This reverts commit be5a845e4c29aadb513ae6e5e2879dccf37efdbb.
This causes large code size regressions, which were not present
in the initial version of this change.
This builds on top of commit 9d0754ada5dbbc0c009bcc2f7824488419cc5530
("[MC] Relax fragments eagerly") and relaxes fragments eagerly to
eliminate MCSection::HasLayout and `getFragmentOffset` overhead. The
approach is slightly different from
1a47f3f3db66589c11f8ddacfeaecc03fb80c510 and has less performance
benefit.
The new layout algorithm also addresses the following problems:
* Size change of MCFillFragment/MCOrgFragment did not influence the
fixed-point iteration, which could be problematic for contrived cases.
* The `invalid number of bytes` error was reported too early. Since
`.zero A-B` might have temporary negative values in the first few
iterations.
* X86AsmBackend::finishLayout performed only one iteration, which might
not converge. In addition, the removed `#ifndef NDEBUG` code (disabled
by default) in X86AsmBackend::finishLayout was problematic, as !NDEBUG
and NDEBUG builds evaluated fragment offsets at different times before
this patch.
* The computed layout for relax-recompute-align.s is optimal now.
Builds with many text sections (e.g. full LTO) shall observe a decrease
in compile time while the new algorithm could be slightly slower for
some -O0 -g projects.
Aligned bundling from the deprecated PNaCl placed constraints how we can
perform iteration.
This reverts commit 1a47f3f3db66589c11f8ddacfeaecc03fb80c510.
Fix#100283
This commit is actually a trigger of other preexisting problems:
* Size change of fill fragments does not influence the fixed-point iteration.
* The `invalid number of bytes` error is reported too early. Since
`.zero A-B` might have temporary negative values in the first few
iterations.
However, the problems appeared at least "benign" (did not affect the
Linux kernel builds) before this commit.
This builds on top of commit 9d0754ada5dbbc0c009bcc2f7824488419cc5530
("[MC] Relax fragments eagerly") and relaxes fragments eagerly to
eliminate MCSection::HasLayout and `getFragmentOffset` overhead.
Note: The removed `#ifndef NDEBUG` code (disabled by default) in
X86AsmBackend::finishLayout was problematic, as (a) !NDEBUG and NDEBUG
builds evaluated fragment offsets at different times before this patch
(b) one iteration might not be sufficient to converge. There might be
some edge cases that it did not handle. Anyhow, this patch probably
makes it work for more cases.
This method is called once per encoded instruction, but never changes
throughout the lifetime of a section. Store this information as a bit
flag in the MCSection instead.
This commit removes the complexity introduced by pending labels in
https://reviews.llvm.org/D5915 by using a simpler approach. D5915 aimed
to ensure padding placement before `.Ltmp0` for the following code, but
at the cost of expensive per-instruction `flushPendingLabels`.
```
// similar to llvm/test/MC/X86/AlignedBundling/labeloffset.s
.bundle_lock align_to_end
calll .L0$pb
.bundle_unlock
.L0$pb:
popl %eax
.Ltmp0: //// padding should be inserted before this label instead of after
addl $_GLOBAL_OFFSET_TABLE_+(.Ltmp0-.L0$pb), %eax
```
(D5915 was adjusted by https://reviews.llvm.org/D8072 and
https://reviews.llvm.org/D71368)
This patch achieves the same goal by setting the offset of the empty
MCDataFragment (`Prev`) in `layoutBundle`. This eliminates the need for
pending labels and simplifies the code.
llvm/test/MC/MachO/pending-labels.s (D71368): relocation symbols are
changed, but the result is still supported by linkers.
There are only three actual uses of the section kind in MCSection:
isText(), XCOFF, and WebAssembly. Store isText() in the MCSection, and
store other info in the actual section variants where required.
ELF and COFF flags also encode all relevant information, so for these
two section variants, remove the SectionKind parameter entirely.
This allows to remove the string switch (which is unnecessary and
inaccurate) from createELFSectionImpl. This was introduced in
[D133456](https://reviews.llvm.org/D133456), but apparently, it was
never hit for non-writable sections anyway and the resulting kind was
never used.
When both aligned bundling and RelaxAll are enabled, bundle padding is
directly written into fragments (https://reviews.llvm.org/D8072).
(The original motivation was memory usage, which has been achieved from
different angles with recent assembler improvement).
The code presents challenges with the work to replace fragment
representation (e.g. #94950#95077). This patch removes the special
handling. RelaxAll still works but the behavior seems slightly different
as revealed by 2 changed tests. However, most `-mc-relax-all` tests are
unchanged.
RelaxAll used to be the default for clang -O0. This mode has significant
code size drawbacks and newer Clang doesn't use it (#90013).
---
flushPendingLabels: The FOffset parameter can be removed: pending labels
will be assigned to the incoming fragment at offset 0.
Pull Request: https://github.com/llvm/llvm-project/pull/95188
The setAtom call introduced by e17bc023f4e5b79f08bfc7f624f8ff0f0cf17ce4
was due to my misunderstanding of flushPendingLabels
(see https://discourse.llvm.org/t/mc-removing-aligned-bundling-support/79518).
When evaluating `.quad x-y`,
MCExpr.cpp:AttemptToFoldSymbolOffsetDifference gives different results
at parse time and layout time because the `if (FA->getAtom() ==
FB.getAtom())` condition in isSymbolRefDifferenceFullyResolvedImpl only
works when `setAtom` with a non-null pointer has been called. Calling
setAtom in flushPendingLabels does not help anything.
Fragments are allocated with `operator new` and stored in an ilist with
Prev/Next/Parent pointers. A more efficient representation would be an
array of fragments without the overhead of Prev/Next pointers.
As the first step, replace ilist with singly-linked lists.
* `getPrevNode` uses have been eliminated by previous changes.
* The last use of the `Prev` pointer remains: for each subsection, there is an insertion point and
the current insertion point is stored at `CurInsertionPoint`.
* `HexagonAsmBackend::finishLayout` needs a backward iterator. Save all
fragments within `Frags`. Hexagon programs are usually small, and the
performance does not matter that much.
To eliminate `Prev`, change the subsection representation to
singly-linked lists for subsections and a pointer to the active
singly-linked list. The fragments from all subsections will be chained
together at layout time.
Since fragment lists are disconnected before layout time, we can remove
`MCFragment::SubsectionNumber` (https://reviews.llvm.org/D69411). The
current implementation of `AttemptToFoldSymbolOffsetDifference` requires
future improvement for robustness.
Pull Request: https://github.com/llvm/llvm-project/pull/95077
Lazy relaxation caused hash table lookups (`getFragmentOffset`) and
complex use/compute interdependencies. Some expressions involding
forward declared symbols (e.g. `subsection-if.s`) cannot be computed.
Recursion detection requires complex `IsBeingLaidOut`
(https://reviews.llvm.org/D79570).
D76114's `invalidateFragmentsFrom` makes lazy relaxation even less
useful.
Switch to eager relaxation to greatly simplify code and resolve these
issues. This change also removes a `getPrevNode` use, which makes it
more feasible to replace the fragment representation, which might yield
a large peak RSS win.
Minor downsides: The number of section relaxations may increase (offset
by avoiding the hash table lookup). For relax-recompute-align.s, the
computed layout is not optimal.
Fixes: c26c5e47ab9ca60835f191c90fa751e9a7dd0f3d (essentially a no-op)
The newly created MCDataFragment should inherit Atom (see
MCMachOStreamer::finishImpl). To the best of my knowledge, this change cannot be
tested at present, but this is important to ensure
MCExpr.cpp:AttemptToFoldSymbolOffsetDifference gives the same result in case we
evaluate the expression again with a MCAsmLayout.
In the following case,
```
.section __DATA,xray_instr_map
lxray_sleds_start1:
.space 16
Lxray_sleds_end1:
.section __DATA,xray_fn_idx
.quad (Lxray_sleds_end1-lxray_sleds_start1)>>4 // can be folded without a MCAsmLayout
```
When we have a MCAsmLayout, without this change, evaluating
(Lxray_sleds_end1-lxray_sleds_start1)>>4 again will fail due to
`FA->getAtom() == nullptr && FB.getAtom() != nullptr` in
MachObjectWriter::isSymbolRefDifferenceFullyResolvedImpl, called by
AttemptToFoldSymbolOffsetDifference.
The newly created MCDataFragment should inherit Atom (see
MCMachOStreamer::finishImpl). I cannot think of a case to test the
behavior, but this is one step towards folding the Mach-O label
difference below and making Mach-O more similar to ELF.
```
.section __DATA,xray_instr_map
lxray_sleds_start1:
.space 16
Lxray_sleds_end1:
.section __DATA,xray_fn_idx
.quad (Lxray_sleds_end1-lxray_sleds_start1)>>4 // error: expected relocatable expression // Mach-O
```
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:
llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h
Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after: 1049293745
Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
CanBeUnnamed is rarely false. Splitting to a createNamedTempSymbol makes the
intention clearer and matches the direction of reverted r240130 (to drop the
unneeded parameters).
No behavior change.
For `.bss; nop`, MC inappropriately calls abort() (via report_fatal_error()) with a message
`cannot have fixups in virtual section!`
It is a bug to crash for invalid user input. Fix it by erroring out early in EmitInstToData().
Similarly, emitIntValue() in a virtual section (SHT_NOBITS in ELF) can crash with the mssage
`non-zero initializer found in section '.bss'` (see D4199)
It'd be nice to report the location but so many directives can call emitIntValue()
and it is difficult to track every location.
Note, COFF does not crash because MCAssembler::writeSectionData() is not
called for an IMAGE_SCN_CNT_UNINITIALIZED_DATA section.
Note, GNU as' arm64 backend reports ``Error: attempt to store non-zero value in section `.bss'``
for a non-zero .inst but fails to do so for other instructions.
We simply reject all instructions, even if the encoding is all zeros.
The Mach-O counterpart is D48517 (see `test/MC/MachO/zerofill-text.s`)
Reviewed By: rnk, skan
Differential Revision: https://reviews.llvm.org/D78138
I plan to use MCSection::getName() in D78138. Having the function in the base class is also convenient for debugging.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D78251
This simplifies the generic interface and also makes SHF_ARM_PURECODE
more robust (fixes a TODO). Inspecting MCDataFragment contents covers
more cases than MCObjectStreamer::EmitBytes.
(This commit restores the original branch (4272372c571) and applies an
additional change dropped from the original in a bad merge. This change
should address the previous bot failures. Both changes reviewed by pete.)
Summary:
This commit builds upon Derek Schuff's 2014 commit for attaching labels to
existing fragments ( Diff Revision: http://reviews.llvm.org/D5915 )
When temporary labels appear ahead of a fragment, MCObjectStreamer will
track the temporary label symbol in a "Pending Labels" list. Labels are
associated with fragments when a real fragment arrives; otherwise, an empty
data fragment will be created if the streamer's section changes or if the
stream finishes.
This commit moves the "Pending Labels" list into each MCStream, so that
this label-fragment matching process is resilient to section changes. If
the streamer emits a label in a new section, switches to another section to
do other work, then switches back to the first section and emits a
fragment, that initial label will be associated with this new fragment.
Labels will only receive empty data fragments in the case where no other
fragment exists for that section.
The downstream effects of this can be seen in Mach-O relocations. The
previous approach could produce local section relocations and external
symbol relocations for the same data in an object file, and this mix of
relocation types resulted in problems in the ld64 Mach-O linker. This
commit ensures relocations triggered by temporary labels are consistent.
Reviewers: pete, ab, dschuff
Reviewed By: pete, dschuff
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71368
This reverts commit 4272372c571cd33edc77a8844b0a224ad7339138.
Caused an MSan buildbot failure. More information available in the patch
that introduced the bug: https://reviews.llvm.org/D71368
Summary:
This commit builds upon Derek Schuff's 2014 commit for attaching labels to
existing fragments ( Diff Revision: http://reviews.llvm.org/D5915 )
When temporary labels appear ahead of a fragment, MCObjectStreamer will
track the temporary label symbol in a "Pending Labels" list. Labels are
associated with fragments when a real fragment arrives; otherwise, an empty
data fragment will be created if the streamer's section changes or if the
stream finishes.
This commit moves the "Pending Labels" list into each MCStream, so that
this label-fragment matching process is resilient to section changes. If
the streamer emits a label in a new section, switches to another section to
do other work, then switches back to the first section and emits a
fragment, that initial label will be associated with this new fragment.
Labels will only receive empty data fragments in the case where no other
fragment exists for that section.
The downstream effects of this can be seen in Mach-O relocations. The
previous approach could produce local section relocations and external
symbol relocations for the same data in an object file, and this mix of
relocation types resulted in problems in the ld64 Mach-O linker. This
commit ensures relocations triggered by temporary labels are consistent.
Reviewers: pete, ab, dschuff
Reviewed By: pete, dschuff
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71368
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
SHF_ARM_PURECODE flag when being built with the -mexecute-only flag.
All code sections of an ELF must have the flag set for the final .text
section to be execute-only, otherwise the flag gets removed.
A HasData flag is added to MCSection to aid in the determination that
the section is empty. A virtual setTargetSectionFlags is added to
MCELFObjectTargetWriter to allow subclasses to set target specific
section flags to be added to sections which we then use in the ARM
backend to set SHF_ARM_PURECODE.
Patch by Ivan Lozano!
Reviewed By: echristo
Differential Revision: https://reviews.llvm.org/D48792
llvm-svn: 341593
See r331124 for how I made a list of files missing the include.
I then ran this Python script:
for f in open('filelist.txt'):
f = f.strip()
fl = open(f).readlines()
found = False
for i in xrange(len(fl)):
p = '#include "llvm/'
if not fl[i].startswith(p):
continue
if fl[i][len(p):] > 'Config':
fl.insert(i, '#include "llvm/Config/llvm-config.h"\n')
found = True
break
if not found:
print 'not found', f
else:
open(f, 'w').write(''.join(fl))
and then looked through everything with `svn diff | diffstat -l | xargs -n 1000 gvim -p`
and tried to fix include ordering and whatnot.
No intended behavior change.
llvm-svn: 331184
Summary:
Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with
LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP.
Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods.
Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so
it'll be picked up by public headers.
Differential Revision: https://reviews.llvm.org/D38406
llvm-svn: 315590
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
We had various variants of defining dump() functions in LLVM. Normalize
them (this should just consistently implement the things discussed in
http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html
For reference:
- Public headers should just declare the dump() method but not use
LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- The definition of a dump method should look like this:
#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
LLVM_DUMP_METHOD void MyClass::dump() {
// print stuff to dbgs()...
}
#endif
llvm-svn: 293359
This extends the work done in r233995 so that now getFragment (in addition to
getSection) also works for variable symbols.
With that the existing logic to decide if a-b can be computed works even if
a or b are variables. Given that, the expression evaluation can avoid expanding
variables as aggressively and that in turn lets the relocation code see the
original variable.
In order for this to work with the asm streamer, there is now a dummy fragment
per section. It is used to assign a section to a symbol when no other fragment
exists.
This patch is a joint work by Maxim Ostapenko andy myself.
llvm-svn: 249303