320 Commits

Author SHA1 Message Date
Fangrui Song
2d4ecba957 MCSymbolMachO: Remove classof
The object file format specific derived classes are used in context
where the type is statically known. We don't use isa/dyn_cast and we
want to eliminate MCSymbol::Kind in the base class.
2025-08-03 18:59:35 -07:00
Fangrui Song
85f00707dd MCSymbolMachO: Migrate away from classof
The object file format specific derived classes are used in context
where the type is statically known. We don't use isa/dyn_cast and we
want to eliminate MCSymbol::Kind in the base class.
2025-08-03 17:48:50 -07:00
Fangrui Song
2bebbe166b MCFragment: Migrate away from appendContents
The fixed-size content of the MCFragment object will be stored as
trailing data (#150846). Any post-assembler-layout adjustments must
target the variable-size tail.
2025-07-28 20:35:35 -07:00
Fangrui Song
f517ac2083 MCSectionCOFF: Avoid cast
The object file format specific derived classes are used in context like
MCStreamer and MCObjectTargetWriter where the type is statically known.
We don't use isa/dyn_cast and we want to eliminate
MCSection::SectionVariant in the base class.
2025-07-26 10:04:04 -07:00
Fangrui Song
b22e22ebfa MC: Allocate initial fragment and define section symbol in changeSection
Reland #150574 with a MCStreamer::changeSection change:
In Mach-O, DWARF sections use Begin as a temporary label, requiring a label
definition, unlike section symbols in other file formats.
(Tested by dec978036ef1037753e7de5b78c978e71c49217b)

---

13a79bbfe583e1d8cc85d241b580907260065eb8 (2017) introduced fragment
creation in MCContext for createELFSectionImpl, which was inappropriate.
Fragments should only be created when using MCSteramer, not during
`MCContext::get*Section` calls.
`initMachOMCObjectFileInfo` defines multiple sections, some of which may
not be used by the code generator. This caused symbol names matching
these sections to be incorrectly marked as undefined (see
https://reviews.llvm.org/D55173).

The fragment code was later replicated in other file formats, such as
WebAssembly (see https://reviews.llvm.org/D46561), XCOFF, and GOFF.

This patch fixes the problem by moving initial fragment allocation from
MCContext::createSection to MCStreamer::changeSection.
While MCContext still creates a section symbol, the symbol is not
attached to the initial fragment. In addition,

* Move `emitLabel`/`setFragment` from `switchSection*` and
  overridden changeSection to `MCObjectStreamer::changeSection` for
  consistency.
* De-virtualize `switchSectionNoPrint`.
* test/CodeGen/XCore/section-name.ll now passes. XCore doesn't support
  MCObjectStreamer. I don't think the MCAsmStreamer output behavior
  change matters.

Pull Request: https://github.com/llvm/llvm-project/pull/150574
2025-07-26 00:05:10 -07:00
dyung
cb37916a25
Revert "MC: Allocate initial fragment and define section symbol in changeSection" (#150736)
Reverts llvm/llvm-project#150574

This is causing a test failure on AArch64 MacOS bots:
https://lab.llvm.org/buildbot/#/builders/190/builds/24187
2025-07-25 20:21:59 -07:00
Fangrui Song
1955a01d63
MC: Allocate initial fragment and define section symbol in changeSection
13a79bbfe583e1d8cc85d241b580907260065eb8 (2017) introduced fragment
creation in MCContext for createELFSectionImpl, which was inappropriate.
Fragments should only be created when using MCSteramer, not during
`MCContext::get*Section` calls.
`initMachOMCObjectFileInfo` defines multiple sections, some of which may
not be used by the code generator. This caused symbol names matching
these sections to be incorrectly marked as undefined (see
https://reviews.llvm.org/D55173).

The fragment code was later replicated in other file formats, such as
WebAssembly (see https://reviews.llvm.org/D46561), XCOFF, and GOFF.

This patch fixes the problem by moving initial fragment allocation from
MCContext::createSection to MCStreamer::changeSection.
While MCContext still creates a section symbol, the symbol is not
attached to the initial fragment.
In addition, move `emitLabel`/`setFragment` from `switchSection*` and
overridden changeSection to `MCObjectStreamer::changeSection` for
consistency.

* test/CodeGen/XCore/section-name.ll now passes. XCore doesn't support
  MCObjectStreamer. I don't think the MCAsmStreamer output behavior
  change matters.

Pull Request: https://github.com/llvm/llvm-project/pull/150574
2025-07-24 23:37:48 -07:00
Kazu Hirata
3e53d4d386
[llvm] Remove unused includes (NFC) (#150265)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-23 15:18:46 -07:00
Fangrui Song
0b3579bedc MCMachOStreamer: Call MCObjectStreamer::changeSection
We don't want to emit linker-local label `ltmpN` for addrsig and
cg_profile sections. Make the intention clear to prepare for the pending
refactoring that moves emitLabel from switchSection to changeSection.
2025-07-22 23:48:06 -07:00
Fangrui Song
bdbc0987ca MCObjectStreamer: Remove changeSectionImpl 2025-07-20 16:28:16 -07:00
Fangrui Song
6201761e96 MC: Rename isVirtualSection to isBssSection
The term BSS (Block Started by Symbol) is a standard, widely recognized
term, available in the a.out object file format and adopted by formats
like COFF, XCOFF, Mach-O (called S_ZEROFILL while `__bss` is also used),
and ELF. To avoid introducing unfamiliar terms, we should use
isBSSSection instead of isVirtualSection.
2025-07-20 10:39:17 -07:00
Fangrui Song
dc89a910aa MCStreamer: Simplify with newFragment. NFC 2025-07-19 12:40:36 -07:00
Fangrui Song
dc3a4c0fcf
MC: Restructure MCFragment as a fixed part and a variable tail
Refactor the fragment representation of `push rax; jmp foo; nop; jmp foo`,
previously encoded as
`MCDataFragment(nop); MCRelaxableFragment(jmp foo); MCDataFragment(nop); MCRelaxableFragment(jmp foo)`,

to

```
MCFragment(fixed: push rax, variable: jmp foo)
MCFragment(fixed: nop, variable: jmp foo)
```

Changes:

* Eliminate MCEncodedFragment, moving content and fixup storage to MCFragment.
* The new MCFragment contains a fixed-size content (similar to previous
  MCDataFragment) and an optional variable-size tail.
* The variable-size tail supports FT_Relaxable, FT_LEB, FT_Dwarf, and
  FT_DwarfFrame, with plans to extend to other fragment types.
  dyn_cast/isa should be avoided for the converted fragment subclasses.
* In `setVarFixups`, source fixup offsets are relative to the variable part's start.
  Stored fixup (in `FixupStorage`) offsets are relative to the fixed part's start.
  A lot of code does `getFragmentOffset(Frag) + Fixup.getOffset()`,
  expecting the fixup offset to be relative to the fixed part's start.
* HexagonAsmBackend::fixupNeedsRelaxationAdvanced needs to know the
  associated instruction for a fixup. We have to add a `const MCFragment &` parameter.
* In MCObjectStreamer, extend `absoluteSymbolDiff` to apply to
  FT_Relaxable as otherwise there would be many more FT_DwarfFrame
  fragments in -g compilations.

https://llvm-compile-time-tracker.com/compare.php?from=28e1473e8e523150914e8c7ea50b44fb0d2a8d65&to=778d68ad1d48e7f111ea853dd249912c601bee89&stat=instructions:u

```
stage2-O0-g instructins:u geomeon (-0.07%)
stage1-ReleaseLTO-g (link only) max-rss geomean (-0.39%)
```

```
% /t/clang-old -g -c sqlite3.i -w -mllvm -debug-only=mc-dump &| awk '/^[0-9]+/{s[$2]++;tot++} END{print "Total",tot; n=asorti(s, si); for(i=1;i<=n;i++) print si[i],s[si[i]]}'
Total 59675
Align 2215
Data 29700
Dwarf 12044
DwarfCallFrame 4216
Fill 92
LEB 12
Relaxable 11396
% /t/clang-new -g -c sqlite3.i -w -mllvm -debug-only=mc-dump &| awk '/^[0-9]+/{s[$2]++;tot++} END{print "Total",tot; n=asorti(s, si); for(i=1;i<=n;i++) print si[i],s[si[i]]}'
Total 32287
Align 2215
Data 2312
Dwarf 12044
DwarfCallFrame 4216
Fill 92
LEB 12
Relaxable 11396
```

Pull Request: https://github.com/llvm/llvm-project/pull/148544
2025-07-15 21:56:55 -07:00
Fangrui Song
04395be630
MC: Merge MCFragment.h into MCSection.h
... due to their close relationship. MCSection's inline functions (e.g.
iterator) access MCFragment, and we want MCFragment's inline functions
to access MCSection similarly (#146307).

Pull Request: https://github.com/llvm/llvm-project/pull/146315
2025-06-30 09:41:53 -07:00
Fangrui Song
cd075a4013 MCObjectStreamer: Deduplicate emitInstToData 2025-06-29 16:12:10 -07:00
Jessica Clarke
1b41599cf8
[MC][AArch64][ARM][X86] Push target-dependent assembler flags into targets (#139844)
The .syntax unified directive and .codeX/.code X directives are, other
than some simple common printing code, exclusively implemented in the
targets themselves. Thus, remove the corresponding MCAF_* flags and
reimplement the directives solely within the targets. This avoids
exposing all targets to all other targets' flags.

Since MCAF_SubsectionsViaSymbols is all that remains, convert it to its
own function like other directives, simplifying its implementation.

Note that, on X86, we now always need a target streamer when parsing
assembly, as it's now used for directives that aren't COFF-specific. It
still does not however need to do anything when producing a non-COFF
object file, so this commit does not introduce any new target streamers.

There is some churn in test output, and corresponding UTC regex changes,
due to comments no longer being flushed by these various directives (and
EmitEOL is not exposed outside MCAsmStreamer.cpp so we couldn't do so
even if we wanted to), but that was a bit odd to be doing anyway.

This is motivated by Morello LLVM, which adds yet another assembler flag
to distinguish A64 and C64 instruction sets, but did not update every
switch and so emits warnings during the build. Rather than fix those
warnings it seems better to instead make the problem not exist in the
first place via this change.
2025-05-18 20:09:43 +01:00
Fangrui Song
b1cd3cb3f4 [MC] Replace getSymA()->getSymbol() with getAddSym. NFC
We will replace the MCSymbolRefExpr member in MCValue with MCSymbol.
This change reduces dependence on MCSymbolRefExpr.
2025-04-05 13:34:24 -07:00
Fangrui Song
b73e144bdf MCValue: Simplify code with getSubSym
MCValue::SymB is a MCSymbolRefExpr *, which might become MCSymbol * in
the future. Simplify some code that uses MCValue::SymB.
2025-03-23 12:13:13 -07:00
Fangrui Song
7722d7519c [MC] evaluateAsRelocatableImpl: remove the Fixup argument
Follow-up to d6fbffa23c84e622735b3e880fd800985c1c0072 . This commit
updates all call sites and removes the argument from the function.
2025-03-15 16:10:19 -07:00
Fangrui Song
ad61e53333
[ARM] Move MCStreamer::emitThumbFunc to ARMTargetStreamer
MCStreamer should not declare arch-specific functions. Such functions
should go to MCTargetStreamer.

Move MCMachOStreamer::emitThumbFunc to ARMTargetMachOStreamer, which is
a new subclass of ARMTargetStreamer. (The new class is just placed in
ARMMachObjectWriter.cpp. The conventional split like
ARMELFObjectWriter.cpp/ARMELFObjectWriter.cpp is overkill.)

`emitCFILabel`, called by ARMWinCOFFStreamer.cpp, has to be made public.

Pull Request: https://github.com/llvm/llvm-project/pull/126199
2025-02-10 09:40:43 -08:00
Fangrui Song
3fa5b52714 [MC] Decrease direct access to getContents()
to make it easier to store fragment contents out-of-line. Loosely based
on Alexis Engelke's prototype.
2024-12-21 17:14:26 -08:00
Alexis Engelke
f1cb64b6f0
[MC] Remove Darwin SDK/Version from ObjFileInfo (#103025)
There's only a single user (MCMachOStreamer), so it makes more sense to
move the version emission to the source of the data.
2024-08-14 09:24:07 +02:00
Fangrui Song
f017d89b22 MCAssembler: Move SubsectionsViaSymbols; to MCObjectWriter 2024-07-22 23:31:01 -07:00
Fangrui Song
eb2239299e MCAssembler: Move LinkerOptions to MachObjectWriter 2024-07-22 22:02:51 -07:00
Fangrui Song
ae3c85a708 MCAssembler: Move CGProfile to MCObjectWriter 2024-07-22 21:56:45 -07:00
Fangrui Song
09a399a1dd [MC] Move VersionInfo to MachObjectWriter 2024-07-21 13:03:21 -07:00
Fangrui Song
a2af375556 [MC] Move LOHContainer to MachObjectwriter 2024-07-21 11:19:52 -07:00
Fangrui Song
e299b163c7 [MC] Move isPrivateExtern to MCSymbolMachO 2024-07-21 10:58:20 -07:00
Fangrui Song
aa86f4f185 [MC] Remove unnecessary DWARFMustBeAtTheEnd check
36a15cb975334403216e6145d4abece3026af17a introduced the
DWARFMustBeAtTheEnd check to ensure DWARF sections were placed after all
text sections to help avoid out-of-range branches for Darwin ARM. The
commit removed a Darwin ARM hack from
20e5f5ed7930efdf2bd34bf099f24ac88798c5ea (2009), likely due to a
no-longer-relevant assembler limitation.

However, this check is no longer relevant due to the following:

* Our CodeGen approach reliably places DWARF sections at the end.
* Darwin AArch32 is less relevant today.

Removing this check also addresses a minor clang cc1as crash that could
occur when text sections were placed after DWARF sections
(e9ad54b3ee905ea3a77c35ca7d6e843b2c552e0b (2015)).
2024-07-20 10:43:52 -07:00
Ahmed Bougacha
5f1bb62c6b
[AArch64][PAC] Lower ptrauth constants in code for MachO. (#97665)
This also adds support for auth stubs on MachO using __DATA,__auth_ptr.

Some of the machinery for auth stubs is already implemented;  this
generalizes that a bit to support MachO, and moves some of the shared
logic into MMIImpls.

In particular, this originally had an AuthStubInfo struct, but we no
longer need it beyond a single MCExpr.  So this provides variants of
the symbol stub helper type declarations and functions for "expr
stubs", where a stub points at an arbitrary MCExpr, rather than
a simple MCSymbol (and a bit).
2024-07-10 15:03:17 -07:00
Fangrui Song
009082aa4b [MC] Move MCAssembler::DataRegions to MachObjectWriter
and make some cleanup.
2024-07-04 23:34:54 -07:00
Fangrui Song
1d4d92d1cc [MC] Move MCAssembler::IndirectSymbols to MachObjectWriter 2024-07-04 22:56:03 -07:00
Fangrui Song
94471e73fe [MC] Move MCAssembler::isSymbolLinkerVisible to MCSymbolMachO 2024-07-03 17:25:10 -07:00
Fangrui Song
21276fd7be
[MC] Don't treat altentry symbols as atoms
The current `setAtom` is inaccurate: a `.alt_entry` label can also be
recognized as an atom. This is mostly benign, but might cause two
locations only separated by an `.alt_entry` to have different atoms.

https://reviews.llvm.org/D153167 changed a `evaluateKnownAbsolute` to
`evaluateAsAbsolute` and would not fold `A-B` even if they are only
separated by a `.alt_entry` label, leading to a spurious error
`invalid CFI advance_loc expression`.

The fix is similar to #82268: add a special case for `.alt_entry`.

Fix #97116

Pull Request: https://github.com/llvm/llvm-project/pull/97479
2024-07-02 15:19:28 -07:00
Fangrui Song
66518ad7fd [MC] Remove addFragment. NFC
This was introduced in dcb71c06c7b059e313f22e46bc9c41343a03f1eb to help
migrate away raw `operator new` and refactor the fragment
representation.

This is now unneeded after `MCStreamer::CurFrag` and
`MCSection::CurFragList` refactoring.
2024-06-29 17:26:51 -07:00
Fangrui Song
abbf3bce94 [MC] Avoid some registerSection calls
changeSection is preferred to call the changeSectionImpl hook, which
registers the section symbol.
2024-06-23 15:21:02 -07:00
Fangrui Song
21fac2d1d0 [MC] Ensure all new sections have a MCDataFragment
MCAssembler::layout ensures that every section has at least one
fragment, which simplifies MCAsmLayout::getSectionAddressSize (see
e73353c7201a3080851d99a16f5fe2c17f7697c6 from 2010). It's better to
ensure the condition is satisfied at create time (COFF, GOFF, Mach-O) to
simplify more fragment processing.
2024-06-23 10:08:52 -07:00
Fangrui Song
95f983f823 [MC] Change Subsection parameters from const MCExpr * to uint32_t
Follow-up to 05ba5c0648ae5e80d5afce270495bf3b1eef9af4. uint32_t is
preferred over const MCExpr * in the section stack uses because it
should only be evaluated once. Change the paramter type to match.
2024-06-22 21:48:11 -07:00
Fangrui Song
f808abf508
[MC] Add MCFragment allocation helpers
`allocFragment` might be changed to a placement new when the allocation
strategy changes.

`allocInitialFragment` is to deduplicate the following pattern
```
  auto *F = new MCDataFragment();
  Result->addFragment(*F);
  F->setParent(Result);
```

Pull Request: https://github.com/llvm/llvm-project/pull/95197
2024-06-14 09:39:32 -07:00
Fangrui Song
27588fe205
[MC] Move MCFragment::Atom to MCSectionMachO::Atoms
Mach-O's `.subsections_via_symbols` mechanism associates a fragment with
an atom (a non-temporary defined symbol). The current approach
(`MCFragment::Atom`) wastes space for other object file formats.

After #95077, `MCFragment::LayoutOrder` is only used by
`AttemptToFoldSymbolOffsetDifference`. While it could be removed, we
might explore future uses for `LayoutOrder`.

@aengelke suggests one use case: move `Atom` into MCSection. This works
because Mach-O doesn't support `.subsection`, and `LayoutOrder`, as the
index into the fragment list, is unchanged.

This patch moves MCFragment::Atom to MCSectionMachO::Atoms. `getAtom`
may be called at parse time before `Atoms` is initialized, so a bound
checking is needed to keep the hack working.

Pull Request: https://github.com/llvm/llvm-project/pull/95341
2024-06-13 14:37:15 -07:00
Fangrui Song
4e34035616 [MC] Remove RelaxAll parameters from create*Streamer
Related to clean-up opportunities discussed at #90013.

After these cleanups, the `RelaxAll` parameter from
`createMCObjectStreamer` can be removed as well. As
`createMCObjectStreamer` is a more user-facing API and used by two files
in mlir/, we postpone the cleanup to the future.
2024-04-25 14:57:27 -07:00
Fangrui Song
45b59cb1d4 [MC] Move setRelaxAll() calls to MCObjectStreamer
Related to clean-up opportunities discussed at #90013.
2024-04-25 13:54:04 -07:00
Eli Friedman
7198baccda [COFF] Add MC support for emitting IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY symbols
This is mostly useful for ARM64EC, which uses such symbols extensively.

One interesting quirk of ARM64EC is that we need to be able to emit weak
symbols that point at each other (so if either symbol is defined
elsewhere, both symbols point at the definition). This handling is
currently restricted to weak_anti_dep symbols, because we depend on the
current behavior of resolving weak symbols in some cases.

Differential Revision: https://reviews.llvm.org/D145208
2023-06-07 11:07:21 -07:00
Fangrui Song
01260bbc6b [MC] registerSymbol: change an output paramter to return value 2023-05-04 22:17:56 -07:00
Zequan Wu
439f804c47 Revert "[COFF] Add MC support for emitting IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY symbols"
This reverts commit 10c17c97ebaf81ac26f6830e51a7a57ddcf63cd2. It causes undefined symbol error on chromium windows build. A small repro was uploaded to the code review.
2023-04-27 10:01:56 -04:00
Eli Friedman
10c17c97eb [COFF] Add MC support for emitting IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY symbols
This is mostly useful for ARM64EC, which uses such symbols extensively.

One interesting quirk of ARM64EC is that we need to be able to emit weak
symbols that point at each other (so if either symbol is defined
elsewhere, both symbols point at the definition).  This required a few
changes to the way we handle weak symbols on Windows.

Differential Revision: https://reviews.llvm.org/D145208
2023-04-17 13:17:25 -07:00
Arthur Eubanks
29a88f991b Revert "[COFF] Add MC support for emitting IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY symbols"
This reverts commit fffdb7eac58b4efde5e23c1281e7a7f93a42d280.

Causes crashes, see https://reviews.llvm.org/D145208
2023-04-13 09:09:36 -07:00
Eli Friedman
fffdb7eac5 [COFF] Add MC support for emitting IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY symbols
This is mostly useful for ARM64EC, which uses such symbols extensively.

One interesting quirk of ARM64EC is that we need to be able to emit weak
symbols that point at each other (so if either symbol is defined
elsewhere, both symbols point at the definition).  This required a few
changes to the way we handle weak symbols on Windows.

Differential Revision: https://reviews.llvm.org/D145208
2023-04-07 14:05:45 -07:00
Alexis Engelke
0c049ea60a [MC] Always encode instruction into SmallVector
All users of MCCodeEmitter::encodeInstruction use a raw_svector_ostream
to encode the instruction into a SmallVector. The raw_ostream however
incurs some overhead for the actual encoding.

This change allows an MCCodeEmitter to directly emit an instruction into
a SmallVector without using a raw_ostream and therefore allow for
performance improvments in encoding. A default path that uses existing
raw_ostream implementations is provided.

Reviewed By: MaskRay, Amir

Differential Revision: https://reviews.llvm.org/D145791
2023-04-06 16:21:49 +02:00
Guillaume Chatelet
710a834c4c [Alignment][NFC] Use Align in MCSymbol::setCommon
This patch supersedes D138705.

Differential Revision: https://reviews.llvm.org/D139819
2022-12-12 13:09:24 +00:00