[llvm-mca][Darwin] Fix crash on .subsections_via_symbols in asm input (#182694)

## Summary

This PR fixes an llvm-mca crash on Darwin assembly containing
`.subsections_via_symbols`. The directive is forwarded by
`DarwinAsmParser` to `emitSubsectionsViaSymbols()`, which crashes when
it hits the base `MCStreamer` `llvm_unreachable` path. The fix adds a
no-op override in `llvm/tools/llvm-mca/CodeRegionGenerator.h`, scoped to
llvm-mca only

## Problem manifestation

I ran across this while tinkering around with making an interactive
interpreter/code analyzer and implementing apple silicon support.

## Root cause

- `DarwinAsmParser::parseDirectiveSubsectionsViaSymbols` calls
`getStreamer().emitSubsectionsViaSymbols()`
- llvm-mca uses `MCStreamerWrapper`, derived from `MCStreamer`
- `MCStreamerWrapper` did not override `emitSubsectionsViaSymbols()`
- the call therefore reached `MCStreamer::emitSubsectionsViaSymbols()`,
which is `llvm_unreachable`, causing a crash

## Reproduction

I was able to reproduce this in three ways (note: no difference in
behavior was observed with the inclusion/exclusion of `-target
arm64-apple-macos` or any other arch flags):

1. AppleClang-generated assembly:
    ```bash
    cat > /tmp/repro.cpp <<'EOF'
    int foo(int x) { return x + 1; }
    EOF
    
/usr/bin/clang++ -target arm64-apple-macos -O0 -S /tmp/repro.cpp -o
/tmp/repro-appleclang.s
/opt/homebrew/opt/llvm/bin/llvm-mca --show-encoding
--register-file-stats /tmp/repro-appleclang.s
    ```

2. Homebrew Clang-generated assembly:
    ```bash
    cat > /tmp/repro.cpp <<'EOF'
    int foo(int x) { return x + 1; }
    EOF
    
/opt/homebrew/opt/llvm/bin/clang++ -target arm64-apple-macos -O0 -S
/tmp/repro.cpp -o /tmp/repro-hbclang.s
/opt/homebrew/opt/llvm/bin/llvm-mca --show-encoding
--register-file-stats /tmp/repro-hbclang.s
    ```

3. Handwritten Darwin assembly:
    ```bash
    cat > /tmp/min.s <<'EOF'
      .text
      .subsections_via_symbols
      .globl _foo
      _foo:
          ret
    EOF
    
/opt/homebrew/opt/llvm/bin/llvm-mca --show-encoding
--register-file-stats /tmp/min.s
    ```

## Fix

Add a no-op `emitSubsectionsViaSymbols()` override to
`MCStreamerWrapper` in `llvm/tools/llvm-mca/CodeRegionGenerator.h`.

This keeps the fix local to llvm-mca’s analysis streamer and does not
change Mach-O object emission behavior. A similar pattern fix is
implemented in `llvm/lib/Object/RecordStreamer.h`
([link](https://github.com/llvm/llvm-project/blob/main/llvm/lib/Object/RecordStreamer.h#L57-L60)).

## Validation

Reproduced and verified on macOS arm64 with the above three reproduction
cases:
1. assembly generated by AppleClang
2. assembly generated by Homebrew Clang
3. manually-authored `.s` file containing `.subsections_via_symbols`

`llvm-mca --show-encoding --register-file-stats` no longer crashes!!

## Notes

From what I can surmise, this is a consumer-side fix (llvm-mca
parser/streamer), not a producer fix in clang, since the directive is
valid Darwin assembly and can appear in external input files. This
seemed to me like the most targeted, distilled fix that follows the same
pattern used elsewhere. Happy to revise the approach if there’s a better
fit for llvm-mca, thanks for taking a look!!
This commit is contained in:
dan rouhana 2026-02-21 15:28:37 -08:00 committed by GitHub
parent c6ade8a170
commit fdce869d72
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 8 additions and 0 deletions

View File

@ -0,0 +1,7 @@
# RUN: llvm-mca -mtriple=arm64-apple-macos -mcpu=apple-m4 -iterations=1 < %s
.text
.subsections_via_symbols
.globl _foo
_foo:
ret

View File

@ -106,6 +106,7 @@ public:
void emitZerofill(MCSection *Section, MCSymbol *Symbol = nullptr,
uint64_t Size = 0, Align ByteAlignment = Align(1),
SMLoc Loc = SMLoc()) override {}
void emitSubsectionsViaSymbols() override {}
void beginCOFFSymbolDef(const MCSymbol *Symbol) override {}
void emitCOFFSymbolStorageClass(int StorageClass) override {}
void emitCOFFSymbolType(int Type) override {}