From fdce869d723f8032b48bb0e551b59a13b5d002d9 Mon Sep 17 00:00:00 2001 From: dan rouhana Date: Sat, 21 Feb 2026 15:28:37 -0800 Subject: [PATCH] [llvm-mca][Darwin] Fix crash on .subsections_via_symbols in asm input (#182694) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Summary This PR fixes an llvm-mca crash on Darwin assembly containing `.subsections_via_symbols`. The directive is forwarded by `DarwinAsmParser` to `emitSubsectionsViaSymbols()`, which crashes when it hits the base `MCStreamer` `llvm_unreachable` path. The fix adds a no-op override in `llvm/tools/llvm-mca/CodeRegionGenerator.h`, scoped to llvm-mca only ## Problem manifestation I ran across this while tinkering around with making an interactive interpreter/code analyzer and implementing apple silicon support. ## Root cause - `DarwinAsmParser::parseDirectiveSubsectionsViaSymbols` calls `getStreamer().emitSubsectionsViaSymbols()` - llvm-mca uses `MCStreamerWrapper`, derived from `MCStreamer` - `MCStreamerWrapper` did not override `emitSubsectionsViaSymbols()` - the call therefore reached `MCStreamer::emitSubsectionsViaSymbols()`, which is `llvm_unreachable`, causing a crash ## Reproduction I was able to reproduce this in three ways (note: no difference in behavior was observed with the inclusion/exclusion of `-target arm64-apple-macos` or any other arch flags): 1. AppleClang-generated assembly: ```bash cat > /tmp/repro.cpp <<'EOF' int foo(int x) { return x + 1; } EOF /usr/bin/clang++ -target arm64-apple-macos -O0 -S /tmp/repro.cpp -o /tmp/repro-appleclang.s /opt/homebrew/opt/llvm/bin/llvm-mca --show-encoding --register-file-stats /tmp/repro-appleclang.s ``` 2. Homebrew Clang-generated assembly: ```bash cat > /tmp/repro.cpp <<'EOF' int foo(int x) { return x + 1; } EOF /opt/homebrew/opt/llvm/bin/clang++ -target arm64-apple-macos -O0 -S /tmp/repro.cpp -o /tmp/repro-hbclang.s /opt/homebrew/opt/llvm/bin/llvm-mca --show-encoding --register-file-stats /tmp/repro-hbclang.s ``` 3. Handwritten Darwin assembly: ```bash cat > /tmp/min.s <<'EOF' .text .subsections_via_symbols .globl _foo _foo: ret EOF /opt/homebrew/opt/llvm/bin/llvm-mca --show-encoding --register-file-stats /tmp/min.s ``` ## Fix Add a no-op `emitSubsectionsViaSymbols()` override to `MCStreamerWrapper` in `llvm/tools/llvm-mca/CodeRegionGenerator.h`. This keeps the fix local to llvm-mca’s analysis streamer and does not change Mach-O object emission behavior. A similar pattern fix is implemented in `llvm/lib/Object/RecordStreamer.h` ([link](https://github.com/llvm/llvm-project/blob/main/llvm/lib/Object/RecordStreamer.h#L57-L60)). ## Validation Reproduced and verified on macOS arm64 with the above three reproduction cases: 1. assembly generated by AppleClang 2. assembly generated by Homebrew Clang 3. manually-authored `.s` file containing `.subsections_via_symbols` `llvm-mca --show-encoding --register-file-stats` no longer crashes!! ## Notes From what I can surmise, this is a consumer-side fix (llvm-mca parser/streamer), not a producer fix in clang, since the directive is valid Darwin assembly and can appear in external input files. This seemed to me like the most targeted, distilled fix that follows the same pattern used elsewhere. Happy to revise the approach if there’s a better fit for llvm-mca, thanks for taking a look!! --- .../AArch64/Apple/darwin-subsections-via-symbols.s | 7 +++++++ llvm/tools/llvm-mca/CodeRegionGenerator.h | 1 + 2 files changed, 8 insertions(+) create mode 100644 llvm/test/tools/llvm-mca/AArch64/Apple/darwin-subsections-via-symbols.s diff --git a/llvm/test/tools/llvm-mca/AArch64/Apple/darwin-subsections-via-symbols.s b/llvm/test/tools/llvm-mca/AArch64/Apple/darwin-subsections-via-symbols.s new file mode 100644 index 000000000000..bcc17a524729 --- /dev/null +++ b/llvm/test/tools/llvm-mca/AArch64/Apple/darwin-subsections-via-symbols.s @@ -0,0 +1,7 @@ +# RUN: llvm-mca -mtriple=arm64-apple-macos -mcpu=apple-m4 -iterations=1 < %s + +.text +.subsections_via_symbols +.globl _foo +_foo: + ret diff --git a/llvm/tools/llvm-mca/CodeRegionGenerator.h b/llvm/tools/llvm-mca/CodeRegionGenerator.h index c30f67a53eac..7083ba363081 100644 --- a/llvm/tools/llvm-mca/CodeRegionGenerator.h +++ b/llvm/tools/llvm-mca/CodeRegionGenerator.h @@ -106,6 +106,7 @@ public: void emitZerofill(MCSection *Section, MCSymbol *Symbol = nullptr, uint64_t Size = 0, Align ByteAlignment = Align(1), SMLoc Loc = SMLoc()) override {} + void emitSubsectionsViaSymbols() override {} void beginCOFFSymbolDef(const MCSymbol *Symbol) override {} void emitCOFFSymbolStorageClass(int StorageClass) override {} void emitCOFFSymbolType(int Type) override {}