575421 Commits

Author SHA1 Message Date
Simon Pilgrim
5674755cb6
[DAG] visitMUL - cleanup pattern matchers to use m_Shl and (commutative) m_Mul directly (#190339)
Based on feedback on #190215
2026-04-03 13:21:51 +00:00
Florian Hahn
c963092b0c
[VPlan] Mark VPCanonicalIVPHI as not reading memory (NFCI). (#190338)
The canonical IV does not access any memory. Mark accordingly. This
should be NFC end-to-end.

PR: https://github.com/llvm/llvm-project/pull/190338
2026-04-03 13:12:20 +00:00
Erich Keane
0a3fdd30e5
[CIR] Handle vtable-lowering-with-incomplete types (#190216)
The NYI diagnostic in getFunctionTypeForVTable showed up a few times in
testing, so this patch is attempting to fix that up.

The reproducer here is a function type for a vtable that has an
incomplete type in it(return or parameter). Classic codegen chooses to
represent this as an opaque type.

This patch instead removes the special v-table handling here, so that we
can instead just represent the types as incomplete record types.

At the moment, this patch ends up lowering incomplete types as 'empty'
types in LLVM-IR, which we may find we need to modify in the future,
however at the moment, it seems to work.

This patch ALSO changes the definition of RecordType::isSized to only be
true for complete types, which prevents a number of other things from
attempting to add attributes/check the size of the type/etc, but those
are irrelevant for the purposes of vtable emission.
2026-04-03 05:59:46 -07:00
Erich Keane
2c734b3951
[CIR] Implement top level 'ExportDecl' emission (#190286)
This is a pretty simple one, its just a type of decl-context. The actual
exporty-ness is handled on a per-declaration basis.

This patch just makes sure we emit them, as I suspect this will reveal
quite a bit more issues in module code I suspect.
2026-04-03 05:59:25 -07:00
Amr Hesham
0932472f3b
[CIR][NFC] Add NYI for OMPSplitDirective stmt (#190329)
Fix the warning of missing OMPSplitDirective statement in the emitStmt
switch
2026-04-03 14:45:48 +02:00
alexpaniman
b9924c76da
[clang] Make -dump-tokens option align tokens (#164894)
When using `-Xclang -dump-tokens`, the lexer dump output is currently
difficult to read because the data are misaligned. The existing
implementation simply separates the token name, spelling, flags, and
location using `'\t'`, which results in inconsistent spacing.

For example, the current output looks like this on provided in this
patch example **(BEFORE THIS PR)**:

<img width="2936" height="632" alt="image"
src="https://github.com/user-attachments/assets/ad893958-6d57-4a76-8838-7fc56e37e6a7"
/>

# Changes

This small PR improves the readability of the token dump by:

+ Adding padding after the token name and after the spelling (the
padding amount was chosen empirically to produce good average
alignment).
+ Swapping the order of location and flags (since flags can take up a
lot of space and disrupt alignment).

The result is a more readable output **(AFTER THIS PR)**:

<img width="1470" height="315" alt="image"
src="https://github.com/user-attachments/assets/c24f24e5-a431-42cc-b5b6-232bac5c635e"
/>
2026-04-03 08:33:36 -04:00
Lakreite
a44c15874d
[AMDGPU][CodeGen] Implement SimplifyDemandedBitsForTargetNode for readfirstlane. (#190009)
Propagate demanded bits through readfirstlane intrinsic in
AMDGPUISelLowering with SimplifyDemandedBitsForTargetNode
implementation.

This allows upstream zero/sign extensions to be eliminated when only a
subset of bits is used after the intrinsic.

Partially addresses #128390.
2026-04-03 14:30:47 +02:00
theRonShark
00aede8f19
Revert "[Clang][OpenMP] Implement Loop splitting #pragma omp split directive " (#190335)
Reverts llvm/llvm-project#183261

15 new lit tests failing in openmp
2026-04-03 12:27:07 +00:00
Simon Pilgrim
15ed4f6c49
[DAG] isKnownToBeAPowerOfTwo - add missing DemandedElts handling to ISD::TRUNCATE and hidden m_Neg pattern (#190190)
Use MaskedVectorIsZero to match X & -X pattern when only DemandedElts
match the negation pattern

Fixes #181654 (properly)
2026-04-03 12:03:33 +00:00
Sergei Barannikov
f1d167123c
[lldb] Return 0 instead of false from a function returning size_t (NFC) (#190334) 2026-04-03 11:32:37 +00:00
Ilia Kuklin
e24936b7ad
[lldb] Fix DIL error diagnostics output (#187680)
* Correctly return the result when used from the console, so that
`DiagnosticsRendering` could use it to output the error.
* Add location pointer to `DILDiagnosticError` internal formatting to
show diagnostics when called from the API.
2026-04-03 16:29:33 +05:00
Arseniy Obolenskiy
03b9c7278e
[SPIR-V] Emit builtin variable OpVariable into entry block (#189958) 2026-04-03 13:18:48 +02:00
Mehdi Amini
c2ec012098
[mlir][linalg] Fix crash in tile_reduction when output map has constant exprs (#189166)
`generateInitialTensorForPartialReduction` and the `getInitSliceInfo*`
helpers unconditionally cast every result expression of the partial
result AffineMap to `AffineDimExpr`. When the original output indexing
map contains a constant (e.g. `affine_map<(d0,d1,d2)->(d0,0,d2)>`), the
constant expression propagates into the partial map and the cast
triggers an assertion.


Fixes #173025

Assisted-by: Claude Code
2026-04-03 11:09:26 +00:00
Matt Arsenault
273e8d85fe
DiagnosticInfo: Fix missing LLVM_LIFETIME_BOUND on Twine arguments (#190331)
Fix use after free errors in DiagnosticInfoResourceLimit uses.
2026-04-03 11:08:00 +00:00
Mehdi Amini
73bcfb6824
[mlir][Affine] Fix LICM incorrectly hoisting stores from zero-trip-count loops (#189165)
The affine-loop-invariant-code-motion pass was hoisting side-effectful
operations (e.g. affine.store) out of loops whose trip count is
statically known to be zero. This caused stores to execute
unconditionally even though the loop body should never run, producing
incorrect results.

The fix skips hoisting of non-memory-effect-free ops when
getConstantTripCount returns 0. Pure/side-effect-free ops are still
eligible for hoisting because they cannot change observable program
state.

Fixes #128273

Assisted-by: Claude Code
2026-04-03 13:07:26 +02:00
Ryotaro Kasuga
9e516f5c58
[MachinePipeliner] Remove isLoopCarriedDep and use DDG (#174394)
This patch completely removes `isLoopCarriedDep`, which was used
previously to identify loop-carried dependencies in the DAG. Now that we
have the DDG representation, this special handling is no longer
necessary. Simply replacing its usage with the DDG causes several tests
to fail, since cycle detection takes some of the validation-only edges
in the DDG into account. To address this, this patch introduces extra
edges in the DDG, which are used only for cycle detection and not for
other parts of the pass (e.g., scheduling). The extra edges are
determined to preserve the existing behavior of the pass as closely as
possible, which makes the predicates for adding them somewhat complex.

Split off from #135148, and the final patch in the series for #135148
2026-04-03 10:36:34 +00:00
Robert Imschweiler
a2d3783b45
[offload][libc] Adapt test to changes in #190239 (#190330) 2026-04-03 12:03:28 +02:00
Mehdi Amini
ff86be21de
[MLIR][MemRef] Fix AllocOp/AllocaOp flattening domination violation (#188980)
The generic MemRefRewritePattern handles AllocOp/AllocaOp by calling
getFlattenMemrefAndOffset with the op's own result as the source memref.
This inserts ExtractStridedMetadataOp and ReinterpretCastOp that consume
op.result before the alloc op itself in the block. After
replaceOpWithNewOp, op.result is RAUW'd to the new ReinterpretCastOp
result, leaving those earlier ops with forward references — a domination
violation caught by MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS.

Replace the AllocOp/AllocaOp cases in MemRefRewritePattern with a
dedicated AllocLikeFlattenPattern that never touches op.result until the
final replaceOpWithNewOp:
- sizes come from op.getMixedSizes() (operands, not the result)
- strides come from getStridesAndOffset on the MemRefType
- the flat allocation size is computed via
getLinearizedMemRefOffsetAndSize plus the static base offset so the
buffer covers [0, offset+extent)
- castAllocResult is simplified to take the pre-computed sizes and
strides rather than inserting an ExtractStridedMetadataOp on the
original op
- non-zero static base offsets are now correctly preserved in the
reinterpret_cast (the old code hardcoded offset=0, which was a verifier
error for layouts with offset \!= 0)
- dynamic offsets or strides bail out via notifyMatchFailure

Also remove the now-dead AllocOp/AllocaOp branches from replaceOp() and
the constexpr specialisation in getIndices().

Assisted-by: Claude Code
2026-04-03 11:21:00 +02:00
Harald van Dijk
7c1d91c435
[BOLT] Move extern "C" out of unnamed namespace (#190282)
GCC 15 changes how it interprets extern "C" in unnamed namespaces and
gives the variable internal linkage.
2026-04-03 09:51:55 +01:00
Mehdi Amini
d725513e7d
[MLIR][Affine] Fix null operands in simplifyConstrainedMinMaxOp (#189246)
`mlir::affine::simplifyConstrainedMinMaxOp` called
`canonicalizeMapAndOperands` with `newOperands` that could contain null
`Value()`s. These nulls came from
`unpackOptionalValues(constraints.getMaybeValues(), newOperands)` where
internal constraint variables added by `appendDimVar` (for `dimOp`,
`dimOpBound`, and `resultDimStart*`) have no associated SSA values.

Passing null Values to `canonicalizeMapAndOperands` risks undefined
behavior:
- `seenDims.find(null_value)` in the DenseMap causes all null operands
to collide at the same key, producing incorrect dim remapping.
- Any null operand that remains referenced in the result map would
propagate as a null Value into `AffineValueMap`, crashing callers that
try to use those operands to create ops.

Fix: Before calling `canonicalizeMapAndOperands`, filter null operands
from `newOperands` by replacing their dim/symbol positions in `newMap`
with constant 0 (safe because internal constraint dims should not appear
in the bound map expression) and compacting `newOperands` to contain
only non-null Values.

Fixes #127436

Assisted-by: Claude Code
2026-04-03 10:17:50 +02:00
Zhewen Yu
a7bf24919f
[mlir][IntRangeAnalysis] Fix assertion in inferAffineExpr for mod with range crossing modulus boundary (#188842)
The "small range with constant divisor" optimization in
`inferAffineExpr` for `AffineExprKind::Mod` assumed that if the dividend
range span (`lhsMax - lhsMin`) is less than the divisor, then the mod
results form a contiguous range. This is not always true, as the range
can straddle a modulus boundary.

For example, `[14, 17] mod 8`:
- Span is 3 < 8, so the old condition passed
- But `14%8=6` and `17%8=1` (wraps at 16)
- `umin=6, umax=1` → assertion `umin.ule(umax)` fails

The fix adds a same-quotient check (`lhsMin/rhs == lhsMax/rhs`) to
ensure both endpoints fall within the same modular period. When they
don't, we fall back to the conservative `[0, divisor-1]` range.

Assisted-by: Cursor (Claude)

Signed-off-by: Yu-Zhewen <zhewenyu@amd.com>
2026-04-03 10:15:52 +02:00
Donát Nagy
c80443cd37
[NFC][analyzer] Eliminate SwitchNodeBuilder (#188096)
This commit removes the class `SwitchNodeBuilder` because it just
obscured the logic of switch handling by hiding some parts of it in
another source file.
2026-04-03 09:46:06 +02:00
David Green
e46c5a831e
[AArch64] Regenerate arm64-stur.ll. NFC (#190317) 2026-04-03 08:27:29 +01:00
Michael Buch
f91124a55b
[lldb][Module] Only call LoadScriptingResourceInTarget via ModuleList (#190136)
This patch is motivated by
https://github.com/llvm/llvm-project/pull/189943, where we would like to
print the "these module scripts weren't loaded" warning for *all*
modules batched together. I.e., we want to print the warning *after* all
the script loading attempts, not from within each attempt.

To do so we want to hoist the `ReportWarning` calls in
`Module::LoadScriptingResourceInTarget` out into the callsites. But if
we do that, the callers have to remember to print the warnings. To avoid
this, we redirect all callsites to use
`ModuleList::LoadScriptingResourceInTarget`, which will be responsible
for printing the warnings.

To avoid future accidental uses of
`Module::LoadScriptingResourceInTarget` I moved the API into
`ModuleList` and made it `private`.
2026-04-03 07:03:11 +00:00
lonely eagle
8db1f6492a
[mlir][reducer] Remove the restriction that OptReductionPass must be a ModuleOp (#189038)
This PR aims to make the pass more generic by removing the ModuleOp
restriction. This PR reimplements the logic using a standalone
PassManager. Additionally, the isInteresting method has been updated to
accept Operation* for better flexibility. Finally, a dedicated test
directory has been added to improve the organization of OptReductionPass
tests.
2026-04-03 14:49:01 +08:00
michaelselehov
df48719df3
[AMDGPU] Add !noalias metadata to mem-accessing calls w/o pointer args (#188949)
addAliasScopeMetadata in AMDGPULowerKernelArguments skips instructions
with empty PtrArgs, including memory-accessing calls that have no
pointer arguments (e.g. builtins like threadIdx()). Because these calls
never receive !noalias metadata, ScopedNoAliasAA cannot prove they don't
alias noalias kernel arguments. MemorySSA then conservatively reports
them as clobbers, which prevents AMDGPUAnnotateUniformValues from
marking loads as noclobber, blocking scalarization (s_load) and forcing
expensive vector loads (global_load) instead.

Fix by adding all noalias kernel argument scopes to !noalias metadata
for memory-accessing instructions with no pointer arguments. Since such
instructions cannot access memory through any kernel pointer argument,
all noalias scopes are safe to apply.

This fixes a performance regression in rocFFT introduced by bd9668df0f00
("[AMDGPU] Propagate alias information in AMDGPULowerKernelArguments").

Assisted-by: Claude Opus
2026-04-03 08:41:05 +02:00
Ramkumar Ramachandra
e09d1e3ff1
[VPlan] Use not_equal_to to improve code (NFC) (#190262) 2026-04-03 07:32:34 +01:00
Paul Kirth
a52a504e69
[clang-doc] Prepare Info types for Arena allocation (#190046)
To allocate Info structures directly in an Arena, they cannot have
members with nontrivial destructors, or we will leak memory. Before we
migrate them, we can replace growable vector types with intrusive lists.

This introduces some slight overhead as these types now have new pointer
members for use in ilists in later patches.

| Metric | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Time | 920.5s | 1005.7s | 1010.5s | +9.8% | +0.5% |
| Memory | 86.0G | 42.1G | 42.9G | -50.2% | +1.8% |

| Benchmark | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |
| BM_BitcodeReader_Scale/10 | 67.9us | 68.6us | 69.2us | +1.9% | +0.9% |
| BM_BitcodeReader_Scale/10000 | 70.5ms | 21.3ms | 21.9ms | -68.9% |
+2.8% |
| BM_BitcodeReader_Scale/4096 | 23.2ms | 4.6ms | 4.6ms | -80.0% | +0.8%
|
| BM_BitcodeReader_Scale/512 | 509.4us | 546.3us | 541.8us | +6.4% |
-0.8% |
| BM_BitcodeReader_Scale/64 | 114.8us | 117.9us | 117.6us | +2.5% |
-0.2% |
| BM_EmitInfoFunction | 1.6us | 1.5us | 1.6us | -1.9% | +3.9% |
| BM_Index_Insertion/10 | 2.3us | 3.9us | 4.0us | +75.3% | +3.0% |
| BM_Index_Insertion/10000 | 3.1ms | 5.3ms | 5.4ms | +72.7% | +2.4% |
| BM_Index_Insertion/4096 | 1.3ms | 2.1ms | 2.1ms | +67.1% | +1.8% |
| BM_Index_Insertion/512 | 153.6us | 253.0us | 259.0us | +68.6% | +2.4%
|
| BM_Index_Insertion/64 | 18.1us | 30.1us | 30.3us | +67.8% | +0.5% |
| BM_JSONGenerator_Scale/10 | 36.8us | 37.0us | 38.2us | +3.6% | +3.2% |
| BM_JSONGenerator_Scale/10000 | 89.6ms | 91.7ms | 90.7ms | +1.2% |
-1.1% |
| BM_JSONGenerator_Scale/4096 | 33.7ms | 35.1ms | 34.7ms | +2.9% | -1.1%
|
| BM_JSONGenerator_Scale/512 | 1.9ms | 1.9ms | 2.0ms | +3.9% | +4.0% |
| BM_JSONGenerator_Scale/64 | 222.4us | 223.3us | 230.1us | +3.5% |
+3.1% |
| BM_Mapper_Scale/10000 | 104.3ms | 105.6ms | 100.9ms | -3.2% | -4.4% |
| BM_Mapper_Scale/4096 | 44.3ms | 44.8ms | 42.8ms | -3.5% | -4.4% |
| BM_Mapper_Scale/512 | 7.6ms | 7.6ms | 7.4ms | -2.6% | -3.2% |
| BM_Mapper_Scale/64 | 3.1ms | 3.0ms | 3.0ms | -2.0% | -1.3% |
| BM_MergeInfos_Scale/10000 | 12.2ms | 1.4ms | 1.6ms | -86.7% | +12.5% |
| BM_MergeInfos_Scale/2 | 1.9us | 1.7us | 1.7us | -10.2% | -1.9% |
| BM_MergeInfos_Scale/4096 | 2.8ms | 487.3us | 503.4us | -81.9% | +3.3%
|
| BM_MergeInfos_Scale/512 | 68.9us | 38.7us | 38.1us | -44.6% | -1.4% |
| BM_MergeInfos_Scale/64 | 10.3us | 6.4us | 6.4us | -37.6% | -0.4% |
| BM_MergeInfos_Scale/8 | 2.8us | 2.2us | 2.2us | -21.7% | -1.5% |
| BM_SerializeFunctionInfo | 25.5us | 25.9us | 26.0us | +1.9% | +0.4% |
2026-04-03 06:02:32 +00:00
Paul Kirth
4b2623d03c
[clang-doc] Introduce TransientArena for short lived allocations (#190045)
With strings interned, we can move the StringRefs in various Info
structs into a new short lived arena. This change migrates the remaining
SmallVectors in CommentInfo to use an ArrayRef backed by the new
transient arena.

This results in further minor reductions in overall memory usage, but no
significant effect on runtime performance.

| Metric | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Time | 920.5s | 1011.0s | 1005.7s | +9.2% | -0.5% |
| Memory | 86.0G | 44.9G | 42.1G | -51.0% | -6.2% |

| Benchmark | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |
| BM_BitcodeReader_Scale/10 | 67.9us | 70.0us | 68.6us | +1.0% | -2.0% |
| BM_BitcodeReader_Scale/10000 | 70.5ms | 21.3ms | 21.3ms | -69.8% |
-0.0% |
| BM_BitcodeReader_Scale/4096 | 23.2ms | 4.5ms | 4.6ms | -80.2% | +2.8%
|
| BM_BitcodeReader_Scale/512 | 509.4us | 538.8us | 546.3us | +7.3% |
+1.4% |
| BM_BitcodeReader_Scale/64 | 114.8us | 118.0us | 117.9us | +2.7% |
-0.1% |
| BM_EmitInfoFunction | 1.6us | 1.6us | 1.5us | -5.5% | -6.2% |
| BM_Index_Insertion/10 | 2.3us | 4.0us | 3.9us | +70.3% | -0.7% |
| BM_Index_Insertion/10000 | 3.1ms | 5.0ms | 5.3ms | +68.6% | +5.0% |
| BM_Index_Insertion/4096 | 1.3ms | 2.0ms | 2.1ms | +64.2% | +4.5% |
| BM_Index_Insertion/512 | 153.6us | 245.0us | 253.0us | +64.8% | +3.2%
|
| BM_Index_Insertion/64 | 18.1us | 28.9us | 30.1us | +67.0% | +4.4% |
| BM_JSONGenerator_Scale/10 | 36.8us | 36.4us | 37.0us | +0.4% | +1.7% |
| BM_JSONGenerator_Scale/10000 | 89.6ms | 90.4ms | 91.7ms | +2.3% |
+1.5% |
| BM_JSONGenerator_Scale/4096 | 33.7ms | 34.0ms | 35.1ms | +4.0% | +3.0%
|
| BM_JSONGenerator_Scale/64 | 222.4us | 220.5us | 223.3us | +0.4% |
+1.3% |
| BM_Mapper_Scale/10000 | 104.3ms | 105.4ms | 105.6ms | +1.3% | +0.3% |
| BM_Mapper_Scale/4096 | 44.3ms | 44.7ms | 44.8ms | +1.0% | +0.1% |
| BM_Mapper_Scale/512 | 7.6ms | 7.7ms | 7.6ms | +0.7% | -1.2% |
| BM_MergeInfos_Scale/10000 | 12.2ms | 1.4ms | 1.4ms | -88.2% | +0.1% |
| BM_MergeInfos_Scale/2 | 1.9us | 1.7us | 1.7us | -8.5% | +2.1% |
| BM_MergeInfos_Scale/4096 | 2.8ms | 495.6us | 487.3us | -82.5% | -1.7%
|
| BM_MergeInfos_Scale/512 | 68.9us | 34.6us | 38.7us | -43.9% | +11.6% |
| BM_MergeInfos_Scale/64 | 10.3us | 6.0us | 6.4us | -37.4% | +7.2% |
| BM_MergeInfos_Scale/8 | 2.8us | 2.1us | 2.2us | -20.6% | +5.1% |
| BM_SerializeFunctionInfo | 25.5us | 26.8us | 25.9us | +1.4% | -3.3% |
2026-04-03 05:46:25 +00:00
Amit Tiwari
1972cf64fd
[Clang][OpenMP] Implement Loop splitting #pragma omp split directive (#183261)
OpenMP 6.0 Loop-splitting directive `#pragma omp split` construct with `counts`
clause
2026-04-03 10:42:31 +05:30
Fangrui Song
2f7bd4fa97
[ELF] Enable parallel relocation scanning for -z nocombreloc and PPC64 (#190309)
The `bool serial` condition in scanRelocations disabled parallelism for
three cases: -z nocombreloc, MIPS, and PPC64. Resolve two cases:

- nocombreloc: .rela.dyn is now always created with combreloc=true so
  non-relative relocations are sorted deterministically. Since
  #187964 already separates relative relocations unconditionally,
  the only remaining effect of -z nocombreloc is suppressing
  DT_RELACOUNT (gated on ctx.arg.zCombreloc in DynamicSection).

- PPC64: After #181496 moved scanning into scanSectionImpl, the
  sole thread-unsafe access is ctx.ppc64noTocRelax (DenseSet::insert).
  Protect it with ctx.relocMutex, which is already used for rare
  operations during parallel scanning.

MIPS retains serial scanning due to `MipsGotSection` mutations.
2026-04-02 22:00:15 -07:00
Weibo He
bc11c85b6b
[clang][CodeGen] Emit coro.dead intrinsic to improve coroutine allocation elision (#190295)
Part 4/4: Implement HALO for coroutines that flow off final suspend.
Parent PR: #185336
2026-04-03 02:06:10 +00:00
Jackson Stogel
8b903fe38b
[clang][DebugInfo][test] Set -fuse-lld for test matching linker invocation. (#190291)
This test doesn't work as intended when an alternative default linker is
specified via `-DCLANG_DEFAULT_LINKER=ld`. If this test isn't intended
to support alternate default linker, lmk I can just change the
downstream usage I'm seeing, though I figure other folks may have
similar configurations. Repro:

```
cmake -S llvm -B build -DLLVM_ENABLE_PROJECTS="clang" -DCLANG_DEFAULT_LINKER=ld -GNinja
ninja -C build
./build/bin/llvm-lit -v clang/test/DebugInfo/CXX/hotpatch.cpp

...

possible intended match
# |             6:  "/usr/bin/ld" "-out:hotpatch.exe" "-libpath:lib/amd64" "-libpath:atlmfc/lib/amd64" "-nologo" "-functionpadmin" "/tmp/lit-tmp-o7x0r1o_/hotpatch-4595de.obj" 
```

afaict it passed before because `-mincremental-linker-compatible` was
being used until e97a42d5f9fe51de50aabd4d9bf6874a4955f9fa, which would
match on the compilation line.
2026-04-02 19:03:27 -07:00
Tamir Duberstein
72b00e60b8
[CMake] Version Darwin dylib identities (#189004) 2026-04-02 18:35:50 -07:00
Stanislav Mekhanoshin
7084f18f27
[AMDGPU] Fix i16/i8 flat store in true16 with sramecc (#190238)
The pattern was guarded by the D16PreservesUnusedBits predicate
which is not needed for stores.
2026-04-02 17:32:50 -07:00
Peter Collingbourne
935f21e1d5
gn build: Port d8e9e0af1cb6
Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/190290
2026-04-02 17:13:02 -07:00
Peter Collingbourne
f20b40ef97
gn build: Port f63d33da0a51 more
Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/190289
2026-04-02 17:08:29 -07:00
Benjamin Maxwell
89f31f26c8
[AArch64][SME] Preserve ZA in agnostic ZA functions without +sme (#190141)
`__arm_agnostic("sme_za_state")` does not require +sme, but we must
still preserve ZA in case the function is used with code that makes use
of ZA:

> The use of `__arm_agnostic("sme_za_state")` allows writing functions
> that are compatible with ZA state without having to share ZA state
> with the caller, as required by `__arm_preserves`. The use of this
> attribute does not imply that SME is available.
2026-04-03 00:33:28 +01:00
Jason Molenda
124b0a8fbb
[lldb][kernel debug] Add a missing call to scan local fs for kexts (#190281)
A kernel developer noticed that I missed a call to index the local
filesystem in one of our codepaths, and had a use case that depended on
that working.

rdar://173814556
2026-04-02 16:33:00 -07:00
Michael Kruse
1f75f318ae
[Runtimes] Gracefully handle invalid LLVM_TARGET_TRIPLE (#190284)
In some situations such as reported at
https://github.com/llvm/llvm-project/pull/177953#issuecomment-4179014239,
LLVM_(DEFAULT_)TARGET_TRIPLE is not set. It is used to derive the output
directory in #177953. Only flang-rt currently uses
RUNTIMES_(INSTALL|OUTPUT)_RESOURCE_LIB_PATH, we should not fail building
other despite a missing LLVM_TARGET_TRIPLE.

Compiler-rt uses COMPILER_RT_DEFAULT_TARGET_TRIPLE instead which it
derives itself. Most other LLVM runtimes libraries just skip the target
portion of the library path (explicitly so since #93354). Do the same
for RUNTIMES_(INSTALL|OUTPUT)_RESOURCE_LIB_PATH which we hope eventually
can replace the other mechanisms.
2026-04-02 23:32:42 +00:00
Paul Kirth
2600533a66
[clang-doc] Switch to string internment (#190044)
This is the first step in migrating all the Info types to be POD. We
introduced a shared string saver that can be used safely across threads,
and updated the internal represntation of various data types to use
these over owned strings, like SmallString or std::string.

This also required changes to YAMLGenerator to keep the single quoted
string formatting and to update the YAML traits.

This change gives an almost 50% reduction in peak memory when building
documentation for clang, at about a 10% performance loss. Future patches
can mitigate the performance penalties, and further reduce memory use.

| Metric | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Time | 920.5s | 920.5s | 1011.0s | +9.8% | +9.8% |
| Memory | 86.0G | 86.0G | 44.9G | -47.8% | -47.8% |

| Benchmark | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |
| BM_BitcodeReader_Scale/10 | 67.9us | 67.9us | 70.0us | +3.0% | +3.0% |
| BM_BitcodeReader_Scale/10000 | 70.5ms | 70.5ms | 21.3ms | -69.8% |
-69.8% |
| BM_BitcodeReader_Scale/4096 | 23.2ms | 23.2ms | 4.5ms | -80.7% |
-80.7% |
| BM_BitcodeReader_Scale/512 | 509.4us | 509.4us | 538.8us | +5.8% |
+5.8% |
| BM_BitcodeReader_Scale/64 | 114.8us | 114.8us | 118.0us | +2.8% |
+2.8% |
| BM_Index_Insertion/10 | 2.3us | 2.3us | 4.0us | +71.6% | +71.6% |
| BM_Index_Insertion/10000 | 3.1ms | 3.1ms | 5.0ms | +60.6% | +60.6% |
| BM_Index_Insertion/4096 | 1.3ms | 1.3ms | 2.0ms | +57.1% | +57.1% |
| BM_Index_Insertion/512 | 153.6us | 153.6us | 245.0us | +59.6% | +59.6%
|
| BM_Index_Insertion/64 | 18.1us | 18.1us | 28.9us | +60.0% | +60.0% |
| BM_JSONGenerator_Scale/10 | 36.8us | 36.8us | 36.4us | -1.3% | -1.3% |
| BM_Mapper_Scale/10000 | 104.3ms | 104.3ms | 105.4ms | +1.0% | +1.0% |
| BM_Mapper_Scale/512 | 7.6ms | 7.6ms | 7.7ms | +1.9% | +1.9% |
| BM_MergeInfos_Scale/10000 | 12.2ms | 12.2ms | 1.4ms | -88.2% | -88.2%
|
| BM_MergeInfos_Scale/2 | 1.9us | 1.9us | 1.7us | -10.3% | -10.3% |
| BM_MergeInfos_Scale/4096 | 2.8ms | 2.8ms | 495.6us | -82.2% | -82.2% |
| BM_MergeInfos_Scale/512 | 68.9us | 68.9us | 34.6us | -49.7% | -49.7% |
| BM_MergeInfos_Scale/64 | 10.3us | 10.3us | 6.0us | -41.6% | -41.6% |
| BM_MergeInfos_Scale/8 | 2.8us | 2.8us | 2.1us | -24.4% | -24.4% |
| BM_SerializeFunctionInfo | 25.5us | 25.5us | 26.8us | +4.9% | +4.9% |

note: I used an LLM to help generate the test code adjustments and the
YAML traits.
2026-04-02 23:25:21 +00:00
hjagasiaAMD
a76750e6de
Revert "[SimplifyCFG] Extend jump-threading to allow live local defs … (#190269)
…(#135079)"

This reverts commit a757f23404c594f4a48b4ddb6625f88b349d11d5. Commit
causes reduce.cu file in hipcub/warp go from 2 minutes of compilation to
taking several hours.
2026-04-02 23:05:26 +00:00
Jackson Stogel
9d18702cd8
[Bazel] Port b1ef47f45966f06f263dc96d83c869393952cbf8 (#190278) 2026-04-02 16:01:55 -07:00
Dave Lee
8fa4b3d601
[lldb] Simplify some tests to run_to_source_breakpoint (NFC) (#190082)
Many tests have ad hoc forms of the launch & break steps done by
`lldbutil.run_to_source_breakpoint`. This changes some of those tests to
use `run_to_source_breakpoint` instead.

Assisted-by: claude
2026-04-02 15:20:49 -07:00
Andy Kaylor
6c4149dba7
[CIR] Fix handling of catch-all with cleanups (#190233)
We had a bug where exceptions caught with catch-all were not properly
handling a thrown exception if the catch-all handler enclosed a cleanup
handler. The structured CIR was generated correctly, but when we
flattened the CFG and introduced cir.eh.initiate operations, the
cir.eh.initiate for the cleanup's EH path was incorrectly marked as
cleanup-only, even though it chained to the dispatch for the catch-all
handler. This resulted in the landing pad generated for the cleanup not
being marked as having a catch-all handler, so the exception was not
caught.

This change fixes the problem in the FlattenCFG pass.

Assisted-by: Cursor / claude-4.6-opus-high
2026-04-02 15:02:05 -07:00
forking-google-bazel-bot[bot]
842464e7a9
[Bazel] Fixes 71122d8 (#190264)
This fixes 71122d8694cad3ae4450368be3e89bb62aa78173.

Co-authored-by: Google Bazel Bot <google-bazel-bot@google.com>
2026-04-02 14:44:01 -07:00
Deric C.
72cc5a670e
[HLSL] Add TableGen-generated header files to the HLSL distribution (#190222)
This PR adds the TableGen-generated headers from
https://github.com/llvm/llvm-project/pull/187610 to the HLSL
distribution.

Currently the HLSL distribution is incomplete due to missing these
generated headers, preventing successful compilation:
```
Command Output (stderr):
--
In file included from <built-in>:1:

In file included from D:\a\_work\1\ClangHLSL\Binaries\lib\clang\23\include\hlsl.h:24:

D:\a\_work\1\ClangHLSL\Binaries\lib\clang\23\include\hlsl/hlsl_alias_intrinsics.h:42:10: fatal error: 'hlsl_alias_intrinsics_gen.inc' file not found

   42 | #include "hlsl_alias_intrinsics_gen.inc"

      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1 error generated.
```
This PR fixes the error by including `hlsl_alias_intrinsics_gen.inc` and
`hlsl_inline_intrinsics_gen.inc` in the HLSL distribution.
2026-04-02 14:38:58 -07:00
Michael Spencer
b1ef47f459
[libclang] Add clang_ModuleCache_prune (#190067)
This allows a build system to direct Clang to prune a module cache
directory using the same method Clang does internally.

This also changes `clang::maybePruneImpl` to clean up files directly in
the directory, not just subdirectories.
2026-04-02 14:26:27 -07:00
David Green
bf50e847fb
[AArch64] Add tests for st1 from subvector extracts. NFC (#190265) 2026-04-02 22:20:53 +01:00
Roland McGrath
71122d8694
[libc] Move LLVM_LIBC_IS_DEFINED macro to its own header (#190081)
This moves the LLVM_LIBC_IS_DEFINED macro to its own header is
__support/macros.  Its implementation leverages cpp::string_view
instead of rolling its own strcmp; this necessitated fixing
several missing constexpr in the string_view implementation.

The new __support/macros/macro-utils.h is also broken out to hold
the stringification macro and can be used in future for token
pasting shenanigans and other such generic macro machinery.
2026-04-02 14:13:51 -07:00