574909 Commits

Author SHA1 Message Date
Bill Wendling
9d3079a7a9
[NFC][CodeGen] Prepare for expansion of InlineAsmPrepare (#189469)
Move some functions around so that the CallBrInst processing is
contained. The 'static' functions don't need to be declared at the top;
just place them before the calls. Fix the naming to use lower-case for
the first letter of function names.
2026-03-30 20:54:00 +00:00
Andy Kaylor
a0ffdf2850
[CIR] Allow replacement of a structor declaration with an alias (#188320)
We had an errorNYI diagnostic to trigger when we generated an alias for
a ctor or dtor that had an existing declaration. Because functions are
used via flat symbol references, all that is needed is to erase the old
declaration. This change does that.
2026-03-30 13:51:29 -07:00
Andy Kaylor
f7329189c0
[CIR] Handle throwing calls inside EH cleanup (#188341)
This implements handling for throwing calls inside an EH cleanup
handler. When such a call occurs, the CFG flattening pass replaces it
with a cir.try_call op that unwinds to a terminate block.

A new CIR operation, cir.eh.terminate, is added to facilitate this
handling, and the design document is updated to describe the new
behavior.

Assisted-by: Cursor / claude-4.6-opus-high
2026-03-30 13:44:11 -07:00
Berke Ates
b6e4d27c48
[MLIR][Mem2Reg] Extract shared utilities for PromotableRegionOpInterface (#188514)
The `PromotableRegionOpInterface` implementations use two helpers that
are likely useful for other dialects implementing this interface as
well:
- `updateTerminator`: Appends the reaching definition as an operand to a
block's terminator, falling back to a default when the block has no
entry (e.g. dead code).
- `replaceWithNewResults`: Clones an operation with additional result
types while preserving its regions, then replaces the original.

This PR extracts them into a common utility header so that downstream
dialects can reuse them directly.
I'm open to discussion about the location of these utilities.
2026-03-30 22:20:39 +02:00
Alexey Merzlyakov
06725d7ef5
[GISel] Keep non-negative info in SUB(CTLZ) (#189314)
Implement non-negative value tracking for SUB-CTLZ chains in GlobalISel,
matching the behavior previously added to SelectionDAG.

Additionally, refactor the SelectionDAG implementation from the previous
patch to improve performance and code density.

Related to https://github.com/llvm/llvm-project/issues/136516 and
https://github.com/llvm/llvm-project/pull/186338#discussion_r2980420174
2026-03-30 22:10:47 +02:00
Alexey Bataev
26e0d15eaa
[SLP] Prefer to trim equal-cost alternate-shuffle subtrees
If the trimming candidate subtree is rooted at an alternate-shuffle node
with binary ops, and this subtree has the same cost as the buildvector
node cost, better to stick with the buildvector node to avoid runtime
perf regressions from shuffle/extra operations  overhead that the cost model may
underestimate. Skip trimming if the subtree contains ExtractElement
nodes, since those operate on already-materialized vectors, which may
reduced vector-to-scalar code movement and have better perf.

Reviewers: hiraditya, bababuck, fhahn, RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/188272
2026-03-30 16:03:18 -04:00
Ehsan Amiri
804ece6a4f
[DA] Require nsw for AddRecs in the WeakCrossing SIV test (#185041)
Before the start of the algorithm in weak crossing SIV test, we need to
ensure both addrecs are `nsw`
2026-03-30 15:51:44 -04:00
forking-google-bazel-bot[bot]
6021270aa3
[Bazel] Fixes 04785ad (#189456)
This fixes 04785adec34ddf9a6ec47f10da5b2b7fe8c9f9c8.

Co-authored-by: Google Bazel Bot <google-bazel-bot@google.com>
2026-03-30 12:49:31 -07:00
Joseph Huber
23f95fa9e3 [LLVM] Fix invalid shadowed type name 2026-03-30 14:32:55 -05:00
Zorojuro
15a7c45163
[libc][math][c23] Add asinbf16 math function (#184170)
Co-authored-by: bassiounix <muhammad.m.bassiouni@gmail.com>
2026-03-30 21:29:55 +02:00
Maksim Levental
f10dccd458
[MLIR][SparseTensor] Add #undef FAILURE_IF_FAILED and ERROR_IF (#188685)
Both DimLvlMapParser.cpp and LvlTypeParser.cpp define FAILURE_IF_FAILED
and ERROR_IF macros that are never undefined, which can leak into
subsequent translation units in unity builds. Add #undef at the end of
each file. See
https://discourse.llvm.org/t/rfc-enabling-unity-build/90306 for more
info.

"clauded" not coded
2026-03-30 12:27:48 -07:00
Maksim Levental
03869c74b6
[MLIR][SparseTensor] Add missing #undef REMUI and DIVUI (#188686)
LoopEmitter.cpp and SparseTensorIterator.cpp define REMUI and DIVUI
macros but the existing #undef block at the end of each file omits them.
This can leak the macros into subsequent translation units in unity
builds. See https://discourse.llvm.org/t/rfc-enabling-unity-build/90306
for more info.

"clauded" not coded
2026-03-30 12:27:31 -07:00
Joseph Huber
0d2c59abd5
[Clang] Fix constant bit widths in gpuintrin.h (#189387)
Summary:
The `ull` suffix can mean 128 bits on some architectures. Replace this
with the `stdint.h` constructor to be certain.
2026-03-30 14:19:01 -05:00
Jeffrey Byrnes
7364203924
Reapply "[AMDGPU] Add HWUI pressure heuristics to coexec strategy (#184929)" (#189121)
Reland https://github.com/llvm/llvm-project/pull/184929 after fixing
some issues in the NDEBUG builds.

3a640ee is unchanged from the previously approved PR, the unreviewed
portion of this PR is 9cabd8d
2026-03-30 12:18:29 -07:00
Joseph Huber
a6ffdb595f
[Clang] Improve scan in gpuintrin.h (#189381)
Summary:
Right now the scan checks to avoid the unspecified behavior in
`clzg(0)`. This is used as the source to the shuffle instruction, but
the argument is discarded at zero anyway. So, we simply pass unspecified
behavior to shuffle and then discard it. This should be fine. The scan
routines are expected to be optimal.

Also renames `sum` to `add`.
2026-03-30 14:16:21 -05:00
Brian Cain
651b61fac5
[Hexagon] Add coverage tests for CodeGen analysis and optimization passes (#183952)
Add tests targeting Hexagon CodeGen analysis and optimization passes:

- gen-pred-andn-orn.ll: HexagonGenPredicate pass exercising andn/orn
logical operations, cmp-zero conversion paths, deeper predicate chains,
and byte comparison classification.

- memcpy-likely-aligned.ll: HexagonSelectionDAGInfo exercising the
aligned memcpy specialization path.

- constprop-fp-cmp.ll: HexagonConstPropagation exercising floating-
point comparison constant folding paths.

- sched-timing-classes.ll: Scheduling timing class coverage for various
Hexagon instruction classes.
2026-03-30 12:10:08 -07:00
Brian Cain
ba228181c2
[lld][Hexagon] Fix out-of-range PLT branch thunks (#186545)
Linking large Hexagon binaries (e.g. ASan runtime with >8 MiB of text)
fails with R_HEX_B22_PCREL / R_HEX_PLT_B22_PCREL relocation overflow on
calls to PLT entries, even though the thunk infrastructure exists and
needsThunks is set.

needsThunk() always used s.getVA() to compute the branch destination,
even for PLT calls where the actual destination is the PLT entry. This
meant the distance check used the wrong address and failed to create
thunks when the PLT entry was out of B22_PCREL range.

Fix by using s.getPltVA() when expr == R_PLT_PC. Also override
getThunkSectionSpacing() so ThunkSections are pre-created at appropriate
intervals for large binaries.
2026-03-30 14:06:47 -05:00
Narayan
04785adec3
[LLVMABI] Create ABI Utils (#185105)
This PR introduces `ABIFunctionInfo` and surrounding utility helpers,
and is part of the set of breakout PRs to upstream the LLVM ABI lowering
library prototyped in https://github.com/llvm/llvm-project/pull/140112.

`ABIFunctionInfo` is directly analogous to `CGFunctionInfo` from Clang's
existing CodeGen pipeline, and represents an ABI lowered view of the
function signature, decoupled from both the Clang AST and LLVM IR.

`ABIArgInfo` encodes lowering decisions and currently supports
Direct,Extend,Indirect and Ignore which are required for our initial
goal of implementing x86-64 SysV and BPF, but this will change as the
library grows to represent more targets that need them.

This PR is a direct precursor to the implementation of `ABIInfo` in the
library as demonstrated in the PR linked above..
2026-03-31 00:34:50 +05:30
Chinmay Deshpande
14ab059dec
[AMDGPU][TTI] Update cost model for transcendental instructions to be more precise (#189430)
Introduce `getTransInstrCost` instead of `getQuarterRateInstrCost` for transcendental ops
2026-03-30 12:01:45 -07:00
Krzysztof Parzyszek
5c9440f8ae
[flang][OpenMP] Remove misplaced comment, NFC (#189449)
Remove the seemingly random comment listing clauses allowed on a DO
construct. The nearby code has nothing to do with clauses.
2026-03-30 13:59:44 -05:00
zGoldthorpe
0b500d5446
[Support] Move KnownFPClass inference from KnownBits to Support (#189414)
Move logic for inferring `KnownFPClass` from known bits into the Support
library so the logic may be used e.g., for analogous value tracking
functions in SelectionDAG.
2026-03-30 12:36:48 -06:00
Benjamin Maxwell
03cc2a3173
[mailmap] Add mailmap entry for myself (#189447) 2026-03-30 18:35:34 +00:00
RolandF77
db80420930
[PowerPC] Respect chain operand for llvm.ppc.disassemble.dmr lowering (#188334)
Fix ignoring the input chain when turning llvm.ppc.disassemble.dmr into
a store.
2026-03-30 14:30:30 -04:00
Roland McGrath
bdf28a6d48
[fuzzer] Use LIBCXX_ABI_UNSTABLE for hermetic libc++ (#189096)
This build of libc++ never interacts with any other, so
it can always use the latest and best ABI.
2026-03-30 11:24:04 -07:00
Jackson Stogel
7ccd92e5e6
[mlir][python] Disable pytype not-yet-supported error on Buffer import (#189440)
For pyhon versions <3.12, pytype complains that:

```
error: in <module>: collections.abc.Buffer not supported yet [not-supported-yet]
  from collections.abc import Buffer as _Buffer
```

Since it seems like this code intends to support <3.12, disabling the
type error on this line.
2026-03-30 11:21:35 -07:00
Tomer Shafir
ebc7b2f04e
[MCA] Use LLVM_DEBUG instead of direct NDEBUG check (NFC) (#189389)
Use the conventional multiline `LLVM_DEBUG` macro for a
debug-printing-only code block, instead of unwrapping a direct `NDEBUG`
check.
2026-03-30 21:19:13 +03:00
Kerem Şahin
8bd8304808
[clang-repl] Fix C89 incompatible keywords (#189432)
Restrict and inline keywords are removed for C89 interpreter since these
keywords caused fail at runtime preamble.

Fixes #189088
2026-03-30 21:14:52 +03:00
John Harrison
dd59a99cbf
[lldb] In python tests, call dumpSessionInfo(). (#188859)
Updates the lldb python test suite to ensure we call dumpSessionInfo()
in the test result's stopTest() method. This will ensure that we get the
session info dumped for all tests, even those that don't have an
explicit call to dumpSessionInfo() in the test case.

Additionally, I updated the lldb-dap test case to mark the '-dap.log' as
a log file, which will be recorded in the test output on failure.

Here is an example test run with a failure:

```
PASS: LLDB (build/bin/clang-arm64) :: test_step (TestDAP_step.TestDAP_step)
FAIL: LLDB (build/bin/clang-arm64) :: test_step_over_inlined_function (TestDAP_step.TestDAP_step)
Log Files:
 - build/lldb-test-build.noindex/tools/lldb-dap/step/TestDAP_step/Failure.log
 - build/lldb-test-build.noindex/tools/lldb-dap/step/TestDAP_step/Failure-dap.log
======================================================================
FAIL: test_step_over_inlined_function (TestDAP_step.TestDAP_step)
   Test stepping over when the program counter is in another file.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "llvm-project/lldb/test/API/tools/lldb-dap/step/TestDAP_step.py", line 113, in test_step_over_inlined_function
    self.assertFalse(
AssertionError: True is not false : expect path ending with 'main.cpp'.
Config=arm64-build/bin/clang
----------------------------------------------------------------------
Ran 2 tests in 4.849s
```
2026-03-30 11:13:49 -07:00
Zhijie Wang
34f5b80731
[LifetimeSafety] Track origins for lifetimebound calls returning record types (#187917)
- Move `hasOrigins` from free function to `OriginManager` method
- Add pre-scan (`collectLifetimeboundOriginTypes`) to register return
types of `[[clang::lifetimebound]]` calls before fact generation
- Generalize copy/move constructor origin propagation from lambda-only
to all types with `isDefaulted()` and `hasOrigins()` guard
- `isDefaulted()` is a heuristic: it avoids false positives from
user-defined copies with opaque semantics, but can still false-positive
when a defaulted outer copy invokes a user-defined inner copy that
breaks the propagate chain. See
`nested_defaulted_outer_with_user_defined_inner`
- Guard `operator=` origin propagation: pointer-like types always
propagate; other tracked types only when defaulted
- Defer `ThisOrigins` construction until after the pre-scan to avoid
origin list depth mismatch
- Fix `IsArgLifetimeBound` to exclude constructors from the
instance-method branch (latent bug exposed by this change)

Limitations (documented with FIXME tests):
- User-defined copy/move that shallow-copies: false negative
- Defaulted outer copy invoking user-defined inner copy: false positive
- Non-pointer/ref/gsl::Pointer parameter types with
`[[clang::lifetimebound]]`: not registered

Fixes #163600
2026-03-30 20:13:16 +02:00
Nick Sarnie
38a46a12c4
[offload][lit] Disable tests failing on Intel GPU (#189422)
Fix some tests causing hangs, one fail, and a few XPASSing. We are
seeing new passes/fails because of the named barrier changes being
merged.

Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
2026-03-30 18:02:34 +00:00
khaki3
e53f82716b
[flang][OpenACC] Add semantic check for GOTO branching out of compute constructs (#189385)
Per OpenACC spec 2.5.4, branching out of `parallel`/`serial`/`kernels`
constructs is not allowed. Add a GOTO check to `NoBranchingEnforce` that
collects labels within the construct block and flags GOTOs targeting
labels outside. In-region GOTOs are allowed.

The check applies only to compute constructs (`parallel`, `serial`,
`kernels`), not to data constructs where GOTO out is valid.
2026-03-30 11:00:47 -07:00
Alireza Torabian
e2055bce5c
[DA] Adding a test case for PR #188098 (#189428)
Without the changes in the patch #188098, this testcase crashes.
2026-03-30 13:58:05 -04:00
Rahul Joshi
44925b6ceb
[NFC][LLVM] Drop unused field from IITDescriptor (#189094)
Drop unused `Float_Width` field from `IITDescriptor`.
2026-03-30 10:52:21 -07:00
Aiden Grossman
9331b5bb77 [DAG] Fix -Wunused-variable
A recently introduced local is only used in an assertion which means we
get -Wunused-variable in release+noasserts builds. Mark it
[[maybe_unused]] rather than inlinine the definition given there are
multiple uses within the assert.
2026-03-30 17:51:42 +00:00
Matt Arsenault
fedb525151
clang: Remove unnecessary triple normalize in offloading job (#189435)
These should already have been normalized (and the device side
comes from code, which should have been trivially normalized to
start).
2026-03-30 17:48:20 +00:00
vangthao95
ec6574e90e
AMDGPU/GlobalISel: RegBankLegalize rules for udot2/sdot2 (#189103) 2026-03-30 10:43:05 -07:00
Erick Velez
cd3da41c01
[clang-doc] Integrate enum LIT tests (#187818)
Combine the two separate test files and have them feed from a common
source. This will be the way that tests are handled to prevent testing
divergence in the future.
2026-03-30 10:34:37 -07:00
Charles Zablit
87085a8705
[lldb-dap][windows] don't use the ConPTY in internalConsole mode (#186472)
In `internalConsole` mode (especially in VSCode), lldb-dap should not
use the ConPTY to read the process' output. This is because the
internalConsole is not a real terminal, there is no reason to use
terminal emulation, which will add arbitrary line returns to the output.

Instead, this patch introduces the `eLaunchFlagUsePipes` flag in
ProcessLaunchInfo which tells ProcessLaunchWindows to use regular pipes
instead of a ConPTY to get the stdin and stdout of the debuggee.

The result is that output which is supposed to be on a single line is
properly rendered.

---

The following example is when debugging a program through lldb-dap on
Windows. The program prints the numbers 0 through 999 on a single line.

# Before
<img width="2214" height="672" alt="Screenshot 2026-03-13 at 17 07 35"
src="https://github.com/user-attachments/assets/26292d11-2288-46ee-a6d2-0b66bfa41288"
/>

The line is split if it's longer than 80 characters (default terminal
size).

# After
<img width="2215" height="689" alt="Screenshot 2026-03-13 at 17 12 39"
src="https://github.com/user-attachments/assets/c9cad9af-b1ce-4c7b-91d5-f684e48e64ca"
/>

The line is correctly printed as a single line.

rdar://172491166
2026-03-30 18:33:45 +01:00
Peter Rong
3e2f0bce95
[ObjCDirectPreconditionThunk] precondition check thunk generation (#170618)
## TL;DR

This is a stack of PRs implementing features to expose direct methods
ABI.
You can see the RFC, design, and discussion
[here](https://discourse.llvm.org/t/rfc-optimizing-code-size-of-objc-direct-by-exposing-function-symbols-and-moving-nil-checks-to-thunks/88866).

https://github.com/llvm/llvm-project/pull/170616 Flag
`-fobjc-direct-precondition-thunk` set up
https://github.com/llvm/llvm-project/pull/170617 Code refactoring to
ease later reviews
https://github.com/llvm/llvm-project/pull/170618 **Thunk generation**
https://github.com/llvm/llvm-project/pull/170619 Optimizations, some
class objects can be known to be realized

## Implementation details

### Dispatching
- `GetDirectMethodCallee` handles the dispatching logic. Previously we
only need to call `GenerateDirectMethod` to get the declaration of a
direct method.
- `GenerateDirectMethod` first attempts to acquire the declaration of
the implementation, and return it if the flag is not set.
- Generate and return thunk if we can't dispatch to true implementation
(i.e. we can't reason receiver is def not null or class object is not
realized)

### Precondition check thunk generation

- `GenerateObjCDirectThunk` generates the thunk, it is called on demand
by `GetDirectMethodCallee`
- Thunk inherits all attributes from the true implementation, see
`StartObjCDirectThunk` for more detail.
- `StartObjCDirectThunk` and `FinishObjCDirectThunk` follows the design
pattern of `StartThunk` and `FinishThunk` in CGVTable.

### Precondition check inline generation

- If the function need to have precondition check inlined
(`shouldHaveNilCheckInline`), caller will emit the nil check during
`EmitMessageSend`
- Class realization is generated inline
- No extra nil check is generated - we reuse `NullReturnState` to emit
the nil check for us, it already emits nil check at caller side to
handle `ns_consumed`, we just need to tell `NullReturnState` to do the
work by setting the flag `RequiresNullCheck |= ReceiverCanBeNull;`

### Visibility and linkage

- Visibility is still by default `Hidden`. But `StartObjCMethod` will
now respect source level visibility attributes so methods with
`__attribute((visibility("default"))` can be used in other linking units
- Linkage is by default `External`

## Tests

- `expose-direct-method.m` follow the example of `direct-method.m`
- `direct-method-ret-mismatch.m` make sure we can handle the corner case
- `expose-direct-method-consumed.m ` and
`expose-direct-method-linkedlist.m` executable test on Mac only to
validate ARC correctness
- `expose-direct-method-varargs.m`
- `expose-direct-method-visibility-linkage.m`
2026-03-30 10:32:09 -07:00
John Paul Jepko
cabebddac9
[NFC] Remove unused-but-set global variables (#189315)
Remove four global variables that are set but never read to fix
-Wunused-but-set-global warnings:

- `MFMAChainLength` in AMDGPUIGroupLP.cpp
- `Wide` in llvm-objdump.cpp
- `SaveTemps` in ClangSYCLLinker.cpp
- `DeprecatedDriverCommand` in ClangScanDeps.cpp

Follow up to #178342
2026-03-30 19:29:50 +02:00
Roy Shi
d17296d013
[llvm lib] Read/write non-power-of-two sized unsigned integers (3, 5, 6, 7 bytes) in DataExtractor and FileWriter (#189098)
This allows tools (like gsymutil) to pack data more efficiently into a
file.
2026-03-30 10:28:09 -07:00
vangthao95
35a1961287
AMDGPU/GlobalISel: RegBankLegalize rules for dot products (#189110) 2026-03-30 10:15:12 -07:00
Justice Adams
c4462a2de0
[green dragon] add lld to LLVM_ENABLE_PROJECTS for ubuntu (#189419) 2026-03-30 10:07:21 -07:00
Shivam Kunwar
96f223c3b7
[DebugInfo] Verify DW_OP_LLVM_implicit_pointer survives ISel (#187641)
PR 1 - https://github.com/llvm/llvm-project/pull/186763 ([DebugInfo]
Lower DW_OP_LLVM_implicit_pointer to DWARF#186763)

RFC -
https://discourse.llvm.org/t/rfc-implementing-dw-op-implicit-pointer-support-in-llvm/90217

---------

Co-authored-by: Shivam Kunwar <phyBrackets@users.noreply.github.com>
2026-03-30 22:33:49 +05:30
vangthao95
2f0118895b
AMDGPU/GlobalISel: RegBankLegalize rules for ds_append/ds_consume (#189143) 2026-03-30 09:57:57 -07:00
Alexis Engelke
7581430722
[IR] Require well-formed IR for BasicBlock::getTerminator (#189416)
BasicBlock::getTerminator() is frequently called on valid IR, yet the
function has to check that the last instruction is in fact a terminator,
even in release builds. This check can only be optimized away when the
instruction is dereferenced.

Therefore, introduce the functions hasTerminator() and
getTerminatorOrNull() as replacement and require (assert) that
getTerminator() always returns a valid terminator. As a side effect,
this forces explicit expression of intent at call sites when unfinished
basic blocks should be supported.
2026-03-30 18:57:37 +02:00
vangthao95
c32d670757
AMDGPU/GlobalISel: RegBankLegalize rules for ds_ordered_add/swap (#189137) 2026-03-30 09:57:04 -07:00
vangthao95
27e3c43d74
AMDGPU/GlobalISel: RegBankLegalize rules for global_load_lds (#189135) 2026-03-30 09:53:12 -07:00
vangthao95
f4d1745ab3
AMDGPU/GlobalISel: RegBankLegalize rules for lds_direct_load (#189134) 2026-03-30 09:52:34 -07:00
Valentin Clement (バレンタイン クレメン)
2334330bd9
[flang][cuda] Imply zero offset when not provided (#189421) 2026-03-30 09:51:11 -07:00