571412 Commits

Author SHA1 Message Date
Daniel Chen
53aa77092e
[flang] Fix distribution build of Fortran builtin/intrinsic modules. (#184204)
Currently, `-DLLVM_DISTRIBUTION_COMPONENTS="flang-module-interfaces"`
doesn't work. It failed to build the Fortran builtin/intrinsic modules
as distribution build, `install-distribution`.
This PR is to fix that.
2026-03-04 10:51:24 -05:00
Alexey Karyakin
e8e8d30b22
[Hexagon] Use __HVX_IEEE_FP__ to guard protos that need -mhvx-ieee-fp (#184422)
Hexagon clang recently started to define __HVX_IEEE_FP__ when the
-mhvx-ieee-fp option is specified. Guard the intrinsic macros for
instructions that should only be available with -mhvx-ieee-fp with
__HVX_IEEE_FP__.

Additionally, the following NFC changes are included:

- NFC: Remove guards around HVX v60 intrinsic macros
  Hexagon v60 is the oldest Hexagon version that supports HVX so these
  guards were redundant. Presence of HVX is guarded separately, once
  per the whole file.

- Remove comments from closing guards (HVX protos)
  These comments served very limited function as they only guard
  one macro. Also, they were incorrect. Instead of fixing remove them.
  This will also reduce by the factor of two the amount of changes
  when guarding conditions change.
2026-03-04 09:30:34 -06:00
Akash Banerjee
f55080da98
[flang][OpenMP] Avoid implicit default mapper on pointer captures (#184382)
This change fixes incorrect implicit declare mapper behavior in Flang
OpenMP lowering.

Issue:
Implicit default mappers were being attached/generated for pointer-based
implicit captures, and also on data-motion directives. That could
trigger recursive component mapping that overlaps/conflicts with
explicit user mappings, causing runtime mapping failures.

Fix:

- Skip implicit default mapper generation for implicit pointer captures
(keep support for allocatables).
- Do not auto-attach implicit mappers on target enter data, target exit
data, or target update.
- Apply the same pointer guard in the implicit target-capture lowering
path.
2026-03-04 15:27:06 +00:00
Krzysztof Drewniak
247a9bfc26
[mlir][AMDGPU] Add folders for memref aliases to TDM base creation (#184567)
The TDM base creation (amdgpu.make_tdm_base and
amdgpu.make_gather_tdm_base) take references to a
`%memref[%i0, %i1,, ...]` for the starting point of the tiles in
global/shared memory that the TDM descriptor refers to. Memory alias ops
can be safely folded into these operations, since these two memref
operands are just pointers to a scalar starting pint and don't have
semantics that depend on the memref layout (except to the extent that it
defines a location in memory).

While I'm here, I've cleaned up a few things, like the incorrect file
header and fixed the tests to not use integer address spaces.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 07:26:40 -08:00
Takashi Idobe
a3eb13b5bf
[X86] remove unnecessary movs when %rdx is an input to mulx (#184462)
Closes: https://github.com/llvm/llvm-project/issues/174912

When generating a `mulx` instruction for a widening multiplication, even
if one input is placed in %rdx, LLVM won't place it in the implicit
first slot, instead it'll generate two movs before calling mulx to swap
the registers, which are unnecessary. GCC already has this optimization
(as shown in the issue) so this puts the two compilers closer to each
other on that front.

Co-authored-by: Aiden Grossman <aidengrossman@google.com>
2026-03-04 15:19:16 +00:00
Mircea Trofin
ded64d2417
[DTU] fix dominator tree update eliding reachable nodes (#177683)
The initial CFG looks like this:

![initial_graph.png](https://app.graphite.com/user-attachments/assets/1e3109c5-7c02-4c81-b9b3-fa6a25964e00.png)

After inlining, it looks like this:

![after_inlining.png](https://app.graphite.com/user-attachments/assets/10906dc6-1865-4125-8cd5-c2af69191858.png)

It should be sufficient to add and remove the edges shown in the test, i.e.:
- add: `bb3->bb1.i` and `bb3->bb2.i`
- remove: `bb3->bb4`, `bb3->bb5` and `bb5->bb8`

New nodes, like `bb5.body`, get discovered when adding bb3->bb2.i. See the "StepByStep" variant of the test). Without the fix in this patch, however, `bb5.body` gets elided when the deleted edges get taken into account, and `DT` is left invalid.
2026-03-04 07:12:26 -08:00
Mehdi Amini
b28ec5ad18
[mlir][Func] Fix FuncOp verifier ordering via hasRegionVerifier (#184612)
FuncOp::verify() iterated over all blocks and called
getMutableSuccessorOperands() on any RegionBranchTerminatorOpInterface
terminator to check return types. This ran during the entrance phase of
verification — before child ops had been verified — so a malformed
terminator whose getMutableSuccessorOperands() assumed invariants
established by its own verify() could crash instead of emitting a clean
diagnostic.

Fix by switching to hasRegionVerifier=1: rename verify() →
verifyRegions() so the return-type checks run in the exit phase, after
all nested ops have already been verified.

To demonstrate the bug and guard against regression, add
TestCrashingReturnOp to the test dialect. The op implements
RegionBranchTerminatorOpInterface and report_fatal_errors in
getMutableSuccessorOperands() when its 'valid' unit-attr is absent,
reproducing the class of crash described above. The accompanying lit
test confirms a clean diagnostic is emitted rather than a crash.
2026-03-04 15:11:14 +00:00
Nick Sarnie
e5a6a0f108
[SPIRV] Fix global emission for modules with no functions (#183833)
Right now we have a problem where if you have a LLVM module with globals
but no functions, a completely empty SPIR-V module is emitted.

This is because global emission is dependent on tracking intrinsic
functions being emitted in functions.

As a simple fix, just insert a service function, which the backend is
already set up to not actually emit, if there are no real functions.

The current use case of the service function is for function pointers. I
don't think it's possible that we need to both generate a service
function for function pointers and for globals with no functions, so I
just added an error (not an assert) just in case if we do need it for
both cases.

Probably we should rework global handling in the future to work without
these workarounds, but this is a pretty fundamental issue so let's work
around it with this simple change for now.

This change exposed an existing bug:
  We consider basic blocks with no successors as fall-through

Also, fix some existing tests. The symptom was:
We previously emitted an empty module, but not that we don't, we hit a
`spirv-val` error about invalid Function StorageClass for globals
because no `addrspace` was specified. Set the `addrspace` to `1`
(`CrossWorkgroup`) in those tests.

Closes: https://github.com/llvm/llvm-project/issues/182899

---------

Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
2026-03-04 14:58:45 +00:00
Nikita Popov
c123642824
[CI] Install binutils-dev in pre-merge container (#184608)
This is to get the plugin-api.h file, to allow running tests for the
gold plugin.
2026-03-04 15:57:56 +01:00
Aiden Grossman
33be2d0e7a
[AArch64] Update clmul tests after #184403 (#184611)
This was likely a mid-air collision with #183282. Update the tests to
match the current state of HEAD.
2026-03-04 14:41:29 +00:00
Mehdi Amini
c9ca768c88
[mlir][shape] Fix crash when shape.lib array references undefined symbol (#184613)
In verifyOperationAttribute(), the single-symbol path for shape.lib used
SymbolTable::lookupSymbolIn() followed by an explicit null check. The
array path at line 196-197 used dyn_cast<FunctionLibraryOp>() directly
on the lookup result, which asserts when the symbol is not found (null
pointer).

Fix: use dyn_cast_or_null<> instead of dyn_cast<> so that a missing
symbol falls through to the existing "does not refer to
FunctionLibraryOp" error diagnostic instead of asserting.

Fixes #159653
2026-03-04 15:37:02 +01:00
Mehdi Amini
56e0b6af1d
[mlir][affine] Fix crash in vectorizeAffineLoopNest test utility for reduction loops (#184617)
The test utility function `testVecAffineLoopNest` called
`isLoopParallel` with a `reductions` output parameter, which populates
reduction descriptors when the loop performs a reduction. However, these
descriptors were never added to `strategy.reductionLoops` before calling
`vectorizeAffineLoopNest`. When the vectorizer then processed a loop
with `iter_args`, it found no reduction descriptors in the strategy and
hit an assertion failure.

Fix by registering the reduction loop descriptors in the strategy before
vectorization, matching what the production vectorizer code already does
correctly.

Fixes #128334
2026-03-04 14:36:27 +00:00
Florian Hahn
c370f5af6c
[VPlan] Preserve IsSingleScalar for hoisted predicated load. (#184453)
The predicated loads may be single scalar (e.g. for VF = 1). We should
preserve IsSingleScalar when hoisting them. As all loops access the same
address, IsSingleScalar must match across all loads in the group.

This fixes an assertion when interleaving-only with hoisted loads.

Fixes https://github.com/llvm/llvm-project/issues/184372

PR: https://github.com/llvm/llvm-project/pull/184453
2026-03-04 14:32:00 +00:00
Sayan Saha
50653e5a0d
[tosa] : Enhance tosa.slice folding for dynamic dims. (#184615)
Source IR:
```
func.func @main(%arg0: tensor<?x112x64x112xf32>) -> tensor<?x113x65x112xf32> {
    %0 = tosa.const_shape  {values = dense<[0, 0, 1, 1, 1, 1, 0, 0]> : tensor<8xindex>} : () -> !tosa.shape<8>
    %1 = "tosa.const"() <{values = dense<0.000000e+00> : tensor<1xf32>}> : () -> tensor<1xf32>
    %2 = tosa.pad %arg0, %0, %1 : (tensor<?x112x64x112xf32>, !tosa.shape<8>, tensor<1xf32>) -> tensor<?x114x66x112xf32>
    %3 = tosa.const_shape  {values = dense<0> : tensor<4xindex>} : () -> !tosa.shape<4>
    %4 = tosa.const_shape  {values = dense<[-1, 113, 65, 112]> : tensor<4xindex>} : () -> !tosa.shape<4>
    %5 = tosa.slice %2, %3, %4 : (tensor<?x114x66x112xf32>, !tosa.shape<4>, !tosa.shape<4>) -> tensor<?x113x65x112xf32>
    return %5 : tensor<?x113x65x112xf32>
  }
```

when canonicalized produces

```
$> mlir-opt --canonicalize

func.func @main(%arg0: tensor<?x112x64x112xf32>) -> tensor<?x113x65x112xf32> {
    %0 = tosa.const_shape  {values = dense<0> : tensor<4xindex>} : () -> !tosa.shape<4>
    %1 = tosa.const_shape  {values = dense<[-1, 113, 65, 112]> : tensor<4xindex>} : () -> !tosa.shape<4>
    %2 = "tosa.const"() <{values = dense<0.000000e+00> : tensor<1xf32>}> : () -> tensor<1xf32>
    %3 = tosa.const_shape  {values = dense<[0, 0, 1, 0, 1, 0, 0, 0]> : tensor<8xindex>} : () -> !tosa.shape<8>
    %4 = tosa.pad %arg0, %3, %2 : (tensor<?x112x64x112xf32>, !tosa.shape<8>, tensor<1xf32>) -> tensor<?x113x65x112xf32>
    %5 = tosa.slice %4, %0, %1 : (tensor<?x113x65x112xf32>, !tosa.shape<4>, !tosa.shape<4>) -> tensor<?x113x65x112xf32>
    return %5 : tensor<?x113x65x112xf32>
  }
```

because of the `PadSliceOptimization`. Note that the `tosa.slice` op
after the optimization is essentially a no-op. This change, enhances the
folder to fold such `tosa.slice` ops. After this change canonicalization
produces

```
func.func @main(%arg0: tensor<?x112x64x112xf32>) -> tensor<?x113x65x112xf32> {
    %0 = "tosa.const"() <{values = dense<0.000000e+00> : tensor<1xf32>}> : () -> tensor<1xf32>
    %1 = tosa.const_shape  {values = dense<[0, 0, 1, 0, 1, 0, 0, 0]> : tensor<8xindex>} : () -> !tosa.shape<8>
    %2 = tosa.pad %arg0, %1, %0 : (tensor<?x112x64x112xf32>, !tosa.shape<8>, tensor<1xf32>) -> tensor<?x113x65x112xf32>
    return %2 : tensor<?x113x65x112xf32>
  }
```
2026-03-04 09:28:46 -05:00
Benjamin Luke
11c11ec2e9
[clang][Lex] Preserve MultipleIncludeOpt state in Lexer::peekNextPPToken (#183425)
Fixes https://github.com/llvm/llvm-project/issues/180155.

This is a duplicate of https://github.com/llvm/llvm-project/pull/180700
except that I also added some tests, fine to go with either PR, but we
should add the tests.

peekNextPPToken lexed a token and mutated MIOpt, which could clear the
controlling-macro state for main files in C++20 modules mode.
Save/restore MIOpt in Lexer::peekNextPPToken.

Add regression coverage in
LexerTest.MainFileHeaderGuardedWithCPlusPlusModules that checks to make
sure the controlling macro is properly set in C++20 mode.

Add source level lit test in miopt-peek-restore-header-guard.cpp that
checks to make sure that the warnings that depend on the MIOpt state
machine are emitted in C++20 mode.
2026-03-04 15:11:33 +01:00
Tarun Thammisetty
5c27407842
[analyzer] Suppress optin.cplusplus.VirtualCall warnings in system headers (#184183)
Fixes #184178

The optin.cplusplus.VirtualCall checker reports warnings for virtual
method calls during construction/destruction even when the call site is
in a system header (included via -isystem). Users cannot fix such code
and must resort to NOLINT suppressions.

Add a system header check in checkPreCall before emitting the report,
consistent with how other checkers (e.g. MallocChecker) handle this.
2026-03-04 13:56:47 +00:00
Juan Manuel Martinez Caamaño
073de3b803
[SPIRV] Rename selectSelectDefaultArgs to selectBoolToInt (#184120)
The function is used to extend a `bool` (vector or scalar) into `1/-1`
for `true` and `0` for `false` (vector or scalar).

There is no obvious "default" argument for a select operation, so the
original name is confusing.

This patch:
* Renames this function to better signal its intention,
* makes the boolean argument explicit in the function (instead of
implicit through the first register operand of the instruction),
* rename `I` to `InsertAt`.
2026-03-04 14:53:04 +01:00
Joseph Huber
0cbba3ed5f
[flang-rt] Fix incorrect condition for removing backtrace (#184610) 2026-03-04 07:50:48 -06:00
Benjamin Maxwell
c6bb6a7e42
[LV] Add -force-target-supports-masked-memory-ops option (#184325)
This can be used to make target agnostic tail-folding tests much less
verbose, as masked loads/stores can be used rather than scalar
predication.
2026-03-04 13:36:29 +00:00
Lukacma
71de1e47c0
Reapply "[AArch64] Wrap integer SCALAR_TO_VECTOR nodes in bitcasts (#172837)" (#183380) (#184403)
This reverts commit b7ce37c6703f2d82376f50f82a05b807a0ad90ad.
The
[issue](https://github.com/llvm/llvm-project/pull/172837#issuecomment-3961532435)
this patch revealed was fixed by [this
patch](https://github.com/llvm/llvm-project/pull/183549).
2026-03-04 13:34:46 +00:00
Ivan Kosarev
21c1ba16ed
[TableGen] Complete the support for artificial registers (#183371)
Artificial registers were added in
eb0c510ecde667cd911682cc1e855f73f341d134
as a means of giving super-registers heavier weights than that
of their subregisters, even when they only contain a single
physical subregister.

Artifical registers thus do exist in code and participate in
register unit weight calculations, but are not supposed to be
available for register allocation.

This patch completes the support for artificial registers to:

- Ignore artificial registers when joining register unit uber
  sets. Artificial registers may be members of classes that
  together include registers and their sub-registers, making it
  impossible to compute normalised weights for uber sets they
  belong to.

  We have a use case downstream relying on this being supported,
  which allows to avoid introducing a large number of additional
  register classes.

- Not generate purely artificial register class intersections.
  It is critical not to have such classes, as the common LLVM
  codegen infrastructure will try to use them to constrain
  classes of virtual registers instead of producing COPYs
  whenever both the source and target register classes contain
  the same artificial registers.

- Not generate sub-classes where classes with the same
  non-artificial members already exist. This is mostly for
  convenience. For example, the HI16-capable subset of AMDGPU's
  AV_32 is VGPR_32, except VGPR_32 also contains the artificial
  staging registers. If the staging registers are not ignored,
  we'll end up having an additional generated register class,
  AV_32_with_hi16_in_VGPR_16, -- harmless, but also useless.

Eliminates a few inferred AMDGPU register classes:
    - VS_32_with_hi16
    - VS_32_Lo256_with_hi16
    - VS_32_Lo128_with_hi16
    - VRegOrLds_32_and_VS_32_Lo256
    - VRegOrLds_32_and_VS_32_Lo128
    - SRegOrLds_32_and_VRegOrLds_32

Causes no register class changes for other targets.
2026-03-04 13:33:26 +00:00
Ken Matsui
c2e22e3b79
[clang][cmake] Add option to control hmaptool installation (#172725) 2026-03-04 08:32:55 -05:00
Shilei Tian
47766d7f8c
[AMDGPU][Clang][Doc] Add documentation for WMMA builtins (#183939) 2026-03-04 08:30:42 -05:00
Mehdi Amini
1b3545117d
[mlir][irdl] Fix crash in TypeOp/AttributeOp verify on empty sym_name (#184598)
TypeOp::verify() and AttributeOp::verify() called StringRef::front() to
check for leading '\!' or '#' sigils before passing the name to
isValidName(). When sym_name is empty, front() triggers an assertion
failure:
  Assertion `\!empty()' failed.

Fix: guard the front() calls with an emptiness check. An empty sym_name
then falls through to isValidName(), which already emits a proper
diagnostic:
  error: name of type is empty

Fixes #159949
2026-03-04 14:26:50 +01:00
Younan Zhang
05fdd53839
[Clang] Fix the lambda context for constraint evaluation (#184319)
Constraint lambdas in the requires body need complete template arguments
before they can be evaluated. That was connected by
ImplicitConceptSpecializationDecl which is no longer created naturally
after the normalization patch.

This patch fixes the bug by creating a temporary decl for that purpose.
Though the temporary object should go away once we have the evaluation
context track template arguments.

No release note for being a regression fix.

Fixes #184047
2026-03-04 21:25:18 +08:00
Tomohiro Kashiwada
0af2d43e06
[Clang] Warn if both of dllexport/dllimport and exclude_from_explicit_instantiation are specified (#183515)
The attributes `exclude_from_explicit_instantiation` and
`dllexport`/`dllimport` serve opposite purposes.
Therefore, if an entity has both attributes, drop one with a warning,
depending on the context of the declaration.
In a template context, the `exclude_from_explicit_instantiation`
attribute takes precedence over the `dllexport` or `dllimport`
attribute. Conversely, the `dllexport` and `dllimport` attributes are
prioritized, in a non-template context.
2026-03-04 14:10:23 +01:00
Matthew Devereau
5cf09a68a6
[AArch64][ISel] Use vector register for scalar CLMUL (#183282)
Even though there are only v8i8 and v1i64 variants for pmul/pmull, Using
them is faster than the current implementation for scalar CLMUL.
2026-03-04 13:07:56 +00:00
Graham Hunter
98ed41718b
[LV] Transform tests for early-exit with stores (#183288)
Precommit of transform tests for #178454
2026-03-04 13:04:05 +00:00
Matt Arsenault
8bb41c929f
AMDGPU: Fix copy of Triple (#184594) 2026-03-04 12:41:56 +00:00
serge-sans-paille
095e1694d9
[clang] Turn misc copy-assign to move-assign (#184144)
That's an automated patch generated from clang-tidy
performance-use-std-move as a follow-up to #184136
2026-03-04 12:37:29 +00:00
Phoebe Linck
c2784e11cc
[Flang][OpenMP] DEFAULT(NONE) error checking on implicit references (#182214)
A variable with an unspecified data-sharing attribute under a
DEFAULT(NONE) clause only emits an error if the variable is explicitly
referenced in the body of the construct with DEFAULT(NONE).

Ex:

```
!$omp parallel default(none)
!$omp task
a = 1
!$omp end task
!$omp end parallel
end
```
gfortran will error with `‘a’ not specified in enclosing ‘parallel’` on
the above. flang doesn't error.

Fix moves the error check to `CreateImplicitSymbols` and checks the
variable for a violation in any of its enclosing contexts.
2026-03-04 09:33:51 -03:00
Mehdi Amini
1f4074b771
[mlir][llvm] Fix SROA crash on empty LLVM struct types (#184596)
When SROA runs on an alloca of an empty struct type (llvm.struct<()>),
it crashes with:

  Assertion `\!subelementIndexMap->empty()' failed.

The root cause is in LLVMStructType::getSubelementIndexMap(): for an
empty struct (no body fields), the loop doesn't execute and an empty
DenseMap is returned as a non-null optional. Later, getTypeAtIndex()
asserts the map is non-empty, triggering the crash.

Fix this by returning std::nullopt for empty structs, indicating they
cannot be destructured. This is consistent with how LLVMArrayType
handles the zero-element case.

Fixes #108366
2026-03-04 13:31:46 +01:00
Ella Ma
0a1e39517b
[nfc][analyzer][test][z3] Replace "REQUIRES: no-z3" with "UNSUPPORTED: z3" (#184349)
Fixing D120325, continuing #183724

Lit feature "no-z3" is the opposite of "z3", requiring "no-z3" is the
same as unsupporting "z3".
2026-03-04 11:57:57 +00:00
jeanPerier
ee8184573f
Revert "[flang] make lowering to scf.while default" (#184592)
Reverts llvm/llvm-project#184234

This is breaking SPEC and other tests.

Reproducer:

```
subroutine foo()
  logical :: l1, l2
  do while (l1())
    if (l2()) then
      call bar()
    endif
  enddo
end
```

The cause is a pass ordering issue between the SCFToControlFlowPass and
CfgConversionPass
[here](d0f50d5574/flang/lib/Optimizer/Passes/Pipelines.cpp (L239-L240)).

I think they need to be run simultaneously somehow because the both SCF
and FIR structured operations may contain each other, and none will be
happy to get block CFG generated inside their region by the pass
lowering the other.

Reverting while this is sorted out.
2026-03-04 11:42:08 +00:00
Mehdi Amini
9c2829f2e1
[mlir][Func] Use getMutableSuccessorOperands() in FuncOp verifier (#184589)
When verifying return-like terminators, use
getMutableSuccessorOperands() instead of getNumOperands() so that only
the operands passed to the parent region are checked against the
function result types. This handles terminators that implement
RegionBranchTerminatorOpInterface and carry additional operands for
other successor regions (e.g. loop back-edges).

Add tests using test.loop_block_term, which has both an iter operand
(passed back to the region) and an exit operand (passed to the parent).
2026-03-04 12:36:19 +01:00
Mehdi Amini
7f044944e4
[MLIR][Arith][Vector] Reject i0 integer type in arith and vector ops (#183589)
Add ODS type constraints that exclude zero-bitwidth integers (i0) from
operations in the arith and vector dialects.  i0 has no meaningful
arithmetic representation and operations on it can trigger undefined
behavior (e.g. bitwidth calculations assuming non-zero width).

Changes:
- Add `AnyNonZeroBitwidthSignlessInteger` (as a `ConfinedType` over
  `AnySignlessInteger`) and `AnyNonZeroBitwidthSignlessIntegerOrIndex`
  to CommonTypeConstraints.td.
- Introduce `Arith_SignlessIntegerOrIndexLike` in ArithOps.td that wraps
  `AnyNonZeroBitwidthSignlessIntegerOrIndex` via
`TypeOrValueSemanticsContainer`, and update
`SignlessFixedWidthIntegerLike`
  to use `AnyNonZeroBitwidthSignlessInteger`.  Replace all uses of the
  shared `SignlessIntegerOrIndexLike` in ArithOps.td with the new
  dialect-local constraint.
- Update `IndexCastTypeConstraint` to use
`Arith_SignlessIntegerOrIndexLike`.
- Update `BitcastTypeConstraint` to exclude i0 by composing the already-
  defined `SignlessFixedWidthIntegerLike` and `FloatLike` constraints,
  keeping the definition compact (3 alternatives instead of 7).
- Add `AnyVectorOfNonI0Elem` and `AnyVectorOfNonZeroRankNonI0Elem` in
  VectorOps.td and apply them to `vector.contract`, `vector.reduction`,
  `vector.multi_reduction`, `vector.outerproduct`, `vector.bitcast`, and
  `vector.scan`.
- Update arith/invalid.mlir with explicit i0 rejection tests covering
all
integer op families (binary ops, cast ops, extended-multiply ops, cmpi,
bitcast, index_cast, index_castui) for both scalar and vector<N> forms.
- Update vector/invalid.mlir with i0 rejection tests for all covered
ops.
- Remove the now-invalid i0 canonicalization tests from
  arith/canonicalize.mlir.

Fixes #177822
Fixes #179266
Fixes #180463
Fixes #181532

See also
https://discourse.llvm.org/t/rfc-reject-i0-integer-type-in-arith-and-vector-ops/90011
2026-03-04 12:34:18 +01:00
Jueon Park
ee92ac2343
[mlir][nvgpu] Fix crash in optimize-shared-memory pass with vector element types (#179111)
The `--nvgpu-optimize-shared-memory` pass crashed when processing
memrefs with vector element types (e.g., `memref<16x1xvector<16xf16>,
3>`). This occurred because getElementTypeBitWidth() calls
getIntOrFloatBitWidth(), which asserts the element type must be an
integer or float.
Thus, this PR adds an early-exit guard to return failure() when the
memref's element type is not a scalar int or float.

I wasn't sure if we should support vector types (by multiplying element
bit width by vector length) or just reject them. For now, I've
implemented it to return failure on non-scalar types.

Fixes #177823

Co-authored-by: rebel-jueonpark <jueonpark@rebellions.ai>
2026-03-04 12:33:40 +01:00
Graham Hunter
943eb6fd95
[LV] Use make_early_inc_range in handleFindLastReductions (#184340)
Fixes #182152
2026-03-04 11:28:56 +00:00
Mirko Brkušanin
d0f50d5574
[AMDGPU] Remove DX10_CLAMP and IEEE bits from gfx1170 (#182107)
Add `DX10ClampAndIEEEMode` feature and set it for every subtarget prior
to gfx1170
2026-03-04 12:16:41 +01:00
Eugene Epshteyn
de5e081a83
[flang][NFC] Converted five tests from old lowering to new lowering (part 23) (#184533)
Tests converted from test/Lower/Intrinsics: adjustr.f90, all.f90,
any.f90, asinpi.f90, associated.f90
2026-03-04 06:10:31 -05:00
Mehdi Amini
8ac00ba7f9
[mlir][SCFToEmitC] Fix crash when scf.while carries a memref loop variable (#183944)
When a scf.while op has a loop-carried value whose type converts to
emitc::ArrayType (e.g. memref<1xf64>), the WhileLowering pattern
unconditionally called emitc::LValueType::get(arrayType), which
triggered an assertion because LValueType cannot wrap an array type.

Fix by returning a match failure in createVariablesForResults and
createVariablesForLoopCarriedValues when the converted type is an
emitc::ArrayType. This converts the crash into a proper legalization
failure.

Fixes #182649
2026-03-04 12:03:46 +01:00
Fedor Nikolaev
f1aa7c3c5f
[mlir][cf] Canonicalize block args with uniform incoming values (#183966)
Add a canonicalization pattern that replaces block arguments with a
common SSA value when all predecessors pass the same value for that
argument. This allows the block argument to be removed by dead code
elimination. First itteration

Idea from #182711
2026-03-04 12:03:17 +01:00
Florian Hahn
f702ee89c1
[VPlan] Fix partially uninitialized accesses after 17aaa0e590a7. (#184583)
17aaa0e590a7 adjusted how parts of the union members are managed. Make
sure the full union is initialized, to fix MSan failure in
https://lab.llvm.org/buildbot/#/builders/164/builds/19313.
2026-03-04 10:56:27 +00:00
Simon Pilgrim
2aab31a94e
[X86] combine-fcopysign.ll - extend test coverage to all x86-64/x86-64-v2/x86-64-v3/x86-64-v4 levels (#184579) 2026-03-04 10:50:19 +00:00
Nikita Popov
177211a99f
[AArch64] Generate test checks (NFC) (#184582) 2026-03-04 11:36:04 +01:00
Henry Baba-Weiss
6bdf076137
[clang] Predefine _MSVC_TRADITIONAL in MSVC compatibility mode (#184278)
As of version 19.15 (Visual Studio 2017 version 15.8), MSVC predefines
the `_MSVC_TRADITIONAL` macro to indicate whether it is using the old
"traditional" preprocessor or the new standards-conforming preprocessor.
Clang now predefines `_MSVC_TRADITIONAL` as 1 when emulating MSVC 19.15
or later, since Clang supports most traditional preprocessor behaviors
(e.g. `/##/` turning into `//`) when running in MSVC compatibility mode.

Currently there isn't a situation where it makes sense for Clang to
report `_MSVC_TRADITIONAL` as 0, since MSVC compatibility mode only
attempts to be compatible with the traditional MSVC preprocessor.
However, this does mean that clang-cl cannot match MSVC's behavior of
implicitly enabling the conforming C preprocessor when compiling with
`/std:c11`, `/std:c17`, or `/std:clatest`.

Fixes #47114
2026-03-04 11:35:34 +01:00
David Spickett
1582dd9c31
[lldb] Change more uses of AppendMessageWithFormat to AppendMessageWithFormatv (#184337)
When the message includes a final newline, Formatv can add that for you.

The only unusual change is one place in platform where we need to print
octal. LLVM doesn't have a built in way to do this (see
llvm/include/llvm/Support/FormatProviders.h) and this is probably the
only place in the codebase that wants to. So I decided not to add it
there.

Instead I've put the number info a format adapter with the normal printf
specifier, then put that into the Formatv format.
2026-03-04 10:33:10 +00:00
serge-sans-paille
d737cd5055
[clang-tools-extra] Turn misc copy-assign into move-assign (#184146)
That's an automated patch generated from clang-tidy
performance-use-std-move as a follow-up to #184136
2026-03-04 10:20:39 +00:00
Rolf Morel
756d068ead
[MLIR][Python][Transform] Expose PatternDescriptorOpInterface to Python (#184331)
Makes it possible to include Python-defined rewrite patterns in
transform-dialect schedules, inside of `transform.apply_patterns`, which
upon execution of the schedule runs the pattern in a greedy rewriter.

With assistance of Claude.
2026-03-04 10:19:59 +00:00
Devajith
9cc0df99de
[clang-repl] Create virtual files for input_line_N buffers (#182044)
Instead of using memory buffers without file backing, this patch
`input_line_N` buffers as virtual files.
    
This patch enables us to use input line numbers when verifying tests
`clang-repl`.

Co-authored-by: Vassil Vassilev <v.g.vassilev@gmail.com>
2026-03-04 12:15:59 +02:00