548981 Commits

Author SHA1 Message Date
Sergei Barannikov
b7d6f484c8
[RISCV] Remove non-existent operand of nds.vfwcvt/nds.vfncvt instructions (#153865)
Mask operand is likely a copy-past error, they don't have one.
2025-08-16 00:46:19 +03:00
Peter Collingbourne
6beb6f34bc dfsan: Fix test with gcc 15.
With gcc 15 we end up emitting a reference to the
std::__glibcxx_assert_fail function because of this change:
361d230fd7
combined with assertion checks in the std::atomic implementation.

This reference is undefined with dfsan causing the test to fail. Fix it
by defining the macro that disables assertions.

Pull Request: https://github.com/llvm/llvm-project/pull/153873
2025-08-15 14:44:27 -07:00
Peter Collingbourne
19cfc30b33 compiler-rt: Make the tests pass on AArch64 and with page size != 4096.
This makes the tests pass on my AArch64 machine with 16K pages.

Not sure why some of the AArch64-specific test failures don't seem to
occur on sanitizer-aarch64-linux. I could also reproduce them by running
buildbot_cmake.sh on my machine.

Pull Request: https://github.com/llvm/llvm-project/pull/153860
2025-08-15 14:44:27 -07:00
Haibo Jiang
21a5729b87
[BOLT] Do not use HLT as split point when build the CFG (#150963)
For x86, the halt instruction is defined as a terminator instruction.
When building the CFG, the instruction sequence following the hlt
instruction is treated as an independent MBB. Since there is no jump
information, the predecessor of this MBB cannot be identified, and it is
considered an unreachable MBB that will be removed.

Using this fix, the instruction sequences before and after hlt are
refused to be placed in different blocks.
2025-08-15 14:35:13 -07:00
Aiden Grossman
d0b19cf792 [Github][CI] Set CC and CXX in CI Container
We set these explicitly in a bunch of places. That is annoying and it is nice
to get them picked up by default rather than needing to remember.
2025-08-15 21:31:17 +00:00
Stanislav Mekhanoshin
1f25c4883e
[AMDGPU] Mitigate DS_ATOMIC_ASYNC_BARRIER_ARRIVE_B64 bug (#153872)
DS_ATOMIC_ASYNC_BARRIER_ARRIVE_B64 shall not be claused (we already do
not clause DS instructions) and needs waits before and after.
2025-08-15 14:17:54 -07:00
Chenguang Wang
eecbaac5c6
[bazel] Add yaml2obj to mlir/Test/Target/BUILD.bazel (#153875)
https://github.com/llvm/llvm-project/pull/152131 uses yaml2obj, which is
not listed as a dependency of the lit tests in bazel. This is causing
LLVM CI failures, e.g [1].

[1]:
https://buildkite.com/llvm-project/upstream-bazel/builds/146788/steps/canvas?sid=0198af37-f624-470f-aac1-d9e0b42fab56
2025-08-15 21:16:03 +00:00
Slava Zakharin
25285b3476
[flang] Lower EOSHIFT into hlfir.eoshift. (#153106)
Straightforward lowering of EOSHIFT intrinsic into the new hlfir.eoshift
operation.
2025-08-15 13:55:05 -07:00
Slava Zakharin
4c6afc7993
[flang] Lower hlfir.eoshift to the runtime call. (#153107)
Straightforward lowering of hlfir.eoshift to the runtime call
in LowerHLFIRIntrinsics pass.
2025-08-15 13:54:49 -07:00
Stanislav Mekhanoshin
e3154559ef
[AMDGPU] Select mul_lohi to V_MAD_NC_{I|U}64_I32 on gfx1250 (#153851) 2025-08-15 13:53:08 -07:00
gulfemsavrun
334e9bf2dd
Revert "RuntimeLibcalls: Generate table of libcall name lengths (#153… (#153864)
…210)"

This reverts commit 9a14b1d254a43dc0d4445c3ffa3d393bca007ba3.

Revert "RuntimeLibcalls: Return StringRef for libcall names (#153209)"

This reverts commit cb1228fbd535b8f9fe78505a15292b0ba23b17de.

Revert "TableGen: Emit statically generated hash table for runtime
libcalls (#150192)"

This reverts commit 769a9058c8d04fc920994f6a5bbb03c8a4fbcd05.

Reverted three changes because of a CMake error while building llvm-nm
as reported in the following PR:
https://github.com/llvm/llvm-project/pull/150192#issuecomment-3192223073
2025-08-15 13:32:27 -07:00
Matheus Izvekov
5c51a88f19
[clang] fix DependentNameType -> UnresolvedUsingType transforms (#153862) 2025-08-15 17:21:55 -03:00
Sterling-Augustine
5b0619e79b
Move function info word into its own data structure (#153627)
The sframe generator needs to construct this word separately from FDEs
themselves, so split them into a separate data structure.
2025-08-15 13:16:34 -07:00
Slava Zakharin
95d4362521
[flang] Added hlfir.eoshift operation definition. (#153105)
This is a basic definition of the operation corresponding to
the Fortran's EOSHIFT transformational intrinsic.
2025-08-15 13:15:35 -07:00
Craig Topper
c84a43ff3b
[RISCV] Fold (sext_inreg (xor (setcc), -1), i1) -> (add (setcc), -1). (#153855)
This improves all 3 vendor extensions that make sext_inreg i1 legal

Fixes #153781.
2025-08-15 12:55:18 -07:00
Aiden Grossman
ca8ee49c1f
[MLIR] Set LLVM_LIT_ARGS in Standalone Example CMake (#152423)
Setting LLVM_LIT_ARGS to include --quiet and then running check-mlir in
a standard checkout will otherwise cause test failures here because
LLVM_LIT_ARGS gets propagated into this project.
2025-08-15 12:40:32 -07:00
Augusto Noronha
c61fb5ca69
[NFC][lldb] Make C++ symbols in CPlusPlusLanguageTest.cpp valid (#153857) 2025-08-15 19:40:24 +00:00
Alexey Bataev
b157599156 [SLP]Do not include copyable data to the same user twice
If the copyable schedule data is created and the user is used several
times in the user node, no need to count same data for the same user
several times, need to include it only ones.

Fixes #153754
2025-08-15 12:36:45 -07:00
David Green
732eb5427c
[AArch64] Replace SIMDLongThreeVectorBHSabd with SIMDLongThreeVectorBHS. (#152987)
We just need to use a BinOpFrag to share the patterns. This also moves
UABDL to where it belongs in with similar instructions, and removes some
patterns that are now handled by abd nodes. This is mostly NFC except
for GISel, which will catch back up when it handles abd nodes in the
same way.
2025-08-15 20:35:27 +01:00
Florian Hahn
2ed727f3f6
[VPlan] Move SCEV invalidation to ::executePlan. (NFCI)
Move SCEV invalidation from legacy ILV code-path directly to ::executePlan.
2025-08-15 20:32:41 +01:00
Chenguang Wang
b3e3a2090b
[bazel] Add missing test inputs inclusion on mlir/test/Target. (#153854)
https://github.com/llvm/llvm-project/pull/152131 added a few tests that
depend on `mlir/test/Target/Wasm/inputs/*`, e.g.
`mlir/test/Target/Wasm/import.mlir` reads `inputs/import.yaml.wasm`.
These inputs should be included as data dependency.
2025-08-15 12:32:15 -07:00
CatherineMoore
49e28d77b8
[OpenMP] Update ompdModule.c printf to match argument type (#152785)
Update printf format string to match argument list

---------

Co-authored-by: Joachim <protze@rz.rwth-aachen.de>
Co-authored-by: Joachim Jenke <jenke@itc.rwth-aachen.de>
2025-08-15 14:30:47 -05:00
Augusto Noronha
c6ea7d72d1
[lldb] Fix CXX's SymbolNameFitsToLanguage matching other languages (#153685)
The current implementation of
CPlusPlusLanguage::SymbolNameFitsToLanguage will return true if the
symbol is mangled for any language that lldb knows about.
2025-08-15 12:30:21 -07:00
Bill Wendling
139bde2035
[llvm] Ignore coding assistant artifacts (#153853)
Now that "vibe coding" is a thing, ignore the documentation artifacts
that coding assistants, like Claude and Gemini, use to retain coding
workflows and other metadata.
2025-08-15 12:27:54 -07:00
Alexey Bataev
09f5b9ab0a Revert "[SLP]Do not include copyable data to the same user twice"
This reverts commit 758c6852c3ffe6b5e259cafadd811e60d8c276fb to fix
buildbot  https://lab.llvm.org/buildbot/#/builders/195/builds/13298
2025-08-15 12:08:31 -07:00
Jasmine Tang
d7a29e5d56
[WebAssembly] Reapply #149461 with correct CondCode in combine of SETCC (#153703)
This PR reapplies https://github.com/llvm/llvm-project/pull/149461

In the original `combineVectorSizedSetCCEquality`, the result of setcc
is being negated by returning setcc with the same cond code, leading to
wrong logic.

For example, with
```llvm
 %cmp_16 = call i32 @memcmp(ptr %a, ptr %b, i32 16)
  %res = icmp eq i32 %cmp_16, 0
```

the original PR producese all_true and then also compares the result
equal to 0 (using the same SETEQ in the returning setcc), meaning that
semantically, it effectively is calling icmp ne.

Instead, the PR should have use SETNE in the returning setcc, this way,
all true return 1, then it is compared again ne 0, which is equivalent
to icmp eq.
2025-08-15 12:06:47 -07:00
Abhinav Gaba
79cf877627
[Offload] Introduce dataFence plugin interface. (#153793)
The purpose of this fence is to ensure that any `dataSubmit`s inserted
into a queue before a `dataFence` finish before finish before any
`dataSubmit`s
inserted after it begin.

This is a no-op for most queues, since they are in-order, and by design
any operations inserted into them occur in order.

But the interface is supposed to be functional for out-of-order queues.

The addition of the interface means that any operations that rely on
such ordering (like ATTACH map-type support in #149036) can invoke it,
without worrying about whether the underlying queue is in-order or
out-of-order.

Once a plugin supports out-of-order queues, the plugin can implement
this function, without requiring any change at the libomptarget level.

---------

Co-authored-by: Alex Duran <alejandro.duran@intel.com>
2025-08-15 11:49:35 -07:00
zGoldthorpe
82caa251d4
[InstCombine] Fold integer unpack/repack patterns through ZExt (#153583)
This patch explicitly enables the InstCombiner to fold integer
unpack/repack patterns such as

```llvm
define i64 @src_combine(i32 %lower, i32 %upper) {
  %base = zext i32 %lower to i64

  %u.0 = and i32 %upper, u0xff
  %z.0 = zext i32 %u.0 to i64
  %s.0 = shl i64 %z.0, 32
  %o.0 = or i64 %base, %s.0

  %r.1 = lshr i32 %upper, 8
  %u.1 = and i32 %r.1, u0xff
  %z.1 = zext i32 %u.1 to i64
  %s.1 = shl i64 %z.1, 40
  %o.1 = or i64 %o.0, %s.1

  %r.2 = lshr i32 %upper, 16
  %u.2 = and i32 %r.2, u0xff
  %z.2 = zext i32 %u.2 to i64
  %s.2 = shl i64 %z.2, 48
  %o.2 = or i64 %o.1, %s.2

  %r.3 = lshr i32 %upper, 24
  %u.3 = and i32 %r.3, u0xff
  %z.3 = zext i32 %u.3 to i64
  %s.3 = shl i64 %z.3, 56
  %o.3 = or i64 %o.2, %s.3

  ret i64 %o.3
}
; =>
define i64 @tgt_combine(i32 %lower, i32 %upper) {
  %base = zext i32 %lower to i64
  %upper.zext = zext i32 %upper to i64
  %s.0 = shl nuw i64 %upper.zext, 32
  %o.3 = or disjoint i64 %s.0, %base
  ret i64 %o.3
}
```

Alive2 proofs: [YAy7ny](https://alive2.llvm.org/ce/z/YAy7ny)
2025-08-15 12:48:32 -06:00
Alexey Bataev
758c6852c3 [SLP]Do not include copyable data to the same user twice
If the copyable schedule data is created and the user is used several
times in the user node, no need to count same data for the same user
several times, need to include it only ones.

Fixes #153754
2025-08-15 11:47:35 -07:00
Erich Keane
dcdbd5b55d
[OpenACC][NFCI] Implement 'recipe' generation for firstprivate copy (#153622)
The 'firstprivate' clause requires that we do a 'copy' operation, so
this patch creates some AST nodes from which we can generate the copy
operation, including a 'temporary' and array init. For the most part
this is pretty similar to what 'private' does other than the fact that
the source is copy (and not default init!), and that there is a
temporary from which to copy.

---------

Co-authored-by: Andy Kaylor <akaylor@nvidia.com>
2025-08-15 18:42:40 +00:00
Stanislav Mekhanoshin
29976f2e58
[AMDGPU] Handle S_GETREG_B32 hazard on gfx1250 (#153848)
GFX1250 SPG says: S_GETREG_B32 does not wait for idle before executing.
The user must S_WAIT_ALU 0 before S_GETREG_B32 on:
STATUS, STATE_PRIV, EXCP_FLAG_PRIV, or EXCP_FLAG_USER.
2025-08-15 11:38:22 -07:00
XChy
3a4a60deff
[VectorCombine] Apply InstSimplify in scalarizeOpOrCmp to avoid infinite loop (#153069)
Fixes #153012

As we tolerate unfoldable constant expressions in `scalarizeOpOrCmp`, we
may fold
```llvm
define void @bug(ptr %ptr1, ptr %ptr2, i64 %idx) #0 {
entry:
  %158 = insertelement <2 x i64> <i64 5, i64 ptrtoint (ptr @val to i64)>, i64 %idx, i32 0
  %159 = or disjoint <2 x i64> splat (i64 2), %158
  store <2 x i64> %159, ptr %ptr2
  ret void
}
```

to

```llvm
define void @bug(ptr %ptr1, ptr %ptr2, i64 %idx) {
entry:
  %.scalar = or disjoint i64 2, %idx
  %0 = or <2 x i64> splat (i64 2), <i64 5, i64 ptrtoint (ptr @val to i64)>
  %1 = insertelement <2 x i64> %0, i64 %.scalar, i64 0
  store <2 x i64> %1, ptr %ptr2, align 16
  ret void
}
```
And it would be folded back in `foldInsExtBinop`, resulting in an
infinite loop.

This patch forces scalarization iff InstSimplify can fold the constant
expression.
2025-08-15 18:38:04 +00:00
Dave Lee
1dc0005d6d
Revert "[lldb] Fallback to expression eval when Dump of variable fails in dwim-print" (#153824)
Reverts llvm/llvm-project#151374

Superseded by https://github.com/llvm/llvm-project/pull/152417
2025-08-15 11:29:31 -07:00
Stanislav Mekhanoshin
5d28284dbb
[AMDGPU] gfx1250 does not need nop before VGPR dealloc (#153844)
This has no impact as the dealloc is now practically disabled.
2025-08-15 11:29:02 -07:00
Valentin Clement (バレンタイン クレメン)
3720d8b52d
[flang][cuda] Update some bind name to fast version and add __sincosf (#153744)
Use the fast version in the bind name and reorder these fast math
functions. Add missing __sincosf interface.
2025-08-15 11:07:15 -07:00
Aaron Ballman
ed6d505fab
[C][Docs] Add backported language features (#153837)
We've backported a lot more features from C to previous C standards than
we were documenting. I took a pass over the c_status page for Clang and
pulled more entries to add to our documentation.
2025-08-15 13:59:41 -04:00
Kaitlin Peng
0bb1af478a
[DirectX] Add GlobalDCE pass after finalize linkage pass in DirectX backend (#151071)
Fixes #139023.

This PR essentially removes unused global variables:
- Restores the `GlobalDCE` Legacy pass and adds it to the DirectX
backend after the finalize linkage pass
- Converts external global variables with no usage to internal linkage
in the finalize linkage pass
  - (so they can be removed by `GlobalDCE`)
- Makes the `dxil-finalize-linkage` pass usable using the new pass
manager flag syntax
- Adds tests to `finalize_linkage.ll` that make sure unused global
variables are removed
- Adds a use for variable `@CBV` in `opaque-value_as_metadata.ll` so it
isn't removed
- Changes the `scalar-data.ll` run command to avoid removing its global
variables

---------

Co-authored-by: Farzon Lotfi <farzonlotfi@microsoft.com>
2025-08-15 10:45:34 -07:00
Aiden Grossman
069f8121e0
[X86] Add RCU for Skylake Models (#153832)
We cannot actually retire an infinite number of uops per cycle. This
patch adds a RCU to the skylake scheduling model to fix this. I'm
purposefully using a loose upper bound here. We're unlikely to actually
get four fused uops per cycle, but this is better than not setting
anything. Most realistic code I've put through uiCA will retire up to ~6
uops per cycle.

Information taken from
https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(client).

This requires modification of the two zero idiom tests because we do not
currently model the CPU frontend which would likely be the actual
bottleneck in that case.

Related to #153747.
2025-08-15 10:33:26 -07:00
Valentin Clement (バレンタイン クレメン)
115f816069
[flang][cuda] Add missing bind name for __int2double_rn (#153720) 2025-08-15 10:27:19 -07:00
Valentin Clement (バレンタイン クレメン)
0e4af726cb
[flang][cuda] Add interface for __fdividef (#153742) 2025-08-15 10:26:40 -07:00
Valentin Clement (バレンタイン クレメン)
0e8c964c21
[flang][cuda] Add interfaces for double_as_longlong and longlong_as_double (#153719) 2025-08-15 17:26:11 +00:00
Alex MacLean
bc77363235
[NVPTX] Do not mark move of global address as cheap enabling more CSE (#153730) 2025-08-15 10:17:34 -07:00
Valentin Clement (バレンタイン クレメン)
fd3f052aeb
[flang][cuda] Add interfaces for int_as_float and float_as_int (#153716) 2025-08-15 10:00:53 -07:00
Simon Pilgrim
92cb0414ca [X86] avx512vnni-builtins.c / avx512vlvnni-builtins.c - add C/C++ and 32/64-bit test coverage 2025-08-15 17:55:33 +01:00
asraa
b045729eb4
[mlir][presburger] add functionality to compute local mod in IntegerRelation (#153614)
Similar to `IntegerRelation::addLocalFloorDiv`, this adds a utility
`IntegerRelation::addLocalModulo` that adds and returns a local variable
that is the modulus of an affine function of the variables modulo some
constant modulus. The function returns the absolute index of the new var
in the relation.

This is computed by first finding the floordiv of `exprs // modulus = q`
and then computing the remainder `result = exprs - q * modulus`.

Signed-off-by: Asra Ali <asraa@google.com>
2025-08-15 09:55:13 -07:00
zGoldthorpe
a8d25683ee
[PatternMatch] Allow m_ConstantInt to match integer splats (#153692)
When matching integers, `m_ConstantInt` is a convenient alternative to
`m_APInt` for matching unsigned 64-bit integers, allowing one to
simplify

```cpp
const APInt *IntC;
if (match(V, m_APInt(IntC))) {
  if (IntC->ule(UINT64_MAX)) {
    uint64_t Int = IntC->getZExtValue();
    // ...
  }
}
```
to
```cpp
uint64_t Int;
if (match(V, m_ConstantInt(Int))) {
  // ...
}
```

However, this simplification is only true if `V` is a scalar type.
Specifically, `m_APInt` also matches integer splats, but `m_ConstantInt`
does not.

This patch ensures that the matching behaviour of `m_ConstantInt`
parallels that of `m_APInt`, and also incorporates it in some obvious
places.
2025-08-15 10:43:54 -06:00
keinflue
af96ed6bf6
[clang] Inject IndirectFieldDecl even if name conflicts. (#153140)
This modifies InjectAnonymousStructOrUnionMembers to inject an
IndirectFieldDecl and mark it invalid even if its name conflicts with
another name in the scope.

This resolves a crash on a further diagnostic
diag::err_multiple_mem_union_initialization which via
findDefaultInitializer relies on these declarations being present.

Fixes #149985
2025-08-15 09:43:29 -07:00
Simon Pilgrim
2c20a9bfb3 [X86] avx512bf16-builtins.c / avx512vlbf16-builtins.c - add C/C++ and 32/64-bit test coverage 2025-08-15 17:38:43 +01:00
CatherineMoore
3a8f579a23
[OpenMP] Update printf statement with missing argument. (#153704) 2025-08-15 16:34:00 +00:00
Valentin Clement (バレンタイン クレメン)
583499a8cf
[flang][cuda] Add missing bind name for __hiloint2double, __double2loint and __double2hiint (#153713) 2025-08-15 09:32:59 -07:00