548550 Commits

Author SHA1 Message Date
Jakub Kuderski
1633e0ba8b
[ADT] Add from_range constructor for (Small)DenseMap (#153515)
This follows how we support range construction for (Small)DenseSet.
2025-08-14 08:53:52 -04:00
Ritanya-B-Bharadwaj
e3dcdb64ee
Claiming support for groupprivate and variable-category (#153553) 2025-08-14 18:15:46 +05:30
Jaden Angella
bfda0e777d
[mlir][EmitC] Expand the MemRefToEmitC pass - Lowering CopyOp (#151206)
This patch lowers `memref.copy` to `emitc.call_opaque "memcpy"`.
From:
```
func.func @copying(%arg0 : memref<9x4x5x7xf32>, %arg1 : memref<9x4x5x7xf32>) {
  memref.copy %arg0, %arg1 : memref<9x4x5x7xf32> to memref<9x4x5x7xf32>
  return
}
```
To:
```cpp
#include <cstring>
void copying(float v1[9][4][5][7], float v2[9][4][5][7]) {
  size_t v3 = 0;
  float* v4 = &v2[v3][v3][v3][v3];
  float* v5 = &v1[v3][v3][v3][v3];
  size_t v6 = sizeof(float);
  size_t v7 = 1260;
  size_t v8 = v6 * v7;
  memcpy(v5, v4, v8);
  return;
}
```
2025-08-14 05:25:55 -07:00
lonely eagle
6d08a39eeb
[mlir][nvgpu] Add tma last dim bytes check (#153451)
Add the check the number of bytes in the last dimension of Tma must be a
multiple of 16.
2025-08-14 20:14:20 +08:00
Igor Wodiany
87de48d11f
[mlir][spirv] Add spirv validation for module.mlir target test (#153227)
Creating this patch as an example on using the new `mlir-translate`
flag. Eventually all tests will be updated to validate SPIR-V modules.
2025-08-14 12:45:55 +01:00
Vincent
d3bbdc7bde
[clang] constexpr __builtin_elementwise_abs support (#152497)
Added constant evaluation support for `__builtin_elementwise_abs` on integer, float and vector type.

fixes #152276

---------

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-08-14 12:34:23 +01:00
Lang Hames
3bc3b4cf5f [ORC] Add cloneExternalModuleToContext API.
cloneExternalModuleToContext can be used to clone an LLVM module onto a given
ThreadSafeContext. Callers of this function are responsible for ensuring
exclusive access to the source module and its LLVMContext.
2025-08-14 21:21:17 +10:00
mdenson
f5b36eb3a4
[clang] fix comment lexing of command names with underscore (#152943)
Comment lexer fails to parse non-alphanumeric names.

fixes #33296

---------

Co-authored-by: Brock Denson <brock.denson@virscient.com>
2025-08-14 13:03:55 +02:00
Theodoros Theodoridis
d15b7a83a7
[llvm][LICM] Limit multi-use BOAssociation to FP and Vector (#149829)
Limit the re-association of BOps with multiple users to FP and Vector
arithmetic.
2025-08-14 11:56:55 +01:00
Corentin Jabot
186176de45
[Clang] Do not consider a variadic function ellipsis part of a default arg (#153496)
When stashing the tokens of a parameter of a member function, we would
munch an ellipsis, as the only considered terminal conditions were `,`
and `)`.

Fixes #153445
2025-08-14 12:51:58 +02:00
Andrzej Warzyński
8d4f3171fa
[mlir][linalg] Fix UnPackOp::getTiledOuterDims (#152960)
Fixes `getTiledOuterDims` by making sure that the `outer_dims_perm`
attribute from `linalg.unpack` is taken into account.

Fixes #152037
2025-08-14 11:39:50 +01:00
Michael Kruse
38853a0146 [flang][OpenMP] MSVC buildbot fix
PR #153488 caused the msvc build (https://lab.llvm.org/buildbot/#/builders/166/builds/1397) to fail:
```
..\llvm-project\flang\include\flang/Evaluate/rewrite.h(78): error C2668: 'Fortran::evaluate::rewrite::Identity::operator ()': ambiguous call to overloaded function
..\llvm-project\flang\include\flang/Evaluate/rewrite.h(43): note: could be 'Fortran::evaluate::Expr<Fortran::evaluate::SomeType> Fortran::evaluate::rewrite::Identity::operator ()<Fortran::evaluate::SomeType,S>(Fortran::evaluate::Expr<Fortran::evaluate::SomeType> &&,const U &)'
        with
        [
            S=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>,
            U=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>
        ]
..\llvm-project\flang\lib\Semantics\check-omp-atomic.cpp(174): note: or       'Fortran::evaluate::Expr<Fortran::evaluate::SomeType> Fortran::semantics::ReassocRewriter::operator ()<Fortran::evaluate::SomeType,S,void>(Fortran::evaluate::Expr<Fortran::evaluate::SomeType> &&,const U &,Fortran::semantics::ReassocRewriter::NonIntegralTag)'
        with
        [
            S=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>,
            U=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>
        ]
..\llvm-project\flang\include\flang/Evaluate/rewrite.h(78): note: while trying to match the argument list '(Fortran::evaluate::Expr<Fortran::evaluate::SomeType>, const S)'
        with
        [
            S=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>
        ]
..\llvm-project\flang\include\flang/Evaluate/rewrite.h(78): note: the template instantiation context (the oldest one first) is
..\llvm-project\flang\lib\Semantics\check-omp-atomic.cpp(814): note: see reference to function template instantiation 'U Fortran::evaluate::rewrite::Mutator<Fortran::semantics::ReassocRewriter>::operator ()<const Fortran::evaluate::Expr<Fortran::evaluate::SomeType>&,Fortran::evaluate::Expr<Fortran::evaluate::SomeType>>(T)' being compiled
        with
        [
            U=Fortran::evaluate::Expr<Fortran::evaluate::SomeType>,
            T=const Fortran::evaluate::Expr<Fortran::evaluate::SomeType> &
        ]
```

The reason is that there is an ambiguity between operator() of
ReassocRewriter itself and operator() of the base class `Identity` through
`using Id::operator();`. By the C++ specification, method declarations
in ReassocRewriter hide methods with the same signature from a using
declaration, but this does not apply to
```
evaluate::Expr<T> operator()(..., NonIntegralTag = {})
```
which has a different signature due to an additional tag parameter.
Since it has a default value, it is ambiguous with operator() without
tag parameter.

GCC and Clang both accept this, but in my understanding MSVC is correct
here.

Since the overloads of ReassocRewriter cover all cases (integral and
non-integral), removing the using declaration to avoid the ambiguity.
2025-08-14 12:30:59 +02:00
Florian Hahn
d92671cf7d
[PhaseOrdering] Add tests for optimizing std::find for AArch64. 2025-08-14 11:25:55 +01:00
Ege Beysel
8de85e753f
[mlir][linalg] Add support for scalable vectorization of linalg.batch_mmt4d (#152984)
This PR builds upon the previous #146531 and enables scalable
vectorization for `batch_mmt4d` as well.

---------

Signed-off-by: Ege Beysel <beyselege@gmail.com>
2025-08-14 11:47:51 +02:00
Simon Pilgrim
c96d0da62b
[X86] lowerShuffleAsLanePermuteAndPermute - ensure we've simplified the demanded shuffle mask elts before testing for a matching shuffle (#153554)
When lowering using sublane shuffles, we can sometimes end up with the
same mask as we started with. We already bail in these occasions, but we
weren't fully simplifying the new shuffle mask before testing if it
matched.

Fixes #153457
2025-08-14 10:47:11 +01:00
Matheus Izvekov
9255580a3a
[clang] fix skipped parsing of late parsed attributes (#153558) 2025-08-14 06:42:55 -03:00
tangaac
9315d701eb
[LoongArch] Optimize inserting extracted element for v4i64/v8i32 (#152629) 2025-08-14 17:06:50 +08:00
Björn Pettersson
5e7924a3cb
[SelectionDAG] Handle more opcodes in isGuaranteedNotToBeUndefOrPoison (#147019)
Add special handling of EXTRACT_SUBVECTOR, INSERT_SUBVECTOR,
EXTRACT_VECTOR_ELT, INSERT_VECTOR_ELT and SCALAR_TO_VECTOR in
isGuaranteedNotToBeUndefOrPoison. Make use of DemandedElts to improve
the analysis and only check relevant elements for each operand.

Also start using DemandedElts in the recursive calls that check
isGuaranteedNotToBeUndefOrPoison for all operands for operations that do
not create undef/poison. We can do that for a number of elementwise
operations for which the DemandedElts can be applied to every operand
(e.g. ADD, OR, BITREVERSE, TRUNCATE).
2025-08-14 09:05:15 +00:00
Jan Patrick Lehr
cd8c3bdf14
[ARM] Fix after #153394 (#153561)
This removes two double definitions.
2025-08-14 11:00:19 +02:00
TianYe
44e6bc6fc0
[Headers][X86] Allow AVX2/AVX512 broadcast intrinsics to be used in Constexpr (#153363)
Fix [issue](https://github.com/llvm/llvm-project/issues/152499)
This patch adds support for the following broadcast intrinsics
by wrapping them around existing generic shuffle implementations:
```
_mm_broadcastb_epi8
_mm_broadcastw_epi16
_mm_broadcastd_epi32
_mm_broadcastq_epi64
_mm_broadcastss_ps
_mm_broadcastsd_pd

_mm256_broadcastb_epi8
_mm256_broadcastw_epi16
_mm256_broadcastd_epi32
_mm256_broadcastq_epi64
_mm256_broadcastss_ps
_mm256_broadcastsd_pd

_mm256_broadcastsi128_si256

_mm512_broadcastb_epi8
_mm512_broadcastw_epi16
_mm512_broadcastd_epi32
_mm512_broadcastq_epi64
_mm512_broadcastss_ps
_mm512_broadcastsd_pd

_mm512_broadcast_f32x2 _mm256_broadcast_f32x2
_mm512_broadcast_i32x2 _mm256_broadcast_i32x2 _mm_broadcast_i32x2
_mm512_broadcast_f32x4 _mm256_broadcast_f32x4
_mm512_broadcast_i32x4 _mm256_broadcast_i32x4
_mm512_broadcast_f32x8
_mm512_broadcast_i32x8
_mm512_broadcast_f64x2 _mm256_broadcast_f64x2
_mm512_broadcast_i64x2 _mm256_broadcast_i64x2
_mm512_broadcast_f64x4
_mm512_broadcast_i64x4
```

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-08-14 09:40:11 +01:00
mcbarton
b24b8a5bb4
Enable running ClangReplInterpreterTests in an Emscripten environment (#150977)
@vgvassilev @anutosh491 This is what it took for me to enable running
ClangReplInterpreterTests in an Emscripten environment. When I ran this
patch for llvm 20 we could run InterpreterTest.InstantiateTemplate , but
now it crashes gtest when running in node. Let me know what you think.
2025-08-14 14:07:13 +05:30
Matt Arsenault
ddb2dc50af
ARM: Move gnu half convert calling conv config into tablegen (#153394) 2025-08-14 17:36:29 +09:00
Matt Arsenault
4aae7bc625
ARM: Move half convert libcall config to tablegen (#153389) 2025-08-14 17:35:58 +09:00
Shoreshen
04aebbfbe2
[AMDGPU] Delete AMDGPU Unify Metadata pass (#153548)
Fixes #153150
2025-08-14 16:16:32 +08:00
David Spickett
b0151cb91d
[compiler-rt][hwasan][test] Tweak check in release-shadow.c (#153181)
Since we (Linaro) moved out bots to a new machine, this test has been
failing:
https://lab.llvm.org/buildbot/#/builders/121/builds/1566

Most of the time, the rss difference is greater than 512 on the first
iteration then settles down to 512 for all the rest.
```
starting rss 512
shadow pages: 1024
p = 0xe083e0800000
1536 -> 740
diff 796
1252 -> 740
diff 512
1252 -> 740
diff 512
1252 -> 740
diff 512
1252 -> 740
diff 512
1252 -> 740
diff 512
1252 -> 740
diff 512
1252 -> 740
diff 512
1252 -> 740
diff 512
1252 -> 740
diff 512
p = 0xe083e0800000
passed 1 out of 10
release-shadow.c.tmp: /home/tcwg-buildbot/worker/clang-aarch64-lld-2stage/llvm/compiler-rt/test/hwasan/TestCases/Linux/release-shadow.c:81: int main(): Assertion `success_count > total_count * 0.8' failed.
```
Given that the test was looking for a diff of at least 513, I guess that
512 is ok too.

For future reference, the original bot host was running this kernel:
Linux 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:51:36 UTC 2025
aarch64 aarch64 aarch64 GNU/Linux

And the new host:
Linux 6.8.0-64-generic #67-Ubuntu SMP PREEMPT_DYNAMIC Sun Jun 15
20:23:40 UTC 2025 aarch64 aarch64 aarch64 GNU/Linux

Though the new host also has more RAM, so the kernel may be less
aggresive with memory management.
2025-08-14 09:13:53 +01:00
Nikita Popov
d1952baa5d [CodeGen] Remove unnecessary setTypeListBeforeSoften() parameter (NFC)
It does not make sense to set the softening type list without
setting IsSoften=true.
2025-08-14 10:04:56 +02:00
Elvis Wang
01fac67e2a
[TTI] Add cost kind to getAddressComputationCost(). NFC. (#153342)
This patch add cost kind to `getAddressComputationCost()` for #149955.

Note that this patch also remove all the default value in `getAddressComputationCost()`.
2025-08-14 16:01:44 +08:00
Piotr Fusik
18782db4c9
[RISCV] Improve instruction selection for most significant bit extraction (#151687)
(seteq (and X, 1<<XLEN-1), 0) -> (xori (srli X, XLEN-1), 1)
    (seteq (and X, 1<<31), 0) -> (xori (srliw X, 31), 1) // RV64
    (setlt X, 0) -> (srli X, XLEN-1) // SRLI is compressible
    (setlt (sext X), 0) -> (srliw X, 31) // RV64
2025-08-14 09:59:43 +02:00
Nikolas Klauser
7b904b09eb
[libc++] Remove assertions from <string_view> that are unreachable (#148598)
When assertions are enabled it is impossible to construct a
`string_view` which contains a null pointer and a non-zero size, so
assertions where we check for that on an already constructed
`string_view` are unreachable.
2025-08-14 09:24:20 +02:00
Nikolas Klauser
5b258884db
[libc++] Document how __tree is laid out and how we iterate through it (#152453) 2025-08-14 09:23:23 +02:00
Pavel Skripkin
30144226a4
[llvm] [InstCombine] fold "icmp eq (X + (V - 1)) & -V, X" to "icmp eq (and X, V - 1), 0" (#152851)
This fold optimizes 

```llvm
define i1 @src(i32 %num, i32 %val) {
  %mask = add i32 %val, -1
  %neg = sub nsw i32 0, %val

  %num.biased = add i32 %num, %mask
  %_2.sroa.0.0 = and i32 %num.biased, %neg
  %_0 = icmp eq i32 %_2.sroa.0.0, %num
  ret i1 %_0
}
```
to
```llvm
define i1 @tgt(i32 %num, i32 %val) {
  %mask = add i32 %val, -1
  %tmp = and i32 %num, %mask
  %ret = icmp eq i32 %tmp, 0
  ret i1 %ret
}
```

For power-of-two `val`.

Observed in real life for following code

```rust
pub fn is_aligned(num: usize) -> bool {
    num.next_multiple_of(1 << 12) == num
}
```
which verifies that num is aligned to 4096.

Alive2 proof https://alive2.llvm.org/ce/z/QisECm
2025-08-14 10:23:03 +03:00
Carl Ritson
f92afe7171
[AMDGPU] Preserve post dominator tree through SILowerControlFlow (#153528)
Change dominator tree updates to also handle post dominator tree.
2025-08-14 16:19:46 +09:00
XChy
f393f2a61e
[BranchFolding] Avoid moving blocks to fall through to an indirect target (#152916)
Depend on #152591 to fix
https://github.com/llvm/llvm-project/issues/149023.
Similar to an EH pad, there is no real advantage in "falling through" to
an indirect target of an INLINEASM_BR. And multiple indirect targets of
inline asm at the end of a function may be rotated infinitely.
Therefore, this patch avoids such optimization on indirect target of
inline asm as fall through.
2025-08-14 16:18:36 +09:00
David Green
4c28bbf5b8 [AArch64] Fix ‘>= 0’ is always true warning. NFC 2025-08-14 08:17:10 +01:00
Matt Arsenault
bbcac029db
ARM: Move more aeabi libcall config into tablegen (#152109) 2025-08-14 15:43:15 +09:00
quic_hchandel
71b066e3a2
[RISCV] Add CodeGen support for qc.insbi and qc.insb insert instructions (#152447)
This patch adds CodeGen support for qc.insbi and qc.insb instructions
defined in the Qualcomm uC Xqcibm extension. qc.insbi and qc.insb
inserts bits into destination register from immediate and register
operand respectively.
A sequence of `xor`, `and` & `xor` depending on appropriate conditions
are converted to `qc.insbi` or `qc.insb` which depends on the
immediate's value.
2025-08-14 12:08:28 +05:30
Chuanqi Xu
ab5a5a90c0 [C++20] [Modules] Fix incorrect diagnostic for using befriend target
Close https://github.com/llvm/llvm-project/issues/138558

The compiler failed to understand the redeclaration-relationship when
performing checks when MergeFunctionDecl. This seemed to be a complex
circular problem (how can we know the redeclaration relationship before
performing merging?). But the fix seems to be easy and safe. It is fine
to only perform the check only if the using decl is a local decl.
2025-08-14 14:23:14 +08:00
Stanislav Mekhanoshin
23b65edfbc
[AMDGPU] Add NV bit to CPol::ALL mask. NFCI. (#153487) 2025-08-13 23:02:50 -07:00
Stanislav Mekhanoshin
1216152f30
[AMDGPU] Fix the comment for OperandType. NFC. (#153489) 2025-08-13 23:02:28 -07:00
Stanislav Mekhanoshin
80d430df5d
[AMDGPU] Add MSG_SAVEWAVE_HAS_TDM on gfx1250 (#153483) 2025-08-13 23:01:50 -07:00
Stanislav Mekhanoshin
fc911fe928
[AMDGPU] Add HW_REG_IB_STS2 on gfx1250 (#153479) 2025-08-13 23:01:28 -07:00
Stanislav Mekhanoshin
cc0d227154
[AMDGPU] Disable s_setkill on gfx1250 (#153471) 2025-08-13 23:01:04 -07:00
Stanislav Mekhanoshin
742bcee2a0
[AMDGPU] Drop duplicated field HasMatrixReuse. NFCI. (#153467) 2025-08-13 23:00:30 -07:00
David Green
d9d9d9ad19
[ARM][MVE] Add shuffle costs for LDn and STn instructions. (#145304)
LD2 is represented in IR as deinterleave-shuffle(load), and ST2 as
store(interleave-shuffle). Whilst the shuffle would be expensive in
general for MVE (it does not have zip/uzp instructions), it should be
treated as cheap when part of the LD2/ST2 pattern. This borrows some
code from the AArch64 backed to produce lower costs. (Some of which
still shows as higher than it should - that just shows how broken the
generic shuffle costs are at the moment, they would be lower if
getShuffleCost was called directly as opposed to going through
getInstructionCost).
2025-08-14 06:59:37 +01:00
Carlos Galvez
3b6d8798ba
[clang-tidy][doc] Improve documentation of the -line-filter flag (#153372)
Fixes #25589

Co-authored-by: Carlos Gálvez <carlos.galvez@zenseact.com>
2025-08-14 07:55:20 +02:00
Terapines MLIR
c164e6309b
[flang][fir] Add conversion of fir.iterate_while to scf.while. (#152439)
This commmit is a supplement for
https://github.com/llvm/llvm-project/pull/140374.

RFC:https://discourse.llvm.org/t/rfc-add-fir-affine-optimization-fir-pass-pipeline/86190/6
2025-08-14 13:39:55 +08:00
Aleksei Babushkin
aa503f6572
[compiler-rt][libFuzzer] Add %run directives to focus-function.test (#153185)
Contrary to most testcases in the libFuzzer test suite,
`focus-function.test` seems to lack the `%run` directives, which is an
inconvenience in cases when `%run` actually gets substituted for
something. This PR adds said directives.
2025-08-14 08:36:25 +03:00
Craig Topper
ace08d5ccf
[RISCV] Add MC support for more P extension instructions. (#153458)
These instructions are the shift by immediate and saturate by immediate
instructions from the top half of page 9 of
https://jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf

I've also improved the CHECK lines in the invalid tests to check line
and column number from the diagnostic.

Co-authored-by: realqhc <caiqihan021@hotmail.com>
2025-08-13 22:07:03 -07:00
Oliver Hunt
d8850ee6c0
[clang][Obj-C][PAC] Add support for authenticating block metadata (#152978)
Introduces the use of pointer authentication to protect the invocation,
copy and dispose, reference, and descriptor pointers in Objective-C
block objects.

Resolves #141176
2025-08-13 22:01:24 -07:00
Craig Topper
9f96e3f80f
[SelectionDAG] Pass SDValue to InstrEmitter::EmitCopyFromReg. NFC (#153485)
Instead of passing SDNode and ResNo separately.

This allows us to use SDValue::operator== and avoid creating SDValue
from the operands inside the function.
2025-08-13 21:46:48 -07:00