Added constant evaluation support for `__builtin_elementwise_abs` on integer, float and vector type.
fixes#152276
---------
Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
cloneExternalModuleToContext can be used to clone an LLVM module onto a given
ThreadSafeContext. Callers of this function are responsible for ensuring
exclusive access to the source module and its LLVMContext.
When stashing the tokens of a parameter of a member function, we would
munch an ellipsis, as the only considered terminal conditions were `,`
and `)`.
Fixes#153445
PR #153488 caused the msvc build (https://lab.llvm.org/buildbot/#/builders/166/builds/1397) to fail:
```
..\llvm-project\flang\include\flang/Evaluate/rewrite.h(78): error C2668: 'Fortran::evaluate::rewrite::Identity::operator ()': ambiguous call to overloaded function
..\llvm-project\flang\include\flang/Evaluate/rewrite.h(43): note: could be 'Fortran::evaluate::Expr<Fortran::evaluate::SomeType> Fortran::evaluate::rewrite::Identity::operator ()<Fortran::evaluate::SomeType,S>(Fortran::evaluate::Expr<Fortran::evaluate::SomeType> &&,const U &)'
with
[
S=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>,
U=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>
]
..\llvm-project\flang\lib\Semantics\check-omp-atomic.cpp(174): note: or 'Fortran::evaluate::Expr<Fortran::evaluate::SomeType> Fortran::semantics::ReassocRewriter::operator ()<Fortran::evaluate::SomeType,S,void>(Fortran::evaluate::Expr<Fortran::evaluate::SomeType> &&,const U &,Fortran::semantics::ReassocRewriter::NonIntegralTag)'
with
[
S=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>,
U=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>
]
..\llvm-project\flang\include\flang/Evaluate/rewrite.h(78): note: while trying to match the argument list '(Fortran::evaluate::Expr<Fortran::evaluate::SomeType>, const S)'
with
[
S=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>
]
..\llvm-project\flang\include\flang/Evaluate/rewrite.h(78): note: the template instantiation context (the oldest one first) is
..\llvm-project\flang\lib\Semantics\check-omp-atomic.cpp(814): note: see reference to function template instantiation 'U Fortran::evaluate::rewrite::Mutator<Fortran::semantics::ReassocRewriter>::operator ()<const Fortran::evaluate::Expr<Fortran::evaluate::SomeType>&,Fortran::evaluate::Expr<Fortran::evaluate::SomeType>>(T)' being compiled
with
[
U=Fortran::evaluate::Expr<Fortran::evaluate::SomeType>,
T=const Fortran::evaluate::Expr<Fortran::evaluate::SomeType> &
]
```
The reason is that there is an ambiguity between operator() of
ReassocRewriter itself and operator() of the base class `Identity` through
`using Id::operator();`. By the C++ specification, method declarations
in ReassocRewriter hide methods with the same signature from a using
declaration, but this does not apply to
```
evaluate::Expr<T> operator()(..., NonIntegralTag = {})
```
which has a different signature due to an additional tag parameter.
Since it has a default value, it is ambiguous with operator() without
tag parameter.
GCC and Clang both accept this, but in my understanding MSVC is correct
here.
Since the overloads of ReassocRewriter cover all cases (integral and
non-integral), removing the using declaration to avoid the ambiguity.
This PR builds upon the previous #146531 and enables scalable
vectorization for `batch_mmt4d` as well.
---------
Signed-off-by: Ege Beysel <beyselege@gmail.com>
When lowering using sublane shuffles, we can sometimes end up with the
same mask as we started with. We already bail in these occasions, but we
weren't fully simplifying the new shuffle mask before testing if it
matched.
Fixes#153457
Add special handling of EXTRACT_SUBVECTOR, INSERT_SUBVECTOR,
EXTRACT_VECTOR_ELT, INSERT_VECTOR_ELT and SCALAR_TO_VECTOR in
isGuaranteedNotToBeUndefOrPoison. Make use of DemandedElts to improve
the analysis and only check relevant elements for each operand.
Also start using DemandedElts in the recursive calls that check
isGuaranteedNotToBeUndefOrPoison for all operands for operations that do
not create undef/poison. We can do that for a number of elementwise
operations for which the DemandedElts can be applied to every operand
(e.g. ADD, OR, BITREVERSE, TRUNCATE).
@vgvassilev @anutosh491 This is what it took for me to enable running
ClangReplInterpreterTests in an Emscripten environment. When I ran this
patch for llvm 20 we could run InterpreterTest.InstantiateTemplate , but
now it crashes gtest when running in node. Let me know what you think.
Since we (Linaro) moved out bots to a new machine, this test has been
failing:
https://lab.llvm.org/buildbot/#/builders/121/builds/1566
Most of the time, the rss difference is greater than 512 on the first
iteration then settles down to 512 for all the rest.
```
starting rss 512
shadow pages: 1024
p = 0xe083e0800000
1536 -> 740
diff 796
1252 -> 740
diff 512
1252 -> 740
diff 512
1252 -> 740
diff 512
1252 -> 740
diff 512
1252 -> 740
diff 512
1252 -> 740
diff 512
1252 -> 740
diff 512
1252 -> 740
diff 512
1252 -> 740
diff 512
p = 0xe083e0800000
passed 1 out of 10
release-shadow.c.tmp: /home/tcwg-buildbot/worker/clang-aarch64-lld-2stage/llvm/compiler-rt/test/hwasan/TestCases/Linux/release-shadow.c:81: int main(): Assertion `success_count > total_count * 0.8' failed.
```
Given that the test was looking for a diff of at least 513, I guess that
512 is ok too.
For future reference, the original bot host was running this kernel:
Linux 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:51:36 UTC 2025
aarch64 aarch64 aarch64 GNU/Linux
And the new host:
Linux 6.8.0-64-generic #67-Ubuntu SMP PREEMPT_DYNAMIC Sun Jun 15
20:23:40 UTC 2025 aarch64 aarch64 aarch64 GNU/Linux
Though the new host also has more RAM, so the kernel may be less
aggresive with memory management.
This patch add cost kind to `getAddressComputationCost()` for #149955.
Note that this patch also remove all the default value in `getAddressComputationCost()`.
When assertions are enabled it is impossible to construct a
`string_view` which contains a null pointer and a non-zero size, so
assertions where we check for that on an already constructed
`string_view` are unreachable.
Depend on #152591 to fix
https://github.com/llvm/llvm-project/issues/149023.
Similar to an EH pad, there is no real advantage in "falling through" to
an indirect target of an INLINEASM_BR. And multiple indirect targets of
inline asm at the end of a function may be rotated infinitely.
Therefore, this patch avoids such optimization on indirect target of
inline asm as fall through.
This patch adds CodeGen support for qc.insbi and qc.insb instructions
defined in the Qualcomm uC Xqcibm extension. qc.insbi and qc.insb
inserts bits into destination register from immediate and register
operand respectively.
A sequence of `xor`, `and` & `xor` depending on appropriate conditions
are converted to `qc.insbi` or `qc.insb` which depends on the
immediate's value.
Close https://github.com/llvm/llvm-project/issues/138558
The compiler failed to understand the redeclaration-relationship when
performing checks when MergeFunctionDecl. This seemed to be a complex
circular problem (how can we know the redeclaration relationship before
performing merging?). But the fix seems to be easy and safe. It is fine
to only perform the check only if the using decl is a local decl.
LD2 is represented in IR as deinterleave-shuffle(load), and ST2 as
store(interleave-shuffle). Whilst the shuffle would be expensive in
general for MVE (it does not have zip/uzp instructions), it should be
treated as cheap when part of the LD2/ST2 pattern. This borrows some
code from the AArch64 backed to produce lower costs. (Some of which
still shows as higher than it should - that just shows how broken the
generic shuffle costs are at the moment, they would be lower if
getShuffleCost was called directly as opposed to going through
getInstructionCost).
Contrary to most testcases in the libFuzzer test suite,
`focus-function.test` seems to lack the `%run` directives, which is an
inconvenience in cases when `%run` actually gets substituted for
something. This PR adds said directives.
These instructions are the shift by immediate and saturate by immediate
instructions from the top half of page 9 of
https://jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf
I've also improved the CHECK lines in the invalid tests to check line
and column number from the diagnostic.
Co-authored-by: realqhc <caiqihan021@hotmail.com>
Introduces the use of pointer authentication to protect the invocation,
copy and dispose, reference, and descriptor pointers in Objective-C
block objects.
Resolves#141176
Instead of passing SDNode and ResNo separately.
This allows us to use SDValue::operator== and avoid creating SDValue
from the operands inside the function.