We can derive and upgrade alignment for loads/stores using other
well-aligned loads/stores. This optimization does a single forward pass through
each basic block and uses loads/stores (the alignment and the offset) to
derive the best possible alignment for a base pointer, caching the
result. If it encounters another load/store based on that pointer, it
tries to upgrade the alignment. The optimization must be a forward pass within a basic
block because control flow and exception throwing can impact alignment guarantees.
---------
Co-authored-by: Nikita Popov <github@npopov.com>
The CMake flag has been on by default for a month without any issues.
This makes the feature support in LLVM unconditional (but does not
enable the feature by default).
This patch updates the pmulhw/pmulhuw builtins to support constant
expression handling - extending the VectorExprEvaluator::VisitCallExpr
handling code that handles elementwise integer binop builtins.
Hopefully this can be used as reference patch to show how to add future
target specific constexpr handling with minimal code impact.
I've also enabled pmullw constexpr handling (which are tagged on
#152490) as they all use very similar tests.
I've also had to tweak the MMX -> SSE2 wrapper as undefs are not
permitted in constexpr shuffle masks
Fixes#152524
When the target enables +sme2p2, the svcompact intrinsic is now
available in streaming SVE mode, through updating the guards in
arm_sve.td. Included Sema test acle_sve_compact.cpp.
Still missing the "extend to 256-bit" casts - _mm256_castpd128_pd256 / _mm256_castps128_ps256 / _mm256_castsi128_si256 - due to constexpr not liking undefined/poison etc.
HIP runtime support for compressed bundle format v3 is in place,
therefore switch the default compressed bundle format to v3 in compiler.
This allows both compressed and decompressed fat binary size to exceed
4GB by default.
Environment variable COMPRESSED_BUNDLE_FORMAT_VERSION=2 can be used for
backward compatibility for older HIP runtimes not supporting v3.
Fixes: SWDEV-548879
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.
This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
I fixed support for varargs functions
(previously it didn't crash but the codegen was incorrect).
I added tests for structs and unions which already work. With the
multivalue abi they crash in the backend, so I added a sema check that
rejects structs and unions for that abi.
It will also crash in the backend if passed an int128 or float128 type.
This change adds the definition of VTableAddrPointOp and the related
AddressPointAttr to the CIR dialect, along with tests for the parsing
and verification of these elements.
Code to generate this operation will be added in a later change.
The landing of #126324 made it so that __has_builtin returns false for
aux triple builtins. CUDA offloading can sometimes compile where the
host is in the aux triple (ie x86_64). This patch explicitly carves out
NVPTX so that we do not run into redefinition errors.
Adds missing C++ run lines to test files containing `constexpr` tests.
Also adds missing 32/64-bit test coverage to the following tests:
- `clang/test/CodeGen/X86/avx512-reduceIntrin.c`
- `clang/test/CodeGen/X86/avx512-reduceMinMaxIntrin.c`
- `clang/test/CodeGen/X86/avx512vpopcntdq-builtins.c`
- `clang/test/CodeGen/X86/avx512vpopcntdqvl-builtins.c`
Additionally, fixes a `_mm512_popcnt_epi64` `constexpr` test that
incorrectly assumed 32-bit integers, leading to incorrect bit counts.
This change updates the test result to assume 64-bit integers.
Update the easy add/sub/mul/logic/cmp/scalar_to_vector intrinsics to be
constexpr compatible.
I'm not expecting anyone to be very interested in using MMX intrinsics,
but they're smaller than the other types and are useful to test the
constexpr handling and test methods before we start applying them to
SSE/AVX2/AVX512 intrinsics.
This commit fixes the false positive that C++ class methods with libc
function names would be false warned about. For example,
```
struct T {void strcpy() const;};
void test(const T& t) { str.strcpy(); // no warn }
```
rdar://156264388
private/firstprivate typically do copy operations, however copying a VLA
isn't really possible. This patch introduces a warning to alert the
person that this copy isn't happening correctly.
As a future direction, we MIGHT consider doing additional work to make
sure they are initialized/copied/deleted/etc correctly.
Under C++17, `DeletedCopy` and `DefaultedCopy` in the test
`clang/test/SemaOpenACC/private_firstprivate_reduction_required_ops.cpp`
are eligible for aggregate initialization. Under C++20 and up, that is
no longer the case. Therefore, an explicit constructor must be declared
to avoid additional diagnostics which the test does not expect.
This test was failing in our downstream testing where our toolchain uses
C++20 as the default dialect.
See this reduced example for more details:
https://godbolt.org/z/46oErYT43
This corrects the codegen for the final class optimization to
correct handle the case where there is no path to perform the
cast, and also corrects the codegen to handle ptrauth protected
vtable pointers.
As part of this fix we separate out the path computation as
that makes it easier to reason about the failure code paths
and more importantly means we can know what the type of the
this object is during the cast.
The allows us to use the GetVTablePointer interface which
correctly performs the authentication operations required when
pointer authentication is enabled. This still leaves incorrect
authentication behavior in the multiple inheritance case but
currently the optimization is disabled entirely if pointer
authentication is enabled.
Fixes#137518
When casting a 0 to a pointer type, the IsNullPtr flag was always set to
false, leading to weird results like a pointer with value 0 that isn't a
null pointer.
This caused
```c++
struct B { const int *p;};
template<B> void f() {}
template void f<B{nullptr}>();
template void f<B{fold(reinterpret_cast<int*>(0))}>();
```
to be valid code, since nullptr and (int*)0 aren't equal. This seems
weird and GCC doesn't behave like this.
NFC patch to clean up extra lines of code in the file
`llvm/test/CodeGen/PowerPC/check-zero-vector.ll` as the current one has
loop unrolled.
Also removing the file `clang/test/CodeGen/PowerPC/check-zero-vector.c`
as the patch affects only the backend.
Co-authored-by: himadhith <himadhith.v@ibm.com>
Forward SourceLocation to `EmitCall` so that clang triggers an error
when a function inside `[[gnu::cleanup(func)]]` is annotated with
`[[gnu::error("some message")]]`.
resolves#146520
The vk::binding attribute allows users to explicitly set the set and
binding for a resource in SPIR-V without chaning the "register"
attribute, which will be used when targeting DXIL.
Fixes https://github.com/llvm/llvm-project/issues/136894
The codegen for the final class dynamic_cast optimization fails to
consider pointer authentication. This change resolves this be simply
disabling the optimization when pointer authentication enabled.
HandleMemberPointerAccess considered whether the lvalue path in a member
pointer access matched the bases of the containing class of the member,
but neglected to check the same for the containing class of the member
itself, thereby ignoring access attempts to members in direct sibling
classes.
Fixes#150705.
Fixes#150709.
Also removes an outdated test buffer-array-operator.hlsl. Array operator on resources is tested in StructuredBuffers-subscripts.hlsl and RWBuffer-subscript.hlsl.
These would not give a correct initializer, but they are not possible
to generate correctly anyway, so this patch makes sure we look through
the array type to correctly diagnose these.
We sometimes don't know if the operand of [[assume]] is true/false, and
that's okay. We can just ignore the attribute in that case.
If we wanted something more fancy, we could bring the assumption to the
constraints, but dropping them should be just as fine for now.
Fixes#151854