The patch introduce __builtin_spirv_generic_cast_to_ptr_explicit which
is lowered to the llvm.spv.generic.cast.to.ptr.explicit intrinsic.
The SPIR-V builtins are now split into 3 differents file:
BuiltinsSPIRVCore.td,
BuiltinsSPIRVVK.td for Vulkan specific builtins, BuiltinsSPIRVCL.td for
OpenCL specific builtins
and BuiltinsSPIRVCommon.td for common ones.
The patch also introduces a new header defining its SPIR-V friendly
equivalent (__spirv_GenericCastToPtrExplicit_ToGlobal,
__spirv_GenericCastToPtrExplicit_ToLocal and
__spirv_GenericCastToPtrExplicit_ToPrivate). The functions are declared
as aliases to the new builtin allowing C-like languages to have a
definition to rely on as well as gaining proper front-end diagnostics.
The motivation for the header is to provide a stable binding for
applications or library (such as SYCL) and allows non SPIR-V targets to
provide an implementation (via libclc or similar to how it is done for
gpuintrin.h).
This reverts commit 894a0dd57f81211f9e431d9e84f2856d34f46993 with
tests fixed.
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.
RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
CGDebugInfo::completeFunction was added previously but mistakenly not
called (dropped through the cracks while putting together the patch
stack). Moved out of #134652 and #134654.
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.
RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
This fixes#139614 on non-clang compilers by moving `__has_warning`
completely inside the `#if defined(__clang__)` block. This prevents a
parse failure from compilers which don't recognize `__has_warning`.
Original description:
Followup to #138741.
This adds the requested macro to silence
`-Wunnecessary-virtual-specifier` when declaring virtual anchor
functions in `final` classes, per [LLVM
policy](https://llvm.org/docs/CodingStandards.html#provide-a-virtual-method-anchor-for-classes-in-headers).
It also cleans up any remaining instances of the warning, allowing us to
stop disabling it when we build LLVM.
The IR now includes a global variable for the debugger that holds
the address of the vtable.
Now every class that contains virtual functions, has a static
member (marked as artificial) that identifies where that vtable
is loaded in memory. The unmangled name is '_vtable$'.
This new symbol will allow a debugger to easily associate
classes with the physical location of their VTables using
only the DWARF information. Previously, this had to be done
by searching for ELF symbols with matching names; something
that was time-consuming and error-prone in certain edge cases.
Adds resource name argument to `llvm.dx.handlefrombinding` and `llvm.dx.handlefromimplicitbinding` intrinsics.
SPIR-V currently does not seem to need the resource names so this change only affects DirectX binding intrinsics.
Part 2/4 of https://github.com/llvm/llvm-project/issues/105059
Recently in some of our internal testing, we noticed that the compiler
was sometimes generating an empty linker.options section which seems
unnecessary. This proposed change causes the compiler to simply omit
emitting the linker.options section if it is empty.
It turns out that getVLASize() does not get you the size of a single
dimension of the VLA, it gets you the full count of all elements. This
caused _Countof to return invalid values on VLA ranks. Now switched to
using getVLAElements1D() instead, which only gets a single dimension.
Fixes#141409
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
The builtin is documented to copy `count` elements, but the implementation
copies `count` bytes. Fix that.
Reviewers: cor3ntin, ojhunt
Pull Request: https://github.com/llvm/llvm-project/pull/140312
See test comment for possible future improvement.
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.
RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
See test comment for possible future improvement.
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.
RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
Note: This relands #140615 adding a ".count" suffix to the non-".all"
variants.
Our current intrinsic support for barrier intrinsics is confusing and
incomplete, with multiple intrinsics mapping to the same instruction and
intrinsic names not clearly conveying intrinsic semantics. Further, we
lack support for some variants. This change unifies the IR
representation to a single consistently named set of intrinsics.
- llvm.nvvm.barrier.cta.sync.aligned.all(i32)
- llvm.nvvm.barrier.cta.sync.aligned.count(i32, i32)
- llvm.nvvm.barrier.cta.arrive.aligned.count(i32, i32)
- llvm.nvvm.barrier.cta.sync.all(i32)
- llvm.nvvm.barrier.cta.sync.count(i32, i32)
- llvm.nvvm.barrier.cta.arrive.count(i32, i32)
The following Auto-Upgrade rules are used to maintain compatibility with
IR using the legacy intrinsics:
* llvm.nvvm.barrier0 --> llvm.nvvm.barrier.cta.sync.aligned.all(0)
* llvm.nvvm.barrier.n --> llvm.nvvm.barrier.cta.sync.aligned.all(x)
* llvm.nvvm.bar.sync --> llvm.nvvm.barrier.cta.sync.aligned.all(x)
* llvm.nvvm.barrier --> llvm.nvvm.barrier.cta.sync.aligned.count(x, y)
* llvm.nvvm.barrier.sync --> llvm.nvvm.barrier.cta.sync.all(x)
* llvm.nvvm.barrier.sync.cnt --> llvm.nvvm.barrier.cta.sync.count(x, y)
In some specific scenarios, `Ptr.getElementType()` won't be a primitive
type or a vector of primitive types, and thus `getScalarSizeInBits()`
returns zero.
Use the datalayout to get the proper size of the type instead of making
an implicit assumption that the type is a simple primitive type.
Solves SWDEV-534184
Covers aggregate initialisation and -ftrivial-auto-var-init=pattern.
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.
RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
Update how Sema Checking is done for HLSL builtins to allow for better
error messages, mainly using 'err_builtin_invalid_arg_type'.
Try to follow the formula outlined in issue #134721Closes#134721
clang/lib/CodeGen/CGDebugInfo.cpp:153:2: error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi]
153 | };
| ^
1 error generated.
This is a scoped helper similar to ApplyDebugLocation that creates a new source
location atom group which instructions can be added to.
A source atom is a source construct that is "interesting" for debug stepping
purposes. We use an atom group number to track the instruction(s) that implement
the functionality for the atom, plus backup instructions/source locations.
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.
RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
Our current intrinsic support for barrier intrinsics is confusing and
incomplete, with multiple intrinsics mapping to the same instruction and
intrinsic names not clearly conveying intrinsic semantics. Further, we
lack support for some variants. This change unifies the IR
representation to a single consistently named set of intrinsics.
- llvm.nvvm.barrier.cta.sync.aligned.all(i32)
- llvm.nvvm.barrier.cta.sync.aligned(i32, i32)
- llvm.nvvm.barrier.cta.arrive.aligned(i32, i32)
- llvm.nvvm.barrier.cta.sync.all(i32)
- llvm.nvvm.barrier.cta.sync(i32, i32)
- llvm.nvvm.barrier.cta.arrive(i32, i32)
The following Auto-Upgrade rules are used to maintain compatibility with
IR using the legacy intrinsics:
* llvm.nvvm.barrier0 --> llvm.nvvm.barrier.cta.sync.aligned.all(0)
* llvm.nvvm.barrier.n --> llvm.nvvm.barrier.cta.sync.aligned.all(x)
* llvm.nvvm.bar.sync --> llvm.nvvm.barrier.cta.sync.aligned.all(x)
* llvm.nvvm.barrier --> llvm.nvvm.barrier.cta.sync.aligned(x, y)
* llvm.nvvm.barrier.sync --> llvm.nvvm.barrier.cta.sync.all(x)
* llvm.nvvm.barrier.sync.cnt --> llvm.nvvm.barrier.cta.sync(x, y)
In 'EmitStoreThroughExtVectorComponentLValue', move the code which ZExts
in the case the Destination Scalar Type is larger than the Source Scalar
Type, to the top of the function, to ensure each condition is handled.
The previous code missed this case:
```
bool4 b = true.xxxx;
b.xyz = false.xxx;
```
Leading to a bad shuffle vector.
Closes#140564
Move the initialization of ptrauth-* function attributes near the
initialization of branch protection attributes. The semantics of these
groups of attributes partially overlaps, so handle both groups in
getDefaultFunctionAttributes() and setTargetAttributes() functions to
prevent getting them out of sync. This fixes C++ TLS wrappers.
The current region mapping for do-while loops that contain statements
such as break or continue results in inaccurate line coverage reports
for the line following the loop.
This change handles terminating statements the same way that other loop
constructs do, correcting the region mapping for accurate reports. It
also fixes a fragile test relying on exact line numbers.
Fixes#139122
…__builtin_scalbn
Clang generates library calls for __builtin_* functions which can be a
problem for GPUs that cannot handle them. This patch generates call to
device implementation for __builtin_logb and ldexp intrinsic for
__builtin_scalbn.
This PR adds a amdgns_load_to_lds intrinsic that abstracts over loads to
LDS from global (address space 1) pointers and buffer fat pointers
(address space 7), since they use the same API and "gather from a
pointer to LDS" is something of an abstract operation.
This commit adds the intrinsic and its lowerings for addrspaces 1 and 7,
and updates the MLIR wrappers to use it (loosening up the restrictions
on loads to LDS along the way to match the ground truth from target
features).
It also plumbs the intrinsic through to clang.