GCC supports three flags related to overflow behavior:
* `-fwrapv`: Makes signed integer overflow well-defined.
* `-fwrapv-pointer`: Makes pointer overflow well-defined.
* `-fno-strict-overflow`: Implies `-fwrapv -fwrapv-pointer`, making both
signed integer overflow and pointer overflow well-defined.
Clang currently only supports `-fno-strict-overflow` and `-fwrapv`, but
not `-fwrapv-pointer`.
This PR proposes to introduce `-fwrapv-pointer` and adjust the semantics
of `-fwrapv` to match GCC.
This allows signed integer overflow and pointer overflow to be
controlled independently, while `-fno-strict-overflow` still exists to
control both at the same time (and that option is consistent across GCC
and Clang).
``` - add clang builtin to Builtins.td
- link builtin in hlsl_intrinsics
- add codegen for spirv intrinsic and two directx intrinsics to retain
signedness information of the operands in CGBuiltin.cpp
- add semantic analysis in SemaHLSL.cpp
- add lowering of spirv intrinsic to spirv backend in
SPIRVInstructionSelector.cpp
- add lowering of directx intrinsics to WaveActiveOp dxil op in
DXIL.td
- add test cases to illustrate passespendent pr merges.
```
Resolves#99170
Reimplement Neon FP8 vector types using attribute `neon_vector_type`
instead of having them as builtin types.
This allows to implement FP8 Neon intrinsics without the need to add
special cases for these types when using `__builtin_shufflevector`
or bitcast (using C-style cast operator) between vectors, both
extensively used in the generated code in `arm_neon.h`.
Reverts llvm/llvm-project#123853
The introduction of `reflect-error.ll` surfaced a bug with the use of
`report_fatal_error` in `SPIRVInstructionSelector` that was propagated
into the pr. This has caused a build-bot breakage, and the work to solve
the underlying issue is tracked here:
https://github.com/llvm/llvm-project/issues/124045. We can re-apply this
commit when the underlying issue is resolved.
This PR relands
[#122992](https://github.com/llvm/llvm-project/pull/122992).
Some machines were failing to run the `reflect-error.ll` test due to the
RUN lines
```llvm
; RUN: not %if spirv-tools %{ llc -O0 -mtriple=spirv64-unknown-unknown %s -o /dev/null 2>&1 -filetype=obj %}
; RUN: not %if spirv-tools %{ llc -O0 -mtriple=spirv32-unknown-unknown %s -o /dev/null 2>&1 -filetype=obj %}
```
which failed when `spirv-tools` was not present on the machine due to
running the command `not` without any arguments.
These RUN lines have been removed since they don't actually test
anything new compared to the other two RUN lines due to the expected
error during instruction selection.
```llvm
; RUN: not llc -verify-machineinstrs -O0 -mtriple=spirv64-unknown-unknown %s -o /dev/null 2>&1 | FileCheck %s
; RUN: not llc -verify-machineinstrs -O0 -mtriple=spirv32-unknown-unknown %s -o /dev/null 2>&1 | FileCheck %s
```
Re-write the sema and codegen for the atomic_test_and_set and
atomic_clear builtin functions to go via AtomicExpr, like the other
atomic builtins do. This simplifies the code, because AtomicExpr already
handles things like generating code for to dynamically select the memory
ordering, which was duplicated for these builtins. This also fixes a few
crash bugs, one when passing an integer to the pointer argument, and one
when using an array.
This also adds diagnostics for the memory orderings which are not valid
for atomic_clear according to
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html, which
were missing before.
Fixes https://github.com/llvm/llvm-project/issues/111293.
This is a re-land of #120449, modified to allow any non-const pointer
type for the first argument.
Fixes#99152
Tasks completed:
- Implement `reflect` in `clang/lib/Headers/hlsl/hlsl_intrinsics.h`
- Implement the `reflect` SPIR-V target built-in in
`clang/include/clang/Basic/BuiltinsSPIRV.td`
- Add a SPIR-V fast path in `clang/lib/Headers/hlsl/hlsl_detail.h` in
the form
```c++
#if (__has_builtin(__builtin_spirv_reflect))
return __builtin_spirv_reflect(...);
#else
return ...; // regular behavior
#endif
```
- Add codegen for the SPIR-V `reflect` built-in to
`EmitSPIRVBuiltinExpr` in `clang/lib/CodeGen/CGBuiltin.cpp`
- Add HLSL codegen tests to
`clang/test/CodeGenHLSL/builtins/reflect.hlsl`
- Add SPIR-V built-in codegen tests to
`clang/test/CodeGenSPIRV/Builtins/reflect.c`
- Add sema tests to `clang/test/SemaHLSL/BuiltIns/reflect-errors.hlsl`
- Add SPIR-V sema tests to
`clang/test/CodeGenSPIRV/Builtins/reflect-errors.c`
- Create the `int_spv_reflect` intrinsic in
`llvm/include/llvm/IR/IntrinsicsSPIRV.td`
- In `llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp` create the
`reflect` lowering and map it to `int_spv_reflect` in
`SPIRVInstructionSelector::selectIntrinsic`
- Create a SPIR-V backend test case in
`llvm/test/CodeGen/SPIRV/hlsl-intrinsics/reflect.ll`
Additional tasks completed:
- Implement sema check for the `reflect` SPIR-V built-in in
`clang/lib/Sema/SemaSPIRV.cpp`
- Required for HLSL codegen to work via the SPIR-V fast path, because
the types defined in `clang/include/clang/Basic/BuiltinsSPIRV.td` are
being overridden
- Create SPIR-V backend error test case in
`llvm/test/CodeGen/SPIRV/opencl/reflect-error.ll`
- Since `reflect` is only available in the GLSL extended instruction
set, using it in OpenCL should result in an error
Incomplete tasks:
- Create SPIR-V backend test case in
`llvm/test/CodeGen/SPIRV/opencl/reflect.ll`
- An OpenCL test is not applicable in this case because the [OpenCL
SPIR-V extended instruction
set](https://registry.khronos.org/SPIR-V/specs/unified1/OpenCL.ExtendedInstructionSet.100.html)
does not include a `reflect` function
This started out as trying to combine bf16 fpround to BFCVT2
instructions, but ended up removing the aarch64.neon.nfcvt intrinsics in
favour of generating fpround instructions directly. This simplifies the
patterns and can lead to other optimizations. The BFCVT2 instruction is
adjusted to makes sure the types are valid, and a bfcvt2 is now
generated in more place. The old intrinsics are auto-upgraded to fptrunc
instructions too.
This patch adds support for the next-generation arch15
CPU architecture to the SystemZ backend.
This includes:
- Basic support for the new processor and its features.
- Detection of arch15 as host processor.
- Assembler/disassembler support for new instructions.
- Exploitation of new instructions for code generation.
- New vector (signed|unsigned|bool) __int128 data types.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining __VEC__ == 10305.
Note: No currently available Z system supports the arch15
architecture. Once new systems become available, the
official system name will be added as supported -march name.
``` - add clang builtin to Builtins.td
- link builtin in hlsl_intrinsics
- add codegen for spirv intrinsic and two directx intrinsics to retain
signedness information of the operands in CGBuiltin.cpp
- add semantic analysis in SemaHLSL.cpp
- add lowering of spirv intrinsic to spirv backend in
SPIRVInstructionSelector.cpp
- add lowering of directx intrinsics to WaveActiveOp dxil op in
DXIL.td
- add test cases to illustrate passespendent pr merges.
```
Resolves#70106
---------
Co-authored-by: Finn Plummer <canadienfinn@gmail.com>
Closes https://github.com/llvm/llvm-project/issues/99116
Implements `firstbitlow` by extracting common functionality from
`firstbithigh` into a shared function while also fixing a bug for an edge
case where `u64x3` and larger vectors will attempt to create vectors
larger than the SPRIV max of 4.
---------
Co-authored-by: Steven Perron <stevenperron@google.com>
The `Checked` parameter of `CodeGenFunction::EmitCheck` is of type
`ArrayRef<std::pair<llvm::Value *, SanitizerMask>>`, which is overly
generalized: SanitizerMask can denote that zero or more sanitizers are
enabled, but `EmitCheck` requires that exactly one sanitizer is
specified in the SanitizerMask (e.g.,
`SanitizeTrap.has(Checked[i].second)` enforces that).
This patch replaces SanitizerMask with SanitizerOrdinal in the `Checked`
parameter of `EmitCheck` and code that transitively relies on it. This
should not affect the behavior of UBSan, but it has the advantages that:
- the code is clearer: it avoids ambiguity in EmitCheck about what to do
if multiple bits are set
- specifying the wrong number of sanitizers in `Checked[i].second` will
be detected as a compile-time error, rather than a runtime assertion
failure
Suggested by Vitaly in https://github.com/llvm/llvm-project/pull/122392
as an alternative to adding an explicit runtime assertion that the
SanitizerMask contains exactly one sanitizer.
## Changes
- Delete DirectX length intrinsic
- Delete HLSL length lang builtin
- Implement length algorithm entirely in the header.
## History
- In the past if an HLSL intrinsic lowered to either a spirv op code or
a DXIL opcode we represented it with intrinsics
## Why we are moving away?
- To make HLSL apis more portable the team decided that it makes sense
for some intrinsics to be defined only in the header.
- Since there tends to be more SPIRV opcodes than DXIL opcodes the plan
is to support SPIRV opcodes either with target specific builtins or via
pattern matching.
## Changes
- Delete DirectX length intrinsic
- Delete HLSL length lang builtin
- Implement length algorithm entirely in the header.
## History
- In the past if an HLSL intrinsic lowered to either a spirv op code or
a DXIL opcode we represented it with intrinsics
## Why we are moving away?
- To make HLSL apis more portable the team decided that it makes sense
for some intrinsics to be defined only in the header.
- Since there tends to be more SPIRV opcodes than DXIL opcodes the plan
is to support SPIRV opcodes either with target specific builtins or via
pattern matching.
Replacing the extant streaming mode function call with an intrinsic
allows us to make further optimisations around it. For example, if it's
called within a function that has a known streaming mode, we can remove
the dead code, and avoid the redundant conditional branch.
- Update pr labeler so new SPIRV files get properly labeled.
- Add distance target builtin to BuiltinsSPIRV.td.
- Update TargetBuiltins.h to account for spirv builtins.
- Update clang basic CMakeLists.txt to build spirv builtin tablegen.
- Hook up sema for SPIRV in Sema.h|cpp, SemaSPIRV.h|cpp, and
SemaChecking.cpp.
- Hookup sprv target builtins to SPIR.h|SPIR.cpp target.
- Update GBuiltin.cpp to emit spirv intrinsics when we get the expected
spirv target builtin.
Consensus was reach in this RFC to add both target builtins and pattern
matching:
https://discourse.llvm.org/t/rfc-add-targetbuiltins-for-spirv-to-support-hlsl/83329.
pattern matching will come in a separate pr this one just sets up the
groundwork to do target builtins for spirv.
partially resolves
[#99107](https://github.com/llvm/llvm-project/issues/99107)
This registers `sincos[f|l]` as a clang builtin and updates GCBuiltin to
emit the `llvm.sincos.*` intrinsic when `-fno-math-errno` is set. Note:
`llvm.sincos.*` is only emitted by `__builtin_sincos[f|l]` functions in
this initial patch.
Re-write the sema and codegen for the atomic_test_and_set and
atomic_clear builtin functions to go via AtomicExpr, like the other
atomic builtins do. This simplifies the code, because AtomicExpr already
handles things like generating code for to dynamically select the memory
ordering, which was duplicated for these builtins. This also fixes a few
crash bugs, one when passing an integer to the pointer argument, and one
when using an array.
This also adds diagnostics for the memory orderings which are not valid
for atomic_clear according to
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html, which
were missing before.
Fixes#111293.
Resolves https://github.com/llvm/llvm-project/issues/99161
- [x] Implement `WaveActiveAllTrue` clang builtin,
- [x] Link `WaveActiveAllTrue` clang builtin with `hlsl_intrinsics.h`
- [x] Add sema checks for `WaveActiveAllTrue` to
`CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp`
- [x] Add codegen for `WaveActiveAllTrue` to `EmitHLSLBuiltinExpr` in
`CGBuiltin.cpp`
- [x] Add codegen tests to
`clang/test/CodeGenHLSL/builtins/WaveActiveAllTrue.hlsl`
- [x] Add sema tests to
`clang/test/SemaHLSL/BuiltIns/WaveActiveAllTrue-errors.hlsl`
- [x] Create the `int_dx_WaveActiveAllTrue` intrinsic in
`IntrinsicsDirectX.td`
- [x] Create the `DXILOpMapping` of `int_dx_WaveActiveAllTrue` to `114`
in `DXIL.td`
- [x] Create the `WaveActiveAllTrue.ll` and
`WaveActiveAllTrue_errors.ll` tests in `llvm/test/CodeGen/DirectX/`
- [x] Create the `int_spv_WaveActiveAllTrue` intrinsic in
`IntrinsicsSPIRV.td`
- [x] In SPIRVInstructionSelector.cpp create the `WaveActiveAllTrue`
lowering and map it to `int_spv_WaveActiveAllTrue` in
`SPIRVInstructionSelector::selectIntrinsic`.
- [x] Create SPIR-V backend test case in
`llvm/test/CodeGen/SPIRV/hlsl-intrinsics/WaveActiveAllTrue.ll`
Patch[1] has update intrinsic interface for ld1/st1, while based on
ARM's document, "If the intrinsic also has a vnum argument, the ZA slice
number is calculated by adding vnum to slice.". But the "vnum" did not
work for our realization now, this patch fix this point.
[1]ee31ba0dd9
Adds support for the following MSVC intrinsics:
* `_InterlockedCompareExchangePointer_acq`
* `_InterlockedCompareExchangePointer_rel`
* `_InterlockedExchangePointer_acq`
* `_InterlockedExchangePointer_nf`
* `_InterlockedExchangePointer_rel`
These are documented at:
<https://learn.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170#interlocked-intrinsics>
NOTE: `_InterlockedCompareExchangePointer_nf` is not being added since
it already exists, although it was incorrectly added for all
architectures instead of being Arm & AArch64 specific.
This change also unifies how the pointer and non-pointer interlocked
compare-exchange intrinsics are being handled.
This introduces `__builtin_hlsl_resource_getpointer`, which lowers to
`llvm.dx.resource.getpointer` and is used to implement indexing into
resources.
This will only work through the backend for typed buffers at this point,
but the changes to structured buffers should be correct as far as the
frontend is concerned.
Note: We probably want this to return a reference in the HLSL device
address space, but for now we're just using address space 0. Creating a
device address space and updating this code can be done later as
necessary.
Fixes#95956
This patch implements the following intrinsics:
8-bit floating-point convert to deinterleaved half-precision or
BFloat16.
``` c
// Variant is also available for: _bf16[_mf8]_x2
svfloat16x2_t svcvtl1_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm) __arm_streaming;
svfloat16x2_t svcvtl2_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm) __arm_streaming;
```
Defined in https://github.com/ARM-software/acle/pull/323
Co-authored-by: Caroline Concatto caroline.concatto@arm.com
Co-authored-by: Marian Lukac marian.lukac@arm.com
Introduces `__builtin_hlsl_buffer_update_counter` clang buildin that is
used to implement the `IncrementCounter` and `DecrementCounter` methods
on `RWStructuredBuffer` and `RasterizerOrderedStructuredBuffer` (see
Note).
The builtin is translated to LLVM intrisic `llvm.dx.bufferUpdateCounter`
or `llvm.spv.bufferUpdateCounter`.
Introduces `BuiltinTypeMethodBuilder` helper in `HLSLExternalSemaSource`
that enables adding methods to builtin types using builder pattern like
this:
```
BuiltinTypeMethodBuilder(Sema, RecordBuilder, "MethodName", ReturnType)
.addParam("param_name", Type, InOutModifier)
.callBuiltin("buildin_name", { BuiltinParams })
.finalizeMethod();
```
Fixes#113513
[First version](llvm/llvm-project#114148) of this PR was reverted
because of build break.
Introduces `__builtin_hlsl_buffer_update_counter` clang buildin that is
used to implement the `IncrementCounter` and `DecrementCounter` methods
on `RWStructuredBuffer` and `RasterizerOrderedStructuredBuffer` (see
Note).
The builtin is translated to LLVM intrisic `llvm.dx.bufferUpdateCounter`
or `llvm.spv.bufferUpdateCounter`.
Introduces `BuiltinTypeMethodBuilder` helper in `HLSLExternalSemaSource`
that enables adding methods to builtin types using builder pattern like
this:
```
BuiltinTypeMethodBuilder(Sema, RecordBuilder, "MethodName", ReturnType)
.addParam("param_name", Type, InOutModifier)
.callBuiltin("buildin_name", { BuiltinParams })
.finalizeMethod();
```
Fixes#113513