Implements intrinsics used to get the level-of-detail given a texture,
sampler, and a coordinate. It will be used to implement the
corresponding HLSL methods.
Assisted-by: Gemini
- Add CPP_for_OpenCL source language operand
- Handle opencl.cxx.version metadata
Align handling with SPIR-V translator logic and tests presented there
SPIR-V backend previously only supported function annotations in
llvm.global.annotations and crashed with a fatal error when encountering
global variable entries
Add missing capabilities to finalize SPV_INTEL_16bit_atomics extension:
- AtomicInt16CompareExchangeINTEL (6260): for i16
load/store/exchange/cmpxchg
- Int16AtomicsINTEL (6261): for i16 arithmetic atomics (add, sub, min,
max, etc.)
- AtomicBFloat16LoadStoreINTEL (6262): for bfloat16 load/store/exchange
This completes the implementation started in 6ef3218.
Specification:
https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_16bit_atomics.asciidoc
This PR adds QuadReadAcrossX intrinsic support in HLSL with codegen for
both DirectX and SPIRV backends. Resolves
https://github.com/llvm/llvm-project/issues/99175.
- [x] Implement QuadReadAcrossX clang builtin
- [x] Link QuadReadAcrossX clang builtin with hlsl_intrinsics.h
- [x] Add sema checks for QuadReadAcrossX to
CheckHLSLBuiltinFunctionCall in SemaChecking.cpp
- [x] Add codegen for QuadReadAcrossX to EmitHLSLBuiltinExpr in
CGBuiltin.cpp
- [x] Add codegen tests to
clang/test/CodeGenHLSL/builtins/QuadReadAcrossX.hlsl
- [x] Add sema tests to
clang/test/SemaHLSL/BuiltIns/QuadReadAcrossX-errors.hlsl
- [x] Create the int_dx_QuadReadAcrossX intrinsic in
IntrinsicsDirectX.td
- [x] Create the DXILOpMapping of int_dx_QuadReadAcrossX to 123 in
DXIL.td
- [x] Create the QuadReadAcrossX.ll and QuadReadAcrossX_errors.ll tests
in llvm/test/CodeGen/DirectX/
- [x] Create the int_spv_QuadReadAcrossX intrinsic in IntrinsicsSPIRV.td
- [x] In SPIRVInstructionSelector.cpp create the QuadReadAcrossX
lowering and map it to int_spv_QuadReadAcrossX in
SPIRVInstructionSelector::selectIntrinsic.
- [x] Create SPIR-V backend test case in
llvm/test/CodeGen/SPIRV/hlsl-intrinsics/QuadReadAcrossX.ll
Similar to commit 557efc9a8b68628c2c944678c6471dac30ed9e8e (2022).
cl::ZeroOrMore is the default for cl::list and is unnecessary for
cl::opt
since the "may only occur zero or one times!" error was removed.
Also remove cl::init(false) on modified cl::opt<bool> lines.
Adds the intrinsics resource_load_level intrinic for DXIL and SPIR-V. It
will be used to load a value from an specific location in the image at
the given mip level. It will be used to implement the Texture Load and
mips[][] methods.
Assisted-by: Gemini
Otherwise multiple translation units in the same process could run into
ID reuse collisions cause invalid SPIR-Vs to be generated due to having
multiple definition for the same SPIR-V SSA value.
Closes: https://github.com/llvm/llvm-project/issues/160613
Right now if a module has a service function we always emit `OpName
entry` for the service function's basic block.
The actual service function isn't emitted and no other instruction uses
the basic block `OpName` instruction, so don't emit it.
Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
This patch implements the SPIR-V lowering for the following HLSL
intrinsics:
- SampleBias
- SampleGrad
- SampleLevel
- SampleCmp
- SampleCmpLevelZero
It defines the required LLVM intrinsics in 'IntrinsicsDirectX.td' and
'IntrinsicsSPIRV.td'.
It updates 'SPIRVInstructionSelector.cpp' to handle the new intrinsics
and
generates the correct 'OpImageSample*' instructions with the required
operands
(Bias, Grad, Lod, ConstOffset, MinLod, etc.).
CodeGen tests are added to verify the implementation for images with
dimension 1D, 2D, 3D, and Cube.
Assisted-by: Gemini
For compute we don't run structurizer hence we won't be able to preserve
loop metadata via LoopMerge instruction. So
SPV_INTEL_unstructured_loop_controls is the only way we can preserve the
info in unstructured control flow.
The extension adds support for the `OpFmaKHR` instruction, which
provides a native SPIR-V instruction for fused multiply-add operations
as an alternative to using OpenCL.std::Fma extended instruction.
Translate both LLVM fma intrinsics as well as OCL builtins to `OpFmaKHR`
if the extension is available.
Specification:
https://github.khronos.org/SPIRV-Registry/extensions/KHR/SPV_KHR_fma.html
This patch implements the `sample` and `sample_clamp` intrinsics for
HLSL
resources in the SPIR-V backend. It adds the necessary intrinsic
definitions
in `IntrinsicsDirectX.td` and `IntrinsicsSPIRV.td`, and implements the
instruction selection logic in `SPIRVInstructionSelector.cpp`.
Key changes:
- Added `int_dx_resource_sample` and `int_dx_resource_sample_clamp`
intrinsics.
- Added `int_spv_resource_sample` and `int_spv_resource_sample_clamp`
intrinsics.
- Implemented `selectSampleIntrinsic` to handle
`OpImageSampleImplicitLod` generation.
- Added `ResourceDimension` enum in `DXILABI.h` and `HLSLResource.h`.
- Added a new test case
`llvm/test/CodeGen/SPIRV/hlsl-resources/Sample.ll` to verify the
implementation.
--Added support for the extension SPV_KHR_non_semantic_info
--Added support for the extension SPV_KHR_relaxed_extended_instruction
--Added instructions from the documentation of the extension.
--Added supporting tests for the same.
Same as #165302
---------
Co-authored-by: Michal Paszkowski <michal@michalpaszkowski.com>
Added support for the SPV_ALTERA_arbitrary_precision_floating_point
extension, enabling all the arbitrary precision floating-point
operations with instruction definitions and test files.
LLVM has pretty thorough support for `int128`, and it has started seeing
some use. Even thouth we already have support for the
`SPV_ALTERA_arbitrary_precision_integers` extension, the BE was oddly
capping integer width to 64-bits. This patch adds partial support for
lowering 128-bit integers to `OpTypeInt 128`. Some work remains to be
done around legalisation support and validating constant uses (e.g.
cases that get lowered to `OpSpecConstantOp`).
This adds support for the `SPV_NV_shader_atomic_fp16_vector` extension,
and then uses it to enable lowering of atomic add, sub, min and max on 2
and 4 component vectors of FP16, which are rather common options in ML
workloads. Even though `bfloat16` also works in practice, we do not
enable it since it's not specified in the extension (which might need
updating / promoting to KHR at least). A `TODO` is also inserted in
`SPIRVModuleAnalysis.cpp' regarding the need to upgrade its ample usage
of `report_fatal_error`; I have a WiP patch for that, but it still needs
a bit of baking. Finally, a paired patch will be necessary in the
Translator, as it's not aware of the extension either - I'll update this
review to reference the PR once I create it.
According to SPIR-V spec:
> It is invalid to decorate any given id or structure member more than
one time with the same
[decoration](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Decoration),
unless explicitly allowed below for a specific decoration.
`FuncParamAttr` explicitly allows multiple uses of the decoration on the
same id, so this patch honors it.
This enables support for atomic RMW ops (add, sub, min and max to be
precise) with `bfloat16` operands, via the [SPV_INTEL_16bit_atomics
extension](https://github.com/intel/llvm/pull/20009). It's logically a
successor to #166031 (I should've used a stack), but I'm putting it up
for early review.
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
Enable the `SPV_INTEL_bfloat16_arithmetic` extension, which allows arithmetic, relational and `OpExtInst` instructions to take `bfloat16` arguments. This patch only adds support to arithmetic and relational ops. The extension itself is rather fresh, but `bfloat16` is ubiquitous at this point and not supporting these ops is limiting.
This adds BE support for the
[`SPV_INTEL_kernel_attributes`](https://github.khronos.org/SPIRV-Registry/extensions/INTEL/SPV_INTEL_kernel_attributes.html)
extension. The extension is necessary to encode the rather useful
`max_work_group_size` kernel attribute, via `OpExecutionMode
MaxWorkgroupSizeINTEL`, which is the only Execution Mode added by the
extension that this patch adds full processing for. Future patches will
add the other Execution Modes and Capabilities. The test is adapted from
the equivalent Translator test; it depends on #165815.
This PR introduces the support for the SPIR-V extension
`SPV_INTEL_predicated_io`. This extension adds predicated load and store
instructions. Predicated load performs load from memory if predicate is
true; otherwise, it uses default_value as a result. Predicated store
performs store of value to memory if predicate is true; otherwise, it
does nothing.
Reference Specification:
https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_predicated_io.asciidoc
Implementation of
[SPV_KHR_float_controls2](https://github.khronos.org/SPIRV-Registry/extensions/KHR/SPV_KHR_float_controls2.html)
extension, and corresponding tests.
Some of the tests make use of `!spirv.ExecutionMode` LLVM named
metadata. This is because some SPIR-V instructions don't have a direct
equivalent in LLVM IR, so the SPIR-V Target uses different LLVM named
metadata to convey the necessary information. Below, you will find an
example from one of the newly added tests:
```
!spirv.ExecutionMode = !{!19, !20, !21, !22, !23, !24, !25, !26, !27}
!19 = !{ptr @k_float_controls_float, i32 6028, float poison, i32 131079}
!20 = !{ptr @k_float_controls_all, i32 6028, float poison, i32 131079}
!21 = !{ptr @k_float_controls_float, i32 31}
!22 = !{ptr @k_float_controls_all, i32 31}
!23 = !{ptr @k_float_controls_float, i32 4461, i32 32}
!24 = !{ptr @k_float_controls_all, i32 4461, i32 16}
!25 = !{ptr @k_float_controls_all, i32 4461, i32 32}
!26 = !{ptr @k_float_controls_all, i32 4461, i32 64}
!27 = !{ptr @k_float_controls_all, i32 4461, i32 128}
```
`!spirv.ExecutionMode` contains a list of metadata nodes, and each of
them specifies the required operands for expressing a particular
`OpExecutionMode` instruction in SPIR-V. For example, `!19 = !{ptr
@k_float_controls_float, i32 6028, float poison, i32 131079}` will be
lowered to `OpExecutionMode [[k_float_controls_float_ID]]
FPFastMathDefault [[float_type_ID]] 131079`.
---------
Co-authored-by: Dmitry Sidorov <dmitry.sidorov@intel.com>
Added constraints related to Addressing model as specified in the
specification.
It conforms with the implementation in translator
Same as PR #160089
Solved all issues
This PR introduces the support for the SPIR-V extension
`SPV_KHR_bfloat16`. This extension extends the `OpTypeFloat` instruction
to enable the use of bfloat16 types with cooperative matrices and dot
products.
TODO:
Per the `SPV_KHR_bfloat16` extension, there are a limited number of
instructions that can use the bfloat16 type. For example, arithmetic
instructions like `FAdd` or `FMul` can't operate on `bfloat16` values.
Therefore, a future patch should be added to either emit an error or
fall back to FP32 for arithmetic in cases where bfloat16 must not be
used.
Reference Specification:
https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/KHR/SPV_KHR_bfloat16.asciidoc
Fix code quality issues reported by static analysis tool, such as:
- Rule of Three/Five.
- Dereference after null check.
- Unchecked return value.
- Variable copied when it could be moved.
Prior to this patch, when `NumElems` was 0, `OpTypeRuntimeArray` was
directly generated, but it requires `Shader` capability, so it can only
be generated if `Shader` env is being used. We have observed a pattern
of using unbound arrays that translate into `[0 x ...]` types in OpenCL,
which implies `Kernel` capability, so `OpTypeRuntimeArray` should not be
used. To prevent this scenario, this patch simplifies GEP instructions
where type is a 0-length array and the first index is also 0. In such
scenario, we effectively drop the 0-length array and the first index.
Additionally, the newly added test prior to this patch was generating a
module with both `Shader` and `Kernel` capabilities at the same time,
but they're incompatible. This patch also fixes that.
Finally, prior to this patch, the newly added test was adding `Shader`
capability to the module even with the command line flag
`--avoid-spirv-capabilities=Shader`. This patch also has a fix for that.
Add handling for FPFastMathMode in SPIR-V shaders. This is a first pass
that
simply does a direct translation when the proper extension is available.
This will unblock work for HLSL. However, it is not a full solution.
The default math mode for spir-v is determined by the API. When
targeting Vulkan many of the fast math options are assumed. We should do
something particular when targeting Vulkan.
We will also need to handle the hlsl "precise" keyword correctly when
FPFastMathMode is not available.
Unblockes https://github.com/llvm/llvm-project/issues/140739, but we are
keeing it open to track the remaining issues mentioned above.
PR #141787 added code to emit the Fragment execution model. This
required emitting the OriginUpperLeft ExecutionMode. But this was done
by using the same codepath used for OpEntrypoint.
This has 2 issues:
- the interface variables were added to both OpEntryPoint and
OpExecutionMode.
- the existing OpExecutionMode logic was not used.
This commit fixes this, regrouping OpExecutionMode handling in one
place, and fixing bad codegen issue when interface variiables are added.