Issue: https://github.com/llvm/llvm-project/issues/99172
- [x] Implement `WavePrefixSum` clang builtin
- [x] Link `WavePrefixSum` clang builtin with `hlsl_intrinsics.h`
- [x] Add sema checks for `WavePrefixSum` to
`CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp`
- [x] Add codegen for `WavePrefixSum` to `EmitHLSLBuiltinExpr` in
`CGBuiltin.cpp`
- [x] Add codegen tests to
`clang/test/CodeGenHLSL/builtins/WavePrefixSum.hlsl`
- [x] Add sema tests to
`clang/test/SemaHLSL/BuiltIns/WavePrefixSum-errors.hlsl`
- [x] Create the `int_dx_WavePrefixSum` intrinsic in
`IntrinsicsDirectX.td`
- [x] Create the `DXILOpMapping` of `int_dx_WavePrefixSum` to `121` in
`DXIL.td`
- [x] Create the `WavePrefixSum.ll` and `WavePrefixSum_errors.ll` tests
in `llvm/test/CodeGen/DirectX/`
- [x] Create the `int_spv_WavePrefixSum` intrinsic in
`IntrinsicsSPIRV.td`
- [x] In SPIRVInstructionSelector.cpp create the `WavePrefixSum`
lowering and map it to `int_spv_WavePrefixSum` in
`SPIRVInstructionSelector::selectIntrinsic`.
- [x] Create SPIR-V backend test case in
`llvm/test/CodeGen/SPIRV/hlsl-intrinsics/WavePrefixSum.ll`
I also added a new macro
`GENERATE_HLSL_INTRINSIC_FUNCTION_SELECT_UNSIGNED` in conjunction with
the new function `getUnsignedIntrinsicVariant` to make selecting
unsigned variants of the intrinsic easier. As a result, I was able to
replace `getWaveActiveSumIntrinsic`, `getWaveActiveMaxIntrinsic`, and
`getWaveActiveMinIntrinsic` using the new macro.
The previous WaveActiveBallot implementation did not account for the
fact that the DXC implementation of the intrinsic returns a struct type
with 4 uints, rather than a vector of 4 uints. This must be respected,
otherwise the validator will reject the uses of WaveActiveBallot that
return a vector of 4 uints.
This PR updates the return type and adds the DXC-specific return type
`fouri32` to use for the intrinsic.
This PR adds a new CC1 option and a new dxc driver option. The DXC
option, when set, is translated into the new CC1 option.
The `all-resources-bound` dxc option will create a metadata module flag,
and the print-dx-shader-flags pass will set the appropriate shader
module flag from this metadata module flag.
Fixes https://github.com/llvm/llvm-project/issues/112264
Raw (as in ByteAddress) buffer accesses in DXIL must specify
ElementIndex as undef, and Structured buffer accesses must specify a
value. Ensure that we do this correctly in DXILResourceAccess, and
enforce that the operations are valid in DXILOpLowering.
Fixes#173316
This introduces the DXILMemIntrinsics pass and moves memset and memcpy
handling from DXILLegalize to here. We need to do this so that we can
handle memory intrinsics before the DXILResourceAccess pass so that we
can properly deal with arrays and large structures in resources.
Instead of trying to precalculate GEP offsets ahead of time and then
process resource accesses based off of these offsets, traverse the GEP
chain inline for each access. This makes it easier to get the types
correct when translating GEPs for cbuffer and structured buffer
accesses, which in turn lets us access individual elements of those
structures directly.
Fixes#160208, #164517, and #169430
Continuation of PR #169848 to address PR comments.
This PR makes the test more strict by adding CHECKs to ensure the loads
are indeed using the same or different GEPs.
Fixes an LLVM DirectX codegen test after it broke due to #169141
The CBuffer loads and GEPs are no longer duplicated when there are two
or more accesses within the same basic block.
This PR removes the duplicate check for CBuffer load and GEP from the
original test function `@f` and adds a new test function `@g` which
places duplicate CBuffer loads into separate basic blocks.
- The DXIL data scalarizer only needs to change vectors into arrays. It
does not need to change the types of GEPs to match the pointer type.
This PR simplifies the `visitGetElementPtrInst` method to do just that
while also accounting for nested GEPs from ConstantExprs. (Before this
PR, there were still vector types lingering in nested GEPs with
ConstantExprs.)
- The `equivalentArrayTypeFromVector` function was awkwardly placed near
the top of the file and away from the other helper functions. The
function is now moved next to the other helper functions.
- Removed an unnecessary `||` condition from `isVectorOrArrayOfVectors`
Related tests have also been cleaned up, and the test CHECKs have been
modified to account for the new simplified behavior.
Adds the fwidth intrinsic for HLSL.
The DXIL path only requires modification to the hlsl headers.
The SPIRV path implements the OpFwidth builtin in Clang and instruction
selection for the OpFwidth instruction in LLVM.
Also adds shader stage tests to the ddx_coarse and ddy_coarse
instructions used by fwidth.
Closes#99120
---------
Co-authored-by: Alexander Johnston <alexander.johnston@amd.com>
This change drops the use of the "Layout" type and instead uses explicit
padding throughout the compiler to represent types in HLSL buffers.
There are a few parts to this, though it's difficult to split them up as
they're very interdependent:
1. Refactor HLSLBufferLayoutBuilder to allow us to calculate the padding
of arbitrary types.
2. Teach Clang CodeGen to use HLSL specific paths for cbuffers when
generating aggregate copies, array accesses, and structure accesses.
3. Simplify DXILCBufferAccesses such that it directly replaces accesses
with dx.resource.getpointer rather than recalculating the layout.
4. Basic infrastructure for SPIR-V handling, but the implementation
itself will need work in follow ups.
Fixes several issues, including #138996, #144573, and #156084.
Resolves#147352.
This isn't reachable today but will come into play once we reorder
passes for #147352 and #147351.
Note that the `CBufferRowIntrin` helper struct is copied from the
`DXILCBufferAccess` pass, but it will be removed from there when we
simplify that pass in #147351
fixes#165051
This change reverts the experiment we did for #165311
While some backends seem to support llvm.assume without validation The
validator itself does not so it makes more sense to just remove it.
This pr lets the `dxil-data-scalarization` account for a GEP with a
source type that is a sub-type of the pointer operand type.
The pass is updated so that the replaced GEP introduces zero indices
such that the result type remains the same (with the vector -> array
transform).
Please see resolved issue for an annotated example.
Resolves: https://github.com/llvm/llvm-project/issues/165473
This pr adds support for emitting the `hlsl.wavesize` function attribute
as an entry property metadata for a compute shader.
It follows the implementation of `hlsl.numthreads`.
- Collects the wave range information from the function attribute in
`DXILMetadataAnalysis`
- Introduce the `WaveRange` property tag
- Emit a `WaveSize` or `WaveRange` metadata (depending on shader model)
in `DXILTranslateMetadata`
- Add tests for valid/invalid scenarios
- Updates the base `PSVInfo` to reflect the min/max wave lane counts
Resolves#70118
Implement the f16tof32() intrinsic, including DXILand SPIRV codegen, and
associated tests.
Fixes#99112
---------
Co-authored-by: Tim Corringham <tcorring@amd.com>
This pr introduces an allow-list for module metadata, this encompasses
the llvm metadata nodes: `llvm.ident` and `llvm.module.flags`, as well
as, the generated `dx.` options.
Resolves: #164473.
This pr adds the equivalent validation of `llvm.loop` metadata that is
[done in
DXC](8f21027f2a/lib/DxilValidation/DxilValidation.cpp (L3010)).
This is done as follows:
- Add `llvm.loop` to the metadata allow-list in `DXILTranslateMetadata`
- Iterate through all `llvm.loop` metadata nodes and strip all
incompatible ones
- Raise an error for ill-formed nodes that are compatible with DXIL
Resolves: https://github.com/llvm/llvm-project/issues/137387
Adds the WaveActiveMin intrinsic from #99169. I think I did all of the
required things on the checklist:
- [x] Implement `WaveActiveMin` clang builtin,
- [x] Link `WaveActiveMin` clang builtin with `hlsl_intrinsics.h`
- [x] Add sema checks for `WaveActiveMin` to
`CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp`
- [x] Add codegen for `WaveActiveMin` to `EmitHLSLBuiltinExpr` in
`CGBuiltin.cpp`
- [x] Add codegen tests to
`clang/test/CodeGenHLSL/builtins/WaveActiveMin.hlsl`
- [x] Add sema tests to
`clang/test/SemaHLSL/BuiltIns/WaveActiveMin-errors.hlsl`
- [x] Create the `int_dx_WaveActiveMin` intrinsic in
`IntrinsicsDirectX.td`
- [x] Create the `DXILOpMapping` of `int_dx_WaveActiveMin` to `119` in
`DXIL.td`
- [x] Create the `WaveActiveMin.ll` and `WaveActiveMin_errors.ll` tests
in `llvm/test/CodeGen/DirectX/`
- [x] Create the `int_spv_WaveActiveMin` intrinsic in
`IntrinsicsSPIRV.td`
- [x] In SPIRVInstructionSelector.cpp create the `WaveActiveMin`
lowering and map it to `int_spv_WaveActiveMin` in
`SPIRVInstructionSelector::selectIntrinsic`.
- [x] Create SPIR-V backend test case in
`llvm/test/CodeGen/SPIRV/hlsl-intrinsics/WaveActiveMin.ll
But as some of the code has changed and was moved around (E.G.
`CGBuiltin.cpp` -> `CGHLSLBuiltins.cpp`) I mostly followed how
`WaveActiveMax()` is implemented.
I have not been able to run the tests myself as I am unsure which
project runs the correct test. Any guidance on how I can test myself
would be helpful.
Also added some tests to the offload-test-suite
https://github.com/llvm/offload-test-suite/pull/478
In DXIL, some 64bit types are actually represented with their 32bit
counterpart. This was already being address in the codegen, however the
metadata generation was lacking this information. This PR is fixing this
issue.
Closes: [#146735](https://github.com/llvm/llvm-project/issues/146735)
This pr updates `DXILPrepare` and `DXILTranslateMetadata` by moving all
the removal of metadata from `DXILPrepare` to `DXILTranslateMetadata` to
have a more consistent definition of what each pass is doing.
It restricts the `DXILPrepare` to only update function attributes and
insert bitcasts, and moves the removal of metadata to
`DXILTranslateMetadata` so that all manipulation of metadata is done in
a single pass.
We were checking for cbuffers where the global was removed, but if the
buffer is completely unused the whole thing can be null.
---------
Co-authored-by: Helena Kotas <hekotas@microsoft.com>
DXILResource was falling over trying to name a resource type that
contained an array, such as `StructuredBuffer<float[3][2]>`. Handle this
by walking through array types to gather the dimensions.
This is a temporary measure to explicitly remove the unrecognized named
metadata when targeting DXIL.
This should be removed for an allowlist as tracked here:
https://github.com/llvm/llvm-project/issues/164473.
This introduces the `dx.Padding` type as an alternative to the
`dx.Layout` types that are currently used for cbuffers. Later, we'll
remove the `dx.Layout` types completely, but making the backend handle
either makes it easier to stage the necessary changes to get there.
See #147352 for details.
Introduces LLVM intrinsic `llvm.dx.resource.getdimensions.x` and its lowering to DXIL op `op.dx.getDimensions`.
The intrinsic will be used to implement `GetDimension` for buffers. The lowering is using `undef` value since it is required by the DXIL format which is based on LLVM 3.7.
Proposal update: https://github.com/llvm/wg-hlsl/pull/350Closes#112982
This explicitly adds two 3-element vectors to the DataLayout so that
they'll be element-aligned. We need to do this more generally for
vectors, but this unblocks some very common cases.
Workaround for #123968
When calculating the upper bound for resource binding to be stored in the PSV0 part of the DXIL container, the compiler needs to take into account that the resource range could be _unbounded_, which is indicated by the binding size being `UINT32_MAX`.
Fixes [#159679](https://github.com/llvm/llvm-project/issues/159679)
This PR makes a few changes to make sure that Root Signature Flags are
always parsed validated and verified, this includes if you use a version
that doesn't support flags. The logic already existed, this PR just
makes sure it is always executed.
Closes: [#161579](https://github.com/llvm/llvm-project/issues/161579)
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
This PR changes the validation logic for Root Descriptor and Descriptor
Range flags to properly check if the `uint32_t` values are within range
before casting into the enums.
Root Signature 1.2 adds flags to static samplers. This requires us to
change the metadata representation to account for it when being
generated. This patch focus on the metadata changes required in the
backend, frontend changes will come in a future PR.
Root Signature Flags, allow flags to block compilation of certain shader
stages. This PR implements a validation and notify the user if they
compile a root signature that is denying such shader stage.
Closes: https://github.com/llvm/llvm-project/issues/153062
Previously approved: https://github.com/llvm/llvm-project/pull/153287
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
Co-authored-by: Joao Saffran <{ID}+{username}@users.noreply.github.com>
Co-authored-by: Joao Saffran <jderezende@microsoft.com>
This patch adds 2 small validation to DirectX backend. First, it checks
if registers in descriptor tables are not overflowing, meaning they
don't try to bind registers over the maximum allowed value, this is
checked both on the offset and on the number of descriptors inside the
range; second, it checks if samplers are being mixed with other resource
types.
Closes: #153057, #153058
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
Co-authored-by: Joao Saffran <{ID}+{username}@users.noreply.github.com>
Co-authored-by: Joao Saffran <jderezende@microsoft.com>