366 Commits

Author SHA1 Message Date
Paul Walker
96d2cb4145
[LLVM][CodeGen][DirectX] Fix scalarisation when vector ConstantFP is used. (#172684)
When using -use-constant-fp-for-fixed-length-splat `splat (float C)`
becomes ConstantFP(C) rather than ConstantVector(C, C, C...).
2026-02-05 13:18:38 +00:00
Kai
751a546fa9
[HLSL][DXIL][SPIRV] WavePrefixSum intrinsic support (#167946)
Issue: https://github.com/llvm/llvm-project/issues/99172
- [x] Implement `WavePrefixSum` clang builtin
- [x] Link `WavePrefixSum` clang builtin with `hlsl_intrinsics.h`
- [x] Add sema checks for `WavePrefixSum` to
`CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp`
- [x] Add codegen for `WavePrefixSum` to `EmitHLSLBuiltinExpr` in
`CGBuiltin.cpp`
- [x] Add codegen tests to
`clang/test/CodeGenHLSL/builtins/WavePrefixSum.hlsl`
- [x] Add sema tests to
`clang/test/SemaHLSL/BuiltIns/WavePrefixSum-errors.hlsl`
- [x] Create the `int_dx_WavePrefixSum` intrinsic in
`IntrinsicsDirectX.td`
- [x] Create the `DXILOpMapping` of `int_dx_WavePrefixSum` to `121` in
`DXIL.td`
- [x] Create the `WavePrefixSum.ll` and `WavePrefixSum_errors.ll` tests
in `llvm/test/CodeGen/DirectX/`
- [x] Create the `int_spv_WavePrefixSum` intrinsic in
`IntrinsicsSPIRV.td`
- [x] In SPIRVInstructionSelector.cpp create the `WavePrefixSum`
lowering and map it to `int_spv_WavePrefixSum` in
`SPIRVInstructionSelector::selectIntrinsic`.
- [x] Create SPIR-V backend test case in
`llvm/test/CodeGen/SPIRV/hlsl-intrinsics/WavePrefixSum.ll`

I also added a new macro
`GENERATE_HLSL_INTRINSIC_FUNCTION_SELECT_UNSIGNED` in conjunction with
the new function `getUnsignedIntrinsicVariant` to make selecting
unsigned variants of the intrinsic easier. As a result, I was able to
replace `getWaveActiveSumIntrinsic`, `getWaveActiveMaxIntrinsic`, and
`getWaveActiveMinIntrinsic` using the new macro.
2026-02-03 03:00:45 -05:00
Joshua Batista
293a668329
[HLSL] Add wave prefix count bits function (#178059)
This PR adds the WavePrefixCountBits function to HLSL, including spirv
and DXIL code generation.
Fixes https://github.com/llvm/llvm-project/issues/99171
2026-01-29 12:09:27 -08:00
Tim Corringham
d5f405558d
[HLSL] Implement f32tof16() intrinsic (#172469)
Implement the f32tof16() intrinsic, DXIL and SPIRV codegen, and related
tests.

Fixes #99113

---------

Co-authored-by: Tim Corringham <tcorring@amd.com>
2026-01-26 15:06:48 +00:00
Joshua Batista
058a223388
[HLSL] Add wave active ballot to set of wave ops that set waveops shader flag (#177043)
This PR simply adds wave active ballot to the set of wave ops that
switch on the waveops shader flag.
2026-01-21 10:06:04 -08:00
Joshua Batista
11b1836282
[HLSL] Handle WaveActiveBallot struct return type appropriately (#175105)
The previous WaveActiveBallot implementation did not account for the
fact that the DXC implementation of the intrinsic returns a struct type
with 4 uints, rather than a vector of 4 uints. This must be respected,
otherwise the validator will reject the uses of WaveActiveBallot that
return a vector of 4 uints.
This PR updates the return type and adds the DXC-specific return type
`fouri32` to use for the intrinsic.
2026-01-20 10:08:26 -08:00
Joshua Batista
ff617bce04
[HLSL] Fix WaveBallot dxil op function name (#174901)
Just fixing a typo from a copy paste.
2026-01-08 10:17:47 -08:00
Joshua Batista
65b0de42e7
[HLSL] Add WaveActiveBallot builtins and lower to DXIL / SPIR-V (#174638)
This PR adds WaveActiveBallot as a builtin function to HLSL.
Fixes https://github.com/llvm/llvm-project/issues/99163
2026-01-07 10:06:27 -08:00
Deric C.
57b0d8379e
[DirectX] Account for GlobalOffset in CurrentIndex calculation for cbuffer loads with GEPs in DXILResourceAccess pass (#174666)
Fixes #174656

---------

Co-authored-by: Alex Sepkowski <alexsepkowski@gmail.com>
2026-01-07 08:39:25 -08:00
Joshua Batista
52e815f1cd
[HLSL] Add allresourcesbound option to DXC driver and set corresponding module flag (#173411)
This PR adds a new CC1 option and a new dxc driver option. The DXC
option, when set, is translated into the new CC1 option.
The `all-resources-bound` dxc option will create a metadata module flag,
and the print-dx-shader-flags pass will set the appropriate shader
module flag from this metadata module flag.

Fixes https://github.com/llvm/llvm-project/issues/112264
2025-12-30 10:15:46 -08:00
Justin Bogner
ac602d887b
[DirectX] Disallow ElementIndex for raw buffer accesses (#173320)
Raw (as in ByteAddress) buffer accesses in DXIL must specify
ElementIndex as undef, and Structured buffer accesses must specify a
value. Ensure that we do this correctly in DXILResourceAccess, and
enforce that the operations are valid in DXILOpLowering.

Fixes #173316
2025-12-23 13:32:50 -07:00
Justin Bogner
38cdadd9c7
[DirectX] Teach MemIntrinsics about structs and nested arrays (#173078)
Add handling for more complicated cases than simple arrays.
2025-12-22 13:28:41 -07:00
Justin Bogner
b359616349
[DirectX] Resources and simple GEP traversal in DXILMemIntrinsics (#173054)
Walk through GEPs and recognize resource target extension types when
trying to infer the underlying types of memory intrinsics.
2025-12-22 11:47:50 -07:00
Justin Bogner
b324c9f4fa
[DirectX] Move memset and memcpy handling to a new pass. NFC (#172921)
This introduces the DXILMemIntrinsics pass and moves memset and memcpy
handling from DXILLegalize to here. We need to do this so that we can
handle memory intrinsics before the DXILResourceAccess pass so that we
can properly deal with arrays and large structures in resources.
2025-12-18 22:08:43 -07:00
Justin Bogner
c3039a7dc5
[DirectX] Avoid precalculating GEPs in DXILResourceAccess (#172720)
Instead of trying to precalculate GEP offsets ahead of time and then
process resource accesses based off of these offsets, traverse the GEP
chain inline for each access. This makes it easier to get the types
correct when translating GEPs for cbuffer and structured buffer
accesses, which in turn lets us access individual elements of those
structures directly.

Fixes #160208, #164517, and #169430
2025-12-18 22:15:12 +00:00
Finn Plummer
2185596c07
[DirectX] Add lowering support for llvm.fsh[l|r].* (#170570)
This pr adds support to emulate the `llvm.fshl.*` and `llvm.fshr.*`
intrinsics by expanding them, as described
[here](https://llvm.org/docs/LangRef.html#llvm-fshl-intrinsic).

Resolves  #165750.
2025-12-15 10:03:42 -08:00
Alexander Johnston
4ca2caeab6
[HLSL] Implement ddx/ddy_fine intrinsics (#168874)
Implements the HLSL ddx_fine and ddy_fine intrinsics.
For the SPIRV backend the intrinsics are ensured to be unavailable in
opencl (as they require fragment execution stage).

Closes https://github.com/llvm/llvm-project/issues/99098
Closes https://github.com/llvm/llvm-project/issues/99101
2025-12-12 09:34:41 -08:00
Finn Plummer
ed6078c023
[NFC] Add missing analysis to DirectX/llc-pipeline (#170714)
The Runtime Library Function Analysis pass was added in
https://github.com/llvm/llvm-project/pull/168622 but the backend test
case was not updating accordingly.
2025-12-04 18:56:23 +00:00
Deric C.
06c8ee61ab
[NFC] [DirectX] Make DirectX codegen test CBufferAccess/gep-ce-two-uses.ll more strict (#169855)
Continuation of PR #169848 to address PR comments.

This PR makes the test more strict by adding CHECKs to ensure the loads
are indeed using the same or different GEPs.
2025-11-27 13:58:47 -08:00
Deric C.
a1f30c24ea
[NFC] [DirectX] Update DirectX codegen test CBufferAccess/gep-ce-two-uses.ll due to changes to ReplaceConstant (#169848)
Fixes an LLVM DirectX codegen test after it broke due to #169141

The CBuffer loads and GEPs are no longer duplicated when there are two
or more accesses within the same basic block.
This PR removes the duplicate check for CBuffer load and GEP from the
original test function `@f` and adds a new test function `@g` which
places duplicate CBuffer loads into separate basic blocks.
2025-11-27 12:02:15 -08:00
Deric C.
f5e228b32a
[DirectX] Simplify DXIL data scalarization, and data scalarize whole GEP chains (#168096)
- The DXIL data scalarizer only needs to change vectors into arrays. It
does not need to change the types of GEPs to match the pointer type.
This PR simplifies the `visitGetElementPtrInst` method to do just that
while also accounting for nested GEPs from ConstantExprs. (Before this
PR, there were still vector types lingering in nested GEPs with
ConstantExprs.)
- The `equivalentArrayTypeFromVector` function was awkwardly placed near
the top of the file and away from the other helper functions. The
function is now moved next to the other helper functions.
- Removed an unnecessary `||` condition from `isVectorOrArrayOfVectors`

Related tests have also been cleaned up, and the test CHECKs have been
modified to account for the new simplified behavior.
2025-11-24 10:56:20 -08:00
Alexander Johnston
76f1949cfa
[HLSL] Implement the fwidth intrinsic for DXIL and SPIR-V target (#161378)
Adds the fwidth intrinsic for HLSL.
The DXIL path only requires modification to the hlsl headers.
The SPIRV path implements the OpFwidth builtin in Clang and instruction
selection for the OpFwidth instruction in LLVM.
Also adds shader stage tests to the ddx_coarse and ddy_coarse
instructions used by fwidth.

Closes #99120

---------

Co-authored-by: Alexander Johnston <alexander.johnston@amd.com>
2025-11-20 07:38:32 -05:00
Justin Bogner
c4898f3f22
[HLSL][DirectX] Use a padding type for HLSL buffers. (#167404)
This change drops the use of the "Layout" type and instead uses explicit
padding throughout the compiler to represent types in HLSL buffers.

There are a few parts to this, though it's difficult to split them up as
they're very interdependent:

1. Refactor HLSLBufferLayoutBuilder to allow us to calculate the padding
of arbitrary types.
2. Teach Clang CodeGen to use HLSL specific paths for cbuffers when
generating aggregate copies, array accesses, and structure accesses.
3. Simplify DXILCBufferAccesses such that it directly replaces accesses
with dx.resource.getpointer rather than recalculating the layout.
4. Basic infrastructure for SPIR-V handling, but the implementation
itself will need work in follow ups.

Fixes several issues, including #138996, #144573, and #156084.
Resolves #147352.
2025-11-18 13:38:43 -08:00
Alexander Johnston
ed60cd2563
[HLSL] Implement ddx/ddy_coarse intrinsics (#164831)
Closes https://github.com/llvm/llvm-project/issues/99097
Closes https://github.com/llvm/llvm-project/issues/99100

As ddx and ddy are near identical implementations I've combined them in
this PR. This aims to unblock
https://github.com/llvm/llvm-project/pull/161378

---------

Co-authored-by: Alexander Johnston <alexander.johnston@amd.com>
2025-11-18 16:41:07 +01:00
Justin Bogner
4ae7348513
[DirectX] Teach DXILResourceAccess about cbuffers (#164554)
This isn't reachable today but will come into play once we reorder
passes for #147352 and #147351.

Note that the `CBufferRowIntrin` helper struct is copied from the
`DXILCBufferAccess` pass, but it will be removed from there when we
simplify that pass in #147351
2025-11-10 12:32:43 -08:00
Farzon Lotfi
ecddaaeb3e
[DirectX] Remove llvm.assume intrinsic (#166697)
fixes #165051

This change reverts the experiment we did for #165311

While some backends seem to support llvm.assume without validation The
validator itself does not so it makes more sense to just remove it.
2025-11-06 14:05:30 -05:00
Finn Plummer
75c09b7924
[DirectX] Let data scalarizer pass account for sub-types when updating GEP type (#166200)
This pr lets the `dxil-data-scalarization` account for a GEP with a
source type that is a sub-type of the pointer operand type.

The pass is updated so that the replaced GEP introduces zero indices
such that the result type remains the same (with the vector -> array
transform).

Please see resolved issue for an annotated example.

Resolves: https://github.com/llvm/llvm-project/issues/165473
2025-11-06 09:14:50 -08:00
Finn Plummer
6312d27511
[DirectX] Emit hlsl.wavesize function attribute as entry property metadata (#165624)
This pr adds support for emitting the `hlsl.wavesize` function attribute
as an entry property metadata for a compute shader.

It follows the implementation of `hlsl.numthreads`.

- Collects the wave range information from the function attribute in
`DXILMetadataAnalysis`
- Introduce the `WaveRange` property tag
- Emit a `WaveSize` or `WaveRange` metadata (depending on shader model)
in `DXILTranslateMetadata`
- Add tests for valid/invalid scenarios
- Updates the base `PSVInfo` to reflect the min/max wave lane counts

Resolves #70118
2025-11-05 09:18:49 -08:00
Tim Corringham
89ec96b8b4
[HLSL] Implement the f16tof32() intrinsic (#165860)
Implement the f16tof32() intrinsic, including DXILand SPIRV codegen, and
associated tests.

Fixes #99112

---------

Co-authored-by: Tim Corringham <tcorring@amd.com>
2025-11-04 17:04:39 +00:00
Finn Plummer
ad29838a44
[DirectX] Use an allow-list of DXIL compatible module metadata (#165290)
This pr introduces an allow-list for module metadata, this encompasses
the llvm metadata nodes: `llvm.ident` and `llvm.module.flags`, as well
as, the generated `dx.` options.

Resolves: #164473.
2025-10-29 13:42:08 -07:00
Finn Plummer
032900eb30
[DirectX] Add DXIL validation of llvm.loop metadata (#164292)
This pr adds the equivalent validation of `llvm.loop` metadata that is
[done in
DXC](8f21027f2a/lib/DxilValidation/DxilValidation.cpp (L3010)).

This is done as follows:
- Add `llvm.loop` to the metadata allow-list in `DXILTranslateMetadata`
- Iterate through all `llvm.loop` metadata nodes and strip all
incompatible ones
- Raise an error for ill-formed nodes that are compatible with DXIL

Resolves: https://github.com/llvm/llvm-project/issues/137387
2025-10-29 11:54:18 -07:00
Sietze Riemersma
bfd4935fa3
[HLSL][DXIL][SPRIV] Added WaveActiveMin intrinsic (#164385)
Adds the WaveActiveMin intrinsic from #99169. I think I did all of the
required things on the checklist:
- [x]  Implement `WaveActiveMin` clang builtin,
- [x]  Link `WaveActiveMin` clang builtin with `hlsl_intrinsics.h`
- [x] Add sema checks for `WaveActiveMin` to
`CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp`
- [x] Add codegen for `WaveActiveMin` to `EmitHLSLBuiltinExpr` in
`CGBuiltin.cpp`
- [x] Add codegen tests to
`clang/test/CodeGenHLSL/builtins/WaveActiveMin.hlsl`
- [x] Add sema tests to
`clang/test/SemaHLSL/BuiltIns/WaveActiveMin-errors.hlsl`
- [x] Create the `int_dx_WaveActiveMin` intrinsic in
`IntrinsicsDirectX.td`
- [x] Create the `DXILOpMapping` of `int_dx_WaveActiveMin` to `119` in
`DXIL.td`
- [x] Create the `WaveActiveMin.ll` and `WaveActiveMin_errors.ll` tests
in `llvm/test/CodeGen/DirectX/`
- [x] Create the `int_spv_WaveActiveMin` intrinsic in
`IntrinsicsSPIRV.td`
- [x] In SPIRVInstructionSelector.cpp create the `WaveActiveMin`
lowering and map it to `int_spv_WaveActiveMin` in
`SPIRVInstructionSelector::selectIntrinsic`.
- [x] Create SPIR-V backend test case in
`llvm/test/CodeGen/SPIRV/hlsl-intrinsics/WaveActiveMin.ll

But as some of the code has changed and was moved around (E.G.
`CGBuiltin.cpp` -> `CGHLSLBuiltins.cpp`) I mostly followed how
`WaveActiveMax()` is implemented.

I have not been able to run the tests myself as I am unsure which
project runs the correct test. Any guidance on how I can test myself
would be helpful.

Also added some tests to the offload-test-suite
https://github.com/llvm/offload-test-suite/pull/478
2025-10-28 11:01:13 -07:00
joaosaffran
f05bd9c2e0
[HLSL] Adding DXIL Storage type into TypedInfo (#164887)
In DXIL, some 64bit types are actually represented with their 32bit
counterpart. This was already being address in the codegen, however the
metadata generation was lacking this information. This PR is fixing this
issue.

Closes: [#146735](https://github.com/llvm/llvm-project/issues/146735)
2025-10-27 18:59:03 -04:00
Finn Plummer
0fd330dfe3
[NFC][DirectX] Refactor DXILPrepare/DXILTranslateMetadata (#164285)
This pr updates `DXILPrepare` and `DXILTranslateMetadata` by moving all
the removal of metadata from `DXILPrepare` to `DXILTranslateMetadata` to
have a more consistent definition of what each pass is doing.

It restricts the `DXILPrepare` to only update function attributes and
insert bitcasts, and moves the removal of metadata to
`DXILTranslateMetadata` so that all manipulation of metadata is done in
a single pass.
2025-10-24 13:55:03 -07:00
Justin Bogner
2b42c6c163
[DirectX] Use a well-formed cbuffer in the unused cbuffer test (#164844)
CBuffers still need a layout type for now. Fix the crash when looking up
the cbuffer info.
2025-10-23 17:19:17 +00:00
Justin Bogner
11a24d6b43
[HLSL] Allow completely unused cbuffers (#164557)
We were checking for cbuffers where the global was removed, but if the
buffer is completely unused the whole thing can be null.

---------

Co-authored-by: Helena Kotas <hekotas@microsoft.com>
2025-10-22 17:17:15 -07:00
Justin Bogner
e9c7966046
[DirectX] Fix crash when naming buffers of arrays (#164553)
DXILResource was falling over trying to name a resource type that
contained an array, such as `StructuredBuffer<float[3][2]>`. Handle this
by walking through array types to gather the dimensions.
2025-10-22 10:35:16 -07:00
Finn Plummer
bcf7267937
[DirectX] remove unrecognized 'llvm.errno.tbaa' named metadata for DXIL target (#164472)
This is a temporary measure to explicitly remove the unrecognized named
metadata when targeting DXIL.

This should be removed for an allowlist as tracked here:
https://github.com/llvm/llvm-project/issues/164473.
2025-10-21 12:30:13 -07:00
Justin Bogner
507373306e
[DirectX] Introduce dx.Padding type (#160957)
This introduces the `dx.Padding` type as an alternative to the
`dx.Layout` types that are currently used for cbuffers. Later, we'll
remove the `dx.Layout` types completely, but making the backend handle
either makes it easier to stage the necessary changes to get there.

See #147352 for details.
2025-10-16 12:31:54 -06:00
Helena Kotas
78d98161b9
[DirectX] Add llvm.dx.resource.getdimensions.x intrinsic and lowering to DXIL (#161753)
Introduces LLVM intrinsic `llvm.dx.resource.getdimensions.x` and its lowering to DXIL op `op.dx.getDimensions`.
The intrinsic will be used to implement `GetDimension` for buffers. The lowering is using `undef` value since it is required by the DXIL format which is based on LLVM 3.7.

Proposal update: https://github.com/llvm/wg-hlsl/pull/350

Closes #112982
2025-10-15 17:54:15 -07:00
Justin Bogner
bf5f441731
[DirectX] Add 32- and 64-bit 3-element vectors to DataLayout (#160955)
This explicitly adds two 3-element vectors to the DataLayout so that
they'll be element-aligned. We need to do this more generally for
vectors, but this unblocks some very common cases.

Workaround for #123968
2025-10-15 13:33:05 -06:00
Justin Bogner
cfe6becdef
[DirectX] Make a test a bit more readable. NFC (#160747)
CHECK-lines ignore whitespace, so we can remove some here and make this
a bit easier to read.
2025-10-15 09:33:55 -06:00
Helena Kotas
b6b4262575
[DirectX] Fix DXIL container generating invalid PSV0 part for unbounded resources (#163287)
When calculating the upper bound for resource binding to be stored in the PSV0 part of the DXIL container, the compiler needs to take into account that the resource range could be _unbounded_, which is indicated by the binding size being `UINT32_MAX`.

Fixes [#159679](https://github.com/llvm/llvm-project/issues/159679)
2025-10-14 11:34:38 -07:00
joaosaffran
3a3b21461f
[DirectX] Making sure we always parse, validate and verify Flags (#162171)
This PR makes a few changes to make sure that Root Signature Flags are
always parsed validated and verified, this includes if you use a version
that doesn't support flags. The logic already existed, this PR just
makes sure it is always executed.

Closes: [#161579](https://github.com/llvm/llvm-project/issues/161579)

---------

Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
2025-10-08 16:18:18 -04:00
joaosaffran
58ce3e20e5
[DirectX] Fix Flags validation to prevent casting into enum (#161587)
This PR changes the validation logic for Root Descriptor and Descriptor
Range flags to properly check if the `uint32_t` values are within range
before casting into the enums.
2025-10-06 17:21:28 -04:00
joaosaffran
56ca23c46d
[DirectX] Updating Root Signature Metadata to contain Static Sampler flags (#160210)
Root Signature 1.2 adds flags to static samplers. This requires us to
change the metadata representation to account for it when being
generated. This patch focus on the metadata changes required in the
backend, frontend changes will come in a future PR.
2025-10-01 16:42:38 -04:00
joaosaffran
e28a559696
[DirectX] Validating Root flags are denying shader stage (#160919)
Root Signature Flags, allow flags to block compilation of certain shader
stages. This PR implements a validation and notify the user if they
compile a root signature that is denying such shader stage.
Closes: https://github.com/llvm/llvm-project/issues/153062
Previously approved: https://github.com/llvm/llvm-project/pull/153287

---------

Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
Co-authored-by: Joao Saffran <{ID}+{username}@users.noreply.github.com>
Co-authored-by: Joao Saffran <jderezende@microsoft.com>
2025-09-26 13:43:58 -04:00
joaosaffran
7d2f6fd177
[DirectX] Updating DXContainer Yaml to represent Root Signature 1.2 (#159659)
This PR updates the YAML representation of DXContainer to support Root
Signature 1.2, this also requires updating the write logic to support
testing.
2025-09-26 12:04:19 -04:00
joaosaffran
c06f35422d
[DirectX] Adding missing descriptor table validations (#153276)
This patch adds 2 small validation to DirectX backend. First, it checks
if registers in descriptor tables are not overflowing, meaning they
don't try to bind registers over the maximum allowed value, this is
checked both on the offset and on the number of descriptors inside the
range; second, it checks if samplers are being mixed with other resource
types.
Closes: #153057, #153058

---------

Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
Co-authored-by: Joao Saffran <{ID}+{username}@users.noreply.github.com>
Co-authored-by: Joao Saffran <jderezende@microsoft.com>
2025-09-26 11:58:54 -04:00
Dan Brown
df420ee2ba
Implements isnan() HLSL intrinsic for DXIL and SPIR-V targets. (#157733)
Addresses #99132.
2025-09-25 12:34:47 -04:00