This moves the responsibility for cleaning up dead intrinsics from
DXILFinalizeLinkage to DXILOpLowering, and moves DXILFinalizeLinkage
back to it's pre-#136244 place in the pipeline. Doing this avoids issues
with DXIL passes running on obviously dead code, and makes it more clear
what DXILFinalizeLinkage is really doing.
This also helps with the story for #134260, as cleaning up dead
intrinsics doesn't make sense if this becomes a more generic pass.
Note that test/CodeGen/DirectX/remove-dead-intriniscs.ll already covers
most of the testing here. It'd be nice to have something that catches
the regression from changing the pass ordering but I couldn't come up
with anything that wouldn't be incredibly fragile.
Fixes#138180.
fixes#136243
This change converts memset into a series of geps and stores It is
intentionally limited to memsets of fixed size It also converts the byte
stores to type stores.
DXIL does not support i8 plus this reduces the total number of gep and
store instructions.
This change also moves DXILFinalizeLinkage to run after Legalization to
clean up any dead intrinsic definitions.
fixes#137202
investingating i8 allocas I came to find some missing instructions from
out i8 legalization around load, store, and select.
Added those three.
To do i8 allocas right though we needed to walk the uses and find the
casts.
After finding the casts I chose to pick the smallest cast as the cast to
transform to. That would then let me preserve the larger casts that come
later
Fixes#112272
In addition to the implementation of the UAVsAtEveryStage shader flag
analysis, several unrelated tests have had the `dx.valver` module
metadata defined to avoid setting the UAVsAtEveryStage shader flag in
them.
Example:
```
!dx.valver = !{!0}
!0 = !{i32 1, i32 8}
```
---------
Co-authored-by: Justin Bogner <mail@justinbogner.com>
This PR revises the descriptions of DXIL module flags.
Descriptions such as `D3D11_1_SB_GLOBAL_FLAG_SKIP_OPTIMIZATION` are
referring to Global Flags in DXBC.
DXBC is not a supported backend target, so references to DXBC should not
be present.
There is also confusion with regards to the description of the
`LowPrecisionPresent` DXIL module flag, which currently reads
`D3D11_1_SB_GLOBAL_FLAG_ENABLE_MINIMUM_PRECISION` and implies the use of
minimum-precision to handle 16-bit types.
However this is not true, because both the flags `LowPrecisionPresent`
and `UseNativeLowPrecision` can simultaneously be set in the same DXIL
module, and minimum precision mode is mutually exclusive with native low
precision.
This PR revises the description of the `LowPrecisionPresent` flag to
accurately describe what it represents.
This PR introduces a Metadata Node Kind allowlist. The purpose is to
prevent newer Metadata Node Kinds to be used and inserted into the
outputted DXIL module. Only the metadata kinds that are accepted in the
DXIL Validator are on the allowlist. The Github DXC validator doesn't
support these newer Metadata Node Kinds, so we need to filter them out.
We introduce this restrictive allowlist into LLVM and strip all metadata
that isn't found in the list.
The accompanying test would add the `llvm.loop.mustprogress` metadata
node kind, but thanks to the allowlist, filters it out, and so the
whitelist is proven to work.
The test also has two separate metadata kinds that are on the allowlist,
and remain after the DXIL Prepare pass.
This PR primarily fixes the version-checking logic of the shader flags
`ResMayNotAlias` and `Max64UAVs` to correctly match DXC's behavior.
Primary changes:
- The logic for determining the presence of UAVs for the
`ResMayNotAlias` shader flag checked against the DXIL Version when it
should have been checking against the DXIL Validator Version. (See DXC:
[DxilShaderFlags.cpp#L484](f19b5da541/lib/DXIL/DxilShaderFlags.cpp (L484)))
- The logic for counting UAVs for the `Max64UAVs` shader flag checked
against the DXIL Version when it should have been checking against the
DXIL Validator Version. (See DXC:
[DxilModule.cpp#L327](f19b5da541/lib/DXIL/DxilModule.cpp (L327)))
- Tests have been modified to test the corrected behaviors for these two
flags
Additional changes included for consistency:
- The logic for setting `UseNativeLowPrecision` now checks against
Shader Model version instead of DXIL version to be consistent with the
code comments from DXC
([DxilShaderFlags.h#L280](f19b5da541/include/dxc/DXIL/DxilShaderFlags.h (L280))).
- An additional test has been added to ensure that the module flag
"dx.nativelowprec" set to 0 does not apply the `UseNativeLowPrecision`
shader flag
- Related shader flag tests were renamed to be more consistent, and some
comments were edited for clarification
- Add obj2yaml tests for the `Max64UAVs` shader flag
We can end up with loads of single element vectors when we have scalar
values, because the vectorizer may introduce these to use ops like
shufflevector in some cases. Make sure we're maintaining the correct
type when translating these into resource load operations.
Fixes#136409.
This pass attempts to forward resource handle creation to accesses of
the handle global. This avoids dependence on optimizations like CSE and
GlobalOpt for correctness of DXIL.
Fixes#134574.
fixes#136620
It was determined that the lifetime intrinsics generated by clang are
likely more correct than the ones in DXC hence explaining the missing
lifetimes between the IR diffs.
As such we are legalizing lllvm lifetime intrinsics by letting them all
pass on through.
Fixes [#114553](https://github.com/llvm/llvm-project/issues/114553)
This implementation replicates the behavior of DXC in setting the
`m_b64UAVs` flag: the `Max64UAVs` DXIL module flag is set in the
presence of more than 8 UAVs in a DXIL module.
The behavior of how UAV (resource) arrays are counted differs based on
Shader Model version:
- If Shader Model < 6.6, then a UAV array counts as a single UAV
regardless of its range size
- if Shader Model >= 6.6, then a UAV array contributes its range size to
the total number of UAVs
I initially thought the complete implementation of this analysis may be
blocked by the resource arrays implementation, but it seems that it is
not the case, as the `@llvm.dx.resource.handle*` already includes a
range size argument.
fixes#135654
In #128613 we added safe guards to prevent the lowering of just any
intrinsic in the backend. We used `DiagnosticInfoUnsupported` to do
this.
What we found was when using `opt` the diagnostic print function was
called but when using clang the diagnostic message was used.
Printing message in the clang version means we miss valuable debugging
information like function name and function type when LLVMContext was
only needed to call `getBestLocationFromDebugLoc`.
There are a few potential fixes
1. Write a custom DiagnosticInfoUnsupported so we can change the Message
just for DirectX. Too heavy handed so rejected.
2. Add the function name to the Message in DirectX code. Very simple one
line change. Downside is when using opt you see the function name twice.
But makes the clang-dxc bugs more actionable.
3. change CodeGenAction.cpp to always use the print function and not the
message directly. Downside is a bunch of innacurate information shows up
in the message if you don't specify `-debug-info-kind=standalone`.
4. add some book keeping to know which function called the intrinsic
keep a map of these so we can pass the calling function to
`DiagnosticInfoUnsupported` instead of the intrinsic. This would only be
useful if we had debug info so we could distinguish different uses of
the intrinsic by line\col number. We would also need to change from
iterating on every function to doing something like a LazyCallGraph
which is a nonstarter.
5. pick a different means of doing a Diagnostic error, because other
uses of `DiagnosticInfoUnsupported` error when we are in the body of a
function not when we see one being used like in the intrinsic case.
This PR went with a combo of option 2 & 5. Its low code change that also
only impacts the DirectX backend.
fixes#135719
LLVM 3.7 did not have a freeze instruction
Further this instruction is really only used as syntactic sugar
in LLVM's optimizer passes to not aggressively optimize things that
could be undef or poison ie x*2 to x+x.
Most backends treat it as a no-op so we will do the same
by removing it and replacing its uses with its input.
Adding support for Root Constant in MC, Object and obj2yaml and
yaml2obj, this PR adds:
- new structures to dxbc definition.
- serialize and desirialize logic from dxcontainer to yaml
- tests validating against dxc
- adding support to multiple parts.
Closes: https://github.com/llvm/llvm-project/issues/126633
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
This introduces a pass that walks accesses to globals in cbuffers and
replaces them with accesses via the cbuffer handle itself. The logic to
interpret the cbuffer metadata is kept in `lib/Frontend/HLSL` so that it
can be reused by other consumers of that metadata.
Fixes#124630.
The `dx.dot2`, `dot3`, and `dot4` intrinsics exist purely to lower
`dx.fdot`, and they map exactly to the DXIL ops of the same name. Using
vectors for their arguments adds unnecessary complexity and causes us to
have vector operations that are not trivial to lower post-scalarizer.
Similarly, the `dx.dot2add` intrinsic is overly generic for something
that only needs to lower to a single `dot2AddHalf` DXIL op. Update its
signature to match the operation it lowers to.
Fixes#134569.
fixes#135285
This change implements the `usub.sat` intrinsic to perform an unsigned
saturating subtraction on the 2 arguments.
The minimum value this operation is clamp to is 0.
Fixes#112270
Completed ACs:
- `-res-may-alias` clang-dxc command-line option added
- It inserts and sets a module metadata flag `dx.resmayalias` to 1
- Shader flag set appropriately:
- The flag IS NOT set if DXIL Version <= 1.6 OR the command-line option
`-res-may-alias` is specified
- Otherwise the flag IS set when:
- DXIL Version > 1.7 AND function uses UAVs, OR
- DXIL Version <= 1.7 AND UAVs present globally
- Add tests
- Tests for Shader Models 6.6, 6.7, and 6.8 corresponding to DXIL
Versions 1.6, 1.7, and 1.8
- Tests (`res-may-alias-0.ll`/`res-may-alias-1.ll`) for when the module
metadata flag `dx.resmayalias` is set to 0 or 1 respectively
- A frontend test (`res-may-alias.hlsl`) for testing that that the
command-line option `-res-may-alias` inserts `dx.resmayalias` module
metadata correctly
Fixes#112267
Implement the shader flag analysis to set the UseNativeLowPrecision DXIL
module flag.
The flag is only able to be set when the command-line flag
`-enable-16bit-types` is passed to clang-dxc, or equivalently
`-fnative-half-type` is passed to clang.
When the command-line flag is passed, a module metadata flag called
"dx.nativelowprec" is set to 1.
The DXILShaderFlags shader flags analysis checks that the module
metadata flag "dx.nativelowprec" is set to 1 and the DXIL Version is 1.2
or greater before setting the UseNativeLowPrecision DXIL module flag.
Resolves#99221
Key points: For SPIRV backend, it decompose into a `dot` followed a
`add`.
- [x] Implement dot2add clang builtin,
- [x] Link dot2add clang builtin with hlsl_intrinsics.h
- [x] Add sema checks for dot2add to CheckHLSLBuiltinFunctionCall in
SemaHLSL.cpp
- [x] Add codegen for dot2add to EmitHLSLBuiltinExpr in CGBuiltin.cpp
- [x] Add codegen tests to clang/test/CodeGenHLSL/builtins/dot2add.hlsl
- [x] Add sema tests to clang/test/SemaHLSL/BuiltIns/dot2add-errors.hlsl
- [x] Create the int_dx_dot2add intrinsic in IntrinsicsDirectX.td
- [x] Create the DXILOpMapping of int_dx_dot2add to 162 in DXIL.td
- [x] Create the dot2add.ll and dot2add_errors.ll tests in
llvm/test/CodeGen/DirectX/
Do cleanup in DXILFinalizeLinkage.cpp where intrinsic declares are getting orphaned.
This change reduces "Unsupported intrinsic for DXIL lowering" errors
when compiling DML shaders from 12218 to 415. and improves our
compilation success rate from less than 1% to 44%.
DXC and the DXIL validator expect resources in a DX container to be
specifically ordered CBuffers, Samplers, SRVs, and then UAVs. Match this
behaviour so that we can pass the validator.
Fixes#130232.
Removing `DXILResourceMDAnalysis` that gathers information about
resources for the `DXILTranslateMetadata` pass. It collects the info
based on obsolete resource metadata annotations that are going to be
removed soon.
Part 1/2 of #114126
Update resource metadata tests to generate metadata based on
`llvm.dx.resource.handlefrombinding` data collected in
`DXILResourceBindingAnalysis`.
- `UAVMetadata.ll` is updated, renamed to `uav_metadata.ll`, and placed
under `Metadata` directory in `llvm/test/CodeGen/DirectX`
- `srv_metadata.ll` is a new test for SRV resource metadata
- `cbuf.ll` and `legacy_cb_layout_{0|1}.ll` tests were merged into
`cbuffer_metadata.ll`
- `legacy_cb_layout_{2|3}.ll` tests we moved to `cbuffer.hlsl` in Clang
CodeGen because there were more of a layout than metadata tests
Related to [#114126](https://github.com/llvm/llvm-project/issues/114126)
Update the lowering of `llvm.dx.resource.store.typedbuffer` to match DXC
and repeat the first element in cases where we are storing fewer than 4
elements.
Fixes#128110
Fixes#99205.
- Implements the HLSL intrinsic `AddUint64` used to perform unsigned
64-bit integer addition by using pairs of unsigned 32-bit integers
instead of native 64-bit types
- The LLVM intrinsic `uadd_with_overflow` is used in the implementation
of `AddUint64` in `CGBuiltin.cpp`
- The DXIL op `UAddc` was defined in `DXIL.td`, and a lowering of the
LLVM intrinsic `uadd_with_overflow` to the `UAddc` DXIL op was
implemented in `DXILOpLowering.cpp`
Notes:
- `__builtin_addc` was not able to be used to implement `AddUint64` in
`hlsl_intrinsics.h` because its `CarryOut` argument is a pointer, and
pointers are not supported in HLSL
- A lowering of the LLVM intrinsic `uadd_with_overflow` to SPIR-V
[already
exists](https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/SPIRV/llvm-intrinsics/uadd.with.overflow.ll)
- When lowering the LLVM intrinsic `uadd_with_overflow` to the `UAddc`
DXIL op, the anonymous struct type `{ i32, i1 }` is replaced with a
named struct type `%dx.types.i32c`. This aspect of the implementation
may be changed when issue #113192 gets addressed
- Fixes issues mentioned in the comments on the original PR #125319
---------
Co-authored-by: Finn Plummer <50529406+inbelic@users.noreply.github.com>
Co-authored-by: Farzon Lotfi <farzonlotfi@microsoft.com>
Co-authored-by: Chris B <beanz@abolishcrlf.org>
Co-authored-by: Justin Bogner <mail@justinbogner.com>
Make sure we're able to print cbuffer comments in a way that's
compatible with DXC.
Fixes#128562
Note: This is a re-commit because I somehow managed to get a completely
empty commit the first time.
When some resource types were present, but not all of them, we were
ending up in a situation where we would fail to initialize the `FirstX`
variables and get incorrect iterators.
Fixes#128560.
Fixes#128071
The current behavior lets intrinsics that don't map to a DXILOP slip
through. Nothing catches this until we hit the DXIL validator. This
change fails earlier so we don't encode invalid llvm intrinsics that can
slip through because of clang builtins like `__builtin_reduce_and`
example:
https://hlsl.godbolt.org/z/13rPj18vn
This PR adds a few more tests to validate some error scenarios of root
signature metadata representation.
Closes: https://github.com/llvm/llvm-project/issues/127280
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
- Check each call instruction for a `WaveOp` intrinsic and set the
`WaveOps` flag if this is true for any intrinsic, Done in
DXILShaderFlags.cpp
Resolves#114565
Adding support for Root Signature Flags Element extraction and writing
to DXContainer.
- Adding an analysis to deal with RootSignature metadata definition
- Adding validation for Flag
- writing RootSignature blob into DXIL
Closes: [126632](https://github.com/llvm/llvm-project/issues/126632)
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
- Set the shader flag `DisableOptimizations` based on `optnone`
attribute of shader entry functions.
- Add DXIL Metadata Analysis pass as pre-requisite for Shader Flags pass
to obtain entry function information collected therein.
- Named module metadata `dx.disable_optimizations` is intended to
indicate disabling optimizations (`-O0`) via commandline flag. However,
its intent is fulfilled by `optnone` attribute of shader entry functions as
implemented in a recent change, and thus not needed. Delete
generation of named metadata and corresponding test file
`disable_opt.ll`.
- Add tests to verify correctness of setting shader flag.
Closes#112263
``` - add clang builtin to Builtins.td
- link builtin in hlsl_intrinsics
- add codegen for spirv intrinsic and two directx intrinsics to retain
signedness information of the operands in CGBuiltin.cpp
- add semantic analysis in SemaHLSL.cpp
- add lowering of spirv intrinsic to spirv backend in
SPIRVInstructionSelector.cpp
- add lowering of directx intrinsics to WaveActiveOp dxil op in
DXIL.td
- add test cases to illustrate passespendent pr merges.
```
Resolves#99170
- Redefines `DXILAttribute` to denote a function attribute, compatible
to how it was define in DXC/LLVM 3.7
- Fix how `DXILAttribute` is emitted to be a struct of set attributes
instead of an "or" of the enums
- Implement the lowering of `DXILAttribute` to LLVM function attributes
in `DXILOpBuilder.cpp`. A custom mapping is defined.
- Audit all current ops to specify the correct attributes consistent
with DXC. This is done here to allow for testing.
- Update testcases in `llvm/test/CodeGen/DirectX` of all ops with
attributes to match that attributes are set
- Update testcases of ops that had previously incorrectly set attributes
to check there is no attributes set
- Defines `DXILProperty` to denote the other type of attributes from DXC
used to query properties.
- Emit `DXILProperty` as a struct of set attributes.
- Updates `DXIL.td` to specify applicable `DXILProperty`s on ops
Note: `DXILProperty` was referred to as 'queryable attributes' in design
discussion. Changed to property to allow for better expression in
`DXIL.td`
Resolves#114461Resolves#115912