MC Descriptor Range Representation currently depend on Object
structures. This PR removes that dependency and in order to facilitate
removing to_underlying usage in follow-up PRs.
This pr implements support for a root signature as a target, as specified
[here](https://github.com/llvm/wg-hlsl/blob/main/proposals/0029-root-signature-driver-options.md#target-root-signature-version).
This is implemented in the following steps:
1. Add `rootsignature` as a shader model environment type and define
`rootsig` as a `target_profile`. Only valid as versions 1.0 and 1.1
2. Updates `HLSLFrontendAction` to invoke a special handling of
constructing the `ASTContext` if we are considering an `hlsl` file and
with a `rootsignature` target
3. Defines the special handling to minimally instantiate the `Parser`
and `Sema` to insert the `RootSignatureDecl`
4. Updates `CGHLSLRuntime` to emit the constructed root signature decl
as part of `dx.rootsignatures` with a `null` entry function
5. Updates `DXILRootSignature` to handle emitting a root signature
without an entry function
6. Updates `ToolChains/HLSL` to invoke `only-section=RTS0` to strip any
other generated information
Resolves: https://github.com/llvm/llvm-project/issues/150286.
##### Implementation Considerations
Ideally we could invoke this as part of `clang-dxc` without the need of
a source file. However, the initialization of the `Parser` and `Lexer`
becomes quite complicated to handle this.
Technically, we could avoid generating any of the extra information that
is removed in step 6. However, it seems better to re-use the logic in
`llvm-objcopy` without any need for additional custom logic in
`DXILRootSignature`.
fixes#156068
- We needed to add a new sub arch to the target tripple so we can test
that emulation does not happen when targeting SM6.9
- The HLSL toolchain needed to be updated to handle the conversion of
strings to enums for the new sub arch.
- The emulation is done in DXILIntrinsicExpansion.cpp and needs to be
able to convert both llvm.is.fpclass and lvm.dx.isinf to the proper
emulation
- test updates in TargetParser/TripleTest.cpp, isinf.ll, is_fpclass.ll,
and DXCModeTest.cpp
fixes#152348
SimplifyCFG collapses raw buffer store from a if\else load into a
select.
This change prevents the TargetExtType dx.Rawbuffer from being replace
thus preserving the if\else blocks.
A further change was needed to eliminate the phi node before we process
Intrinsic::dx_resource_getpointer in DXILResourceAccess.cpp
DXC checks if registers are correctly bound to root signature
descriptors. This implements the same check.
closes: #[126645](https://github.com/llvm/llvm-project/issues/126645)
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
Co-authored-by: Joao Saffran <jderezende@microsoft.com>
Removes uniformity bit from resource initialization intrinsics `llvm.{dx|spv}.resource.handlefrombinding` and `llvm.{dx|spv}.resource.handlefromimplicitbinding`. The flag currently always set to `false`. It should be derived from resource analysis and not provided by codegen.
Closes#135452
This pr fixes some inconsistencies in behaviour of how we handle
`StaticSamplersOffset` with respect to DXC and `RootParameterOffset`.
Namely:
1. Make codegen of `RTS0` always compute the `StaticSamplersOffset`
regardless if there are any `StaticSampler`s. This is to be consistent
and produce an identical `DXContainer` as DXC.
2. Make the `StaticSamplersOffset` and `RootParametersOffset` optional
parameters in the yaml description. This means it will be used when it
is specified (which was not necassarily the case before).
3. Enforce that the provided `StaticSamplersOffset` and
`RootParametersOffset` in a yaml description match the computed value.
For more context see:
https://github.com/llvm/llvm-project/issues/155299.
Description of existing test updates updates:
- `CodeGen/DirectX/ContainerData`: Updated to codegen computed values
(previously unspecified)
- `llvm-objcopy/DXContainer`: Updated to `yaml2obj` computed values
(previously unspecified)
- `ObjectYAML/DXContainer`: Updated to `yaml2obj` computed values
(previously incorrect)
- `ObjectYAML/DXContainerYAMLTest`: Updated to `yaml2obj` computed
values (previously incorrect)
See newly added tests for testing of optional parameter functionality
and `StaticSamplersOffset` computation.
Resolves: https://github.com/llvm/llvm-project/issues/155299
This patch is refactoring Root Parameter Header in DX Container backend
to remove the usage of `to_underlying`. This requires some changes:
first, MC Root Signature should not depend on Object/DXContainer.h;
Second, we need to assume data to be valid in scenarios where it was
originally not expected, this made some tests be removed.
Fixes#139023.
This PR essentially removes unused global variables:
- Restores the `GlobalDCE` Legacy pass and adds it to the DirectX
backend after the finalize linkage pass
- Converts external global variables with no usage to internal linkage
in the finalize linkage pass
- (so they can be removed by `GlobalDCE`)
- Makes the `dxil-finalize-linkage` pass usable using the new pass
manager flag syntax
- Adds tests to `finalize_linkage.ll` that make sure unused global
variables are removed
- Adds a use for variable `@CBV` in `opaque-value_as_metadata.ll` so it
isn't removed
- Changes the `scalar-data.ll` run command to avoid removing its global
variables
---------
Co-authored-by: Farzon Lotfi <farzonlotfi@microsoft.com>
As part of the Root Signature Spec, we need to validate if Root
Signatures are not defining overlapping ranges.
Closes: https://github.com/llvm/llvm-project/issues/126645
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
Co-authored-by: Joao Saffran <{ID}+{username}@users.noreply.github.com>
Co-authored-by: Joao Saffran <jderezende@microsoft.com>
fixes#151764
This fix has two parts first we track all lifetime intrinsics and if
they are users of an alloca of a target extention like dx.RawBuffer then
we eliminate those memory intrinsics when we visit the alloca.
We do step one to allow us to use the Dead Store Elimination Pass. This
removes the alloca and simplifies the use of the target extention back
to using just the global. That keeps things in a form the
DXILBitcodeWriter is expecting.
Obviously to pull this off we needed to bring back the legacy pass
manager plumbing for the DSE pass and hook it up into the DirectX
backend.
The net impact of this change is that DML shader pass rate went from
89.72% (4268 successful compilations) to 90.98% (4328 successful
compilations).
The resource binding analysis was incorrectly reducing the size of the
`Bindings` vector by one element after sorting and de-duplication. This
led to an inaccurate setting of the `HasOverlappingBinding` flag in the
`DXILResourceBindingInfo` analysis, as the truncated vector no longer
reflected the true binding state.
This update corrects the shrink logic and introduces an `assert` in the
`DXILPostOptimizationValidation` pass. The assertion will trigger if
`HasOverlappingBinding` is set but no corresponding error is detected,
helping catch future inconsistencies.
The bug surfaced when the `srv_metadata.hlsl` and `uav_metadata.hlsl`
tests were updated to include unbounded resource arrays as part of
https://github.com/llvm/llvm-project/issues/145422. These updated test
files are included in this PR, as they would cause the new assertion to
fire if the original issue remained unresolved.
Depends on #152250
Fixes#152754
- Fixes the ArgOperand index in `DXILOpLowering.cpp` used to obtain the
pointer operand of a lifetime intrinsic.
- Updates the tests
`llvm/test/CodeGen/DirectX/legalize-lifetimes-valver-1.5.ll`,
`llvm/test/CodeGen/DirectX/legalize-lifetimes-valver-1.6.ll`,
`llvm/test/CodeGen/DirectX/ShaderFlags/lifetimes-noint64op.ll`, and
`llvm/test/tools/dxil-dis/lifetimes.ll` to use the new size-less
lifetime intrinsic
- Removes lifetime intrinsics from the test
`llvm/test/CodeGen/DirectX/legalize-memset.ll` to be consistent with the
corresponding memcpy test which does not have lifetime intrinsics.
(Removal of lifetime intrinsics from tests like this was suggested here
in the past:
https://github.com/llvm/llvm-project/pull/139173#discussion_r2091778868)
- Rewrites the lifetime legalization functions in the EmbedDXILPass to
re-add the explicit size argument for DXIL
The code that checks for overlapping binding did not compare register space when one of the bindings was for an unbounded resource array, leading to false errors. This change fixes it.
fixes#140819
SROA pass is making it so that some globals get loaded into stack
allocations. This means we find an alloca where we use to expect a load
and now need to walk an alloca -> store -> maybe load chain before we
find the global. Doing so fixes All but two instances of #137715 And
fixes every instance of `Load of "8.sroa.0" is not a global resource
handle we are currently seeing in the DML shaders.
This PR addresses
https://github.com/llvm/llvm-project/pull/144465#issuecomment-3063422828.
Using `joinErrors` and `llvm:Error` instead of boolean values.
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
Co-authored-by: Joao Saffran <{ID}+{username}@users.noreply.github.com>
Fixes#150482 by replacing all `Instruction.getType()` in
DXILShaderFlags.cpp with `Instruction.getType()->getScalarType()` to
account for vectors types as suggested by @bogner
- Fixes#150050
- Address the issue of many nested geps
- Check for ConstantExpr GEP if we see it check if it needs a global
replacement with a new type. Create the new constExpr Gep and set the
pointer operand to it. Finally cleanup and remove the old nested geps.
Fixes#147395
This PR:
- Excludes lifetime intrinsics from the Int64Ops shader flags analysis
to match DXC behavior and pass DXIL validation.
- Performs legalization of `llvm.lifetime.*` intrinsics in the
EmbedDXILPass just before invoking the DXILBitcodeWriter.
- After invoking the DXILBitcodeWriter, all lifetime intrinsics and
associated bitcasts are removed from the module to keep the Module
Verifier happy. This is fine since lifetime intrinsics are not needed by
any passes after the EmbedDXILPass.
Fixes#149345
Effectively no-op pairs of insertelement-extractelement instructions
were being created due to the ExtractValueInst visitor in the Scalarizer
storing its scalarized result into the Scattered map using an incorrect
key (specifically the type used in the key).
This PR fixes this issue.
Fixes#149179
The issue is that `Builder.CreateGEP` does not return a GEP Instruction
or GEP ContantExpr when the pointer operand is a global variable and all
indices are constant zeroes.
This PR ensures that a GEP instruction is created if `Builder.CreateGEP`
did not return a GEP.
Fixes#149180
This PR removes an assertion that triggered on valid IR. It has been
replaced with an if statement that returns early if the conditions are
not correct.
This PR also adds GEPs to scalar loads and stores from/to global
variables.
Fixes#145924 and #140416
Depends on #146173 being merged first.
This PR moves the scalarizer pass to immediately before the
dxil-flatten-arrays pass to allow the dxil-flatten-arrays pass to turn
scalar GEPs (including i8 GEPs) into flattened array GEPs where
applicable.
A number of LLVM DirectX CodeGen tests have been edited to remove scalar
GEPs and also correct previously uncaught incorrectly-transformed GEPs.
No more validation errors of the form `Access to out-of-bounds memory is
disallowed` or `TGSM pointers must originate from an unambiguous TGSM
global variable` appear anymore after this PR when compiling DML
shaders.
In tandem with #146800, this PR fixes#145370
This PR simplifies the logic for collapsing GEP chains and replacing
GEPs to multidimensional arrays with GEPs to flattened arrays. This
implementation avoids unnecessary recursion and more robustly computes
the index to the flattened array by using the GEPOperator's
collectOffset function, which has the side effect of allowing "i8 GEPs"
and other types of GEPs to be handled naturally in the flattening /
collapsing of GEP chains.
Furthermore, a handful of LLVM DirectX CodeGen tests have been edited to
fix incorrect GEP offsets, mismatched types (e.g., loading i32s from a
an array of floats), and typos.
Fixes#147395
This PR legalizes lifetime intrinsics for DXIL by
- Adding a bitcast for the lifetime intrinsics' pointer operand in
dxil-prepare to ensure it gets cast to an `i8*` when written to DXIL
- Removing the memory attribute from lifetime intrinsics in dxil-prepare
to match DXIL
- Making the DXIL bitcode writer write the base/demangled name of
lifetime intrinsics to the symbol table
- Making lifetime intrinsics an exception to Int64Ops shader flag
analysis (otherwise we get `error: Flags must match usage.` from the
validator)
This pr resolves some discrepancies in verification during `validate` in
`DXILRootSignature.cpp`.
Note: we don't add a backend test for version 1.0 flag values because it
treats the struct as though there is no flags value. However, this will
be used when we use the verifications in the frontend.
- Updates `verifyDescriptorFlag` to check for valid flags based on
version, as reflected [here](https://github.com/llvm/wg-hlsl/pull/297)
- Add test to demonstrate updated flag verifications
- Adds `verifyNumDescriptors` to the validation of `DescriptorRange`s
- Add a test to demonstrate `numDescriptors` verification
- Updates a number of tests that mistakenly had an invalid
`numDescriptors` specified
Resolves: https://github.com/llvm/llvm-project/issues/147107
Fixes#147394
References DXC for the implementation logic:
d751c827ed/lib/HLSL/DxilPreparePasses.cpp (L693-L699)
If DXIL Version < 1.6 then replace lifetime intrinsics with stores
- For validator version >= 1.6, store an undef
- For validator version < 1.6, store zeros
else keep the lifetime intrinsics in the DXIL.
After this PR, the number of DML shaders failing validation due to
#146974 is reduced from 157 to 50.
For SM6.2 and earlier, Raw buffer Loads and Stores can't handle 64 bit
types. This PR expands Raw Buffer Loads and Stores for 64 bit types
double and int64_t. This Adds to the work done in #139996 and #145047 .
Raw Buffer Loads and Stores allow for 64 bit type vectors of size 3 and
4, and the code is modified to allow for that.
Closes#144747
Fixes#140420. The switch.table.* validation errors were caused by DXIL
not supporting private global variables. Converting them to internal
linkage fixes the bug.
May need more discussion on the preserved analyses/a follow-up PR that
fixes what this pass says it preserves.
Fixes#141840
This PR implements support for the `memcpy` intrinsic in the DXIL
CBuffer Access pass with the following restrictions:
- The CBuffer Access must be the `src` operand of `memcpy` and must be
direct (i.e., not a GEP)
- The type of the CBuffer Access must be of an Array Type
These restrictions greatly simplify the implementation of `memcpy` yet
still covers the known uses in DML shaders.
Furthermore, to prevent errors like #141840 from occurring silently
again, this PR adds error reporting for unsupported users of globals in
the DXIL CBuffer Access pass.
Implements static samplers parsing from root signature metadata
representation. This is required to support Root Signatures in HLSL.
Closes: #[126641](https://github.com/llvm/llvm-project/issues/126641)
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
fixes#140321
Specifically it fixes ` error: Cannot create BufferLoad operation:
Invalid overload type`
https://hlsl.godbolt.org/z/dTq4q7o58
but no new DML shaders are building. This change now exposes #144747.
The change does two things it adds i64 support for intrinsic expansion
for the `dx_resource_load_typedbuffer`, and
`dx_resource_store_typedbuffer` intrinsics.
It also lets loaded typedbuffers crash more gracefully because of ` auto
*EVI = cast<ExtractValueInst>(U);` is now a `dyn_cast` and
`llvm_unreachable`.
fixes#145782
This change modifies `isArrayOfVectors` into `isVectorOrArrayOfVectors`.
The previous implementation did not support vector to array
transformations. Further it was too simplistic and didn't assume allocas
would create multidimensional arrays.
fixes#145408
Before we see the GEP we already have transformed Allocas that get
passed the `isArrayOfVectors`.
The bug is because we are trying to transform a gep for struct of arrays
when we should only be transforming arrays.
The problem with our `visitGetElementPtrInst` is that it was doing
transformations for all allocas when it should be limiting it to the
alloca of array cases. Technically we would have liked to make sure it
was an array of vectors cases but by the time we see the GEP the type
has been changed by replace all uses. There should not be a problem with
looking at all Arrays since DXILDataScalarization does not change any
indicies.
The `dx.rootsignatures` metadata is not recognized in DXIL, so failure
to remove this will cause validation errors.
This metadata is parsed (within `RootSignatureAnalysisWrapper`) into its
binary format. As such, once it has been used to construct the binary
form, it can be safely discarded without loss of information.
This pr ensures that the dxil prepare pass will depend and preserve on
the root signature analysis so that it runs before the metadata is
removed.
- Update `DXILPrepare.cpp` to preserve and depend on
`RootSignatureAnalysisWrapper`
- Update test to demonstrate order is correct
- Provide test-case to demonstrate the metadata is removed
Resolves https://github.com/llvm/llvm-project/issues/145437.
----------
Co-authored-by: Justin Bogner <mail@justinbogner.com>