Fixes#139023.
This PR essentially removes unused global variables:
- Restores the `GlobalDCE` Legacy pass and adds it to the DirectX
backend after the finalize linkage pass
- Converts external global variables with no usage to internal linkage
in the finalize linkage pass
- (so they can be removed by `GlobalDCE`)
- Makes the `dxil-finalize-linkage` pass usable using the new pass
manager flag syntax
- Adds tests to `finalize_linkage.ll` that make sure unused global
variables are removed
- Adds a use for variable `@CBV` in `opaque-value_as_metadata.ll` so it
isn't removed
- Changes the `scalar-data.ll` run command to avoid removing its global
variables
---------
Co-authored-by: Farzon Lotfi <farzonlotfi@microsoft.com>
As part of the Root Signature Spec, we need to validate if Root
Signatures are not defining overlapping ranges.
Closes: https://github.com/llvm/llvm-project/issues/126645
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
Co-authored-by: Joao Saffran <{ID}+{username}@users.noreply.github.com>
Co-authored-by: Joao Saffran <jderezende@microsoft.com>
fixes#151764
This fix has two parts first we track all lifetime intrinsics and if
they are users of an alloca of a target extention like dx.RawBuffer then
we eliminate those memory intrinsics when we visit the alloca.
We do step one to allow us to use the Dead Store Elimination Pass. This
removes the alloca and simplifies the use of the target extention back
to using just the global. That keeps things in a form the
DXILBitcodeWriter is expecting.
Obviously to pull this off we needed to bring back the legacy pass
manager plumbing for the DSE pass and hook it up into the DirectX
backend.
The net impact of this change is that DML shader pass rate went from
89.72% (4268 successful compilations) to 90.98% (4328 successful
compilations).
The resource binding analysis was incorrectly reducing the size of the
`Bindings` vector by one element after sorting and de-duplication. This
led to an inaccurate setting of the `HasOverlappingBinding` flag in the
`DXILResourceBindingInfo` analysis, as the truncated vector no longer
reflected the true binding state.
This update corrects the shrink logic and introduces an `assert` in the
`DXILPostOptimizationValidation` pass. The assertion will trigger if
`HasOverlappingBinding` is set but no corresponding error is detected,
helping catch future inconsistencies.
The bug surfaced when the `srv_metadata.hlsl` and `uav_metadata.hlsl`
tests were updated to include unbounded resource arrays as part of
https://github.com/llvm/llvm-project/issues/145422. These updated test
files are included in this PR, as they would cause the new assertion to
fire if the original issue remained unresolved.
Depends on #152250
Fixes#152754
- Fixes the ArgOperand index in `DXILOpLowering.cpp` used to obtain the
pointer operand of a lifetime intrinsic.
- Updates the tests
`llvm/test/CodeGen/DirectX/legalize-lifetimes-valver-1.5.ll`,
`llvm/test/CodeGen/DirectX/legalize-lifetimes-valver-1.6.ll`,
`llvm/test/CodeGen/DirectX/ShaderFlags/lifetimes-noint64op.ll`, and
`llvm/test/tools/dxil-dis/lifetimes.ll` to use the new size-less
lifetime intrinsic
- Removes lifetime intrinsics from the test
`llvm/test/CodeGen/DirectX/legalize-memset.ll` to be consistent with the
corresponding memcpy test which does not have lifetime intrinsics.
(Removal of lifetime intrinsics from tests like this was suggested here
in the past:
https://github.com/llvm/llvm-project/pull/139173#discussion_r2091778868)
- Rewrites the lifetime legalization functions in the EmbedDXILPass to
re-add the explicit size argument for DXIL
The code that checks for overlapping binding did not compare register space when one of the bindings was for an unbounded resource array, leading to false errors. This change fixes it.
fixes#140819
SROA pass is making it so that some globals get loaded into stack
allocations. This means we find an alloca where we use to expect a load
and now need to walk an alloca -> store -> maybe load chain before we
find the global. Doing so fixes All but two instances of #137715 And
fixes every instance of `Load of "8.sroa.0" is not a global resource
handle we are currently seeing in the DML shaders.
This PR addresses
https://github.com/llvm/llvm-project/pull/144465#issuecomment-3063422828.
Using `joinErrors` and `llvm:Error` instead of boolean values.
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
Co-authored-by: Joao Saffran <{ID}+{username}@users.noreply.github.com>
Fixes#150482 by replacing all `Instruction.getType()` in
DXILShaderFlags.cpp with `Instruction.getType()->getScalarType()` to
account for vectors types as suggested by @bogner
- Fixes#150050
- Address the issue of many nested geps
- Check for ConstantExpr GEP if we see it check if it needs a global
replacement with a new type. Create the new constExpr Gep and set the
pointer operand to it. Finally cleanup and remove the old nested geps.
Fixes#147395
This PR:
- Excludes lifetime intrinsics from the Int64Ops shader flags analysis
to match DXC behavior and pass DXIL validation.
- Performs legalization of `llvm.lifetime.*` intrinsics in the
EmbedDXILPass just before invoking the DXILBitcodeWriter.
- After invoking the DXILBitcodeWriter, all lifetime intrinsics and
associated bitcasts are removed from the module to keep the Module
Verifier happy. This is fine since lifetime intrinsics are not needed by
any passes after the EmbedDXILPass.
Fixes#149345
Effectively no-op pairs of insertelement-extractelement instructions
were being created due to the ExtractValueInst visitor in the Scalarizer
storing its scalarized result into the Scattered map using an incorrect
key (specifically the type used in the key).
This PR fixes this issue.
Fixes#149179
The issue is that `Builder.CreateGEP` does not return a GEP Instruction
or GEP ContantExpr when the pointer operand is a global variable and all
indices are constant zeroes.
This PR ensures that a GEP instruction is created if `Builder.CreateGEP`
did not return a GEP.
Fixes#149180
This PR removes an assertion that triggered on valid IR. It has been
replaced with an if statement that returns early if the conditions are
not correct.
This PR also adds GEPs to scalar loads and stores from/to global
variables.
Fixes#145924 and #140416
Depends on #146173 being merged first.
This PR moves the scalarizer pass to immediately before the
dxil-flatten-arrays pass to allow the dxil-flatten-arrays pass to turn
scalar GEPs (including i8 GEPs) into flattened array GEPs where
applicable.
A number of LLVM DirectX CodeGen tests have been edited to remove scalar
GEPs and also correct previously uncaught incorrectly-transformed GEPs.
No more validation errors of the form `Access to out-of-bounds memory is
disallowed` or `TGSM pointers must originate from an unambiguous TGSM
global variable` appear anymore after this PR when compiling DML
shaders.
In tandem with #146800, this PR fixes#145370
This PR simplifies the logic for collapsing GEP chains and replacing
GEPs to multidimensional arrays with GEPs to flattened arrays. This
implementation avoids unnecessary recursion and more robustly computes
the index to the flattened array by using the GEPOperator's
collectOffset function, which has the side effect of allowing "i8 GEPs"
and other types of GEPs to be handled naturally in the flattening /
collapsing of GEP chains.
Furthermore, a handful of LLVM DirectX CodeGen tests have been edited to
fix incorrect GEP offsets, mismatched types (e.g., loading i32s from a
an array of floats), and typos.
Fixes#147395
This PR legalizes lifetime intrinsics for DXIL by
- Adding a bitcast for the lifetime intrinsics' pointer operand in
dxil-prepare to ensure it gets cast to an `i8*` when written to DXIL
- Removing the memory attribute from lifetime intrinsics in dxil-prepare
to match DXIL
- Making the DXIL bitcode writer write the base/demangled name of
lifetime intrinsics to the symbol table
- Making lifetime intrinsics an exception to Int64Ops shader flag
analysis (otherwise we get `error: Flags must match usage.` from the
validator)
This pr resolves some discrepancies in verification during `validate` in
`DXILRootSignature.cpp`.
Note: we don't add a backend test for version 1.0 flag values because it
treats the struct as though there is no flags value. However, this will
be used when we use the verifications in the frontend.
- Updates `verifyDescriptorFlag` to check for valid flags based on
version, as reflected [here](https://github.com/llvm/wg-hlsl/pull/297)
- Add test to demonstrate updated flag verifications
- Adds `verifyNumDescriptors` to the validation of `DescriptorRange`s
- Add a test to demonstrate `numDescriptors` verification
- Updates a number of tests that mistakenly had an invalid
`numDescriptors` specified
Resolves: https://github.com/llvm/llvm-project/issues/147107
Fixes#147394
References DXC for the implementation logic:
d751c827ed/lib/HLSL/DxilPreparePasses.cpp (L693-L699)
If DXIL Version < 1.6 then replace lifetime intrinsics with stores
- For validator version >= 1.6, store an undef
- For validator version < 1.6, store zeros
else keep the lifetime intrinsics in the DXIL.
After this PR, the number of DML shaders failing validation due to
#146974 is reduced from 157 to 50.
For SM6.2 and earlier, Raw buffer Loads and Stores can't handle 64 bit
types. This PR expands Raw Buffer Loads and Stores for 64 bit types
double and int64_t. This Adds to the work done in #139996 and #145047 .
Raw Buffer Loads and Stores allow for 64 bit type vectors of size 3 and
4, and the code is modified to allow for that.
Closes#144747
Fixes#140420. The switch.table.* validation errors were caused by DXIL
not supporting private global variables. Converting them to internal
linkage fixes the bug.
May need more discussion on the preserved analyses/a follow-up PR that
fixes what this pass says it preserves.
Fixes#141840
This PR implements support for the `memcpy` intrinsic in the DXIL
CBuffer Access pass with the following restrictions:
- The CBuffer Access must be the `src` operand of `memcpy` and must be
direct (i.e., not a GEP)
- The type of the CBuffer Access must be of an Array Type
These restrictions greatly simplify the implementation of `memcpy` yet
still covers the known uses in DML shaders.
Furthermore, to prevent errors like #141840 from occurring silently
again, this PR adds error reporting for unsupported users of globals in
the DXIL CBuffer Access pass.
Implements static samplers parsing from root signature metadata
representation. This is required to support Root Signatures in HLSL.
Closes: #[126641](https://github.com/llvm/llvm-project/issues/126641)
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
fixes#140321
Specifically it fixes ` error: Cannot create BufferLoad operation:
Invalid overload type`
https://hlsl.godbolt.org/z/dTq4q7o58
but no new DML shaders are building. This change now exposes #144747.
The change does two things it adds i64 support for intrinsic expansion
for the `dx_resource_load_typedbuffer`, and
`dx_resource_store_typedbuffer` intrinsics.
It also lets loaded typedbuffers crash more gracefully because of ` auto
*EVI = cast<ExtractValueInst>(U);` is now a `dyn_cast` and
`llvm_unreachable`.
fixes#145782
This change modifies `isArrayOfVectors` into `isVectorOrArrayOfVectors`.
The previous implementation did not support vector to array
transformations. Further it was too simplistic and didn't assume allocas
would create multidimensional arrays.
fixes#145408
Before we see the GEP we already have transformed Allocas that get
passed the `isArrayOfVectors`.
The bug is because we are trying to transform a gep for struct of arrays
when we should only be transforming arrays.
The problem with our `visitGetElementPtrInst` is that it was doing
transformations for all allocas when it should be limiting it to the
alloca of array cases. Technically we would have liked to make sure it
was an array of vectors cases but by the time we see the GEP the type
has been changed by replace all uses. There should not be a problem with
looking at all Arrays since DXILDataScalarization does not change any
indicies.
The `dx.rootsignatures` metadata is not recognized in DXIL, so failure
to remove this will cause validation errors.
This metadata is parsed (within `RootSignatureAnalysisWrapper`) into its
binary format. As such, once it has been used to construct the binary
form, it can be safely discarded without loss of information.
This pr ensures that the dxil prepare pass will depend and preserve on
the root signature analysis so that it runs before the metadata is
removed.
- Update `DXILPrepare.cpp` to preserve and depend on
`RootSignatureAnalysisWrapper`
- Update test to demonstrate order is correct
- Provide test-case to demonstrate the metadata is removed
Resolves https://github.com/llvm/llvm-project/issues/145437.
----------
Co-authored-by: Justin Bogner <mail@justinbogner.com>
Running the `vector-combine` pass on this test now produces a single
shuffle on a loaded `<1 x float>` instead of an insert into a `<2 x
float>` followed by a shuffle.
This test change matches changes in other tests in PR #144690, which
introduced the optimization.
In #144957 the backend was updated to expect a version in the metadata,
but since the frontend wasn't updated this breaks compilation. This is a
somewhat temporary fix to that until #144813 lands.
fixes#144608
- there is a getPointerOperandIndex function so we don't need to iterate
the operands trying to find the pointer. This resulted in a small
cleanup to visitStoreInst and visitLoadInst.
- The meat of this change was in visitGetElementPtrInst to account for
allocas and not bail when we don't find a global.
Implements descriptor table parsing from root signature metadata. This
is required to support root signatures in hlsl.
Closes: #[126640](https://github.com/llvm/llvm-project/issues/126640)
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
Fixes#141136
- Implement `visitExtractElementInst` and `visitInsertElementInst` in
`DXILDataScalarizerVisitor` to scalarize `extractelement` and
`insertelement` instructions whose index operand is not a `ConstantInt`
by converting the vector to an array and then loading from the array
- Rename the `replaceVectorWithArray` helper function to
`equivalentArrayTypeFromVector`, relocate the function toward the top of
the file, and remove the unused `Ctx` parameter
Updates the Root Signature metadata parser to extract version
information. This requirement was added after the initial parser
implementation.
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
Implements
https://github.com/llvm/wg-hlsl/blob/main/proposals/0026-symbol-visibility.md.
The change is to stop using the `hlsl.export` attribute. Instead,
symbols with "program linkage" in HLSL will have export linkage with
default visibility, and symbols with "external linkage" in HLSL will
have export linkage with hidden visibility.
fixes#142836
We added a function called `collectIndicesAndDimsFromGEP` which builds
the Indicies and Dims up for the recursive case and the base case.
really to solve #142836 we didn't need to add it to the recursive case.
The recursive cases exists for gep chains which are ussually two
indicies per gep ie ptr index and array index. adding
collectIndicesAndDimsFromGEP to the recursive cases means we can now do
some mixed mode indexing say we get a case where its not the ussual 2
indicies but instead 3 we can now treat those last two indicies as part
of the computation for the flat array index.
This change relands https://github.com/llvm/llvm-project/pull/142853
It fixes the circular reference issue we were seeing in GEPs
ex `%.flat = getelementptr inbounds [16 x i32], ptr %.flat, i32 0, i32
15`