The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.
For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.
The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
Fixes#128071
The current behavior lets intrinsics that don't map to a DXILOP slip
through. Nothing catches this until we hit the DXIL validator. This
change fails earlier so we don't encode invalid llvm intrinsics that can
slip through because of clang builtins like `__builtin_reduce_and`
example:
https://hlsl.godbolt.org/z/13rPj18vn
This removes "replaceFunctionWithNamedStructOp" and folds its
functionality into "replaceFunctionWithOp". It turns out we were
overcomplicating things and this is trivial to handle generically.
Fixes#113192
This introduces `@llvm.dx.resource.load.rawbuffer` and generalizes the
buffer load docs under DirectX/DXILResources.
This resolves the "load" parts of #106188
We'd introduced separate versions of `llvm.dx.resource.load` with a
struct return to handle the CheckAccessFullyMapped case without making
the IR for the common case unnecessarily complicated. However, at this
point the common case is really `resource.getpointer`, so the ergonomics
of a simplified version of `load` don't actually gain us as much as the
cost of having multiple opcodes.
Drop the `dx.resource.loadchecked` functions and have `dx.resource.load`
consistently return `{element_type, i1}`.
Move the DXILOpLoweringPass after DXILTranslateMetadata, and add asserts
in DXILShaderFlags to ensure it isn't scheduled after op lowering. This
will allow us to rely on DirectX intrinsics in the shader flags analysis
rather than having to recover information from lowered operations.
Fixes#120119.
This splits the DXILResourceAnalysis pass into TypeAnalysis and
BindingAnalysis passes. The type analysis pass is made immutable and
populated lazily so that it can be used earlier in the pipeline without
needing to carefully maintain the invariants of the binding analysis.
Fixes#118400
Instead of storing an auxilliary structure with the information from the
DXIL resource target extension types duplicated, access the information
that we can via the type itself.
This also means we need to handle some of the target extension types we
haven't fully defined yet, like Texture and CBuffer. For now we make an
educated guess to what those should look like based on llvm/wg-hlsl#76,
and we can update them fairly easily when we've defined them more
thoroughly.
First part of #118400
DXILOpLowering runs after scalarization but `@llvm.dx.typedbuffer.store`
takes a vector, so the argument is usually an artifact. Avoid creating a
vector just to extract elements from it immediately.
fixes#112974
partially fixes#70103
An earlier version of this change was reverted so some issues could be fixed.
### Changes
- Added new tablegen based way of lowering dx intrinsics to DXIL ops.
- Added int_dx_group_memory_barrier_with_group_sync intrinsic in
IntrinsicsDirectX.td
- Added expansion for int_dx_group_memory_barrier_with_group_sync in
DXILIntrinsicExpansion.cpp`
- Added DXIL backend test case
### Related PRs
* [[clang][HLSL] Add GroupMemoryBarrierWithGroupSync intrinsic
#111883](https://github.com/llvm/llvm-project/pull/111883)
* [[SPIRV] Add GroupMemoryBarrierWithGroupSync intrinsic
#111888](https://github.com/llvm/llvm-project/pull/111888)
Introduces `__builtin_hlsl_buffer_update_counter` clang buildin that is
used to implement the `IncrementCounter` and `DecrementCounter` methods
on `RWStructuredBuffer` and `RasterizerOrderedStructuredBuffer` (see
Note).
The builtin is translated to LLVM intrisic `llvm.dx.bufferUpdateCounter`
or `llvm.spv.bufferUpdateCounter`.
Introduces `BuiltinTypeMethodBuilder` helper in `HLSLExternalSemaSource`
that enables adding methods to builtin types using builder pattern like
this:
```
BuiltinTypeMethodBuilder(Sema, RecordBuilder, "MethodName", ReturnType)
.addParam("param_name", Type, InOutModifier)
.callBuiltin("buildin_name", { BuiltinParams })
.finalizeMethod();
```
Fixes#113513
[First version](llvm/llvm-project#114148) of this PR was reverted
because of build break.
In the DXIL CreateHandle and CreateHandleFromBinding ops, resource
bindings are
indexed from the beginning of the binding space, not from the binding
itself.
Translate from an index into the binding to one from the beginning of
the space
when lowering to these operations.
Introduces `__builtin_hlsl_buffer_update_counter` clang buildin that is
used to implement the `IncrementCounter` and `DecrementCounter` methods
on `RWStructuredBuffer` and `RasterizerOrderedStructuredBuffer` (see
Note).
The builtin is translated to LLVM intrisic `llvm.dx.bufferUpdateCounter`
or `llvm.spv.bufferUpdateCounter`.
Introduces `BuiltinTypeMethodBuilder` helper in `HLSLExternalSemaSource`
that enables adding methods to builtin types using builder pattern like
this:
```
BuiltinTypeMethodBuilder(Sema, RecordBuilder, "MethodName", ReturnType)
.addParam("param_name", Type, InOutModifier)
.callBuiltin("buildin_name", { BuiltinParams })
.finalizeMethod();
```
Fixes#113513
By giving these intrinsics their appropriate attributes, loads of
globals that are stored on the other side of these calls can be
eliminated by the EarlyCSE pass. Stores to the same globals and the
globals themselves require more direct intervention as part of the
create/annotated handle lowering.
Adds a test that verifies that the unneeded globals and their uses can
be eliminated and also that the attributes are set properly.
Fixes#104271
fixes#112974
partially fixes#70103
### Changes
- Added new tablegen based way of lowering dx intrinsics to DXIL ops.
- Added int_dx_group_memory_barrier_with_group_sync intrinsic in
IntrinsicsDirectX.td
- Added expansion for int_dx_group_memory_barrier_with_group_sync in
DXILIntrinsicExpansion.cpp`
- Added DXIL backend test case
### Related PRs
* [[clang][HLSL] Add GroupMemoryBarrierWithGroupSync intrinsic
#111883](https://github.com/llvm/llvm-project/pull/111883)
* [[SPIRV] Add GroupMemoryBarrierWithGroupSync intrinsic
#111888](https://github.com/llvm/llvm-project/pull/111888)
Restricts hlsl countbits to always return a uint32.
Implements a lowering from llvm.ctpop which has an overloaded return
type to dxil cbits op which always returns uint32.
Closes#112779
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
The `@llvm.dx.typedBufferLoad` intrinsic is lowered to `@dx.op.bufferLoad`.
There's some complexity here in translating to scalarized IR, which I've
abstracted out into a function that should be useful for samples, gathers, and
CBuffer loads.
I've also updated the DXILResources.rst docs to match what I'm doing here and
the proposal in llvm/wg-hlsl#59. I've removed the content about stores and raw
buffers for now with the expectation that it will be added along with the work.
Note that this change includes a bit of a hack in how it deals with
`getOverloadKind` for the `dx.ResRet` types - we need to adjust how we deal
with operation overloads to generate a table directly rather than proxy through
the OverloadKind enum, but that's left for a later change here.
Part of #91367
Pull Request: https://github.com/llvm/llvm-project/pull/104252
The `@llvm.dx.handle.fromBinding` intrinsic is lowered either to the
`CreateHandle` op or a pair of `CreateHandleFromBinding` and `AnnotateHandle`
ops, depending on the DXIL version. Regardless of the DXIL version we need to
emit metadata about the binding, but that's left to a separate change.
These DXIL ops all need to return the `%dx.types.Handle` type, but the llvm
intrinsic returns a target extension type. To facilitate changing the type of
the operation and all of its users, we introduce `%llvm.dx.cast.handle`, which
can cast between the two handle representations.
Pull Request: https://github.com/llvm/llvm-project/pull/104251
This wires up dxil-op-lower, dxil-intrinsic-expansion, dxil-translate-metadata,
and dxil-pretty-printer to the new pass manager, both as a matter of future
proofing the backend and so that they can be used more flexibly in tests.
A few arbitrary tests are updated in order to test the new PM path, and we drop
the "print-dxil-resource-md" pass since it's redundant with the pretty printer.
Pull Request: https://github.com/llvm/llvm-project/pull/104250
This introduces an anonymous class "OpLowerer" to help with lowering DXIL ops,
and moves the DXILOpBuilder there instead of creating a new one for every
operation. DXILOpBuilder is also changed to own its IRBuilder, since that makes
it simpler to ensure that it isn't misused.
Pull Request: https://github.com/llvm/llvm-project/pull/104248
This adjusts the DXILOpBuilder API in a couple of ways:
1. Remove the need to call `getOverloadTy` before creating Ops
2. Introduce `tryCreateOp` to parallel `createOp` but propagate errors
3. Introduce specialized createOp methods for each DXIL Op
This will simplify usage of the builder in upcoming changes, and also allows us
to propagate errors via DiagnosticInfo rather than using fatal errors.
Pull Request: https://github.com/llvm/llvm-project/pull/101250
Completes #83626
- `CGBuiltin.cpp` - modify `getDotProductIntrinsic` to be able to emit
`dot2`, `dot3`, and `dot4` intrinsics based on element count
- `IntrinsicsDirectX.td` - for floating point add `dot2`, `dot3`, and
`dot4` inntrinsics -`DXIL.td` add dxilop intrinsic lowering for `dot2`,
`dot3`, & `dot4`.
- `DXILOpLowering.cpp` - add vector arg flattening for dot product.
- `DXILOpBuilder.h` - modify `createDXILOpCall` to take a smallVector
instead of an iterator
- `DXILOpBuilder.cpp` - modify `createDXILOpCall` by moving the small
vector up to the calling function in `DXILOpLowering.cpp`.
- Moving one function up gives us access to the `CallInst` and
`Function` which were needed to distinguish the dot product intrinsics
and get the operands without using the iterator.
Return type of DXIL Ops may be different from valid overload type of the
parameters, if any. Such DXIL Ops are correctly represented in DXIL.td.
However, DXILEmitter assumes the return type to be the same as parameter
overload type, if one exists. This results in generation in incorrect
overload index value in DXILOperation.inc for the DXIL Op and incorrect
DXIL operation function call in DXILOpLowering pass.
This change distinguishes return types correctly from parameter overload
types in DXILEmitter backend to handle such DXIL ops.
Add specification for DXIL Op `isinf` and corresponding tests to verify
the above change.
Fixes issue #85125
This change implements lowering for #70076, #70100, #70072, & #70102
`CGBuiltin.cpp` - - simplify `lerp` intrinsic
`IntrinsicsDirectX.td` - simplify `lerp` intrinsic
`SemaChecking.cpp` - remove unnecessary check
`DXILIntrinsicExpansion.*` - add intrinsic to instruction expansion
cases
`DXILOpLowering.cpp` - make sure `DXILIntrinsicExpansion` happens first
`DirectX.h` - changes to support new pass
`DirectXTargetMachine.cpp` - changes to support new pass
Why `any`, and `lerp` as instruction expansion just for DXIL?
- SPIR-V there is an
[OpAny](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpAny)
- SPIR-V has a GLSL lerp extension via
[Fmix](https://registry.khronos.org/SPIR-V/specs/1.0/GLSL.std.450.html#FMix)
Why `exp` instruction expansion?
- We have an `exp2` opcode and `exp` reuses that opcode. So instruction
expansion is a convenient way to do preprocessing.
- Further SPIR-V has a GLSL exp extension via
[Exp](https://registry.khronos.org/SPIR-V/specs/1.0/GLSL.std.450.html#Exp)
and
[Exp2](https://registry.khronos.org/SPIR-V/specs/1.0/GLSL.std.450.html#Exp2)
Why `rcp` as instruction expansion?
This one is a bit of the odd man out and might have to move to
`cgbuiltins` when we better understand SPIRV requirements. However I
included it because it seems like [fast math mode has an AllowRecip
flag](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_fp_fast_math_mode)
which lets you compute the reciprocal without performing the division.
We don't have that in DXIL so thought to include it.
* Leverage TableGen record descriptions of LLVM or DirectX intrinsics
that can be directly mapped in DXIL Ops TableGen description. As a
result, such DXIL Ops can be succinctly described without duplication.
DXILEmitter backend can derive the properties of DXIL Ops accordingly.
* Ensured that corresponding lit tests pass.
We have namespaces `DXIL` and `dxil`, which is just confusing. This
renames `DXIL` -> `dxil` making everything consistent.
While the LLVM coding standards don't have a clear direction here, I
chose lower case because by my current unscientific count there are
more places where we had the lowercase namespace than the uppercase.
A new helper class DXILOpBuilder is added to create DXIL op function calls.
TableGen backend for DXILOperation will create table for DXIL op function parameter types.
When create DXIL op function, these parameter types will used to create the function type.
Reviewed By: bogner
Differential Revision: https://reviews.llvm.org/D130291
Add DXIL operation for thread/group id operations.
ID Name Description
93 ThreadId reads the thread ID
94 GroupId reads the group ID (SV_GroupID)
95 ThreadIdInGroup reads the thread ID within the group (SV_GroupThreadID)
96 FlattenedThreadIdInGroup provides a flattened index for a given thread within a given group (SV_GroupIndex)
Also add llvm intrinsic which map to these intrinsics to DXIL operation.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D127990
Add more feature to tableGen backend gen-dxil-operation.
It will generate getOpCodeProperty, getOpCodeClassName and getOpCodeName when build DirectX target.
Each of these functions has a table which generate based on DXIL operations.
These generated functions will replace the manually written functions which used for query DXIL operation information.
Reviewed By: bogner
Differential Revision: https://reviews.llvm.org/D125520
Add more feature to tableGen backend gen-dxil-operation.
It will generate getOpCodeProperty, getOpCodeClassName and getOpCodeName when build DirectX target.
Each of these functions has a table which generate based on DXIL operations.
These generated functions will replace the manually written functions which used for query DXIL operation information.
Reviewed By: bogner
Differential Revision: https://reviews.llvm.org/D125520
A new tableGen backend gen-dxil-intrinsic-map is added to generate map from llvm intrinsic to DXIL operation.
A new file "DXILIntrinsicMap.inc" will be generated when build DirectX target which include the map.
The generated map will replace the manually created map when find DXIL operation from llvm intrinsic.
Reviewed By: bogner
Differential Revision: https://reviews.llvm.org/D125519
A new tableGen backend gen-dxil-enum is added to generate enum for DXIL operation and operation class.
A new file "DXILConstants.inc" will be generated when build DirectX target which include the enums.
More tableGen backends will be added to replace manually written table in DirectX backend.
The unused fields in dxil_inst will be used in future PR.
Reviewed By: bogner
Differential Revision: https://reviews.llvm.org/D125435