- create a clang built-in in Builtins.td
- add semantic checking in SemaHLSL.cpp
- link the WaveReadLaneAt api in hlsl_intrinsics.h
- add lowering to spirv backend op GroupNonUniformShuffle
with Scope = 2 (Group) in SPIRVInstructionSelector.cpp
- add WaveReadLaneAt intrinsic to IntrinsicsDirectX.td and mapping
to DXIL.td
- add tests for HLSL intrinsic lowering to spirv intrinsic in
WaveReadLaneAt.hlsl
- add tests for sema checks in WaveReadLaneAt-errors.hlsl
- add spir-v backend tests in WaveReadLaneAt.ll
- add test to show scalar dxil lowering functionality
- note that this doesn't include support for the scalarizer to
handle WaveReadLaneAt will be added in a future pr
This is the first part #70104
- add degrees builtin
- link degrees api in hlsl_intrinsics.h
- add degrees intrinsic to IntrinsicsDirectX.td
- add degrees intrinsic to IntrinsicsSPIRV.td
- add lowering from clang builtin to dx/spv intrinsics in CGBuiltin.cpp
- add semantic checks to SemaHLSL.cpp
- add expansion of directx intrinsic to llvm fmul for DirectX in
DXILIntrinsicExpansion.cpp
- add mapping to spir-v intrinsic in SPIRVInstructionSelector.cpp
- add test coverage:
- degrees.hlsl -> check hlsl lowering to dx/spv degrees intrinsics
- degrees-errors.hlsl/half-float-only-errors -> check semantic warnings
- hlsl-intrinsics/degrees.ll -> check lowering of spir-v degrees
intrinsic to SPIR-V backend
- DirectX/degrees.ll -> check expansion and scalarization of directx
degrees intrinsic to fmul
Resolves#99104
Implement the intrinsic `llvm.spv.handle.fromBinding`, which returns the
handle for a global resource. This involves creating a global variable
that matches the return-type, set, and binding in the call, and
returning the handle to that resource.
This commit implements the scalar version. It does not handle arrays of
resources yet. It also does not handle storage buffers yet. We do not
have the type for the storage buffers designed yet.
Part of #81036
Reverts llvm/llvm-project#110800
`llvm\test\CodeGen\DirectX\radians.ll` is failing after this change.
@adam-yang please send a new PR with the issue resolved once you've had
time to investigate.
`selectExtInst` tries to add the intrinsic to the SPIRV GL extension
instruction.
`MO_IntrinsicID` is always the first operand when we come from
`selectIntrinsic`.
For all those cases `selectExtInst` needs to know to start at index 2.
There are llvm intrinsics which we do not expect to actually represent
code after lowering or which are not implemented yet but can be found in
a customer's LLVM IR input. We do not want translation to crash when
these llvm intrinsics are found, and this PR fixes the issue with
translation crash for some known cases, aligned with Khronos Translator.
This PR reworks implementation of OpSpecConstantOp with ptr-cast
operation (PtrCastToGeneric, GenericCastToPtr). Previous implementation
didn't take into account a lot of use cases, including multiple
inclusion of pointers, reference to a pointer from OpName, etc. A
reproducer is attached as a new test case.
This PR also fixes wrong type inference for IR patterns which generate
new virtual registers without SPIRV type. Previous implementation
assumed always that result has the same address space as a source that
is not the fact, and, for example, led to impossibility to emit a
ptr-cast operation in the reproducer, because wrong type inference
rendered source and destination with the same address space, eliminating
translation of G_ADDRSPACE_CAST.
Two main goals of this PR are:
* to support "Arithmetic with Overflow" intrinsics, including the
special case when those intrinsics are being generated by the
CodeGenPrepare pass during translations with optimization;
* to redirect intrinsics with aggregate return type to be lowered via
GlobalISel operations instead of SPIRV-specific unfolding/lowering (see
https://github.com/llvm/llvm-project/pull/95012).
There is a new test case
`llvm/test/CodeGen/SPIRV/passes/translate-aggregate-uaddo.ll` that
describes and checks the general logics of the translation.
This PR continues a series of PRs aimed to identify and fix flaws in
code emission, to improve pass rates for the mode with expensive checks
set on (see https://github.com/llvm/llvm-project/pull/101732,
https://github.com/llvm/llvm-project/pull/104104,
https://github.com/llvm/llvm-project/pull/106966), having in mind the
ultimate goal of proceeding towards the non-experimental status of
SPIR-V Backend.
The reproducers are:
1) consider `llc -O3 -mtriple=spirv64-unknown-unknown ...` with:
```
define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
entry:
br label %l1
l1:
%e = phi i32 [ %a, %entry ], [ %i, %body ]
%i = add nsw i32 %e, 1
%fl = icmp eq i32 %i, 0
br i1 %fl, label %exit, label %body
body:
store i8 42, ptr addrspace(4) %p
br label %l1
exit:
ret i32 %i
}
```
2) consider `llc -O0 -mtriple=spirv64-unknown-unknown ...` with:
```
define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
entry:
br label %l1
l1: ; preds = %body, %entry
%e = phi i32 [ %a, %entry ], [ %math, %body ]
%0 = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 %e, i32 1)
%math = extractvalue { i32, i1 } %0, 0
%ov = extractvalue { i32, i1 } %0, 1
br i1 %ov, label %exit, label %body
body: ; preds = %l1
store i8 42, ptr addrspace(4) %p, align 1
br label %l1
exit: ; preds = %l1
ret i32 %math
}
```
This change adds support for correctly lowering the `__scoped` Clang
builtins, and corresponding scoped LLVM instructions. These were
previously unconditionally lowered to Device scope, which is possibly incorrect.
Furthermore, the default / implicit scope is changed from Device (an
OpenCL assumption) to AllSvmDevices (aka System), since the SPIR-V BE is not
OpenCL specific / can ingest IR coming from other language front-ends. OpenCL
defaulting to Device scope is now reflected in the front-end handling of atomic
ops, which seems preferable.
This commit adds an initial SPIR-V structurizer.
It leverages the previously merged passes, and the convergence region
analysis to determine the correct merge and continue blocks for SPIR-V.
The first part does a branch cleanup (simplifying switches, and
legalizing them), then merge instructions are added to cycles,
convergent and later divergent blocks.
Then comes the important part: splitting critical edges, and making sure
the divergent construct boundaries don't cross.
- we split blocks with multiple headers into 2 blocks.
- we split blocks that are a merge blocks for 2 or more constructs:
SPIR-V spec disallow a merge block to be shared by 2
loop/switch/condition construct.
- we split merge & continue blocks: SPIR-V spec disallow a basic block
to be both a continue block, and a merge block.
- we remove superfluous headers: when a header doesn't bring more info
than the parent on the divergence state, it must be removed.
This PR leverages the merged SPIR-V simulator for testing, as long as
spirv-val. For now, most DXC structurization tests are passing. The
unsupported ones are either caused by unsupported features like switches
on boolean types, or switches in region exits, because the MergeExit
pass doesn't support those yet (there is a FIXME).
This PR is quite large, and the addition not trivial, so I tried to keep
it simple. E.G: as soon as the CFG changes, I recompute the dominator
trees and other structures instead of updating them.
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
When running SPIR-V Backend with optimization levels higher than 0, we
observe GEP Operator's as a new factor, massively used to convey the
semantics of the original LLVM IR. Previously, an issue related to GEP
Operator was mentioned and fixed on the consumer side of toolchains
(see, for example, Khronos Trandslator Issue
https://github.com/KhronosGroup/SPIRV-LLVM-Translator/issues/2486 and PR
https://github.com/KhronosGroup/SPIRV-LLVM-Translator/pull/2487).
However, there is a case when GenCode creates G_PTR_ADD to convey the
original semantics under optimization levels higher than 0 where it's
SPIR-V Backend that fails to translate source LLVM IR correctly.
Consider the following reproducer:
```
%struct = type { i32, [257 x i8], [257 x i8], [129 x i8], i32, i64, i64, i64, i64, i64, i64 }
@Mem = linkonce_odr dso_local addrspace(1) global %struct zeroinitializer, align 8
define weak dso_local spir_func void @__devicelib_assert_fail(ptr addrspace(4) noundef %expr, i32 noundef %line, i1 %fl) {
entry:
%cmp = icmp eq i32 %line, 0
br i1 %cmp, label %lbl, label %exit
lbl:
store i32 %line, ptr addrspace(1) getelementptr inbounds (i8, ptr addrspace(1) @Mem, i64 648), align 8
br i1 %fl, label %lbl, label %exit
exit:
ret void
}
```
converted to the following machine instructions by SPIR-V Backend:
```
%4:type(s64) = OpTypeInt 32, 0
%22:type(s64) = OpTypePointer 5, %4:type(s64)
%2:type(s64) = OpTypeInt 8, 0
%28:type(s64) = OpTypePointer 5, %2:type(s64)
%10:pid(p1) = G_GLOBAL_VALUE @Mem
%36:type(s64) = OpTypeStruct %4:type(s64), %32:type(s64), %32:type(s64), %34:type(s64), %4:type(s64), %35:type(s64), %35:type(s64), %35:type(s64), %35:type(s64), %35:type(s64), %35:type(s64)
%37:iid(s32) = G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.spv.const.composite)
%8:iid(s32) = ASSIGN_TYPE %37:iid(s32), %36:type(s64)
G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.spv.init.global), %10:pid(p1), %8:iid(s32)
%29:pid(p1) = nuw G_PTR_ADD %10:pid, %16:iid(s64)
%15:pid(p1) = nuw ASSIGN_TYPE %29:pid(p1), %28:type(s64)
%27:pid(p2) = G_BITCAST %15:pid(p1)
%17:pid(p2) = ASSIGN_TYPE %27:pid(p2), %22:type(s64)
G_STORE %1:iid(s32), %17:pid(p2) :: (store (s32) into %ir.3, align 8, addrspace 1)
```
On the next stage of instruction selection this `G_PTR_ADD`-related
pattern would be interpreted as an initialization of a global variable
and converted to an invalid constant GEP pattern that, in its turn,
would fail to be verified by LLVM during back translation from SPIR-V to
LLVM IR.
This PR introduces a fix for the problem by adding one more case of
`G_PTR_ADD` translation, when we use a non-const GEP to convey the
meaning. The reproducer is attached as a new test case.
This commits add the WaveIsFirstLane() hlsl intrinsinc. This intrinsic
uses the convergence intrinsincs for the SPIR-V backend. On the DXIL
side, I'm not sure what the strategy is for convergence, so I
implemented that like in DXC: a normal builtin function.
Signed-off-by: Nathan Gauër <brioche@google.com>
This adds the SPIRV fdot, sdot, and udot intrinsics and allows them to
be created at codegen depending on the target architecture. This
required moving some of the DXIL-specific choices to DXIL instruction
expansion out of codegen and providing it with at a more generic fdot
intrinsic as well.
Removed some stale comments that gave the obsolete impression that type
conversions should be expected to match overloads.
The SPIRV intrinsic handling involves generating multiply and add
operations for integers and the existing OpDot operation for floating
point.
New tests for generating SPIRV float and integer dot intrinsics are
added as well as expanding HLSL tests to include SPIRV generation
Used new dot product intrinsic generation to implement normalize() in SPIRV
Incidentally changed existing dot intrinsic definitions to use
DefaultAttrsIntrinsic to match the newly added inrinsics
Fixes#88056
This PR continues https://github.com/llvm/llvm-project/pull/101732
changes in virtual register processing aimed to improve correctness of
emitted MIR between passes from the perspective of MachineVerifier.
Namely, the following changes are introduced:
* register classes (lib/Target/SPIRV/SPIRVRegisterInfo.td) and
instruction patterns (lib/Target/SPIRV/SPIRVInstrInfo.td) are corrected
and simplified (by removing unnecessary sophisticated options) -- e.g.,
this PR gets rid of duplicating 32/64 bits patterns, removes ANYID
register class and simplifies definition of the rest of register
classes,
* hardcoded LLT scalar types in passes before instruction selection are
corrected -- the goal is to have correct bit width before instruction
selection, and use 64 bits registers for pattern matching in the
instruction selection pass; 32-bit registers remain where they are
described in such terms by SPIR-V specification (like, for example,
creation of virtual registers for scope/mem semantics operands),
* rework virtual register type/class assignment for calls/builtins
lowering,
* a series of minor changes to fix validity of emitted code between
passes:
- ensure that that bitcast changes the type,
- fix the pattern for instruction selection for OpExtInst,
- simplify inline asm operands usage,
- account for arbitrary integer sizes / update legalizer rules;
* add '-verify-machineinstrs' to existed test cases.
See also https://github.com/llvm/llvm-project/issues/88129 that this PR
may resolve.
This PR fixes a great number of issues reported by MachineVerifier and,
as a result, reduces a number of failed test cases for the mode with
expensive checks set on from ~200 to ~57.
Implement support for HLSL intrinsic saturate.
Implement DXIL codegen for the intrinsic saturate by lowering it to DXIL
Op dx.saturate.
Implement SPIRV codegen by transforming saturate(x) to clamp(x, 0.0f,
1.0f).
Add tests for DXIL and SPIRV CodeGen.
This PR addresses a TODO in
lib/Target/SPIRV/SPIRVInstructionSelector.cpp by adding implementation
of the non-const G_BUILD_VECTOR, and fix emission of the
OpGroupBroadcast instruction for the case when the `..._group_broadcast`
builtin has more than one `local_id` argument and `OpGroupBroadcast`
requires a newly constructed vector with 2 or 3 components instead of
originally passed series of `local_id` arguments.
This PR may resolve https://github.com/llvm/llvm-project/issues/97310 if
the reason for the reported fail is an incorrectly generated
OpGroupBroadcast instruction that was definitely a case.
Existing test is hardened and a new test is added to cover this special
case of the OpGroupBroadcast instruction emission.
This PR finishes #99134 by lowering the length function to the SPIR-V
backend. A test was added to verify that the generated SPIR-V is
correct.
Fixes#99134
This PR contains changes in virtual register processing aimed to improve
correctness of emitted MIR between passes from the perspective of
MachineVerifier. This potentially helps to detect previously missed
flaws in code emission and harden the test suite. As a measure of
correctness and usefulness of this PR we may use a mode with expensive
checks set on, and MachineVerifier reports problems in the test suite.
In order to satisfy Machine Verifier requirements to MIR correctness not
only a rework of usage of virtual registers' types and classes is
required, but also corrections into pre-legalizer and instruction
selection logics. Namely, the following changes are introduced:
* scalar virtual registers have proper bit width,
* detect register class by SPIR-V type,
* add a superclass for id virtual register classes,
* fix Tablegen rules used for instruction selection,
* fixes of minor existed issues (missed flag for proper representation
of a null constant for OpenCL vs. HLSL, wrong usage of integer virtual
registers as a synonym of any non-type virtual register).
The MachineModuleInfo reference was removed from the MachineFunction in
#100357, which broke this code. Instead of finding another way to get at
the MMI, just avoid using it - we can get at the LLVMContext from the
MachineFunction itself.
This PR improves type inference of operand presented by opaque pointers
and aggregate types:
* tries to restore original function return type for aggregate types so
that it's possible to deduce a correct type during emit-intrinsics step
(see llvm/test/CodeGen/SPIRV/SpecConstants/restore-spec-type.ll for the
reproducer of the previously existed issue when spirv-val found a
mismatch between object and ptr types in OpStore due to the incorrect
aggregate types tracing),
* explores untyped pointer operands of store to deduce correct pointee
types,
* creates an extension type to track pointee types from emit-intrinsics
step and further instead of direct and naive usage of TypePointerType
that led previously to crashes due to ban of creation of Value of
TypePointerType type,
* tracks instructions with uncomplete type information and tries to
improve their type info after pass calculated types for all machine
functions (it doesn't traverse a code but rather checks only those
instructions which were tracked as uncompleted),
* address more cases of removing unnecessary bitcasts (see, for example,
changes in test/CodeGen/SPIRV/transcoding/OpGenericCastToPtr.ll where
`CHECK-SPIRV-NEXT` in LIT checks show absence of unneeded bitcasts and
unmangled/mangled versions have proper typing now with equivalent type
info),
* address more cases of well known types or relations between types
within instructions (see, for example, atomic*.ll test cases and
Event-related test cases for improved SPIR-V code generated by the
Backend),
* fix the issue of removing unneeded ptrcast instructions in
pre-legalizer pass that led to creation of new assign-type instructions
with the same argument as source in ptrcast and caused errors in type
inference (the reproducer `complex.ll` test case is added to the PR).
This PR fixes the issue
https://github.com/llvm/llvm-project/issues/96614 by improve pattern
matching and tracking of constant integers. The attached test is
successful if it doesn't crash and generate valid SPIR-V code for both
32 and 64 bits targets.
This PR fixes https://github.com/llvm/llvm-project/issues/96513.
The way of creation of array type constant was incorrect: instead of
creating [1, 1, 1] or [1, 1, 1, 1, 1, ....] constants, the same [1]
constant was always created, substituting original composite constants.
This in its turn led to a situation when only one of constants might
exist in the code without emitting invalid code, the second constant
would be eventually rewritten to the first constant, because a key to
address both was an array of a single element (like [1]).
This PR fixes the issue and purges from the code unneeded copy/pasted
clone of the function that creates an array constant.
This PR continues https://github.com/llvm/llvm-project/pull/94467 and
contains fixes in emission of type intrinsics, constant recording and
corresponding test cases:
* type-deduce-global-dup.ll -- fix of integer constant emission on
32-bit platforms and correct type deduction for globals
* type-deduce-simple-for.ll -- fix of GEP translation (there was an
issue previously that led to incorrect translation/broken logic of
for-range implementation)
This PR also:
* fixes a cast between identical storage classes and updates the test
case to include validation run by spirv-val,
* ensures that Bitcast for pointers satisfies the requirement that the
address spaces must match and adds the corresponding test case,
* improve encode in Tablegen and decode in code of dependencies between
SPIR-V entities and required capability/extensions,
* prevent emission of identical OpTypePointer instructions.
This change is an implementation of #87367's investigation on supporting
IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
If you want an overarching view of how this will all connect see:
https://github.com/llvm/llvm-project/pull/90088
Changes:
- `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN`
opcode
- `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic
- `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN`
Opcode handler
- `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN`
Opcode
- `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic
to `G_FTAN` Opcode
- `llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp` - Map the
`G_FTAN` opcode to the GLSL 4.5 and openCL tan instructions.
- `llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp` - Define `G_FTAN` as a
legal spirv target opcode.
Translating global values, IRTranslator pass can sometimes generates
code patterns that require additional efforts during pre-legalization.
This PR addresses this problem to support G_PTRTOINT instruction used in
initialization of GV.
This PR fixes the issue
https://github.com/llvm/llvm-project/issues/88908
Attached test case is updated to check that OpSConvert/OpUConvert is not
generated when input and result types are identical.
This PR resolves the issue that SPIR-V Backend uses the notion of a
pointer size of the target, most notably, in legalizer code, but
Tablegen instruction selection in SPIR-V Backend doesn't account for a
pointer size of the target. See
https://github.com/llvm/llvm-project/issues/88723 for a detailed
description. There are 3 test cases attached to the PR that reproduced
the issue, when dealing with spirv32-spirv64 differences, and are
working correctly now with this PR.
- `CGBuiltin.cpp` - Switch to using
`CGM.getHLSLRuntime().get##NAME##Intrinsic()`
- `CGHLSLRuntime.h` - Add any to backend intrinsic abstraction
- `IntrinsicsSPIRV.td` - Add any intrinsic to SPIR-V.
- `SPIRVInstructionSelector.cpp` - Add means of selecting any intrinsic.
Any and All share the same behavior up to the opCode. They are only
different in vector cases.
Completes #88045
The main point of this change was to add support for HLSL's all
intrinsic.
In the process of doing that I found a few issues around creating an
`OpConstantComposite` via `buildZerosVal`.
First the current code didn't support floats so the process of adding
`buildZerosValF` meant I needed a
float version of `getOrCreateIntConstVector`. After doing so I renamed
both versions to `getOrCreateConstVector`. That meant I needed to create
a float type version of `getOrCreateIntCompositeOrNull`. Luckily the
type information was low for this function so was able to split it out
into a helpwe and rename `getOrCreateIntCompositeOrNull` to
`getOrCreateCompositeOrNull` With the exception of type handling
differences of the code and Null vs 0 Constant Op codes these functions
should be identical.
To handle scalar floats I could not use `buildConstantFP` like this PR
did:
https://github.com/llvm/llvm-project/commit/0a2aaab5aba46#diff-733a189c5a8c3211f3a04fd6e719952a3fa231eadd8a7f11e6ecf1e584d57411R1603
because that would create too many superfluous registers (that causes
problems in the validator), I had to create a float version of
`getOrCreateConstInt` which I called `getOrCreateConstFP`.
similar problems with doing it like this:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/SPIRV/SPIRVBuiltins.cpp#L1540.
`buildZerosValF` also has a use of a function `getZeroFP`. This is
because half, float, and double scalar values of 0 would collide in
`SPIRVDuplicatesTracker<Constant> CT` if you use `APFloat(0.0f)`.
`getORCreateConstFP` needed its own version of `getOrCreateConstIntReg`
which I called `getOrCreateConstFloatReg` The one difference in this
function is `getOrCreateConstFloatReg` returns a bit width so we don't
have to call `getScalarOrVectorBitWidth` twice ie when it is used again
in `getOrCreateConstFP` for `OpConstantF` `addNumImm`.
`getOrCreateConstFloatReg` needed an `assignFloatTypeToVReg` helper
which called a `getOrCreateSPIRVFloatType` helper. There was no
equivalent IntegerType::get for floats so I handled this with a switch
statement on bit widths to get the right LLVM float type.
Finally, there is the use of `bool ZeroAsNull = STI.isOpenCLEnv();` This
is partly a cosmetic change. When Zeros are treated as nulls, we don't
create `OpConstantComposite` vectors which is something we do in the
DXCs SPIRV backend. The DXC SPIRV backend also does not use
`OpConstantNull`. Finally, I needed a means to test the behavior of the
OpConstantNull and `OpConstantComposite` changes and this was one way I
could do that via the same tests.
SPIR-V intsruction selection was mapping the LLVM float min/max
intrinsics to FMin and FMax respectively for GL/Vulkan environments,
which does not match the intrinsics' documented treatment of NaN
operands. This patch switches the mapping to the correctly matched NMin
and NMax operations.
Fixes#87072
This PR:
* fixes OpVariable instructions place in a function (see
https://github.com/llvm/llvm-project/issues/66261),
* improves type inference,
* helps avoiding unneeded bitcasts when validating function call's
This allows to improve existing and add new test cases with more strict
checks. OpVariable fix refers to "All OpVariable instructions in a
function must be the first instructions in the first block" requirement
from SPIR-V spec.
This PR improves type inference in general and deduces types of
composite data structures in particular. Also added a way to insert a
bitcast to make a fun call valid in case of arguments types mismatch due
to opaque pointers type inference.
The attached test `pointers/nested-struct-opaque-pointers.ll`
demonstrates new capabilities: the SPIRV code emitted for this test is
now (1) valid in a sense of data field types and (2) accepted by
`spirv-val`.
More strict LIT checks, support of more composite data structures and
improvement of fun calls from the perspective of type correctness are
main todo's at the moment.