This PR fixes https://github.com/llvm/llvm-project/issues/120078 and
improves/simplifies parsing of demangled function name that aims to
detect a tip for floating point decorations. The latter improvement
fixes also a complaint from `LLVM_USE_SANITIZER=Address`.
This PR is to fix the following issues:
* the SPIR-V Backend didn't generate Alignment decoration for alloca
instructions,
* we need to use types from demangled function declarations to specify
types for opaque pointers.
This PR fixes:
* emission of OpNames (added newly inserted internal intrinsics and
basic blocks)
* emission of function attributes (SRet is added)
* implementation of SPV_INTEL_optnone so that it emits OptNoneINTEL
Function Control flag, and add implementation of the SPV_EXT_optnone
SPIR-V extension.
Re-land of #116636
Adds a new address spaces: hlsl_private. Variables with such address
space will be emitted with a Private storage class.
This is useful for variables global to a SPIR-V module, since up to now,
they were still emitted with a Function storage class, which is wrong.
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
Adds a new address spaces: `hlsl_private`. Variables with such address
space will be emitted with a `Private` storage class.
This is useful for variables global to a SPIR-V module, since up to now,
they were still emitted with a `Function` storage class, which is wrong.
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
Goals of the PR are:
* to ensure that correct types are applied to virtual registers which
were used as return values in call lowering. A reproducer is attached as
a new test case, before the PR it fails because spirv-val considers
output invalid due to wrong result/operand types in OpPhi's;
* improve type inference by speeding up postprocessing of types: by
limiting iterations by checking what remains to process, and processing
each instruction just once for any number of operands with uncomplete
types;
* improve type inference by more accurate work with uncomplete types
(pass uncomplete property to dependent operands, ensure consistency of
uncomplete-types data structure);
* change processing order and add traversing of PHI nodes when type
inference apply instructions results to specify/update/cast operands
type (fixes an issue with OpPhi's result type mismatch with operand
types).
Add testing for the visitor and added a note explaining irreducible CFG
are not supported.
Related to #116692
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
Block sorting was assuming reducible CFG. Meaning we always had a best
node to continue with. Irreducible CFG makes breaks this assumption, so
the algorithm looped indefinitely because no node was a valid candidate.
Fixes#116692
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
This PR fixes generation of OpConstantFunctionPointerINTEL instruction
for the SPIR-V extension SPV_INTEL_function_pointers. Result type of
OpConstantFunctionPointerINTEL must be OpTypePointer with Storage Class
operand equal to CodeSectionINTEL.
See also https://github.com/llvm/llvm-project/pull/116636
CC: @MrSidims
This PR is to solve several intertwined issues with type inference while
adding support for builtins for OpIAddCarry and OpISubBorrow:
* OpIAddCarry and OpISubBorrow generation in a way of supporting SPIR-V
friendly builtins `__spirv_...` -- introduces a new element to account
for, namely, `ptr sret (%struct) %0` argument that is a place to put a
result of the instruction;
* fix early definition of SPIR-V types during call lowering -- namely,
the goal of the PR is to ensure that correct types are applied to
virtual registers which were used as arguments in call lowering and so
caused early definition of SPIR-V types; reproducers are attached as a
new test cases;
* improve parsing of builtin names (e.g., understand a name of a kind
`"anon<int, int> __spirv_IAddCarry<int, int>(int, int)"` that was
incorrectly parsed as `anon` before the PR);
* improve type inference and fix access to erased from parent after
visit instructions -- before the PR visiting of instructions in
emitintrinsics pass replaced old alloca's, bitcast's, etc. instructions
with a newly generated internal SPIR-V intrinsics and after erasing old
instructions there were still references to them in a postprocessing
working list, while records for newly deduced pointee types were lost;
this PR fixes the issue by adding as consistent wrt. internal data
structures action `SPIRVEmitIntrinsics::replaceAllUsesWith()` that fixes
above mentioned problems;
* LLVM IR add/sub instructions result in logical SPIR-V instructions
when applied to bool type;
* fix validation of pointer types for frexp and lgamma_r,
* fix hardcoded reference to AS0 as a Function storage class in
lib/Target/SPIRV/SPIRVBuiltins.cpp -- now it's
`storageClassToAddressSpace(SPIRV::StorageClass::Function)`,
* re-use the same OpTypeStruct for two identical references to struct's
in arithmetic with overflow instructions.
The "topological" sorting was behaving incorrectly in some cases:
the exit of a loop could have a lower rank than a node in the loop.
This causes issues when structurizing some patterns, and also codegen
issues as we could generate BBs in the incorrect order in regard to the
SPIR-V spec.
Fixing this ordering alone broke other parts of the structurizer, which
by luck worked. Had to fix those.
Added more test cases, especially to test basic patterns.
I also needed to tweak/disable some tests for 2 reasons:
- SPIR-V now required reg2mem/mem2reg to run. Meaning dead stores
are optimized away. Some tests require tweaks to avoid having the
whole function removed.
- Mem2Reg will generate variable & load/stores. This generates
G_BITCAST in several cases. And there is currently something wrong
we do with G_BITCAST which causes MIR verifier to complain.
Until this is resolved, I disabled -verify-machineinstrs flag on
those tests.
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
This PR reworks implementation of OpSpecConstantOp with ptr-cast
operation (PtrCastToGeneric, GenericCastToPtr). Previous implementation
didn't take into account a lot of use cases, including multiple
inclusion of pointers, reference to a pointer from OpName, etc. A
reproducer is attached as a new test case.
This PR also fixes wrong type inference for IR patterns which generate
new virtual registers without SPIRV type. Previous implementation
assumed always that result has the same address space as a source that
is not the fact, and, for example, led to impossibility to emit a
ptr-cast operation in the reproducer, because wrong type inference
rendered source and destination with the same address space, eliminating
translation of G_ADDRSPACE_CAST.
Two main goals of this PR are:
* to support "Arithmetic with Overflow" intrinsics, including the
special case when those intrinsics are being generated by the
CodeGenPrepare pass during translations with optimization;
* to redirect intrinsics with aggregate return type to be lowered via
GlobalISel operations instead of SPIRV-specific unfolding/lowering (see
https://github.com/llvm/llvm-project/pull/95012).
There is a new test case
`llvm/test/CodeGen/SPIRV/passes/translate-aggregate-uaddo.ll` that
describes and checks the general logics of the translation.
This PR continues a series of PRs aimed to identify and fix flaws in
code emission, to improve pass rates for the mode with expensive checks
set on (see https://github.com/llvm/llvm-project/pull/101732,
https://github.com/llvm/llvm-project/pull/104104,
https://github.com/llvm/llvm-project/pull/106966), having in mind the
ultimate goal of proceeding towards the non-experimental status of
SPIR-V Backend.
The reproducers are:
1) consider `llc -O3 -mtriple=spirv64-unknown-unknown ...` with:
```
define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
entry:
br label %l1
l1:
%e = phi i32 [ %a, %entry ], [ %i, %body ]
%i = add nsw i32 %e, 1
%fl = icmp eq i32 %i, 0
br i1 %fl, label %exit, label %body
body:
store i8 42, ptr addrspace(4) %p
br label %l1
exit:
ret i32 %i
}
```
2) consider `llc -O0 -mtriple=spirv64-unknown-unknown ...` with:
```
define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
entry:
br label %l1
l1: ; preds = %body, %entry
%e = phi i32 [ %a, %entry ], [ %math, %body ]
%0 = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 %e, i32 1)
%math = extractvalue { i32, i1 } %0, 0
%ov = extractvalue { i32, i1 } %0, 1
br i1 %ov, label %exit, label %body
body: ; preds = %l1
store i8 42, ptr addrspace(4) %p, align 1
br label %l1
exit: ; preds = %l1
ret i32 %math
}
```
It appears that PR https://github.com/llvm/llvm-project/pull/106429
introduced an issue for builds with SPIRV Backend target when building
with gcc, e.g.:
```
/llvm-project/llvm/lib/Target/SPIRV/SPIRVUtils.cpp:263:36: error: use of parameter from containing function
263 | llvm::SyncScope::ID SubGroup = Ctx.getOrInsertSyncScopeID("subgroup");
| ^~~
/llvm-project/llvm/lib/Target/SPIRV/SPIRVUtils.cpp:256:46: note: ‘llvm::LLVMContext& Ctx’ declared here
256 | SPIRV::Scope::Scope getMemScope(LLVMContext &Ctx, SyncScope::ID Id) {
```
This PR fixes this by removing struct and using static const variables
instead.
This change adds support for correctly lowering the `__scoped` Clang
builtins, and corresponding scoped LLVM instructions. These were
previously unconditionally lowered to Device scope, which is possibly incorrect.
Furthermore, the default / implicit scope is changed from Device (an
OpenCL assumption) to AllSvmDevices (aka System), since the SPIR-V BE is not
OpenCL specific / can ingest IR coming from other language front-ends. OpenCL
defaulting to Device scope is now reflected in the front-end handling of atomic
ops, which seems preferable.
This commit adds an initial SPIR-V structurizer.
It leverages the previously merged passes, and the convergence region
analysis to determine the correct merge and continue blocks for SPIR-V.
The first part does a branch cleanup (simplifying switches, and
legalizing them), then merge instructions are added to cycles,
convergent and later divergent blocks.
Then comes the important part: splitting critical edges, and making sure
the divergent construct boundaries don't cross.
- we split blocks with multiple headers into 2 blocks.
- we split blocks that are a merge blocks for 2 or more constructs:
SPIR-V spec disallow a merge block to be shared by 2
loop/switch/condition construct.
- we split merge & continue blocks: SPIR-V spec disallow a basic block
to be both a continue block, and a merge block.
- we remove superfluous headers: when a header doesn't bring more info
than the parent on the divergence state, it must be removed.
This PR leverages the merged SPIR-V simulator for testing, as long as
spirv-val. For now, most DXC structurization tests are passing. The
unsupported ones are either caused by unsupported features like switches
on boolean types, or switches in region exits, because the MergeExit
pass doesn't support those yet (there is a FIXME).
This PR is quite large, and the addition not trivial, so I tried to keep
it simple. E.G: as soon as the CFG changes, I recompute the dominator
trees and other structures instead of updating them.
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
This PR fixes https://github.com/llvm/llvm-project/issues/96513.
The way of creation of array type constant was incorrect: instead of
creating [1, 1, 1] or [1, 1, 1, 1, 1, ....] constants, the same [1]
constant was always created, substituting original composite constants.
This in its turn led to a situation when only one of constants might
exist in the code without emitting invalid code, the second constant
would be eventually rewritten to the first constant, because a key to
address both was an array of a single element (like [1]).
This PR fixes the issue and purges from the code unneeded copy/pasted
clone of the function that creates an array constant.
This PR continues https://github.com/llvm/llvm-project/pull/94467 and
contains fixes in emission of type intrinsics, constant recording and
corresponding test cases:
* type-deduce-global-dup.ll -- fix of integer constant emission on
32-bit platforms and correct type deduction for globals
* type-deduce-simple-for.ll -- fix of GEP translation (there was an
issue previously that led to incorrect translation/broken logic of
for-range implementation)
This PR also:
* fixes a cast between identical storage classes and updates the test
case to include validation run by spirv-val,
* ensures that Bitcast for pointers satisfies the requirement that the
address spaces must match and adds the corresponding test case,
* improve encode in Tablegen and decode in code of dependencies between
SPIR-V entities and required capability/extensions,
* prevent emission of identical OpTypePointer instructions.
This PR contains a series of fixes which are to improve type inference
and instruction selection.
Namely, it includes:
* fix OpSelect to support operands of a pointer type, according to the
SPIR-V specification (previously only integer/float/vectors of integer
or float were supported) -- a new test case is added and existing test
case is updated;
* fix TableGen typo's in definition of register classes and introduce a
new reg class that is a vector of pointers;
* fix usage of a machine function context when there is a need to switch
between different machine functions to infer/validate correct types;
* add usage of TypedPointerType instead of PointerType so that later
stages of type inference are able to distinguish pointer types by their
element types, effectively supporting hierarchy of pointer/pointee types
and avoiding more complicated recursive type matching on level of
machine instructions in favor of direct pointer comparison using LLVM's
`Type *` values;
* extracting detailed information about operand types using known type
rules for some llvm instructions (for instance, by deducing PHI's
operand pointee types if PHI's results type was deducted on previous
stages of type inference), and adding correspondent
`Intrinsic::spv_assign_ptr_type` to keep type info along consequent
passes,
* ensure that OpConstantComposite reuses a constant when it's already
created and available in the same machine function -- otherwise there is
a crash while building a dependency graph, the corresponding test case
is attached,
* implement deduction of function's return type for opaque pointers, a
new test case is attached,
* make 'emit intrinsics' a module pass to resolve function return types
over the module -- first types for all functions of the module must be
calculated, and only after that it's feasible to deduct function return
types on this earlier stage of translation.
This PR is a small improvement of parsing of base type name in builtins,
allowing to understand `unsigned ...` types. The test case that fails
without the fix is attached.
Add support to generate valid SPIR-V for the WaveGetLaneIndex() HLSL
builtin.
To implement this, I had to fix a few small issues in the backend, like
the i8* pointer type being emitted, even if we have the type information
elsewhere.
Signed-off-by: Nathan Gauër <brioche@google.com>
This pull request aims to remove any dependency on OpenCL/SPIR-V type
information in LLVM IR metadata. While, using metadata might simplify
and prettify the resulting SPIR-V output (and restore some of the
information missed in the transformation to opaque pointers), the
overall methodology for resolving kernel parameter types is highly
inefficient.
The high-level strategy is to assign kernel parameter types in this order:
1. Resolving the types using builtin function calls as mangled names
must contain type information or by looking up builtin definition in
SPIRVBuiltins.td. Then:
- Assigning the type temporarily using an intrinsic and later setting
the right SPIR-V type in SPIRVGlobalRegistry after IRTranslation
- Inserting a bitcast
2. Defaulting to LLVM IR types (in case of pointers the generic i8*
type or types from byval/byref attributes)
In case of type incompatibility (e.g. parameter defined initially as
sampler_t and later used as image_t) the error will be found early on
before IRTranslation (in the SPIRVEmitIntrinsics pass).
SPIRV-V Backend generates unnecessary OpExecutionMode records, putting
into the id's which are not the Entry Point operands of an OpEntryPoint
(ref: https://github.com/llvm/llvm-project/issues/81753). This PR is to
fix the issue.
This PR adds support for atomic instruction on floating-point numbers:
* SPV_EXT_shader_atomic_float_add
* SPV_EXT_shader_atomic_float_min_max
* SPV_EXT_shader_atomic_float16_add
and fixes asm printer output for half floating-type.
This is the first of the 7 steps outlined in #75801. This PR explicitely
calls the SimplifyLoops pass. Directly following this pass should follow
the 6 others required to structurize the IR.
Running this pass could generate empty basic-blocks, which are implicit
fallthrough to the successor BB.
There was a specific condition in the SPIR-V ISel which handled implicit
fallthrough, but it couldn't work on empty basic-blocks. This commits
removes the old logic, and adds this new logic, which checks all
basic-blocks for implicit fallthroughs, including empty ones.
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
Introduced the convergent equivalent of the existing G_INTRINSIC opcodes:
- G_INTRINSIC_CONVERGENT
- G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS
Out of the targets that currently have some support for GlobalISel, the patch
assumes that the convergent intrinsics only relevant to SPIRV and AMDGPU.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D154766
Some opcodes in generic MIR represent calls to intrinsics, where the intrinsic
ID is the first non-def operand to the instruction. These are now represented as
a subclass of GenericMachineInstr, and the method MachineInstr::getIntrinsicID()
is now moved to this subclass GIntrinsic.
Some target-defined instructions behave like GMIR intrinsics, and have an
Intrinsic::ID operand. But they should not be recognized as generic intrinsics,
and should not use GIntrinsic::getIntrinsicID(). Separated these out by
introducing a new AMDGPU::getIntrinsicID().
Reviewed By: arsenm, Pierre-vh
Differential Revision: https://reviews.llvm.org/D155556
This restores commit baa3386edb11a2f9bcadda8cf58d56f3707c39fa.
Originally reverted in d0f7850b01cf17e50a4f4b00e3b84dded94df6b8.
This reverts commit baa3386edb11a2f9bcadda8cf58d56f3707c39fa.
The changes did not cover all occurrences of the deteleted function
MachineInstr::getIntrinsicID().
Some opcodes in generic MIR represent calls to intrinsics, where the intrinsic
ID is the first non-def operand to the instruction. These are now represented as
a subclass of GenericMachineInstr, and the method MachineInstr::getIntrinsicID()
is now moved to this subclass GIntrinsic.
Some target-defined instructions behave like GMIR intrinsics, and have an
Intrinsic::ID operand. But they should not be recognized as generic intrinsics,
and should not use GIntrinsic::getIntrinsicID(). Separated these out by
introducing a new AMDGPU::getIntrinsicID().
Reviewed By: arsenm, Pierre-vh
Differential Revision: https://reviews.llvm.org/D155556
After this patch all arbitrary size integers (smaller than 64 bits) in
LLVM IR will be promoted to regular size type in SPIR-V (OpTypeInt
8/16/32/64).
Differential Revision: https://reviews.llvm.org/D145137
This patch adds support for TargetExtType/target(...) representing
SPIR-V builtin types. After D135202, target(...) is the preferred way
for representing SPIR-V builtin types in LLVM IR and the only working
in the opaque pointer mode.
In order to maintain compatibility with LLVM IR generated by older
versions of Clang and LLVM/SPIR-V Translator, pointers-to-opaque-structs
denoting SPIR-V/OpenCL builtin types will be translated to equivalent
SPIR-V target extension types. This translation is only available in the
typed pointer mode (-opaque-pointers=0).
The relevant LIT tests with SPIR-V builtins were converted to use the
new target(...) notation.
Differential Revision: https://reviews.llvm.org/D144494
The patch fixes gcc's warning in SPIRVUtils.cpp after D142532.
Also it fixes compilation error by MSVC in SPIRVBuiltins.cpp.
Differential Revision: https://reviews.llvm.org/D142937
Fix these build warnings:
SPIRVBuiltins.cpp:1590:30: warning: 'getPrefTypeAlignment' is deprecated: use getPrefTypeAlign instead [-Wdeprecated-declarations]
SPIRVUtils.cpp:209:3: warning: default label in switch which covers all enumeration values [-Wcovered-switch-default]
Differential Revision: https://reviews.llvm.org/D142532
SPIRVModuleAnalysis collects module and external function registers
(usually result of OpFunction) for use when emitting OpFunctionCall.
This patch makes the mapping between the functions and registers using
pointers (instead of name strings) to ensure anonymous functions and
calls can be resolved properly.
Differential Revision: https://reviews.llvm.org/D140548
The patch adds support of the enqueue_kernel builtin function.
It is implemented in the same way as in the SPIRV translator.
2 LIT tests are added to show the improvement.
Differential Revision: https://reviews.llvm.org/D137018
The patch fixes lowering of anonymous functions, removes file/linkage
info for builtin call demangling, and adds relevant test demonstrating
a fixed problem.
Differential Revision: https://reviews.llvm.org/D135390
The patch adds the regularization pass that prepare LLVM IR for
the IR translation. It also contains following changes:
- reduce indentation, make getNonParametrizedType, getSamplerType,
getPipeType, getImageType, getSampledImageType static in SPIRVBuiltins,
- rename mayBeOclOrSpirvBuiltin to getOclOrSpirvBuiltinDemangledName,
- move isOpenCLBuiltinType, isSPIRVBuiltinType, isSpecialType from
SPIRVGlobalRegistry.cpp to SPIRVUtils.cpp, renaming isSpecialType to
isSpecialOpaqueType,
- implment getTgtMemIntrinsic() in SPIRVISelLowering,
- add hasSideEffects = 0 in Pseudo (SPIRVInstrFormats.td),
- add legalization rule for G_MEMSET, correct G_BRCOND rule,
- add capability processing for OpBuildNDRange in SPIRVModuleAnalysis,
- don't correct types of registers holding constants and used in
G_ADDRSPACE_CAST (SPIRVPreLegalizer.cpp),
- lower memset/bswap intrinsics to functions in SPIRVPrepareFunctions,
- change TargetLoweringObjectFileELF to SPIRVTargetObjectFile
in SPIRVTargetMachine.cpp,
- correct comments.
5 LIT tests are added to show the improvement.
Differential Revision: https://reviews.llvm.org/D133253
Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
The patch adds support for OpenCL and SPIR-V built-in functions.
Their detection and properties are implemented using TableGen.
Five tests are added to demonstrate the improvement.
Differential Revision: https://reviews.llvm.org/D132024
Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
The patch replaces SPIRVBaseInfo.* previously created using macros by
the tablegen approach. There are many small changes in other files due to
differences in namespaces. Also, functions in SPIRVUtils are moved to
the llvm namespace.
Differential Revision: https://reviews.llvm.org/D130518
Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>