This PR reworks implementation of OpSpecConstantOp with ptr-cast
operation (PtrCastToGeneric, GenericCastToPtr). Previous implementation
didn't take into account a lot of use cases, including multiple
inclusion of pointers, reference to a pointer from OpName, etc. A
reproducer is attached as a new test case.
This PR also fixes wrong type inference for IR patterns which generate
new virtual registers without SPIRV type. Previous implementation
assumed always that result has the same address space as a source that
is not the fact, and, for example, led to impossibility to emit a
ptr-cast operation in the reproducer, because wrong type inference
rendered source and destination with the same address space, eliminating
translation of G_ADDRSPACE_CAST.
This change adds support for correctly lowering the `__scoped` Clang
builtins, and corresponding scoped LLVM instructions. These were
previously unconditionally lowered to Device scope, which is possibly incorrect.
Furthermore, the default / implicit scope is changed from Device (an
OpenCL assumption) to AllSvmDevices (aka System), since the SPIR-V BE is not
OpenCL specific / can ingest IR coming from other language front-ends. OpenCL
defaulting to Device scope is now reflected in the front-end handling of atomic
ops, which seems preferable.
This PR continues https://github.com/llvm/llvm-project/pull/101732
changes in virtual register processing aimed to improve correctness of
emitted MIR between passes from the perspective of MachineVerifier.
Namely, the following changes are introduced:
* register classes (lib/Target/SPIRV/SPIRVRegisterInfo.td) and
instruction patterns (lib/Target/SPIRV/SPIRVInstrInfo.td) are corrected
and simplified (by removing unnecessary sophisticated options) -- e.g.,
this PR gets rid of duplicating 32/64 bits patterns, removes ANYID
register class and simplifies definition of the rest of register
classes,
* hardcoded LLT scalar types in passes before instruction selection are
corrected -- the goal is to have correct bit width before instruction
selection, and use 64 bits registers for pattern matching in the
instruction selection pass; 32-bit registers remain where they are
described in such terms by SPIR-V specification (like, for example,
creation of virtual registers for scope/mem semantics operands),
* rework virtual register type/class assignment for calls/builtins
lowering,
* a series of minor changes to fix validity of emitted code between
passes:
- ensure that that bitcast changes the type,
- fix the pattern for instruction selection for OpExtInst,
- simplify inline asm operands usage,
- account for arbitrary integer sizes / update legalizer rules;
* add '-verify-machineinstrs' to existed test cases.
See also https://github.com/llvm/llvm-project/issues/88129 that this PR
may resolve.
This PR fixes a great number of issues reported by MachineVerifier and,
as a result, reduces a number of failed test cases for the mode with
expensive checks set on from ~200 to ~57.
This PR contains changes in virtual register processing aimed to improve
correctness of emitted MIR between passes from the perspective of
MachineVerifier. This potentially helps to detect previously missed
flaws in code emission and harden the test suite. As a measure of
correctness and usefulness of this PR we may use a mode with expensive
checks set on, and MachineVerifier reports problems in the test suite.
In order to satisfy Machine Verifier requirements to MIR correctness not
only a rework of usage of virtual registers' types and classes is
required, but also corrections into pre-legalizer and instruction
selection logics. Namely, the following changes are introduced:
* scalar virtual registers have proper bit width,
* detect register class by SPIR-V type,
* add a superclass for id virtual register classes,
* fix Tablegen rules used for instruction selection,
* fixes of minor existed issues (missed flag for proper representation
of a null constant for OpenCL vs. HLSL, wrong usage of integer virtual
registers as a synonym of any non-type virtual register).
This PR:
* supports cl_ext_float_atomics by mapping atomic_fetch_add and
atomic_fetch_sub applied to float arguments to the corresponding
instructions from SPV_EXT_shader_atomic_float*_add, and
* fix errors in definition of atomic_fetch_*_explicit builtins by fixing
a valid number of arguments.
This PR continues https://github.com/llvm/llvm-project/pull/94467 and
contains fixes in emission of type intrinsics, constant recording and
corresponding test cases:
* type-deduce-global-dup.ll -- fix of integer constant emission on
32-bit platforms and correct type deduction for globals
* type-deduce-simple-for.ll -- fix of GEP translation (there was an
issue previously that led to incorrect translation/broken logic of
for-range implementation)
This PR also:
* fixes a cast between identical storage classes and updates the test
case to include validation run by spirv-val,
* ensures that Bitcast for pointers satisfies the requirement that the
address spaces must match and adds the corresponding test case,
* improve encode in Tablegen and decode in code of dependencies between
SPIR-V entities and required capability/extensions,
* prevent emission of identical OpTypePointer instructions.
This change seeks to add support for vendor flavoured SPIRV - more
specifically, AMDGCN flavoured SPIRV. The aim is to generate SPIRV that
carries some extra bits of information that are only usable by AMDGCN
targets, forfeiting absolute genericity to obtain greater expressiveness
for target features:
- AMDGCN inline ASM is allowed/supported, under the assumption that the
[SPV_INTEL_inline_assembly](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_inline_assembly.asciidoc)
extension is enabled/used
- AMDGCN target specific builtins are allowed/supported, under the
assumption that e.g. the `--spirv-allow-unknown-intrinsics` option is
enabled when using the downstream translator
- the featureset matches the union of AMDGCN targets' features
- the datalayout string is overspecified to affix both the program
address space and the alloca address space, the latter under the
assumption that the
[SPV_INTEL_function_pointers](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_function_pointers.asciidoc)
extension is enabled/used, case in which the extant SPIRV datalayout
string would lead to pointers to function pointing to the private
address space, which would be wrong.
Existing AMDGCN tests are extended to cover this new target. It is
currently dormant / will require some additional changes, but I thought
I'd rather put it up for review to get feedback as early as possible. I
will note that an alternative option is to place this under AMDGPU, but
that seems slightly less natural, since this is still SPIRV, albeit
relaxed in terms of preconditions & constrained in terms of
postconditions, and only guaranteed to be usable on AMDGCN targets (it
is still possible to obtain pristine portable SPIRV through usage of the
flavoured target, though).
This PR introduces support of llvm.ptr.annotation to SPIR-V Backend, and
implement several extensions which make use of spirv.Decorations and
llvm.ptr.annotation to annotate global variables and pointers:
- SPV_INTEL_cache_controls
- SPV_INTEL_global_variable_host_access
- SPV_INTEL_global_variable_fpga_decorations
This PR introduces support for inline assembly calls for SPIR-V Backend
in general, and support for SPV_INTEL_inline_assembly [1] extension in
particular. The former part of the PR is agnostic towards
vendor-specific requirements and resolves the task of supporting
successful transformation of inline assembly as long as it's possible
without specific SPIR-V instruction codes.
As a part of the PR there appears an opportunity to bring coherent
inline assembly information up to latest passes of the transformation
process (emitting final SPIR-V instructions), so that PR makes it easy
to add any another required flavor of inline assembly, other then
supported by the vendor specific SPV_INTEL_inline_assembly extension,
if/when needed.
At the moment, however, SPV_INTEL_inline_assembly is the only
implemented way to bring LLVM IR inline assembly calls up to valid
SPIR-V instructions and also the default one. This means that inline
assembly calls will generate an error message of such extension is not
used to prevent LLVM-generated error messages at the final stages of
translation. When the SPV_INTEL_inline_assembly extension is mentioned
among supported, translation of inline assembly is intercepted by this
extension implementation on a pre-legalizer step, and this is a place
where support for a new inline assembly extension may be added if
needed.
This PR also extends support for register classes, improves type
inference during pre-legalizer pass, and fixes a minor bug with
asm-printing of string literals.
[1]
https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_inline_assembly.asciidoc
Recognize `cl_khr_kernel_clock` builtins and translate them to
`OpReadClockKHR` instructions. The `Scope` operand is deduced from the
builtin function name.
spirv-val does not pass yet due to OpReadClockKHR only supporting the
valid scopes for Vulkan (Device and Subgroup, but not Workgroup), so
leave validation disabled with a TODO.
This PR generation of argument types of internal intrinsic functions
`spv_const_composite` and `spv_track_constant`, so that composite
constants of ConstantVector type preserve their correct type in
transformation passes and can be successfully used further by LLVM
intrinsic functions.
The added test case serves two purposes: it is to check the above
mentioned fix and to demonstrate that a call to __builtin_alloca() maps
to instructions from SPV_INTEL_variable_length_array when this extension
is available.
This PR adds SPIR-V extension SPV_INTEL_variable_length_array that
allows to allocate local arrays whose number of elements is unknown at
compile time:
* add a new SPIR-V internal intrinsic:int_spv_alloca_array
* legalize G_STACKSAVE and G_STACKRESTORE
* implement allocation of arrays (previously getArraySize() of
AllocaInst was not used)
* add tests
This PR is to add support for the SPIR-V extension
SPV_KHR_uniform_group_instructions that adds new instructions to SPIR-V
to support additional group operations within uniform control flow.
This PR adds support for atomic instruction on floating-point numbers:
* SPV_EXT_shader_atomic_float_add
* SPV_EXT_shader_atomic_float_min_max
* SPV_EXT_shader_atomic_float16_add
and fixes asm printer output for half floating-type.
This patch supports SPIR-V capabilities and extensions. In addition,
it inserts decorations related to MIFlags and improves support of switches.
Five tests are included to demonstrate the improvement.
Differential Revision: https://reviews.llvm.org/D131221
Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>