Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.
This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
The HLSL frontend generates call to the intrinsic
@llvm.spv.resource.handlefromimplicitbinding to be able to access a
resource where the set and binding were not explicitly given in the
source code. Determining the correct set and binding cannot be done
during Clang's codegen or earlier because in DXIL, they must first
remove resource that are not accessed before assigning binding locations
to the resource without an explicit binding.
We will follow their lead.
This is a change from DXC, where implicit binding for SPIR-V are
assigned before optimizations.
See https://github.com/llvm/wg-hlsl/pull/309
Fix code quality issues reported by static analysis tool, such as:
- Rule of Three/Five.
- Dereference after null check.
- Unchecked return value.
- Variable copied when it could be moved.
Prior to this patch, when `NumElems` was 0, `OpTypeRuntimeArray` was
directly generated, but it requires `Shader` capability, so it can only
be generated if `Shader` env is being used. We have observed a pattern
of using unbound arrays that translate into `[0 x ...]` types in OpenCL,
which implies `Kernel` capability, so `OpTypeRuntimeArray` should not be
used. To prevent this scenario, this patch simplifies GEP instructions
where type is a 0-length array and the first index is also 0. In such
scenario, we effectively drop the 0-length array and the first index.
Additionally, the newly added test prior to this patch was generating a
module with both `Shader` and `Kernel` capabilities at the same time,
but they're incompatible. This patch also fixes that.
Finally, prior to this patch, the newly added test was adding `Shader`
capability to the module even with the command line flag
`--avoid-spirv-capabilities=Shader`. This patch also has a fix for that.
Add support for the intrinsic @llvm.fptosi.sat.* and @llvm.fptoui.sat.*
- add legalizer for G_FPTOSI_SAT and G_FPTOUI_SAT
- add instructionSelector for G_FPTOSI_SAT and G_FPTOUI_SAT
- add function to add saturatedConversion decoration to the intrinsic
---------
Co-authored-by: Dmitry Sidorov <dmitry.sidorov@intel.com>
Co-authored-by: Ebin-McW <ebin.jose@multicorewareinc.com>
Co-authored-by: Michal Paszkowski <michal@michalpaszkowski.com>
`Data` now references the first byte of the fixup offset within the current fragment.
MCAssembler::layout asserts that the fixup offset is within either the
fixed-size content or the optional variable-size tail, as this is the
most the generic code can validate without knowing the target-specific
fixup size.
Many backends applyFixup assert
```
assert(Offset + Size <= F.getSize() && "Invalid fixup offset!");
```
This refactoring allows a subsequent change to move the fixed-size
content outside of MCSection::ContentStorage, fixing the
-fsanitize=pointer-overflow issue of #150846
Pull Request: https://github.com/llvm/llvm-project/pull/151724
Pointers and GEP are untyped. SPIR-V required structured OpAccessChain.
This means the backend will have to determine a good way to retrieve the
structured access from an untyped GEP. This is not a trivial problem,
and needs to be addressed to have a robust compiler.
The issue is other workstreams relies on the access chain deduction to
work. So we have 2 options:
- pause all dependent work until we have a good chain deduction.
- submit this limited fix to we can work on both this and other features
in parallel.
Choice we want to make is #2: submitting this **knowing this is not a
good** fix. It only increase the number of patterns we can work with,
thus allowing others to continue working on other parts of the backend.
This patch as-is has many limitations:
- If cannot robustly determine the depth of the structured access from a
GEP. Fixing this would require looking ahead at the full GEP chain.
- It cannot always figure out the correct access indices, especially
with dynamic indices. This will require frontend collaboration.
Because we know this is a temporary hack, this patch only impacts the
logical SPIR-V target. Physical SPIR-V, which can rely on pointer cast
remains on the old method.
Related to #145002
Using GEP to index into a vector is not disallowed, but not recommended.
The SPIR-V backend needs to generate structured access into types, which
is impossible with an untyped GEP instruction unless we add more info to
the IR. Finding a solution is a work-in-progress, but in the meantime,
we'd like to reduce the amount of failures.
Preventing this optimizations from rewritting extract/insert
instructions into a GEP helps us lower more code to SPIR-V. This change
should be OK as it's only active when targeting SPIR-V and disabling a
non-recommended transformation.
Related to #145002
fixes#146942
## Issue
The cause of the bug is in InstCombine which is converting our load of
float vec4 and bitcast to i32 vec4 into one load of i32 vec4. That means
wr have to do a legalization in the spirv backend to convert back
```diff
- %3 = load <4 x i32>, ptr addrspace(11) %2, align 16
+ %3 = load <4 x float>, ptr addrspace(11) %2, align 16
+ %4 = bitcast <4 x float> %3 to <4 x i32>
```
<img width="2566" height="548" alt="Image"
src="https://github.com/user-attachments/assets/0bf8813c-70f8-47df-8207-ab7da54f5382"
/>
https://godbolt.org/z/K4GeM4fKT
## The Fix
Just removing the assert isn't enough to fix this bug. If we do so we
get an assert later
`Assertion failed: (!storageClassRequiresExplictLayout(SC)), function
getOrCreateSPIRVPointerType, file SPIRVGlobalRegistry.cpp, line 1806.`
If we just remove the assert the `CreateShuffleVector` uses the source
type via the `NewLoad` when the `Output` type needs to be the
`TargetType`.
We also can't use`CreateBitCast` That will feed the right types for the
`ShuffleVector` but it doesn't emit OpBitcast. the llvmIR isn't
translated over to MIR.
The fix then is to emit `spv_bitcast` just like what
`SPIRVEmitIntrinsics::visitBitCastInst` does.
---------
Co-authored-by: Chris B <beanz@abolishcrlf.org>
This commit adds custom legalization for G_IS_FPCLASS, corresponding to
the @llvm.is.fpclass intrinsic.
The lowering strategy is essentially copied and adjusted from the
target-agnostic LegalizeHelper::lowerISFPCLASS legalization. The reason
we can't just use that directly is that the series of instruction it
expands to aren't logged in the SPIR-V backend's register/type
book-keeping, leading to issues later on in the compilation process.
As such the code introduced here was copied from the aforementioned
helper method, with some notable changes:
* Each new instruction's destination register must have a SPIR-V type
registered to it.
* Instead of a COPY from the floating-point type to integer, we issue a
SPIR-V OpBitcast directly. The backend doesn't currently appear to
handle bitcast-like COPYs.
Fixes#72862
This is a quick fix to make progress to the backend until we get a
proper type scavenging system.
The previous code was only checking the type if the resource was used
once. Slightly changed the code to look to all usages, and get the first
type.
This will certainly break in other cases, but it allows us to move
forward for now until we rewrite the type scavenging to handle untyped
GEP/ptradd correctly.
Related to #145002
lifetime.start and lifetime.end are primarily intended for use on
allocas, to enable stack coloring and other liveness optimizations. This
is necessary because all (static) allocas are hoisted into the entry
block, so lifetime markers are the only way to convey the actual
lifetimes.
However, lifetime.start and lifetime.end are currently *allowed* to be
used on non-alloca pointers. We don't actually do this in practice, but
just the mere fact that this is possible breaks the core purpose of the
lifetime markers, which is stack coloring of allocas. Stack coloring can
only work correctly if all lifetime markers for an alloca are
analyzable.
* If a lifetime marker may operate on multiple allocas via a select/phi,
we don't know which lifetime actually starts/ends and handle it
incorrectly (https://github.com/llvm/llvm-project/issues/104776).
* Stack coloring operates on the assumption that all lifetime markers
are visible, and not, for example, hidden behind a function call or
escaped pointer. It's not possible to change this, as part of the
purpose of lifetime markers is that they work even in the presence of
escaped pointers, where simple use analysis is insufficient.
I don't think there is any way to have coherent semantics for lifetime
markers on allocas, while also permitting them on arbitrary pointer
values.
This PR restricts lifetimes to operate on allocas only. As a followup, I
will also drop the size argument, which is superfluous if we always
operate on an alloca. (This change also renders various code handling
lifetime markers on non-alloca dead. I plan to clean up that kind of
code after dropping the size argument as well.)
In practice, I've only found a few places that currently produce
lifetimes on non-allocas:
* CoroEarly replaces the promise alloca with the result of an intrinsic,
which will later be replaced back with an alloca. I think this is the
only place where there is some legitimate loss of functionality, but I
don't think this is particularly important (I don't think we'd expect
the promise in a coroutine to admit useful lifetime optimization.)
* SafeStack moves unsafe allocas onto a separate frame. We can safely
drop lifetimes here, as SafeStack performs its own stack coloring.
* Similar for AddressSanitizer, it also moves allocas into separate
memory.
* LSR sometimes replaces the lifetime argument with a GEP chain of the
alloca (where the offsets ultimately cancel out). This is just
unnecessary. (Fixed separately in
https://github.com/llvm/llvm-project/pull/149492.)
* InferAddrSpaces sometimes makes lifetimes operate on an addrspacecast
of an alloca. I don't think this is necessary.
- [x] Implement refract using HLSL source in hlsl_intrinsics.h
- [x] Implement the refract SPIR-V target built-in in
clang/include/clang/Basic/BuiltinsSPIRV.td
- [x] Add sema checks for refract to CheckSPIRVBuiltinFunctionCall in
clang/lib/Sema/SemaSPIRV.cpp
- [x] Add codegen for spv refract to EmitSPIRVBuiltinExpr in
CGBuiltin.cpp
- [x] Add codegen tests to clang/test/CodeGenHLSL/builtins/refract.hlsl
- [x] Add spv codegen test to clang/test/CodeGenSPIRV/Builtins/refract.c
- [x] Add sema tests to clang/test/SemaHLSL/BuiltIns/refract-errors.hlsl
- [x] Add spv sema tests to
clang/test/SemaSPIRV/BuiltIns/refract-errors.c
- [x] Create the int_spv_refract intrinsic in IntrinsicsSPIRV.td
- [x] In SPIRVInstructionSelector.cpp create the refract lowering and
map it to int_spv_refract in SPIRVInstructionSelector::selectIntrinsic.
- [x] Create SPIR-V backend test case in
llvm/test/CodeGen/SPIRV/hlsl-intrinsics/refract.ll
- [x] Check for what OpenCL support is needed.
Resolves https://github.com/llvm/llvm-project/issues/99153
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
This PR adds overrides in `SPIRVTTIImpl` for
`collectFlatAddressOperands` and `rewriteIntrinsicWithAddressSpace` to
enable `InferAddressSpacesPass` to rewrite the
`llvm.spv.generic.cast.to.ptr.explicit` intrinsic (corresponding to
`OpGenericCastToPtrExplicit`) when the address space of the argument can
be inferred. When the destination address space of the cast matches the
inferred address space of the argument, the call is replaced with that
argument. When they do not match, the cast is replaced with a constant
null pointer.
The patch adds intrinsics and lowering logic for GlobalSize,
GlobalOffset, SubgroupMaxSize, NumWorkgroups, WorkgroupSize,
WorkgroupId, LocalInvocationId, GlobalInvocationId, SubgroupSize,
NumSubgroups, SubgroupId and SubgroupLocalInvocationId SPIR-V builtins.
The patch also extend spv_thread_id, spv_group_id and
spv_thread_id_in_group to return anyint rather than i32. This allows the
intrinsics to support the opencl environment.
For each of the intrinsics, new clang builtins were added as well as a
binding for the SPIR-V "friendly" format. The original format doesn't
define such binding (uses global variables) but it is not possible to
express the Input SC which is normally required by the environement
specs, and using builtin functions is the most usual approach for other
backend and programming models.
In DXC, there is an option to enable all KHR extension. I would like to
extend the existing `-spirv-ext` backend commandline option to have the
same capability. It is like the special case for `all` execept it only
adds the `SPV_KHR_*` extensions.
Part of https://github.com/llvm/llvm-project/issues/137650.
In Vulkan, the signedness of the accesses to images has to match the
signedness of the backing image.
See
https://docs.vulkan.org/spec/latest/chapters/textures.html#textures-input,
where it says the behaviour is undefined if
> the signedness of any read or sample operation does not match the
signedness of the image’s format.
Users who define say an `RWBuffer<int>` will create a Vulkan image with
a signed integer format. So the HLSL that is generated must match that
expecation.
The solution we use is to generate a `spirv.SignedImage` target type for
signed integer instead of `spirv.Image`. The two types are otherwise the
same.
The backend will add the `signExtend` image operand to access to the
image to ensure the image is access as a signed image.
Fixes#144580
Add handling for FPFastMathMode in SPIR-V shaders. This is a first pass
that
simply does a direct translation when the proper extension is available.
This will unblock work for HLSL. However, it is not a full solution.
The default math mode for spir-v is determined by the API. When
targeting Vulkan many of the fast math options are assumed. We should do
something particular when targeting Vulkan.
We will also need to handle the hlsl "precise" keyword correctly when
FPFastMathMode is not available.
Unblockes https://github.com/llvm/llvm-project/issues/140739, but we are
keeing it open to track the remaining issues mentioned above.
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the `llvm/Target` library.
These annotations currently have no meaningful impact on the LLVM build;
however, they are a prerequisite to support an LLVM Windows DLL (shared
library) build.
## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
A sub-set of these changes were generated automatically using the
[Interface Definition Scanner (IDS)](https://github.com/compnerd/ids)
tool, followed formatting with `git clang-format`.
The bulk of this change is manual additions of `LLVM_ABI` to
`LLVMInitializeX` functions defined in .cpp files under llvm/lib/Target.
Adding `LLVM_ABI` to the function implementation is required here
because they do not `#include "llvm/Support/TargetSelect.h"`, which
contains the declarations for this functions and was already updated
with `LLVM_ABI` in a previous patch. I considered patching these files
with `#include "llvm/Support/TargetSelect.h"` instead, but since
TargetSelect.h is a large file with a bunch of preprocessor x-macro
stuff in it I was concerned it would unnecessarily impact compile times.
In addition, a number of unit tests under llvm/unittests/Target required
additional dependencies to make them build correctly against the LLVM
DLL on Windows using MSVC.
## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
There is a builtin __spirv_SpecConstant that the SPIR-V backend expands
into a specialization constant. However, it is currently only enable for
OpenCL shaders, and not the graphic shaders.
We want to use it for specialization constants coming from HLSL, so we
are enabling it for graphic shaders as well.
Implements https://github.com/llvm/wg-hlsl/pull/287
Fixes https://github.com/llvm/llvm-project/issues/142991
Implements
https://github.com/llvm/wg-hlsl/blob/main/proposals/0026-symbol-visibility.md.
The change is to stop using the `hlsl.export` attribute. Instead,
symbols with "program linkage" in HLSL will have export linkage with
default visibility, and symbols with "external linkage" in HLSL will
have export linkage with hidden visibility.
In the AArch64 version this helps reduce the number of blr instruction
(indirect jumps) in from 325 to 87, and reduces the size of the object
file by 4%. It seems to help make the code more efficient even if it
doesn't greatly affect compile time.
The AMDGPU variants are already marked as final.
The SPIR-V backend does not have access to the original name of a
resource in the source, so it tries to create a name. This leads to some
problems with reflection.
That is why start to pass the name of the resource from Clang to the
SPIR-V backend.
Fixes#138533
PR #141787 added code to emit the Fragment execution model. This
required emitting the OriginUpperLeft ExecutionMode. But this was done
by using the same codepath used for OpEntrypoint.
This has 2 issues:
- the interface variables were added to both OpEntryPoint and
OpExecutionMode.
- the existing OpExecutionMode logic was not used.
This commit fixes this, regrouping OpExecutionMode handling in one
place, and fixing bad codegen issue when interface variiables are added.
Current implementation outputs opcode is an immediate but spirv-tools
requires that the name of the operation without "Op" is needed for the
instruction OpSpecConstantOp
that is if the opcode is OpBitcast the instruction must be
`%1 = OpSpecConstantOp %6 Bitcast %17`
instead of
`%1 = OpBitcast %6 124 %17`
[refer this commit for more
info](0f166be68d)
---------
Co-authored-by: Dmitry Sidorov <dmitry.sidorov@intel.com>
Co-authored-by: Ebin-McW <ebin.jose@multicorewareinc.com>
A new test added for spirv-friendly builtins for
SPV_KHR_bit_instructions unveiled that current mechanism to detect
whether SPIRV Backend is in OpenCL environment or Vulkan environment was
not good enough. This PR updates how to detect the environment and all
the tests accordingly.
*UPDATE*: the new approach is having a new member in `SPIRVSubtarget` to
represent the environment. It can be either OpenCL, Kernel or Unknown.
If the triple is explicit, we can directly set it at the creation of the
`SPIRVSubtarget`, otherwise we just leave it unknown until we find other
information that can help us set the environment. For now, the only
other information we use to set the environment is `hlsl.shader`
attribute at `SPIRV::ExecutionModel::ExecutionModel
getExecutionModel(const SPIRVSubtarget &STI, const Function &F)`. Going
forward we should consider also specific instructions that are
Kernel-exclusive or Shader-exclusive.
---------
Co-authored-by: marcos.maronas <mmaronas@smtp.igk.intel.com>
TargetExtType values are replaced with calls to
`llvm.spv.track.constant`, with a `poison` value, but
`llvm.spv.assign.type` was called with their original value. This PR
updates the `assign.type` call to be consistent with the
`track.constant` call.
Fixes#134417.
---------
Co-authored-by: Steven Perron <stevenperron@google.com>
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
This relands #141031
This change ensures generated SPIR-V is valid and passes machine
verification:
```
*** Bad machine code: inconsistent constant size ***
- function: foo
- basic block: %bb.1 entry (0x9ec9298)
- instruction: %12:iid(s8) = G_CONSTANT i4 1
```
That is done by promoting `G_CONSTANT` instructions with small integer
types (e.g., `i4`) to `i8` if no extensions for "special" integer types
are enabled.
Remove the MCSubtargetInfo argument from applyFixup, introduced in
https://reviews.llvm.org/D45962 , as it's only required by ARM. Instead,
add const MCFragment & so that ARMAsmBackend can retrieve
MCSubtargetInfo via a static member function.
Additionally, remove the MCAssembler argument, which is also only
required by ARM.
Additionally, make applyReloc non-const. Its arguments now fully cover
addReloc's functionality.