60 Commits

Author SHA1 Message Date
Diego Novillo
06aae40c6d
[HLSL][SPIRV] Restore support for -g to generate NSDI (#190007)
The original attempt (#187051) produced a regression for
`intel-sycl-gpu` because `SPIRVEmitNonSemanticDI` will now self-activate
whenever `llvm.dbg.cu` is present. This removed the need for the
explicit `--spv-emit-nonsemantic-debug-info` flag.

The pass is now entered unconditionally for all SPIR-V targets, but
`NonSemantic.Shader.DebugInfo.100` requires the
`SPV_KHR_non_semantic_info`. Targets like `spirv64-intel` do not enable
that extension by default. When `checkSatisfiable()` ran on those
targets, it issued a fatal error rather than silently skipping.

Adds an early-out from `emitGlobalDI()`: if
`SPV_KHR_non_semantic_info` is not available for the current target, the
pass returns without emitting anything.
2026-04-01 21:00:36 -07:00
Arseniy Obolenskiy
4a773b9f35
[SPIR-V] Emit OpLoopMerge for non-shader targets without SPV_INTEL_unstructured_loop_controls extension (#187519)
`OpLoopMerge` emission was not supported due to the fact that spirv
structurizer is not being run for non-shader targets.

After enabling support for `SPV_INTEL_unstructured_loop_controls` in
https://github.com/llvm/llvm-project/pull/178799 is started to preserve
some information about unstructured control flow. This PR is intended to
enable support for `OpLoopMerge` without extension.

Note: changes in `llvm/test/CodeGen/SPIRV/pointers/phi-chain-types.ll`
and `llvm/test/CodeGen/SPIRV/llvm-intrinsics/memset.ll` are due to the
fact that loop layout has changed after `loop-simplify` pass enabling
2026-03-30 13:59:40 +02:00
Nick Sarnie
09951fd475
Revert "[HLSL][SPIRV] Add support for -g to generate NonSemantic Debug Info" (#188771)
Reverts llvm/llvm-project#187051

Breaks some OpenMP offload tests
2026-03-26 18:58:47 +00:00
Diego Novillo
85049fc357
[HLSL][SPIRV] Add support for -g to generate NonSemantic Debug Info (#187051)
This adds two related changes to HLSL debug info support in the SPIR-V
backend. It's a first small step towards the plan I described in
https://discourse.llvm.org/t/hlsl-spirv-nsdi-debug-info-support-for-clang-dxc/90149.

## Tag HLSL shaders with `DW_LANG_HLSL` in the front-end

`GetSourceLanguage()` in `clang/lib/CodeGen/CGDebugInfo.cpp` checked
`LO.CPlusPlus` before `LO.HLSL`. Since HLSL is compiled as C++, the HLSL
check was never reached. Shaders compiled with `-g` were tagged with
`DW_LANG_C_plus_plus_14` instead of `DW_LANG_HLSL`. The NSDI pass
already had the correct mapping for `DW_LANG_HLSL` but it was never
triggered.

This fixes #136929 and #136995.

## Make `SPIRVEmitNonSemanticDI` activate automatically when `-g` is
used

`SPIRVPassConfig::addPreEmitPass()` only scheduled
`SPIRVEmitNonSemanticDI` when `--spv-emit-nonsemantic-debug-info` was
set or the target vendor was AMD. Passing `-g` to clang had no effect on
the SPIR-V backend pass.

The pass is now added unconditionally and self-activates by checking for
`llvm.dbg.cu` in the module. When no debug metadata is present it exits
early with no effect. This avoids the need to inspect module metadata at
pass-configuration time, which is not reliably available.

`--spv-emit-nonsemantic-debug-info` is now a deprecated synonym for
`-g`.

The alternative to the unconditional pass approach is to check at
pass-configuration time whether the module was compiled with debug info
(e.g. via `TargetOptions::DebugInfoForProfiling` or a similar flag
forwarded from the driver). I went with the unconditional approach
because it is simpler and the pass is cheap to enter and exit when no
`llvm.dbg.cu` is present.

I'm not sure whether adding a pass unconditionally is acceptable. Does
this sound reasonable, or would it be better to implement the
flag-forwarding approach?

Changes to tests:

- `clang/test/CodeGenHLSL/` (new): verifies that `-g` on an HLSL SPIR-V
target produces `DebugCompilationUnit` with language code 5
(`DW_LANG_HLSL`).
-
`llvm/test/CodeGen/SPIRV/debug-info/hlsl-debug-info-auto-activation.ll`
(new): verifies that a module with `llvm.dbg.cu` and `DW_LANG_HLSL`
produces `DebugCompilationUnit` without
`--spv-emit-nonsemantic-debug-info`.
- Existing `debug-compilation-unit.ll`, `debug-type-basic.ll`,
`debug-type-pointer.ll`: updated to verify NSDI is emitted whenever
debug metadata is present.
- `llc-pipeline.ll`: updated to reflect that `SPIRVEmitNonSemanticDI` is
now always in the pipeline.

---------

Co-authored-by: Eric Christopher <echristo@gmail.com>
2026-03-25 11:17:09 -07:00
ambergorzynski
741eb80152
[SPIRV] Add pass SPIRVEmitIntrinsics to new pass manager (#188285)
Registered SPIRVEmitIntrinsics with the new pass manager as
`spirv-emit-intrinsics`. The motivation for this is to allow it to be run from
`opt` as a standalone pass for testing purposes. A simple pipeline test
is also included to check that the registration works.
2026-03-25 16:39:41 +00:00
Alex Duran
de0c366e4b
[llvm][SPIRV] Add pass to lower Ctors/Dtors for SPIRV (#187509)
This PR adds a new SPIRV pass that generates a kernel named
"spirv$device$init" that iterates the pointers in the table pointed by
__init_array_start and __init_array_end and executes them. It also
generates symbols for each constructor with the form
__init_array_object_NAME_PRIORITY.

These symbols will be used by the Level Zero plugin in the liboffload
runtime (with the support introduced by #187510) to generate the
aforementioned table as spirv-link cannot create the table itself.

It also does the same thing for destructors, with the kernel name being
"spirv$device$fini", the table pointers __fini_array_start and
__fini_array_end, and the generated symbols prefix __fini_array_object.

The code was mostly generated by Claude 4.5 and has been reviewed by me
to the best of my ability.
2026-03-24 21:47:09 +00:00
Nick Sarnie
e4c30c15c8
[SPIRV] Extend lowering of variadic functions (#178980)
Variadic function lowering for SPIR-V was initially added in
https://github.com/llvm/llvm-project/pull/175076.

However, I tried a full OpenMP offloading example that includes a vararg
call and hit a few issues:

1) The OpenMP Deivce library function `ompx::printf` was incorrectly
being considered a builtin `printf` function that would be handled
specifically by the SPIR-V backend.

The fix here is to remove the `printf` special handling.

2) We were getting an assert in ModuleVerifier saying the LLVM lifetime
intrinsics were being called with an argument that was neither an
`alloca` ptr or `poison`. The problem is the `alloca` was replaced with
a SPIR-V intrinsic `alloca` in `SPIRVPrepareFunctions`, but the lifetime
intrinsic added in `ExpandVariadics` was not lowered to the SPIR-V
lifetime intrinsic since `ExpandVariadics` is run after
`SPIRVPrepareFunctions`,

The fix here is to just run `ExpandVariadics` first.

3) There were `va` intrinsics taking in a `addrspace(4)` pointer that
were not being expanded.

The fix here is to extend `ExpandVariadics` to support expanding `va`
intrinsics with target-specific address spaces.

---------

Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
Co-authored-by: Joseph Huber <huberjn@outlook.com>
2026-02-13 18:22:25 +00:00
Lleu Yang
8fc59bc0e3
[SPIRV] Add handling for uinc_wrap and udec_wrap atomics (#179114)
This adds atomicrmw `uinc_wrap` and `udec_wrap` operations support for
SPIR-V. Since SPIR-V doesn't provide dedicated instructions for those
two operations, we have to use the `AtomicExpand` pass to expand the
operations into CAS forms.

Closes #177204.
2026-02-10 01:39:05 +01:00
Joseph Huber
efb57947ba
[SPIR-V] Enable variadic function lowering for the SPIR-V target (#175076)
Summary:
We support variadic functions in AMDGPU / NVPTX via an LLVM-IR pass.
This patch applies the same handling here to support them on this
target.

I am unsure what the ABI should look like here, I have mostly copied the
one we use for NVPTX where it's basically a struct layout with natural
alignment. This wastes some space, which is why AMDGPU does not pad
them.

Additionally, this required allowing the SPIRV_FUNC calling convention.
I'm assuming this is compatible with the C calling convention in IR, but
I will need someone to confirm that for me.
2026-01-19 12:19:15 -06:00
Nick Sarnie
75ec177483
[SPIRV] Add legalization pass for zero-size arrays (#172367)
This adds a legalization pass to convert zero size arrays to legal types
for common cases. It doesn't handle all cases, but if we see real use
cases for other cases, we can add them in the future.

For globals, and their initializers, we generally replace `[0 x T]` with
`ptr`.

For instructions, we either replace `[0 x T]` with `poision`, for
`alloca` we just allocate `T`.

This is motivated by IR generated by the OpenMP front end.

Issue: https://github.com/llvm/llvm-project/issues/170150

---------

Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
2026-01-07 16:58:53 +00:00
Nathan Gauër
8cfda79105
[HLSL][SPIR-V] Implement vk::push_constant (#166793)
Implements initial support for vk::push_constant.
As is, this allows handling simple push constants, but has one
main issue: layout can be incorrect (See #168401). The layout
issue being not only push-constant related, it's ignored for this PR.

The frontend part of the implementation is straightforward:
 - adding a new attribute
 - when targeting vulkan/spirv, we process it
 - global variables with this attribute gets a new AS:
   hlsl_push_constant

The IR has nothing specific, only some RO globals in this new AS.

On the SPIR-V side, we not convert this AS into a PushConstant storage
class. But this creates some issues: the variables in this storage class
must have a specific set of decoration to define their layout.

Current infra to create the SPIR-V types lacks the context required to
make this decision: no indication on the AS or context around the type
being created. Refactoring this would be a heavy task as it would
require getting this information in every place using the GR for type
creation.

Instead, we do something similar to CBuffers:
 - find all globals with this address space, and change their type to
   a target-specific type.
 - insert a new intrinsic in place of every reference to this global
   variable.

This allow the backend to handle both layout variables loads and type
lowering independently.

Type lowering has nothing specific: when we encounter a target extension
type with spirv.PushConstant, we lower this to the correct SPIR-V type
with the proper offset & block decorations.

As for the intrinsic, it's mostly a no-op, but required since we have
this target-specific type.

Note: this implementation prevents the static declaration of multiple
push constants in a single shader module. The actual specification is
more relaxed: there can be only one **used** push constant block per
entrypoint. To correctly implement this, we'd require to keep some
additional state to determine the list of statically used resources per
entrypoint. This shall be addressed as a follow-up (see #170310)
2025-12-18 11:01:11 +01:00
Juan Manuel Martinez Caamaño
d23d8abf1f
[SPIRV][SPIRVPrepareGlobals] Convert llvm.embedded.module from a 0-element array to a 1-element array (#166950)
When compiling with `-fembed-bitcode-marker`, Clang inserts a
placeholder
for the bitcode. This placeholder is a `[0 x i8]` array, which we cannot
represent in SPIRV.

For AMD flavored SPIRV, we extend the `llvm.embedded.module` global to a
`zeroinitializer [1 x i8]` array.

To achieve this, this patch adds a new pass, `SPIRVPrepareGlobals`, that
we can use to write global variable's _non-trivial-to-lower-IR_ ->
_trivial-to-lower-IR_ mappings.

This is a second attempt at
https://github.com/llvm/llvm-project/pull/162082, but cleaner.

In the translator something similar is done for every 0-element array
since https://github.com/KhronosGroup/SPIRV-LLVM-Translator/pull/2743 .
But I don't think we want to do this mapping for all cases.
2025-11-12 08:47:26 +00:00
Alex Voicu
0307147105
[NFC][SPIRV] Add AMDGCN SPIR-V specific defaults to the BE (#165815)
AMDGCN flavoured SPIR-V has slightly different defaults from what the BE
adopts: it assumes all extensions are enabled, and expects nonsemantic
debug info to be generated. Furthermore, it is necessary to encode in
the resulting SPIR-V binary that what was generated was AMDGCN
flavoured, which we do by setting the Generator Version to `UINT16_MAX`
(which matches what we expect to see at reverse translation). We will
register this generator version at
<https://github.com/KhronosGroup/SPIRV-Headers>. This is a preliminary
patch out of a series of patches that are needed for adopting the BE for
AMDGCN flavoured SPIR-V generation.
2025-11-04 12:45:53 +00:00
Kazu Hirata
4eed68357e
[llvm] Use "= default" (NFC) (#166088)
Identified with modernize-use-equals-default.
2025-11-02 17:16:47 -08:00
Steven Perron
00ad9ecc1c
[SPIRV][HLSL] Implement CBuffer access lowering pass (#159136)
This patch introduces a new pass, SPIRVCBufferAccess, which is
responsible for translating accesses to HLSL constant buffer (cbuffer)
global variables into accesses to the proper SPIR-V resource.

The pass operates by:
1. Identifying all cbuffers via the `!hlsl.cbs` metadata.
2. Replacing all uses of cbuffer member global variables with
`llvm.spv.resource.getpointer` intrinsics.
3. Cleaning up the original global variables and metadata.

This approach allows subsequent passes, like SPIRVEmitIntrinsics, to
correctly fold GEPs into a single OpAccessChain instruction.

The patch also includes a comprehensive set of lit tests to cover
various scenarios:
- Basic cbuffer access direct load and GEPs.
- Unused and partially unused cbuffers.

This implements the SPIR-V version of

https://github.com/llvm/wg-hlsl/blob/main/proposals/0016-constant-buffers.md#lowering-to-buffer-load-intrinsics.
2025-09-22 09:01:48 -04:00
Reid Kleckner
f3efbce4a7
[llvm] Move data layout string computation to TargetParser (#157612)
Clang and other frontends generally need the LLVM data layout string in
order to generate LLVM IR modules for LLVM. MLIR clients often need it
as well, since MLIR users often lower to LLVM IR.

Before this change, the LLVM datalayout string was computed in the
LLVM${TGT}CodeGen library in the relevant TargetMachine subclass.
However, none of the logic for computing the data layout string requires
any details of code generation. Clients who want to avoid duplicating
this information were forced to link in LLVMCodeGen and all registered
targets, leading to bloated binaries. This happened in PR #145899,
which measurably increased binary size for some of our users.

By moving this information to the TargetParser library, we
can delete the duplicate datalayout strings in Clang, and retain the
ability to generate IR for unregistered targets.

This is intended to be a very mechanical LLVM-only change, but there is
an immediately obvious follow-up to clang, which will be prepared
separately.

The vast majority of data layouts are computable with two inputs: the
triple and the "ABI name". There is only one exception, NVPTX, which has
a cl::opt to enable short device pointers. I invented a "shortptr" ABI
name to pass this option through the target independent interface.
Everything else fits. Mips is a bit awkward because it uses a special
MipsABIInfo abstraction, which includes members with codegen-like
concepts like ABI physical registers that can't live in TargetParser. I
think the string logic of looking for "n32" "n64" etc is reasonable to
duplicate. We have plenty of other minor duplication to preserve
layering.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
2025-09-11 11:05:29 -07:00
Nick Sarnie
343186deef
[clang][SPIRV] Set program address space for Intel-flavored SPIR-V (#135251)
Technically, SPIR-V should use addrspace(4) for generic pointers.

We already set the default AS in TargetInfo to 4, but that's not enough
for all cases. Also set the program address space to 4 to fix the
remaining cases. AMD already does this for their SPIR-V target, do it
for Intel's SPIR-V target.

I need this for OpenMP offloading to SPIR-V. There are a couple of
places I need to change in the OMP FE to check the program AS, I'll do
that in a follow-up PR.

---------

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-09-11 14:24:41 +00:00
Nathan Gauër
d67ab11f2e
[SPIR-V] Move structurizer to ISel prepare (#157886)
Some passes like LoopSimplify/SimplifyCFF are running between IRPasses
and ISelPrepare. This is an issue because the structurizer generates
OpSelectionMerge/OpLoopMerge instructions at specific places, and those
passes are moving them.
Moving the structurizer later solves this issue.
2025-09-11 15:02:43 +02:00
Steven Perron
25bf86fede
[SPIRV] Add pass to replace gethandlefromimplicitbinding (#146756)
The HLSL frontend generates call to the intrinsic
@llvm.spv.resource.handlefromimplicitbinding to be able to access a
resource where the set and binding were not explicitly given in the
source code. Determining the correct set and binding cannot be done
during Clang's codegen or earlier because in DXIL, they must first
remove resource that are not accessed before assigning binding locations
to the resource without an explicit binding.

We will follow their lead.

This is a change from DXC, where implicit binding for SPIR-V are
assigned before optimizations.

See https://github.com/llvm/wg-hlsl/pull/309
2025-08-06 13:10:55 -04:00
Andrew Rogers
19658d1474
[llvm] annotate interfaces in llvm/Target for DLL export (#143615)
## Purpose

This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the `llvm/Target` library.
These annotations currently have no meaningful impact on the LLVM build;
however, they are a prerequisite to support an LLVM Windows DLL (shared
library) build.

## Background

This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).

A sub-set of these changes were generated automatically using the
[Interface Definition Scanner (IDS)](https://github.com/compnerd/ids)
tool, followed formatting with `git clang-format`.

The bulk of this change is manual additions of `LLVM_ABI` to
`LLVMInitializeX` functions defined in .cpp files under llvm/lib/Target.
Adding `LLVM_ABI` to the function implementation is required here
because they do not `#include "llvm/Support/TargetSelect.h"`, which
contains the declarations for this functions and was already updated
with `LLVM_ABI` in a previous patch. I considered patching these files
with `#include "llvm/Support/TargetSelect.h"` instead, but since
TargetSelect.h is a large file with a bunch of preprocessor x-macro
stuff in it I was concerned it would unnecessarily impact compile times.

In addition, a number of unit tests under llvm/unittests/Target required
additional dependencies to make them build correctly against the LLVM
DLL on Windows using MSVC.

## Validation

Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:

- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
2025-06-17 13:28:45 -07:00
Marcos Maronas
b1703ad38d
[SPIRV] Change how to detect OpenCL/Vulkan Env and update tests accordingly. (#129689)
A new test added for spirv-friendly builtins for
SPV_KHR_bit_instructions unveiled that current mechanism to detect
whether SPIRV Backend is in OpenCL environment or Vulkan environment was
not good enough. This PR updates how to detect the environment and all
the tests accordingly.

*UPDATE*: the new approach is having a new member in `SPIRVSubtarget` to
represent the environment. It can be either OpenCL, Kernel or Unknown.
If the triple is explicit, we can directly set it at the creation of the
`SPIRVSubtarget`, otherwise we just leave it unknown until we find other
information that can help us set the environment. For now, the only
other information we use to set the environment is `hlsl.shader`
attribute at `SPIRV::ExecutionModel::ExecutionModel
getExecutionModel(const SPIRVSubtarget &STI, const Function &F)`. Going
forward we should consider also specific instructions that are
Kernel-exclusive or Shader-exclusive.

---------

Co-authored-by: marcos.maronas <mmaronas@smtp.igk.intel.com>
2025-06-03 09:50:23 -04:00
Kazu Hirata
89fd7b3d1e
[SPIRV] Remove unused includes (NFC) (#141450)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-05-26 09:13:43 -07:00
Rahul Joshi
52c2e45c11
[NFC][CodeGen] Adopt MachineFunctionProperties convenience accessors (#141101) 2025-05-23 08:30:29 -07:00
Nathan Gauër
7d98b66e3d
[SPIR-V] Add InferAddrSpaces pass to the backend (#137766)
This commit enables a pass in the backend which propagates the addrspace
of the pointers down to the last use, making sure the addrspace remains
consistent, and thus stripping any addrspacecast. This is required to
lower LLVM-IR to logical SPIR-V, which does not support generic
pointers.

This is now required as HLSL emits several address spaces, and thus
addrspacecasts in some cases:

Example 1: resource access

```llvm
%handle = tail call target("spirv.VulkanBuffer", ...)
%rptr = @llvm.spv.resource.getpointer(%handle, ...);
%cptr = addrspacecast ptr addrspace(11) %rptr to ptr
%fptr = load i32, ptr %cptr
```

Example 2: object methods

```llvm
define void @objectMethod(ptr %this) {
}

define void @foo(ptr addrspace(11) %object) {
  call void @objectMethod(ptr addrspacecast(addrspace(11) %object to ptr));
}
```
2025-05-07 16:53:25 +02:00
Matthias Braun
675cb70641
Register assembly printer passes (#138348)
Register assembly printer passes in the pass registry.

This makes it possible to use `llc -start-before=<target>-asm-printer ...` in tests.

Adds a `char &ID` parameter to the AssemblyPrinter constructor to allow
targets to use the `INITIALIZE_PASS` macros and register the pass in the
pass registry. This currently has a default parameter so it won't break
any targets that have not been updated.
2025-05-06 18:01:17 -07:00
Sergei Barannikov
bb1765179e
[TTI] Simplify implementation (NFCI) (#136674)
Replace "concept based polymorphism" with simpler PImpl idiom.

This pursues two goals:
* Enforce static type checking. Previously, target implementations hid
base class methods and type checking was impossible. Now that they
override the methods, the compiler will complain on mismatched
signatures.
* Make the code easier to navigate. Previously, if you asked your
favorite LSP server to show a method (e.g. `getInstructionCost()`), it
would show you methods from `TTI`, `TTI::Concept`, `TTI::Model`,
`TTIImplBase`, and target overrides. Now it is two less :)

There are three commits to hopefully simplify the review.

The first commit removes `TTI::Model`. This is done by deriving
`TargetTransformInfoImplBase` from `TTI::Concept`. This is possible
because they implement the same set of interfaces with identical
signatures.

The first commit makes `TargetTransformImplBase` polymorphic, which
means all derived classes should `override` its methods. This is done in
second commit to make the first one smaller. It appeared infeasible to
extract this into a separate PR because the first commit landed
separately would result in tons of `-Woverloaded-virtual` warnings (and
break `-Werror` builds).

The third commit eliminates `TTI::Concept` by merging it with the only
derived class `TargetTransformImplBase`. This commit could be extracted
into a separate PR, but it touches the same lines in
`TargetTransformInfoImpl.h` (removes `override` added by the second
commit and adds `virtual`), so I thought it may make sense to land these
two commits together.

Pull Request: https://github.com/llvm/llvm-project/pull/136674
2025-04-26 15:25:40 +03:00
Nathan Gauër
a625bc60e2
[HLSL][SPIR-V] Add hlsl_private address space for SPIR-V (#133464)
This is an alternative to
https://github.com/llvm/llvm-project/pull/122103

In SPIR-V, private global variables have the Private storage class. This
PR adds a new address space which allows frontend to emit variable with
this storage class when targeting this backend.

This is covered in this proposal: llvm/wg-hlsl@4c9e11a

This PR will cause addrspacecast to show up in several cases, like class
member functions or assignment. Those will have to be handled in the
backend later on, particularly to fixup pointer storage classes in some
functions.

Before this change, global variable were emitted with the 'Function'
storage class, which was wrong.
2025-04-10 10:55:10 +02:00
Rahul Joshi
3801bf6164
[NFC] Cleanup pass initialization for SPIRV passes (#134189)
- Do not call pass initialization functions from pass contructors.
- Instead, call them from SPIRV target initialization.
- https://github.com/llvm/llvm-project/issues/111767
2025-04-03 08:50:31 -07:00
Nathan Gauër
7c8b1275bc
[SPIR-V] Add pass to remove spv_ptrcast intrinsics (#128896)
OpenCL is allowed to cast pointers, meaning they can resolve some type
mismatches this way. In logical SPIR-V, those are restricted. This new
pass legalizes such pointer cast when targeting logical SPIR-V.

For now, this pass supports 3 cases we witnessed:
 - loading a vec3 from a vec4*.
 - loading a scalar from a vec*.
 - loading the 1st element of an array.

---------

Co-authored-by: Steven Perron <stevenperron@google.com>
2025-03-04 10:30:46 +01:00
Farzon Lotfi
eddeb36cf1
[SPIRV] add pre legalization instruction combine (#122839)
- Add the boilerplate to support instcombine in SPIRV
- instcombine length(X-Y) to distance(X,Y)
- switch HLSL's distance intrinsic to not special case for SPIRV.
- fixes #122766
- This RFC we were requested to add in the infra for pattern matching:
https://discourse.llvm.org/t/rfc-add-targetbuiltins-for-spirv-to-support-hlsl/83329/13
2025-01-17 14:46:14 -05:00
joaosaffran
10b1caf6b9
[SPIRV][OPT] Adding flag to run spirv structurizer (#119301)
This PR adds a new flag into OPT to run SPIRV structurizer, this is
being added improving testing of such pass.

This change is required to implement a test request that come
https://github.com/llvm/llvm-project/pull/116331.

---------

Co-authored-by: Joao Saffran <jderezende@microsoft.com>
2024-12-10 18:34:33 -08:00
Vyacheslav Levytskyy
42633cf27b
[SPIR-V] Improve general validity of emitted code between passes (#119202)
This PR improves general validity of emitted code between passes due to
generation of `TargetOpcode::PHI` instead of `SPIRV::OpPhi` after
Instruction Selection, fixing generation of OpTypePointer instructions
and using of proper virtual register classes.

Using `TargetOpcode::PHI` instead of `SPIRV::OpPhi` after Instruction
Selection has a benefit to support existing optimization passes
immediately, as an alternative path to disable those passes that use
`MI.isPHI()`. This PR makes it possible thus to revert
https://github.com/llvm/llvm-project/pull/116060 actions and get back to
use the `MachineSink` pass.

This PR is a solution of the problem discussed in details in
https://github.com/llvm/llvm-project/pull/110507. It accepts an advice
from code reviewers of the PR #110507 to postpone generation of OpPhi
rather than to patch CodeGen. This solution allows to unblock
improvements wrt. expensive checks and makes it unrelated to the general
points of the discussion about OpPhi vs. G_PHI/PHI.

This PR contains numerous small patches of emitted code validity that
allows to substantially pass rate with expensive checks. Namely, the
test suite with expensive checks set ON now has only 12 fails out of 569
total test cases.

FYI @bogner
2024-12-09 21:10:09 +01:00
Vyacheslav Levytskyy
565a9ac7df
[SPIR-V] Disable Machine Sink pass in SPIR-V Backend (#116060)
Some standard passes that optimize machine instructions in SSA form uses
MI.isPHI() that doesn't account for OpPhi in SPIR-V and so are able to
break the CFG. MachineSink is among such passes (see for example
1884ffc41c/llvm/lib/CodeGen/MachineSink.cpp (L630)),
so this PR disables the pass to ensure correctness of the generated
code.

There is a reproducer of the issue that demonstrates how MachineSink is
able to generate an invalid code for the SPIR-V Backend

```
error: line 6837: OpPhi must appear within a non-entry block before all non-OpPhi instructions (except for OpLine, which can be mixed with OpPhi).
  %z_fra_3_1 = OpPhi %uint %and187 %4250 %inc194 %4257 %uint_0 %4264
```

The reproducer is a part of SYCL end-to-end test suite
(https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/DeviceLib/imf_fp32_rounding_test.cpp).
At the moment it doesn't seem feasible to make it a part of the SPIR-V
Backend test suite due to a far too big size of the intermediate LLVM IR
that causes the problem.
2024-11-19 21:42:44 +01:00
Matin Raayai
bb3f5e1fed
Overhaul the TargetMachine and LLVMTargetMachine Classes (#111234)
Following discussions in #110443, and the following earlier discussions
in https://lists.llvm.org/pipermail/llvm-dev/2017-October/117907.html,
https://reviews.llvm.org/D38482, https://reviews.llvm.org/D38489, this
PR attempts to overhaul the `TargetMachine` and `LLVMTargetMachine`
interface classes. More specifically:
1. Makes `TargetMachine` the only class implemented under
`TargetMachine.h` in the `Target` library.
2. `TargetMachine` contains target-specific interface functions that
relate to IR/CodeGen/MC constructs, whereas before (at least on paper)
it was supposed to have only IR/MC constructs. Any Target that doesn't
want to use the independent code generator simply does not implement
them, and returns either `false` or `nullptr`.
3. Renames `LLVMTargetMachine` to `CodeGenCommonTMImpl`. This renaming
aims to make the purpose of `LLVMTargetMachine` clearer. Its interface
was moved under the CodeGen library, to further emphasis its usage in
Targets that use CodeGen directly.
4. Makes `TargetMachine` the only interface used across LLVM and its
projects. With these changes, `CodeGenCommonTMImpl` is simply a set of
shared function implementations of `TargetMachine`, and CodeGen users
don't need to static cast to `LLVMTargetMachine` every time they need a
CodeGen-specific feature of the `TargetMachine`.
5. More importantly, does not change any requirements regarding library
linking.

cc @arsenm @aeubanks
2024-11-14 13:30:05 -08:00
Alex Voicu
2c13dec328
[clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V (#110695)
SPIR-V doesn't currently encode "native" integer bit-widths in its
datalayout(s). This is problematic as it leads to optimisation passes,
such as InstCombine, getting ideas and e.g. shrinking to non
byte-multiple integer types, which is not desirable and can lead to
breakage further down in the toolchain. This patch addresses that by
encoding `i8`, `i16`, `i32` and `i64` as native types for vanilla SPIR-V
(the spec natively supports them), and `i32` and `i64` for AMDGCNSPIRV
(where the hardware targets are known). We also set the stack alignment
on the latter, as it is overaligned (32-bit vs 8-bit).
2024-11-05 17:26:08 +02:00
Nathan Gauër
cba70550cc
[SPIR-V] Fix BB ordering & register lifetime (#111026)
The "topological" sorting was behaving incorrectly in some cases: 
the exit of a loop could have a lower rank than a node in the loop.
This causes issues when structurizing some patterns, and also codegen
issues as we could generate BBs in the incorrect order in regard to the
SPIR-V spec.

Fixing this ordering alone broke other parts of the structurizer, which
by luck worked. Had to fix those.

Added more test cases, especially to test basic patterns.

I also needed to tweak/disable some tests for 2 reasons:
 - SPIR-V now required reg2mem/mem2reg to run. Meaning dead stores
   are optimized away. Some tests require tweaks to avoid having the
   whole function removed.
 - Mem2Reg will generate variable & load/stores. This generates
   G_BITCAST in several cases. And there is currently something wrong
   we do with G_BITCAST which causes MIR verifier to complain.
   Until this is resolved, I disabled -verify-machineinstrs flag on
   those tests.

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-10-30 14:57:32 +01:00
Nathan Gauër
1ed65febd9
[SPIR-V] Add SPIR-V structurizer (#107408)
This commit adds an initial SPIR-V structurizer.
It leverages the previously merged passes, and the convergence region
analysis to determine the correct merge and continue blocks for SPIR-V.

The first part does a branch cleanup (simplifying switches, and
legalizing them), then merge instructions are added to cycles,
convergent and later divergent blocks.
Then comes the important part: splitting critical edges, and making sure
the divergent construct boundaries don't cross.

- we split blocks with multiple headers into 2 blocks.
- we split blocks that are a merge blocks for 2 or more constructs:
SPIR-V spec disallow a merge block to be shared by 2
loop/switch/condition construct.
- we split merge & continue blocks: SPIR-V spec disallow a basic block
to be both a continue block, and a merge block.
- we remove superfluous headers: when a header doesn't bring more info
than the parent on the divergence state, it must be removed.

This PR leverages the merged SPIR-V simulator for testing, as long as
spirv-val. For now, most DXC structurization tests are passing. The
unsupported ones are either caused by unsupported features like switches
on boolean types, or switches in region exits, because the MergeExit
pass doesn't support those yet (there is a FIXME).

This PR is quite large, and the addition not trivial, so I tried to keep
it simple. E.G: as soon as the CFG changes, I recompute the dominator
trees and other structures instead of updating them.

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-09-20 11:36:43 +02:00
Stephen Tozer
3d08ade7bd
[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)
This patch is part of a set of patches that add an `-fextend-lifetimes`
flag to clang, which extends the lifetimes of local variables and
parameters for improved debuggability. In addition to that flag, the
patch series adds a pragma to selectively disable `-fextend-lifetimes`,
and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes`
for this pointers only. All changes and tests in these patches were
written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer)
has handled review and merging. The extend lifetimes flag is intended to
eventually be set on by `-Og`, as discussed in the RFC
here:

https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850

This patch implements a new intrinsic instruction in LLVM,
`llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand
and has no effect other than "using" its operand, to ensure that its
operand remains live until after the fake use. This patch does not emit
fake uses anywhere; the next patch in this sequence causes them to be
emitted from the clang frontend, such that for each variable (or this) a
fake.use operand is inserted at the end of that variable's scope, using
that variable's value. This patch covers everything post-frontend, which
is largely just the basic plumbing for a new intrinsic/instruction,
along with a few steps to preserve the fake uses through optimizations
(such as moving them ahead of a tail call or translating them through
SROA).

Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2024-08-29 17:53:32 +01:00
bwlodarcz
62da359ce7
[SPIRV] Emitting DebugSource, DebugCompileUnit (#97558)
This commit introduces emission of DebugSource, DebugCompileUnit from
NonSemantic.Shader.DebugInfo.100 and required OpString with filename.
NonSemantic.Shader.DebugInfo.100 is divided, following DWARF into two
main concepts – emitting DIE and Line.
In DWARF .debug_abbriev and .debug_info sections are responsible for
emitting tree with information (DEIs) about e.g. types, compilation
unit. Corresponding to that in NonSemantic.Shader.DebugInfo.100 have
instructions like DebugSource, DebugCompileUnit etc. which preforms same
role in SPIR-V file. The difference is in fact that in SPIR-V there are
no sections but logical layout which forces order of the instruction
emission.
The NonSemantic.Shader.DebugInfo.100 requires for this type of global
information to be emitted after OpTypeXXX and OpConstantXXX
instructions.
One of the goals was to minimize changes and interaction with
SPIRVModuleAnalysis as possible which current commit achieves by
emitting it’s instructions directly into MachineFunction.
The possibility of duplicates are mitigated by guard inside pass which
emits the global information only once in one function.
By that method duplicates don’t have chance to be emitted.
From that point, adding new debug global instructions should be
straightforward.
2024-08-22 20:27:36 -07:00
Alex Voicu
88e2bb4092
[clang][SPIR-V] Add support for AMDGCN flavoured SPIRV (#89796)
This change seeks to add support for vendor flavoured SPIRV - more
specifically, AMDGCN flavoured SPIRV. The aim is to generate SPIRV that
carries some extra bits of information that are only usable by AMDGCN
targets, forfeiting absolute genericity to obtain greater expressiveness
for target features:

- AMDGCN inline ASM is allowed/supported, under the assumption that the
[SPV_INTEL_inline_assembly](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_inline_assembly.asciidoc)
extension is enabled/used
- AMDGCN target specific builtins are allowed/supported, under the
assumption that e.g. the `--spirv-allow-unknown-intrinsics` option is
enabled when using the downstream translator
- the featureset matches the union of AMDGCN targets' features
- the datalayout string is overspecified to affix both the program
address space and the alloca address space, the latter under the
assumption that the
[SPV_INTEL_function_pointers](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_function_pointers.asciidoc)
extension is enabled/used, case in which the extant SPIRV datalayout
string would lead to pointers to function pointing to the private
address space, which would be wrong.

Existing AMDGCN tests are extended to cover this new target. It is
currently dormant / will require some additional changes, but I thought
I'd rather put it up for review to get feedback as early as possible. I
will note that an alternative option is to place this under AMDGPU, but
that seems slightly less natural, since this is still SPIRV, albeit
relaxed in terms of preconditions & constrained in terms of
postconditions, and only guaranteed to be usable on AMDGCN targets (it
is still possible to obtain pristine portable SPIRV through usage of the
flavoured target, though).
2024-06-07 11:50:23 +01:00
Nathan Gauër
a5641f106a
[SPIR-V] Add pass to merge convergence region exit targets (#92531)
The structurizer required regions to be SESE: single entry, single exit.
This new pass transforms multiple-exit regions into single-exit regions.

```
      +---+
      | A |
      +---+
      /   \
   +---+ +---+
   | B | | C |  A, B & C belongs to the same convergence region.
   +---+ +---+
     |     |
   +---+ +---+
   | D | | E |  C & D belongs to the parent convergence region.
   +---+ +---+  This means B & C are the exit blocks of the region.
      \   /     And D & E the targets of those exits.
       \ /
        |
      +---+
      | F |
      +---+
```

This pass would assign one value per exit target:
B = 0
C = 1

Then, create one variable per exit block (B, C), and assign it to the
correct value: in B, the variable will have the value 0, and in C, the
value 1.

Then, we'd create a new block H, with a PHI node to gather those 2
variables, and a switch, to route to the correct target.

Finally, the branches in B and C are updated to exit to this new block.

```
      +---+
      | A |
      +---+
      /   \
   +---+ +---+
   | B | | C |
   +---+ +---+
      \   /
      +---+
      | H |
      +---+
      /   \
   +---+ +---+
   | D | | E |
   +---+ +---+
      \   /
       \ /
        |
      +---+
      | F |
      +---+
```

Note: the variable is set depending on the condition used to branch. If
B's terminator was conditional, the variable would be set using a
SELECT.
All internal edges of a region are left intact, only exiting edges are
updated.

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-06-03 11:35:55 +02:00
Alex Voicu
1120d8e6f7
[clang][CodeGen] Add AS for Globals to SPIR & SPIRV datalayouts (#88455)
Currently neither the SPIR nor the SPIRV targets specify the AS for
globals in their datalayout strings. This is problematic because
CodeGen/LLVM will default to AS0 in this case, which produces Globals
that end up in the private address space for e.g. OCL, HIPSPV or SYCL.
This patch addresses it by completing the datalayout string.
2024-04-16 11:37:29 +01:00
Vyacheslav Levytskyy
540d255167
[SPIRV] Add vector reduction instructions (#82786)
This PR is to add vector reduction instructions according to
https://llvm.org/docs/GlobalISel/GenericOpcode.html#vector-reduction-operations
and widen in such a way a range of successful supported conversions,
covering new cases of vector reduction instructions which IRTranslator
is unable to resolve.

By legalizing vector reduction instructions we introduce a new
instruction patterns that should be addressed, including patterns that
are delegated to pre-legalize step. To address this problem, a new pass
is added that is to bring newly generated instructions after
legalization to an aspect required by instruction selection.

Expected overheads for existing cases is minimal, because a new pass is
working only with newly introduced instructions, otherwise it's just a
additional code traverse without any actions.
2024-03-04 12:14:58 +01:00
Nathan Gauër
7b08b4360b
[SPIR-V] add convergence region analysis (#78456)
This new analysis returns a hierarchical view of the convergence regions
in the given function.
This will allow our passes to query which basic block belongs to which
convergence region, and structurize the code in consequence.

Definition
----------

A convergence region is a CFG with:
 - a single entry node.
 - one or multiple exit nodes (different from LLVM's regions).
 - one back-edge
 - zero or more subregions.

Excluding sub-regions nodes, the nodes of a region can only reference a
single convergence token. A subregion uses a different convergence
token.

Algorithm
---------

This algorithm assumes all loops are in the Simplify form.

Create an initial convergence region for the whole function.
  - the convergence token is the function entry token.
  - the entry is the function entrypoint.
- Exits are all the basic blocks terminating with a return instruction.

Take the function CFG, and process it in DAG order (ignoring
back-edges). If a basic block is a loop header:
 - Create a new region.
- The parent region is the parent's loop region if any, otherwise, the
top level region.
   - The region blocks are all the blocks belonging to this loop.
- For each loop exit: - visit the rest of the CFG in DAG order (ignore
back-edges). - if the region's convergence token is found, add all the
blocks dominated by the exit from which the token is reachable to the
region.
   - continue the algorithm with the loop headers successors.
2024-02-02 18:22:14 +01:00
Nathan Gauër
0e1037edbf
[SPIR-V] Strip convergence intrinsics before ISel (#75948)
The structurizer will require the frontend to emit convergence
intrinsics. Once uses to restructurize the control-flow, those
intrinsics shall be removed, as they cannot be converted to
SPIR-V.

This commit adds a new pass to the SPIR-V backend which strips those
intrinsics.

Those 2 new steps are not limited to Vulkan as OpenCL could
also benefit from not crashing if a convertent operation is in
the IR (even though the frontend doesn't generate such intrinsics).

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-01-15 11:35:35 +01:00
Nathan Gauër
a9ffc92fc4
[SPIR-V] Add pre-headers to loops. (#75844)
This is the first of the 7 steps outlined in #75801. This PR explicitely
calls the SimplifyLoops pass. Directly following this pass should follow
the 6 others required to structurize the IR.

Running this pass could generate empty basic-blocks, which are implicit
fallthrough to the successor BB.
There was a specific condition in the SPIR-V ISel which handled implicit
fallthrough, but it couldn't work on empty basic-blocks. This commits
removes the old logic, and adds this new logic, which checks all
basic-blocks for implicit fallthroughs, including empty ones.

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-01-08 11:41:45 +01:00
Paulo Matos
8b7326587b
[SPIRV] Fix SPV_KHR_expect_assume support (#67793)
Since efe0e10718 changes in tests are required. Need to add extension to
Extensions list
and command line to enable use of the extension for test runs.
2023-10-09 10:05:58 +02:00
Arthur Eubanks
0a1aa6cda2
[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.

This matches other nearby enums.

For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
2023-09-14 14:10:14 -07:00
Nathan Gauër
56396b25f1 [SPIRV-V] Add SPIR-V logical triple to llc
This commits adds the minimal required bits to build a logical SPIR-V
compute shader using LLC.
- Skip OpenCL-only capabilities & extensions for Logical SPIR-V.
- Generate required metadata for entrypoints from HLSL frontend.
- Fix execution mode to GLCompute in logical.

The main issue is the lack of "vulkan" bit in the triple.
This might need to be added as a vendor?
Because as-is, SPIRV32/64 assumes OpenCL, and then, SPIRV assumes
Vulkan. This is ok-ish today, but not correct.

Differential Revision: https://reviews.llvm.org/D156424
2023-09-11 10:31:50 +02:00
Bjorn Pettersson
2dd221fe48 Remove no longer needed includes of LegacyPassManager.h
Most of the removed includes should probably have been removed already
when we removed TargetMachine::adjustPassManager.
2023-02-06 13:38:57 +01:00