69 Commits

Author SHA1 Message Date
Nikita Popov
246a64a12e
[Clang] Rename HasLegalHalfType -> HasFastHalfType (NFC) (#153163)
This option is confusingly named. What it actually controls is whether,
under the default of `-ffloat16-excess-precision=standard`, it is
beneficial for performance to perform calculations on float (without
intermediate rounding) or not. For `-ffloat16-excess-precision=none` the
LLVM `half` type will always be used, and all backends are expected to
legalize it correctly.
2025-08-18 09:23:48 +02:00
Alex Voicu
ff8b4f8151
[AMDGCNSPIRV][NFC] Match AMDGPU's __builtin_va_list type (#152044)
AMDGCN flavoured SPIRV should math AMDGPU TI as much as possible, and
the va_list difference was spurious.
2025-08-05 15:30:16 +01:00
Wenju He
b41398294c
[SPIR] Set MaxAtomicInlineWidth minimum size to 32 for spir32 and 64 for spir64 (#148997)
Set MaxAtomicInlineWidth the same way as SPIR-V targets in 3cfd0c0d3697.
This PR fixes build warning in scoped atomic built-in in #146814:
`warning: large atomic operation may incur significant performance
penalty; ; the access size (2 bytes) exceeds the max lock-free size (0
bytes) [-Watomic-alignment]`
2025-07-17 09:02:10 +08:00
Yaxun (Sam) Liu
beea2a9414
[Clang] Respect MS layout attributes during CUDA/HIP device compilation (#146620)
This patch fixes an issue where Microsoft-specific layout attributes,
such as __declspec(empty_bases), were ignored during CUDA/HIP device
compilation on a Windows host. This caused a critical memory layout
mismatch between host and device objects, breaking libraries that rely
on these attributes for ABI compatibility.

The fix introduces a centralized hasMicrosoftRecordLayout() check within
the TargetInfo class. This check is aware of the auxiliary (host) target
and is set during TargetInfo::adjust if the host uses a Microsoft ABI.

The empty_bases, layout_version, and msvc::no_unique_address attributes
now use this centralized flag, ensuring device code respects them and
maintains layout consistency with the host.

Fixes: https://github.com/llvm/llvm-project/issues/146047
2025-07-09 08:53:10 -04:00
Nathan Gauër
33fee56499
[HLSL][SPIR-V] Change SPV AS map for groupshared (#143519)
The previous mapping we setting the hlsl_groupshared AS to 0, which
translated to either Generic or Function.
Changing this to 3, which translated to Workgroup.

Related to #142804
2025-06-11 14:22:45 +02:00
Nick Sarnie
3b9ebe9201
[clang] Simplify device kernel attributes (#137882)
We have multiple different attributes in clang representing device
kernels for specific targets/languages. Refactor them into one attribute
with different spellings to make it more easily scalable for new
languages/targets.

---------

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-06-05 14:15:38 +00:00
Nathan Gauër
20d70196c9
[HLSL][SPIR-V] Implement vk::ext_builtin_input attribute (#138530)
This variable attribute is used in HLSL to add Vulkan specific builtins
in a shader.
The attribute is documented here:

17727e88fd/proposals/0011-inline-spirv.md

Those variable, even if marked as `static` are externally initialized by
the pipeline/driver/GPU. This is handled by moving them to a specific
address space `hlsl_input`, also added by this commit.

The design for input variables in Clang can be found here:
355771361e/proposals/0019-spirv-input-builtin.md


Co-authored-by: Justin Bogner <mail@justinbogner.com>
2025-06-04 13:22:37 +02:00
Victor Lomuller
c474f8f240
[clang][SPIRV] Add builtin for OpGenericCastToPtrExplicit and its SPIR-V friendly binding (#137805)
The patch introduce __builtin_spirv_generic_cast_to_ptr_explicit which
is lowered to the llvm.spv.generic.cast.to.ptr.explicit intrinsic.

The SPIR-V builtins are now split into 3 differents file:
BuiltinsSPIRVCore.td,
BuiltinsSPIRVVK.td for Vulkan specific builtins, BuiltinsSPIRVCL.td for
OpenCL specific builtins
and BuiltinsSPIRVCommon.td for common ones.

The patch also introduces a new header defining its SPIR-V friendly
equivalent (__spirv_GenericCastToPtrExplicit_ToGlobal,
__spirv_GenericCastToPtrExplicit_ToLocal and
__spirv_GenericCastToPtrExplicit_ToPrivate). The functions are declared
as aliases to the new builtin allowing C-like languages to have a
definition to rely on as well as gaining proper front-end diagnostics.

The motivation for the header is to provide a stable binding for
applications or library (such as SYCL) and allows non SPIR-V targets to
provide an implementation (via libclc or similar to how it is done for
gpuintrin.h).
2025-05-29 15:19:40 +02:00
Steven Perron
5584020d8a
[HLSL][SPIRV] Implement the SPIR-V target type for cbuffers. (#140061)
This change implement the type used to represent cbuffer for SPIR-V.

Fixes https://github.com/llvm/llvm-project/issues/138274.
2025-05-28 07:51:03 -04:00
Nick Sarnie
f9ee5ce605
[clang][SPIR-V] Fix OpenCL addrspace mapping when using non-zero default AS (#137187)
Based on feedback from https://github.com/llvm/llvm-project/pull/136753,
remove the dummy values for OpenCL and make them match the zero default
AS map.

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-04-28 19:26:01 +00:00
Nick Sarnie
52a96491e1
[clang][SPIR-V] Addrspace of opencl_global should always be 1 (#136753)
This fixes a CUDA SPIR-V regression introduced in
https://github.com/llvm/llvm-project/pull/134399.

---------

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-04-24 14:20:13 +00:00
Steven Perron
c073c22865
[HLSL] Use hlsl_device address space for getpointer. (#127675)
We add the hlsl_device address space to represent the device memory
space as defined in section 1.7.1.3 of the [HLSL
spec](https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf).

Fixes https://github.com/llvm/llvm-project/issues/127075
2025-04-22 13:26:32 -04:00
Nathan Gauër
a625bc60e2
[HLSL][SPIR-V] Add hlsl_private address space for SPIR-V (#133464)
This is an alternative to
https://github.com/llvm/llvm-project/pull/122103

In SPIR-V, private global variables have the Private storage class. This
PR adds a new address space which allows frontend to emit variable with
this storage class when targeting this backend.

This is covered in this proposal: llvm/wg-hlsl@4c9e11a

This PR will cause addrspacecast to show up in several cases, like class
member functions or assignment. Those will have to be handled in the
backend later on, particularly to fixup pointer storage classes in some
functions.

Before this change, global variable were emitted with the 'Function'
storage class, which was wrong.
2025-04-10 10:55:10 +02:00
Nick Sarnie
68ee56d150
[clang][OpenMP][SPIR-V] Fix addrspace of global constants (#134399)
SPIR-V has strict address space rules, constant globals cannot be in the
default address space.

The OMPIRBuilder change was required for lit tests to pass, we were
missing an addrspacecast.

---------

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-04-09 15:41:53 +00:00
Nick Sarnie
7a5e4f5405
[clang][NFCI] Fix getGridValues for unsupported targets (#131023)
I broke this in
f3cd223838,
I should have added this to the `SPIRV64` subclass, but I accidentally
added it to base `TargetInfo`.

Using an unsupported target should error in the driver way before this
though.

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-03-13 14:28:49 +00:00
Jon Chesterfield
43999deb37
[spirv][amdgpu] Set atomic size in the clang target info (#128569)
Problem identified by Joseph. The openmp device runtime uses
__scoped_atomic_load_n and similar which presently hit

```
error: large atomic operation may incur significant performance
      penalty; the access size (4 bytes) exceeds the max lock-free size (0 bytes) [-Werror,-Watomic-alignment]
```

This is because the spirv class doesn't set the corresponding field. The
base does, but only if there's a host toolchain, which there isn't.
2025-02-25 18:31:10 +00:00
Chandler Carruth
cd269fee05 [StrTable] Switch Clang builtins to use string tables
This both reapplies #118734, the initial attempt at this, and updates it
significantly.

First, it uses the newly added `StringTable` abstraction for string
tables, and simplifies the construction to build the string table and
info arrays separately. This should reduce any `constexpr` compile time
memory or CPU cost of the original PR while significantly improving the
APIs throughout.

It also restructures the builtins to support sharding across several
independent tables. This accomplishes two improvements from the
original PR:

1) It improves the APIs used significantly.

2) When builtins are defined from different sources (like SVE vs MVE in
   AArch64), this allows each of them to build their own string table
   independently rather than having to merge the string tables and info
   structures.

3) It allows each shard to factor out a common prefix, often cutting the
   size of the strings needed for the builtins by a factor two.

The second point is important both to allow different mechanisms of
construction (for example a `.def` file and a tablegen'ed `.inc` file,
or different tablegen'ed `.inc files), it also simply reduces the sizes
of these tables which is valuable given how large they are in some
cases. The third builds on that size reduction.

Initially, we use this new sharding rather than merging tables in
AArch64, LoongArch, RISCV, and X86. Mostly this helps ensure the system
works, as without further changes these still push scaling limits.
Subsequent commits will more deeply leverage the new structure,
including using the prefix capabilities which cannot be easily factored
out here and requires deep changes to the targets.
2025-02-04 18:04:57 +00:00
Helena Kotas
d92bac8a3e
[HLSL] Introduce address space hlsl_constant(2) for constant buffer declarations (#123411)
Introduces a new address space `hlsl_constant(2)` for constant buffer
declarations.

This address space is applied to declarations inside `cbuffer` block.
Later on, it will also be applied to `ConstantBuffer<T>` syntax and the
default `$Globals` constant buffer.

Clang codegen translates constant buffer declarations to global
variables and loads from `hlsl_constant(2)` address space. More work
coming soon will include addition of metadata that will map these
globals to individual constant buffers and enable their transformation
to appropriate constant buffer load intrinsics later on in an LLVM pass.

Fixes #123406
2025-01-24 16:48:35 -08:00
Farzon Lotfi
21edac25f0
[SPIRV] Add Target Builtins using Distance ext as an example (#121598)
- Update pr labeler so new SPIRV files get properly labeled.
- Add distance target builtin to BuiltinsSPIRV.td.
- Update TargetBuiltins.h to account for spirv builtins.
- Update clang basic CMakeLists.txt to build spirv builtin tablegen.
- Hook up sema for SPIRV in Sema.h|cpp, SemaSPIRV.h|cpp, and
SemaChecking.cpp.
- Hookup sprv target builtins to SPIR.h|SPIR.cpp target.
- Update GBuiltin.cpp to emit spirv intrinsics when we get the expected
spirv target builtin.

Consensus was reach in this RFC to add both target builtins and pattern
matching:
https://discourse.llvm.org/t/rfc-add-targetbuiltins-for-spirv-to-support-hlsl/83329.

pattern matching will come in a separate pr this one just sets up the
groundwork to do target builtins for spirv.

partially resolves
[#99107](https://github.com/llvm/llvm-project/issues/99107)
2025-01-06 11:37:20 -05:00
Chandler Carruth
ca79ff07d8
Revert "Switch builtin strings to use string tables" (#119638)
Reverts llvm/llvm-project#118734

There are currently some specific versions of MSVC that are miscompiling
this code (we think). We don't know why as all the other build bots and
at least some folks' local Windows builds work fine.

This is a candidate revert to help the relevant folks catch their
builders up and have time to debug the issue. However, the expectation
is to roll forward at some point with a workaround if at all possible.
2024-12-13 23:58:48 -08:00
Chandler Carruth
be2df95e92
Switch builtin strings to use string tables (#118734)
The Clang binary (and any binary linking Clang as a library), when built
using PIE, ends up with a pretty shocking number of dynamic relocations
to apply to the executable image: roughly 400k.

Each of these takes up binary space in the executable, and perhaps most
interestingly takes start-up time to apply the relocations.

The largest pattern I identified were the strings used to describe
target builtins. The addresses of these string literals were stored into
huge arrays, each one requiring a dynamic relocation. The way to avoid
this is to design the target builtins to use a single large table of
strings and offsets within the table for the individual strings. This
switches the builtin management to such a scheme.

This saves over 100k dynamic relocations by my measurement, an over 25%
reduction. Just looking at byte size improvements, using the `bloaty`
tool to compare a newly built `clang` binary to an old one:

```
    FILE SIZE        VM SIZE
 --------------  --------------
  +1.4%  +653Ki  +1.4%  +653Ki    .rodata
  +0.0%    +960  +0.0%    +960    .text
  +0.0%    +197  +0.0%    +197    .dynstr
  +0.0%    +184  +0.0%    +184    .eh_frame
  +0.0%     +96  +0.0%     +96    .dynsym
  +0.0%     +40  +0.0%     +40    .eh_frame_hdr
  +114%     +32  [ = ]       0    [Unmapped]
  +0.0%     +20  +0.0%     +20    .gnu.hash
  +0.0%      +8  +0.0%      +8    .gnu.version
  +0.9%      +7  +0.9%      +7    [LOAD #2 [R]]
  [ = ]       0 -75.4% -3.00Ki    .relro_padding
 -16.1%  -802Ki -16.1%  -802Ki    .data.rel.ro
 -27.3% -2.52Mi -27.3% -2.52Mi    .rela.dyn
  -1.6% -2.66Mi  -1.6% -2.66Mi    TOTAL
```

We get a 16% reduction in the `.data.rel.ro` section, and nearly 30%
reduction in `.rela.dyn` where those reloctaions are stored.

This is also visible in my benchmarking of binary start-up overhead at
least:

```
Benchmark 1: ./old_clang --version
  Time (mean ± σ):      17.6 ms ±   1.5 ms    [User: 4.1 ms, System: 13.3 ms]
  Range (min … max):    14.2 ms …  22.8 ms    162 runs

Benchmark 2: ./new_clang --version
  Time (mean ± σ):      15.5 ms ±   1.4 ms    [User: 3.6 ms, System: 11.8 ms]
  Range (min … max):    12.4 ms …  20.3 ms    216 runs

Summary
  './new_clang --version' ran
    1.13 ± 0.14 times faster than './old_clang --version'
```

We get about 2ms faster `--version` runs. While there is a lot of noise
in binary execution time, this delta is pretty consistent, and
represents over 10% improvement. This is particularly interesting to me
because for very short source files, repeatedly starting the `clang`
binary is actually the dominant cost. For example, `configure` scripts
running against the `clang` compiler are slow in large part because of
binary start up time, not the time to process the actual inputs to the
compiler.

----

This PR implements the string tables using `constexpr` code and the
existing macro system. I understand that the builtins are moving towards
a TableGen model, and if complete that would provide more options for
modeling this. Unfortunately, that migration isn't complete, and even
the parts that are migrated still rely on the ability to break out of
the TableGen model and directly expand an X-macro style `BUILTIN(...)`
textually. I looked at trying to complete the move to TableGen, but it
would both require the difficult migration of the remaining targets, and
solving some tricky problems with how to move away from any macro-based
expansion.

I was also able to find a reasonably clean and effective way of doing
this with the existing macros and some `constexpr` code that I think is
clean enough to be a pretty good intermediate state, and maybe give a
good target for the eventual TableGen solution. I was also able to
factor the macros into set of consistent patterns that avoids a
significant regression in overall boilerplate.
2024-12-08 19:00:14 -08:00
Nathan Gauër
f8b4182f07
Revert "[SPIR-V] Fixup storage class for global private (#116636)" (#118312)
This reverts commit aa7fe1c10e5d6d0d3aacdb345fed995de413e142.
2024-12-02 17:32:54 +01:00
Nathan Gauër
aa7fe1c10e
[SPIR-V] Fixup storage class for global private (#116636)
Adds a new address spaces: `hlsl_private`. Variables with such address
space will be emitted with a `Private` storage class.
This is useful for variables global to a SPIR-V module, since up to now,
they were still emitted with a `Function` storage class, which is wrong.

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-12-02 16:17:44 +01:00
Alex Voicu
2c13dec328
[clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V (#110695)
SPIR-V doesn't currently encode "native" integer bit-widths in its
datalayout(s). This is problematic as it leads to optimisation passes,
such as InstCombine, getting ideas and e.g. shrinking to non
byte-multiple integer types, which is not desirable and can lead to
breakage further down in the toolchain. This patch addresses that by
encoding `i8`, `i16`, `i32` and `i64` as native types for vanilla SPIR-V
(the spec natively supports them), and `i32` and `i64` for AMDGCNSPIRV
(where the hardware targets are known). We also set the stack alignment
on the latter, as it is overaligned (32-bit vs 8-bit).
2024-11-05 17:26:08 +02:00
Jay Foad
4dd55c567a
[clang] Use {} instead of std::nullopt to initialize empty ArrayRef (#109399)
Follow up to #109133.
2024-10-24 10:23:40 +01:00
Alex Voicu
e13cbaca69
[clang][CodeGen][SPIR-V] Fix incorrect SYCL usage, implement missing interface (#109415)
This is primarily meant to address the issue identified in #109182,
around incorrect usage of `-fsycl-is-device`; we now have AMDGCN
flavoured SPIR-V which retains the desired behaviour around the default
AS and does not depend on the SYCL language being enabled to do so.
Overall, there are three changes:

1. We unconditionally use the `SPIRDefIsGen` AS map for AMDGCNSPIRV
target, as there is no case where the hack of setting default to private
would be desirable, and it can be used for languages other than OCL/HIP;
2. We implement `SPIRVTargetCodeGenInfo::getGlobalVarAddressSpace` for
SPIR-V in general, because otherwise using it from languages other than
HIP or OpenCL would yield 0, incorrectly;
3. We remove the incorrect usage of `-fsycl-is-device`.
2024-09-26 14:06:14 +01:00
Alex Voicu
3cfd0c0d36
[SPIRV][RFC] Rework / extend support for memory scopes (#106429)
This change adds support for correctly lowering the `__scoped` Clang
builtins, and corresponding scoped LLVM instructions. These were
previously unconditionally lowered to Device scope, which is possibly incorrect. 
Furthermore, the default / implicit scope is changed from Device (an 
OpenCL assumption) to AllSvmDevices (aka System), since the SPIR-V BE is not 
OpenCL specific / can ingest IR coming from other language front-ends. OpenCL 
defaulting to Device scope is now reflected in the front-end handling of atomic 
ops, which seems preferable.
2024-09-25 00:44:57 +01:00
Alex Voicu
88e2bb4092
[clang][SPIR-V] Add support for AMDGCN flavoured SPIRV (#89796)
This change seeks to add support for vendor flavoured SPIRV - more
specifically, AMDGCN flavoured SPIRV. The aim is to generate SPIRV that
carries some extra bits of information that are only usable by AMDGCN
targets, forfeiting absolute genericity to obtain greater expressiveness
for target features:

- AMDGCN inline ASM is allowed/supported, under the assumption that the
[SPV_INTEL_inline_assembly](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_inline_assembly.asciidoc)
extension is enabled/used
- AMDGCN target specific builtins are allowed/supported, under the
assumption that e.g. the `--spirv-allow-unknown-intrinsics` option is
enabled when using the downstream translator
- the featureset matches the union of AMDGCN targets' features
- the datalayout string is overspecified to affix both the program
address space and the alloca address space, the latter under the
assumption that the
[SPV_INTEL_function_pointers](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_function_pointers.asciidoc)
extension is enabled/used, case in which the extant SPIRV datalayout
string would lead to pointers to function pointing to the private
address space, which would be wrong.

Existing AMDGCN tests are extended to cover this new target. It is
currently dormant / will require some additional changes, but I thought
I'd rather put it up for review to get feedback as early as possible. I
will note that an alternative option is to place this under AMDGPU, but
that seems slightly less natural, since this is still SPIRV, albeit
relaxed in terms of preconditions & constrained in terms of
postconditions, and only guaranteed to be usable on AMDGCN targets (it
is still possible to obtain pristine portable SPIRV through usage of the
flavoured target, though).
2024-06-07 11:50:23 +01:00
Justin Bogner
b0ddbfb77d
[clang][SPIR-V] Set AS for the SPIR-V logical triple (#88939)
This was missed in #88455, causing most of the .hlsl to SPIR-V tests to
fail (such as clang\test\Driver\hlsl-lang-targets-spirv.hlsl)
2024-04-16 12:09:32 -07:00
Alex Voicu
1120d8e6f7
[clang][CodeGen] Add AS for Globals to SPIR & SPIRV datalayouts (#88455)
Currently neither the SPIR nor the SPIRV targets specify the AS for
globals in their datalayout strings. This is problematic because
CodeGen/LLVM will default to AS0 in this case, which produces Globals
that end up in the private address space for e.g. OCL, HIPSPV or SYCL.
This patch addresses it by completing the datalayout string.
2024-04-16 11:37:29 +01:00
Natalie Chouinard
3b57b647a9
[HLSL][SPIR-V] Add create.handle intrinsic (#81038)
Add a SPIR-V target-specific intrinsic for creating handles, which is
used for lowering HLSL resources types like RWBuffer.

`llvm/lib/TargetParser/Triple.cpp`: SPIR-V intrinsics use "spv" as the
target prefix, not "spirv". As far as I can tell, this is the first one
that is used via the `CGBuiltin` codepath, which relies on
`getArchTypePrefix`, so I've corrected it here.

`clang/lib/Basic/Targets/SPIR.h`: When records are laid out in the
lowering from AST to IR, they were incorrectly offset because these
Pointer attributes were defaulting to 32.

Related to #81036
2024-02-08 14:35:44 -05:00
Jonas Paulsson
34dd8ec8ae
[clang, SystemZ] Support -munaligned-symbols (#73511)
When this option is passed to clang, external (and/or weak) symbols
are not assumed to have the minimum ABI alignment normally required.
Symbols defined locally that are not weak are however still given the
minimum alignment.

This is implemented by passing a new parameter to getMinGlobalAlign()
named HasNonWeakDef that is used to return the right alignment value.

This is needed when external symbols created from a linker script may
not get the ABI minimum alignment and must therefore be treated as
unaligned by the compiler.
2024-01-27 18:29:37 +01:00
Natalie Chouinard
c21f48e5ad
[HLSL][SPIR-V] Add Vulkan to target triple (#76749)
Add support for specifying the logical SPIR-V target environment in the
triple as Vulkan. When compiling HLSL, this replaces the DirectX Shader
Model with a Vulkan environment instead.

Currently, the only supported combinations of SPIR-V version and Vulkan
environment are:
- Vulkan 1.2 and SPIR-V 1.5
- Vulkan 1.3 and SPIR-V 1.6

Fixes #70051
2024-01-18 12:52:00 -05:00
Nathan Gauër
53b6a169e4 [SPIR-V] Add SPIR-V logical triple.
Clang implements SPIR-V with both Physical32 and Physical64 addressing
models. This commit adds a new triple value for the Logical
addressing model.

Differential Revision: https://reviews.llvm.org/D155978
2023-09-11 10:15:24 +02:00
Stoorx
42d758bfa6 [clang] Return std::string_view from TargetInfo::getClobbers()
Change the return type of `getClobbers` function from `const char*`
to `std::string_view`. Update the function usages in CodeGen module.

The reasoning of these changes is to remove unsafe `const char*`
strings and prevent unnecessary allocations for constructing the
`std::string` in usages of `getClobbers()` function.

Differential Revision: https://reviews.llvm.org/D148799
2023-04-24 12:16:54 +03:00
Stoorx
830b359d3a [clang] Return std::unique_ptr<TargetInfo> from AllocateTarget
In file 'clang/lib/Basic/Targets.cpp' the function 'AllocateTarget' had a raw pointer as a return type, which have been wrapped in the 'std::unique_ptr' in all usages.
This commit changes the signature of the function to return an instance of 'std::unique_ptr' directly.

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D148574
2023-04-18 10:07:26 +00:00
Paulo Matos
8d0c889752 [clang][WebAssembly] Initial support for reference type funcref in clang
This is the funcref counterpart to 890146b. We introduce a new attribute
that marks a function pointer as a funcref. It also implements builtin
__builtin_wasm_ref_null_func(), that returns a null funcref value.

Differential Revision: https://reviews.llvm.org/D128440
2023-03-17 18:31:44 +01:00
ShangwuYao
8bd13ad6c5 [CUDA][SPIRV] Match builtin types and __GCC_ATOMIC_XXX_LOCK_FREE macros on host/device
This change matches the CUDA/SPIRV behavior with CUDA/NVPTX, and makes some builtin types
and __GCC_ATOMIC_XXX_LOCK_FREE macros the same between the host and device. This is only
done when host triple is provided and known, otherwise the behavior is unchanged.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D144047
2023-02-22 15:44:46 +00:00
Archibald Elliott
62c7f035b4 [NFC][TargetParser] Remove llvm/ADT/Triple.h
I also ran `git clang-format` to get the headers in the right order for
the new location, which has changed the order of other headers in two
files.
2023-02-07 12:39:46 +00:00
Zahira Ammarguellat
b4b06d8ff8 __arithmetic_fence enforces ordering on expression evaluation when fast-math
is enabled.
In fast math mode some floating-point optimizations are performed such as
reassociation and distribution.

For example, the compiler may transform (a+b)+c into a+(b+c). Although these
two expressions are equivalent in integer arithmetic, they may not be in
floating-point arithmetic. The builtin tells the compiler that the expression
in parenthesis can’t be re-associated or distributed.
__arithmetic_fence(a+b)+c is not equivalent to a+(b+c).

This patch adds the support of the builtin to SPIR target.

Differential Revision: https://reviews.llvm.org/D142583
2023-01-26 14:18:28 -05:00
Krzysztof Parzyszek
0ca43d4488 DebugInfoMetadata: convert Optional to std::optional 2022-12-04 11:52:02 -06:00
Kazu Hirata
eeee3fee37 [Basic] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-03 11:34:27 -08:00
Xiang Li
7e04c0ad63 [HLSL] Add groupshare address space.
Added keyword, LangAS and TypeAttrbute for groupshared.

Tanslate it to LangAS with asHLSLLangAS.

Make sure it translated into address space 3 for DirectX target.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D135060
2022-10-20 09:29:09 -07:00
Shangwu Yao
c2f501f395
[CUDA][SPIRV] Assign global address space to CUDA kernel arguments
(resubmit https://reviews.llvm.org/D119207 after fixing the test for
some build settings)

This patch converts CUDA pointer kernel arguments with default address
space to CrossWorkGroup address space (__global in OpenCL). This is
because Generic or Function (OpenCL's private) is not supported as
storage class for kernel pointer types.

Differential revision: https://reviews.llvm.org/D120366
2022-02-24 20:51:43 -08:00
Matthew Voss
9ce09099bb Revert "[CUDA][SPIRV] Assign global address space to CUDA kernel arguments"
This reverts commit 9de4fc0f2d3b60542956f7e5254951d049edeb1f.

Reverting due to test failure: https://lab.llvm.org/buildbot/#/builders/139/builds/17199
2022-02-17 14:32:10 -08:00
Shangwu Yao
9de4fc0f2d
[CUDA][SPIRV] Assign global address space to CUDA kernel arguments
This patch converts CUDA pointer kernel arguments with default address space to
CrossWorkGroup address space (__global in OpenCL). This is because Generic or
Function (OpenCL's private) is not supported as storage class for kernel pointer types.

Differential Revision: https://reviews.llvm.org/D119207
2022-02-17 09:38:06 -08:00
Aaron Ballman
6c75ab5f66 Introduce _BitInt, deprecate _ExtInt
WG14 adopted the _ExtInt feature from Clang for C23, but renamed the
type to be _BitInt. This patch does the vast majority of the work to
rename _ExtInt to _BitInt, which accounts for most of its size. The new
type is exposed in older C modes and all C++ modes as a conforming
extension. However, there are functional changes worth calling out:

* Deprecates _ExtInt with a fix-it to help users migrate to _BitInt.
* Updates the mangling for the type.
* Updates the documentation and adds a release note to warn users what
is going on.
* Adds new diagnostics for use of _BitInt to call out when it's used as
a Clang extension or as a pre-C23 compatibility concern.
* Adds new tests for the new diagnostic behaviors.

I want to call out the ABI break specifically. We do not believe that
this break will cause a significant imposition for early adopters of
the feature, and so this is being done as a full break. If it turns out
there are critical uses where recompilation is not an option for some
reason, we can consider using ABI tags to ease the transition.
2021-12-06 12:52:01 -05:00
Anastasia Stulova
f4d3cb4ca8 [HIPSPV] Add CUDA->SPIR-V address space mapping
Add mapping for CUDA address spaces for HIP to SPIR-V
translation. This change allows HIP device code to be
emitted as valid SPIR-V by mapping unqualified pointers
to generic address space and by mapping __device__ and
__shared__ AS to their equivalent AS in SPIR-V
(CrossWorkgroup and Workgroup, respectively).

Cuda's __constant__ AS is handled specially. In HIP
unqualified pointers (aka "flat" pointers) can point to
__constant__ objects. Mapping this AS to ConstantMemory
would produce to illegal address space casts to
generic AS. Therefore, __constant__ AS is mapped to
CrossWorkgroup.

Patch by linjamaki (Henry Linjamäki)!

Differential Revision: https://reviews.llvm.org/D108621
2021-12-02 13:34:27 +00:00
Anastasia Stulova
a10a69fe9c [SPIR-V] Add SPIR-V triple and clang target info.
Add new triple and target info for ‘spirv32’ and ‘spirv64’ and,
thus, enabling clang (LLVM IR) code emission to SPIR-V target.

The target for SPIR-V is mostly reused from SPIR by derivation
from a common base class since IR output for SPIR-V is mostly
the same as SPIR. Some refactoring are made accordingly.

Added and updated tests for parts that are different between
SPIR and SPIR-V.

Patch by linjamaki (Henry Linjamäki)!

Differential Revision: https://reviews.llvm.org/D109144
2021-11-08 13:34:10 +00:00
Melanie Blower
aaba37187f [clang][PATCH][nfc] Refactor TargetInfo::adjust to pass DiagnosticsEngine to allow diagnostics on target-unsupported options
Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D104729
2021-06-29 13:26:23 -04:00