22 Commits

Author SHA1 Message Date
Matheus Izvekov
91cdd35008
[clang] Improve nested name specifier AST representation (#147835)
This is a major change on how we represent nested name qualifications in
the AST.

* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.

This patch offers a great performance benefit.

It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.

This has great results on compile-time-tracker as well:

![image](https://github.com/user-attachments/assets/700dce98-2cab-4aa8-97d1-b038c0bee831)

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.

It has some other miscelaneous drive-by fixes.

About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.

There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.

How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.

The rest and bulk of the changes are mostly consequences of the changes
in API.

PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.

Fixes #136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
2025-08-09 05:06:53 -03:00
Sven van Haastregt
d45d20e871
[OpenCL] Remove image dimensionality comments; NFC (#147312)
The code is correct as it aligns with the SPIR-V Specification, but the
comment was incorrect.
2025-07-09 10:27:30 +02:00
Shafik Yaghmour
6efa366b43
[Clang][NFC] Avoid copies by using std::move (#146960)
Static analysis flagged this code as using copies when we could use move
instead. I used a temporary in some cases instead of an explicit move.
2025-07-07 17:53:45 -07:00
Steven Perron
68173c8091
[HLSL][SPRIV] Handle signed RWBuffer correctly (#144774)
In Vulkan, the signedness of the accesses to images has to match the
signedness of the backing image.
    
See

https://docs.vulkan.org/spec/latest/chapters/textures.html#textures-input,
where it says the behaviour is undefined if
    
> the signedness of any read or sample operation does not match the
signedness of the image’s format.
    
Users who define say an `RWBuffer<int>` will create a Vulkan image with
a signed integer format. So the HLSL that is generated must match that
expecation.
    
The solution we use is to generate a `spirv.SignedImage` target type for
signed integer instead of `spirv.Image`. The two types are otherwise the
same.
    
The backend will add the `signExtend` image operand to access to the
image to ensure the image is access as a signed image.
    
Fixes #144580
2025-07-02 12:09:47 -04:00
Sarah Spall
23be14b222
[HLSL][SPIRV] Boolean in a RawBuffer should be i32 and Boolean vector in a RawBuffer should be <N x i32> (#144929)
Instead of converting the type in a RawBuffer to its HLSL type using
'ConvertType', use 'ConvertTypeForMem'.
ConvertTypeForMem handles booleans being i32 and boolean vectors being <
N x i32 >.
Add tests to show booleans and boolean vectors in RawBuffers now have
the correct type of i32, and respectively.
Closes #141089
2025-06-27 13:43:03 -07:00
Alex Voicu
992f0d1225
[Clang][SPIRV][AMDGPU] Override supportsLibCall for AMDGCNSPIRV (#143814)
The `supportsLibCall` predicate is used to select whether some math builtins get expanded in the FE or they get lowered into libcalls. The default implementation unconditionally returns true, which is problematic for AMDGCN-flavoured SPIRV, as AMDGPU does not support any libcalls at the moment. This change overrides the predicate in order to reflect this and correctly do the expected FE expansion when targeting AMDGCN-flavoured SPIRV.
2025-06-25 11:22:59 +01:00
Nick Sarnie
86d1d6b2c0
[clang] Use TargetInfo to determine device kernel calling convention (#144728)
We should abstract this logic away to `TargetInfo`. See
https://github.com/llvm/llvm-project/pull/137882 for more information.

---------

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-06-18 20:50:12 +00:00
Nick Sarnie
3b9ebe9201
[clang] Simplify device kernel attributes (#137882)
We have multiple different attributes in clang representing device
kernels for specific targets/languages. Refactor them into one attribute
with different spellings to make it more easily scalable for new
languages/targets.

---------

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-06-05 14:15:38 +00:00
Steven Perron
5584020d8a
[HLSL][SPIRV] Implement the SPIR-V target type for cbuffers. (#140061)
This change implement the type used to represent cbuffer for SPIR-V.

Fixes https://github.com/llvm/llvm-project/issues/138274.
2025-05-28 07:51:03 -04:00
Cassandra Beckley
5a4571133a
[HLSL] Implement SpirvType and SpirvOpaqueType (#134034)
This implements the design proposed by [Representing SpirvType in
Clang's Type System](https://github.com/llvm/wg-hlsl/pull/181). It
creates `HLSLInlineSpirvType` as a new `Type` subclass, and
`__hlsl_spirv_type` as a new builtin type template to create such a
type.

This new type is lowered to the `spirv.Type` target extension type, as
described in [Target Extension Types for Inline SPIR-V and Decorated
Types](https://github.com/llvm/wg-hlsl/blob/main/proposals/0017-inline-spirv-and-decorated-types.md).
2025-05-27 11:40:54 -04:00
Aniket Lal
642481a428
[Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (#115821)
This feature is currently not supported in the compiler.
To facilitate this we emit a stub version of each kernel
function body with different name mangling scheme, and
replaces the respective kernel call-sites appropriately.
    
Fixes https://github.com/llvm/llvm-project/issues/60313
    
D120566 was an earlier attempt made to upstream a solution
for this issue.

---------

Co-authored-by: anikelal <anikelal@amd.com>
2025-04-08 10:29:30 +05:30
Steven Perron
16603d838c
[HLSL] Add SPIR-V target type for RWStructuredBuffers (#133468)
This PR adds the target type for main storage for HLSL raw buffer types.
It does not handle the counter variables that are associated with those
buffers.

This is implementing part of
https://github.com/llvm/wg-hlsl/blob/main/proposals/0018-spirv-resource-representation.md.
We do not handle other HLSL raw buffer types.
2025-04-01 16:59:46 -04:00
Helena Kotas
73e12de062
[HLSL] Implement explicit layout for default constant buffer ($Globals) (#128991)
Processes `HLSLResourceBindingAttr` attributes that represent
`register(c#)` annotations on default constant buffer declarations and
applies its value to the buffer layout. Any default buffer declarations
without an explicit `register(c#)` annotation are placed after the
elements with explicit layout.

This PR also adds a test case for a `cbuffer` that does not have
`packoffset` on all declarations. Same layout rules apply here as well.

Fixes #126791
2025-03-12 22:35:07 -07:00
Helena Kotas
19af8581d5
[HLSL] Constant Buffers CodeGen (#124886)
Translates `cbuffer` declaration blocks to `target("dx.CBuffer")` type. Creates global variables in `hlsl_constant` address space for all `cbuffer` constant and adds metadata describing which global constant belongs to which constant buffer. For explicit constant buffer layout information an explicit layout type `target("dx.Layout")` is used. This might change in the future.

The constant globals are temporary and will be removed in upcoming pass that will translate `load` instructions in the `hlsl_constant` address space to constant buffer load intrinsics calls off a CBV handle (#124630, #112992).

See [Constant buffer design
doc](https://github.com/llvm/wg-hlsl/pull/94) for more details.

Fixes #113514, #106596
2025-02-20 10:32:14 -08:00
Alex Voicu
39ec9de7c2
[clang][CodeGen] sret args should always point to the alloca AS, so use that (#114062)
`sret` arguments are always going to reside in the stack/`alloca`
address space, which makes the current formulation where their AS is
derived from the pointee somewhat quaint. This patch ensures that `sret`
ends up pointing to the `alloca` AS in IR function signatures, and also
guards agains trying to pass a casted `alloca`d pointer to a `sret` arg,
which can happen for most languages, when compiled for targets that have
a non-zero `alloca` AS (e.g. AMDGCN) / map `LangAS::default` to a
non-zero value (SPIR-V). A target could still choose to do something
different here, by e.g. overriding `classifyReturnType` behaviour.

In a broader sense, this patch extends non-aliased indirect args to also
carry an AS, which leads to changing the `getIndirect()` interface. At
the moment we're only using this for (indirect) returns, but it allows
for future handling of indirect args themselves. We default to using the
AllocaAS as that matches what Clang is currently doing, however if, in
the future, a target would opt for e.g. placing indirect returns in some
other storage, with another AS, this will require revisiting.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
2025-02-14 11:20:45 +00:00
Alex Voicu
66acb26946
[clang][CodeGen][SPIRV] Translate amdgpu_flat_work_group_size into max_work_group_size. (#116820)
HIPAMD relies on the `amdgpu_flat_work_group_size` attribute to
implement key functionality such as the `__launch_bounds__` `__global__`
function annotation. This attribute is not available / directly
translatable to SPIR-V, hence as it is AMDGCN flavoured SPIR-V suffers
from information loss.

This patch addresses that limitation by converting the unsupported
attribute into the `max_work_group_size` attribute which maps to
[`MaxWorkgroupSizeINTEL`](https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/INTEL/SPV_INTEL_kernel_attributes.asciidoc),
which is available in / handled by SPIR-V. When reverse translating from
SPIR-V to AMDGCN LLVMIR we invert the map and add the original AMDGPU
attribute.
2025-01-07 12:01:31 +02:00
Steven Perron
d6344c1cd0
[HLSL][SPIRV] Add HLSL type translation for spirv. (#114273)
This commit partially implements SPIRTargetCodeGenInfo::getHLSLType. It
can now generate the spirv type for the following HLSL types:

1. RWBuffer
2. Buffer
3. Sampler

---------

Co-authored-by: Nathan Gauër <github@keenuts.net>
2024-11-04 12:32:23 -05:00
Alex Voicu
e13cbaca69
[clang][CodeGen][SPIR-V] Fix incorrect SYCL usage, implement missing interface (#109415)
This is primarily meant to address the issue identified in #109182,
around incorrect usage of `-fsycl-is-device`; we now have AMDGCN
flavoured SPIR-V which retains the desired behaviour around the default
AS and does not depend on the SYCL language being enabled to do so.
Overall, there are three changes:

1. We unconditionally use the `SPIRDefIsGen` AS map for AMDGCNSPIRV
target, as there is no case where the hack of setting default to private
would be desirable, and it can be used for languages other than OCL/HIP;
2. We implement `SPIRVTargetCodeGenInfo::getGlobalVarAddressSpace` for
SPIR-V in general, because otherwise using it from languages other than
HIP or OpenCL would yield 0, incorrectly;
3. We remove the incorrect usage of `-fsycl-is-device`.
2024-09-26 14:06:14 +01:00
Alex Voicu
3cfd0c0d36
[SPIRV][RFC] Rework / extend support for memory scopes (#106429)
This change adds support for correctly lowering the `__scoped` Clang
builtins, and corresponding scoped LLVM instructions. These were
previously unconditionally lowered to Device scope, which is possibly incorrect. 
Furthermore, the default / implicit scope is changed from Device (an 
OpenCL assumption) to AllSvmDevices (aka System), since the SPIR-V BE is not 
OpenCL specific / can ingest IR coming from other language front-ends. OpenCL 
defaulting to Device scope is now reflected in the front-end handling of atomic 
ops, which seems preferable.
2024-09-25 00:44:57 +01:00
Alex Voicu
ad435bcc14
[clang][CodeGen][SPIR-V][AMDGPU] Tweak AMDGCNSPIRV ABI to allow for the correct handling of aggregates passed to kernels / functions. (#102776)
The AMDGPU kernel ABI is not directly representable in SPIR-V, since it
relies on passing aggregates `byref`, and SPIR-V only encodes `byval`
(which the AMDGPU BE disallows for kernel arguments). As a temporary
solution to this mismatch, we add special handling for AMDGCN flavoured
SPIR-V, whereby aggregates are passed as direct, both to kernels and to
normal functions. This is not ideal (there are pathological cases where
performance is heavily impacted), but empirically robust and guaranteed
to work as the AMDGPU BE retains handling of `direct` passing for legacy
reasons.

We will revisit this in the future, but as it stands it is enough to
pass a wide array of integration tests and generates correct SPIR-V and
correct reverse translation into LLVM IR. The
amdgpu-kernel-arg-pointer-type test is updated via the automated script,
and thus becomes quite noisy.
2024-08-21 13:16:59 +01:00
Kazu Hirata
f3dcc2351c
[clang] Use StringRef::{starts,ends}_with (NFC) (#75149)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-13 08:54:13 -08:00
Sergei Barannikov
992cb98462 [clang][CodeGen] Break up TargetInfo.cpp [8/8]
This commit breaks up CodeGen/TargetInfo.cpp into a set of *.cpp files,
one file per target. There are no functional changes, mostly just code moving.

Non-code-moving changes are:
* A virtual destructor has been added to DefaultABIInfo to pin the vtable to a cpp file.
* A few methods of ABIInfo and DefaultABIInfo were split into declaration + definition
  in order to reduce the number of transitive includes.
* Several functions that used to be static have been placed in clang::CodeGen
  namespace so that they can be accessed from other cpp files.

RFC: https://discourse.llvm.org/t/rfc-splitting-clangs-targetinfo-cpp/69883

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D148094
2023-06-17 07:14:50 +03:00