This is a major change on how we represent nested name qualifications in
the AST.
* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.
This patch offers a great performance benefit.
It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.
This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.
The rest and bulk of the changes are mostly consequences of the changes
in API.
PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.
Fixes#136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
Adds the `isHLSLResourceRecordArray()` method to the `Type` class. This method returns `true` if the `Type` represents an array of HLSL resource records. Defining this method on `Type` makes it accessible from both sema and codegen.
The vk::binding attribute allows users to explicitly set the set and
binding for a resource in SPIR-V without chaning the "register"
attribute, which will be used when targeting DXIL.
Fixes https://github.com/llvm/llvm-project/issues/136894
The patch adds intrinsics and lowering logic for GlobalSize,
GlobalOffset, SubgroupMaxSize, NumWorkgroups, WorkgroupSize,
WorkgroupId, LocalInvocationId, GlobalInvocationId, SubgroupSize,
NumSubgroups, SubgroupId and SubgroupLocalInvocationId SPIR-V builtins.
The patch also extend spv_thread_id, spv_group_id and
spv_thread_id_in_group to return anyint rather than i32. This allows the
intrinsics to support the opencl environment.
For each of the intrinsics, new clang builtins were added as well as a
binding for the SPIR-V "friendly" format. The original format doesn't
define such binding (uses global variables) but it is not possible to
express the Input SC which is normally required by the environement
specs, and using builtin functions is the most usual approach for other
backend and programming models.
This pr breaks-up `HLSLRootSignatureUtils` into separate orthogonal and
meaningful libraries. This prevents it ending up as a dumping grounds of
many different parts.
- Creates a library `RootSignatureMetadata` to contain helper functions
for interacting the root signatures in their metadata representation
- Create a library `RootSignatureValidations` to contain helper
functions that will validate various values of root signatures
- Move the serialization of root signature elements to
`HLSLRootSignature`
Resolves: https://github.com/llvm/llvm-project/issues/145946
In one case, we have a null pointer check that's unnecessary because the
only caller of the function already asserts the value is non-null.
In the other case, we've got an anti-pattern of `is` followed by `get`.
The logic was easier to repair by changing `get` to `cast`.
Neither case is a functional change.
Fixes#145525
This pr provides the ability to specify the root signature version as a
compiler option and to retain this in the root signature decl.
It also updates the methods to serialize the version when dumping the
declaration and to output the version when generating the metadata.
- Update `DXContainer.hI` to define the root signature versions
- Update `Options.td` and `LangOpts.h` to define the
`fdx-rootsignature-version` compiler option
- Update `Options.td` to provide an alias `force-rootsig-ver` in
clang-dxc
- Update `Decl.[h|cpp]` and `SeamHLSL.cpp` so that `RootSignatureDecl`
will retain its version type
- Updates `CGHLSLRuntime.cpp` to generate the extra metadata field
- Add tests to illustrate
Resolves https://github.com/llvm/llvm-project/issues/126557.
Note: this does not implement validation based on versioning.
https://github.com/llvm/llvm-project/issues/129940 is required to
retrieve the version and use it for validations.
In #144957 the backend was updated to expect a version in the metadata,
but since the frontend wasn't updated this breaks compilation. This is a
somewhat temporary fix to that until #144813 lands.
Implements
https://github.com/llvm/wg-hlsl/blob/main/proposals/0026-symbol-visibility.md.
The change is to stop using the `hlsl.export` attribute. Instead,
symbols with "program linkage" in HLSL will have export linkage with
default visibility, and symbols with "external linkage" in HLSL will
have export linkage with hidden visibility.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
The SPIR-V backend does not have access to the original name of a
resource in the source, so it tries to create a name. This leads to some
problems with reflection.
That is why start to pass the name of the resource from Clang to the
SPIR-V backend.
Fixes#138533
This commit is using the same mechanism as vk::ext_builtin_input to
implement the SV_Position semantic input.
The HLSL signature is not yet ready for DXIL, hence this commit only
implements the SPIR-V side.
This is incomplete as it doesn't allow the semantic on hull/domain and
other shaders, but it's a first step to validate the overall
input/output
semantic logic.
Fixes https://github.com/llvm/llvm-project/issues/136969
This variable attribute is used in HLSL to add Vulkan specific builtins
in a shader.
The attribute is documented here:
17727e88fd/proposals/0011-inline-spirv.md
Those variable, even if marked as `static` are externally initialized by
the pipeline/driver/GPU. This is handled by moving them to a specific
address space `hlsl_input`, also added by this commit.
The design for input variables in Clang can be found here:
355771361e/proposals/0019-spirv-input-builtin.md
Co-authored-by: Justin Bogner <mail@justinbogner.com>
`HLSLRootSignature.h` was originally created to hold the struct
definitions of an `llvm::hlsl::rootsig::RootElement` and some helper
functions for it.
However, there many users of the structs that don't require any of the
helper methods. This requires us to link the `FrontendHLSL` library,
where we otherwise wouldn't need to.
For instance:
- This [revert](https://github.com/llvm/llvm-project/pull/142005) was
required as it requires linking to the unrequired `FrontendHLSL` library
- As part of the change required here:
https://github.com/llvm/llvm-project/issues/126557. We will want to add
an `HLSLRootSignatureVersion` enum. Ideally this could live with the
root signature struct defs, but we don't want to link the helper objects
into `clang/Basic/TargetOptions.h`
This change allows the struct definitions to be kept in a single header
file and to then have the `FrontendHLSL` library only be linked when
required.
Adds resource name argument to `llvm.dx.handlefrombinding` and `llvm.dx.handlefromimplicitbinding` intrinsics.
SPIR-V currently does not seem to need the resource names so this change only affects DirectX binding intrinsics.
Part 2/4 of https://github.com/llvm/llvm-project/issues/105059
Constant buffers defined with the `cbuffer` keyword do not have a
constructor. Instead, the call to initialize the resource handle based
on its binding is generated in codegen. This change adds initialization
of `cbuffer` handles that have implicit binding.
Closes #139617
- prereq: Modify `RootSignatureAttr` to hold a reference to the owned
declaration
- Define and implement `MetadataBuilder` in `HLSLRootSignature`
- Integrate and invoke the builder in `CGHLSLRuntime.cpp` to generate
the Root Signature for any associated entry functions
- Add tests to demonstrate functionality in `RootSignature.hlsl`
Resolves https://github.com/llvm/llvm-project/issues/126584
Note: this is essentially just
https://github.com/llvm/llvm-project/pull/125131 rebased onto the new
approach of constructing a root signature decl, instead of holding the
elements in `AdditionalMembers`.
Specifying only `space` in a `register` annotation means the compiler
should implicitly assign a register slot to the resource from the
provided virtual register space.
Closes#133346
- Adds resource constructor that takes explicit binding for all resource
classes.
- Updates implementation of default resource constructor to initialize
resource handle to `poison`.
- Removes initialization of resource classes from Codegen.
- Initialization of `cbuffer` still needs to happen in `CGHLSLRuntime`
because it does not have a corresponding resource class type.
- Adds `ImplicitCastExpr` for builtin function calls. Sema adds these
automatically when a method of a template class is instantiated, but
some resource classes like `ByteAddressBuffer` are not templates so they
need to have this added explicitly.
Design proposal: llvm/wg-hlsl#197
Closes#134154
Fixes#112270
Completed ACs:
- `-res-may-alias` clang-dxc command-line option added
- It inserts and sets a module metadata flag `dx.resmayalias` to 1
- Shader flag set appropriately:
- The flag IS NOT set if DXIL Version <= 1.6 OR the command-line option
`-res-may-alias` is specified
- Otherwise the flag IS set when:
- DXIL Version > 1.7 AND function uses UAVs, OR
- DXIL Version <= 1.7 AND UAVs present globally
- Add tests
- Tests for Shader Models 6.6, 6.7, and 6.8 corresponding to DXIL
Versions 1.6, 1.7, and 1.8
- Tests (`res-may-alias-0.ll`/`res-may-alias-1.ll`) for when the module
metadata flag `dx.resmayalias` is set to 0 or 1 respectively
- A frontend test (`res-may-alias.hlsl`) for testing that that the
command-line option `-res-may-alias` inserts `dx.resmayalias` module
metadata correctly
Fixes#112267
Implement the shader flag analysis to set the UseNativeLowPrecision DXIL
module flag.
The flag is only able to be set when the command-line flag
`-enable-16bit-types` is passed to clang-dxc, or equivalently
`-fnative-half-type` is passed to clang.
When the command-line flag is passed, a module metadata flag called
"dx.nativelowprec" is set to 1.
The DXILShaderFlags shader flags analysis checks that the module
metadata flag "dx.nativelowprec" is set to 1 and the DXIL Version is 1.2
or greater before setting the UseNativeLowPrecision DXIL module flag.
Lower the `SV_GroupIndex` semantic as the
`llvm.spv.flattened.thread.id.in.group` intrinsic.
Depends on #130670.
---------
Co-authored-by: Steven Perron <stevenperron@google.com>
Processes `HLSLResourceBindingAttr` attributes that represent
`register(c#)` annotations on default constant buffer declarations and
applies its value to the buffer layout. Any default buffer declarations
without an explicit `register(c#)` annotation are placed after the
elements with explicit layout.
This PR also adds a test case for a `cbuffer` that does not have
`packoffset` on all declarations. Same layout rules apply here as well.
Fixes#126791
The resource wrapper should have internal linkage because it contains a
handle to the global resource, and it not the actual global.
Makeing this changed exposed that we were zeroinitializing the resouce,
which is a problem. The handle cannot be zeroinitialized. This is
changed to use poison instead.
Fixes https://github.com/llvm/llvm-project/issues/122767.
---------
Co-authored-by: Helena Kotas <hekotas@microsoft.com>
All variable declarations in the global scope that are not resources,
static or empty are implicitly added to implicit constant buffer
`$Globals`. They are created in `hlsl_constant` address space and
collected in an implicit `HLSLBufferDecl` node that is added to the AST
at the end of the translation unit. Codegen is the same as for explicit
constant buffers.
Fixes#123801
This is a second attempt to implement this feature. The first attempt
had to be reverted because of memory leaks. The problem was adding a
`SmallVector` member on `HLSLBufferDecl` node to represent a list of
default buffer declarations. When this vector needed to grow, it
allocated memory that was never released, because all memory used by AST
nodes must be allocated by `ASTContext` allocator and is released all at
once. Destructors on AST nodes are never called.
It this change the list of default buffer declarations is collected in a
`SmallVector` instance on `SemaHLSL`. The `HLSLBufDecl` representing
`$Globals` is created at the end of the translation unit when the number
of declarations is known, and the list is copied into an array allocated
by the `ASTContext` allocator.
All variable declarations in the global scope that are not resources,
static or empty are implicitly added to implicit constant buffer
`$Globals`. They are created in `hlsl_constant` address space and
collected in an implicit `HLSLBufferDecl` node that is added to the AST
at the end of the translation unit. Codegen is the same as for explicit
constant buffers.
Fixes#123801
Translates `cbuffer` declaration blocks to `target("dx.CBuffer")` type. Creates global variables in `hlsl_constant` address space for all `cbuffer` constant and adds metadata describing which global constant belongs to which constant buffer. For explicit constant buffer layout information an explicit layout type `target("dx.Layout")` is used. This might change in the future.
The constant globals are temporary and will be removed in upcoming pass that will translate `load` instructions in the `hlsl_constant` address space to constant buffer load intrinsics calls off a CBV handle (#124630, #112992).
See [Constant buffer design
doc](https://github.com/llvm/wg-hlsl/pull/94) for more details.
Fixes#113514, #106596
This PR implements HLSL's initialization list behvaior as specified in
the draft language specifcation under
[*Decl.Init.Agg*](https://microsoft.github.io/hlsl-specs/specs/hlsl.html#Decl.Init.Agg).
This behavior is a bit unusual for C/C++ because intermediate braces in
initializer lists are ignored and a whole array of additional
conversions occur unintuitively to how initializaiton works in C.
The implementaiton in this PR generates a valid C/C++ initialization
list AST for the HLSL initializer so that there are no changes required
to Clang's CodeGen to support this. This design will also allow us to
use Clang's rewrite to convert HLSL initializers to valid C/C++
initializers that are equivalent. It does have the downside that it will
generate often redundant accesses during codegen. The IR optimizer is
extremely good at eliminating those so this will have no impact on the
final executable performance.
There is some opportunity for optimizing the initializer list generation
that we could consider in subsequent commits. One notable opportunity
would be to identify aggregate objects that occur in the same place in
both initializers and do not require converison, those aggregates could
be initialized as aggregates rather than fully scalarized.
Closes#56067
---------
Co-authored-by: Finn Plummer <50529406+inbelic@users.noreply.github.com>
Co-authored-by: Helena Kotas <hekotas@microsoft.com>
Co-authored-by: Justin Bogner <mail@justinbogner.com>
- Set the shader flag `DisableOptimizations` based on `optnone`
attribute of shader entry functions.
- Add DXIL Metadata Analysis pass as pre-requisite for Shader Flags pass
to obtain entry function information collected therein.
- Named module metadata `dx.disable_optimizations` is intended to
indicate disabling optimizations (`-O0`) via commandline flag. However,
its intent is fulfilled by `optnone` attribute of shader entry functions as
implemented in a recent change, and thus not needed. Delete
generation of named metadata and corresponding test file
`disable_opt.ll`.
- Add tests to verify correctness of setting shader flag.
Closes#112263
Introduces a new address space `hlsl_constant(2)` for constant buffer
declarations.
This address space is applied to declarations inside `cbuffer` block.
Later on, it will also be applied to `ConstantBuffer<T>` syntax and the
default `$Globals` constant buffer.
Clang codegen translates constant buffer declarations to global
variables and loads from `hlsl_constant(2)` address space. More work
coming soon will include addition of metadata that will map these
globals to individual constant buffers and enable their transformation
to appropriate constant buffer load intrinsics later on in an LLVM pass.
Fixes#123406
The HLSL SV_GroupID semantic attribute is lowered into
@llvm.spv.group.id intrinsic in LLVM IR for SPIR-V target.
In the SPIR-V backend, this is now translated to a `WorkgroupId` builtin
variable.
Fixes#118700 which's a follow-up work to #70120
Before this patch, there was a calling-convention mismatch between the
constructors and the actual call emitted for the entrypoint wrapper.
Such mismatch causes the InstCombine pass to replace this call with an
`unreachable`, breaking the whole function.
Signed-off-by: Nathan Gauër <brioche@google.com>
Support HLSL SV_GroupThreadId attribute.
For `directx` target, translate it into `dx.thread.id.in.group` in clang
codeGen and lower `dx.thread.id.in.group` to `dx.op.threadIdInGroup` in
LLVM DirectX backend.
For `spir-v` target, translate it into `spv.thread.id.in.group` in clang
codeGen and lower `spv.thread.id.in.group` to a `LocalInvocationId`
builtin variable in LLVM SPIR-V backend.
Fixes: #70122
UAVs and SRVs have already been converted to use LLVM target types and
we can disable generating of the !hlsl.uavs and !hlsl.srvs! annotations.
This will enable adding tests for structured buffers with user defined
types that this old resource annotations code does not handle (it
crashes).
Part 1 of #114126
Adds `@_init_resource_bindings()` function to module initialization that
includes `handle.fromBinding` intrinsic calls for simple resource
declarations. Arrays of resources or resources inside user defined types
are not supported yet.
While this unblocks our progress on [Compile a runnable shader from
clang](https://github.com/llvm/wg-hlsl/issues/7) milestone, this is
probably not the way we would like to handle resource binding
initialization going forward. Ideally, it should be done via the
resource class constructors in order to support dynamic resource binding
or unbounded arrays if resources.
Depends on PRs #110327 and #111203.
Part 1 of #105076
Fix the calling convention used for the call in the entry point
wrapper. No calling convention is currently set. It can easily use the
calling convention of the function that is being called.
Without this, there is a mismatch in the calling convention between the
call site and the callee. This is undefined behaviour.
This reverts commit 4a63f4d301c0e044073e1b1f8f110015ec1778a1.
It was reverted because of a buildbot breakage, but the fix-forward has
landed (https://github.com/llvm/llvm-project/pull/109023).