Adds support for accessing individual resources from fixed-size global resource arrays.
Design proposal:
https://github.com/llvm/wg-hlsl/blob/main/proposals/0028-resource-arrays.md
Enables indexing into globally scoped, fixed-size resource arrays to retrieve individual resources. The initialization logic is primarily handled during codegen. When a global resource array is indexed, the
codegen translates the `ArraySubscriptExpr` AST node into a constructor call for the corresponding resource record type and binding.
To support this behavior, Sema needs to ensure that:
- The constructor for the specific resource type is instantiated.
- An implicit binding attribute is added to resource arrays that lack explicit bindings (#152452).
Closes#145424
This is a major change on how we represent nested name qualifications in
the AST.
* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.
This patch offers a great performance benefit.
It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.
This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.
The rest and bulk of the changes are mostly consequences of the changes
in API.
PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.
Fixes#136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.
This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
+ Changed type_mismatch minimal handler to accept and print pointer.
This will allow to distinguish null pointer use, misallignment and
incorrect object size.
The change increases binary size by ~1% and has almost no performance
impact.
Fixes#149943
This patch adds a human readable trap category and message to UBSan
traps. The category and message are encoded in a fake frame in the debug
info where the function is a fake inline function where the name encodes
the trap category and message. This is the same mechanism used by
Clang’s `__builtin_verbose_trap()`.
This change allows consumers of binaries built with trapping UBSan to
more easily identify the reason for trapping. In particular LLDB already
has a frame recognizer that recognizes the fake function names emitted
in debug info by this patch. A patch testing this behavior in LLDB will
be added in a separately.
The human readable trap messages are based on the messages currently
emitted by the userspace runtime for UBSan in compiler-rt. Note the
wording is not identical because the userspace UBSan runtime has access
to dynamic information that is not available during Clang’s codegen.
Test cases for each UBSan trap kind are included.
This complements the [`-fsanitize-annotate-debug-info`
feature](https://github.com/llvm/llvm-project/pull/141997). While
`-fsanitize-annotate-debug-info` attempts to annotate all UBSan-added
instructions, this feature (`-fsanitize-debug-trap-reasons`) only
annotates the final trap instruction using SanitizerHandler information.
This work is part of a GSoc 2025 project.
This extends https://github.com/llvm/llvm-project/pull/138577 to more UBSan checks, by changing SanitizerDebugLocation (formerly SanitizerScope) to add annotations if enabled for the specified ordinals.
Annotations will use the ordinal name if there is exactly one ordinal specified in the SanitizerDebugLocation; otherwise, it will use the handler name.
Updates the tests from https://github.com/llvm/llvm-project/pull/141814.
---------
Co-authored-by: Vitaly Buka <vitalybuka@google.com>
We have multiple different attributes in clang representing device
kernels for specific targets/languages. Refactor them into one attribute
with different spellings to make it more easily scalable for new
languages/targets.
---------
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
This enables the LLVM optimizer to view accesses to distinct struct
members as independent, also for array members. For example, the
following two stores no longer alias:
struct S { int a[10]; int b; };
void test(S *p, int i) {
p->a[i] = ...;
p->b = ...;
}
Array members were already added to TBAA struct type nodes in commit
57493e29. Here, we extend a path tag for an array subscript expression.
This introduces the attribute discussed in
https://discourse.llvm.org/t/rfc-function-type-attribute-to-prevent-cfi-instrumentation/85458.
The proposed name has been changed from `no_cfi` to
`cfi_unchecked_callee` to help differentiate from `no_sanitize("cfi")`
more easily. The proposed attribute has the following semantics:
1. Indirect calls to a function type with this attribute will not be
instrumented with CFI. That is, the indirect call will not be checked.
Note that this only changes the behavior for indirect calls on pointers
to function types having this attribute. It does not prevent all
indirect function calls for a given type from being checked.
2. All direct references to a function whose type has this attribute
will always reference the true function definition rather than an entry
in the CFI jump table.
3. When a pointer to a function with this attribute is implicitly cast
to a pointer to a function without this attribute, the compiler will
give a warning saying this attribute is discarded. This warning can be
silenced with an explicit C-style cast or C++ static_cast.
The InstrProf headers are very expensive. Avoid including them in all of
CodeGen/ by moving the CodeGenPGO member behind a unqiue_ptr.
This reduces clang build time by 0.8%.
In 'EmitStoreThroughExtVectorComponentLValue', move the code which ZExts
in the case the Destination Scalar Type is larger than the Source Scalar
Type, to the top of the function, to ensure each condition is handled.
The previous code missed this case:
```
bool4 b = true.xxxx;
b.xyz = false.xxx;
```
Leading to a bad shuffle vector.
Closes#140564
This generalizes the debug info annotation code from https://github.com/llvm/llvm-project/pull/139149 and moves it into a helper function, SanitizerAnnotateDebugInfo().
Future work can use 'ApplyDebugLocation ApplyTrapDI(*this, SanitizerAnnotateDebugInfo(Ordinal));' to add annotations to additional checks.
The 'counted_by' attribute is now available for pointers in structs.
It generates code for sanity checks as well as
__builtin_dynamic_object_size()
calculations. For example:
struct annotated_ptr {
int count;
char *buf __attribute__((counted_by(count)));
};
If the pointer's type is 'void *', use the 'sized_by' attribute, which
works similarly to 'counted_by', but can handle the 'void' base type:
struct annotated_ptr {
int count;
void *buf __attribute__((sized_by(count)));
};
If the 'count' field member occurs after the pointer, use the
'-fexperimental-late-parse-attributes' flag during compilation.
Note that 'counted_by' cannot be applied to a pointer to an incomplete
type, because the size isn't known.
struct foo;
struct annotated_ptr {
int count;
struct foo *buf __attribute__((counted_by(count))); /* invalid */
};
Signed-off-by: Bill Wendling <morbo@google.com>
It isn't used and is redundant with the result pointer type argument.
A more reasonable API would only have LangAS parameters, or IR parameters,
not both. Not all values have a meaningful value for this. I'm also
not sure why we have this at all, it's not overridden by any targets and
further simplification is possible.
This fixes emitting undefined behavior where a 64-bit generic
pointer is written to a 32-bit slot allocated for a private pointer.
This can be seen in test/CodeGenOpenCL/amdgcn-automatic-variable.cl's
wrong_pointer_alloca.
@fmayer introduced '-mllvm -array-bounds-pseudofn'
(https://github.com/llvm/llvm-project/pull/128977/) to make it easier to
see why crashes occurred, and to estimate with a profiler the cycles
spent on these array-bounds checks. This functionality could be usefully
generalized to other checks in future work.
This patch adds the plumbing for -fsanitize-annotate-debug-info, and
connects it to the existing array-bounds-pseudo-fn functionality i.e.,
-fsanitize-annotate-debug-info=array-bounds can be used as a replacement
for '-mllvm -array-bounds-pseudofn', though we do not yet delete the
latter.
Note: we replaced '-mllvm -array-bounds-pseudofn' in
clang/test/CodeGen/bounds-checking-debuginfo.c, because adding test
cases would modify the line numbers in the test assertions, and
therefore obscure that the test output is the same between '-mllvm
-array-bounds-pseudofn' and -fsanitize-annotate-debug-info=array-bounds.
-fno-sanitize-merge (introduced in
https://github.com/llvm/llvm-project/pull/120464) nearly works for CFI:
code that calls EmitCheck will already check the merge options. This
patch fixes one EmitTrapCheck call, which did not check the merge
options, and for two other EmitTrapChecks, adds two TODOs that explain
why it is difficult to fix them.
This is a follow-up of 13aac46332f607a38067b5ddd466071683b8c255.
This commit adjusts the implementation of `hasBooleanRepresentation` to
be somewhat aligned to `hasIntegerRepresentation`.
In particular vector of booleans should be handled in
`hasBooleanRepresentation`, while `_Atomic(bool)` should not.
This patch relands https://github.com/llvm/llvm-project/pull/130990.
If the check value is passed by reference, add `memory(read)`.
Original PR description:
This patch adds `memory(argmem: read, inaccessiblemem: readwrite)` to
**recoverable** ubsan handlers in order to unblock some
memory/loop optimizations. It provides an average of 3% performance
improvement on llvm-test-suite (except for 49 test failures due to ubsan
diagnostics).
The qualifier allows programmer to directly control how pointers are
signed when they are stored in a particular variable.
The qualifier takes three arguments: the signing key, a flag specifying
whether address discrimination should be used, and a non-negative
integer that is used for additional discrimination.
```
typedef void (*my_callback)(const void*);
my_callback __ptrauth(ptrauth_key_process_dependent_code, 1, 0xe27a) callback;
```
Co-Authored-By: John McCall rjmccall@apple.com
In the LLVM middle-end we want to fold `gep inbounds null, idx -> null`:
https://alive2.llvm.org/ce/z/5ZkPx-
This pattern is common in real-world programs
(https://github.com/dtcxzyw/llvm-opt-benchmark/pull/55#issuecomment-1870963906).
Generally, it exists in some (actually) unreachable blocks, which is
introduced by JumpThreading.
However, some old-style offsetof macros are still widely used in
real-world C/C++ code (e.g., hwloc/slurm/luajit). To avoid breaking
existing code and inconvenience to downstream users, this patch removes
the inbounds flag from the struct gep if the base pointer is null.
This feature largely models the same behavior as in C++11. It is
technically a breaking change between C99 and C11, so the paper is not
being backported to older language modes.
One difference between C++ and C is that things which are rvalues in C
are often lvalues in C++ (such as the result of a ternary operator or a
comma operator).
Fixes#96486
This patch adds `memory(argmem: read, inaccessiblemem: readwrite)
mustprogress` to **recoverable** ubsan handlers in order to unblock some
memory/loop optimizations. It provides an average of 3% performance
improvement on llvm-test-suite (except for 49 test failures due to ubsan
diagnostics).
Closes https://github.com/llvm/llvm-project/issues/130093.
This feature is currently not supported in the compiler.
To facilitate this we emit a stub version of each kernel
function body with different name mangling scheme, and
replaces the respective kernel call-sites appropriately.
Fixes https://github.com/llvm/llvm-project/issues/60313
D120566 was an earlier attempt made to upstream a solution
for this issue.
---------
Co-authored-by: anikelal <anikelal@amd.com>
The ClangIR upstreaming project needs the same logic for
hasBooleanRepresentation() that is currently implemented in the standard
clang codegen. In order to share this code, this change moves the
implementation of this function into the AST Type class.
No functional change is intended by this change. The ClangIR use of this
function will be added separately in a later change.
…ncorrect name
Clang needs variables to be represented with unique names. This means
that if a variable shadows another, its given a different name
internally to ensure it has a unique name. If ASan tries to use this
name when printing an error, it will print the modified unique name,
rather than the variable's source code name
Fixes#47326
Make the memory representation of boolean vectors in HLSL, vectors of
i32.
Allow boolean swizzling for boolean vectors in HLSL.
Add tests for boolean vectors and boolean vector swizzling.
Closes#91639
This PR implements HLSL's initialization list behvaior as specified in
the draft language specifcation under
[*Decl.Init.Agg*](https://microsoft.github.io/hlsl-specs/specs/hlsl.html#Decl.Init.Agg).
This behavior is a bit unusual for C/C++ because intermediate braces in
initializer lists are ignored and a whole array of additional
conversions occur unintuitively to how initializaiton works in C.
The implementaiton in this PR generates a valid C/C++ initialization
list AST for the HLSL initializer so that there are no changes required
to Clang's CodeGen to support this. This design will also allow us to
use Clang's rewrite to convert HLSL initializers to valid C/C++
initializers that are equivalent. It does have the downside that it will
generate often redundant accesses during codegen. The IR optimizer is
extremely good at eliminating those so this will have no impact on the
final executable performance.
There is some opportunity for optimizing the initializer list generation
that we could consider in subsequent commits. One notable opportunity
would be to identify aggregate objects that occur in the same place in
both initializers and do not require converison, those aggregates could
be initialized as aggregates rather than fully scalarized.
Closes#56067
---------
Co-authored-by: Finn Plummer <50529406+inbelic@users.noreply.github.com>
Co-authored-by: Helena Kotas <hekotas@microsoft.com>
Co-authored-by: Justin Bogner <mail@justinbogner.com>
Implement HLSL Aggregate Splat casting that handles splatting for arrays
and structs, and vectors if splatting from a vec1.
Closes#100609 and Closes#100619
Depends on #118842