fixes https://github.com/llvm/llvm-project/issues/179859
For matrix types we need to check the language mode so we can change the
matrix memory layout to arrays of vectors. To make this play nice with
how the rest of clang treats matrices we need to modify the
MaybeConvertMatrixAddress and the CreateMemTemp function to know how to
reconstruct a flattened vector.
Rest of changes is just test updates.
fixes#159438
This patch adds `MatrixElementExpr`, a new AST node for HLSL matrix
element and swizzle access (e.g. M._m00, M._11_22_33).
It introduces a shared `ElementAccessExprBase` used by both matrix and
vector swizzle expressions, updates Sema to parse and validate
zero-based and one-based accessors, detects duplicates for l-value
checks, and emits improved diagnostics. CodeGen is updated to lower
scalar and multi-element accesses consistently, and full AST
serialization, dumping, and tooling support is included. This
implementation reflects the updated
[RFC](https://github.com/llvm/wg-hlsl/pull/357/files) for HLSL matrix
accessor semantics.
The conversion code always ended up just getting the type of Src from
the Src argument itself, with no virtual users of this, so there is no
point in also providing this API hook. Fix the documentation as well,
since it seems DestAddr must have been similarly removed at some point
in the past from the API but was still documented.
Also fixes CIR to actually return the casted value!
This just added unnecessary work to the IR, since they are only used for
load and store, which just causes some IR noise. Tests updated by UTC
script to remove the extra lines.
Currently we do not emit lifetimes by default when compiling with
memtag-stack - which means we don't catch use-after-scope (when
compiling without optimization).
This patch fixes that by mirroring ASan, HWASan and MSan, and always
emitting lifetime markers. The patch is based on the changes made in
aeca569.
rdar://163713381
Fixes#174629
This PR is similar to that of #169144 but for matrices.
When storing to a matrix element or matrix row, `insertelement`
instructions have been replaced by GEPs followed by stores to individual
matrix elements. There is no longer storing of the entire matrix to
memory all at once, thus avoiding data races when writing to independent
matrix elements from multiple threads.
Fixes#175808
This PR adds support for boolean matrix splats by adding tests and
fixing a bug in `CodeGenFunction::EmitToMemory` when the type of a
boolean matrix already matches the type expected of a load/store.
This PR also addresses the todo comment in `clang/lib/Sema/SemaExpr.cpp`
regarding support for boolean matrix splats by removing the comment
altogether since it is not necessary.
---------
Co-authored-by: Farzon Lotfi <farzonlotfi@microsoft.com>
Fixes#172711
Fixes the type mismatch issues preventing single matrix subscript
getters and setters from working with boolean matrices.
The changes from this PR also happens to make matrix splats work for
boolean matrices, but adding tests for that and (re)introducing
boolean-matrix-specific sema will be relegated to its own PR.
Fixes#175236
This pull request improves support for HLSL constant matrix types with
boolean elements in Clang's code generation. The main changes ensure
that boolean (i1) matrices are correctly represented and stored as i32
vectors in LLVM IR. This includes updates to both the code generation
logic and related tests.
### Code generation improvements for HLSL boolean matrices
* Updated `convertTypeForLoadStore` in `CodeGenTypes.cpp` to represent
constant matrix types with boolean elements as `FixedVectorType` of
integers, ensuring atomic load/store operations and correct element type
conversion for HLSL.
* Modified `EmitToMemory` in `CGExpr.cpp` to handle both
`ExtVectorBoolType` and `ConstantMatrixBoolType`, improving the handling
of boolean matrices during memory emission.
### Test updates for boolean matrix codegen
* Adjusted test expectations in `BoolMatrix.hlsl` to reflect the new
representation, showing stores and loads of `<N x i32>` instead of `<N x
i1>` for boolean matrices, and added zero-extension where necessary.
* Added a new test for a 4x4 boolean matrix function to verify correct
code generation for initial stores to boolean matrix parameter
declaration allocas.
fixes#170777
If we don't use vector type and instead continue to pass on the matrix
type when we enter `EmitExtVectorElementExpr` Then we don't need to
store the row and column length on the LValue.
Using the Matrix type means we can reuse the isMatrixRow() cases in
EmitLoadOfLValue and EmitStoreThroughLValue and not have to support a
new lValue that is a hybrid between the ExtVectorElt and MatrixRow
cases.
All we need to do to support this is pass the list of column indices as
a `ConstantDataVector` and check the size of this Vector to know how
many column iterations we need to do. Further just index into the vector
to fetch the right encoded element index value.
fixes#172805
For the constant case if we know the row index then we can compute the
offsets via `E->getEncodedElementAccess(Elts)`. We had to also add
column and row sizes to LValue so that we could compute the right index.
the emitter for `MatrixSingleSubscriptExpr` collects the sizes off the
type and passes it to `MakeMatrixRow`.
---------
Co-authored-by: Justin Bogner <mail@justinbogner.com>
fixes#167617
In DXC HLSL supports different indexing modes via codegen for its
equivalent of the MatrixSubscriptExpr when the /Zpr and /Zpc flags are
used see: https://godbolt.org/z/bz5Y5WG36.
This change modifies EmitMatrixSubscriptExpr to consider the
MatrixRowMajor/MatrixColMajor Layout flags before generating an index.
Similarly it introduces `createRowMajorIndex` and
`createColumnMajorIndex` in `MatrixBuilder.h` for use in
`VisitMatrixSubscriptExpr`.
Allow the normal rules for preventing instrumentation on indirect calls
to `cfi_unchecked_callee` function types and `cfi_unchecked_callee`
functions when using `-fsanitize=function`. While it's technically
separate from `-fsanitize=cfi`, this particular UBSan mode checks for
similar control flow bugs so it makes sense to also prevent those
control flow checks from being added onto `cfi_unchecked_callee`
functions.
This change creates an XFAIL that will be addressed by #172805
This part of the change was not intended to make it in. It was a way to
optimize for constant index setters\getters but had the unintended
consequence of doing swizzles without considering the encoded element
offsets.
We need to be calling `MakeExtVectorElt` in `EmitExtVectorElementExpr`
so that we can call `E->getEncodedElementAccess(Elts);` off of the
`ExtVectorElementExpr` Node.
Calling it in `EmitMatrixSingleSubscriptExpr` was wrong because we doing
have the vector encoded elementts yet. But also to get the right index
we need a way to pass on the matrix column and row lengths so
`MakeMatrixRow` needed to be expanded to store row and column sizes on
the LValue.
fixes#166206
- Add swizzle support if row index is constant
- Add test cases
- Add new AST type
- Add new LValue for Matrix Row Type
- TODO: Make the new LValue a dynamic index version of ExtVectorElt
fixes#171049fixes#171050
- Allow Bools for matrix type when in HLSL mode
- use ConvertTypeForMem to figure out the bool size
- Add Bool matrix types to hlsl_basic_types.h
---------
Co-authored-by: Helena Kotas <hekotas@microsoft.com>
If the 'counted_by' value is signed, we will incorrectly allow accesses
when the value is negative. This has obvious bad effects as it will
allow accessing a huge swath of unallocated memory.
Also clarify and rearrange the parameters to make them more
perspicuous.
Fixes: #170987.
fixes#168737fixes#168755
This change fixes adds support for Matrix truncations via the
ICK_HLSL_Matrix_Truncation enum. That ends up being most of the files
changed.
It also allows Matrix as an HLSL Elementwise cast as long as the cast
does not perform a shape transformation ie 3x2 to 2x3.
Tests for the new elementwise and truncation behavior were added. As
well as sema tests to make sure we error n the shape transformation
cast.
I am punting right now on the ConstExpr Matrix support. That will need
to be addressed later. Will file a seperate issue for that if reviewers
agree it can wait.
This change fixes couple of issues with static resources:
- Enables assignment to static resource or resource array variables (fixes#166458)
- Initializes static resources and resource arrays with default constructor that sets the handle to poison
When an individual element of a vector is updated via indexing into the vector, it needs to be handled as a store operation on that one vector element.
Clang treats vectors as one unit, so a vector element needs to be updated, the whole vector is loaded, the element is modified, and then the whole vector is stored. In HLSL vector elements are handled individually. We need to avoid this load/modify/store sequence to prevent overwriting other vector elements that might be getting updated in parallel.
Fixes#167729
Contributes to #160208.
When individual elements of a vector are updated via vector swizzle, it needs to be handled as separate store operations to the individual vector elements.
Clang treats vectors as one unit, so if a part of a vector needs to be updated, the whole vector is loaded, some elements modified, and then the whole vector is stored.
In HLSL vector elements are handled separately. We need to avoid this load/modify/store sequence to prevent overwriting other vector elements that might be getting updated in parallel.
Fixes#152815
This partially reverts #152192, keeping updated tests and
some code reordering in clang/lib/CodeGen/CGExpr.cpp.
compiler-rt/lib/ubsan_minimal/ubsan_minimal_handlers.cpp is exact revert
(with followup #152419)
We don't have a good use case for that, so revert it before we are stuck
maintaining this API.
21.x does not have this patch.
This reverts commit a1209d868632b8aea10450cd2323848ab0b6776a.
Consider a newly added "malloc_span" attribute in the allocation token
instrumentation to ensure that allocation functions with the
"malloc_span" attribute are processed similarly to other memory
allocation functions.
Update the tests to demonstrate applicability to __size_returning_new.
This change drops the use of the "Layout" type and instead uses explicit
padding throughout the compiler to represent types in HLSL buffers.
There are a few parts to this, though it's difficult to split them up as
they're very interdependent:
1. Refactor HLSLBufferLayoutBuilder to allow us to calculate the padding
of arbitrary types.
2. Teach Clang CodeGen to use HLSL specific paths for cbuffers when
generating aggregate copies, array accesses, and structure accesses.
3. Simplify DXILCBufferAccesses such that it directly replaces accesses
with dx.resource.getpointer rather than recalculating the layout.
4. Basic infrastructure for SPIR-V handling, but the implementation
itself will need work in follow ups.
Fixes several issues, including #138996, #144573, and #156084.
Resolves#147352.
`DISubprogram`s are attached to call sites to support various debug info
features, including entry values and tail calls. Clang 9.0
(0f6516856670a435461f56a9faeb4aa8a35a6679) was the first version to
include this kind of call site `DISubprogram` attachment.
This earlier work appears to visit only some call site variants,
however. The call site attachment was added to a higher-level `EmitCall`
path in Clang's code gen that is only used by some call variants. In
particular, some C++ member calls use a different code gen path, which
did not include this call site attachment step, and thus the debug info
it triggers (e.g. call site entries) was not emitted for such calls.
This moves `DISubprogram` attachment to a lower-level call emission path
that is used by all call variants.
Fixes https://github.com/llvm/llvm-project/issues/161962
Currently Clang usually leaves padding bits uninitialized, which means
they are undef at the moment.
When expanding stores of vector types to include padding, the padding
lanes will be poison, hence the padding bits will be poison.
This interacts badly with coercion of arguments and return values, where
3 x float vectors will be loaded as i128 integer; poisoning the padding
bits will make the whole value poison.
Not sure if there's a better way, but I think we have a number of places
that currently rely on the padding being undef, not poison.
PR: https://github.com/llvm/llvm-project/pull/164821
Refactor the logic for inferring allocated types out of `CodeGen` and
into a new reusable component in `clang/AST/InferAlloc.h`.
This is a preparatory step for implementing `__builtin_infer_alloc_token`.
By moving the type inference heuristics into a common place, it can be
shared between the existing allocation-call instrumentation and the new
builtin's implementation.
`UnqualPtrTy` didn't always match `llvm::PointerType::getUnqual`:
sometimes it returned a pointer that is not in address space 0 (notably
for SPIRV).
Since `UnqualPtrTy` was used as the "generic" or "default" pointer type,
this patch renames it to `DefaultPtrTy` to avoid confusion with LLVM's
`PointerType::getUnqual`.
This rename was made as part of
https://github.com/llvm/llvm-project/pull/147835 in order to ease
rebasing the PR, and give a nice window for other patches to get rebased
as well.
It has been a while already, so lets go ahead and rename it back.
Adds support for elementwise and aggregate splat casting struct types
with bitfields. Replacing existing Flattening function which used to
produce a list of GEPs representing a flattened object with one that
produces a list of LValues representing a flattened object. The LValues
can be used by EmitStoreThroughLValue and EmitLoadOfLValue, ensuring
bitfields are properly loaded and stored. This also simplifies the code
in the elementwise and aggregate splat casting functions.
Closes#125986
This API takes a const char* with a default nullptr value and immdiately
passes it down to an API taking a StringRef. All of the places this is
called from are either using compile time string literals, the default
argument, or string objects that have known length. Discarding the
length known from a calling API to just have to strlen it to call the
next layer down that requires a StringRef is a bit silly, so this change
updates CodeGenModule::GetAddrOfConstantCString to use StringRef instead
of const char* for the GlobalName parameter.
It might be worth also replacing the first parameter with an llvm ADT
type that avoids allocation, but that change would have wider impact so
we should consider it separately.
For structured bindings that use custom `get` specializations, the
resulting LLVM IR ascribes the load of the result of `get` to the
binding's declaration, rather than the place where the binding is
referenced - this caused awkward sequencing in the debug info where,
when stepping through the code you'd step back to the binding
declaration every time there was a reference to the binding.
To fix that - when we cross into IRGening a binding - suppress the debug
info location of that subexpression.
I don't represent this as a great bit of API design - certainly open to
ideas, but putting it out here as a place to start.
It's /possible/ this is an incomplete fix, even - if the binding decl
had other subexpressions, those would still get their location applied &
it'd likely be wrong.
So maybe that's a direction to go with to productionize this - add a new
location scoped device that suppresses any overriding - this might be
more robust. How do people feel about that?
In 29992cfd628ed5b968ccb73b17ed0521382ba317 (#145967) support was added
for "trap reasons" on traps emitted in UBSan in trapping mode (e.g.
`-fsanitize-trap=undefined`). This improved the debugging experience by
attaching the reason for trapping as a string on the debug info on trap
instructions. Consumers such as LLDB can display this trap reason string
when the trap is reached.
A limitation of that patch is that the trap reason string is hard-coded
for each `SanitizerKind` even though the compiler actually has much more
information about the trap available at compile time that could be shown
to the user.
This patch is an incremental step in fixing that. It consists of two
main steps.
**1. Introduce infrastructure for building trap reason strings**
To make it convenient to construct trap reason strings this patch
re-uses Clang's powerful diagnostic infrastructure to provide a
convenient API for constructing trap reason strings. This is achieved
by:
* Introducing a new `Trap` diagnostic kind to represent trap diagnostics
in TableGen files.
* Adding a new `Trap` diagnostic component. While this part probably
isn't technically necessary it seemed like I should follow the existing
convention used by the diagnostic system.
* Adding `DiagnosticTrapKinds.td` to describe the different trap
reasons.
* Add the `TrapReasonBuilder` and `TrapReason` classes to provide an
interface for constructing trap reason strings and the trap category.
Note this API while similar to `DiagnosticBuilder` has different
semantics which are described in the code comments. In particular the
behavior when the destructor is called is very different.
* Adding `CodeGenModule::BuildTrapReason()` as a convenient constructor
for the `TrapReasonBuilder`.
This use of the diagnostic system is a little unusual in that the
emitted trap diagnostics aren't actually consumed by normal diagnostic
consumers (e.g. the console). Instead the `TrapReasonBuilder` is just
used to format a string, so in effect the builder is somewhat analagous
to "printf". However, re-using the diagnostics system in this way brings
a several benefits:
* The powerful diagnostic templating languge (e.g. `%select`) can be
used.
* Formatting Clang data types (e.g. `Type`, `Expr`, etc.) just work
out-of-the-box.
* Describing trap reasons in tablegen files opens the door for
translation to different languages in the future.
* The `TrapReasonBuilder` API is very similar to `DiagnosticBuilder`
which makes it easy to use by anyone already familiar with Clang's
diagnostic system.
While UBSan is the first consumer of this new infrastructure the intent
is to use this to overhaul how trap reasons are implemented in the
`-fbounds-safety` implementation (currently exists downstream).
**2. Apply the new infrastructure to UBSan checks for arithmetic
overflow**
To demonstrate using `TrapReasonBuilder` this patch applies it to UBSan
traps for arithmetic overflow. The intention is that we would
iteratively switch to using the `TrapReasonBuilder` for all UBSan traps
where it makes sense in future patches.
Previously for code like
```
int test(int a, int b) { return a + b; }
```
The trap reason string looked like
```
Undefined Behavior Sanitizer: Integer addition overflowed
```
now the trap message looks like:
```
Undefined Behavior Sanitizer: signed integer addition overflow in 'a + b'
```
This string is much more specific because
* It explains if signed or unsigned overflow occurred
* It actually shows the expression that overflowed
One possible downside of this approach is it may blow up Debug info size
because now there can be many more distinct trap reason strings. To
allow users to avoid this a new driver/cc1 flag
`-fsanitize-debug-trap-reasons=` has been added which can either be
`none` (disable trap reasons entirely), `basic` (use the per
`SanitizerKind` hard coded strings), and `detailed` (use the new
expressive trap reasons implemented in this patch). The default is
`detailed` to give the best out-of-the-box debugging experience. The
existing `-fsanitize-debug-trap-reasons` and
`-fno-sanitize-debug-trap-reasons` have been kept for compatibility and
are aliases of the new flag with `detailed` and `none` arguments passed
respectively.
rdar://158612755
This changes a bunch of places which use getAs<TagType>, including
derived types, just to obtain the tag definition.
This is preparation for #155028, offloading all the changes that PR used
to introduce which don't depend on any new helpers.
Adds support for accessing individual resources from fixed-size global resource arrays.
Design proposal:
https://github.com/llvm/wg-hlsl/blob/main/proposals/0028-resource-arrays.md
Enables indexing into globally scoped, fixed-size resource arrays to retrieve individual resources. The initialization logic is primarily handled during codegen. When a global resource array is indexed, the
codegen translates the `ArraySubscriptExpr` AST node into a constructor call for the corresponding resource record type and binding.
To support this behavior, Sema needs to ensure that:
- The constructor for the specific resource type is instantiated.
- An implicit binding attribute is added to resource arrays that lack explicit bindings (#152452).
Closes#145424