Introduce `OverflowBehaviorType` (OBT), a new type attribute in Clang
that provides developers with fine-grained control over the overflow
behavior of integer types. This feature allows for a more nuanced
approach to integer safety, achieving better granularity than global
compiler flags like `-fwrapv` and `-ftrapv`. Type specifiers are also
available as keywords `__ob_wrap` and `__ob_trap`.
These can be applied to integer types (both signed and unsigned) as well
as typedef declarations, where the behavior is one of the following:
* `wrap`: Guarantees that arithmetic operations on the type will wrap on
overflow, similar to `-fwrapv`. This suppresses UBSan's integer overflow
checks for the attributed type and prevents eager compiler
optimizations.
* `trap`: Enforces overflow checking for the type, even when global
flags like `-fwrapv` would otherwise suppress it.
A key aspect of this feature is its interaction with existing
mechanisms. `OverflowBehaviorType` takes precedence over global flags
and, notably, over entries in the Sanitizer Special Case List (SSCL).
This allows developers to "allowlist" critical types for overflow
instrumentation, even if they are disabled by a broad rule in an SSCL.
Signed-off-by: Justin Stitt <justinstitt@google.com>
fixes#159438
This patch adds `MatrixElementExpr`, a new AST node for HLSL matrix
element and swizzle access (e.g. M._m00, M._11_22_33).
It introduces a shared `ElementAccessExprBase` used by both matrix and
vector swizzle expressions, updates Sema to parse and validate
zero-based and one-based accessors, detects duplicates for l-value
checks, and emits improved diagnostics. CodeGen is updated to lower
scalar and multi-element accesses consistently, and full AST
serialization, dumping, and tooling support is included. This
implementation reflects the updated
[RFC](https://github.com/llvm/wg-hlsl/pull/357/files) for HLSL matrix
accessor semantics.
This is my second attempt to merge the #33210 fix. My original patch
#178220 caused test failures on risc-v, as I committed a new test file
that was too target-specific. I reverted the PR (#180183) to avoid
causing disruption, as I did not know how long the fix would take.
I have now fixed the test so that it behaves correctly on risc-v. I have
also changed the run lines so it is tested on aarch64, amd64 and risc-v.
The conversion code always ended up just getting the type of Src from
the Src argument itself, with no virtual users of this, so there is no
point in also providing this API hook. Fix the documentation as well,
since it seems DestAddr must have been similarly removed at some point
in the past from the API but was still documented.
Also fixes CIR to actually return the casted value!
This just added unnecessary work to the IR, since they are only used for
load and store, which just causes some IR noise. Tests updated by UTC
script to remove the extra lines.
There is no reason to use these over fpext/fptrunc and bitcast.
Split out from #174484. The test coverage is also shockingly bad,
so adds a new wasm test which shows different contexts the intrinsics
are used.
I've also reverted this to a more conservative version that leaves the
useFP16ConversionIntrinsics configuration in place, and only replaces
the exact intrinsic usage. This should be removed, but it seems to have
turned into a buggy ABI option. Some contexts which probably meant to
check NativeHalfType or NativeHalfArgsAndReturns were relying on this
instead. Additionally, some of the SVE intrinsics appear to be using
__fp16 but really expect _Float16 treatment.
Fixes#172711
Fixes the type mismatch issues preventing single matrix subscript
getters and setters from working with boolean matrices.
The changes from this PR also happens to make matrix splats work for
boolean matrices, but adding tests for that and (re)introducing
boolean-matrix-specific sema will be relegated to its own PR.
In ScalarExprEmitter::EmitScalarPrePostIncDec we create ConstantInt
values that are either 1 or -1. There is a special case when the type is
i1 (e.g. for unsigned _BitInt(1)) when we need to be able to create a
"i1 true" value for both inc and dec.
To avoid triggering the assertions added by the pull request #171456 we
now treat the ConstantInt as unsigned for increments and as signed for
decrements.
fixes#167617
In DXC HLSL supports different indexing modes via codegen for its
equivalent of the MatrixSubscriptExpr when the /Zpr and /Zpc flags are
used see: https://godbolt.org/z/bz5Y5WG36.
This change modifies EmitMatrixSubscriptExpr to consider the
MatrixRowMajor/MatrixColMajor Layout flags before generating an index.
Similarly it introduces `createRowMajorIndex` and
`createColumnMajorIndex` in `MatrixBuilder.h` for use in
`VisitMatrixSubscriptExpr`.
fixes#166206
- Add swizzle support if row index is constant
- Add test cases
- Add new AST type
- Add new LValue for Matrix Row Type
- TODO: Make the new LValue a dynamic index version of ExtVectorElt
fixes#168737fixes#168755
This change fixes adds support for Matrix truncations via the
ICK_HLSL_Matrix_Truncation enum. That ends up being most of the files
changed.
It also allows Matrix as an HLSL Elementwise cast as long as the cast
does not perform a shape transformation ie 3x2 to 2x3.
Tests for the new elementwise and truncation behavior were added. As
well as sema tests to make sure we error n the shape transformation
cast.
I am punting right now on the ConstExpr Matrix support. That will need
to be addressed later. Will file a seperate issue for that if reviewers
agree it can wait.
This rename was made as part of
https://github.com/llvm/llvm-project/pull/147835 in order to ease
rebasing the PR, and give a nice window for other patches to get rebased
as well.
It has been a while already, so lets go ahead and rename it back.
Adds support for elementwise and aggregate splat casting struct types
with bitfields. Replacing existing Flattening function which used to
produce a list of GEPs representing a flattened object with one that
produces a list of LValues representing a flattened object. The LValues
can be used by EmitStoreThroughLValue and EmitLoadOfLValue, ensuring
bitfields are properly loaded and stored. This also simplifies the code
in the elementwise and aggregate splat casting functions.
Closes#125986
Conditional modifier on lastprivate clause is producing incorrect result
when compiled with clang( C compiler). IR is not emitting while
compilation with C compiler.
However it is working correctly with clang++
The OpenMP hook that emits the conditional modifier
(checkAndEmitLastprivateConditional) is skipped in C because assignment
is a prvalue and takes the scalar path.
Original Codegen Support :
[eddb8](a58da1a2ff (diff-629e03f730f901cdf96b6b48fb0aed8ef156590aaff37857b8e5ad0694beddb8))
```
C = → prvalue → EmitAnyExpr(TEK_Scalar) → ScalarExprEmitter::VisitBinAssign (hook not reached)
C++ = → lvalue → EmitBinaryOperatorLValue
```
```
Failing Test Case :
#include <stdio.h>
#define N 10
int A[N];
void condlastprivate() {
int x, y, z, k;
x = y = z = k = 0;
#pragma omp parallel for lastprivate(conditional: x,y, z) lastprivate(k)
for( k=0; k<N; k++){
if ((k >2 ) && (k <6))
{ x = A[k]; z = A[k]+111; }
else
{ y = A[k]+222; }
}
printf("Expecting: x=5, y=231, z=116 k=10 Got: x=%d y=%d z=%d k=%d \n", x,y,z,k);
}
int main() {
for( int i=0; i<N; i++)
{ A[i] = i; }
condlastprivate();
return 0;
}
```
```
#>./clang -fopenmp cond_c.c
#> ./a.out
Expecting: x=5, y=231, z=116 k=10 **Got: x=-1376379760 y=231 z=631465600** k=10
```
---------
Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>
In 29992cfd628ed5b968ccb73b17ed0521382ba317 (#145967) support was added
for "trap reasons" on traps emitted in UBSan in trapping mode (e.g.
`-fsanitize-trap=undefined`). This improved the debugging experience by
attaching the reason for trapping as a string on the debug info on trap
instructions. Consumers such as LLDB can display this trap reason string
when the trap is reached.
A limitation of that patch is that the trap reason string is hard-coded
for each `SanitizerKind` even though the compiler actually has much more
information about the trap available at compile time that could be shown
to the user.
This patch is an incremental step in fixing that. It consists of two
main steps.
**1. Introduce infrastructure for building trap reason strings**
To make it convenient to construct trap reason strings this patch
re-uses Clang's powerful diagnostic infrastructure to provide a
convenient API for constructing trap reason strings. This is achieved
by:
* Introducing a new `Trap` diagnostic kind to represent trap diagnostics
in TableGen files.
* Adding a new `Trap` diagnostic component. While this part probably
isn't technically necessary it seemed like I should follow the existing
convention used by the diagnostic system.
* Adding `DiagnosticTrapKinds.td` to describe the different trap
reasons.
* Add the `TrapReasonBuilder` and `TrapReason` classes to provide an
interface for constructing trap reason strings and the trap category.
Note this API while similar to `DiagnosticBuilder` has different
semantics which are described in the code comments. In particular the
behavior when the destructor is called is very different.
* Adding `CodeGenModule::BuildTrapReason()` as a convenient constructor
for the `TrapReasonBuilder`.
This use of the diagnostic system is a little unusual in that the
emitted trap diagnostics aren't actually consumed by normal diagnostic
consumers (e.g. the console). Instead the `TrapReasonBuilder` is just
used to format a string, so in effect the builder is somewhat analagous
to "printf". However, re-using the diagnostics system in this way brings
a several benefits:
* The powerful diagnostic templating languge (e.g. `%select`) can be
used.
* Formatting Clang data types (e.g. `Type`, `Expr`, etc.) just work
out-of-the-box.
* Describing trap reasons in tablegen files opens the door for
translation to different languages in the future.
* The `TrapReasonBuilder` API is very similar to `DiagnosticBuilder`
which makes it easy to use by anyone already familiar with Clang's
diagnostic system.
While UBSan is the first consumer of this new infrastructure the intent
is to use this to overhaul how trap reasons are implemented in the
`-fbounds-safety` implementation (currently exists downstream).
**2. Apply the new infrastructure to UBSan checks for arithmetic
overflow**
To demonstrate using `TrapReasonBuilder` this patch applies it to UBSan
traps for arithmetic overflow. The intention is that we would
iteratively switch to using the `TrapReasonBuilder` for all UBSan traps
where it makes sense in future patches.
Previously for code like
```
int test(int a, int b) { return a + b; }
```
The trap reason string looked like
```
Undefined Behavior Sanitizer: Integer addition overflowed
```
now the trap message looks like:
```
Undefined Behavior Sanitizer: signed integer addition overflow in 'a + b'
```
This string is much more specific because
* It explains if signed or unsigned overflow occurred
* It actually shows the expression that overflowed
One possible downside of this approach is it may blow up Debug info size
because now there can be many more distinct trap reason strings. To
allow users to avoid this a new driver/cc1 flag
`-fsanitize-debug-trap-reasons=` has been added which can either be
`none` (disable trap reasons entirely), `basic` (use the per
`SanitizerKind` hard coded strings), and `detailed` (use the new
expressive trap reasons implemented in this patch). The default is
`detailed` to give the best out-of-the-box debugging experience. The
existing `-fsanitize-debug-trap-reasons` and
`-fno-sanitize-debug-trap-reasons` have been kept for compatibility and
are aliases of the new flag with `detailed` and `none` arguments passed
respectively.
rdar://158612755
Summary:
It's extremely common to conditionally blend two vectors. Previously
this was done with mask registers, which is what the normal ternary code
generation does when used on a vector. However, since Clang 15 we have
supported boolean vector types in the compiler. These are useful in
general for checking the mask registers, but are currently limited
because they do not map to an LLVM-IR select instruction.
This patch simply relaxes these checks, which are technically forbidden
by
the OpenCL standard. However, general vector support should be able to
handle these. We already support this for Arm SVE types, so this should
be make more consistent with the clang vector type.
This is a major change on how we represent nested name qualifications in
the AST.
* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.
This patch offers a great performance benefit.
It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.
This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.
The rest and bulk of the changes are mostly consequences of the changes
in API.
PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.
Fixes#136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
This extends https://github.com/llvm/llvm-project/pull/138577 to more UBSan checks, by changing SanitizerDebugLocation (formerly SanitizerScope) to add annotations if enabled for the specified ordinals.
Annotations will use the ordinal name if there is exactly one ordinal specified in the SanitizerDebugLocation; otherwise, it will use the handler name.
Updates the tests from https://github.com/llvm/llvm-project/pull/141814.
---------
Co-authored-by: Vitaly Buka <vitalybuka@google.com>
It turns out that getVLASize() does not get you the size of a single
dimension of the VLA, it gets you the full count of all elements. This
caused _Countof to return invalid values on VLA ranks. Now switched to
using getVLAElements1D() instead, which only gets a single dimension.
Fixes#141409
For i1 vectors, we used an i8 fixed vector as the storage type.
If the known minimum number of elements of the scalable vector type is
less than 8, we were doing the cast through memory. This used a load or
store from a fixed vector alloca. If is less than 8, DataLayout
indicates that the load/store reads/writes vscale bytes even if vscale
is known and vscale*X is less than or equal to 8. This means the load or
store is outside the bounds of the fixed size alloca as far as
DataLayout is concerned leading to undefined behavior.
This patch avoids this by widening the i1 scalable vector type with zero
elements until it is divisible by 8. This allows it be bitcasted to/from
an i8 scalable vector. We then insert or extract the i8 fixed vector
into this type.
Hopefully this enables #130973 to be accepted.
The passed indices have to be constant integers anyway, which we verify
before creating the ShuffleVectorExpr. Use the value we create there and
save the indices using a ConstantExpr instead. This way, we don't have
to evaluate the args every time we call getShuffleMaskIdx().
Allows the __ptrauth qualifier to be applied to pointer sized integer types,
updates Sema to ensure trivially copyable, etc correctly handle address
discriminated integers, and updates codegen to perform authentication
around arithmetic on the types.
It isn't used and is redundant with the result pointer type argument.
A more reasonable API would only have LangAS parameters, or IR parameters,
not both. Not all values have a meaningful value for this. I'm also
not sure why we have this at all, it's not overridden by any targets and
further simplification is possible.
When an HLSL Init list is producing a Scalar, handle OpaqueValueExprs in
the Init List with 'emitInitListOpaqueValues'
Copied from 'AggExprEmitter::VisitCXXParenListOrInitListExpr'
Closes#136408
---------
Co-authored-by: Chris B <beanz@abolishcrlf.org>
Most callers want a constant index. Instead of making every caller
create a ConstantInt, we can do it in IRBuilder. This is similar to
createInsertElement/createExtractElement.
The qualifier allows programmer to directly control how pointers are
signed when they are stored in a particular variable.
The qualifier takes three arguments: the signing key, a flag specifying
whether address discrimination should be used, and a non-negative
integer that is used for additional discrimination.
```
typedef void (*my_callback)(const void*);
my_callback __ptrauth(ptrauth_key_process_dependent_code, 1, 0xe27a) callback;
```
Co-Authored-By: John McCall rjmccall@apple.com
fixes#135122
SemaExpr.cpp - Make all doubles fail. Add sema support for float scalars
and vectors when language mode is HLSL.
CGExprScalar.cpp - Allow emit frem when language mode is HLSL.