Added codegen for scope directive, enabled allocate and firstprivate
clauses, and added scope directive LIT test.
Testing
- LIT tests (including new scope test).
- OpenMP scope example test from 5.2 OpenMP API examples document.
- Three executable scope tests from OpenMP_VV/sollve_vv suite.
This implements our original design now that LLVM is comfortable with
structs and arrays of scalable vector types. All SVE ACLE intrinsics
already use struct types so the effect of this change is purely the
types used for alloca and function parameters.
There should be no C/C++ user visible change with this patch.
This patch enable the function multiversion(FMV) and `target_clones`
attribute for RISC-V target.
The proposal of `target_clones` syntax can be found at the
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/48 (which has
landed), as modified by the proposed
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/85 (which adds the
priority syntax).
It supports the `target_clones` function attribute and function
multiversioning feature for RISC-V target. It will generate the ifunc
resolver function for the function that declared with target_clones
attribute.
The resolver function will check the version support by runtime object
`__riscv_feature_bits`.
For example:
```
__attribute__((target_clones("default", "arch=+ver1", "arch=+ver2"))) int bar() {
return 1;
}
```
the corresponding resolver will be like:
```
bar.resolver() {
__init_riscv_feature_bits();
// Check arch=+ver1
if ((__riscv_feature_bits.features[0] & BITMASK_OF_VERSION1) == BITMASK_OF_VERSION1) {
return bar.arch=+ver1;
} else {
// Check arch=+ver2
if ((__riscv_feature_bits.features[0] & BITMASK_OF_VERSION2) == BITMASK_OF_VERSION2) {
return bar.arch=+ver2;
} else {
// Default
return bar.default;
}
}
}
```
This patch is the frontend implementation of the coroutine elide
improvement project detailed in this discourse post:
https://discourse.llvm.org/t/language-extension-for-better-more-deterministic-halo-for-c-coroutines/80044
This patch proposes a C++ struct/class attribute
`[[clang::coro_await_elidable]]`. This notion of await elidable task
gives developers and library authors a certainty that coroutine heap
elision happens in a predictable way.
Originally, after we lower a coroutine to LLVM IR, CoroElide is
responsible for analysis of whether an elision can happen. Take this as
an example:
```
Task foo();
Task bar() {
co_await foo();
}
```
For CoroElide to happen, the ramp function of `foo` must be inlined into
`bar`. This inlining happens after `foo` has been split but `bar` is
usually still a presplit coroutine. If `foo` is indeed a coroutine, the
inlined `coro.id` intrinsics of `foo` is visible within `bar`. CoroElide
then runs an analysis to figure out whether the SSA value of
`coro.begin()` of `foo` gets destroyed before `bar` terminates.
`Task` types are rarely simple enough for the destroy logic of the task
to reference the SSA value from `coro.begin()` directly. Hence, the pass
is very ineffective for even the most trivial C++ Task types. Improving
CoroElide by implementing more powerful analyses is possible, however it
doesn't give us the predictability when we expect elision to happen.
The approach we want to take with this language extension generally
originates from the philosophy that library implementations of `Task`
types has the control over the structured concurrency guarantees we
demand for elision to happen. That is, the lifetime for the callee's
frame is shorter to that of the caller.
The ``[[clang::coro_await_elidable]]`` is a class attribute which can be
applied to a coroutine return type.
When a coroutine function that returns such a type calls another
coroutine function, the compiler performs heap allocation elision when
the following conditions are all met:
- callee coroutine function returns a type that is annotated with
``[[clang::coro_await_elidable]]``.
- In caller coroutine, the return value of the callee is a prvalue that
is immediately `co_await`ed.
From the C++ perspective, it makes sense because we can ensure the
lifetime of elided callee cannot exceed that of the caller if we can
guarantee that the caller coroutine is never destroyed earlier than the
callee coroutine. This is not generally true for any C++ programs.
However, the library that implements `Task` types and executors may
provide this guarantee to the compiler, providing the user with
certainty that HALO will work on their programs.
After this patch, when compiling coroutines that return a type with such
attribute, the frontend checks that the type of the operand of
`co_await` expressions (not `operator co_await`). If it's also
attributed with `[[clang::coro_await_elidable]]`, the FE emits metadata
on the call or invoke instruction as a hint for a later middle end pass
to elide the elision.
The original patch version is
https://github.com/llvm/llvm-project/pull/94693 and as suggested, the
patch is split into frontend and middle end solutions into stacked PRs.
The middle end CoroSplit patch can be found at
https://github.com/llvm/llvm-project/pull/99283
The middle end transformation that performs the elide can be found at
https://github.com/llvm/llvm-project/pull/99285
HLSL output parameters are denoted with the `inout` and `out` keywords
in the function declaration. When an argument to an output parameter is
constructed a temporary value is constructed for the argument.
For `inout` pamameters the argument is initialized via copy-initialization
from the argument lvalue expression to the parameter type. For `out`
parameters the argument is not initialized before the call.
In both cases on return of the function the temporary value is written
back to the argument lvalue expression through an implicit assignment
binary operator with casting as required.
This change introduces a new HLSLOutArgExpr ast node which represents
the output argument behavior. The OutArgExpr has three defined children:
- An OpaqueValueExpr of the argument lvalue expression.
- An OpaqueValueExpr of the copy-initialized parameter.
- A BinaryOpExpr assigning the first with the value of the second.
Fixes#87526
---------
Co-authored-by: Damyan Pepper <damyanp@microsoft.com>
Co-authored-by: John McCall <rjmccall@gmail.com>
As per [1] the indices for a matrix element access operator shall have
integral or unscoped enumeration types and be non-negative. At the
moment, the index expression is converted to SizeType irrespective of
the signedness of the index expression. This causes implicit sign
conversion warnings if any of the indices is signed.
As per the spec, using signed types as indices is allowed and should not
cause any warnings. If the index expression is signed, extend to
SignedSizeType to avoid the warning.
[1]
https://clang.llvm.org/docs/MatrixTypes.html#matrix-type-element-access-operator
PR: https://github.com/llvm/llvm-project/pull/103044
This patch creates an additional EmitRISCVCpuSupports function to handle
situations with multiple features. It also modifies the original
EmitRISCVCpuSupports function to invoke the new one.
As part of the LLVM effort to eliminate debug-info intrinsics, we're
moving to a world where only iterators should be used to insert
instructions. This isn't a problem in clang when instructions get
generated before any debug-info is inserted, however we're planning on
deprecating and removing the instruction-pointer insertion routines.
Scatter some calls to getIterator in a few places, remove a
deref-then-addrof on another iterator, and add an overload for the
createLoadInstBefore utility. Some callers passes a null insertion
point, which we need to handle explicitly now.
This is a minimal patch to support parsing for "omp assume" directives.
These are meant to be hints to a compiler's optimisers: as such, it is
legitimate (if not very useful) to ignore them. The patch builds on top
of the existing support for "omp assumes" directives (note spelling!).
Unlike the "omp [begin/end] assumes" directives, "omp assume" is
associated with a compound statement, i.e. it can appear within a
function. The "holds" assumption could (theoretically) be mapped onto
the existing builtin "__builtin_assume", though the latter applies to a
single point in the program, and the former to a range (i.e. the whole
of the associated compound statement).
This patch fixes sollve's OpenMP 5.1 "omp assume"-based tests.
Fix codegen of consteval functions returning an empty class, and related
issues
If a class is empty, don't store it to memory: the store might overwrite
useful data. Similarly, if a class has tail padding that might overlap
other fields, don't store the tail padding to memory.
The problem here turned out a bit more general than I initially thought:
basically all uses of EmitAggregateStore were broken. Call lowering had
a method that did mostly the right thing, though: CreateCoercedStore.
Adapt CreateCoercedStore so it always does the conservatively right
thing, and use it for both calls and ConstantExpr.
Also, along the way, fix the "overlap" bit in AggValueSlot: the bit was
set incorrectly for empty classes in some cases.
Fixes#93040.
As with other loops, we need only look at a RecordDecl's FieldDecls.
Convert to using them. In the meantime, we can improve the generation of
the 'counted_by' FieldDecl's GEP by creating one GEP instead of a series
of GEPs.
- For languages following SPMD/SIMT programming model, functions and
call sites are marked 'convergent' by default. 'noconvergent' is added
in this patch to allow developers to remove that 'convergent'
attribute when it's safe.
Reviewers:
nhaehnle, Sirraide, yxsamliu, Artem-B, ilovepi, jayfoad, ssahasra, arsenm
Reviewed By: arsenm
Pull Request: https://github.com/llvm/llvm-project/pull/100637
Summary:
Currently there are several layers to handle `printf`. Since we now have
varargs and an implementation of `printf` this can be heavily
simplified.
1. The frontend renames `printf` into `omp_vprintf` and gives it an
argument buffer.
Removing 1. triggered some code in the AMDGPU backend menat for HIP /
OpenCL, so I hadded an exception to it.
2. Forward this to CUDA vprintf or ignore it.
We no longer need special handling for it since we have varargs. So now
we just forward this to CUDA vprintf if we have libc, otherwise just
leave `printf` as an external function and expect that `libc` will be
linked in.
Use this to replace the emission of the amdgpu-unsafe-fp-atomics
attribute in favor of per-instruction metadata. In the future
new fine grained controls should be introduced that also cover
the integer cases.
Add a wrapper around CreateAtomicRMW that appends the metadata,
and update a few use contexts to use it.
This implements the __builtin_cpu_init and __builtin_cpu_supports
builtin routines based on the compiler runtime changes in
https://github.com/llvm/llvm-project/pull/85790.
This is inspired by https://github.com/llvm/llvm-project/pull/85786.
Major changes are a) a restriction in scope to only the builtins (which
have a much narrower user interface), and the avoidance of false
generality. This change deliberately only handles group 0 extensions
(which happen to be all defined ones today), and avoids the tblgen
changes from that review.
I don't have an environment in which I can actually test this, but @BeMg
has been kind enough to report that this appears to work as expected.
Before this can make it into a release, we need a change such as
https://github.com/llvm/llvm-project/pull/99958. The gcc docs claim that
cpu_support can be called by "normal" code without calling the cpu_init
routine because the init routine will have been called by a high
priority constructor. Our current compiler-rt mechanism does not do
this.
Introduces type based signing of member function pointers. To support
this discrimination schema we no longer emit member function pointer to
virtual methods and indices into a vtable but migrate to using thunks.
This does mean member function pointers are no longer necessarily
directly comparable, however as such comparisons are UB this is
acceptable.
We derive the discriminator from the C++ mangling of the type of the
pointer being authenticated.
Co-Authored-By: Akira Hatanaka ahatanaka@apple.com
Co-Authored-By: John McCall rjmccall@apple.com
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
Re-signing occurs when function type discrimination is enabled and a
function pointer is converted to another function pointer type that
requires signing using a different discriminator. A function pointer is
re-signed using discriminator zero when it's converted to a pointer to a
non-function type such as `void*`.
---------
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
Co-authored-by: John McCall <rjmccall@apple.com>
Add the reverse directive which will be introduced in the upcoming
OpenMP 6.0 specification. A preview has been published in [Technical
Report 12](https://www.openmp.org/wp-content/uploads/openmp-TR12.pdf).
---------
Co-authored-by: Alexey Bataev <a.bataev@outlook.com>
Updates codegen for global destructors and raising exceptions to ensure
that the function pointers being passed are signed using the correct
schema.
Notably this requires that CodeGenFunction::createAtExitStub to return
an opaque Constant* rather than a Function* as the value being emitted
is no longer necessarily a raw function pointer depending on the
configured ABI.
Co-Authored-By: Akira Hatanaka <ahatanaka@apple.com>
Co-Authored-By: John McCall <rjmccall@apple.com>
There are two problems with _BitInt prior to this patch:
1. For at least some values of N, we cannot use LLVM's iN for the type
of struct elements, array elements, allocas, global variables, and so
on, because the LLVM layout for that type does not match the high-level
layout of _BitInt(N).
Example: Currently for i128:128 targets correct implementation is
possible either for __int128 or for _BitInt(129+) with lowering to iN,
but not both, since we have now correct implementation of __int128 in
place after a21abc7.
When this happens, opaque [M x i8] types used, where M =
sizeof(_BitInt(N)).
2. LLVM doesn't guarantee any particular extension behavior for integer
types that aren't a multiple of 8. For this reason, all _BitInt types
are now have in-memory representation that is a whole number of bytes.
I.e. for example _BitInt(17) now will have memory layout type i32.
This patch also introduces concept of load/store type and adds an API to
CodeGenTypes that returns the IR type that should be used for load and
store operations. This is particularly useful for the case when a
_BitInt ends up having array of bytes as memory layout type. For
_BitInt(N), let M = sizeof(_BitInt(N)), and let BITS = M * 8. Loads and
stores of iM would both (1) produce far better code from the backends
and (2) be far more optimizable by IR passes than loads and stores of [M
x i8].
Fixes https://github.com/llvm/llvm-project/issues/85139
Fixes https://github.com/llvm/llvm-project/issues/83419
---------
Co-authored-by: John McCall <rjmccall@gmail.com>
Virtual function pointer entries in v-tables are signed with address
discrimination in addition to declaration-based discrimination, where an
integer discriminator the string hash (see
`ptrauth_string_discriminator`) of the mangled name of the overridden
method. This notably provides diversity based on the full signature of
the overridden method, including the method name and parameter types.
This patch introduces ItaniumVTableContext logic to find the original
declaration of the overridden method.
On AArch64, these pointers are signed using the `IA` key (the
process-independent code key.)
V-table pointers can be signed with either no discrimination, or a
similar scheme using address and decl-based discrimination. In this
case, the integer discriminator is the string hash of the mangled
v-table identifier of the class that originally introduced the vtable
pointer.
On AArch64, these pointers are signed using the `DA` key (the
process-independent data key.)
Not using discrimination allows attackers to simply copy valid v-table
pointers from one object to another. However, using a uniform
discriminator of 0 does have positive performance and code-size
implications on AArch64, and diversity for the most important v-table
access pattern (virtual dispatch) is already better assured by the
signing schemas used on the virtual functions. It is also known that
some code in practice copies objects containing v-tables with `memcpy`,
and while this is not permitted formally, it is something that may be
invasive to eliminate.
This is controlled by:
```
-fptrauth-vtable-pointer-type-discrimination
-fptrauth-vtable-pointer-address-discrimination
```
In addition, this provides fine-grained controls in the
ptrauth_vtable_pointer attribute, which allows overriding the default
ptrauth schema for vtable pointers on a given class hierarchy, e.g.:
```
[[clang::ptrauth_vtable_pointer(no_authentication, no_address_discrimination,
no_extra_discrimination)]]
[[clang::ptrauth_vtable_pointer(default_key, default_address_discrimination,
custom_discrimination, 0xf00d)]]
```
The override is then mangled as a parametrized vendor extension:
```
"__vtptrauth" I
<key>
<addressDiscriminated>
<extraDiscriminator>
E
```
To support this attribute, this patch adds a small extension to the
attribute-emitter tablegen backend.
Note that there are known areas where signing is either missing
altogether or can be strengthened. Some will be addressed in later
changes (e.g., member function pointers, some RTTI).
`dynamic_cast` in particular is handled by emitting an artificial
v-table pointer load (in a way that always authenticates it) before the
runtime call itself, as the runtime doesn't have enough information
today to properly authenticate it. Instead, the runtime is currently
expected to strip the v-table pointer.
---------
Co-authored-by: John McCall <rjmccall@apple.com>
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
This patch simplifies instruction creation by replacing all overloads of
instruction constructors/Create methods that are identical other than
the Instruction *InsertBefore/BasicBlock *InsertAtEnd/BasicBlock::iterator
InsertBefore argument with a single version that takes an InsertPosition
argument. The InsertPosition class can be implicitly constructed from
any of the above, internally converting them to the appropriate
BasicBlock::iterator value which can then be used to insert the
instruction (or to not insert it if an invalid iterator is passed).
The upshot of this is that code will be deduplicated, and all callsites
will switch to calling the new unified version without any changes
needed to make the compiler happy. There is at least one exception to
this; the construction of InsertPosition is a user-defined conversion,
so any caller that was already relying on a different user-defined
conversion won't work. In all of LLVM and Clang this happens exactly
once: at clang/lib/CodeGen/CGExpr.cpp:123 we try to construct an alloca
with an AssertingVH<Instruction> argument, which must now be cast to an
Instruction* by using `&*`. If this is more common elsewhere, it could
be fixed by adding an appropriate constructor to InsertPosition.
Prior to 84780af4b02cb3b86e4cb724f996bf8e02f2f2e7, the class didn't have
any information about whether the saved value was volatile.
This is NFC as far as I can tell.
This patch implements the 'loop' construct AST, as well as the basic
appertainment rule. Additionally, it sets up the 'parent' compute
construct, which is necessary for codegen/other diagnostics.
A 'loop' can apply to a for or range-for loop, otherwise it has no other
restrictions (though some of its clauses do).
Using MMRAs, allow `builtin_amdgcn_fence` to emit fences that only
target one or more address spaces, instead of fencing all address spaces
at once.
This is done through a `amdgpu-as` MMRA. Currently focused on OpenCL
fences, but can very easily support more AS names and codegen on more
than just fences.
Due to how `CodeGenFunction::EmitTrapCheck` is implemented
`SanitizerHandler` with numeric value 0x19 needs to be reserved because
`-fbounds-safety` generates trap instructions with that value embedded
in the trap instructions for x86_64 and arm64 just like for UBSan traps.
** x86_64 **
```
ud1l 0x19(%eax), %eax
```
** arm64 **
```
brk #0x5519
```
To avoid upstream Clang and AppleClang diverging their ABIs for
`-fbounds-safety` the slot is being reserved in this patch.
`SanitizerHandler::BoundsSafety` currently has no uses in the code but
uses will be introduced when the CodeGen side of `-fbounds-safety`'s
implementation is upstreamed.
rdar://126884014
Co-authored-by: Dan Liew <dan@su-root.co.uk>
PR #80680 added bits in the codegen to lazily add convergence intrinsics
when required. This logic relied on the LoopStack. The issue is when
parsing the condition, the loopstack doesn't yet reflect the correct
values, as expected since we are not yet in the loop.
However, convergence tokens should sometimes already be available. The
solution which seemed the simplest is to greedily generate the tokens
when we generate SPIR-V.
Fixes#88144
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
https://wg21.link/P2809R3
This is applied as a DR to C++11 (C++98 did not guarantee forward
progress and is left untouched)
As an extension (and to preserve existing behavior in C), we consider
all controlling expression that can be constant folded
in the front end, not just standard constant expressions.
Prior to this change the debug-location for the
`llvm.instrprof.increment` intrinsic was set to whatever the current
DIBuilder's current debug location was set to. This meant that for
switch-statements, a counter's location was set to the previous case's
debug-location, causing confusing stepping behaviour in debuggers. This
patch makes sure we attach a dummy debug-location for the increment
instructions.
rdar://123050737
Latest diff:
f1ab4c2677..adf9bc902b
We address two additional bugs here:
### Problem 1: Deactivated normal cleanup still runs, leading to
double-free
Consider the following:
```cpp
struct A { };
struct B { B(const A&); };
struct S {
A a;
B b;
};
int AcceptS(S s);
void Accept2(int x, int y);
void Test() {
Accept2(AcceptS({.a = A{}, .b = A{}}), ({ return; 0; }));
}
```
We add cleanups as follows:
1. push dtor for field `S::a`
2. push dtor for temp `A{}` (used by ` B(const A&)` in `.b = A{}`)
3. push dtor for field `S::b`
4. Deactivate 3 `S::b`-> This pops the cleanup.
5. Deactivate 1 `S::a` -> Does not pop the cleanup as *2* is top. Should
create _active flag_!!
6. push dtor for `~S()`.
7. ...
It is important to deactivate **5** using active flags. Without the
active flags, the `return` will fallthrough it and would run both `~S()`
and dtor `S::a` leading to **double free** of `~A()`.
In this patch, we unconditionally emit active flags while deactivating
normal cleanups. These flags are deleted later by the `AllocaTracker` if
the cleanup is not emitted.
### Problem 2: Missing cleanup for conditional lifetime extension
We push 2 cleanups for lifetime-extended cleanup. The first cleanup is
useful if we exit from the middle of the expression (stmt-expr/coro
suspensions). This is deactivated after full-expr, and a new cleanup is
pushed, extending the lifetime of the temporaries (to the scope of the
reference being initialized).
If this lifetime extension happens to be conditional, then we use active
flags to remember whether the branch was taken and if the object was
initialized.
Previously, we used a **single** active flag, which was used by both
cleanups. This is wrong because the first cleanup will be forced to
deactivate after the full-expr and therefore this **active** flag will
always be **inactive**. The dtor for the lifetime extended entity would
not run as it always sees an **inactive** flag.
In this patch, we solve this using two separate active flags for both
cleanups. Both of them are activated if the conditional branch is taken,
but only one of them is deactivated after the full-expr.
---
Fixes https://github.com/llvm/llvm-project/issues/63818
Fixes https://github.com/llvm/llvm-project/issues/88478
---
Previous PR logs:
1. https://github.com/llvm/llvm-project/pull/85398
2. https://github.com/llvm/llvm-project/pull/88670
3. https://github.com/llvm/llvm-project/pull/88751
4. https://github.com/llvm/llvm-project/pull/88884
OpenACC is going to need an array sections implementation that is a
simpler version/more restrictive version of the OpenMP version.
This patch moves `OMPArraySectionExpr` to `Expr.h` and renames it `ArraySectionExpr`,
then adds an enum to choose between the two.
This also fixes a couple of 'drive-by' issues that I discovered on the way,
but leaves the OpenACC Sema parts reasonably unimplemented (no semantic
analysis implementation), as that will be a followup patch.
The original change caused widespread breakages in msan/ubsan tests and
causes `use-after-free`. Most likely we are adding more cleanups than
necessary.
Fixes https://github.com/llvm/llvm-project/issues/88478
Promoting the `EHCleanup` to `NormalAndEHCleanup` in `EmitCallArgs`
surfaced another bug with deactivation of normal cleanups. Here we
missed emitting CPP scope ends for deactivated normal cleanups. This
patch also fixes that bug.
We missed emitting CPP scope ends because we remove the `fallthrough`
(clears the insertion point) before deactivating normal cleanups. This
is to make the emitted "normal" cleanup code unreachable. But we still
need to emit CPP scope ends in the original basic block even for a
deactivated normal cleanup.
(This worked correctly before we did not remove `fallthrough` for
`EHCleanup`s).
Fixes https://github.com/llvm/llvm-project/issues/63818 for control flow
out of an expressions.
#### Background
A control flow could happen in the middle of an expression due to
stmt-expr and coroutine suspensions.
Due to branch-in-expr, we missed running cleanups for the temporaries
constructed in the expression before the branch.
Previously, these cleanups were only added as `EHCleanup` during the
expression and as normal expression after the full expression.
Examples of such deferred cleanups include:
`ParenList/InitList`: Cleanups for fields are performed by the
destructor of the object being constructed.
`Array init`: Cleanup for elements of an array is included in the array
cleanup.
`Lifetime-extended temporaries`: reference-binding temporaries in
braced-init are lifetime extended to the parent scope.
`Lambda capture init`: init in the lambda capture list is destroyed by
the lambda object.
---
#### In this PR
In this PR, we change some of the `EHCleanups` cleanups to
`NormalAndEHCleanups` to make sure these are emitted when we see a
branch inside an expression (through statement expressions or coroutine
suspensions).
These are supposed to be deactivated after full expression and destroyed
later as part of the destructor of the aggregate or array being
constructed. To simplify deactivating cleanups, we add two utilities as
well:
* `DeferredDeactivationCleanupStack`: A stack to remember cleanups with
deferred deactivation.
* `CleanupDeactivationScope`: RAII for deactivating cleanups added to
the above stack.
---
#### Deactivating normal cleanups
These were previously `EHCleanups` and not `Normal` and **deactivation**
of **required** `Normal` cleanups had some bugs. These specifically
include deactivating `Normal` cleanups which are not the top of
`EHStack`
[source1](92b56011e6/clang/lib/CodeGen/CGCleanup.cpp (L1319)),
[2](92b56011e6/clang/lib/CodeGen/CGCleanup.cpp (L722-L746)).
This has not been part of our test suite (maybe it was never required
before statement expressions). In this PR, we also fix the emission of
required-deactivated-normal cleanups.
Fix since #75481 got reverted.
- Explicitly set BitfieldBits to 0 to avoid uninitialized field member
for the integer checks:
```diff
- llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first)};
+ llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first),
+ llvm::ConstantInt::get(Builder.getInt32Ty(), 0)};
```
- `Value **Previous` was erroneously `Value *Previous` in
`CodeGenFunction::EmitWithOriginalRHSBitfieldAssignment`, fixed now.
- Update following:
```diff
- if (Kind == CK_IntegralCast) {
+ if (Kind == CK_IntegralCast || Kind == CK_LValueToRValue) {
```
CK_LValueToRValue when going from, e.g., char to char, and
CK_IntegralCast otherwise.
- Make sure that `Value *Previous = nullptr;` is initialized (see
1189e87951)
- Add another extensive testcase
`ubsan/TestCases/ImplicitConversion/bitfield-conversion.c`
---------
Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>
This patch implements the implicit truncation and implicit sign change
checks for bitfields using UBSan. E.g.,
`-fsanitize=implicit-bitfield-truncation` and
`-fsanitize=implicit-bitfield-sign-change`.
HLSL has wave operations and other kind of function which required the
control flow to either be converged, or respect certain constraints as
where and how to re-converge.
At the HLSL level, the convergence are mostly obvious: the control flow
is expected to re-converge at the end of a scope.
Once translated to IR, HLSL scopes disapear. This means we need a way to
communicate convergence restrictions down to the backend.
For this, the SPIR-V backend uses convergence intrinsics. So this commit
adds some code to generate convergence intrinsics when required.
---------
Signed-off-by: Nathan Gauër <brioche@google.com>