As part of the LLVM effort to eliminate debug-info intrinsics, we're
moving to a world where only iterators should be used to insert
instructions. This isn't a problem in clang when instructions get
generated before any debug-info is inserted, however we're planning on
deprecating and removing the instruction-pointer insertion routines.
Scatter some calls to getIterator in a few places, remove a
deref-then-addrof on another iterator, and add an overload for the
createLoadInstBefore utility. Some callers passes a null insertion
point, which we need to handle explicitly now.
Fix codegen of consteval functions returning an empty class, and related
issues
If a class is empty, don't store it to memory: the store might overwrite
useful data. Similarly, if a class has tail padding that might overlap
other fields, don't store the tail padding to memory.
The problem here turned out a bit more general than I initially thought:
basically all uses of EmitAggregateStore were broken. Call lowering had
a method that did mostly the right thing, though: CreateCoercedStore.
Adapt CreateCoercedStore so it always does the conservatively right
thing, and use it for both calls and ConstantExpr.
Also, along the way, fix the "overlap" bit in AggValueSlot: the bit was
set incorrectly for empty classes in some cases.
Fixes#93040.
- For languages following SPMD/SIMT programming model, functions and
call sites are marked 'convergent' by default. 'noconvergent' is added
in this patch to allow developers to remove that 'convergent'
attribute when it's safe.
Reviewers:
nhaehnle, Sirraide, yxsamliu, Artem-B, ilovepi, jayfoad, ssahasra, arsenm
Reviewed By: arsenm
Pull Request: https://github.com/llvm/llvm-project/pull/100637
In PowerPC ABI, a few initial arguments are passed through registers,
but their places in parameter save area are reserved, arguments passed
by memory goes after the reserved location.
For debugging purpose, we may want to save copy of the pass-by-reg
arguments into correct places on stack. The new option achieves by
adding new function level attribute and make argument lowering part
aware of it.
Introduces type based signing of member function pointers. To support
this discrimination schema we no longer emit member function pointer to
virtual methods and indices into a vtable but migrate to using thunks.
This does mean member function pointers are no longer necessarily
directly comparable, however as such comparisons are UB this is
acceptable.
We derive the discriminator from the C++ mangling of the type of the
pointer being authenticated.
Co-Authored-By: Akira Hatanaka ahatanaka@apple.com
Co-Authored-By: John McCall rjmccall@apple.com
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
There are two problems with _BitInt prior to this patch:
1. For at least some values of N, we cannot use LLVM's iN for the type
of struct elements, array elements, allocas, global variables, and so
on, because the LLVM layout for that type does not match the high-level
layout of _BitInt(N).
Example: Currently for i128:128 targets correct implementation is
possible either for __int128 or for _BitInt(129+) with lowering to iN,
but not both, since we have now correct implementation of __int128 in
place after a21abc7.
When this happens, opaque [M x i8] types used, where M =
sizeof(_BitInt(N)).
2. LLVM doesn't guarantee any particular extension behavior for integer
types that aren't a multiple of 8. For this reason, all _BitInt types
are now have in-memory representation that is a whole number of bytes.
I.e. for example _BitInt(17) now will have memory layout type i32.
This patch also introduces concept of load/store type and adds an API to
CodeGenTypes that returns the IR type that should be used for load and
store operations. This is particularly useful for the case when a
_BitInt ends up having array of bytes as memory layout type. For
_BitInt(N), let M = sizeof(_BitInt(N)), and let BITS = M * 8. Loads and
stores of iM would both (1) produce far better code from the backends
and (2) be far more optimizable by IR passes than loads and stores of [M
x i8].
Fixes https://github.com/llvm/llvm-project/issues/85139
Fixes https://github.com/llvm/llvm-project/issues/83419
---------
Co-authored-by: John McCall <rjmccall@gmail.com>
musttail is not often possible to be generated on PPC targets as when
calling to a function defined in another module, PPC needs to restore
the TOC pointer. To restore the TOC pointer, compiler needs to emit a
nop after the call to let linker generate codes to restore TOC pointer.
Tail call cannot generate expected call sequence for this case.
To avoid the crash inside the compiler backend, a diagnosis is added in
the frontend.
Fixes#63214
The size of the inline ASM `srcloc` cookie was changed from 32 bits to
64 bits in [D105491](https://reviews.llvm.org/D105491). However, that
commit only updated the size of the cookie in `DiagnosticInfoInlineAsm`,
meaning that inline ASM diagnostics that are instead represented with a
`DiagnosticInfoSrcMgr` have their cookies truncated to 32 bits. This PR
replaces the remaining uses of `unsigned` to represent the cookie with
`uint64_t`, allowing the cookie to make it all the way to the diagnostic
handler without being truncated.
When using the -mframe-chain=aapcs or -mframe-chain=aapcs-leaf options,
we cannot use r11 as an allocatable register, even if
-fomit-frame-pointer is also used. This is so that r11 will always point
to a valid frame record, even if we don't create one in every function.
This is in effect a revert of f139ae3d93797, as we have since gained a
more sophisticated way of doing extra IRGen with the addition of
RawAddress in #86923.
PR #80680 added bits in the codegen to lazily add convergence intrinsics
when required. This logic relied on the LoopStack. The issue is when
parsing the condition, the loopstack doesn't yet reflect the correct
values, as expected since we are not yet in the loop.
However, convergence tokens should sometimes already be available. The
solution which seemed the simplest is to greedily generate the tokens
when we generate SPIR-V.
Fixes#88144
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
When using a hard-float ABI for a target without FP registers, it's not
possible to correctly generate code for functions with arguments which
must be passed in floating-point registers. This is diagnosed in CodeGen
instead of Sema, to more closely match GCC's behaviour around inline
functions, which is relied on by the Linux kernel.
Previously, this only checked function signatures as they were
code-generated, but this missed some cases:
* Calls to functions not defined in this translation unit.
* Calls through function pointers.
* Calls to variadic functions, where the variadic arguments have a
floating-point type.
This adds checks to function calls, as well as definitions, so that
these cases are correctly diagnosed.
Latest diff:
f1ab4c2677..adf9bc902b
We address two additional bugs here:
### Problem 1: Deactivated normal cleanup still runs, leading to
double-free
Consider the following:
```cpp
struct A { };
struct B { B(const A&); };
struct S {
A a;
B b;
};
int AcceptS(S s);
void Accept2(int x, int y);
void Test() {
Accept2(AcceptS({.a = A{}, .b = A{}}), ({ return; 0; }));
}
```
We add cleanups as follows:
1. push dtor for field `S::a`
2. push dtor for temp `A{}` (used by ` B(const A&)` in `.b = A{}`)
3. push dtor for field `S::b`
4. Deactivate 3 `S::b`-> This pops the cleanup.
5. Deactivate 1 `S::a` -> Does not pop the cleanup as *2* is top. Should
create _active flag_!!
6. push dtor for `~S()`.
7. ...
It is important to deactivate **5** using active flags. Without the
active flags, the `return` will fallthrough it and would run both `~S()`
and dtor `S::a` leading to **double free** of `~A()`.
In this patch, we unconditionally emit active flags while deactivating
normal cleanups. These flags are deleted later by the `AllocaTracker` if
the cleanup is not emitted.
### Problem 2: Missing cleanup for conditional lifetime extension
We push 2 cleanups for lifetime-extended cleanup. The first cleanup is
useful if we exit from the middle of the expression (stmt-expr/coro
suspensions). This is deactivated after full-expr, and a new cleanup is
pushed, extending the lifetime of the temporaries (to the scope of the
reference being initialized).
If this lifetime extension happens to be conditional, then we use active
flags to remember whether the branch was taken and if the object was
initialized.
Previously, we used a **single** active flag, which was used by both
cleanups. This is wrong because the first cleanup will be forced to
deactivate after the full-expr and therefore this **active** flag will
always be **inactive**. The dtor for the lifetime extended entity would
not run as it always sees an **inactive** flag.
In this patch, we solve this using two separate active flags for both
cleanups. Both of them are activated if the conditional branch is taken,
but only one of them is deactivated after the full-expr.
---
Fixes https://github.com/llvm/llvm-project/issues/63818
Fixes https://github.com/llvm/llvm-project/issues/88478
---
Previous PR logs:
1. https://github.com/llvm/llvm-project/pull/85398
2. https://github.com/llvm/llvm-project/pull/88670
3. https://github.com/llvm/llvm-project/pull/88751
4. https://github.com/llvm/llvm-project/pull/88884
Linked to https://github.com/gnustep/libobjc2/pull/289.
More information can be found in issue: #88273.
My solution involves creating a new message-send function for this
calling convention when targeting MSVC. Additional information is
available in the libobjc2 pull request.
I am unsure whether we should check for a runtime version where
objc_msgSend_stret2_np is guaranteed to be present or leave it as is,
considering it remains a critical bug. What are your thoughts about this
@davidchisnall?
Prior to #85863, the required parameters of llvm::isKnownNonZero were
Value and DataLayout. After, they are Value, Depth, and SimplifyQuery,
where SimplifyQuery is implicitly constructible from DataLayout. The
change to move Depth before SimplifyQuery needed callers to be updated
unnecessarily, and as commented in #85863, we actually want Depth to be
after SimplifyQuery anyway so that it can be defaulted and the caller
does not need to specify it.
The original change caused widespread breakages in msan/ubsan tests and
causes `use-after-free`. Most likely we are adding more cleanups than
necessary.
Fixes https://github.com/llvm/llvm-project/issues/88478
Promoting the `EHCleanup` to `NormalAndEHCleanup` in `EmitCallArgs`
surfaced another bug with deactivation of normal cleanups. Here we
missed emitting CPP scope ends for deactivated normal cleanups. This
patch also fixes that bug.
We missed emitting CPP scope ends because we remove the `fallthrough`
(clears the insertion point) before deactivating normal cleanups. This
is to make the emitted "normal" cleanup code unreachable. But we still
need to emit CPP scope ends in the original basic block even for a
deactivated normal cleanup.
(This worked correctly before we did not remove `fallthrough` for
`EHCleanup`s).
MSVC doesn't support generating __vectorcall calls in Arm64EC mode, but
it does treat it as a distinct type. The Microsoft STL depends on this
functionality. (Not sure if this is intentional.) Add support for
parsing the same way as MSVC, and add some checks to ensure we don't try
to actually generate code.
The error handling in CodeGen is ugly, but I can't think of a better way
to do it.
HLSL constant sized array function parameters do not decay to pointers.
Instead constant sized array types are preserved as unique types for
overload resolution, template instantiation and name mangling.
This implements the change by adding a new `ArrayParameterType` which
represents a non-decaying `ConstantArrayType`. The new type behaves the
same as `ConstantArrayType` except that it does not decay to a pointer.
Values of `ConstantArrayType` in HLSL decay during overload resolution
via a new `HLSLArrayRValue` cast to `ArrayParameterType`.
`ArrayParamterType` values are passed indirectly by-value to functions
in IR generation resulting in callee generated memcpy instructions.
The behavior of HLSL function calls is documented in the [draft language
specification](https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf)
under the Expr.Post.Call heading.
Additionally the design of this implementation approach is documented in
[Clang's
documentation](https://clang.llvm.org/docs/HLSL/FunctionCalls.html)
Resolves#70123
This reverts commit ca4c4a6758d184f209cb5d88ef42ecc011b11642.
This was intended not to introduce new consistency diagnostics for
smart pointer types, but failed to ignore sugar around types when
detecting this.
Fixed and test added.
HLSL has wave operations and other kind of function which required the
control flow to either be converged, or respect certain constraints as
where and how to re-converge.
At the HLSL level, the convergence are mostly obvious: the control flow
is expected to re-converge at the end of a scope.
Once translated to IR, HLSL scopes disapear. This means we need a way to
communicate convergence restrictions down to the backend.
For this, the SPIR-V backend uses convergence intrinsics. So this commit
adds some code to generate convergence intrinsics when required.
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.
This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.
In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
This reapplies d9a685a9dd589486e882b722e513ee7b8c84870c, which was
reverted because it broke ubsan bots. There seems to be a bug in
coroutine code-gen, which is causing EmitTypeCheck to use the wrong
alignment. For now, pass alignment zero to EmitTypeCheck so that it can
compute the correct alignment based on the passed type (see function
EmitCXXMemberOrOperatorMemberCallExpr).
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.
This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.
In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
This reapplies 8bd1f9116aab879183f34707e6d21c7051d083b6. The commit
broke msan bots because LValue::IsKnownNonNull was uninitialized.
[RISCV] RISCV vector calling convention (1/2)
This is the vector calling convention based on
https://github.com/riscv-non-isa/riscv-elf-psabi-doc,
the idea is to split between "scalar" callee-saved registers
and "vector" callee-saved registers. "scalar" ones remain the
original strategy, however, "vector" ones are handled together
with RVV objects.
The stack layout would be:
|--------------------------| <-- FP
| callee-allocated save |
| area for register varargs|
|--------------------------|
| callee-saved registers | <-- scalar callee-saved
| (scalar) |
|--------------------------|
| RVV alignment padding |
|--------------------------|
| callee-saved registers | <-- vector callee-saved
| (vector) |
|--------------------------|
| RVV objects |
|--------------------------|
| padding before RVV |
|--------------------------|
| scalar local variables |
|--------------------------| <-- BP
| variable size objects |
|--------------------------| <-- SP
Note: This patch doesn't contain "tuple" type, e.g. vint32m1x2.
It will be handled in https://github.com/riscv-non-isa/riscv-elf-psabi-doc (2/2).
Differential Revision: https://reviews.llvm.org/D154576
In PR #79382, I need to add a new type that derives from
ConstantArrayType. This means that ConstantArrayType can no longer use
`llvm::TrailingObjects` to store the trailing optional Expr*.
This change refactors ConstantArrayType to store a 60-bit integer and
4-bits for the integer size in bytes. This replaces the APInt field
previously in the type but preserves enough information to recreate it
where needed.
To reduce the number of places where the APInt is re-constructed I've
also added some helper methods to the ConstantArrayType to allow some
common use cases that operate on either the stored small integer or the
APInt as appropriate.
Resolves#85124.
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.
This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.
In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
This reverts commit 92a09c0165b87032e1bd05020a78ed845cf35661.
This is triggering a bunch of new -Wnullability-completeness warnings
in code with existing raw pointer nullability annotations.
The intent was the new nullability locations wouldn't affect those
warnings, so this is a bug at least for now.
This enables clang and external nullability checkers to make use of
these annotations on nullable C++ class types like unique_ptr.
These types are recognized by the presence of the _Nullable attribute.
Nullable standard library types implicitly receive this attribute.
Existing static warnings for raw pointers are extended to smart
pointers:
- nullptr used as return value or argument for non-null functions
(`-Wnonnull`)
- assigning or initializing nonnull variables with nullable values
(`-Wnullable-to-nonnull-conversion`)
It doesn't implicitly add these attributes based on the assume_nonnull
pragma, nor warn on missing attributes where the pragma would apply
them.
I'm not confident that the pragma's current behavior will work well for
C++ (where type-based metaprogramming is much more common than C/ObjC).
We'd like to revisit this once we have more implementation experience.
Support can be detected as `__has_feature(nullability_on_classes)`.
This is needed for back-compatibility, as previously clang would issue a
hard error when _Nullable appears on a smart pointer.
UBSan's `-fsanitize=nullability` will not check smart-pointer types.
It can be made to do so by synthesizing calls to `operator bool`, but
that's left for future work.
Discussion:
https://discourse.llvm.org/t/rfc-allowing-nonnull-etc-on-smart-pointers/77201/26
This implements the C++23 `[[assume]]` attribute.
Assumption information is lowered to a call to `@llvm.assume`, unless the expression has side-effects, in which case it is discarded and a warning is issued to tell the user that the assumption doesn’t do anything. A failed assumption at compile time is an error (unless we are in `MSVCCompat` mode, in which case we don’t check assumptions at compile time).
Due to performance regressions in LLVM, assumptions can be disabled with the `-fno-assumptions` flag. With it, assumptions will still be parsed and checked, but no calls to `@llvm.assume` will be emitted and assumptions will not be checked at compile time.
Instead of only handling vscale x 16 x i1 predicate vectors, handle any
scalable i1 vector where the known minimum is divisible by 8.
This is used on RISC-V where we have multiple sizes of predicate
types.
The new experimental calling convention preserve_none is the opposite
side of existing preserve_all. It tries to preserve as few general
registers as possible. So all general registers are caller saved
registers. It can also uses more general registers to pass arguments.
This attribute doesn't impact floating-point registers. Floating-point
registers still follow the c calling convention.
Currently preserve_none is supported on X86-64 only. It changes the c
calling convention in following fields:
* RSP and RBP are the only preserved general registers, all other
general registers are caller saved registers.
* We can use [RDI, RSI, RDX, RCX, R8, R9, R11, R12, R13, R14, R15, RAX]
to pass arguments.
It can improve the performance of hot tailcall chain, because many
callee saved registers' save/restore instructions can be removed if the
tail functions are using preserve_none. In my experiment in protocol
buffer, the parsing functions are improved by 3% to 10%.
Since https://github.com/ARM-software/acle/pull/276 the ACLE
defines attributes to better describe the use of a given SME state.
Previously the attributes merely described the possibility of it being
'shared' or 'preserved', whereas the new attributes have more semantics
and also describe how the data flows through the program.
For ZT0 we already had to add new LLVM IR attributes:
* aarch64_new_zt0
* aarch64_in_zt0
* aarch64_out_zt0
* aarch64_inout_zt0
* aarch64_preserves_zt0
We have now done the same for ZA, such that we add:
* aarch64_new_za (previously `aarch64_pstate_za_new`)
* aarch64_in_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_out_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_inout_za (more specific variation of
`aarch64_pstate_za_shared`)
* aarch64_preserves_za (previously `aarch64_pstate_za_shared,
aarch64_pstate_za_preserved`)
This explicitly removes 'pstate' from the name, because with SME2 and
the new ACLE attributes there is a difference between "sharing ZA"
(sharing
the ZA matrix register with the caller) and "sharing PSTATE.ZA" (sharing
either the ZA or ZT0 register, both part of PSTATE.ZA with the caller).
This patch builds on top of #76971 and implements support for:
* __arm_new("zt0")
* __arm_in("zt0")
* __arm_out("zt0")
* __arm_inout("zt0")
* __arm_preserves("zt0")
This patch replaces the `__arm_new_za`, `__arm_shared_za` and
`__arm_preserves_za` attributes in favour of:
* `__arm_new("za")`
* `__arm_in("za")`
* `__arm_out("za")`
* `__arm_inout("za")`
* `__arm_preserves("za")`
As described in https://github.com/ARM-software/acle/pull/276.
One change is that `__arm_in/out/inout/preserves(S)` are all mutually
exclusive, whereas previously it was fine to write `__arm_shared_za
__arm_preserves_za`. This case is now represented with `__arm_in("za")`.
The current implementation uses the same LLVM attributes under the hood,
since `__arm_in/out/inout` are all variations of "shared ZA", so can use
the existing `aarch64_pstate_za_shared` attribute in LLVM.
#77941 will add support for the new "zt0" state as introduced
with SME2.
Set the writable and dead_on_unwind attributes for sret arguments. These
indicate that the argument points to writable memory (and it's legal to
introduce spurious writes to it on entry to the function) and that the
argument memory will not be used if the call unwinds.
This enables additional MemCpyOpt/DSE/LICM optimizations.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
This is reverting the previous implementation to avoid adding inline
function in opencl headers.
This was breaking clspv flow google/clspv#1231, while
https://reviews.llvm.org/D156743 mentioned that just decorating the call
node with `!pfmath` was enough.
This PR is implementing this idea.
The test has been updated with this implementation.