Adds target codegen info class for DirectX. For now it always translates
`__hlsl_resource_t` handle to `target("dx.TypedBuffer", i32, 1, 0, 1)`
(`RWBuffer<int>`). More work is needed to determine the actual target
exp type and parameters based on the resource handle attributes.
Part 1/2 of #95952
Update codegen for func param with transparent_union attr to be that of
the first union member.
This is a followup to #101738 to fix non-ppc codegen and closes#76773.
The AMDGPU kernel ABI is not directly representable in SPIR-V, since it
relies on passing aggregates `byref`, and SPIR-V only encodes `byval`
(which the AMDGPU BE disallows for kernel arguments). As a temporary
solution to this mismatch, we add special handling for AMDGCN flavoured
SPIR-V, whereby aggregates are passed as direct, both to kernels and to
normal functions. This is not ideal (there are pathological cases where
performance is heavily impacted), but empirically robust and guaranteed
to work as the AMDGPU BE retains handling of `direct` passing for legacy
reasons.
We will revisit this in the future, but as it stands it is enough to
pass a wide array of integration tests and generates correct SPIR-V and
correct reverse translation into LLVM IR. The
amdgpu-kernel-arg-pointer-type test is updated via the automated script,
and thus becomes quite noisy.
Such struct types:
```
struct {
struct{} a;
long long b;
};
stuct {
struct{} a;
double b;
};
```
For such structures, Lo is NoClass and Hi is Integer/SSE. And when this
structure argument is passed, the high part is passed at offset 8 in
memory. So we should do special handling for these types in
EmitVAArg.Fix https://github.com/llvm/llvm-project/issues/79790 and fix
https://github.com/llvm/llvm-project/issues/86371.
struct SuperEmpty { struct{ int a[0];} b;};
Such 0 sized structs in c++ mode can not be ignored in i386 for that c++
fields are never empty.But when EmitVAArg, its size is 0, so that
va_list not increase.Maybe we can just Ignore this kind of arguments,
like X86_64 did. Fix#86385.
Test `aarch64-sme-inline-streaming-attrs.c` caused some buildbot
failures, because the test was missing a `REQUIRES: aarch64-registered
target`. This was because we've demoted the error to a warning, which
then resulted in a different error message, because Clang can't actually
CodeGen the IR.
PR #77936 introduced a diagnostic to avoid calls being inlined into
functions with a different streaming mode, because inlining those
functions may result in different runtime behaviour. This was necessary
because LLVM doesn't check whether inlining is possible and thus blindly
inlines the function without checking feasibility.
In practice however, this introduces an artificial restriction that the
user may not be able to work around. Calling an `always_inline` function
from some header file that is out of the control of the user would
result in an error that the user cannot remedy.
Therefore, this patch demotes the error into a warning (for calls from
streaming[-compatible] -> non-streaming), but the proper fix would be to
fix the AlwaysInliner in LLVM to avoid inlining when it has analyzed the
callee and has determined that inlining is not possible.
Calling an always_inline function for calls from non-streaming ->
streaming will remain an error, because there is little pre-existing
code for SME, so it is expected that the header file can be modified by
the user (e.g. by using `__arm_streaming_compatible` if the code is
claimed to be compatible).
Use this to replace the emission of the amdgpu-unsafe-fp-atomics
attribute in favor of per-instruction metadata. In the future
new fine grained controls should be introduced that also cover
the integer cases.
Add a wrapper around CreateAtomicRMW that appends the metadata,
and update a few use contexts to use it.
The SystemZ ABI code was missing code to handle the transparent_union
extension. Arguments of such types are specified to be passed like
the first member of the union, instead of according to the usual
ABI calling convention for aggregates.
This did not make much difference in practice as the SystemZ ABI
already specifies that 1-, 2-, 4- or 8-byte aggregates are passed
in registers. However, there *is* a difference if the first member
of the transparent union is a scalar integer type smaller than word
size - if passed as a scalar, it needs to be zero- or sign-extended
to word size, while if passed as aggregate, it is not.
Fixed by adding code to handle transparent_union similar to what
is done on other targets.
Summary:
This patch implements support for variadic functions for NVPTX targets.
The implementation here mainly follows what was done to implement it for
AMDGPU in https://github.com/llvm/llvm-project/pull/93362.
We change the NVPTX codegen to lower all variadic arguments to functions
by-value. This creates a flattened set of arguments that the IR lowering
pass converts into a struct with the proper alignment.
The behavior of this function was determined by iteratively checking
what the NVCC copmiler generates for its output. See examples like
https://godbolt.org/z/KavfTGY93. I have noted the main methods that
NVIDIA uses to lower variadic functions.
1. All arguments are passed in a pointer to aggregate.
2. The minimum alignment for a plain argument is 4 bytes.
3. Alignment is dictated by the underlying type
4. Structs are flattened and do not have their alignment changed.
5. NVPTX never passes any arguments indirectly, even very large ones.
This patch passes the tests in the `libc` project currently, including
support for `sprintf`.
So far branch protection, sign return address, guarded control stack
attributes are
only emitted as module flags to indicate the functions need to be
generated with
those features.
The problem is in case of an LTO build the module flags are merged with
the `min`
rule which means if one of the module is not build with sign return
address then the features
will be turned off for all functions. Due to the functions take the
branch-protection and
sign-return-address features from the module flags. The
sign-return-address is
function level option therefore it is expected functions from files that
is
compiled with -mbranch-protection=pac-ret to be protected.
The inliner might inline functions with different set of flags as it
doesn't consider
the module flags.
This patch adds the attributes to all functions and drops the checking
of the module flags
for the code generation.
Module flag is still used for generating the ELF markers.
Also drops the "true"/"false" values from the
branch-protection-enforcement,
branch-protection-pauth-lr, guarded-control-stack attributes as presence
of the
attribute means it is on absence means off and no other option.
Releand with test fixes.
So far branch protection, sign return address, guarded control stack
attributes are
only emitted as module flags to indicate the functions need to be
generated with
those features.
The problem is in case of an LTO build the module flags are merged with
the `min`
rule which means if one of the module is not build with sign return
address then the features
will be turned off for all functions. Due to the functions take the
branch-protection and
sign-return-address features from the module flags. The
sign-return-address is
function level option therefore it is expected functions from files that
is
compiled with -mbranch-protection=pac-ret to be protected.
The inliner might inline functions with different set of flags as it
doesn't consider
the module flags.
This patch adds the attributes to all functions and drops the checking
of the module flags
for the code generation.
Module flag is still used for generating the ELF markers.
Also drops the "true"/"false" values from the
branch-protection-enforcement,
branch-protection-pauth-lr, guarded-control-stack attributes as presence
of the
attribute means it is on absence means off and no other option.
According to RISC-V integer calling convention empty structs or union
arguments or return values are ignored by C compilers which support them
as a non-standard extension. This is not the case for C++, which
requires them to be sized types.
Fixes#97285
This is a mostly-target-independent variadic function optimisation and
lowering pass. It is only enabled for AMDGPU in this initial commit.
The purpose is to make C style variadic functions a zero cost
abstraction. They are lowered to equivalent IR which is then amenable to
other optimisations. This is inherently slightly target specific but
much less so than one might expect - the C varargs interface heavily
constrains the ABI design divergence.
The pass is primarily tested from webassembly. This is because wasm has
a straightforward variadic lowering strategy which coincides exactly
with what this pass transforms code into and a struct passing convention
with few cases to check. Adding further targets conventions is
straightforward and elided from this patch primarily to simplify the
review. Implemented in other branches are Linux X86, AMD64, AArch64 and
NVPTX.
Testing for targets that have existing lowering for va_arg from clang is
most efficiently done by checking that clang | opt completely elides the
variadic syntax from test cases. The lowering produces a struct for each
call site which can be inspected to check the various alignment and
indirections are correct.
AMDGPU presently has no variadic support other than some ad hoc printf
handling. Combined with the pass being inactive on all other targets
landing this represents strict increase in capability with zero risk.
Testing and refining will continue post commit.
In addition to the compiler tests included here, a self contained x64
clang/musl toolchain was constructed using the "lowering" instead of the
systemv ABI and used to build various C programs like lua and libxml2.
Pass variadic arguments without changing their type, unlike the fixed
ones.
Fixed arguments are modified to better fit into registers. This patch
leaves those unchanged.
Splitting struct types into individual fields and packing small structs
into integers works well for passing via registers. Variadic arguments
are currently unimplemented in the backend. They're likely to be
implemented as a pointer to stack memory in which case register-themed
optimisations are inapplicable.
Splitting the struct into fields makes it difficult to implement va_arg
robustly. The rules around padding and alignment to inverse the struct
splitting could be constructed, but at high complexity and no particular
advantage.
Passing types as-is means there is a 1:1 correspondence with the type
information va_arg has to work with and the parameter type at the call
site.
This is an ABI change, but as the only functions affected are variadic
ones which are presently a compilation error, not a functional break.
Factored out of the larger #93362 and can land independently.
This is in effect a revert of f139ae3d93797, as we have since gained a
more sophisticated way of doing extra IRGen with the addition of
RawAddress in #86923.
Make sure that empty structs are treated as if it has a size of one bit
in function parameters and return types so that it occupies a full
argument and/or return register slot.
This fixes crashes and miscompilations when passing and/or returning
empty structs.
Reviewed by: @s-barannikov
When using a hard-float ABI for a target without FP registers, it's not
possible to correctly generate code for functions with arguments which
must be passed in floating-point registers. This is diagnosed in CodeGen
instead of Sema, to more closely match GCC's behaviour around inline
functions, which is relied on by the Linux kernel.
Previously, this only checked function signatures as they were
code-generated, but this missed some cases:
* Calls to functions not defined in this translation unit.
* Calls through function pointers.
* Calls to variadic functions, where the variadic arguments have a
floating-point type.
This adds checks to function calls, as well as definitions, so that
these cases are correctly diagnosed.
```
typedef long long t67 __attribute__((aligned (4)));
struct s67 {
int a;
t67 b;
};
void f67(struct s67 x) {
}
```
When classify:
a: Lo = Integer, Hi = NoClass
b: Lo = Integer, Hi = NoClass
struct S: Lo = Integer, Hi = NoClass
```
define dso_local void @f67(i64 %x.coerce) {
```
In this case, only one i64 register is used when the structure parameter
is transferred, which is obviously incorrect.So we need to treat the
split case specially. fix
https://github.com/llvm/llvm-project/issues/85387.
Empty structs are ignored for parameter passing purposes, but va_arg was
incrementing the pointer anyway for that the size of empty struct in c++
is 1 byte, which could lead to va_list getting out of sync. Fix#86057.
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.
This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.
In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
This reapplies d9a685a9dd589486e882b722e513ee7b8c84870c, which was
reverted because it broke ubsan bots. There seems to be a bug in
coroutine code-gen, which is causing EmitTypeCheck to use the wrong
alignment. For now, pass alignment zero to EmitTypeCheck so that it can
compute the correct alignment based on the passed type (see function
EmitCXXMemberOrOperatorMemberCallExpr).
This patch replaces dyn_cast<> with cast<> to resolve potential static
analyzer bugs for
1. Dereferencing a pointer issue with nullptr GVar when calling
addAttribute() in AIXTargetCodeGenInfo::setTargetAttributes(clang::Decl
const *, llvm::GlobalValue *, clang::CodeGen::CodeGenModule &).
2. Dereferencing a pointer issue with nullptr GG when calling
getCorrespondingConstructor() in
DeclareImplicitDeductionGuidesForTypeAlias(clang::Sema &,
clang::TypeAliasTemplateDecl *, clang::SourceLocation).
3. Dereferencing a pointer issue with nullptr CurrentBT when calling
getKind() in
ComplexExprEmitter::GetHigherPrecisionFPType(clang::QualType).
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.
This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.
In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
This reapplies 8bd1f9116aab879183f34707e6d21c7051d083b6. The commit
broke msan bots because LValue::IsKnownNonNull was uninitialized.
In PR #79382, I need to add a new type that derives from
ConstantArrayType. This means that ConstantArrayType can no longer use
`llvm::TrailingObjects` to store the trailing optional Expr*.
This change refactors ConstantArrayType to store a 60-bit integer and
4-bits for the integer size in bytes. This replaces the APInt field
previously in the type but preserves enough information to recreate it
where needed.
To reduce the number of places where the APInt is re-constructed I've
also added some helper methods to the ConstantArrayType to allow some
common use cases that operate on either the stored small integer or the
APInt as appropriate.
Resolves#85124.
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.
This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.
In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
SizeInBytes of empty structure is 0 in C, while 1 in C++. And empty
structure argument of the function is ignored in X86_64 backend.As a
result, the value of variable arguments in C++ is incorrect. fix#77036
Co-authored-by: Longsheng Mou <moulongsheng@huawei.com>
This is re-working of #74460, which adds a soft-float ABI for AArch64.
That was reverted because it causes errors when building the linux and
fuchsia kernels.
The problem is that GCC's implementation of the ABI compatibility checks
when using the hard-float ABI on a target without FP registers does it's
checks after optimisation. The previous version of this patch reported
errors for all uses of floating-point types, which is stricter than what
GCC does in practice.
This changes two things compared to the first version:
* Only check the types of function arguments and returns, not the types
of other values. This is more relaxed than GCC, while still guaranteeing
ABI compatibility.
* Move the check from Sema to CodeGen, so that inline functions are only
checked if they are actually used. There are some cases in the linux
kernel which depend on this behaviour of GCC.
For Embedded Swift, let's unblock building for RISC-V boards (e.g.
ESP32-C6). This isn't trying to add full RISC-V support to Swift /
Embedded Swift, it's just fixing the immediate blocker (not having
SwiftInfo defined blocks all compilations).