Clang currently passes and returns `__float128` in vector registers on
MinGW targets, which is LLVM's default ABI for `fp128`. However, the
Windows x86-64 calling convention [1] states the following:
__m128 types, arrays, and strings are never passed by immediate
value. Instead, a pointer is passed to memory allocated by the
caller. Structs and unions of size 8, 16, 32, or 64 bits, and __m64
types, are passed as if they were integers of the same size. Structs
or unions of other sizes are passed as a pointer to memory allocated
by the caller. For these aggregate types passed as a pointer,
including __m128, the caller-allocated temporary memory must be
16-byte aligned.
Based on the above it sounds like `__float128` should be passed
indirectly. Thus, change `f128` passing to use the stack and make the
return in xmm0 explicit. This is the identical to `i128`, and passing is
the same as GCC.
Regarding return values, the documentation states:
A scalar return value that can fit into 64 bits, including the __m64
type, is returned through RAX. Non-scalar types including floats,
doubles, and vector types such as __m128, __m128i, __m128d are
returned in XMM0.
This makes it sound like it should be acceptable to return `__float128`
in xmm0; however, GCC returns `__float128` on the stack. That above ABI
statement as well as consistency with `i128` (which is returned in xmm0)
mean that it would likely be better for GCC to change its return ABI to
match Clang rather than the other way around, so that portion is left
as-is.
Clang's MSVC targets do not support `__float128` or `_Float128`, but
these changes would also apply there if it is eventually enabled.
With [2] which should land around the same time, LLVM will also
implement this ABI so it is not technically necessary for Clang to make
a change here as well. This is sill done in order to be consistent with
other types, and to allow calling convention-aware optimizations at all
available optimization layers (@rnk mentioned possible reuse of stack
arguments). An added benefit is readibility of the LLVM IR since it more
accurately reflects what the lowered assembly does.
[1]:
https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170
[2]: https://github.com/llvm/llvm-project/pull/128848
`sret` arguments are always going to reside in the stack/`alloca`
address space, which makes the current formulation where their AS is
derived from the pointee somewhat quaint. This patch ensures that `sret`
ends up pointing to the `alloca` AS in IR function signatures, and also
guards agains trying to pass a casted `alloca`d pointer to a `sret` arg,
which can happen for most languages, when compiled for targets that have
a non-zero `alloca` AS (e.g. AMDGCN) / map `LangAS::default` to a
non-zero value (SPIR-V). A target could still choose to do something
different here, by e.g. overriding `classifyReturnType` behaviour.
In a broader sense, this patch extends non-aliased indirect args to also
carry an AS, which leads to changing the `getIndirect()` interface. At
the moment we're only using this for (indirect) returns, but it allows
for future handling of indirect args themselves. We default to using the
AllocaAS as that matches what Clang is currently doing, however if, in
the future, a target would opt for e.g. placing indirect returns in some
other storage, with another AS, this will require revisiting.
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
This reverts commit 81fc3add1e627c23b7270fe2739cdacc09063e54.
This breaks some LLDB tests, e.g.
SymbolFile/DWARF/x86/no_unique_address-with-bitfields.cpp:
lldb: ../llvm-project/clang/lib/AST/Decl.cpp:4604: unsigned int clang::FieldDecl::getBitWidthValue() const: Assertion `isa<ConstantExpr>(getBitWidth())' failed.
Save the bitwidth value as a `ConstantExpr` with the value set. Remove
the `ASTContext` parameter from `getBitWidthValue()`, so the latter
simply returns the value from the `ConstantExpr` instead of
constant-evaluating the bitwidth expression every time it is called.
Update codegen for func param with transparent_union attr to be that of
the first union member.
This is a followup to #101738 to fix non-ppc codegen and closes#76773.
Such struct types:
```
struct {
struct{} a;
long long b;
};
stuct {
struct{} a;
double b;
};
```
For such structures, Lo is NoClass and Hi is Integer/SSE. And when this
structure argument is passed, the high part is passed at offset 8 in
memory. So we should do special handling for these types in
EmitVAArg.Fix https://github.com/llvm/llvm-project/issues/79790 and fix
https://github.com/llvm/llvm-project/issues/86371.
struct SuperEmpty { struct{ int a[0];} b;};
Such 0 sized structs in c++ mode can not be ignored in i386 for that c++
fields are never empty.But when EmitVAArg, its size is 0, so that
va_list not increase.Maybe we can just Ignore this kind of arguments,
like X86_64 did. Fix#86385.
This is in effect a revert of f139ae3d93797, as we have since gained a
more sophisticated way of doing extra IRGen with the addition of
RawAddress in #86923.
When using a hard-float ABI for a target without FP registers, it's not
possible to correctly generate code for functions with arguments which
must be passed in floating-point registers. This is diagnosed in CodeGen
instead of Sema, to more closely match GCC's behaviour around inline
functions, which is relied on by the Linux kernel.
Previously, this only checked function signatures as they were
code-generated, but this missed some cases:
* Calls to functions not defined in this translation unit.
* Calls through function pointers.
* Calls to variadic functions, where the variadic arguments have a
floating-point type.
This adds checks to function calls, as well as definitions, so that
these cases are correctly diagnosed.
```
typedef long long t67 __attribute__((aligned (4)));
struct s67 {
int a;
t67 b;
};
void f67(struct s67 x) {
}
```
When classify:
a: Lo = Integer, Hi = NoClass
b: Lo = Integer, Hi = NoClass
struct S: Lo = Integer, Hi = NoClass
```
define dso_local void @f67(i64 %x.coerce) {
```
In this case, only one i64 register is used when the structure parameter
is transferred, which is obviously incorrect.So we need to treat the
split case specially. fix
https://github.com/llvm/llvm-project/issues/85387.
Empty structs are ignored for parameter passing purposes, but va_arg was
incrementing the pointer anyway for that the size of empty struct in c++
is 1 byte, which could lead to va_list getting out of sync. Fix#86057.
In PR #79382, I need to add a new type that derives from
ConstantArrayType. This means that ConstantArrayType can no longer use
`llvm::TrailingObjects` to store the trailing optional Expr*.
This change refactors ConstantArrayType to store a 60-bit integer and
4-bits for the integer size in bytes. This replaces the APInt field
previously in the type but preserves enough information to recreate it
where needed.
To reduce the number of places where the APInt is re-constructed I've
also added some helper methods to the ConstantArrayType to allow some
common use cases that operate on either the stored small integer or the
APInt as appropriate.
Resolves#85124.
SizeInBytes of empty structure is 0 in C, while 1 in C++. And empty
structure argument of the function is ignored in X86_64 backend.As a
result, the value of variable arguments in C++ is incorrect. fix#77036
Co-authored-by: Longsheng Mou <moulongsheng@huawei.com>
- Update CodeGenTypeCache to use a single union for all pointers in
address space zero.
- Introduce a UnqualPtrTy in CodeGenTypeCache, and use that (for
example instead of llvm::PointerType::getUnqual) in some places.
- Drop some redundant bit/pointers casts from ptr to ptr.
MSVC allows users to pass structures with required alignments greater
than 4 to variadic functions. It does not pass them indirectly to
correctly align them. Instead, it passes them directly with the usual 4
byte stack alignment.
This change implements the same logic in clang on the passing side. The
receiving side (va_arg) never implemented any of this indirect logic, so
it doesn't need to be updated.
This issue pre-existed, but @aaron.ballman noticed it when we started
passing structs containing aligned fields indirectly in D152752.
This is an alternative of D157485 and a pre-feature to support AVX10.
AVX10 Architecture Specification: https://cdrdv2.intel.com/v1/dl/getContent/784267
AVX10 Technical Paper: https://cdrdv2.intel.com/v1/dl/getContent/784343
RFC: https://discourse.llvm.org/t/rfc-design-for-avx10-feature-support/72661
Based on the feedbacks from LLVM and GCC community, we have agreed to
start from supporting `-m[no-]evex512` on existing AVX512 features.
The option `-mno-evex512` can be used with `-mavx512xxx` to build
binaries that can run on both legacy AVX512 targets and AVX10-256.
There're still arguments about what's the expected behavior when this
option as well as `-mavx512xxx` used together with `-mavx10.1-256`. We
decided to defer the support of `-mavx10.1` after we made consensus.
Or furthermore, we start from supporting AVX10.2 and not providing any
AVX10.1 options.
Reviewed By: RKSimon, skan
Differential Revision: https://reviews.llvm.org/D159250
This is an alternative of D157485 and a pre-feature to support AVX10.
AVX10 Architecture Specification: https://cdrdv2.intel.com/v1/dl/getContent/784267
AVX10 Technical Paper: https://cdrdv2.intel.com/v1/dl/getContent/784343
RFC: https://discourse.llvm.org/t/rfc-design-for-avx10-feature-support/72661
Based on the feedbacks from LLVM and GCC community, we have agreed to
start from supporting `-m[no-]evex512` on existing AVX512 features.
The option `-mno-evex512` can be used with `-mavx512xxx` to build
binaries that can run on both legacy AVX512 targets and AVX10-256.
There're still arguments about what's the expected behavior when this
option as well as `-mavx512xxx` used together with `-mavx10.1-256`. We
decided to defer the support of `-mavx10.1` after we made consensus.
Or furthermore, we start from supporting AVX10.2 and not providing any
AVX10.1 options.
Reviewed By: RKSimon, skan
Differential Revision: https://reviews.llvm.org/D159250
Partial progress towards replacing `CreateElementBitCast`, as it no
longer does what its name suggests. Either replace its uses with
`Address::withElementType()`, or remove them if no longer needed.
Reviewed By: barannikov88, nikic
Differential Revision: https://reviews.llvm.org/D153314
This commit breaks up CodeGen/TargetInfo.cpp into a set of *.cpp files,
one file per target. There are no functional changes, mostly just code moving.
Non-code-moving changes are:
* A virtual destructor has been added to DefaultABIInfo to pin the vtable to a cpp file.
* A few methods of ABIInfo and DefaultABIInfo were split into declaration + definition
in order to reduce the number of transitive includes.
* Several functions that used to be static have been placed in clang::CodeGen
namespace so that they can be accessed from other cpp files.
RFC: https://discourse.llvm.org/t/rfc-splitting-clangs-targetinfo-cpp/69883
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D148094