137 Commits

Author SHA1 Message Date
Artem Belevich
7c3fdcc276
[CUDA] Add support for __grid_constant__ attribute (#114589)
LLVM support for the attribute has been implemented already, so it just
plumbs it through to the CUDA front-end.

One notable difference from NVCC is that the attribute can be used
regardless of the targeted GPU. On the older GPUs it will just be
ignored. The attribute is a performance hint, and does not warrant a
hard error if compiler can't benefit from it on a particular GPU
variant.
2024-11-05 10:48:54 -08:00
Steven Perron
d6344c1cd0
[HLSL][SPIRV] Add HLSL type translation for spirv. (#114273)
This commit partially implements SPIRTargetCodeGenInfo::getHLSLType. It
can now generate the spirv type for the following HLSL types:

1. RWBuffer
2. Buffer
3. Sampler

---------

Co-authored-by: Nathan Gauër <github@keenuts.net>
2024-11-04 12:32:23 -05:00
Jesse Huang
335e68d8bc
[Clang][RISCV] Support -fcf-protection=return for RISC-V (#112477)
Enables the support of `-fcf-protection=return` on RISC-V, which
requires Zicfiss. It also adds a string attribute "hw-shadow-stack"
to every function if the option is set on RISC-V
2024-10-29 15:47:49 +08:00
Aaron Ballman
af7c58b7ea
Remove support for RenderScript (#112916)
See
https://discourse.llvm.org/t/rfc-deprecate-and-eventually-remove-renderscript-support/81284
for the RFC
2024-10-28 12:48:42 -04:00
Momchil Velikov
53f7f8ecca
[Clang][AArch64] Fix Pure Scalables Types argument passing and return (#112747)
Pure Scalable Types are defined in AAPCS64 here:

https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#pure-scalable-types-psts

And should be passed according to Rule C.7 here:

https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#682parameter-passing-rules

This part of the ABI is completely unimplemented in Clang, instead it
treats PSTs sometimes as HFAs/HVAs, sometime as general composite types.

This patch implements the rules for passing PSTs by employing the
`CoerceAndExpand` method and extending it to:
* allow array types in the `coerceToType`; Now only `[N x i8]` are
considered padding.
* allow mismatch between the elements of the `coerceToType` and the
elements of the `unpaddedCoerceToType`; AArch64 uses this to map
fixed-length vector types to SVE vector types.

Corectly passing a PST argument needs a decision in Clang about whether
to pass it in memory or registers or, equivalently, whether to use the
`Indirect` or `Expand/CoerceAndExpand` method. It was considered
relatively harder (or not practically possible) to make that decision in
the AArch64 backend.
Hence this patch implements the register counting from AAPCS64 (cf.
`NSRN`, `NPRN`) to guide the Clang's decision.
2024-10-28 15:43:14 +00:00
Matt Arsenault
51b4ada458
clang/AMDGPU: Set noalias.addrspace metadata on atomicrmw (#102462) 2024-10-17 17:10:45 +04:00
Simon Pilgrim
cf5e295ec0 Fix MSVC "not all control paths return a value" warning. NFC. 2024-10-16 17:15:47 +01:00
Helena Kotas
3b4512074e
[HLSL] Make HLSLAttributedResourceType canonical and add code paths to convert HLSL types to DirectX target types (#110327)
Translates `RWBuffer` and `StructuredBuffer` resources buffer types to
DirectX target types `dx.TypedBuffer` and `dx.RawBuffer`.

Includes a change of `HLSLAttributesResourceType` from 'sugar' type to
full canonical type. This is required for codegen and other clang
infrastructure to work property on HLSL resource types.

Fixes #95952 (part 2/2)
2024-10-15 13:38:15 -07:00
Michał Górny
387b37af1a
[LLVM] [Clang] Support for Gentoo *t64 triples (64-bit time_t ABIs) (#111302)
Gentoo is planning to introduce a `*t64` suffix for triples that will be
used by 32-bit platforms that use 64-bit `time_t`. Add support for
parsing and accepting these triples, and while at it make clang
automatically enable the necessary glibc feature macros when this suffix
is used.

An open question is whether we can backport this to LLVM 19.x. After
all, adding new triplets to Triple sounds like an ABI change — though I
suppose we can minimize the risk of breaking something if we move new
enum values to the very end.
2024-10-14 11:18:04 +00:00
Rahul Joshi
fa789dffb1
[NFC] Rename Intrinsic::getDeclaration to getOrInsertDeclaration (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
2024-10-11 05:26:03 -07:00
Matt Arsenault
d50302f31c
clang/AMDGPU: Stop emitting amdgpu-unsafe-fp-atomics attribute (#111579) 2024-10-09 08:52:32 +04:00
Alex Voicu
e13cbaca69
[clang][CodeGen][SPIR-V] Fix incorrect SYCL usage, implement missing interface (#109415)
This is primarily meant to address the issue identified in #109182,
around incorrect usage of `-fsycl-is-device`; we now have AMDGCN
flavoured SPIR-V which retains the desired behaviour around the default
AS and does not depend on the SYCL language being enabled to do so.
Overall, there are three changes:

1. We unconditionally use the `SPIRDefIsGen` AS map for AMDGCNSPIRV
target, as there is no case where the hack of setting default to private
would be desirable, and it can be used for languages other than OCL/HIP;
2. We implement `SPIRVTargetCodeGenInfo::getGlobalVarAddressSpace` for
SPIR-V in general, because otherwise using it from languages other than
HIP or OpenCL would yield 0, incorrectly;
3. We remove the incorrect usage of `-fsycl-is-device`.
2024-09-26 14:06:14 +01:00
Alex Voicu
3cfd0c0d36
[SPIRV][RFC] Rework / extend support for memory scopes (#106429)
This change adds support for correctly lowering the `__scoped` Clang
builtins, and corresponding scoped LLVM instructions. These were
previously unconditionally lowered to Device scope, which is possibly incorrect. 
Furthermore, the default / implicit scope is changed from Device (an 
OpenCL assumption) to AllSvmDevices (aka System), since the SPIR-V BE is not 
OpenCL specific / can ingest IR coming from other language front-ends. OpenCL 
defaulting to Device scope is now reflected in the front-end handling of atomic 
ops, which seems preferable.
2024-09-25 00:44:57 +01:00
Jonas Paulsson
14120227a3
Target ABI: improve call parameters extensions handling (#100757)
For the purpose of verifying proper arguments extensions per the target's ABI,
introduce the NoExt attribute that may be used by a target when neither sign-
or zeroextension is required (e.g. with a struct in register). The purpose of
doing so is to be able to verify that there is always one of these attributes
present and by this detecting cases where sign/zero extension is actually
missing.

As a first step, this patch has the verification step done for the SystemZ
backend only, but left off by default until all known issues have been
addressed.

Other targets/front-ends can now also add NoExt attribute where needed and do
this check in the backend.
2024-09-19 16:59:31 +02:00
JOE1994
1b913cde2a [clang][CodeGen] Strip unneeded calls to raw_string_ostream::str() (NFC)
Try to avoid excess layer of indirection when possible.

p.s. Remove a call to raw_string_ostream::flush() which is a no-op.
2024-09-13 20:33:58 -04:00
Piyou Chen
9cd9377409
[RISCV][FMV] Support target_clones (#85786)
This patch enable the function multiversion(FMV) and `target_clones`
attribute for RISC-V target.

The proposal of `target_clones` syntax can be found at the
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/48 (which has
landed), as modified by the proposed
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/85 (which adds the
priority syntax).

It supports the `target_clones` function attribute and function
multiversioning feature for RISC-V target. It will generate the ifunc
resolver function for the function that declared with target_clones
attribute.

The resolver function will check the version support by runtime object
`__riscv_feature_bits`.

For example:

```
__attribute__((target_clones("default", "arch=+ver1", "arch=+ver2"))) int bar() {
    return 1;
}
```

the corresponding resolver will be like:

```
bar.resolver() {
    __init_riscv_feature_bits();
    // Check arch=+ver1
    if ((__riscv_feature_bits.features[0] & BITMASK_OF_VERSION1) == BITMASK_OF_VERSION1) {
        return bar.arch=+ver1;
    } else {
        // Check arch=+ver2
        if ((__riscv_feature_bits.features[0] & BITMASK_OF_VERSION2) == BITMASK_OF_VERSION2) {
            return bar.arch=+ver2;
        } else {
            // Default
            return bar.default;
        }
    }
}
```
2024-09-13 18:04:53 +08:00
Jon Roelofs
b3f3c0c633
[clang][AArch64] Put soft-float ABI checks under isSoftFloat(). NFC 2024-09-11 13:37:45 -07:00
Helena Kotas
becb03f3c6
[DirectX] Add DirectXTargetCodeGenInfo (#104856)
Adds target codegen info class for DirectX. For now it always translates
`__hlsl_resource_t` handle to `target("dx.TypedBuffer", i32, 1, 0, 1)`
(`RWBuffer<int>`). More work is needed to determine the actual target
exp type and parameters based on the resource handle attributes.

Part 1/2 of #95952
2024-09-10 12:41:08 -07:00
Lei Huang
ea9204505c
Fix codegen for transparent_union function params (#104816)
Update codegen for func param with transparent_union attr to be that of
the first union member.

This is a followup to #101738 to fix non-ppc codegen and closes #76773.
2024-09-09 11:01:22 -04:00
Alex Voicu
ad435bcc14
[clang][CodeGen][SPIR-V][AMDGPU] Tweak AMDGCNSPIRV ABI to allow for the correct handling of aggregates passed to kernels / functions. (#102776)
The AMDGPU kernel ABI is not directly representable in SPIR-V, since it
relies on passing aggregates `byref`, and SPIR-V only encodes `byval`
(which the AMDGPU BE disallows for kernel arguments). As a temporary
solution to this mismatch, we add special handling for AMDGCN flavoured
SPIR-V, whereby aggregates are passed as direct, both to kernels and to
normal functions. This is not ideal (there are pathological cases where
performance is heavily impacted), but empirically robust and guaranteed
to work as the AMDGPU BE retains handling of `direct` passing for legacy
reasons.

We will revisit this in the future, but as it stands it is enough to
pass a wide array of integration tests and generates correct SPIR-V and
correct reverse translation into LLVM IR. The
amdgpu-kernel-arg-pointer-type test is updated via the automated script,
and thus becomes quite noisy.
2024-08-21 13:16:59 +01:00
Lei Huang
f95026dbf6
[PowerPC] Fix codegen for transparent_union function params (#101738)
Update codegen for func param with transparent_union attr to be that of
the first union member.

PPC fix for: https://github.com/llvm/llvm-project/issues/76773
2024-08-19 12:17:44 -04:00
Jon Roelofs
019ef52275
[clang][AArch64] Point the nofp ABI check diagnostics at the callee (#103392)
... whereever we have the Decl for it, and even when we don't keep the
SourceLocation of it aimed at the call site.

Fixes: #102983
2024-08-14 07:38:14 -07:00
Longsheng Mou
a27f40e5d9
[X86_64] Fix empty field error in vaarg of C++. (#101639)
Such struct types:
```
struct {
  struct{} a;
  long long b;
};

stuct {
  struct{} a;
  double b;
};
```
For such structures, Lo is NoClass and Hi is Integer/SSE. And when this
structure argument is passed, the high part is passed at offset 8 in
memory. So we should do special handling for these types in
EmitVAArg.Fix https://github.com/llvm/llvm-project/issues/79790 and fix
https://github.com/llvm/llvm-project/issues/86371.
2024-08-13 11:35:23 +08:00
Vladislav Belov
635d20e9e7
[RISCV] full support for riscv_rvv_vector_bits attribute (#100110)
Add support for using attribute((rvv_vector_bits(N))), when N < 8.
It allows using all fixed length vector mask types regardless VLEN
value.
2024-08-08 12:45:20 +03:00
Longsheng Mou
4461b69022
[X86_32][C++] fix 0 sized struct case in vaarg. (#86388)
struct SuperEmpty { struct{ int a[0];} b;};
Such 0 sized structs in c++ mode can not be ignored in i386 for that c++
fields are never empty.But when EmitVAArg, its size is 0, so that
va_list not increase.Maybe we can just Ignore this kind of arguments,
like X86_64 did. Fix #86385.
2024-08-02 09:20:49 +08:00
Sander de Smalen
389679d5f9
Reland: "[Clang] Demote always_inline error to warning for mismatching SME attrs" (#100991) (#100996)
Test `aarch64-sme-inline-streaming-attrs.c` caused some buildbot
failures, because the test was missing a `REQUIRES: aarch64-registered
target`. This was because we've demoted the error to a warning, which
then resulted in a different error message, because Clang can't actually
CodeGen the IR.
2024-07-29 11:23:25 +01:00
Sander de Smalen
e3a3397209
Revert "[Clang] Demote always_inline error to warning for mismatching SME attrs" (#100991)
Reverts llvm/llvm-project#100740
2024-07-29 10:19:28 +01:00
Sander de Smalen
5430f73b50
[Clang] Demote always_inline error to warning for mismatching SME attrs (#100740)
PR #77936 introduced a diagnostic to avoid calls being inlined into
functions with a different streaming mode, because inlining those
functions may result in different runtime behaviour. This was necessary
because LLVM doesn't check whether inlining is possible and thus blindly
inlines the function without checking feasibility.

In practice however, this introduces an artificial restriction that the
user may not be able to work around. Calling an `always_inline` function
from some header file that is out of the control of the user would
result in an error that the user cannot remedy.

Therefore, this patch demotes the error into a warning (for calls from
streaming[-compatible] -> non-streaming), but the proper fix would be to
fix the AlwaysInliner in LLVM to avoid inlining when it has analyzed the
callee and has determined that inlining is not possible.

Calling an always_inline function for calls from non-streaming ->
streaming will remain an error, because there is little pre-existing
code for SME, so it is expected that the header file can be modified by
the user (e.g. by using `__arm_streaming_compatible` if the code is
claimed to be compatible).
2024-07-29 09:29:30 +01:00
Matt Arsenault
e108853ac8
clang: Allow targets to set custom metadata on atomics (#96906)
Use this to replace the emission of the amdgpu-unsafe-fp-atomics
attribute in favor of per-instruction metadata. In the future
new fine grained controls should be introduced that also cover
the integer cases.

Add a wrapper around CreateAtomicRMW that appends the metadata,
and update a few use contexts to use it.
2024-07-26 09:57:28 +04:00
James Y Knight
e59a619acf
Clang: don't unnecessarily convert inline-asm operands to x86mmx in IR. (#98273)
The SelectionDAG asm-lowering code can already handle conversion of
other vector types to MMX if needed.
2024-07-23 13:22:24 -04:00
Ulrich Weigand
9af3628ce7 [SystemZ] Fix transparent_union calling convention
The SystemZ ABI code was missing code to handle the transparent_union
extension.  Arguments of such types are specified to be passed like
the first member of the union, instead of according to the usual
ABI calling convention for aggregates.

This did not make much difference in practice as the SystemZ ABI
already specifies that 1-, 2-, 4- or 8-byte aggregates are passed
in registers.  However, there *is* a difference if the first member
of the transparent union is a scalar integer type smaller than word
size - if passed as a scalar, it needs to be zero- or sign-extended
to word size, while if passed as aggregate, it is not.

Fixed by adding code to handle transparent_union similar to what
is done on other targets.
2024-07-18 17:43:28 +02:00
Joseph Huber
486d00eca6
[NVPTX] Implement variadic functions using IR lowering (#96015)
Summary:
This patch implements support for variadic functions for NVPTX targets.
The implementation here mainly follows what was done to implement it for
AMDGPU in https://github.com/llvm/llvm-project/pull/93362.

We change the NVPTX codegen to lower all variadic arguments to functions
by-value. This creates a flattened set of arguments that the IR lowering
pass converts into a struct with the proper alignment.

The behavior of this function was determined by iteratively checking
what the NVCC copmiler generates for its output. See examples like
https://godbolt.org/z/KavfTGY93. I have noted the main methods that
NVIDIA uses to lower variadic functions.

1. All arguments are passed in a pointer to aggregate.
2. The minimum alignment for a plain argument is 4 bytes.
3. Alignment is dictated by the underlying type
4. Structs are flattened and do not have their alignment changed.
5. NVPTX never passes any arguments indirectly, even very large ones.

This patch passes the tests in the `libc` project currently, including
support for `sprintf`.
2024-07-12 17:09:48 -05:00
Daniel Kiss
f34a1654d6
[NFC][Clang] Move set functions out BranchProtectionInfo. (#98451)
To reduce build times move them to TargetCodeGenInfo.

Refactor of #98329
2024-07-12 10:58:34 +02:00
Daniel Kiss
e03f66516d
[Clang][ARM] Call constructor on BranchTargetInfo. (#98307)
Otherwise members will be uninitialised.
2024-07-10 17:26:55 +02:00
Daniel Kiss
1782810b84 [Clang][ARM][AArch64] Alway emit protection attributes for functions. (#82819)
So far branch protection, sign return address, guarded control stack
attributes are
only emitted as module flags to indicate the functions need to be
generated with
those features.
The problem is in case of an LTO build the module flags are merged with
the `min`
rule which means if one of the module is not build with sign return
address then the features
will be turned off for all functions. Due to the functions take the
branch-protection and
sign-return-address features from the module flags. The
sign-return-address is
function level option therefore it is expected functions from files that
is
compiled with -mbranch-protection=pac-ret to be protected.
The inliner might inline functions with different set of flags as it
doesn't consider
the module flags.

This patch adds the attributes to all functions and drops the checking
of the module flags
for the code generation.
Module flag is still used for generating the ELF markers.
Also drops the "true"/"false" values from the
branch-protection-enforcement,
branch-protection-pauth-lr, guarded-control-stack attributes as presence
of the
attribute means it is on absence means off and no other option.

Releand with test fixes.
2024-07-10 11:32:41 +02:00
Daniel Kiss
4b2daeccc7
Revert "[Clang][ARM][AArch64] Alway emit protection attributes for functions." (#98284)
Reverts llvm/llvm-project#82819
2024-07-10 10:22:38 +02:00
Daniel Kiss
e15d67cfc2
[Clang][ARM][AArch64] Alway emit protection attributes for functions. (#82819)
So far branch protection, sign return address, guarded control stack
attributes are
only emitted as module flags to indicate the functions need to be
generated with
those features.
The problem is in case of an LTO build the module flags are merged with
the `min`
rule which means if one of the module is not build with sign return
address then the features
will be turned off for all functions. Due to the functions take the
branch-protection and
sign-return-address features from the module flags. The
sign-return-address is
function level option therefore it is expected functions from files that
is
compiled with -mbranch-protection=pac-ret to be protected.
The inliner might inline functions with different set of flags as it
doesn't consider
the module flags.
 
This patch adds the attributes to all functions and drops the checking
of the module flags
for the code generation.
Module flag is still used for generating the ELF markers.
Also drops the "true"/"false" values from the
branch-protection-enforcement,
branch-protection-pauth-lr, guarded-control-stack attributes as presence
of the
attribute means it is on absence means off and no other option.
2024-07-10 10:06:14 +02:00
Sudharsan Veeravalli
d65f423202
[RISCV] Handle empty structs/unions passing in C++ (#97315)
According to RISC-V integer calling convention empty structs or union
arguments or return values are ignored by C compilers which support them
as a non-standard extension. This is not the case for C++, which
requires them to be sized types.

Fixes #97285
2024-07-08 18:17:51 -07:00
Phoebe Wang
7e01e64714
[X86][vectorcall] Do not consume register for indirect return value (#97939)
This is how MSVC handles it. https://godbolt.org/z/Eav3vx7cd
2024-07-08 21:23:08 +08:00
Tomas Matheson
fa6d38d61a
[AArch64][TargetParser] Split FMV and extensions (#92882)
FMV extensions are really just mappings from FMV feature names to lists
of backend features for codegen. Split them out into their own separate
file.
2024-06-20 15:33:21 +01:00
Mariya Podchishchaeva
6d973b4548
[clang][CodeGen] Return RValue from EmitVAArg (#94635)
This should simplify handling of resulting value by the callers.
2024-06-17 13:29:20 +02:00
Jon Chesterfield
8516f54e6a
[AMDGPU] Implement variadic functions by IR lowering (#93362)
This is a mostly-target-independent variadic function optimisation and
lowering pass. It is only enabled for AMDGPU in this initial commit.

The purpose is to make C style variadic functions a zero cost
abstraction. They are lowered to equivalent IR which is then amenable to
other optimisations. This is inherently slightly target specific but
much less so than one might expect - the C varargs interface heavily
constrains the ABI design divergence.

The pass is primarily tested from webassembly. This is because wasm has
a straightforward variadic lowering strategy which coincides exactly
with what this pass transforms code into and a struct passing convention
with few cases to check. Adding further targets conventions is
straightforward and elided from this patch primarily to simplify the
review. Implemented in other branches are Linux X86, AMD64, AArch64 and
NVPTX.

Testing for targets that have existing lowering for va_arg from clang is
most efficiently done by checking that clang | opt completely elides the
variadic syntax from test cases. The lowering produces a struct for each
call site which can be inspected to check the various alignment and
indirections are correct.

AMDGPU presently has no variadic support other than some ad hoc printf
handling. Combined with the pass being inactive on all other targets
landing this represents strict increase in capability with zero risk.
Testing and refining will continue post commit.

In addition to the compiler tests included here, a self contained x64
clang/musl toolchain was constructed using the "lowering" instead of the
systemv ABI and used to build various C programs like lua and libxml2.
2024-06-06 10:44:53 +01:00
Jon Chesterfield
794457f6f9
[amdgpu] Pass variadic arguments without splitting (#94083)
Pass variadic arguments without changing their type, unlike the fixed
ones.

Fixed arguments are modified to better fit into registers. This patch
leaves those unchanged.

Splitting struct types into individual fields and packing small structs
into integers works well for passing via registers. Variadic arguments
are currently unimplemented in the backend. They're likely to be
implemented as a pointer to stack memory in which case register-themed
optimisations are inapplicable.

Splitting the struct into fields makes it difficult to implement va_arg
robustly. The rules around padding and alignment to inverse the struct
splitting could be constructed, but at high complexity and no particular
advantage.

Passing types as-is means there is a 1:1 correspondence with the type
information va_arg has to work with and the parameter type at the call
site.

This is an ABI change, but as the only functions affected are variadic
ones which are presently a compilation error, not a functional break.
Factored out of the larger #93362 and can land independently.
2024-06-04 13:10:10 +01:00
Jon Chesterfield
b2d7d72ff2
[AArch64] Use ptrmask for vaarg stack alignment (#92836) 2024-05-21 03:22:20 +01:00
Ahmed Bougacha
3575d23ca8
[clang][CodeGen] Remove unused LValue::getAddress CGF arg. (#92465)
This is in effect a revert of f139ae3d93797, as we have since gained a
more sophisticated way of doing extra IRGen with the addition of
RawAddress in #86923.
2024-05-20 10:23:04 -07:00
Florian Hahn
8a4cbeada9
[Clang] Unbreak build take 2 using uint64_t() explicitly. 2024-05-15 15:49:03 +01:00
Florian Hahn
da116bd82c
[Clang] Use ULL for std::max constant argument to fix build failure.
getKnownMinValue returns uint64_t, use ULL to make sure the second arg
is also 64 bit.
2024-05-15 15:37:52 +01:00
Koakuma
c2fba6df94
[clang][SPARC] Treat empty structs as if it's a one-bit type in the CC (#90338)
Make sure that empty structs are treated as if it has a size of one bit
in function parameters and return types so that it occupies a full
argument and/or return register slot.

This fixes crashes and miscompilations when passing and/or returning
empty structs.

Reviewed by: @s-barannikov
2024-05-15 20:49:28 +07:00
Lukacma
421862f8e4
[Clang] Fix incorrect passing of _BitInt args (#90741)
This patch removes incorrect `byval` attribute from pointer argument
passed with >128 bit long _BitInt types.
2024-05-15 10:51:32 +01:00
Phoebe Wang
5bde8017a1
[X86][vectorcall] Pass built types byval when xmm0~6 exhausted (#91846)
This is how MSVC handles it. https://godbolt.org/z/fG386bjnf
2024-05-13 08:31:49 +08:00