2107 Commits

Author SHA1 Message Date
Daniil Kovalev
146fd7cd45
[PAC][Driver] Support pauthtest ABI for AArch64 Linux triples (#97237)
When `pauthtest` is either passed as environment part of AArch64 Linux
triple
or passed via `-mabi=`, enable the following ptrauth flags:

- `intrinsics`;
- `calls`;
- `returns`;
- `auth-traps`;
- `vtable-pointer-address-discrimination`;
- `vtable-pointer-type-discrimination`;
- `init-fini`.

Some related stuff is still subject to change, and the ABI itself might
be changed, so end users are not expected to use this and the ABI name
has 'test' suffix.

If `-mabi=pauthtest` option is used, it's normalized to effective
triple.

When the environment part of the effective triple is `pauthtest`, try
to use `aarch64-linux-pauthtest` as multilib directory.

The following is not supported:
- combination of `pauthtest` ABI with any branch protection scheme
except BTI;
- explicit set of environment part of the triple to a value different
  from `pauthtest` in combination with `-mabi=pauthtest`;
- usage on non-Linux OS.

---------

Co-authored-by: Anatoly Trosinenko <atrosinenko@accesssoftek.com>
2024-07-22 21:18:39 +03:00
Alexandros Lamprineas
a2d309912a
[FMV][AArch64] Do not emit ifunc resolver on use. (#97761)
It was raised in https://github.com/llvm/llvm-project/issues/81494 that
we are not generating correct code when there is no TU-local caller.

The suggestion was to emit a resolver:
* Whenever there is a use in the TU.
* When the TU has a definition of the default version.

See the comment for more details:

https://github.com/llvm/llvm-project/issues/81494#issuecomment-1985963497

This got addressed with https://github.com/llvm/llvm-project/pull/84405.

Generating a resolver on use means that we may end up with multiple
resolvers across different translation units. Those resolvers may not be
the same because each translation unit may contain different version
declarations (user's fault). Therefore the order of linking the final
image determines which of these weak symbols gets selected, resulting in
non consisted behavior. I am proposing to stop emitting a resolver on
use and only do so in the translation unit which contains the default
definition. This way we guarantee the existence of a single resolver.
Now, when a versioned function is used we want to emit a declaration of
the function symbol omitting the multiversion mangling.

I have added a requirement to ACLE mandating that all the function
versions are declared in the translation unit which contains the default
definition: https://github.com/ARM-software/acle/pull/328
2024-07-18 16:34:16 +01:00
Aaron Ballman
d3f8105c65
Revert "Finish deleting the le32/le64 targets" (#99079)
Reverts llvm/llvm-project#98497

We're reverting this for approx 30 days so that the Halide project has
time to transition off the target.
2024-07-16 14:47:09 -04:00
Fangrui Song
148d90729e
[CodeGen] Set attributes on resolvers emitted after ifuncs
Visiting the ifunc calls `GetOrCreateLLVMFunction` with
`NotForDefinition` while visiting the resolver calls
`GetOrCreateLLVMFunction` with `ForDefinition`.

When an ifunc is emitted before its resolver, the `ForDefinition` call
does not call `SetFunctionAttributes`, because the function prematurely
returns due to `(Entry->getValueType() == Ty)` and
`llvm::GlobalIFunc::getResolverFunctionType(DeclTy)`.

This leads to missing `!kcfi_type` with -fsanitize=kcfi.

```
extern void ifunc0(void) __attribute__ ((ifunc("resolver0")));
void *resolver0(void) { return 0; } // SetFunctionAttributes not called

extern void ifunc1(void) __attribute__ ((ifunc("resolver1")));
static void *resolver1(void) { return 0; } // SetFunctionAttributes not called

extern void ifunc2(void) __attribute__ ((ifunc("resolver2")));
static void *resolver2(void*) { return 0; }
```

Ensure `SetFunctionAttributes` is called by calling
`GetOrCreateLLVMFunction` with a dummy non-function type. Now that the
`F->takeName(Entry)` code path may be taken, the
`DisableSanitizerInstrumentation` code
(https://reviews.llvm.org/D150262) should be moved to `checkAliases`,
when the resolver function is finalized.

Pull Request: https://github.com/llvm/llvm-project/pull/98832
2024-07-15 11:27:01 -07:00
Yaxun (Sam) Liu
77fd30f7ce
[CUDA][HIP] Fix template static member (#98580)
Should check host/device attributes before emitting static member of
template instantiation.

Fixes: https://github.com/llvm/llvm-project/issues/98151
2024-07-12 10:08:34 -04:00
Aaron Ballman
2369a54fbe
Finish deleting the le32/le64 targets (#98497)
This is a revert of ef5e7f90ea4d5063ce68b952c5de473e610afc02 which was a
temporary partial revert of 77ac823fd285973cfb3517932c09d82e6a32f46d.
The le32 and le64 targets are no longer necessary to retain, so this
removes them entirely.
2024-07-12 06:55:49 -04:00
Yaxun (Sam) Liu
90abdf83e2
[CUDA][HIP][NFC] add CodeGenModule::shouldEmitCUDAGlobalVar (#98543)
Extract the logic whether to emit a global var based on CUDA/HIP
host/device related attributes to CodeGenModule::shouldEmitCUDAGlobalVar
to be used by other places.
2024-07-11 21:52:04 -04:00
NAKAMURA Takumi
da31b684a5
[Coverage] Suppress covmap and profdata for system headers. (#97952)
With `system-headers-coverage=false`, functions defined in system
headers were not instrumented but corresponding covmaps were emitted. It
caused wasting covmap and profraw.

This change improves:

- Reduce object size (due to reduced covmap)
- Reduce size of profraw (uninstrumented system headers occupied
counters)
- Smarter view of coverage report. Stubs of uninstrumented system
headers will be no longer seen.
2024-07-10 17:11:12 +09:00
Chen Zheng
afd0e6d06b
[PowerPC] Diagnose musttail instead of crash inside backend (#93267)
musttail is not often possible to be generated on PPC targets as when
calling to a function defined in another module, PPC needs to restore
the TOC pointer. To restore the TOC pointer, compiler needs to emit a
nop after the call to let linker generate codes to restore TOC pointer.
Tail call cannot generate expected call sequence for this case.

To avoid the crash inside the compiler backend, a diagnosis is added in
the frontend.

Fixes #63214
2024-07-08 09:30:01 +08:00
Nick Zavaritsky
ae0d2244a2
[BPF] Fix linking issues in static map initializers (#91310)
When BPF object files are linked with bpftool, every symbol must be
accompanied by BTF info. Ensure that extern functions referenced by
global variable initializers are included in BTF.

The primary motivation is "static" initialization of PROG maps:

```c
extern int elsewhere(struct xdp_md *);

struct {
  __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
  __uint(max_entries, 1);
  __type(key, int);
  __type(value, int);
  __array(values, int (struct xdp_md *));
} prog_map SEC(".maps") = { .values = { elsewhere } };
```

BPF backend needs debug info to produce BTF. Debug info is not
normally generated for external variables and functions. Previously, it
was solved differently for variables (collecting variable declarations
    in ExternalDeclarations vector) and functions (logic invoked during
    codegen in CGExpr.cpp).

This patch generalises ExternalDefclarations to include both function
and variable declarations. This change ensures that function references
    are not missed no matter the context. Previously external functions
    referenced in constant expressions lacked debug info.
2024-07-05 07:32:09 -07:00
Sven van Haastregt
5fd5b8ada7
[OpenCL] Emit opencl.cxx.version metadata for C++ (#92140)
Currently there is no way to tell whether an IR module was generated
using `-cl-std=cl3.0` or `-cl-std=clc++2021`, i.e., whether the origin
was a OpenCL C or C++ for OpenCL source.

Add new `opencl.cxx.version` named metadata when compiling C++. Keep the
`opencl.ocl.version` metadata to convey the compatible OpenCL C version.

Fixes https://github.com/llvm/llvm-project/issues/91912
2024-07-03 13:24:22 +02:00
Alex Voicu
9acb533c38
[clang][Driver] Add HIPAMD Driver support for AMDGCN flavoured SPIR-V (#95061)
This patch augments the HIPAMD driver to allow it to target AMDGCN
flavoured SPIR-V compilation. It's mostly straightforward, as we re-use
some of the existing SPIRV infra, however there are a few notable
additions:

- we introduce an `amdgcnspirv` offload arch, rather than relying on
using `generic` (this is already fairly overloaded) or simply using
`spirv` or `spirv64` (we'll want to use these to denote unflavoured
SPIRV, once we bring up that capability)
- initially it is won't be possible to mix-in SPIR-V and concrete AMDGPU
targets, as it would require some relatively intrusive surgery in the
HIPAMD Toolchain and the Driver to deal with two triples
(`spirv64-amd-amdhsa` and `amdgcn-amd-amdhsa`, respectively)
- in order to retain user provided compiler flags and have them
available at JIT time, we rely on embedding the command line via
`-fembed-bitcode=marker`, which the bitcode writer had previously not
implemented for SPIRV; we only allow it conditionally for AMDGCN
flavoured SPIRV, and it is handled correctly by the Translator (it ends
up as a string literal)

Once the SPIRV BE is no longer experimental we'll switch to using that
rather than the translator. There's some additional work that'll come
via a separate PR around correctly piping through AMDGCN's
implementation of `printf`, for now we merely handle its flags
correctly.
2024-06-25 12:19:28 +01:00
Alexandros Lamprineas
3d8079229e
[clang][AArch64][FMV] Stop emitting alias to ifunc. (#96221)
Long story short the interaction of two optimizations happening in
GlobalOpt results in a crash. For more details look at the issue
https://github.com/llvm/llvm-project/issues/96197. I will be fixing this
in GlobalOpt but it is a conservative solution since it won't allow us
to optimize resolvers which return a pointer to a function whose
definition is in another TU when compiling without LTO:

```
__attribute__((target_version("simd"))) void bar(void);
__attribute__((target_version("default"))) void bar(void);
int foo() { bar(); }
```

fixes: #96197
2024-06-24 12:01:48 +01:00
Andrew Ng
baba78daf2
[clang] Fix loss of dllexport for exported template specialization (#94664)
When dropping DLL attributes, ensure that the most recent declaration is
being checked.
2024-06-10 19:39:28 +01:00
Oliver Stannard
1a5239251e
[ARM] r11 is reserved when using -mframe-chain=aapcs (#86951)
When using the -mframe-chain=aapcs or -mframe-chain=aapcs-leaf options,
we cannot use r11 as an allocatable register, even if
-fomit-frame-pointer is also used. This is so that r11 will always point
to a valid frame record, even if we don't create one in every function.
2024-06-07 10:58:10 +01:00
smanna12
ccaccc3367
[Clang] Prevent null pointer dereference in target attribute mangling (#94228)
This patch adds assertions in the getMangledNameImpl() function to
ensure that the expected target attributes (TargetAttr,
TargetVersionAttr, and TargetClonesAttr) are not null before they are
passed to appendAttributeMangling() to prevent potential null pointer
dereferences and improve the robustness of the attribute mangling
process.

This assertion will trigger a runtime error with a clear message in
debug build if any of the expected attributes are missing, facilitating
early and easier diagnosis and debugging of such issues related to
attribute mangling.
2024-06-03 18:20:33 -05:00
Nikita Popov
cd9a02e2c7 [CodeGen] Remove useless zero-index constant GEPs (NFCI)
Remove zero-index constant expression GEPs, which are not needed
with opaque pointers and will get folded away.
2024-05-30 10:24:57 +02:00
Chuanqi Xu
b0f10a1dc3
[C++20] [Modules] Don't generate the defintition for non-const available external variables (#93530)
Close https://github.com/llvm/llvm-project/issues/93497

The root cause of the problem is, we mark the variable from other
modules as constnant in LLVM incorrectly. This patch fixes this problem
by not emitting the defintition for non-const available external
variables. Since the non const available externally variable is not
helpful to the optimization.
2024-05-29 13:39:57 +08:00
Alexandros Lamprineas
8930ba98e0
[clang][FMV] Allow declaration of function versions in namespaces. (#93044)
Fixes the following bug:

namespace Name {
int __attribute((target_version("default"))) foo() { return 0; }
}

namespace Name {
int __attribute((target_version("sve"))) foo() { return 1; }
}

int bar() { return Name::foo(); }

error: redefinition of 'foo'
  int __attribute((target_version("sve"))) foo() { return 1; }

note: previous definition is here
  int __attribute((target_version("default"))) foo() { return 0; }

While fixing this I also found that in the absence of default version
declaration, the one we implicitly create has incorrect mangling if
we are in a namespace:

namespace OtherName {
int __attribute((target_version("sve"))) foo() { return 2; }
}

int baz() { return OtherName::foo(); }

In this example instead of creating a declaration for the symbol
@_ZN9OtherName3fooEv.default we are creating one for the symbol
@_Z3foov.default (the namespace mangling prefix is omitted).
This has now been fixed.
2024-05-23 10:09:22 +01:00
lolloz98
67ae86d700
[clang] Fix crash passing function pointer without prototype. (#90255)
Fixes use-after-free iterating over the uses of the function.

Closes #88917
2024-05-21 11:51:21 -07:00
Alex Voicu
10edb4991c
[Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (#88182)
At the moment, Clang is rather liberal in assuming that 0 (and by extension unqualified) is always a safe default. This does not work for targets that actually use a different value for the default / generic AS (for example, the SPIRV that obtains from HIPSPV or SYCL). This patch is a first, fairly safe step towards trying to clear things up by querying a modules' default AS from the target, rather than assuming it's 0, alongside fixing a few places where things break / we encode the 0 == DefaultAS assumption. A bunch of existing tests are extended to check for non-zero default AS usage.
2024-05-19 14:59:03 +01:00
Daniil Kovalev
ad652efa1f
[AArch64][PAC][clang][ELF] Support PAuth ABI core info (#85235)
Depends on #87545

Emit PAuth ABI compatibility tag values as llvm module flags:
- `aarch64-elf-pauthabi-platform`
- `aarch64-elf-pauthabi-version`

For platform 0x10000002 (llvm_linux), the version value bits correspond
to the following LangOptions defined in #85232:

- bit 0: `PointerAuthIntrinsics`;
- bit 1: `PointerAuthCalls`;
- bit 2: `PointerAuthReturns`;
- bit 3: `PointerAuthAuthTraps`;
- bit 4: `PointerAuthVTPtrAddressDiscrimination`;
- bit 5: `PointerAuthVTPtrTypeDiscrimination`;
- bit 6: `PointerAuthInitFini`.

---------

Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
2024-05-09 15:32:18 +03:00
Kees Cook
869ffcf3f6
[CodeGen][i386] Move -mregparm storage earlier and fix Runtime calls (#89707)
When building the Linux kernel for i386, the -mregparm=3 option is
enabled. Crashes were observed in the sanitizer handler functions, and
the problem was found to be mismatched calling convention.

As was fixed in commit c167c0a4dcdb ("[BuildLibCalls] infer inreg param
attrs from NumRegisterParameters"), call arguments need to be marked as
"in register" when -mregparm is set. Use the same helper developed there
to update the function arguments.

Since CreateRuntimeFunction() is actually part of CodeGenModule, storage
of the -mregparm value is also moved to the constructor, as doing this
in Release() is too late.

Fixes: https://github.com/llvm/llvm-project/issues/89670
2024-04-29 14:54:10 -07:00
Craig Topper
733a87783c
[RISCV] Split code that tablegen needs out of RISCVISAInfo. (#89684)
This introduces a new file, RISCVISAUtils.cpp and moves the rest of
RISCVISAInfo to the TargetParser library.

This will allow us to generate part of RISCVISAInfo.cpp using tablegen.
2024-04-23 15:12:36 -07:00
Chuanqi Xu
39016e33b0 [C++20] [Modules] Don't import non-inline function bodies even if it is always-inline
Recommit
1ecbab56dc

Close https://github.com/llvm/llvm-project/issues/80949

The new thing in this commit is to allow to import the function body
from instantiations if it is marked with always-inline. See the
discussion in https://github.com/llvm/llvm-project/issues/86893 for
details.
2024-04-16 13:06:44 +08:00
Chuanqi Xu
aa2741449c Revert "[C++20] [Modules] Don't import non-inline function bodies even if it is marked as always_inline"
This reverts commit 1ecbab56dcbb78268c8d19af34a50591f90b12a0.

See the discussion in https://github.com/llvm/llvm-project/issues/86893.

The original commit receives too many complaints. Let's try to
workaround the issue to give better user experiences.
2024-04-15 17:06:03 +08:00
Arthur Eubanks
5d6d8dcd29
[clang][llvm] Remove "implicit-section-name" attribute (#87906)
D33412/D33413 introduced this to support a clang pragma to set section
names for a symbol depending on if it would be placed in
bss/data/rodata/text, which may not be known until the backend. However,
for text we know that only functions will go there, so just directly set
the section in clang instead of going through a completely separate
attribute.

Autoupgrade the "implicit-section-name" attribute to directly setting
the section on a Fuction.
2024-04-11 12:29:29 -07:00
Bill Wendling
fca51911d4
[NFC][Clang] Improve const correctness for IdentifierInfo (#79365)
The IdentifierInfo isn't typically modified. Use 'const' wherever
possible.
2024-04-11 00:33:40 +00:00
Eli Friedman
71097e9271
[ARM64EC] Add support for parsing __vectorcall (#87725)
MSVC doesn't support generating __vectorcall calls in Arm64EC mode, but
it does treat it as a distinct type. The Microsoft STL depends on this
functionality. (Not sure if this is intentional.) Add support for
parsing the same way as MSVC, and add some checks to ensure we don't try
to actually generate code.

The error handling in CodeGen is ugly, but I can't think of a better way
to do it.
2024-04-09 19:53:56 -07:00
Akira Hatanaka
84780af4b0
[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86923)
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.

This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.

In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.

This reapplies d9a685a9dd589486e882b722e513ee7b8c84870c, which was
reverted because it broke ubsan bots. There seems to be a bug in
coroutine code-gen, which is causing EmitTypeCheck to use the wrong
alignment. For now, pass alignment zero to EmitTypeCheck so that it can
compute the correct alignment based on the passed type (see function
EmitCXXMemberOrOperatorMemberCallExpr).
2024-03-28 06:54:36 -07:00
Akira Hatanaka
f75eebab88
Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721)" (#86898)
This reverts commit d9a685a9dd589486e882b722e513ee7b8c84870c.

The commit broke ubsan bots.
2024-03-27 18:14:04 -07:00
Akira Hatanaka
d9a685a9dd
[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721)
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.

This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.

In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.

This reapplies 8bd1f9116aab879183f34707e6d21c7051d083b6. The commit
broke msan bots because LValue::IsKnownNonNull was uninitialized.
2024-03-27 12:24:49 -07:00
Chris B
28ddbd4a86
[NFC] Refactor ConstantArrayType size storage (#85716)
In PR #79382, I need to add a new type that derives from
ConstantArrayType. This means that ConstantArrayType can no longer use
`llvm::TrailingObjects` to store the trailing optional Expr*.

This change refactors ConstantArrayType to store a 60-bit integer and
4-bits for the integer size in bytes. This replaces the APInt field
previously in the type but preserves enough information to recreate it
where needed.

To reduce the number of places where the APInt is re-constructed I've
also added some helper methods to the ConstantArrayType to allow some
common use cases that operate on either the stored small integer or the
APInt as appropriate.

Resolves #85124.
2024-03-26 14:15:56 -05:00
Akira Hatanaka
b311756450
Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454)" (#86674)
This reverts commit 8bd1f9116aab879183f34707e6d21c7051d083b6.

It appears that the commit broke msan bots.
2024-03-26 07:37:57 -07:00
Alexandros Lamprineas
da9ac43433
[FMV] Allow mixing target_version with target_clones. (#86493)
The latest ACLE allows it and further clarifies the following
in regards to the combination of the two attributes:

"If the `default` matches with another explicitly provided
 version in the same translation unit, then the compiler can
 emit only one function instead of the two. The explicitly
 provided version shall be preferred."

("default" refers to the default clone here)

https://github.com/ARM-software/acle/pull/310
2024-03-26 11:36:34 +00:00
Akira Hatanaka
8bd1f9116a
[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454)
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.

This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.

In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
2024-03-25 18:05:42 -07:00
Alexandros Lamprineas
772e316457
[FMV] Allow multi versioning without default declaration. (#85454)
This was a limitation which has now been lifted. Please read the
thread below for more details:

https://github.com/llvm/llvm-project/pull/84405#discussion_r1525583647

Basically it allows to separate versioned implementations across
different TUs without having to share private header files which
contain the default declaration.

The ACLE spec has been updated accordingly to make this explicit:
"Each version declaration should be visible at the translation
 unit in which the corresponding function version resides."

https://github.com/ARM-software/acle/pull/310

If a resolver is required (because there is a caller in the TU),
then a default declaration is implicitly generated.
2024-03-25 09:43:41 +00:00
Alexandros Lamprineas
9cb5004209
Reland [FMV] Emit the resolver along with the default version definit… (#85923)
…ion.

This was reverted because the resolver didn't look as expected in one of
the tests. I believe it had some interaction with #84146. I have now
regenerated it using -target-feature -fp-armv8.
2024-03-20 16:49:51 +00:00
Alexandros Lamprineas
b7975cae7b
Revert "[FMV] Emit the resolver along with the default version definition." (#85914)
Reverts llvm/llvm-project#84405

In between of passing the precommit tests on github and being merged
some change (perhaps in the AArch64 backend?) landed which resulted
in altering the generated resolver. I will regenerate the tests
perhaps using a less sensitive runline to such changes.
2024-03-20 06:16:26 -04:00
Alexandros Lamprineas
e6b5bd5854
[FMV] Emit the resolver along with the default version definition. (#84405)
We would like the resolver to be generated eagerly, even if the
versioned function is not called from the current translation
unit. Fixes #81494. It further allows Multi Versioning to work
even if the default target version attribute is omitted from
function declarations.
2024-03-20 09:24:29 +00:00
ostannard
ef395a492a
[AArch64] Add soft-float ABI (#84146)
This is re-working of #74460, which adds a soft-float ABI for AArch64.
That was reverted because it causes errors when building the linux and
fuchsia kernels.

The problem is that GCC's implementation of the ABI compatibility checks
when using the hard-float ABI on a target without FP registers does it's
checks after optimisation. The previous version of this patch reported
errors for all uses of floating-point types, which is stricter than what
GCC does in practice.

This changes two things compared to the first version:
* Only check the types of function arguments and returns, not the types
of other values. This is more relaxed than GCC, while still guaranteeing
ABI compatibility.
* Move the check from Sema to CodeGen, so that inline functions are only
checked if they are actually used. There are some cases in the linux
kernel which depend on this behaviour of GCC.
2024-03-19 13:58:51 +00:00
Zaara Syeda
37b5eb0a0a
[AIX][TOC] Add -mtocdata/-mno-tocdata options on AIX (#67999)
This patch enables support that the XL compiler had for AIX under
-qdatalocal/-qdataimported.
2024-03-13 10:26:31 -04:00
Joseph Huber
630289f77d
[HIP] Do not include the CUID module hash with the new driver (#84332)
Summary:
The new driver does not need this hash and it can lead to redefined
symbol errors when the CUID hash isn't set.
2024-03-07 11:04:40 -06:00
Emma Pilkington
4490003a22
[AMDGPU] Rename COV module flag to amdhsa_code_object_version (#79905)
The previous name 'amdgpu_code_object_version', was misleading since
this is really a property of the HSA OS. The new spelling also matches
the asm directive I added in bc82cfb.
2024-03-06 09:51:48 -05:00
Alexandros Lamprineas
b42b7c8a12
[clang] Refactor target attribute mangling. (#81893)
Before this patch all of the 'target', 'target_version' and
'target_clones' attributes were sharing a common mangling logic across
different targets. However we would like to differenciate this logic,
therefore I have moved the default path to ABIInfo and provided
overrides for AArch64. This way we can resolve feature aliases without
affecting the name mangling. The PR #80540 demonstrates a motivating
case.
2024-02-28 17:49:59 +00:00
Florian Hahn
d2a9df2c8f
[TBAA] Handle bitfields when generating !tbaa.struct metadata. (#82922)
At the moment, clang generates what I believe are incorrect !tbaa.struct
fields for named bitfields. At the moment, the base type size is used
for named bifields (e.g. sizeof(int)) instead of the bifield width per
field. This results in overalpping fields in !tbaa.struct metadata.

This causes incorrect results when extracting individual copied fields
from !tbaa.struct as in added in dc85719d5.

This patch fixes that by skipping by combining adjacent bitfields
in fields with correct sizes.

Fixes https://github.com/llvm/llvm-project/issues/82586
2024-02-27 20:09:54 +00:00
gulfemsavrun
23f895f656
[InstrProf] Single byte counters in coverage (#75425)
This patch inserts 1-byte counters instead of an 8-byte counters into
llvm profiles for source-based code coverage. The origial idea was
proposed as block-cov for PGO, and this patch repurposes that idea for
coverage: https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4

The current 8-byte counters mechanism add counters to minimal regions,
and infer the counters in the remaining regions via adding or
subtracting counters. For example, it infers the counter in the if.else
region by subtracting the counters between if.entry and if.then regions
in an if statement. Whenever there is a control-flow merge, it adds the
counters from all the incoming regions. However, we are not going to be
able to infer counters by subtracting two execution counts when using
single-byte counters. Therefore, this patch conservatively inserts
additional counters for the cases where we need to add or subtract
counters.

RFC:
https://discourse.llvm.org/t/rfc-single-byte-counters-for-source-based-code-coverage/75685
2024-02-26 14:44:55 -08:00
Yaxun (Sam) Liu
33a6ce1837
[HIP] Allow partial linking for -fgpu-rdc (#81700)
`-fgpu-rdc` mode allows device functions call device functions in
different TU. However, currently all device objects have to be linked
together since only one fat binary is supported. This is time consuming
for AMDGPU backend since it only supports LTO.

There are use cases that objects can be divided into groups in which
device functions are self-contained but host functions are not. It is
desirable to link/optimize/codegen the device code and generate a fatbin
for each group, whereas partially link the host code with `ld -r` or
generate a static library by using the `--emit-static-lib` option of
clang. This avoids linking all device code together, therefore decreases
the linking time for `-fgpu-rdc`.

Previously, clang emits an external symbol `__hip_fatbin` for all
objects for `-fgpu-rdc`. With this patch, clang emits an unique external
symbol `__hip_fatbin_{cuid}` for the fat binary for each object. When a
group of objects are linked together to generate a fatbin, the symbols
are merged by alias and point to the same fat binary. Each group has its
own fat binary. One executable or shared library can have multiple fat
binaries. Device linking is done for undefined fab binary symbols only
to avoid repeated linking. `__hip_gpubin_handle` is also uniquefied and
merged to avoid repeated registering. Symbol `__hip_cuid_{cuid}` is
introduced to facilitate debugging and tooling.

Fixes: https://github.com/llvm/llvm-project/issues/77018
2024-02-22 13:51:31 -05:00
Joseph Huber
cc374d8056
[OpenMP] Remove register_requires global constructor (#80460)
Summary:
Currently, OpenMP handles the `omp requires` clause by emitting a global
constructor into the runtime for every translation unit that requires
it. However, this is not a great solution because it prevents us from
having a defined order in which the runtime is accessed and used.

This patch changes the approach to no longer use global constructors,
but to instead group the flag with the other offloading entires that we
already handle. This has the effect of still registering each flag per
requires TU, but now we have a single constructor that handles
everything.

This function removes support for the old `__tgt_register_requires` and
replaces it with a warning message. We just had a recent release, and
the OpenMP policy for the past four releases since we switched to LLVM
is that we do not provide strict backwards compatibility between major
LLVM releases now that the library is versioned. This means that a user
will need to recompile if they have an old binary that relied on
`register_requires` having the old behavior. It is important that we
actively deprecate this, as otherwise it would not solve the problem of
having no defined init and shutdown order for `libomptarget`. The
problem of `libomptarget` not having a define init and shutdown order
cascades into a lot of other issues so I have a strong incentive to be
rid of it.

It is worth noting that the current `__tgt_offload_entry` only has space
for a 32-bit integer here. I am planning to overhaul these at some point
as well.
2024-02-21 11:33:32 -06:00
Chuanqi Xu
1ecbab56dc [C++20] [Modules] Don't import non-inline function bodies even if it is marked as always_inline
Close https://github.com/llvm/llvm-project/issues/80949

Previously, I thought the always-inline function can be an exception to
enable optimizations as much as possible. However, it looks like it
breaks the ABI requirement we discussed later. So it looks better to not
import non-inline function bodies at all even if the function bodies are
marked as always_inline.

It doesn't produce regressions in some degree since the always_inline
still works in the same TU.
2024-02-18 15:15:28 +08:00