2172 Commits

Author SHA1 Message Date
Akira Hatanaka
8bd1f9116a
[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454)
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.

This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.

In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
2024-03-25 18:05:42 -07:00
Alexandros Lamprineas
772e316457
[FMV] Allow multi versioning without default declaration. (#85454)
This was a limitation which has now been lifted. Please read the
thread below for more details:

https://github.com/llvm/llvm-project/pull/84405#discussion_r1525583647

Basically it allows to separate versioned implementations across
different TUs without having to share private header files which
contain the default declaration.

The ACLE spec has been updated accordingly to make this explicit:
"Each version declaration should be visible at the translation
 unit in which the corresponding function version resides."

https://github.com/ARM-software/acle/pull/310

If a resolver is required (because there is a caller in the TU),
then a default declaration is implicitly generated.
2024-03-25 09:43:41 +00:00
Alexandros Lamprineas
9cb5004209
Reland [FMV] Emit the resolver along with the default version definit… (#85923)
…ion.

This was reverted because the resolver didn't look as expected in one of
the tests. I believe it had some interaction with #84146. I have now
regenerated it using -target-feature -fp-armv8.
2024-03-20 16:49:51 +00:00
Alexandros Lamprineas
b7975cae7b
Revert "[FMV] Emit the resolver along with the default version definition." (#85914)
Reverts llvm/llvm-project#84405

In between of passing the precommit tests on github and being merged
some change (perhaps in the AArch64 backend?) landed which resulted
in altering the generated resolver. I will regenerate the tests
perhaps using a less sensitive runline to such changes.
2024-03-20 06:16:26 -04:00
Alexandros Lamprineas
e6b5bd5854
[FMV] Emit the resolver along with the default version definition. (#84405)
We would like the resolver to be generated eagerly, even if the
versioned function is not called from the current translation
unit. Fixes #81494. It further allows Multi Versioning to work
even if the default target version attribute is omitted from
function declarations.
2024-03-20 09:24:29 +00:00
ostannard
ef395a492a
[AArch64] Add soft-float ABI (#84146)
This is re-working of #74460, which adds a soft-float ABI for AArch64.
That was reverted because it causes errors when building the linux and
fuchsia kernels.

The problem is that GCC's implementation of the ABI compatibility checks
when using the hard-float ABI on a target without FP registers does it's
checks after optimisation. The previous version of this patch reported
errors for all uses of floating-point types, which is stricter than what
GCC does in practice.

This changes two things compared to the first version:
* Only check the types of function arguments and returns, not the types
of other values. This is more relaxed than GCC, while still guaranteeing
ABI compatibility.
* Move the check from Sema to CodeGen, so that inline functions are only
checked if they are actually used. There are some cases in the linux
kernel which depend on this behaviour of GCC.
2024-03-19 13:58:51 +00:00
Zaara Syeda
37b5eb0a0a
[AIX][TOC] Add -mtocdata/-mno-tocdata options on AIX (#67999)
This patch enables support that the XL compiler had for AIX under
-qdatalocal/-qdataimported.
2024-03-13 10:26:31 -04:00
Joseph Huber
630289f77d
[HIP] Do not include the CUID module hash with the new driver (#84332)
Summary:
The new driver does not need this hash and it can lead to redefined
symbol errors when the CUID hash isn't set.
2024-03-07 11:04:40 -06:00
Emma Pilkington
4490003a22
[AMDGPU] Rename COV module flag to amdhsa_code_object_version (#79905)
The previous name 'amdgpu_code_object_version', was misleading since
this is really a property of the HSA OS. The new spelling also matches
the asm directive I added in bc82cfb.
2024-03-06 09:51:48 -05:00
Alexandros Lamprineas
b42b7c8a12
[clang] Refactor target attribute mangling. (#81893)
Before this patch all of the 'target', 'target_version' and
'target_clones' attributes were sharing a common mangling logic across
different targets. However we would like to differenciate this logic,
therefore I have moved the default path to ABIInfo and provided
overrides for AArch64. This way we can resolve feature aliases without
affecting the name mangling. The PR #80540 demonstrates a motivating
case.
2024-02-28 17:49:59 +00:00
Florian Hahn
d2a9df2c8f
[TBAA] Handle bitfields when generating !tbaa.struct metadata. (#82922)
At the moment, clang generates what I believe are incorrect !tbaa.struct
fields for named bitfields. At the moment, the base type size is used
for named bifields (e.g. sizeof(int)) instead of the bifield width per
field. This results in overalpping fields in !tbaa.struct metadata.

This causes incorrect results when extracting individual copied fields
from !tbaa.struct as in added in dc85719d5.

This patch fixes that by skipping by combining adjacent bitfields
in fields with correct sizes.

Fixes https://github.com/llvm/llvm-project/issues/82586
2024-02-27 20:09:54 +00:00
gulfemsavrun
23f895f656
[InstrProf] Single byte counters in coverage (#75425)
This patch inserts 1-byte counters instead of an 8-byte counters into
llvm profiles for source-based code coverage. The origial idea was
proposed as block-cov for PGO, and this patch repurposes that idea for
coverage: https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4

The current 8-byte counters mechanism add counters to minimal regions,
and infer the counters in the remaining regions via adding or
subtracting counters. For example, it infers the counter in the if.else
region by subtracting the counters between if.entry and if.then regions
in an if statement. Whenever there is a control-flow merge, it adds the
counters from all the incoming regions. However, we are not going to be
able to infer counters by subtracting two execution counts when using
single-byte counters. Therefore, this patch conservatively inserts
additional counters for the cases where we need to add or subtract
counters.

RFC:
https://discourse.llvm.org/t/rfc-single-byte-counters-for-source-based-code-coverage/75685
2024-02-26 14:44:55 -08:00
Yaxun (Sam) Liu
33a6ce1837
[HIP] Allow partial linking for -fgpu-rdc (#81700)
`-fgpu-rdc` mode allows device functions call device functions in
different TU. However, currently all device objects have to be linked
together since only one fat binary is supported. This is time consuming
for AMDGPU backend since it only supports LTO.

There are use cases that objects can be divided into groups in which
device functions are self-contained but host functions are not. It is
desirable to link/optimize/codegen the device code and generate a fatbin
for each group, whereas partially link the host code with `ld -r` or
generate a static library by using the `--emit-static-lib` option of
clang. This avoids linking all device code together, therefore decreases
the linking time for `-fgpu-rdc`.

Previously, clang emits an external symbol `__hip_fatbin` for all
objects for `-fgpu-rdc`. With this patch, clang emits an unique external
symbol `__hip_fatbin_{cuid}` for the fat binary for each object. When a
group of objects are linked together to generate a fatbin, the symbols
are merged by alias and point to the same fat binary. Each group has its
own fat binary. One executable or shared library can have multiple fat
binaries. Device linking is done for undefined fab binary symbols only
to avoid repeated linking. `__hip_gpubin_handle` is also uniquefied and
merged to avoid repeated registering. Symbol `__hip_cuid_{cuid}` is
introduced to facilitate debugging and tooling.

Fixes: https://github.com/llvm/llvm-project/issues/77018
2024-02-22 13:51:31 -05:00
Joseph Huber
cc374d8056
[OpenMP] Remove register_requires global constructor (#80460)
Summary:
Currently, OpenMP handles the `omp requires` clause by emitting a global
constructor into the runtime for every translation unit that requires
it. However, this is not a great solution because it prevents us from
having a defined order in which the runtime is accessed and used.

This patch changes the approach to no longer use global constructors,
but to instead group the flag with the other offloading entires that we
already handle. This has the effect of still registering each flag per
requires TU, but now we have a single constructor that handles
everything.

This function removes support for the old `__tgt_register_requires` and
replaces it with a warning message. We just had a recent release, and
the OpenMP policy for the past four releases since we switched to LLVM
is that we do not provide strict backwards compatibility between major
LLVM releases now that the library is versioned. This means that a user
will need to recompile if they have an old binary that relied on
`register_requires` having the old behavior. It is important that we
actively deprecate this, as otherwise it would not solve the problem of
having no defined init and shutdown order for `libomptarget`. The
problem of `libomptarget` not having a define init and shutdown order
cascades into a lot of other issues so I have a strong incentive to be
rid of it.

It is worth noting that the current `__tgt_offload_entry` only has space
for a 32-bit integer here. I am planning to overhaul these at some point
as well.
2024-02-21 11:33:32 -06:00
Chuanqi Xu
1ecbab56dc [C++20] [Modules] Don't import non-inline function bodies even if it is marked as always_inline
Close https://github.com/llvm/llvm-project/issues/80949

Previously, I thought the always-inline function can be an exception to
enable optimizations as much as possible. However, it looks like it
breaks the ABI requirement we discussed later. So it looks better to not
import non-inline function bodies at all even if the function bodies are
marked as always_inline.

It doesn't produce regressions in some degree since the always_inline
still works in the same TU.
2024-02-18 15:15:28 +08:00
Prabhuk
ea9ec80b7a
Revert "[AArch64] Add soft-float ABI (#74460)" (#82032)
This reverts commit 9cc98e336980f00cbafcbed8841344e6ac472bdc.

Issue: https://github.com/ClangBuiltLinux/linux/issues/1997
2024-02-16 16:43:50 -08:00
ostannard
9cc98e3369
[AArch64] Add soft-float ABI (#74460)
This adds support for the AArch64 soft-float ABI. The specification for
this ABI was added by https://github.com/ARM-software/abi-aa/pull/232.

Because all existing AArch64 hardware has floating-point hardware, we
expect this to be a niche option, only used for embedded systems on
R-profile systems. We are going to document that SysV-like systems
should only ever use the base (hard-float) PCS variant:
https://github.com/ARM-software/abi-aa/pull/233. For that reason, I've
not added an option to select the ABI independently of the FPU hardware,
instead the new ABI is enabled iff the target architecture does not have
an FPU.

For testing, I have run this through an ABI fuzzer, but since this is
the first implementation it can only test for internal consistency
(callers and callees agree on the PCS), not for conformance to the ABI
spec.
2024-02-15 12:39:16 +00:00
Craig Topper
f45b9d987d
[RISCV] Add canonical ISA string as Module metadata in IR. (#80760)
In an LTO build, we don't set the ELF attributes to indicate what
extensions were compiled with. The target CPU/Attrs in
RISCVTargetMachine do not get set for an LTO build. Each function gets a
target-cpu/feature attribute, but this isn't usable to set ELF attributs
since we wouldn't know what function to use. We can't just once since it
might have been compiler with an attribute likes target_verson.

This patch adds the ISA as Module metadata so we can retrieve it in the
backend. Individual translation units can still be compiled with
different strings so we need to collect the unique set when Modules are
merged.

The backend will need to combine the unique ISA strings to produce a
single value for the ELF attributes. This will be done in a separate
patch.
2024-02-13 16:17:50 -08:00
Jon Roelofs
99d743320c
[clang][fmv] Drop .ifunc from target_version's entrypoint's mangling (#81194)
Fixes: https://github.com/llvm/llvm-project/issues/81043
2024-02-09 08:13:15 -08:00
Dani
6e3e8856d4
[NFC][Clang] Replace Arch with Triplet. (#80465) 2024-02-05 08:25:30 +01:00
Sander de Smalen
d313614b60
[AArch64] Replace LLVM IR function attributes for PSTATE.ZA. (#79166)
Since https://github.com/ARM-software/acle/pull/276 the ACLE
defines attributes to better describe the use of a given SME state.

Previously the attributes merely described the possibility of it being
'shared' or 'preserved', whereas the new attributes have more semantics
and also describe how the data flows through the program.

For ZT0 we already had to add new LLVM IR attributes:
* aarch64_new_zt0
* aarch64_in_zt0
* aarch64_out_zt0
* aarch64_inout_zt0
* aarch64_preserves_zt0

We have now done the same for ZA, such that we add:
* aarch64_new_za       (previously `aarch64_pstate_za_new`)
* aarch64_in_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_out_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_inout_za (more specific variation of
`aarch64_pstate_za_shared`)
* aarch64_preserves_za (previously `aarch64_pstate_za_shared,
aarch64_pstate_za_preserved`)

This explicitly removes 'pstate' from the name, because with SME2 and
the new ACLE attributes there is a difference between "sharing ZA"
(sharing
the ZA matrix register with the caller) and "sharing PSTATE.ZA" (sharing
either the ZA or ZT0 register, both part of PSTATE.ZA with the caller).
2024-02-01 13:37:37 +00:00
Jonas Paulsson
34dd8ec8ae
[clang, SystemZ] Support -munaligned-symbols (#73511)
When this option is passed to clang, external (and/or weak) symbols
are not assumed to have the minimum ABI alignment normally required.
Symbols defined locally that are not weak are however still given the
minimum alignment.

This is implemented by passing a new parameter to getMinGlobalAlign()
named HasNonWeakDef that is used to return the right alignment value.

This is needed when external symbols created from a linker script may
not get the ABI minimum alignment and must therefore be treated as
unaligned by the compiler.
2024-01-27 18:29:37 +01:00
Sander de Smalen
1652d44d8d
[Clang] Amend SME attributes with support for ZT0. (#77941)
This patch builds on top of #76971 and implements support for:
* __arm_new("zt0")
* __arm_in("zt0")
* __arm_out("zt0")
* __arm_inout("zt0")
* __arm_preserves("zt0")
2024-01-23 12:35:16 +01:00
Dani
1be0d9d7d8
[AArch64][Clang] Fix linker error for function multiversioning (#74358)
AArch64 part of https://github.com/llvm/llvm-project/pull/71706.

Default version is now mangled with .default.
Resolver for the TargetVersion need to be emitted from the
CodeGenModule::EmitMultiVersionFunctionDefinition.
2024-01-22 19:55:16 +01:00
bd1976bris
cd05ade13a
Add a "don't override" mapping for -fvisibility-from-dllstorageclass (#74629)
`-fvisibility-from-dllstorageclass` allows for overriding the visibility
of globals from their DLL storage class. The visibility to apply can be
customised for the different classes of globals via a set of dependent
options that specify the mapping values:
- `-fvisibility-dllexport=<value>`
- `-fvisibility-nodllstorageclass=<value>`
- `-fvisibility-externs-dllimport=<value>`
- `-fvisibility-externs-nodllstorageclass=<value>` 

Currently, one of the existing LLVM visibilities, `hidden`, `protected`,
`default`, can be used as a mapping value. This change adds a new
mapping value: `keep`, which specifies that the visibility should not be
overridden for that class of globals. The behaviour of
`-fvisibility-from-dllstorageclass` is otherwise unchanged and existing
uses of this set of options will be unaffected.

The background to this change is that currently the PS4 and PS5
compilers effectively ignore visibility - dllimport/export is the
supported method for export control in C/C++ source code. Now, we would
like to support visibility attributes and options in our frontend, in
addition to dllimport/export. To support this, we will override the
visibility of globals with explicit dllimport/export annotations but use
the `keep` setting for globals which do not have an explicit
dllimport/export.

There are also some minor improvements to the existing options:
- Make the `LANGOPS` `BENIGN` as they don't involve the AST.
- Correct/clarify the help text for the options.
2024-01-19 21:57:40 +00:00
Wang Pengcheng
3ac9fe69f7
[RISCV] CodeGen of RVE and ilp32e/lp64e ABIs (#76777)
This commit includes the necessary changes to clang and LLVM to support
codegen of `RVE` and the `ilp32e`/`lp64e` ABIs.

The differences between `RVE` and `RVI` are:
* `RVE` reduces the integer register count to 16(x0-x16).
* The ABI should be `ilp32e` for 32 bits and `lp64e` for 64 bits.

`RVE` can be combined with all current standard extensions.

The central changes in ilp32e/lp64e ABI, compared to ilp32/lp64 are:
* Only 6 integer argument registers (rather than 8).
* Only 2 callee-saved registers (rather than 12).
* A Stack Alignment of 32bits (rather than 128bits).
* ilp32e isn't compatible with D ISA extension.

If `ilp32e` or `lp64` is used with an ISA that has any of the registers
x16-x31 and f0-f31, then these registers are considered temporaries.

To be compatible with the implementation of ilp32e in GCC, we don't use
aligned registers to pass variadic arguments and set stack alignment\
to 4-bytes for types with length of 2*XLEN.

FastCC is also supported on RVE, while GHC isn't since there is only one
avaiable register.

Differential Revision: https://reviews.llvm.org/D70401
2024-01-16 20:44:30 +08:00
Sander de Smalen
8e7f073eb4
[Clang][AArch64] Change SME attributes for shared/new/preserved state. (#76971)
This patch replaces the `__arm_new_za`, `__arm_shared_za` and
`__arm_preserves_za` attributes in favour of:
* `__arm_new("za")`
* `__arm_in("za")`
* `__arm_out("za")`
* `__arm_inout("za")`
* `__arm_preserves("za")`

As described in https://github.com/ARM-software/acle/pull/276.

One change is that `__arm_in/out/inout/preserves(S)` are all mutually
exclusive, whereas previously it was fine to write `__arm_shared_za
__arm_preserves_za`. This case is now represented with `__arm_in("za")`.

The current implementation uses the same LLVM attributes under the hood,
since `__arm_in/out/inout` are all variations of "shared ZA", so can use
the existing `aarch64_pstate_za_shared` attribute in LLVM.

#77941 will add support for the new "zt0" state as introduced
with SME2.
2024-01-15 09:41:32 +00:00
Arthur Eubanks
f05b081214
[clang] Adjust -mlarge-data-threshold handling (#77958)
Make it apply to x86-64 medium and large code models since that's what
the backend does.

Limit logic to exclude x86-32.

Default to 0, let the driver set it to 65536 for the medium code model
if one is not passed. Set it to 0 for the large code model by default to
match gcc and since some users make assumptions about the large code
model that any small data will break.
2024-01-12 12:23:42 -08:00
John Brawn
40d5c2bcd4
[clang][AArch64] Add a -mbranch-protection option to enable GCS (#75486)
-mbranch-protection=gcs (enabled by -mbranch-protection=standard) causes
generated objects to be marked with the gcs feature. This is done via
the guarded-control-stack module flag, in a similar way to
branch-target-enforcement and sign-return-address.

Enabling GCS causes the GNU_PROPERTY_AARCH64_FEATURE_1_GCS bit to be set
on generated objects. No code generation changes are required, as GCS
just requires that functions are called using BL and returned from using
RET (or other similar variant instructions), which is already the case.
2024-01-11 12:53:23 +00:00
hev
4df5662907
[clang] Add per-global code model attribute (#72078)
This patch adds a per-global code model attribute, which can override
the target's code model to access global variables.

Currently, the code model attribute is only supported on LoongArch. This
patch also maps GCC's code model names to LLVM's, which allows for
better compatibility between the two compilers.


Suggested-by: Arthur Eubanks <aeubanks@google.com>
Link:
https://discourse.llvm.org/t/how-to-best-implement-code-model-overriding-for-certain-values/71816
Link:
https://discourse.llvm.org/t/rfc-add-per-global-code-model-attribute/74944

---------

Signed-off-by: WANG Rui <wangrui@loongson.cn>
2024-01-06 08:51:59 +08:00
Tomas Matheson
7bd17212ef Re-land "[AArch64] Codegen support for FEAT_PAuthLR" (#75947)
This reverts commit 9f0f5587426a4ff24b240018cf8bf3acc3c566ae.

Fix expensive checks failure by properly marking register def for ADR.
2023-12-21 18:32:55 +00:00
Tomas Matheson
9f0f558742 Revert "[AArch64] Codegen support for FEAT_PAuthLR"
This reverts commit 5992ce90b8c0fac06436c3c86621fbf6d5398ee5.

Builtbot failures with expensive checks enabled.
2023-12-21 16:25:55 +00:00
Tomas Matheson
5992ce90b8 [AArch64] Codegen support for FEAT_PAuthLR
- Adds a new +pc option to -mbranch-protection that will enable
  the use of PC as a diversifier in PAC branch protection code.

- When +pauth-lr is enabled (-march=armv9.5a+pauth-lr) in combination
  with -mbranch-protection=pac-ret+pc, the new 9.5-a instructions
  (pacibsppc, retaasppc, etc) are used.

Documentation for the relevant instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2023-09/Base-Instructions/

Co-authored-by: Lucas Prates <lucas.prates@arm.com>
2023-12-21 14:18:33 +00:00
Dimitry Andric
2c27013fa9 [clang] Add getClangVendor() and use it in CodeGenModule.cpp (#75935)
In 9a38a72f1d482 `ProductId` was assigned from the stringified value of
`CLANG_VENDOR`, if that macro was defined. However, `CLANG_VENDOR` is
supposed to be a string, as it is defined (optionally) as such in the
top-level clang `CMakeLists.txt`.

Furthermore, `CLANG_VENDOR` is only passed as a build-time define when
compiling `Version.cpp`, so add a `getClangVendor()` function to
`Version.h`, and use it in `CodegGenModule.cpp`, instead of relying on
the macro.

Fixes: 9a38a72f1d482
2023-12-20 20:09:39 +01:00
Dimitry Andric
5c1a41f8ad Revert "[clang] Add getClangVendor() and use it in CodeGenModule.cpp (#75935)"
This reverts commit 9055519103eadfba0b48810be926883a71890c55, due to an
incorrectly chosen commit message.
2023-12-20 20:07:22 +01:00
Dimitry Andric
9055519103
[clang] Add getClangVendor() and use it in CodeGenModule.cpp (#75935)
In 9a38a72f1d482 `ProductId` was assigned from the stringified value of
`CLANG_VENDOR`, if that macro was defined. However, `CLANG_VENDOR` is
supposed to be a string, as it is defined (optionally) as such in the
top-level clang `CMakeLists.txt`.

Move the addition of `-DCLANG_VENDOR` to the compiler flags from
`clang/lib/Basic/CMakeLists.txt` to the top-level `CMakeLists.txt`, so
it is consistent across the whole clang codebase. Then remove the
stringification from `CodeGenModule.cpp`, to make it work correctly.

Fixes:		9a38a72f1d482
2023-12-20 20:03:19 +01:00
Kazu Hirata
f3dcc2351c
[clang] Use StringRef::{starts,ends}_with (NFC) (#75149)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-13 08:54:13 -08:00
Mike Rice
0808be47b8
[NFC] Remove unneeded nullptr checks after cast<> (#74674)
Since VD is assigned from a cast<VarDecl> it cannot be a nullptr or it
would have asserted. Remove the subsequent checks to clear up any
misunderstanding.
2023-12-07 16:20:22 -08:00
elizabethandrews
cee5b8777f
[Clang] Fix linker error for function multiversioning (#71706)
Currently target_clones attribute results in a linker error when there
are no multi-versioned function declarations in the calling TU.

In the calling TU, the call is generated with the ‘normal’ assembly
name. This does not match any of the versions or the ifunc, since
version mangling includes a .versionstring, and the ifunc includes
.ifunc suffix. The linker error is not seen with GCC since the mangling
for the ifunc symbol in GCC is the ‘normal’ assembly name for function
i.e. no ifunc suffix.

This PR removes the .ifunc suffix to match GCC. It also adds alias with
the .ifunc suffix so as to ensure backward compatibility.

The changes exclude aarch64 target because the mangling for default
versions on aarch64 does not include a .default suffix and is the
'normal' assembly name, unlike other targets. It is not clear to me what
the correct behavior for this target is.

Old Phabricator review - https://reviews.llvm.org/D158666

---------

Co-authored-by: Tom Honermann <tom@honermann.net>
2023-12-05 18:11:53 -05:00
Momchil Velikov
092507a730
[clang][AArch64] Pass down stack clash protection options to LLVM/Backend (#68993) 2023-11-30 11:18:02 +00:00
Brendan Dahl
c6d70722b4
[clang][CodeGen] Emit annotations for function declarations. (#66716)
Previously, annotations were only emitted for function definitions. With
this change annotations are also emitted for declarations. Also,
emitting function annotations is now deferred until the end so that the
most up to date declaration is used which will have any inherited
annotations.
2023-11-29 15:13:30 -08:00
Dominik Adamski
95943d2fab [Flang] Add code-object-version option (#72638)
Information about code object version can be configured by the user for
AMD GPU target and it needs to be placed in LLVM IR generated by Flang.

Information about code object version in MLIR generated by the parser
can be reused by other tools. There is no need to specify extra flags if
we want to invoke MLIR tools (like fir-opt) separately.

Changes in comparison to a8ac93:
 * added information about required targets for test
   flang/test/Driver/driver-help.f90
2023-11-29 03:01:01 -06:00
Dominik Adamski
f00ffcdb58 Revert "[Flang] Add code-object-version option (#72638)"
This commit causes test errors on buildbots.

This reverts commit a8ac930b99d93b2a539ada7e566993d148899144.
2023-11-28 13:18:46 -06:00
Dominik Adamski
a8ac930b99
[Flang] Add code-object-version option (#72638)
Information about code object version can be configured by the user for
AMD GPU target and it needs to be placed in LLVM IR generated by Flang.

Information about code object version in MLIR generated by the parser
can be reused by other tools. There is no need to specify extra flags if
we want to invoke MLIR tools (like fir-opt) separately.
2023-11-28 19:57:36 +01:00
Yusra Syeda
9a38a72f1d
[SystemZ][z/OS] This change adds support for the PPA2 section in zOS (#68926)
This PR adds support for the PPA2 fields.

---------

Co-authored-by: Yusra Syeda <yusra.syeda@ibm.com>
2023-11-27 16:30:12 -05:00
Youngsuk Kim
ed73121ffe [CodeGenModule] Remove no-op ptr-to-ptr bitcasts (NFC)
Opaque ptr cleanup effort (NFC)
2023-11-20 16:21:21 -06:00
Fangrui Song
9c1c56a803 -fcoverage-mapping: simplify. NFC 2023-11-13 15:16:58 -08:00
Yaxun (Sam) Liu
9774d0ce5f
[CUDA][HIP] Make template implicitly host device (#70369)
Added option -foffload-implicit-host-device-templates which is off by
default.

When the option is on, template functions and specializations without
host/device attributes have implicit host device attributes.

They can be overridden by device template functions with the same
signagure.
They are emitted on device side only if they are used on device side.

This feature is added as an extension.
`__has_extension(cuda_implicit_host_device_templates)` can be used to
check whether it is enabled.

This is to facilitate using standard C++ headers for device.

Fixes: https://github.com/llvm/llvm-project/issues/69956

Fixes: SWDEV-428314
2023-11-09 20:36:38 -05:00
Chuanqi Xu
0f7aaeb324 [C++20] [Modules] Allow export from language linkage
Close https://github.com/llvm/llvm-project/issues/71347

Previously I misread the concept of module purview. I thought if a
declaration attached to a unnamed module, it can't be part of the module
purview. But after the issue report, I recognized that module purview is
more of a concept about locations instead of semantics.

Concretely, the things in the language linkage after module declarations
can be exported.

This patch refactors `Module::isModulePurview()` and introduces some
possible code cleanups.
2023-11-09 17:44:41 +08:00
Chuanqi Xu
8ee9da0232
[C++20] [Modules] Don't import function bodies from other module units even with optimizations (#71031)
Close https://github.com/llvm/llvm-project/issues/60996.

Previously, clang will try to import function bodies from other module
units to get more optimization oppotunities as much as possible. Then
the motivation becomes the direct cause of the above issue.

However, according to the discussion in SG15, the behavior of importing
function bodies from other module units breaks the ABI compatibility. It
is unwanted. So the original behavior of clang is incorrect. This patch
choose to not import function bodies from other module units in all
cases to follow the expectation.

Note that the desired optimized BMI idea is discarded too. Since it will
still break the ABI compatibility after we import function bodies
seperately.

The release note will be added seperately.

There is a similar issue for variable definitions. I'll try to handle
that in a different commit.
2023-11-07 23:04:45 +08:00