16346 Commits

Author SHA1 Message Date
Shilei Tian
e997dca333
[OpenMP] Introduce the initial support for OpenMP kernel language (#66844)
This patch starts the support for OpenMP kernel language, basically to
write
OpenMP target region in SIMT style, similar to kernel languages such as
CUDA.
What included in this first patch is the `ompx_bare` clause for `target
teams`
directive. When `ompx_bare` exists, globalization is disabled such that
local
variables will not be globalized. The runtime init/deinit function calls
will
not be emitted. That being said, almost all OpenMP executable directives
are
not supported in the region, such as parallel, task. This patch doesn't
include
the Sema checks for that, so the use of them is UB. Simple directives,
such as
atomic, can be used. We provide a set of APIs (for C, they are prefix
with
`ompx_`; for C++, they are in `ompx` namespace) to get thread id, block
id, etc.
For more details, you can refer to
https://tianshilei.me/wp-content/uploads/llvm-hpc-2023.pdf.
2023-09-29 13:11:09 -04:00
Pavel Iliin
8ec50d6446 [AArch64] Fix FMV ifunc resolver usage on old Android APIs. Rename internal compiler-rt FMV functions.
The patch fixes Function Multi Versioning features detection by ifunc
resolver on Android API levels < 30.
Ifunc hwcaps parameters are not supported on Android API levels 23-29,
so all CPU features are set unsupported if they were not initialized
before ifunc resolver call.
There is no support for ifunc on Android API levels < 23, so Function
Multi Versioning is disabled in this case.

Also use two underscore prefix for FMV runtime support functions to
avoid conflict with user program ones.

Differential Revision: https://reviews.llvm.org/D158641
2023-09-29 17:10:48 +01:00
Jan Svoboda
8a2fb1391b
[clang] NFCI: Use FileEntryRef in SourceManager::FileInfos (#67742) 2023-09-29 08:04:34 -07:00
Chuanqi Xu
cbbe555904 [C++20] [Modules] Generate init calls for the modules imported in GMF or PMF
I just found that we didn't handle the imports in GMF of PMF when we're
generating the init functions for the current module unit. This looks
like a simple oversight and I'm going to fix that in this patch
directly.
2023-09-29 22:16:31 +08:00
Chuanqi Xu
7e8a0e4bdc [NFC] [C++20] [Modules] Rename NamedModuleHasInit to NamedModuleHasInit
Address comments in
https://github.com/llvm/llvm-project/pull/67638/files#r1340342453 to
rename the field variable.
2023-09-29 21:49:10 +08:00
Jakub Chlanda
3f8d4a8ef2
Reland [NVPTX] Add support for maxclusterrank in launch_bounds (#66496) (#67667)
This reverts commit 0afbcb20fd908f8bf9073697423da097be7db592.
2023-09-29 08:39:31 +02:00
Chuanqi Xu
989173c09c
[C++20] [Modules] Don't generate call to an imported module that dont init anything (#67638)
Close https://github.com/llvm/llvm-project/issues/56794

And see https://github.com/llvm/llvm-project/issues/67582 for a detailed
backgrond for the issue.

As required by the Itanium ABI, the module units have to generate the
initialization function. However, the importers are allowed to elide the
call to the initialization function if they are sure the initialization
function doesn't do anything.

This patch implemented this semantics.
2023-09-28 23:29:24 +08:00
Nikita Popov
fb2bdbb83d [CodeGen] Avoid use of ConstantExpr::getZExt() (NFC)
Use the constant folding API instead. In preparation for dropping
zext constant expressions.
2023-09-28 16:45:31 +02:00
Chuanqi Xu
9744909a12 [NFC] [C++20] [Modules] Refactor Module::getGlobalModuleFragment and Module::getPrivateModuleFragment
The original implementation of `Module::getGlobalModuleFragment` and
`Module::getPrivateModuleFragment` tried to find the global module
fragment and the private module fragment by comparing strings, which
smells bad. This patch tries to improve this.
2023-09-28 14:06:02 +08:00
Fangrui Song
0d8b864829 CGBuiltin: emit llvm.abs.* instead of neg+icmp+select for abs
instcombine will combine neg+icmp+select to llvm.abs.*. Let's just emit
llvm.abs.* in the first place.
2023-09-27 21:29:56 -07:00
Sam McCall
0afbcb20fd Revert "[NVPTX] Add support for maxclusterrank in launch_bounds (#66496)"
This reverts commit dfab31b41b4988b6dc8129840eba68f0c36c0f13.

SemaDeclAttr.cpp cannot depend on Basic's private headers
(lib/Basic/Targets/NVPTX.h)
2023-09-27 10:59:04 +02:00
Jakub Chlanda
dfab31b41b
[NVPTX] Add support for maxclusterrank in launch_bounds (#66496)
Since SM_90 CUDA supports specifying additional argument to the
launch_bounds attribute: maxBlocksPerCluster, to express the maximum
number of CTAs that can be part of the cluster. See:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#cluster-dimension-directives-maxclusterrank
and

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#launch-bounds
for details.
2023-09-27 08:51:26 +02:00
Zequan Wu
4d5d9a5390 Revert "[Coverage] Allow Clang coverage to be used with debug info correlation."
This reverts commit 32db121b29f78e4c41116b2a8f1c730f9522b202 and subsequent commits.

This causes time regression on llvm-cov even with debug info correlation off.
2023-09-26 20:57:09 -04:00
Arthur Eubanks
a42787d108
[clang] Add -mlarge-data-threshold for x86_64 medium code model (#66839)
Error if not used with x86_64.
Warn if not used with the medium code model (can update if other code
models end up using this).

Set TargetMachine option and add module flag.
2023-09-26 09:44:31 -07:00
Phoebe Wang
31631d307f
[X86][FP16] Add missing handling for FP16 constrained cmp intrinsics (#67400) 2023-09-26 19:27:57 +08:00
Qiu Chaofan
3e97db89ae [PowerPC] Emit IR module flag for current float abi
This is part of the efforts adding .gnu_attribute support for PowerPC.
In Clang, an extra metadata field will be added as float-abi to show
current long double format. So backend can emit .gnu_attribute section
data from this metadata.

To avoid breaking existing behavior, the module metadata will only be
emitted when this module makes use of long double.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D116016
2023-09-25 17:53:39 +08:00
Björn Pettersson
b4858c634e
[clang][CodeGen] Simplify code based on opaque pointers (#65624)
- Update CodeGenTypeCache to use a single union for all pointers in
  address space zero.
- Introduce a UnqualPtrTy in CodeGenTypeCache, and use that (for
  example instead of llvm::PointerType::getUnqual) in some places.
- Drop some redundant bit/pointers casts from ptr to ptr.
2023-09-25 11:21:24 +02:00
Carlos Eduardo Seo
7523550853
[Clang][CodeGen] Add __builtin_bcopy (#67130)
Add __builtin_bcopy to the list of GNU builtins. This was causing a
series of test failures in glibc.

Adjust the tests to reflect the changes in codegen.

Fixes #51409.
Fixes #63065.
2023-09-24 11:58:14 -03:00
Umesh Kalappa
2641d9b280 Propagate the volatile qualifier of exp to store /load operations .
This changes to address the PR : 55207

We update the volatility  on the LValue by looking at the LHS cast operation qualifier and propagate the RValue volatile-ness   from  the CGF data structure .

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D157890
2023-09-23 19:40:24 +05:30
Bruno Cardoso Lopes
34415fd611
[Clang][LLVM][Coroutines] Prevent __coro_gro from outliving __promise (#66706)
When dealing with short-circuiting coroutines (e.g. expected), the
deferred calls that resolve the get_return_object are currently being
emitted after we delete the coroutine frame.

This was caught by ASAN when using optimizations -O1 and above:
optimizations after inlining would place the __coro_gro in the heap, and
subsequent delete of the coroframe followed by the conversion -> BOOM.

This patch forbids the GRO to be placed in the coroutine frame, by
adding a new metadata node that can be attached to `alloca`
instructions.

Fix #49843
2023-09-21 22:52:05 -07:00
Amy Huang
03c698a431
[MSVC, ARM64] Add _Copy* and _Count* intrinsics (#66554)
Implement the _Count* and _Copy* Windows ARM intrinsics:

```
double _CopyDoubleFromInt64(__int64)
float _CopyFloatFromInt32(__int32)
__int32 _CopyInt32FromFloat(float)
__int64 _CopyInt64FromDouble(double)
unsigned int _CountLeadingOnes(unsigned long)
unsigned int _CountLeadingOnes64(unsigned __int64)
unsigned int _CountLeadingSigns(long)
unsigned int _CountLeadingSigns64(__int64)
unsigned int _CountLeadingZeros(unsigned long)
unsigned int _CountLeadingZeros64(unsigned __int64)
unsigned int _CountOneBits(unsigned long)
unsigned int _CountOneBits64(unsigned __int64)
```

Full list of intrinsics here:
[https://learn.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics](https://learn.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics)

Bug: [65405](https://github.com/llvm/llvm-project/issues/65405)
2023-09-21 14:34:59 -07:00
Fangrui Song
9ee65a7618 Revert "[Coverage] Fix -Wswitch after D138847"
This reverts commit ca22d6e40508f6d24a9352835bda9c152e3eee1b.

The base patch 618a22144db5e45da8c95dc22064103e1b5e5b71 has been reverted.
2023-09-20 14:45:19 -07:00
Fangrui Song
ca22d6e405 [Coverage] Fix -Wswitch after D138847 2023-09-20 14:20:58 -07:00
Alex Voicu
de018f5ca4
[clang][CodeGen] The eh_typeid_for intrinsic needs special care too (#65699)
This change is symmetric with the one reviewed in
<https://reviews.llvm.org/D157452> and handles the exception handling
specific intrinsic, which slipped through the cracks, in the same way,
by inserting an address-space cast iff RTTI is in a non-default AS.
2023-09-20 17:12:19 +01:00
Juan Manuel Martinez Caamaño
69183f8eb9
[NFC][Clang] Address reviews about overrideFunctionFeaturesWithTargetFeatures (#65938)
Addressing remarks after merge of D159257

* Add comment
* Remove irrelevant CHECKs from test
* Simplify function
* Use llvm::sort before setting target-features as it is done in
CodeGenModeule
2023-09-20 13:37:13 +02:00
Zequan Wu
816144bfd2 [Coverage] Skip visiting ctor member initializers with invalid source locations. 2023-09-19 14:59:41 -04:00
Zahira Ammarguellat
a292e7edf8
Fix math-errno issue (#66381)
Update handling of math errno. This change updates the logic for
generation of math intrinics in place of math library function calls.
The previous logic https://reviews.llvm.org/D151834 was incorrectly
using intrinsics when math errno handling was needed at optimization
levels above -O0.
This also fixes issue mentioned in https://reviews.llvm.org/D151834 by
@uabelho
This is joint work with @andykaylor Andy.
2023-09-19 09:13:02 -04:00
Louis Dionne
a52560c8dd [clang] Remove spurious trailing whitespace 2023-09-15 17:26:16 -04:00
Zequan Wu
0b8df841f9
[Coverage] Add coverage for constructor member initializers. (#66441)
Before, constructor member initializers are shown as not covered. This
adds coverage info for them.
2023-09-15 17:06:04 -04:00
Zequan Wu
32db121b29 [Coverage] Allow Clang coverage to be used with debug info correlation.
Debug info correlation is an option in InstrProfiling pass, which is used by
both IR instrumentation and front-end instrumentation. So, Clang coverage can
also benefits the binary size saving from it.

Reviewed By: ellis

Differential Revision: https://reviews.llvm.org/D157913
2023-09-15 13:47:23 -04:00
Anton Korobeynikov
51d5d7bbae
Extend retcon.once coroutines lowering to optionally produce a normal result (#66333)
One of the main user of these kind of coroutines is swift. There yield-once (`retcon.once`) coroutines are used to temporary "expose" pointers to internal fields of various objects creating borrow scopes.

However, in some cases it might be useful also to allow these coroutines to produce a normal result, but there is no convenient way to represent this (as compared to switched-resume kind of coroutines where C++ `co_return`
is transformed to a member / callback call on promise object).

The extension is simple: we allow continuation function to have a non-void result and accept optional extra arguments via a special `llvm.coro.end.result` intrinsic that would essentially forward them as normal results.
2023-09-15 09:54:38 -07:00
Arthur Eubanks
0a1aa6cda2
[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.

This matches other nearby enums.

For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
2023-09-14 14:10:14 -07:00
Yaxun (Sam) Liu
d7e1932f85
[HIP] Fix comdat of template kernel handle (#66283)
Currently, clang emits LLVM IR that fails verifier for the following
code:

```
template<typename T>
__global__ void foo(T x);

void bar() {
  foo<<<1, 1>>>(0);
}
```
This is due to clang putting the kernel handle for foo into comdat,
which is not allowed, since the kernel handle is a declaration.

The siutation is similar to calling a declaration-only template
function. The callee will be a declaration in LLVM IR and won't be put
into comdat. This is in contrast to calling a template function with
body, which will be put into comdat.

Fixes: SWDEV-419769
2023-09-14 15:56:02 -04:00
Matt Arsenault
ddc3346a6b
clang/AMDGPU: Fix accidental behavior change for __builtin_amdgcn_ldexph (#66340) 2023-09-14 18:15:44 +03:00
Sergio Afonso
9058762789
[OpenMP][Flang][MLIR] Lowering of requires directive from MLIR to LLVM IR
Default atomic ordering information is processed in the OpenMP dialect
to LLVM IR lowering stage at every spot where an operation can be
affected by it. The rest of clauses are stored globally in the
OpenMPIRBuilderConfig object before starting that lowering stage, so
that the OMPIRBuilder can conditionally modify code generation
depending on these. At the end of the process, the omp.requires
attribute is itself lowered into a global constructor that passes these
clauses as flags to the OpenMP runtime.

Depends on D147217, D147218 and D158278.

Differential Revision: https://reviews.llvm.org/D147219
2023-09-14 10:35:44 +01:00
Sergio Afonso
094a63a20b
[OpenMP][OMPIRBuilder] OpenMPIRBuilder support for requires directive
This patch updates the `OpenMPIRBuilderConfig` structure to hold all
available 'requires' clauses, and it replicates part of the code
generation for the 'requires' registration function from clang in the
`OMPIRBuilder`, to be used with flang.

Porting the rest of features of the clang implementation to the IRBuilder
and sharing it between clang and flang remains for a future patch, due to the
complexity of the logic selecting the attributes of the generated
registration function.

Differential Revision: https://reviews.llvm.org/D147217
2023-09-14 10:33:54 +01:00
Reid Kleckner
c8c075e876
[MS] Follow up fix to pass aligned args to variadic x86_32 functions (#65692)
MSVC allows users to pass structures with required alignments greater
than 4 to variadic functions. It does not pass them indirectly to
correctly align them. Instead, it passes them directly with the usual 4
byte stack alignment.

This change implements the same logic in clang on the passing side. The
receiving side (va_arg) never implemented any of this indirect logic, so
it doesn't need to be updated.

This issue pre-existed, but @aaron.ballman noticed it when we started
passing structs containing aligned fields indirectly in D152752.
2023-09-13 16:29:11 -07:00
Joshua Cranmer
bf49237103 [Clang] Enable -print-pipeline-passes in clang.
Reviewed By: arsenm, aeubanks

Differential Revision: https://reviews.llvm.org/D127221
2023-09-13 08:57:10 -07:00
CarolineConcatto
ee31ba0dd9
[AArch64][SME]Update intrinsic interface for ld1/st1 (#65582)
The new ACLE PR#225[1] now combines the slice parameters for some
builtins. 
Slice specifies the ZA slice number directly and needs to be explicity
implemented by the "user" with the base register plus the immediate
offset

[1]https://github.com/ARM-software/acle/pull/225/files
2023-09-13 15:24:09 +01:00
Joseph Huber
1b7a095e27
[Clang][AMDGPU] Permit language address spaces for AMDGPU globals (#66205)
Summary:
Currently, there is an assertion that prevents us from emitting an
AMDGPU global with a non-target specific address space (i.e. numerical
attribute). I'm unsure what the original intentions of this assertion
were, but we should be able to use OpenCL address spaces when compiling
directly to AMDGPU from C++. This is permitted on NVPTX so I'm unsure
what this assertion is guarding. The patch simply removes the assertion
and adds a test to ensure that these emit the expected address spaces.

Fixes https://github.com/llvm/llvm-project/issues/65069
2023-09-13 08:43:01 -05:00
Joseph Huber
49ff6a96a7
[Clang] Define AMDGPU ABI when referenced in CodeGen for ABI "none" (#66162)
Summary:
We use the `llvm.amgcn.abi.version` varaible to control code generation.
This is emitted in every module now to indicate what should be used when
compiling. Previously, the logic caused us to emit an external reference
to this variable when creating the code for the `none` type. This would
then cause us not to emit the actual definition. This patch refines the
logic to create the external reference, and then update it if it is
found unset by the time we emit the global. I had to remove the
reference to `GetOrCreateLLVmGlobal` because it did not accept the
proper address space.
2023-09-13 08:31:31 -05:00
Benjamin Kramer
88b7e06dcf Revert "[clang][CodeGen] Emit annotations for function declarations."
This reverts commit c6a33ff49dfb3498dae15c718820ea3d9c19f3cb. Makes
clang segfault.

// clang t.cc
class a;
class c {
 public:
  [[clang::annotate("")]] c(const c *) {}
};
class d {
  d(const c *, a *, a *);
  c e;
};
d::d(const c *f, a *, a *) : e(f) {}
2023-09-13 13:22:57 +02:00
Aaron Jarmusch
131ba0ae01 Revert "[Clang][OpenMP] Clang adding the addrSpace according to DataLayout fix (#65483)"
This reverts commit e831a32c93c1ab404785773cc7c08c01730d61e5.
2023-09-12 22:46:09 +00:00
Aaron Jarmusch
e3298bb275 fixup! [Clang][OpenMP] Clang adding the addrSpace according to DataLayout fix (#65483) 2023-09-12 20:52:33 +00:00
Brendan Dahl
c6a33ff49d [clang][CodeGen] Emit annotations for function declarations.
Previously, annotations were only emitted for function definitions. With
this change annotations are also emitted for declarations. Also, emitting
function annotations is now deferred until the end so that the most
up to date declaration is used which will have any inherited annotations.

Differential Revision: https://reviews.llvm.org/D156172/new/
2023-09-12 13:07:55 -07:00
Aaron Jarmusch
e831a32c93
[Clang][OpenMP] Clang adding the addrSpace according to DataLayout fix (#65483)
Fix for an issue where clang was not adding the address space according
to the data layout, instead was using the default which resulted in a
crash at times. The fix includes changes to the cases of
LargeCapMemAlloc and CGroupMemAlloc where we are setting the AddrSpace
according to the DataLayout.
2023-09-12 15:44:39 -04:00
CarolineConcatto
dc8d2ecc5e
[AArch64][SME]Update intrinsic interface for read/write (#65594)
The new ACLE PR#225[1] now combines the slice parameters for some
builtins. This patch is the #2 of 3 patches to update the interface.

Slice specifies the ZA slice number directly and needs to be explicity
implemented by the "user" with the base register plus the immediate
offset

[1]https://github.com/ARM-software/acle/pull/225/files
2023-09-12 18:08:57 +01:00
CarolineConcatto
7b8d4eff02
[AArch64][SME]Update intrinsic interface for ldr/str (#65593)
The new ACLE PR#225[1] now combines the slice parameters for some
builtins. 

[1]https://github.com/ARM-software/acle/pull/225/files
2023-09-12 17:31:51 +01:00
Adrian Prantl
167acac417
Propagate the DWARF version from the main compiler invocation to PCHC… (#66032)
…ontainerGenerator

Currently it remains uninitialized and thus always uses the LLVM default
of 4.
2023-09-12 08:31:27 -07:00
Max Iyengar
dbeb3d029d Add missing vrnd intrinsics
This patch adds 8 missing intrinsics as specified in the Arm ACLE document section 2.12.1.1 : [[ https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#rounding-3 | https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#rounding-3]]

The intrinsics implemented are:

  - vrnd32z_f64
  - vrnd32zq_f64
  - vrnd64z_f64
  - vrnd64zq_f64
  - vrnd32x_f64
  - vrnd32xq_f64
  - vrnd64x_f64
  - vrnd64xq_f64

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D158626
2023-09-11 12:59:18 +01:00