16319 Commits

Author SHA1 Message Date
Louis Dionne
a52560c8dd [clang] Remove spurious trailing whitespace 2023-09-15 17:26:16 -04:00
Zequan Wu
0b8df841f9
[Coverage] Add coverage for constructor member initializers. (#66441)
Before, constructor member initializers are shown as not covered. This
adds coverage info for them.
2023-09-15 17:06:04 -04:00
Zequan Wu
32db121b29 [Coverage] Allow Clang coverage to be used with debug info correlation.
Debug info correlation is an option in InstrProfiling pass, which is used by
both IR instrumentation and front-end instrumentation. So, Clang coverage can
also benefits the binary size saving from it.

Reviewed By: ellis

Differential Revision: https://reviews.llvm.org/D157913
2023-09-15 13:47:23 -04:00
Anton Korobeynikov
51d5d7bbae
Extend retcon.once coroutines lowering to optionally produce a normal result (#66333)
One of the main user of these kind of coroutines is swift. There yield-once (`retcon.once`) coroutines are used to temporary "expose" pointers to internal fields of various objects creating borrow scopes.

However, in some cases it might be useful also to allow these coroutines to produce a normal result, but there is no convenient way to represent this (as compared to switched-resume kind of coroutines where C++ `co_return`
is transformed to a member / callback call on promise object).

The extension is simple: we allow continuation function to have a non-void result and accept optional extra arguments via a special `llvm.coro.end.result` intrinsic that would essentially forward them as normal results.
2023-09-15 09:54:38 -07:00
Arthur Eubanks
0a1aa6cda2
[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.

This matches other nearby enums.

For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
2023-09-14 14:10:14 -07:00
Yaxun (Sam) Liu
d7e1932f85
[HIP] Fix comdat of template kernel handle (#66283)
Currently, clang emits LLVM IR that fails verifier for the following
code:

```
template<typename T>
__global__ void foo(T x);

void bar() {
  foo<<<1, 1>>>(0);
}
```
This is due to clang putting the kernel handle for foo into comdat,
which is not allowed, since the kernel handle is a declaration.

The siutation is similar to calling a declaration-only template
function. The callee will be a declaration in LLVM IR and won't be put
into comdat. This is in contrast to calling a template function with
body, which will be put into comdat.

Fixes: SWDEV-419769
2023-09-14 15:56:02 -04:00
Matt Arsenault
ddc3346a6b
clang/AMDGPU: Fix accidental behavior change for __builtin_amdgcn_ldexph (#66340) 2023-09-14 18:15:44 +03:00
Sergio Afonso
9058762789
[OpenMP][Flang][MLIR] Lowering of requires directive from MLIR to LLVM IR
Default atomic ordering information is processed in the OpenMP dialect
to LLVM IR lowering stage at every spot where an operation can be
affected by it. The rest of clauses are stored globally in the
OpenMPIRBuilderConfig object before starting that lowering stage, so
that the OMPIRBuilder can conditionally modify code generation
depending on these. At the end of the process, the omp.requires
attribute is itself lowered into a global constructor that passes these
clauses as flags to the OpenMP runtime.

Depends on D147217, D147218 and D158278.

Differential Revision: https://reviews.llvm.org/D147219
2023-09-14 10:35:44 +01:00
Sergio Afonso
094a63a20b
[OpenMP][OMPIRBuilder] OpenMPIRBuilder support for requires directive
This patch updates the `OpenMPIRBuilderConfig` structure to hold all
available 'requires' clauses, and it replicates part of the code
generation for the 'requires' registration function from clang in the
`OMPIRBuilder`, to be used with flang.

Porting the rest of features of the clang implementation to the IRBuilder
and sharing it between clang and flang remains for a future patch, due to the
complexity of the logic selecting the attributes of the generated
registration function.

Differential Revision: https://reviews.llvm.org/D147217
2023-09-14 10:33:54 +01:00
Reid Kleckner
c8c075e876
[MS] Follow up fix to pass aligned args to variadic x86_32 functions (#65692)
MSVC allows users to pass structures with required alignments greater
than 4 to variadic functions. It does not pass them indirectly to
correctly align them. Instead, it passes them directly with the usual 4
byte stack alignment.

This change implements the same logic in clang on the passing side. The
receiving side (va_arg) never implemented any of this indirect logic, so
it doesn't need to be updated.

This issue pre-existed, but @aaron.ballman noticed it when we started
passing structs containing aligned fields indirectly in D152752.
2023-09-13 16:29:11 -07:00
Joshua Cranmer
bf49237103 [Clang] Enable -print-pipeline-passes in clang.
Reviewed By: arsenm, aeubanks

Differential Revision: https://reviews.llvm.org/D127221
2023-09-13 08:57:10 -07:00
CarolineConcatto
ee31ba0dd9
[AArch64][SME]Update intrinsic interface for ld1/st1 (#65582)
The new ACLE PR#225[1] now combines the slice parameters for some
builtins. 
Slice specifies the ZA slice number directly and needs to be explicity
implemented by the "user" with the base register plus the immediate
offset

[1]https://github.com/ARM-software/acle/pull/225/files
2023-09-13 15:24:09 +01:00
Joseph Huber
1b7a095e27
[Clang][AMDGPU] Permit language address spaces for AMDGPU globals (#66205)
Summary:
Currently, there is an assertion that prevents us from emitting an
AMDGPU global with a non-target specific address space (i.e. numerical
attribute). I'm unsure what the original intentions of this assertion
were, but we should be able to use OpenCL address spaces when compiling
directly to AMDGPU from C++. This is permitted on NVPTX so I'm unsure
what this assertion is guarding. The patch simply removes the assertion
and adds a test to ensure that these emit the expected address spaces.

Fixes https://github.com/llvm/llvm-project/issues/65069
2023-09-13 08:43:01 -05:00
Joseph Huber
49ff6a96a7
[Clang] Define AMDGPU ABI when referenced in CodeGen for ABI "none" (#66162)
Summary:
We use the `llvm.amgcn.abi.version` varaible to control code generation.
This is emitted in every module now to indicate what should be used when
compiling. Previously, the logic caused us to emit an external reference
to this variable when creating the code for the `none` type. This would
then cause us not to emit the actual definition. This patch refines the
logic to create the external reference, and then update it if it is
found unset by the time we emit the global. I had to remove the
reference to `GetOrCreateLLVmGlobal` because it did not accept the
proper address space.
2023-09-13 08:31:31 -05:00
Benjamin Kramer
88b7e06dcf Revert "[clang][CodeGen] Emit annotations for function declarations."
This reverts commit c6a33ff49dfb3498dae15c718820ea3d9c19f3cb. Makes
clang segfault.

// clang t.cc
class a;
class c {
 public:
  [[clang::annotate("")]] c(const c *) {}
};
class d {
  d(const c *, a *, a *);
  c e;
};
d::d(const c *f, a *, a *) : e(f) {}
2023-09-13 13:22:57 +02:00
Aaron Jarmusch
131ba0ae01 Revert "[Clang][OpenMP] Clang adding the addrSpace according to DataLayout fix (#65483)"
This reverts commit e831a32c93c1ab404785773cc7c08c01730d61e5.
2023-09-12 22:46:09 +00:00
Aaron Jarmusch
e3298bb275 fixup! [Clang][OpenMP] Clang adding the addrSpace according to DataLayout fix (#65483) 2023-09-12 20:52:33 +00:00
Brendan Dahl
c6a33ff49d [clang][CodeGen] Emit annotations for function declarations.
Previously, annotations were only emitted for function definitions. With
this change annotations are also emitted for declarations. Also, emitting
function annotations is now deferred until the end so that the most
up to date declaration is used which will have any inherited annotations.

Differential Revision: https://reviews.llvm.org/D156172/new/
2023-09-12 13:07:55 -07:00
Aaron Jarmusch
e831a32c93
[Clang][OpenMP] Clang adding the addrSpace according to DataLayout fix (#65483)
Fix for an issue where clang was not adding the address space according
to the data layout, instead was using the default which resulted in a
crash at times. The fix includes changes to the cases of
LargeCapMemAlloc and CGroupMemAlloc where we are setting the AddrSpace
according to the DataLayout.
2023-09-12 15:44:39 -04:00
CarolineConcatto
dc8d2ecc5e
[AArch64][SME]Update intrinsic interface for read/write (#65594)
The new ACLE PR#225[1] now combines the slice parameters for some
builtins. This patch is the #2 of 3 patches to update the interface.

Slice specifies the ZA slice number directly and needs to be explicity
implemented by the "user" with the base register plus the immediate
offset

[1]https://github.com/ARM-software/acle/pull/225/files
2023-09-12 18:08:57 +01:00
CarolineConcatto
7b8d4eff02
[AArch64][SME]Update intrinsic interface for ldr/str (#65593)
The new ACLE PR#225[1] now combines the slice parameters for some
builtins. 

[1]https://github.com/ARM-software/acle/pull/225/files
2023-09-12 17:31:51 +01:00
Adrian Prantl
167acac417
Propagate the DWARF version from the main compiler invocation to PCHC… (#66032)
…ontainerGenerator

Currently it remains uninitialized and thus always uses the LLVM default
of 4.
2023-09-12 08:31:27 -07:00
Max Iyengar
dbeb3d029d Add missing vrnd intrinsics
This patch adds 8 missing intrinsics as specified in the Arm ACLE document section 2.12.1.1 : [[ https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#rounding-3 | https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#rounding-3]]

The intrinsics implemented are:

  - vrnd32z_f64
  - vrnd32zq_f64
  - vrnd64z_f64
  - vrnd64zq_f64
  - vrnd32x_f64
  - vrnd32xq_f64
  - vrnd64x_f64
  - vrnd64xq_f64

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D158626
2023-09-11 12:59:18 +01:00
Shilei Tian
52b4bec939
[Clang][OpenMP] Emit unroll directive w/o captured stmt (#65862)
The front end doesn't create captured stmt for unroll directive. This
leads to
a crash when `-fopenmp-simd` is used, as reported in #63570.

Fix #63570.
2023-09-09 18:51:58 -04:00
Matt Arsenault
6a08cf12d9 clang: Add __builtin_exp10* and use new llvm.exp10 intrinsic
https://reviews.llvm.org/D157911
2023-09-09 23:14:12 +03:00
Nuno Lopes
8a2d68f6be
[clang][CodeGen] Switch declaration of vtable information to be [0 x ptr] (#65596)
Continuing the discussion in
https://discourse.llvm.org/t/codegen-layout-of-si-class-type-info-doesnt-match-the-actual-size/73274

Before we had this code:
@_ZTVN10__cxxabiv117__class_type_infoE = external global ptr

now we'll produce:
@_ZTVN10__cxxabiv117__class_type_infoE = external global [0 x ptr]

This is because we may not know the exact size of this data, and clang
issues gep inbounds with idx=2. Before, that gep would always result in
poison.
2023-09-09 07:50:35 +01:00
Jan Svoboda
523c471250 Reapply "[clang] NFCI: Adopt SourceManager::getFileEntryRefForID()"
This reapplies ddbcc10b9e26b18f6a70e23d0611b9da75ffa52f, except for a tiny part that was reverted separately: 65331da0032ab4253a4bc0ddcb2da67664bd86a9. That will be reapplied later on, since it turned out to be more involved.

This commit is enabled by 5523fefb01c282c4cbcaf6314a9aaf658c6c145f and f0f548a65a215c450d956dbcedb03656449705b9, specifically the part that makes 'clang-tidy/checkers/misc/header-include-cycle.cpp' separator agnostic.
2023-09-08 19:04:01 -07:00
Phoebe Wang
24194090e1 [X86][RFC] Add new option -m[no-]evex512 to disable ZMM and 64-bit mask instructions for AVX512 features
This is an alternative of D157485 and a pre-feature to support AVX10.

AVX10 Architecture Specification: https://cdrdv2.intel.com/v1/dl/getContent/784267
AVX10 Technical Paper: https://cdrdv2.intel.com/v1/dl/getContent/784343
RFC: https://discourse.llvm.org/t/rfc-design-for-avx10-feature-support/72661

Based on the feedbacks from LLVM and GCC community, we have agreed to
start from supporting `-m[no-]evex512` on existing AVX512 features.
The option `-mno-evex512` can be used with `-mavx512xxx` to build
binaries that can run on both legacy AVX512 targets and AVX10-256.

There're still arguments about what's the expected behavior when this
option as well as `-mavx512xxx` used together with `-mavx10.1-256`. We
decided to defer the support of `-mavx10.1` after we made consensus.
Or furthermore, we start from supporting AVX10.2 and not providing any
AVX10.1 options.

Reviewed By: RKSimon, skan

Differential Revision: https://reviews.llvm.org/D159250
2023-09-08 22:47:22 +08:00
Zahira Ammarguellat
2c93e3c1c8 Take math-errno into account with '#pragma float_control(precise,on)' and
'attribute__((optnone)).

Differential Revision: https://reviews.llvm.org/D151834
2023-09-08 09:48:53 -04:00
Juan Manuel MARTINEZ CAAMAÑO
d60c47476d [Clang] Propagate target-features if compatible when using mlink-builtin-bitcode
Buitlins from AMD's device-libs are compiled without specifying a
target-cpu, which results in builtins without the target-features
attribute set.

Before this patch, when linking this builtins with -mlink-builtin-bitcode
the target-features were not propagated in the incoming builtins.

With this patch, the default target features are propagated
if they are compatible with the target-features in the incoming builtin.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D159206
2023-09-08 11:20:16 +02:00
Phoebe Wang
0856efbf88 Revert "[X86][RFC] Add new option -m[no-]evex512 to disable ZMM and 64-bit mask instructions for AVX512 features"
This reverts commit 7dd48cc24de2d54d40527432cbee8a9d97a8a4f7.

Causing buildbot failure.
2023-09-07 21:59:01 +08:00
Phoebe Wang
7dd48cc24d [X86][RFC] Add new option -m[no-]evex512 to disable ZMM and 64-bit mask instructions for AVX512 features
This is an alternative of D157485 and a pre-feature to support AVX10.

AVX10 Architecture Specification: https://cdrdv2.intel.com/v1/dl/getContent/784267
AVX10 Technical Paper: https://cdrdv2.intel.com/v1/dl/getContent/784343
RFC: https://discourse.llvm.org/t/rfc-design-for-avx10-feature-support/72661

Based on the feedbacks from LLVM and GCC community, we have agreed to
start from supporting `-m[no-]evex512` on existing AVX512 features.
The option `-mno-evex512` can be used with `-mavx512xxx` to build
binaries that can run on both legacy AVX512 targets and AVX10-256.

There're still arguments about what's the expected behavior when this
option as well as `-mavx512xxx` used together with `-mavx10.1-256`. We
decided to defer the support of `-mavx10.1` after we made consensus.
Or furthermore, we start from supporting AVX10.2 and not providing any
AVX10.1 options.

Reviewed By: RKSimon, skan

Differential Revision: https://reviews.llvm.org/D159250
2023-09-07 21:38:35 +08:00
Jan Svoboda
0a9611fd8d Revert "[clang] NFCI: Adopt SourceManager::getFileEntryRefForID()"
This reverts commit ddbcc10b9e26b18f6a70e23d0611b9da75ffa52f.

The 'clang-tidy/checkers/misc/header-include-cycle.cpp' test started failing on Windows: https://lab.llvm.org/buildbot/#/builders/216/builds/26855.
2023-09-06 13:23:23 -07:00
Jan Svoboda
e75ecaa190 [clang] NFCI: Use FileEntryRef in CoverageMappingGen
This removes some uses of the deprecated `FileEntry::getName()`.
2023-09-06 11:15:51 -07:00
Jan Svoboda
ddbcc10b9e [clang] NFCI: Adopt SourceManager::getFileEntryRefForID()
This commit replaces some calls to the deprecated `FileEntry::getName()` with `FileEntryRef::getName()` by swapping current usages of `SourceManager::getFileEntryForID()` with `SourceManager::getFileEntryRefForID()`. This lowers the number of usages of the deprecated `FileEntry::getName()` from 95 to 50.
2023-09-06 10:49:48 -07:00
Chris Bieneman
400d3261a0 [HLSL] Cleanup support for this as an l-value
The goal of this change is to clean up some of the code surrounding
HLSL using CXXThisExpr as a non-pointer l-value. This change cleans up
a bunch of assumptions and inconsistencies around how the type of
`this` is handled through the AST and code generation.

This change is be mostly NFC for HLSL, and completely NFC for other
language modes.

This change introduces a new member to query for the this object's type
and seeks to clarify the normal usages of the this type.

With the introudction of HLSL to clang, CXXThisExpr may now be an
l-value and behave like a reference type rather than C++'s normal
method of it being an r-value of pointer type.

With this change there are now three ways in which a caller might need
to query the type of `this`:

* The type of the `CXXThisExpr`
* The type of the object `this` referrs to
* The type of the implicit (or explicit) `this` argument

This change codifies those three ways you may need to query
respectively as:

* CXXMethodDecl::getThisType()
* CXXMethodDecl::getThisObjectType()
* CXXMethodDecl::getThisArgType()

This change then revisits all uses of `getThisType()`, and in cases
where the only use was to resolve the pointee type, it replaces the
call with `getThisObjectType()`. In other cases it evaluates whether
the desired returned type is the type of the `this` expr, or the type
of the `this` function argument. The `this` expr type is used for
creating additional expr AST nodes and for member lookup, while the
argument type is used mostly for code generation.

Additionally some cases that used `getThisType` in simple queries could
be substituted for `getThisObjectType`. Since `getThisType` is
implemented in terms of `getThisObjectType` calling the later should be
more efficient if the former isn't needed.

Reviewed By: aaron.ballman, bogner

Differential Revision: https://reviews.llvm.org/D159247
2023-09-05 19:38:50 -05:00
Bill Wendling
7d6283fd09 [NFC] Remove unneeded header includes
Use forward decls instead of #including the header files.

Differential Revision: https://reviews.llvm.org/D159421
2023-09-05 13:12:00 -07:00
Paul T Robinson
a4605af26f
[CodeGen][LTO] Rename some misleading variables (#65185)
Some flags named "IsLTO" and "IsThinLTO" implied they described
compilation modes, but with Unified LTO this is no longer true. Rename
these to "PrepForXXX" to be less confusing to readers. Also, deleted
"IsThinOrUnifiedLTO" because Unified implies PrepareForThinLTO.
2023-09-05 08:57:16 -07:00
Kazu Hirata
2cdfdfd7b9 [CodeGen] Modernize EHScopeStack::Cleanup::Flags (NFC) 2023-09-02 09:32:36 -07:00
Vassil Vassilev
92246a9be0 [CodeGen] First check the kind and then the llvm::Function properties.
This patch fixes valgrind reports from downstream consumers about conditional
jump over uninitialised memory.

The original report:

```[ RUN      ] ScopeReflectionTest.IsComplete
==987150== Conditional jump or move depends on uninitialised value(s)
==987150==    at 0x1E1128F: clang::CodeGen::CodeGenModule::SetLLVMFunctionAttributesForDefinition(clang::Decl const*, llvm::Function*) (CodeGenModule.cpp:2391)
==987150==    by 0x1E4F181: clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) (CodeGenModule.cpp:5669)
==987150==    by 0x1E4A194: clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) (CodeGenModule.cpp:3909)
==987150==    by 0x1E4A752: clang::CodeGen::CodeGenModule::EmitGlobal(clang::GlobalDecl) (CodeGenModule.cpp:3649)
==987150==    by 0x1E532F5: clang::CodeGen::CodeGenModule::EmitTopLevelDecl(clang::Decl*) [clone .part.0] (CodeGenModule.cpp:6563)
==987150==    by 0x1B0BEDD: (anonymous namespace)::CodeGeneratorImpl::HandleTopLevelDecl(clang::DeclGroupRef) (ModuleBuilder.cpp:190)
==987150==    by 0x1AEA47B: clang::BackendConsumer::HandleTopLevelDecl(clang::DeclGroupRef) (CodeGenAction.cpp:235)
==987150==    by 0x101B02F: clang::IncrementalASTConsumer::HandleTopLevelDecl(clang::DeclGroupRef) (IncrementalParser.cpp:52)
==987150==    by 0x101ED93: clang::IncrementalParser::ParseOrWrapTopLevelDecl() (IncrementalParser.cpp:276)
==987150==    by 0x101FBBC: clang::IncrementalParser::Parse(llvm::StringRef) (IncrementalParser.cpp:342)
==987150==    by 0x100E104: clang::Interpreter::Parse(llvm::StringRef) (Interpreter.cpp:360)
==987150==    by 0xE734C0: Cpp::Interpreter::Parse(llvm::StringRef) (CppInterOpInterpreter.h:172)
==987150==  Uninitialised value was created by a heap allocation
==987150==    at 0x844BE63: operator new(unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==987150==    by 0x1B0C882: StartModule (ModuleBuilder.cpp:139)
==987150==    by 0x1B0C882: clang::CodeGenerator::StartModule(llvm::StringRef, llvm::LLVMContext&) (ModuleBuilder.cpp:360)
==987150==    by 0x101C4AF: clang::IncrementalParser::GenModule() (IncrementalParser.cpp:372)
==987150==    by 0x101FC0E: clang::IncrementalParser::Parse(llvm::StringRef) (IncrementalParser.cpp:362)
==987150==    by 0x100E104: clang::Interpreter::Parse(llvm::StringRef) (Interpreter.cpp:360)
==987150==    by 0x100E243: clang::Interpreter::create(std::unique_ptr<clang::CompilerInstance, std::default_delete<clang::CompilerInstance> >) (Interpreter.cpp:279)
==987150==    by 0xF2131A: compat::createClangInterpreter(std::vector<char const*, std::allocator<char const*> >&) (Compatibility.h:123)
==987150==    by 0xF22AB9: Cpp::Interpreter::Interpreter(int, char const* const*, char const*, std::vector<std::shared_ptr<clang::ModuleFileExtension>, std::allocator<std::shared_ptr<clang::ModuleFileExtension> > > const&, void*, bool) (CppInterOpInterpreter.h:146)
==987150==    by 0xF1827A: CreateInterpreter (CppInterOp.cpp:2494)
==987150==    by 0xECFA0E: TestUtils::GetAllTopLevelDecls(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<clang::Decl*, std::allocator<clang::Decl*> >&, bool) (Utils.cpp:23)
==987150==    by 0xE9CB85: ScopeReflectionTest_IsComplete_Test::TestBody() (ScopeReflectionTest.cpp:71)
==987150==    by 0xF0ED0C: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (in /home/vvassilev/workspace/builds/scratch/cppyy/InterOp/build-with-clang-repl-release/unittests/CppInterOp/CppInterOpTests)
==987150==
```

Differential revision: https://reviews.llvm.org/D159339
2023-09-01 19:52:27 +00:00
Martin Storsjö
f9f2fdcf03 [clang] [MinGW] Add the option -fno-auto-import
In GCC, the .refptr stubs are only generated for x86_64, and only
for code models medium and larger (and medium is the default for
x86_64 since this was introduced). They can be omitted for
projects that are conscious about performance and size, and don't
need automatically importing dll data members, by passing -mcmodel=small.

In Clang/LLVM, such .refptr stubs are generated for any potentially
symbol reference that might end up autoimported. The .refptr stubs
are emitted for three separate reasons:
- Without .refptr stubs, undefined symbols are mostly referenced
  with 32 bit wide relocations. If the symbol ends up autoimported
  from a different DLL, a 32 bit relative offset might not be
  enough to reference data in a different DLL, depending on runtime
  loader layout.
- Without .refptr stubs, the runtime pseudo relocation mechanism
  will need to temporarily make sections read-write-executable
  if there are such relocations in the text section
- On ARM and AArch64, the immediate addressing encoded into
  instructions isn't in the form of a plain 32 bit relative offset,
  but is expressed with various bits scattered throughout two
  instructions - the mingw runtime pseudo relocation mechanism
  doesn't support updating offsets in that form.

If autoimporting is known not to be needed, the user can now
compile with -fno-auto-import, avoiding the extra overhead of
the .refptr stubs.

However, omitting them is potentially fragile as the code
might still rely on automatically importing some symbol without
the developer knowing. If this happens, linking still usually
will succeed, but users may encounter issues at runtime.

Therefore, if the new option -fno-auto-import is passed to the compiler
when driving linking, it passes the flag --disable-auto-import to
the linker, making sure that no symbols actually are autoimported
when the generated code doesn't expect it.

Differential Revision: https://reviews.llvm.org/D61670
2023-09-01 22:39:38 +03:00
Alexander Kornienko
b7f4915644 Revert "Reapply: [IRGen] Emit lifetime intrinsics around temporary aggregate argument allocas"
This reverts commit e698695fbbf62e6676f8907665187f2d2c4d814b. The commit caused
invalid AddressSanitizer: stack-use-after-scope errors.

See https://reviews.llvm.org/D74094#4633785 for details.

Differential Revision: https://reviews.llvm.org/D159346
2023-09-01 12:53:24 +02:00
Francis Visoiu Mistrih
c987f9d7fd
[Matrix] Try to emit fmuladd for both vector and matrix types
For vector * scalar + vector, we emit `fmuladd` directly from clang.

This enables it also for matrix * scalar + matrix.

rdar://113967122

Differential Revision: https://reviews.llvm.org/D158883
2023-08-31 17:13:19 -07:00
Stephen Peckham
282da83756 [XCOFF][AIX] Issue an error when specifying an alias for a common symbol
Summary:

There is no support in XCOFF for labels on common symbols. Therefore, an alias for a common symbol is not supported. Issue an error in the front end when an aliasee is a common symbol. Issue a similar error in the back end in case an IR specifies an alias for a common symbol.

Reviewed by: hubert.reinterpretcast, DiggerLin

Differential Revision:  https://reviews.llvm.org/D158739
2023-08-31 11:43:47 -04:00
Juan Manuel MARTINEZ CAAMAÑO
19550e79b5 [NFC][Clang] Remove redundant function definitions
There were 3 definitions of the mergeDefaultFunctionDefinitionAttributes
function: A private implementation, a version exposed in CodeGen, a
version exposed in CodeGenModule.

This patch removes the private and the CodeGenModule versions and keeps
a single definition in CodeGen.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D159256
2023-08-31 14:47:42 +02:00
Fangrui Song
651b2fbc1c [CodeGen] Function multi-versioning: don't set comdat for internal linkage resolvers
For function multi-versioning using the target or target_clones
function attributes, currently we incorrectly set comdat for internal
linkage resolvers. This is problematic for ELF linkers
as GRP_COMDAT deduplication will kick in even with STB_LOCAL signature
(https://groups.google.com/g/generic-abi/c/2X6mR-s2zoc
"GRP_COMDAT group with STB_LOCAL signature").

In short, two `__attribute((target_clones(...))) static void foo()`
in two translation units will be deduplicated. Fix this.

Fix #65114

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D158963
2023-08-30 09:46:48 -07:00
Anton Rydahl
3c9988f85d [OpenMP] Allow exceptions in target regions when offloading to GPUs
The motivation for this patch is that many code bases use exception handling. As GPUs are not expected to support exception handling in the near future, we can experiment with compiling the code for GPU targets anyway. This will
allow us to run the code, as long as no exception is thrown.

The overall idea is very simple:
- If a throw expression is compiled to AMDGCN or NVPTX, it is replaced with a trap during code generation.
- If a try/catch statement is compiled to AMDGCN or NVPTX, we generate code for the try statement as if it were a basic block.

With this patch, the compilation of the following example
```
int gaussian_sum(int a,int b){
	if ((a + b) % 2 == 0) {throw -1;};
	return (a+b) * ((a+b)/2);
}

int main(void) {
	int gauss = 0;
	#pragma omp target map(from:gauss)
	{
		try {
			gauss = gaussian_sum(1,100);
		}
		catch (int e){
			gauss = e;
		}
	}
	std::cout << "GaussianSum(1,100)="<<gauss<<std::endl;
        #pragma omp target map(from:gauss)
        {
                try {
                     	gauss = gaussian_sum(1,101);
                }
                catch (int e){
                        gauss = e;
                }
        }
	std::cout << "GaussianSum(1,101)="<<gauss<<std::endl;
	return (gauss > 1) ? 0 : 1;
}
```
with offloading to `gfx906` results in
```
./bin/target_try_minimal_fail
GaussianSum(1,100)=5050
AMDGPU fatal error 1: Received error in queue 0x155555506000: HSA_STATUS_ERROR_EXCEPTION: An HSAIL operation resulted in a hardware exception.
zsh: abort (core dumped)
```

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D153924
2023-08-30 09:36:22 -07:00
Juan Manuel MARTINEZ CAAMAÑO
9b35254018 [NFC][Clang] Remove unused function CodeGenModule::addDefaultFunctionDefinitionAttributes
This patch deletes the unused `addDefaultFunctionDefinitionAttributes(llvm::Function);` function,
while it still keeps `void addDefaultFunctionDefinitionAttributes(llvm::AttrBuilder &attrs);` which is being used.

Differential Revision: https://reviews.llvm.org/D158990
2023-08-30 10:32:51 +02:00
Takuya Shimizu
01b88dd66d [NFC] Remove unused variables declared in conditions
D152495 makes clang warn on unused variables that are declared in conditions like `if (int var = init) {}`
This patch is an NFC fix to suppress the new warning in llvm,clang,lld builds to pass CI in the above patch.

Differential Revision: https://reviews.llvm.org/D158016
2023-08-30 10:05:06 +09:00
antonrydahl
7af0eff540 Revert "[OpenMP] Allow exceptions in target regions when offloading to GPUs"
This reverts commit 4c62e943b7178127861ca39163a0ed4caeb14943.
2023-08-29 15:59:47 -07:00