1123 Commits

Author SHA1 Message Date
Zahira Ammarguellat
2c93e3c1c8 Take math-errno into account with '#pragma float_control(precise,on)' and
'attribute__((optnone)).

Differential Revision: https://reviews.llvm.org/D151834
2023-09-08 09:48:53 -04:00
Juan Manuel MARTINEZ CAAMAÑO
d60c47476d [Clang] Propagate target-features if compatible when using mlink-builtin-bitcode
Buitlins from AMD's device-libs are compiled without specifying a
target-cpu, which results in builtins without the target-features
attribute set.

Before this patch, when linking this builtins with -mlink-builtin-bitcode
the target-features were not propagated in the incoming builtins.

With this patch, the default target features are propagated
if they are compatible with the target-features in the incoming builtin.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D159206
2023-09-08 11:20:16 +02:00
Chris Bieneman
400d3261a0 [HLSL] Cleanup support for this as an l-value
The goal of this change is to clean up some of the code surrounding
HLSL using CXXThisExpr as a non-pointer l-value. This change cleans up
a bunch of assumptions and inconsistencies around how the type of
`this` is handled through the AST and code generation.

This change is be mostly NFC for HLSL, and completely NFC for other
language modes.

This change introduces a new member to query for the this object's type
and seeks to clarify the normal usages of the this type.

With the introudction of HLSL to clang, CXXThisExpr may now be an
l-value and behave like a reference type rather than C++'s normal
method of it being an r-value of pointer type.

With this change there are now three ways in which a caller might need
to query the type of `this`:

* The type of the `CXXThisExpr`
* The type of the object `this` referrs to
* The type of the implicit (or explicit) `this` argument

This change codifies those three ways you may need to query
respectively as:

* CXXMethodDecl::getThisType()
* CXXMethodDecl::getThisObjectType()
* CXXMethodDecl::getThisArgType()

This change then revisits all uses of `getThisType()`, and in cases
where the only use was to resolve the pointee type, it replaces the
call with `getThisObjectType()`. In other cases it evaluates whether
the desired returned type is the type of the `this` expr, or the type
of the `this` function argument. The `this` expr type is used for
creating additional expr AST nodes and for member lookup, while the
argument type is used mostly for code generation.

Additionally some cases that used `getThisType` in simple queries could
be substituted for `getThisObjectType`. Since `getThisType` is
implemented in terms of `getThisObjectType` calling the later should be
more efficient if the former isn't needed.

Reviewed By: aaron.ballman, bogner

Differential Revision: https://reviews.llvm.org/D159247
2023-09-05 19:38:50 -05:00
Alexander Kornienko
b7f4915644 Revert "Reapply: [IRGen] Emit lifetime intrinsics around temporary aggregate argument allocas"
This reverts commit e698695fbbf62e6676f8907665187f2d2c4d814b. The commit caused
invalid AddressSanitizer: stack-use-after-scope errors.

See https://reviews.llvm.org/D74094#4633785 for details.

Differential Revision: https://reviews.llvm.org/D159346
2023-09-01 12:53:24 +02:00
Juan Manuel MARTINEZ CAAMAÑO
19550e79b5 [NFC][Clang] Remove redundant function definitions
There were 3 definitions of the mergeDefaultFunctionDefinitionAttributes
function: A private implementation, a version exposed in CodeGen, a
version exposed in CodeGenModule.

This patch removes the private and the CodeGenModule versions and keeps
a single definition in CodeGen.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D159256
2023-08-31 14:47:42 +02:00
Juan Manuel MARTINEZ CAAMAÑO
9b35254018 [NFC][Clang] Remove unused function CodeGenModule::addDefaultFunctionDefinitionAttributes
This patch deletes the unused `addDefaultFunctionDefinitionAttributes(llvm::Function);` function,
while it still keeps `void addDefaultFunctionDefinitionAttributes(llvm::AttrBuilder &attrs);` which is being used.

Differential Revision: https://reviews.llvm.org/D158990
2023-08-30 10:32:51 +02:00
Juan Manuel MARTINEZ CAAMAÑO
b63c6e585d [NFC][Clang] Add missing & to function argument
Differential Revision: https://reviews.llvm.org/D158991
2023-08-29 13:59:17 +02:00
Nikita Popov
14cc7a0772 [Clang] Allow __declspec(noalias) to access inaccessible memory
MSVC defines __declspec(noalias) as follows (https://learn.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2012/k649tyc7(v=vs.110)?redirectedfrom=MSDN):

> noalias means that a function call does not modify or reference
> visible global state and only modifies the memory pointed to
> directly by pointer parameters (first-level indirections).

> If a function is annotated as noalias, the optimizer can assume
> that, in addition to the parameters themselves, only first-level
> indirections of pointer parameters are referenced or modified
> inside the function. The visible global state is the set of all
> data that is not defined or referenced outside of the compilation
> scope, and their address is not taken. The compilation scope is
> all source files (/LTCG (Link-time Code Generation) builds) or a
> single source file (non-/LTCG build).

The wording is not super clear to me, but I believe this is saying
that __declspec(noalias) functions may access inaccessible memory
(i.e. non-visible global state in their words). Indeed, the Windows
CRT applies this attribute to malloc, which does access inaccessible
memory under LLVM's memory model.

As such, change the attribute to emit
memory(argmem: readwrite, inaccessiblemem: readwrite) instead of
memory(argmem: readwrite).

Fixes https://github.com/llvm/llvm-project/issues/64827.

Differential Revision: https://reviews.llvm.org/D158984
2023-08-29 11:43:57 +02:00
Chuanqi Xu
572cc8d38f Revert "[C++20] [Coroutines] Mark await_suspend as noinline if the awaiter is not empty"
This reverts commit 9d9c25f81456aace2bec4b58498a420e650007d9.
This reverts commit 19ab2664ad3182ffa8fe3a95bb19765e4ae84653.
This reverts commit c4672454743e942f148a1aff1e809dae73e464f6.

As the issue https://github.com/llvm/llvm-project/issues/65018 shows,
the previous fix introduce a regression actually. So this commit reverts
the fix by our policies.
2023-08-28 13:21:17 +08:00
Chuanqi Xu
9d9c25f814 [C++20] [Coroutines] Don't mark await_suspend as noinline if it is specified as always_inline already
Address https://github.com/llvm/llvm-project/issues/64933 and partially
https://github.com/llvm/llvm-project/issues/64945.

After c467245, we will add a noinline attribute to the await_suspend
member function of an awaiter if the awaiter has any non static member
functions.

Obviously, this decision will bring some performance regressions. And
people may complain about this while the long term solution may not be
available soon. In such cases, it is better to provide a solution for
the users who met the regression surprisingly.

Also it is natural to not prevent the inlining if the function is marked
as always_inline by the users already.
2023-08-28 11:43:33 +08:00
eopXD
39a41c8905 [CGCall][RISCV] Handle function calls with parameter of RVV tuple type
This was an oversight in D146872, where function calls with tuple type
was not covered. This commit fixes this.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D157953
2023-08-22 23:41:23 -07:00
Chuanqi Xu
c467245474 [C++20] [Coroutines] Mark await_suspend as noinline if the awaiter is not empty
Close https://github.com/llvm/llvm-project/issues/56301
Close https://github.com/llvm/llvm-project/issues/64151

See the summary and the discussion of https://reviews.llvm.org/D157070
to get the full context.

As @rjmccall pointed out, the key point of the root cause is that
currently we didn't implement the semantics for '@llvm.coro.save' well
("after the await-ready returns false, the coroutine is considered to be
suspended ") well.
Since the semantics implies that we (the compiler) shouldn't write the
spills into the coroutine frame in the await_suspend. But now it is possible
due to some combinations of the optimizations so the semantics are
broken. And the inlining is the root optimization of such optimizations.
So in this patch, we tried to add the `noinline` attribute to the
await_suspend call.

Also as an optimization, we don't add the `noinline` attribute to the
await_suspend call if the awaiter is an empty class. This should be
correct since the programmers can't access the local variables in
await_suspend if the awaiter is empty. I think this is necessary for the
performance since it is pretty common.

Another potential optimization is:

    call @llvm.coro.await_suspend(ptr %awaiter, ptr %handle,
                                  ptr @awaitSuspendFn)

Then it is much easier to perform the safety analysis in the middle
end.
If it is safe to inline the call to awaitSuspend, we can replace it
in the CoroEarly pass. Otherwise we could replace it in the CoroSplit
pass.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D157833
2023-08-22 09:56:44 +08:00
Erik Pilkington
e698695fbb Reapply: [IRGen] Emit lifetime intrinsics around temporary aggregate argument allocas
This reverts commit e26c24b849211f35a988d001753e0cd15e4a9d7b.

These temporaries are only used in the callee, and their memory can be
reused after the call is complete.

rdar://58552124

Link: https://github.com/llvm/llvm-project/issues/38157
Link: https://github.com/llvm/llvm-project/issues/41896
Link: https://github.com/llvm/llvm-project/issues/43598
Link: https://github.com/ClangBuiltLinux/linux/issues/39
Link: https://reviews.llvm.org/rGfafc6e4fdf3673dcf557d6c8ae0c0a4bb3184402

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D74094
2023-08-16 15:21:46 -07:00
Martin Storsjö
d60c3d08e7 [clang] Skip stores in init for fields that are empty structs
An empty struct is handled as a struct with a dummy i8, on all targets.

Most targets treat an empty struct return value as essentially
void - but some don't. (Currently, at least x86_64-windows-* and
powerpc64le-* don't treat it as void.)

When intializing a struct with such a no_unique_address member,
make sure we don't write the dummy i8 into the struct where there's
no space allocated for it.

Previously it would clobber the actual valid data of the struct.

Fixes https://github.com/llvm/llvm-project/issues/64253, and
possibly https://github.com/llvm/llvm-project/issues/64077
and https://github.com/llvm/llvm-project/issues/64427 as well.

We should omit the store for any empty record (not only ones
declared with no_unique_address); we can have a situation where a
class doesn't have the no_unique_address attribute, but is embedded
in an outer struct with the no_unique_address attribute - like this:

    struct S {};
    S f();
    struct S2 : public S { S2();};
    S2::S2() : S(f()) {}
    struct S3 { int x; [[no_unique_address]] S2 y; S3(); };
    S3::S3() : x(1), y() {}

Here, the problematic store (which this patch omits) is in
the constructor of S2. In the case of S3, S2 has no valid storage
and aliases x - thus the constructor of S2 should omit the dummy
store.

Differential Revision: https://reviews.llvm.org/D157332
2023-08-15 10:59:23 +03:00
Changpeng Fang
d77c62053c [clang][AMDGPU]: Don't use byval for struct arguments in function ABI
Summary:
  Byval requires allocating additional stack space, and always requires an implicit copy to be inserted in codegen,
where it can be difficult to optimize. In this work, we use byref/IndirectAliased promotion method instead of
byval with the implicit copy semantics.

Reviewers:
  arsenm

Differential Revision:
  https://reviews.llvm.org/D155986
2023-08-11 16:37:42 -07:00
Matt Arsenault
25bc999d1f Intrinsics: Add type overload to stacksave and stackstore
This allows use with non-0 address space stacks. llvm_ptr_ty should
never be used. This could use some more percolation up through mlir,
but this is enough to fix existing tests.

https://reviews.llvm.org/D156666
2023-08-09 18:33:11 -04:00
Sander de Smalen
28b5f3087a [Clang][AArch64] Add/implement ACLE keywords for SME.
This patch adds all the language-level function keywords defined in:

  https://github.com/ARM-software/acle/pull/188 (merged)
  https://github.com/ARM-software/acle/pull/261 (update after D148700 landed)

The keywords are used to control PSTATE.ZA and PSTATE.SM, which are
respectively used for enabling the use of the ZA matrix array and Streaming
mode. This information needs to be available on call sites, since the use
of ZA or streaming mode may have to be enabled or disabled around the
call-site (depending on the IR attributes set on the caller and the
callee). For calls to functions from a function pointer, there is no IR
declaration available, so the IR attributes must be added explicitly to the
call-site.

With the exception of '__arm_locally_streaming' and '__arm_new_za' the
information is part of the function's interface, not just the function
definition, and thus needs to be propagated through the
FunctionProtoType::ExtProtoInfo.

This patch adds the defintions of these keywords, as well as codegen and
semantic analysis to ensure conversions between function pointers are valid
and that no conflicting keywords are set. For example, '__arm_streaming'
and '__arm_streaming_compatible' are mutually exclusive.

Differential Revision: https://reviews.llvm.org/D127762
2023-08-08 07:00:59 +00:00
Jon Roelofs
ed83797f3c
[Intrinsics][ObjC] Mark objc_retain and friends as thisreturn.
https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain

rdar://79869679

Differential revision: https://reviews.llvm.org/D105671
2023-08-01 18:02:00 -07:00
Yaxun (Sam) Liu
ac72531043 [Driver] Add -f[no-]offload-uniform-block
By default, clang assumes HIP kernels are launched with uniform block size,
which is the case for kernels launched through triple chevron or
hipLaunchKernelGGL. Clang adds uniform-work-group-size function attribute
to HIP kernels to allow the backend to do optimizations on that.

However, in some rare cases, HIP kernels can be launched
through hipExtModuleLaunchKernel where global work size is specified,
which may result in non-uniform block size.

To be able to support non-uniform block size for HIP kernels,
an option `-f[no-]offload-uniform-block is added. This option
is generic for offloading languages. Its default value is on for
CUDA/HIP and off otherwise.

Make -cl-uniform-work-group-size an alias to -foffload-uniform-block.

Reviewed by: Siu Chi Chan, Matt Arsenault, Fangrui Song, Johannes Doerfert

Differential Revision: https://reviews.llvm.org/D155213

Fixes: SWDEV-406592
2023-07-27 16:36:02 -04:00
Amy Huang
27dab4d305 Reland "Try to implement lambdas with inalloca parameters by forwarding without use of inallocas."t
This reverts commit 8ed7aa59f489715d39d32e72a787b8e75cfda151.

Differential Revision: https://reviews.llvm.org/D154007
2023-07-26 16:13:36 -07:00
Craig Topper
d53d842d12 [RISCV][AArch64][IRGen] Add a special case to CodeGenFunction::EmitCall for scalable vector return being coerced to fixed vector.
Before falling back to CreateCoercedStore, detect a scalable vector
return being coerced to fixed vector. Handle it using a vector.extract
intrinsic without going through memory.

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D155495
2023-07-18 10:04:33 -07:00
Craig Topper
e8dc9dcd7d [IRGen] Remove 'Sve' from the name of some IR names that are shared with RISC-V now.
Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D155220
2023-07-17 08:43:43 -07:00
Youngsuk Kim
6f986bffc5 [clang] Remove CGBuilderTy::CreateElementBitCast
`CGBuilderTy::CreateElementBitCast()` no longer does what its name suggests.

Remove remaining in-tree uses by one of the following methods.

* drop the call entirely
* fold it to an `Address` construction
* replace it with `Address::withElementType()`

This is a NFC cleanup effort.

Reviewed By: barannikov88, nikic, jrtc27

Differential Revision: https://reviews.llvm.org/D154285
2023-07-02 10:40:16 -04:00
Elliot Goodrich
f0fa2d7c29 [llvm] Move AttributeMask to a separate header
Move `AttributeMask` out of `llvm/IR/Attributes.h` to a new file
`llvm/IR/AttributeMask.h`.  After doing this we can remove the
`#include <bitset>` and `#include <set>` directives from `Attributes.h`.
Since there are many headers including `Attributes.h`, but not needing
the definition of `AttributeMask`, this causes unnecessary bloating of
the translation units and slows down compilation.

This commit adds in the include directive for `llvm/IR/AttributeMask.h`
to the handful of source files that need to see the definition.

This reduces the total number of preprocessing tokens across the LLVM
source files in lib from (roughly) 1,917,509,187 to 1,902,982,273 - a
reduction of ~0.76%. This should result in a small improvement in
compilation time.

Differential Revision: https://reviews.llvm.org/D153728
2023-06-27 15:26:17 +01:00
Eduard Zingerman
06eee734c1 [clang] Allow 'nomerge' attribute for function pointers
Allow specifying 'nomerge' attribute for function pointers,
e.g. like in the following C code:

    extern void (*foo)(void) __attribute__((nomerge));
    void bar(long i) {
      if (i)
        foo();
      else
        foo();
    }

With the goal to attach 'nomerge' to both calls done through 'foo':

    @foo = external local_unnamed_addr global ptr, align 8
    define dso_local void @bar(i64 noundef %i) local_unnamed_addr #0 {
      ; ...
      %0 = load ptr, ptr @foo, align 8, !tbaa !5
      ; ...
    if.then:
      tail call void %0() #1
      br label %if.end
    if.else:
      tail call void %0() #1
      br label %if.end
    if.end:
      ret void
    }
    ; ...
    attributes #1 = { nomerge ... }

Report a warning in case if 'nomerge' is specified for a variable that
is not a function pointer, e.g.:

    t.c:2:22: warning: 'nomerge' attribute is ignored because 'j' is not a function pointer [-Wignored-attributes]
        2 | int j __attribute__((nomerge));
          |                      ^

The intended use-case is for BPF backend.

BPF provides a sort of "standard library" functions that are called
helpers. BPF also verifies usage of these helpers before program
execution. Because of limitations of verification / runtime model it
is important to keep calls to some of such helpers from merging.

An example could be found by the link [1], there input C code:

     if (data_end - data > 1024) {
         bpf_for_each_map_elem(&map1, cb, &cb_data, 0);
     } else {
         bpf_for_each_map_elem(&map2, cb, &cb_data, 0);
     }

Is converted to bytecode equivalent to:

     if (data_end - data > 1024)
       tmp = &map1;
     else
       tmp = &map2;
     bpf_for_each_map_elem(tmp, cb, &cb_data, 0);

However, BPF verification/runtime requires to use the same map address
for each particular `bpf_for_each_map_elem()` call.

The 'nomerge' attribute is a perfect match for this situation, but
unfortunately BPF helpers are declared as pointers to functions:

    static long (*bpf_for_each_map_elem)(void *map, ...) = (void *) 164;

Hence, this commit, allowing to use 'nomerge' for function pointers.

[1] https://lore.kernel.org/bpf/03bdf90f-f374-1e67-69d6-76dd9c8318a4@meta.com/

Differential Revision: https://reviews.llvm.org/D152986
2023-06-27 01:15:45 +03:00
Amy Huang
8ed7aa59f4 Revert "Try to implement lambdas with inalloca parameters by forwarding without use of inallocas."
Causes a clang crash (see crbug.com/1457256).

This reverts commit 015049338d7e8e0e81f2ad2f94e5a43e2e3f5220.
2023-06-22 11:42:33 -07:00
Amy Huang
015049338d Try to implement lambdas with inalloca parameters by forwarding without use of inallocas.
Differential Revision: https://reviews.llvm.org/D137872
2023-06-20 17:30:20 -07:00
Joseph Huber
8784b6a854 [Clang] Allow bitcode linking when the input is LLVM-IR
Clang provides the `-mlink-bitcode-file` and `-mlink-builtin-bitcode`
options to insert LLVM-IR into the current TU. These are usefuly
primarily for including LLVM-IR files that require special handling to
be correct and cannot be linked normally, such as GPU vendor libraries
like `libdevice.10.bc`. Currently these options can only be used if the
source input goes through the AST consumer path. This patch makes the
changes necessary to also support this when the input is LLVM-IR. This
will allow the following operation:

```
clang in.bc -Xclang -mlink-builtin-bitcode -Xclang libdevice.10.bc
```

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D152391
2023-06-20 08:02:58 -05:00
NAKAMURA Takumi
0cbbfb8c2e [CGCall] Prune ArgStruct [-Wunused-variable]
It has been unused since b92ccc355acb
2023-06-16 08:00:57 +09:00
Nikita Popov
b92ccc355a [CGCall] Directly create opaque pointers (NFCI) 2023-06-15 10:06:40 +02:00
Nikita Popov
8a19af513d [Clang] Remove uses of PointerType::getWithSamePointeeType (NFC)
No longer relevant with opaque pointers.
2023-06-12 12:18:28 +02:00
Nikita Popov
2c44168381 [Clang] Remove typed pointer consistency assertions (NFC)
These are no-ops with opaque pointers.
2023-06-09 09:45:43 +02:00
pvanhout
23431b5246 [clang][CodeGen] Fix GPU-specific attributes being dropped by bitcode linking
Device libs make use of patterns like this:
```
__attribute__((target("gfx11-insts")))
static unsigned do_intrin_stuff(void)
{
  return __builtin_amdgcn_s_sendmsg_rtnl(0x0);
}
```
For functions that are assumed to be eliminated if the currennt GPU target doesn't support them.
At O0 such functions aren't eliminated by common optimizations but often by AMDGPURemoveIncompatibleFunctions instead, which sees the "+gfx11-insts" attribute on, say, GFX9 and knows it's not valid, so it removes the function.

D142907 accidentally made it so such attributes were dropped during bitcode linking, making it impossible for RemoveIncompatibleFunctions to catch the functions and causing ISel to catch fire eventually.

This fixes the issue and adds a new test to ensure we don't accidentally fall into this trap again.

Fixes SWDEV-403642

Reviewed By: arsenm, yaxunl

Differential Revision: https://reviews.llvm.org/D152251
2023-06-07 15:51:52 +02:00
Manna, Soumi
02ce49afb9 [NFC][CLANG] Fix bug with dereference null return value in GetFunctionTypeForVTable()
This patch uses castAs instead of getAs which will assert if the type doesn't match in clang::CodeGen::CodeGenTypes::GetFunctionTypeForVTable(clang::GlobalDecl).

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D151957
2023-06-02 13:28:06 -07:00
Dmitri Gribenko
daa95c7de5 [clang][analyzer][NFC] Remove unnecessary FALLTHROUGH markers
They are redundant with the [[fallthrough]]; attribute that follows.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D151723
2023-05-30 18:16:35 +02:00
Florian Hahn
f0687b47a0
[IRGen] Handle infinite cycles in findDominatingStoreToReturnValue.
If there is an infinite cycle in the IR, the loop will never exit. Keep
track of visited basic blocks in a set and return nullptr if a block is
visited again.

Fixes #62830.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D151076
2023-05-24 20:16:42 +01:00
eopXD
5e92298f76 [2/11][POC][Clang][RISCV] Define RVV tuple types
For the cover letter of this patch-set, please checkout D146872.

Depends on D146872.

This is the 2nd patch of the patch-set. This patch originates from
D97264. This patch further allows local variable declaration and
function parameter passing by adjustment in clang lowering.

Test cases are provided to demonstrate the LLVM IR generated.

Note: This patch is currently only a proof-of-concept with only a
single RVV tuple type declared here, the rest will be added when
the concept of this patch-set is accepted.

Authored-by: eop Chen <eop.chen@sifive.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D146873
2023-05-22 00:50:40 -07:00
Matt Arsenault
bc37be1855 LangRef: Add "dynamic" option to "denormal-fp-math"
This is stricter than the default "ieee", and should probably be the
default. This patch leaves the default alone. I can change this in a
future patch.

There are non-reversible transforms I would like to perform which are
legal under IEEE denormal handling, but illegal with flushing zero
behavior. Namely, conversions between llvm.is.fpclass and fcmp with
zeroes.

Under "ieee" handling, it is legal to translate between
llvm.is.fpclass(x, fcZero) and fcmp x, 0.

Under "preserve-sign" handling, it is legal to translate between
llvm.is.fpclass(x, fcSubnormal|fcZero) and fcmp x, 0.

I would like to compile and distribute some math library functions in
a mode where it's callable from code with and without denormals
enabled, which requires not changing the compares with denormals or
zeroes.

If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0,
it is no longer possible to call the function from code with denormals
enabled, or write an optimization to move the function into a denormal
flushing mode. For the original function, if x was a denormal, the
class would evaluate to false. If the function compiled with denormal
handling was converted to or called from a preserve-sign function, the
fcmp now evaluates to true.

This could also be of use for strictfp handling, where code may be
changing the denormal mode.

Alternative name could be "unknown".

Replaces the old AMDGPU custom inlining logic with more conservative
logic which tries to permit inlining for callees with dynamic handling
and avoids inlining other mismatched modes.
2023-04-29 08:44:59 -04:00
Harald van Dijk
6b86813945
[SYCL] Always set NoUnwind attribute for SYCL.
Like CUDA and OpenCL, the SYCL specification says that throwing and
catching exceptions in device functions is not supported, so this change
extends the logic for adding the NoUnwind attribute to SYCL.

The existing convergent.cpp test, which tests that the convergent
attribute is added to functions by default, is renamed and reused to
test that the nounwind attribute is added by default. This test now has
-fexceptions added to it, which the driver adds by default as well.

The obvious question here is why not simply change the driver to remove
-fexceptions. This change follows the direction given by the TODO
comment because removing -fexceptions would also disable the
__EXCEPTIONS macro, which should reflect whether exceptions are enabled
on the host, rather than on the device, to avoid conflicts in types
shared between host and device.

Reviewed By: bader

Differential Revision: https://reviews.llvm.org/D147097
2023-03-30 02:18:52 +01:00
Qiu Chaofan
608212a0ff [Clang] Check feature requirement from inlined callee
Currently clang emits error when both always_inline and target
attributes are on callee, but caller doesn't have some feature.

This patch makes clang emit error when caller cannot meet target feature
requirements from an always-inlined callee.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D143479
2023-03-15 14:21:52 +08:00
Matt Arsenault
dd81810554 clang: Emit nofpclass(nan inf) for -ffinite-math-only
Set this on any source level floating-point type argument,
return value, call return or outgoing parameter which is lowered
to a valid IR type for the attribute. Currently this isn't
applied to emitted intrinsics since those don't go through
ABI code.
2023-03-15 01:13:08 -04:00
Jacob Young
6740991135 [Clang][CodeGen] Fix this argument type for certain destructors
With the Microsoft ABI, some destructors need to offset a parameter to
get the derived this pointer, in which case the type of that parameter
should not be a pointer to the derived type.

Fixes #60465
2023-02-28 16:43:03 -08:00
Akira Hatanaka
57865bc5ad [CodeGen] Add a flag to Address and Lvalue that is used to keep
track of whether the pointer is known not to be null

The flag will be used for the arm64e work we plan to upstream in the
future (see https://lists.llvm.org/pipermail/llvm-dev/2019-October/136091.html).
Currently the flag has no effect on code generation.

Differential Revision: https://reviews.llvm.org/D142584
2023-02-15 10:15:13 -08:00
Francesco Petrogalli
20f3ebd258 [clang][CGCall] Remove header file not used. [NFCI]
Reviewed By: fpetrogalli

Differential Revision: https://reviews.llvm.org/D142976
2023-01-31 16:12:46 +01:00
Sven van Haastregt
1495210914 [OpenCL] Always add nounwind attribute for OpenCL
Neither OpenCL nor C++ for OpenCL support exceptions, so add the
`nounwind` attribute unconditionally for those languages.

Differential Revision: https://reviews.llvm.org/D142033
2023-01-20 12:01:22 +00:00
Guillaume Chatelet
bf5c17ed0f [clang][NFC] Remove dependency on DataLayout::getPrefTypeAlignment 2023-01-13 15:01:29 +00:00
Guillaume Chatelet
eaa1f46f11 [clang][NFC] Remove dependency on DataLayout::getPrefTypeAlignment 2023-01-13 13:19:19 +00:00
Guillaume Chatelet
6916ebd026 [clang][NFC] Use the TypeSize::getXXXValue() instead of TypeSize::getXXXSize)
This change is one of a series to implement the discussion from
https://reviews.llvm.org/D141134.
2023-01-11 16:07:48 +00:00
Matt Arsenault
ce6ae0b2a2 clang: Don't emit "frame-pointer"="none"
This is the default behavior and cuts down on attribute spam.
Probably should also do something to consolidate the option spellings;
printing and parsing it is repeated in at least 3 different places.

In the OpenMP tests, I had to manually delete some metadata check
lines update_cc_test_checks was inserting that included the local
build revision.
2023-01-03 19:42:46 -05:00
Dani Ferreira Franco Moura
0da4cecfb6 [clang][dataflow] Remove unused argument in getNullability
This change will allow users to call getNullability() without providing an ASTContext.

Reviewed By: gribozavr2

Differential Revision: https://reviews.llvm.org/D140104
2022-12-16 12:22:23 +01:00