The motivating use case is to support import the function declaration
across modules to construct call graph edges for indirect calls [1]
when importing the function definition costs too much compile time
(e.g., the function is too large has no `noinline` attribute).
1. Currently, when the compiled IR module doesn't have a function
definition but its postlink combined summary contains the function
summary or a global alias summary with this function as aliasee, the
function definition will be imported from source module by IRMover. The
implementation is in FunctionImporter::importFunctions [2]
2. In order for FunctionImporter to import a declaration of a function,
both function summary and alias summary need to carry the def / decl
state. Specifically, all existing summary fields doesn't differ across
import modules, but the def / decl state of is decided by
`<ImportModule, Function>`.
This change encodes the def/decl state in `GlobalValueSummary::GVFlags`.
In the subsequent changes
1. The indexing step `computeImportForModule` [3]
will compute the set of definitions and the set of declarations for each
module, and passing on the information to bitcode writer.
2. Bitcode writer will look up the def/decl state and sets the state
when it writes out the flag value. This is demonstrated in
https://github.com/llvm/llvm-project/pull/87600
3. Function importer will read the def/decl state when reading the
combined summary to figure out two sets of global values, and IRMover
will be updated to import the declaration (aka linkGlobalValuePrototype [4])
into the destination module.
- The next change is https://github.com/llvm/llvm-project/pull/87600
[1] mentioned in rfc https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5
[2] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L1608-L1764)
[3] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L856)
[4] 3b337242ee/llvm/lib/Linker/IRMover.cpp (L605)
This reverts commit ccdebbae4d77d3efc236af92c22941de5d437e01.
Causes test failures in the presence of Android runtime libraries in resource-dir.
See comments on https://github.com/llvm/llvm-project/pull/87866.
IR for 'target teams loop' is now dependent on suitability of associated
loop-nest.
If a loop-nest:
- does not contain a function call, or
- the -fopenmp-assume-no-nested-parallelism has been specified,
- or the call is to an OpenMP API AND
- does not contain nested loop bind(parallel) directives
then it can be emitted as 'target teams distribute parallel for', which
is the current default. Otherwise, it is emitted as 'target teams
distribute'.
Added debug output indicating how 'target teams loop' was emitted. Flag
is -mllvm -debug-only=target-teams-loop-codegen
Added LIT tests explicitly verifying 'target teams loop' emitted as a
parallel loop and a distribute loop.
Updated other 'loop' related tests as needed to reflect change in IR.
- These updates account for most of the changed files and
additions/deletions.
This patch introduces a new command-line option for clang, namely,
amdgpu-precise-mem-op (or precise-memory in the backend). When this option is specified, a waitcnt
instruction is generated after each memory load/store instruction. The
counter values are always 0, but which counters are involved depends on
the memory instruction.
---------
Co-authored-by: Jun Wang <jun.wang7@amd.com>
This test just checks for the stdout/stderr of clang, but it
incidentally tries to write to `a.out` in the current directory, which
may be write protected. Typically one would write `clang -o %t.o` for a
writeable dir, but since we only care about stdout/stderr, throw away
the object file and just write to /dev/null instead.
As a followup to my previous commits, this is an implementation of a
single clause, in this case the 'default' clause. This implements all
semantic analysis for it on compute clauses, and continues to leave it
rejected for all others (some as 'doesnt appertain', others as 'not
implemented' as appropriate).
This also implements and tests the TreeTransform as requested in the
previous patch.
Summary:
The linker wrapper attempts to maintain consistent semantics with
existing host invocations. Static libraries by default only extract if
there are non-weak symbols that remain undefined. However, we have
situations between linkers that put different meanings on ordering. The
ld.bfd linker requires static libraries to be defined after the symbols,
while `ld.lld` relaxes this rule. The linker wrapper went with the
former as it's the easier solution, however this has caused a lot of
issues as I've had to explain this rule to several people, it also make
it difficult to include things like `libc` in the OpenMP runtime because
it would sometimes be linked before or after.
This patch reworks the logic to more or less perform the following logic
for static libraries.
1. Split library / object inputs.
2. Include every object input and record its undefined symbols
3. Repeatedly try to extract static libraries to resolve these
symbols. If a file is extracted we need to check every library
again to resolve any new undefined symbols.
This allows the following to work and will cause fewer issues when
replacing HIP, which does `--whole-archive` so it's very likely the old
logic will regress.
```console
$ clang -lfoo main.c -fopenmp --offload-arch=native
```
There has been an optimization for `SizeOfPackExprs` since c5452ed9, in
which
we overlooked a case where the template arguments were not yet
formed into a `PackExpansionType` at the token annotation stage. This
led to a problem in that a template involving such expressions may
lose its nature of being dependent, causing some false-positive
diagnostics.
Fixes https://github.com/llvm/llvm-project/issues/84220
Fixes https://github.com/llvm/llvm-project/issues/63818 for control flow
out of an expressions.
#### Background
A control flow could happen in the middle of an expression due to
stmt-expr and coroutine suspensions.
Due to branch-in-expr, we missed running cleanups for the temporaries
constructed in the expression before the branch.
Previously, these cleanups were only added as `EHCleanup` during the
expression and as normal expression after the full expression.
Examples of such deferred cleanups include:
`ParenList/InitList`: Cleanups for fields are performed by the
destructor of the object being constructed.
`Array init`: Cleanup for elements of an array is included in the array
cleanup.
`Lifetime-extended temporaries`: reference-binding temporaries in
braced-init are lifetime extended to the parent scope.
`Lambda capture init`: init in the lambda capture list is destroyed by
the lambda object.
---
#### In this PR
In this PR, we change some of the `EHCleanups` cleanups to
`NormalAndEHCleanups` to make sure these are emitted when we see a
branch inside an expression (through statement expressions or coroutine
suspensions).
These are supposed to be deactivated after full expression and destroyed
later as part of the destructor of the aggregate or array being
constructed. To simplify deactivating cleanups, we add two utilities as
well:
* `DeferredDeactivationCleanupStack`: A stack to remember cleanups with
deferred deactivation.
* `CleanupDeactivationScope`: RAII for deactivating cleanups added to
the above stack.
---
#### Deactivating normal cleanups
These were previously `EHCleanups` and not `Normal` and **deactivation**
of **required** `Normal` cleanups had some bugs. These specifically
include deactivating `Normal` cleanups which are not the top of
`EHStack`
[source1](92b56011e6/clang/lib/CodeGen/CGCleanup.cpp (L1319)),
[2](92b56011e6/clang/lib/CodeGen/CGCleanup.cpp (L722-L746)).
This has not been part of our test suite (maybe it was never required
before statement expressions). In this PR, we also fix the emission of
required-deactivated-normal cleanups.
This turns the current `Pointer` class into a discriminated union of
`BlockPointer` and `IntPointer`. The former is what `Pointer` currently
is while the latter is just an integer value and an optional
`Descriptor*`.
The `Pointer` then has type check functions like
`isBlockPointer()`/`isIntegralPointer()`/`asBlockPointer()`/`asIntPointer()`,
which can be used to access its data.
Right now, the `IntPointer` and `BlockPointer` structs do not have any
methods of their own and everything is instead implemented in Pointer
(like it was before) and the functions now just either assert for the
right type or decide what to do based on it.
This also implements bitcasts by decaying the pointer to an integral
pointer.
`test/AST/Interp/const-eval.c` is a new test testing all kinds of stuff
related to this. It still has a few tests `#ifdef`-ed out but that
mostly depends on other unimplemented things like
`__builtin_constant_p`.
The compiler doesn't know in advance if the streaming and non-streaming
vector-lengths are different, so it should be safe to give a warning
diagnostic to warn the user about possible undefined behaviour. If the
user knows the vector lengths are equal, they can disable the warning
separately.
This relands #87149.
The previous commit exposed failures on some targets. The reason is only
a few targets support COFF ObjectFormatType on Windows:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/TargetParser/Triple.cpp#L835-L842
With #87149, the targets don't support COFF will report "warning:
argument unused during compilation: '-gcodeview-command-line'
[-Wunused-command-line-argument]" in the test gcodeview-command-line.c
This patch limits gcodeview-command-line.c only run on targets support
COFF.
This patch takes advantage of a recent NFC change that refactored
`EvaluateBinaryTypeTrait()` to accept `TypeSourceInfo` instead of
`QualType` c7db450e5c1a83ea768765dcdedfd50f3358d418.
Before:
```
test2.cpp:105:55: error: variable length arrays are not supported in '__is_layout_compatible'
105 | static_assert(!__is_layout_compatible(int[n], int[n]));
| ^
test2.cpp:125:76: error: incomplete type 'CStructIncomplete' where a complete type is required
125 | static_assert(__is_layout_compatible(CStructIncomplete, CStructIncomplete));
| ^
```
After:
```
test2.cpp:105:41: error: variable length arrays are not supported in '__is_layout_compatible'
105 | static_assert(!__is_layout_compatible(int[n], int[n]));
| ^
test2.cpp:125:40: error: incomplete type 'CStructIncomplete' where a complete type is required
125 | static_assert(__is_layout_compatible(CStructIncomplete, CStructIncomplete));
| ^
```
```
typedef long long t67 __attribute__((aligned (4)));
struct s67 {
int a;
t67 b;
};
void f67(struct s67 x) {
}
```
When classify:
a: Lo = Integer, Hi = NoClass
b: Lo = Integer, Hi = NoClass
struct S: Lo = Integer, Hi = NoClass
```
define dso_local void @f67(i64 %x.coerce) {
```
In this case, only one i64 register is used when the structure parameter
is transferred, which is obviously incorrect.So we need to treat the
split case specially. fix
https://github.com/llvm/llvm-project/issues/85387.
MSVC doesn't support generating __vectorcall calls in Arm64EC mode, but
it does treat it as a distinct type. The Microsoft STL depends on this
functionality. (Not sure if this is intentional.) Add support for
parsing the same way as MSVC, and add some checks to ensure we don't try
to actually generate code.
The error handling in CodeGen is ugly, but I can't think of a better way
to do it.
Summary:
These entires are generic for offloading with the new driver now. Having
the `omp` prefix was a historical artifact and is confusing when used
for CUDA. This patch just renames them for now, future patches will
rework the binary format to make it more common.
This reverts commit 407a2f23 which stopped propagating the callback to module compiles, effectively disabling dependency directive scanning for all modular dependencies. Also added a regression test.
Follow-up to discussion: https://github.com/llvm/llvm-project/pull/87761
As discussed after landing the original PR:
Since fails could happen w.r.t. checking `!6`, these checks should be
removed.
- Add '--' argument to prevent interpreting intput files as options
starting with '/'. Fix test failure after
2921a0928c71f4ee652a2478283e47ab5ffebf58.
This diagnostic is one of the ones that GCC also does not have a
warning group for, but a user requested adding a group to control
selectively turning off this diagnostic. So this adds the diagnostic
to a new group, -Wtentative-definition-array
Fixes#87766
After discussion with a few others, and seeing the state of our concepts
support, I believe it is worth trying to see if we can update this for
Clang19. The forcing function is that libstdc++'s `<expected>` header is
guarded by this macro, so we need to update it to support that.
Summary:
These headers are a part of the compiler's resource directory once
installed. However, they are currently placed in the binary directory
temporarily. This makes it more difficult to use the compiler out of the
build directory and will cause issues when moving to `liboffload`. This
patch changes the logic to write these instead to the copmiler's
resource directory inside of the build tree.
NOTE: This doesn't change the Fortran headers, I don't know enough about
those and it won't use the same directory.
Warning '-Wundefined-func-template' incorrectly indicates that no
definition is available for a pure virtual function. However, a
definition is not needed for a pure virtual function.
Fixes#74016
This fixes some problems wrt dependence of captures in lambdas with
an explicit object parameter.
[temp.dep.expr] states that
> An id-expression is type-dependent if [...] its terminal name is
> - associated by name lookup with an entity captured by copy
> ([expr.prim.lambda.capture]) in a lambda-expression that has
> an explicit object parameter whose type is dependent [dcl.fct].
There were several issues with our implementation of this:
1. we were treating by-reference captures as dependent rather than
by-value captures;
2. tree transform wasn't checking whether referring to such a
by-value capture should make a DRE dependent;
3. when checking whether a DRE refers to such a by-value capture, we
were only looking at the immediately enclosing lambda, and not
at any parent lambdas;
4. we also forgot to check for implicit by-value captures;
5. lastly, we were attempting to determine whether a lambda has an
explicit object parameter by checking the `LambdaScopeInfo`'s
`ExplicitObjectParameter`, but it seems that that simply wasn't
set (yet) by the time we got to the check.
All of these should be fixed now.
This fixes#70604, #79754, #84163, #84425, #86054, #86398, and #86399.
This patch fixes a crash that happens when '`this`' is referenced
(implicitly or explicitly) in a dependent class scope function template
specialization that instantiates to a static member function. For
example:
```
template<typename T>
struct A
{
template<typename U>
static void f();
template<>
void f<int>()
{
this; // causes crash during instantiation
}
};
template struct A<int>;
```
This happens because during instantiation of the function body,
`Sema::getCurrentThisType` will return a null `QualType` which we
rebuild the `CXXThisExpr` with. A similar problem exists for implicit
class member access expressions in such contexts (which shouldn't really
happen within templates anyways per [class.mfct.non.static]
p2, but changing that is non-trivial). This patch fixes the crash by building
`UnresolvedLookupExpr`s instead of `MemberExpr`s for these implicit
member accesses, which will then be correctly rebuilt as `MemberExpr`s
during instantiation.
rldimi is 64-bit instruction, due to backward compatibility, it needs to
be expanded into series of rotate and masking in 32-bit environment. In
the future, we may improve bit permutation selector and remove such
direct codegen.
Original commit message:
"
Commit https://github.com/llvm/llvm-project/commit/46f3ade introduced a notion
of printing the attributes on the left to improve the printing of attributes
attached to variable declarations. The intent was to produce more GCC compatible
code because clang tends to print the attributes on the right hand side which is
not accepted by gcc.
This approach has increased the complexity in tablegen and the attrubutes
themselves as now the are supposed to know where they could appear. That lead to
mishandling of the `override` keyword which is modelled as an attribute in
clang.
This patch takes an inspiration from the existing approach and tries to keep the
position of the attributes as they were written. To do so we use simpler
heuristic which checks if the source locations of the attribute precedes the
declaration. If so, it is considered to be printed before the declaration.
Fixes https://github.com/llvm/llvm-project/issues/87151
"
The reason for the bot breakage is that attributes coming from ApiNotes are not
marked implicit even though they do not have source locations. This caused an
assert to trigger. This patch forces attributes with no source location
information to be printed on the left. That change is consistent to the overall
intent of the change to increase the chances for attributes to compile across
toolchains and at the same time the produced code to be as close as possible to
the one written by the user.
Commit https://github.com/llvm/llvm-project/commit/46f3ade introduced a
notion of printing the attributes on the left to improve the printing of
attributes attached to variable declarations. The intent was to produce
more GCC compatible code because clang tends to print the attributes on
the right hand side which is not accepted by gcc.
This approach has increased the complexity in tablegen and the
attrubutes themselves as now the are supposed to know where they could
appear. That lead to mishandling of the `override` keyword which is
modelled as an attribute in clang.
This patch takes an inspiration from the existing approach and tries to
keep the position of the attributes as they were written. To do so we
use simpler heuristic which checks if the source locations of the
attribute precedes the declaration. If so, it is considered to be
printed before the declaration.
Fixes https://github.com/llvm/llvm-project/issues/87151
…irectory (#88007)"
This reverts commit 8671429151d5e67d3f21a737809953ae8bdfbfde.
This commit broke the flang build, so I'm reverting it. See the comments
in merge request #88007 for more information.
Follow-up to #81037.
ToolChain::LibraryPaths holds the new compiler-rt library directory
(e.g. `/tmp/Debug/lib/clang/19/lib/x86_64-unknown-linux-gnu`). However,
it might be empty when the directory does not exist (due to the `if
(getVFS().exists(P))` change in https://reviews.llvm.org/D158475).
If neither the old/new compiler-rt library directories exists, we would
suggest the undesired old compiler-rt file name:
```
% /tmp/Debug/bin/clang++ a.cc -fsanitize=memory -o a
ld.lld: error: cannot open /tmp/Debug/lib/clang/19/lib/linux/libclang_rt.msan-x86_64.a: No such file or directory
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
```
With this change, we will correctly suggest the new compiler-rt file name.
Fix#87150
Pull Request: https://github.com/llvm/llvm-project/pull/87866
Hit this when trying upgrade an old project of mine. I couldn't find a
corresponding existing issue for this when spelunking the open issues
here on github.
Thankfully I can work-around it today with the `[[clang::no_destroy]]`
attribute for my use case. However it should still be properly fixed.
### Issue and History ###
https://godbolt.org/z/EYnhce8MK for reference.
All subsequent text below refers to the example in the godbolt above.
Anonymous unions never have their destructor invoked automatically.
Therefore we can skip vtable initialization of the destructor of a
dynamic class if that destructor effectively does no work.
This worked previously as the following check would be hit and return
true for the trivial anonymous union,
https://github.com/llvm/llvm-project/blob/release/18.x/clang/lib/CodeGen/CGClass.cpp#L1348,
resulting in the code skipping vtable initialization.
This was broken here
982bbf404e
in relation to comments made on this review here
https://reviews.llvm.org/D10508.
### Fixes ###
The check the code is doing is correct however the return value is
inverted. We want to return true here since a field with anonymous union
never has its destructor invoked and thus effectively has a trivial
destructor body from the perspective of requiring vtable init in the
parent dynamic class.
Also added some extra missing unit tests to test for this use case and a
couple others.
This reverts commit e770153865c53c4fd72a68f23acff33c24e42a08.
This wasn't reviewed, and the functionality in question was
intentionally rejected the last time it was discussed in
https://reviews.llvm.org/D56305 .
intrin.h checks for x86_64. But the "x86_64" define is also defined for
ARM64EC, and we don't support all the intrinsics in ARM64EC mode. Fix
the preprocessor checks to handle this correctly. (If we actually need
some of these intrinsics in ARM64EC mode, we can revisit later.)
Not exactly sure how I didn't run into this issue before now... I think
I've built code that requires these headers, but maybe not since the
define fix landed.
Summary:
These headers are a part of the compiler's resource directory once
installed. However, they are currently placed in the binary directory
temporarily. This makes it more difficult to use the compiler out of the
build directory and will cause issues when moving to `liboffload`. This
patch changes the logic to write these instead to the copmiler's
resource directory inside of the build tree.
NOTE: This doesn't change the Fortran headers, I don't know enough about
those and it won't use the same directory.
This test is the bottleneck for OpenMP lit tests, running about twice as
long as the others. Break it into five tests based on run lines with the
same version.
Fix since #75481 got reverted.
- Explicitly set BitfieldBits to 0 to avoid uninitialized field member
for the integer checks:
```diff
- llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first)};
+ llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first),
+ llvm::ConstantInt::get(Builder.getInt32Ty(), 0)};
```
- `Value **Previous` was erroneously `Value *Previous` in
`CodeGenFunction::EmitWithOriginalRHSBitfieldAssignment`, fixed now.
- Update following:
```diff
- if (Kind == CK_IntegralCast) {
+ if (Kind == CK_IntegralCast || Kind == CK_LValueToRValue) {
```
CK_LValueToRValue when going from, e.g., char to char, and
CK_IntegralCast otherwise.
- Make sure that `Value *Previous = nullptr;` is initialized (see
1189e87951)
- Add another extensive testcase
`ubsan/TestCases/ImplicitConversion/bitfield-conversion.c`
---------
Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>
This is a follow-up to #81506. As discussed in #87737, we're rejecting
incomplete types, save for exceptions listed in the C++ standard (`void`
and arrays of unknown bound). Note that arrays of unknown bound of
incomplete types are accepted.
Since we're happy with the current behavior of this intrinsic for
flexible array members
(https://github.com/llvm/llvm-project/pull/87737#discussion_r1553652570),
I added a couple of tests for that as well.
[CWG593](https://cplusplus.github.io/CWG/issues/593.html) "Falling off
the end of a destructor's function-try-block handler". As usual with CWG
issues resolved as NAD, we test for status-quo confirmed by CWG.