Summary:
The `clang-offload-packager` records the architecture of the job.
Currently there are cases where this will be empty. SYCL, CPU, and when
the user manually overrides it to be empty. In these cases we should
alwas consider it 'generic'. Adding this string both makes it clear how
it behaves and triggers the special handling for this architecture,
allowing it to bind to different architectures.
Summary:
Recent changes made a lot of this stuff redundant or unused, clean it up
a bit.
Also snuck in a change to pass the CUDA path since we still use it for
`fatbinary` internally.
Add initial parsing/sema support for new assumption clause so clause can
be specified. For now, it's ignored, just like the others.
Added support for 'no_openmp_construct' to release notes.
Testing
- Updated appropriate LIT tests.
- Testing: check-all
Add manual ELF packaging for `spirv64-intel` images as there is no
SPIR-V linker available. This format will be expected by the runtime
plugin we will submit in the future and is compatible with the format we
already use downstream.
---------
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Summary:
This patch unifies the existing offloading entires into a single section
called `llvm_offload_entires`. This lets us use a more unified
offloading infrastructure so that all targets share the same handling.
The effect is that people in the runtimes now need to check if the kind
is what they expect, but the expectation is that you can combine
multiple potential providers into a compile job. Doesn't fully work
yet because of other runtime issues, but some day. Mostly this helps the
future of liboffload where we want to handle different languages than
OpenMP.
When computing the context hash, `clang` always includes the compiler's
working directory. This can lead to situations when the only difference
between two compilations is the working directory, different module
variants are generated. These variants are redundant. This PR implements
an optimization that ignores the working directory when computing the
context hash when safe.
Specifically, `clang` checks if it is safe to ignore the working
directory in `isSafeToIgnoreCWD`. The check involves going through
compile command options to see if any paths specified are relative. The
definition of relative path used here is that the input path is not
empty, and `llvm::sys::path::is_absolute` is false. If all the paths
examined are not relative, `clang` considers it safe to ignore the
current working directory and does not consider the working directory
when computing the context hash.
Update Dependency scanner so it can scan the dependency of a TU with
a provided buffer rather than relying on the on disk file system to
provide the input file.
New proposed function `clang-format-vc-diff`.
It is the same as calling `clang-format-region` on all diffs between
the content of a buffer-file and the content of the file at git
revision HEAD. This is essentially the same thing as:
`git-clang-format -f {filename}`
If the current buffer is saved.
The motivation is many project (LLVM included) both have code that is
non-compliant with there clang-format style and disallow unrelated
format diffs in PRs. This means users can't just run
`clang-format-buffer` on the buffer they are working on, and need to
manually go through all the regions by hand to get them
formatted. This is both an error prone and annoying workflow.
Note that PointerUnion::dyn_cast has been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
Literal migration would result in dyn_cast_if_present (see the
definition of PointerUnion::dyn_cast), but this patch uses dyn_cast
because we expect Storage to be nonnull. Note that if Storage were
null, dereferencing Ovl would trigger a segfault.
The atomic construct is a particularly complicated one. The directive
itself is pretty simple, it has 5 options for the 'atomic-clause'.
However, the associated statement is fairly complicated.
'read' accepts:
v = x;
'write' accepts:
x = expr;
'update' (or no clause) accepts:
x++;
x--;
++x;
--x;
x binop= expr;
x = x binop expr;
x = expr binop x;
'capture' accepts either a compound statement, or:
v = x++;
v = x--;
v = ++x;
v = --x;
v = x binop= expr;
v = x = x binop expr;
v = x = expr binop x;
IF 'capture' has a compound statement, it accepts:
{v = x; x binop= expr; }
{x binop= expr; v = x; }
{v = x; x = x binop expr; }
{v = x; x = expr binop x; }
{x = x binop expr ;v = x; }
{x = expr binop x; v = x; }
{v = x; x = expr; }
{v = x; x++; }
{v = x; ++x; }
{x++; v = x; }
{++x; v = x; }
{v = x; x--; }
{v = x; --x; }
{x--; v = x; }
{--x; v = x; }
While these are all quite complicated, there is a significant amount
of similarity between the 'capture' and 'update' lists, so this patch
reuses a lot of the same functions.
This patch implements the entirety of 'atomic', creating a new Sema file
for the sema for it, as it is fairly sizable.
Note that PointerUnion::dyn_cast has been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
Literal migration would result in dyn_cast_if_present (see the
definition of PointerUnion::dyn_cast), but this patch uses dyn_cast
because we expect Storage to be nonnull.
In the discussion around #116792, @rjmccall mentioned that ARCMigrate
has been obsoleted and that we could go ahead and remove it from Clang,
so this patch does just that.
This is an implementation of P1061 Structure Bindings Introduce a Pack
without the ability to use packs outside of templates. There is a couple
of ways the AST could have been sliced so let me know what you think.
The only part of this change that I am unsure of is the
serialization/deserialization stuff. I followed the implementation of
other Exprs, but I do not really know how it is tested. Thank you for
your time considering this.
---------
Co-authored-by: Yanzuo Liu <zwuis@outlook.com>
This is take two of #70976. This iteration of the patch makes sure that
custom
diagnostics without any warning group don't get promoted by `-Werror` or
`-Wfatal-errors`.
This implements parts of the extension proposed in
https://discourse.llvm.org/t/exposing-the-diagnostic-engine-to-c/73092/7.
Specifically, this makes it possible to specify a diagnostic group in an
optional third argument.
Note that PointerUnion::dyn_cast has been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
This patch migrates uses of PointerUnion::dyn_cast to
dyn_cast_if_present (see the definition of PointerUnion::dyn_cast).
Note that we cannot use dyn_cast in any of the migrations in this
patch; placing
assert(!X.isNull());
just before any of dyn_cast_if_present in this patch triggers some
failure in check-clang.
This change removes the need to call the clang-bolt target in order to
apply bolt optimizations to clang. Now running `ninja clang` will build
a clang with bolt optimizations, and `ninja check-clang` and `ninja
install-clang` will test and install bolt optimized clang too.
The clang-bolt target has been kept for compatibilty reasons, but it is
now just an alias to the clang target.
Also, this new design for applying the bolt optimizations to clang will
be easier to generalize and use to optimize other binaries/libraries in
the project.
---------
Co-authored-by: Amir Ayupov <fads93@gmail.com>
Co-authored-by: Petr Hosek <phosek@google.com>
A SYCL kernel entry point function is a non-member function or a static
member function declared with the `sycl_kernel_entry_point` attribute.
Such functions define a pattern for an offload kernel entry point
function to be generated to enable execution of a SYCL kernel on a
device. A SYCL library implementation orchestrates the invocation of
these functions with corresponding SYCL kernel arguments in response to
calls to SYCL kernel invocation functions specified by the SYCL 2020
specification.
The offload kernel entry point function (sometimes referred to as the
SYCL kernel caller function) is generated from the SYCL kernel entry
point function by a transformation of the function parameters followed
by a transformation of the function body to replace references to the
original parameters with references to the transformed ones. Exactly how
parameters are transformed will be explained in a future change that
implements non-trivial transformations. For now, it suffices to state
that a given parameter of the SYCL kernel entry point function may be
transformed to multiple parameters of the offload kernel entry point as
needed to satisfy offload kernel argument passing requirements.
Parameters that are decomposed in this way are reconstituted as local
variables in the body of the generated offload kernel entry point
function.
For example, given the following SYCL kernel entry point function
definition:
```
template<typename KernelNameType, typename KernelType>
[[clang::sycl_kernel_entry_point(KernelNameType)]]
void sycl_kernel_entry_point(KernelType kernel) {
kernel();
}
```
and the following call:
```
struct Kernel {
int dm1;
int dm2;
void operator()() const;
};
Kernel k;
sycl_kernel_entry_point<class kernel_name>(k);
```
the corresponding offload kernel entry point function that is generated
might look as follows (assuming `Kernel` is a type that requires
decomposition):
```
void offload_kernel_entry_point_for_kernel_name(int dm1, int dm2) {
Kernel kernel{dm1, dm2};
kernel();
}
```
Other details of the generated offload kernel entry point function, such
as its name and calling convention, are implementation details that need
not be reflected in the AST and may differ across target devices. For
that reason, only the transformation described above is represented in
the AST; other details will be filled in during code generation.
These transformations are represented using new AST nodes introduced
with this change. `OutlinedFunctionDecl` holds a sequence of
`ImplicitParamDecl` nodes and a sequence of statement nodes that
correspond to the transformed parameters and function body.
`SYCLKernelCallStmt` wraps the original function body and associates it
with an `OutlinedFunctionDecl` instance. For the example above, the AST
generated for the `sycl_kernel_entry_point<kernel_name>` specialization
would look as follows:
```
FunctionDecl 'sycl_kernel_entry_point<kernel_name>(Kernel)'
TemplateArgument type 'kernel_name'
TemplateArgument type 'Kernel'
ParmVarDecl kernel 'Kernel'
SYCLKernelCallStmt
CompoundStmt
<original statements>
OutlinedFunctionDecl
ImplicitParamDecl 'dm1' 'int'
ImplicitParamDecl 'dm2' 'int'
CompoundStmt
VarDecl 'kernel' 'Kernel'
<initialization of 'kernel' with 'dm1' and 'dm2'>
<transformed statements with redirected references of 'kernel'>
```
Any ODR-use of the SYCL kernel entry point function will (with future
changes) suffice for the offload kernel entry point to be emitted. An
actual call to the SYCL kernel entry point function will result in a
call to the function. However, evaluation of a `SYCLKernelCallStmt`
statement is a no-op, so such calls will have no effect other than to
trigger emission of the offload kernel entry point.
Additionally, as a related change inspired by code review feedback,
these changes disallow use of the `sycl_kernel_entry_point` attribute
with functions defined with a _function-try-block_. The SYCL 2020
specification prohibits the use of C++ exceptions in device functions.
Even if exceptions were not prohibited, it is unclear what the semantics
would be for an exception that escapes the SYCL kernel entry point
function; the boundary between host and device code could be an implicit
noexcept boundary that results in program termination if violated, or
the exception could perhaps be propagated to host code via the SYCL
library. Pending support for C++ exceptions in device code and clear
semantics for handling them at the host-device boundary, this change
makes use of the `sycl_kernel_entry_point` attribute with a function
defined with a _function-try-block_ an error.
Previously, they used a hand-rolled Pascal-string encoding different
from all the other string tables produced from TableGen. This moves them
to use the newly introduced runtime abstraction, and enhances that
abstraction to support iterating over the string table as used in this
case.
From what I can tell the Pascal-string encoding isn't critical here to
avoid expensive `strlen` calls, so I think this is a simpler and more
consistent model. But if folks would prefer a Pascal-string style
encoding, I can instead work to switch the `StringTable` abstraction
towards that. It would require some tricky tradeoffs though to make it
reasonably general: either using 4 bytes instead of 1 byte to encode the
size, or having a fallback to `strlen` for long strings.
This fixes a bug where report links generated from files such as
StylePrimitiveNumericTypes+Conversions.h in WebKit result in an error.
---------
Co-authored-by: Brianna Fan <bfan2@apple.com>
When building with -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON with a recent
libstdc++ (e.g. from gcc 13.3.0) the testcase
clang/test/Misc/warning-flags-tree.c fail with the message:
```
+ diagtool tree --internal
.../include/c++/13.3.0/bits/stl_algo.h:2013:
In function:
_ForwardIterator std::lower_bound(_ForwardIterator, _ForwardIterator,
const _Tp &, _Compare) [_ForwardIterator = const
diagtool::DiagnosticRecord *, _Tp = diagtool::DiagnosticRecord, _Compare
= bool (*)(const diagtool::DiagnosticRecord &, const
diagtool::DiagnosticRecord &)]
Error: elements in iterator range [first, last) are not partitioned by the predicate __comp and value __val.
Objects involved in the operation:
iterator "first" @ 0x7ffea8ef2fd8 {
}
iterator "last" @ 0x7ffea8ef2fd0 {
}
```
The reason for this error is that std::lower_bound is called on
BuiltinDiagnosticsByID without it being entirely sorted. Calling
std::lower_bound If the range is not sorted, the behavior of this
function is undefined. This is detected when building with expensive
checks.
To make BuiltinDiagnosticsByID sorted we need to slightly change the
order the inc-files are included. The include of
DiagnosticCrossTUKinds.inc in DiagnosticNames.cpp is included too early
and should be moved down directly after DiagnosticCommentKinds.inc.
As a part of pull request the includes that build up
BuiltinDiagnosticsByID table are extracted into a common wrapper header
file AllDiagnosticKinds.inc that is used by both clang and diagtool.
The first API is clang_visitCXXBaseClasses: this allows visiting the
base classes without going through the generic child visitor (which is
awkward, and doesn't work for template instantiations).
The second API is clang_getOffsetOfBase; this allows computing the
offset of a base in the class layout, the same way
clang_Cursor_getOffsetOfField computes the offset of a field.
Also, add a Python binding for the existing function
clang_isVirtualBase.
* Construct frequently-accessed TimerLock/DefaultTimerGroup early to
reduce overhead.
* Rename `aquireDefaultGroup` to `acquireTimerGlobals` and restore
ManagedStatic::claim. https://reviews.llvm.org/D76099
* Drop mtg::. We use internal linkage, so mtg:: is unneeded and might
mislead users. In addition, llvm/ code almost never introduces a named
namespace not in llvm::. Drop mtg::.
* Replace some unique_ptr with optional to reduce overhead.
* Switch to `functionName()`.
* Simplify `llvm::initTimerOptions` and `TimerGroup::constructForStatistics()`
Pull Request: https://github.com/llvm/llvm-project/pull/122429
These two clauses just take a 'var-list' and specify where the variables
should be copied from/to. This patch implements the AST nodes for them
and ensures they properly take a var-list.
The 'self' clause is an unfortunately difficult one, as it has a
significantly different meaning between 'update' and the other
constructs. This patch introduces a way for the 'self' clause to work
as both. I considered making this two separate AST nodes (one for
'self' on 'update' and one for the others), however this makes the
automated macros/etc for supporting a clause break.
Instead, 'self' has the ability to act as either a condition or as a
var-list clause. As this is the only one of its kind, it is implemented
all within it. If in the future we have more that work like this, we
should consider rewriting a lot of the macros that we use to make
clauses work, and make them separate ast nodes.
The dependency file and the P1689 file are text files, but the
open call misses the OF_Text flag. This PR adds the flag.
Fixes regressions in test cases ClangScanDeps/modules-extern-unrelated.m
and ClangScanDeps/P1689.cppm.
This reverts commit 81fc3add1e627c23b7270fe2739cdacc09063e54.
This breaks some LLDB tests, e.g.
SymbolFile/DWARF/x86/no_unique_address-with-bitfields.cpp:
lldb: ../llvm-project/clang/lib/AST/Decl.cpp:4604: unsigned int clang::FieldDecl::getBitWidthValue() const: Assertion `isa<ConstantExpr>(getBitWidth())' failed.
Save the bitwidth value as a `ConstantExpr` with the value set. Remove
the `ASTContext` parameter from `getBitWidthValue()`, so the latter
simply returns the value from the `ConstantExpr` instead of
constant-evaluating the bitwidth expression every time it is called.
This executable construct has a larger list of clauses than some of the
others, plus has some additional restrictions. This patch implements
the AST node, plus the 'cannot be the body of a if, while, do, switch,
or label' statement restriction. Future patches will handle the
rest of the restrictions, which are based on clauses.
A fairly simple one, only valid on the 'set' construct, this clause
takes an int expression. Most of the work was already done as a part of
parsing, so this patch ends up being a lot of infrastructure.
The 'set' construct is another fairly simple one, it doesn't have an
associated statement and only a handful of allowed clauses. This patch
implements it and all the rules for it, allowing 3 of its for clauses.
The only exception is default_async, which will be implemented in a
future patch, because it isn't just being enabled, it needs a complete
new implementation.
This is the first of a series of patches to add support for OpenMP
offloading to SPIR-V through liboffload with the first intended target
being Intel GPUs. This patch implements the basic driver and
`clang-linker-wrapper` work for JIT mode. There are still many missing
pieces, so this is not yet usable.
We introduce `spirv64-intel-unknown` as the only currently supported
triple. The user-facing argument to enable offloading will be `-fopenmp
-fopenmp-targets=spirv64-intel`
Add a new `SPIRVOpenMPToolChain` toolchain based on the existing general
SPIR-V toolchain which will call all the required SPIR-V tools (and
eventually the SPIR-V backend) as well as add the corresponding device
RTL as an argument to the linker.
We can't get through the front end consistently yet, so it's difficult
to add any LIT tests that execute any tools, but front end changes are
planned very shortly, and then we can add those tests.
---------
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
This will prevent the error on systems with a default encoding other
than utf-8.
```
UnicodeDecodeError: 'gbk' codec can't decode byte 0xb6 in position 12958: illegal multibyte sequence
```
This is a very simple sema implementation, and just required AST node
plus the existing diagnostics. This patch adds tests and adds the AST
node required, plus enables it for 'init' and 'shutdown' (only!)
These two constructs are very simple and similar, and only support 3
different clauses, two of which are already implemented. This patch
adds AST nodes for both constructs, and leaves the device_num clause
unimplemented, but enables the other two.
The arguments to this are the same as for the 'wait' clause, so this
reuses all of that infrastructure. So all this has to do is support a
pair of clauses that are already implemented (if and async), plus create
an AST node. This patch does so, and adds proper testing.
This adds support for the loongarch64 architecture to the offload host
plugin.
Similar to #115773
To fix some test issues, I've had to add the LoongArch64 target to:
- CompilerInvocation::ParseLangArgs
- linkDevice in ClangLinuxWrapper.cpp
- OMPContext::OMPContext (to set the device_kind_cpu trait)
Reviewed By: jhuber6
Pull Request: https://github.com/llvm/llvm-project/pull/120173
This fixes a bug where report links generated from files such as
StylePrimitiveNumericTypes+Conversions.h in WebKit result in an error.
Co-authored-by: Brianna Fan <bfan2@apple.com>
This is a clause that is only valid on 'host_data' constructs, and
identifies variables which it should use the current device address.
From a Sema perspective, the only thing novel here is mild changes to
how ActOnVar works for this clause, else this is very much like the rest
of the 'var-list' clauses.
'delete' is another clause that has very little compile-time
implication, but needs a full AST that takes a var list. This patch
ipmlements it fully, plus adds sufficient test coverage.