The purpose of this flag is to allow the compiler to assume that each
object file passed to the linker has been compiled using a unique
source file name. This is useful for reducing link times when doing
ThinLTO in combination with whole-program devirtualization or CFI,
as it allows modules without exported symbols to be built with ThinLTO.
Reviewers: vitalybuka, teresajohnson
Reviewed By: teresajohnson
Pull Request: https://github.com/llvm/llvm-project/pull/135728
Finding operator delete[] is still problematic, without it the extension
is a security hazard, so reverting until the problem with operator
delete[] is figured out.
This reverts the following PRs:
Reland [MS][clang] Add support for vector deleting destructors (llvm#133451)
[MS][clang] Make sure vector deleting dtor calls correct operator delete (llvm#133950)
[MS][clang] Fix crash on deletion of array of pointers (llvm#134088)
[clang] Do not diagnose unused deleted operator delete[] (llvm#134357)
[MS][clang] Error about ambiguous operator delete[] only when required (llvm#135041)
This feature is currently not supported in the compiler.
To facilitate this we emit a stub version of each kernel
function body with different name mangling scheme, and
replaces the respective kernel call-sites appropriately.
Fixes https://github.com/llvm/llvm-project/issues/60313
D120566 was an earlier attempt made to upstream a solution
for this issue.
---------
Co-authored-by: anikelal <anikelal@amd.com>
Whereas it is UB in terms of the standard to delete an array of objects
via pointer whose static type doesn't match its dynamic type, MSVC
supports an extension allowing to do it.
Aside from array deletion not working correctly in the mentioned case,
currently not having this extension implemented causes clang to generate
code that is not compatible with the code generated by MSVC, because
clang always puts scalar deleting destructor to the vftable. This PR
aims to resolve these problems.
It was reverted due to link time errors in chromium with sanitizer
coverage enabled,
which is fixed by https://github.com/llvm/llvm-project/pull/131929 .
The second commit of this PR also contains a fix for a runtime failure
in chromium reported
in
https://github.com/llvm/llvm-project/pull/126240#issuecomment-2730216384
.
Fixes https://github.com/llvm/llvm-project/issues/19772
Original PR: #130537
Originally reverted due to revert of dependent commit. Relanding with no
changes.
This changes the MemberPointerType representation to use a
NestedNameSpecifier instead of a Type to represent the base class.
Since the qualifiers are always parsed as nested names, there was an
impedance mismatch when converting these back and forth into types, and
this led to issues in preserving sugar.
The nested names are indeed a better match for these, as the differences
which a QualType can represent cannot be expressed syntatically, and
they represent the use case more exactly, being either dependent or
referring to a CXXRecord, unqualified.
This patch also makes the MemberPointerType able to represent sugar for
a {up/downcast}cast conversion of the base class, although for now the
underlying type is canonical, as preserving the sugar up to that point
requires further work.
As usual, includes a few drive-by fixes in order to make use of the
improvements.
Original PR: #130537
Reland after updating lldb too.
This changes the MemberPointerType representation to use a
NestedNameSpecifier instead of a Type to represent the base class.
Since the qualifiers are always parsed as nested names, there was an
impedance mismatch when converting these back and forth into types, and
this led to issues in preserving sugar.
The nested names are indeed a better match for these, as the differences
which a QualType can represent cannot be expressed syntatically, and
they represent the use case more exactly, being either dependent or
referring to a CXXRecord, unqualified.
This patch also makes the MemberPointerType able to represent sugar for
a {up/downcast}cast conversion of the base class, although for now the
underlying type is canonical, as preserving the sugar up to that point
requires further work.
As usual, includes a few drive-by fixes in order to make use of the
improvements.
This changes the MemberPointerType representation to use a
NestedNameSpecifier instead of a Type to represent the class.
Since the qualifiers are always parsed as nested names, there was an
impedance mismatch when converting these back and forth into types, and
this led to issues in preserving sugar.
The nested names are indeed a better match for these, as the differences
which a QualType can represent cannot be expressed syntactically, and it
also represents the use case more exactly, being either dependent or
referring to a CXXRecord, unqualified.
This patch also makes the MemberPointerType able to represent sugar for
a {up/downcast}cast conversion of the base class, although for now the
underlying type is canonical, as preserving the sugar up to that point
requires further work.
As usual, includes a few drive-by fixes in order to make use of the
improvements, and removing some duplications, for example
CheckBaseClassAccess is deduplicated from across SemaAccess and
SemaCast.
This caused link errors when building with sancov. See comment on the PR.
> Whereas it is UB in terms of the standard to delete an array of objects
> via pointer whose static type doesn't match its dynamic type, MSVC
> supports an extension allowing to do it.
> Aside from array deletion not working correctly in the mentioned case,
> currently not having this extension implemented causes clang to generate
> code that is not compatible with the code generated by MSVC, because
> clang always puts scalar deleting destructor to the vftable. This PR
> aims to resolve these problems.
>
> Fixes https://github.com/llvm/llvm-project/issues/19772
This reverts commit d6942d54f677000cf713d2b0eba57b641452beb4.
This patch makes the assertion (that is currently in a comment) that
validates that names mangled by clang can be demangled by LLVM actually
compile/work. There were some minor issues that needed to be fixed (like
starts_with not being available on std::string and needing to call
getDecl() on GD), and a logic issue that should be fixed in this patch.
This enables just uncommenting the assertion to enable it within the
compiler (minus needing to add the header file).
I initially implemented codegen to be a 'no-op' for these declarations,
which I thought was properly implemented. However, when they are a
top-level decl, we have a separate switch. This patch makes sure they
are properly emitted at top-level as a no-op, and adds a test for both
top-level and not top-level.
The resource wrapper should have internal linkage because it contains a
handle to the global resource, and it not the actual global.
Makeing this changed exposed that we were zeroinitializing the resouce,
which is a problem. The handle cannot be zeroinitialized. This is
changed to use poison instead.
Fixes https://github.com/llvm/llvm-project/issues/122767.
---------
Co-authored-by: Helena Kotas <hekotas@microsoft.com>
Whereas it is UB in terms of the standard to delete an array of objects
via pointer whose static type doesn't match its dynamic type, MSVC
supports an extension allowing to do it.
Aside from array deletion not working correctly in the mentioned case,
currently not having this extension implemented causes clang to generate
code that is not compatible with the code generated by MSVC, because
clang always puts scalar deleting destructor to the vftable. This PR
aims to resolve these problems.
Fixes https://github.com/llvm/llvm-project/issues/19772
Add option and statement attribute for controlling emitting of
target-specific
metadata to atomicrmw instructions in IR.
The RFC for this attribute and option is
https://discourse.llvm.org/t/rfc-add-clang-atomic-control-options-and-pragmas/80641,
Originally a pragma was proposed, then it was changed to clang
attribute.
This attribute allows users to specify one, two, or all three options
and must be applied
to a compound statement. The attribute can also be nested, with inner
attributes
overriding the options specified by outer attributes or the target's
default
options. These options will then determine the target-specific metadata
added to atomic
instructions in the IR.
In addition to the attribute, three new compiler options are introduced:
`-f[no-]atomic-remote-memory`, `-f[no-]atomic-fine-grained-memory`,
`-f[no-]atomic-ignore-denormal-mode`.
These compiler options allow users to override the default options
through the
Clang driver and front end. `-m[no-]unsafe-fp-atomics` is aliased to
`-f[no-]ignore-denormal-mode`.
In terms of implementation, the atomic attribute is represented in the
AST by the
existing AttributedStmt, with minimal changes to AST and Sema.
During code generation in Clang, the CodeGenModule maintains the current
atomic options,
which are used to emit the relevant metadata for atomic instructions.
RAII is used
to manage the saving and restoring of atomic options when entering
and exiting nested AttributedStmt.
All variable declarations in the global scope that are not resources,
static or empty are implicitly added to implicit constant buffer
`$Globals`. They are created in `hlsl_constant` address space and
collected in an implicit `HLSLBufferDecl` node that is added to the AST
at the end of the translation unit. Codegen is the same as for explicit
constant buffers.
Fixes#123801
This is a second attempt to implement this feature. The first attempt
had to be reverted because of memory leaks. The problem was adding a
`SmallVector` member on `HLSLBufferDecl` node to represent a list of
default buffer declarations. When this vector needed to grow, it
allocated memory that was never released, because all memory used by AST
nodes must be allocated by `ASTContext` allocator and is released all at
once. Destructors on AST nodes are never called.
It this change the list of default buffer declarations is collected in a
`SmallVector` instance on `SemaHLSL`. The `HLSLBufDecl` representing
`$Globals` is created at the end of the translation unit when the number
of declarations is known, and the list is copied into an array allocated
by the `ASTContext` allocator.
All variable declarations in the global scope that are not resources,
static or empty are implicitly added to implicit constant buffer
`$Globals`. They are created in `hlsl_constant` address space and
collected in an implicit `HLSLBufferDecl` node that is added to the AST
at the end of the translation unit. Codegen is the same as for explicit
constant buffers.
Fixes#123801
As Intel is working to add support for SPIR-V OpenMP device offloading
in upstream clang/liboffload, we need to modify the OpenMP frontend to
allow SPIR-V as well as generate valid IR for SPIR-V. For example, we
need the frontend to generate code to define and interact with device
globals used in the DeviceRTL.
This is the beginning of what I expect will be (many) other changes, but
let's get started with something simple.
---------
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Kernel Control Flow Integrity (kCFI) is a feature that hardens indirect
calls by comparing a 32-bit hash of the function pointer's type against
a hash of the target function's type. If the hashes do not match, the
kernel may panic (or log the hash check failure, depending on the
kernel's configuration). These hashes are computed at compile time by
applying the xxHash64 algorithm to each mangled canonical function (or
function pointer) type, then truncating the result to 32 bits. This hash
is written into each indirect-callable function header by encoding it as
the 32-bit immediate operand to a `MOVri` instruction, e.g.:
```
__cfi_foo:
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
movl $199571451, %eax # hash of foo's type = 0xBE537FB
foo:
...
```
This PR extends x86-based kCFI with a 3-bit arity indicator encoded in
the `MOVri` instruction's register (reg) field as follows:
| Arity Indicator | Description | Encoding in reg field |
| --------------- | --------------- | --------------- |
| 0 | 0 parameters | EAX |
| 1 | 1 parameter in RDI | ECX |
| 2 | 2 parameters in RDI and RSI | EDX |
| 3 | 3 parameters in RDI, RSI, and RDX | EBX |
| 4 | 4 parameters in RDI, RSI, RDX, and RCX | ESP |
| 5 | 5 parameters in RDI, RSI, RDX, RCX, and R8 | EBP |
| 6 | 6 parameters in RDI, RSI, RDX, RCX, R8, and R9 | ESI |
| 7 | At least one parameter may be passed on the stack | EDI |
For example, if `foo` takes 3 register arguments and no stack arguments
then the `MOVri` instruction in its kCFI header would instead be written
as:
```
movl $199571451, %ebx # hash of foo's type = 0xBE537FB
```
This PR will benefit other CFI approaches that build on kCFI, such as
FineIBT. For example, this proposed enhancement to FineIBT must be able
to infer (at kernel init time) which registers are live at an indirect
call target: https://lkml.org/lkml/2024/9/27/982. If the arity bits are
available in the kCFI function header, then this information is trivial
to infer.
Note that there is another existing PR proposal that includes the 3-bit
arity within the existing 32-bit immediate field, which introduces
different security properties:
https://github.com/llvm/llvm-project/pull/117121.
This both reapplies #118734, the initial attempt at this, and updates it
significantly.
First, it uses the newly added `StringTable` abstraction for string
tables, and simplifies the construction to build the string table and
info arrays separately. This should reduce any `constexpr` compile time
memory or CPU cost of the original PR while significantly improving the
APIs throughout.
It also restructures the builtins to support sharding across several
independent tables. This accomplishes two improvements from the
original PR:
1) It improves the APIs used significantly.
2) When builtins are defined from different sources (like SVE vs MVE in
AArch64), this allows each of them to build their own string table
independently rather than having to merge the string tables and info
structures.
3) It allows each shard to factor out a common prefix, often cutting the
size of the strings needed for the builtins by a factor two.
The second point is important both to allow different mechanisms of
construction (for example a `.def` file and a tablegen'ed `.inc` file,
or different tablegen'ed `.inc files), it also simply reduces the sizes
of these tables which is valuable given how large they are in some
cases. The third builds on that size reduction.
Initially, we use this new sharding rather than merging tables in
AArch64, LoongArch, RISCV, and X86. Mostly this helps ensure the system
works, as without further changes these still push scaling limits.
Subsequent commits will more deeply leverage the new structure,
including using the prefix capabilities which cannot be easily factored
out here and requires deep changes to the targets.
Per the ELF spec, section groups may only contain local symbols if those
symbols are only
referenced from within the section group. [1] In the case of template
parameter objects,
they can be referenced from outside the group when the type of the
object was declared
in an anonymous namespace. In that case, we can't place the object in a
COMDAT. This matches
GCC's linkage behavior on the test input.
[1]:
https://www.sco.com/developers/gabi/latest/ch4.sheader.html#section_groups
This is an implementation of P1061 Structure Bindings Introduce a Pack
without the ability to use packs outside of templates. There is a couple
of ways the AST could have been sliced so let me know what you think.
The only part of this change that I am unsure of is the
serialization/deserialization stuff. I followed the implementation of
other Exprs, but I do not really know how it is tested. Thank you for
your time considering this.
---------
Co-authored-by: Yanzuo Liu <zwuis@outlook.com>
* We want the default version to have this attribute too otherwise it
becomes indistinguishable from non-versioned functions.
* We don't need the '+' unlike target-features which can negate. This
will allow using the parsing API of target_version/clones for the
metadata too.
Currently, the more features a version has, the higher its priority is.
We are changing ACLE https://github.com/ARM-software/acle/pull/370 as
follows:
"Among any two versions, the higher priority version is determined by
identifying the highest priority feature that is specified in exactly
one of the versions, and selecting that version."
We need to be able to propagate information about FMV attribute strings
from C/C++ source to LLVM IR. This is necessary so that we can
distinguish which target-features are coming from the cmdline, which are
coming from the target attribute, and which are coming from feature
dependency expansion. We need this for static resolution of calls in
LLVM. Here's a motivating example:
Suppose you have target_version("i8mm+dotprod") and
target_version("fcma"). The first version clearly has higher priority.
Now suppose you specify -march=armv8-a+i8mm on the command line. Then
the versions would have target-features "+i8mm,+dotprod" and
"+i8mm,+fcma" respectively. If you are using those to deduce version
priority, then you would incorrectly deduce that the second version was
higher priority than the first.
Currently we need at least one more version other than the default to
trigger FMV. However we would like a header file declaration
__attribute__((target_version("default"))) void f(void);
to guarantee that there will be f.default
Re-apply #113148 after revert in #119331
If function pointer signing is enabled, sign personality function
pointer stored in `.DW.ref.__gxx_personality_v0` section with IA key,
0x7EAD = `ptrauth_string_discriminator("personality")` constant
discriminator and address diversity enabled.
If function pointer signing is enabled, sign personality function
pointer stored in `.DW.ref.__gxx_personality_v0` section with IA key,
0x7EAD = `ptrauth_string_discriminator("personality")` constant
discriminator and address diversity enabled.
Currently we have code with target hooks in CodeGenModule shared between
X86 and AArch64 for sorting MultiVersionResolverOptions. Those are used
when generating IFunc resolvers for FMV. The RISCV target has different
criteria for sorting, therefore it repeats sorting after calling
CodeGenFunction::EmitMultiVersionResolver.
I am moving the FMV priority logic in TargetInfo, so that it can be
implemented by the TargetParser which then makes it possible to query it
from llvm. Here is an example why this is handy:
https://github.com/llvm/llvm-project/pull/87939
When dealing with cpu_specific GlobalDecl,
GetOrCreateMultiVersionResolver should immediately return the already
created llvm function if it exists.
Fixes https://github.com/llvm/llvm-project/issues/115299.
Add `-fptrauth-elf-got` clang cc1 flag and set `ptrauth_elf_got`
preprocessor feature and `PointerAuthELFGOT` LangOption correspondingly.
No additional checks like ensuring OS binary format is ELF are
performed: it should be done on clang driver level when a pauth-enabled
environment implying signed GOT enabled is requested.
If the cc1 flag is passed, "ptrauth-elf-got" IR module flag is set.
Reproducer:
```
//--- a.cppm
export module a;
int func();
static int a = func();
//--- a.cpp
import a;
```
The `func()` should only execute once. However, before this patch we
will somehow import `static int a` from a.cppm incorrectly and
initialize that again.
This is super bad and can introduce serious runtime behaviors.
And also surprisingly, it looks like the root cause of the problem is
simply some oversight choosing APIs.
Fixes: #113187
Avoid to create init function since clang does not support global
variable with flexible array init.
It will cause assertion failure later.
Adds `@_init_resource_bindings()` function to module initialization that
includes `handle.fromBinding` intrinsic calls for simple resource
declarations. Arrays of resources or resources inside user defined types
are not supported yet.
While this unblocks our progress on [Compile a runnable shader from
clang](https://github.com/llvm/wg-hlsl/issues/7) milestone, this is
probably not the way we would like to handle resource binding
initialization going forward. Ideally, it should be done via the
resource class constructors in order to support dynamic resource binding
or unbounded arrays if resources.
Depends on PRs #110327 and #111203.
Part 1 of #105076