Fixes: #113187
Avoid to create init function since clang does not support global
variable with flexible array init.
It will cause assertion failure later.
Adds `@_init_resource_bindings()` function to module initialization that
includes `handle.fromBinding` intrinsic calls for simple resource
declarations. Arrays of resources or resources inside user defined types
are not supported yet.
While this unblocks our progress on [Compile a runnable shader from
clang](https://github.com/llvm/wg-hlsl/issues/7) milestone, this is
probably not the way we would like to handle resource binding
initialization going forward. Ideally, it should be done via the
resource class constructors in order to support dynamic resource binding
or unbounded arrays if resources.
Depends on PRs #110327 and #111203.
Part 1 of #105076
When the arch in the triple in "spirv", the default target codegen is
currently used. We should be using the spir-v target codegen. This will
be used to have SPIR-V specific lowering of the HLSL types.
Done by calling clang::runWithSufficientStackSpace().
Added CodeGenModule::runWithSufficientStackSpace() method similar to the
one in Sema to provide a single warning when this triggers
Fixes: #111699
Gentoo is planning to introduce a `*t64` suffix for triples that will be
used by 32-bit platforms that use 64-bit `time_t`. Add support for
parsing and accepting these triples, and while at it make clang
automatically enable the necessary glibc feature macros when this suffix
is used.
An open question is whether we can backport this to LLVM 19.x. After
all, adding new triplets to Triple sounds like an ABI change — though I
suppose we can minimize the risk of breaking something if we move new
enum values to the very end.
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
Currently, `__constant__` variables do not get unconditionally marked as
`constant` in IR, which seems a bit odd given their definition. This is
generally inconsequential for NVPTX/AMDGPU, since said variables get
emitted in the constant address space for those BEs. However, it is
potentially significant for e.g. HIP-on-SPIR-V cases, as SPIR-V does not
allow casts to/from the constant AS (`UniformConstant`), which forces
`__constant__` variables to be emitted in the global AS, thus making IR
constness meaningful.
This patch enables the following command line flags for RISC-V targets:
+ `-fcf-protection=branch` turns on forward-edge control-flow integrity conditioning
+ `-mcf-branch-label-scheme=unlabeled|func-sig` selects the label scheme used in the forward-edge CFI conditioning
This reverts commit 4a63f4d301c0e044073e1b1f8f110015ec1778a1.
It was reverted because of a buildbot breakage, but the fix-forward has
landed (https://github.com/llvm/llvm-project/pull/109023).
HLSL inlines all its functions by default. This uses the alwaysinline
attribute to make the alwaysinliner pass inline any function not
explicitly marked noinline by the user or autogeneration. The
alwayslinline marking takes place in `SetLLVMFunctionAttributesForDefinitions`
where all other inlining interactions are determined.
The outermost entry function is marked noinline because there's no
reason to inline it. Any user calls to an entry function will instead call
the internal mangled version of the entry function.
Adds tests for function and constructor inlining and augments some
existing tests to verify correct inlining of implicitly created
functions as well.
Incidentally restore RUN line that I believe was mistakenly removed as
part of #88918Fixes#89282
This patch enable the function multiversion(FMV) and `target_clones`
attribute for RISC-V target.
The proposal of `target_clones` syntax can be found at the
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/48 (which has
landed), as modified by the proposed
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/85 (which adds the
priority syntax).
It supports the `target_clones` function attribute and function
multiversioning feature for RISC-V target. It will generate the ifunc
resolver function for the function that declared with target_clones
attribute.
The resolver function will check the version support by runtime object
`__riscv_feature_bits`.
For example:
```
__attribute__((target_clones("default", "arch=+ver1", "arch=+ver2"))) int bar() {
return 1;
}
```
the corresponding resolver will be like:
```
bar.resolver() {
__init_riscv_feature_bits();
// Check arch=+ver1
if ((__riscv_feature_bits.features[0] & BITMASK_OF_VERSION1) == BITMASK_OF_VERSION1) {
return bar.arch=+ver1;
} else {
// Check arch=+ver2
if ((__riscv_feature_bits.features[0] & BITMASK_OF_VERSION2) == BITMASK_OF_VERSION2) {
return bar.arch=+ver2;
} else {
// Default
return bar.default;
}
}
}
```
Adds target codegen info class for DirectX. For now it always translates
`__hlsl_resource_t` handle to `target("dx.TypedBuffer", i32, 1, 0, 1)`
(`RWBuffer<int>`). More work is needed to determine the actual target
exp type and parameters based on the resource handle attributes.
Part 1/2 of #95952
Commit 0d527e56a5ee ("GlobalIFunc: Make ifunc respect function address
spaces") added support for this within LLVM, but Clang does not properly
honour the target's address spaces when creating IFUNCs, crashing with
RAUW and verifier assertion failures when compiling C code on a target
with a non-zero program address space, so fix this.
In incremental compilation clang works with multiple `llvm::Module`s.
Our current approach is to create a CodeGenModule entity for every new
module request (via StartModule). However, some of the state such as the
mangle context needs to be preserved to keep the original semantics in
the ever-growing TU.
Fixes: llvm/llvm-project#95581.
cc: @jeaye
With -fsanitize-cfi-icall-experimental-normalize-integers, Clang
appends ".normalized" to KCFI types in CodeGenModule::CreateKCFITypeId,
which changes type hashes also for functions that don't have integer
types in their signatures. However, llvm::setKCFIType does not take
integer normalization into account, which means LLVM generated
functions with KCFI types, e.g. sanitizer constructors, will fail KCFI
checks when integer normalization is enabled in Clang.
Add a cfi-normalize-integers module flag to indicate integer
normalization is used, and append ".normalized" to KCFI types also in
llvm::setKCFIType to fix the type mismatch.
For llvm_linux platform, define the following meaning for bits 9, 10,
11:
- bit 9: set if indirect gotos signing is enabled;
- bit 10: set if type info vtable pointer discrimination is enabled;
- bit 11: set if function pointer type discrimination is enabled.
As part of the LLVM effort to eliminate debug-info intrinsics, we're
moving to a world where only iterators should be used to insert
instructions. This isn't a problem in clang when instructions get
generated before any debug-info is inserted, however we're planning on
deprecating and removing the instruction-pointer insertion routines.
Scatter some calls to getIterator in a few places, remove a
deref-then-addrof on another iterator, and add an overload for the
createLoadInstBefore utility. Some callers passes a null insertion
point, which we need to handle explicitly now.
Treat 8th bit of version value for llvm_linux platform as signed GOT
flag.
- clang: define `PointerAuthELFGOT` LangOption and set 8th bit of
`aarch64-elf-pauthabi-version` LLVM module flag correspondingly;
- llvm-readobj: print `PointerAuthELFGOT` or `!PointerAuthELFGOT` in
version description of llvm_linux platform depending on whether the flag
is set.
If both `-fptrauth-init-fini` and `-fptrauth-calls` are passed, sign
function pointers in `llvm.global_ctors` and `llvm.global_dtors` with
constant discriminator 0xD9D4
(`ptrauth_string_discriminator("init_fini")`). Additionally, if
`-fptrauth-init-fini-address-discrimination` is passed, address
discrimination is used for signing (otherwise, just constant
discriminator is used).
For address discrimination, we use it's special form since uses of
`llvm.global_{c|d}tors` are disallowed (see
`Verifier::visitGlobalVariable`) and we can't emit `getelementptr`
expressions referencing these special arrays. A signed ctor/dtor pointer
with special address discrimination applied looks like the following:
```
ptr ptrauth (ptr @foo, i32 0, i64 55764, ptr inttoptr (i64 1 to ptr))
```
See the attached test for the motivation example. If we're too greedy to
not emit the definition for inline builtins, we may meet a middle end
crash. And it should be good to emit inline builtins always.
This patch replaces getAs with castAs and dyn_cast with cast to ensure
type safety and prevents potential null pointer dereferences. These
changes enforce compile-time checks for correct type casting in
ASTContext and CodeGenModule.
See the attached test for the motivation example. If we're too greedy to
not emit the definition for inline builtins, we may meet a middle end
crash. And it should be good to emit inline builtins always.
When `pauthtest` is either passed as environment part of AArch64 Linux
triple
or passed via `-mabi=`, enable the following ptrauth flags:
- `intrinsics`;
- `calls`;
- `returns`;
- `auth-traps`;
- `vtable-pointer-address-discrimination`;
- `vtable-pointer-type-discrimination`;
- `init-fini`.
Some related stuff is still subject to change, and the ABI itself might
be changed, so end users are not expected to use this and the ABI name
has 'test' suffix.
If `-mabi=pauthtest` option is used, it's normalized to effective
triple.
When the environment part of the effective triple is `pauthtest`, try
to use `aarch64-linux-pauthtest` as multilib directory.
The following is not supported:
- combination of `pauthtest` ABI with any branch protection scheme
except BTI;
- explicit set of environment part of the triple to a value different
from `pauthtest` in combination with `-mabi=pauthtest`;
- usage on non-Linux OS.
---------
Co-authored-by: Anatoly Trosinenko <atrosinenko@accesssoftek.com>
It was raised in https://github.com/llvm/llvm-project/issues/81494 that
we are not generating correct code when there is no TU-local caller.
The suggestion was to emit a resolver:
* Whenever there is a use in the TU.
* When the TU has a definition of the default version.
See the comment for more details:
https://github.com/llvm/llvm-project/issues/81494#issuecomment-1985963497
This got addressed with https://github.com/llvm/llvm-project/pull/84405.
Generating a resolver on use means that we may end up with multiple
resolvers across different translation units. Those resolvers may not be
the same because each translation unit may contain different version
declarations (user's fault). Therefore the order of linking the final
image determines which of these weak symbols gets selected, resulting in
non consisted behavior. I am proposing to stop emitting a resolver on
use and only do so in the translation unit which contains the default
definition. This way we guarantee the existence of a single resolver.
Now, when a versioned function is used we want to emit a declaration of
the function symbol omitting the multiversion mangling.
I have added a requirement to ACLE mandating that all the function
versions are declared in the translation unit which contains the default
definition: https://github.com/ARM-software/acle/pull/328
Visiting the ifunc calls `GetOrCreateLLVMFunction` with
`NotForDefinition` while visiting the resolver calls
`GetOrCreateLLVMFunction` with `ForDefinition`.
When an ifunc is emitted before its resolver, the `ForDefinition` call
does not call `SetFunctionAttributes`, because the function prematurely
returns due to `(Entry->getValueType() == Ty)` and
`llvm::GlobalIFunc::getResolverFunctionType(DeclTy)`.
This leads to missing `!kcfi_type` with -fsanitize=kcfi.
```
extern void ifunc0(void) __attribute__ ((ifunc("resolver0")));
void *resolver0(void) { return 0; } // SetFunctionAttributes not called
extern void ifunc1(void) __attribute__ ((ifunc("resolver1")));
static void *resolver1(void) { return 0; } // SetFunctionAttributes not called
extern void ifunc2(void) __attribute__ ((ifunc("resolver2")));
static void *resolver2(void*) { return 0; }
```
Ensure `SetFunctionAttributes` is called by calling
`GetOrCreateLLVMFunction` with a dummy non-function type. Now that the
`F->takeName(Entry)` code path may be taken, the
`DisableSanitizerInstrumentation` code
(https://reviews.llvm.org/D150262) should be moved to `checkAliases`,
when the resolver function is finalized.
Pull Request: https://github.com/llvm/llvm-project/pull/98832
This is a revert of ef5e7f90ea4d5063ce68b952c5de473e610afc02 which was a
temporary partial revert of 77ac823fd285973cfb3517932c09d82e6a32f46d.
The le32 and le64 targets are no longer necessary to retain, so this
removes them entirely.
Extract the logic whether to emit a global var based on CUDA/HIP
host/device related attributes to CodeGenModule::shouldEmitCUDAGlobalVar
to be used by other places.
With `system-headers-coverage=false`, functions defined in system
headers were not instrumented but corresponding covmaps were emitted. It
caused wasting covmap and profraw.
This change improves:
- Reduce object size (due to reduced covmap)
- Reduce size of profraw (uninstrumented system headers occupied
counters)
- Smarter view of coverage report. Stubs of uninstrumented system
headers will be no longer seen.
musttail is not often possible to be generated on PPC targets as when
calling to a function defined in another module, PPC needs to restore
the TOC pointer. To restore the TOC pointer, compiler needs to emit a
nop after the call to let linker generate codes to restore TOC pointer.
Tail call cannot generate expected call sequence for this case.
To avoid the crash inside the compiler backend, a diagnosis is added in
the frontend.
Fixes#63214
When BPF object files are linked with bpftool, every symbol must be
accompanied by BTF info. Ensure that extern functions referenced by
global variable initializers are included in BTF.
The primary motivation is "static" initialization of PROG maps:
```c
extern int elsewhere(struct xdp_md *);
struct {
__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
__uint(max_entries, 1);
__type(key, int);
__type(value, int);
__array(values, int (struct xdp_md *));
} prog_map SEC(".maps") = { .values = { elsewhere } };
```
BPF backend needs debug info to produce BTF. Debug info is not
normally generated for external variables and functions. Previously, it
was solved differently for variables (collecting variable declarations
in ExternalDeclarations vector) and functions (logic invoked during
codegen in CGExpr.cpp).
This patch generalises ExternalDefclarations to include both function
and variable declarations. This change ensures that function references
are not missed no matter the context. Previously external functions
referenced in constant expressions lacked debug info.
Currently there is no way to tell whether an IR module was generated
using `-cl-std=cl3.0` or `-cl-std=clc++2021`, i.e., whether the origin
was a OpenCL C or C++ for OpenCL source.
Add new `opencl.cxx.version` named metadata when compiling C++. Keep the
`opencl.ocl.version` metadata to convey the compatible OpenCL C version.
Fixes https://github.com/llvm/llvm-project/issues/91912
This patch augments the HIPAMD driver to allow it to target AMDGCN
flavoured SPIR-V compilation. It's mostly straightforward, as we re-use
some of the existing SPIRV infra, however there are a few notable
additions:
- we introduce an `amdgcnspirv` offload arch, rather than relying on
using `generic` (this is already fairly overloaded) or simply using
`spirv` or `spirv64` (we'll want to use these to denote unflavoured
SPIRV, once we bring up that capability)
- initially it is won't be possible to mix-in SPIR-V and concrete AMDGPU
targets, as it would require some relatively intrusive surgery in the
HIPAMD Toolchain and the Driver to deal with two triples
(`spirv64-amd-amdhsa` and `amdgcn-amd-amdhsa`, respectively)
- in order to retain user provided compiler flags and have them
available at JIT time, we rely on embedding the command line via
`-fembed-bitcode=marker`, which the bitcode writer had previously not
implemented for SPIRV; we only allow it conditionally for AMDGCN
flavoured SPIRV, and it is handled correctly by the Translator (it ends
up as a string literal)
Once the SPIRV BE is no longer experimental we'll switch to using that
rather than the translator. There's some additional work that'll come
via a separate PR around correctly piping through AMDGCN's
implementation of `printf`, for now we merely handle its flags
correctly.
Long story short the interaction of two optimizations happening in
GlobalOpt results in a crash. For more details look at the issue
https://github.com/llvm/llvm-project/issues/96197. I will be fixing this
in GlobalOpt but it is a conservative solution since it won't allow us
to optimize resolvers which return a pointer to a function whose
definition is in another TU when compiling without LTO:
```
__attribute__((target_version("simd"))) void bar(void);
__attribute__((target_version("default"))) void bar(void);
int foo() { bar(); }
```
fixes: #96197
When using the -mframe-chain=aapcs or -mframe-chain=aapcs-leaf options,
we cannot use r11 as an allocatable register, even if
-fomit-frame-pointer is also used. This is so that r11 will always point
to a valid frame record, even if we don't create one in every function.
This patch adds assertions in the getMangledNameImpl() function to
ensure that the expected target attributes (TargetAttr,
TargetVersionAttr, and TargetClonesAttr) are not null before they are
passed to appendAttributeMangling() to prevent potential null pointer
dereferences and improve the robustness of the attribute mangling
process.
This assertion will trigger a runtime error with a clear message in
debug build if any of the expected attributes are missing, facilitating
early and easier diagnosis and debugging of such issues related to
attribute mangling.
Close https://github.com/llvm/llvm-project/issues/93497
The root cause of the problem is, we mark the variable from other
modules as constnant in LLVM incorrectly. This patch fixes this problem
by not emitting the defintition for non-const available external
variables. Since the non const available externally variable is not
helpful to the optimization.
Fixes the following bug:
namespace Name {
int __attribute((target_version("default"))) foo() { return 0; }
}
namespace Name {
int __attribute((target_version("sve"))) foo() { return 1; }
}
int bar() { return Name::foo(); }
error: redefinition of 'foo'
int __attribute((target_version("sve"))) foo() { return 1; }
note: previous definition is here
int __attribute((target_version("default"))) foo() { return 0; }
While fixing this I also found that in the absence of default version
declaration, the one we implicitly create has incorrect mangling if
we are in a namespace:
namespace OtherName {
int __attribute((target_version("sve"))) foo() { return 2; }
}
int baz() { return OtherName::foo(); }
In this example instead of creating a declaration for the symbol
@_ZN9OtherName3fooEv.default we are creating one for the symbol
@_Z3foov.default (the namespace mangling prefix is omitted).
This has now been fixed.