Few times in different methods of the EmitAssemblyHelper class the following
code snippet is used to get the TargetTriple and then use it's single method
to check some conditions:
TargetTriple(TheModule->getTargetTriple())
The parsing of a target triple string is not a trivial operation and it takes
time to repeat the parsing many times in different methods of the class and
even numerous times in one method just to call a getter
(llvm::Triple(TheModule->getTargetTriple()).getVendor()), for example.
The patch extracts the TargetTriple member of the EmitAssemblyHelper class to
parse the triple only once in the class' constructor.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D122587
We have some discission in D99152 and llvm-dev and finially come up with
a solution to add amx specific cast intrinsics. We've support the
intrinsics in llvm IR. This patch is to replace bitcast with amx cast
intrinsics in code emitting in FE.
Differential Revision: https://reviews.llvm.org/D122567
We expect that `extern "C"` static functions to be usable in things like
inline assembly, as well as ifuncs:
See the bug report here: https://github.com/llvm/llvm-project/issues/54549
However, we were diagnosing this as 'not defined', because the
ifunc's attempt to look up its resolver would generate a declared IR
function.
Additionally, as background, the way we allow these static extern "C"
functions to work in inline assembly is by making an alias with the C
mangling in MOST situations to the version we emit with
internal-linkage/mangling.
The problem here was multi-fold: First- We generated the alias after the
ifunc was checked, so the function by that name didn't exist yet.
Second, the ifunc's generation caused a symbol to exist under the name
of the alias already (the declared function above), which suppressed the
alias generation.
This patch fixes all of this by moving the checking of ifuncs/CFE aliases
until AFTER we have generated the extern-C alias. Then, it does a
'fixup' around the GlobalIFunc to make sure we correct the reference.
Differential Revision: https://reviews.llvm.org/D122608
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be
populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be
lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D115907
Beautify dump format, add indent for nested struct and struct members, also fix test cases in dump-struct-builtin.c
for example:
struct:
```
struct A {
int a;
struct B {
int b;
struct C {
struct D {
int d;
union E {
int x;
int y;
} e;
} d;
int c;
} c;
} b;
};
```
Before:
```
struct A {
int a = 0
struct B {
int b = 0
struct C {
struct D {
int d = 0
union E {
int x = 0
int y = 0
}
}
int c = 0
}
}
}
```
After:
```
struct A {
int a = 0
struct B {
int b = 0
struct C {
struct D {
int d = 0
union E {
int x = 0
int y = 0
}
}
int c = 0
}
}
}
```
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D122704
Remove anonymous tag locations, powered by 'PrintingPolicy',
@aaron.ballman once suggested removing this extra information in
https://reviews.llvm.org/D122248
struct:
struct S {
int a;
struct /* Anonymous*/ {
int x;
} b;
int c;
};
Before:
struct S {
int a = 0
struct S::(unnamed at ./builtin_dump_struct.c:20:3) {
int x = 0
}
int c = 0
}
After:
struct S {
int a = 0
struct S::(unnamed) {
int x = 0
}
int c = 0
}
Differntial Revision: https://reviews.llvm.org/D122670
Currently, the regcall calling conversion in Clang doesn't match with
ICC when passing / returning structures. https://godbolt.org/z/axxKMKrW7
This patch tries to fix the problem to match with ICC.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D122104
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be
populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be
lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D115907
This builtin returns the address of a global instance of the
`std::source_location::__impl` type, which must be defined (with an
appropriate shape) before calling the builtin.
It will be used to implement std::source_location in libc++ in a
future change. The builtin is compatible with GCC's implementation,
and libstdc++'s usage. An intentional divergence is that GCC declares
the builtin's return type to be `const void*` (for
ease-of-implementation reasons), while Clang uses the actual type,
`const std::source_location::__impl*`.
In order to support this new functionality, I've also added a new
'UnnamedGlobalConstantDecl'. This artificial Decl is modeled after
MSGuidDecl, and is used to represent a generic concept of an lvalue
constant with global scope, deduplicated by its value. It's possible
that MSGuidDecl itself, or some of the other similar sorts of things
in Clang might be able to be refactored onto this more-generic
concept, but there's enough special-case weirdness in MSGuidDecl that
I gave up attempting to share code there, at least for now.
Finally, for compatibility with libstdc++'s <source_location> header,
I've added a second exception to the "cannot cast from void* to T* in
constant evaluation" rule. This seems a bit distasteful, but feels
like the best available option.
Reviewers: aaron.ballman, erichkeane
Differential Revision: https://reviews.llvm.org/D120159
This patch adds the necessary AMDGPU calling convention to the ctor /
dtor kernels. These are fundamentally device kenels called by the host
on image load. Without this calling convention information the AMDGPU
plugin is unable to identify them.
Depends on D122504
Fixes#54091
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D122515
The default construction of constructor functions by LLVM tends to make
them have internal linkage. When we call a ctor / dtor function in the
target region we are actually creating a kernel that is called at
registration. Because the ctor is a kernel we need to make sure it's
externally visible so we can actually call it. This prevented AMDGPU
from correctly using constructors while NVPTX could use them simply
because it ignored internal visibility.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D122504
This patch adds a helper method to determine if a nonvirtual base has an entry in the LLVM struct. Such a base may not have an entry
if the base does not have any fields/bases itself that would change the size of the struct. This utility method is useful for other frontends (Polygeist) that use Clang as an API to generate code.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D122502
Currently the device kernels all have weak linkage to prevent linkage
errors on multiple defintions. However, this prevents some optimizations
from adequately analyzing them because of the nature of weak linkage.
This patch replaces the weak linkage with weak_odr linkage so we can
statically assert that multiple declarations of the same kernel will
have the same definition.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D122443
Current clang generates extra set of simd variant function attribute
with extra 'v' encoding.
For example:
_ZGVbN2v__Z5add_1Pf vs _ZGVbN2vv__Z5add_1Pf
The problem is due to declaration of ParamAttrs following:
llvm::SmallVector<ParamAttrTy, 8> ParamAttrs(ParamPositions.size());
where ParamPositions.size() is grown after following assignment:
Pos = ParamPositions[PVD];
So the PVD is not find in ParamPositions.
The problem is ParamPositions need to set for each FD decl. To fix this
Move ParamPositions's init inside while loop for each FD.
Differential Revision: https://reviews.llvm.org/D122338
Fix clang crash and add bitfield support in __builtin_dump_struct.
In clang13.0.x, a struct with three or more members and a bitfield at
the same time will cause a crash. In clang15.x, as long as the struct
has one bitfield, it will cause a crash in clang.
Open issue: https://github.com/llvm/llvm-project/issues/54462
Differential Revision: https://reviews.llvm.org/D122248
This information isn't preserved in the DWARF description of function
types (though probably should be - it's preserved on the function
declarations/definitions themselves through the DW_AT_noreturn attribute
- but we should move or also include that in the subroutine type itself
too - but for now, with it not being there, the DWARF is lossy and
can't be reconstructed)
Adds basic parsing/sema/serialization support for the
#pragma omp target parallel loop directive.
Differential Revision: https://reviews.llvm.org/D122359
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.
Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.
Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be
removed in the future.
Differential Revision: https://reviews.llvm.org/D121736
Currently we create offloading entries to register device variables with
the host. When we register a variable we will look up the symbol in the
device image and map the device address to the host address. This is a
problem when the symbol is declared with hidden visibility or internal
linkage. This means the symbol is not accessible externally and we
cannot get its address. We should still allow static variables to be
declared on the device, but ew should not create an offloading entry for
them so they exist independently on the host and device.
Fixes#54309
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D122352
The way the check is written is not compatible with opaque
pointers -- while we don't need to change the IR pointer type,
we do need to change the element type stored in the Address.
As we're going to reassign the initializer, we actually need the
value types to match, not just the pointer types. This is only
relevant with opaque pointers.
This patch extends the support for C/C++ operators for SVE
types to allow one of the arguments to be a scalar, in which
case a vector splat is performed.
Differential Revision: https://reviews.llvm.org/D121829
This requires some adjustment in caller code, because there was
a confusion regarding the meaning of the PtrTy argument: This
argument is the type of the pointer being loaded, not the addresses
being loaded from.
Reapply after fixing the specified pointer type for one call in
47eb4f7dcd845878b16a53dadd765195b9c24b6e, where the used type is
important for determining alignment.
GCC supports power-of-2 size structures for the arguments. Clang supports fewer than GCC. But Clang always crashes for the unsupported cases.
This patch adds sema checks to do the diagnosts to solve these crashes.
Reviewed By: jyu2
Differential Revision: https://reviews.llvm.org/D107141
Worth noting that the code marked with FIXME is dead and would
produce invalid IR if hit. Someone familiar with this code should
probably look into that.
Before we start addressing the issue with having
a lot of false positives when using debugify in
the original mode, we have made a few patches that
should speed up the execution of the testing
utility Passes.
For example, when testing a large project
(let's say LLVM project itself), we can face
a lot of potential DI issues. Usually, we use
-verify-each-debuginfo-preserve (that is very
similar to -debugify-each) -- it collects
DI metadata before each Pass, and after the Pass
it checks if the Pass preserved the DI metadata.
However, we can speed up this process, since we
don't need to collect DI metadata before each
Pass -- we could use the DI metadata that are
collected after the previous Pass from
the pipeline as an input for the next Pass.
This patch speeds up the utility for ~2x.
Differential Revision: https://reviews.llvm.org/D115622
This requires some adjustment in caller code, because there was
a confusion regarding the meaning of the PtrTy argument: This
argument is the type of the pointer being loaded, not the addresses
being loaded from.
The EmitLoadOfPointer() call already specified the right pointer
type, but it did not match the Address we're loading from, so we
need to insert a bitcast first.
Rather than using a dummy void pointer type, we should specify the
correct private type and perform the bitcast beforehand rather than
afterwards. This way, the Address will have correct alignment
information.