When we apply cloning decisions in the ThinLTO backend, we need to find
the corresponding summary for each function in the IR, and in some cases
for callee functions. This is complicated when the function was a
promoted local, in which case the GUID was formed from the hash of the
original source file prepended to the function name. Those functions
can be identified by the fact that they were given a ".llvm." suffix
during promotion.
We previously didn't do this correctly for promoted locals imported from
other modules, as we only tried the current module source name. This led
to crashes, in particular when the current module also had an local
function of the same original name. In particular, we were attempting to
iterate through the wrong summary's callsites, and there were fewer than
in the actual function so we accessed data off the end (in a release
build with assertion checking off - with assertion checking on we double
check the stack ids and that would have failed). Even if we hadn't
crashed or hit an assert, we could have applied the wrong cloning
decisions, leading to unsats at link time.
Luckily, function importing attaches thinlto_src_file metadata
containing the original source file name to all imported functions. It
normally doesn't do this by default, however, it always does if MemProf
context disambiguation is enabled. Therefore, we can just look to see if
the function contains this metadata and if so use it to recreate the
original GUID.
A similar issue can occur when looking for the ValueInfo / GUID of
a direct tail call to see if we synthesized a callsite record for a
missing tail call frame. In that case, the callee function may be a
declaration, if we imported its caller but not the callee function
definition. Because imported declarations don't get the thinlto_src_file
metadata, we instead look at its caller (which works because this
happens very early in the backend before any inlining).
Add a new AutoUpgrade function to convert some legacy nvvm.annotations
metadata to function level attributes. These attributes are quicker to
look-up so improve compile time and are more idiomatic than using
metadata which should not include required information that changes the
meaning of the program.
Currently supported annotations are:
- !"kernel" -> ptx_kernel calling convention
- !"align" -> alignstack parameter attributes (return not yet supported)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.
Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
make it easier to use old IR files and somewhat reduce the test churn in
this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
attribute. The representation in the LLVM IR dialect should be updated
separately.
This introduces the `captures` attribute as described in:
https://discourse.llvm.org/t/rfc-improvements-to-capture-tracking/81420
This initial patch only introduces the IR/bitcode support for the
attribute and its in-memory representation as `CaptureInfo`. This will
be followed by a patch to upgrade and remove the `nocapture` attribute,
and then by actual inference/analysis support.
Based on the RFC feedback, I've used a syntax similar to the `memory`
attribute, though the only "location" that can be specified is `ret`.
I've added some pretty extensive documentation to LangRef on the
semantics. One non-obvious bit here is that using ptrtoint will not
result in a "return-only" capture, even if the ptrtoint result is only
used in the return value. Without this requirement we wouldn't be able
to continue ordinary capture analysis on the return value.
This consists of:
* Make these instructions part of FPMathOperator.
* Adjust bitcode/ir readers/writers to expect fast math flags on these
instructions.
* Make IRBuilder set the fast math flags on these instructions.
* Update langref and release notes.
* Update a bunch of tests. Some of these are due to InstCombineCasts
incorrectly adding fast math flags to fptrunc, which will be fixed in a
later patch.
Add a specification attribute to LLVM DebugInfo, which is analogous
to DWARF's DW_AT_specification. According to the DWARF spec:
"A debugging information entry that represents a declaration that
completes another (earlier) non-defining declaration may have a
DW_AT_specification attribute whose value is a reference to the
debugging information entry representing the non-defining declaration."
This patch allows types to be specifications of other types. This is
used by Swift to represent generic types. For example, given this Swift
program:
```
struct MyStruct<T> {
let t: T
}
let variable = MyStruct<Int>(t: 43)
```
The Swift compiler emits (roughly) an unsubtituted type for MyStruct<T>:
```
DW_TAG_structure_type
DW_AT_name ("MyStruct")
// "$s1w8MyStructVyxGD" is a Swift mangled name roughly equivalent to
// MyStruct<T>
DW_AT_linkage_name ("$s1w8MyStructVyxGD")
// other attributes here
```
And a specification for MyStruct<Int>:
```
DW_TAG_structure_type
DW_AT_specification (<link to "MyStruct">)
// "$s1w8MyStructVySiGD" is a Swift mangled name equivalent to
// MyStruct<Int>
DW_AT_linkage_name ("$s1w8MyStructVySiGD")
DW_AT_byte_size (0x08)
// other attributes here
```
An extra inhabitant is a bit pattern that does not represent a valid
value for instances of a given type. The number of extra inhabitants is
the number of those bit configurations.
This is used by Swift to save space when composing types. For example,
because Bool only needs 2 bit patterns to represent all of its values
(true and false), an Optional<Bool> only occupies 1 byte in memory by
using a bit configuration that is unused by Bool. Which bit patterns are
unused are part of the ABI of the language.
Since Swift generics are not monomorphized, by using dynamic libraries
you can have generic types whose size, alignment, etc, are known only
at runtime (which is why this feature is needed).
This patch adds num_extra_inhabitants to LLVM-IR debug info and in DWARF
as an Apple extension.
StructType::setBody is the only mechanism that can potentially create
recursion in the type system. Add a runtime check that it is not
actually used to create recursion.
If the check fails, report an error from LLParser, BitcodeReader and
IRLinker. In all other cases assert that the check succeeds.
In future StructType::setBody will be removed in favor of specifying the
body when the type is created, so any performance hit from this runtime
check will be temporary.
Type::isScalableTy and StructType::containsScalableVectorType failed to
detect some cases of structs containing scalable vectors because
containsScalableVectorType did not call back into isScalableTy to check
the element types. Fix this, which requires sharing the same Visited set
in both functions. Also change the external API so that callers are
never required to pass in a Visited set, and normalize the naming to
isScalableTy.
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
This change implements support of metadata strings in operand bundle
values. It makes possible calls like:
call void @some_func(i32 %x) [ "foo"(i32 42, metadata !"abc") ]
It requires some extension of the bitcode serialization. As SSA values
and metadata are stored in different tables, there must be a way to
distinguish them during deserialization. It is implemented by putting a
special marker before the metadata index. The marker cannot be treated
as a reference to any SSA value, so it unambiguously identifies
metadata. It allows extending the bitcode serialization without breaking
compatibility.
Metadata as operand bundle values are intended to be used in
floating-point function calls. They would represent the same information
as now is passed by the constrained intrinsic arguments.
When the lexer hits an error during assembly parsing, it just logs the
error in the `ErrorMsg` object, and it's possible that error gets
overwritten later in by the parser with a more generic error message.
This makes some errors reported by the LLVM's asm parser less precise.
Address this by not having the parser overwrite the message logged by
the lexer by assigning error messages generated by the lexer higher
"priority" than those generated by parser and overwriting the error
message only if its same or higher priority.
Update several Assembler unit test to now check the more precise error
messaged reported by the LLVM's AsmParser.
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
During the ThinLTO indexing step for one of our large applications, we
create 4 million instances of FunctionSummary.
Changing:
std::vector<EdgeTy> CallGraphEdgeList;
to:
SmallVector<EdgeTy, 0> CallGraphEdgeList;
in FunctionSummary reduces the size of each instance by 8 bytes. The
rest of the patch makes the same change to other places so that the
types stay compatible across function boundaries.
The primary motivation is to remove `EntryCount` from `FunctionSummary`.
This frees 8 bytes out of `sizeof(FunctionSummary)` (136 bytes as of
64498c5483).
While I'm at it, this PR clean up {SummaryBasedOptimizations,
SyntheticCountsPropagation} since they were not used and there are no
plans to further invest on them.
With this patch, bitcode writer writes a placeholder 0 at the byte
offset of `EntryCount` and bitcode reader can parse the function entry
count at the correct byte offset. Added a TODO to stop writing
`EntryCount` and bump bitcode version
During the ThinLTO indexing step for one of our large applications, we
create 7.5 million instances of GlobalValueSummary.
Changing:
std::vector<ValueInfo> RefEdgeList;
to:
SmallVector<ValueInfo, 0> RefEdgeList;
in GlobalValueSummary reduces the size of each instance by 8 bytes.
The rest of the patch makes the same change to other places so that
the types stay compatible across function boundaries.
Since IR Types are immutable it makes sense to check them on
construction instead of in the IR Verifier pass.
This patch checks that some TargetExtTypes are well-formed in the sense
that they have the expected number of type parameters and integer
parameters. When called from LLParser it gives a diagnostic message.
When called from anywhere else it just asserts that they are
well-formed.
The use of SmallVector here saves 1.03% of heap allocations during the
compilation of X86ISelLowering.cpp.ll, a .ll version of
X86ISelLowering.cpp. The 8 inline elements cover greater than 99% of
invocations encountered here.
While I am it, the patch changes the parameter type to ArrayRef to
allow callers to use any vector type.
Extend `DIBasicType` and `DISubroutineType` with additional field
`annotations`, e.g. as below:
```
!5 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed, annotations: !6)
!6 = !{!7}
!7 = !{!"btf:type_tag", !"tag1"}
```
The field would be used by BPF backend to generate DWARF attributes
corresponding to `btf_type_tag` type attributes, e.g.:
```
0x00000029: DW_TAG_base_type
DW_AT_name ("int")
DW_AT_encoding (DW_ATE_signed)
DW_AT_byte_size (0x04)
0x0000002d: DW_TAG_LLVM_annotation
DW_AT_name ("btf:type_tag")
DW_AT_const_value ("tag1")
```
Such DWARF entries would be used to generate BTF definitions by tools
like [pahole](https://github.com/acmel/dwarves).
Note: similar fields with similar purposes are already present in
DIDerivedType and DICompositeType.
Currently "btf_type_tag" attributes are represented in debug information
as 'annotations' fields in DIDerivedType with DW_TAG_pointer_type tag.
The annotation on a pointer corresponds to pointee having the attributes
in the final BTF.
The discussion in
[thread](https://lore.kernel.org/bpf/87r0w9jjoq.fsf@oracle.com/) came to
conclusion, that such annotations should apply to the annotated type
itself. Hence the necessity to extend `DIBasicType` & `DISubroutineType`
types with 'annotations' field to represent cases like below:
```
int __attribute__((btf_type_tag("foo"))) bar;
```
This was previously tracked as differential revision:
https://reviews.llvm.org/D143966
Reapplies commit c5aeca73 (and its followup commit 21396be8), which were
reverted due to missing functionality in MLIR and Flang regarding printing
debug records. This has now been added in commit 08aa511, along with support
for printing debug records in flang.
This reverts commit 2dc2290860355dd2bac3b655eea895fe30fde257.
"[Flang] Update test to not check for tail calls on debug intrinsics" &
"Reapply#3 "[RemoveDIs] Load into new debug info format by default in LLVM (#89799)"
Recent updates to flang have added debug info generation via MLIR, a path
which currently does not support debug records. The patch that enables
debug records by default (and a small followup patch) are thus being
reverted until the MLIR path has been fixed.
This reverts commits:
21396be865b4640abf6afa0b05de6708a1a996e0
c5aeca732d1ff6769b0659efebd1cfb5f60487e4
Reapplies commit 91446e2, which was reverted due to a downstream error,
discussed on the pull request. The error could not be reproduced
upstream, and cannot be reproduced downstream as-of current main, so
until the error can be confirmed to still exist this patch should
return.
This reverts commit 23f8fac745bdde70ed4f9c585d19c4913734f1b8.
Adds a calling convention for calls to the `__arm_get_current_vg`
support
routine, which preserves X1-X15, X19-X29, SP, Z0-Z31 & P0-P15.
See https://github.com/ARM-software/abi-aa/pull/263
Remove support for the icmp and fcmp constant expressions.
This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
This defines a new kind of IR Constant that represents a ptrauth signed
pointer, as used in AArch64 PAuth.
It allows representing most kinds of signed pointer constants used thus
far in the llvm ptrauth implementations, notably those used in the
Darwin and ELF ABIs being implemented for c/c++. These signed pointer
constants are then lowered to ELF/MachO relocations.
These can be simply thought of as a constant `llvm.ptrauth.sign`, with
the interesting addition of discriminator computation: the `ptrauth`
constant can also represent a combined blend, when both address and
integer discriminator operands are used. Both operands are otherwise
optional, with default values 0/null.
This implements the `nusw` and `nuw` flags for `getelementptr` as
proposed at
https://discourse.llvm.org/t/rfc-add-nusw-and-nuw-flags-for-getelementptr/78672.
The three possible flags are encapsulated in the new `GEPNoWrapFlags`
class. Currently this class has a ctor from bool, interpreted as the
InBounds flag. This ctor should be removed in the future, as code gets
migrated to handle all flags.
There are a few places annotated with `TODO(gep_nowrap)`, where I've had
to touch code but opted to not infer or precisely preserve the new
flags, so as to keep this as NFC as possible and make sure any changes
of that kind get test coverage when they are made.
This reverts commit 91446e2aa687ec57ad88dc0df793d0c6e694a7c9 and
a unittest followup 1530f319311908b06fe935c89fca692d3e53184f (#90476).
In a stage-2 -flto=thin -gsplit-dwarf -g -fdebug-info-for-profiling
-fprofile-sample-use= build of clang, a ThinLTO backend compile has
assertion failures:
Global is external, but doesn't have external or weak linkage!
ptr @_ZN5clang12ast_matchers8internal18makeAllOfCompositeINS_8QualTypeEEENS1_15BindableMatcherIT_EEN4llvm8ArrayRefIPKNS1_7MatcherIS5_EEEE
function declaration may only have a unique !dbg attachment
ptr @_ZN5clang12ast_matchers8internal18makeAllOfCompositeINS_8QualTypeEEENS1_15BindableMatcherIT_EEN4llvm8ArrayRefIPKNS1_7MatcherIS5_EEEE
The failures somehow go away if -fprofile-sample-use= is removed.
Reapplies the original commit:
2f01fd99eb8c8ab3db9aba72c4f00e31e9e60a05
The previous application of this patch failed due to some missing
DbgVariableRecord support in clang, which has been added now by commit
8805465e.
This will probably break some downstream tools that don't already handle
debug records. If your downstream code breaks as a result of this
change, the simplest fix is to convert the module in question to the old
debug format before you process it, using
`Module::convertFromNewDbgValues()`. For more information about how to
handle debug records or about what has changed, see the migration
document:
https://llvm.org/docs/RemoveDIsDebugInfo.html
This reverts commit 4fd319ae273ed6c252f2067909c1abd9f6d97efa.
Fixes the broken tests in the original commit:
2f01fd99eb8c8ab3db9aba72c4f00e31e9e60a05
This will probably break some downstream tools that don't already handle
debug records. If your downstream code breaks as a result of this
change, the simplest fix is to convert the module in question to the old
debug format before you process it, using
`Module::convertFromNewDbgValues()`. For more information about how to
handle debug records or about what has changed, see the migration
document:
https://llvm.org/docs/RemoveDIsDebugInfo.html
This reverts commit 00821fed09969305b0003d3313c44d1e761a7131.
This patch enables parsing and creating modules directly into the new
debug info format. Prior to this patch, all modules were constructed
with the old debug info format by default, and would be converted into
the new format just before running LLVM passes. This is an important
milestone, in that this means that every tool will now be exposed to
debug records, rather than those that run LLVM passes. As far as I've
tested, all LLVM tools/projects now either handle debug records, or
convert them to the old intrinsic format.
There are a few unit tests that need updating for this patch; these are
either cases of tests that previously needed to set the debug info
format to function, or tests that depend on the old debug info format in
some way. There should be no visible change in the output of any LLVM
tool as a result of this patch, although the likelihood of this patch
breaking downstream code means an NFC tag might be a little misleading,
if not technically incorrect:
This will probably break some downstream tools that don't already handle
debug records. If your downstream code breaks as a result of this
change, the simplest fix is to convert the module in question to the old
debug format before you process it, using
`Module::convertFromNewDbgValues()`. For more information about how to
handle debug records or about what has changed, see the migration
document:
https://llvm.org/docs/RemoveDIsDebugInfo.html