Reverts llvm/llvm-project#90692
Breaking PPC buildbots. The bots are not meant to test LLD, but are
running a test that is using an old version of LLD without the change
(so is incompatible). Revert until a fix is found.
Reapplies the original commit:
2f01fd99eb8c8ab3db9aba72c4f00e31e9e60a05
The previous application of this patch failed due to some missing
DbgVariableRecord support in clang, which has been added now by commit
8805465e.
This will probably break some downstream tools that don't already handle
debug records. If your downstream code breaks as a result of this
change, the simplest fix is to convert the module in question to the old
debug format before you process it, using
`Module::convertFromNewDbgValues()`. For more information about how to
handle debug records or about what has changed, see the migration
document:
https://llvm.org/docs/RemoveDIsDebugInfo.html
This reverts commit 4fd319ae273ed6c252f2067909c1abd9f6d97efa.
Fixes the broken tests in the original commit:
2f01fd99eb8c8ab3db9aba72c4f00e31e9e60a05
This will probably break some downstream tools that don't already handle
debug records. If your downstream code breaks as a result of this
change, the simplest fix is to convert the module in question to the old
debug format before you process it, using
`Module::convertFromNewDbgValues()`. For more information about how to
handle debug records or about what has changed, see the migration
document:
https://llvm.org/docs/RemoveDIsDebugInfo.html
This reverts commit 00821fed09969305b0003d3313c44d1e761a7131.
This reverts commit 2aabfc811670beb843074c765c056fff4a7b443b.
Add fixes to LLD and Gold tests missed in original change.
Co-authored-by: Jan Voung <jvoung@google.com>
This patch enables parsing and creating modules directly into the new
debug info format. Prior to this patch, all modules were constructed
with the old debug info format by default, and would be converted into
the new format just before running LLVM passes. This is an important
milestone, in that this means that every tool will now be exposed to
debug records, rather than those that run LLVM passes. As far as I've
tested, all LLVM tools/projects now either handle debug records, or
convert them to the old intrinsic format.
There are a few unit tests that need updating for this patch; these are
either cases of tests that previously needed to set the debug info
format to function, or tests that depend on the old debug info format in
some way. There should be no visible change in the output of any LLVM
tool as a result of this patch, although the likelihood of this patch
breaking downstream code means an NFC tag might be a little misleading,
if not technically incorrect:
This will probably break some downstream tools that don't already handle
debug records. If your downstream code breaks as a result of this
change, the simplest fix is to convert the module in question to the old
debug format before you process it, using
`Module::convertFromNewDbgValues()`. For more information about how to
handle debug records or about what has changed, see the migration
document:
https://llvm.org/docs/RemoveDIsDebugInfo.html
GUID often have content in the higher bits of a 64-bit entry so using
the unabbrev encoding is inefficient (lots of VBR control bits).
Instead, use an abbrev with two 32-bit fixed width chunks.
The abbrev also helps encode the "count" in one place instead of
in every record.
Reduces size of distributed backend summary files by 8.7% in one
example app.
Co-authored-by: Jan Voung <jvoung@google.com>
Fixes the reported errors on:
https://github.com/llvm/llvm-project/pull/87379
A previous patch updated the bitcode reading for debug
intrinsics/records to not perform the expensive debug info format
conversion from records to intrinsics in cases where no records were
present, but the patch did not actually track when debug labels had been
seen, resulting in errors when parsing bitcode where functions contained
debug label records but no other debug records. This patch fixes that
case and adds a test for it.
The motivating use case is to support import the function declaration
across modules to construct call graph edges for indirect calls [1]
when importing the function definition costs too much compile time
(e.g., the function is too large has no `noinline` attribute).
1. Currently, when the compiled IR module doesn't have a function
definition but its postlink combined summary contains the function
summary or a global alias summary with this function as aliasee, the
function definition will be imported from source module by IRMover. The
implementation is in FunctionImporter::importFunctions [2]
2. In order for FunctionImporter to import a declaration of a function,
both function summary and alias summary need to carry the def / decl
state. Specifically, all existing summary fields doesn't differ across
import modules, but the def / decl state of is decided by
`<ImportModule, Function>`.
This change encodes the def/decl state in `GlobalValueSummary::GVFlags`.
In the subsequent changes
1. The indexing step `computeImportForModule` [3]
will compute the set of definitions and the set of declarations for each
module, and passing on the information to bitcode writer.
2. Bitcode writer will look up the def/decl state and sets the state
when it writes out the flag value. This is demonstrated in
https://github.com/llvm/llvm-project/pull/87600
3. Function importer will read the def/decl state when reading the
combined summary to figure out two sets of global values, and IRMover
will be updated to import the declaration (aka linkGlobalValuePrototype [4])
into the destination module.
- The next change is https://github.com/llvm/llvm-project/pull/87600
[1] mentioned in rfc https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5
[2] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L1608-L1764)
[3] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L856)
[4] 3b337242ee/llvm/lib/Linker/IRMover.cpp (L605)
As noted when #82404 was pushed (canonicalizing `sitofp` -> `uitofp`),
different signedness on fp casts can have dramatic performance
implications on different backends.
So, it makes to create a reliable means for the backend to pick its
cast signedness if either are correct.
Further, this allows us to start canonicalizing `sitofp`- > `uitofp`
which may easy middle end analysis.
Closes#86141
This patch adds a new flag: `--preserve-input-debuginfo-format`
This flag instructs the tool to not convert the debug info format
(intrinsics/records) of input IR, but to instead determine the format of
the input IR and overwrite the other format-determining flags so that we
process and output the file in the same format that we received it in.
This flag is turned off by llvm-link, llvm-lto, and llvm-lto2, and
should be turned off by any other tool that expects to parse multiple IR
modules and have their debug info formats match.
The motivation for this flag is to allow tools to not convert the debug
info format - verify-uselistorder and llvm-reduce, and any downstream
tools that seek to test or mutate IR as-is, without applying extraneous
modifications to the input. This is a necessary step to using debug
records by default in all (other) LLVM tools.
This patch renames DPLabel to DbgLabelRecord, in accordance with the
ongoing DbgRecord rename. This rename was fairly trivial, since DPLabel
isn't as widely used as DPValue and has no real conflicts in either its
full or abbreviated name. As usual, the entire replacement was done
automatically, with `s/DPLabel/DbgLabelRecord/` and `s/DPL/DLR/`.
As part of the migration to ptradd
(https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699),
we need to change the representation of the `inrange` attribute, which
is used for vtable splitting.
Currently, inrange is specified as follows:
```
getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2)
```
The `inrange` is placed on a GEP index, and all accesses must be "in
range" of that index. The new representation is as follows:
```
getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2)
```
This specifies which offsets are "in range" of the GEP result. The new
representation will continue working when canonicalizing to ptradd
representation:
```
getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48)
```
The inrange offsets are relative to the return value of the GEP. An
alternative design could make them relative to the source pointer
instead. The result-relative format was chosen on the off-chance that we
want to extend support to non-constant GEPs in the future, in which case
this variant is more expressive.
This implementation "upgrades" the old inrange representation in bitcode
by simply dropping it. This is a very niche feature, and I don't think
trying to upgrade it is worthwhile. Let me know if you disagree.
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:
- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.
Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:
```
DPValue -> DbgVariableRecord
DPVal -> DbgVarRec
DPV -> DVR
```
Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
--load-bitcode-into-experimental-debuginfo-iterators
false: Convert to the old debug mode after reading.
true: Upgrade to the new debug info format (*).
unset: Same as false (for now).
(*) As of this patch it actually just means "don't convert to either
mode after loading". Auto-upgrading will be implemented in an upcoming
patch.
With this flag we can incrementally add support for RemoveDIs by
overriding the "unset" behaviour in individual tools. The flag can be
removed once all tools support the new debug info mode.
Reaplying after revert in #85382 (861ebe6446296c96578807363aa292c69d827773).
Fixed intermittent test failure by avoiding piping output in some RUN lines.
If --write-experimental-debuginfo-iterators-to-bitcode is true (default false)
and --expermental-debuginfo-iterators is also true then the new debug info
format (non-instruction records) is written to bitcode directly.
Added the following records:
FUNC_CODE_DEBUG_RECORD_LABEL
FUNC_CODE_DEBUG_RECORD_VALUE
FUNC_CODE_DEBUG_RECORD_DECLARE
FUNC_CODE_DEBUG_RECORD_ASSIGN
FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE
The last one has an abbrev in FUNCTION_BLOCK BLOCK_INFO. Incidentally, this uses
the last value available without widening the code-length for FUNCTION_BLOCK
from 4 to 5 bits.
Records are formatted as follows:
All DbgRecord start with:
1. DILocation
FUNC_CODE_DEBUG_RECORD_LABEL
2. DILabel
DPValues then share common fields:
2. DILocalVariable
3. DIExpression
FUNC_CODE_DEBUG_RECORD_VALUE
4. Location Metadata
FUNC_CODE_DEBUG_RECORD_DECLARE
4. Location Metadata
FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE
4. Location Value (single)
FUNC_CODE_DEBUG_RECORD_ASSIGN
4. Location Metadata
5. DIAssignID
6. DIExpression (address)
7. Location Metadata (address)
Encoding the DILocation metadata reference directly appeared to yield smaller
bitcode files than encoding the operands seperately (as is done with instruction
DILocations).
FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE is by far the most common DbgRecord record
in optimized code (order of 5x-10x over other kinds). Unoptimized code should
only contain FUNC_CODE_DEBUG_RECORD_DECLARE.
If --write-experimental-debuginfo-iterators-to-bitcode is true (default false)
and --expermental-debuginfo-iterators is also true then the new debug info
format (non-instruction records) is written to bitcode directly.
Added the following records:
FUNC_CODE_DEBUG_RECORD_LABEL
FUNC_CODE_DEBUG_RECORD_VALUE
FUNC_CODE_DEBUG_RECORD_DECLARE
FUNC_CODE_DEBUG_RECORD_ASSIGN
FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE
The last one has an abbrev in FUNCTION_BLOCK BLOCK_INFO. Incidentally, this uses
the last value available without widening the code-length for FUNCTION_BLOCK
from 4 to 5 bits.
Records are formatted as follows:
All DbgRecord start with:
1. DILocation
FUNC_CODE_DEBUG_RECORD_LABEL
2. DILabel
DPValues then share common fields:
2. DILocalVariable
3. DIExpression
FUNC_CODE_DEBUG_RECORD_VALUE
4. Location Metadata
FUNC_CODE_DEBUG_RECORD_DECLARE
4. Location Metadata
FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE
4. Location Value (single)
FUNC_CODE_DEBUG_RECORD_ASSIGN
4. Location Metadata
5. DIAssignID
6. DIExpression (address)
7. Location Metadata (address)
Encoding the DILocation metadata reference directly appeared to yield smaller
bitcode files than encoding the operands seperately (as is done with instruction
DILocations).
FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE is by far the most common DbgRecord record
in optimized code (order of 5x-10x over other kinds). Unoptimized code should
only contain FUNC_CODE_DEBUG_RECORD_DECLARE.
`sign-return-address` and similar module attributes should be propagated
to the function level before modules got merged because module flags may
contradict and this information is not recoverable.
Generated code will match with the normal linking flow.
NOTE: For brevity the following talks about ConstantInt but
everything extends to cover ConstantFP as well.
Whilst ConstantInt::get() supports the creation of vectors whereby
each lane has the same value, it achieves this via other constants:
* ConstantVector for fixed-length vectors
* ConstantExprs for scalable vectors
However, ConstantExprs are being deprecated and ConstantVector is
not space efficient for larger vector types. By extending ConstantInt
we can represent vector splats by only storing the underlying scalar
value.
More specifically:
* ConstantInt gains an ElementCount variant of get().
* LLVMContext is extended to map <EC,APInt>->ConstantInt.
* BitcodeReader/Writer support is extended to allow vector types.
Whilst this patch adds the base support, more work is required
before it's production ready. For example, there's likely to be
many places where isa<ConstantInt> assumes a scalar type. Accordingly
the default behaviour of ConstantInt::get() remains unchanged but a
set of flags are added to allow wider testing and thus help with the
migration:
--use-constant-int-for-fixed-length-splat
--use-constant-fp-for-fixed-length-splat
--use-constant-int-for-scalable-splat
--use-constant-fp-for-scalable-splat
NOTE: No change is required to the bitcode format because types and
values are handled separately.
NOTE: For similar reasons as above, code generation doesn't work
out-the-box.
We've been building and testing this no-debug-intrinsic work inside of
the pass manager for a while, so that optimisation passes get exercised
and tested when we turn it on. However, by converting to the
non-intrinsic form in the bitcode loader, we accidentally caused all
parts of LLVM to potentially see non-intrinsic debug-info.
Seeing how we're trying to turn things on incrementally, it was a
mistake to go this far this fast: we can instead just focus on enabling
during optimisations for the moment, then all the other parts of LLVM
later.
Turns out I was using DbgMarker::getDbgValueRange rather than the helper
utility in Instruction::getDbgValueRange, which checks for null-ness.
Original commit message follows.
[DebugInfo][RemoveDIs] Convert debug-info modes when loading bitcode (#78967)
As part of eliminating debug-intrinsics in LLVM, we'll shortly be
pushing the conversion from "old" dbg.value mode to "new" DPValue mode
out from when the pass manager runs, to when modules are loaded. This
patch adds that conversion process and some (temporary) options to
llvm-lto{,2} to help test it.
Specifically: now whenever we load a bitcode module, consider a flag of
whether to "upgrade" it into the new debug-info mode, and if we're
lazily materializing functions then do that lazily too. Doing this
exposes an error in the IRLinker/materializer handling of DPValues,
where we need to transfer the debug-info format flag correctly, and in
ValueMapper we need to remap the Values that DPValues point at.
I've added some test coverage in the modified tests; these will be
exercised by our llvm-new-debug-iterators buildbot.
This upgrading of debug-info won't be happening for the llvm18 release,
instead we'll turn it on after the branch date, thenbe push the boundary
of where "new" debug-info starts and ends down into the existing
debug-info upgrade path over the course of the next release.
As part of eliminating debug-intrinsics in LLVM, we'll shortly be
pushing the conversion from "old" dbg.value mode to "new" DPValue mode
out from when the pass manager runs, to when modules are loaded. This
patch adds that conversion process and some (temporary) options to
llvm-lto{,2} to help test it.
Specifically: now whenever we load a bitcode module, consider a flag of
whether to "upgrade" it into the new debug-info mode, and if we're
lazily materializing functions then do that lazily too. Doing this
exposes an error in the IRLinker/materializer handling of DPValues,
where we need to transfer the debug-info format flag correctly, and in
ValueMapper we need to remap the Values that DPValues point at.
I've added some test coverage in the modified tests; these will be
exercised by our llvm-new-debug-iterators buildbot.
This upgrading of debug-info won't be happening for the llvm18 release,
instead we'll turn it on after the branch date, thenbe push the boundary
of where "new" debug-info starts and ends down into the existing
debug-info upgrade path over the course of the next release.
Add the `dead_on_unwind` attribute, which states that the caller will
not read from this argument if the call unwinds. This allows eliding
stores that could otherwise be visible on the unwind path, for example:
```
declare void @may_unwind()
define void @src(ptr noalias dead_on_unwind %out) {
store i32 0, ptr %out
call void @may_unwind()
store i32 1, ptr %out
ret void
}
define void @tgt(ptr noalias dead_on_unwind %out) {
call void @may_unwind()
store i32 1, ptr %out
ret void
}
```
The optimization is not valid without `dead_on_unwind`, because the `i32
0` value might be read if `@may_unwind` unwinds.
This attribute is primarily intended to be used on sret arguments. In
fact, I previously wanted to change the semantics of sret to include
this "no read after unwind" property (see D116998), but based on the
feedback there it is better to keep these attributes orthogonal (sret is
an ABI attribute, dead_on_unwind is an optimization attribute). This is
a reboot of that change with a separate attribute.
This adds support for a HasTailCall flag on function call edges in the
ThinLTO summary. It is intended for use in aiding discovery of missing
frames from tail calls in profiled call stacks for MemProf of profiled
binaries that did not disable tail call elimination. A follow on change
will add the use of this new flag during MemProf context disambiguation.
The new flag is encoded in the bitcode along with either the hotness
flag from the profile, or the relative block frequency under the
-write-relbf-to-summary flag when there is no profile data.
Because we now will always have some additional call edge information, I
have removed the non-profile function summary record format, and we
simply encode the tail call flag along with a hotness type of none when
there is no profile information or relative block frequency. The change
of record format and name caused most of the test case changes.
I have added explicit testing of generation of the new tail call flag
into the bitcode and IR assembly format as part of the changes to
llvm/test/Bitcode/thinlto-function-summary-refgraph.ll. I have also
added round trip testing through assembly and bitcode to
llvm/test/Assembler/thinlto-summary.ll.
This flag indicates that every bit is known to be zero in at least one
of the inputs. This allows the Or to be treated as an Add since there is
no possibility of a carry from any bit.
If the flag is present and this property does not hold, the result is
poison.
This makes it easier to reverse the InstCombine transform that turns Add
into Or.
This is inspired by a comment here
https://github.com/llvm/llvm-project/pull/71955#discussion_r1391614578
Discourse thread
https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036
Close https://github.com/llvm/llvm-project/issues/56980.
This patch tries to introduce a light-weight optimization attribute for
coroutines which are guaranteed to only be destroyed after it reached
the final suspend.
The rationale behind the patch is simple. See the example:
```C++
A foo() {
dtor d;
co_await something();
dtor d1;
co_await something();
dtor d2;
co_return 43;
}
```
Generally the generated .destroy function may be:
```C++
void foo.destroy(foo.Frame *frame) {
switch(frame->suspend_index()) {
case 1:
frame->d.~dtor();
break;
case 2:
frame->d.~dtor();
frame->d1.~dtor();
break;
case 3:
frame->d.~dtor();
frame->d1.~dtor();
frame->d2.~dtor();
break;
default: // coroutine completed or haven't started
break;
}
frame->promise.~promise_type();
delete frame;
}
```
Since the compiler need to be ready for all the cases that the coroutine
may be destroyed in a valid state.
However, from the user's perspective, we can understand that certain
coroutine types may only be destroyed after it reached to the final
suspend point. And we need a method to teach the compiler about this.
Then this is the patch. After the compiler recognized that the
coroutines can only be destroyed after complete, it can optimize the
above example to:
```C++
void foo.destroy(foo.Frame *frame) {
frame->promise.~promise_type();
delete frame;
}
```
I spent a lot of time experimenting and experiencing this in the
downstream. The numbers are really good. In a real-world coroutine-heavy
workload, the size of the build dir (including .o files) reduces 14%.
And the size of final libraries (excluding the .o files) reduces 8% in
Debug mode and 1% in Release mode.
Remove support for zext and sext constant expressions. All places
creating them have been removed beforehand, so this just removes the
APIs and uses of these constant expressions in tests.
There is some additional cleanup that can be done on top of this, e.g.
we can remove the ZExtInst vs ZExtOperator footgun.
This is part of
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.
This adds a writable attribute, which in conjunction with
dereferenceable(N) states that a spurious store of N bytes is
introduced on function entry. This implies that this many bytes
are writable without trapping or introducing data races. See
https://llvm.org/docs/Atomics.html#optimization-outside-atomic for
why the second point is important.
This attribute can be added to sret arguments. I believe Rust will
also be able to use it for by-value (moved) arguments. Rust likely
won't be able to use it for &mut arguments (tree borrows does not
appear to allow spurious stores).
In this patch the new attribute is only used by LICM scalar promotion.
However, the actual motivation for this is to fix a correctness issue
in call slot optimization, which needs this attribute to avoid
optimization regressions.
Followup to the discussion on D157499.
Differential Revision: https://reviews.llvm.org/D158081
Add an nneg flag to the zext instruction, which specifies that the
argument is non-negative. Otherwise, the result is a poison value.
The primary use-case for the flag is to preserve information when sext
gets replaced with zext due to range-based canonicalization. The nneg
flag allows us to convert the zext back into an sext later. This is
useful for some optimizations (e.g. a signed icmp can fold with sext but
not zext), as well as some targets (e.g. RISCV prefers sext over zext).
Discourse thread: https://discourse.llvm.org/t/rfc-add-zext-nneg-flag/73914
This patch is based on https://reviews.llvm.org/D156444 by
@Panagiotis156, with some implementation simplifications and additional
tests.
---------
Co-authored-by: Panagiotis K <karouzakispan@gmail.com>
This patch adds a new fn attribute, `optdebug`, that specifies that
optimizations should make decisions that prioritize debug info quality,
potentially at the cost of runtime performance.
This patch does not add any functional changes triggered by this
attribute, only the attribute itself. A subsequent patch will use this
flag to disable the post-RA scheduler.
Metadata (via ValueAsMetadata) can reference constant expressions
that may no longer be supported. These references can both be in
function-local metadata and module metadata, if the same expression
is used in multiple functions. At least in theory, such references
could also be in metadata proper, rather than just inside
ValueAsMetadata references in calls.
Instead of trying to expand these expressions (which we can't
reliably do), pretend that the constant has been deleted, which
means that ValueAsMetadata references will get replaced with
undef metadata.
Fixes https://github.com/llvm/llvm-project/issues/68281.
This fixes the bitcode upgrade failure reported in
https://reviews.llvm.org/D155924#4616789.
The expansion always happens in the entry block, so this may be
inaccurate if there are trapping constant expressions.