Fix#91814
When instructions are extracted into a new function the `DIAssignID` metadata
uses and attachments need to be remapped so that the stores and assignment
markers don't link to stores and assignment markers in the original function.
This matches existing inlining behaviour for DIAssignIDs.
When the compiler enables ALSR by default, binary inconsistency occurs
when the extractCodeRegion function is called. The reason is that the
sequence of traversing ExitBlocks containers of the SmallPtrSet type is
changed due to randomization of BasicBlock pointers.
This fixes https://github.com/llvm/llvm-project/issues/86427
This patch renames DPLabel to DbgLabelRecord, in accordance with the
ongoing DbgRecord rename. This rename was fairly trivial, since DPLabel
isn't as widely used as DPValue and has no real conflicts in either its
full or abbreviated name. As usual, the entire replacement was done
automatically, with `s/DPLabel/DbgLabelRecord/` and `s/DPL/DLR/`.
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:
- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.
Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:
```
DPValue -> DbgVariableRecord
DPVal -> DbgVarRec
DPV -> DVR
```
Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
Since some of the users of `CodeExtractor` like `HotColdSplitting` run
late in the pipeline, returns are not cleaned to `unreachable`. So,
just emit `unreachable` directly if the function is `noreturn`.
Closes#84682
As part of the effort to rename the DbgRecord classes, this patch
renames the widely-used functions that operate on DbgRecords but refer
to DbgValues or DPValues in their names to refer to DbgRecords instead;
all such functions are defined in one of `BasicBlock.h`,
`Instruction.h`, and `DebugProgramInstruction.h`.
This patch explicitly does not change the names of any comments or
variables, except for where they use the exact name of one of the
renamed functions. The reason for this is reviewability; this patch can
be trivially examined to determine that the only changes are direct
string substitutions and any results from clang-format responding to the
changed line lengths. Future patches will cover renaming variables and
comments, and then renaming the classes themselves.
I'd reverted this in 6c7805d5d1 after a bad stage. Original commit
messsage follows:
[NFC][RemoveDIs] Bulk update utilities to insert with iterators
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.
There are two general flavours of update:
* Almost all call-sites just call getIterator on an instruction
* Several make use of an existing iterator (scenarios where the code is
actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.
I've also switched DemotePHIToStack to take an optional iterator: it needs
to take an iterator, and having a no-insert-location behaviour appears to
be important. The constructors for ICmpInst and FCmpInst have been updated
too. They're the only instructions that take block _references_ rather than
pointers for certain calls, and a future patch is going to make use of
default-null block insertion locations.
All of this should be NFC.
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.
There are two general flavours of update:
* Almost all call-sites just call getIterator on an instruction
* Several make use of an existing iterator (scenarios where the code is
actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.
I've also switched DemotePHIToStack to take an optional iterator: it needs
to take an iterator, and having a no-insert-location behaviour appears to
be important. The constructors for ICmpInst and FCmpInst have been updated
too. They're the only instructions that take block _references_ rather than
pointers for certain calls, and a future patch is going to make use of
default-null block insertion locations.
All of this should be NFC.
Patch 2 of 3 to add llvm.dbg.label support to the RemoveDIs project. The
patch stack adds the DPLabel class, which is the RemoveDIs llvm.dbg.label
equivalent.
1. Add DbgRecord base class for DPValue and the not-yet-added
DPLabel class.
2. Add the DPLabel class.
-> 3. Add support to passes.
The next patch, #82639, will enable conversion between dbg.labels and DPLabels.
AssignemntTrackingAnalysis support could have gone two ways:
1. Have the analysis store a DPLabel representation in its results -
SelectionDAGBuilder reads the analysis results and ignores all DbgRecord
kinds.
2. Ignore DPLabels in the analysis - SelectionDAGBuilder reads the analysis
results but still needs to iterate over DPLabels from the IR.
I went with option 2 because it's less work and is no less correct than 1. It's
worth noting that causes labels to sink to the bottom of packs of debug records.
e.g., [value, label, value] becomes [value, value, label]. This shouldn't be a
problem because labels and variable locations don't have an ordering requirement.
The ordering between variable locations is maintained and the label movement is
deterministic
Patch 1 of 3 to add llvm.dbg.label support to the RemoveDIs project. The
patch stack adds a new base class
-> 1. Add DbgRecord base class for DPValue and the not-yet-added
DPLabel class.
2. Add the DPLabel class.
3. Enable dbg.label conversion and add support to passes.
Patches 1 and 2 are NFC.
In the near future we also will rename DPValue to DbgVariableRecord and
DPLabel to DbgLabelRecord, at which point we'll overhaul the function
names too. The name DPLabel keeps things consistent for now.
When picking the source location for a branch instruction in the
CodeExtractor, we can end up picking the source location of a debugging
intrinsic. This never makes sense because any variable assignment
information (or labels) might originate from completely different
lexical scopes that have been inlined, and also makes the line tables
change between -g and -gmlt. Fix this by skipping debug intrinsics when
looking for branch source locations.
Detected because of test differences with RemoveDIs, the non-intrinsinc
form of debug-info -- fixing in intrinsic form to avoid there being
spurious test differences when we turn it on.
This patch trivially updates various opt passes to handle DPVAssigns. In
all cases, this means some combination of generifying existing code to
handle DPValues and DbgAssignIntrinsics, iterating over DPValues where
previously we did not, or duplicating code for DbgAssignIntrinsics to
the equivalent DPValue function (in inlining and salvageDebugInfo).
Add the `dead_on_unwind` attribute, which states that the caller will
not read from this argument if the call unwinds. This allows eliding
stores that could otherwise be visible on the unwind path, for example:
```
declare void @may_unwind()
define void @src(ptr noalias dead_on_unwind %out) {
store i32 0, ptr %out
call void @may_unwind()
store i32 1, ptr %out
ret void
}
define void @tgt(ptr noalias dead_on_unwind %out) {
call void @may_unwind()
store i32 1, ptr %out
ret void
}
```
The optimization is not valid without `dead_on_unwind`, because the `i32
0` value might be read if `@may_unwind` unwinds.
This attribute is primarily intended to be used on sret arguments. In
fact, I previously wanted to change the semantics of sret to include
this "no read after unwind" property (see D116998), but based on the
feedback there it is better to keep these attributes orthogonal (sret is
an ABI attribute, dead_on_unwind is an optimization attribute). This is
a reboot of that change with a separate attribute.
CodeExtractor shifts dbg.value intrinsics out of the region being
extracted and updates them to be appropriate in the extracted function.
With new non-intrinsic variable locations, we need to manually do this
too, with DPValues.
Most of this patch shifts and refactors some utilities in
fixupDebugInfoPostExtraction so that we can add a single extra helper
lambda that iterates over DPValues and applies update-utilities. We also
have to assign the IsNewDbgInfoFormat flag in a bunch of places -- this
normally gets set the moment you insert a block into a function (or
function into a module), however a few blocks are constructed here
before being inserted, thus we have to do some manual setup.
Tested via LoopExtractor_alloca.ll, which invokes debugify.
Close https://github.com/llvm/llvm-project/issues/56980.
This patch tries to introduce a light-weight optimization attribute for
coroutines which are guaranteed to only be destroyed after it reached
the final suspend.
The rationale behind the patch is simple. See the example:
```C++
A foo() {
dtor d;
co_await something();
dtor d1;
co_await something();
dtor d2;
co_return 43;
}
```
Generally the generated .destroy function may be:
```C++
void foo.destroy(foo.Frame *frame) {
switch(frame->suspend_index()) {
case 1:
frame->d.~dtor();
break;
case 2:
frame->d.~dtor();
frame->d1.~dtor();
break;
case 3:
frame->d.~dtor();
frame->d1.~dtor();
frame->d2.~dtor();
break;
default: // coroutine completed or haven't started
break;
}
frame->promise.~promise_type();
delete frame;
}
```
Since the compiler need to be ready for all the cases that the coroutine
may be destroyed in a valid state.
However, from the user's perspective, we can understand that certain
coroutine types may only be destroyed after it reached to the final
suspend point. And we need a method to teach the compiler about this.
Then this is the patch. After the compiler recognized that the
coroutines can only be destroyed after complete, it can optimize the
above example to:
```C++
void foo.destroy(foo.Frame *frame) {
frame->promise.~promise_type();
delete frame;
}
```
I spent a lot of time experimenting and experiencing this in the
downstream. The numbers are really good. In a real-world coroutine-heavy
workload, the size of the build dir (including .o files) reduces 14%.
And the size of final libraries (excluding the .o files) reduces 8% in
Debug mode and 1% in Release mode.
This adds a writable attribute, which in conjunction with
dereferenceable(N) states that a spurious store of N bytes is
introduced on function entry. This implies that this many bytes
are writable without trapping or introducing data races. See
https://llvm.org/docs/Atomics.html#optimization-outside-atomic for
why the second point is important.
This attribute can be added to sret arguments. I believe Rust will
also be able to use it for by-value (moved) arguments. Rust likely
won't be able to use it for &mut arguments (tree borrows does not
appear to allow spurious stores).
In this patch the new attribute is only used by LICM scalar promotion.
However, the actual motivation for this is to fix a correctness issue
in call slot optimization, which needs this attribute to avoid
optimization regressions.
Followup to the discussion on D157499.
Differential Revision: https://reviews.llvm.org/D158081
The user of CodeExtractor should be able to specify that
the aggregate argument should be passed as a pointer in zero address
space.
CodeExtractor is used to generate outlined functions required by OpenMP
runtime. The arguments of the outlined functions for OpenMP GPU code
are in 0 address space. 0 address space does not need to be the default
address space for GPU device. That's why there is a need to allow
the user of CodeExtractor to specify, that the allocated aggregate parameter
is passed as pointer in zero address space.
This patch adds a new fn attribute, `optdebug`, that specifies that
optimizations should make decisions that prioritize debug info quality,
potentially at the cost of runtime performance.
This patch does not add any functional changes triggered by this
attribute, only the attribute itself. A subsequent patch will use this
flag to disable the post-RA scheduler.
The `BlockFrequency` class abstracts `uint64_t` frequency values. Use it
more consistently in various APIs and disable implicit conversion to
make usage more consistent and explicit.
- Use `BlockFrequency Freq` parameter for `setBlockFreq`,
`getProfileCountFromFreq` and `setBlockFreqAndScale` functions.
- Return `BlockFrequency` in `getEntryFreq()` functions.
- While on it change some `const BlockFrequency& Freq` parameters to
plain `BlockFreqency Freq`.
- Mark `BlockFrequency(uint64_t)` constructor as explicit.
- Add missing `BlockFrequency::operator!=`.
- Remove `uint64_t BlockFreqency::getMaxFrequency()`.
- Add `BlockFrequency BlockFrequency::max()` function.
Currently, the code extractor functionality deletes a debug intrinsic if
its "Location" argument is not part of the extracted function. The
location is the first argument (or the first few arguments in case of a
DIArgList) of all debug intrinsics.
However, according to the docs, the signature of dbg.assign is:
```
void @llvm.dbg.assign(Value *Value,
DIExpression *ValueExpression,
DILocalVariable *Variable,
DIAssignID *ID,
Value *Address,
DIExpression *AddressExpression)
```
That is, there are two `Value` arguments to it: the usual location
argument and an "Address" argument. This Address argument should also
receive the same treatment.
Continuing the patch series to get rid of debug intrinsics [0], instruction
insertion needs to be done with iterators rather than instruction pointers,
so that we can communicate information in the iterator class. This patch
adds an iterator-taking insertBefore method and converts various call sites
to take iterators. These are all sites where such debug-info needs to be
preserved so that a stage2 clang can be built identically; it's likely that
many more will need to be changed in the future.
At this stage, this is just changing the spelling of a few operations,
which will eventually become signifiant once the debug-info bearing
iterator is used.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D152537
This carries a bitmask indicating forbidden floating-point value kinds
in the argument or return value. This will enable interprocedural
-ffinite-math-only optimizations. This is primarily to cover the
no-nans and no-infinities cases, but also covers the other floating
point classes for free. Textually, this provides a number of names
corresponding to bits in FPClassTest, e.g.
call nofpclass(nan inf) @must_be_finite()
call nofpclass(snan) @cannot_be_snan()
This is more expressive than the existing nnan and ninf fast math
flags. As an added bonus, you can represent fun things like nanf:
declare nofpclass(inf zero sub norm) float @only_nans()
Compared to nnan/ninf:
- Can be applied to individual call operands as well as the return value
- Can distinguish signaling and quiet nans
- Distinguishes the sign of infinities
- Can be safely propagated since it doesn't imply anything about
other operands.
- Does not apply to FP instructions; it's not a flag
This is one step closer to being able to retire "no-nans-fp-math" and
"no-infs-fp-math". The one remaining situation where we have no way to
represent no-nans/infs is for loads (if we wanted to solve this we
could introduce !nofpclass metadata, following along with
noundef/!noundef).
This is to help simplify the GPU builtin math library
distribution. Currently the library code has explicit finite math only
checks, read from global constants the compiler driver needs to set
based on the compiler flags during linking. We end up having to
internalize the library into each translation unit in case different
linked modules have different math flags. By propagating known-not-nan
and known-not-infinity information, we can automatically prune the
edge case handling in most functions if the function is only reached
from fast math uses.
This reverts commit f9599bbc7a3f831e1793a549d8a7a19265f3e504.
For some reason it caused us a huge compile time regression in downstream
workloads. Not sure whether the source of it is in upstream code ir not.
Temporarily reverting until investigated.
Differential Revision: https://reviews.llvm.org/D142330
As discussed in https://github.com/llvm/llvm-project/issues/59901
This change is not NFC. There is one SCEV and EarlyCSE test that have an
improved analysis/optimization case. Rest of the tests are not failing.
I've mostly only added cleanup to SCEV since that is where this issue
started. As a follow up, I believe there is more cleanup opportunity in
SCEV and other affected passes.
There could be cases where there are missed registerAssumption of
guards, but this case is not so bad because there will be no
miscompilation. AssumptionCacheTracker should take care of deleted
guards.
Differential Revision: https://reviews.llvm.org/D142330
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
When a dbg.label is moved into a new function, its corresponding scope
should be preserved, with the exception of the subprogram at the end of
the scope chain, which should now be the subprogram of the destination
function. See D139671.
Differential Revision: https://reviews.llvm.org/D139849
When a dbg.value is moved into a new function, the corresponding
variable should have its entire scope chain reparented with the new
function. The current implementation drops the scope chain and replaces
it with a subprogram for the new function.
Differential Revision: https://reviews.llvm.org/D139671
When a dbg.value instruction for a variable V is extracted into a new
function, the scope of the underlying variable should be set to the new
function iff V was in the scope of the old function (i.e. it hadn't been
inlined). Prior to this patch, the code extractor would always update
the scope of V.
Differential Revision: https://reviews.llvm.org/D139669
When a new function "NewF" is created with instructions extracted from
another function "OldF", the CodeExtractor only preserves debug
line/column of the extracted instructions. However:
1. Any inlinedAt nodes are dropped.
2. The scope chain is replaced with a single node, the Subprogram of NewF.
Both of these are incorrect: most of the debug metadata from the
original instructions should be preserved. We only need to update the
Subprogram found at the scope of the last node of the inline chain; this
Subprogram used to be OldF but now should be NewF.
Differential Revision: https://reviews.llvm.org/D139217
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This switches everything to use the memory attribute proposed in
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly
attributes are dropped. The readnone, readonly and writeonly attributes
are restricted to parameters only.
The old attributes are auto-upgraded both in bitcode and IR.
The bitcode upgrade is a policy requirement that has to be retained
indefinitely. The IR upgrade is mainly there so it's not necessary
to update all tests using memory attributes in this patch, which
is already large enough. We could drop that part after migrating
tests, or retain it longer term, to make it easier to import IR
from older LLVM versions.
High-level Function/CallBase APIs like doesNotAccessMemory() or
setDoesNotAccessMemory() are mapped transparently to the memory
attribute. Code that directly manipulates attributes (e.g. via
AttributeList) on the other hand needs to switch to working with
the memory attribute instead.
Differential Revision: https://reviews.llvm.org/D135780
This implements IR and bitcode support for the memory attribute,
as specified in https://reviews.llvm.org/D135597.
The new attribute is not used for anything yet (and as such, the
old memory attributes are unaffected).
Differential Revision: https://reviews.llvm.org/D135592