This patch adds support for DPVAssigns across all of
AssignmentTrackingAnalysis except for AssignmentTrackingLowering, which
is implemented in a separate patch. This patch includes handling
DPValues in MemLocFragFill, the removal of redundant DPValues as part of
AssignmentTrackingAnalysis (which is different to the version in
`BasicBlockUtils.cpp`), and preventing the DPVAssigns from being
directly emitted in SelectionDAG (just as we don't emit llvm.dbg.assigns
directly, but receive a set of locations from
AssignmentTrackingAnalysis' output).
Following on from the previous patch 6aeb7a7,
this patch adds the necessary code to process the DPV equivalents of
llvm.dbg.assign intrinsics. Most of the content of this patch is simply
duplicating existing functionality, using generic code for simple
functions and PointerUnions where storage is required. The most complex
changes are in the places that iterate over instructions, as iterating
over DPValues between instructions is different to iterating over
instructions that may or may not be debug intrinsics; this is most
complex in `AssignmentTrackingLowering::process`, where I've added some
comments to explain the state of the program at each key point depending
on whether we are operating on intrinsics or DPValues.
This patch adds the preliminary changes for handling DPValues in
AssignmentTrackingAnalysis - very few functional changes are included,
but internal data structures have been changed to operate with DPValues
as well as Instructions, allowing future patches to process DPValues
correctly.
Fix https://github.com/llvm/llvm-project/issues/74189 (crash report).
The pruning code uses a BitVector to track which parts of a variable have been
defined in order to find redundant debug records. BitVector uses a u32 to track
size; variable of types with a bit-size greater than max(u32) ish* can't be
represented using a BitVector.
Fix the assertion by introducing a limit on type size. Improve performance by
bringing the limit down to a sensible number and tracking byte-sizes instead
of bit-sizes.
Skipping variables in this pruning code doesn't cause debug info correctness
issues; it just means there may be some extra redundant debug records.
(*) `max(u32) - 63` due to BitVector::NumBitWords implementation.
The whole point of the GenericDomTree.h vs
GenericDomTreeConstruction.h distinction is that the latter only
needs to be included in the source file and not the header.
Fixes#65004 by trimming assignments from out of bounds stores (out of bounds
of either the base variable or the backing alloca). If there's no overlap at
all or the out of bounds access starts at a negative offset from the alloca,
the assignment is simply skipped.
Remove assert from AssignmentTrackingAnalysis that fires if a local variable
has non-alloca storage. The analysis can emit these locations but the
assignment tracking code in SelectionDAG isn't ready to handle non-alloca
storage for locals yet. The AssignmentTrackingPass (pass that adds assignment
tracking metadata) ignores non-alloca dbg.declares, so the only variables
affected are those who's backing storage is changed from an alloca during
optimisation, and the result is the variables are dropped.
Fixes: https://ci.chromium.org/ui/p/pigweed/builders/toolchain/
toolchain-ci-pigweed-linux/b8783274592206481489/overview
Reviewed By: StephenTozer
Differential Revision: https://reviews.llvm.org/D149135
The vectors being sorted here shouldn't contain duplicate entries. Prior to
this patch this was checked with an assert within the `std::sort`
predicate. However, `std::sort` may compare an element against itself which
causes the assert to fire (false positive). Move the assert outside of the sort
predicate to avoid such issues.
Reviewed By: StephenTozer
Differential Revision: https://reviews.llvm.org/D149045
Such dbg.assigns will occur if you write zero-sized memcpys (see
https://reviews.llvm.org/D146987#4240016).
Handle this in AssignmentTrackingAnalysis (back end) rather than
AssignmentTrackingPass (declare-to-assign) in case it is possible to reproduce
this as a result of optimisations.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D147435
The elements in FragmentMap are big objects, use reference can get
better performance, as someone do in line 1912.
Differential Revision: https://reviews.llvm.org/D147126
MemLocFragmentFill uses an IntervalMap to track which bits of each variable are
stack-homed. Intervals with the same value (same stack location base address)
are automatically coalesced by the map. This patch changes the analysis to take
advantage of that and insert a new dbg loc after each def if any coalescing
took place. This results in some additional redundant defs (we insert a def,
then another that by definition shadows the previous one if any coalescing took
place) but they're all cleaned up thanks to the previous patch in this stack.
This reduces the total number of fragments created by
AssignmentTrackingAnalysis which reduces compile time because LiveDebugValues
computes SSA for every fragment it encounters. There's a geomean reduction in
instructions retired in a CTMark LTO-O3-g build of 0.3% with these two patches.
One small caveat is that this technique can produce partially overlapping
fragments (e.g. slice [0, 32) and slice [16, 64)), which we know
LiveDebugVariables doesn't really handle correctly. Used in combination with
instruction-referencing this isn't a problem, since LiveDebugVariables is
effectively side-stepped in instruction-referencing mode. Given this, the
coalescing is only enabled when instruction-referencing is enabled (but the
behaviour can be overriden using -debug-ata-coalesce-frags=<bool>).
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D146980
`removeRedundantDbgLocsUsingBackwardScan` removes redundant dbg loc definitions
by scanning backwards through contiguous sets of them (a "wedge"), removing
earlier (in IR order terms) defs for fragments of variables that are defined
later in the wedge.
In this patch we use a `Bitvector` for each variable to track which bits have
definitions to more accurately determine whether a loc def is redundant. This
patch increases compile time by itself, but reduces it when combined with the
follow-up patch.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D146978
Restructure AssignmentTrackingLowering::join to avoid a map copy in the case
where BB has more than one pred.
We only need to perform a copy of a pred LiveOut if there's exactly one
already-visited pred (Result = PredLiveOut). With more than one pred the result
is built by calling Result = join(std::move(Result), PredLiveOut) for each
subsequent pred, where join parameters are const &. i.e. with more than 1 pred
we can avoid copying by referencing the first two pred LiveOuts in the first
join and then using a move + reference for the rest.
This reduces compile time for CTMark LTO-O3-g builds.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D144732
Only calculate fragment overlaps for partially stack homed variables. This
filter is already applied to the rest of the analysis - this change simply
prevents some unnecessary work.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D145515
...rather than using DenseMaps to track per-variable information.
Rather than tracking 3 maps of {VariableID: SomeInfo} per block, use a
BitVector indexed by VariableID to mask 3 vectors of SomeInfo.
BlockInfos now need to be initialised with a call to init which sets the
BitVector width to the number of partially promoted variables in the function
and fills the vectors with Top values.
Prior to this patch, in joinBlockInfo, it was necessary to insert Top values
into the Join result for variables in A XOR B after joining the variables in A
AND B. Now, because the vectors are pre-filled with Top values we need only
join the variables A AND B and set the BitVector of tracked variables to A OR
B.
The patch achieves an average of 0.25% reduction in instructions retired and a
1.1% max-rss for the CTMark suite in LTO-O3-g builds.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D145558
Use RawLocationWrapper rather than a Value to represent the location operand(s)
so that it's possible to represent multiple location
operands. AssignmentTrackingAnalysis still converts variadic debug intrinsics
to kill locations so this patch is NFC.
Reviewed By: StephenTozer
Differential Revision: https://reviews.llvm.org/D145911
As part of this work, removing `SDDbgValue::clearIsEmitted` originally added for
`dbg.addr` in 045c67769d7fe577fc38cccb6fb40fd814437447 was attempted, but it
appears some tests for `DBG_INSTR_REF` now depend on that behaviour as well, so
it was kept and comments were updated instead.
Part of `dbg.addr` removal
Discussed in https://discourse.llvm.org/t/what-is-the-status-of-dbg-addr/62898
Differential Revision: https://reviews.llvm.org/D144800
Where the new checks have been added, `SymmetricDifference` - still being built
- contains entries for variables present in `A` and not in `B`. If
`SymmetricDifference` is empty at this point it means the variables (map keys)
in `A` are a subset of those in `B`, so if `A` and `B` are the same size then
we know they're identical.
This reduces the number of instructions retired building some of the CTMark
projects in a ReleaseLTO-g configuration (geomean change -0.05% with the best
improvement being -0.24% for tramp3d-v4)
Reviewed By: StephenTozer
Differential Revision: https://reviews.llvm.org/D144621
The size lower bound is known - the `Join` map in both cases gets an entry for
each variable from both input maps (union).
This reduces the number of times the map grows, improving ReleaseLTO-g compile
time for CTMark projects by an average of around 0.2%.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D144486
Without this patch `getDerefOffsetInBytes` incorrectly always returns
`std::nullopt` for expressions with fragments due to an off-by-one error with
fragment element indices.
Reviewed By: StephenTozer
Differential Revision: https://reviews.llvm.org/D143567
The iterator `FirstOverlap` is invalidated after the call to `insert` - avoid
dereferencing the iterator after the call to `insert`.
Reviewed By: CarlosAlbertoEnciso
Differential Revision: https://reviews.llvm.org/D141854
Remove LLVM flag -experimental-assignment-tracking. Assignment tracking is
still enabled from Clang with the command line -Xclang
-fexperimental-assignment-tracking which tells Clang to ask LLVM to run the
pass declare-to-assign. That pass converts conventional debug intrinsics to
assignment tracking metadata. With this patch it now also sets a module flag
debug-info-assignment-tracking with the value `i1 true` (using the flag conflict
rule `Max` since enabling assignment tracking on IR that contains only
conventional debug intrinsics should cause no issues).
Update the docs and tests too.
Reviewed By: CarlosAlbertoEnciso
Differential Revision: https://reviews.llvm.org/D142027
Unlike D140903 this patch folds in treating an empty metadata address component
of a dbg.assign the same as undef because it was already being treated that way
in the AssignmentTrackingAnalysis pass.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D141125
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir
Add initial revision of assignment tracking analysis pass
---------------------------------------------------------
This patch squashes five individually reviewed patches into one:
#1https://reviews.llvm.org/D136320#2https://reviews.llvm.org/D136321#3https://reviews.llvm.org/D136325#4https://reviews.llvm.org/D136331#5https://reviews.llvm.org/D136335
Patch #1 introduces 2 new files: AssignmentTrackingAnalysis.h and .cpp. The
two subsequent patches modify those files only. Patch #4 plumbs the analysis
into SelectionDAG, and patch #5 is a collection of tests for the analysis as
a whole.
The analysis was broken up into smaller chunks for review purposes but for the
most part the tests were written using the whole analysis. It would be possible
to break up the tests for patches #1 through #3 for the purpose of landing the
patches seperately. However, most them would require an update for each
patch. In addition, patch #4 - which connects the analysis to SelectionDAG - is
required by all of the tests.
If there is build-bot trouble, we might try a different landing sequence.
Analysis problem and goal
-------------------------
Variables values can be stored in memory, or available as SSA values, or both.
Using the Assignment Tracking metadata, it's not possible to determine a
variable location just by looking at a debug intrinsic in
isolation. Instructions without any metadata can change the location of a
variable. The meaning of dbg.assign intrinsics changes depending on whether
there are linked instructions, and where they are relative to those
instructions. So we need to analyse the IR and convert the embedded information
into a form that SelectionDAG can consume to produce debug variable locations
in MIR.
The solution is a dataflow analysis which, aiming to maximise the memory
location coverage for variables, outputs a mapping of instruction positions to
variable location definitions.
API usage
---------
The analysis is named `AssignmentTrackingAnalysis`. It is added as a required
pass for SelectionDAGISel when assignment tracking is enabled.
The results of the analysis are exposed via `getResults` using the returned
`const FunctionVarLocs *`'s const methods:
const VarLocInfo *single_locs_begin() const;
const VarLocInfo *single_locs_end() const;
const VarLocInfo *locs_begin(const Instruction *Before) const;
const VarLocInfo *locs_end(const Instruction *Before) const;
void print(raw_ostream &OS, const Function &Fn) const;
Debug intrinsics can be ignored after running the analysis. Instead, variable
location definitions that occur between an instruction `Inst` and its
predecessor (or block start) can be found by looping over the range:
locs_begin(Inst), locs_end(Inst)
Similarly, variables with a memory location that is valid for their lifetime
can be iterated over using the range:
single_locs_begin(), single_locs_end()
Further detail
--------------
For an explanation of the dataflow implementation and the integration with
SelectionDAG, please see the reviews linked at the top of this commit message.
Reviewed By: jmorse