This patch is a part of step-by-step refactoring of CloneFunctionInto.
The goal is to extract reusable pieces out of it that will be later used
to optimize function cloning e.g. in coroutine processing.
Extracted from #109032 (commit 2)
In a variety of places we change the bitwidth of a parameter but don't
update the attributes.
The issue in this case is from the `range` attribute when inlining
`__memset_chk`. `optimizeMemSetChk` will replace an `i32` with an
`i8`, and if the `i32` had a `range` attr assosiated it will cause an
error.
Fixes#112633
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
We weren't flagging inlined callee functions with callsite but not
memprof metadata correctly, leading to the callsite metadata not being
stripped when that function was inlined into a callsite that didn't
itself have callsite metadata.
In practice, this meant that we went into the LTO link with many more
calls than necessary having callsite metadata / summary records, which
in turn made the graph larger than necessary.
Fixing this oversight resulted in huge reductions in the thin link of a
large target:
99% fewer duplicated context ids (recall we have to duplicate when
callsites containing the same stack ids are in different functions)
71% fewer graph edges
17% fewer graph nodes
13% fewer functions cloned
44% smaller peak memory
47% smaller time
This patch is part of a set of patches that add an `-fextend-lifetimes`
flag to clang, which extends the lifetimes of local variables and
parameters for improved debuggability. In addition to that flag, the
patch series adds a pragma to selectively disable `-fextend-lifetimes`,
and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes`
for this pointers only. All changes and tests in these patches were
written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer)
has handled review and merging. The extend lifetimes flag is intended to
eventually be set on by `-Og`, as discussed in the RFC
here:
https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850
This patch implements a new intrinsic instruction in LLVM,
`llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand
and has no effect other than "using" its operand, to ensure that its
operand remains live until after the fake use. This patch does not emit
fake uses anywhere; the next patch in this sequence causes them to be
emitted from the clang frontend, such that for each variable (or this) a
fake.use operand is inserted at the end of that variable's scope, using
that variable's value. This patch covers everything post-frontend, which
is largely just the basic plumbing for a new intrinsic/instruction,
along with a few steps to preserve the fake uses through optimizations
(such as moving them ahead of a tail call or translating them through
SROA).
Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...
`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
Performing `instSimplify` while cloning is unsafe due to incomplete
remapping (as reported in #87534). Ideally, `instSimplify` ought to
reason on the updated newly-cloned function, after returns have been
rewritten and callee entry basic block / call-site have been fixed up.
This is in contrast to `CloneAndPruneIntoFromInst` behaviour, which
is inherently expected to clone basic blocks, with pruning on top of
– if any –, and not actually fixing up returns / CFG, which should be
up to the Inliner. We may solve this by letting `instSimplify` work on
the newly-cloned function, while maintaining old function attributes,
so as to avoid inconsistencies between the yet-to-be-solved return
type, and new function ret type attributes.
Original commit: a61f9fe31750cee65c726fb51f1b14e31e177258
Multiple 2-stage buildbots were reporting failures. These issues have been
addressed separately.
Fixes: https://github.com/llvm/llvm-project/issues/87534.
A logic issue arose when inlining via `CloneAndPruneFunctionInto`,
which, besides cloning, performs instruction simplification as well.
By the time a new cloned instruction is being simplified, phi-nodes
are not remapped yet as the whole CFG needs to be processed first.
As `VMap` state at this stage is incomplete, `threadCmpOverPHI` and
variants could lead to unsound optimizations. This issue has been
addressed by performing basic constant folding while cloning, and
postponing instruction simplification once phi-nodes are revisited.
Fixes: https://github.com/llvm/llvm-project/issues/87534.
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:
- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.
Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:
```
DPValue -> DbgVariableRecord
DPVal -> DbgVarRec
DPV -> DVR
```
Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
As part of the effort to rename the DbgRecord classes, this patch
renames the widely-used functions that operate on DbgRecords but refer
to DbgValues or DPValues in their names to refer to DbgRecords instead;
all such functions are defined in one of `BasicBlock.h`,
`Instruction.h`, and `DebugProgramInstruction.h`.
This patch explicitly does not change the names of any comments or
variables, except for where they use the exact name of one of the
renamed functions. The reason for this is reviewability; this patch can
be trivially examined to determine that the only changes are direct
string substitutions and any results from clang-format responding to the
changed line lengths. Future patches will cover renaming variables and
comments, and then renaming the classes themselves.
- [DebugMetadata][DwarfDebug] Support function-local types in lexical
block scopes (4/7)
- [CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined
functions
This is a follow-up for https://reviews.llvm.org/D144006, fixing a crash
reported
in Chromium (https://reviews.llvm.org/D144006#4651955).
The first commit is added for convenience, as it has already been
accepted.
If DISubpogram was not cloned (e.g. we are cloning a function that has
other
functions inlined into it, and subprograms of the inlined functions are
not supposed to be cloned), it doesn't make sense to clone its
DILocalVariables as well.
Otherwise get duplicated DILocalVariables not tracked in their
subprogram's retainedNodes, that crash LTO with Chromium.
This is meant to be committed along with
https://reviews.llvm.org/D144006.
With intrinsics representing debug-info, we just clone all the
intrinsics when inlining a function and don't think about it any
further. With non-instruction debug-info however we need to be a bit
more careful and manually move the debug-info from one place to another.
For the most part, this means keeping a "cursor" during block cloning of
where we last copied debug-info from, and performing debug-info copying
whenever we successfully clone another instruction.
There are several utilities in LLVM for doing this, all of which now
need to manually call cloneDebugInfo. The testing story for this is not
well covered as we could rely on normal instruction-cloning mechanisms
to do all the hard stuff. Thus, I've added a few tests to explicitly
test dbg.value behaviours, ahead of them becoming not-instructions.
This patch adds support for CloneBasicBlock duplicating the DPValues
attached to instructions, and adds facilities to remap them into their new
context. The plumbing to achieve this is fairly straightforwards and
mechanical.
I've also added illustrative uses to LoopUnrollRuntime, SimpleLoopUnswitch
and SimplifyCFG. The former only updates for the epilogue right now so I've
added CHECK lines just for the end of an unrolled loop (further updates
coming later). SimpleLoopUnswitch had no debug-info tests so I've added a
new one. The two modified parts of SimplifyCFG are covered by the two
modified SimplifyCFG tests.
These are scenarios where we have to do extra cloning for copying of
DPValues because they're no longer instructions, and remap them too.
This reverts commit 957efa4ce4f0391147cec62746e997226ee2b836.
Original commit message below -- in this follow up, I've shifted
un-necessary inclusions of DebugProgramInstruction.h into being forward
declarations (fixes clang-compile time I hope), and a memory leak in the
DebugInfoTest.cpp IR unittests.
I also tracked a compile-time regression in D154080, more explanation
there, but the result of which is hiding some of the changes behind the
EXPERIMENTAL_DEBUGINFO_ITERATORS compile-time flag. This is tested by the
"new-debug-iterators" buildbot.
[DebugInfo][RemoveDIs] Add prototype storage classes for "new" debug-info
This patch adds a variety of classes needed to record variable location
debug-info without using the existing intrinsic approach, see the rationale
at [0].
The two added files and corresponding unit tests are the majority of the
plumbing required for this, but at this point isn't accessible from the
rest of LLVM as we need to stage it into the repo gently. An overview is
that classes are added for recording variable information attached to Real
(TM) instructions, in the form of DPValues and DPMarker objects. The
metadata-uses of DPValues is plumbed into the metadata hierachy, and a
field added to class Instruction, which are all stimulated in the unit
tests. The next few patches in this series add utilities to convert to/from
this new debug-info format and add instruction/block utilities to have
debug-info automatically updated in the background when various operations
occur.
This patch was reviewed in Phab in D153990 and D154080, I've squashed them
together into this commit as there are dependencies between the two
patches, and there's little profit in landing them separately.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
And some intervening fixups. There are two remaining problems:
* A memory leak via https://lab.llvm.org/buildbot/#/builders/236/builds/7120/steps/10/logs/stdio
* A performance slowdown with -g where I'm not completely sure what the cause it
These might be fairly straightforwards to fix, but it's the end of the day
hear, so I figure I'll clear the buildbots til tomorrow.
This reverts commit 7d77bbef4ad9230f6f427649373fe46a668aa909.
This reverts commit 9026f35afe6ffdc5e55b6615efcbd36f25b11558.
This reverts commit d97b2b389a0e511c65af6845119eb08b8a2cb473.
This patch adds a variety of classes needed to record variable location
debug-info without using the existing intrinsic approach, see the rationale
at [0].
The two added files and corresponding unit tests are the majority of the
plumbing required for this, but at this point isn't accessible from the
rest of LLVM as we need to stage it into the repo gently. An overview is
that classes are added for recording variable information attached to Real
(TM) instructions, in the form of DPValues and DPMarker objects. The
metadata-uses of DPValues is plumbed into the metadata hierachy, and a
field added to class Instruction, which are all stimulated in the unit
tests. The next few patches in this series add utilities to convert to/from
this new debug-info format and add instruction/block utilities to have
debug-info automatically updated in the background when various operations
occur.
This patch was reviewed in Phab in D153990 and D154080, I've squashed them
together into this commit as there are dependencies between the two
patches, and there's little profit in landing them separately.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
This caused asserts:
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2331:
virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *):
Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms &&
"getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed.
See comment on the code review for reproducer.
> RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544
>
> Similar to imported declarations, the patch tracks function-local types in
> DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with
> the aforementioned metadata change and provided a support of function-local
> types scoped within a lexical block.
>
> The patch assumes that DICompileUnit's 'enums field' no longer tracks local
> types and DwarfDebug would assert if any locally-scoped types get placed there.
>
> Reviewed By: jmmartinez
>
> Differential Revision: https://reviews.llvm.org/D144006
This reverts commit f8aab289b5549086062588fba627b0e4d3a5ab15.
This reverts commit 0c03f48480f69b854f86d31235425b5cb71ac921.
Going to fix forward size regression instead due to more dependent patches needing to be reverted otherwise.
Unlike every other analysis and transform, simplifyInstruction
permitted operating on instructions which are not inserted
into a function. This created an edge case no other code needs
to really worry about, and limited transforms in cases that
can make use of the context function. Only the inliner and a handful
of other utilities were making use of this, so just fix up these
edge cases. Results in some IR ordering differences since
cloned blocks are inserted eagerly now. Plus some additional
simplifications trigger (e.g. some add 0s now folded out that
previously didn't).
For some reason the inliner calls simplifyInstruction with disembodied
instructions. I consider this to be an API defect. Either the instruction
should always be inserted prior to simplification, or we at least
should pass in the new function for the context.
The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir
The inliner requires two additions:
fixupAssignments - Update inlined instructions' DIAssignID metadata so that
inlined DIAssignID attachments are unique to the inlined instance.
trackInlinedStores - Treat inlined stores to caller-local variables
(i.e. callee stores to argument pointers that point to the caller's allocas) as
assignments. Track them using trackAssignments, which is the same method as is
used by the AssignmentTrackingPass. This means that we're able to detect stale
memory locations due to DSE after inlining. Because the stores are only tracked
_after_ inlining, any DSE or movement of stores _before_ inlining will not be
accounted for. This is an accepted limitation mentioned in the RFC.
One change is also required:
Update CloneBlock to preserve debug use-before-defs. Otherwise the assignments
will be dropped due to having the intrinsic operands replaced with empty
metadata (see use-before-def.ll in this patch and this related discourse post.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D133318
Update both memprof and callsite metadata to reflect inlined functions.
For callsite metadata this is simply a concatenation of each cloned
call's call stack with that of the inlined callsite's.
For memprof metadata, each profiled memory info block (MIB) is either
moved to the cloned allocation call or left on the original allocation
call depending on whether its context matches the newly refined call
stack context on the cloned call. We also reapply context trimming
optimizations based on the refined set of contexts on each of the calls
(cloned and original).
Depends on D128142.
Reviewed By: snehasish
Differential Revision: https://reviews.llvm.org/D128143
This reverts commit 0d7f3464ce0ba3a97df73e08ee0acd4e33adbe9b and
commit f9403ca41e5f3dab60cd6e5de26eea65dcab01a4. The latter was
"Profile matching and IR annotation for memprof profiles." and was left
from a bad rebase from a commit already pushed upstream.
Update both memprof and callsite metadata to reflect inlined functions.
For callsite metadata this is simply a concatenation of each cloned
call's call stack with that of the inlined callsite's.
For memprof metadata, each profiled memory info block (MIB) is either
moved to the cloned allocation call or left on the original allocation
call depending on whether its context matches the newly refined call
stack context on the cloned call. We also reapply context trimming
optimizations based on the refined set of contexts on each of the calls
(cloned and original), via utilities in MemoryProfileInfo.
Depends on D128142.
Differential Revision: https://reviews.llvm.org/D128143