Patch 1 of 3 to add llvm.dbg.label support to the RemoveDIs project. The
patch stack adds a new base class
-> 1. Add DbgRecord base class for DPValue and the not-yet-added
DPLabel class.
2. Add the DPLabel class.
3. Enable dbg.label conversion and add support to passes.
Patches 1 and 2 are NFC.
In the near future we also will rename DPValue to DbgVariableRecord and
DPLabel to DbgLabelRecord, at which point we'll overhaul the function
names too. The name DPLabel keeps things consistent for now.
This patch trivially updates various opt passes to handle DPVAssigns. In
all cases, this means some combination of generifying existing code to
handle DPValues and DbgAssignIntrinsics, iterating over DPValues where
previously we did not, or duplicating code for DbgAssignIntrinsics to
the equivalent DPValue function (in inlining and salvageDebugInfo).
With intrinsics representing debug-info, we just clone all the
intrinsics when inlining a function and don't think about it any
further. With non-instruction debug-info however we need to be a bit
more careful and manually move the debug-info from one place to another.
For the most part, this means keeping a "cursor" during block cloning of
where we last copied debug-info from, and performing debug-info copying
whenever we successfully clone another instruction.
There are several utilities in LLVM for doing this, all of which now
need to manually call cloneDebugInfo. The testing story for this is not
well covered as we could rely on normal instruction-cloning mechanisms
to do all the hard stuff. Thus, I've added a few tests to explicitly
test dbg.value behaviours, ahead of them becoming not-instructions.
The `BlockFrequency` class abstracts `uint64_t` frequency values. Use it
more consistently in various APIs and disable implicit conversion to
make usage more consistent and explicit.
- Use `BlockFrequency Freq` parameter for `setBlockFreq`,
`getProfileCountFromFreq` and `setBlockFreqAndScale` functions.
- Return `BlockFrequency` in `getEntryFreq()` functions.
- While on it change some `const BlockFrequency& Freq` parameters to
plain `BlockFreqency Freq`.
- Mark `BlockFrequency(uint64_t)` constructor as explicit.
- Add missing `BlockFrequency::operator!=`.
- Remove `uint64_t BlockFreqency::getMaxFrequency()`.
- Add `BlockFrequency BlockFrequency::max()` function.
For attributes assosiated with a value (like `dereferenceable(N)`)
instead of always using the attribute from the to-be inlined caller,
it should keep using the value at existing callsites that have the
attribute if the value is higher (provides more information).
Poison generating return attributes can't be propagated the same as
others, as they can change the behavior of other uses and/or create UB
where it otherwise wouldn't have occurred.
For example:
```
define nonnull ptr @foo() {
%p = call ptr @bar()
call void @use(ptr %p)
ret ptr %p
}
```
If we inline `@foo` and propagate `nonnull` to `@bar`, it could change
the behavior of `@use` as instead of taking `null`, `@use` will
now be passed `poison`.
This can be even worth in a case like:
```
define nonnull ptr @foo() {
%p = call noundef ptr @bar()
ret ptr %p
}
```
Where propagating `nonnull` to `@bar` will cause UB on `null` return
of `@bar` (`noundef` + `poison`) where it previously wouldn't
have occurred.
To fix this, we only propagate poison generating return attributes if
either 1) The only use of the callsite to propagate too is return and
the callsite to propagate too doesn't have `noundef`. Or 2) the
callsite to be be inlined has `noundef`.
The former case ensures no new UB or `poison` values will be
added. The latter is UB anyways if the value is `poison` so we can go
ahead without worrying about behavior changes.
Continuing the patch series to get rid of debug intrinsics [0], instruction
insertion needs to be done with iterators rather than instruction pointers,
so that we can communicate information in the iterator class. This patch
adds an iterator-taking insertBefore method and converts various call sites
to take iterators. These are all sites where such debug-info needs to be
preserved so that a stage2 clang can be built identically; it's likely that
many more will need to be changed in the future.
At this stage, this is just changing the spelling of a few operations,
which will eventually become signifiant once the debug-info bearing
iterator is used.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D152537
When updating the return type of deoptimize call during inline, we need
to drop incompatible return attributes. This bug was exposed once we
relaxed the contraint of adding the attributes through D156844. With
that change deoptimize (are not willreturn) will start having return
attributes added to it.
Fixes https://github.com/llvm/llvm-project/issues/64804.
Differential Revision: https://reviews.llvm.org/D158286
When a convergencectrl token is passed to a convergent call, and the called
function in turn calls the entry intrinsic, the intrinsic is now now replaced
with the convergencectrl token.
The spec requires the following check:
A call from function F to function G can be inlined only if:
- at least one of F or G does not make any convergent calls, or,
- both F and G make the same kind of convergent calls: controlled or
uncontrolled.
But this change does not implement this complete check. A proper implemenation
require a whole new analysis that identifies convergence in every function. For
now, we skip that and just do a cursory check for the entry intrinsic. The
underlying assumption is that in a compiler flow that fully implements
convergence control tokens, there is no mixing of controlled and uncontrolled
convergent operations in the whole program.
This is a reboot of the original change D85606 by
Nicolai Haehnle <nicolai.haehnle@amd.com>.
Reviewed By: arsenm, nhaehnle
Differential Revision: https://reviews.llvm.org/D152431
The actual callsite we are adding to doesn't need to be
`willreturn`/`nounwind`, only ever instructions between the callsite
and the return.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D156844
We can do this by just querying attribute in the callsite itself. This
is both cleaner code and produces bette results.
Differential Revision: https://reviews.llvm.org/D156843
This allows use with non-0 address space stacks. llvm_ptr_ty should
never be used. This could use some more percolation up through mlir,
but this is enough to fix existing tests.
https://reviews.llvm.org/D156666
For pseudo probes we would like to keep their original dwarf discriminator (either a zero or null) until the first FS-discriminator pass. The inliner is a violation of that, given that it assigns inlinee instructions with no debug info with the that of the callsite. This is being disabled in this patch.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D151568
This is by default on and I don't see any reason to turn it off. There's also no testing of it.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D148956
The legacy pass is only used in AMDGPU codegen, which doesn't care about running it in call graph order (it actually has to work around that fact).
Make the legacy pass a module pass and share code with the new pass.
This allows us to remove the legacy inliner infrastructure.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D146446
This reverts commit f9599bbc7a3f831e1793a549d8a7a19265f3e504.
For some reason it caused us a huge compile time regression in downstream
workloads. Not sure whether the source of it is in upstream code ir not.
Temporarily reverting until investigated.
Differential Revision: https://reviews.llvm.org/D142330
Computing EH-related information was only relevant for analysis passes so far. Lifting it to IR will allow the IR Verifier to calculate EH funclet coloring and validate funclet operand bundles in a follow-up step.
Reviewed By: rnk, compnerd
Differential Revision: https://reviews.llvm.org/D138122
As discussed in https://github.com/llvm/llvm-project/issues/59901
This change is not NFC. There is one SCEV and EarlyCSE test that have an
improved analysis/optimization case. Rest of the tests are not failing.
I've mostly only added cleanup to SCEV since that is where this issue
started. As a follow up, I believe there is more cleanup opportunity in
SCEV and other affected passes.
There could be cases where there are missed registerAssumption of
guards, but this case is not so bad because there will be no
miscompilation. AssumptionCacheTracker should take care of deleted
guards.
Differential Revision: https://reviews.llvm.org/D142330
Remove LLVM flag -experimental-assignment-tracking. Assignment tracking is
still enabled from Clang with the command line -Xclang
-fexperimental-assignment-tracking which tells Clang to ask LLVM to run the
pass declare-to-assign. That pass converts conventional debug intrinsics to
assignment tracking metadata. With this patch it now also sets a module flag
debug-info-assignment-tracking with the value `i1 true` (using the flag conflict
rule `Max` since enabling assignment tracking on IR that contains only
conventional debug intrinsics should cause no issues).
Update the docs and tests too.
Reviewed By: CarlosAlbertoEnciso
Differential Revision: https://reviews.llvm.org/D142027
Non-throwing inline asm infers the nounwind attribute in
instcombine. Thus, it can be handled in the same manner as
non-throwing target functions are generally. Further special casing is
unnecessary complexity.
It isn't correct to always remove memprof metadata MIBs from the
original allocation call after inlining.
Let's say we have the following partial call graph:
C D
\ /
v v
B E
| /
v v
A
where A contains an allocation call. If both contexts including B have
the same allocation behavior, the context in the memprof metadata on the
allocation will be pruned, and we will have 2 MIBs with contexts:
A,B and A,E.
Previously, if we inlined A into B we propagate the matching MIBs onto
the inlined allocation call in B' (A,B in this case), and remove it from
the original out of line allocation in A. This is correct if we have a
single round of bottom up inlining.
However, in the compiler we can have multiple invocations of the inliner
pass (e.g. LTO). We may also inline non-bottom up with an alternative
inliner such as the ModuleInliner. In that case, we could end up first
inlining B into C, without having inlined A into B. The call graph then
looks like:
D
|
v
C' B E
\ | /
v v v
A
If we subsequently (perhaps on a later invocation of bottom up inlining)
inline A into B, the previous handling would propagate the memprof MIB
context A,B up into the inlined allocation in B', and remove it from the
original allocation in A. The propagation into B' is fine, however, by
removing it from A's allocation, we no longer reflect the context coming
from C'.
To fix this, simply prevent the removal of MIB from the original
allocation callsites.
Note that the memprof_inline.ll test has some changes to existing
checking to replace "noncold" with "notcold" in the metadata. The
corresponding CHECK was accidentally commented out in the old version
and thus this mistake was not previously detected.
Differential Revision: https://reviews.llvm.org/D140764
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir
The inliner requires two additions:
fixupAssignments - Update inlined instructions' DIAssignID metadata so that
inlined DIAssignID attachments are unique to the inlined instance.
trackInlinedStores - Treat inlined stores to caller-local variables
(i.e. callee stores to argument pointers that point to the caller's allocas) as
assignments. Track them using trackAssignments, which is the same method as is
used by the AssignmentTrackingPass. This means that we're able to detect stale
memory locations due to DSE after inlining. Because the stores are only tracked
_after_ inlining, any DSE or movement of stores _before_ inlining will not be
accounted for. This is an accepted limitation mentioned in the RFC.
One change is also required:
Update CloneBlock to preserve debug use-before-defs. Otherwise the assignments
will be dropped due to having the intrinsic operands replaced with empty
metadata (see use-before-def.ll in this patch and this related discourse post.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D133318
Followup to D135962 to rename remaining uses of
FunctionModRefBehavior to MemoryEffects. Does not touch API names
yet, but also updates variables names FMRB/MRB to ME, to match the
new type name.
Update both memprof and callsite metadata to reflect inlined functions.
For callsite metadata this is simply a concatenation of each cloned
call's call stack with that of the inlined callsite's.
For memprof metadata, each profiled memory info block (MIB) is either
moved to the cloned allocation call or left on the original allocation
call depending on whether its context matches the newly refined call
stack context on the cloned call. We also reapply context trimming
optimizations based on the refined set of contexts on each of the calls
(cloned and original).
Depends on D128142.
Reviewed By: snehasish
Differential Revision: https://reviews.llvm.org/D128143
This reverts commit 0d7f3464ce0ba3a97df73e08ee0acd4e33adbe9b and
commit f9403ca41e5f3dab60cd6e5de26eea65dcab01a4. The latter was
"Profile matching and IR annotation for memprof profiles." and was left
from a bad rebase from a commit already pushed upstream.
Update both memprof and callsite metadata to reflect inlined functions.
For callsite metadata this is simply a concatenation of each cloned
call's call stack with that of the inlined callsite's.
For memprof metadata, each profiled memory info block (MIB) is either
moved to the cloned allocation call or left on the original allocation
call depending on whether its context matches the newly refined call
stack context on the cloned call. We also reapply context trimming
optimizations based on the refined set of contexts on each of the calls
(cloned and original), via utilities in MemoryProfileInfo.
Depends on D128142.
Differential Revision: https://reviews.llvm.org/D128143
With the recent addition of new parameter MergeAttributes (D134117),
callers need to specify several default parameters before getting to
specify the new parameter.
This patch reorders the parameters so that callers do not have to
specify as many default parameters.
Differential Revision: https://reviews.llvm.org/D134125