As a follow-on to 113686, this breaks the recursion between phi nodes
that have p1 = phi(x, p2) and p2 = phi(y, p1). The knownFPClass can be
calculated from the classes of p1 and p2.
Given a recursive phi with select:
%p = phi [ 0, entry ], [ %sel, loop]
%sel = select %c, %other, %p
The fp state can be calculated using the knowledge that the select/phi
pair can only be the initial state (0 here) or from %other. This adds a
short-cut into computeKnownFPClass for PHI to detect that the select is
recursive back to the phi, and if so use the state from the other
operand.
This helps to address a regression from #83200.
If the pointer returned by a function is not "the base pointer" but has
an offset, we need to track the offset such that users can apply it to
their offset chain when they create accesses.
This was reported by @ye-luo and reduced test cases are included. The
OffsetInfo was moved and the container was replaced with a set to avoid
excessive growth. Otherwise, the patch just replaces the "returns
pointer" flag with the "returned offsets", and deals with the applying
to offsets at the call site.
---------
Co-authored-by: Johannes Doerfert <jdoerfert@llnl.gov>
…ntElimination
ArgumentPromotion and DeadArgumentElimination passes could change
function signatures but the function name remains the same as before the
transformation. This makes it hard for tracing with bpf programs where
user tends to use function signature in the source. See discussion [1]
for details.
This patch added suffix to functions whose signatures are changed. The
suffix lets users know that function signature has changed and they need
to impact the IR or binary to find modified signature before tracing
those functions.
The suffix for ArgumentPromotion is ".argprom" and the suffixes for
DeadArgumentElimination are ".argelim" and ".retelim". The suffix also
gives user hints about what kind of transformation has been done.
With this patch, I built a recent linux kernel with full LTO enabled. I
got 4 functions with only argpromotion like
```
set_track_update.argelim.argprom
pmd_trans_huge_lock.argprom
...
```
I got 1058 functions with only deadargelim like
```
process_bit0.argelim
pci_io_ecs_init.argelim
...
```
I got 3 functions with both argpromotion and deadargelim
```
set_track_update.argelim.argprom
zero_pud_populate.argelim.argprom
zero_pmd_populate.argelim.argprom
```
[1] https://github.com/llvm/llvm-project/issues/104678
Instead of visiting call sites in Attribute::checkForAllUses, we now
keep track of returns in AAPointerInfo and use the call site return
information as required. This way, the user of
AAPointerInfo(CallSite)Argument can determine if the call return should
be visited. We do not collect them as "may accesses" in the
AAPointerInfo(CallSite)Argument itself in case a return user is found.
When we propagate call site arguments we always need to translate them,
this is important as we ended up picking the function argument for a
recurisve call not the call site argument. `@recBad` and `@recGood` in
`returned.ll` show the problem as they used to transform them the same
way. The restructuring cleans the code up and helps derive more
"returned" arguments and better information in the presence of recursive
calls. The "dropped" attributes are simply dropped because we do not
query them anymore, not because we cannot derive them.
Before, we kept the call site access kind (may/must) when we translated
the access. However, the pointer we access it through (by passing it to
the callee) might not be the underlying object. We have similar logic
when we add store and load accesses.
When we encounter a bitcast from an integer type we can use the
information from `KnownBits` to glean some information about the
fpclass:
- If the sign bit is known, we can transfer this information over.
- If the float is IEEE format and enough of the bits are known, we may
be able to prove or rule out some fpclasses such as NaN, Zero, or Inf.
- Allocas and GlobalValues cannot be simplified, so we should not try.
- If we never used any assumed state, the AAUnderlyingObjects doesn't
require an additional update.
- If we have seen an object (or it's underlying object) before, we do
not need to inspect it anymore.
The original logic for "SeenObjects" was flawed and caused us to add
intermediate values to the underlying object list if a PHI or select
instruction referenced the same underlying object twice. The test
changes are all instances of this situation and we now correctly derive
`memory(none)` for the functions that only access stack memory.
---------
Co-authored-by: Shilei Tian <i@tianshilei.me>
Verify that the arguments of a naked function are not used. They can
only be referenced via registers/stack in inline asm, not as IR values.
Doing so will result in assertion failures in the backend.
There's probably more that we should verify, though I'm not completely
sure what the constraints are (would it be correct to require that naked
functions are exactly an inline asm call + unreachable, or is more
allowed?)
Fixes https://github.com/llvm/llvm-project/issues/104718.
This reverts commit e592c2dcf5b7d2da6c2564f5d9990aa34079bad4.
We can finally reland the PR since the issue that caused the PR to be
reverted has been resolved in
https://github.com/llvm/llvm-project/pull/104051.
When copying map entries, we might run into resizing and invalidate the
RHS of the assignment. We dealt with this before and now use the proper
helper to avoid the problem in another place.
Fixes: https://github.com/llvm/llvm-project/issues/104397
Global variables that reference themselves alongside a function that is
called indirectly can cause an infinite loop in
`AAGlobalValueInfoFloating`. The recursive reference is continually
pushed back into the workload, causing the attributor to hang
indefinitely.
This patch makes the final major change of the RemoveDIs project, changing the
default IR output from debug intrinsics to debug records. This is expected to
break a large number of tests: every single one that tests for uses or
declarations of debug intrinsics and does not explicitly disable writing
records.
If this patch has broken your downstream tests (or upstream tests on a
configuration I wasn't able to run):
1. If you need to immediately unblock a build, pass
`--write-experimental-debuginfo=false` to LLVM's option processing for all
failing tests (remember to use `-mllvm` for clang/flang to forward arguments to
LLVM).
2. For most test failures, the changes are trivial and mechanical, enough that
they can be done by script; see the migration guide for a guide on how to do
this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates
3. If any tests fail for reasons other than FileCheck check lines that need
updating, such as assertion failures, that is most likely a real bug with this
patch and should be reported as such.
For more information, see the recent PSA:
https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578
In some modules, e.g. Kotlin-generated IR, we end up with a huge RefSCC
and the call graph updates done as a result of the inliner take a long
time. This is due to RefSCC::removeInternalRefEdges() getting called
many times, each time removing one function from the RefSCC, but each
call to removeInternalRefEdges() is proportional to the size of the
RefSCC.
There are two places that call removeInternalRefEdges(), in
updateCGAndAnalysisManagerForPass() and
LazyCallGraph::removeDeadFunction().
1) Since LazyCallGraph can deal with spurious (edges that exist in the
graph but not in the IR) ref edges, we can simply not call
removeInternalRefEdges() in updateCGAndAnalysisManagerForPass().
2) LazyCallGraph::removeDeadFunction() still ends up taking the brunt of
compile time with the above change for the original reason. So instead
we batch all the dead function removals so we can call
removeInternalRefEdges() just once. This requires some changes to
callers of removeDeadFunction() to not actually erase the function from
the module, but defer it to when we batch delete dead functions at the
end of the CGSCC run, leaving the function body as "unreachable" in the
meantime. We still need to ensure that call edges are accurate. I had
also tried deleting dead functions after visiting a RefSCC, but deleting
them all at once at the end was simpler.
Many test changes are due to not performing unnecessary revisits of an
SCC (the CGSCC infrastructure deems ref edge refinements as unimportant
when it comes to revisiting SCCs, although that seems to not be
consistently true given these changes) because we don't remove some ref
edges. Specifically for devirt-invalidated.ll this seems to expose an
inlining order issue with the inliner. Probably unimportant for this
type of intentionally weird call graph.
Compile time:
https://llvm-compile-time-tracker.com/compare.php?from=6f2c61071c274a1b5e212e6ad4114641ec7c7fc3&to=b08c90d05e290dd065755ea776ceaf1420680224&stat=instructions:u
Remove support for the icmp and fcmp constant expressions.
This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
The old use of must-be-executed-context (MBEC) did propagate
through calls even if that was not allowed. We now only propagate from
call site arguments. If there are calls/intrinsics that allows
propagation, we need to add them explicitly.
Fixes: https://github.com/llvm/llvm-project/issues/78507
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
If the calling function has the null_pointer_is_valid attribute, somehow
a null constant reaches here. I'm not sure why exactly, it doesn't
happen for other types of constants.
Fixes#87856
If we had a comparison to a literal nan with a false predicate,
we were incorrectly treating it as an unordered compare. This was
correct for fcmp true, but not fcmp false. I noticed this in the
review for e44d3b3e503fa12fdaead2936b28844aa36237c1 but misdiagnosed
the reason. Also change the test for the fcmp true case to be more
useful, but it wasn't wrong previously.
This reapplication changes debug intrinsic declaration removal to only take
place when printing final IR, so that the processing format of the Module
does not affect the output.
This reverts commit d128448efdd4e2bf3c9bc9a5b43ae642aa78026f.
Reverted due to failures on buildbots, where a new cl flag was placed
in the wrong file, resulting in link errors.
https://lab.llvm.org/buildbot/#/builders/198/builds/8548
This reverts commit 0b398256b3f72204ad1f7c625efe4990204e898a.
This patch adds support for printing the proposed non-instruction debug
info ("RemoveDIs") out to textual IR. This patch does not add any
bitcode support, parsing support, or documentation.
Printing of the new format is controlled by a flag added in this patch,
`--write-experimental-debuginfo`, which defaults to false. The new
format will be printed *iff* this flag is true, so whether we use the IR
format is completely independent of whether we use non-instruction debug
info during LLVM passes (which is controlled by the
`--try-experimental-debuginfo-iterators` flag).
Even with the flag disabled, some existing tests need to be updated, as this
patch causes debug intrinsic declarations to be changed in a round trip,
such that they always appear at the end of a module and have no attributes
(this has no functional change on the module).
The design of this new IR format was proposed previously on
Discourse, and any further discussion about the design can still be
contributed there:
https://discourse.llvm.org/t/rfc-debuginfo-proposed-changes-to-the-textual-ir-representation-for-debug-values/73491
As we'll hopefully move away from using intrinsics for debug-info
shortly, this commit stabilizes a few tests to avoid spurious changes in
the process. Briefly, there are differences in output when we don't use
intrinsics that we're going to suppress in case we have to revert, these
are:
* The attributor test gets different attributes for the dbg.value
intrinsic because it's not present during optimisation. This has no
functional effect and there's no need to test for it.
* The Scalarizer test exposes a "debug-info affects codegen" problem,
but fixing it is fiddly (updating 20 IRBuilder object calls). Pin this
test to not change with RemoveDIs, we can relax it later and get the
correct behaviour.
* DIDefaultTemplateParam.ll tests for explicit metadata node numbers
which is generally bad. Add explicit node-number capturing CHECK lines.