540 Commits

Author SHA1 Message Date
Jeremy Morse
c672ba7dde
[DebugInfo][RemoveDIs] Instrument inliner for non-instr debug-info (#72884)
With intrinsics representing debug-info, we just clone all the
intrinsics when inlining a function and don't think about it any
further. With non-instruction debug-info however we need to be a bit
more careful and manually move the debug-info from one place to another.
For the most part, this means keeping a "cursor" during block cloning of
where we last copied debug-info from, and performing debug-info copying
whenever we successfully clone another instruction.

There are several utilities in LLVM for doing this, all of which now
need to manually call cloneDebugInfo. The testing story for this is not
well covered as we could rely on normal instruction-cloning mechanisms
to do all the hard stuff. Thus, I've added a few tests to explicitly
test dbg.value behaviours, ahead of them becoming not-instructions.
2023-11-26 21:24:29 +00:00
Jeremy Morse
e09758224b
[DebugInfo][RemoveDIs] Instrument jump-threading to update DPValues (#73127)
This patch makes jump-threading handle non-instruction debug-info stored
in DPValues in the same way that it updates dbg.values nowadays. This
involves re-targetting their operands as with dbg.values getting moved
from one block to another, and manually cloning them when duplicating
blocks. The SSAUpdater class also grows some functions for SSA-updating
DPValues in the same way as dbg.values.

All of this is largely covered by existing debug-info tests, except for
the cloning of DPValues attached to elidable instructions and branches,
where I've added a test to thread-debug-info.ll. Where previously we
could rely on dbg.values being copied and cloned as normal instructions
are, as we need to explicitly perform that operation now I've added some
explicit testing for it.
2023-11-23 17:07:10 +00:00
Matthias Braun
cb4627d150
Add setBranchWeigths convenience function. NFC (#72446)
Add `setBranchWeights` convenience function to ProfDataUtils.h and use
it where appropriate.
2023-11-16 10:55:19 -08:00
Nikita Popov
75881dbb0f
[JumpThreading] Don't phi translate past loop phi (#70664)
When evaluating comparisons in predecessors, phi operands are translated
into the predecessor. If the translation is across a backedge, this
means that the two operands of the icmp will be from two different loop
iterations, resulting in incorrect simplification.

Fix this by not performing the phi translation for phis in loop headers.

Note: This is not a complete fix. If the
jump-threading-across-loop-headers option is enabled, the LoopHeaders
variable does not get populated. Additional changes will be needed to
fix that case.

Related to https://github.com/llvm/llvm-project/issues/70651.
2023-10-31 10:20:07 +01:00
Kazu Hirata
9bcc094d37 [llvm] Use llvm::erase_if (NFC) 2023-10-12 22:59:25 -07:00
Matthias Braun
5181156b37
Use BlockFrequency type in more places (NFC) (#68266)
The `BlockFrequency` class abstracts `uint64_t` frequency values. Use it
more consistently in various APIs and disable implicit conversion to
make usage more consistent and explicit.

- Use `BlockFrequency Freq` parameter for `setBlockFreq`,
`getProfileCountFromFreq` and `setBlockFreqAndScale` functions.
- Return `BlockFrequency` in `getEntryFreq()` functions.
- While on it change some `const BlockFrequency& Freq` parameters to
plain `BlockFreqency Freq`.
- Mark `BlockFrequency(uint64_t)` constructor as explicit.
- Add missing `BlockFrequency::operator!=`.
- Remove `uint64_t BlockFreqency::getMaxFrequency()`.
- Add `BlockFrequency BlockFrequency::max()` function.
2023-10-05 11:40:17 -07:00
Nikita Popov
5cacf4e688 [JumpThreading] Avoid use of ConstantExpr::getCast()
Use the constant folding API instead.
2023-09-29 11:10:21 +02:00
Matthias Braun
168c288af1
JumpThreading: Propagate branch weights in tryToUnfoldSelectInCurrBB (#66116)
Propagate "branch_weights" metadata whe turning a select into a
conditional branch in tryToUnfoldSelectInCurrBB
2023-09-12 13:36:49 -07:00
Jeremy Morse
6942c64e81 [NFC][RemoveDIs] Prefer iterator-insertion over instructions
Continuing the patch series to get rid of debug intrinsics [0], instruction
insertion needs to be done with iterators rather than instruction pointers,
so that we can communicate information in the iterator class. This patch
adds an iterator-taking insertBefore method and converts various call sites
to take iterators. These are all sites where such debug-info needs to be
preserved so that a stage2 clang can be built identically; it's likely that
many more will need to be changed in the future.

At this stage, this is just changing the spelling of a few operations,
which will eventually become signifiant once the debug-info bearing
iterator is used.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152537
2023-09-11 11:48:45 +01:00
DianQK
7ded71b1e4
[JumpThreading] Invalidate LVI after combineMetadataForCSE. 2023-09-04 11:50:14 +08:00
Nikita Popov
4eafc9b6ff [IR] Treat callbr as special terminator (PR64215)
isLegalToHoistInto() currently return true for callbr instructions.
That means that a callbr with one successor will be considered a
proper loop preheader, which may result in instructions that use
the callbr return value being hoisted past it.

Fix this by adding callbr to isExceptionTerminator (with a rename
to isSpecialTerminator), which also fixes similar assumptions in
other places.

Fixes https://github.com/llvm/llvm-project/issues/64215.

Differential Revision: https://reviews.llvm.org/D158609
2023-08-25 09:20:18 +02:00
Matt Arsenault
fa90f6b9d0 TTI: Pass function to hasBranchDivergence in a few passes
https://reviews.llvm.org/D152033
2023-07-07 09:49:38 -04:00
Arthur Eubanks
3e39cfe5b4 Revert "Revert "InstSimplify: Require instruction be parented""
This reverts commit 0c03f48480f69b854f86d31235425b5cb71ac921.

Going to fix forward size regression instead due to more dependent patches needing to be reverted otherwise.
2023-06-16 13:53:31 -07:00
Arthur Eubanks
0c03f48480 Revert "InstSimplify: Require instruction be parented"
This reverts commit 1536e299e63d7788f38117b0212ca50eb76d7a3b.

Causes large binary size regressions, see comments on https://reviews.llvm.org/rG1536e299e63d7788f38117b0212ca50eb76d7a3b.
2023-06-16 11:24:29 -07:00
Alan Zhao
d6b4f6786b Revert "Revert "InstSimplify: Require instruction be parented""
This reverts commit 00264eac4d0938ae8a0826da38e4777be269124c.

Reason: caused a bunch of bots to break
2023-06-16 10:58:54 -07:00
Alan Zhao
00264eac4d Revert "InstSimplify: Require instruction be parented"
This reverts commit 1536e299e63d7788f38117b0212ca50eb76d7a3b.

Reason: causes a regression in the inliner (see https://crbug.com/1454531 and https://reviews.llvm.org/rG1536e299e63d7788f38117b0212ca50eb76d7a3b#1217141)
2023-06-16 10:36:49 -07:00
Matt Arsenault
1536e299e6 InstSimplify: Require instruction be parented
Unlike every other analysis and transform, simplifyInstruction
permitted operating on instructions which are not inserted
into a function. This created an edge case no other code needs
to really worry about, and limited transforms in cases that
can make use of the context function. Only the inliner and a handful
of other utilities were making use of this, so just fix up these
edge cases. Results in some IR ordering differences since
cloned blocks are inserted eagerly now. Plus some additional
simplifications trigger (e.g. some add 0s now folded out that
previously didn't).
2023-06-02 18:14:28 -04:00
Bjorn Pettersson
a20f7efbc5 Remove several no longer needed includes. NFCI
Mostly removing includes of InitializePasses.h and Pass.h in
passes that no longer has support for the legacy PM.
2023-04-17 13:54:19 +02:00
Arthur Eubanks
7c3c981442 [Passes] Remove some legacy passes
DFAJumpThreading
JumpThreading
LibCallsShrink
LoopVectorize
SLPVectorizer
DeadStoreElimination
AggressiveDCE
CorrelatedValuePropagation
IndVarSimplify

These are part of the optimization pipeline, of which the legacy version is deprecated and being removed.
2023-03-10 17:17:00 -08:00
Benjamin Kramer
da3623de24 [JT] Always create BPI/BFI when running in legacy PM
This is wasteful, but only affects the legacy pass manager. Otherwise
a1b78fb929fccf96acaa0212cf68fee82298e747 would crash JT when running
with that PM. There are still a few users of the legacy PM out there
that are reluctant to migrate, numba in this case.

No test as we don't test legacy PM anymore.
2023-02-17 10:13:20 +01:00
Evgeniy Brevnov
a1b78fb929 [JT][CT] Preserve exisiting BPI/BFI during JumpThreading
Currently, JT creates and updates local instances of BPI\BFI. As a result global ones have to be invalidated if JT made any changes.
In fact, JT doesn't use any information from BPI/BFI for the sake of the transformation itself. It only creates BPI/BFI to keep them up to date. But since it updates local copies (besides cases when it updates profile metadata) it just waste of time.

Current patch is a rework of D124439. D124439 makes one step and replaces local copies with global ones retrieved through AnalysisPassManager. Here we do one more step and don't create BPI/BFI if the only reason of creation is to keep BPI/BFI up to date. Overall logic is the following. If there is cached BPI/BFI then update it along the transformations. If there is no existing BPI/BFI, then create it only if it is required to update profile metadata.

Please note if BPI/BFI exists on exit from JT (either cached or created) it is always up to date and no reason to invalidate it.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D136827
2023-02-16 16:08:34 +07:00
Ben Mudd
e0374fb2f4 [DebugInfo] Make debug intrinsics to track cloned values in JumpThreading
This patch causes debug value intrinsics outside of cloned blocks in the
Jump Threading pass to correctly point towards any derived values. If it cannot,
it kills them.

Reviewed By: probinson, StephenTozer

Differential Revision: https://reviews.llvm.org/D140404
2023-02-01 12:52:37 +00:00
Evgeniy Brevnov
f7c1982309 Revert "[JT][CT] Preserve exisiting BPI/BFI during JumpThreading"
This reverts commit 26e7cb24cb5dfa560683064d37f560558f00aa67.
2023-01-27 15:35:32 +07:00
Evgeniy Brevnov
26e7cb24cb [JT][CT] Preserve exisiting BPI/BFI during JumpThreading
Currently, JT creates and updates local instances of BPI\BFI. As a result global ones have to be invalidated if JT made any changes.
In fact, JT doesn't use any information from BPI/BFI for the sake of the transformation itself. It only creates BPI/BFI to keep them up to date. But since it updates local copies (besides cases when it updates profile metadata) it just waste of time.

Current patch is a rework of D124439. D124439 makes one step and replaces local copies with global ones retrieved through AnalysisPassManager. Here we do one more step and don't create BPI/BFI if the only reason of creation is to keep BPI/BFI up to date. Overall logic is the following. If there is cached BPI/BFI then update it along the transformations. If there is no existing BPI/BFI, then create it only if it is required to update profile metadata.

Please note if BPI/BFI exists on exit from JT (either cached or created) it is always up to date and no reason to invalidate it.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D136827
2023-01-27 15:00:16 +07:00
Christian Ulmann
e741b8c2e5 [llvm][ir] Purge MD_prof custom accessors
This commit purges direct accesses to MD_prof metadata and replaces them
with the accessors provided from the utility file wherever possible.
This commit can be seen as the first step towards switching the branch weights to 64 bits.
See post here: https://discourse.llvm.org/t/extend-md-prof-branch-weights-metadata-from-32-to-64-bits/67492

Reviewed By: davidxl, paulkirth

Differential Revision: https://reviews.llvm.org/D141393
2023-01-19 14:26:26 +01:00
Max Kazantsev
82cee24e3d [JumpThreading] Preserve profile metadata during select unfolding, take 2
Jump threading can replace select and unconditional branch with
conditional branch, but when doing so loses profile information.

This destructive transform can eventually lead to a performance
degradation due to folding of branches in
shouldFoldCondBranchesToCommonDestination as branch probabilities
are no longer known.

The first version was reverted due to assert caused by i32 overflow,
fixed in this version.

Patch by Roman Paukner!

Differential Revision: https://reviews.llvm.org/D138132
Reviewed By: mkazantsev
2023-01-16 19:04:23 +07:00
Dmitri Gribenko
0e9956204d Revert "[JumpThreading] Preserve profile metadata during select unfolding"
This reverts commit 957952dbf2f34ed552e8e1f8c35eed17eee2ea38.

Addition in the newly added code can overflow.  As a result, the
constructor of `BranchProbability()` can trigger an assertion. See
the discussion on https://reviews.llvm.org/D138132 for more details.
2023-01-10 11:54:50 +01:00
Ben Mudd
1f11d1bd12 [DebugInfo] Fix jump threading failing to update cloned dbg.values
This is a patch to fix duplicated dbg.values in the JumpThreading pass not
pointing towards their local value, and instead towards the variable in the
original block.
JumpThreadingPass::cloneInstructions is the changed function to target metadata
as well as normal cloned values.

Reviewed By: jmorse, StephenTozer

Differential Revision: https://reviews.llvm.org/D140006
2023-01-09 11:42:33 +00:00
Max Kazantsev
957952dbf2 [JumpThreading] Preserve profile metadata during select unfolding
Jump threading can replace select and unconditional branch with
conditional branch, but when doing so loses profile information.

This destructive transform can eventually lead to a performance
degradation due to folding of branches in
shouldFoldCondBranchesToCommonDestination as branch probabilities
are no longer known.

Patch by Roman Paukner!

Differential Revision: https://reviews.llvm.org/D138132
Reviewed By: mkazantsev
2023-01-09 16:14:58 +07:00
Yingchi Long
84733b0f17
[JT] check xor operand is exactly the same in processBranchOnXOR
Reproducer:

    ; RUN: opt -S -jump-threading < %s
    define void @test() {
    entry:
    br i1 false, label %loop, label %exit

    loop:
    %bool = phi i1 [ %xor, %loop.latch ], [ false, %entry ]
    %cmp = icmp eq i16 0, 1
    %xor = xor i1 %cmp, %bool
    br i1 %bool, label %loop.latch, label %exit

    loop.latch:
    %dummy = phi i16 [ 0, %loop ]
    br label %loop

    exit:
    ret void
    }

On this occassion, phi node %bool is actually %xor, and doing substitution causes assertion failure.

Fixes: https://github.com/llvm/llvm-project/issues/58812

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D139783
2022-12-21 21:43:55 +08:00
Vasileios Porpodas
32b38d248f [NFC] Rename Instruction::insertAt() to Instruction::insertInto(), to be consistent with BasicBlock::insertInto()
Differential Revision: https://reviews.llvm.org/D140085
2022-12-15 12:27:45 -08:00
Kazu Hirata
6eb0b0a045 Don't include Optional.h
These files no longer use llvm::Optional.
2022-12-14 21:16:22 -08:00
Fangrui Song
d4b6fcb32e [Analysis] llvm::Optional => std::optional 2022-12-14 07:32:24 +00:00
Vasileios Porpodas
06911ba6ea [NFC] Cleanup: Replaces BB->getInstList().insert() with I->insertAt().
This is part of a series of cleanup patches towards making BasicBlock::getInstList() private.

Differential Revision: https://reviews.llvm.org/D138877
2022-12-12 13:33:05 -08:00
Evgeniy Brevnov
50f8eb05af Revert "[JT] Preserve exisiting BPI/BFI during JumpThreading"
This reverts commit 52a4018506e39f50d0c06ac5a1c987eb83b900c7.
2022-11-17 17:11:47 +07:00
Evgeniy Brevnov
52a4018506 [JT] Preserve exisiting BPI/BFI during JumpThreading
Currently, JT creates and updates local instances of BPI\BFI. As a result global ones have to be invalidated if JT made any changes.
In fact, JT doesn't use any information from BPI/BFI for the sake of the transformation itself. It only creates BPI/BFI to keep them up to date. But since it updates local copies (besides cases when it updates profile metadata) it just waste of time.

Current patch is a rework of D124439. D124439 makes one step and replaces local copies with global ones retrieved through AnalysisPassManager. Here we do one more step and don't create BPI/BFI if the only reason of creation is to keep BPI/BFI up to date. Overall logic is the following. If there is cached BPI/BFI then update it along the transformations. If there is no existing BPI/BFI, then create it only if it is required to update profile metadata.

Please note if BPI/BFI exists on exit from JT (either cached or created) it is always up to date and no reason to invalidate it.

Differential Revision: https://reviews.llvm.org/D136827
2022-11-17 17:00:00 +07:00
Usman Nadeem
32755786e0 [JumpThreading] Put a limit on the PHI nodes when duplicating a BB.
Do not duplicate a BB if it has a lot of PHI nodes.
If a threadable chain is too long then the number of duplicated PHI nodes
can add up, leading to a substantial increase in compile time when rewriting
the SSA.

Fixes https://github.com/llvm/llvm-project/issues/58203
Differential Revision: https://reviews.llvm.org/D136716

The threshold of 76 in this patch is reasonably high and reduces the compile
time of cldwat2m_macro.f90 in SPEC2017/cam4 from 80+min to <2min.

Change-Id: I153c89a8e0d89b206a5193dc1b908c67e320717e
2022-10-31 15:51:56 -07:00
Evgeniy Brevnov
03a102e3b2 [JumpThreading][NFC] Reuse existing DT instead of recomputation (newPM)
This is the same change as
503d5771b6c5e3544a9fa3be6b8d085ffbbd4057 with the same intent but for new pass manager.
2022-09-15 12:27:57 +07:00
Sergey Kachkov
be37caca00 [JumpThreading] Process range comparisions with non-local cmp instructions
Use getPredicateOnEdge method if value is a non-local
compare-with-a-constant instruction, that can give more precise
results than getConstantOnEdge.

Differential Revision: https://reviews.llvm.org/D131956
2022-09-02 12:22:45 +02:00
Kazu Hirata
6b1bc80188 [Scalar] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-08-20 21:18:25 -07:00
Simon Pilgrim
fdec50182d [CostModel] Replace getUserCost with getInstructionCost
* Replace getUserCost with getInstructionCost, covering all cost kinds.
* Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks.

Original Patch by @samparker (Sam Parker)

Differential Revision: https://reviews.llvm.org/D79483
2022-08-18 11:55:23 +01:00
Kazu Hirata
e20d210eef [llvm] Qualify auto (NFC)
Identified with readability-qualified-auto.
2022-08-07 23:55:27 -07:00
Paul Kirth
d434e40f39 [llvm][NFC] Refactor code to use ProfDataUtils
In this patch we replace common code patterns with the use of utility
functions for dealing with profiling metadata. There should be no change
in functionality, as the existing checks should be preserved in all
cases.

Reviewed By: bogner, davidxl

Differential Revision: https://reviews.llvm.org/D128860
2022-08-03 00:09:45 +00:00
Paul Kirth
6e9bab71b6 Revert "[llvm][NFC] Refactor code to use ProfDataUtils"
This reverts commit 300c9a78819b4608b96bb26f9320bea6b8a0c4d0.

We will reland once these issues are ironed out.
2022-07-27 21:38:11 +00:00
Paul Kirth
300c9a7881 [llvm][NFC] Refactor code to use ProfDataUtils
In this patch we replace common code patterns with the use of utility
functions for dealing with profiling metadata. There should be no change
in functionality, as the existing checks should be preserved in all
cases.

Reviewed By: bogner, davidxl

Differential Revision: https://reviews.llvm.org/D128860
2022-07-27 21:13:54 +00:00
ChenYang Li
6d036b83d1 [JumpThreading] Avoid threadThroughTwoBasicBlocks when PredPred BB ends with indirectbranch
Since we can't change the destination of indirectbr, so when
encounter indirectbr as PredPredBB terminator, we should pass it.

Differential Revision: https://reviews.llvm.org/D129193
2022-07-08 09:29:17 +02:00
Nikita Popov
40a4078e14 [BasicBlockUtils] Allow splitting predecessors with callbr terminators
SplitBlockPredecessors currently asserts if one of the predecessor
terminators is a callbr. This limitation was originally necessary,
because just like with indirectbr, it was not possible to replace
successors of a callbr. However, this is no longer the case since
D67252. As the requirement nowadays is that callbr must reference
all blockaddrs directly in the call arguments, and these get
automatically updated when setSuccessor() is called, we no longer
need this limitation.

The only thing we need to do here is use replaceSuccessorWith()
instead of replaceUsesOfWith(), because only the former does the
necessary blockaddr updating magic.

I believe there's other similar limitations that can be removed,
e.g. related to critical edge splitting.

Differential Revision: https://reviews.llvm.org/D129205
2022-07-07 09:13:25 +02:00
Nuno Lopes
373571dbb4 [NFC] Switch a few uses of undef to poison as placeholders for unreachble code 2022-06-30 23:01:43 +01:00
Nikita Popov
2124b2f0e6 [JumpThreading] Avoid ConstantExpr::get() (NFCI)
This code requires the result to be an UndefValue/ConstantInt
anyway (checked by getKnownConstant), so we are only interested
in the case where this folds.
2022-06-29 16:43:05 +02:00
Simon Moll
b8c2781ff6 [NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName".  This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.

This is the alternative to the less invasive clang-format only patch: D126783

Reviewed By: spatel, rengolin

Differential Revision: https://reviews.llvm.org/D126889
2022-06-09 16:10:08 +02:00