CHR builds the merged hot-path predicate with
IRBuilder::CreateLogicalAnd. That helper is implemented as a select and
can constant-fold to a non- Instruction (e.g. i1 true). The pass then
attempted to mark the merged condition as having explicitly unknown
branch weights when profile data is present, but it unconditionally did
cast<Instruction>(MergedCondition), which can crash in release builds.
Guard the metadata update with dyn_cast<Instruction> and pass the
containing Function explicitly to avoid calling Instruction::getFunction
when the value is not attached yet.
Add a regression test that exercises the constant-folding case.
Crashing stack:
```
2. Running pass "chr" on function "repro_crash"
#0 0x0000000003be00a4 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (bin/opt+0x3be00a4)
#1 0x0000000003bdd9e8 llvm::sys::RunSignalHandlers() (bin/opt+0x3bdd9e8)
#2 0x0000000003be1300 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
#3 0x0000ffffa8e1d840 (linux-vdso.so.1+0x840)
#4 0x0000000003c815e0 llvm::Instruction::getFunction() const (bin/opt+0x3c815e0)
#5 0x0000000003dcd35c llvm::setExplicitlyUnknownBranchWeightsIfProfiled(llvm::Instruction&, llvm::StringRef, llvm::Function const*) (bin/opt+0x3dcd35c)
#6 0x0000000004fb3670 (anonymous namespace)::CHR::addToMergedCondition(bool, llvm::Value*, llvm::Instruction*, (anonymous namespace)::CHRScope*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::Value*&) ControlHeightReduction.cpp:0:0
#7 0x0000000004fa7d88 (anonymous namespace)::CHR::run() ControlHeightReduction.cpp:0:0
#8 0x0000000004fa3618 llvm::ControlHeightReductionPass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (bin/opt+0x4fa3618)
```
Tests: opt <
llvm/test/Transforms/PGOProfile/chr-unknown-profdata-crash.ll
-passes='require<profile-summary>,function(chr)' -force-chr
-chr-merge-threshold=1 -disable-output
These selects are dependent on values live into the CHRScope that we
cannot infer anything about, so mark the branch weights unknown. These
selects usually also just get folded down into a icmps, so the profile
information ends up being kind of redundant.
This patch replaces LLVM_ATTRIBUTE_UNUSED with [[maybe_unused]]. Note
that this patch adjusts the placement of [[maybe_unused]] to comply
with the C++17 language.
Reapplies #159686
This reverts commit 4f33d7b7a9f39d733b7572f9afbf178bca8da127.
The original landing of this patch had an issue where it would try and
hoist allocas into the entry block that were in the entry block. This
would end up actually moving them lower in the block potentially after
users, resulting in invalid IR.
This update fixes this by ensuring that we are only hoisting static
allocas that have been sunk into a split basic block. A regression test
has been added.
Integration tested using a three stage build of clang with IRPGO
enabled.
This reverts commit a00450944d2a91aba302954556c1c23ae049dfc7.
Looks like this one is actually breaking the buildbots. Reverting the switch back
to IRPGO did not fix things.
ControlHeightReduction will duplicate some blocks and insert phi nodes
in exit blocks of regions that it operates on for any live values. This
includes allocas. Having a lifetime annotation refer to a phi node was
made illegal in 92c55a315eab455d5fed2625fe0f61f88cb25499, which causes
the verifier to fail after CHR.
There are some cases where we might not need to drop lifetime
annotations (usually because we do not need the phi to begin with), but
drop all annotations for now to be conservative.
Fixes#159621.
We can use *Set::insert_range to collapse:
for (auto Elem : Range)
Set.insert(E);
down to:
Set.insert_range(Range);
In some cases, we can further fold that into the set declaration.
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range. This patch replaces:
Dest.insert(Src.begin(), Src.end());
with:
Dest.insert_range(Src);
This patch does not touch custom begin like succ_begin for now.
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.
This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
…f weights" #95136
Reverts #95060, and relands #86609, with the unintended code generation
changes addressed.
This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032
In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g. from
profiles or inserted by the compiler.
One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.
Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.
We also update the lang ref for branch weights to reflect the change.
This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032
In this patch, we add an optional field to MD_prof metadata nodes for
branch weights, which can be used to distinguish weights added from
`llvm.expect*` intrinsics from those added via other methods, e.g.
from profiles or inserted by the compiler.
One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.
Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.
We also update the lang ref for branch weights to reflect the change.
Continuing the patch series to get rid of debug intrinsics [0], instruction
insertion needs to be done with iterators rather than instruction pointers,
so that we can communicate information in the iterator class. This patch
adds an iterator-taking insertBefore method and converts various call sites
to take iterators. These are all sites where such debug-info needs to be
preserved so that a stage2 clang can be built identically; it's likely that
many more will need to be changed in the future.
At this stage, this is just changing the spelling of a few operations,
which will eventually become signifiant once the debug-info bearing
iterator is used.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D152537
Add an API that allows removing multiple incoming phi values based
on a predicate callback, as suggested on D157621.
This makes sure that the removal is linear time rather than quadratic,
and avoids subtleties around iterator invalidation.
I have replaced some of the more straightforward users with the new
API, though there's a couple more places that should be able to use it.
Differential Revision: https://reviews.llvm.org/D158064
Relative to the previous attempt, this also adjusts RegionInfo
verification to allow unreachable predecessors.
-----
If a block in the CHR region has an unreachable predecessor, then
there will be no edge from that predecessor to the newly cloned
block. However, a phi node entry for it will be left behind. Make
sure that these incoming blocks get dropped as well.
Fixes https://github.com/llvm/llvm-project/issues/64594.
Differential Revision: https://reviews.llvm.org/D157621
If a block in the CHR region has an unreachable predecessor, then
there will be no edge from that predecessor to the newly cloned
block. However, a phi node entry for it will be left behind. Make
sure that these incoming blocks get dropped as well.
Fixes https://github.com/llvm/llvm-project/issues/64594.
Differential Revision: https://reviews.llvm.org/D157621
This patch freezes potentially poisonous conditions in conditional
branches so that we do not "move up" conditional branches
"br i1 poison".
Differential Revision: https://reviews.llvm.org/D145008
Without this patch, the control height reduction pass would combine a
"poison" branch with an earlier well-defined branch, turning the
earlier branch into a "poison" branch also.
This patch fixes the problem by rejecting "poison" conditional
branches.
Differential Revision: https://reviews.llvm.org/D145008
This is a modified version of commit b374423304a8 by
Arthur (https://reviews.llvm.org/D143424).
Here we invoke to the pass independent of PGOOPT. We now check if the
profile is available through the program summary. This ensures CHR is
called in distributed ThinLTO BE compilation (where PGOOPT might not
be created).
Differential Revision: https://reviews.llvm.org/D144769
ControlHeightReduction (CHR) clones the code region to reduce the
branches in the hot code path. The number of clones is linear to the
depth of the region.
Currently it does not have control over the code size increase. We are
seeing one ~9000 BB functions get expanded to ~250000 BBs, an 25x
increase. This creates a big compile time issue for the downstream
optimizations.
This patch adds a cap for number of clones for one region.
Differential Revision: https://reviews.llvm.org/D138333
Use logical instead of bitwise and to combine conditions, to avoid
propagating poison from a later condition if an earlier one is
already false. This avoids introducing branch on poison.
Differential Revision: https://reviews.llvm.org/D125898
While select conditions can be poison, branch on poison is
immediate UB. As such, we need to freeze the condition when
converting a select into a branch.
Differential Revision: https://reviews.llvm.org/D125398
When a block containing llvm.coro.id is cloned during CHR, it inserts an invalid
PHI node with token type to the beginning of the block containing llvm.coro.begin.
To avoid such case, we exclude regions with llvm.coro.id.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D124418