Similar to the non-ptr case, directly create the getelementptr
instruction. Going through expandAddToGEP() no longer makes sense
with opaque pointers, where generating the necessary instruction
is trivial.
This avoids recursive expansion of (the SCEV of) StepV while the
IR is in an inconsistent state, in particular with an incomplete
IV phi node, which utilities may not be prepared to deal with.
Fixes https://github.com/llvm/llvm-project/issues/80954.
SCEV treats "or disjoint" the same as "add nsw nuw". However, when
expanding, we cannot generally replace an add SCEV node with an "or
disjoint" instruction. Just dropping the poison flag is insufficient in
this case, we would have to actually convert the or into an add.
This is a partial fix for #79861.
We are replacing with a wider increment. If both OrigInc and
IsomorphicInc are NUW/NSW, then we can preserve them on the wider
increment; the narrower IsomorphicInc would wrap before the wider
OrigInc, so the replacement won't make IsomorphicInc's uses more
poisonous.
PR: https://github.com/llvm/llvm-project/pull/79512
Move logic to replace congruent IV increments to helper function, to
reduce the indentation by using early returns. This is in preparation
for a follow-up patch.
LSR uses SCEVExpander to generate induction formulas. The expander
internally tries to reuse existing IR expressions. To do that, it needs
to strip any poison generating flags (nsw, nuw, exact, nneg, etc..)
which may not be valid for the newly added users.
This is conservatively correct, but has the effect that LSR will strip
nneg flags on zext instructions involved in trip counts in loop
preheaders. To avoid this, this patch adjusts the expanded to reinfer
the flags on the CSE candidate if legal for all possible users.
This should fix the regression reported in
https://github.com/llvm/llvm-project/issues/71200.
This should arguably be done inside canReuseInstruction instead, but
doing it outside is more conservative compile time wise. Both
canReuseInstruction and isGuaranteedNotToBePoison walk operand lists, so
right now we are performing work which is roughly O(N^2) in the size of
the operand graph. We should fix that before making the per operand step
more expensive. My tenative plan is to land this, and then rework the
code to sink the logic into more core interfaces.
zext nneg was recently added to the IR in #67982. Teaching SCEVExpander
to emit nneg when possible is valuable since SCEV may have proved
non-trivial facts about loop bounds which would otherwise be lost when
materializing the value.
gcc warned about it:
../lib/Transforms/Utils/ScalarEvolutionExpander.cpp: In lambda function:
../lib/Transforms/Utils/ScalarEvolutionExpander.cpp:2104:22: warning: unused variable 'ARPtrTy' [-Wunused-variable]
2104 | if (PointerType *ARPtrTy = dyn_cast<PointerType>(ARTy)) {
| ^~~~~~~
Fix the warning by removing the variable and turn dyn_cast into isa.
Remove all the expandCodeFor() uses that specify an explicit type,
as well as InsertNoopCastOfTo() calls and most uses of
getEffectiveSCEVType().
The only place where no-op casts can now be inserted are public
expandCodeFor() uses.
SCEVExpander currently has special handling for the case where the
start or the step of an addrec do not dominate the loop header,
which is not used by any lit test.
Initially I thought that this is entirely dead code, because
addrec operands are required to be loop invariant. However,
SCEV currently allows creating an addrec with operands that are
loop invariant but defined *after* the loop.
This doesn't seem like a useful case to allow, and we don't
appear to be using this outside a single easy to adjust unit test.
A lot of SCEV expressions only work on integers -- in which case
the effective type will always be the same as the type.
There is a lot more cleanup to do here.
Continuing the patch series to get rid of debug intrinsics [0], instruction
insertion needs to be done with iterators rather than instruction pointers,
so that we can communicate information in the iterator class. This patch
adds an iterator-taking insertBefore method and converts various call sites
to take iterators. These are all sites where such debug-info needs to be
preserved so that a stage2 clang can be built identically; it's likely that
many more will need to be changed in the future.
At this stage, this is just changing the spelling of a few operations,
which will eventually become signifiant once the debug-info bearing
iterator is used.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D152537
SCEVExpander tries to reuse existing instruction with the same
SCEV expression. However, doing this replacement blindly is not
safe, because the instruction might be more poisonous.
What we were already doing is to drop poison-generating flags on
the reused instruction. But this is not the only way that more
poison can be introduced. The poison-generating flag might not
be directly on the reused instruction, or the poison contribution
might come from something like 0 * %var, which folds to 0 but can
still introduce poison.
This patch fixes the issue in a principled way, by determining which
values can contribute poison to the SCEV expression, and then
checking whether any additional values can contribute poison to the
instruction being reused. Poison-generating flags are dropped if
doing that enables reuse.
This is a pretty big hammer and does cause some regressions in
tests, but less than I would have expected. I wasn't able to come
up with a less intrusive fix that still satisfies the correctness
requirements.
Fixes https://github.com/llvm/llvm-project/issues/63763.
Fixes https://github.com/llvm/llvm-project/issues/63926.
Fixes https://github.com/llvm/llvm-project/issues/64333.
Fixes https://github.com/llvm/llvm-project/issues/63727.
Differential Revision: https://reviews.llvm.org/D158181
This method is only used to determine whether a related expansion
exists, the actual value is unused. Clarify that by renaming
get -> has and returning bool.
replaceCongruentIVs analysis is based on ScalarEvolution; this makes
comparing different PHIs and performing the replacement straightforward.
However, it can have some side-effects: it isn't aware whether an
induction variable is in canonical form, so it can perform replacements
which obscure the meaning of the IR.
In test22 in widen-loop-comp.ll, the resulting loop can't be analyzed by
ScalarEvolution at all.
My attempted solution is to restrict the transform: don't try to replace
induction variables using PHI nodes that don't represent simple
induction variables.
I'm not sure if this is the best solution; suggestions welcome.
Differential Revision: https://reviews.llvm.org/D121950
/Users/jiefu/llvm-project/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp:293:13: error: function 'FactorOutConstant' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
static bool FactorOutConstant(const SCEV *&S, const SCEV *&Remainder,
^
1 error generated.
Instead of checking the pointer type, check the element type of
the GEP.
Previously we ended up reusing GEP increments that were not in
expanded form, thus not respecting LSRs choice of representation.
The change in 2011-10-06-ReusePhi.ll recovers a regression that
appeared when converting that test to opaque pointers.
Changes in various Thumb tests now compute the step outside the
loop instead of using add.w inside the loop, which is LSR's
preferred representation for this target.
When normalizing a SCEV expression during expansion, there should be
no need for it to be invertible, as it will only be used for code
generation. This fixes a crash after 7f5b15ad150e.
Fixes https://github.com/llvm/llvm-project/issues/63678.
SCEVExpander keeps track of all instructions it inserted. However,
it currently misses some phi nodes created during LCSSA construction.
Fix this by collecting these into another argument.
This also removes the IRBuilder argument, which was added for
essentially the same purpose, but only handles the root LCSSA nodes,
not those inserted by SSAUpdater.
This was reported as a regression on D149344, but the reduced test
case also reproduces without it.
Differential Revision: https://reviews.llvm.org/D150681
This patch-set aims to simplify the existing RVV segment load/store
intrinsics to use a type that represents a tuple of vectors instead.
To achieve this, first we need to relax the current limitation for an
aggregate type to be a target of load/store/alloca when the aggregate
type contains homogeneous scalable vector types. Then to adjust the
prolog of an LLVM function during lowering to clang. Finally we
re-define the RVV segment load/store intrinsics to use the tuple types.
The pull request under the RVV intrinsic specification is
riscv-non-isa/rvv-intrinsic-doc#198
---
This is the 1st patch of the patch-set. This patch is originated from
D98169.
This patch allows aggregate type (StructType) that contains homogeneous
scalable vector types to be a target of load/store/alloca. The RFC of
this patch was posted in LLVM Discourse.
https://discourse.llvm.org/t/rfc-ir-permit-load-store-alloca-for-struct-of-the-same-scalable-vector-type/69527
The main changes in this patch are:
Extend `StructLayout::StructSize` from `uint64_t` to `TypeSize` to
accommodate an expression of scalable size.
Allow `StructType:isSized` to also return true for homogeneous
scalable vector types.
Let `Type::isScalableTy` return true when `Type` is `StructType`
and contains scalable vectors
Extra description is added in the LLVM Language Reference Manual on the
relaxation of this patch.
Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Co-Authored-by: eop Chen <eop.chen@sifive.com>
Reviewed By: craig.topper, nikic
Differential Revision: https://reviews.llvm.org/D146872
This is part of an effort to remove ConstantExpr based
representations of `vscale` so that its LangRef definiton can
be relaxed to accommodate a less strict definition of constant.
Differential Revision: https://reviews.llvm.org/D144891
Currently, for all invocations, it's equivalent, since that is literally
how `SCEVMinMaxExpr::getType()` is defined. But for e.g. `select`,
we'll want to ask about the hand type, and not the type of the operand
that happens to be first.
Use a consistent type for the operands() methods of different SCEV
types. Also make the API consistent by only providing operands(),
rather than also providin op_begin() and op_end() for some of them.
Move LCSSA fixup from ::expandCodeForImpl to ::expand(). This has
the advantage that we directly preserve LCSSA nodes here instead of
relying on doing so in rememberInstruction. It also ensures that we
don't add the non-LCSSA-safe value to InsertedExpressions.
Alternative to D132704.
Fixes#57000.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D134739
Simplify the code by using CastInst::CreateBitOrPointerCast directly. By
not going through the builder, the temporary instruction also won't get
registered in InsertedValues & co, which means less work overall and
simplifies the clean-up.
Instruction being hoisted could have nuw/nsw flags inferred from the old
context, and we cannot simply move it to the new location keeping them
because we are going to introduce new uses to them that didn't exist before.
Example in https://github.com/llvm/llvm-project/issues/57187 shows how
this can produce branch by poison from initially well-defined program.
This patch forcefully recomputes poison-generating flag in the new context.
Differential Revision: https://reviews.llvm.org/D132022
Reviewed By: fhahn, nikic