Need to consider the length of the original vector for extractelements,
not the length, matched number of the scalars. It fixes 2 issues: 1)
improves cost estimation; 2) Fixes crashes after D158449.
This caused asserts:
Assertion failed: NumElts > 1 && "Expected at least 2-element fixed length vector(s).",
file C:\b\s\w\ir\cache\builder\src\third_party\llvm\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp, line 7096
see comment on 59a67ea35d
> Need to consider the length of the original vector for extractelements,
> not the length, matched number of the scalars. It fixes 2 issues: 1)
> improves cost estimation; 2) Fixes crashes after D158449.
This reverts commit 59a67ea35d608480257fc64ec3e5106ef50de740.
Need to consider the length of the original vector for extractelements,
not the length, matched number of the scalars. It fixes 2 issues: 1)
improves cost estimation; 2) Fixes crashes after D158449.
This reverts commit 9a99944df068b29b905cd8ba9a2132cc6382b6fb.
Due to test suite failures on all our SVE buildbots e.g.:
https://lab.llvm.org/buildbot/#/builders/184/builds/7375
clang: ../llvm/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:3565:
InstructionCost llvm::AArch64TTIImpl::getShuffleCost(TTI::ShuffleKind,
VectorType *, ArrayRef<int>, TTI::TargetCostKind, int, VectorType *,
ArrayRef<const Value *>): Assertion `Mask.size() == TpNumElts && "Expected Mask and Tp size to match!"' failed.
artificial for better cost estimation.
Need to use original source vector type, not the one artificially
constructed, based on the number of vectorized scalars. It affect the
cost significantly.
I propose that we go ahead and enabled SLP by default. Over the last few weeks, @luke and I have been working through codegen issues seen at small VLs from a couple of SPEC workloads. We still have a ways to go to get optimal codegen, but we're at the point where having a single configuration we're all tuning against is probably the right default.
As a bit of history, I introduced this TTI hook back in a310637132 back in August of last year to unblock enabling LoopVectorizer. At the time, we had a couple known issues: constant materialization, address generation, and a general lack of maturity of small fixed vector codegen. By now, each of these has had significant investment. I can't say any of them are completely fixed, but we're no longer seeing instances of them every place we look.
What we're mostly seeing at this point is a long tail of code gen opportunities, many involving build vectors, shuffles, and extract patterns. I have a couple patches up to continue iterating on those issues, but I don't think they need to be blockers for enabling SLP.
Differential Revision: https://reviews.llvm.org/D152750
With this patch an undefined mask in a shufflevector will be printed as poison.
This change is done to support the new shufflevector semantics
for undefined mask elements.
Differential Revision: https://reviews.llvm.org/D149210
RISCV has "vfabs.v" and "vfsqrt.v" so math functions abs and sqrt
can be SLP vectorized. But others exp/log/sin/asin/sinh/asinh/...
can not.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D145562