12211 Commits

Author SHA1 Message Date
Max Kazantsev
0cbb8ec030 Revert "[AssumptionCache] caches @llvm.experimental.guard's"
This reverts commit f9599bbc7a3f831e1793a549d8a7a19265f3e504.

For some reason it caused us a huge compile time regression in downstream
workloads. Not sure whether the source of it is in upstream code ir not.
Temporarily reverting until investigated.

Differential Revision: https://reviews.llvm.org/D142330
2023-02-20 18:38:07 +07:00
Simon Tatham
a8cd35c3b7 [LowerTypeTests] Support generating Armv6-M jump tables. (reland)
[Originally committed as f6ddf7781471b71243fa3c3ae7c93073f95c7dff;
reverted in bbef38352fbade9e014ec97d5991da5dee306da7 due to test
breakage; now relanded with the Arm tests conditioned on
`arm-registered-target`]

The LowerTypeTests pass emits a jump table in the form of an
`inlineasm` IR node containing a string representation of some
assembly. It tests the target triple to see what architecture it
should be generating assembly for. But that's not good enough for
`Triple::thumb`, because the 32-bit PC-relative `b.w` branch
instruction isn't available in all supported architecture versions. In
particular, Armv6-M doesn't support that instruction (although the
similar Armv8-M Baseline does).

Most of this patch is concerned with working out whether the
compilation target is Armv6-M or not, which I'm doing by going through
all the functions in the module, retrieving a TargetTransformInfo for
each one, and querying it via a new method I've added to check its
SubtargetInfo. If any function's TTI indicates that it's targeting an
architecture supporting B.W, then we assume we're also allowed to use
B.W in the jump table.

The Armv6-M compatible jump table format requires a temporary
register, and therefore also has to use the stack in order to restore
that register.

Another consequence of this change is that jump tables on Arm/Thumb
are no longer always the same size. In particular, on an architecture
that supports Arm and Thumb-1 but not Thumb-2, the Arm and Thumb
tables are different sizes from //each other//. As a consequence,
``getJumpTableEntrySize`` can no longer base its answer on the target
triple's architecture: it has to take into account the decision that
``selectJumpTableArmEncoding`` made, which meant I had to move that
function to an earlier point in the code and store its answer in the
``LowerTypeTestsModule`` class.

Reviewed By: lenary

Differential Revision: https://reviews.llvm.org/D143576
2023-02-20 10:46:47 +00:00
Nikita Popov
be88b5814d [InstCombine] Call simplifyLoadInst()
InstCombine is supposed to be a superset of InstSimplify, but
failed to invoke load simplification.

Unfortunately, this causes a minor compile-time regression, which
will be mitigated in a future commit.
2023-02-20 10:49:44 +01:00
Matt Devereau
8299c764bd [InstSimplify] Simplify icmp between Shl instructions of the same value
define i1 @compare_vscales() {
  %vscale = call i64 @llvm.vscale.i64()
  %vscalex2 = shl nuw nsw i64 %vscale, 1
  %vscalex4 = shl nuw nsw i64 %vscale, 2
  %cmp = icmp ult i64 %vscalex2, %vscalex4
  ret i1 %cmp
}

This IR is currently emitted by LLVM. This icmp is redundant as this snippet
can be simplified to true or false as both operands originate from the same
@llvm.vscale.i64() call.

Differential Revision: https://reviews.llvm.org/D142542
2023-02-20 09:25:34 +00:00
Max Kazantsev
5fe915bb8c [SCEV] Canonicalize ext(min/max(x, y)) to min/max(ext(x), ext(y))
I stumbled over this while trying to improve our exit count work. These expressions
are equivalent for complementary signed/unsigned ext and min/max (including umin_seq),
but they are not canonicalized and SCEV cannot recognize them as the same.

The benefit of this canonicalization is that SCEV can prove some new equivalences which
it coudln't prove because of different forms. There is 1 test where trip count seems pessimized,
I could not directly figure out why, but it just seems an unrelated issue that we can fix.
Other changes seem neutral or positive to me.

Differential Revision: https://reviews.llvm.org/D141481
Reviewed By: nikic
2023-02-20 16:12:58 +07:00
Kohei Asano
a4d6c7dd99 [InstSimplify] Fold LoadInst for uniform constant global variables
Fold LoadInst for uniformly initialized constants, even if there
are non-constant GEP indices.

Goal proof: https://alive2.llvm.org/ce/z/oZtVby

Motivated by https://github.com/rust-lang/rust/issues/107208

Differential Revision: https://reviews.llvm.org/D144184
2023-02-20 09:43:52 +01:00
Kazu Hirata
a28b252d85 Use APInt::getSignificantBits instead of APInt::getMinSignedBits (NFC)
Note that getMinSignedBits has been soft-deprecated in favor of
getSignificantBits.
2023-02-19 23:56:52 -08:00
Max Kazantsev
df9c5bd8d2 [SCEV] Support umin/smin in SCEVLoopGuardRewriter
Adds support for these SCEVs to cover more cases.

Differential Revision: https://reviews.llvm.org/D143259
Reviewed By: dmakogon, fhahn
2023-02-20 13:05:00 +07:00
Kazu Hirata
f8f3db2756 Use APInt::count{l,r}_{zero,one} (NFC) 2023-02-19 22:04:47 -08:00
Kazu Hirata
cbde2124f1 Use APInt::popcount instead of APInt::countPopulation (NFC)
This is for consistency with the C++20-style bit manipulation
functions in <bit>.
2023-02-19 11:29:12 -08:00
Noah Goldstein
3bd38f6639 [ValueTracking] Add cases for additional ops in isKnownNonZero
Add cases for the following ops:
    - 0-X            -- https://alive2.llvm.org/ce/z/6C75Li
    - bitreverse(X)  -- https://alive2.llvm.org/ce/z/SGG1q9
    - bswap(X)       -- https://alive2.llvm.org/ce/z/p7pzwh
    - ctpop(X)       -- https://alive2.llvm.org/ce/z/c5y3BC
    - abs(X)         -- https://alive2.llvm.org/ce/z/yxXGz_
                        https://alive2.llvm.org/ce/z/rSRg4K
    - uadd_sat(X, Y) -- https://alive2.llvm.org/ce/z/Zw-y4W
                        https://alive2.llvm.org/ce/z/2NRqRz
                        https://alive2.llvm.org/ce/z/M1OpF8

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D142828
2023-02-18 13:45:15 -06:00
Noah Goldstein
9a8f517f57 [ValueTracking] Add KnownBits patterns xor(x, x - 1) and and(x, -x) for knowing upper bits to be zero
These two BMI pattern will clear the upper bits of result past the
first set bit. So if we know a single bit in `x` is set, we know that
`results[bitwidth - 1, log2(x) + 1] = 0`.

Alive2:
blsmsk: https://alive2.llvm.org/ce/z/a397BS
blsi: https://alive2.llvm.org/ce/z/tsbQhC

Differential Revision: https://reviews.llvm.org/D142271
2023-02-18 13:31:17 -06:00
Teresa Johnson
8045ba8948 [ThinLTO/WPD] Handle function alias in vtable correctly
We were not summarizing a function alias in the vtable, leading to
incorrect WPD in some cases, and missing WPD in others.

Specifically, we would end up ignoring function aliases as they aren't
summarized, so we could incorrectly devirtualize if there was a single
other non-alias function in a compatible vtable. And if there was only
one implementation, but it was an alias, we would not be able to
identify and perform the single implementation devirtualization.

Handling the alias summary correctly also required fixing the handling
in mustBeUnreachableFunction, so that it is not incorrectly ignored.

Regular LTO is conservatively correct because it will skip
devirtualizing when any pointer within a vtable is not a function.
However, it needs additional work to be able to take advantage of
function alias within the vtable that is in fact the only
implementation. For that reason, the Regular LTO testing in the second
test case is currently disabled, and will be enabled along with a follow
on enhancement fix for Regular LTO WPD.

Differential Revision: https://reviews.llvm.org/D144209
2023-02-16 18:20:12 -08:00
Simon Tatham
bbef38352f Revert "[LowerTypeTests] Support generating Armv6-M jump tables."
This reverts commit f6ddf7781471b71243fa3c3ae7c93073f95c7dff.

Eight buildbots reported that the two test files changed by that
commit had started failing. The buildbots in question all had in
common that they build with a very restricted `LLVM_TARGETS_TO_BUILD`,
such as only X86 or AArch64 or Hexagon. I didn't notice this before
commit because my own build has the full default set of targets, and
in that circumstance, the tests pass.

I assume the problem has something to do with the attempt to query
TargetTransformInfo: if you can't make a valid TTI for the target
triple then you can't ask it what kind of inline assembler you should
be emitting, and so `opt` without the Arm backend can't get the Arm
cases of these tests right.

I don't have time to fix this until next week, so I'll revert the
change for now to keep the buildbots happy.
2023-02-16 17:11:06 +00:00
Simon Tatham
f6ddf77814 [LowerTypeTests] Support generating Armv6-M jump tables.
The LowerTypeTests pass emits a jump table in the form of an
`inlineasm` IR node containing a string representation of some
assembly. It tests the target triple to see what architecture it
should be generating assembly for. But that's not good enough for
`Triple::thumb`, because the 32-bit PC-relative `b.w` branch
instruction isn't available in all supported architecture versions. In
particular, Armv6-M doesn't support that instruction (although the
similar Armv8-M Baseline does).

Most of this patch is concerned with working out whether the
compilation target is Armv6-M or not, which I'm doing by going through
all the functions in the module, retrieving a TargetTransformInfo for
each one, and querying it via a new method I've added to check its
SubtargetInfo. If any function's TTI indicates that it's targeting an
architecture supporting B.W, then we assume we're also allowed to use
B.W in the jump table.

The Armv6-M compatible jump table format requires a temporary
register, and therefore also has to use the stack in order to restore
that register.

Another consequence of this change is that jump tables on Arm/Thumb
are no longer always the same size. In particular, on an architecture
that supports Arm and Thumb-1 but not Thumb-2, the Arm and Thumb
tables are different sizes from //each other//. As a consequence,
``getJumpTableEntrySize`` can no longer base its answer on the target
triple's architecture: it has to take into account the decision that
``selectJumpTableArmEncoding`` made, which meant I had to move that
function to an earlier point in the code and store its answer in the
``LowerTypeTestsModule`` class.

Reviewed By: lenary

Differential Revision: https://reviews.llvm.org/D143576
2023-02-16 15:34:49 +00:00
Florian Hahn
b32b7068ef
[ConstraintSystem] Use sparse representation for constraints. (NFC)
Update ConstraintSystem to use a sparse representation for entries in a
row. Most rows only contain a small number of variables, so the sparse
representation can result in significant speedups.

For a large test case from D135915, it halves the time spent in
ConstraintElimination.

To ensure this returns the same results as the old implementation in all
cases, I built a large set of projects with an extra assertion that it
produces the same result as the old implementation.
2023-02-16 14:44:49 +00:00
Jay Foad
c76acb9dff [UniformityAnalysis] Fix some file headers and pass names
Differential Revision: https://reviews.llvm.org/D144167
2023-02-16 11:12:31 +00:00
Nikita Popov
eeb125659c [InstSimplify] Slightly optimize simplifyLoad() (NFC)
Check upfront whether the load is based on a constant global
with definitive initializer. Don't bother computing offsets
otherwise.
2023-02-16 10:41:23 +01:00
Nikita Popov
9ca2c309ab [InstSimplify] Fix poison safety in insertvalue fold
We can only fold insertvalue undef, (extractvalue x, n) to x
if x is not poison, otherwise we might be replacing undef with
poison (https://alive2.llvm.org/ce/z/fnw3c8). The insertvalue
poison case is always fine.

I didn't go to particularly large effort to preserve cases where
folding with undef is still legal (mainly when there is a chain of
multiple inserts that end up covering the whole aggregate),
because this shouldn't really occur in practice: We should always
be generating the insertvalue poison form when constructing
aggregates nowadays.

Differential Revision: https://reviews.llvm.org/D144106
2023-02-16 09:39:44 +01:00
Zain Jaffal
df2ea2fc28 [ConstriantElimination] Add NODEBUG condition around dump 2023-02-15 18:53:22 +00:00
Zain Jaffal
07f93d8c2c Recommit "[ConstraintElimination] Change debug output to display variable names."
This reverts commit 02ae7e72b3f00969eeb579a2b4346082827f0b35.

include Value.h in ConstraintSystem.h
2023-02-15 16:38:35 +00:00
Nikita Popov
02ae7e72b3 Revert "Recommit "[ConstraintElimination] Change debug output to display variable names.""
This reverts commit 2a2a6bfcfe8e62886542cb673ac8df349cf26499.

This causes build failures:

    /home/npopov/repos/llvm-project/llvm/lib/Analysis/ConstraintSystem.cpp: In member function ‘llvm::SmallVector<std::__cxx11::basic_string<char> > llvm::ConstraintSystem::getVarNamesList() const’:
    /home/npopov/repos/llvm-project/llvm/lib/Analysis/ConstraintSystem.cpp:118:10: error: invalid use of incomplete type ‘class llvm::Value’
      118 |     if (V->getName().empty())
          |          ^~
    In file included from /home/npopov/repos/llvm-project/llvm/lib/Analysis/ConstraintSystem.cpp:9:
    /home/npopov/repos/llvm-project/llvm/include/llvm/Analysis/ConstraintSystem.h:21:7: note: forward declaration of ‘class llvm::Value’
       21 | class Value;
          |       ^~~~~
    /home/npopov/repos/llvm-project/llvm/lib/Analysis/ConstraintSystem.cpp:119:22: error: invalid use of incomplete type ‘class llvm::Value’
      119 |       OperandName = V->getNameOrAsOperand();
          |                      ^~
    /home/npopov/repos/llvm-project/llvm/include/llvm/Analysis/ConstraintSystem.h:21:7: note: forward declaration of ‘class llvm::Value’
       21 | class Value;
          |       ^~~~~
    /home/npopov/repos/llvm-project/llvm/lib/Analysis/ConstraintSystem.cpp:121:41: error: invalid use of incomplete type ‘class llvm::Value’
      121 |       OperandName = std::string("%") + V->getName().str();
          |                                         ^~
    /home/npopov/repos/llvm-project/llvm/include/llvm/Analysis/ConstraintSystem.h:21:7: note: forward declaration of ‘class llvm::Value’
       21 | class Value;
          |       ^~~~~
2023-02-15 16:36:44 +01:00
Zain Jaffal
2a2a6bfcfe Recommit "[ConstraintElimination] Change debug output to display variable names."
This reverts commit 62d0e1a8541f93dfbf66d982f66da32676df2df7.

remove `dumpWithNames` function
2023-02-15 15:23:53 +00:00
Zain Jaffal
62d0e1a854 Revert "[ConstraintElimination] Change debug output to display variable names."
This reverts commit 869c87ad10e87db7c032c3464338ab9d50916510.

`dumpWithNames` function should be removed
2023-02-15 15:21:46 +00:00
Zain Jaffal
869c87ad10 [ConstraintElimination] Change debug output to display variable names.
Previously when constraint system outputs the rows in the system the variables used are x1,2...n making it hard to infer which ones they relate to in the IR

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D142618
2023-02-15 15:07:48 +00:00
Nikita Popov
07916cea2e [ConstantFold] Check for constant global earlier (NFC)
Check that the underlying object is a constant global with
definitive initializer upfront, so we can skip the more expensive
offset calculation logic if we can't perform the fold anyway.
2023-02-15 15:17:05 +01:00
Sanjay Patel
74a2dd1356 [InstSimplify] fix/improve folding with an SNaN vector element operand
Follow-up to the equivalent change for scalars:
D143505 / 83ba349ae0a8
2023-02-14 19:10:56 -05:00
Sanjay Patel
83ba349ae0 [InstSimplify] fix/improve folding with an SNaN operand
There are 2 issues here:

1. In the default LLVM FP environment (regular FP math instructions),
   SNaN is some flavor of "don't care" which we will nail down in
   D143074, so this is just a quality-of-implementation improvement
   for default FP.
2. In the constrained FP environment (constrained intrinsics), SNaN
   must not propagate through a math operation; it has to be quieted
   according to IEEE-754 spec. That is independent of exception
   handling mode, so the current behavior is a miscompile.

Differential Revision: https://reviews.llvm.org/D143505
2023-02-14 17:51:06 -05:00
Nikita Popov
bfbfbd8b65 [LVI] Fix and re-enable at-use reasoning (PR60629)
This fixes the handling of phi nodes in getConstantRangeAtUse()
and re-enables it, reverting the workaround from
c77c186a647b385c291ddabecd70a2b4f84ae342.

For phi nodes, while we can make use of the edge condition for the
incoming value, we shouldn't look past the phi node to look for
further conditions, because we might be reasoning about values
from two different cycle iterations (which will have the same
SSA value).

To handle this more specifically we would have to detect cycles,
and there doesn't seem to be any motivating case for that at this
point.
2023-02-14 15:56:39 +01:00
Simon Pilgrim
faf5616e11 BlockFrequencyInfoImpl.cpp - add missing closing namespace comment. NFC
Fixes clang-tidy llvm-namespace-comment warning
2023-02-12 16:42:28 +00:00
Simon Pilgrim
738370ae0e DemandedBits.cpp - use auto* when initializing from cast<>. NFC.
Silence clang-tidy warnings
2023-02-12 14:57:11 +00:00
Dmitry Makogon
c77c186a64 [LVI] Don't traverse uses when calculating range at use
This effectively reverts 5c38c6a and 4f772b0.

A recently introduced LazyValueInfo::getConstantRangeAtUse returns incorrect
ranges for values in certain cases. One such example is described in PR60629.
The issue has something to do with traversing PHI uses of a value transitively.
As nikic pointed out, we're effectively reasoning about values from different
loop iterations.

In the faulting test case, CVP made a miscompilation because the calculated
range for a shift argument was incorrect. It returned empty-set, however it is
clearly not a dead code. CVP then erased the shift instruction because
of empty range.
2023-02-10 17:06:36 +07:00
Craig Topper
68c906811b [ValueTracking] Replace an always false condition with an assert. NFC
The one caller of this function already checked that V isn't a
Constant.

Alternatively, we could remove the check from the caller if reviewers
prefer.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D143677
2023-02-09 15:19:28 -08:00
Archibald Elliott
62c7f035b4 [NFC][TargetParser] Remove llvm/ADT/Triple.h
I also ran `git clang-format` to get the headers in the right order for
the new location, which has changed the order of other headers in two
files.
2023-02-07 12:39:46 +00:00
Sergey Kachkov
203cc665cf [PHITransAddr] Simplify casts in PHITransAddr
Try to simplify cast in similar way as for GEP and ADD with
constant (e.g. sext/zext + trunc).

Differential Revision: https://reviews.llvm.org/D143167
2023-02-07 12:43:19 +03:00
Max Kazantsev
0c4a735200 [SCEV] Support sext in SCEVLoopGuardRewriter
There is no particular reason why it's not supported, and it is useful.

Differential Revision: https://reviews.llvm.org/D143257
Reviewed By: fhahn
2023-02-07 14:00:30 +07:00
Max Kazantsev
d7eda3ca10 [SCEV][NFC] Remove check for rewriteable types
I guess its only reason to exist is potential CT optimization, otherwise it is
just creating cohesion between this code and rewriter internals. We plan to
extend the rewriter. I'd rather not have this cohesion, unless there is a serious
reason to have it.

Differential Revision: https://reviews.llvm.org/D143246
2023-02-07 12:43:42 +07:00
Bjorn Pettersson
eec670ac8e Revert "[Lint] Use new PM instead of legacy PM in lintFunction and lintModule"
This reverts commit 525ed98be483188db6dc3bb69cecd0123148ceca.

Some buildbots are failing when linking bugpoint.
Reverting to investigate that further.
2023-02-06 19:29:06 +01:00
Bjorn Pettersson
525ed98be4 [Lint] Use new PM instead of legacy PM in lintFunction and lintModule
There are some helpers in the Lint analysis pass that will setup
a pass manager and then run the Lint pass on a given Function/Module.

Those have been using the LegacyPassManager, but as a small step
towards removing the deprecated legacy pass manager this patch is
changing those helpers into using the new pass manager instead.

No idea if anyone is really is using those helpers. Maybe an
alternative had been to just remove them. There is at least no unit
tests or similar that verifies that they work, so I validated this
patch by using a hacked opt binary that called those functions
before running the normal pipeline.

Differential Revision: https://reviews.llvm.org/D143388
2023-02-06 19:21:23 +01:00
Florian Hahn
8537a7c91c
[ConstraintElim] Update existing constraint system in place (NFC).
This patch breaks up the solving step into 2 phases:

1. Collect all rows where the variable to eliminate is != 0 and remove
   it from the original system.
2. Process all collect rows to build new set of constraints, add them to
   the original system.

This is much more efficient for excessive cases, as this avoids a large
number of moves to the new system. This reduces the time spent in
ConstraintElimination for the test case shared in D135915 from ~3s to
0.6s.
2023-02-06 16:43:42 +00:00
Florian Hahn
d82811df4d
[ConstraintElim] Move some array accesses to variables (NFC).
Move some accesses that are use multiple times to variables. This also
will make updating them easier in the future.
2023-02-06 16:13:26 +00:00
Mircea Trofin
79f7a5e02b [mlgo] Disable mlgo tests when python version is 6
Supporting 3.6 requires a bit too much of a change in the mlgo test python scripts.
2023-02-03 19:45:22 -08:00
Mircea Trofin
d62cdfadc0 [mlgo] fixes for old python versions 2023-02-03 18:08:14 -08:00
Mircea Trofin
b72e893d1d [mlgo] Fix type annotation in log_reader, for older python3 versions 2023-02-03 18:04:09 -08:00
Mircea Trofin
5fd51fcba6 Reland "[mlgo] Hook up the interactive runner to the mlgo-ed passes"
This reverts commit a772f0bb920a4957fb94dd8dbe45943809fd0ec3.

The main problem was related to how we handled `dbgs()` from the hosted
compiler. Using explicit `subprocess.communicate`, and not relying on
dbgs() being flushed until the end appears to address the problem.

Also some fixes due to some bots running older pythons, so we can't have
nice things like `int | float` and such.
2023-02-03 17:54:42 -08:00
Mircea Trofin
a772f0bb92 Revert "[mlgo] Hook up the interactive runner to the mlgo-ed passes"
This reverts commit a7354899d1a235a796b3a2ccb45f6596983c8672.

The way stdout/stderr get routed seems to work differently locally and
on the bots. Investigating.
2023-02-03 16:34:31 -08:00
Mircea Trofin
a7354899d1 [mlgo] Hook up the interactive runner to the mlgo-ed passes
This hooks up the interactive model runner to the passes that support
ml-based decisions. Because the interface to this runner is the exact
same as the one used during inference, we just reuse the exact same
setup we have for "release mode". This makes "release mode" a misnomer -
and that's something we needed to resolve sooner or later (e.g.
supporting more than one embedded model for the same problem was another
reason to drop that nomenclature). That will happen in a subsequent
change.

To use this evaluator, just enable the pass in (currently) "release"
mode, but also pass the base name for the 2 channel files via the
pass-specific flag.

The 2 files are the responsibilty of the hosting process. The added
tests use a minimal, toy such host, illustrating setup and
communication.

Differential Revision: https://reviews.llvm.org/D143218
2023-02-03 16:22:57 -08:00
Craig Topper
2919ec041f [RISCV] Remove side effects from vsetvli intrinsics.
Delete the opt intrinsics since they are now identical.

I left the side effects due to user expectations about how these
interact with things like inline assembly or function calls. Or
that they wouldn't be hoisted. I think we should look at other
ways to address thoughs.

If I could, I'd rename them these somehow to distance them from
the vsetvli instruction. In some sense they only query the VL for
a particular SEW and LMUL. They don't guarantee a vsetvli
instruction will be emitted.

Fixes https://github.com/llvm/llvm-project/issues/59359

Reviewed By: rogfer01, kito-cheng

Differential Revision: https://reviews.llvm.org/D143220
2023-02-03 13:03:56 -08:00
Teresa Johnson
6827c4f0de [MemProf] Add helper to access the back (last) call stack id
This is split out of D140908 as suggested.

Differential Revision: https://reviews.llvm.org/D143184
2023-02-03 07:51:32 -08:00
Sander de Smalen
005311399e [LoopVectorize][TTI] NFCI: Clarify enum for the tail folding style.
This NFC (intended) patch has several small changes:
* It renames PredicationStyle to TailFoldingStyle.
* It renames TTI.emitActiveLaneMask() to TTI.getPreferredTailFoldingStyle()
* Simplifies some of its uses in the LoopVectorizer

Rationale: To my surprise PredicationStyle::None did not mean 'no
predication', but rather 'no active lane mask intrinsic', such that the
predicate is created using a splat + compare with stepvector. The enum is
also highly specific to tail folding, so it seems better to name this
around that feature, i.e. 'tail folding style'.

This also makes it more amenable to extend it to other tail folding styles,
such as the one added in D142109.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D142887
2023-02-03 14:59:57 +00:00