169 Commits

Author SHA1 Message Date
Ramkumar Ramachandra
1abb055c57
[IVDesc] Make getCastInsts return an ArrayRef (NFC) (#169021)
To make it clear that the return value is immutable.
2025-11-24 08:57:55 +00:00
Luke Lau
5da0445420
[LV] Consolidate shouldOptimizeForSize and remove unused BFI/PSI. NFC (#168697)
#158690 plans on passing BFI as a lazy lambda to avoid computing
BlockFrequencyInfo when not needed.

In preparation for that, this PR removes BFI and PSI from some
constructors that aren't used. It also consolidates the two calls to
llvm::shouldOptimizeForSize so that the result is computed once and
passed where needed.

This also renames OptForSize in LoopVectorizationLegality to clarify
that it's to prevent runtime SCEV checks, see
https://reviews.llvm.org/D68082
2025-11-19 21:29:26 +08:00
Florian Hahn
a6edeedbfa Revert "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042)"
This reverts commit 62d1a080e69e3c5e98840e000135afa7c688a77b.

This appears to be causing some runtime failures on RISCV
https://lab.llvm.org/buildbot/#/builders/210/builds/5221
2025-11-13 22:34:55 +00:00
Florian Hahn
62d1a080e6
[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042)
Building on top of https://github.com/llvm/llvm-project/pull/148817,
introduce a new abstract LastActiveLane opcode that gets lowered to
Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1).

When folding the tail, update all extracts for uses outside the loop the
extract the value of the last actice lane.

See also https://github.com/llvm/llvm-project/issues/148603

PR: https://github.com/llvm/llvm-project/pull/149042
2025-11-12 15:11:00 +00:00
Florian Hahn
af9a4263a1
[LAA] Only use inbounds/nusw in isNoWrap if the GEP is dereferenced. (#161445)
Update isNoWrap to only use the inbounds/nusw flags from GEPs that are
guaranteed to be dereferenced on every iteration. This fixes a case
where we incorrectly determine no dependence.

I think the issue is isolated to code that evaluates the resulting
AddRec at BTC, just using it to compute the distance between accesses
should still be fine; if the access does not execute in a given
iteration, there's no dependence in that iteration. But isolating the
code is not straight-forward, so be conservative for now. The practical
impact should be very minor (only one loop changed across a corpus with
27k modules from large C/C++ workloads.

Fixes https://github.com/llvm/llvm-project/issues/160912.

PR: https://github.com/llvm/llvm-project/pull/161445
2025-11-04 17:08:12 +00:00
Florian Hahn
5e3ac2a6f2
[LV] Bail out on loops with switch as latch terminator.
Currently we cannot vectorize loops with latch blocks terminated by a
switch. In the future this could be handled by materializing appropriate
compares.

Fixes https://github.com/llvm/llvm-project/issues/156894.
2025-10-12 21:20:35 +01:00
Craig Topper
59fe8401f4
[LV] Use StringRef::consume_front. NFC (#161454) 2025-10-01 08:58:30 -07:00
Graham Hunter
54fc5367f6 [LV] Fix crash in uncountable exit with side effects checking
Fixes an ICE reported on PR #145663, as an assert was found to be
reachable with a specific combination of unreachable blocks.
2025-09-12 10:41:05 +00:00
Graham Hunter
e285602fda [LV] Enforce addrec in current loop for uncountable exit load address check
Addresses post-commit review raised for #145663
2025-09-11 11:18:22 +00:00
Graham Hunter
3c810b76b9
[LV] Add initial legality checks for early exit loops with side effects (#145663)
This adds initial support to LoopVectorizationLegality to analyze loops
with side effects (particularly stores to memory) and an uncountable
exit. This patch alone doesn't enable any new transformations, but
does give clearer reasons for rejecting vectorization for such a loop.

The intent is for a loop like the following to pass the specific checks,
and only be rejected at the end until the transformation code is
committed:

```
// Assume a is marked restrict
// Assume b is known to be large enough to access up to b[N-1]
for (int i = 0; i < N; ++) {
  a[i]++;
  if (b[i] > threshold)
    break;
}
```
2025-09-10 13:54:52 +01:00
Florian Hahn
b9b0ea5f62
[LV] Pass DT to isGuaranteedNotToBePoison in canVectorizeWithIfCvt.
Pass DT to slightly improve analysis results. Note that the context
instruction is already passed.
2025-09-05 20:42:56 +01:00
Shih-Po Hung
9876b06bc7
[LV] Add initial legality checks for loops with unbound loads. (#152422)
This patch splits out the legality checks from PR #151300, following the
landing of PR #128593.

It is a step toward supporting vectorization of early-exit loops that
contain potentially faulting loads.
In this commit, an early-exit loop is considered legal for vectorization
if it satisfies the following criteria:

1. it is a read-only loop.
2. all potentially faulting loads are unit-stride, which is the only
type currently supported by vp.load.ff.
2025-09-05 08:20:16 +08:00
Tobias Stadler
8135b7c1ab
[LV] Emit all remarks for unvectorizable instructions (#153833)
If ExtraAnalysis is requested, emit all remarks caused by unvectorizable instructions - instead of only the first.
This is in line with how other places handle DoExtraAnalysis and it can be quite helpful to get info about all instructions in a loop that prevent vectorization.
2025-08-18 18:04:53 +01:00
Florian Hahn
7d815c7642
[LV] Remove unused variables after 965231ca0a9a. (NFC)
Clean up unused/dead variables after 965231ca0a9a
(https://github.com/llvm/llvm-project/pull/151311)
2025-08-01 10:12:20 +01:00
Florian Hahn
965231ca0a
[LV] Replace UncountableEdge with UncountableEarlyExitingBlock (NFC). (#151311)
Only the uncountable exiting BB is used. Store it instead of a piar of
Exiting BB and Exit BB.

PR: https://github.com/llvm/llvm-project/pull/151311
2025-08-01 09:37:02 +01:00
Florian Hahn
fd97dfbb78
[LV] Don't mark ptrs as safe to speculate if fed by UB/poison op. (#143204)
Add additional checks before marking pointers safe to load
speculatively. If some computations feeding the pointer may trigger UB,
we cannot load the pointer speculatively, because we cannot compute the
address speculatively. The UB triggering instructions will be
predicated, but if the predicated block does not execute the result is
poison.

Similarly, we also cannot load the pointer speculatively if it may be
poison. The patch also checks if any of the operands defined outside the
loop may be poison when entering the loop. We *don't* need to check if
any operation inside the loop may produce poison due to flags, as those
will be dropped if needed.

There are some types of instructions inside the loop that can produce
poison independent of flags. Currently loads are also checked, not sure
if there's a convenient API to check for all such operands.

Fixes https://github.com/llvm/llvm-project/issues/142957.

PR: https://github.com/llvm/llvm-project/pull/143204
2025-06-20 13:05:19 +01:00
Jeremy Morse
9eb0020555
[DebugInfo][RemoveDIs] Remove a swathe of debug-intrinsic code (#144389)
Seeing how we can't generate any debug intrinsics any more: delete a
variety of codepaths where they're handled. For the most part these are
plain deletions, in others I've tweaked comments to remain coherent, or
added a type to (what was) type-generic-lambdas.

This isn't all the DbgInfoIntrinsic call sites but it's most of the
simple scenarios.

Co-authored-by: Nikita Popov <github@npopov.com>
2025-06-17 15:55:14 +01:00
Shih-Po Hung
e5263e3ec8
[LV][NFC] Clean up tail-folding check for early-exit loops (#133931)
This patch moves the check for a single latch exit from computeMaxVF()
to LoopVectorizationLegality::canFoldTailByMasking(), as it duplicates
the logic when foldTailByMasking() returns false.

It also updates the NoScalarEpilogueNeeded logic to return false for
loops that are neither single-latch-exit nor early-exit. This avoids
applying tail-folding in unsupported cases and prevents triggering
assertions during analysis.
2025-04-18 10:57:11 +08:00
chrisPyr
71f4c7dabe
[NFC]Make file-local cl::opt global variables static (#126486)
#125983
2025-03-03 13:46:33 +07:00
Ramkumar Ramachandra
4ac43b541c
[LV] Restrict widest induction type to be IntegerType (NFC) (#128173)
As the name of the function suggests, convertPointerToIntegerType should
return an IntegerType instead of a Type, and should only ever be called
with integer or ptr type. Fix the callers getWiderType, and
addInductionPhi to narrow the type of WidestIndTy to IntegerType,
stripping unclear casts. While at it, rename convertPointerToIntegerType
and getWiderType for clarity.
2025-02-24 19:10:33 +00:00
Benjamin Maxwell
e0e67a6207
[LV] Add initial support for vectorizing literal struct return values (#109833)
This patch adds initial support for vectorizing literal struct return
values. Currently, this is limited to the case where the struct is
homogeneous (all elements have the same type) and not packed. The users
of the call also must all be `extractvalue` instructions.

The intended use case for this is vectorizing intrinsics such as:

```
declare { float, float } @llvm.sincos.f32(float %x)
```

Mapping them to structure-returning library calls such as:

```
declare { <4 x float>, <4 x float> } @Sleef_sincosf4_u10advsimd(<4 x float>)
```

Or their widened form (such as `@llvm.sincos.v4f32` in this case).

Implementing this required two main changes:

1. Supporting widening `extractvalue`
2. Adding support for vectorized struct types in LV
  * This is mostly limited to parts of the cost model and scalarization

Since the supported use case is narrow, the required changes are
relatively small.
2025-02-17 09:51:35 +00:00
David Sherwood
4a2ebd6661
[LV][NFC] Refactor structures used to maintain uncountable exit info (#123219)
I've removed the HasUncountableEarlyExit variable, since we can
already determine whether or not a loop has an early exit by seeing
if we found an uncountable exit.

I have also deleted the old UncountableExitingBlocks and
UncountableExitBlocks lists and replaced them with a single
uncountable edge. This means we don't need to worry about keeping the
list entries in sync and makes it clear which exiting block
corresponds to which exit block.
2025-01-22 09:40:08 +00:00
Florian Hahn
b0697dc1de
[LV] Only check isVectorizableEarlyExitLoop with multiple exits. (#121994)
Currently we emit early-exit related debug messages/remarks even when
there is a single exit. Update to only check isVectorizableEarlyExitLoop
if there isn't a single exit block.

PR: https://github.com/llvm/llvm-project/pull/121994
2025-01-09 12:05:19 +00:00
Benjamin Maxwell
f88ef1bd1b
[LV] Teach LoopVectorizationLegality about struct vector calls (#119221)
This is a split-off from #109833 and only adds code relating to checking
if a struct-returning call can be vectorized.

This initial patch only allows the case where all users of the struct
return are `extractvalue` operations that can be widened.

```
%call = tail call { float, float } @foo(float %in_val)
%extract_a = extractvalue { float, float } %call, 0
%extract_b = extractvalue { float, float } %call, 1
```

Note: The tests require the VFABI changes from #119000 to pass.
2025-01-09 09:27:29 +00:00
Finn Plummer
45c01e8a33
[NFC][TargetTransformInfo][VectorUtils] Consolidate isVectorIntrinsic... api (#117635)
- update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for
all uses, to allow specifiction of target specific intrinsics
- add TTI to the `isVectorIntrinsicWithStructReturnOverloadAtField` api
- update TTI api to provide `isTargetIntrinsicWith...` functions and
  consistently name them
- move `isTriviallyScalarizable` to VectorUtils
  
- update all uses of the api and provide the TTI parameter

Resolves #117030
2024-12-19 11:54:26 -08:00
David Sherwood
c18fda02e1
[LoopVectorize] Use new single string variant of reportVectorizationFailure (#120414) 2024-12-19 10:07:13 +00:00
Florian Hahn
5fae408d3a
[VPlan] Dispatch to multiple exit blocks via middle blocks. (#112138)
A more lightweight variant of
https://github.com/llvm/llvm-project/pull/109193,
which dispatches to multiple exit blocks via the middle blocks.

The patch also introduces a bit of required scaffolding to enable
early-exit vectorization, including an option. At the moment, early-exit
vectorization doesn't come with legality checks, and is only used if the
option is provided and the loop has metadata forcing vectorization. This
is only intended to be used for testing during bring-up, with @david-arm
enabling auto early-exit vectorization plugging in the changes from
https://github.com/llvm/llvm-project/pull/88385.

PR: https://github.com/llvm/llvm-project/pull/112138
2024-12-11 21:11:05 +00:00
Ellis Hoag
6ab26eab4f
Check hasOptSize() in shouldOptimizeForSize() (#112626) 2024-10-28 09:45:03 -07:00
Graham Hunter
6f1a8c2da2
[LV] Vectorize histogram operations (#99851)
This patch implements autovectorization support for the 'all-in-one'
histogram intrinsic, which seems to have more support than the
'standalone' intrinsic. See
https://discourse.llvm.org/t/rfc-vectorization-support-for-histogram-count-operations/74788/
for an overview of the work and my notes on the tradeoffs between the
two approaches.
2024-09-27 13:08:55 +01:00
David Sherwood
f4eeae1244
[LoopVectorize] Address comments on PR #107004 left post-commit (#109300)
* Rename Speculative -> Uncountable and update tests.
* Add comments explaining why it's safe to ignore the predicates when
building up a list of exiting blocks.
* Reshuffle some code to do (hopefully) cheaper checks first.
2024-09-23 13:36:25 +01:00
David Sherwood
02ee96eca9
[Analysis] Teach isDereferenceableAndAlignedInLoop about SCEV predicates (#106562)
Currently if a loop contains loads that we can prove at compile time
are dereferenceable when certain conditions are satisfied the function
isDereferenceableAndAlignedInLoop will still return false because
getSmallConstantMaxTripCount will return 0 when SCEV predicates
are required. This patch changes getSmallConstantMaxTripCount to take
an optional Predicates pointer argument so that we can permit
functions such as isDereferenceableAndAlignedInLoop to consider more
cases.
2024-09-23 09:56:37 +01:00
Benjamin Kramer
57777a5066 [LoopVectorize] Silence unused variable warning 2024-09-19 11:01:58 +02:00
David Sherwood
e762d4dac7
[LoopVectorize] Teach LoopVectorizationLegality about more early exits (#107004)
This patch is split off from PR #88385 and concerns only the code
related to the legality of vectorising early exit loops. It is the first
step in adding support for vectorisation of a simple class of loops that
typically involves searching for something, i.e.

  for (int i = 0; i < n; i++) {
    if (p[i] == val)
      return i;
  }
  return n;

or

  for (int i = 0; i < n; i++) {
    if (p1[i] != p2[i])
      return i;
  }
  return n;

In this initial commit LoopVectorizationLegality will only consider
early exit loops legal for vectorising if they follow these criteria:

1. There are no stores in the loop.
2. The loop must have only one early exit like those shown in the above
example. I have referred to such exits as speculative early exits, to
distinguish from existing support for early exits where the
exit-not-taken count is known exactly at compile time.
3. The early exit block dominates the latch block.
4. The latch block must have an exact exit count.
5. There are no loads after the early exit block.
6. The loop must not contain reductions or recurrences. I don't see
anything fundamental blocking vectorisation of such loops, but I just
haven't done the work to support them yet.
7. We must be able to prove at compile-time that loops will not contain
faulting loads.

Tests have been added here:

  Transforms/LoopVectorize/AArch64/simple_early_exit.ll
2024-09-19 09:41:25 +01:00
ErikHogeman
78e1e6ace6
[LV] Check for vector-to-scalar casts in legalizer (#106244)
The code makes assumptions later on the operations and their inputs
being scalar in the loops that are processed, so we should make sure
this is the case in the legalizer.
2024-09-06 11:20:14 +02:00
Madhur Amilkanthwar
cd46829e54
[LV] Fix emission of debug message in legality check (#101924)
Successful vectorization message is emitted even
after "Result" is false. "Result" = false indicates failure of one of
the legality check and thus
successful message should not be printed.
2024-09-04 16:28:39 +05:30
Daniil Fukalov
0da2ba811a
[NFC] Cleanup in ADT and Analysis headers. (#104484)
Remove unused directly includes and forward declarations in ADT and
Analysis headers.
2024-08-17 13:11:18 +02:00
Florian Hahn
f0df4fbd0c
[LV] Support generating masks for switch terminators. (#99808)
Update createEdgeMask to created masks where the terminator in Src is a
switch. We need to handle 2 separate cases:

1. Dst is not the default desintation. Dst is reached if any of the
cases with destination == Dst are taken. Join the conditions for each
case where destination == Dst using a logical OR.
2. Dst is the default destination. Dst is reached if none of the cases
with destination != Dst are taken. Join the conditions for each case
where the destination is != Dst using a logical OR and negate it.

Edge masks are created for every destination of cases and/or 
default when requesting a mask where the source is a switch.

Fixes https://github.com/llvm/llvm-project/issues/48188.

PR: https://github.com/llvm/llvm-project/pull/99808
2024-08-11 20:38:36 +02:00
Florian Hahn
edf46f365c
[SCEV] Use const SCEV * explicitly in more places.
Use const SCEV * explicitly in more places to prepare for
https://github.com/llvm/llvm-project/pull/91961. Split off as suggested.
2024-08-03 20:10:01 +01:00
Ramkumar Ramachandra
e1a3aa8c0f
LV/Legality: fix style after cursory reading (NFC) (#100363) 2024-07-24 16:32:24 +01:00
Kazu Hirata
5c83498984
[Transforms] Use range-based for loops (NFC) (#99607) 2024-07-21 13:11:25 -07:00
Graham Hunter
22a7f6dcc4
Revert "[LV] Autovectorization for the all-in-one histogram intrinsic" (#98493)
Reverts llvm/llvm-project#91458 to deal with post-commit reviewer
requests.
2024-07-11 16:39:30 +01:00
Graham Hunter
1860fd049e
[LV] Autovectorization for the all-in-one histogram intrinsic (#91458)
This patch implements limited loop vectorization support for the 'all-in-one' histogram intrinsic. The feature is disabled by default, and when enabled will only vectorize if there are no other users of values in the gather-modify-scatter sequence.
2024-07-11 15:33:30 +01:00
Florian Hahn
0577cdaa32
[LV] Split checking if tail-folding is possible, collecting masked ops. (#77612)
Introduce new canFoldTail helper which only checks if tail-folding is
possible, but without modifying MaskedOps.

Just because tail-folding is possible doesn't mean the tail will be
folded; that's up to the cost-model to decide. Separating the check if
tail-folding is possible and preparing for tail-folding makes sure that
MaskedOps is only populated when tail-folding is actually selected.

PR: https://github.com/llvm/llvm-project/pull/77612
2024-07-08 16:34:42 +01:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Florian Hahn
e949b54a5b
[LAA] Use PSE::getSymbolicMaxBackedgeTakenCount. (#93499)
Update LAA to use PSE::getSymbolicMaxBackedgeTakenCount which returns
the minimum of the countable exits.

When analyzing dependences and computing runtime checks, we need the
smallest upper bound on the number of iterations. In terms of memory
safety, it shouldn't matter if any uncomputable exits leave the loop,
as long as we prove that there are no dependences given the minimum of
the countable exits. The same should apply also for generating runtime
checks.

Note that this shifts the responsiblity of checking whether all exit
counts are computable or handling early-exits to the users of LAA.

Depends on https://github.com/llvm/llvm-project/pull/93498

PR: https://github.com/llvm/llvm-project/pull/93499
2024-06-04 22:23:30 +01:00
Florian Hahn
b54a78d69b
[LV,LAA] Don't vectorize loops with load and store to invar address.
Code checking stores to invariant addresses and reductions made an
incorrect assumption that the case of both a load & store to the same
invariant address does not need to be handled.

In some cases when vectorizing with runtime checks, there may be
dependences with a load and store to the same address, storing a
reduction value.

Update LAA to separately track if there was a store-store and a
load-store dependence with an invariant addresses.

Bail out early if there as a load-store dependence with invariant
address. If there was a store-store one, still apply the logic checking
if they all store a reduction.
2024-05-04 20:53:54 +01:00
Niwin Anto
eaf0d82529
[LV] Disable fold tail by masking when IV is used outside (#81609)
When induction variable are used outside the loop body, tail folding
by masking mis-compiles, because for users outside of the loop the
final value of the induction is computed separately from the vector
loop.

Fixes https://github.com/llvm/llvm-project/issues/76069
Fixes https://github.com/llvm/llvm-project/issues/51677
2024-03-04 11:33:30 +00:00
Graham Hunter
b070629c10
[LV] Increase max VF if vectorized function variants exist (#66639)
If there are function calls in the candidate loop and we have vectorized
variants available, try some wider VFs in case the conservative initial
maximum based on the widest types in the loop won't actually allow us
to make use of those function variants.
2023-11-13 10:27:10 +00:00
Simon Pilgrim
3ca4fe80d4 [Transforms] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC.
startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)
2023-11-06 16:50:18 +00:00
Florian Hahn
aac8acb115
[VPlan] Model masked assumes as replicate recipes, drop them (NFCI).
Replace ConditionalAssume set by treating conditional assumes like other
predicated instructions (i.e. create a VPReplicateRecipe with a mask)
and later remove any assume recipes with masks during VPlan cleanup.

This reduces coupling of VPlan construction and Legal by removing a
shared set between the 2 and results in a cleaner code structure
overall.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D157034
2023-08-06 20:56:24 +01:00