1018 Commits

Author SHA1 Message Date
Florian Hahn
7c34848ae1
[VPlan] Hoist loads with invariant addresses using noalias metadata. (#166247)
This patch implements a transform to hoists single-scalar replicated
loads with invariant addresses out of the vector loop to the preheader
when scoped noalias metadata proves they cannot alias with any stores in
the loop.

This enables hosting of loads we can prove do not alias any stores in
the loop due to memory runtime checks added during vectorization.

PR: https://github.com/llvm/llvm-project/pull/166247
2025-11-18 09:35:48 +00:00
Ramkumar Ramachandra
ef023cae38
Reland [VPlan] Expand WidenInt inductions with nuw/nsw (#168354)
Changes: The previous patch had to be reverted to a mismatching-OpType
assert in cse. The reduced-test has now been added corresponding to a
RVV pointer-induction, and the pointer-induction case has been updated
to use createOverflowingBinaryOp.

While at it, record VPIRFlags in VPWidenInductionRecipe.
2025-11-17 13:44:25 +00:00
Alex Bradbury
f2336d4c7e
Revert "[VPlan] Expand WidenInt inductions with nuw/nsw" (#168080)
Reverts llvm/llvm-project#163538

This is causing build failures on the two-stage RVV buildbots. e.g.
https://lab.llvm.org/buildbot/#/builders/214/builds/1363. I've shared a
reproducer and more information at
https://github.com/llvm/llvm-project/pull/163538#issuecomment-3533482822

This reverts commit 355e0f94af5adabe90ac57110ce1b47596afd4cd.
2025-11-14 16:11:48 +00:00
Ramkumar Ramachandra
355e0f94af
[VPlan] Expand WidenInt inductions with nuw/nsw (#163538)
While at it, record VPIRFlags in VPWidenInductionRecipe.
2025-11-14 12:10:55 +00:00
Florian Hahn
a6edeedbfa Revert "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042)"
This reverts commit 62d1a080e69e3c5e98840e000135afa7c688a77b.

This appears to be causing some runtime failures on RISCV
https://lab.llvm.org/buildbot/#/builders/210/builds/5221
2025-11-13 22:34:55 +00:00
Ramkumar Ramachandra
9ba738af2c
[VPlan] Fix assert in store-user in narrowToSingleScalars (#167686)
Follow up on c2d4c7c18b96 ([VPlan] Permit more users in
narrowToSingleScalars) to fix an assert related to WidenStore users of
the recipe being narrowed in narrowToSingleScalars.
2025-11-12 17:53:24 +00:00
Florian Hahn
62d1a080e6
[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042)
Building on top of https://github.com/llvm/llvm-project/pull/148817,
introduce a new abstract LastActiveLane opcode that gets lowered to
Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1).

When folding the tail, update all extracts for uses outside the loop the
extract the value of the last actice lane.

See also https://github.com/llvm/llvm-project/issues/148603

PR: https://github.com/llvm/llvm-project/pull/149042
2025-11-12 15:11:00 +00:00
Luke Lau
bfd4155f23
[VPlan] Don't apply predication discount to non-originally-predicated blocks (#160449)
Split off from #158690. Currently if an instruction needs predicated due
to tail folding, it will also have a predicated discount applied to it
in multiple places.
This is likely inaccurate because we can expect a tail folded
instruction to be executed on every iteration bar the last.

This fixes it by checking if the instruction/block was originally
predicated, and in doing so prevents vectorization with tail folding
where we would have had to scalarize the memory op anyway.

On llvm-test-suite this causes 4 loops in total to no longer be
vectorized with -O3 on arm64-apple-darwin, and there's no observable
performance impact.
2025-11-10 12:10:40 +00:00
Ramkumar Ramachandra
2d1d5fe78e
[VPlan] Simplify branch-cond with getVectorTripCount (#155604)
Call getVectorTripCount first, and call getTripCount failing that, in
simplifyBranchConditionForVFAndUF, to simplify missed cases. While at
it, strip the dead check for a zero TC.
2025-11-10 10:43:37 +00:00
Florian Hahn
b0b4616790
[VPlan] Handle single-scalar conds in VPWidenSelectRecipe. (#165506)
Generalize VPWidenSelectRecipe codegen to consider single-scalar
conditions instead of just loop-invariant ones.

If the condition is a single-scalar, we can simply use a scalar
condition.

PR: https://github.com/llvm/llvm-project/pull/165506
2025-11-05 22:11:29 +00:00
David Sherwood
7b3fe5fd42
[LV][NFC] Remove undef values in some test cases (#164401)
Split off from PR #163525, this standalone patch replaces simple cases
where undef is used as a value for arithmetic or getelementptr
instructions. This will reduce the likelihood of contributors hitting
the `undef deprecator` warning in github.
2025-11-05 09:18:02 +00:00
Florian Hahn
af9a4263a1
[LAA] Only use inbounds/nusw in isNoWrap if the GEP is dereferenced. (#161445)
Update isNoWrap to only use the inbounds/nusw flags from GEPs that are
guaranteed to be dereferenced on every iteration. This fixes a case
where we incorrectly determine no dependence.

I think the issue is isolated to code that evaluates the resulting
AddRec at BTC, just using it to compute the distance between accesses
should still be fine; if the access does not execute in a given
iteration, there's no dependence in that iteration. But isolating the
code is not straight-forward, so be conservative for now. The practical
impact should be very minor (only one loop changed across a corpus with
27k modules from large C/C++ workloads.

Fixes https://github.com/llvm/llvm-project/issues/160912.

PR: https://github.com/llvm/llvm-project/pull/161445
2025-11-04 17:08:12 +00:00
Florian Hahn
b7e922a3da
[VPlan] Convert BuildVector with all-equal values to Broadcast. (#165826)
Fold BuildVector where all operands are equal to Broadcast of the first
operand. This will subsequently make it easier to remove additional
buildvectors/broadcasts, e.g. via
https://github.com/llvm/llvm-project/pull/165506.

PR: https://github.com/llvm/llvm-project/pull/165826
2025-11-01 17:28:42 -07:00
Florian Hahn
317b42ef5c
[VPlan] Remove original recipe after narrowing to single-scalar.
Directly remove RepOrWidenR after replacing all uses. Removing the dead
user early unlocks additional opportunities for further narrowing.
2025-10-31 04:38:16 +00:00
Florian Hahn
98d3a25f74
[VPlan] Don't preserve LCSSA in expandSCEVs. (#165505)
This follows similar reasoning as 45ce88758d24
(https://github.com/llvm/llvm-project/pull/159556):

LV does not preserve LCSSA, it constructs it just before processing a
loop to vectorize. Runtime check expressions are invariant to that loop,
so expanding them should not break LCSSA form for the loop we are about
to vectorize.

LV creates SCEV and memory runtime checks early on and then disconnects
the blocks temporarily. The patch fixes a mis-compile, where previously
LCSSA construction during SCEV expand may replace uses in currently
unreachable SCEV/memory check blocks.

Fixes https://github.com/llvm/llvm-project/issues/162512

PR: https://github.com/llvm/llvm-project/pull/165505
2025-10-29 18:25:46 +00:00
paperchalice
249883d0c5
[test][Transforms] Remove unsafe-fp-math uses part 2 (NFC) (#164786)
Post cleanup for #164534.
2025-10-23 20:31:31 +08:00
Florian Hahn
bfc322dd72
Revert "[VPlan] Run narrowInterleaveGroups during general VPlan optimizations. (#149706)"
This reverts commit 8d29d09309654541fb2861524276ada6a3ebf84c.

There have been reports of mis-compiles
in https://github.com/llvm/llvm-project/pull/149706.

Revert while I investigate.
2025-10-22 21:27:11 +01:00
Florian Hahn
8d29d09309
[VPlan] Run narrowInterleaveGroups during general VPlan optimizations. (#149706)
Move narrowInterleaveGroups to to general VPlan optimization stage.

To do so, narrowInterleaveGroups now has to find a suitable VF where all
interleave groups are consecutive and saturate the full vector width.

If such a VF is found, the original VPlan is split into 2:
 a) a new clone which contains all VFs of Plan, except VFToOptimize, and
 b) the original Plan with VFToOptimize as single VF.

The original Plan is then optimized. If a new copy for the other VFs has
been created, it is returned and the caller has to add it to the list of
candidate plans.

Together with https://github.com/llvm/llvm-project/pull/149702, this
allows to take the narrowed interleave groups into account when
computing costs to choose the best VF and interleave count.

One example where we currently miss interleaving/unrolling when
narrowing interleave groups is https://godbolt.org/z/Yz77zbacz

PR: https://github.com/llvm/llvm-project/pull/149706
2025-10-21 11:37:42 +01:00
David Sherwood
822c291aac
[LV][NFC] Remove undef from phi incoming values (#163762)
Split off from PR #163525, this standalone patch replaces
 use of undef as incoming PHI values with zero, in order
 to reduce the likelihood of contributors hitting the
 `undef deprecator` warning in github.
2025-10-21 10:49:27 +01:00
Florian Hahn
9317975a7a
[VPlan] Match legacy behavior w.r.t. using pointer phis as scalar addrs.
When the legacy cost model scalarizes loads that are used as addresses
for other loads and stores, it looks to phi nodes, if they are direct
address operands of loads/stores. Match this behavior in
isUsedByLoadStoreAddress, to fix a divergence between legacy and
VPlan-based cost model.
2025-10-20 11:09:25 +01:00
Nikita Popov
573ca36753
[IR] Replace alignment argument with attribute on masked intrinsics (#163802)
The `masked.load`, `masked.store`, `masked.gather` and `masked.scatter`
intrinsics currently accept a separate alignment immarg. Replace this
with an `align` attribute on the pointer / vector of pointers argument.

This is the standard representation for alignment information on
intrinsics, and is already used by all other memory intrinsics. This
means the signatures now match llvm.expandload, llvm.vp.load, etc.
(Things like llvm.memcpy used to have a separate alignment argument as
well, but were already migrated a long time ago.)

It's worth noting that the masked.gather and masked.scatter intrinsics
previously accepted a zero alignment to indicate the ABI type alignment
of the element type. This special case is gone now: If the align
attribute is omitted, the implied alignment is 1, as usual. If ABI
alignment is desired, it needs to be explicitly emitted (which the
IRBuilder API already requires anyway).
2025-10-20 08:50:09 +00:00
Florian Hahn
b9ce7656e9
[VPlan] Add VPInstruction to unpack vector values to scalars. (#155670)
Add a new Unpack VPInstruction (name to be improved) to explicitly
extract scalars values from vectors.

Test changes are movements of the extracts: they are no generated
together and also directly after the producer.

Depends on https://github.com/llvm/llvm-project/pull/155102 (included in
PR)

PR: https://github.com/llvm/llvm-project/pull/155670
2025-10-19 18:49:05 +00:00
Nikita Popov
8fa4a1029c [LoopVectorize] Regenerate test checks (NFC) 2025-10-16 18:21:42 +02:00
Ramkumar Ramachandra
b71515cc76
[VPlan] Extend licm to hoist assumes (#162636)
Assumes are safe to hoist if they're guaranteed to execute, since they
don't alias, and don't throw. This mirrors what the IR-LICM does.
2025-10-16 13:59:32 +00:00
David Sherwood
c48aa54656
[LV][NFC] Remove undef from function return values (#163578)
Split off from PR #163525, this standalone patch replaces `ret * undef`
returns with `ret void` in order to reduce the likelihood of
contributors hitting the `undef deprecator` warning in github.
2025-10-16 09:49:38 +01:00
Florian Hahn
ba69e33e13
[LV] Consistently apply address def scalarization across loop.
Consistently scalarize loads used as part of address computations across
all uses in the loop. This aligns the VPlan and legacy cost model and
fixes a divergence crash. It doesn't matter if the load and address
users are in different blocks, as long as they are in the same loop, the
scalar value can be used. This removes a number of insert/extracts.
2025-10-09 22:04:15 +01:00
Florian Hahn
98ce434870
[VPlan] Skip VPBlendRecipe in isUsedByLoadStoreAddress.
VPBlendRecipes are introduced as part of if-conversion, potentially adding
a def-use chain from a load used in a compare to another load/store. In
the scalar IR, there is no connection via def-use chains, so the legacy
cost model won't consider the load used by memory operation.

Skipping blends brings the VPlan-based cost-computation in line with the
legacy cost model after https://github.com/llvm/llvm-project/pull/162157.
2025-10-08 18:43:23 +01:00
Joel E. Denny
6d44b9082e
[LoopUnroll] Skip remainder loop guard if skip unrolled loop (#156549)
The original loop (OL) that serves as input to LoopUnroll has basic
blocks that are arranged as follows:

```
OLPreHeader
OLHeader <-.
...        |
OLLatch ---'
OLExit
```

In this depiction, every block has an implicit edge to the next block
below, so any explicit edge indicates a conditional branch.

Given OL and unroll count N, LoopUnroll sometimes creates an unrolled
loop (UL) with a remainder loop (RL) epilogue arranged like this:

```
,-- ULGuard
|   ULPreHeader
|   ULHeader <-.
|   ...        |
|   ULLatch ---'
|   ULExit
`-> RLGuard -----.
    RLPreHeader  |
,-> RLHeader     |
|   ...          |
`-- RLLatch      |
    RLExit       |
    OLExit <-----'
```

Each UL iteration executes N OL iterations, but each RL iteration
executes 1 OL iteration. ULGuard or RLGuard checks whether the first
iteration of UL or RL should execute, respectively. If so, ULLatch or
RLLatch checks whether to execute each subsequent iteration.

Once reached, OL always executes its first iteration but not necessarily
the next N-1 iterations. Thus, ULGuard is always required before the
first UL iteration. However, when control flows from ULGuard directly to
RLGuard, the first OL iteration has yet to execute, so RLGuard is then
redundant before the first RL iteration.

Thus, this patch makes the following changes:
- Adjust ULGuard to branch to RLPreHeader instead of RLGuard, thus
eliminating RLGuard's unnecessary branch instruction for that path.
- Eliminate the creation of RLGuard phi node poison values. Without this
patch, RLGuard has such a phi node for each value that is defined by any
OL iteration and used in OLExit. The poison value is required where
ULGuard is the predecessor. The poison value indicates that control flow
from ULGuard to RLGuard to Exit has no counterpart in OL because the
first OL iteration must execute either in UL or RL.
- Simplify the CFG by not splitting ULExit and RLGuard because, without
the ULGuard predecessor, the single block can now be a dedicated UL
exit.
- To RLPreHeader, add an `llvm.assume` call that asserts the RL trip
count is non-zero. Without this patch, RLPreHeader is reachable only
when RLGuard guarantees that assertion is true. With this patch, RLGuard
guarantees it only when RLGuard is the predecessor, and the OL structure
guarantees it when ULGuard is the predecessor. If RL itself is unrolled
later, this guarantee somehow prevents ScalarEvolution from giving up
when trying to compute a maximum trip count for RL. That maximum trip
count enables the branch instruction in the final unrolled instance of
RLLatch to be eliminated. Without the `llvm.assume` call, some existing
unroll tests start to fail because that instruction is not eliminated.

The original motivation for this patch is to facilitate later patches
that fix LoopUnroll's computation of branch weights so that they
maintain the block frequency of OL's body (see #135812). Specifically,
this patch ensures RLGuard's branch weights do not affect RL's
contribution to the block frequency of OL's body in the case that
ULGuard skips UL.
2025-10-07 10:45:49 -04:00
Florian Hahn
74af5784a5
Reapply "[VPlan] Compute cost of more replicating loads/stores in ::computeCost. (#160053)" (#162157)
This reverts commit f80c0baf058dbdc5 and 94eade61a02ae5.

Recommit a small fix for targets using prefersVectorizedAddressing.

Original message:
Update VPReplicateRecipe::computeCost to compute costs of more
replicating loads/stores.

There are 2 cases that require extra checks to match the legacy cost
model:
1. If the pointer is based on an induction, the legacy cost model passes
its SCEV to getAddressComputationCost. In those cases, still fall back
to the legacy cost. SCEV computations will be added as follow-up
2. If a load is used as part of an address of another load, the legacy
cost model skips the scalarization overhead. Those cases are currently
handled by a usedByLoadOrStore helper.

Note that getScalarizationOverhead also needs updating, because when the
legacy cost model computes the scalarization overhead, scalars have not
been collected yet, so we can't each for replicating recipes to skip
their cost, except other loads. This again can be further improved by
modeling inserts/extracts explicitly and consistently, and compute costs
for those operations directly where needed.

PR: https://github.com/llvm/llvm-project/pull/160053
2025-10-06 22:16:08 +01:00
Alexey Bataev
94eade61a0 Revert "[VPlan] Match legacy CM in ::computeCost if load is used by load/store."
This reverts commit 1d65d9ce06fef890389e61990d9c748162334e55 to fix
crashes, reported in the commits
2025-10-05 08:37:52 -07:00
Florian Hahn
1d65d9ce06
[VPlan] Match legacy CM in ::computeCost if load is used by load/store.
If a load is scalarized because it is used by a load/store address, the
legacy cost model does not pass ScalarEvolution to getAddressComputationCost.

Match the behavior in VPReplicateRecipe::computeCost.
2025-10-03 22:01:46 +01:00
Florian Hahn
1a850279c5
[LV] Re-compute cost of scalarized load users.
If there are direct memory op users of the newly scalarized load,
their cost may have changed because there's no scalarization
overhead for the operand. Update it.

This ensures assigning consistent costs to scalarized memory
instructions that themselves have scalarized memory instructions as
operands.
2025-10-01 22:30:18 +01:00
Alexey Bader
2d6e7ef567
[LV] Add additional tests for replicating load/store costs.
Includes test for https://github.com/llvm/llvm-project/issues/161404
2025-10-01 19:15:19 +01:00
Florian Hahn
8907b6d393
[VPlan] Remove original loop blocks if dead. (#155497)
Build on top of https://github.com/llvm/llvm-project/pull/154510 to
completely remove the blocks of dead scalar loops.

Depends on https://github.com/llvm/llvm-project/pull/154510. 

PR: https://github.com/llvm/llvm-project/pull/155497
2025-10-01 16:53:59 +00:00
Florian Hahn
45ce88758d
[LV] Don't preserve LCSSA in SCEVExpander for runtime checks. (#159556)
LV does not preserve LCSSA, it constructs it just before processing a
loop to vectorize. Runtime check expressions are invariant to that loop,
so expanding them should not break LCSSA form for the loop we are about
to vectorize.

This fixes a crash when discarding instructions generated when expanding
runtime checks, if the expansion introduces LCSSA phis for values from
other loops which are not in LCSSA form: we would introduce new LCSSA
phis and update all outside users, some of which are not created by the
expander and cannot be cleaned up.

Fixes https://github.com/llvm/llvm-project/issues/158259.

PR: https://github.com/llvm/llvm-project/pull/159556
2025-09-30 10:03:55 +00:00
Florian Hahn
8460dbb450
[VPlan] Mark VPInstruction::Broadcast as not reading/writing memory.
This enables additional DCE/CSE opportunities and ensures that we don't
end up with multiple redundant users of a VPInstruction using EVL. It
fixes a verifier error in the added test_3_inductions test.
2025-09-27 20:48:42 +01:00
Florian Hahn
78af056137
[VPlan] Run CSE closer to VPlan::execute. (#160572)
Additional CSE opportunities are exposed after converting to concrete
recipes/dissolving regions and materializing various expressions. Run
CSE later, to capitalize on some of the late opportunities.

PR: https://github.com/llvm/llvm-project/pull/160572
2025-09-26 09:38:58 +00:00
Florian Hahn
2016af5652
[VPlan] Create epilogue minimum iteration check in VPlan. (#157545)
Move creation of the minimum iteration check for the epilogue vector
loop to VPlan. This is a first step towards breaking up and moving
skeleton creation for epilogue vectorization to VPlan.

It moves most logic out of EpilogueVectorizerEpilogueLoop: the minimum
iteration check is created directly in VPlan, connecting the check
blocks from the main vector loop is done as post-processing. Next steps
are to move connecting and updating the branches from the check blocks
to VPlan, as well as updating the incoming values for phis.

Test changes are improvements due to folding of live-ins.

PR: https://github.com/llvm/llvm-project/pull/157545
2025-09-25 07:13:38 +00:00
Ramkumar Ramachandra
844e47c60c
[LV] Fixup a test after 66fd420 (#160505)
Follow up on 66fd420 ([LV] Don't ignore invariant stores when costing)
to regen a test with UTC.
2025-09-24 11:56:18 +00:00
Ramkumar Ramachandra
66fd42008a
[LV] Don't ignore invariant stores when costing (#158682)
Invariant stores of reductions are removed early in the VPlan
construction, and there is no reason to ignore them while costing.
2025-09-24 11:33:24 +01:00
Florian Hahn
81c0c7337d
[LV] Pass operand info to getMemoryOpCost in getMemInstScalarizationCost.
Pass operand info to getMemoryOpCost in getMemInstScalarizationCost.
This matches the behavior in VPReplicateRecipe::computeCost.
2025-09-19 21:03:38 +01:00
Florian Hahn
19659eec2b
[LV] Add additional test for replicating store costs.
Add tests for costing replicating stores with x86_fp80, scalarizing
costs after discarding interleave groups and cost when preferring vector
addressing.
2025-09-19 20:38:53 +01:00
Florian Hahn
50b9ca4dda
[VPlan] Simplify Plan's entry in removeBranchOnConst. (#154510)
After https://github.com/llvm/llvm-project/pull/153643, there may be a
BranchOnCond with constant condition in the entry block.

Simplify those in removeBranchOnConst. This removes a number of
redundant conditional branch from entry blocks.

In some cases, it may also make the original scalar loop unreachable,
because we know it will never execute. In that case, we need to remove
the loop from LoopInfo, because all unreachable blocks may dominate each
other, making LoopInfo invalid. In those cases, we can also completely
remove the loop, for which I'll share a follow-up patch.

Depends on https://github.com/llvm/llvm-project/pull/153643.

PR: https://github.com/llvm/llvm-project/pull/154510
2025-09-18 19:25:05 +01:00
Ramkumar Ramachandra
46fcece2a8
[VPlan] Extend CSE to eliminate GEPs (#156699)
The motivation for this patch is to close the gap between the
VPlan-based CSE and the legacy CSE, to make it easier to remove the
legacy CSE. Before this patch, stubbing out the legacy CSE leads to 22
test failures, and after this patch, there are only 12 failures, and all
of them seem to have a single root cause:
VPlanTransforms::createInterleaveGroups() and
VPInterleaveGroup::execute(). The improvements from this patch are of
course welcome.

While developing the patch, a miscompile was found when GEP
source-element-types differ, and this has been fixed.

Co-authored-by: Florian Hahn <flo@fhahn.com>
Co-authored-by: Luke Lau <luke@igalia.com>
2025-09-16 10:14:32 +00:00
Florian Hahn
1858532c48
[VPlan] Handle predicated UDiv in VPReplicateRecipe::computeCost.
Account for predicated UDiv,SDiv,URem,SRem in
VPReplicateRecipe::computeCost: compute costs of extra phis and apply
getPredBlockCostDivisor.

Fixes https://github.com/llvm/llvm-project/issues/158660
2025-09-15 21:46:50 +01:00
Florian Hahn
4949cb4a5e
[VPlan] Track VPValues instead of VPRecipes in calculateRegisterUsage. (#155301)
Update calculateRegisterUsageForPlan to track live-ness of VPValues
instead of recipes. This gives slightly more accurate results for
recipes that define multiple values (i.e. VPInterleaveRecipe).

When tracking the live-ness of recipes, all VPValues defined by an
VPInterleaveRecipe are considered alive until the last use of any of
them. When tracking the live-ness of individual VPValues, we can
accurately track the individual values until their last use.

Note the changes in large-loop-rdx.ll and pr47437.ll. This patch
restores the original behavior before introducing VPlan-based liveness
tracking.

PR: https://github.com/llvm/llvm-project/pull/155301
2025-09-15 20:55:11 +01:00
Antonio Frighetto
370607065d
[llvm] Regenerate test checks including TBAA semantics (NFC)
Tests exercizing TBAA metadata (both purposefully and not), and
previously generated via UTC, have been regenerated and updated
to version 6.
2025-09-12 20:01:17 +02:00
Joel E. Denny
0e3c5566c0
[PGO] Add llvm.loop.estimated_trip_count metadata (#152775)
This patch implements the `llvm.loop.estimated_trip_count` metadata
discussed in [[RFC] Fix Loop Transformations to Preserve Block
Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785).
As the RFC explains, that metadata enables future patches, such as PR
#128785, to fix block frequency issues without losing estimated trip
counts.
2025-09-11 15:55:18 -04:00
Florian Hahn
1efa997317
[VPlan] Handle stores to single-scalar addr in narrowToSingleScalars.
Move handling of stores to single-scalar/uniform address from
replicateByVF to narrowToSingleScalar.
2025-09-10 21:58:29 +01:00
Nikita Popov
a301e1a895
[InstCombine] Split GEPs with multiple non-zero offsets (#151333)
Split GEPs that have more than one non-zero offset into two GEPs. This
is in preparation for the ptradd migration, which can only represent
such GEPs.

This also enables CSE and LICM of the common base.
2025-09-10 16:51:58 +02:00