34818 Commits

Author SHA1 Message Date
Peter Collingbourne
75bb30ddbf
Move {load,store}(llvm.protected.field.ptr) lowering to InstCombine.
The previous position of llvm.protected.field.ptr lowering for loads
and stores was problematic as it not only inhibited optimizations such
as DSE (as stores to a llvm.protected.field.ptr were not considered to
must-alias stores to the non-protected.field pointer) but also required
changes to other optimization passes to avoid transformations that would
reduce PFP coverage.

Address this by moving the load/store part of the lowering to
InstCombine, where it will run earlier than the PFP-breaking and
AA-relying transformations. The deactivation symbol, null comparison
and EmuPAC parts of the lowering remain in PreISelLowering.

Now that the transformation inhibitions are no longer needed, remove them
(i.e. partially revert #151649, and revert #182976).

This change resulted in a 2.4% reduction in Fleetbench .text size and
the following improvements to PFP performance overhead for BM_PROTO_Arena
on various microarchitectures:

                    before   after
  Apple M2 Ultra     3.5%    3.3%
  Google Axion C4A   3.3%    2.9%
  Google Axion N4A   2.7%    2.2%

Reviewers: fmayer, nikic, vitalybuka

Reviewed By: fmayer

Pull Request: https://github.com/llvm/llvm-project/pull/186548
2026-04-06 17:47:24 -07:00
Arthur Eubanks
82505fbfc8
[Inliner] Put inline history into IR as !inline_history metadata (#190700)
(Reland of #190092 with verifier change to look through GlobalAliases)

So that it's preserved across all inline invocations rather than just
one inliner pass run.

This prevents cases where devirtualization in the simplification
pipeline uncovers inlining opportunities that should be discarded due to
inline history, but we dropped the inline history between inliner pass
runs, causing code size to blow up, sometimes exponentially.

For compile time reasons, we want to limit this to only call sites that
have the potential to inline through SCCs, potentially with the help of
devirtualization. This means that the callee is in a non-trivial
(Ref)SCC, or the call site was previously an indirect call, which can
potentially be devirtualized to call any function.

The CGSCCUpdater::InlinedInternalEdges logic still seems to be relevant
even with this change, as monster_scc.ll blows up if I remove that code.


http://llvm-compile-time-tracker.com/compare.php?from=e830d88e8ae5f44a97cc76136a0a4e83aa9157c0&to=ed535e732fc41b79ab8efda2417886cbd0812f7f&stat=instructions:u

Fixes #186926.
2026-04-06 17:31:43 -07:00
Wei Wang
1ae179b325
[SampleProfileMatcher] Fix backward matching of non-anchor locations (#190118)
The backward matching loop in `matchNonCallsiteLocs` was ineffective
because `InsertMatching` used `std::unordered_map::insert()` which does
not overwrite existing entries. Since forward matching already inserted
entries for all non-anchor locations, the backward matching for the
second half was silently ignored.

The backward matching can update forward mappings in
`IRToProfileLocationMap` in 2 ways:
- The IR location maps a new different profile location. Change
`insert()` to `insert_or_assign()` so that entry overwrite can happen.
- The IR location maps the same profile location. Add `erase()` to
remove such mapping.
2026-04-06 15:21:31 -07:00
Arthur Eubanks
70d3dcaa64
Revert "[Inliner] Put inline history into IR as !inline_history metadata" (#190666)
Reverts llvm/llvm-project#190092

Crashes reported in
https://github.com/llvm/llvm-project/pull/190092#issuecomment-4194546908
2026-04-06 20:31:54 +00:00
Kirill Stoimenov
cdbb1f5014
Revert "[InstCombine] Fix #163110: Support peeling off matching shifts from icmp operands via canEvaluateShifted" (#190638)
Reverts llvm/llvm-project#165975

Breaks Sanitizer bots:
https://lab.llvm.org/buildbot/#/builders/52/builds/16329
2026-04-06 11:30:36 -07:00
vporpo
8d442bc5b5
[SandboxVec][LoadStoreVec] Add support for constants (#189769)
Up until now the pass would only vectorize load-store pairs. This patch
implements vectorization of constant-store pairs.
2026-04-06 11:25:20 -07:00
Arthur Eubanks
72d4ce9889
[Inliner] Put inline history into IR as !inline_history metadata (#190092)
So that it's preserved across all inline invocations rather than just
one inliner pass run.

This prevents cases where devirtualization in the simplification
pipeline uncovers inlining opportunities that should be discarded due to
inline history, but we dropped the inline history between inliner pass
runs, causing code size to blow up, sometimes exponentially.

For compile time reasons, we want to limit this to only call sites that
have the potential to inline through SCCs, potentially with the help of
devirtualization. This means that the callee is in a non-trivial
(Ref)SCC, or the call site was previously an indirect call, which can
potentially be devirtualized to call any function.

The CGSCCUpdater::InlinedInternalEdges logic still seems to be relevant
even with this change, as monster_scc.ll blows up if I remove that code.


http://llvm-compile-time-tracker.com/compare.php?from=e830d88e8ae5f44a97cc76136a0a4e83aa9157c0&to=ed535e732fc41b79ab8efda2417886cbd0812f7f&stat=instructions:u

Fixes #186926.
2026-04-06 10:24:41 -07:00
Domenic Nutile
5b33f85a08
[AMDGPU] Change isSingleLaneExecution to account for WWM enabling lanes even if there's only one workitem (#188316)
This issue was discovered during some downstream work around Vulkan CTS
tests, specifically
`dEQP-VK.subgroups.arithmetic.compute.subgroupadd_float`

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2026-04-06 12:51:46 -04:00
Matt Arsenault
bf2a97a0dd
AMDGPU: Add range attribute to mbcnt intrinsic callsites (#189191)
It seems the known bits handling added in
686987a540bc176bceaad43ffe530cb3e88796d5
is insufficient to perform many range based optimizations. For some
reason
computeConstantRange doesn't fall back on KnownBits, and has a separate,
less used form which tries to use computeKnownBits.
2026-04-06 14:40:54 +00:00
Max Graey
c4281fd5af
[Support][ValueTraking] Improve KnownFPClass for fadd. Handle infinity signs (#190559)
Improve KnownFPClass reasoning for fadd:

- Refine NaN handling for infinities by checking opposite-sign cases:
   - `-inf` + `+inf` --> `nan`
  - `+inf` + `-inf` --> `nan`
  - `+inf` + `+inf` --> `+inf`
  - `-inf` + `-inf` --> `-inf`
- Introduce `cannotBeOrderedLessEqZero` as pair to
`cannotBeOrderedGreaterEqZero`.
2026-04-06 16:23:20 +02:00
Florian Hahn
0403639667
[VPlan] Skip successors outside any loop when updating LoopInfo. (#190553)
Successors outside of any loop do not contribute to the innermost loop,
skip them to avoid incorrect results due to
getSmallestCommonLoop(nullptr, X) returning nullptr.
2026-04-06 12:58:41 +01:00
陈子昂
05ff170026
[InstCombine] Fix #163110: Support peeling off matching shifts from icmp operands via canEvaluateShifted (#165975)
Consider a pattern like `icmp (shl nsw X, L), (add nsw (shl nsw Y, L),
K)`. When the constant K is a multiple of 2^L, this can be simplified to
`icmp X, (add nsw Y, K >> L)`.
This patch extends canEvaluateShifted to support `Instruction::Add` and
updates its signature to accept `Instruction::BinaryOps` instead of a
boolean. This change allows the function to distinguish between LShr and
AShr requirements, ensuring that information is preserved according to
the signedness and overflow flags (nsw/nuw) of the operands.
The logic is integrated into `foldICmpCommutative` to enable peeling off
matching shifts from both sides of a comparison even when an offset is
present.

Fixes: #163110
2026-04-06 13:44:17 +02:00
Florian Hahn
58208a0cc1
[LV] Additional epilogue tests for find-iv and with uses of IV.(NFC) (#190548)
Additional test coverage for loops not yet supported, with sinkable
find-iv expressions (github.com/llvm/llvm-project/pull/183911) and uses
of the IV.

PR: https://github.com/llvm/llvm-project/pull/190548
2026-04-05 20:42:11 +00:00
Alexey Bataev
eaf0135b77
[SLP][NFC]Fix run line for the test, fix test name, NFC
Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/190537
2026-04-05 12:20:43 -04:00
Elvis Wang
47cd798670
Revert "[LV] Enable scalable FindLast on RISCV." (#190463)
Reverts llvm/llvm-project#184931 since it crash llvm-test-suite.
https://lab.llvm.org/buildbot/#/builders/210/builds/9807
2026-04-04 23:03:11 +08:00
Elvis Wang
a955b3caba
[LV] Enable scalable FindLast on RISCV. (#184931)
This patch enables FindLast reduction vectorization with scalable vectors
on RISCV.
2026-04-04 18:58:58 +08:00
Antonio Frighetto
d27cbc5fa9
[DSE] Introduce eliminateRedundantStoresViaDominatingConditions (#181709)
While optimizing tautological assignments, if there exists a dominating
condition that implies the value being stored in a pointer, and such a
condition appears in a node that dominates the store via equality edge,
then subsequent stores may be redundant, if no write occurs in between.
This is achieved via a DFS top-down walk of the dom-tree, collecting
dominating conditions and propagating them to each subtree, popping them
upon backtracking.

This also generalizes `dominatingConditionImpliesValue` transform, which
was previously taking into account only the immediate dominator.

Compile-time:
https://llvm-compile-time-tracker.com/compare.php?from=f8906704104e446a7482aeca32d058b91867e05c&to=24c5d61f1e28acbe6a59ea4e9a5da0ffcee3bf1a&stat=instructions:u.

Compile-time w/ limit on recursion:
https://llvm-compile-time-tracker.com/compare.php?from=24c5d61f1e28acbe6a59ea4e9a5da0ffcee3bf1a&to=9889567fe8a0515ab895b22003c93fabfd9ac4e5&stat=instructions:u.
Seems to alleviate the small regression in stage2-O3, but seemingly adds
one in stage2-O0-g.
2026-04-04 10:55:25 +00:00
Antonio Frighetto
48c59d1a97
[DSE] Introduce tests for PR181709 (NFC) (#190454) 2026-04-04 12:23:17 +02:00
Florian Hahn
093c6391b2
[LV] Add additional tests with IV live-outs. (NFC) (#190395)
Add additional tests with IV live-out users, for which epilogue
vectorization is not enabled yet.

Also modernize check lines.
2026-04-04 10:20:04 +01:00
Joseph Huber
8a8434f22a
[OpenMP] Move alloc / free shared from TLI to alloc tags (#190365)
Summary:
Allocation kinds were added after these were introduced. We only needed
the TLI to identify these in the attributor so we can now just use
attributes. Update the usage in OpenMP and drop the TLI interface.

Fixes: https://github.com/llvm/llvm-project/issues/190072
2026-04-03 15:15:48 -05:00
Wei Wang
f33e9faa5d
[SampleProfile] Fix FuncMappings key mismatch for renamed functions in stale profile matching (#187899)
Fix a bug where `distributeIRToProfileLocationMap` fails to find
location mappings from IR to profile for renamed functions because
`FuncMappings` is indexed by the IR function name while
`distributeIRToProfileLocationMap` looks up by the profile function
name. Fixed by making `FuncMappings` to use profile function name as
key.
2026-04-03 11:38:51 -07:00
Sander de Smalen
730a07f225
[LV] Only create partial reductions when profitable. (#181706)
We want the LV cost-model to make the best possible decision of VF and
whether or not to use partial reductions. At the moment, when the LV can
use partial reductions for a given VF range, it assumes those are always
preferred. After transforming the plan to use partial reductions, it
then chooses the most profitable VF. It is possible for a different VF
to have been more profitable, if it wouldn't have chosen to use partial
reductions.

This PR changes that, to first decide whether partial reductions are
more profitable for a given chain. If not, then it won't do the
transform.

This causes some regressions for AArch64 which are addressed in a
follow-up PR to keep this one simple.
2026-04-03 17:42:51 +01:00
hjagasiaAMD
a76750e6de
Revert "[SimplifyCFG] Extend jump-threading to allow live local defs … (#190269)
…(#135079)"

This reverts commit a757f23404c594f4a48b4ddb6625f88b349d11d5. Commit
causes reduce.cu file in hipcub/warp go from 2 minutes of compilation to
taking several hours.
2026-04-02 23:05:26 +00:00
Valeriy Savchenko
75a354ba55
[InstCombine] Use ComputeNumSignBits in isKnownExactCastIntToFP (#190235)
For signed int-to-FP casts, ComputeNumSignBits can prove exactness where
computeKnownBits cannot -- e.g. through ashr(shl x, a), b where sign propagation is
tracked precisely but individual known bits are all unknown.
2026-04-02 21:04:29 +01:00
Matt Arsenault
2ec19b86b5
ValueTracking: x - floor(x) cannot introduce overflow (#189003)
This returns a value with an absolute value less than 1 so it
should be possible to propagate no-infs.
2026-04-02 19:26:22 +00:00
Florian Hahn
8e61085291
[Matrix] Place allocas in function entry. (#190032)
Create allocas for temporary matrixes in the function entry. Limit the
lifetime via lifetime.start & lifetime.end. This avoids dynamic allocas.

Improvement suggested in
https://github.com/llvm/llvm-project/pull/188721.

PR: https://github.com/llvm/llvm-project/pull/190032
2026-04-02 17:36:13 +00:00
Alexey Bataev
c2f97c5917
[SLP] Do not skip tiny trees with gathered loads to vectorize
The isTreeTinyAndNotFullyVectorizable check for 2-node trees
(insertelement root + gather child) was too aggressive: it rejected
trees even when LoadEntriesToVectorize was non-empty, preventing
gathered loads from being vectorized into masked loads/strided loads, etc.

Reviewers: hiraditya, RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/190181
2026-04-02 09:47:01 -04:00
Alexey Bataev
dc2d25f80b
Revert "[SLP] Do not skip tiny trees with gathered loads to vectorize"
This reverts commit 94ec7ffa46d351b86fbbe3a445ceef37f331c4a2 to fix
reported issue https://github.com/llvm/llvm-project/pull/190040#issuecomment-4177827078

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/190176
2026-04-02 09:26:31 -04:00
Alexey Bataev
94ec7ffa46
[SLP] Do not skip tiny trees with gathered loads to vectorize
The isTreeTinyAndNotFullyVectorizable check for 2-node trees
(insertelement root + gather child) was too aggressive: it rejected
trees even when LoadEntriesToVectorize was non-empty, preventing
gathered loads from being vectorized into masked loads/strided loads, etc.

Reviewers: RKSimon, hiraditya

Pull Request: https://github.com/llvm/llvm-project/pull/190040
2026-04-02 06:47:53 -04:00
Gabriel Baraldi
5e0a06b34d
Move ExpandMemCmp and MergeIcmp to the middle end (#77370)
Moving these into the middle-end pipeline will allow for additional
optimization of the expansion result, such as CSE of redundant loads
(c.f. https://godbolt.org/z/bEna4Md9r). For now, we conservatively place
the passes at the end of the middle-end pipeline, so we mostly don't
benefit from additional optimizations yet. The pipeline position will be
moved in a future change.

This builds on work done by legrosbuffle in
https://reviews.llvm.org/D60318.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 09:57:00 +02:00
Weibo He
7ccd1cb9a4
Reland "[CoroSplit] Erase trivially dead allocas after spilling (#189295)" (#190124)
The original PR contained a use-after-delete issue, which has been
resolved in #189521.

Reland #189295, which is reverted in #189311
2026-04-02 07:45:13 +00:00
Nikita Popov
1662c200a5
[Passes][LoopRotate] Move minsize handling fully into pass (#189956)
Make this dependent only on the minsize attribute and drop the pipeline
handling.

Rename the enable-loop-header-duplication option to
enable-loop-header-duplication-at-minsize to clarify that it controls
header duplication at minsize only (in other cases it is enabled by
default, independently of this option).
2026-04-02 09:32:56 +02:00
Nikita Popov
40e7fa632d
[Passes][FuncSpec] Move optsize/minsize handling into pass (#189952)
Instead of using the Os/Oz level during pass pipeline construction,
query the optsize/minsize attribute on the function to determine whether
specialization is allowed to take place. This ensures consistent
behavior for per-function attributes.

It's worth noting that FuncSpec *already* checks for minsize, but at the
call-site level.
2026-04-02 09:32:39 +02:00
Hans Wennborg
3b81be803f
WholeProgramDevirt: Import/export the CVP byte directly in the summary (#188979)
rather than using absolute symbol constants on ELF/x86.

This leads to better codegen as the absolute symbol constants were not
resolved until link time (see bug for example).

Fixes #188470
2026-04-02 09:28:32 +02:00
Alexey Bataev
c6669c4993
[SLP] Guard FMulAdd conversion to require single-use/non-reordered FMul operands
The FMulAdd (CombinedVectorize) transformation in transformNodes() marks
an FMul child entry with zero cost, assuming it is fully absorbed into
the fmuladd intrinsic. However, when any FMul scalar has multiple uses
(e.g., also stored separately), the FMul must survive as a separate
node.

Reviewers: hiraditya, RKSimon, bababuck

Pull Request: https://github.com/llvm/llvm-project/pull/189692
2026-04-01 17:14:52 -04:00
Matt Arsenault
6c923741c1
ValueTracking: llvm.amdgcn.fract cannot introduce overflow (#189002)
This returns a value with an absolute value less than 1.
2026-04-01 21:07:57 +00:00
Valeriy Savchenko
33ca7a4667
[LICM] Reassociate add/sub expressions to hoist invariant computations (#183082)
While `sub` is not associative, we can still reassociate `add` and
`sub`.

## Alive2 proofs

| Case | Transform | Proof |
|------|-----------|-------|
| 1 | `(x + c1) - c2` => `x + (c1 - c2)` |
[proof](https://alive2.llvm.org/ce/z/iofzYy) |
| 2 | `(x - c1) - c2` => `x - (c1 + c2)` |
[proof](https://alive2.llvm.org/ce/z/U4K_tE) |
| 3 | `(x - c1) + c2` => `x + (c2 - c1)` |
[proof](https://alive2.llvm.org/ce/z/moiJVw) |
2026-04-01 18:29:40 +01:00
Ramkumar Ramachandra
82e8494070
[VPlan] Avoid unnecessary BTC SymbolicValue creation (NFC) (#189929)
Don't unnecessarily create a backedge-taken-count SymbolicValue. This
allows us to simplify some code.
2026-04-01 16:25:48 +00:00
Florian Hahn
0b61cd39e4
[LV] Add epilogue minimum iteration check in VPlan as well. (#189372)
Update LV to also use the VPlan-based addMinimumIterationCheck for the
iteration count check for the epilogue.

As the VPlan-based addMinimumIterationCheck uses VPExpandSCEV, those
need to be placed in the entry block for now, moving vscale * VF * IC to
the entry for scalable vectors.

The new logic also fails to simplify some checks involving PtrToInt,
because they were only simplified when going through generated IR, then
folding some PtrToInt in IR, then constructing SCEVs again. But those
should be cleaned up by later combines, and there is not really much we
can do other than trying to go through IR.

PR: https://github.com/llvm/llvm-project/pull/189372
2026-04-01 15:47:41 +01:00
David Green
fd40c60665
[VectorCombine] Fix transitive Uses in foldShuffleToIdentity (#188989)
The Uses in foldShuffleToIdentity is intended to detect where an operand
is used to distinguish between splats, identities and concats of the
same value. When looking through multiple unsimplified shuffles the same
Use could be both a splat and a identity though. This patch changes the
Use to a Value and an original Use, so that even if we are looking
through multiple vectors we recognise the splat vs identity vs concat of
each use correctly.

Fixes #180338
2026-04-01 14:53:04 +01:00
Nikita Popov
31edb8fab5
[LICM] Generate test checks (NFC) (#189963) 2026-04-01 15:52:17 +02:00
Lewis Crawford
ce78c16738
SROA: Fix tree merge IRBuilder insert point (#189680)
StoreInfos is sorted by slice offset, not program order. Anchoring the
IRBuilder at StoreInfos.back() could emit shufflevectors before SSA
values defined later in the same block (invalid IR).

Insert merged shuffles immediately before TheLoad when the load shares
the store block. When the load is elsewhere, insert before the store
block terminator so the merge runs after every store + any trailing
instructions in that block.
2026-04-01 14:45:05 +01:00
Henry Jiang
bf50489eeb
[Psuedoprobe][MachO] Enable pseudo probes emission for MachO (#185758)
Enable pseudo probes emission for MachO. Due to the 16 character limit
of MachO segment and section, the file sections will be
`__PSEUDO_PROBE,__probes` and `__PSEUDO_PROBE,__probe_descs`.
2026-03-31 16:27:58 -07:00
Alexey Bataev
c20e233020 [SLP] Replace TrackedToOrig DenseMap with parallel SmallVector in reduction
Replace the DenseMap<Value*, Value*> TrackedToOrig with a SmallVector<Value*>
indexed in parallel with Candidates. This avoids hash-table overhead for the
tracked-value-to-original-value mapping in horizontal reduction processing.

Fixes #189686
2026-03-31 16:22:57 -07:00
Henry Jiang
5d624b5b93
[VPlan] Stop outerloop vectorization from vectorizing nonvector intrinsics (#185347)
In outer-loop VPlan, avoid emitting vector intrinsic calls for intrinsics
without a vector form. In VPRecipeBuilder, detect missing vector intrinsic
mapping and emit scalar handling instead of a vector call.

Also fix assertion when `llvm.pseudoprobe` in VPlan's native path is being
treated as a `WIDEN-INTRINSIC`.

Reproducer: https://godbolt.org/z/GsPYobvYs
2026-03-31 16:01:39 -07:00
vporpo
d8e9e0af1c
[SandboxVec][LoadStoreVec] Initial pass implementation (#188308)
This patch implements a new simple region pass that can vectorize
store-load chains.
2026-03-31 15:15:43 -07:00
Justin Fargnoli
cb32b8bffb
[LoopUnrollPass] Don't pre-set UP.Count before legality checks in computeUnrollCount() (#185979)
We currently set `UP.Count` to `TripCount` and `MaxTripCount` prior to
full and upper bound unrolling, respectively. This was likely done to
ensure that calls to `UCE.getUnrolledLoopSize(UP)` use the appropriate
trip count. However, we can use `UCE.getUnrolledLoopSize(UP,
FullUnrollTripCount)` instead.

To prevent unintentional unrolling, we set `UP.Count = 0` when
early-exiting `computeUnrollCount()`. (Note: this does not occur
[here](eb687fb106/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp (L1190-L1198)).
This seems like a bug.)

We only perform early exits when evaluating runtime unrolling. At that
point, [we know `TripCount` is
false](3fb31e7b06/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp (L1157-L1158)),
and thus we could not have leaked `TripCount`. However, we [could've
leaked
`MaxTripCount`](eb687fb106/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp (L1102-L1110)).

It seems like:
eb687fb106/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp (L1181-L1188)

was supposed to handle this case. However:

- It uses `<` instead of `<=`. This breaks the existing convention
[[1]](eb687fb106/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp (L869))
[[2]](eb687fb106/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp (L1103))
for how `UP.MaxUpperBound` is treated.
- It's ignored when a target sets `UP.Force = true`.

Thus:
- When `UP.Force == false`, we leak `MaxTripCount` into runtime
unrolling when `MaxTripCount && (UP.UpperBound || MaxOrZero) &&
MaxTripCount == UP.MaxUpperBound`
- When `UP.Force == true`, we leak `MaxTripCount` into runtime unrolling
when `MaxTripCount && (UP.UpperBound || MaxOrZero) && MaxTripCount <=
UP.MaxUpperBound`.

This PR:
- Uses `UCE.getUnrolledLoopSize(UP, FullUnrollTripCount)`
- Stops setting `TripCount` and `MaxTripCount` prior to calling
`shouldFullUnroll()`
- Removes the `UP.Count = 0` safeguards
- Swaps `<` with `<=`, to address the `UP.Force == false` case
- Adds a test to document the behavior change (no longer leaking
`MaxTripCount`) in the `UP.Force == true` case.
2026-03-31 19:50:52 +00:00
Daniel Thornburgh
05dd3ae10c
[SimplifyLibCalls] Prevent orphaned global string literals (#189502)
When `printf` is simplified to `puts`, `SimplifyLibCalls` would eagerly
create a global string for the argument before checking if `puts` is
emittable. If `puts` is not emittable (e.g. because it's an unextracted
bitcode libfunc), the optimization aborts, leaving an orphaned global
string in the module. Under expensive checks, this triggers a fatal
error because the function pass modified the module without reporting
it.

This change defers the creation of the global string until after
checking if `puts` is emittable.

(This PR was created with the help of Gemini CLI.)
2026-03-31 12:12:15 -07:00
Anshil Gandhi
be9b162a79
[LoadStoreVectorizer] Add tests for mixed-type vectorization (#189716)
Precommitting this test to reflect impact of the mixed-type
vectorization PR #177908. NFC.
2026-03-31 17:51:42 +00:00
Alexey Bataev
38c0f53a14 [SLP][NFC] Add a test for incorrect fma-conversion for fmuls with multi uses 2026-03-31 08:00:21 -07:00