208342 Commits

Author SHA1 Message Date
Sameer Sahasrabuddhe
f9adee2f6b
[AMDGPU] asyncmark support for ASYNC_CNT (#185813)
Some checks failed
Bazel Checks / Buildifier (push) Has been cancelled
Bazel Checks / Bazel Build/Test (push) Has been cancelled
Build CI Tooling Containers / Build Container abi-tests (push) Has been cancelled
Build CI Tooling Containers / Build Container format (push) Has been cancelled
Build CI Tooling Containers / Build Container lint (push) Has been cancelled
Build Windows CI Container / build-ci-container-windows (push) Has been cancelled
Build CI Container / Build Container X64 (push) Has been cancelled
Build CI Container / Build Container ARM64 (push) Has been cancelled
Build CI Container / Build Container agent X64 (push) Has been cancelled
Build CI Container / Build Container agent ARM64 (push) Has been cancelled
Build libc Container / Build libc container (ubuntu-24.04) (push) Has been cancelled
Build libc Container / Build libc container (ubuntu-24.04-arm) (push) Has been cancelled
Build Metrics Container / build-metrics-container (push) Has been cancelled
Check CI Scripts / Check Python Tests (push) Has been cancelled
Test documentation build / Test documentation build (push) Has been cancelled
Libclang Python Binding Tests / Build and run Python unit tests (3.13) (push) Has been cancelled
Libclang Python Binding Tests / Build and run Python unit tests (3.8) (push) Has been cancelled
Build Docker images for libc++ CI / build-and-push (push) Has been cancelled
Test Unprivileged Download Artifact Action / Upload Test Artifact (push) Has been cancelled
Zizmor GitHub Actions Analysis / Run zizmor (push) Has been cancelled
Build CI Tooling Containers / push-ci-container (push) Has been cancelled
Build Windows CI Container / push-ci-container (push) Has been cancelled
Build CI Container / push-ci-container (push) Has been cancelled
Build libc Container / push-libc-container (push) Has been cancelled
Build Metrics Container / push-metrics-container (push) Has been cancelled
Test Unprivileged Download Artifact Action / Test Unprivileged Download Artifact (push) Has been cancelled
Commit Access Review / commit-access-review (push) Has been cancelled
The ASYNC_CNT is used to track the progress of asynchronous copies
between global and LDS memories. By including it in asyncmark, the
compiler can now assist the programmer in generating waits for
ASYNC_CNT.

Assisted-By: Claude Sonnet 4.5

This is part of a stack:

- #185813
- #185810 

Fixes: LCOMPILER-332
2026-04-07 07:23:09 +05:30
Peter Collingbourne
75bb30ddbf
Move {load,store}(llvm.protected.field.ptr) lowering to InstCombine.
The previous position of llvm.protected.field.ptr lowering for loads
and stores was problematic as it not only inhibited optimizations such
as DSE (as stores to a llvm.protected.field.ptr were not considered to
must-alias stores to the non-protected.field pointer) but also required
changes to other optimization passes to avoid transformations that would
reduce PFP coverage.

Address this by moving the load/store part of the lowering to
InstCombine, where it will run earlier than the PFP-breaking and
AA-relying transformations. The deactivation symbol, null comparison
and EmuPAC parts of the lowering remain in PreISelLowering.

Now that the transformation inhibitions are no longer needed, remove them
(i.e. partially revert #151649, and revert #182976).

This change resulted in a 2.4% reduction in Fleetbench .text size and
the following improvements to PFP performance overhead for BM_PROTO_Arena
on various microarchitectures:

                    before   after
  Apple M2 Ultra     3.5%    3.3%
  Google Axion C4A   3.3%    2.9%
  Google Axion N4A   2.7%    2.2%

Reviewers: fmayer, nikic, vitalybuka

Reviewed By: fmayer

Pull Request: https://github.com/llvm/llvm-project/pull/186548
2026-04-06 17:47:24 -07:00
Jim Lin
eb35aa90f6
[RISCV] Use per-SEW immediate inversion for vrol intrinsic patterns (#190113)
The VPatBinaryV_VI_VROL multiclass was using InvRot64Imm for all SEW
widths when converting vrol immediate intrinsics to vror.vi. This
produced unnecessarily large immediates for narrower element types
(e.g., 61 instead of 5 for SEW=8 rotate-left by 3).

Use the appropriate InvRot{SEW}Imm transform to match what the SDNode
patterns already do.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:42:23 +08:00
Arthur Eubanks
82505fbfc8
[Inliner] Put inline history into IR as !inline_history metadata (#190700)
(Reland of #190092 with verifier change to look through GlobalAliases)

So that it's preserved across all inline invocations rather than just
one inliner pass run.

This prevents cases where devirtualization in the simplification
pipeline uncovers inlining opportunities that should be discarded due to
inline history, but we dropped the inline history between inliner pass
runs, causing code size to blow up, sometimes exponentially.

For compile time reasons, we want to limit this to only call sites that
have the potential to inline through SCCs, potentially with the help of
devirtualization. This means that the callee is in a non-trivial
(Ref)SCC, or the call site was previously an indirect call, which can
potentially be devirtualized to call any function.

The CGSCCUpdater::InlinedInternalEdges logic still seems to be relevant
even with this change, as monster_scc.ll blows up if I remove that code.


http://llvm-compile-time-tracker.com/compare.php?from=e830d88e8ae5f44a97cc76136a0a4e83aa9157c0&to=ed535e732fc41b79ab8efda2417886cbd0812f7f&stat=instructions:u

Fixes #186926.
2026-04-06 17:31:43 -07:00
Lucas Ramirez
94875aea7e
[CodeGen] Fix multiple connected component issue in rematerializer (#186674)
This fixes a rematerializer issue wherein re-creating the interval of a
non-rematerializable super-register defined over multiple MIs, some of
which defining entirely dead sub-registers, could cause a crash when
changing the order of sub-definitions (for example during scheduling)
because the re-created interval could end up with multiple connected
components, which is illegal. The solution is to split separate
components of the interval in such cases. The added unit test crashes
without that added behavior.
2026-04-06 23:26:16 +00:00
Craig Topper
8e1ea8af38
[RISCV][P-ext] Add isel patterns for mhacc/mhaccu/mhaccsu. (#190670) 2026-04-06 15:37:51 -07:00
Wei Wang
1ae179b325
[SampleProfileMatcher] Fix backward matching of non-anchor locations (#190118)
The backward matching loop in `matchNonCallsiteLocs` was ineffective
because `InsertMatching` used `std::unordered_map::insert()` which does
not overwrite existing entries. Since forward matching already inserted
entries for all non-anchor locations, the backward matching for the
second half was silently ignored.

The backward matching can update forward mappings in
`IRToProfileLocationMap` in 2 ways:
- The IR location maps a new different profile location. Change
`insert()` to `insert_or_assign()` so that entry overwrite can happen.
- The IR location maps the same profile location. Add `erase()` to
remove such mapping.
2026-04-06 15:21:31 -07:00
Steven Wu
79e669f000
[CAS] Revert an unintentional change in #190634 (#190686)
Revert an unintentional change in #190634 that did an unintentional
implicit signed to unsigned cast.
2026-04-06 21:37:15 +00:00
Ehsan Amiri
8a11fe97a2
[DA] Require nsw for AddRecs involved in GCD test (#186892)
Similar to other tests, we are adding code that the AddRecs used in GCD
test are `nsw`. In this case, all recursively identified `AddRec`s are
also checked. Note that there is already a similar check in
`getConstantCoefficient` for expressions processed in that function.
2026-04-06 17:33:16 -04:00
Shilei Tian
ef715849d7
[NFC][AMDGPU] Add some debug prints to SIMemoryLegalizer (#190658) 2026-04-06 17:17:33 -04:00
Joe Nash
af95b0a615
[AMDGPU] Remove implicit super-reg defs on mov64 pseudos (#190379)
The mov64 pseudo is split into two 32 bit movs, but those 32 bit movs
had the full 64-bit register still implicitly defined. VOPD formation is
affected, so we can emit more of them.
2026-04-06 21:11:06 +00:00
Anshul Nigham
97d50c1490
[NewPM] Adds a port for AArch64PreLegalizerCombiner (#190567)
Standard porting (note that TargetPassConfig dependency was [removed
earlier](e27e7e4339)).

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2026-04-06 14:01:37 -07:00
Andrew
ee51de9836
[llvm-cov] add ability to show non executed test vectors for mc/dc coverage (#187517)
- Added `-show-mcdc-non-executed-vectors` option
- Non-executed test vectors now are tracked
- When the opt is present it's get written to UI
2026-04-06 15:59:14 -05:00
Zile Xiong
d917027334
[llvm-cov] Guard against empty CountedRegions in findMainViewFileID (#189270)
When processing coverage generated from branch coverage mode, some
functions can reach findMainViewFileID with an empty CountedRegions
list. In that case the current logic still proceeds to infer the main
view file, even though there is no regular counted region available to
do so.

Return std::nullopt early when CountedRegions is empty.

This was observed when reproducing issue #189169 with:
  cargo llvm-cov --lib --branch

The issue appears related to branch-only coverage information being
recorded separately in CountedBranchRegions, while
findMainViewFileID currently only consults CountedRegions.
This patch is a defensive fix for the empty-region case; further
investigation may still be needed to determine whether branch regions
should participate in main view file selection.

Co-authored-by: Zile Xiong <xiongzile99@gmail.com>
2026-04-06 15:58:31 -05:00
Chinmay Deshpande
9033e872fd
[AMDGPU][GISel] RegBankLegalize rules for update_dpp (#190662) 2026-04-06 13:52:10 -07:00
Congzhe
fbe6d79465
[LoopFusion] Fix out-of-date LoopInfo being used during fusion (#189452)
This is fix for
[187902](https://github.com/llvm/llvm-project/issues/187902), where
`LoopInfo` is not in a valid state at the beginning of `ScalarEvolution::createSCEVIter`.

The reason for the bug is that, `mergeLatch()` is called at a place
where control flow and dominator trees have been updated but `LoopInfo`
has not completed the update yet. `mergeLatch()` calls into
`ScalarEvolution` that uses `LoopInfo`, where out-of-date `LoopInfo` would
result in crash or unpredictable results.

This patch moves `mergeLatch()` to the place where `LoopInfo` has
completed its update and hence is in a valid state.
2026-04-06 16:35:28 -04:00
Steven Wu
1a0ca1019d
[CAS] Harden validate() against on-disk corruption (#190634)
Fixes found by fuzzer:

OnDiskTrieRawHashMap:
- Bounds-check data slot offsets in TrieVerifier::visitSlot() before
  calling getRecord(), preventing asData() assertion on out-of-bounds
  trie entries.
- Validate subtrie headers (NumBits, bounds) before constructing
  SubtrieHandle, preventing SEGV in getSlots() from corrupt NumBits.
- Validate arena bump pointer alignment, catching misaligned BumpPtr
  that would crash store() with an alignment assertion.
- Fix comma operator bug in getOrCreateRoot() where the
  compare_exchange_strong result was discarded, causing asSubtrie()
  assertion when RootTrieOffset was corrupted to zero.

OnDiskGraphDB:
- Reject invalid (zero) ref offsets in validate callback, preventing
  asData() assertion when corrupt data pool refs are resolved via
  recoverFromFileOffset().
- Validate DataRecordHandle layout flags before calling getTotalSize(),
  preventing llvm_unreachable on corrupt NumRefsFlags/DataSizeFlags.
- Validate data pool bump pointer alignment, catching misaligned
  BumpPtr that would crash store() in DataRecordHandle::constructImpl().
- Check data record refs offset alignment before calling getRefs(),
  preventing PointerUnion assertion from misaligned refs pointer.

MappedFileRegionArena:
- Convert assertions in initializeHeader() to errors so corrupted
  arena headers return an error on CAS open instead of crashing.

Assisted-By: Claude
2026-04-06 13:33:22 -07:00
Arthur Eubanks
70d3dcaa64
Revert "[Inliner] Put inline history into IR as !inline_history metadata" (#190666)
Reverts llvm/llvm-project#190092

Crashes reported in
https://github.com/llvm/llvm-project/pull/190092#issuecomment-4194546908
2026-04-06 20:31:54 +00:00
Chinmay Deshpande
40d5a7d69e
[AMDGPU][UniformityAnalysis] Mark set_inactive and set_inactive_chain_arg as SourceOfDivergence (#190640)
`set_inactive` produces a result that varies per-lane based on the EXEC mask, even when both inputs are uniform.
2026-04-06 12:40:22 -07:00
Aadarsh Keshri
326593b4b4
[Support][Modules] Removed prepareForGetLock and its usages. Ensured parent directory exists when creating lock file. (#189888)
Following #187372
2026-04-06 12:37:32 -07:00
Lucas Ramirez
5e1162eebc
[CodeGen] Move rollback capabilities outside of the rematerializer (#184341)
The rematerializer implements support for rolling back
rematerializations by modifying MIs that should normally be deleted in
an attempt to make them "transparent" to other analyses. This involves:

1. setting their opcode to DBG_VALUE and
2. setting their read register operands to the sentinel register.

This approach has several drawbacks.

1. It forces the rematerializer to support tracking these "dead MIs"
(even if support is optional, these data-structures have to exist).
2. It is not actually clear whether this mechanism will interact well
with all other analyses. This is an issue since the intent of the
rematerializer is to be usable in as many contexts as possible.
3. In practice, it has shown itself to be relatively error-prone.

This commit removes rollback support from the rematerializer and moves
those capabilities to a rematerializer listener than can be instantiated
on-demand and implements the same functionality on top of standard
rematerializer operations. The rematerializer now actually deletes MIs
that are no longer useful after rematerializations, and has support for
re-creating them on-demand without requiring additional tracking on its
part.
2026-04-06 19:23:19 +00:00
Daniel Thornburgh
fecf609998
Reland "[LTO][LLD] Prevent invalid LTO libfunc transforms (#164916)" (#190642)
This reverts commit 1ec7e86b3a779df2a0af3f37e58c8f5b3a398d7f after issue
#190072 was fixed.
2026-04-06 19:20:45 +00:00
Henry Jiang
412d6941e3
[VFS] Guard against null key/value nodes when parsing YAML overlay (#190506)
When a VFS overlay YAML file contains malformed content such as tabs,
the YAML parser can produce KeyValueNode entries where `getKey` returns
nullptr. The VFS overlay parser then passes the nullptr to
`parseScalarString`, which then calls dyn_cast.

Switch to `dyn_cast_if_present` for the above callsites and a few more.
2026-04-06 12:10:26 -07:00
Alexis Engelke
a105f27f61
[Scheduler][NFC] Don't use set to track visited nodes (#190480)
The visited set can grow rather large and we can use an unused field in
SDNode to store the same information without the use of a hash set.

This improves compile times: stage2-O3 -0.14%.
2026-04-06 18:37:26 +00:00
Kirill Stoimenov
cdbb1f5014
Revert "[InstCombine] Fix #163110: Support peeling off matching shifts from icmp operands via canEvaluateShifted" (#190638)
Reverts llvm/llvm-project#165975

Breaks Sanitizer bots:
https://lab.llvm.org/buildbot/#/builders/52/builds/16329
2026-04-06 11:30:36 -07:00
vporpo
8d442bc5b5
[SandboxVec][LoadStoreVec] Add support for constants (#189769)
Up until now the pass would only vectorize load-store pairs. This patch
implements vectorization of constant-store pairs.
2026-04-06 11:25:20 -07:00
SiliconA-Z
5c13d2f099
[ARM] Enable creation of ARMISD::CMN nodes (#163223)
Map ARMISD::CMN to tCMN instead of armcmpz.

Rename the cmn instructions to match this new reality.

Please note that I do not have merge permissions.
2026-04-06 20:05:14 +02:00
Craig Topper
38034d42bd
[RISCV] Use EVT instead of MVT in compressShuffleOfShuffles. (#190636)
For the test case I just grabbed a test that exercised this code path
and made the VT non-simple.

Fixes #190605.
2026-04-06 11:03:38 -07:00
Chinmay Deshpande
12e957fd7f
[AMDGPU][GISel] RegBankLegalize rules for amdgcn_inverse_ballot (#190629) 2026-04-06 10:30:35 -07:00
Tomer Shafir
37801e9e99
[MCA] Enhance debug prints of processor resources (#190132)
Previously, `computeProcResourceMasks()` would print resource masks on
debug mode from multiple call sites, creating noise in the debug output.
This patch aims to fix this and also print more info about the
resources.

It splits to 2 types of debug prints for resources:

1. No simulation - mask only
2. Simulation - mask + other info

For 2, it shares printing on a single place in `ResourceManager`
constructor, that should cover all the other simulation cases
indirectly:

1. `llvm/lib/MCA/HardwareUnits/ResourceManager` - covered
2. `llvm/lib/MCA/InstrBuilder.c` - should be covered indirectly - only
used by `llvm-mca` before simulation that constructs a `ResourceManager`
3. `llvm/tools/llvm-mca/Views/SummaryView.cpp` - after simulation that
constructs a `ResourceManager`
4. `llvm/tools/llvm-mca/Views/BottleneckAnalysis.cpp` - after simulation
that constructs a `ResourceManager`

It also adds `BufferSize` to the output, which should be useful to debug
scheduling model + MCA integration.

For 1, it inlines mask-only printing into 2 other callers:

1. `llvm/include/llvm/MCA/Stages/InstructionTables.h`
2. `llvm/tools/llvm-exegesis/lib/SchedClassResolution.cpp`

as they only use the masks there. I think this is a reasonable
duplication across distinguishably different users/tools.

Now every pair of callers, even across groups (1 and 2), effectively
print in a mutually exclusive way.

The patch adds debug tests for the 3 new callers, in the corresponding
root test directories, to drive further location of logically
target-independent tests that just require some target at the root. I
think this convention is more discoverable, and is pretty widely used in
the project.
2026-04-06 20:27:18 +03:00
Arthur Eubanks
72d4ce9889
[Inliner] Put inline history into IR as !inline_history metadata (#190092)
So that it's preserved across all inline invocations rather than just
one inliner pass run.

This prevents cases where devirtualization in the simplification
pipeline uncovers inlining opportunities that should be discarded due to
inline history, but we dropped the inline history between inliner pass
runs, causing code size to blow up, sometimes exponentially.

For compile time reasons, we want to limit this to only call sites that
have the potential to inline through SCCs, potentially with the help of
devirtualization. This means that the callee is in a non-trivial
(Ref)SCC, or the call site was previously an indirect call, which can
potentially be devirtualized to call any function.

The CGSCCUpdater::InlinedInternalEdges logic still seems to be relevant
even with this change, as monster_scc.ll blows up if I remove that code.


http://llvm-compile-time-tracker.com/compare.php?from=e830d88e8ae5f44a97cc76136a0a4e83aa9157c0&to=ed535e732fc41b79ab8efda2417886cbd0812f7f&stat=instructions:u

Fixes #186926.
2026-04-06 10:24:41 -07:00
vangthao95
eb065bf028
AMDGPU/GlobalISel: RegBankLegalize rules for G_EXTRACT_VECTOR_ELT (#189144) 2026-04-06 10:22:11 -07:00
Craig Topper
b44d2c977c
[RISCV] Use a vector MemVT when converting store+extractelt into a vector store. (#190107)
This is needed so that `allowsMemoryAccessForAlignment` checks for
unaligned vector memory
support instead of unaligned scalar memory support when called from
`RISCVTargetLowering::expandUnalignedVPStore`

While there remove incorrect setting of the truncating store flag
on the vector instruction. And restrict the transform to simple stores
since we don't have tests for volatile or atomic.

Fixes #189037
2026-04-06 09:58:04 -07:00
Craig Topper
0d14772a91
[RISCV][P-ext] Add isel patterns for for macc*.h00/macc*.w00. (#190444)
The RV32 macc*.h00 instructions take the lower half words from rs1 and
rs2, compute the full word product by extending the inputs, and
add to rd. The RV64 macc*.w00 is similar but operates on words
and produces a double word result.

I've restricted this to case where the multiply has a single use.
We don't have a general macc that multiplies the full xlen bits
of rs1 and rs2, so I'm allowing the input to be sext_inreg/and or
have sufficient sign/zero bits according to
ComputeNumSignBits/computeKnownBits.

We should also add mul*.h00/mul.*w00 patterns, but those we should
restrict to at least one input being sext_inreg/and and prefer
regular mul when there are no sext_inreg/and.
2026-04-06 09:57:29 -07:00
Wooseok Lee
0bef4c7aab
[AMDGPU] Add v2i32 and/or patterns for VOP3 AND_OR and OR3 operations (#188375)
Add ThreeOp_v2i32_Pats pattern class to support v2i32 vector operations
for AND_OR_B32 and OR3_B32 instructions. The new patterns check the
v2i32 and-or or or-or instruction sequence, extract individual 32-bit
elements from v2i32 operands, and applies the and_or or or3 vop3
operations.
2026-04-06 16:54:21 +00:00
Domenic Nutile
5b33f85a08
[AMDGPU] Change isSingleLaneExecution to account for WWM enabling lanes even if there's only one workitem (#188316)
This issue was discovered during some downstream work around Vulkan CTS
tests, specifically
`dEQP-VK.subgroups.arithmetic.compute.subgroupadd_float`

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2026-04-06 12:51:46 -04:00
Ryotaro Kasuga
34a16392fa
[DA] Use SmallVector instead of raw new/delete (NFC) (#190586)
Some functions used `new`/`delete` to allocate/free arrays. To avoid
memory leaks, it would be better to avoid using raw pointers. This patch
replaces the use of them with `SmallVector`.
2026-04-06 15:54:34 +00:00
Matt Arsenault
bf2a97a0dd
AMDGPU: Add range attribute to mbcnt intrinsic callsites (#189191)
It seems the known bits handling added in
686987a540bc176bceaad43ffe530cb3e88796d5
is insufficient to perform many range based optimizations. For some
reason
computeConstantRange doesn't fall back on KnownBits, and has a separate,
less used form which tries to use computeKnownBits.
2026-04-06 14:40:54 +00:00
Max Graey
c4281fd5af
[Support][ValueTraking] Improve KnownFPClass for fadd. Handle infinity signs (#190559)
Improve KnownFPClass reasoning for fadd:

- Refine NaN handling for infinities by checking opposite-sign cases:
   - `-inf` + `+inf` --> `nan`
  - `+inf` + `-inf` --> `nan`
  - `+inf` + `+inf` --> `+inf`
  - `-inf` + `-inf` --> `-inf`
- Introduce `cannotBeOrderedLessEqZero` as pair to
`cannotBeOrderedGreaterEqZero`.
2026-04-06 16:23:20 +02:00
Joe Nash
2ccc941549
[AMDGPU] Mark two instructions as DPMACC (#190391)
It appears these were accidentally missed in #170319
2026-04-06 13:43:35 +00:00
Trung Nguyen
b6e7c475cb
[CodeGen] Ignore ANNOTATION_LABEL in scheduler (#190499)
This fixes a crash in `clang` for `armv7` targets when optimizations are
enabled.

Fixes #190497
2026-04-06 14:16:01 +02:00
Florian Hahn
0403639667
[VPlan] Skip successors outside any loop when updating LoopInfo. (#190553)
Successors outside of any loop do not contribute to the innermost loop,
skip them to avoid incorrect results due to
getSmallestCommonLoop(nullptr, X) returning nullptr.
2026-04-06 12:58:41 +01:00
陈子昂
05ff170026
[InstCombine] Fix #163110: Support peeling off matching shifts from icmp operands via canEvaluateShifted (#165975)
Consider a pattern like `icmp (shl nsw X, L), (add nsw (shl nsw Y, L),
K)`. When the constant K is a multiple of 2^L, this can be simplified to
`icmp X, (add nsw Y, K >> L)`.
This patch extends canEvaluateShifted to support `Instruction::Add` and
updates its signature to accept `Instruction::BinaryOps` instead of a
boolean. This change allows the function to distinguish between LShr and
AShr requirements, ensuring that information is preserved according to
the signedness and overflow flags (nsw/nuw) of the operands.
The logic is integrated into `foldICmpCommutative` to enable peeling off
matching shifts from both sides of a comparison even when an offset is
present.

Fixes: #163110
2026-04-06 13:44:17 +02:00
Florian Hahn
64a0bd1227
[LV] Return best VPlan together with VF from computeBestVF (NFC). (#190385)
computeBestVF iterates over all VPlans and picks the VF of the most
profitable VPlan. This VPlan is later needed for execution and
additional checks. Instead of retrieving it multiple times later, just
directly return it from computeBestVF.

This removes some redundant lookups.

PR: https://github.com/llvm/llvm-project/pull/190385
2026-04-06 11:01:18 +01:00
Florian Hahn
f7cdebb478
[VPlan] Mark unary ops as not having side-effects (NFC). (#190554)
Mark unary ops (only FNeg current) to neither read nor write memory,
similar to binary and cast ops.

Should currently be NFC end-to-end.
2026-04-06 09:05:38 +01:00
Florian Hahn
c109dd1e9a
[VPlan] Refactor FindLastSelect matching to use m_Specific(PhiR) (NFC). (#190547)
Match the select operands directly against PhiR using m_Specific,
binding only the non-phi IV expression. This replaces the generic
TrueVal/FalseVal matching followed by an assert and conditional
extraction.

Split off from approved
https://github.com/llvm/llvm-project/pull/183911/ as suggested.
2026-04-05 20:07:34 +00:00
Samuel Thibault
9ce30c8dc3
[Orc][LibResolver] Fix GNU/Hurd build (#184470)
GNU/Hurd does not put a PATH_MAX static constraint on path lengths. We can instead check the symlink length.
2026-04-05 19:56:31 +01:00
Florian Hahn
36e495dd90
[VPlan] Use APSInt in CheckSentinel directly (NFC). (#190534)
Simplify the sentinel checking logic by using APSInt and checking for
both a signed and unsigned sentinel in a single call.

Removes the IsSigned argument

Split off from approved
https://github.com/llvm/llvm-project/pull/183911/ as suggested.
2026-04-05 16:43:59 +00:00
Florian Hahn
a2c16bb59f
[VPlan] Rename CondSelect to FindLastSelect (NFC). (#190536)
…ns (NFC).

Use the more descriptive name FindLastSelect for the conditional select
that picks between the reduction phi and the IV value.

Split off from approved
https://github.com/llvm/llvm-project/pull/183911/ as suggested.
2026-04-05 16:39:34 +00:00
Hassnaa Hamdi
c5a904946a
[LV][NFC] remove dead code in canFoldTailByMasking() (#190263)
Remove unused ReductionLiveOuts variable in `canFoldTailByMasking()`.
The set was being populated with reduction loop exit instructions but
was never actually used anywhere in the function.
2026-04-05 12:59:32 +01:00