llvm-project

Author	SHA1	Message	Date
Ramkumar Ramachandra	95c525b1db	[VPlan] Preserve nusw on VectorEndPointer (#151558 ) In createInterleaveGroups, get the nusw in addition to inbounds from the existing GEP, and set them on the VPVectorEndPointerRecipe.	2025-08-11 10:38:25 +01:00
David Sherwood	aba0ce10c7	[LV] Add new line to interleaving disabled message (#152722 )	2025-08-11 09:53:20 +01:00
David Sherwood	9181a7e294	[LV] Fix branch weights in epilogue min iteration check block (#152534 ) I've changed how we construct the EpilogueVectorizerEpilogueLoop and EpilogueVectorizerMainLoop classes so that we construct the parent class with an additional boolean parameter indicating whether we're vectorising the main or epilogue loop. The InnerLoopAndEpilogueVectorizer class uses this new argument in combination with the EpilogueLoopVectorizationInfo struct to set the right UF and VF values. This then allows EpilogueVectorizerEpilogueLoop to access the correct values of VF and UF for the main loop, which are required when setting branch weights in the minimum iteration check block.	2025-08-11 09:52:54 +01:00
Elvis Wang	37fe7a9933	[LV] Generate scalar xor for VPInstruction::Not if possible. (#152628 ) `VPInstruction::Not` which will generate xor instruction is widely used for the exit condition. This patch make `VPInstruction::Not` generate scalar `xor` if possible. This can help reducing the (splat true) in the `xor` and make `xor` be scalar.	2025-08-11 16:35:21 +08:00
David Green	6ca6d45b29	[VectorCombine] Use hasOneUser in shuffle-to-identity fold (#152675 ) We need to check that the node is part of the graph being converted, so will not contain external uses when transformed.	2025-08-11 07:45:15 +01:00
Mel Chen	6db3776f9b	[LV][EVL] Simplify EVL recipe transformation by using a single EVL mask. nfc (#152479 ) The EVL mask is always defined as `icmp ult (step-vector, EVL)`, so we only need to generate it once per plan in the header. Then, we replace all uses of the header mask with the EVL mask, and recursively optimize the users of EVL mask into EVL recipes. This way, the transformation to EVL recipes can be done with just a single loop.	2025-08-11 11:09:01 +08:00
Florian Hahn	86813aa786	[VPlan] Add dedicated user for resume phi with epilogue vectorization. Epilogue vectorization currently relies on the resume phi for the canonical induction being always available, which is why VPPhi are considered to have side-effects, to prevent their removal. This patch adds a new ResumeForEpilogue opcode to mark the resume phi as used for epilogue vectorization. This allows treating VPPhis in general as not having side-effects, enabling removal of unused VPPhis.	2025-08-10 21:21:16 +01:00
David Green	cfe190979e	Revert "[SLP]Initial FMAD support (#149102 )" This reverts commit 0fffb9f9ed81f4c2084b8fe040c88b60bb6c372a due to major performance regressions.	2025-08-10 15:16:01 +01:00
Florian Hahn	06fd0f9d65	[VPlan] Move initial skeleton construction earlier (NFC). (#150848 ) Split up the not clearly named prepareForVectorization transform into buildVPlan0, which adds the vector preheader, middle and scalar preheader blocks, as well as the canonical induction recipes and sets the trip count. The new transform is run directly after building the plain CFG VPlan initially. The remaining code handling early exits and adding the branch in the middle block is renamed to handleEarlyExitsAndAddMiddleCheck and still runs at the original position. With the code movement, we only have to add the skeleton once to the initial VPlan, and cloning will take care of the rest. It will also enable moving other construction steps to work directly on VPlan0, like adding resume phis. PR: https://github.com/llvm/llvm-project/pull/150848	2025-08-09 20:54:42 +01:00
Alexey Bataev	0fffb9f9ed	[SLP]Initial FMAD support (#149102 ) Added initial check for potential fmad conversion in reductions and operands vectorization. Added the check for instruction to fix #152683	2025-08-08 10:30:23 -07:00
Alexander Richardson	3a4b351ba1	[IR] Introduce the `ptrtoaddr` instruction This introduces a new `ptrtoaddr` instruction which is similar to `ptrtoint` but has two differences: 1) Unlike `ptrtoint`, `ptrtoaddr` does not capture provenance 2) `ptrtoaddr` only extracts (and then extends/truncates) the low index-width bits of the pointer For most architectures, difference 2) does not matter since index (address) width and pointer representation width are the same, but this does make a difference for architectures that have pointers that aren't just plain integer addresses such as AMDGPU fat pointers or CHERI capabilities. This commit introduces textual and bitcode IR support as well as basic code generation, but optimization passes do not handle the new instruction yet so it may result in worse code than using ptrtoint. Follow-up changes will update capture tracking, etc. for the new instruction. RFC: https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54 Reviewed By: nikic Pull Request: https://github.com/llvm/llvm-project/pull/139357	2025-08-08 10:12:39 -07:00
Alexey Bataev	0419b459be	Revert "[SLP]Initial FMAD support (#149102 )" This reverts commit 0bcf45ea3458ba79eb4257afcfd6af954292c9ce to fix the regresions, reported in https://github.com/llvm/llvm-project/issues/152683	2025-08-08 09:17:59 -07:00
Mel Chen	ab7281d896	[VPlan] Update naming in VPInterleaveRecipe constructor. nfc (#152472 )	2025-08-08 20:17:10 +08:00
Florian Hahn	82d633e9ff	[VPlan] Materialize vector trip count using VPInstructions. (#151925 ) Materialize the vector trip count computation using VPInstruction instead of directly creating IR. This is one of the last few steps needed to model the full vector skeleton in VPlan. It also simplifies vector-trip count computations for scalable vectors, as we can re-use the UF x VF computation. PR: https://github.com/llvm/llvm-project/pull/151925	2025-08-08 11:44:32 +01:00
Alexey Bataev	0bcf45ea34	[SLP]Initial FMAD support (#149102 ) Added initial check for potential fmad conversion in reductions and operands vectorization.	2025-08-07 09:51:43 -04:00
Florian Hahn	95c32bf2d4	[VPlan] Return invalid cost if any skeleton block has invalid costs. (#151940 ) We need to reject plans that contain recipes with invalid costs. LICM can move recipes with invalid costs out of the loop region, which then get missed by the main cost computation. Extend the logic to check recipes for invalid cost currently only covering the middle block to include all skeleton blocks. Fixes https://github.com/llvm/llvm-project/issues/144358 Fixes https://github.com/llvm/llvm-project/issues/151664 PR: https://github.com/llvm/llvm-project/pull/151940	2025-08-07 10:45:27 +01:00
Florian Hahn	a485e0eae0	[VPlan] Retrieve vector TC for epilogue from resume phi (NFC). Instead of relying on getOrCreateVectorTripCount to initialize EPI.VectorTripCount, delay initialization after we retrieved the resume phi and get the trip count from there. This makes the code independent of legacy vector trip count creation.	2025-08-07 07:52:35 +01:00
Luke Lau	df8da2ff83	[VPlan] Support VPWidenPointerInductionRecipes with EVL tail folding (#152110 ) Now that VPWidenPointerInductionRecipes are modelled in VPlan in #148274, we can support them in EVL tail folding. We need to replace their VFxUF operand with EVL as the increment is not guaranteed to always be VF on the penultimate iteration, and UF is always 1 with EVL tail folding. We also need to move the creation of the backedge value to the latch so that EVL dominates it. With this we will no longer fail to convert a VPlan to EVL tail folding, so adjust tryAddExplicitVectorLength to account for this. This brings us to 99.4% of all vector loops vectorized on SPEC CPU 2017 with tail folding vs no tail folding. The test in only-compute-cost-for-vplan-vfs.ll previously relied on widened pointer inductions with EVL tail folding to end up in a scenario with no vector VPlans, so this also replaces it with an unvectorizable fixed-order recurrence test from first-order-recurrence-multiply-recurrences.ll that also gets discarded.	2025-08-07 10:54:24 +08:00
Ramkumar Ramachandra	092388171f	[VPlan] Introduce m_[Specific]ICmp matcher (#151540 )	2025-08-06 20:35:35 +01:00
Florian Hahn	25d1285eec	[VPlan] Replace single-entry VPPhis with their incoming values. Replace trivial, single-entry VPPhis with their incoming values,	2025-08-06 20:03:31 +01:00
Alexey Bataev	4784ce9ebc	[SLP][NFC]Check an external user before trying to address it in debug dump, NFC	2025-08-06 08:58:16 -07:00
Florian Hahn	e80e7e717e	[VPlan] Use scalar VPPhi instead of VPWidenPHIRecipe in createPlainCFG. (#150847 ) The initial VPlan closely reflects the original scalar loop, so unsing VPWidenPHIRecipe here is premature. Widened phi recipes should only be introduced together with other widened recipes. PR: https://github.com/llvm/llvm-project/pull/150847	2025-08-06 14:43:03 +01:00
Florian Hahn	777c320e6c	[VPlan] Address comments missed in #142309 . Address additional comments from https://github.com/llvm/llvm-project/pull/142309.	2025-08-06 11:52:08 +01:00
Andrew Rogers	a3c386d241	[llvm] annotate recently added interfaces for DLL export (#152179 ) ## Purpose This patch is one in a series of code-mods that annotate LLVM’s public interface for export. This patch annotates symbols that were recently added to LLVM and fixes incorrectly annotated symbols. ## Background This effort is tracked in #109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307), and documentation for `LLVM_ABI` and related annotations is found in the LLVM repo [here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst). ## Overview The bulk of these changes were generated automatically using the [Interface Definition Scanner (IDS)](https://github.com/compnerd/ids) tool, followed formatting with `git clang-format`. The following manual adjustments were also applied after running IDS: - Add `LLVM_EXPORT_TEMPLATE` and `LLVM_TEMPLATE_ABI` annotations to explicitly instantiated instances of `llvm::object::SFrameParser`. ## Validation On Windows 11: ``` cmake -B build -S llvm -G Ninja -DLLVM_ENABLE_PROJECTS="llvm;clang;clang-tools-extra;lldb;lld" -DLLVM_OPTIMIZED_TABLEGEN=ON -DLLVM_BUILD_LLVM_DYLIB=ON -DLLVM_BUILD_LLVM_DYLIB_VIS=ON -DLLVM_LINK_LLVM_DYLIB=ON -DLLVM_BUILD_TESTS=ON -DCLANG_LINK_CLANG_DYLIB=OFF -DCMAKE_BUILD_TYPE=Release ninja -C build ```	2025-08-05 23:12:07 -07:00
Florian Hahn	d478502a42	[VPlan] Ensure that IV resume phi for epilogue is always first. (NFCI) Update handling of canonical IV resume phi for the epilogue loop to make sure the resume phi for the canonical IV is always the first phi in the scalar preheader. This makes it easier to retrieve it in preparePlanForEpilogueVectorLoop. For now, we keep an assert to make sure we use the same resume phi as before. This will be removed in the future.	2025-08-05 21:06:41 +01:00
Florian Hahn	47258ca470	[VPlan] Use VPPhi instead of dyn_cast + opcode check in isPhi (NFC).	2025-08-05 19:20:12 +01:00
Luke Lau	94a6cd464e	[VPlan] Expand VPWidenPointerInductionRecipe into separate recipes (#148274 ) This is the VPWidenPointerInductionRecipe equivalent of #118638, with the motivation of allowing us to use the EVL as the induction step. There is a new VPInstruction added, WidePtrAdd to allow adding the step vector to the induction phi, since VPInstruction::PtrAdd only handles scalars or multiple scalar lanes. Originally this transformation was copied from the original recipe's execute code, but it's since been simplifed by teaching `unrollWidenInductionByUF` to unroll the recipe, which brings it inline with VPWidenIntOrFpInductionRecipe.	2025-08-05 16:54:02 +08:00
Florian Hahn	c9dd14d1d4	[VPlan] Compute interleave count for VPlan. (#149702 ) Move selectInterleaveCount to LoopVectorizationPlanner and retrieve some information directly from VPlan. Register pressure was already computed for a VPlan, and with this patch we now also check for reductions directly on VPlan, as well as checking how many load and store operations remain in the loop. This should be mostly NFC, but we may compute slightly different interleave counts, except for some edge cases, e.g. where dead loads have been removed. This shouldn't happen in practice, and the patch doesn't cause changes across a large test corpus on AArch64. Computing the interleave count based on VPlan allows for making better decisions in presence of VPlan optimizations, for example when operations on interleave groups are narrowed. Note that there are a few test changes for tests that were still checking the legacy cost-model output when it was computed in selectInterleaveCount. PR: https://github.com/llvm/llvm-project/pull/149702	2025-08-05 09:42:55 +01:00
Mel Chen	8761b6cf8f	[VPlan] Use VPTypeAnalysis to get the step type of widen pointer induction (#147925 ) This patch uses VPTypeAnalysis to determine its type since the induction step is not always a live-in value in the VPlan and may be defined by a recipe.	2025-08-05 09:13:44 +08:00
Florian Hahn	0433e1e15f	[VPlan] Add VPlan::getTrue/getFalse convenience helpers (NFC). Makes it slightly more convenient to create true/false constants.	2025-08-04 21:04:55 +01:00
Florian Hahn	215e6beae0	[LV] Use MapVector for ScalarCostsTy for deterministic iter order (NFC) We iterate over the scalar costs of instruction when printing costs, and currently the iteration order is not deterministic. Currently no tests check the output with multiple instructions in the map, but those will come soon.	2025-08-04 19:31:07 +01:00
Alexey Bataev	e27831ff9b	[SLP] Fix a check for main/alternate interchanged instruction If the instruction is checked for matching the main instruction, need to check if the opcode of the main instruction is compatible with the operands of the instruction. If they are not, need to check the alternate instruction and its operands for compatibility and return alternate instruction as a match. Fixes #151699 Fixed check for non-supported binary operations.	2025-08-04 11:20:54 -07:00
Michael Halkenhäuser	70af09e3a1	Revert "[SLP] Fix a check for main/alternate interchanged instruction" (#151997 ) This reverts commit 3ee8d047109ea4bb479095f4b153c2120a8d726c. Revert reason: FAILED build for openmp-offload-amdgpu-runtime-2 https://lab.llvm.org/buildbot/#/builders/10/builds/10827	2025-08-04 12:57:20 -04:00
Alexey Bataev	3ee8d04710	[SLP] Fix a check for main/alternate interchanged instruction If the instruction is checked for matching the main instruction, need to check if the opcode of the main instruction is compatible with the operands of the instruction. If they are not, need to check the alternate instruction and its operands for compatibility and return alternate instruction as a match. Fixes #151699	2025-08-04 08:31:35 -07:00
Simon Pilgrim	88c6448fa2	Revert "[VectorCombine] Shrink loads used in shufflevector rebroadcasts" (#151960 ) Reverts llvm/llvm-project#128938 while a crash regression is investigated	2025-08-04 15:03:53 +01:00
Alexey Bataev	7cd1ce3aa0	[SLP]Check vector-like instruction for dominance in copyables Need to check if the vector-like instruction is dominated by main operation in the copyables to prevent broken def-use chain Fixes #151456	2025-08-04 06:14:19 -07:00
Florian Hahn	66a8341f6d	[VPlan] Skip disconnected exit blocks in hasEarlyExit. (#151718 ) Currently hasEarlyExit returns true, if there are multiple exit blocks. ExitBlocks contains the wrapped original IR exit blocks. Without checking the predecessors we incorrectly return true for loops with multiple countable exits, that have been vectorized by requiring a scalar epilogue. In that case, the exit blocks will get disconnected. Fix this by filtering out disconnected exit blocks. Currently this should only impact the 'early-exit vectorized' statistic. PR: https://github.com/llvm/llvm-project/pull/151718	2025-08-04 11:31:00 +01:00
Leon Clark	1feed444aa	[VectorCombine] Shrink loads used in shufflevector rebroadcasts (#128938 ) Attempt to shrink the size of vector loads where only some of the incoming lanes are used for rebroadcasts in shufflevector instructions. --------- Co-authored-by: Leon Clark <leoclark@amd.com> Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-04 10:49:27 +01:00
Kazu Hirata	3549134836	[Vectorize] Remove an unnecessary cast (NFC) (#151850 ) getNumElements() already returns unsigned.	2025-08-03 08:44:50 -07:00
Florian Hahn	559d1dff89	[VPlan] Materialize BackedgeTakenCount using VPInstructions. Explicitly compute the backedge-taken count using VPInstruction. This is needed to model the full skeleton in VPlan. NFC modulo some instruction re-ordering.	2025-08-03 12:21:28 +01:00
Simon Pilgrim	b983ce8145	[VPlan] handleMaxMinNumReductions - fix gcc Wparentheses warning. NFC.	2025-08-03 11:50:31 +01:00
Florian Hahn	39c30665e9	[VPlan] Update type of cloned instruction in scalarizeInstruction. The operands of the replicate recipe may have been narrowed, resulting in a narrower result type. Update the type of the cloned instruction to the correct type. Fixes https://github.com/llvm/llvm-project/issues/151392.	2025-08-02 19:49:59 +01:00
Florian Hahn	08f50e9665	[VPlan] Use vector tripcount if computable when simplifying conds. (#151034 ) Update isConditionTrueViaVFAndUF to use the vector trip count if computable. This is the case when it has been materialized to a constant. Otherwise fall back to the trip count. PR: https://github.com/llvm/llvm-project/pull/151034	2025-08-02 16:31:31 +01:00
Ramkumar Ramachandra	af0be76a35	[VPlan] Replace reverse RPOT with PO traversal (NFC) (#151757 )	2025-08-02 08:46:27 +01:00
Florian Hahn	eee9755881	[LV] Refine check to find epilogue IV resume value. Make sure to check that the vector trip count is containedin the list of incoming values to serve as tie-breaker with phis with all-zero incoming values. Fixes https://github.com/llvm/llvm-project/issues/151686.	2025-08-01 20:54:39 +01:00
Florian Hahn	c300a99ea8	[LV] Use MapVector for InstsToScalarize for deterministic iter order (NFC) We iterate over InstsToScalarize when printing costs, and currently the iteration order is not deterministic. Currently no tests check the output with multiple instructions in InstsToScalarize, but those will come soon.	2025-08-01 14:29:53 +01:00
Florian Hahn	7d815c7642	[LV] Remove unused variables after 965231ca0a9a. (NFC) Clean up unused/dead variables after 965231ca0a9a (https://github.com/llvm/llvm-project/pull/151311)	2025-08-01 10:12:20 +01:00
Florian Hahn	965231ca0a	[LV] Replace UncountableEdge with UncountableEarlyExitingBlock (NFC). (#151311 ) Only the uncountable exiting BB is used. Store it instead of a piar of Exiting BB and Exit BB. PR: https://github.com/llvm/llvm-project/pull/151311	2025-08-01 09:37:02 +01:00
Mel Chen	86916ff0f0	[LV] Fix gap mask requirement for interleaved access (#151105 ) When interleaved stores contain gaps, a mask is required to skip the gaps, regardless of whether scalar epilogues are allowed. This patch corrects the condition under which a gap mask is needed, ensuring consistency between the legacy and VPlan-based cost models and avoiding assertion failures. Related #149981	2025-08-01 14:24:30 +08:00
Ramkumar Ramachandra	d07f48e4da	[VPlan] Use m_BinaryOr matcher for clarity (NFC) (#151541 )	2025-08-01 06:56:27 +01:00

1 2 3 4 5 ...

6397 Commits