llvm-project

Author	SHA1	Message	Date
Thurston Dang	ef5022745c	[NFCI][msan] Refactor into 'horizontalReduce' (#152961 ) The functionality is used by two helper functions, and will be used even more in the future (e.g., https://github.com/llvm/llvm-project/pull/152941).	2025-08-11 15:48:20 -07:00
Florian Hahn	1c7c8e3ad3	Revert "[VPlan] Remove trivial dead VPPhi cycles." This reverts commit 1f17bb133f4f49942a1e0245291811ca3c99a7d2. This seems to be breaking some RISCV bots, reverting for now https://lab.llvm.org/buildbot/#/builders/210/builds/1266	2025-08-11 22:05:30 +01:00
Florian Hahn	1f17bb133f	[VPlan] Remove trivial dead VPPhi cycles. Update removeDeadRecipes to remove trivial dead VPPhi cycles. Should effectively be NFC end-to-end.	2025-08-11 21:29:49 +01:00
XChy	df75b4b942	Revert "[DFAJumpThreading] Prevent pass from using too much memory." (#153075 ) Reverts llvm/llvm-project#145482	2025-08-12 04:26:47 +08:00
Alexey Bataev	2d7b55a028	[SLP]Initial support for copyable elements Adds initial support for copyable elements, both schedulable and non-schedulable. Adds support only for add for now, other opcodes will added in future. Still some cases are not handled, e.g. stores do not include this, because currently do not check for copyable elements. Reviewers: hiraditya, RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/147366	2025-08-11 09:41:19 -04:00
Alexey Bataev	67af2f6c5c	[SLP]Initial FMAD support (#149102 ) Added initial check for potential fmad conversion in reductions and operands vectorization. Added the check for instruction to fix #152683 Skipped the code for reduction to avoid regressions.	2025-08-11 05:53:55 -07:00
Weibo He	13cd725857	[CoroSplit] Remove lifetime marker checks for subranges of allocas (#152886 ) #150248 starts to drop size argument of lifetime markers. Then lifetime markers cannot refer to subrange of allocas and we can remove this check.	2025-08-11 13:02:07 +02:00
Andreas Jonson	330a589450	[PredicateInfo] Handle trunc nuw i1 condition. (#152988 ) proof: https://alive2.llvm.org/ce/z/mxtn4L	2025-08-11 13:00:54 +02:00
Luke Lau	aea82a780a	[VPlan] Remove some getCanonicalIV() uses. NFC (#152969 ) A lot of time getCanonicalIV() is used to get the canonical IV type, e.g. to instantiate a VPTypeAnalysis or to get the LLVMContext. However VPTypeAnalysis has a constructor that takes the VPlan directly and there's a method on VPlan to get the LLVMContext directly, so use those instead where possible. This lets us remove a constructor on VPTypeAnalysis. Also remove an unused LLVMContext argument in UnrollState whilst we're here.	2025-08-11 18:12:05 +08:00
Luke Lau	acb86fb9e0	[TTI] Consistently pass the pointer type to getAddressComputationCost. NFCI (#152657 ) In some places we were passing the type of value being accessed, in other cases we were passing the type of the pointer for the access. The most "involved" user is LoopVectorizationCostModel::getMemInstScalarizationCost, which is the only call site that passes in the SCEV, and it passes along the pointer type. This changes call sites to consistently pass the pointer type, and renames the arguments to clarify this. No target actually checks the contents of the type passed, only to see if it's a vector or not, so this shouldn't have an effect.	2025-08-11 18:00:12 +08:00
Ramkumar Ramachandra	95c525b1db	[VPlan] Preserve nusw on VectorEndPointer (#151558 ) In createInterleaveGroups, get the nusw in addition to inbounds from the existing GEP, and set them on the VPVectorEndPointerRecipe.	2025-08-11 10:38:25 +01:00
David Sherwood	aba0ce10c7	[LV] Add new line to interleaving disabled message (#152722 )	2025-08-11 09:53:20 +01:00
David Sherwood	9181a7e294	[LV] Fix branch weights in epilogue min iteration check block (#152534 ) I've changed how we construct the EpilogueVectorizerEpilogueLoop and EpilogueVectorizerMainLoop classes so that we construct the parent class with an additional boolean parameter indicating whether we're vectorising the main or epilogue loop. The InnerLoopAndEpilogueVectorizer class uses this new argument in combination with the EpilogueLoopVectorizationInfo struct to set the right UF and VF values. This then allows EpilogueVectorizerEpilogueLoop to access the correct values of VF and UF for the main loop, which are required when setting branch weights in the minimum iteration check block.	2025-08-11 09:52:54 +01:00
Elvis Wang	37fe7a9933	[LV] Generate scalar xor for VPInstruction::Not if possible. (#152628 ) `VPInstruction::Not` which will generate xor instruction is widely used for the exit condition. This patch make `VPInstruction::Not` generate scalar `xor` if possible. This can help reducing the (splat true) in the `xor` and make `xor` be scalar.	2025-08-11 16:35:21 +08:00
Yingwei Zheng	84b31581f8	Revert "[PatternMatch] Add `m_[Shift]OrSelf` matchers." (#152953 ) Reverts llvm/llvm-project#152924 According to `f67668b586`, it is not an NFC change.	2025-08-11 09:35:16 +02:00
hanbeom	a750fcb52b	[GVN] Check IndirectBr in Predecessor Terminators (#151188 ) Critical edges with an IndirectBr terminator cannot be split. Add a check it to prevent assertion failures. Fixes: #150229	2025-08-11 09:25:52 +02:00
Nikita Popov	35bad229c1	[PredicateInfo] Use bitcast instead of ssa.copy (#151174 ) PredicateInfo needs some no-op to which the predicate can be attached. Currently this is an ssa.copy intrinsic. This PR replaces it with a no-op bitcast. Using a bitcast is more efficient because we don't have the overhead of an overloaded intrinsic. It also makes things slightly simpler overall.	2025-08-11 09:25:01 +02:00
David Green	6ca6d45b29	[VectorCombine] Use hasOneUser in shuffle-to-identity fold (#152675 ) We need to check that the node is part of the graph being converted, so will not contain external uses when transformed.	2025-08-11 07:45:15 +01:00
Mel Chen	6db3776f9b	[LV][EVL] Simplify EVL recipe transformation by using a single EVL mask. nfc (#152479 ) The EVL mask is always defined as `icmp ult (step-vector, EVL)`, so we only need to generate it once per plan in the header. Then, we replace all uses of the header mask with the EVL mask, and recursively optimize the users of EVL mask into EVL recipes. This way, the transformation to EVL recipes can be done with just a single loop.	2025-08-11 11:09:01 +08:00
Yingwei Zheng	1c499351d6	[PatternMatch] Add `m_[Shift]OrSelf` matchers. (#152924 ) Address the comment https://github.com/llvm/llvm-project/pull/147414/files#r2228612726. As they are usually used to match integer packing patterns, it is enough to handle constant shamts.	2025-08-11 09:58:16 +08:00
Florian Hahn	86813aa786	[VPlan] Add dedicated user for resume phi with epilogue vectorization. Epilogue vectorization currently relies on the resume phi for the canonical induction being always available, which is why VPPhi are considered to have side-effects, to prevent their removal. This patch adds a new ResumeForEpilogue opcode to mark the resume phi as used for epilogue vectorization. This allows treating VPPhis in general as not having side-effects, enabling removal of unused VPPhis.	2025-08-10 21:21:16 +01:00
David Green	cfe190979e	Revert "[SLP]Initial FMAD support (#149102 )" This reverts commit 0fffb9f9ed81f4c2084b8fe040c88b60bb6c372a due to major performance regressions.	2025-08-10 15:16:01 +01:00
weiguozhi	5e87792200	[LoopInfo] Pointer to stack object may not be loop invariant in a coroutine function (#149936 ) A coroutine function may be split to ramp function and resume function, and they have different stack frames, so a pointer to stack objects may have different addresses depending on where it is used, so it's not a loop invariant. It temporarily fixes https://github.com/llvm/llvm-project/issues/149604.	2025-08-09 14:20:19 -07:00
Florian Hahn	06fd0f9d65	[VPlan] Move initial skeleton construction earlier (NFC). (#150848 ) Split up the not clearly named prepareForVectorization transform into buildVPlan0, which adds the vector preheader, middle and scalar preheader blocks, as well as the canonical induction recipes and sets the trip count. The new transform is run directly after building the plain CFG VPlan initially. The remaining code handling early exits and adding the branch in the middle block is renamed to handleEarlyExitsAndAddMiddleCheck and still runs at the original position. With the code movement, we only have to add the skeleton once to the initial VPlan, and cloning will take care of the rest. It will also enable moving other construction steps to work directly on VPlan0, like adding resume phis. PR: https://github.com/llvm/llvm-project/pull/150848	2025-08-09 20:54:42 +01:00
Alexey Bataev	0fffb9f9ed	[SLP]Initial FMAD support (#149102 ) Added initial check for potential fmad conversion in reductions and operands vectorization. Added the check for instruction to fix #152683	2025-08-08 10:30:23 -07:00
Alexander Richardson	3a4b351ba1	[IR] Introduce the `ptrtoaddr` instruction This introduces a new `ptrtoaddr` instruction which is similar to `ptrtoint` but has two differences: 1) Unlike `ptrtoint`, `ptrtoaddr` does not capture provenance 2) `ptrtoaddr` only extracts (and then extends/truncates) the low index-width bits of the pointer For most architectures, difference 2) does not matter since index (address) width and pointer representation width are the same, but this does make a difference for architectures that have pointers that aren't just plain integer addresses such as AMDGPU fat pointers or CHERI capabilities. This commit introduces textual and bitcode IR support as well as basic code generation, but optimization passes do not handle the new instruction yet so it may result in worse code than using ptrtoint. Follow-up changes will update capture tracking, etc. for the new instruction. RFC: https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54 Reviewed By: nikic Pull Request: https://github.com/llvm/llvm-project/pull/139357	2025-08-08 10:12:39 -07:00
Drew Kersnar	90e8c8e718	[InferAlignment] Propagate alignment between loads/stores of the same base pointer (#145733 ) We can derive and upgrade alignment for loads/stores using other well-aligned loads/stores. This optimization does a single forward pass through each basic block and uses loads/stores (the alignment and the offset) to derive the best possible alignment for a base pointer, caching the result. If it encounters another load/store based on that pointer, it tries to upgrade the alignment. The optimization must be a forward pass within a basic block because control flow and exception throwing can impact alignment guarantees. --------- Co-authored-by: Nikita Popov <github@npopov.com>	2025-08-08 12:05:29 -05:00
Alexey Bataev	0419b459be	Revert "[SLP]Initial FMAD support (#149102 )" This reverts commit 0bcf45ea3458ba79eb4257afcfd6af954292c9ce to fix the regresions, reported in https://github.com/llvm/llvm-project/issues/152683	2025-08-08 09:17:59 -07:00
Szymon Piotr Milczek	fd41700962	[InstCombine] visitShuffleVectorInst assert with vector of pointers fix. (#152341 ) In visitShuffleVectorInst there's an if block that's meant to turn shufflevector followed by bitcast into extractelement where possible. It assumes that there will never be bitcasts performed on vectors of ptr as such operations are almost always illegal, and ptrtoint instructions should be used instead. There is however an edge case where a bitcast instruction can be performed on a vector of type `<1 x ptr>` to turn it into type `ptr` In this edge case, the code initializes the variable `VecBitWidth` to 0. Then, when iterating over users that are bitcasts, an attempt is made to create a vector of size 0, which triggers and assert. This commit changes initialization of `VecBitWidth` to use datalayout to find the the size of the vector instead of getPrimitiveSizeInBits method which results in 0 for ptr and vectors of ptr.	2025-08-08 15:23:02 +02:00
Mel Chen	ab7281d896	[VPlan] Update naming in VPInterleaveRecipe constructor. nfc (#152472 )	2025-08-08 20:17:10 +08:00
Florian Hahn	82d633e9ff	[VPlan] Materialize vector trip count using VPInstructions. (#151925 ) Materialize the vector trip count computation using VPInstruction instead of directly creating IR. This is one of the last few steps needed to model the full vector skeleton in VPlan. It also simplifies vector-trip count computations for scalable vectors, as we can re-use the UF x VF computation. PR: https://github.com/llvm/llvm-project/pull/151925	2025-08-08 11:44:32 +01:00
Nikita Popov	c23b4fbdbb	[IR] Remove size argument from lifetime intrinsics (#150248 ) Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).	2025-08-08 11:09:34 +02:00
Antonio Frighetto	e977b28c37	[InstCombine] Match intrinsic recurrences when known to be hoisted For value-accumulating recurrences of kind: ``` %umax.acc = phi i8 [ %umax, %backedge ], [ %a, %entry ] %umax = call i8 @llvm.umax.i8(i8 %umax.acc, i8 %b) ``` The binary intrinsic may be simplified into an intrinsic with init value and the other operand, if the latter is loop-invariant: ``` %umax = call i8 @llvm.umax.i8(i8 %a, i8 %b) ``` Proofs: https://alive2.llvm.org/ce/z/ea2cVC. Fixes: https://github.com/llvm/llvm-project/issues/145875.	2025-08-08 09:31:50 +02:00
Bushev Dmitry	b5902924b2	[DFAJumpThreading] Prevent pass from using too much memory. (#145482 ) The limit 'dfa-max-num-paths' that is used to control number of enumerated paths was not checked against inside getPathsFromStateDefMap. It may lead to large memory consumption for complex enough switch statements.	2025-08-07 20:15:42 +03:00
Alexey Bataev	0bcf45ea34	[SLP]Initial FMAD support (#149102 ) Added initial check for potential fmad conversion in reductions and operands vectorization.	2025-08-07 09:51:43 -04:00
Nikita Popov	dbfc3ed690	[TypeSanitizer] Use alloca size for lifetime markers (#152154 ) Split out from https://github.com/llvm/llvm-project/pull/150248: Use the size of the alloca instead of the size passed to the lifetime intrinsic. As a bonus, this handles dynamic allocas correctly (see the added test) instead of doing a memset with size -1...	2025-08-07 14:39:32 +02:00
Florian Hahn	95c32bf2d4	[VPlan] Return invalid cost if any skeleton block has invalid costs. (#151940 ) We need to reject plans that contain recipes with invalid costs. LICM can move recipes with invalid costs out of the loop region, which then get missed by the main cost computation. Extend the logic to check recipes for invalid cost currently only covering the middle block to include all skeleton blocks. Fixes https://github.com/llvm/llvm-project/issues/144358 Fixes https://github.com/llvm/llvm-project/issues/151664 PR: https://github.com/llvm/llvm-project/pull/151940	2025-08-07 10:45:27 +01:00
Matt Arsenault	1110e2ff9f	InlineFunction: Split inlining into predicate and apply functions (#134213 ) This is to support a new inline function reduction in llvm-reduce, which should pre-filter callsites that are not eligible for inlining. This code was mostly structured as a match and apply, with a few exceptions. The ugliest piece is for propagating and verifying compatible getGC and personalities. Also collection of EHPad and the convergence token to use are now cached in InlineFunctionInfo. I was initially confused by the split between the checks performed here and isInlineViable, so better document how this system is supposed to work. It turns out this split does make sense, in that isInlineViable checks if it's possible based on the callee content and the ultimate inline depended on the callsite context. I think more renames of these functions would help, and isInlineViable should probably move out of InlineCost to be with these transfoms.	2025-08-07 16:13:36 +09:00
Florian Hahn	a485e0eae0	[VPlan] Retrieve vector TC for epilogue from resume phi (NFC). Instead of relying on getOrCreateVectorTripCount to initialize EPI.VectorTripCount, delay initialization after we retrieved the resume phi and get the trip count from there. This makes the code independent of legacy vector trip count creation.	2025-08-07 07:52:35 +01:00
Luke Lau	df8da2ff83	[VPlan] Support VPWidenPointerInductionRecipes with EVL tail folding (#152110 ) Now that VPWidenPointerInductionRecipes are modelled in VPlan in #148274, we can support them in EVL tail folding. We need to replace their VFxUF operand with EVL as the increment is not guaranteed to always be VF on the penultimate iteration, and UF is always 1 with EVL tail folding. We also need to move the creation of the backedge value to the latch so that EVL dominates it. With this we will no longer fail to convert a VPlan to EVL tail folding, so adjust tryAddExplicitVectorLength to account for this. This brings us to 99.4% of all vector loops vectorized on SPEC CPU 2017 with tail folding vs no tail folding. The test in only-compute-cost-for-vplan-vfs.ll previously relied on widened pointer inductions with EVL tail folding to end up in a scenario with no vector VPlans, so this also replaces it with an unvectorizable fixed-order recurrence test from first-order-recurrence-multiply-recurrences.ll that also gets discarded.	2025-08-07 10:54:24 +08:00
Ramkumar Ramachandra	092388171f	[VPlan] Introduce m_[Specific]ICmp matcher (#151540 )	2025-08-06 20:35:35 +01:00
Florian Hahn	25d1285eec	[VPlan] Replace single-entry VPPhis with their incoming values. Replace trivial, single-entry VPPhis with their incoming values,	2025-08-06 20:03:31 +01:00
Alexey Bataev	4784ce9ebc	[SLP][NFC]Check an external user before trying to address it in debug dump, NFC	2025-08-06 08:58:16 -07:00
Yussur Mustafa Oraji	ded1f3ec96	[TSan] Add option to ignore capturing behavior when instrumenting (#148156 ) While not needed for most applications, some tools such as [MUST](https://www.i12.rwth-aachen.de/cms/i12/forschung/forschungsschwerpunkte/lehrstuhl-fuer-hochleistungsrechnen/~nrbe/must/) depend on the instrumentation being present. MUST uses the ThreadSanitizer annotation interface to detect data races in MPI programs, where the capture tracking is detrimental as it has no bearing on MPI data races, leading to missed races.	2025-08-06 15:47:33 +02:00
Florian Hahn	e80e7e717e	[VPlan] Use scalar VPPhi instead of VPWidenPHIRecipe in createPlainCFG. (#150847 ) The initial VPlan closely reflects the original scalar loop, so unsing VPWidenPHIRecipe here is premature. Widened phi recipes should only be introduced together with other widened recipes. PR: https://github.com/llvm/llvm-project/pull/150847	2025-08-06 14:43:03 +01:00
Florian Hahn	777c320e6c	[VPlan] Address comments missed in #142309 . Address additional comments from https://github.com/llvm/llvm-project/pull/142309.	2025-08-06 11:52:08 +01:00
Andrew Rogers	a3c386d241	[llvm] annotate recently added interfaces for DLL export (#152179 ) ## Purpose This patch is one in a series of code-mods that annotate LLVM’s public interface for export. This patch annotates symbols that were recently added to LLVM and fixes incorrectly annotated symbols. ## Background This effort is tracked in #109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307), and documentation for `LLVM_ABI` and related annotations is found in the LLVM repo [here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst). ## Overview The bulk of these changes were generated automatically using the [Interface Definition Scanner (IDS)](https://github.com/compnerd/ids) tool, followed formatting with `git clang-format`. The following manual adjustments were also applied after running IDS: - Add `LLVM_EXPORT_TEMPLATE` and `LLVM_TEMPLATE_ABI` annotations to explicitly instantiated instances of `llvm::object::SFrameParser`. ## Validation On Windows 11: ``` cmake -B build -S llvm -G Ninja -DLLVM_ENABLE_PROJECTS="llvm;clang;clang-tools-extra;lldb;lld" -DLLVM_OPTIMIZED_TABLEGEN=ON -DLLVM_BUILD_LLVM_DYLIB=ON -DLLVM_BUILD_LLVM_DYLIB_VIS=ON -DLLVM_LINK_LLVM_DYLIB=ON -DLLVM_BUILD_TESTS=ON -DCLANG_LINK_CLANG_DYLIB=OFF -DCMAKE_BUILD_TYPE=Release ninja -C build ```	2025-08-05 23:12:07 -07:00
Mircea Trofin	f675483905	[profcheck] Annotate `select` instructions (#152171 ) For `select`, we don't have the equivalent of the branch probability analysis to offer defaults, so we make up our own and allow their overriding with flags. Issue #147390	2025-08-06 02:48:50 +02:00
Florian Hahn	d478502a42	[VPlan] Ensure that IV resume phi for epilogue is always first. (NFCI) Update handling of canonical IV resume phi for the epilogue loop to make sure the resume phi for the canonical IV is always the first phi in the scalar preheader. This makes it easier to retrieve it in preparePlanForEpilogueVectorLoop. For now, we keep an assert to make sure we use the same resume phi as before. This will be removed in the future.	2025-08-05 21:06:41 +01:00
Florian Hahn	47258ca470	[VPlan] Use VPPhi instead of dyn_cast + opcode check in isPhi (NFC).	2025-08-05 19:20:12 +01:00

1 2 3 4 5 ...

40664 Commits