llvm-project

Author	SHA1	Message	Date
Florian Hahn	2bdc1a1337	[LV] Use frozen start value for FindLastIV if needed. (#132691 ) FindLastIV introduces multiple uses of the start value, where in the original source there was only a single use, when the epilogue is vectorized. Each use of undef may produce a different result, so introducing multiple uses can produce incorrect results when the input is undef/poison. If the start value may be undef or poison, freeze it and use the frozen value, which will be the same at all uses. See the following scenarios in Alive2: * Both main and epilogue vector loops execute, go to exit block: https://alive2.llvm.org/ce/z/_TSvRr * Both main and epilogue vector loops execute, go to scalar loop: https://alive2.llvm.org/ce/z/CsPj5v * Only epilogue vector loop executes, go to exit block: https://alive2.llvm.org/ce/z/5XqkNV * Only epilogue vector loop executes, go to scalar loop: https://alive2.llvm.org/ce/z/JUpqRN The latter 2 show requiring freezing the resume phi. That means we cannot freeze in the preheader. We could move the freeze to the main iteration count check, but that would be a bit fragile to find and other transforms can sink the freeze if needed. Depends on https://github.com/llvm/llvm-project/pull/132689 and https://github.com/llvm/llvm-project/pull/132690. Fixes https://github.com/llvm/llvm-project/issues/126836 PR: https://github.com/llvm/llvm-project/pull/132691	2025-04-04 11:48:01 +01:00
Florian Hahn	a4573ee38d	[LoopUnroll] UnrollRuntimeMultiExit takes precedence over TTI. (#134259 ) Update UnrollRuntimeLoopRemainder to always give priority to the UnrollRuntimeMultiExit option, if provided. After ad9da92cf6f7357 (https://github.com/llvm/llvm-project/pull/124462), we would ignore the option if the backend indicates multi-exit is profitable. This means it cannot be used to disable runtime unrolling. To be consistent with canProfitablyRuntimeUnrollMultiExitLoop, always respect the option. This surfaced while discussing https://github.com/llvm/llvm-project/pull/131998. PR: https://github.com/llvm/llvm-project/pull/134259	2025-04-04 10:16:50 +01:00
Tobias Stadler	1302610f03	[MergeFunc] Fix crash caused by bitcasting ArrayType (#133259 ) createCast in MergeFunctions did not consider ArrayTypes, which results in the creation of a bitcast between ArrayTypes in the thunk function, leading to an assertion failure in the provided test case. The version of createCast in GlobalMergeFunctions does handle ArrayTypes, so this common code has been factored out into the IRBuilder.	2025-04-04 10:16:40 +01:00
Mircea Trofin	4532512f6c	[ctxprof] Move `MoveSymbolGUID` to address dependency issues (#134334 ) See PR #134192	2025-04-03 19:02:46 -07:00
Mircea Trofin	2146826169	[ctxprof] Support for "move" semantics for the contextual root (#134192 ) This PR finishes what PR #133992 started.	2025-04-03 18:36:45 -07:00
Alex MacLean	ba0a52a04b	[InferAS] Support getAssumedAddrSpace for Arguments for NVPTX (#133991 )	2025-04-03 16:47:36 -07:00
Florian Hahn	cdff7f0b6e	[LV] Retrieve middle VPBB via scalar ph to fix epilogue resumephis (NFC) If ScalarPH has predecessors, we may need to update its reduction resume values. If there is a middle block, it must be the first predecessor. Note that the first predecessor may not be the middle block, if the middle block doesn't branch to the scalar preheader. In that case, fixReductionScalarResumeWhenVectorizingEpilog will be a no-op. In preparation for https://github.com/llvm/llvm-project/pull/106748.	2025-04-03 21:46:48 +01:00
Mircea Trofin	61768b3528	[ctxprof] Don't import roots elsewhere (#134012 ) Block a context root from being imported by its callers. Suppose that happened. Its caller - usually a message pump - inlines its copy of the root. Then it (the root) and whatever it calls will be the non-contextually optimized callee versions.	2025-04-03 13:21:39 -07:00
Alexey Bataev	daab7d0807	[SLP]Initial support for (masked)loads + compress and (masked)interleaved Added initial support for (masked)loads + compress and (masked)interleaved loads. Reviewers: RKSimon, hiraditya Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/132099	2025-04-03 13:17:40 -07:00
Alexey Bataev	7c4013d591	Revert "[SLP]Initial support for (masked)loads + compress and (masked)interleaved" This reverts commit 0bec0f5c059af5f920fe22ecda469b666b5971b0 to fix a crash reported in https://lab.llvm.org/buildbot/#/builders/143/builds/6668.	2025-04-03 12:58:49 -07:00
Alexey Bataev	0bec0f5c05	[SLP]Initial support for (masked)loads + compress and (masked)interleaved Added initial support for (masked)loads + compress and (masked)interleaved loads. Reviewers: RKSimon, hiraditya Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/132099	2025-04-03 13:21:22 -04:00
Matt Arsenault	a54736afd5	CloneFunction: Do not delete blocks with address taken (#134209 ) If a block with a single predecessor also had its address taken, it was getting deleted in this post-inline cleanup step. This would result in the blockaddress in the resulting function getting deleted and replaced with inttoptr 1. This fixes one bug required to permit inlining of functions with blockaddress uses. At the moment this is not testable (at least without an annoyingly complex unit test), and is a pre-bug fix for future patches. Functions with blockaddress uses are rejected in isInlineViable, so we don't get this far with the current InlineFunction uses (some of the existing cases seem to reproduce this part of the rejection logic, like PartialInliner). This will be tested in a pending llvm-reduce change. Prerequisite for #38908	2025-04-03 23:52:25 +07:00
gbMattN	61ef286506	Fix signed/unsigned mismatch warning (#134255 )	2025-04-03 15:56:33 +01:00
gbMattN	59074a3760	[ASan] Add metadata to renamed instructions so ASan doesn't use the i… (#119387 ) …ncorrect name Clang needs variables to be represented with unique names. This means that if a variable shadows another, its given a different name internally to ensure it has a unique name. If ASan tries to use this name when printing an error, it will print the modified unique name, rather than the variable's source code name Fixes #47326	2025-04-03 15:27:14 +01:00
Ramkumar Ramachandra	6bbdc70066	[LV] Use getCallWideningDecision in more places (NFC) (#134236 )	2025-04-03 14:53:19 +01:00
Camsyn	ecc35456d7	[Utils] Fix incorrect LCSSA PHI nodes when splitting critical edges with MergeIdenticalEdges (#131744 ) This PR fixes incorrect LCSSA PHI node generation when splitting critical edges with both `PreserveLCSSA` and `MergeIdenticalEdges` enabled. The bug caused PHI nodes in the split block to miss predecessors when multiple identical edges were merged.	2025-04-03 12:02:03 +02:00
Yingwei Zheng	73e1710a4d	[SimplifyCFG] Remove unused variable. NFC. (#134211 )	2025-04-03 15:22:51 +08:00
Ryotaro Kasuga	91f3965be4	[LoopInterchange] Fix the vectorizable check for a loop (#133667 ) In the profitability check for vectorization, the dependency matrix was not handled correctly. This can result to make a wrong decision: It may say "this loop can be vectorized" when in fact it cannot. The root cause of this is that the check process early returns when it finds '=' or 'I' in the dependency matrix. To make sure that we can actually vectorize the loop, we need to check all the rows of the matrix. This patch fixes the process of checking whether we can vectorize the loop or not. Now it won't make a wrong decision for a loop that cannot be vectorized. Related: #131130	2025-04-03 16:21:19 +09:00
Yingwei Zheng	b6c0ce0bb6	[IR][NFC] Use `SwitchInst::defaultDestUnreachable` (#134199 )	2025-04-03 14:47:47 +08:00
Snehasish Kumar	7f2abe8fd1	Revert "[Metadata] Preserve MD_prof when merging instructions when one is missing." (#134200 ) Reverts llvm/llvm-project#132433 I suspect this change caused a failure in the bolt build bot. https://lab.llvm.org/buildbot/#/builders/113/builds/6621 ``` !9185 = !{!"branch_weights", i32 3912, i32 802} Wrong number of operands !9185 = !{!"branch_weights", i32 3912, i32 802} fatal error: error in backend: Broken module found, compilation aborted! ```	2025-04-02 22:11:17 -07:00
Mircea Trofin	d59b2c4def	[ctxprof][nfc] Make `computeImportForFunction` a member of `ModuleImportsManager` (#134011 )	2025-04-02 18:18:17 -07:00
Mircea Trofin	02467f9e21	[ctxprof] Option to move a whole tree to its own module (#133992 ) Modules may contain a mix of functions that participate or don't participate in callgraphs covered by a contextual profile. We currently have been importing all the functions under a context root in the module defining that root, but if the other functions there are covered by flat profiles, the result is difficult to reason about. This patch allows moving everything under a context root (and that root) in its own module. For now, we expect a module with a filename matching the GUID of the function be present in the set of modules known by the linker. This mechanism can be improved in a later patch. Subsequent patches will handle implementing "move" instead of "import" semantics for the root function (because we want to make sure only one version of the root exists - so the optimizations we perform are actually the ones being observed at runtime).	2025-04-02 18:15:48 -07:00
Matt Arsenault	7559c64c5e	CloneModule: Map global initializers after mapping the function (#134082 )	2025-04-03 07:17:12 +07:00
Florian Hahn	380defd4b3	[VPlan] Update VPInterleaveRecipe to take debug loc directly as arg (NFC)	2025-04-02 22:46:38 +01:00
Florian Hahn	4b67c53e20	[VPlan] Use recipe debug loc instead of instr DLs in more cases (NFC) Update both VPInterleaveRecipe and VPReplicateRecipe codegen to use debug location directly from the recipe, not the underlying instruction. This removes another dependency on underlying instructions.	2025-04-02 21:51:17 +01:00
vporpo	a1b0b4997e	[SandboxVec][NFC] Replace std::regex with llvm::Regex (#134110 )	2025-04-02 13:46:56 -07:00
Krzysztof Drewniak	554859c736	[TTI] Make isLegalMasked{Load,Store} take an address space (#134006 ) In order to facilitate targets that only support masked loads/stores on certain address spaces (AMDGPU will support them in an upcoming patch, but only for address space 7), add an AddressSpace parameter to isLegalMaskedLoad and isLegalMaskedStore	2025-04-02 15:38:10 -05:00
Florian Hahn	3bdf9a0880	[EquivalenceClasses] Use SmallVector for deterministic iteration order. (#134075 ) Currently iterators over EquivalenceClasses will iterate over std::set, which guarantees the order specified by the comperator. Unfortunately in many cases, EquivalenceClasses are used with pointers, so iterating over std::set of pointers will not be deterministic across runs. There are multiple places that explicitly try to sort the equivalence classes before using them to try to get a deterministic order (LowerTypeTests, SplitModule), but there are others that do not at the moment and this can result at least in non-determinstic value naming in Float2Int. This patch updates EquivalenceClasses to keep track of all members via a extra SmallVector and removes code from LowerTypeTests and SplitModule to sort the classes before processing. Overall it looks like compile-time slightly decreases in most cases, but close to noise: https://llvm-compile-time-tracker.com/compare.php?from=7d441d9892295a6eb8aaf481e1715f039f6f224f&to=b0c2ac67a88d3ef86987e2f82115ea0170675a17&stat=instructions PR: https://github.com/llvm/llvm-project/pull/134075	2025-04-02 20:27:43 +01:00
Alexey Bataev	843ef77dc2	[SLP]Update mapping between values and their matching entries upon selection Need to update the mapping between gathered values and their matching entries, if the list of the entries is updated and only some of them are selected for final shuffling. Fixes #134085	2025-04-02 11:59:32 -07:00
Snehasish Kumar	c18994c7cd	[Metadata] Preserve MD_prof when merging instructions when one is missing. (#132433 ) Preserve branch weight metadata when merging instructions if one of the instructions is missing metadata. This is similar in behaviour to what we do today for other types of metadata such as mmra, memprof and callsite metadata.	2025-04-02 11:13:45 -06:00
Snehasish Kumar	dde0be9d97	[Metadata] Handle memprof, callsite merging when one is missing. (#132106 ) For memprof and callsite metadata we want to pick one deterministically and keep that even if one of them may be missing.	2025-04-02 11:10:02 -06:00
Alexey Bataev	48a4b14cb6	[SLP]Fix whole vector registers calculations for compares Need to check that the calculated number of the elements is not larger than the original number of scalars to prevent a compiler crash. Fixes #134013	2025-04-02 07:26:40 -07:00
Yingwei Zheng	65ed35393c	[IR] Add helper `CmpPredicate::dropSameSign` (#134071 ) Address review comment https://github.com/llvm/llvm-project/pull/133711#discussion_r2024519641	2025-04-02 22:25:01 +08:00
Han-Kuan Chen	5bbcc765cc	[SLP][REVEC] getNumElements should not be used as VF when REVEC is enabled. (#134031 )	2025-04-02 19:04:07 +08:00
Luke Lau	8107b430ed	[VPlan] Simplify select c, x, x -> x (#133731 ) As noted in 1a9358c090d0507be21c5e9b2d97a23ef1de8ab0, some simplifications can produce a redundant select where the true and false operands are the same, which this patch removes. The is_fpclass test was changed so the condition wasn't made dead.	2025-04-02 10:26:48 +01:00
Ryotaro Kasuga	528e408b94	[LoopInterchange] Add an option to control the cost heuristics applied (#133664 ) LoopInterchange has several heuristic functions to determine if exchanging two loops is profitable or not. Whether or not to use each heuristic and the order in which to use them were fixed, but #125830 allows them to be changed internally at will. This patch adds a new option to control them via the compiler option. The previous patch also added an option to prioritize the vectorization heuristic. This patch also removes it to avoid conflicts between it and the newly introduced one, e.g., both `-loop-interchange-prioritize-vectorization=1` and `-loop-interchange-profitabilities='cache,vectorization'` are specified.	2025-04-02 15:41:40 +09:00
Alexey Bataev	0e3049c562	[SLP]Support revectorization of the previously vectorized scalars If the scalar instructions is marked for the vectorization in the tree, it cannot be vectorized as part of the another node in the same tree, in general. It may prevent some potentially profitable vectorization opportunities, since some nodes end up being buildvector/gather nodes, which add to the total cost. Patch allows revectorization of the previously vectorized scalars. Reviewers: hiraditya, RKSimon Reviewed By: RKSimon, hiraditya Pull Request: https://github.com/llvm/llvm-project/pull/133091	2025-04-01 14:30:06 -04:00
Matt Arsenault	7e25b24073	IRNormalizer: Replace cl::opts with pass parameters (#133874 ) Not sure why the "fold-all" option naming didn't match the variable "FoldPreOutputs", but I've preserved the difference. More annoyingly, the pass name "normalize" does not match the pass name IRNormalizer and should probably be fixed one way or the other. Also the existing test coverage for the flags is lacking. I've added a test that shows they parse, but we should have tests that they do something.	2025-04-01 23:27:20 +07:00
Jeremy Morse	1ebc308bba	[DebugInfo][RemoveDIs] Remove debug-intrinsic printing cmdline options (#131855 ) During the transition from debug intrinsics to debug records, we used several different command line options to customise handling: the printing of debug records to bitcode and textual could be independent of how the debug-info was represented inside a module, whether the autoupgrader ran could be customised. This was all valuable during development, but now that totally removing debug intrinsics is coming up, this patch removes those options in favour of a single flag (experimental-debuginfo-iterators), which enables autoupgrade, in-memory debug records, and debug record printing to bitcode and textual IR. We need to do this ahead of removing the experimental-debuginfo-iterators flag, to reduce the amount of test-juggling that happens at that time. There are quite a number of weird test behaviours related to this -- some of which I simply delete in this commit. Things like print-non-instruction-debug-info.ll , the test suite now checks for debug records in all tests, and we don't want to check we can print as intrinsics. Or the update_test_checks tests -- these are duplicated with write-experimental-debuginfo=false to ensure file writing for intrinsics is correct, but that's something we're imminently going to delete. A short survey of curious test changes: * free-intrinsics.ll: we don't need to test that debug-info is a zero cost intrinsic, because we won't be using intrinsics in the future. * undef-dbg-val.ll: apparently we pinned this to non-RemoveDIs in-memory mode while we sorted something out; it works now either way. * salvage-cast-debug-info.ll: was testing intrinsics-in-memory get salvaged, isn't necessary now * localize-constexpr-debuginfo.ll: was producing "dead metadata" intrinsics for optimised-out variable values, dbg-records takes the (correct) representation of poison/undef as an operand. Looks like we didn't update this in the past to avoid spurious test differences. * Transforms/Scalarizer/dbginfo.ll: this test was explicitly testing that debug-info affected codegen, and we deferred updating the tests until now. This is just one of those silent gnochange issues that get fixed by RemoveDIs. Finally: I've added a bitcode test, dbg-intrinsics-autoupgrade.ll.bc, that checks we can autoupgrade debug intrinsics that are in bitcode into the new debug records.	2025-04-01 14:27:11 +01:00
Florian Hahn	9e5bfbf77d	[EquivalenceClasses] Update member_begin to take ECValue (NFC). Remove a level of indirection and update code to use range-based for loops.	2025-04-01 09:28:46 +01:00
Florian Hahn	64d493f987	[EquivalenceClasses] Return ECValue directly from insert (NFC). Removes a redundant lookup in the mapping.:	2025-04-01 08:45:46 +01:00
Ningning Shi(史宁宁)	6b647de031	[NFC] Remove the unused hasMinSize() (#133838 ) The 'hasOptSize()' is 'hasFnAttribute(Attribute::OptimizeForSize) \|\| hasMinSize()', so we don't need another 'hasMinSize()'.	2025-04-01 15:23:34 +08:00
Alexey Bataev	cf6a452cc7	[SLP]Fix same/alternate analysis in split node analysis for compares getSameOpcode in some cases may consider 2 compares as having same opcode, even though previously they were considered as alternate. It may happen, because getSameOpcode looses info about previous instructions and their states. Need to use isAlternateInstruction function instead for the correct analysis. Reviewers: RKSimon, hiraditya Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/133769	2025-03-31 19:33:40 -04:00
Luke Lau	6afe5e5d1a	[LV][EVL] Peek through combination tail-folded + predicated masks (#133430 ) If a recipe was predicated and tail folded at the same time, it will have a mask like EMIT vp<%header-mask> = icmp ule canonical-iv, backedge-tc EMIT vp<%mask> = logical-and vp<%header-mask>, vp<%pred-mask> When converting to an EVL recipe, if the mask isn't exactly just the header-mask we copy the whole logical-and. We can remove this redundant logical-and (because it's now covered by EVL) and just use vp<%pred-mask> instead. This lets us remove the widened canonical IV in more places.	2025-03-31 21:28:39 +01:00
Luke Lau	b739a3cb65	[VPlan] Add m_Deferred. NFC (#133736 ) This copies over the implementation of m_Deferred which allows matching values that were bound in the pattern, and uses it for the (X && Y) \|\| (X && !Y) -> X simplifcation.	2025-03-31 21:01:28 +01:00
Alexey Bataev	bfd8cc0a3e	[SLP]Fix a check for the whole register use Need to check the value type, not the return type, of the instructions, when doing the analysis for the whole register use to prevent a compiler crash. Fixes #133751	2025-03-31 10:52:12 -07:00
Rahul Joshi	74b7abf154	[IRBuilder] Add new overload for CreateIntrinsic (#131942 ) Add a new `CreateIntrinsic` overload with no `Types`, useful for creating calls to non-overloaded intrinsics that don't need additional mangling.	2025-03-31 08:10:34 -07:00
Alexey Bataev	78777a204a	[LV]Split store-load forward distance analysis from other checks, NFC (#121156 ) The patch splits the store-load forwarding distance analysis from other dependency analysis in LAA. Currently it supports only power-of-2 distances, required to support non-power-of-2 distances in future. Part of #100755	2025-03-31 07:28:44 -04:00
Florian Hahn	809f857d2c	[VPlan] Support early-exit loops in optimizeForVFAndUF. (#131539 ) Update optimizeForVFAndUF to support early-exit loops by handling BranchOnCond(Or(..., CanonicalIV == TripCount)) via SCEV PR: https://github.com/llvm/llvm-project/pull/131539	2025-03-31 07:55:48 +01:00
Kazu Hirata	2fc08d4c31	[Vectorize] Use DenseMap::insert_range (NFC) (#133656 )	2025-03-30 22:57:45 -07:00

1 2 3 4 5 ...

39366 Commits