llvm-project

Author	SHA1	Message	Date
Yingwei Zheng	eafbab6fac	[EntryExitInstrumenter][AArch64][RISCV][LoongArch] Pass `__builtin_return_address(0)` into `_mcount` (#121107 ) On RISC-V, AArch64, and LoongArch, the `_mcount` function takes `__builtin_return_address(0)` as an argument since `__builtin_return_address(1)` is not available on these platforms. This patch fixes the argument passing to match the behavior of glibc/gcc. Closes https://github.com/llvm/llvm-project/issues/121103.	2025-01-01 15:02:08 +08:00
Florian Hahn	b06a45c66f	[VPlan] Add all blocks to outer loop if present during ::execute (NFCI). This ensures that all blocks created during VPlan execution are properly added to an enclosing loop, if present. Split off from https://github.com/llvm/llvm-project/pull/108378 and also needed once more of the skeleton blocks are created directly via VPlan. This also allows removing the custom logic for early-exit loop vectorization added as part of https://github.com/llvm/llvm-project/pull/117008.	2024-12-31 19:34:34 +00:00
Simon Pilgrim	b195bb87e1	[VectorCombine] scalarizeLoadExtract - consistently use LoadInst and ExtractElementInst specific operand getters. NFC Noticed while investigating the hung builds reported after af83093933ca73bc82c33130f8bda9f1ae54aae2	2024-12-31 14:42:39 +00:00
Florian Hahn	ddef380cd6	[VPlan] Move simplifyRecipe(s) definitions up to allow re-use (NFC) Move definitions to allow easy reuse in https://github.com/llvm/llvm-project/pull/108378.	2024-12-31 13:23:19 +00:00
Muhammad Omair Javaid	332d2647ff	Revert "[LV]: Teach LV to recursively (de)interleave. (#89018 )" This reverts commit ccfe0de0e1e37ed369c9bf89dd0188ba0afb2e9a. This breaks LLVM build on AArch64 SVE Linux buildbots https://lab.llvm.org/buildbot/#/builders/143/builds/4462 https://lab.llvm.org/buildbot/#/builders/17/builds/4902 https://lab.llvm.org/buildbot/#/builders/4/builds/4399 https://lab.llvm.org/buildbot/#/builders/41/builds/4299	2024-12-31 03:12:24 +05:00
Simon Pilgrim	d5a96eb125	Revert af83093933ca73bc82c33130f8bda9f1ae54aae2 "[VectorCombine] eraseInstruction - ensure we reattempt to fold other users of an erased instruction's operands" Reports of hung builds, but I don't have time to investigate at the moment.	2024-12-30 21:20:56 +00:00
Simon Pilgrim	af83093933	[VectorCombine] eraseInstruction - ensure we reattempt to fold other users of an erased instruction's operands As we're reducing the use count of the operands its more likely that they will now fold, as they were previously being prevented by a m_OneUse check, or the cost of retaining the extra instruction had been too high. This is necessary for some upcoming patches, although the only change so far is instruction ordering as it allows some SSE folds of 256/512-bit with 128-bit subvectors to occur earlier in foldShuffleToIdentity as the subvector concats are free. Pulled out of #120984	2024-12-30 17:52:42 +00:00
Florian Hahn	16d19aaedf	[VPlan] Manage created blocks directly in VPlan. (NFC) (#120918 ) This patch changes the way blocks are managed by VPlan. Previously all blocks reachable from entry would be cleaned up when a VPlan is destroyed. With this patch, each VPlan keeps track of blocks created for it in a list and this list is then used to delete all blocks in the list when the VPlan is destroyed. To do so, block creation is funneled through helpers in directly in VPlan. The main advantage of doing so is it simplifies CFG transformations, as those do not have to take care of deleting any blocks, just adjusting the CFG. This helps to simplify https://github.com/llvm/llvm-project/pull/108378 and https://github.com/llvm/llvm-project/pull/106748. This also simplifies handling of 'immutable' blocks a VPlan holds references to, which at the moment only include the scalar header block. PR: https://github.com/llvm/llvm-project/pull/120918	2024-12-30 12:08:12 +00:00
Florian Hahn	7f3428d3ed	[VPlan] Compute induction end values in VPlan. (#112145 ) Use createDerivedIV to compute IV end values directly in VPlan, instead of creating them up-front. This allows updating IV users outside the loop as follow-up. Depends on https://github.com/llvm/llvm-project/pull/110004 and https://github.com/llvm/llvm-project/pull/109975. PR: https://github.com/llvm/llvm-project/pull/112145	2024-12-29 19:05:08 +00:00
Simon Pilgrim	f2f02b21cd	[VectorCombine] foldShuffleOfBinops - only accept exact matching cmp predicates m_SpecificCmp allowed equivalent predicate+flags which don't necessarily work after being folded from "shuffle (cmpop), (cmpop)" into "cmpop (shuffle), (shuffle)" Fixes #121110	2024-12-28 09:21:31 +00:00
Fangrui Song	edc42b2dc1	[SLP] Migrate away from PointerUnion::get	2024-12-27 21:01:09 -08:00
Zequan Wu	4d8f9594b2	Revert "Reland "[LoopVectorizer] Add support for partial reductions" (#120721 )" This reverts commit c858bf620c3ab2a4db53e84b9365b553c3ad1aa6 as it casuse optimization crash on -O2, see https://github.com/llvm/llvm-project/pull/120721#issuecomment-2563192057	2024-12-27 11:51:54 -08:00
Florian Hahn	8caeb2e0c2	[VPlan] Always create initial blocks in constructor (NFC). Update C++ unit tests to use VPlanTestBase to construct initial VPlan, using a constructor that creates the VP blocks directly in the constructor. Split off from and in preparation for https://github.com/llvm/llvm-project/pull/120918.	2024-12-27 17:43:22 +00:00
Alexey Bataev	07ba457525	[SLP][NFC]Add dump of combined entries, where applicable	2024-12-27 07:56:10 -08:00
Hassnaa Hamdi	ccfe0de0e1	[LV]: Teach LV to recursively (de)interleave. (#89018 ) Currently available intrinsics are only ld2/st2, which don't support interleaving factor > 2. This patch teaches the LV to use ld2/st2 recursively to support high interleaving factors.	2024-12-27 12:42:07 +00:00
Elvis Wang	47e1c87a61	[VPlan] Set debug location for VPReduction/VPWidenIntrinsicRecipe. (#120054 ) This patch add missing debug location for VPReduction/VPWidenIntrinsicRecipe.	2024-12-27 10:37:21 +08:00
Florian Hahn	2dfe1b4042	[VPlan] Remove stray space when printing reverse vector pointer. printFlags() takes care of printing the required space, remove the extra printed space between flags and operands.	2024-12-26 21:26:17 +00:00
Alexey Bataev	889215a30e	[SLP]Followup fix for the poisonous logical op in reductions If the VectorizedTree still may generate poisonous value, but it is not the original operand of the reduction op, need to check if Res still the operand, to generate correct code. Fixes #114905	2024-12-26 05:11:26 -08:00
DaPorkchop_	cea738bc9a	[SimplifyCFG] Replace unreachable switch lookup table holes with poison (#94990 ) As discussed in #94468, this causes switch lookup table entries which are unreachable to be poison instead of filling them with a value from one of the reachable cases. --------- Co-authored-by: DianQK <dianqk@dianqk.net>	2024-12-26 07:47:26 +08:00
Usman Nadeem	5fb57131b7	[DFAJumpThreading] Don't bail early after encountering unpredictable values (#119774 ) After #96127 landed, mshockwave reported that the pass was no longer threading SPEC2006/perlbench. After 96127 we started bailing out in `getStateDefMap` and rejecting the transformation because one of the unpredictable values was coming from inside the loop. There was no fundamental change in that function except that we started calling `Loop->contains(IncomingBB)` instead of `LoopBBs.count(IncomingBB)`. After some analysis I came to the conclusion that even before 96127 we would reject the transformation if we provided large enough limits on the path traversal (large enough so that LoopBBs contained blocks corresponding to that unpredictable value). In this patch I changed `getStateDefMap` to not terminate early on finding an unpredictable value, this is because `getPathsFromStateDefMap`, later, actually has checks to ensure that the final list of paths only have predictable values. As a result we can now partially thread functions like `negative6` in the tests that have some predictable paths. This patch does not really have any compile-time impact on the test suite without `-dfa-early-exit-heuristic=false` (early exit is enabled by default). Change-Id: Ie1633b370ed4a0eda8dea52650b40f6f66ef49a3	2024-12-25 01:29:01 -08:00
LiqinWeng	b5f0ec80d5	[VPlan] Remove redundant printing final in VPlan::execute (#121048 ) Multiple prints will cause problems when testing ir-bb	2024-12-25 10:11:02 +08:00
Alexey Bataev	07d284d4eb	[SLP]Add cost estimation for gather node reshuffling Adds cost estimation for the variants of the permutations of the scalar values, used in gather nodes. Currently, SLP just unconditionally emits shuffles for the reused buildvectors, but in some cases better to leave them as buildvectors rather than shuffles, if the cost of such buildvectors is better. X86, AVX512, -O3+LTO Metric: size..text Program size..text results results0 diff test-suite :: External/SPEC/CINT2006/445.gobmk/445.gobmk.test 912998.00 913238.00 0.0% test-suite :: MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame.test 203070.00 203102.00 0.0% test-suite :: External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test 1396320.00 1396448.00 0.0% test-suite :: External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test 1396320.00 1396448.00 0.0% test-suite :: MultiSource/Benchmarks/Bullet/bullet.test 309790.00 309678.00 -0.0% test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 12477607.00 12470807.00 -0.1% CINT2006/445.gobmk - extra code vectorized MiBench/consumer-lame - small variations CFP2017speed/638.imagick_s CFP2017rate/538.imagick_r - extra vectorized code Benchmarks/Bullet - extra code vectorized CFP2017rate/526.blender_r - extra vector code RISC-V, sifive-p670, -O3+LTO CFP2006/433.milc - regressions, should be fixed by https://github.com/llvm/llvm-project/pull/115173 CFP2006/453.povray - extra vectorized code CFP2017rate/508.namd_r - better vector code CFP2017rate/510.parest_r - extra vectorized code SPEC/CFP2017rate - extra/better vector code CFP2017rate/526.blender_r - extra vectorized code CFP2017rate/538.imagick_r - extra vectorized code CINT2006/403.gcc - extra vectorized code CINT2006/445.gobmk - extra vectorized code CINT2006/464.h264ref - extra vectorized code CINT2006/483.xalancbmk - small variations CINT2017rate/525.x264_r - better vectorization Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/115201	2024-12-24 15:35:29 -05:00
Florian Hahn	2d038caeeb	[VPlan] Remove stray space when printing VPWidenCastRecipe. printFlags() already takes care of printing a single space if there are no flags. Remove the extra space when printing a recipe without flags.	2024-12-24 20:23:48 +00:00
Alexey Bataev	852feea820	[SLP]Propagate AssumptionCache where possible	2024-12-24 09:20:26 -08:00
Alexey Bataev	0d6cb0ae9d	[SLP]Fix strict weak ordering criterion in comparators Fixes #121019	2024-12-24 08:13:57 -08:00
Alexey Bataev	f0f8dab712	[SLP]Check if the first reduced value requires freeze/swap, if it may be too poisonous If several reduced values are combined and the first reduced value is just the original reduced value of the bool logical op, need to freeze it to prevent the propagation of the poison value. Fixes #114905	2024-12-24 07:40:35 -08:00
Sam Tebbs	c858bf620c	Reland "[LoopVectorizer] Add support for partial reductions" (#120721 ) This re-lands the reverted #92418 When the VF is small enough so that dividing the VF by the scaling factor results in 1, the reduction phi execution thinks the VF is scalar and sets the reduction's output as a scalar value, tripping assertions expecting a vector value. The latest commit in this PR fixes that by using `State.VF` in the scalar check, rather than the divided VF. --------- Co-authored-by: Nicholas Guy <nicholas.guy@arm.com>	2024-12-24 12:08:17 +00:00
Alexey Bataev	030829a7e5	[SLP]Drop samesign flag if the vector node has reduced bitwidth If the operands of the icmp instructions has reduced bitwidth after MinBitwidth analysis, need to drop samesign flag to preserve correctness of the transformation. Fixes #120823	2024-12-23 16:55:11 -08:00
Benjamin Maxwell	9ab5474e56	[LV] Rename `ToVectorTy` to `toVectorTy` (NFC) (#120404 ) This is for consistency with other helpers (and also follows the LLVM naming conventions).	2024-12-23 23:33:44 +00:00
Florian Hahn	c7a777322d	[VPlan] Replace else-if dyn_cast with cast (NFC). The recipes handled here are either VPWidenIntrinsic or VPWidenCast, so replace the else-if dyn_cast with a single else + cast.	2024-12-23 19:46:22 +00:00
Simon Pilgrim	e3f8c229f5	[VectorCombine] foldInsExtVectorToShuffle - inserting into a poison base vector can be modelled as a single src shuffle We already canonicalized an undef base vector to the RHS to improve further folding, this extends this to improve the shuffle cost estimate of the single src shuffle	2024-12-23 15:49:17 +00:00
Simon Pilgrim	29c89d7265	[VectorCombine] foldShuffleOfShuffles - fold "shuffle (shuffle x, y, m1), (shuffle y, x, m2)" -> "shuffle x, y, m3" (#120959 ) foldShuffleOfShuffles currently only folds unary shuffles to ensure we don't end up with a merged shuffle with more than 2 sources, but this prevented cases where both shuffles were sharing sources. This patch generalizes the merge process to find up to 2 sources as it merges with the inner shuffles, it also moves the undef/poison handling stages into the merge loop as well. Fixes #120764	2024-12-23 14:56:15 +00:00
Han-Kuan Chen	11676da808	[SLP] Normalize debug messages for newTreeEntry. (#119514 ) A debug message should follow after newTreeEntry. Make ExtractValueInst and ExtractElementInst use setOperand directly.	2024-12-23 21:42:02 +08:00
Haopeng Liu	8daba2c13d	Skip negative length while inferring initializes attr (#120874 ) Bail out negative length while inferring initializes attr. Otherwise it causes an assertion error: `Attribute 'initializes' does not support unordered ranges`	2024-12-22 19:01:52 -08:00
LiqinWeng	b1fab4f849	[LV][VPlan] Initialize the variable 'VPID' of the createEVLRecipe (#120926 ) Resolve the compilation error caused by the merge issue: #119510	2024-12-23 09:23:22 +08:00
LiqinWeng	8a51471d83	[LV][VPlan] Extract the implementation of transform Recipe to EVLRecipe into a small function. NFC (#119510 )	2024-12-23 08:28:19 +08:00
Simon Pilgrim	bf873aa3ec	[VectorCombine] foldShuffleToIdentity - add debug message for match Helps with debugging to show to that the fold found the match.	2024-12-22 17:21:44 +00:00
Simon Pilgrim	f96337e04e	[VectorCombine] foldConcatOfBoolMasks - add debug message for match + cost-comparison Helps with debugging to show to that the fold found the match, and shows the old + new costs to indicate whether the fold was/wasn't profitable.	2024-12-22 16:21:02 +00:00
Florian Hahn	e1833e3a7e	[VPlan] Simplify redundant VPDerivedIVRecipe (NFC). Split DerivedIV simplification off from https://github.com/llvm/llvm-project/pull/112145 and use to remove the need for extra checks in createScalarIVSteps. Required an extra simplification run after IV transforms.	2024-12-22 09:39:19 +00:00
LiqinWeng	86fa35ce7e	[LV][VPlan] Use opcode to retrieve the VPID of the CallRecipe, rather than underlying instruction (#120816 ) This patch may cause the flags in the CallRecipe to be lost after EVL transformation, and it has been addressed in the patch: #119847	2024-12-22 10:28:20 +08:00
Florian Hahn	9b496deb90	[VPlan] Set and use debug location for VPPredInstPHIRecipe. Update the recipe it always set its debug location and use it during IR generation.	2024-12-21 21:57:47 +00:00
GrumpyPigSkin	f7ba2bdf86	[LLVM][SLSR] Add a debug counter (#119981 ) Added debug counter and test for SLSR. Fixes: https://github.com/llvm/llvm-project/issues/119770	2024-12-21 12:37:44 -05:00
Florian Hahn	bb86c5dd4d	[VPlan] Use inferScalarType in VPInstruction::ResumePhi codegen (NFC). Use VPlan-based type analysis to retrieve type of phi node. Also adds missing type inference for ResumePhi and ComputeReductionResult opcodes.	2024-12-21 15:55:21 +00:00
vporpo	7a38445ee2	[SandboxVec][DAG] Register move instr callback (#120146 ) This patch implements the move instruction notifier for the DAG. Whenever an instruction moves the notifier will maintain the DAG.	2024-12-20 23:10:24 -08:00
Kazu Hirata	adf0c817f3	[memprof] Undrift MemProf profile even when some frames are missing (#120500 ) This patch makes the MemProf undrifting process a little more lenient. Consider an inlined call hierarchy: foo -> bar -> ::new If bar tail-calls ::new, the profile appears to indicate that foo directly calls ::new. This is a problem because the perceived call hierarchy in the profile looks different from what we can obtain from the inline stack in the IR. Recall that undrifting works by constructing and comparing a list of direct calls from the profile and that from the IR. This patch modifies the construction of the latter. Specifically, if foo calls bar in the IR, but bar is missing the profile, we pretend that foo directly calls some heap allocation function. We apply this transformation only in the inline stack leading to some heap allocation function.	2024-12-20 15:40:08 -08:00
Owen Anderson	bc8fa9c443	Revert "SimplifyLibCalls: Use default globals address space when building new global strings. (#118729 )" (#119616 ) This reverts commit cfa582e8aaa791b52110791f5e6504121aaf62bf.	2024-12-21 09:33:39 +13:00
Teresa Johnson	c7451ffcb9	[MemProf] Supporting hinting mostly-cold allocations after cloning (#120633 ) Optionally unconditionally hint allocations as cold or not cold during the cloning step if the percentage of bytes allocated is at least that of the given threshold. This is similar to PR120301 which supports this during matching, but enables the same behavior during cloning, to reduce the false positives that can be addressed by cloning at the cost of carrying the additional size metadata/summary.	2024-12-20 11:27:54 -08:00
Thurston Dang	5bb650345d	Remove -bounds-checking-unique-traps (replace with -fno-sanitize-merge=local-bounds) (#120682 ) #120613 removed -ubsan-unique-traps and replaced it with -fno-sanitize-merge (introduced in #120511), which allows fine-grained control of which UBSan checks to prevent merging. This analogous patch removes -bound-checking-unique-traps, and allows it to be controlled via -fno-sanitize-merge=local-bounds. Most of this patch is simply plumbing through the compiler flags into the bounds checking pass. Note: this patch subtly changes -fsanitize-merge (the default) to also include -fsanitize-merge=local-bounds. This is different from the previous behavior, where -fsanitize-merge (or the old -ubsan-unique-traps) did not affect local-bounds (requiring the separate -bounds-checking-unique-traps). However, we argue that the new behavior is more intuitive. Removing -bounds-checking-unique-traps and merging its functionality into -fsanitize-merge breaks backwards compatibility; we hope that this is acceptable since '-mllvm -bounds-checking-unique-traps' was an experimental flag.	2024-12-20 10:07:44 -08:00
Simon Pilgrim	82b5bda42c	[VectorCombine] Add "VC: Erasing" debug message to help the log show when dead WorkList instructions are erased.	2024-12-20 17:59:14 +00:00
Simon Pilgrim	e3157d3f0d	[VectorCombine] foldBitcastShuffle - add debug message for match + cost-comparison Helps with debugging to show to that the fold found the match, and shows the old + new costs to indicate whether the fold was/wasn't profitable.	2024-12-20 17:59:13 +00:00

1 2 3 4 5 ...

38497 Commits