llvm-project

Author	SHA1	Message	Date
Simon Pilgrim	718d50d6d0	[VectorCombine] foldPermuteOfBinops - prefer the new fold for matching costs. Minor tweak to #114101 - as we're reducing the instruction count, we should prefer the fold if the old/new costs are the same.	2024-11-01 17:28:37 +00:00
Simon Pilgrim	92af82a48d	[VectorCombine] Fold "shuffle (binop (shuffle, shuffle)), undef" --> "binop (shuffle), (shuffle)" (#114101 ) Add foldPermuteOfBinops - to fold a permute (single source shuffle) through a binary op that is being fed by other shuffles. Fixes #94546 Fixes #49736	2024-10-31 10:58:09 +00:00
Simon Pilgrim	bc999ee57a	[PhaseOrdering][X86] Add test coverage for #94546	2024-10-30 11:55:04 +00:00
Simon Pilgrim	2de1fc8286	[PhaseOrdering][X86] Add additional test coverage for #49736 I've kept the old PR50392 tag since this is such an old issue....	2024-10-30 11:10:48 +00:00
Yingwei Zheng	095d49da76	[InstCombine] Set `samesign` when converting signed predicates into unsigned (#112642 ) Alive2: https://alive2.llvm.org/ce/z/6cqdt-	2024-10-17 20:43:48 +08:00
Yingwei Zheng	62cd07fb67	[InstCombine] Canonicalize `sub mask, X -> ~X` when high bits are ignored (#110635 ) Alive2: https://alive2.llvm.org/ce/z/NJgBPL The motivating case of this patch is to emit `andn` on RISC-V with zbb for expressions like `(sub 63, X) & 63`.	2024-10-02 12:48:06 +08:00
Philip Reames	2c7786e94a	Prefer use of 0.0 over -0.0 for fadd reductions w/nsz (in IR) (#106770 ) This is a follow up to 924907bc6, and is mostly motivated by consistency but does include one additional optimization. In general, we prefer 0.0 over -0.0 as the identity value for an fadd. We use that value in several places, but don't in others. So, let's be consistent and use the same identity (when nsz allows) everywhere. This creates a bunch of test churn, but due to 924907bc6, most of that churn doesn't actually indicate a change in codegen. The exception is that this change enables the use of 0.0 for nsz, but not reasoc, fadd reductions. Or said differently, it allows the neutral value of an ordered fadd reduction to be 0.0.	2024-09-03 09:16:37 -07:00
Shengchen Kan	87c86aa6b9	[X86,SimplifyCFG] Support hoisting load/store with conditional faulting (Part I) (#96878 ) This is simplifycfg part of https://github.com/llvm/llvm-project/pull/95515 In this PR, we support hoisting load/store with conditional faulting in `SimplifyCFGOpt::speculativelyExecuteBB` to eliminate conditional branches. This is for cases like ``` void test (int a, int b) { if (a) b = a; } ``` In the following patches, we will support the hoist in `SimplifyCFGOpt::hoistCommonCodeFromSuccessors`. That is for cases like ``` void test (int a, int c, int d) { if (a) c = a; else d = a; } ```	2024-08-29 10:42:44 +08:00
Nikita Popov	a105877646	[InstCombine] Remove some of the complexity-based canonicalization (#91185 ) The idea behind this canonicalization is that it allows us to handle less patterns, because we know that some will be canonicalized away. This is indeed very useful to e.g. know that constants are always on the right. However, this is only useful if the canonicalization is actually reliable. This is the case for constants, but not for arguments: Moving these to the right makes it look like the "more complex" expression is guaranteed to be on the left, but this is not actually the case in practice. It fails as soon as you replace the argument with another instruction. The end result is that it looks like things correctly work in tests, while they actually don't. We use the "thwart complexity-based canonicalization" trick to handle this in tests, but it's often a challenge for new contributors to get this right, and based on the regressions this PR originally exposed, we clearly don't get this right in many cases. For this reason, I think that it's better to remove this complexity canonicalization. It will make it much easier to write tests for commuted cases and make sure that they are handled.	2024-08-21 12:02:54 +02:00
Alexey Bataev	ecbbe5b431	[SLP]Fix mask building for alternate node cost estimation (#102966 ) Need to to use same functionality in cost model, as for the codegen, to correctly build the shuffle mask and estimate the cost.	2024-08-12 17:26:56 -04:00
Florian Hahn	5a42a677aa	[VPlan] Mark VPVectorPointer as only using the first part of the ptr. VPVectorPointerRecipe only uses the first part of the pointer operand, so mark it accordingly. Follow-up suggested as part of https://github.com/llvm/llvm-project/pull/99808.	2024-08-12 08:46:55 +01:00
Florian Hahn	f0df4fbd0c	[LV] Support generating masks for switch terminators. (#99808 ) Update createEdgeMask to created masks where the terminator in Src is a switch. We need to handle 2 separate cases: 1. Dst is not the default desintation. Dst is reached if any of the cases with destination == Dst are taken. Join the conditions for each case where destination == Dst using a logical OR. 2. Dst is the default destination. Dst is reached if none of the cases with destination != Dst are taken. Join the conditions for each case where the destination is != Dst using a logical OR and negate it. Edge masks are created for every destination of cases and/or default when requesting a mask where the source is a switch. Fixes https://github.com/llvm/llvm-project/issues/48188. PR: https://github.com/llvm/llvm-project/pull/99808	2024-08-11 20:38:36 +02:00
Florian Hahn	4399dbe331	[LV] Adjust test for #48188 to use AVX level closer to report. Update AVX level for https://github.com/llvm/llvm-project/issues/48188 to be closer to the one used in the preproducer.	2024-08-11 15:04:07 +01:00
Simon Pilgrim	da286c8bf6	[VectorCombine] foldShuffleToIdentity - peek through bitcasts to see if they come from the same value to form identity sequence (#98334 ) Workaround until I can get #96884 fixed properly - when trying to find identity sequences, peek through any bitcasts to see if the values all came from the same source. We don't run CSE frequently enough to merge all the bitcasts that we end up with.	2024-07-15 21:36:23 +01:00
Simon Pilgrim	3ef2805dcf	[PhaseOrdering][X86] Fix cut+paste typo in blendv test	2024-07-10 13:35:48 +01:00
YAMAMOTO Takashi	5d79110959	[Pipelines] Perform mergefunc after constmerge (#92498 ) Constmerge can fold switch jump tables, possibly making functions identical again. It can help mergefunc. On the other hand, the opposite seems unlikely. Fixes https://github.com/llvm/llvm-project/issues/92201.	2024-07-05 12:28:03 +02:00
Florian Hahn	99d6c6d936	[VPlan] Model branch cond to enter scalar epilogue in VPlan. (#92651 ) This patch moves branch condition creation to enter the scalar epilogue loop to VPlan. Modeling the branch in the middle block also requires modeling the successor blocks. This is done using the recently introduced VPIRBasicBlock. Note that the middle.block is still created as part of the skeleton and then patched in during VPlan execution. Unfortunately the skeleton needs to create the middle.block early on, as it is also used for induction resume value creation and is also needed to properly update the dominator tree during skeleton creation. After this patch lands, I plan to move induction resume value and phi node creation in the scalar preheader to VPlan. Once that is done, we should be able to create the middle.block in VPlan directly. This is a re-worked version based on the earlier https://reviews.llvm.org/D150398 and the main change is the use of VPIRBasicBlock. Depends on https://github.com/llvm/llvm-project/pull/92525 PR: https://github.com/llvm/llvm-project/pull/92651	2024-07-05 10:08:42 +01:00
Simon Pilgrim	b546096d94	[VectorCombine] foldShuffleToIdentity - handle bitcasts with equal element counts (#97731 ) Basic initial patch for #96884 that just handles case where we bitcast between float/integers of the same element width	2024-07-05 09:47:42 +01:00
Simon Pilgrim	7de7f50fc9	[InstCombine][X86] Fold blendv(x,y,shuffle(bitcast(sext(m)))) -> select(shuffle(m),x,y) (#96882 ) We already handle blendv(x,y,bitcast(sext(m))) -> select(m,x,y) cases, but this adds support for peeking through one-use shuffles as well. VectorCombine should already have canonicalized the IR to shuffle(bitcast(...)) for us. The particular use case is where we have split generic 256/512-bit code to use target-specific blendv intrinsics (e.g. AVX1 spoofing AVX2 256-bit ops). Fixes #58895	2024-07-03 12:21:31 +01:00
Simon Pilgrim	8467cc61ce	[X86] Add phase ordering test coverage for #58895	2024-06-27 10:45:11 +01:00
Simon Pilgrim	6f8efc76c9	[PhaseOrdering][X86] Regenerate pr67803.ll	2024-06-26 13:59:34 +01:00
Nikita Popov	8e8d2595da	[ConstantFolding] Canonicalize constexpr GEPs to i8 (#89872 ) This patch canonicalizes constant expression GEPs to use i8 source element type, aka ptradd. This is the ConstantFolding equivalent of the InstCombine canonicalization introduced in #68882. I believe all our optimizations working on constant expression GEPs (like GlobalOpt etc) have already been switched to work on offsets, so I don't expect any significant fallout from this change. This is part of: https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699	2024-05-20 11:47:30 +02:00
Andreas Jonson	b8f3024a31	[InstCombine] Swap out range metadata to range attribute for cttz/ctlz/ctpop (#88776 ) Since all optimizations that use range metadata now also handle range attribute, this patch replaces writes of range metadata for call instructions to range attributes.	2024-04-25 01:45:50 +08:00
Craig Topper	e15f47f267	[InstCombine] Don't use dominating conditions to transform sub into xor. (#88566 ) Other passes are unable to reverse this transform if we use dominating conditions. Fixes #88239.	2024-04-17 13:16:08 -07:00
Craig Topper	421a8c5892	[InstCombine] Add phase ordering test for #88239 . NFC	2024-04-17 10:40:25 -07:00
Simon Pilgrim	6fd2fdccf2	[VectorCombine] foldShuffleOfCastops - extend shuffle(bitcast(x),bitcast(y)) -> bitcast(shuffle(x,y)) support Handle shuffle mask scaling handling for cases where the bitcast src/dst element counts are different	2024-04-11 14:02:56 +01:00
Simon Pilgrim	212b2bbcd1	[VectorCombine][X86] foldShuffleOfCastops - fold shuffle(cast(x),cast(y)) -> cast(shuffle(x,y)) iff cost efficient (#87510 ) Based off the existing foldShuffleOfBinops fold Fixes #67803	2024-04-04 11:22:37 +01:00
Monad	56b3222b79	[InstCombine] Remove the canonicalization of `trunc` to `i1` (#84628 ) Remove the canonicalization of `trunc` to `i1` according to the suggestion of https://github.com/llvm/llvm-project/pull/83829#issuecomment-1986801166 `a84e66a92d/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp (L737-L745)` Alive2: https://alive2.llvm.org/ce/z/cacYVA	2024-03-29 21:47:35 +08:00
Simon Pilgrim	ee5e027cc6	[X86] getShuffleCost - recognise concat_vector(X,Y) shuffle as InsertSubvector instead of PermuteTwoSrc We don't have a concat_vector shuffle kind and improveShuffleKindFromMask won't alter the base type to match it as InsertSubvector. But since this is how X86 will lower concat_vector anyhow, just recognise it explicitly. Another step for #67803	2024-03-21 09:29:39 +00:00
Simon Pilgrim	7812fcf3d7	[VectorCombine] foldBitcastShuf - add support for binary shuffles (REAPPLIED) Generalise fold to "bitcast (shuf V0, V1, MaskC) --> shuf (bitcast V0), (bitcast V1), MaskC'". Reapplied with a clang codegen test fix. Further prep work for #67803	2024-03-20 15:06:19 +00:00
Simon Pilgrim	ada24ae5e6	Revert 2ac85d8d200a9e1e0ced501c2d2f04404c400bd9 "[VectorCombine] foldBitcastShuf - add support for binary shuffles" Breaks some tests in other subprojects - will recommit with a fix later	2024-03-20 13:39:42 +00:00
Simon Pilgrim	2ac85d8d20	[VectorCombine] foldBitcastShuf - add support for binary shuffles Generalise fold to "bitcast (shuf V0, V1, MaskC) --> shuf (bitcast V0), (bitcast V1), MaskC'". Further prep work for #67803	2024-03-20 13:19:30 +00:00
Simon Pilgrim	08e036e734	[PhaseOrdering][X86] Add test coverage for #67803	2024-03-05 15:50:37 +00:00
Nilanjana Basu	c1c5b854ad	[LV] Remove loop trip count threshold for deciding whether to interleave a loop (#67725 ) A set of microbenchmarks (https://github.com/llvm/llvm-test-suite/pull/26) showed that loop interleaving can be beneficial for loops with low trip count as well. Loop interleaving count computation is updated accordingly in prior patches while this patch removes the loop trip count threshold for interleaving.	2024-02-05 17:23:58 -08:00
Nikita Popov	90ba33099c	[InstCombine] Canonicalize constant GEPs to i8 source element type (#68882 ) This patch canonicalizes getelementptr instructions with constant indices to use the `i8` source element type. This makes it easier for optimizations to recognize that two GEPs are identical, because they don't need to see past many different ways to express the same offset. This is a first step towards https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699. This is limited to constant GEPs only for now, as they have a clear canonical form, while we're not yet sure how exactly to deal with variable indices. The test llvm/test/Transforms/PhaseOrdering/switch_with_geps.ll gives two representative examples of the kind of optimization improvement we expect from this change. In the first test SimplifyCFG can now realize that all switch branches are actually the same. In the second test it can convert it into simple arithmetic. These are representative of common optimization failures we see in Rust. Fixes https://github.com/llvm/llvm-project/issues/69841.	2024-01-24 15:25:29 +01:00
Nikita Popov	cd7ea4ea65	[LAA] Drop alias scope metadata that is not valid across iterations (#79161 ) LAA currently adds memory locations with their original AATags to AST. However, scoped alias AATags may be valid only within one loop iteration, while LAA reasons across iterations. Fix this by determining which alias scopes are defined inside the loop, and drop AATags that reference these scopes. Fixes https://github.com/llvm/llvm-project/issues/79137.	2024-01-24 11:20:16 +01:00
Nikita Popov	543cf08636	[PhaseOrdering] Add additional test for #79161 (NFC)	2024-01-24 10:46:11 +01:00
Florian Hahn	f18536d642	[VPlan] Model address separately. (#72164 ) Move vector pointer generation to a separate VPVectorPointerRecipe. This untangles address computation from the memory recipes future and is also needed to enable explicit unrolling in VPlan. https://github.com/llvm/llvm-project/pull/72164	2024-01-01 19:51:15 +00:00
Yingwei Zheng	1228becf7d	[FuncAttrs] Deduce `noundef` attributes for return values (#76553 ) This patch deduces `noundef` attributes for return values. IIUC, a function returns `noundef` values iff all of its return values are guaranteed not to be `undef` or `poison`. Definition of `noundef` from LangRef: ``` noundef This attribute applies to parameters and return values. If the value representation contains any undefined or poison bits, the behavior is undefined. Note that this does not refer to padding introduced by the type’s storage representation. ``` Alive2: https://alive2.llvm.org/ce/z/g8Eis6 Compile-time impact: http://llvm-compile-time-tracker.com/compare.php?from=30dcc33c4ea3ab50397a7adbe85fe977d4a400bd&to=c5e8738d4bfbf1e97e3f455fded90b791f223d74&stat=instructions:u \|stage1-O3\|stage1-ReleaseThinLTO\|stage1-ReleaseLTO-g\|stage1-O0-g\|stage2-O3\|stage2-O0-g\|stage2-clang\| \|--\|--\|--\|--\|--\|--\|--\| \|+0.01%\|+0.01%\|-0.01%\|+0.01%\|+0.03%\|-0.04%\|+0.01%\| The motivation of this patch is to reduce the number of `freeze` insts and enable more optimizations.	2023-12-31 20:44:48 +08:00
Nikita Popov	a5f3415533	[InstCombine] Replace non-demanded undef vector with poison If an operand (esp to shufflevector or insertelement) is not demanded, canonicalize it from undef to poison.	2023-12-18 16:12:37 +01:00
Nikita Popov	cf47af493b	[InstCombine] Generalize folds for inversion of icmp operands (#74317 ) We have a bunch of folds that basically perform X pred Y to ~Y pred ~X for various special cases where this saves an instruction. Generalize these folds to use isFreeToInvert(). We have to make sure that we consume an instruction in either of the inversions, otherwise we're just going to swap the icmp back and forth. Fixes https://github.com/llvm/llvm-project/issues/74302.	2023-12-08 11:25:41 +01:00
Nikita Popov	d77067d08a	[ValueTracking] Add dominating condition support in computeKnownBits() (#73662 ) This adds support for using dominating conditions in computeKnownBits() when called from InstCombine. The implementation uses a DomConditionCache, which stores which branches may provide information that is relevant for a given value. DomConditionCache is similar to AssumptionCache, but does not try to do any kind of automatic tracking. Relevant branches have to be explicitly registered and invalidated values explicitly removed. The necessary tracking is done inside InstCombine. The reason why this doesn't just do exactly the same thing as AssumptionCache is that a lot more transforms touch branches and branch conditions than assumptions. AssumptionCache is an immutable analysis and mostly gets away with this because only a handful of places have to register additional assumptions (mostly as a result of cloning). This is very much not the case for branches. This change regresses compile-time by about ~0.2%. It also improves stage2-O0-g builds by about ~0.2%, which indicates that this change results in additional optimizations inside clang itself. Fixes https://github.com/llvm/llvm-project/issues/74242.	2023-12-06 14:17:18 +01:00
Craig Topper	7ec4f6094e	[InstCombine] Infer disjoint flag on Or instructions. (#72912 ) The disjoint flag was recently added to IR in #72583 We already set it when we turn an add into an or. This patch sets it on Ors that weren't converted from an Add.	2023-12-02 14:11:12 -08:00
Craig Topper	03d4a9d94d	[InstCombine] Set disjoint flag when turning Add into Or. (#72702 ) The disjoint flag was recently added to IR in #72583	2023-11-27 12:54:11 -08:00
Yingwei Zheng	dc6d077396	[CVP] Infer nneg on existing zext (#72052 ) This patch infers `nneg` flags for existing zext instructions in CVP. After https://github.com/llvm/llvm-project/pull/71534 and this patch, we can drop `zext -> zext nneg` transform in `RISCVCodeGenPrepare`: `40671bbdef/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp (L74-L83)` This is an alternative to #72049.	2023-11-13 22:41:37 +08:00
dewen	3b82336188	Revert "[PM] Execute IndVarSimplifyPass precede RessociatePass" (#71617 ) Reverts llvm/llvm-project#71054	2023-11-08 09:22:55 +08:00
dewen	e4d27d7f32	[PM] Execute IndVarSimplifyPass precede RessociatePass (#71054 ) ReassociatePass may clear nsw/nuw flags of some instructions, which may have side effects on optimizations in IndVarSimplifyPass.	2023-11-08 09:21:17 +08:00
Nikita Popov	30240e428f	[PhaseOrdering] Regenerate test checks (NFC)	2023-10-12 14:40:13 +02:00
DianQK	2d1e8a03f5	[EarlyCSE] Compare GEP instructions based on offset (#65875 ) Closes #65763. This will provide more opportunities for constant propagation for subsequent optimizations.	2023-09-20 06:14:45 +08:00
Alexey Bataev	c619222ea4	[SLP]Use common logic for cost estimation of the alternate vector nodes. We can use buildShuffleEntryMask() to build the shuffle mask correctly not only for the alternate nodes with reuses, but also for the nodes without reused scalars. It allows better to estimate the cost of the node and emit better code. Differential Revision: https://reviews.llvm.org/D157413	2023-08-09 11:50:39 -07:00

1 2 3 4 5 ...

261 Commits