llvm-project

Author	SHA1	Message	Date
alexfh	067632e141	Revert "[DAGCombiner] Transform `(icmp eq/ne (and X,C0),(shift X,C1))` to use rotate or to getter constants." due to a miscompile (#71598 ) - Revert "[DAGCombiner] Transform `(icmp eq/ne (and X,C0),(shift X,C1))` to use rotate or to getter constants." - causes a miscompile, see `112e49b381 (commitcomment-131943923)` - Revert "[X86] Fix gcc warning about mix of enumeral and non-enumeral types. NFC", which fixes a compiler warning in the commit above	2023-11-08 15:07:12 +01:00
Simon Pilgrim	1085b70a94	[DAG] Don't fold (zext (bitop (load x), cst)) -> (bitop (zextload x), (zext cst)) if the zext is free Prevents an infinite loop if we've been trying to narrow the bitop to a more preferable type	2023-11-04 15:32:13 +00:00
Craig Topper	70b35ec0a8	[SelectionDAG] Add initial support for nneg flag on ISD::ZERO_EXTEND. (#70872 ) This adds the nneg flag to SDNodeFlags and the node printing code. SelectionDAGBuilder will add this flag to the node if the target doesn't prefer sign extend. A future RISC-V patch can remove the sign extend preference from SelectionDAGBuilder. I've also added the flag to the DAG combine that converts ISD::SIGN_EXTEND to ISD::ZERO_EXTEND.	2023-11-03 11:15:08 -07:00
Craig Topper	20020c1b43	[DAGCombiner] Fix misuse of getZeroExtendInReg in SimplifySelectCC. (#70066 ) If VT has less bits than SCC, using a ZeroExtendInReg isn't going to fix it. That's an AND instruction. We need to truncate the value instead. This should be ok because we already checked that the boolean contents is ZeroOrOne so the setcc can only produce 0 or 1. No test because I found this while trying to make i32 legal for RISC-V 64 which I'm not ready to upload yet. You can see in the coverage report that this line isn't tested today. https://lab.llvm.org/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp.html#L27270	2023-10-24 12:35:55 -07:00
Pierre van Houtryve	2bc93584f5	[DAG] Constant Folding for U/SMUL_LOHI (#69437 )	2023-10-24 07:37:55 +02:00
Ramkumar Ramachandra	98c90a13c6	ISel: introduce vector ISD::LRINT, ISD::LLRINT; custom RISCV lowering (#66924 ) The issue #55208 noticed that std::rint is vectorized by the SLPVectorizer, but a very similar function, std::lrint, is not. std::lrint corresponds to ISD::LRINT in the SelectionDAG, and std::llrint is a familiar cousin corresponding to ISD::LLRINT. Now, neither ISD::LRINT nor ISD::LLRINT have a corresponding vector variant, and the LangRef makes this clear in the documentation of llvm.lrint.* and llvm.llrint.. This patch extends the LangRef to include vector variants of llvm.lrint. and llvm.llrint.*, and lays the necessary ground-work of scalarizing it for all targets. However, this patch would be devoid of motivation unless we show the utility of these new vector variants. Hence, the RISCV target has been chosen to implement a custom lowering to the vfcvt.x.f.v instruction. The patch also includes a CostModel for RISCV, and a trivial follow-up can potentially enable the SLPVectorizer to vectorize std::lrint and std::llrint, fixing #55208. The patch includes tests, obviously for the RISCV target, but also for the X86, AArch64, and PowerPC targets to justify the addition of the vector variants to the LangRef.	2023-10-19 13:05:04 +01:00
Noah Goldstein	112e49b381	[DAGCombiner] Transform `(icmp eq/ne (and X,C0),(shift X,C1))` to use rotate or to getter constants. If `C0` is a mask and `C1` shifts out all the masked bits (to essentially compare two subsets of `X`), we can arbitrarily re-order shift as `srl` or `shl`. If `C1` (shift amount) is a power of 2, we can replace the and+shift with a rotate. Otherwise, based on target preference we can arbitrarily swap `shl` and `shl` in/out to get better constants. On x86 we can use this re-ordering to: 1) get better `and` constants for `C0` (zero extended moves or avoid imm64). 2) covert `srl` to `shl` if `shl` will be implementable with `lea` or `add` (both of which can be preferable). Proofs: https://alive2.llvm.org/ce/z/qzGM_w Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D152116	2023-10-18 01:16:55 -05:00
Pierre van Houtryve	c464fea779	[DAG] Constant fold FMAD (#69324 ) This has very little effect on codegen in practice, but is a nice to have I think. See #68315	2023-10-18 07:46:24 +02:00
Björn Pettersson	4acb96c99f	[SelectionDAG] Tidy up around endianness and isConstantSplat (#68212 ) The BuildVectorSDNode::isConstantSplat function could depend on endianness, and it takes a bool argument that can be used to indicate if big or little endian should be considered when internally casting from a vector to a scalar. However, that argument is default set to false (= little endian). And in many situations, even in target generic code such as DAGCombiner, the endianness isn't specified when using the function. The intent with this patch is to highlight that endianness doesn't matter, depending on the context in which the function is used. In DAGCombiner the code is slightly refactored. Back in the days when the code was written it wasn't possible to request a MinSplatBits size when calling isConstantSplat. Instead the code re-expanded the found SplatValue to match with the EltBitWidth. Now we can just provide EltBitWidth as MinSplatBits and remove the logic for doing the re-expand. While being at it, tidying up around isConstantSplat, this patch also adds an explicit check in BuildVectorSDNode::isConstantSplat to break out from the loop if trying to split an on VecWidth into two halves. Haven't been able to prove that there could be miscompiles involved if not doing so. There are lit tests that trigger that scenario, although I think they happen to later discard the returned SplatValue for other reasons.	2023-10-16 14:53:53 +02:00
Yingwei Zheng	53c81a8c16	[RISCV][SDAG] Fix constant narrowing when narrowing loads (#69015 ) When narrowing logic ops(OR/XOR) with constant rhs, `DAGCombiner` will fixup the constant rhs node. It is incorrect when lhs is also a constant. For example, we will incorrectly replace `xor OpaqueConstant:i64<8191>, Constant:i64<-1>` with `xor (and OpaqueConstant:i64<8191>, Constant:i64<65535>), Constant:i64<-1>`. Fixes #68855.	2023-10-14 06:38:17 +08:00
Jie Fu	573a083c1c	[DAG] Remove unused variable 'VT' in DAGCombiner.cpp (NFC) /llvm-project/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:26896:7: error: unused variable 'VT' [-Werror,-Wunused-variable] EVT VT = N->getValueType(0); ^ 1 error generated.	2023-10-09 18:30:38 +08:00
Simon Pilgrim	072675f14e	[DAG] foldSelectOfBinops - correctly handle select of binops where ResNo != 0 Correctly handle cases where the select(cond, binop(x, y), binop(z, y)) --> binop(select(cond, x, z), y) fold is selecting ResNo != 0 results (UADDO flags etc.) Fixes #68539	2023-10-09 11:08:55 +01:00
Ben Mudd	6d6b395b53	[DebugInfo][SelectionDAG] Add debug info salvaging for TRUNC nodes This patch adds support for salvaging TRUNC nodes during SelectionDAG, fixing LLVM issue #63076: https://github.com/llvm/llvm-project/issues/63076 Reviewed in: https://github.com/llvm/llvm-project/pull/66922	2023-10-06 16:10:33 +01:00
Alexey Bataev	e22818d5c9	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-05 06:17:07 -07:00
Arthur Eubanks	07389535a7	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit b186f1f68be11630355afb0c08b80374a6d31782. Causes crashes, see https://reviews.llvm.org/D158449.	2023-10-04 14:37:16 -07:00
Alexey Bataev	b186f1f68b	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-04 07:53:30 -07:00
Alexey Bataev	1129dec778	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit 6f43d28f3452b3ef598bc12b761cfc2dbd0f34c9 to fix a crash reported in https://reviews.llvm.org/D158449.	2023-10-03 13:02:16 -07:00
Alexey Bataev	6f43d28f34	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-03 10:26:11 -07:00
Simon Pilgrim	b4f591363c	[DAG] visitSHL - move SimplifyDemandedBits after all standard folds to give them a chance to match Pulled out of D155472	2023-10-02 16:09:35 +01:00
Alexey Bataev	ebcb5d59fc	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit 9f5960e004ff54082ccfa9396522e07358f5b66b to fix buildbots reported here https://lab.llvm.org/buildbot/#/builders/230/builds/19412.	2023-09-29 15:03:46 -07:00
Alexey Bataev	9f5960e004	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-09-29 13:16:03 -07:00
Jay Foad	6e3d2a4b38	[ISel] Fix another crash in new FMA DAG combine (#67818 ) Following on from D135150, this patch fixes another crash caused by this DAG combine: fadd (fma A, B, (fmul C, D)), E --> fma A, B, (fma C, D, E) The combine calls ReplaceAllUsesOfValueWith to replace (fmul C, D) with (fma C, D, E). This can cause nodes to get CSEd. In D135150 the problem was that the (fma C, D, E) node got CSEd away. In this new case, the problem is that the outer fadd node gets CSEd away. To fix it we have to return SDValue(N, 0) from the combine and be careful not to add a deleted node to the worklist.	2023-09-29 17:18:23 +01:00
Alexey Bataev	3204f88a8b	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit c88c281cf1ac1a01c55231b93826d7c8ae83985b to fix the crash revealed by https://lab.llvm.org/buildbot/#/builders/230/builds/19353.	2023-09-28 11:57:32 -07:00
Noah Goldstein	de7881ebf5	[DAGCombiner] Combine `(select c, (and X, 1), 0)` -> `(and (zext c), X)` The middle end canonicalizes: `(and (zext c), X)` -> `(select c, (and X, 1), 0)` But the `and` + `zext` form gets better codegen.	2023-09-28 13:46:46 -05:00
Alexey Bataev	c88c281cf1	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-09-28 11:03:21 -07:00
David Green	03647e2e4b	[AArch64] Handle scalable vectors in combineFMulOrFDivWithIntPow2. The transform will still not trigger as takeInexpensiveLog2 will bail out for any scalable vector, but this guards against a scalable typesize error.	2023-09-26 15:34:34 +01:00
Noah Goldstein	bc38c427d4	[DAGCombiner][AArch64] Fix incorrect cast VT in `takeInexpensiveLog2` Previously, we where taking `CurVT` before finalizing `ToCast` which meant potentially returning an `SDValue` with an illegal `ValueType` for the operation. Fix is to just take `CurVT` after we have finalized `ToCast` with `PeekThroughCastsAndTrunc`.	2023-09-23 09:50:42 -05:00
Noah Goldstein	6d6314ba64	[DAGCombiner] Extend `combineFMulOrFDivWithIntPow2` to work for non-splat float vecs Do so by extending `matchUnaryPredicate` to also work for `ConstantFPSDNode` types then encapsulate the constant checks in a lambda and pass it to `matchUnaryPredicate`. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D154868	2023-09-20 13:28:24 -05:00
Noah Goldstein	47c642f9a0	[DAGCombiner] Fold IEEE `fmul`/`fdiv` by Pow2 to `add`/`sub` of exp Note: This is moving D154678 which previously implemented this in InstCombine. Concerns where brought up that this was de-canonicalizing and really targeting a codegen improvement, so placing in DAGCombiner. This implements: ``` (fmul C, (uitofp Pow2)) -> (bitcast_to_FP (add (bitcast_to_INT C), Log2(Pow2) << mantissa)) (fdiv C, (uitofp Pow2)) -> (bitcast_to_FP (sub (bitcast_to_INT C), Log2(Pow2) << mantissa)) ``` The motivation is mostly fdiv where 2^(-p) is a fairly common expression. The patch is intentionally conservative about the transform, only doing so if we: 1) have IEEE floats 2) C is normal 3) add/sub of max(Log2(Pow2)) stays in the min/max exponent bounds. Alive2 can't realistically prove this, but did test float16/float32 cases (within the bounds of the above rules) exhaustively. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D154805	2023-09-20 13:28:24 -05:00
Fraser Cormack	ebefe83c09	[NFC] Fix spelling 'constanst' -> 'constants'	2023-09-20 15:33:03 +01:00
Luke Lau	22d0bd8632	[DAGCombiner] Combine vp.strided.store with unit stride to vp.store (#66774 ) This is the VP equivalent of #66677. If we have a strided store where the stride is equal to the element width, we can just use a regular VP store.	2023-09-19 16:43:50 +01:00
Luke Lau	469f6b9b4c	[DAGCombiner] Combine vp.strided.load with unit stride to vp.load (#66766 ) This is the VP equivalent of #65674. We already combine MGATHER loads with unit stride to MLOAD, so this extends it for EXPERIMENTAL_VP_STRIDED_LOAD.	2023-09-19 16:39:28 +01:00
Sergei Barannikov	caaf61eb6e	[SDag] Fold saddo[_carry] with bitwise-not argument to ssubo[_carry] (#66571 ) Fold `(saddo (not a), 1)` to `(ssubo 0, a)` and `(saddo_carry (not a), b, c)` to `(ssubo_carry b, a, !c)`. Proof: https://alive2.llvm.org/ce/z/Lj49YM This is the same as https://reviews.llvm.org/D46505 and https://reviews.llvm.org/D59208, but for signed opcodes.	2023-09-18 14:45:41 +03:00
Philip Reames	09a5aac514	[TLI] Add extend as explicit parameter to shouldRemoveExtendFromGSIndex [nfc] Note: Reviewed as part of a stack of changes in PR# 66405.	2023-09-15 14:48:02 -07:00
Arthur Eubanks	0a1aa6cda2	[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295 ) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::	2023-09-14 14:10:14 -07:00
Philip Reames	61757fbd04	[DAG] Remove pointless peephole from refineUniformBase [nfc] No need to special case add 0, N. SelectionDAG::getNode contains the canonicalization and simplification for this case, so no need to duplicate it here.	2023-09-13 10:16:11 -07:00
Philip Reames	2f005df066	[DAG][X86] Fold mgather/mscatter/etc with splat index (#65980 ) A splat index means the operation is reading from (writing to) the same memory location. Generally, zero is the cheapest value to splat. As such, we'd prefer to add the splatted value to the base, and use a constant zero as the index operand.	2023-09-13 09:26:30 -07:00
Yingwei Zheng	4793c2c3de	[DAGCombiner][RISCV] Prefer to sext i32 non-negative values (#65984 ) By default, `DAGCombiner` folds `sext x` to `zext x` when `x` is non-negative. It will generate redundant `zext` inst seq on riscv64 (typically `slli (srli x, 32), 32`). godbolt: https://godbolt.org/z/osf6adP1o This patch applies the transform iff `zext` is cheaper than `sext`.	2023-09-12 19:02:35 +08:00
Mohamed Atef	741c127817	[SelectionDAG] Add computeOverflowForSignedMul / computeOverflowForUnsignedMul overflow handlers Support signed multiplication Support unsigned multiplication Differential Revision: https://reviews.llvm.org/D159406	2023-09-07 10:03:18 +01:00
Simon Pilgrim	84447c044f	[DAG] Add SelectionDAG::isADDLike helper. NFC. Make the DAGCombine helper global so we can more easily reuse it.	2023-09-06 16:54:25 +01:00
Simon Pilgrim	e4d0e12099	[DAG] Fold (shl (sext (add_nsw x, c1)), c2) -> (add (shl (sext x), c2), c1 << c2) (REAPPLIED) Assuming the ADD is nsw then it may be sign-extended to merge with a SHL op in a similar fold to the existing (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2) fold. This is most useful for helping to expose address math for X86, but has also touched several aarch64 test cases as well. Alive2: https://alive2.llvm.org/ce/z/2UpSbJ Differential Revision: https://reviews.llvm.org/D159198	2023-09-06 13:19:42 +01:00
Dmitri Gribenko	97bf104d97	Revert "[DAG] Fold (shl (sext (add_nsw x, c1)), c2) -> (add (shl (sext x), c2), c1 << c2)" This reverts commit b027ce0ab93060bc6cb79d5402d21520e8b93fb7. This commit breaks Transforms/InferAddressSpaces/AMDGPU/flat_atomic.ll.	2023-09-06 11:28:55 +02:00
Simon Pilgrim	b027ce0ab9	[DAG] Fold (shl (sext (add_nsw x, c1)), c2) -> (add (shl (sext x), c2), c1 << c2) Assuming the ADD is nsw then it may be sign-extended to merge with a SHL op in a similar fold to the existing (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2) fold. This is most useful for helping to expose address math for X86, but has also touched several aarch64 test cases as well. Alive2: https://alive2.llvm.org/ce/z/2UpSbJ Differential Revision: https://reviews.llvm.org/D159198	2023-09-06 10:06:21 +01:00
David Sherwood	50598f0ff4	[DAGCombiner][SVE] Add support for illegal extending masked loads In some cases where the same mask is used for multiple extending masked loads it can be more efficient to combine the zero- or sign-extend into the load even if it's not a legal or custom operation. This leads to splitting up the extending load into smaller parts, which also requires splitting the mask. For SVE at least this improves the performance of the SPEC benchmark x264 slightly on neoverse-v1 (~0.3%), and at least one other benchmark improves by around 30%. The uplift for SVE seems due to removing the dependencies (vector unpacks) introduced between the loads and the vector operations, since this should increase the level of parallelism. See tests: CodeGen/AArch64/sve-masked-ldst-sext.ll CodeGen/AArch64/sve-masked-ldst-zext.ll https://reviews.llvm.org/D159191	2023-09-05 10:41:21 +00:00
Fangrui Song	111fcb0df0	[llvm] Fix duplicate word typos. NFC Those fixes were taken from https://reviews.llvm.org/D137338	2023-09-01 18:25:16 -07:00
Konstantina Mitropoulou	17fc78e7a4	[DAGCombiner] Change foldAndOrOfSETCC() to optimize and/or patterns with floating points. This reverts commit 48fa79a503a7cf380f98b6335fbd349afae1bd86. Reviewed By: brooksmoses Differential Revision: https://reviews.llvm.org/D159240	2023-08-31 11:36:50 -07:00
Luke Lau	3a4ad45a2c	[DAGCombiner] Combine trunc (splat_vector x) -> splat_vector (trunc x) From the discussion in https://reviews.llvm.org/D158853, moving the truncate into the splat helps more splatted scalar operands get selected on RISC-V, and also avoids the need for splat_vector_parts on RV32. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D159147	2023-08-30 15:22:57 +01:00
Simon Pilgrim	d037445f3a	[DAG] visitSHL - use FoldConstantArithmetic to fold constants in (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2) fold Matches what we do in the (shl (mul x, c1), c2) -> (mul x, c1 << c2) fold as well as inside visitShiftByConstant	2023-08-29 18:52:24 +01:00
Konstantina Mitropoulou	48fa79a503	Revert "[DAGCombiner] Change foldAndOrOfSETCC() to optimize and/or patterns with floating points." This reverts commit 5ec13535235d07eafd64058551bc495f87c283b1.	2023-08-24 20:39:04 -07:00
Konstantina Mitropoulou	5ec1353523	[DAGCombiner] Change foldAndOrOfSETCC() to optimize and/or patterns with floating points. CMP(A,C)\|\|CMP(B,C) => CMP(MIN/MAX(A,B), C) CMP(A,C)&&CMP(B,C) => CMP(MIN/MAX(A,B), C) If the operands are proven to be non NaN, then the optimization can be applied for all predicates. We can apply the optimization for the following predicates for FMINNUM/FMAXNUM (for quiet and signaling NaNs) and for FMINNUM_IEEE/FMAXNUM_IEEE if we can prove that the operands are not signaling NaNs. - ordered lt/le and \|\| - ordered gt/ge and \|\| - unordered lt/le and && - unordered gt/ge and && Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D155267	2023-08-24 10:48:56 -07:00

1 2 3 4 5 ...

3685 Commits