llvm-project

Author	SHA1	Message	Date
Simon Pilgrim	056cf936a7	[DAG] Fold (and X, (bswap/bitreverse (not Y))) -> (and X, (not (bswap/bitreverse Y))) (#112547 ) On ANDNOT capable targets we can always do this profitably, without ANDNOT we only attempt this if we don't introduce an additional NOT Fixes #112425	2024-10-28 11:52:44 +00:00
James Chesterman	11c818816d	[AArch64] Improve index selection for histograms (#111150 ) Removes unnecessary extends on the indices passed into histogram instructions. It also removes the instruction when the mask is zero.	2024-10-22 11:14:00 +01:00
Simon Pilgrim	f0b3b6d15b	[DAG] isConstantIntBuildVectorOrConstantInt - peek through bitcasts (#112710 ) (REAPPLIED) Alter both isConstantIntBuildVectorOrConstantInt + isConstantFPBuildVectorOrConstantFP to return a bool instead of the underlying SDNode, and adjust usage to account for this. Update isConstantIntBuildVectorOrConstantInt to peek though bitcasts when attempting to find a constant, in particular this improves canonicalization of constants to the RHS on commutable instructions. X86 is the beneficiary here as it often bitcasts rematerializable 0/-1 vector constants as vXi32 and bitcasts to the requested type Minor cleanup that helps with #107423 Reapplied after regression fix ba1255def64a9c3c68d97ace051eec76f546eeb0	2024-10-20 14:23:21 +01:00
Simon Pilgrim	ba1255def6	[DAG] Use FoldConstantArithmetic to constant fold (and (ext (and V, c1)), c2) -> (and (ext V), (and c1, (ext c2))) Noticed while triaging the regression from #112710 noticed by @mstorsjo - don't rely on isConstantIntBuildVectorOrConstantInt+getNode to guarantee constant folding (if it fails to constant fold it will infinite loop), use FoldConstantArithmetic instead.	2024-10-20 13:05:23 +01:00
Martin Storsjö	b26df3e463	Revert "[DAG] isConstantIntBuildVectorOrConstantInt - peek through bitcasts (#112710 )" This reverts commit a630771b28f4b252e2754776b8f3ab416133951a. This caused compilation to hang for Windows/ARM, see https://github.com/llvm/llvm-project/pull/112710 for details.	2024-10-20 00:49:16 +03:00
Simon Pilgrim	93ec08d629	[DAG] Move SIGN_EXTEND_INREG constant folding inside FoldConstantArithmetic Update visitSIGN_EXTEND_INREG to call FoldConstantArithmetic instead of getNode.	2024-10-19 20:57:07 +01:00
Simon Pilgrim	e1330d96a0	[DAG] visitFMA/FDIV - avoid SDLoc duplication. NFC.	2024-10-18 11:57:41 +01:00
Simon Pilgrim	5c37316b54	[DAG] visitFMA/FMAD - use FoldConstantArithmetic to add missing vector constant folding support	2024-10-18 11:12:06 +01:00
Simon Pilgrim	a630771b28	[DAG] isConstantIntBuildVectorOrConstantInt - peek through bitcasts (#112710 ) Alter both isConstantIntBuildVectorOrConstantInt + isConstantFPBuildVectorOrConstantFP to return a bool instead of the underlying SDNode, and adjust usage to account for this. Update isConstantIntBuildVectorOrConstantInt to peek though bitcasts when attempting to find a constant, in particular this improves canonicalization of constants to the RHS on commutable instructions. X86 is the beneficiary here as it often bitcasts rematerializable 0/-1 vector constants as vXi32 and bitcasts to the requested type Minor cleanup that helps with #107423	2024-10-18 10:52:55 +01:00
Simon Pilgrim	3ec1b1a4dd	[DAG] visitFP_EXTEND - use FoldConstantArithmetic to attempt to constant fold Don't rely on isConstantFPBuildVectorOrConstantFP followed by getNode() will constant fold - FoldConstantArithmetic will do all of this for us.	2024-10-18 10:10:44 +01:00
Simon Pilgrim	3a1df05ca9	[DAG] visitFP_ROUND - use FoldConstantArithmetic to attempt to constant fold Don't rely on isConstantFPBuildVectorOrConstantFP followed by getNode() will constant fold - FoldConstantArithmetic will do all of this for us.	2024-10-18 10:10:43 +01:00
Simon Pilgrim	7a43be1690	[DAG] visitXROUND - use FoldConstantArithmetic to attempt to constant fold Don't rely on isConstantFPBuildVectorOrConstantFP followed by getNode() will constant fold - FoldConstantArithmetic will do all of this for us.	2024-10-18 10:10:43 +01:00
Simon Pilgrim	c72992bf89	[DAG] visitABS - use FoldConstantArithmetic to attempt to constant fold Don't rely on isConstantFPBuildVectorOrConstantFP followed by getNode() will constant fold - FoldConstantArithmetic will do all of this for us. Cleanup for #112682	2024-10-18 10:10:43 +01:00
Simon Pilgrim	256bbdb3f6	[DAG] visitFCEIL/FTRUNC/FFLOOR/FNEG - use FoldConstantArithmetic to attempt to constant fold Don't rely on isConstantFPBuildVectorOrConstantFP followed by getNode() will constant fold - FoldConstantArithmetic will do all of this for us. Cleanup for #112682	2024-10-17 16:53:44 +01:00
Simon Pilgrim	cf046c8717	[DAG] visitSIGN_EXTEND_INREG - avoid SDLoc duplication. NFC.	2024-10-17 12:51:11 +01:00
Simon Pilgrim	5692a0c6f8	[DAG] visitFP_TO_SINT/FP_TO_UINT - use FoldConstantArithmetic to attempt to constant fold Don't rely on isConstantFPBuildVectorOrConstantFP followed by getNode() will constant fold - FoldConstantArithmetic will do all of this for us. Cleanup for #112682	2024-10-17 12:50:09 +01:00
Simon Pilgrim	784c15a282	[DAG] visitSINT_TO_FP/UINT_TO_FP - use FoldConstantArithmetic to attempt to constant fold Don't rely on isConstantIntBuildVectorOrConstantInt followed by getNode() will constant fold - FoldConstantArithmetic will do all of this for us. Cleanup for #112682	2024-10-17 12:50:09 +01:00
Simon Pilgrim	8268bc48eb	[DAG] Avoid SDLoc duplication in FP<->INT combines. NFC.	2024-10-17 12:50:09 +01:00
Lewis Crawford	f5f00764ab	[DAGCombiner] Fix check for extending loads (#112182 ) Fix a check for extending loads in DAGCombiner, where if the result type has more bits than the loaded type it should count as an extending load. All backends apart from AArch64 ignore this ExtTy argument to shouldReduceLoadWidth, so this change currently only impacts AArch64.	2024-10-16 13:23:46 +01:00
Simon Pilgrim	25b702f263	[DAG] visitXOR - add missing comment for or/and constant setcc demorgan fold. NFC. Noticed while triaging #112347 which is using this fold - we described the or->and fold, but not the equivalent and->or which is also handled.	2024-10-16 11:15:36 +01:00
Simon Pilgrim	30deb76d46	[DAG] visitXOR - add missing comment for or/and constant demorgan fold. NFC. Noticed while triaging #112347 which is using this fold - we described the or->and fold, but not the equivalent and->or which is also handled.	2024-10-15 16:32:27 +01:00
c8ef	854ded9b24	Reapply "[DAG] Enhance SDPatternMatch to match integer minimum and maximum patterns in addition to the existing ISD nodes." (#112203 ) This patch adds icmp+select patterns for integer min/max matchers in SDPatternMatch, similar to those in IR PatternMatch. Reapply #111774. Closes #108218.	2024-10-15 21:07:06 +08:00
c8ef	a3b0c31ebc	Revert "[DAG] Enhance SDPatternMatch to match integer minimum and maximum patterns in addition to the existing ISD nodes." (#112200 ) Reverts llvm/llvm-project#111774 This appears to be causing some tests to fail.	2024-10-14 21:43:49 +08:00
c8ef	11f625cb87	[DAG] Enhance SDPatternMatch to match integer minimum and maximum patterns in addition to the existing ISD nodes. (#111774 ) Closes #108218. This patch adds icmp+select patterns for integer min/max matchers in SDPatternMatch, similar to those in IR PatternMatch.	2024-10-14 21:19:34 +08:00
Oliver Stannard	1e49670b31	[DAGISel] Keep flags when converting FP load/store to integer (#111679 ) This DAG combine replaces a floating-point load/store pair which has no other uses with an integer one, but did not copy the memory operand flags to the new instructions, resulting in it dropping the volatile flag. This optimisation is still valid if one or both of the instructions is volatile, so we can copy over the whole MachineMemOperand to generate volatile integer loads and stores where needed.	2024-10-10 09:17:50 +01:00
Simon Pilgrim	1dcb6dc757	[DAG] foldVSelectToSignBitSplatMask - pull out repeated code and use getShiftAmountConstant helper. We're assuming shift amount type matches the result type - which is true for vectors, but I'm hoping to generalize this fold in the future.	2024-10-08 17:36:34 +01:00
Simon Pilgrim	520562c597	Revert 412d59f0a510a05c08ed45545943dfd2f901bc5d "[DAG] combineShiftToMULH - handle zext nneg as sext" Reverting until I can investigate a miscompilation reported by @mstorsjo	2024-10-01 10:52:27 +01:00
Simon Pilgrim	412d59f0a5	[DAG] combineShiftToMULH - handle zext nneg as sext Fixes poor codegen on AVX512 targets for a test case from #109790	2024-09-30 12:12:32 +01:00
Pawan Nirpal	26f272ebbd	[X86][SelectionDAG] - Add support for llvm.canonicalize intrinsic (#106370 ) Enable support for fcanonicalize intrinsic lowering.	2024-09-23 12:15:38 +01:00
Pierre van Houtryve	758444ca3e	[AMDGPU] Promote uniform ops to I32 in DAGISel (#106383 ) Promote uniform binops, selects and setcc between 2 and 16 bits to 32 bits in DAGISel Solves #64591	2024-09-19 09:00:21 +02:00
David Green	2242cd2b6a	[DAG] Fold vecreduce.or(sext(x)) to sext(vecreduce.or(x)) (#108959 ) The same is true for and / xor reductions, where the sext / zext can be sank down through the bitwise operation. https://alive2.llvm.org/ce/z/TvzCd5	2024-09-17 15:24:00 +01:00
Matt Arsenault	c49a1ae6d6	DAG: Reorder isFMAFasterThanFMulAndFAdd checks (NFC) Basic legality checks should be first.	2024-09-15 16:33:01 +04:00
Robert Dazi	8837898b8d	[DAGCombine] Count leading ones: refine post DAG/Type Legalisation if promotion (#102877 ) This PR is related to #99591. In this PR, instead of modifying how the legalisation occurs depending on surrounding instructions, we refine after legalisation. This PR has two parts: * `SDPatternMatch/MatchContext`: Modify a little bit the code to match Operands (used by `m_Node(...)`) and Unary/Binary/Ternary Patterns to make it compatible with `VPMatchContext`, instead of only `m_Opc` supported. Some tests were added to ensure no regressions. * `DAGCombiner`: Add a `foldSubCtlzNot` which detect and rewrite the patterns using matching context. Remaining Tasks: - [ ] GlobalISel - [ ] Currently the pattern matching will occur even before legalisation. Should I restrict it to specific stages instead ? - [ ] Style: Add a visitVP_SUB ?? Move `foldSubCtlzNot` in another location for style consistency purpose ? @topperc --------- Co-authored-by: v01dxyz <v01dxyz@v01d.xyz>	2024-09-15 15:48:36 +04:00
Simon Pilgrim	5910e8d607	[DAG] visitUDIV - call SimplifyDemandedBits to handle hidden constant foldable cases Fixes #108728	2024-09-15 12:29:28 +01:00
Simon Pilgrim	69a21154ca	[DAG] Fold trunc(srl(extract_elt(vec,c1),c2)) -> extract_elt(bitcast(vec),c3) (#107987 ) Extends existing trunc(extract_elt(vec,c1)) -> extract_elt(bitcast(vec),c3) fold. Noticed while working on #107404	2024-09-13 15:13:58 +01:00
Simon Pilgrim	6ec889e53f	[DAG] Add support for neg(abd(x,y)) patterns. Currently limited to cases which have legal/custom ABDS/ABDU handling - I'll extend this for all targets in future (similar to how we support neg(abs(x))) once I've addressed some outstanding regressions on aarch64/riscv. Helps avoid a lot of extra cmov instructions on x86 in particular, and allows us to more easily improve the codegen in future commits.	2024-09-06 13:16:09 +01:00
Princeton Ferro	8f77d37f25	[DAGCombiner] cache negative result from getMergeStoreCandidates() (#106949 ) Cache negative search result from getStoreMergeCandidates() so that mergeConsecutiveStores() does not iterate quadratically over a potentially long sequence of unmergeable stores.	2024-09-04 18:18:53 +04:00
Simon Pilgrim	b25b9a7d6c	[DAG] visitSELECT - add "select usubo(x, y).overflow, (sub y, x), (usubo x, y) -> abdu(x, y)" fold (and neg equivalent) Handle cases where CGP has merged the CMP+SUB into a USUBO node - improves a few outstanding niggles from #100810	2024-09-04 11:59:10 +01:00
Simon Pilgrim	4baf29e81e	[DAG] Handle cases where a shift amount is larger than the pre-extended value bitwidth In the (zext (shl (zext x), cst)) -> (shl (zext x), cst) fold, don't use a bitmask / MaskedValueIsZero as we can't guarantee that the shift amount is in bounds. Fixes #106202	2024-08-27 18:12:24 +01:00
Simon Pilgrim	807557654a	[DAG] visitTRUNCATE_USAT_U - use sd_match to match FP_TO_UINT_SAT pattern. NFC.	2024-08-23 16:39:32 +01:00
Sumanth Gundapaneni	e78156a0e2	Scalarize the vector inputs to llvm.lround intrinsic by default. (#101054 ) Verifier is updated in a different patch to let the vector types for llvm.lround and llvm.llround intrinsics.	2024-08-21 12:13:56 -05:00
Björn Pettersson	278fc8efdf	[DAGCombiner] Fix ReplaceAllUsesOfValueWith mutation bug in visitFREEZE (#104924 ) In visitFREEZE we have been collecting a set/vector of MaybePoisonOperands that later was iterated over, applying a freeze to those operands. However, C-level fuzzy testing has discovered that the recursiveness of ReplaceAllUsesOfValueWith may cause later operands in the MaybePoisonOperands vector to be replaced when replacing an earlier operand. That would then turn up as Assertion `N1.getOpcode() != ISD::DELETED_NODE && "Operand is DELETED_NODE!"' failed. failures when trying to freeze those later operands. So we need to make sure that the vector with MaybePoisonOperands is mutated as well when needed. Or as the solution used in this patch, make sure to keep track of operand numbers that should be frozen instead of having a vector of SDValues. And then we can refetch the operands while iterating over operand numbers. The problem was seen after adding SELECT_CC to the set of operations including in "AllowMultipleMaybePoisonOperands". I'm not sure, but I guess that this could happen for other operations as well for which we allow multiple maybe poison operands.	2024-08-21 17:56:27 +02:00
Simon Pilgrim	8109e5de57	[DAG] Add select_cc -> abd folds (#102137 ) Fixes #100810	2024-08-21 12:07:40 +01:00
Tianqing Wang	7f87b5bf0e	[SelectionDAG][X86] Preserve unpredictable metadata for conditional branches in SelectionDAG, as well as JCCs generated by X86 backend. (#102101 ) This builds on 09515f2c2, which preserves unpredictable metadata in CodeGen for `select`. This patch does it for conditional branches.	2024-08-19 11:04:48 +08:00
Craig Topper	067f2e9f18	[SelectionDAG] Use getSignedConstant/getAllOnesConstant.	2024-08-17 00:04:01 -07:00
Craig Topper	321de07b77	[DAGCombiner] Remove TRUNCATE_(S/U)SAT_(S/U) from an assert that isn't tested. NFC (#104466 )	2024-08-16 08:42:55 -07:00
Craig Topper	e027e04f01	[DAGCombiner] Don't let scalarizeBinOpOfSplats create illegal scalar MULHS/MULHU (#104518 ) Type legalization lacks generic support for these operations. They are normally only created when the type is legal. This scalarization case is new. We could update type legalization, but there some corner cases that make it not straightforward. For example, if the promoted type isn't 2x the narrow type we need to over promote. Fixes #104480	2024-08-15 21:07:22 -07:00
YunQiang Su	fb9e685fc4	Intrinsic: introduce minimumnum and maximumnum for IR and SelectionDAG (#96649 ) C23 introduced new functions fminimum_num and fmaximum_num, and they follow the minimumNumber and maximumNumber of IEEE754-2019. Let's introduce new intrinsics to support them. This patch introduces support only support for scalar values. The support of vector (vp, vp.reduce, vector.reduce), experimental.constrained will be added in future patches. With this patch, MIPSr6 and LoongArch can work out of box with fcanonical and fmax/fmin. Aarch64/PowerPC64 can use the same login as MIPSr6 and LoongArch, while they have no fcanonical support yet. I will add it in future patches. The FMIN/FMAX of RISC-V instructions follows the minimumNumber/maximumNumber of IEEE754-2019. We can just add it in future patch. Background https://discourse.llvm.org/t/rfc-fix-llvm-min-f-and-llvm-max-f-intrinsics/79735 Currently we have fminnum/fmaxnum, which have different behavior on different platform for NUM vs sNaN: 1) Fallback to fmin(3)/fmax(3): return qNaN. 2) ARM64/ARM32+Neon: same as libc. 3) MIPSr6/LoongArch/RISC-V: return NUM. And the fix of fminnum/fmaxnum to follow minNUM/maxNUM of IEEE754-2008 will submit as separated patches.	2024-08-15 14:09:36 +08:00
Froster	234cb4c6e3	[SelectionDAG] Scalarize binary ops of splats before legal types (#100749 ) Fixes #65072. This allows binary ops of splats to be scalarized if the operation isn't legal on the element type isn't legal, but is legal on the type it will be legalized to. I assume if an Op is legal both in scalar and vector, choose scalar version should always be better no matter what the type is. There are some cases that my approach can't scalarize, for example: ``` llvm ; test/CodeGen/RISCV/rvv/select-int.ll define <vscale x 4 x i64> @select_nxv4i64(i1 zeroext %c, <vscale x 4 x i64> %a, <vscale x 4 x i64> %b) { %v = select i1 %c, <vscale x 4 x i64> %a, <vscale x 4 x i64> %b ret <vscale x 4 x i64> %v } ``` https://godbolt.org/z/xzqrKrxvK `xor (splat i1, splat i1)` is generated in late step after LegalizeType, from select. I didn't figure out how to make `xor i1, i1` legal at this time. --------- Co-authored-by: Luke Lau <luke@igalia.com>	2024-08-15 00:07:00 +08:00
Kazu Hirata	5ce326ccb1	[SelectionDAG] Construct SmallVector with ArrayRef (NFC) (#103705 )	2024-08-14 08:22:20 -07:00

1 2 3 4 5 ...

3921 Commits