llvm-project

Author	SHA1	Message	Date
Craig Topper	8f04d81ede	[SelectionDAG][RISCV] Mask constants to narrow size in TargetLowering::expandUnalignedStore. If the SRL for Hi constant folds, but we don't remoe those bits from the Lo, we can end up with strange constant folding through DAGCombine later. I've only seen this with constants being lowered to constant pools during lowering on RISC-V.	2023-09-18 09:10:19 -07:00
Yingwei Zheng	e042ff7eef	[SDAG][RISCV] Avoid expanding is-power-of-2 pattern on riscv32/64 with zbb This patch adjusts the legality check for riscv to use `cpop/cpopw` since `isOperationLegal(ISD::CTPOP, MVT::i32)` returns false on rv64gc_zbb. Clang vs gcc: https://godbolt.org/z/rc3s4hjPh Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D156390	2023-09-17 02:56:09 +08:00
Kazu Hirata	5fb990ac51	[SelectionDAG] Use isNullConstant (NFC)	2023-09-02 09:32:43 -07:00
Fangrui Song	111fcb0df0	[llvm] Fix duplicate word typos. NFC Those fixes were taken from https://reviews.llvm.org/D137338	2023-09-01 18:25:16 -07:00
Simon Pilgrim	2a81396b1b	[DAG] SimplifyDemandedBits - add SMIN/SMAX KnownBits comparison analysis Followup to D158364 Also, final fix for Issue #59902 which noted that the snippet should just return 1	2023-09-01 12:42:30 +01:00
Simon Pilgrim	aca8b9d0d5	[DAG] SimplifyDemandedBits - if we're only demanding the signbits, a MIN/MAX node can be simplified to a OR or AND node Extension to the signbit case, if the signbits extend down through all the demanded bits then SMIN/SMAX/UMIN/UMAX nodes can be simplified to a OR/AND/AND/OR. Alive2: https://alive2.llvm.org/ce/z/mFVFAn (general case) Differential Revision: https://reviews.llvm.org/D158364	2023-09-01 10:56:32 +01:00
Daniel Paoliello	0c5c7b52f0	Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information: * The address of the branch instruction that uses the jump table. * The address of the jump table. * The "base" address that the values in the jump table are relative to. * The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted). Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to. Documentation for the symbol can be found in the Microsoft PDB library dumper: `0fe89a942f/cvdump/dumpsym7.cpp (L5518)` This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes). Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D149367	2023-08-31 12:06:50 -07:00
Luke Lau	6e4860f5d0	[SDAG] Add SimplifyDemandedBits support for ISD::SPLAT_VECTOR This improves some cases where a splat_vector uses a build_pair that can be simplified, e.g: (rotl x:i64, splat_vector (build_pair x1:i32, x2:i32)) rotl only demands the bottom 6 bits, so this patch allows it to simplify it to: (rotl x:i64, splat_vector (build_pair x1:i32, undef:i32)) Which in turn improves some cases where a splat_vector_parts is lowered on RV32. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D158839	2023-08-28 10:35:56 +01:00
Arthur Eubanks	0a4fc4ac1c	Revert "Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables" This reverts commit 8d0c3db388143f4e058b5f513a70fd5d089d51c3. Causes crashes, see comments in https://reviews.llvm.org/D149367. Some follow-up fixes are also reverted: This reverts commit 636269f4fca44693bfd787b0a37bb0328ffcc085. This reverts commit 5966079cf4d4de0285004eef051784d0d9f7a3a6. This reverts commit e7294dbc85d24a08c716d9babbe7f68390cf219b.	2023-08-25 18:34:15 -07:00
Daniel Paoliello	8d0c3db388	Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information: * The address of the branch instruction that uses the jump table. * The address of the jump table. * The "base" address that the values in the jump table are relative to. * The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted). Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to. Documentation for the symbol can be found in the Microsoft PDB library dumper: `0fe89a942f/cvdump/dumpsym7.cpp (L5518)` This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes). Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D149367	2023-08-25 10:19:17 -07:00
Simon Pilgrim	d254014fdb	[DAG] Add willNotOverflowAdd/willNotOverflowSub helper functions. Matches similar instructions on InstCombine	2023-08-24 17:52:54 +01:00
Yingwei Zheng	d6639f83a9	[SDAG][RISCV] Avoid folding `setcc (xor C1, -1), C2, cond` into `setcc (xor C2, -1), C1, cond` This patch fixes https://github.com/llvm/llvm-project/issues/64935. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D158654	2023-08-24 04:18:17 +08:00
Kazu Hirata	134115618a	[CodeGen] Use isAllOnesConstant and isNullConstant (NFC)	2023-08-20 22:56:40 -07:00
Simon Pilgrim	95865e5138	[DAG] SimplifyDemandedBits - if we're only demanding the signbit, a SMIN/SMAX node can be simplified to a OR/AND node respectively. Alive2: https://alive2.llvm.org/ce/z/MehvFB REAPPLIED from 54d663d5896008 with fix for using the correct DemandedBits mask.	2023-08-20 14:20:49 +01:00
Craig Topper	0a5347f40d	[DAG] SimplifyDemandedBits - Use DemandedBits intead of OriginalDemandedBits to when simplifying UMIN/UMAX to AND/OR. DemandedBits is forced to all ones if there are multiple users. The changes X86 test cases looks like they were miscompiles before. The value of eax/rax from the cmov is returned from the function in addition to being used by the sar. That usage needs all bits even though the sar doesn't.	2023-08-18 11:59:18 -07:00
Thurston Dang	29b2009061	Revert "[DAG] SimplifyDemandedBits - if we're only demanding the signbit, a SMIN/SMAX node can be simplified to a OR/AND node respectively." This reverts commit 54d663d5896008c09c938f80357e2a056454bc65, which breaks the test CodeGen/SystemZ/ctpop-01.ll for stage2-ubsan check (see https://lab.llvm.org/buildbot/#/builders/85/builds/18410) I manually confirmed that the test had been passing immediately prior to that commit (BUILDBOT_REVISION=4772c66cfb00d60f8f687930e9dd3aa1b6872228 llvm-zorg/zorg/buildbot/builders/sanitizers/buildbot_bootstrap_ubsan.sh)	2023-08-18 18:08:10 +00:00
Simon Pilgrim	bd9bf9cb67	[X86] SimplifyDemandedBits - move MaskedValueIsZero as late as possible to avoid unnecessary (recursive) analysis costs. NFC. Mentioned on D155472 for the SHL equivalent	2023-08-18 15:14:06 +01:00
Simon Pilgrim	4cd1c07491	[DAG] SimplifyDemandedBits - if we're only demanding the msb, a UMIN/UMAX node can be simplified to a AND/OR node respectively. Alive2: https://alive2.llvm.org/ce/z/qnvmc6	2023-08-18 12:12:22 +01:00
Simon Pilgrim	54d663d589	[DAG] SimplifyDemandedBits - if we're only demanding the signbit, a SMIN/SMAX node can be simplified to a OR/AND node respectively. Alive2: https://alive2.llvm.org/ce/z/MehvFB	2023-08-18 11:35:34 +01:00
Noah Goldstein	e7f7b63fb3	[DAGCombiner][X86] Guard `(X & Y) ==/!= Y` --> `(X & Y) !=/== 0` behind TLI preference On X86 for vec types `(X & Y) == Y` is generally preferable to `(X & Y) != 0`. Creating zero requires an extra instruction and on pre-avx512 targets there is no vector `pcmpne` so it requires two additional instructions to invert the `pcmpeq`. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D157014	2023-08-16 02:00:15 -05:00
Simon Pilgrim	b0a77af4f1	[DAG] SimplifyDemandedBits - add sra(shl(x,c1),c1) -> sign_extend_inreg(x) demanded elts fold Move the sra(shl(x,c1),c1) -> sign_extend_inreg(x) fold inside SimplifyDemandedBits so we can recognize hidden splats with DemandedElts masks. Because the c1 shift amount has multiple uses, hidden splats won't get simplified to a splat constant buildvector - meaning the existing fold in DAGCombiner::visitSRA can't fire as it won't see a uniform shift amount. I also needed to add TLI preferSextInRegOfTruncate hook to help keep truncate(sign_extend_inreg(x)) vector patterns on X86 so we can use PACKSS more efficiently. Differential Revision: https://reviews.llvm.org/D157972	2023-08-15 16:32:03 +01:00
Bjorn Pettersson	e53b28c833	[llvm] Drop some bitcasts and references related to typed pointers Differential Revision: https://reviews.llvm.org/D157551	2023-08-10 15:07:07 +02:00
Alex Bradbury	1cffd26483	[TargetLowering][RISCV] Improve codegen for saturating bf16 to int conversion Extending to f32 first (as is done for f16) results in better generated code for RISC-V (and affects no other in-tree tests). Additionally, performing the FP_EXTEND first seems equally justified for bf16 as for f16. Differential Revision: https://reviews.llvm.org/D156944	2023-08-07 11:21:25 +01:00
Simon Pilgrim	ae60706da0	[DAG] SimplifyDemandedBits - call ComputeKnownBits for constant non-uniform ISD::SRL shift amounts We only attempted to determine KnownBits for uniform constant shift amounts, but ComputeKnownBits is able to handle some non-uniform cases as well that we can use as a fallback.	2023-07-21 14:52:57 +01:00
Simon Pilgrim	7567b72f4d	[DAG] ShrinkDemandedConstant - early-out for empty DemandedBits/Elts Leave this to constant folding in SimplifyDemandedBits Fixes #63975	2023-07-20 12:18:10 +01:00
Simon Pilgrim	d7eb9240c0	[DAG] SimplifyDemandedBits - attempt to use SimplifyMultipleUseDemandedBits for bitcasts from larger element types Attempt to avoid multi-use ops if the bitcast doesn't need anything from them.	2023-07-18 18:38:03 +01:00
Simon Pilgrim	e9caa37e9c	[DAG] Move lshr narrowing from visitANDLike to SimplifyDemandedBits Inspired by some of the cases from D145468 Let SimplifyDemandedBits handle the narrowing of lshr to half-width if we don't require the upper bits, the narrowed shift is profitable and the zext/trunc are free. A future patch will propose the equivalent shl narrowing combine. Differential Revision: https://reviews.llvm.org/D146121	2023-07-17 15:50:09 +01:00
Jon Roelofs	56e60bc5bb	TargetLowering: fix an infinite DAG combine in SimplifySETCC TargetLowering::SimplifySetCC wants to swap the operands of a SETCC to canonicalize the constant to the RHS. The bug here was that it did so whether or not the RHS was already a constant, leading to an infinite loop. rdar://111847838 Divverential revision: https://reviews.llvm.org/D155095 This reverts commit cdc633e4bc93d4bf241ecd4c29691ae065749313.	2023-07-12 16:13:27 -07:00
Jon Roelofs	cdc633e4bc	Revert "TargetLowering: fix an infinite DAG combine in SimplifySETCC" This reverts commit b76c85b355578d9076c22a86faf4ea8de1745bdf. It broke the RISCV-enabled bots. Oops.	2023-07-12 12:22:03 -07:00
Jon Roelofs	b76c85b355	TargetLowering: fix an infinite DAG combine in SimplifySETCC TargetLowering::SimplifySetCC wants to swap the operands of a SETCC to canonicalize the constant to the RHS. The bug here was that it did so whether or not the RHS was already a constant, leading to an infinite loop. rdar://111847838 Differential revision: https://reviews.llvm.org/D155095	2023-07-12 11:44:15 -07:00
Matt Arsenault	b59022b42e	DAG: Handle lowering of unordered fcZero\|fcSubnormal to fcmp	2023-07-11 18:30:15 -04:00
Matt Arsenault	310f839612	DAG: Lower is.fpclass fcInf to fcmp of fabs InstCombine should have taken care of this, but I think this is more useful in the future when the expansion tries to handle multiple cases at a time with fcmp. x87 looks worse to me but the only thing I know about it is that I aggressively do not care about it. https://reviews.llvm.org/D143198	2023-07-07 17:00:10 -04:00
Matt Arsenault	64df9573a7	DAG: Handle inversion of fcSubnormal \| fcZero There are a number of more test combinations here that can be done together and reduce the number of instructions. https://reviews.llvm.org/D143191	2023-07-06 21:19:44 -04:00
Matt Arsenault	61820f8b5d	CodeGen: Optimize lowering of is.fpclass fcZero\|fcSubnormal Combine the two checks into a check if the exponent bits are 0. The inverted case isn't reachable until a future change, and GlobalISel currently doesn't attempt the inversion optimization. https://reviews.llvm.org/D143182	2023-07-06 13:03:57 -04:00
Matt Arsenault	1588e18b2d	DAG: Check isCondCodeLegal in is_fpclass expansion to fcmp eq 0 Results in some x86 codegen diffs. Some look better, some look worse. https://reviews.llvm.org/D152094	2023-07-06 13:00:52 -04:00
David Green	f55d96b9a2	[DAG][AArch64] Handle vector types when expanding sdiv/udiv into mulh The aarch64 backend will benefit from expanding 64vector sdiv/udiv into mulh using shift(mul(ext, ext)), as the larger type size is legal and the mul(ext, ext) can efficiently use smull/umull instructions. This extends the existing code in GetMULHS to handle vector types for it. Differential Revision: https://reviews.llvm.org/D154049	2023-07-02 15:02:52 +01:00
Dhruv Chawla	3f77724de7	[TargetLowering] Better code generation for ISD::SADDSAT/SSUBSAT when operand sign is known When the sign of either of the operands is known, it is possible to determine what the saturating value will be without having to compute it using the sign bits. Differential Revision: https://reviews.llvm.org/D153575	2023-06-23 13:20:36 +05:30
Matt Arsenault	18b93562cf	DAG: Expand legalization of is.fpclass to fcmp for DAZ Try to use a compare with 0 if DAZ is assumed. FPClassTest really needs to be marked as a bimask enum, but the API for that is currently broken.	2023-06-22 06:18:02 -04:00
Noah Goldstein	5c8188c7bc	[DAGCombine] Use `IsKnownNeverZero` to see if we need zero-check in is_pow2 setcc patern `ctpop(X) eq/ne 1` is checking if X is a non-zero power of 2. Power of 2 check including zero is `(X & (X-1)) eq/ne 0` and unfortunately there is no good pattern for checking a power of 2 while excluding zero. So, when lowering `ctpop(X) eq/ne 1`, explicitly check `IsKnownNeverZero(X)` to maybe be able to optimize out the extra zero check. We need this explicitly as DAGCombiner does not re-analyze provable setcc nodes, and the middle-end never finds it beneficially to broaden `ctpop(X) eq/ne 1` -> `ctpop(X) ule/ugt 1` (power of 2 including zero). Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D152675	2023-06-12 13:52:43 -05:00
Yeting Kuo	2fe2a6d4b8	[DAGCombiner] Use generalized pattern matcher in visitFMA to support vp.fma. Note: Some patterns in visitFMA are needed refined to support splat of constant. Reviewed By: luke Differential Revision: https://reviews.llvm.org/D152260	2023-06-08 09:40:21 +08:00
Craig Topper	ee27e5df9e	[TargetLowering][ARM][AArch64] Remove usage of NoSignedWrap/NoUnsignedWrap from AVGFLOOR/CEIL transform. Use computeOverflowForUnsignedAdd and computeOverflowForSignedAdd instead. Unfortunately, this recomputes some known bits and sign bits we may have already computed, but was the easiest fix without a lot of restructuring. This recovers the regressions from D151472. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D151858	2023-06-01 14:18:08 -07:00
Dhruv Chawla	3b3912e9b8	Reapply [SelectionDAG] Handle NSW for ADD/SUB in computeKnownBits() This exposed a miscompile due to incorrect flag preservation in integer type legalization, which has been fixed in D151472. ----- This patch is a continuation of D150110. It separates the cases for ADD and SUB into their own cases so that computeForAddSub can be directly called and the NSW flag passed. This allows better optimization when the NSW flag is enabled, and allows fixing up the TODO that was there previously in SimplifyDemandedBits. Differential Revision: https://reviews.llvm.org/D150769	2023-05-31 12:25:41 +02:00
Nikita Popov	2ba14283cd	Revert "[SelectionDAG] Handle NSW for ADD/SUB in computeKnownBits()" This reverts commit b66551370fdfc6f357ae0d77237119d2b1077b62. This has exposed a pre-existing miscompile, reported in https://reviews.llvm.org/D150769#4370467.	2023-05-25 11:13:51 +02:00
Dhruv Chawla	b66551370f	[SelectionDAG] Handle NSW for ADD/SUB in computeKnownBits() This patch is a continuation of D150110. It separates the cases for ADD and SUB into their own cases so that computeForAddSub can be directly called and the NSW flag passed. This allows better optimization when the NSW flag is enabled, and allows fixing up the TODO that was there previously in SimplifyDemandedBits. Differential Revision: https://reviews.llvm.org/D150769	2023-05-17 15:15:05 +02:00
Jay Foad	d8229e2f14	[KnownBits] Define and use intersectWith and unionWith Define intersectWith and unionWith as two complementary ways of combining KnownBits. The names are chosen for consistency with ConstantRange. Deprecate commonBits as a synonym for intersectWith. Differential Revision: https://reviews.llvm.org/D150443	2023-05-16 09:23:51 +01:00
Craig Topper	a983ef2c17	[DAGCombiner][AArch64][VE] Teach BuildUDIV/SDIV to use 2x mul when mulh/mul_lohi are not available. Correct the legality of i32 mul_lohi on AArch64. Previously, AArch64 incorrectly reported i32 mul_lohi as Legal. This allowed BuildUDIV/SDIV to use them. A later DAGCombiner would replace them with MULHS/MULHU because only the high half was used. This conversion does not check the legality of MULHS/MULHU under the assumption that LegalizeDAG can turn it back into MUL_LOHI later. After they are converted to MULHS/MULHU, DAGCombine ran and saw that these operations aren't supported but an i64 MUL is. So they get converted to that plus a shift. Without this, LegalizeDAG would convert back MUL_LOHI and isel would fail to find a pattern. This patch teaches BuildUDIV/SDIV to create the wide mul and shift so that we can report the correct operation legality on AArch64. It also enables div by constant folding for more cases on VE. I don't know if VE wants this div by constant optimization or not. If they don't want it, they can use the isIntDivCheap hook to disable it. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D150333	2023-05-12 09:06:17 -07:00
Dhruv Chawla	1d21d2eb7f	[TargetLowering] Fix unnecessary call to `computeKnownBits` (NFCI) In the SimplifyDemandedBits function, there is a fallthrough to the default case in the case of ISD::ADD, ISD::MUL and ISD::SUB. This leads to a call to computeKnownBits which is unnecessary as the calls to SimplifyDemandedBits in the cases themselves handle the calculation of the known bits. This information is discarded through the Known2 variables. By keeping this information around and calling KnownBits::mul or KnownBits::computeForAddSub directly, the unnecessary computation can be avoided. For now, the NSW bit is not passed through to KnownBits as this is something that computeKnownBits does not handle either. This requires updating computeForAddCarry to handle the flag as well. Differential Revision: https://reviews.llvm.org/D150110	2023-05-08 16:14:01 +02:00
Simon Pilgrim	051918c71e	[DAG] expandIntMINMAX - add umax(x,1) --> sub(x,cmpeq(x,0)) fold Move the fold from X86 to generic expansion (We also have several existing expansions that are missing freezes on repeated operands - I've added a TODO for now).	2023-05-05 19:27:52 +01:00
Simon Pilgrim	04e809ab90	[DAG] Add TargetLowering::expandABD and convert X86 lowering to use it Scalar widening cases are still custom lowered in the X86 backend - we still need to add promotion/legalization support to handle these	2023-05-05 15:13:23 +01:00
Evgenii Kudriashov	a82d27a9a6	[X86] Support llvm.{min,max}imum.f{16,32,64} Addresses https://github.com/llvm/llvm-project/issues/53353 Reviewed By: RKSimon, pengfei Differential Revision: https://reviews.llvm.org/D145634	2023-05-04 21:04:48 +08:00

1 2 3 4 5 ...

1406 Commits