llvm-project

Author	SHA1	Message	Date
Benjamin Kramer	5f5a64134b	Revert "[DAGCombiner] Simplifying `{si\|ui}tofp` when only signbit is needed" This reverts commit 353fbeb0a294d2c7cef6d88607fa0fd50ee81462. It crashes when it encounters an UINT_TO_FP. llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1618 in SDValue llvm::SelectionDAG::getConstant(const ConstantInt &, const SDLoc &, EVT, bool, bool): VT.isInteger() && "Cannot create FP integer constant!"	2024-03-20 15:08:37 +01:00
Noah Goldstein	353fbeb0a2	[DAGCombiner] Simplifying `{si\|ui}tofp` when only signbit is needed If we only need the signbit `uitofp` simplified to 0, and `sitofp` simplifies to `bitcast`. Closes #85138	2024-03-19 17:17:35 -05:00
Craig Topper	23323e2837	[TargetLowering][RISCV] Propagate fastmath flags for the vector operations emitted in expandVecReduce. (#85164 ) We used the fastmath flags for any scalar ops created, but not vector.	2024-03-14 08:39:32 -07:00
Arthur Eubanks	94c988bcfd	[NFC] Remove unused parameter from shouldAssumeDSOLocal()	2024-03-11 19:48:17 +00:00
Noah Goldstein	61c06775c9	[KnownBits] Add API for `nuw` flag in `computeForAddSub`; NFC	2024-03-05 12:59:58 -06:00
Owen Anderson	2c5a68858b	Fix non-splat vector SREM expansion when one of the divisors is a power of two. (#82706 ) The expansion previously used, derived from Hacker's Delight, does not work correctly when the dividend is INT_MIN and the divisor is a power of two. We now use an alternate derivation of the A and Q constants specifically for the power-of-two divisor case to avoid this problem. Credit to Fabian Giesen for the new derivation. Fixes https://github.com/llvm/llvm-project/issues/77169	2024-02-25 10:13:05 -05:00
David Majnemer	be36812fb7	[TargetLowering] Be more efficient in fp -> bf16 NaN conversions We can avoid masking completely as it is OK (and probably preferable) to bring over some of the existant NaN payload.	2024-02-21 22:47:27 +00:00
David Majnemer	9eff001d3d	[TargetLowering] Correctly yield NaN from FP_TO_BF16 We didn't set the exponent field, resulting in tiny numbers instead of NaNs.	2024-02-21 22:17:02 +00:00
David Majnemer	ddc0f1d8fe	[TargetLowering] Actually add the adjustment to the significand The logic was supposed to be choosing between {0, 1, -1} as an adjustment to the FP bit pattern. However, the adjustment itself was used as the bit pattern instead which result in garbage results.	2024-02-21 19:34:11 +00:00
David Majnemer	cc13f3ba45	Correctly round FP -> BF16 when SDAG expands such nodes (#82399 ) We did something pretty naive: - round FP64 -> BF16 by first rounding to FP32 - skip FP32 -> BF16 rounding entirely - taking the top 16 bits of a FP32 which will turn some NaNs into infinities Let's do this in a more principled way by rounding types with more precision than FP32 to FP32 using round-inexact-to-odd which will negate double rounding issues.	2024-02-21 12:37:02 -05:00
Craig Topper	d485317357	[TargetLowering] Emit SIGN_EXTEND_INREG instead of shift pair from optimizeSetCCOfSignedTruncationCheck. (#81785 ) sext_inreg is our canonical form of shift pair before op legalization so DAG combiner will probably create it anyway. If it isn't legal LegalizeDAG will expand to shifts later.	2024-02-15 09:24:02 -08:00
David Green	2e3de997ab	[DAG] Generalize setcc(setcc) fold to use known bits. If we have a `SETCC (SETCC), 0, NE` and ZeroOrOneBooleanContent, we can remove the outer setcc as it will produce the same value as the inner. This can be generalized to anything where the top bits are known to be 0, as the value will remain as 1 or 0.	2024-02-06 12:39:48 +00:00
Craig Topper	f72da9f4fd	[SelectionDAG] Use getShiftAmountConstant to simplify code. NFC (#80561 ) Replace calls to getShiftAmountTy+getConstant with getShiftAmountContant.	2024-02-04 16:05:14 -08:00
Kazu Hirata	39fa304866	[llvm] Use StringRef::starts_with (NFC)	2024-01-31 23:54:07 -08:00
PiJoules	a356e6ccad	[SelectionDAG] Expand fixed point multiplication into libcall (#79352 ) 32-bit ARMv6 with thumb doesn't support MULHS/MUL_LOHI as legal/custom nodes during expansion which will cause fixed point multiplication of _Accum types to fail with fixed point arithmetic. Prior to this, we just happen to use fixed point multiplication on platforms that happen to support these MULHS/MUL_LOHI. This patch attempts to check if the multiplication can be done via libcalls, which are provided by the arm runtime. These libcall attempts are made elsewhere, so this patch refactors that libcall logic into its own functions and the fixed point expansion calls and reuses that logic.	2024-01-30 13:58:55 -08:00
Philip Reames	0fc5f4b524	[DAG] Set nneg flag when forming zext in demanded bits (#72281 ) We do the same for the analogous transform in DAGCombine, but this case was missed in the recent patch which added support for zext nneg. Sorry for the lack of test coverage. Not sure how to exercise this piece of logic. It appears to have only minimal impact on LIT tests (only test/CodeGen/X86/wide-scalar-shift-by-byte-multiple-legalization.ll), and even then, the changes without it appear uninteresting. Maybe we should remove this transform instead?	2024-01-18 07:34:08 -08:00
Alex Bradbury	2d54ec36f7	[SelectionDAG] Add and use SDNode::getAsAPIntVal() helper (#77455 ) This is the logical equivalent for #76710 for APInt and uses the same naming scheme. Converted existing users through: `git grep -l "cast<ConstantSDNode>\(.\).getAPIntValueValue" \| xargs sed -E -i 's/cast<ConstantSDNode>\((.*)\)->getAPIntValue/\1->getAsAPIntVal/'`	2024-01-09 14:27:07 +00:00
Simon Pilgrim	d460c1de3b	[DAG] SimplifyDemandedBits - don't fold sext(x) -> aext(x) if we lose an 0/-1 allsignbits mask (#77296 ) For targets that use 0/-1 boolean results, we want to keep this pattern through extensions/truncations as much as possible - so avoid simplifying to any_extend even if we don't demand the upper bits. Noticed in triage for https://reviews.llvm.org/D152928	2024-01-08 18:01:41 +00:00
Simon Pilgrim	f45b75949d	[DAG] SimplifyDemandedBits - call demanded elts variant directly for SELECT/SELECT_CC nodes. Don't rebuild the demanded elts mask every time.	2024-01-04 10:53:45 +00:00
Simon Pilgrim	72db578d71	[DAG] Fix typo in VSELECT SimplifyDemandedVectorElts handling. NFC. Rename UndefZero -> UndefSel (undefined elements from Sel operand).	2024-01-04 10:50:42 +00:00
David Green	771fd1ad2a	[DAG] Extend input types if needed in combineShiftToAVG. (#76791 ) This atempts to fix #76734 which is a crash in invalid TRUNC nodes types from unoptimized input code in combineShiftToAVG. The NVT can be VT if the larger type was legal and the adds will not overflow, in which case the inputs should be extended. From what I can tell this appears to be valid (if not optimal for this case): https://alive2.llvm.org/ce/z/fRieHR The result has also been changed to getExtOrTrunc in case that VT==NVT, which is not handled by SEXT/ZEXT.	2024-01-03 10:52:01 +00:00
Craig Topper	bbd57e1832	[SelectionDAG] Add initial plumbing for the disjoint flag. (#76751 ) This copies the flag from IR to the SDNode in SelectionDAGBuilder, clears the flag in SimplifyDemandedBits, and adds it to canCreateUndefOrPoison. Uses of the flag will come in later patches.	2024-01-02 21:58:00 -08:00
Sander de Smalen	81b7f115fb	[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979 ) It seems TypeSize is currently broken in the sense that: TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8) without failing its assert that explicitly tests for this case: assert(LHS.Scalable == RHS.Scalable && ...); The reason this fails is that `Scalable` is a static method of class TypeSize, and LHS and RHS are both objects of class TypeSize. So this is evaluating if the pointer to the function Scalable == the pointer to the function Scalable, which is always true because LHS and RHS have the same class. This patch fixes the issue by renaming `TypeSize::Scalable` -> `TypeSize::getScalable`, as well as `TypeSize::Fixed` to `TypeSize::getFixed`, so that it no longer clashes with the variable in FixedOrScalableQuantity. The new methods now also better match the coding standard, which specifies that: * Variable names should be nouns (as they represent state) * Function names should be verb phrases (as they represent actions)	2023-11-22 08:52:53 +00:00
Simon Pilgrim	98efa8f9aa	[DAG] Fix ShrinkDemandedOp doxygen description to match behaviour. NFC. ShrinkDemandedOp checks for both isTruncateFree AND isZExtFree but extends with ANY_EXTEND.	2023-11-18 22:44:08 +00:00
Tavian Barnes	75cf672b12	[SDAG] Simplify is-power-of-2 codegen (#72275 ) When x is not known to be nonzero, ctpop(x) == 1 is expanded to x != 0 && (x & (x - 1)) == 0 resulting in codegen like leal -1(%rdi), %eax testl %eax, %edi sete %cl testl %edi, %edi setne %al andb %cl, %al But another expression that works is (x ^ (x - 1)) > x - 1 which has nicer codegen: leal -1(%rdi), %eax xorl %eax, %edi cmpl %eax, %edi seta %al	2023-11-15 22:26:34 +09:00
Yingwei Zheng	650026897c	[RISCV][SDAG] Prefer ShortForwardBranch to lower sdiv by pow2 (#67364 ) This patch lowers `sdiv x, +/-2k` to `add + select + shift` when the short forward branch optimization is enabled. The latter inst seq performs faster than the seq generated by target-independent DAGCombiner. This algorithm is described in Hacker's Delight**. This patch also removes duplicate logic in the X86 and AArch64 backend. But we cannot do this for the PowerPC backend since it generates a special instruction `addze`.	2023-11-10 21:38:47 +08:00
Craig Topper	70b35ec0a8	[SelectionDAG] Add initial support for nneg flag on ISD::ZERO_EXTEND. (#70872 ) This adds the nneg flag to SDNodeFlags and the node printing code. SelectionDAGBuilder will add this flag to the node if the target doesn't prefer sign extend. A future RISC-V patch can remove the sign extend preference from SelectionDAGBuilder. I've also added the flag to the DAG combine that converts ISD::SIGN_EXTEND to ISD::ZERO_EXTEND.	2023-11-03 11:15:08 -07:00
Qiu Chaofan	b46e768455	[DAGCombine] Fold setcc_eq infinity into is.fpclass (#67829 )	2023-11-01 11:51:15 +09:00
Simon Pilgrim	8d2efd7427	[DAG] Avoid ComputeNumSignBits call when we know the result is unsigned D146121 needs to set the NSW flag, but given the result is NUW then we know that the result has leading zeros, so we don't need to call ComputeNumSignBits - just reuse the existing KnownBits value instead.	2023-10-29 17:35:24 +00:00
Simon Pilgrim	d96529af3c	[DAG] Attempt shl narrowing in SimplifyDemandedBits (REAPPLIED) If a shl node leaves the upper half bits zero / undemanded, then see if we can profitably perform this with a half-width shl and a free trunc/zext. Followup to D146121 Reapplied - moved after the ShrinkDemandedOp call; reuse the existing KnownBits result; ensure that we only attempt this if all the upper bits are demanded; 547dc461225ba should address the remaining regressions that were noticed in the previous commit. Differential Revision: https://reviews.llvm.org/D155472	2023-10-29 15:38:46 +00:00
Simon Pilgrim	547dc46122	[DAG] SimplifyDemandedBits - ensure we drop NSW/NUW flags when we simplify a SHL node's input We already do this for variable shifts, but we missed it for constant shifts Fixes #69965	2023-10-26 10:34:58 +01:00
Simon Pilgrim	2a40ec2d3e	[DAG] SimplifyDemandedBits - fix isOperationLegal typo in D146121 We need to check that the simplified ISD::SRL node is legal, not the old one Noticed while trying to isolate the regressions in D155472	2023-10-17 17:50:12 +01:00
Kirill Stoimenov	0a776996af	Revert "[DAG] Attempt shl narrowing in SimplifyDemandedBits" This reverts commit 7a8c04ef84ecdab4390b451d4c2fe17bc45a7b63.	2023-10-04 22:15:41 +00:00
Simon Pilgrim	7a8c04ef84	[DAG] Attempt shl narrowing in SimplifyDemandedBits If a shl node leaves the upper half bits zero / undemanded, then see if we can profitably perform this with a half-width shl and a free trunc/zext. Followup to D146121 Differential Revision: https://reviews.llvm.org/D155472	2023-10-04 10:23:02 +01:00
Nick Desaulniers	e0a48c065b	[InlineAsm] add comments for NumOperands and ConstraintType (#67474 ) Splitting up patches for #20571. I found these comments generally useful to add and not predicated on those changes. Hopefully they help future travelers.	2023-09-28 08:24:56 -07:00
Nick Desaulniers	35a364fa5c	[TargetLowering] fix index OOB (#67494 ) I accidentally introduced this in commit 330fa7d2a4e0 ("[TargetLowering] Deduplicate choosing InlineAsm constraint between ISels (#67057)") Fix forward.	2023-09-26 15:50:26 -07:00
Sam McCall	679c3a1791	[TargetLowering] use stable_sort to avoid nondeterminism After 330fa7d2a4e0cfbb4b078 we were seeing nondeterministic failures of llvm/test/CodeGen/ARM/thumb-big-stack.ll, with different code being generated in different runs. Switching sort -> stable_sort fixes this. It looks like the old algorithm picked the first best option, and using stable_sort restores that behavior.	2023-09-26 15:16:09 +02:00
Nick Desaulniers	330fa7d2a4	[TargetLowering] Deduplicate choosing InlineAsm constraint between ISels (#67057 ) Given a list of constraints for InlineAsm (ex. "imr") I'm looking to modify the order in which they are chosen. Before doing so, I noticed a fair amount of logic is duplicated between SelectionDAGISel and GlobalISel for this. That is because SelectionDAGISel is also trying to lower immediates during selection. If we detangle these concerns into: 1. choose the preferred constraint 2. attempt to lower that constraint Then we can slide down the list of constraints until we find one that can be lowered. That allows the implementation to be shared between instruction selection frameworks. This makes it so that later I might only need to adjust the priority of constraints in one place, and have both selectors behave the same.	2023-09-25 08:53:03 -07:00
Sirish Pande	e6f9483f77	[SelectionDAG] Flags are dropped when creating a new FMUL (#66701 ) While simplifying some vector operators in DAG combine, we may need to create new instructions for simplified vectors. At that time, we need to make sure that all the flags of the new instruction are copied/modified from the old instruction. If "contract" is dropped from an instruction like FMUL, it may not generate FMA instruction which would impact performance. Here's an example where "contract" flag is dropped when FMUL is created. Replacing.2 t42: v2f32 = fmul contract t41, t38 With: t48: v2f32 = fmul t38, t38 Co-authored-by: Sirish Pande <sirish.pande@amd.com>	2023-09-21 10:26:34 -05:00
Craig Topper	8f04d81ede	[SelectionDAG][RISCV] Mask constants to narrow size in TargetLowering::expandUnalignedStore. If the SRL for Hi constant folds, but we don't remoe those bits from the Lo, we can end up with strange constant folding through DAGCombine later. I've only seen this with constants being lowered to constant pools during lowering on RISC-V.	2023-09-18 09:10:19 -07:00
Yingwei Zheng	e042ff7eef	[SDAG][RISCV] Avoid expanding is-power-of-2 pattern on riscv32/64 with zbb This patch adjusts the legality check for riscv to use `cpop/cpopw` since `isOperationLegal(ISD::CTPOP, MVT::i32)` returns false on rv64gc_zbb. Clang vs gcc: https://godbolt.org/z/rc3s4hjPh Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D156390	2023-09-17 02:56:09 +08:00
Kazu Hirata	5fb990ac51	[SelectionDAG] Use isNullConstant (NFC)	2023-09-02 09:32:43 -07:00
Fangrui Song	111fcb0df0	[llvm] Fix duplicate word typos. NFC Those fixes were taken from https://reviews.llvm.org/D137338	2023-09-01 18:25:16 -07:00
Simon Pilgrim	2a81396b1b	[DAG] SimplifyDemandedBits - add SMIN/SMAX KnownBits comparison analysis Followup to D158364 Also, final fix for Issue #59902 which noted that the snippet should just return 1	2023-09-01 12:42:30 +01:00
Simon Pilgrim	aca8b9d0d5	[DAG] SimplifyDemandedBits - if we're only demanding the signbits, a MIN/MAX node can be simplified to a OR or AND node Extension to the signbit case, if the signbits extend down through all the demanded bits then SMIN/SMAX/UMIN/UMAX nodes can be simplified to a OR/AND/AND/OR. Alive2: https://alive2.llvm.org/ce/z/mFVFAn (general case) Differential Revision: https://reviews.llvm.org/D158364	2023-09-01 10:56:32 +01:00
Daniel Paoliello	0c5c7b52f0	Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information: * The address of the branch instruction that uses the jump table. * The address of the jump table. * The "base" address that the values in the jump table are relative to. * The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted). Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to. Documentation for the symbol can be found in the Microsoft PDB library dumper: `0fe89a942f/cvdump/dumpsym7.cpp (L5518)` This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes). Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D149367	2023-08-31 12:06:50 -07:00
Luke Lau	6e4860f5d0	[SDAG] Add SimplifyDemandedBits support for ISD::SPLAT_VECTOR This improves some cases where a splat_vector uses a build_pair that can be simplified, e.g: (rotl x:i64, splat_vector (build_pair x1:i32, x2:i32)) rotl only demands the bottom 6 bits, so this patch allows it to simplify it to: (rotl x:i64, splat_vector (build_pair x1:i32, undef:i32)) Which in turn improves some cases where a splat_vector_parts is lowered on RV32. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D158839	2023-08-28 10:35:56 +01:00
Arthur Eubanks	0a4fc4ac1c	Revert "Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables" This reverts commit 8d0c3db388143f4e058b5f513a70fd5d089d51c3. Causes crashes, see comments in https://reviews.llvm.org/D149367. Some follow-up fixes are also reverted: This reverts commit 636269f4fca44693bfd787b0a37bb0328ffcc085. This reverts commit 5966079cf4d4de0285004eef051784d0d9f7a3a6. This reverts commit e7294dbc85d24a08c716d9babbe7f68390cf219b.	2023-08-25 18:34:15 -07:00
Daniel Paoliello	8d0c3db388	Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information: * The address of the branch instruction that uses the jump table. * The address of the jump table. * The "base" address that the values in the jump table are relative to. * The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted). Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to. Documentation for the symbol can be found in the Microsoft PDB library dumper: `0fe89a942f/cvdump/dumpsym7.cpp (L5518)` This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes). Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D149367	2023-08-25 10:19:17 -07:00
Simon Pilgrim	d254014fdb	[DAG] Add willNotOverflowAdd/willNotOverflowSub helper functions. Matches similar instructions on InstCombine	2023-08-24 17:52:54 +01:00

1 2 3 4 5 ...

1445 Commits