llvm-project

Author	SHA1	Message	Date
Craig Topper	e30a4fc3e2	[TargetLowering] Improve one signature of forceExpandWideMUL. (#123991 ) We have two forceExpandWideMUL functions. One takes the low and high half of 2 inputs and calculates the low and high half of their product. This does not calculate the full 2x width product. The other signature takes 2 inputs and calculates the low and high half of their full 2x width product. Previously it did this by sign/zero extending the inputs to create the high bits and then calling the other function. We can instead copy the algorithm from the other function and use the Signed flag to determine whether we should do SRA or SRL. This avoids the need to multiply the high part of the inputs and add them to the high half of the result. This improves the generated code for signed multiplication. This should improve the performance of #123262. I don't know yet how close we will get to gcc.	2025-01-23 12:49:35 -08:00
Craig Topper	cdd321462a	[TargetLowering] Use getShiftAmountConstant. NFC (#123802 ) Previously we always used the pointer size which might need to be legalized on some targets.	2025-01-21 12:05:52 -08:00
Graham Hunter	d9f165ddea	[SDAG] Add an ISD node to help lower vector.extract.last.active (#118810 ) Based on feedback from the clastb codegen PR, I'm refactoring basic codegen for the vector.extract.last.active intrinsic to lower to an ISD node in SelectionDAGBuilder then expand in LegalizeVectorOps, instead of doing everything in the builder. The new ISD node (vector_find_last_active) only covers finding the index of the last active element of the mask, and extracting the element + handling passthru is left to existing ISD nodes.	2025-01-20 12:57:05 +00:00
Craig Topper	e2449f1bce	[SelectionDAG] Use SDNode::op_iterator instead of SDNodeIterator. NFC (#122147 ) I think SDNodeIterator primarily exists because GraphTraits requires an iterator that dereferences to SDNode. op_iterator dereferences to SDUse which is implicitly convertible to SDValue. This piece of code can use SDValue instead of SDNode* so we should prefer to use the the more common op_iterator.	2025-01-09 09:09:55 -08:00
Simon Pilgrim	112793a90e	[DAG] expandUINT_TO_FP - use getShiftAmountConstant helper. NFC. Don't bother with separate getShiftAmountTy/getConstant calls.	2025-01-06 18:49:50 +00:00
Alex MacLean	6018820c48	[NVPTX] Fix lowering of i1 SETCC (#115035 ) Add DAG legalization support for expanding i1 SETCC nodes using appropriate logical operations to simulate integer comparisons. Use these expansions to handle i1 SETCC in NVPTX. fixes #58428 and #57405	2024-12-05 12:54:24 -08:00
Simon Pilgrim	b1a48af56a	[DAG] SimplifyDemandedVectorElts - add handling for INT<->FP conversions (#117884 )	2024-12-04 07:37:01 +00:00
Craig Topper	b076fbb844	[TargetLowering] Use Type* instead of EVT in shouldSignExtendTypeInLibCall. (#118587 ) I want to use this function for GISel too so Type * is a better common interface. All of the callers already convert EVT to Type * as needed by calling lowering anyway.	2024-12-03 22:06:55 -08:00
Craig Topper	caa8aa551b	[SelectionDAG] Rename CallOptions::IsSExt to IsSigned. NFC (#118574 ) This is eventually passed to shouldSignExtendTypeInLibCall which calls it IsSigned.	2024-12-03 18:25:44 -08:00
Félix-Antoine Constantin	7a56dc7245	[Clang] Attribute NoFPClass should not prevent tail call optimization. (#116741 ) Fixes #111950	2024-11-22 17:28:45 -08:00
Simon Pilgrim	51809e4a26	[DAG] SimplifyDemandedVectorElts - add SimplifyMultipleUse handling to SEXT/ZEXT/TRUNC nodes (#116227 ) Allows us to bypass multiple uses of a SEXT/ZEXT/TRUNC node operand	2024-11-16 12:40:42 +00:00
Sam Elliott	862f42eedf	[TargetLowering] Use Correct VT for Multi-out Asm (#116024 ) This was overlooked in 7d940432c46be83b8fcb5dbefee439585fa820cd - when inline assembly has multiple outputs, they are returned as members of a struct, and the `getAsmOperandType` needs to be called for each member of struct. The difference between this and the single-output case is that in the latter, there isn't a struct wrapping the outputs. I noticed this when trying to use the same mechanism in the RISC-V backend. Committing two tests: - One that shows a crash before this change, which is fixed by this change. - One (commented out) that shows a different crash with tied inputs/outputs. This is commented as it is not fixed by this change and needs more work in target-independent inline asm handling code.	2024-11-14 12:31:31 +00:00
Alex MacLean	7a8fe0f83c	[SelectionDAG] Fixup type usage of CondCodeAction table (#116082 ) Ensure that all uses of CondCodeAction table are checking the compared types, not the produced type. This is a prerequisite to landing #115035	2024-11-13 13:20:16 -08:00
Yingwei Zheng	f74aed7938	[DAGCombiner] Add basic support for `trunc nsw/nuw` (#113808 ) This patch adds basic support for `trunc nsw/nuw` in SDAG. It will allow DAGCombiner to further eliminate in-reg `zext/sext` instructions.	2024-11-07 00:23:53 +08:00
Simon Pilgrim	9540a7ae82	[DAG] SimplifyMultipleUseDemandedBits - bypass ADD nodes if either operand is zero (#112588 ) The dpbusd_const.ll test change is due to us losing the expanded add reduction pattern as one of the elements is known to be zero (removing one of the adds from the reduction pyramid). I don't think its of concern. Noticed while working on #107423	2024-11-05 17:20:41 +00:00
Simon Pilgrim	0ac2e42227	[DAG] SimplifyDemandedBits - ignore SRL node if we're just demanding known sign bits (#114805 ) Check to see if we are only demanding (shifted) signbits from a SRL node that are also signbits in the source node. We can't demand any upper zero bits that the SRL will shift in (up to max shift amount), and the lower demanded bits bound must already be all signbits. Same fold as #114389 which added this for SimplifyMultipleUseDemandedBits	2024-11-04 17:27:45 +00:00
Sander de Smalen	3098200fcc	[ISel] Propagate disjoint flag in ShrinkDemandedOp (#114560 ) When trying to evaluate an expression in a narrower type, the DAGCombine should propagate the disjoint flag, as it's equally valid on the narrower expression. This helps improve better use of addressing modes for some Arm SME instructions, for example.	2024-11-03 19:42:04 +00:00
Kazu Hirata	6927a434ba	[SelectionDAG] Remove unused includes (NFC) (#114697 ) Identified with misc-include-cleaner.	2024-11-03 07:03:33 -08:00
Simon Pilgrim	9fb4bc5bf4	[DAG] SimplifyMultipleUseDemandedBits - ignore SRL node if we're just demanding known sign bits (#114389 ) Check to see if we are only demanding (shifted) signbits from a SRL node that are also signbits in the source node. We can't demand any upper zero bits that the SRL will shift in (up to max shift amount), and the lower demanded bits bound must already be all signbits.	2024-10-31 16:40:29 +00:00
Kazu Hirata	f582cd3dc7	[SelectionDAG] Fix a warning This patch fixes: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:1489:17: error: unused variable 'Flags' [-Werror,-Wunused-variable]	2024-10-30 17:49:51 -07:00
Yingwei Zheng	cf9d1c1486	[SDAG] Simplify `SDNodeFlags` with bitwise logic (#114061 ) This patch allows using enumeration values directly and simplifies the implementation with bitwise logic. It addresses the comment in https://github.com/llvm/llvm-project/pull/113808#discussion_r1819923625.	2024-10-31 08:10:07 +08:00
Matt Arsenault	067e8b8dc5	DAG: Lower fcNormal is.fpclass to compare with inf (#100389 )	2024-10-17 15:49:13 +04:00
Nikita Popov	255a99c29f	[APInt] Fix APInt constructions where value does not fit bitwidth (NFCI) (#80309 ) This fixes all the places that hit the new assertion added in https://github.com/llvm/llvm-project/pull/106524 in tests. That is, cases where the value passed to the APInt constructor is not an N-bit signed/unsigned integer, where N is the bit width and signedness is determined by the isSigned flag. The fixes either set the correct value for isSigned, set the implicitTrunc flag, or perform more calculations inside APInt. Note that the assertion is currently still disabled by default, so this patch is mostly NFC.	2024-10-17 08:48:08 +02:00
Michael Marjieh	b5600c6f85	[TargetLowering][SelectionDAG] Exploit nneg Flag in UINT_TO_FP (#108931 ) 1. Propagate the nneg flag in WidenVecRes 2. Use SINT_TO_FP in expandUINT_TO_FP when possible.	2024-10-14 20:55:48 +04:00
YunQiang Su	d52c8408ff	SelectionDAG/expandFMINNUM_FMAXNUM: skips vector if SETCC/VSELECT is not legal (#109570 ) If SETCC or VSELECT is not legal for vector, we should not expand it, instead we can split the vectors. So that, some simple scale instructions can be emitted instead of some pairs of comparation+selection.	2024-10-10 08:39:25 +08:00
Paul Walker	02dd6b1014	[LLVM][CodeGen] Add lowering for scalable vector bfloat operations. (#109803 ) Specifically: fabs, fadd, fceil, fdiv, ffloor, fma, fmax, fmaxnm, fmin, fminnm, fmul, fnearbyint, fneg, frint, fround, froundeven, fsub, fsqrt & ftrunc	2024-10-07 13:01:59 +01:00
Matt Arsenault	5883ad34d6	DAG: Handle vector legalization of minimumnum/maximumnum (#109779 ) Follow the same patterns as the other min/max variants.	2024-09-30 13:43:35 +04:00
Jonas Paulsson	14120227a3	Target ABI: improve call parameters extensions handling (#100757 ) For the purpose of verifying proper arguments extensions per the target's ABI, introduce the NoExt attribute that may be used by a target when neither sign- or zeroextension is required (e.g. with a struct in register). The purpose of doing so is to be able to verify that there is always one of these attributes present and by this detecting cases where sign/zero extension is actually missing. As a first step, this patch has the verification step done for the SystemZ backend only, but left off by default until all known issues have been addressed. Other targets/front-ends can now also add NoExt attribute where needed and do this check in the backend.	2024-09-19 16:59:31 +02:00
Pierre van Houtryve	758444ca3e	[AMDGPU] Promote uniform ops to I32 in DAGISel (#106383 ) Promote uniform binops, selects and setcc between 2 and 16 bits to 32 bits in DAGISel Solves #64591	2024-09-19 09:00:21 +02:00
David Green	960c975acd	[AArch64] Expand scmp/ucmp vector operations with sub (#108830 ) Unlike scalar, where AArch64 prefers expanding scmp/ucmp with select, under Neon we can use the arithmetic expansion to generate fewer instructions. Notably it also prevents the scalarization of vselect during vector-legalization.	2024-09-16 18:44:52 +01:00
Lawrence Benson	b74e779219	[x86] Add lowering for `@llvm.experimental.vector.compress` (#104904 ) This is a follow-up to #92289 that adds lowering of the new `@llvm.experimental.vector.compress` intrinsic on x86 with AVX512 instructions. This intrinsic maps directly to `vpcompress`.	2024-09-13 21:48:01 +02:00
YunQiang Su	5773adb0bf	SelectionDAG: Remove unneeded getSelectCC in expandFMINIMUMNUM_FMAXIMUMNUM (#107416 ) ISD::FCANONICALIZE is enough, which can process NaN or non-NaN correctly, thus getSelectCC is not needed here.	2024-09-11 09:53:04 +08:00
Simon Pilgrim	7e07c1df67	[DAG] expandAVG - consistently use getShiftAmountConstant for constant shift amounts. NFC	2024-09-10 09:25:58 +01:00
Matt Arsenault	77f1b481b8	DAG: Lower single infinity is.fpclass tests to fcmp (#100380 ) InstCombine also should have taken care of this, but this should be helpful when the fcmp based lowering strategy tries to combine multiple tests.	2024-09-06 09:15:18 +04:00
Matt Arsenault	fc3e6a8186	DAG: Handle lowering unordered compare with inf (#100378 ) Try to take advantage of the nan check behavior of fcmp. x86_64 looks better, x86_32 looks worse.	2024-09-05 19:54:32 +04:00
Dávid Ferenc Szabó	e9eaf19eb6	[CodeGen] Allow mixed scalar type constraints for inline asm (#65465 ) GCC supports code like "asm volatile ("" : "=r" (i) : "0" (f))" where i is integer type and f is floating point type. Currently this code produces an error with Clang. The change allows mixed scalar types between input and output constraints. Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>	2024-08-29 22:53:28 +04:00
David Majnemer	ea1f05e618	[SelectionDAG] Fix lowering of IEEE 754 2019 minimum/maximum We used integer comparisons instead of floating point comparisons resulting in very odd behavior.	2024-08-20 05:09:49 +00:00
Craig Topper	f7d94b783f	[SelectionDAG] Use getAllOnesConstant.	2024-08-17 17:57:05 -07:00
Craig Topper	067f2e9f18	[SelectionDAG] Use getSignedConstant/getAllOnesConstant.	2024-08-17 00:04:01 -07:00
Craig Topper	7afb51e035	[SelectionDAG][X86] Add SelectionDAG::getSignedConstant and use it in a few places. (#104555 ) PR #80309 proposes to have users of APInt's uint64_t constructor opt-in to implicit truncation. Currently, that patch requires SelectionDAG::getConstant to opt-in. This patch adds getSignedConstant so we can start fixing some of the cases that require implicit truncation.	2024-08-16 09:21:11 -07:00
Craig Topper	3dea42f3e5	[TargetLowering] Don't call SelectionDAG::getTargetLoweringInfo() from TargetLowering methods. NFC (#104197 ) If we are inside a TargetLowering method, `SelectionDAG::getTargetLoweringInfo()` should be the same as `this`.	2024-08-15 12:33:12 -07:00
YunQiang Su	fb9e685fc4	Intrinsic: introduce minimumnum and maximumnum for IR and SelectionDAG (#96649 ) C23 introduced new functions fminimum_num and fmaximum_num, and they follow the minimumNumber and maximumNumber of IEEE754-2019. Let's introduce new intrinsics to support them. This patch introduces support only support for scalar values. The support of vector (vp, vp.reduce, vector.reduce), experimental.constrained will be added in future patches. With this patch, MIPSr6 and LoongArch can work out of box with fcanonical and fmax/fmin. Aarch64/PowerPC64 can use the same login as MIPSr6 and LoongArch, while they have no fcanonical support yet. I will add it in future patches. The FMIN/FMAX of RISC-V instructions follows the minimumNumber/maximumNumber of IEEE754-2019. We can just add it in future patch. Background https://discourse.llvm.org/t/rfc-fix-llvm-min-f-and-llvm-max-f-intrinsics/79735 Currently we have fminnum/fmaxnum, which have different behavior on different platform for NUM vs sNaN: 1) Fallback to fmin(3)/fmax(3): return qNaN. 2) ARM64/ARM32+Neon: same as libc. 3) MIPSr6/LoongArch/RISC-V: return NUM. And the fix of fminnum/fmaxnum to follow minNUM/maxNUM of IEEE754-2008 will submit as separated patches.	2024-08-15 14:09:36 +08:00
Craig Topper	e687a9f2dd	[TargetLowering] Remove unncessary null check. NFC	2024-08-14 12:26:41 -07:00
Craig Topper	abc1acf8df	[TargetLowering][AMDGPU][ARM][RISCV][X86] Teach SimplifyDemandedBits to combine (srl (sra X, C1), ShAmt) -> sra(X, C1+ShAmt) (#101751 ) If the upper bits of the shr aren't demanded. This helps with cases where the outer srl was originally an sra and was converted to a srl by SimplifyDemandedBits before it had a chance to combine with the inner sra. This can occur when the inner sra was part of a sign_extend_inreg expansion. There are some regressions in ARM and Thumb2.	2024-08-14 08:44:57 -07:00
Craig Topper	51bad732dc	[SelectionDAG] Replace EVTToAPFloatSemantics with MVT/EVT::getFltSemantics. (#103001 )	2024-08-13 11:35:28 -07:00
Pierre van Houtryve	7389545d0d	Reapply "[AMDGPU] Always lower s/udiv64 by constant to MUL" (#101942 ) Reland #100723, fixing the ARM issue at the cost of a small loss of optimization in `test/CodeGen/AMDGPU/fshr.ll` Solves #100383	2024-08-12 09:00:22 +02:00
Craig Topper	0c783be985	[TargetLowering] Use APInt::isSubsetOf to simplify an expression. NFC	2024-08-09 22:09:40 -07:00
Bjorn Pettersson	bbefd5713f	[TargetLowering] Handle vector types in expandFixedPointMul (#102635 ) In TargetLowering::expandFixedPointMul when expanding fixed point multiplication, and when using a widened MUL as strategy for the lowering, there was a bug resulting in assertion failures like this: Assertion `VT.isVector() == N1.getValueType().isVector() && "SIGN_EXTEND result type type should be vector iff the operand " "type is vector!"' failed. Problem was that we did not consider that VT could be a vector type when setting up the WideVT. This patch should fix that bug.	2024-08-10 00:25:57 +02:00
Kazu Hirata	f4fb735840	[llvm] Construct SmallVector<SDValue> with ArrayRef (NFC) (#102578 )	2024-08-09 09:15:42 -07:00
Simon Pilgrim	13d04fa560	[DAG] Add legalization handling for ABDS/ABDU (#92576 ) (REAPPLIED) Always match ABD patterns pre-legalization, and use TargetLowering::expandABD to expand again during legalization. abdu(lhs, rhs) -> sub(xor(sub(lhs, rhs), usub_overflow(lhs, rhs)), usub_overflow(lhs, rhs)) Alive2: https://alive2.llvm.org/ce/z/dVdMyv REAPPLIED: Fix regression issue with "abs(ext(x) - ext(y)) -> zext(abd(x, y))" fold failing after type legalization	2024-08-08 11:39:05 +01:00

1 2 3 4 5 ...

1548 Commits