llvm-project

Author	SHA1	Message	Date
Craig Topper	fe012bd52d	[SelectionDAG] Use Register around RegisterSDNode related functions. NFC RegisterSDNode itself already stored a Register.	2024-09-17 23:26:56 -07:00
ErikHogeman	e16ec9b45e	[SelectionDAG] Do not build illegal nodes with users (#108573 ) When we build a node with illegal type which has a user, it's possible that it can end up being processed by the DAG combiner later before it's removed, which can trigger an assert expecting the types to be legalized already.	2024-09-16 10:02:42 +01:00
anjenner	4af249fe6e	Add usub_cond and usub_sat operations to atomicrmw (#105568 ) These both perform conditional subtraction, returning the minuend and zero respectively, if the difference is negative.	2024-09-06 16:19:20 +01:00
Philip Reames	e1bde1c5b2	[SDAG] Fix a typo in comment	2024-09-03 08:04:57 -07:00
Sam Tebbs	44cfbef1b3	[AArch64] Lower partial add reduction to udot or svdot (#101010 ) This patch introduces lowering of the partial add reduction intrinsic to a udot or svdot for AArch64. This also involves adding a `shouldExpandPartialReductionIntrinsic` target hook, which AArch64 will return false from in the cases that it can be lowered.	2024-09-02 14:06:14 +01:00
Philip Reames	924907bc6a	[DAG] Prefer 0.0 over -0.0 as neutral value for FADD w/NoSignedZero (#106616 ) When getting a neutral value, we can prefer using a positive zero over a negative zero if nsz is set on the FADD (or reduction). A positive zero should be cheaper to materialize on basically all targets. Arguably, we should be doing this kind of canonicalization in DAGCombine, but we don't do that for any of the other reduction variants, so this seems like path of least resistance. This does mean that we can only do this for "fast" reductions. Just nsz isn't enough, as that goes through the SEQ_FADD path where the IR level start value isn't folded away. If folks think this is to RISCV specific, let me know. There's a trivial RISCV specific implementation. I went with the generic one as I through this might benefit other targets.	2024-08-30 07:56:14 -07:00
Philip Reames	74b4ec17e2	[VP] Remove VP_PROPERTY_REDUCTION and VP_PROPERTY_CMP [nfc] (#105551 ) These lists are quite static and several of the parameters are actually constant across all users. Heavy use of macros is undesirable, and not idiomatic in LLVM, so let's just use the naive switch cases. I'll probably continue with removing the other property macros. These two just happened to be the two I actually had to figure out for an unrelated change.	2024-08-29 09:57:58 -07:00
Sergei Barannikov	4d7a0abae8	[DataLayout] Change return type of `getStackAlignment` to `MaybeAlign` (#105478 ) Currently, `getStackAlignment` asserts if the stack alignment wasn't specified. This makes it inconvenient to use and complicates testing. This change also makes `exceedsNaturalStackAlignment` method redundant.	2024-08-27 22:59:33 +03:00
Sumanth Gundapaneni	e78156a0e2	Scalarize the vector inputs to llvm.lround intrinsic by default. (#101054 ) Verifier is updated in a different patch to let the vector types for llvm.lround and llvm.llround intrinsics.	2024-08-21 12:13:56 -05:00
Philip Reames	91b423d955	[DAG][RISCV] Use vp.<binop> when widening illegal types for binops which can trap (#105214 ) This allows the use a single wider operation with a restricted EVL instead of having to split and cover via decreasing powers-of-two sizes. On RISCV, this avoids the need for a bunch of vslidedown and vslideup instructions to extract subvectors, and VL toggles to switch between the various widths. Note there is a potential downside of using vp nodes; we loose any generic DAG combines which might have applied to the split form.	2024-08-20 13:51:10 -07:00
Tianqing Wang	7f87b5bf0e	[SelectionDAG][X86] Preserve unpredictable metadata for conditional branches in SelectionDAG, as well as JCCs generated by X86 backend. (#102101 ) This builds on 09515f2c2, which preserves unpredictable metadata in CodeGen for `select`. This patch does it for conditional branches.	2024-08-19 11:04:48 +08:00
Craig Topper	067f2e9f18	[SelectionDAG] Use getSignedConstant/getAllOnesConstant.	2024-08-17 00:04:01 -07:00
Craig Topper	7afb51e035	[SelectionDAG][X86] Add SelectionDAG::getSignedConstant and use it in a few places. (#104555 ) PR #80309 proposes to have users of APInt's uint64_t constructor opt-in to implicit truncation. Currently, that patch requires SelectionDAG::getConstant to opt-in. This patch adds getSignedConstant so we can start fixing some of the cases that require implicit truncation.	2024-08-16 09:21:11 -07:00
YunQiang Su	fb9e685fc4	Intrinsic: introduce minimumnum and maximumnum for IR and SelectionDAG (#96649 ) C23 introduced new functions fminimum_num and fmaximum_num, and they follow the minimumNumber and maximumNumber of IEEE754-2019. Let's introduce new intrinsics to support them. This patch introduces support only support for scalar values. The support of vector (vp, vp.reduce, vector.reduce), experimental.constrained will be added in future patches. With this patch, MIPSr6 and LoongArch can work out of box with fcanonical and fmax/fmin. Aarch64/PowerPC64 can use the same login as MIPSr6 and LoongArch, while they have no fcanonical support yet. I will add it in future patches. The FMIN/FMAX of RISC-V instructions follows the minimumNumber/maximumNumber of IEEE754-2019. We can just add it in future patch. Background https://discourse.llvm.org/t/rfc-fix-llvm-min-f-and-llvm-max-f-intrinsics/79735 Currently we have fminnum/fmaxnum, which have different behavior on different platform for NUM vs sNaN: 1) Fallback to fmin(3)/fmax(3): return qNaN. 2) ARM64/ARM32+Neon: same as libc. 3) MIPSr6/LoongArch/RISC-V: return NUM. And the fix of fminnum/fmaxnum to follow minNUM/maxNUM of IEEE754-2008 will submit as separated patches.	2024-08-15 14:09:36 +08:00
Craig Topper	51bad732dc	[SelectionDAG] Replace EVTToAPFloatSemantics with MVT/EVT::getFltSemantics. (#103001 )	2024-08-13 11:35:28 -07:00
Craig Topper	f58f92c213	[SelectionDAG] Move SelectionDAG::getAllOnesConstant out of line. NFC (#102995 ) This function has to get the scalar size and create an APInt. I don't think it belongs inline.	2024-08-13 10:15:16 -07:00
Paul Walker	4197386dbd	[LLVM][SelectionDAG] Remove scalable vector restriction from poison analysis. (#102504 ) The following functions have an early exit for scalable vectors: SelectionDAG::canCreateUndefOrPoison SelectionDAG:isGuaranteedNotToBeUndefOrPoison The implementations of these don't look to be sensitive to the vector type other than some uses of demanded elts analysis that doesn't fully support scalable types. That said the initial calculation demands all elements and so I've followed the same scheme as used by TargetLowering::SimplifyDemandedBits.	2024-08-13 12:53:20 +01:00
Simon Pilgrim	13d04fa560	[DAG] Add legalization handling for ABDS/ABDU (#92576 ) (REAPPLIED) Always match ABD patterns pre-legalization, and use TargetLowering::expandABD to expand again during legalization. abdu(lhs, rhs) -> sub(xor(sub(lhs, rhs), usub_overflow(lhs, rhs)), usub_overflow(lhs, rhs)) Alive2: https://alive2.llvm.org/ce/z/dVdMyv REAPPLIED: Fix regression issue with "abs(ext(x) - ext(y)) -> zext(abd(x, y))" fold failing after type legalization	2024-08-08 11:39:05 +01:00
Simon Pilgrim	e4e96b3e26	Revert b1234ddbe2652aa7948242a57107ca7ab12fd2f8. "[DAG] Add legalization handling for ABDS/ABDU (#92576 )" Reverting #92576 while we identify a reported regression	2024-08-07 17:11:25 +01:00
Simon Pilgrim	b1234ddbe2	[DAG] Add legalization handling for ABDS/ABDU (#92576 ) Always match ABD patterns pre-legalization, and use TargetLowering::expandABD to expand again during legalization. abdu(lhs, rhs) -> sub(xor(sub(lhs, rhs), usub_overflow(lhs, rhs)), usub_overflow(lhs, rhs)) Alive2: https://alive2.llvm.org/ce/z/dVdMyv	2024-08-06 10:18:06 +01:00
Simon Pilgrim	1b85935453	Fix MSVC "not all control paths return a value" warning. NFC.	2024-08-06 09:38:09 +01:00
Luke Lau	33fc322696	[SelectionDAG] Simplify vselect true, T, F -> T (#100992 ) This addresses a TODO where we can fold a vselect to it's true operand if the boolean is known to be all trues, by factoring out the logic from extractBooleanFlip which checks TLI.getBooleanContents.	2024-08-06 10:49:20 +08:00
Kazu Hirata	8d1b17b662	[CodeGen] Construct SmallVector with ArrayRef (NFC) (#101841 )	2024-08-04 00:41:29 -07:00
Matt Arsenault	9843843c88	SelectionDAG: Do not propagate divergence through copy glue (#101210 ) This fixes DAG divergence mishandling inline asm. This was considering the glue nodes for divergence, when the divergence should only come from the individual CopyFromRegs that are glued. As a result, having any VGPR CopyFromRegs would taint any uniform SGPR copies as divergent, resulting in SGPR copies to VGPR virtual registers later.	2024-07-31 00:04:58 +04:00
Manish Kausik H	f6d1d6fe7b	[SelectionDAG] Use unaligned store to legalize `EXTRACT_VECTOR_ELT` type when Stack is non-realignable (#98176 ) This patch ports the commit a6614ec5b7c1dbfc4b847884c5de780cf75e8e9c to SelectionDAG TypeLegalization. Fixes #98044 Co-authored-by: Manish Kausik H <hmamishkausik@gmail.com>	2024-07-29 15:23:36 +08:00
Vitaly Buka	455990d18f	Reland "SelectionDAG: Avoid using MachineFunction::getMMI" (#99779 ) Reverts llvm/llvm-project#99777 Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>	2024-07-24 10:38:53 +04:00
Simon Pilgrim	5bd38a98d7	[DAG] ComputeNumSignBits - subo_carry(x,x,c) -> bitwidth 'allsignbits' (#99935 ) Handle cases where the subo_carry is subtracting the same operand (=zero) - so only the subtraction of the 0/1 carry bit is affecting the result, giving a 0/-1 allsignbits value. Noticed while improving ABDS/ABDU expansion.	2024-07-23 11:49:12 +01:00
Vitaly Buka	98c0e55d9d	Revert "SelectionDAG: Avoid using MachineFunction::getMMI" (#99777 ) Reverts llvm/llvm-project#99696 https://lab.llvm.org/buildbot/#/builders/164/builds/1262	2024-07-20 12:20:50 -07:00
Joseph Huber	615b7eeaa9	Reapply "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512 )" This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5. I moved the `ISD` dependencies into the CodeGen portion of the handling, it's a little awkward but it's the easiest solution I can think of for now.	2024-07-20 09:29:31 -05:00
Matt Arsenault	c2019a37bd	SelectionDAG: Avoid using MachineFunction::getMMI (#99696 )	2024-07-20 10:53:41 +04:00
NAKAMURA Takumi	740161a9b9	Revert "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512 )" This reverts commit c05126bdfc3b02daa37d11056fa43db1a6cdef69. (llvmorg-19-init-17714-gc05126bdfc3b) See #99610	2024-07-20 12:36:57 +09:00
Jie Fu	544c390aac	[CodeGen] Fix -Wunused-variable in SelectionDAG.cpp (NFC) /llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7560:9: error: unused variable 'VecVT' [-Werror,-Wunused-variable] EVT VecVT = N1.getValueType(); ^ 1 error generated.	2024-07-17 21:20:55 +08:00
Lawrence Benson	177ce1900f	[LLVM] Add `llvm.experimental.vector.compress` intrinsic (#92289 ) This PR adds a new vector intrinsic `@llvm.experimental.vector.compress` to "compress" data within a vector based on a selection mask, i.e., it moves all selected values (i.e., where `mask[i] == 1`) to consecutive lanes in the result vector. A `passthru` vector can be provided, from which remaining lanes are filled. The main reason for this is that the existing `@llvm.masked.compressstore` has very strong constraints in that it can only write values that were selected, resulting in guard branches for all targets except AVX-512 (and even there the AMD implementation is _very_ slow). More instruction sets support "compress" logic, but only within registers. So to store the values, an additional store is needed. But this combination is likely significantly faster on many target as it avoids branches. In follow up PRs, my plan is to add target-specific lowerings for x86, SVE, and possibly RISCV. I also want to combine this with a store instruction, as this is probably a common case and we can avoid some memory writes in that case. See [discussion in forum](https://discourse.llvm.org/t/new-intrinsic-for-masked-vector-compress-without-store/78663) for initial discussion on the design.	2024-07-17 14:24:24 +02:00
Amara Emerson	f270a4dd66	[AArch64] Don't tail call memset if it would convert to a bzero. (#98969 ) Well, not quite that simple. We can tc memset since it returns the first argument but bzero doesn't do that and therefore we can end up miscompiling. This patch also refactors the logic out of isInTailCallPosition() into the callers. As a result memcpy and memmove are also modified to do the same thing for consistency. rdar://131419786	2024-07-17 01:31:52 -07:00
Joseph Huber	c05126bdfc	[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512 ) Summary: The LTO pass and LLD linker have logic in them that forces extraction and prevent internalization of needed runtime calls. However, these currently take all RTLibcalls into account, even if the target does not support them. The target opts-out of a libcall if it sets its name to nullptr. This patch pulls this logic out into a class in the header so that LTO / lld can use it to determine if a symbol actually needs to be kept. This is important for targets like AMDGPU that want to be able to use `lld` to perform the final link step, but does not want the overhead of uncalled functions. (This adds like a second to the link time trivially)	2024-07-16 06:22:09 -05:00
Volodymyr Vasylkun	afb584a562	[SelectionDAG] Ensure that we don't create `UCMP`/`SCMP` nodes with operands being scalars and result being a 1-element vector during scalarization (#98687 ) This patch fixes a problem that existed before where in some situations a `UCMP`/`SCMP` node which operated on 1-element vectors had a legal result type (i.e. `v1i64` on AArch64), but illegal operands (i.e. `v1i65`). This meant that operand scalarization was performed on the node and the operands were changed to a legal scalar type, but the result wasn't. This then led to `UCMP`/`SCMP` nodes with different vector-ness of operands and result appearing in the SDAG. This patch addresses this issue by fully scalarizing the `UCMP`/`SCMP` node and then turning its result back into a 1-element vector using a `SCALAR_TO_VECTOR` node. It also adds several assertions to `SelectionDAG::getNode()` to avoid this or a similar issue arising in the future. I wasn't sure if these two changes are unrelated enough to warrant two small separate PRs, but I'm happy to split this PR into two if that's deemed more appropriate.	2024-07-12 21:41:49 +01:00
Farzon Lotfi	0b58f34c98	[X86][CodeGen] Add base trig intrinsic lowerings (#96222 ) This change is an implementation of https://github.com/llvm/llvm-project/issues/87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds constraint intrinsics and some lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. The only x86 specific change was for f80. https://github.com/llvm/llvm-project/issues/70079 https://github.com/llvm/llvm-project/issues/70080 https://github.com/llvm/llvm-project/issues/70081 https://github.com/llvm/llvm-project/issues/70083 https://github.com/llvm/llvm-project/issues/70084 https://github.com/llvm/llvm-project/issues/95966 The x86 lowering is going to be done in three pr changes with this being the first. A second PR will be put up for Loop Vectorizing and then SLPVectorizer. The constraint intrinsics is also going to be in multiple parts, but just 2. This part covers just the llvm specific changes, part2 will cover clang specifc changes and legalization for backends than have special legalization requirements like aarch64 and wasm.	2024-07-11 15:58:43 -04:00
Luke Lau	baf22a527c	[SelectionDAG] Handle vscale range wrapping in isKnownNeverZero As pointed out by @preames, ConstantRange can wrap so it's possible for zero to be in a range without zero being the minimum. This fixes this by checking contains instead.	2024-07-09 23:05:22 +08:00
Bjorn Pettersson	c2fbc701aa	[SelectionDAG] Let ComputeKnownSignBits handle (shl (ext X), C) (#97695 ) Add simple support for looking through ZEXT/ANYEXT/SEXT when doing ComputeKnownSignBits for SHL. This is valid for the case when all extended bits are shifted out, because then the number of sign bits can be found by analysing the EXT operand. A future improvement could be to pass along the "shifted left by" information in the recursive calls to ComputeKnownSignBits. Allowing us to handle this more generically.	2024-07-05 22:37:26 +02:00
Luke Lau	e4b28420f6	[SelectionDAG] Handle VSCALE in isKnownNeverZero (#97789 ) VSCALE is by definition greater than zero, but this checks it via getVScaleRange anyway. The motivation for this is to be able to check if the EVL for a VP strided load is non-zero in #97394. I added the tests to the RISC-V backend since the existing X86 known-never-zero.ll test crashed when trying to lower vscale for the +sse2 RUN line.	2024-07-05 16:11:06 +08:00
Craig Topper	8419da8bd4	[SelectionDAG] Remove LegalTypes argument from getShiftAmountConstant. (#97653 ) #97645 proposed to remove LegalTypes from getShiftAmountTy. This patches removes it from getShiftAmountConstant which is one of the callers of getShiftAmountTy.	2024-07-04 18:33:25 -07:00
Craig Topper	3141c11fe8	[SelectionDAG] Remove LegalTypes argument from getShiftAmountTy. NFC (#97757 ) This argument is no longer used inside the function. Remove it from the interface.	2024-07-04 15:24:54 -07:00
Yingwei Zheng	d5c9ffd545	[SDAG] Intersect poison-generating flags after CSE (#97434 ) This patch fixes a miscompilation when `N` gets CSEed to `Existing`: ``` Existing: t5: i32 = sub nuw Constant:i32<0>, t3 N: t30: i32 = sub Constant:i32<0>, t3 ``` Fixes https://github.com/llvm/llvm-project/issues/96366.	2024-07-03 20:32:46 +08:00
Youngsuk Kim	a95c85fba5	[llvm][CodeGen] Avoid 'raw_string_ostream::str' (NFC) (#97318 ) Since `raw_string_ostream` doesn't own the string buffer, it is desirable (in terms of memory safety) for users to directly reference the string buffer rather than use `raw_string_ostream::str()`. Work towards TODO comment to remove `raw_string_ostream::str()`.	2024-07-01 21:52:37 -04:00
Shilei Tian	9a4f57ec1e	[SelectionDAG] Use `EVT::getIntegerVT` in `getBitcastedAnyExtOrTrunc` (#96658 ) `SelectionDAG::getBitcastedAnyExtOrTrunc` assumes that there is always a valid integer type corresponding to another type, which is not always true when it comes to vector type. For example, `<3 x i8>` doesn't have a corresponding integer type. Fix SWDEV-464698.	2024-07-01 15:10:57 -04:00
Nikita Popov	f2f18459d4	Revert "Intrinsic: introduce minimumnum and maximumnum (#93841 )" As far as I can tell, this pull request was not approved, and did not go through an RFC on discourse. This reverts commit 89881480030f48f83af668175b70a9798edca2fb. This reverts commit 225d8fc8eb24fb797154c1ef6dcbe5ba033142da.	2024-06-21 08:34:04 +02:00
YunQiang Su	8988148003	Intrinsic: introduce minimumnum and maximumnum (#93841 ) Currently, on different platform, the behaivor of llvm.minnum is different if one operand is sNaN: When we compare sNaN vs NUM: ARM/AArch64/PowerPC: follow the IEEE754-2008's minNUM: return qNaN. RISC-V/Hexagon follow the IEEE754-2019's minimumNumber: return NUM. X86: Returns NUM but not same with IEEE754-2019's minimumNumber as +0.0 is not always greater than -0.0. MIPS/LoongArch/Generic: return NUM. LIBCALL: returns qNaN. So, let's introduce llvm.minmumnum/llvm.maximumnum, which always follow IEEE754-2019's minimumNumber/maximumNumber. Half-fix: #93033	2024-06-21 11:53:08 +08:00
Simon Pilgrim	1216e7045f	[DAG] getNode - add value type assertions for AVG nodes.	2024-06-12 17:00:33 +01:00
Simon Pilgrim	346f16d504	[DAG] Move isNullConstantOrUndef helper to SelectionDAGNodes.h to allow other future uses. NFC.	2024-06-12 16:32:36 +01:00
Simon Pilgrim	53fecef1ec	[DAG] FoldConstantArithmetic - allow binop folding to work with differing bitcasted constants (#94863 ) We currently only constant fold binop(bitcast(c1),bitcast(c2)) if c1 and c2 are both bitcasted and from the same type. This patch relaxes this assumption to allow the constant build vector to originate from different types (and allow cases where only one operand was bitcasted). We still ensure we bitcast back to one of the original types if both operand were bitcasted (we assume that if we have a non-bitcasted constant then its legal to keep using that type).	2024-06-09 11:30:05 +01:00

1 2 3 4 5 ...

2607 Commits