llvm-project

Author	SHA1	Message	Date
Paul Walker	68732ce8e0	[LLVM][CodeGen][SVE] Add isel for bfloat unordered reductions. (#143540 ) The omissions are VECREDUCE_SEQ_* and MUL. The former goes down a different code path and the latter is unsupported across all element types.	2025-06-20 11:46:25 +01:00
Philip Reames	939666380f	[SDAG] Add partial_reduce_sumla node (#141267 ) We have recently added the partial_reduce_smla and partial_reduce_umla nodes to represent Acc += ext(b) * ext(b) where the two extends have to have the same source type, and have the same extend kind. For riscv64 w/zvqdotq, we have the vqdot and vqdotu instructions which correspond to the existing nodes, but we also have vqdotsu which represents the case where the two extends are sign and zero respective (i.e. not the same type of extend). This patch adds a partial_reduce_sumla node which has sign extension for A, and zero extension for B. The addition is somewhat mechanical.	2025-06-09 07:17:45 -07:00
Philip Reames	1651aa2943	[SDAG] Split the partial reduce legalize table by opcode [nfc] (#141970 ) On it's own, this change should be non-functional. This is a preparatory change for https://github.com/llvm/llvm-project/pull/141267 which adds a new form of PARTIAL_REDUCE_*MLA. As noted in the discussion on that review, AArch64 needs a different set of legal and custom types for the PARTIAL_REDUCE_SUMLA variant than the currently existing PARTIAL_REDUCE_UMLA/SMLA.	2025-05-29 14:05:31 -07:00
Philip Reames	cf2f558501	[DAG/RISCV] Continue mitgrating to getInsertSubvector and getExtractSubvector Follow up to 6e654caab, use the new routines in more places. Note that I've excluded from this patch any case which uses a getConstant index instead of a getVectorIdxConstant index just to minimize room for error. I'll get those in a separate follow up.	2025-05-08 09:40:45 -07:00
Nicholas Guy	a1f369e630	[AArch64][SVE] Add dot product lowering for PARTIAL_REDUCE_MLA node (#130933 ) Add lowering in tablegen for PARTIAL_REDUCE_U/SMLA ISD nodes. Only happens when the combine has been performed on the ISD node. Also adds in check to only do the DAG combine when the node can then eventually be lowered, so changes neon tests too. --------- Co-authored-by: James Chesterman <james.chesterman@arm.com>	2025-04-23 13:19:41 +01:00
Jim Lin	94f6b6d538	[SelectionDAG][RISCV] Promote VECREDUCE_{FMAX,FMIN,FMAXIMUM,FMINIMUM} (#128800 ) This patch also adds the tests for VP_REDUCE_{FMAX,FMIN,FMAXIMUM,FMINIMUM}, which have been supported for a while.	2025-02-28 23:13:30 +08:00
James Chesterman	d4a0848dc6	[SelectionDAG] Add PARTIAL_REDUCE_U/SMLA ISD Nodes (#125207 ) Add signed and unsigned PARTIAL_REDUCE_MLA ISD nodes. Add command line argument (aarch64-enable-partial-reduce-nodes) that indicates whether the intrinsic experimental_vector_partial_ reduce_add will be transformed into the new ISD node. Lowering with the new ISD nodes will, for now, always be done as an expand.	2025-02-18 09:08:47 +00:00
Benjamin Maxwell	19556eccf6	[RTLIB] Rename getFSINCOS() to getSINCOS (NFC) (#126705 ) This makes the name more consistent with the other helpers.	2025-02-11 11:51:35 +00:00
Benjamin Maxwell	701223ac20	[IR] Add llvm.sincospi intrinsic (#125873 ) This adds the `llvm.sincospi` intrinsic, legalization, and lowering (mostly reusing the lowering for sincos and frexp). The `llvm.sincospi` intrinsic takes a floating-point value and returns both the sine and cosine of the value multiplied by pi. It computes the result more accurately than the naive approach of doing the multiplication ahead of time, especially for large input values. ``` declare { float, float } @llvm.sincospi.f32(float %Val) declare { double, double } @llvm.sincospi.f64(double %Val) declare { x86_fp80, x86_fp80 } @llvm.sincospi.f80(x86_fp80 %Val) declare { fp128, fp128 } @llvm.sincospi.f128(fp128 %Val) declare { ppc_fp128, ppc_fp128 } @llvm.sincospi.ppcf128(ppc_fp128 %Val) declare { <4 x float>, <4 x float> } @llvm.sincospi.v4f32(<4 x float> %Val) ``` Currently, the default lowering of this intrinsic relies on the `sincospi[f\|l]` functions being available in the target's runtime (e.g. libc).	2025-02-11 09:01:30 +00:00
Benjamin Maxwell	4bf97aa818	[IR] Add `llvm.modf` intrinsic (#121948 ) This adds the `llvm.modf` intrinsic, legalization, and lowering (mostly reusing the lowering for sincos and frexp). The `llvm.modf` intrinsic takes a floating-point value and returns both the integral and fractional parts (as a struct). ``` declare { float, float } @llvm.modf.f32(float %Val) declare { double, double } @llvm.modf.f64(double %Val) declare { x86_fp80, x86_fp80 } @llvm.modf.f80(x86_fp80 %Val) declare { fp128, fp128 } @llvm.modf.f128(fp128 %Val) declare { ppc_fp128, ppc_fp128 } @llvm.modf.ppcf128(ppc_fp128 %Val) declare { <4 x float>, <4 x float> } @llvm.modf.v4f32(<4 x float> %Val) ``` This corresponds to the libm `modf` function but returns multiple values in a struct (rather than take output pointers), which makes it easier to vectorize.	2025-02-07 09:25:13 +00:00
Graham Hunter	d9f165ddea	[SDAG] Add an ISD node to help lower vector.extract.last.active (#118810 ) Based on feedback from the clastb codegen PR, I'm refactoring basic codegen for the vector.extract.last.active intrinsic to lower to an ISD node in SelectionDAGBuilder then expand in LegalizeVectorOps, instead of doing everything in the builder. The new ISD node (vector_find_last_active) only covers finding the index of the last active element of the mask, and extracting the element + handling passthru is left to existing ISD nodes.	2025-01-20 12:57:05 +00:00
Craig Topper	8ce81f17a1	[LegalizeVectorOps][RISCV] Use VP_FP_EXTEND/ROUND when promoting VP_FP* operations. (#122784 ) This preserves the original VL leading to more reuse of VL for vsetvli. The VLOptimizer can also clean up a lot of this, but I'm not sure if it gets all of it. There are some regressions in here from propagating the mask too, but I'm not sure if that's a concern.	2025-01-13 15:18:41 -08:00
abhishek-kaushik22	366e62a0cb	[X86] Combine `uitofp <v x i32> to <v x half>` (#121809 ) Closes #121793	2025-01-08 16:49:29 +08:00
Simon Pilgrim	923675193b	[DAG] VectorLegalizer::ExpandUINT_TO_FLOAT- pull out repeated getValueType calls. NFC.	2025-01-06 18:49:51 +00:00
Phoebe Wang	1547382033	[X86] Support lowering of FMINIMUMNUM/FMAXIMUMNUM (#121464 )	2025-01-06 21:28:58 +08:00
Craig Topper	e32afded92	[LegalizeVectorOps] Use getBoolConstant instead of getAllOnesConstant in VectorLegalizer::UnrollVSETCC. (#121526 ) This code should follow the target preference for boolean contents of a vector type. We shouldn't assume that true is negative one.	2025-01-03 10:46:37 -08:00
Benjamin Maxwell	ea6b8fa4b9	[SDAG] Merge multiple-result libcall expansion into DAG.expandMultipleResultFPLibCall() (#114792 ) This merges the logic for expanding both FFREXP and FSINCOS into one method `DAG.expandMultipleResultFPLibCall()`. This reduces duplication and also allows FFREXP to benefit from the stack slot elimination implemented for FSINCOS. This method will also be used in future to implement more multiple-result intrinsics (such as modf and sincospi).	2024-11-06 11:06:06 +00:00
Benjamin Maxwell	89a8c71db6	[SDAG] Support expanding `FSINCOS` to vector library calls (#114039 ) This shares most of its code with the scalar sincos expansion. It allows expanding vector FSINCOS nodes to a library call from the specified `-vector-library`. The upside of this is it will mean the vectorizer only needs to handle the sincos intrinsic, which has no memory effects, and this can handle lowering the intrinsic to a call that takes output pointers.	2024-10-31 12:41:43 +00:00
Yingwei Zheng	cf9d1c1486	[SDAG] Simplify `SDNodeFlags` with bitwise logic (#114061 ) This patch allows using enumeration values directly and simplifies the implementation with bitwise logic. It addresses the comment in https://github.com/llvm/llvm-project/pull/113808#discussion_r1819923625.	2024-10-31 08:10:07 +08:00
Benjamin Maxwell	c3260c65e8	[IR] Add `llvm.sincos` intrinsic (#109825 ) This adds the `llvm.sincos` intrinsic, legalization, and lowering. The `llvm.sincos` intrinsic takes a floating-point value and returns both the sine and cosine (as a struct). ``` declare { float, float } @llvm.sincos.f32(float %Val) declare { double, double } @llvm.sincos.f64(double %Val) declare { x86_fp80, x86_fp80 } @llvm.sincos.f80(x86_fp80 %Val) declare { fp128, fp128 } @llvm.sincos.f128(fp128 %Val) declare { ppc_fp128, ppc_fp128 } @llvm.sincos.ppcf128(ppc_fp128 %Val) declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float> %Val) ``` The lowering is built on top of the existing FSINCOS ISD node, with additional type legalization to allow for f16, f128, and vector values.	2024-10-29 10:52:20 +00:00
Tex Riddell	875afa939d	[X86][CodeGen] Add base atan2 intrinsic lowering (p4) (#110760 ) This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 Based on example PR #96222 and fix PR #101268, with some differences due to 2-arg intrinsic and intermediate refactor (RuntimeLibCalls.cpp). - Add llvm.experimental.constrained.atan2 - Intrinsics.td, ConstrainedOps.def, LangRef.rst - Add to ISDOpcodes.h and TargetSelectionDAG.td, connect to intrinsic in BasicTTIImpl.h, and LibFunc_ in SelectionDAGBuilder.cpp - Update LegalizeDAG.cpp, LegalizeFloatTypes.cpp, LegalizeVectorOps.cpp, and LegalizeVectorTypes.cpp - Update isKnownNeverNaN in SelectionDAG.cpp - Update SelectionDAGDumper.cpp - Update libcalls - RuntimeLibcalls.def, RuntimeLibcalls.cpp - TargetLoweringBase.cpp - Expand for vectors, promote f16 - X86ISelLowering.cpp - Expand f80, promote f32 to f64 for MSVC Part 4 for Implement the atan2 HLSL Function #70096.	2024-10-16 11:43:17 -07:00
Paul Walker	02dd6b1014	[LLVM][CodeGen] Add lowering for scalable vector bfloat operations. (#109803 ) Specifically: fabs, fadd, fceil, fdiv, ffloor, fma, fmax, fmaxnm, fmin, fminnm, fmul, fnearbyint, fneg, frint, fround, froundeven, fsub, fsqrt & ftrunc	2024-10-07 13:01:59 +01:00
Craig Topper	92a8b81bdf	[LegalizeVectorOps] Enable ExpandFABS/COPYSIGN to use integer ops for fixed vectors in some cases. (#109232 ) Copy the same FSUB check from ExpandFNEG to avoid breaking AArch64 and ARM.	2024-09-30 11:44:49 -07:00
Craig Topper	d21a43579e	[LegalizeVectorOps][RISCV] Don't scalarize FNEG in ExpandFNEG if FSUB is marked Promote. We have a special check that tries to determine if vector FP operations are supported for the type to determine whether to scalarize or not. If FP arithmetic would be promoted, don't unroll. This improves Zvfhmin codegen on RISC-V.	2024-09-18 18:19:21 -07:00
Craig Topper	da46244e49	Revert "[LegalizeVectorOps] Make the AArch64 hack in ExpandFNEG more specific." This reverts commit 884ff9e3f9741ac282b6cf8087b8d3f62b8e138a. Regression was reported in Halide for arm32.	2024-09-17 09:04:43 -07:00
Craig Topper	f36580fcb5	[LegalizeVectorOps] Remove calls to DAG.UnrollVectorsOps from some expansion handlers. NFC (#108930 ) Instead, return SDValue() to tell the caller to do the unrolling. This is consistent with how some other handler work. Especially the handlers that live in TLI. ExpandBITREVERSE was rewritten to not take the Results vector an argument.	2024-09-17 08:35:22 -07:00
Craig Topper	884ff9e3f9	[LegalizeVectorOps] Make the AArch64 hack in ExpandFNEG more specific. Only scalarize single element vectors when vector FSUB is not supported and scalar FNEG is supported.	2024-09-16 21:48:42 -07:00
Craig Topper	3e798476de	[LegalizeDAG][RISCV] Don't promote f16 vector ISD::FNEG/FABS/FCOPYSIGN to f32 when we don't have Zvfh. (#106652 ) The fp_extend will canonicalize NaNs which is not the semantics of FNEG/FABS/FCOPYSIGN. For fixed vectors I'm scalarizing due to test changes on other targets where the scalarization is expected. I will try to address in a follow up. For scalable vectors, we bitcast to integer and use integer logic ops.	2024-09-03 22:44:49 -07:00
Craig Topper	366ac8c090	[LegalizeVectorOps] Defer UnrollVectorOp in ExpandFNEG to caller. (#106783 ) Make ExpandFNEG return SDValue() when it doesn't expand. The caller already knows how to Unroll when Results is empty.	2024-09-02 16:16:12 -07:00
Yingwei Zheng	affc0c64b6	[SDAG] Expand vector [u\|s]cmp in VectorLegalizer (#106883 ) Address comment https://github.com/llvm/llvm-project/pull/106747#issuecomment-2322922855.	2024-09-01 22:35:52 +08:00
Craig Topper	c25293c6dd	[LegalizeVectorOps][RISCV] Don't promote VP_FABS/FNEG/FCOPYSIGN. (#106659 ) Promoting canonicalizes NaNs which changes the semantics. Bitcast to integer and use logic ops instead.	2024-08-30 09:44:51 -07:00
Craig Topper	aa91d90cb0	[LegalizeVectorOps][PowerPC] Use xor to expand fneg. (#106595 ) This preserves the semantis of fneg and matches what we do in LegalizeDAG. I kept the legal FSUB check to force unrolling for some targets that don't have FSUB but have XOR. On Aarch64, using xor broke some tests that expected to see a (v1f64 (fma (insertvector_elt (f64 (fneg (extractvectorelt X)))))) pattern.	2024-08-29 15:00:23 -07:00
Sumanth Gundapaneni	e78156a0e2	Scalarize the vector inputs to llvm.lround intrinsic by default. (#101054 ) Verifier is updated in a different patch to let the vector types for llvm.lround and llvm.llround intrinsics.	2024-08-21 12:13:56 -05:00
Lawrence Benson	177ce1900f	[LLVM] Add `llvm.experimental.vector.compress` intrinsic (#92289 ) This PR adds a new vector intrinsic `@llvm.experimental.vector.compress` to "compress" data within a vector based on a selection mask, i.e., it moves all selected values (i.e., where `mask[i] == 1`) to consecutive lanes in the result vector. A `passthru` vector can be provided, from which remaining lanes are filled. The main reason for this is that the existing `@llvm.masked.compressstore` has very strong constraints in that it can only write values that were selected, resulting in guard branches for all targets except AVX-512 (and even there the AMD implementation is _very_ slow). More instruction sets support "compress" logic, but only within registers. So to store the values, an additional store is needed. But this combination is likely significantly faster on many target as it avoids branches. In follow up PRs, my plan is to add target-specific lowerings for x86, SVE, and possibly RISCV. I also want to combine this with a store instruction, as this is probably a common case and we can avoid some memory writes in that case. See [discussion in forum](https://discourse.llvm.org/t/new-intrinsic-for-masked-vector-compress-without-store/78663) for initial discussion on the design.	2024-07-17 14:24:24 +02:00
Farzon Lotfi	0b58f34c98	[X86][CodeGen] Add base trig intrinsic lowerings (#96222 ) This change is an implementation of https://github.com/llvm/llvm-project/issues/87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds constraint intrinsics and some lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. The only x86 specific change was for f80. https://github.com/llvm/llvm-project/issues/70079 https://github.com/llvm/llvm-project/issues/70080 https://github.com/llvm/llvm-project/issues/70081 https://github.com/llvm/llvm-project/issues/70083 https://github.com/llvm/llvm-project/issues/70084 https://github.com/llvm/llvm-project/issues/95966 The x86 lowering is going to be done in three pr changes with this being the first. A second PR will be put up for Loop Vectorizing and then SLPVectorizer. The constraint intrinsics is also going to be in multiple parts, but just 2. This part covers just the llvm specific changes, part2 will cover clang specifc changes and legalization for backends than have special legalization requirements like aarch64 and wasm.	2024-07-11 15:58:43 -04:00
Nikita Popov	f2f18459d4	Revert "Intrinsic: introduce minimumnum and maximumnum (#93841 )" As far as I can tell, this pull request was not approved, and did not go through an RFC on discourse. This reverts commit 89881480030f48f83af668175b70a9798edca2fb. This reverts commit 225d8fc8eb24fb797154c1ef6dcbe5ba033142da.	2024-06-21 08:34:04 +02:00
YunQiang Su	8988148003	Intrinsic: introduce minimumnum and maximumnum (#93841 ) Currently, on different platform, the behaivor of llvm.minnum is different if one operand is sNaN: When we compare sNaN vs NUM: ARM/AArch64/PowerPC: follow the IEEE754-2008's minNUM: return qNaN. RISC-V/Hexagon follow the IEEE754-2019's minimumNumber: return NUM. X86: Returns NUM but not same with IEEE754-2019's minimumNumber as +0.0 is not always greater than -0.0. MIPS/LoongArch/Generic: return NUM. LIBCALL: returns qNaN. So, let's introduce llvm.minmumnum/llvm.maximumnum, which always follow IEEE754-2019's minimumNumber/maximumNumber. Half-fix: #93033	2024-06-21 11:53:08 +08:00
Poseydon42	995835fe6d	[SelectionDAG] Add support for the 3-way comparison intrinsics [US]CMP (#91871 ) This PR adds initial support for the `scmp`/`ucmp` 3-way comparison intrinsics in the SelectionDAG. Some of the expansions/lowerings are not optimal yet.	2024-06-17 11:16:52 +02:00
Simon Pilgrim	ea2ee5dc2f	[DAG] Add legalization handling for AVGCEIL/AVGFLOOR nodes (#92096 ) Always match AVG patterns pre-legalization, and use TargetLowering::expandAVG to expand again during legalization. I've removed the X86 custom AVGCEILU pattern detection and replaced with combines to try and convert other AVG nodes to AVGCEILU.	2024-06-12 14:11:07 +01:00
Matt Arsenault	212b78aad4	DAG: Improve fminimum/fmaximum vector expansion logic (#93579 ) First, expandFMINIMUM_FMAXIMUM should be a never-fail API. The client wanted it expanded, and it can always be expanded. This logic was tied up with what the VectorLegalizer wanted. Prefer using the min/max opcodes, and unrolling if we don't have a vselect. This seems to produce better code in all the changed tests.	2024-06-06 19:01:39 +02:00
Craig Topper	21711f89b9	[LegalizeVectorOps] Move VP_STORE legalization from LegalizeDAG to LegalizeVectorOps. 705636a1130551ab105aec95b909a35a0305fc9f moved reductions from LegalizeVectorOps to LegalizeDAG, but the way it was done inadvertently moved stores from LegalizeVectorOps to LegalizeDAG too. This was not intended or desired. Found when this was pulled into my downstream which has other changes that make the distinction important.	2024-06-05 12:23:24 -07:00
Farzon Lotfi	1d87433593	[x86] Add tan intrinsic part 4 (#90503 ) This change is an implementation of #87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 Much of this change was following how G_FSIN and G_FCOS were used. Changes: - `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN` opcode - `llvm/docs/LangRef.rst` - Document the tan intrinsic - `llvm/include/llvm/Analysis/VecFuncs.def` - Associate the tan intrinsic as a vector function similar to the tanf libcall. - `llvm/include/llvm/CodeGen/BasicTTIImpl.h` - Map the tan intrinsic to `ISD::FTAN` - `llvm/include/llvm/CodeGen/ISDOpcodes.h` - Define ISD opcodes for `FTAN` and `STRICT_FTAN` - `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic - `llvm/include/llvm/IR/RuntimeLibcalls.def` - Define tan libcall mappings - `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN` Opcode - `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN` Opcode handler - `llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td` - Map `G_FTAN` to `ftan` - `llvm/include/llvm/Target/TargetSelectionDAG.td` - Define `ftan`, `strict_ftan`, and `any_ftan` and map them to the ISD opcodes for `FTAN` and `STRICT_FTAN` - `llvm/lib/Analysis/VectorUtils.cpp` - Associate the tan intrinsic as a vector intrinsic - `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic to `G_FTAN` Opcode - `llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp` - Add `G_FTAN` to the list of floating point math operations also associate `G_FTAN` with the `TAN_F` runtime lib. - `llvm/lib/CodeGen/GlobalISel/Utils.cpp` - More floating point math operation common behaviors. - llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp - List the function expansion operations for `FTAN` and `STRICT_FTAN`. Also define both opcodes in `PromoteNode`. - `llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp` - More `FTAN` and `STRICT_FTAN` handling in the legalizer - `llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h` - Define `SoftenFloatRes_FTAN` and `ExpandFloatRes_FTAN`. - `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp` - Define `FTAN` as a legal vector operation. - `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp` - Define `FTAN` as a legal vector operation. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp` - define tan as an intrinsic that doesn't return NaN. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp` Map `LibFunc_tan`, `LibFunc_tanf`, and `LibFunc_tanl` to `ISD::FTAN`. Map `Intrinsic::tan` to `ISD::FTAN` and add selection dag handling for `Intrinsic::tan`. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp` - Define `ftan` and `strict_ftan` names for the equivalent ISD opcodes. - `llvm/lib/CodeGen/TargetLoweringBase.cpp` -Define a Tan128 libcall and ISD::FTAN as a target lowering action. - `llvm/lib/Target/X86/X86ISelLowering.cpp` - Add x86_64 lowering for tan intrinsic resolves https://github.com/llvm/llvm-project/issues/70082	2024-06-05 15:01:33 -04:00
Craig Topper	f2d74002fd	[LegalizeVectorOps][X86] Add ISD::ABDS/ABSDU to the list of opcodes handled by LegalizeVectorOps. (#92332 ) The expand code is present, but we were missing the type query code so the nodes would be ignored until LegalizeDAG.	2024-05-15 21:46:31 -07:00
Craig Topper	705636a113	[SelectionDAG][RISCV] Move VP_REDUCE* legalization to LegalizeDAG.cpp. (#90522 ) LegalizeVectorType is responsible for legalizing nodes that perform an operation on each element may need to scalarize. This is not true for nodes like VP_REDUCE.*, BUILD_VECTOR, SHUFFLE_VECTOR, EXTRACT_SUBVECTOR, etc. This patch drops any nodes with a scalar result from LegalizeVectorOps and handles them in LegalizeDAG instead. This required moving the reduction promotion to LegalizeDAG. I have removed the support integer promotion as it was incorrect for integer min/max reductions. Since it was untested, it was best to assert on it until it was really needed. There are a couple regressions that can be fixed with a small DAG combine which I will do as a follow up.	2024-04-29 22:44:24 -07:00
Qiu Chaofan	4a8f2f2e1a	[Legalizer] Expand fmaximum and fminimum (#67301 ) According to langref, llvm.maximum/minimum has -0.0 < +0.0 semantics and propagates NaN. Expand the nodes on targets not supporting the operation, by adding extra check for NaN and using is_fpclass to check zero signs.	2024-04-29 15:09:54 +08:00
Matt Arsenault	401658cb4b	AMDGPU: Fix vector handling of fptrunc_round	2024-04-24 12:42:55 +02:00
choikwa	422bf13f33	[AMDGPU] In VectorLegalizer::Expand, if UnrollVectorOp returns Load, … (#88475 ) …return only Load since other output is chain. Added testcase that showed mismatched expected arity when Load and chain were returned as separate items after 003b58f65bdd5d9c7d0c1b355566c9ef430c0e7d	2024-04-16 06:04:37 -04:00
Paul Walker	bd6eb54886	[LLVM][CodeGen] Teach SelectionDAG how to expand FREM to a vector math call. (#83859 ) This removes, at least when a vector library is available, a failure case for scalable vectors. Doing so means we can confidently cost vector FREM instructions without making an assumption that later passes will transform the IR before it gets to the code generator. NOTE: Whilst only FREM has been implemented the same mechanism can be used for the other libm related ISD nodes.	2024-03-08 12:09:05 +00:00
Craig Topper	de41eae41f	[SelectionDAG][RISCV] Use FP type for legality query for LRINT/LLRINT in LegalizeVectorOps. (#82728 ) This matches how LRINT/LLRINT is queried for scalar types in LegalizeDAG. It's confusing if they do different things since a "Legal" vector LRINT/LLRINT would get through to LegalizeDAG which would then consider it illegal. This doesn't happen currently because RISC-V uses Custom.	2024-02-22 20:18:52 -08:00
Nico Weber	184ca39529	[llvm] Move CodeGenTypes library to its own directory (#79444 ) Finally addresses https://reviews.llvm.org/D148769#4311232 :) No behavior change.	2024-01-25 12:01:31 -05:00

1 2 3 4 5 ...

337 Commits