llvm-project

Author	SHA1	Message	Date
Paul Walker	02dd6b1014	[LLVM][CodeGen] Add lowering for scalable vector bfloat operations. (#109803 ) Specifically: fabs, fadd, fceil, fdiv, ffloor, fma, fmax, fmaxnm, fmin, fminnm, fmul, fnearbyint, fneg, frint, fround, froundeven, fsub, fsqrt & ftrunc	2024-10-07 13:01:59 +01:00
Matt Arsenault	5883ad34d6	DAG: Handle vector legalization of minimumnum/maximumnum (#109779 ) Follow the same patterns as the other min/max variants.	2024-09-30 13:43:35 +04:00
Jonas Paulsson	14120227a3	Target ABI: improve call parameters extensions handling (#100757 ) For the purpose of verifying proper arguments extensions per the target's ABI, introduce the NoExt attribute that may be used by a target when neither sign- or zeroextension is required (e.g. with a struct in register). The purpose of doing so is to be able to verify that there is always one of these attributes present and by this detecting cases where sign/zero extension is actually missing. As a first step, this patch has the verification step done for the SystemZ backend only, but left off by default until all known issues have been addressed. Other targets/front-ends can now also add NoExt attribute where needed and do this check in the backend.	2024-09-19 16:59:31 +02:00
Pierre van Houtryve	758444ca3e	[AMDGPU] Promote uniform ops to I32 in DAGISel (#106383 ) Promote uniform binops, selects and setcc between 2 and 16 bits to 32 bits in DAGISel Solves #64591	2024-09-19 09:00:21 +02:00
David Green	960c975acd	[AArch64] Expand scmp/ucmp vector operations with sub (#108830 ) Unlike scalar, where AArch64 prefers expanding scmp/ucmp with select, under Neon we can use the arithmetic expansion to generate fewer instructions. Notably it also prevents the scalarization of vselect during vector-legalization.	2024-09-16 18:44:52 +01:00
Lawrence Benson	b74e779219	[x86] Add lowering for `@llvm.experimental.vector.compress` (#104904 ) This is a follow-up to #92289 that adds lowering of the new `@llvm.experimental.vector.compress` intrinsic on x86 with AVX512 instructions. This intrinsic maps directly to `vpcompress`.	2024-09-13 21:48:01 +02:00
YunQiang Su	5773adb0bf	SelectionDAG: Remove unneeded getSelectCC in expandFMINIMUMNUM_FMAXIMUMNUM (#107416 ) ISD::FCANONICALIZE is enough, which can process NaN or non-NaN correctly, thus getSelectCC is not needed here.	2024-09-11 09:53:04 +08:00
Simon Pilgrim	7e07c1df67	[DAG] expandAVG - consistently use getShiftAmountConstant for constant shift amounts. NFC	2024-09-10 09:25:58 +01:00
Matt Arsenault	77f1b481b8	DAG: Lower single infinity is.fpclass tests to fcmp (#100380 ) InstCombine also should have taken care of this, but this should be helpful when the fcmp based lowering strategy tries to combine multiple tests.	2024-09-06 09:15:18 +04:00
Matt Arsenault	fc3e6a8186	DAG: Handle lowering unordered compare with inf (#100378 ) Try to take advantage of the nan check behavior of fcmp. x86_64 looks better, x86_32 looks worse.	2024-09-05 19:54:32 +04:00
Dávid Ferenc Szabó	e9eaf19eb6	[CodeGen] Allow mixed scalar type constraints for inline asm (#65465 ) GCC supports code like "asm volatile ("" : "=r" (i) : "0" (f))" where i is integer type and f is floating point type. Currently this code produces an error with Clang. The change allows mixed scalar types between input and output constraints. Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>	2024-08-29 22:53:28 +04:00
David Majnemer	ea1f05e618	[SelectionDAG] Fix lowering of IEEE 754 2019 minimum/maximum We used integer comparisons instead of floating point comparisons resulting in very odd behavior.	2024-08-20 05:09:49 +00:00
Craig Topper	f7d94b783f	[SelectionDAG] Use getAllOnesConstant.	2024-08-17 17:57:05 -07:00
Craig Topper	067f2e9f18	[SelectionDAG] Use getSignedConstant/getAllOnesConstant.	2024-08-17 00:04:01 -07:00
Craig Topper	7afb51e035	[SelectionDAG][X86] Add SelectionDAG::getSignedConstant and use it in a few places. (#104555 ) PR #80309 proposes to have users of APInt's uint64_t constructor opt-in to implicit truncation. Currently, that patch requires SelectionDAG::getConstant to opt-in. This patch adds getSignedConstant so we can start fixing some of the cases that require implicit truncation.	2024-08-16 09:21:11 -07:00
Craig Topper	3dea42f3e5	[TargetLowering] Don't call SelectionDAG::getTargetLoweringInfo() from TargetLowering methods. NFC (#104197 ) If we are inside a TargetLowering method, `SelectionDAG::getTargetLoweringInfo()` should be the same as `this`.	2024-08-15 12:33:12 -07:00
YunQiang Su	fb9e685fc4	Intrinsic: introduce minimumnum and maximumnum for IR and SelectionDAG (#96649 ) C23 introduced new functions fminimum_num and fmaximum_num, and they follow the minimumNumber and maximumNumber of IEEE754-2019. Let's introduce new intrinsics to support them. This patch introduces support only support for scalar values. The support of vector (vp, vp.reduce, vector.reduce), experimental.constrained will be added in future patches. With this patch, MIPSr6 and LoongArch can work out of box with fcanonical and fmax/fmin. Aarch64/PowerPC64 can use the same login as MIPSr6 and LoongArch, while they have no fcanonical support yet. I will add it in future patches. The FMIN/FMAX of RISC-V instructions follows the minimumNumber/maximumNumber of IEEE754-2019. We can just add it in future patch. Background https://discourse.llvm.org/t/rfc-fix-llvm-min-f-and-llvm-max-f-intrinsics/79735 Currently we have fminnum/fmaxnum, which have different behavior on different platform for NUM vs sNaN: 1) Fallback to fmin(3)/fmax(3): return qNaN. 2) ARM64/ARM32+Neon: same as libc. 3) MIPSr6/LoongArch/RISC-V: return NUM. And the fix of fminnum/fmaxnum to follow minNUM/maxNUM of IEEE754-2008 will submit as separated patches.	2024-08-15 14:09:36 +08:00
Craig Topper	e687a9f2dd	[TargetLowering] Remove unncessary null check. NFC	2024-08-14 12:26:41 -07:00
Craig Topper	abc1acf8df	[TargetLowering][AMDGPU][ARM][RISCV][X86] Teach SimplifyDemandedBits to combine (srl (sra X, C1), ShAmt) -> sra(X, C1+ShAmt) (#101751 ) If the upper bits of the shr aren't demanded. This helps with cases where the outer srl was originally an sra and was converted to a srl by SimplifyDemandedBits before it had a chance to combine with the inner sra. This can occur when the inner sra was part of a sign_extend_inreg expansion. There are some regressions in ARM and Thumb2.	2024-08-14 08:44:57 -07:00
Craig Topper	51bad732dc	[SelectionDAG] Replace EVTToAPFloatSemantics with MVT/EVT::getFltSemantics. (#103001 )	2024-08-13 11:35:28 -07:00
Pierre van Houtryve	7389545d0d	Reapply "[AMDGPU] Always lower s/udiv64 by constant to MUL" (#101942 ) Reland #100723, fixing the ARM issue at the cost of a small loss of optimization in `test/CodeGen/AMDGPU/fshr.ll` Solves #100383	2024-08-12 09:00:22 +02:00
Craig Topper	0c783be985	[TargetLowering] Use APInt::isSubsetOf to simplify an expression. NFC	2024-08-09 22:09:40 -07:00
Bjorn Pettersson	bbefd5713f	[TargetLowering] Handle vector types in expandFixedPointMul (#102635 ) In TargetLowering::expandFixedPointMul when expanding fixed point multiplication, and when using a widened MUL as strategy for the lowering, there was a bug resulting in assertion failures like this: Assertion `VT.isVector() == N1.getValueType().isVector() && "SIGN_EXTEND result type type should be vector iff the operand " "type is vector!"' failed. Problem was that we did not consider that VT could be a vector type when setting up the WideVT. This patch should fix that bug.	2024-08-10 00:25:57 +02:00
Kazu Hirata	f4fb735840	[llvm] Construct SmallVector<SDValue> with ArrayRef (NFC) (#102578 )	2024-08-09 09:15:42 -07:00
Simon Pilgrim	13d04fa560	[DAG] Add legalization handling for ABDS/ABDU (#92576 ) (REAPPLIED) Always match ABD patterns pre-legalization, and use TargetLowering::expandABD to expand again during legalization. abdu(lhs, rhs) -> sub(xor(sub(lhs, rhs), usub_overflow(lhs, rhs)), usub_overflow(lhs, rhs)) Alive2: https://alive2.llvm.org/ce/z/dVdMyv REAPPLIED: Fix regression issue with "abs(ext(x) - ext(y)) -> zext(abd(x, y))" fold failing after type legalization	2024-08-08 11:39:05 +01:00
cceerczw	6f8e8faa12	[TargetLowering] Fix the problem of emulated-TLS implementation witho… (#101490 ) For a __thread variable x, when emulated TLS is enabled and there is an access to x, the compiler first looks up the symbol __emutls_v.x within the module. However, the issue arises with an alias y of x, the compiler still tries to look up __emutls_v.y instead of __emutls_v.x. As a result, the lookup returns a nullptr, causing the compiler to crash. The purpose of this MR (Merge Request) is to ensure that in emulated TLS, before checking __emutls_v.y, the compiler first identifies which global value y is an alias of.	2024-08-07 21:56:48 +04:00
Simon Pilgrim	e4e96b3e26	Revert b1234ddbe2652aa7948242a57107ca7ab12fd2f8. "[DAG] Add legalization handling for ABDS/ABDU (#92576 )" Reverting #92576 while we identify a reported regression	2024-08-07 17:11:25 +01:00
Simon Pilgrim	b1234ddbe2	[DAG] Add legalization handling for ABDS/ABDU (#92576 ) Always match ABD patterns pre-legalization, and use TargetLowering::expandABD to expand again during legalization. abdu(lhs, rhs) -> sub(xor(sub(lhs, rhs), usub_overflow(lhs, rhs)), usub_overflow(lhs, rhs)) Alive2: https://alive2.llvm.org/ce/z/dVdMyv	2024-08-06 10:18:06 +01:00
Sergei Barannikov	4527fba9ad	Revert "[SDag][ARM][RISCV] Allow lowering CTPOP into a libcall" (#101740 ) Reverts the rest of llvm/llvm-project#99752	2024-08-03 01:51:26 +03:00
Fangrui Song	0b92e70dfb	Revert "[AMDGPU] Always lower s/udiv64 by constant to MUL (#100723 )" This reverts commit 92fbc963a51683d32f70d0c7f3783bb13983f08d. The patch also affected ARM and caused an assertion failure during CurDAG->Legalize (https://github.com/llvm/llvm-project/pull/100723#issuecomment-2266154211).	2024-08-02 14:43:36 -07:00
Pierre van Houtryve	92fbc963a5	[AMDGPU] Always lower s/udiv64 by constant to MUL (#100723 ) Solves #100383	2024-08-02 12:22:42 +02:00
Sergei Barannikov	92e18ffd80	[SDag][ARM][RISCV] Allow lowering CTPOP into a libcall (#99752 ) The main change is adding CTPOP to `RuntimeLibcalls.def` to allow targets to use LibCall action for CTPOP. DAG legalizers are changed accordingly.	2024-08-02 12:29:39 +03:00
Julius Alexandre	7231776a02	Recommit "[DAG] Reducing instructions by better legalization handling of AVGFLOORU for illegal data types" (#101223 ) Previous reverted merge: https://github.com/llvm/llvm-project/pull/99913 Original message: Issue: https://github.com/rust-lang/rust/issues/124790 Previous PR: https://github.com/llvm/llvm-project/pull/99614 https://rust.godbolt.org/z/T7eKP3Tvo Aarch64: https://alive2.llvm.org/ce/z/dqr2Kg x86: https://alive2.llvm.org/ce/z/ze88Hw	2024-07-30 19:00:46 -07:00
Craig Topper	fed94333fd	Revert "[DAG] Reducing instructions by better legalization handling of AVGFLOORU for illegal data types (#99913 )" This reverts commit d5521d128494690be66e03a674b9d1181935bf77. The AArch64 test is failing on the bots.	2024-07-27 18:35:44 -07:00
Julius Alexandre	d5521d1284	[DAG] Reducing instructions by better legalization handling of AVGFLOORU for illegal data types (#99913 ) Issue: https://github.com/rust-lang/rust/issues/124790 Previous PR: https://github.com/llvm/llvm-project/pull/99614 https://rust.godbolt.org/z/T7eKP3Tvo Aarch64: https://alive2.llvm.org/ce/z/dqr2Kg x86: https://alive2.llvm.org/ce/z/ze88Hw cc: @RKSimon @topperc	2024-07-27 17:33:09 -07:00
Matt Arsenault	361d4cf533	DAG: Lower is.fpclass fcSubnormal\|fcZero to fabs(x) < smallest_normal (#100390 ) Produces better code on x86_64 only in the unordered case. Not sure what the exact condition should be to avoid the regression. Free fabs might do it, or maybe requires legality checks for the alternative integer expansion.	2024-07-26 22:45:47 +04:00
AtariDreams	871740761f	[CodeGen] Remove checks for vectors in unsigned division prior to computing leading zeros (#99524 ) It turns out we can safely use DAG.computeKnownBits(N0).countMinLeadingZeros() with constant legal vectors, so remove the check for it.	2024-07-19 12:15:36 +08:00
AtariDreams	a51f343b43	[CodeGen] Emit more efficient magic numbers for exact udivs (#87161 ) Have simpler lowering for exact udivs in both SelectionDAG and GlobalISel. The algorithm is the same between unsigned exact divs and signed divs save for arithmetic vs logical shift for even divisors, according to Hacker's Delight, 2nd Edition, page 242.	2024-07-17 12:19:02 -07:00
Lawrence Benson	177ce1900f	[LLVM] Add `llvm.experimental.vector.compress` intrinsic (#92289 ) This PR adds a new vector intrinsic `@llvm.experimental.vector.compress` to "compress" data within a vector based on a selection mask, i.e., it moves all selected values (i.e., where `mask[i] == 1`) to consecutive lanes in the result vector. A `passthru` vector can be provided, from which remaining lanes are filled. The main reason for this is that the existing `@llvm.masked.compressstore` has very strong constraints in that it can only write values that were selected, resulting in guard branches for all targets except AVX-512 (and even there the AMD implementation is _very_ slow). More instruction sets support "compress" logic, but only within registers. So to store the values, an additional store is needed. But this combination is likely significantly faster on many target as it avoids branches. In follow up PRs, my plan is to add target-specific lowerings for x86, SVE, and possibly RISCV. I also want to combine this with a store instruction, as this is probably a common case and we can avoid some memory writes in that case. See [discussion in forum](https://discourse.llvm.org/t/new-intrinsic-for-masked-vector-compress-without-store/78663) for initial discussion on the design.	2024-07-17 14:24:24 +02:00
Volodymyr Vasylkun	e094abde42	[SelectionDAG] Expand [US]CMP using arithmetic on boolean values instead of selects (#98774 ) The previous expansion of [US]CMP was done using two selects and two compares. It produced decent code, but on many platforms it is better to implement [US]CMP nodes by performing the following operation: ``` [us]cmp(x, y) = (x [us]> y) - (x [us]< y) ``` This patch adds this new expansion, as well as a hook in TargetLowering to allow some targets to still use the select-based approach. AArch64 and SystemZ are currently the only targets to prefer the former approach, but other targets may also start to use it if it provides for better codegen.	2024-07-16 20:56:18 +01:00
Froster	c8dc21d77f	[SelectionDAG][RISCV] Fix break of vnsrl pattern in issue #94265 (#95563 ) Added a RISCV overload of `isTruncateFree` to fix the break of vnsrl described in issue #94265. Fixes #94265	2024-07-14 12:09:37 +01:00
Dmitry Borisenkov	a38d5e0632	[SelectionDAG] Use LAST_INTEGER_VALUETYPE instead of i64 (#98299 ) When looking for a largest legal integer type for a target `TargetLowering::findOptimalMemOpLowering` assumes that `MVT::i64` is the largets possible integer type. The patch removes this assumption and uses `MVT::LAST_INTEGER_VALUETYPE` instead.	2024-07-10 21:38:50 +04:00
Craig Topper	8419da8bd4	[SelectionDAG] Remove LegalTypes argument from getShiftAmountConstant. (#97653 ) #97645 proposed to remove LegalTypes from getShiftAmountTy. This patches removes it from getShiftAmountConstant which is one of the callers of getShiftAmountTy.	2024-07-04 18:33:25 -07:00
Craig Topper	3141c11fe8	[SelectionDAG] Remove LegalTypes argument from getShiftAmountTy. NFC (#97757 ) This argument is no longer used inside the function. Remove it from the interface.	2024-07-04 15:24:54 -07:00
Simon Pilgrim	92715cf43b	[DAG] expandAVG - attempt to extend to a wider integer type for the add/shift to avoid overflow handling (#95788 )	2024-06-26 13:33:09 +01:00
Nikita Popov	f2f18459d4	Revert "Intrinsic: introduce minimumnum and maximumnum (#93841 )" As far as I can tell, this pull request was not approved, and did not go through an RFC on discourse. This reverts commit 89881480030f48f83af668175b70a9798edca2fb. This reverts commit 225d8fc8eb24fb797154c1ef6dcbe5ba033142da.	2024-06-21 08:34:04 +02:00
YunQiang Su	8988148003	Intrinsic: introduce minimumnum and maximumnum (#93841 ) Currently, on different platform, the behaivor of llvm.minnum is different if one operand is sNaN: When we compare sNaN vs NUM: ARM/AArch64/PowerPC: follow the IEEE754-2008's minNUM: return qNaN. RISC-V/Hexagon follow the IEEE754-2019's minimumNumber: return NUM. X86: Returns NUM but not same with IEEE754-2019's minimumNumber as +0.0 is not always greater than -0.0. MIPS/LoongArch/Generic: return NUM. LIBCALL: returns qNaN. So, let's introduce llvm.minmumnum/llvm.maximumnum, which always follow IEEE754-2019's minimumNumber/maximumNumber. Half-fix: #93033	2024-06-21 11:53:08 +08:00
Poseydon42	995835fe6d	[SelectionDAG] Add support for the 3-way comparison intrinsics [US]CMP (#91871 ) This PR adds initial support for the `scmp`/`ucmp` 3-way comparison intrinsics in the SelectionDAG. Some of the expansions/lowerings are not optimal yet.	2024-06-17 11:16:52 +02:00
Simon Pilgrim	76c5158aed	[DAG] combineShiftToAVG - don't create avgfloor with scalar constant operands unless legal. Converting to avgfloor and then expanding it back to shift+add later is likely to prevent other folds (re-association and value-tracking in particular) in the meantime. Fixes #95284	2024-06-13 12:37:43 +01:00
Simon Pilgrim	ca33796d54	[DAG] combineShiftToAVG - only create new types before LegalTypes Fixes #95271	2024-06-12 18:49:49 +01:00

1 2 3 4 5 ...

1573 Commits