llvm-project

Author	SHA1	Message	Date
Shih-Po Hung	3d985a6f1b	[RISCV][TTI] Scale the cost of Select with LMUL (#88098 ) Use the Val type to estimate the instruction cost for SelectInst.	2024-04-10 14:18:15 +08:00
Shih-Po Hung	ee52add6cb	[RISCV][TTI] Implement cost of intrinsic active_lane_mask (#87931 ) This patch uses the argument type to infer the LMUL cost for the index generation, add, and comparison.	2024-04-10 10:08:33 +08:00
David Green	4ac2721e51	[AArch64] Add costs for ST3 and ST4 instructions, modelled as store(shuffle). (#87934 ) This tries to add some costs for the shuffle in a ST3/ST4 instruction, which are represented in LLVM IR as store(interleaving shuffle). In order to detect the store, it needs to add a CxtI context instruction to check the users of the shuffle. LD3 and LD4 are added, LD2 should be a zip1 shuffle, which will be added in another patch. It should help fix some of the regressions from #87510.	2024-04-09 16:36:08 +01:00
Alexey Bataev	413a66f339	[LV, VP]VP intrinsics support for the Loop Vectorizer + adding new tail-folding mode using EVL. (#76172 ) This patch introduces generating VP intrinsics in the Loop Vectorizer. Currently the Loop Vectorizer supports vector predication in a very limited capacity via tail-folding and masked load/store/gather/scatter intrinsics. However, this does not let architectures with active vector length predication support take advantage of their capabilities. Architectures with general masked predication support also can only take advantage of predication on memory operations. By having a way for the Loop Vectorizer to generate Vector Predication intrinsics, which (will) provide a target-independent way to model predicated vector instructions. These architectures can make better use of their predication capabilities. Our first approach (implemented in this patch) builds on top of the existing tail-folding mechanism in the LV (just adds a new tail-folding mode using EVL), but instead of generating masked intrinsics for memory operations it generates VP intrinsics for loads/stores instructions. The patch adds a new VPlanTransforms to replace the wide header predicate compare with EVL and updates codegen for load/stores to use VP store/load with EVL. Other important part of this approach is how the Explicit Vector Length is computed. (VP intrinsics define this vector length parameter as Explicit Vector Length (EVL)). We use an experimental intrinsic `get_vector_length`, that can be lowered to architecture specific instruction(s) to compute EVL. Also, added a new recipe to emit instructions for computing EVL. Using VPlan in this way will eventually help build and compare VPlans corresponding to different strategies and alternatives. Differential Revision: https://reviews.llvm.org/D99750	2024-04-04 18:30:17 -04:00
Shih-Po Hung	97523e5321	[RISCV][TTI] Scale the cost of intrinsic stepvector with LMUL (#87301 ) Use the return type to measure the LMUL size for latency/throughput cost	2024-04-04 08:30:15 +08:00
Shih-Po Hung	d7a43a00fe	[RISCV][TTI] Scale the cost of trunc/fptrunc/fpext with LMUL (#87101 ) Use the destination data type to measure the LMUL size for latency/throughput cost	2024-04-02 09:30:51 +08:00
Shih-Po Hung	84f24c2daf	[RISCV][TTI] Scale the cost of intrinsic umin/umax/smin/smax with LMUL (#87245 ) Use the return type to measure the LMUL size for throughput/latency cost	2024-04-02 09:26:27 +08:00
Shih-Po Hung	c7954ca312	Recommit "[RISCV] Refine cost on Min/Max reduction (#79402 )" (#86480 ) This is recommitted as the test and fix for llvm.vector.reduce.fmaximum/fminimum are covered in #80553 and #80697	2024-04-01 14:44:10 +08:00
ShihPo Hung	aa2d5d5413	Recommit "[RISCV][TTI] Scale the cost of the sext/zext with LMUL (#86617 )" Changes in Recommit: Add an additional check on sign/zero extend to the same type. Original message: Use the destination data type to measure the LMUL size for latency/throughput cost	2024-03-26 23:41:16 -07:00
Jianjian Guan	05a7b22a01	[RISCV] Add areInlineCompatible for riscv target (#86639 ) Inline a callee if its target-features are a subset of the callers target-features.	2024-03-27 14:16:03 +08:00
ShihPo Hung	da3e58e74a	Revert "[RISCV][TTI] Scale the cost of the sext/zext with LMUL (#86617 )" This reverts commit 7545c635729a2055a429c5decd26a619a8d6e74b as it's failing on the Linux bots.	2024-03-26 21:47:32 -07:00
Shih-Po Hung	7545c63572	[RISCV][TTI] Scale the cost of the sext/zext with LMUL (#86617 ) Use the destination data type to measure the LMUL size for latency/throughput cost	2024-03-27 10:58:17 +08:00
Craig Topper	2fbc40d36d	[RISCV] Split compound if statement to fix a crash. We're not allowed to call getELEN when the vector extension is not enabled. If we're looking at a vector type, isTypeLegal would only return true if the vector extensions are enabled. So early out for non-vector types before we call isTypeLegal and getELEN.	2024-03-26 11:53:17 -07:00
ShihPo Hung	5dc0c75aab	[RISCV][TTI] Fix missing return in the end of function	2024-03-25 23:32:18 -07:00
Shih-Po Hung	817f453aa5	[RISCV][TTI] Refactor getCastInstrCost to exit early (#86619 ) To reduce the indentation by using early returns, this patch hoist the return for illegal type and non vector type earlier. It should mostly be an NFC.	2024-03-26 14:15:40 +08:00
Shih-Po Hung	3cb024198f	[RISCV][CostModel] Estimate cost of llvm.vector.reduce.fmaximum/fminimum (#80697 ) The ‘llvm.vector.reduce.fmaximum/fminimum.*’ intrinsics propagate NaNs if any element of the vector is a NaN. Following #79402, the patch adds the cost for NaN check (vmfne + vcpop)	2024-03-25 17:17:36 +08:00
Kolya Panchenko	aa68e2814d	[RISCV] Support `llvm.masked.compressstore` intrinsic (#83457 ) The changeset enables lowering of `llvm.masked.compressstore(%data, %ptr, %mask)` for RVV for fixed vector type into: ``` %0 = vcompress %data, %mask, %vl %new_vl = vcpop %mask, %vl vse %0, %ptr, %1, %new_vl ``` Such lowering is only possible when `%data` fits into available LMULs and otherwise `llvm.masked.compressstore` is scalarized by `ScalarizeMaskedMemIntrin` pass. Even though RVV spec in the section `15.8` provide alternative sequence for compressstore, use of `vcompress + vcpop` should be a proper canonical form to lower `llvm.masked.compressstore`. If RISC-V target find the sequence from `15.8` better, peephole optimization can transform `vcompress + vcpop` into that sequence.	2024-03-13 15:18:51 -04:00
Visoiu Mistrih Francis	eceb24c439	[RISCV] Hoist immediate addresses from loads/stores (#83644 ) In case of loads/stores from an immediate address, avoid rematerializing the constant for every block and allow consthoist to hoist it to the entry block.	2024-03-05 22:41:56 -08:00
Shih-Po Hung	fb67dce1cb	[RISCV] Fix crash when unrolling loop containing vector instructions (#83384 ) When MVT is not a vector type, TCK_CodeSize should return an invalid cost. This patch adds a check in the beginning to make sure all cost kinds return invalid costs consistently. Before this patch, TCK_CodeSize returns a valid cost on scalar MVT but other cost kinds doesn't. This fixes the issue #83294 where a loop contains vector instructions and MVT is scalar after type legalization when the vector extension is not enabled,	2024-03-02 12:33:55 +08:00
Shih-Po Hung	6ee9c8afbc	[RISCV][CostModel] Updates reduction and shuffle cost (#77342 ) - Make `andi` cost 1 in SK_Broadcast - Query the cost of VID_V, VRSUB_VX/VRSUB_VI which would scale with LMUL	2024-02-29 15:41:19 +08:00
Philip Reames	f037e709ca	[RISCV][TTI] Cost a subvector extract at a register boundary with exact vlen (#82405 ) If we have exact vlen knowledge, we can figure out which indices correspond to register boundaries. Our lowering uses this knowledge to replace the vslidedown.vi with a sub-register extract. Our costs can reflect that as well. This is another piece split off https://github.com/llvm/llvm-project/pull/80164 --------- Co-authored-by: Luke Lau <luke_lau@icloud.com>	2024-02-21 07:56:08 -08:00
Philip Reames	2549c24142	Reapply "[RISCV][TTI] Extract subvector at index zero is free (#81751 )" This reverts commit 834d11c21541c8bf92ef598c1171e8163b69e8c7 which was a revert of my 3a626937b1b652e3c87cd0050df9c24cc5127d3b. I had failed to rebase after new tests added overnight by fc0b67e1d79d1f199687f8f06d619984d9520230. Original commit message follows: Extracing a subvector at index zero corresponds to a type conversion and possibly a subregister operation. We will not emit a vslidedown. As such, they are free. As an aside, it looks like we're not passing an index in for cases where the subvec type is scalable. For at least index zero, we probably should be. Revert "Revert "[RISCV][TTI] Extract subvector at index zero is free (#81751)""	2024-02-15 16:51:15 -08:00
Craig Topper	834d11c215	Revert "[RISCV][TTI] Extract subvector at index zero is free (#81751 )" This reverts commit 3a626937b1b652e3c87cd0050df9c24cc5127d3b. Causes tests added by fc0b67e1d79d1f199687f8f06d619984d9520230 to fail.	2024-02-15 12:51:23 -08:00
Philip Reames	3a626937b1	[RISCV][TTI] Extract subvector at index zero is free (#81751 ) Extracing a subvector at index zero corresponds to a type conversion and possibly a subregister operation. We will not emit a vslidedown. As such, they are free. As an aside, it looks like we're not passing an index in for cases where the subvec type is scalable. For at least index zero, we probably should be.	2024-02-15 07:43:50 -08:00
Philip Reames	59e559067b	Revert "[RISCV] Refine cost on Min/Max reduction" (#80340 ) Reverts llvm/llvm-project#79402. Crash reported. On closer inspection, this patch does not handle Intrinsic::maximum and Intrinsic::minimum.	2024-02-01 13:09:07 -08:00
Alexey Bataev	8ad14b6d90	[TTI]Add support for strided loads/stores. Added basic legality check and cost estimation functions for strided loads and stores. These interfaces will be built upon in https://github.com/llvm/llvm-project/pull/80310. Reviewers: preames Reviewed By: preames Pull Request: https://github.com/llvm/llvm-project/pull/80329	2024-02-01 16:07:38 -05:00
Shih-Po Hung	2800448f88	[RISCV] Refine cost on Min/Max reduction (#79402 ) This patch is split off from #77342, and follows #79103 - Correct for CodeSize cost that 1 instruction is not included. 3 is from {VMV.S, ReductionOp, VMV.X} - Add SplitCost which chains a series of VMAX/VMIN/... which scales with LMUL. - Use MVT to estimate VL.	2024-01-30 16:47:32 +08:00
Shih-Po Hung	bf716fb716	[RISCV] Refine cost on Min/Max reduction with i1 type (#79401 ) It is split off from #77342. InstCombine transform min/max reduction with i1 into arithmetic reduction, so this patch reuses the cost logic in arithmetic reduction cost function.	2024-01-26 19:35:27 +08:00
Shih-Po Hung	84be954cb2	[RISCV][CostModel] Refine Arithmetic reduction costs (#79103 ) This patch is split off from #77342 - Correct for CodeSize cost that 1 instruction is not included. 3 is from {VMV.S, ReductionOp, VMV.X} - Add SplitCost Unordered reduction chain a series of VADD/VFADD/... which scales with LMUL. Ordered reductions chain a series of VFREDOSUMs. - Use MVT to estimate VL.	2024-01-25 10:49:44 +08:00
Shih-Po Hung	7e63940f69	[RISCV][CostModel] Make VMV_S_X and VMV_X_S cost independent of LMUL (#78739 ) Following #77963, instructions like VMV_S_X/VMV_X_S handle single element, so the cost don't scale with LMUL.	2024-01-23 11:00:19 +08:00
Philip Reames	8bf624af47	[RISCV] Key VectorIntrinsicCostTable by SEW [nfc-ish] Previously, we'd keyed the table by the vector type, but we were actually assigning the same cost for all the types with a common element type. Unless we'd missed an entry, this means that effectively we were performing an SEW lookup. Restructure the table to make this SEW dependence more explicit, and in the process greatly reduce the size of the table.	2024-01-18 17:10:56 -08:00
Philip Reames	2663d2cb9c	[RISCV] Adjust select shuffle cost to reflect mask creation cost (#77963 ) This is inspired by https://github.com/llvm/llvm-project/pull/77342#pullrequestreview-1814673242, and is split off of same with some differences in style. A select is a vmerge.vv with the additional cost of materializing the bitmask vector in a vreg. All masks fit within a single vector register (e8 + m8 is the worst case), and thus our worst case cost should be roughly 3 (2 scalar to produce the address, one vector load op). Given most shuffles are small, and the mask will be instead produced by LUI/ADDI + vmv.s.x or ADDI + vmv.s.x, using 2 as the default seems quite reasonable. At worst, we're not going to be off by much. The prior lowering scaled the cost of the bitmask with LMUL, which I don't understand. At m1 it did use the same base cost of 2. (@lukel97 You wrote the original code here, anything I'm missing here?)	2024-01-18 10:24:47 -08:00
Luke Lau	a348397a1c	[RISCV] Don't scale cost by LMUL for TCK_CodeSize in getMemoryOpCost (#78407 )	2024-01-17 21:41:35 +07:00
Shih-Po Hung	475890cd2e	[RISCV][CostModel] Add getRISCVInstructionCost() to TTI for CostKind (#76793 ) Instruction cost for CodeSize and Latency/RecipThroughput can be very different. Considering the diversity of CostKind and vendor-specific cost, and how they are spread across various TTI functions, it's becoming quite a challenge to handle. This patch adds an interface getRISCVInstructionCost to address it.	2024-01-04 21:04:36 +08:00
Vitaly Buka	9c39d9bb49	Revert "[RISCV][CostModel] Add getRISCVInstructionCost() to TTI for Cost… (#73651 )" (#76536 ) Fails on bots https://lab.llvm.org/buildbot/#/builders/5/builds/39629 Issue #76535 This reverts commit 3e75dece919511e4a2edada82d783304cc14a9cd.	2023-12-28 13:30:56 -08:00
Shih-Po Hung	3e75dece91	[RISCV][CostModel] Add getRISCVInstructionCost() to TTI for Cost… (#73651 ) …Kind Instruction cost for CodeSize and Latency/RecipThroughput can be very different. Considering the diversity of CostKind and vendor-specific cost, and how they are spread across various TTI functions, it's becoming quite a challenge to handle. This patch adds an interface getRISCVInstructionCost to address it.	2023-12-28 14:36:01 +08:00
melonedo	3eaed9e6f5	[RISCV] Implement intrinsics for XCVbitmanip extension in CV32E40P (#74993 ) Implement XCVbitmanip intrinsics for CV32E40P according to the specification. This commit is part of a patch-set to upstream the vendor specific extensions of CV32E40P that need LLVM intrinsics to implement Clang builtins. Contributors: @CharKeaney, @ChunyuLiao, @jeremybennett, @lewis-revill, @NandniJamnadas, @PaoloS02, @simonpcook, @xingmingjie. Spec: `05481cf0ef/specifications/corev-builtin-spec.md (listing-of-pulp-bit-manipulation-builtins-xcvbitmanip)`. Previously reviewed on Phabricator: https://reviews.llvm.org/D157510. Parallel GCC patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635795.html. Co-authored-by: melonedo <funanzeng@gmail.com>	2023-12-17 19:29:40 +08:00
Sander de Smalen	81b7f115fb	[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979 ) It seems TypeSize is currently broken in the sense that: TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8) without failing its assert that explicitly tests for this case: assert(LHS.Scalable == RHS.Scalable && ...); The reason this fails is that `Scalable` is a static method of class TypeSize, and LHS and RHS are both objects of class TypeSize. So this is evaluating if the pointer to the function Scalable == the pointer to the function Scalable, which is always true because LHS and RHS have the same class. This patch fixes the issue by renaming `TypeSize::Scalable` -> `TypeSize::getScalable`, as well as `TypeSize::Fixed` to `TypeSize::getFixed`, so that it no longer clashes with the variable in FixedOrScalableQuantity. The new methods now also better match the coding standard, which specifies that: * Variable names should be nouns (as they represent state) * Function names should be verb phrases (as they represent actions)	2023-11-22 08:52:53 +00:00
Wang Pengcheng	e179b125fb	[RISCV][NFC] Pass MCSubtargetInfo instead of FeatureBitset in RISCVMatInt (#71770 ) The use of `hasFeature` is more descriptive and the callers of `RISCVMatInt` have no need to call `getFeatureBits()` any more.	2023-11-09 15:15:23 +08:00
Fangrui Song	8e247b8f47	Replace TypeSize::{getFixed,getScalable} with canonical TypeSize::{Fixed,Scalable}. NFC	2023-10-27 00:30:41 -07:00
Ramkumar Ramachandra	98c90a13c6	ISel: introduce vector ISD::LRINT, ISD::LLRINT; custom RISCV lowering (#66924 ) The issue #55208 noticed that std::rint is vectorized by the SLPVectorizer, but a very similar function, std::lrint, is not. std::lrint corresponds to ISD::LRINT in the SelectionDAG, and std::llrint is a familiar cousin corresponding to ISD::LLRINT. Now, neither ISD::LRINT nor ISD::LLRINT have a corresponding vector variant, and the LangRef makes this clear in the documentation of llvm.lrint.* and llvm.llrint.. This patch extends the LangRef to include vector variants of llvm.lrint. and llvm.llrint.*, and lays the necessary ground-work of scalarizing it for all targets. However, this patch would be devoid of motivation unless we show the utility of these new vector variants. Hence, the RISCV target has been chosen to implement a custom lowering to the vfcvt.x.f.v instruction. The patch also includes a CostModel for RISCV, and a trivial follow-up can potentially enable the SLPVectorizer to vectorize std::lrint and std::llrint, fixing #55208. The patch includes tests, obviously for the RISCV target, but also for the X86, AArch64, and PowerPC targets to justify the addition of the vector variants to the LangRef.	2023-10-19 13:05:04 +01:00
Alexey Bataev	e22818d5c9	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-05 06:17:07 -07:00
Arthur Eubanks	07389535a7	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit b186f1f68be11630355afb0c08b80374a6d31782. Causes crashes, see https://reviews.llvm.org/D158449.	2023-10-04 14:37:16 -07:00
Alexey Bataev	b186f1f68b	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-04 07:53:30 -07:00
Alexey Bataev	1129dec778	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit 6f43d28f3452b3ef598bc12b761cfc2dbd0f34c9 to fix a crash reported in https://reviews.llvm.org/D158449.	2023-10-03 13:02:16 -07:00
Alexey Bataev	6f43d28f34	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-03 10:26:11 -07:00
Alexey Bataev	ebcb5d59fc	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit 9f5960e004ff54082ccfa9396522e07358f5b66b to fix buildbots reported here https://lab.llvm.org/buildbot/#/builders/230/builds/19412.	2023-09-29 15:03:46 -07:00
Alexey Bataev	9f5960e004	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-09-29 13:16:03 -07:00
Alexey Bataev	3204f88a8b	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit c88c281cf1ac1a01c55231b93826d7c8ae83985b to fix the crash revealed by https://lab.llvm.org/buildbot/#/builders/230/builds/19353.	2023-09-28 11:57:32 -07:00
Alexey Bataev	c88c281cf1	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-09-28 11:03:21 -07:00

1 2 3 4 5

232 Commits