llvm-project

Author	SHA1	Message	Date
David Green	77941eba7f	[CostModel] Add a DstTy to getShuffleCost (#141634 ) A shuffle will take two input vectors and a mask, to produce a new vector of size <MaskElts x SrcEltTy>. Historically it has been assumed that the SrcTy and the DstTy are the same for getShuffleCost, with that being relaxed in recent years. If the Tp passed to getShuffleCost is the SrcTy, then the DstTy can be calculated from the Mask elts and the src elt size, but the Mask is not always provided and the Tp is not reliably always the SrcTy. This has led to situations notably in the SLP vectorizer but also in the generic cost routines where assumption about how vectors will be legalized are built into the generic cost routines - for example whether they will widen or promote, with the cost modelling assuming they will widen but the default lowering to promote for integer vectors. This patch attempts to start improving that - it originally tried to alter more of the cost model but that too quickly became too many changes at once, so this patch just plumbs in a DstTy to getShuffleCost so that DstTy and SrcTy can be reliably distinguished. The callers of getShuffleCost have been updated to try and include a DstTy that is more accurate. Otherwise it tries to be fairly non-functional, keeping the SrcTy used as the primary type used in shuffle cost routines, only using DstTy where it was in the past (for InsertSubVector for example). Some asserts have been added that help to check for consistent values when a Mask and a DstTy are provided to getShuffleCost. Some of them took a while to get right, and some non-mask calls might still be incorrect. Hopefully this will provide a useful base to build more shuffles that alter size.	2025-06-21 12:29:29 +01:00
Florian Hahn	071a6feabd	[TTI] Remove PPC hasActiveVectorLength impl, simplify interface (NFC). (#142310 ) PPCTTIImpl defines hasActiveVectorLength and also getVPMemoryOpCost, but they appear unused (i.e. no changes to tests). Remove them, as they complicate the interface for hasActiveVectorLength. This simplifies the only use in LV as now no placeholder values need to be passed. PR: https://github.com/llvm/llvm-project/pull/142310	2025-06-18 19:02:17 +01:00
David Green	abd2c07e39	[CostModel] Make Op0 and Op1 const in getVectorInstrCost. NFC (#137631 ) This does not alter much at the moment, but allows const pointers to be passed as Op0 and Op1, simplifying later patches	2025-05-01 15:55:08 +01:00
Sergei Barannikov	bb1765179e	[TTI] Simplify implementation (NFCI) (#136674 ) Replace "concept based polymorphism" with simpler PImpl idiom. This pursues two goals: * Enforce static type checking. Previously, target implementations hid base class methods and type checking was impossible. Now that they override the methods, the compiler will complain on mismatched signatures. * Make the code easier to navigate. Previously, if you asked your favorite LSP server to show a method (e.g. `getInstructionCost()`), it would show you methods from `TTI`, `TTI::Concept`, `TTI::Model`, `TTIImplBase`, and target overrides. Now it is two less :) There are three commits to hopefully simplify the review. The first commit removes `TTI::Model`. This is done by deriving `TargetTransformInfoImplBase` from `TTI::Concept`. This is possible because they implement the same set of interfaces with identical signatures. The first commit makes `TargetTransformImplBase` polymorphic, which means all derived classes should `override` its methods. This is done in second commit to make the first one smaller. It appeared infeasible to extract this into a separate PR because the first commit landed separately would result in tons of `-Woverloaded-virtual` warnings (and break `-Werror` builds). The third commit eliminates `TTI::Concept` by merging it with the only derived class `TargetTransformImplBase`. This commit could be extracted into a separate PR, but it touches the same lines in `TargetTransformInfoImpl.h` (removes `override` added by the second commit and adds `virtual`), so I thought it may make sense to land these two commits together. Pull Request: https://github.com/llvm/llvm-project/pull/136674	2025-04-26 15:25:40 +03:00
David Green	98b6f8dc69	[CostModel] Remove optional from InstructionCost::getValue() (#135596 ) InstructionCost is already an optional value, containing an Invalid state that can be checked with isValid(). There is little point in returning another optional from getValue(). Most uses do not make use of it being a std::optional, dereferencing the value directly (either isValid has been checked previously or the Cost is assumed to be valid). The one case that does in AMDGPU used value_or which has been replaced by a isValid() check.	2025-04-23 07:46:27 +01:00
Sergei Barannikov	3334c3597d	[TTI] Fix discrepancies in prototypes between interface and implementations (NFCI) (#136655 ) These are not diagnosed because implementations hide the methods of the base class rather than overriding them. This works as long as a hiding function is callable with the same arguments as the same function from the base class. Pull Request: https://github.com/llvm/llvm-project/pull/136655	2025-04-22 11:40:12 +03:00
Sergei Barannikov	0014b49482	[TTI] Make all interface methods const (NFCI) (#136598 ) Making `TargetTransformInfo::Model::Impl` `const` makes sure all interface methods are `const`, in `BasicTTIImpl`, its bases, and in all derived classes. Pull Request: https://github.com/llvm/llvm-project/pull/136598	2025-04-22 06:27:29 +03:00
Sergei Barannikov	e0c1e23b99	[TTI] Constify BasicTTIImplBase::thisT() (NFCI) (#136575 ) The main change is making `thisT` method `const`, the rest of the changes is fixing compilation errors (). () There are two tricky methods, `getVectorInstrCost()` and `getIntImmCost()`. They have several overloads; some of these overloads are typically pulled in to derived classes using the `using` directive, and then hidden by methods in the derived class. The compiler does not complain if the hiding methods are not marked as `const`, which means that clients will use the methods from the base class. If after this change your target fails cost model tests, this must be the reason. To resolve the issue you need to make all hiding overloads `const`. See the second commit in this PR. Pull Request: https://github.com/llvm/llvm-project/pull/136575	2025-04-21 21:42:40 +03:00
Pedro Lobo	6a94bd136d	[PPC] Change placeholder from `undef` to `poison` (#134552 ) Call `insertelement` on a `poison` value instead of `undef`.	2025-04-07 21:56:02 +01:00
Kostas	65220fc5ae	[PPC] Fix coding style violations in PPCTargetTransformInfo.cpp (#130666 ) Fixing violations of the coding conventions. Applying clang-tidy recommendations.	2025-03-13 13:06:49 -04:00
Henry Jiang	6d0cfbc9c0	[PPC] Implement `areInlineCompatible` (#126562 ) After the default implementation swap from https://github.com/llvm/llvm-project/pull/117493, where `areInlineCompatible` checks if the callee features are a subset of caller features. This is not a safe assumption in general on PPC. We fallback to check for strict feature set equality for now, and see what improvements we can make.	2025-02-24 17:53:43 -05:00
Kazu Hirata	f71cb9dbb7	[PowerPC] Remove unused includes (NFC) (#116163 ) Identified with misc-include-cleaner.	2024-11-14 07:55:18 -08:00
RolandF77	06c8210a67	update P7 32-bit partial vector load cost (#108261 ) Update cost model to reflect codegen change to use lfiwzx for 32-bit partial vector loads on pwr7 with https://github.com/llvm/llvm-project/pull/104507.	2024-10-03 12:28:43 -04:00
Philip Reames	d288574363	[TTI][RISCV] Model cost of loading constants arms of selects and compares (#109824 ) This follows in the spirit of 7d82c99403f615f6236334e698720bf979959704, and extends the costing API for compares and selects to provide information about the operands passed in an analogous manner. This allows us to model the cost of materializing the vector constant, as some select-of-constants are significantly more expensive than others when you account for the cost of materializing the constants involved. This is a stepping stone towards fixing https://github.com/llvm/llvm-project/issues/109466. A separate SLP patch will be required to utilize the new API.	2024-09-25 07:25:57 -07:00
azhan92	1df4d866cc	[PowerPC] Add support for -mcpu=pwr11 / -mtune=pwr11 (#99511 ) This PR adds support for -mcpu=pwr11/power11 and -mtune=pwr11/power11 in clang and llvm.	2024-07-23 09:49:41 -04:00
David Green	4ac2721e51	[AArch64] Add costs for ST3 and ST4 instructions, modelled as store(shuffle). (#87934 ) This tries to add some costs for the shuffle in a ST3/ST4 instruction, which are represented in LLVM IR as store(interleaving shuffle). In order to detect the store, it needs to add a CxtI context instruction to check the users of the shuffle. LD3 and LD4 are added, LD2 should be a zip1 shuffle, which will be added in another patch. It should help fix some of the regressions from #87510.	2024-04-09 16:36:08 +01:00
Il-Capitano	308ed0233a	[Intrinsics] Make `patchpoint.i64` generic on its return type (#85911 ) Currently patchpoints can only have two result types, `void` and `i64`. This limits the result to general purpose registers. This patch makes `patchpoint.i64` an overloadable intrinsic, allowing result values that can fit in a single register (e.g. integers, pointers, floats).	2024-03-26 19:08:52 +05:30
Chen Zheng	8d1046ae49	[PowerPC] adjust cost for extract i64 from vector on P9 and above (#82963 ) https://godbolt.org/z/Ma347Tx1W	2024-03-04 09:37:11 +08:00
Chen Zheng	80f3bb4cf2	[PowerPC] adjust cost for vector insert/extract with non const index (#79092 ) P9 has vxform `Vector Extract Element Instructions` like `vextuwrx` and P10 has vxform `Vector Insert Element instructions` like `vinsd`. Update the instruction cost reflecting these instructions. Fixes https://github.com/llvm/llvm-project/issues/50249	2024-02-19 09:57:49 +08:00
RolandF77	4beea6b195	[PowerPC] lower partial vector store cost (#78358 ) There are matching store opcodes (stfd, stxsiwx) for the load opcodes that make 32-bit and 64-bit vector operations cheap with VSX, so stores should also be cheap.	2024-01-23 16:07:18 -05:00
Kazu Hirata	286ef12b47	[Target] Remove unnecessary includes (NFC)	2023-12-07 21:03:56 -08:00
Sander de Smalen	81b7f115fb	[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979 ) It seems TypeSize is currently broken in the sense that: TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8) without failing its assert that explicitly tests for this case: assert(LHS.Scalable == RHS.Scalable && ...); The reason this fails is that `Scalable` is a static method of class TypeSize, and LHS and RHS are both objects of class TypeSize. So this is evaluating if the pointer to the function Scalable == the pointer to the function Scalable, which is always true because LHS and RHS have the same class. This patch fixes the issue by renaming `TypeSize::Scalable` -> `TypeSize::getScalable`, as well as `TypeSize::Fixed` to `TypeSize::getFixed`, so that it no longer clashes with the variable in FixedOrScalableQuantity. The new methods now also better match the coding standard, which specifies that: * Variable names should be nouns (as they represent state) * Function names should be verb phrases (as they represent actions)	2023-11-22 08:52:53 +00:00
Youngsuk Kim	e69e066bfe	[llvm][PowerPC] Remove no-op ptr-to-ptr bitcasts (NFC) Opaque ptr cleanup effort.	2023-11-01 16:40:32 -05:00
Fangrui Song	8e247b8f47	Replace TypeSize::{getFixed,getScalable} with canonical TypeSize::{Fixed,Scalable}. NFC	2023-10-27 00:30:41 -07:00
Roland Froese	4d425f8663	[PowerPC] vector cost model add cost to extract i1 Try to avoid some unprofitable predication on PPC. Recognize in the cost model that computing on i1 values will require extra mask or compare operation. Differential Revision: https://reviews.llvm.org/D155876	2023-08-14 17:04:11 -04:00
Ting Wang	65f68812d3	[PowerPC] update PPCTTIImpl::supportsTailCallFor() check conditions This patch reuse `PPCTargetLowering::isEligibleForTCO()` to check `PPCTTIImpl::supportsTailCallFor()`. Fixes #59315 Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D140369	2023-02-28 22:29:16 -05:00
Luke Lau	b02b1e0ed6	[LV][NFC] Use ElementCount for getMaxInterleaveFactor In order to allow targets to disable interleaving for scalable vectors, pass the entire VF's ElementCount to getMaxInterleaveFactor. This is based off of the approach used here: `8d36708507` The plan would then be to disable interleaving on scalable VFs on RISC-V in a follow up patch. See https://reviews.llvm.org/D143723#4132349 Reviewed By: reames Differential Revision: https://reviews.llvm.org/D144474	2023-02-22 10:15:05 +00:00
ShihPo Hung	5fb3a57ea7	[Cost] Add CostKind to getVectorInstrCost and its related users LoopUnroll estimates the loop size via getInstructionCost(), but getInstructionCost() cannot pass CostKind to getVectorInstrCost(). And so does getShuffleCost() to getBroadcastShuffleOverhead(), getPermuteShuffleOverhead(), getExtractSubvectorOverhead(), and getInsertSubvectorOverhead(). To address this, this patch adds an argument CostKind to these functions. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D142116	2023-01-21 05:29:24 -08:00
Alexey Bataev	9b5f62685a	[SLP]Fix cost of the broadcast buildvector/gather. Need to include the cost of the initial insertelement to the cost of the broadcasts. Also, need to adjust the cost of the gather/buildvector if the element is inserted into poison/undef vector. Differential Revision: https://reviews.llvm.org/D140498	2023-01-06 09:25:05 -08:00
Chen Zheng	85edf1fc70	[PowerPC] remove the ctr clobbers check related to TLS access Dynamic tls access model will be lowered to MI which clobbers CTR in the loop in ISEL(ADDItlsgdLADDR) and post-isel CTR loop pass will revert the loop to a normal compare + branch form. So no need to add this clobber check in hardware loop insertion pass now. Reviewed By: nemanjai Differential revision: https://reviews.llvm.org/D140367	2023-01-05 21:23:29 -05:00
Chen Zheng	f74324a1f8	[PowerPC] don't generate hardware loop. If the candidate loop already has hardware loop related intrinsics, don't generate hardware loop on PPC. PPC does not support nested hardware loops.	2022-12-19 20:32:29 -05:00
Chen Zheng	b5e1fc19da	[PowerPC] don't check CTR clobber in hardware loop insertion pass We added a new post-isel CTRLoop pass in D122125. That pass will expand the hardware loop related intrinsic to CTR loop or normal loop based on the loop context. So we don't need to conservatively check the CTR clobber now on the IR level. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D135847	2022-12-04 20:53:49 -05:00
Krzysztof Parzyszek	86fe4dfdb6	TargetTransformInfo: convert Optional to std::optional Recommit: added missing "#include <cstdint>".	2022-12-02 11:42:15 -08:00
Krzysztof Parzyszek	4e12d1836a	Revert "TargetTransformInfo: convert Optional to std::optional" This reverts commit b83711248cb12639e7ef7303cfbb4452b4067e85. Some buildbots are failing.	2022-12-02 11:34:04 -08:00
Krzysztof Parzyszek	b83711248c	TargetTransformInfo: convert Optional to std::optional	2022-12-02 11:27:12 -08:00
Nemanja Ivanovic	4ea121c904	[PowerPC] Fix a number of inefficiencies and issues with atomic code gen There are a few issues with the code we generate for atomic operations and the way we generate it: - Hard coded CR0 for compares - Order of operands for compares not conducive to emitting compare-immediate or for CSE of compares - Missing MachineMemOperand for st[bhwd]cx intrinsics - Missing intrinsic properties for the same - Unnecessary blocks with store conditional instructions to clear reservation (which ends up hindering performance) - Move from CR instructions just to compare the result of a store conditional with zero (even though it is a record-form) This patch aims to resolve all of those issues. Differential revision: https://reviews.llvm.org/D134783	2022-10-03 19:55:29 -05:00
Philip Reames	c9608d57b8	[TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC] This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet. The main point of this change is simple consistency with the recently changes getArithmeticInstrCost, and to remove the last (interface) use of OperandValueKind.	2022-08-23 07:55:42 -07:00
Philip Reames	104fa367ee	[TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC] This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both. This is the change which motivated the whole sequence which preceeded it. In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact. This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through. I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance. For instance, every parameter which changes type in this change also changes name. This was intentional to make sure that every call site possible effected must show up in the diff. This let me audit each one closely.	2022-08-22 15:16:39 -07:00
Ting Wang	d2d77e050b	[PowerPC][Coroutines] Add tail-call check with call information for coroutines Fixes #56679. Reviewed By: ChuanqiXu, shchenz Differential Revision: https://reviews.llvm.org/D131953	2022-08-21 22:20:40 -04:00
Simon Pilgrim	5263155d5b	[CostModel] Add CostKind argument to getShuffleCost Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after comparison (vectorcombiner). Calls via getInstructionCost were just dropping the CostKind, so again there should be no change at this time (as getShuffleCost and its expansions don't use CostKind yet) - but it will make it easier for us to better account for size/latency shuffle costs in inline/unroll passes in the future. Differential Revision: https://reviews.llvm.org/D132287	2022-08-21 10:54:51 +01:00
Alexey Bataev	d53e245951	[COST][NFC]Introduce OperandValueKind in getMemoryOpCost, NFC. Added OperandValueKind OpdInfo parameter to getMemoryOpCost functions to better estimate cost with immediate values. Part of D126885.	2022-08-19 07:33:00 -07:00
Simon Pilgrim	fdec50182d	[CostModel] Replace getUserCost with getInstructionCost * Replace getUserCost with getInstructionCost, covering all cost kinds. * Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks. Original Patch by @samparker (Sam Parker) Differential Revision: https://reviews.llvm.org/D79483	2022-08-18 11:55:23 +01:00
Daniil Fukalov	7ed3d81333	[NFCI] Move cost estimation from TargetLowering to TargetTransformInfo. TragetLowering had two last InstructionCost related `getTypeLegalizationCost()` and `getScalingFactorCost()` members, but all other costs are processed in TTI. E.g. it is not comfortable to use other TTI members in these two functions overrided in a target. Minor refactoring: `getTypeLegalizationCost()` now doesn't need DataLayout parameter - it was always passed from TTI. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D117723	2022-08-18 00:38:55 +03:00
Fangrui Song	de9d80c1c5	[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC With C++17 there is no Clang pedantic warning or MSVC C5051.	2022-08-08 11:24:15 -07:00
Paul Kirth	d434e40f39	[llvm][NFC] Refactor code to use ProfDataUtils In this patch we replace common code patterns with the use of utility functions for dealing with profiling metadata. There should be no change in functionality, as the existing checks should be preserved in all cases. Reviewed By: bogner, davidxl Differential Revision: https://reviews.llvm.org/D128860	2022-08-03 00:09:45 +00:00
Paul Kirth	6e9bab71b6	Revert "[llvm][NFC] Refactor code to use ProfDataUtils" This reverts commit 300c9a78819b4608b96bb26f9320bea6b8a0c4d0. We will reland once these issues are ironed out.	2022-07-27 21:38:11 +00:00
Paul Kirth	300c9a7881	[llvm][NFC] Refactor code to use ProfDataUtils In this patch we replace common code patterns with the use of utility functions for dealing with profiling metadata. There should be no change in functionality, as the existing checks should be preserved in all cases. Reviewed By: bogner, davidxl Differential Revision: https://reviews.llvm.org/D128860	2022-07-27 21:13:54 +00:00
Congzhe Cao	a9dccb0072	[TargetTransformInfo] Added an opt/llc option for cache line size In some passes we need a valid number of cache line size to do analysis or transformation, e.g., loop cache analysis and loop date prefetch. However, for some backend targets, `TTIImpl->getCacheLineSize()` is not implemented and hence 'TTI.getCacheLineSize()' would just return 0 which eventually might produce invalid result. In this patch we add a user-specified opt/llc option for cache line size. If the option is specified by users we use the value supplied, otherwise we fall-back to the default value obtained from `TTIImpl->->getCacheLineSize()`. The powerpc target already has such an option, this patch generalizes this option to TargetTransformInfo.cpp. Reviewed By: bmahjour, #loopoptwg Differential Revision: https://reviews.llvm.org/D127342	2022-06-16 15:57:51 -04:00
eopXD	6a84579243	[LSR][TTI][PowerPC][SystemZ][X86] Add const-ness to TTI::isLSRCostLess. NFC Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D126350	2022-05-27 15:22:23 -07:00
Qiu Chaofan	d9d15af787	[PowerPC] Treat llvm.fmuladd intrinsic as using CTR This fixes bug 55463, similar to D78668. This is a temporary fix since we will switch to post-isel CTR loop determination in the future. Reviewed By: dim, shchenz Differential Revision: https://reviews.llvm.org/D125746	2022-05-18 15:57:55 +08:00

1 2 3 4 5

238 Commits