227 Commits

Author SHA1 Message Date
Kazu Hirata
f71cb9dbb7
[PowerPC] Remove unused includes (NFC) (#116163)
Identified with misc-include-cleaner.
2024-11-14 07:55:18 -08:00
RolandF77
06c8210a67
update P7 32-bit partial vector load cost (#108261)
Update cost model to reflect codegen change to use lfiwzx 
for 32-bit partial vector loads on pwr7 with
https://github.com/llvm/llvm-project/pull/104507.
2024-10-03 12:28:43 -04:00
Philip Reames
d288574363
[TTI][RISCV] Model cost of loading constants arms of selects and compares (#109824)
This follows in the spirit of 7d82c99403f615f6236334e698720bf979959704,
and extends the costing API for compares and selects to provide
information about the operands passed in an analogous manner. This
allows us to model the cost of materializing the vector constant, as
some select-of-constants are significantly more expensive than others
when you account for the cost of materializing the constants involved.

This is a stepping stone towards fixing
https://github.com/llvm/llvm-project/issues/109466. A separate SLP patch
will be required to utilize the new API.
2024-09-25 07:25:57 -07:00
azhan92
1df4d866cc
[PowerPC] Add support for -mcpu=pwr11 / -mtune=pwr11 (#99511)
This PR adds support for -mcpu=pwr11/power11 and -mtune=pwr11/power11 in
clang and llvm.
2024-07-23 09:49:41 -04:00
David Green
4ac2721e51
[AArch64] Add costs for ST3 and ST4 instructions, modelled as store(shuffle). (#87934)
This tries to add some costs for the shuffle in a ST3/ST4 instruction,
which are represented in LLVM IR as store(interleaving shuffle). In
order to detect the store, it needs to add a CxtI context instruction to
check the users of the shuffle. LD3 and LD4 are added, LD2 should be a
zip1 shuffle, which will be added in another patch.

It should help fix some of the regressions from #87510.
2024-04-09 16:36:08 +01:00
Il-Capitano
308ed0233a
[Intrinsics] Make patchpoint.i64 generic on its return type (#85911)
Currently patchpoints can only have two result types, `void` and `i64`.
This limits the result to general purpose registers.
This patch makes `patchpoint.i64` an overloadable intrinsic, allowing
result values that can fit in a single register (e.g. integers,
pointers, floats).
2024-03-26 19:08:52 +05:30
Chen Zheng
8d1046ae49
[PowerPC] adjust cost for extract i64 from vector on P9 and above (#82963)
https://godbolt.org/z/Ma347Tx1W
2024-03-04 09:37:11 +08:00
Chen Zheng
80f3bb4cf2
[PowerPC] adjust cost for vector insert/extract with non const index (#79092)
P9 has vxform `Vector Extract Element Instructions` like `vextuwrx` and
P10 has vxform `Vector Insert Element instructions` like `vinsd`. Update
the instruction cost reflecting these instructions.

Fixes https://github.com/llvm/llvm-project/issues/50249
2024-02-19 09:57:49 +08:00
RolandF77
4beea6b195
[PowerPC] lower partial vector store cost (#78358)
There are matching store opcodes (stfd, stxsiwx) for the load opcodes
that make 32-bit and 64-bit vector operations cheap with VSX, so stores
should also be cheap.
2024-01-23 16:07:18 -05:00
Kazu Hirata
286ef12b47 [Target] Remove unnecessary includes (NFC) 2023-12-07 21:03:56 -08:00
Sander de Smalen
81b7f115fb
[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979)
It seems TypeSize is currently broken in the sense that:

  TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8)

without failing its assert that explicitly tests for this case:

  assert(LHS.Scalable == RHS.Scalable && ...);

The reason this fails is that `Scalable` is a static method of class
TypeSize,
and LHS and RHS are both objects of class TypeSize. So this is
evaluating
if the pointer to the function Scalable == the pointer to the function
Scalable,
which is always true because LHS and RHS have the same class.

This patch fixes the issue by renaming `TypeSize::Scalable` ->
`TypeSize::getScalable`, as well as `TypeSize::Fixed` to
`TypeSize::getFixed`,
so that it no longer clashes with the variable in
FixedOrScalableQuantity.

The new methods now also better match the coding standard, which
specifies that:
* Variable names should be nouns (as they represent state)
* Function names should be verb phrases (as they represent actions)
2023-11-22 08:52:53 +00:00
Youngsuk Kim
e69e066bfe [llvm][PowerPC] Remove no-op ptr-to-ptr bitcasts (NFC)
Opaque ptr cleanup effort.
2023-11-01 16:40:32 -05:00
Fangrui Song
8e247b8f47 Replace TypeSize::{getFixed,getScalable} with canonical TypeSize::{Fixed,Scalable}. NFC 2023-10-27 00:30:41 -07:00
Roland Froese
4d425f8663 [PowerPC] vector cost model add cost to extract i1
Try to avoid some unprofitable predication on PPC. Recognize in the cost model that computing on i1 values will require extra mask or compare operation.

Differential Revision: https://reviews.llvm.org/D155876
2023-08-14 17:04:11 -04:00
Ting Wang
65f68812d3 [PowerPC] update PPCTTIImpl::supportsTailCallFor() check conditions
This patch reuse `PPCTargetLowering::isEligibleForTCO()` to check
`PPCTTIImpl::supportsTailCallFor()`.

Fixes #59315

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D140369
2023-02-28 22:29:16 -05:00
Luke Lau
b02b1e0ed6 [LV][NFC] Use ElementCount for getMaxInterleaveFactor
In order to allow targets to disable interleaving for scalable vectors, pass the entire VF's ElementCount to getMaxInterleaveFactor.
This is based off of the approach used here: 8d36708507

The plan would then be to disable interleaving on scalable VFs on RISC-V in a follow up patch.
See https://reviews.llvm.org/D143723#4132349

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D144474
2023-02-22 10:15:05 +00:00
ShihPo Hung
5fb3a57ea7 [Cost] Add CostKind to getVectorInstrCost and its related users
LoopUnroll estimates the loop size via getInstructionCost(),
but getInstructionCost() cannot pass CostKind to getVectorInstrCost().
And so does getShuffleCost() to getBroadcastShuffleOverhead(),
getPermuteShuffleOverhead(), getExtractSubvectorOverhead(),
and getInsertSubvectorOverhead().

To address this, this patch adds an argument CostKind to these
functions.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D142116
2023-01-21 05:29:24 -08:00
Alexey Bataev
9b5f62685a [SLP]Fix cost of the broadcast buildvector/gather.
Need to include the cost of the initial insertelement to the cost of the
broadcasts. Also, need to adjust the cost of the gather/buildvector if
the element is inserted into poison/undef vector.

Differential Revision: https://reviews.llvm.org/D140498
2023-01-06 09:25:05 -08:00
Chen Zheng
85edf1fc70 [PowerPC] remove the ctr clobbers check related to TLS access
Dynamic tls access model will be lowered to MI which clobbers CTR in
the loop in ISEL(ADDItlsgdLADDR) and post-isel CTR loop pass will revert
the loop to a normal compare + branch form.

So no need to add this clobber check in hardware loop insertion pass now.

Reviewed By: nemanjai

Differential revision: https://reviews.llvm.org/D140367
2023-01-05 21:23:29 -05:00
Chen Zheng
f74324a1f8 [PowerPC] don't generate hardware loop.
If the candidate loop already has hardware loop related intrinsics,
don't generate hardware loop on PPC. PPC does not support nested
hardware loops.
2022-12-19 20:32:29 -05:00
Chen Zheng
b5e1fc19da [PowerPC] don't check CTR clobber in hardware loop insertion pass
We added a new post-isel CTRLoop pass in D122125. That pass will expand
the hardware loop related intrinsic to CTR loop or normal loop based
on the loop context. So we don't need to conservatively check the CTR
clobber now on the IR level.

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D135847
2022-12-04 20:53:49 -05:00
Krzysztof Parzyszek
86fe4dfdb6 TargetTransformInfo: convert Optional to std::optional
Recommit: added missing "#include <cstdint>".
2022-12-02 11:42:15 -08:00
Krzysztof Parzyszek
4e12d1836a Revert "TargetTransformInfo: convert Optional to std::optional"
This reverts commit b83711248cb12639e7ef7303cfbb4452b4067e85.

Some buildbots are failing.
2022-12-02 11:34:04 -08:00
Krzysztof Parzyszek
b83711248c TargetTransformInfo: convert Optional to std::optional 2022-12-02 11:27:12 -08:00
Nemanja Ivanovic
4ea121c904 [PowerPC] Fix a number of inefficiencies and issues with atomic code gen
There are a few issues with the code we generate for atomic operations and the way we generate it:

- Hard coded CR0 for compares
- Order of operands for compares not conducive to
  emitting compare-immediate or for CSE of compares
- Missing MachineMemOperand for st[bhwd]cx intrinsics
- Missing intrinsic properties for the same
- Unnecessary blocks with store conditional
  instructions to clear reservation (which ends
  up hindering performance)
- Move from CR instructions just to compare the
  result of a store conditional with zero (even
  though it is a record-form)

This patch aims to resolve all of those issues.

Differential revision: https://reviews.llvm.org/D134783
2022-10-03 19:55:29 -05:00
Philip Reames
c9608d57b8 [TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC]
This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet.  The main point of this change is simple consistency with the recently changes getArithmeticInstrCost, and to remove the last (interface) use of OperandValueKind.
2022-08-23 07:55:42 -07:00
Philip Reames
104fa367ee [TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC]
This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both.

This is the change which motivated the whole sequence which preceeded it.  In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact.  This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through.

I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance.  For instance, every parameter which changes type in this change also changes name.  This was intentional to make sure that every call site possible effected must show up in the diff.  This let me audit each one closely.
2022-08-22 15:16:39 -07:00
Ting Wang
d2d77e050b [PowerPC][Coroutines] Add tail-call check with call information for coroutines
Fixes #56679.

Reviewed By: ChuanqiXu, shchenz

Differential Revision: https://reviews.llvm.org/D131953
2022-08-21 22:20:40 -04:00
Simon Pilgrim
5263155d5b [CostModel] Add CostKind argument to getShuffleCost
Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after comparison (vectorcombiner). Calls via getInstructionCost were just dropping the CostKind, so again there should be no change at this time (as getShuffleCost and its expansions don't use CostKind yet) - but it will make it easier for us to better account for size/latency shuffle costs in inline/unroll passes in the future.

Differential Revision: https://reviews.llvm.org/D132287
2022-08-21 10:54:51 +01:00
Alexey Bataev
d53e245951 [COST][NFC]Introduce OperandValueKind in getMemoryOpCost, NFC.
Added OperandValueKind OpdInfo parameter to getMemoryOpCost functions to
better estimate cost with immediate values.

Part of D126885.
2022-08-19 07:33:00 -07:00
Simon Pilgrim
fdec50182d [CostModel] Replace getUserCost with getInstructionCost
* Replace getUserCost with getInstructionCost, covering all cost kinds.
* Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks.

Original Patch by @samparker (Sam Parker)

Differential Revision: https://reviews.llvm.org/D79483
2022-08-18 11:55:23 +01:00
Daniil Fukalov
7ed3d81333 [NFCI] Move cost estimation from TargetLowering to TargetTransformInfo.
TragetLowering had two last InstructionCost related `getTypeLegalizationCost()`
and `getScalingFactorCost()` members, but all other costs are processed in TTI.

E.g. it is not comfortable to use other TTI members in these two functions
overrided in a target.

Minor refactoring: `getTypeLegalizationCost()` now doesn't need DataLayout
parameter - it was always passed from TTI.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D117723
2022-08-18 00:38:55 +03:00
Fangrui Song
de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Paul Kirth
d434e40f39 [llvm][NFC] Refactor code to use ProfDataUtils
In this patch we replace common code patterns with the use of utility
functions for dealing with profiling metadata. There should be no change
in functionality, as the existing checks should be preserved in all
cases.

Reviewed By: bogner, davidxl

Differential Revision: https://reviews.llvm.org/D128860
2022-08-03 00:09:45 +00:00
Paul Kirth
6e9bab71b6 Revert "[llvm][NFC] Refactor code to use ProfDataUtils"
This reverts commit 300c9a78819b4608b96bb26f9320bea6b8a0c4d0.

We will reland once these issues are ironed out.
2022-07-27 21:38:11 +00:00
Paul Kirth
300c9a7881 [llvm][NFC] Refactor code to use ProfDataUtils
In this patch we replace common code patterns with the use of utility
functions for dealing with profiling metadata. There should be no change
in functionality, as the existing checks should be preserved in all
cases.

Reviewed By: bogner, davidxl

Differential Revision: https://reviews.llvm.org/D128860
2022-07-27 21:13:54 +00:00
Congzhe Cao
a9dccb0072 [TargetTransformInfo] Added an opt/llc option for cache line size
In some passes we need a valid number of cache line size to do analysis or
transformation, e.g., loop cache analysis and loop date prefetch. However,
for some backend targets, `TTIImpl->getCacheLineSize()` is not implemented
and hence 'TTI.getCacheLineSize()' would just return 0 which eventually might
produce invalid result.

In this patch we add a user-specified opt/llc option for cache line size.
If the option is specified by users we use the value supplied, otherwise we
fall-back to the default value obtained from `TTIImpl->->getCacheLineSize()`.
The powerpc target already has such an option, this patch generalizes
this option to TargetTransformInfo.cpp.

Reviewed By: bmahjour, #loopoptwg

Differential Revision: https://reviews.llvm.org/D127342
2022-06-16 15:57:51 -04:00
eopXD
6a84579243 [LSR][TTI][PowerPC][SystemZ][X86] Add const-ness to TTI::isLSRCostLess. NFC
Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D126350
2022-05-27 15:22:23 -07:00
Qiu Chaofan
d9d15af787 [PowerPC] Treat llvm.fmuladd intrinsic as using CTR
This fixes bug 55463, similar to D78668. This is a temporary fix since
we will switch to post-isel CTR loop determination in the future.

Reviewed By: dim, shchenz

Differential Revision: https://reviews.llvm.org/D125746
2022-05-18 15:57:55 +08:00
Vasileios Porpodas
fa8a9fea47 Recommit "[SLP][TTI] Refactoring of getShuffleCost Args to work like getArithmeticInstrCost"
This reverts commit 6a9bbd9f20dcd700e28738788bb63a160c6c088c.

Code review: https://reviews.llvm.org/D124202
2022-04-26 14:02:40 -07:00
Vasileios Porpodas
39aa202aff Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit e6ead19b774718113007ecb1a4449d7af0cbcfeb.
2022-03-23 18:32:17 -07:00
Arthur Eubanks
e6ead19b77 Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash."
This reverts commit 27bd8f94928201f87f6b659fc2228efd539e8245.

Causes crashes, see comments in D121973
2022-03-23 10:57:45 -07:00
Vasileios Porpodas
27bd8f9492 Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit f7d7d2a08d16356c57f6d2d36bc2fc0589a55df9.
2022-03-22 16:41:55 -07:00
Arthur Eubanks
f7d7d2a08d Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads.""
This reverts commit 79613185d305013de743cdbd6690e4d77c8af27e.

Causes crashes, see comments in https://reviews.llvm.org/D121973.
2022-03-22 13:33:49 -07:00
Vasileios Porpodas
79613185d3 Recommit "[SLP] Fix lookahead operand reordering for splat loads."
Original review: https://reviews.llvm.org/D121354

The original commit 9136145eb019e1d18c966d4d06a3df349b88cc14 broke the build on several targets.

Differential Revision: https://reviews.llvm.org/D121973
2022-03-21 15:57:32 -07:00
Kazu Hirata
bf039a8620 [Target] Use range-based for loops (NFC) 2022-01-23 22:53:15 -08:00
Qiu Chaofan
8dedf9b58b [PowerPC] Change CTR clobber estimation for 128-bit floating types
Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D117459
2022-01-22 23:20:14 +08:00
Nikita Popov
f5ac23b5ae [ArgPromotion][TTI] Pass types to ABI compatibility hook
The areFunctionArgsABICompatible() hook currently accepts a list of
pointer arguments, though what we're actually interested in is the
ABI compatibility after these pointer arguments have been converted
into value arguments.

This means that a) the current API is incompatible with opaque
pointers (because it requires inspection of pointee types) and
b) it can only be used in the specific context of ArgPromotion.
I would like to reuse the API when inspecting calls during inlining.

This patch converts it into an areTypesABICompatible() hook, which
accepts a list of types. This makes the method more generally usable,
and compatible with opaque pointers from an API perspective (the
actual usage in ArgPromotion/Attributor is still incompatible,
I'll follow up on that in separate patches).

Differential Revision: https://reviews.llvm.org/D116031
2021-12-22 09:37:51 +01:00
Nemanja Ivanovic
a3ea9052d6 [PowerPC] Do not increase cost for getUserCost with MMA types
Commit 150681f increases
cost of producing MMA types (vector pair and quad).
However, it increases the cost for getUserCost() which is
used in unrolling. As a result, loops that contain these
types already (from the user code) cannot be unrolled
(even with the user's unroll pragma). This was an unintended
sideeffect. Reverting that portion of the commit to allow
unrolling such loops.

Differential revision: https://reviews.llvm.org/D115424
2021-12-21 13:36:08 -06:00
Mikael Holmen
cb413f208a [PowerPC] Fix gcc warning about unused variable [NFC]
gcc warned about
../lib/Target/PowerPC/PPCTargetTransformInfo.cpp:1401:13: warning: unused variable 'VecTy' [-Wunused-variable]
 1401 |   if (auto *VecTy = dyn_cast<FixedVectorType>(DataType)) {
      |             ^~~~~
2021-12-09 10:31:56 +01:00