1390 Commits

Author SHA1 Message Date
liqinweng
1f8746cc80 [RISCV][CostModel] Add half type support for the cost model of sqrt/fabs
1. Refactor for costs of sqrt/fabs
2. Add half type support for the cost model of sqrt/fabs

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D132908
2023-01-09 12:57:03 +08:00
liqinweng
f3408739da [RISCV][CostModel] Add cost model for integer abs
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D132999
2023-01-09 11:38:24 +08:00
Alexey Bataev
9b5f62685a [SLP]Fix cost of the broadcast buildvector/gather.
Need to include the cost of the initial insertelement to the cost of the
broadcasts. Also, need to adjust the cost of the gather/buildvector if
the element is inserted into poison/undef vector.

Differential Revision: https://reviews.llvm.org/D140498
2023-01-06 09:25:05 -08:00
Yeting Kuo
1e9e1b9cf8 [VP][RISCV] Add vp.ctlz/cttz and RISC-V support.
The patch also adds expandVPCTLZ and expandVPCTTZ to expand vp.ctlz/cttz nodes
and the cost model of vp.ctlz/cttz.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D140370
2023-01-04 15:15:01 +08:00
Sjoerd Meijer
5c94faba0b [TTI] [AArch64] getMemoryOpCost for ptr types
Opaque ptr types have a size in bits of 0. The legalised type is an i64 or
vector of i64s, which do have a size. Because of this difference in size, target
hook getMemoryOpCost modelled stores of ptr types as extending/truncating
load/stores. Now we just check for opaque ptr types and return the legalised
cost. This makes stores of pointers cheaper, and as a result we now SLP
vectorise the changed test case.

Differential Revision: https://reviews.llvm.org/D140193
2022-12-16 15:38:17 +00:00
Sjoerd Meijer
e909c3d31f [CostModel][AArch64] Precommit opaque ptr store tests. NFC. 2022-12-16 15:34:12 +00:00
Nikita Popov
36ec97a575 [CostModel] Convert some tests to opaque pointers (NFC)
These required some manual fixup.
2022-12-15 09:57:59 +01:00
Nikita Popov
9c1dca3c2f [CostModel] Convert test to opaque pointers (NFC)
Replace GEP index from 0 to 1 so it is not a trivial GEP.
2022-12-15 09:52:29 +01:00
Nikita Popov
68c50b111d [CostModel] Convert some tests to opaque pointers (NFC) 2022-12-15 09:50:34 +01:00
Yeting Kuo
ad68586a37 [VP][RISCV] Add vp.ctpop and RISC-V support.
The patch also adds expandVPCTPOP in TargetLowering to expand VP_CTPOP nodes.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D139920
2022-12-14 09:47:44 +08:00
Roman Lebedev
64d46e141c
[NFC][Costmodel][X86] Replication shuffle: AVX512F can promote i1 to i32.
As the added codegen test coverage shows,
there isn't that much difference between AVX512DQI and
baseline AVX512F codegen, DQI added `vpmovm2d`/`vpmovd2m`,
but with just the Foundation we can use `vpternlogd`/`vptestmd`
to do the same.
2022-12-13 21:21:07 +03:00
Roman Lebedev
ff5fcda430
[x86][Costmodel] AVX512VL: add missing costs for v8 i1<->i32 casts
This would come up as a regression in the follow-up Replication-of-i1 patch.

https://godbolt.org/z/fxr9Mzssr
2022-12-13 21:21:07 +03:00
liqinweng
6efb45f5ab [AARCH64][CostModel] Modified the cost of mask vector load/store
Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D134413
2022-12-09 14:11:21 +08:00
Alex Richardson
9114ac67a9 Overload all llvm.annotation intrinsics for globals argument
The global constant arguments could be in a different address space
than the first argument, so we have to add another overloaded argument.
This patch was originally made for CHERI LLVM (where globals can be in
address space 200), but it also appears to be useful for in-tree targets
as can be seen from the test diffs.

Differential Revision: https://reviews.llvm.org/D138722
2022-12-07 18:29:18 +00:00
Yeting Kuo
0f8c761c48 [VP][RISCV] Recommit "Add vp.fshl/fshr and RISC-V support."
This reverts commit 7883e5b061bdbbe8bee5f479ebe911db5045b7e9.

The original commit was reverted that it didn't update test files after D136263
landed. The recommit fixed those.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D139509
2022-12-07 15:58:12 +08:00
Kazu Hirata
7883e5b061 Revert "[VP][RISCV] Add vp.fshl/fshr and RISC-V support."
This reverts commit 70de0e014013b4d97febe6704881a9a8c893d078.

I'm seeing:

Failed Tests (2):
  LLVM :: CodeGen/RISCV/rvv/fixed-vectors-fshr-fshl-vp.ll
  LLVM :: CodeGen/RISCV/rvv/fshr-fshl-vp.ll

Also reported at:

https://lab.llvm.org/buildbot/#/builders/123/builds/14531
2022-12-06 22:27:43 -08:00
Yeting Kuo
8c8a6e1488 [RISCV] Add basic cost model for vp float rounding instructions.
Reviewed By: craig.topper, reames

Differential Revision: https://reviews.llvm.org/D137766
2022-12-07 14:15:13 +08:00
Yeting Kuo
70de0e0140 [VP][RISCV] Add vp.fshl/fshr and RISC-V support.
The patch made VectorLegalizer expand ISD::VP_FSHL and ISD::VP_FSHR to
achieve the codegen.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D138379
2022-12-07 12:16:36 +08:00
liqinweng
cfd73186db [RISCV][CostModel] Add a test for reverse shuffles cost on RISCV, NFC
Reviewed By: benshi001

Differential Revision: https://reviews.llvm.org/D134519
2022-12-07 10:16:20 +08:00
jacquesguan
d11cc69143 [RISCV][NFC] Add test coverage for insertelement/extractelement of widen vector type.
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D135534
2022-12-06 15:16:59 +08:00
Matt Arsenault
a74c5707be Fix some test files with executable permissions 2022-12-02 17:12:03 -05:00
Philip Reames
73eacf94e0 [RISCV] Incorporate LMUL into costs for arithmetic and shuffles
This reuses the routine implemented in 0e6f0b7 to implement several existing TODOs. Many of the operations scale linearly with LMUL; this change represents that in the cost model.

Differential Revision: https://reviews.llvm.org/D139039
2022-12-01 10:46:27 -08:00
Philip Reames
7d82c99403 [RISCV][TTI] Account for constant materialization cost when costing arithmetic operations
At the IR level, we generally assume that constants are free to materialize. However, for RISCV due to some quirks of the ISA, materializing arbitrary constants can be rather expensive. We frequently fallback to constant pool loads.

We've been slowly moving in the direction of modeling the cost of the remat as part of the instruction cost. This has the effect of disincentivizing vectorization - mostly SLP - when we'd have to materialize an expensive constant.

We need better modeling of which constants are expensive and not, but the moment let's be consistent with how we model arithmetic and memory instructions. The difference between the two is that arithmetic can sometimes fold a splat operation which stores can not.

Differential Revision: https://reviews.llvm.org/D138941
2022-11-30 07:20:51 -08:00
David Green
f2a92db29e [AArch64] Don't treat SVE scalable extends as free widening instructions
The logic in isWideningInstruction handles instructions like uaddw and
smull, where 'add(x, zext(y))' or 'mul(sext(x), sext(y))' can be
converted to single instructions, making the extends free. This doesn't
apply the same to SVE instructions though.
https://godbolt.org/z/695d3nhGd

(There are instructions like SMULLT/B, but they require top/bottom lane
interleaving. That is similar to MVE instructions, which required a
special pass to perform the lane interleaving).

This patch just bails out of the call to isWideningInstruction if the
vector is scalable, getting a more accurate cost.

Differential Revision: https://reviews.llvm.org/D138591
2022-11-30 13:09:48 +00:00
ShihPo Hung
0e6f0b7cc3 [RISCV] Add cost model for fixed broadcast shuffle
This patch adds basic broadcast shuffle costs in order to enable SLP vectorization.
And adds `getLMULCost` to consider reciprocal throughput for different LMUL.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D137276
2022-11-30 04:58:52 -08:00
Philip Reames
3c9d247112 [RISCV] Add test coverage for vector constant materialization costs on arithmetic instructions 2022-11-29 12:00:58 -08:00
Philip Reames
e726c5879a [RISCV] Add cost model coverage for vector arithmetic 2022-11-29 11:50:52 -08:00
Mateja Marjanovic
595a08847a [AMDGPU] Add support for new LLVM vector types
Add VReg, AReg and SReg on AMDGPU for bit widths: 288, 320, 352 and 384.

Differential Revision: https://reviews.llvm.org/D138205
2022-11-29 17:02:04 +01:00
David Green
57dc4a8cab [AArch64] Extend testing for widening conditions under SVE. NFC 2022-11-29 15:53:39 +00:00
Philip Reames
db07d79ab0 [RISCV] Add cost model for integer and float vector arithmetic instructions.
This patch implements getArithmeticInstrCost for RISCV, supports cost
model for integer and float vector arithmetic instructions.

Differential Revision: https://reviews.llvm.org/D133552 (Original patch by jacquesguan.  Subset by me with todos added.)
2022-11-28 09:04:38 -08:00
Zain Jaffal
6e4cea55f0 [AArch64] Fix cost model for udiv instruction when one of the operands is a uniform constant
Currently the model over estimates the cost of a udiv instruction with one constant. The correct cost for a udiv instruction is
insert_cost * extract_cost * num_elements

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D135991
2022-11-28 10:38:17 +02:00
Haohai Wen
1215e86a0e [CostModel][X86] Fix permute latency cost
Avx512 permute latency should be 3 instead of 1.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D138427
2022-11-23 19:17:16 +08:00
Haohai Wen
2dfe76e989 [CostModel][X86] Add CostKinds test coverage for shufflevector instruction
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D138485
2022-11-23 10:30:48 +08:00
Yeting Kuo
ed9638c44b [VP][RISCV] Add vp.nearbyint and RISC-V support.
nearbyint has the property to execute without exception.
For not modifying fflags, the patch added new machine opcode
PseudoVFROUND_NOEXCEPT_V that expands vfcvt.x.f.v and vfcvt.f.x.v between a pair
of frflags and fsflags.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D137685
2022-11-16 14:05:35 +08:00
Yeting Kuo
5c3ca10b09 [VP][RISCV] Add vp.bswap and RISC-V support.
The patch also added function expandVPBSWAP to expand ISD::VP_BSWAP nodes.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D137928
2022-11-16 11:36:38 +08:00
Roman Lebedev
11abb7fedb
[NFC][X86][Costmodel] Drop reduntant interleaved cost test coverage
These are already covered by the more general tests i've added.
2022-11-15 21:30:06 +03:00
Roman Lebedev
8e37b53360
[X86] Rewrite getScalarizationOverhead()
All of our insert/extract ops work on 128-bit lanes.

For `Insert`, we need to extract affected 128-bit lane,
unless it's being fully overwritten (FIXME: do we need to be
careful about legalization-induced padding that we obviously don't demand?),
perform insertions, and then insert the 128-bit lane back.

But hold on. If we are operating on an 256-bit legal vector,
and thus have two 128-bit subvectors, and are fully overwriting them both,
we don't actually need to insert *both* subvectors,
only the second one, into the implicitly-widened first one.

Also, `Insert` wasn't actually querying the costs,
but just assuming them to be `1`.

`getShuffleCost(TTI::SK_ExtractSubvector)` notes:
```
  // Note that in general, the insertion starting at the beginning of a vector
  // isn't free, because we need to preserve the rest of the wide vector.
```
... so as far as i can tell, we didn't account for that.

I was hoping this would allow vectorization at a higher VF at one case i looked at,
but the subvector insertion cost is still dis-advising that.

The change for `Extract` is NFC, and is for consistency only,
i wanted to get rid of of that weird explicit discounting of insertion of 0'th element,
since the general code should already deal with that.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D137913
2022-11-15 21:07:12 +03:00
Philip Reames
73482b457e [RISCV] Fix cost of legal fixed length masked load and stores
We can cost them the same way as a scalable masked load/store. By hitting the default path, we were costing them as if they were being scalarized. This is a significant over estimate.

Differential Revision: https://reviews.llvm.org/D137218
2022-11-02 07:24:38 -07:00
Yeting Kuo
71e4e35581 [VP][RISCV] Add vp.rint and RISC-V support.
FRINT uses dynamic rounding mode instead of static rounding mode. The patch
rename VFCVT_X_F_VL to VFCVT_RM_X_F_VL for static rounding mode uses and added
new ISDNode VFCVT_X_F_VL directly selected to PseudoVFCVT_X_F_V.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D136662
2022-11-01 14:52:47 +08:00
Craig Topper
e94dc58dff [RISCV] Inline scalar ceil/floor/trunc/rint/round/roundeven.
This avoids the call overhead as well as the the save/restore of
fflags and the snan handling in the libm function.

The save/restore of fflags and snan handling are needed to be
correct for -ftrapping-math. I think we can ignore them in the
default environment.

The inline sequence will generate an invalid exception for nan
and an inexact exception if fractional bits are discarded.

I've used a custom inserter to explicitly create the control flow
around the float->int->float conversion.

We can probably avoid the final fsgnj after the conversion for
no signed zeros FMF, but I'll leave that for future work.

Note the comparison constant is slightly different than glibc uses.
They use 1<<53 for double, I'm using 1<<52. I believe either are valid.
Numbers >= 1<<52 can't have any fractional bits. It's ok to do the
float->int->float conversion on numbers between 1<<53 and 1<<52 since
they will all fit in 64. We only have a problem if the double can't fit
in i64

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D136508
2022-10-26 14:36:49 -07:00
Bjorn Pettersson
ec9ccb1668 [test] Use -passes syntax in Analysis tests
Another step towards getting rid of dependencies to the legacy
pass manager.

Primary change here is to just do -passes=foo instead of -foo in
simple situations (when running a single pass). But also
updated a few test running multiple passes.
2022-10-21 20:38:42 +02:00
Craig Topper
020450211b [RISCV] Add missing vscale x 1 cost model entries and tests.
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D136411
2022-10-21 09:05:59 -07:00
Simon Pilgrim
ca5fbac783 [CostModel][X86] Remove duplicate RUN line from cttz cost tests 2022-10-21 14:07:03 +01:00
Craig Topper
851669792f [RISCV] Add vscale x 1 cost model tests for compares. NFC 2022-10-20 21:36:10 -07:00
Jolanta Jensen
66e3589cd7 [NFC][CostModel] Added floating point frem test for SVE
Differential Revision: https://reviews.llvm.org/D136241
2022-10-19 19:34:14 +00:00
David Green
de6dfbbb30 [ARM] Fix for MVE i128 vector icmp costs.
We were hitting an assert as the legalied type needn't be a vector.

Fixes #58364
2022-10-14 18:49:25 +01:00
Simon Pilgrim
a640aa5bfd [CostModel][X86] Add insertelement costs into a known base vector value
We were only testing inserting into undef/poison base vectors

Test coverage for Issue #58261
2022-10-11 12:07:25 +01:00
Craig Topper
de0de294eb [RISCV] Update cost of vector roundeven to match round which uses the same sequence but a different FRM value.
Reviewed By: reames, eopXD

Differential Revision: https://reviews.llvm.org/D134978
2022-09-30 20:01:35 -07:00
Philip Reames
02bfe2de7c [RISCV] Adjust vector immediate store materialization cost
This change updates the costs to make constant pool loads match their actual cost, and adds the broadcast special case to avoid too many regressions. We really need more information about the constants being rematerialized, but this is an incremental improvement.

Differential Revision: https://reviews.llvm.org/D134746
2022-09-29 07:37:13 -07:00
eopXD
02a982829c [RISCV] Add lowering for llvm.roundeven
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134785
2022-09-29 06:08:14 -07:00