28 Commits

Author SHA1 Message Date
Luke Lau
d0d864f6f4 [SLP] Explicitly pass AccessTy to getGEPCost
Building on D149889, this patch updates SLP to pass the vector type as
the AccessTy to getGEPCost.
This should have the effect of GEPs being costed for more often instead
of being treated as foldable into the address mode and thus free, as
some architectures, notably RISC-V, do not have offset+reg addressing
modes for vector memory accesses.

Note that in SLP, GEPs are costed in two places: getPointersChainCost
and GetGEPCostDiff.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D153570
2023-06-29 18:42:24 +01:00
Luke Lau
2b28f8f044 [RISCV][SLP] Add tests for unprofitable SLP vectorization due to GEP. NFC
Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D149888
2023-06-29 18:42:22 +01:00
Luke Lau
b87a09301f [RISCV] Add tests for cost modelling constants in phis
Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D149168
2023-06-29 13:55:22 +01:00
Philip Reames
7f26c27e03 [RISCV] Enable SLP by default (when vectors are available)
I propose that we go ahead and enabled SLP by default. Over the last few weeks, @luke and I have been working through codegen issues seen at small VLs from a couple of SPEC workloads. We still have a ways to go to get optimal codegen, but we're at the point where having a single configuration we're all tuning against is probably the right default.

As a bit of history, I introduced this TTI hook back in a310637132 back in August of last year to unblock enabling LoopVectorizer. At the time, we had a couple known issues: constant materialization, address generation, and a general lack of maturity of small fixed vector codegen. By now, each of these has had significant investment. I can't say any of them are completely fixed, but we're no longer seeing instances of them every place we look.

What we're mostly seeing at this point is a long tail of code gen opportunities, many involving build vectors, shuffles, and extract patterns. I have a couple patches up to continue iterating on those issues, but I don't think they need to be blockers for enabling SLP.

Differential Revision: https://reviews.llvm.org/D152750
2023-06-14 09:49:58 -07:00
Luke Lau
c27a0b21c5 [SLP][RISCV] Account for offset folding in getPointersChainCost
For a GEP in a pointer chain, if:
1) a pointer chain is unit-strided
2) the base pointer wasn't folded and is sitting in a register somewhere
3) the distance between the GEP and the base pointer is small enough and
   can be folded into the addressing mode of the using load/store

Then we can exclude that GEP from the total cost of the pointer chain,
as it will likely be folded away.

In order to check if 3) holds, we need to know the type of memory access
being made by the users of the pointer chain. For that, we need to pass
along a new argument to getPointersChainCost. (Using the source pointer
type of the GEP isn't accurate, see https://reviews.llvm.org/D149889 for
more details).

Also note that 2) is currently an assumption, and could be modelled more
accurately.

This prevents some unprofitable cases from being SLP vectorized on
RISC-V by making the scalar costs cheaper and closer to the actual
codegen.

For now the getPointersChainCost hook is duplicated for RISC-V to prevent
disturbing other targets, but could be merged back in and shared with
other targets in a following patch.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D149654
2023-05-22 13:55:30 +01:00
Luke Lau
53afdb712d [SLP][RISCV] Add test for folding offsets in GEP pointer chains 2023-05-22 10:11:02 +01:00
Luke Lau
8288d39b4c [RISCV] Add test for unprofitable SLP vectorization
Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D149653
2023-05-19 14:45:39 +01:00
Tobias Hieta
f84bac329b
[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
ManuelJBrito
8b56da5e9f [IR] Change shufflevector undef mask to poison
With this patch an undefined mask in a shufflevector will be printed as poison.
This change is done to support the new shufflevector semantics
for undefined mask elements.

Differential Revision: https://reviews.llvm.org/D149210
2023-04-27 14:41:10 +01:00
Luke Lau
f23ea4cbd4 [RISCV] Model select and insertsubvector shuffle kinds
Selects get lowered to a vmerge with a mask, and insertsubvectors get
lowered to a vslideup.

Differential Revision: https://reviews.llvm.org/D146747
2023-03-24 17:30:32 +00:00
Luke Lau
1c9094a201 [RISCV] Add test case for two equivalent reductions
They are functionally equivalent but currently one fails to vectorize
because the cost of an insert subvector shuffle is too expensive.
D146747 will update the cost of these types of shuffles, so add a test
case for it.
2023-03-24 17:30:32 +00:00
Luke Lau
40b408cb05 [RISCV] Enable SLP in RISC-V SLP reduction tests
Horizontal reduction can still kick in even when the max VF is set to 0,
but strange stuff can happen as it affects the cost model.
Enable it for these tests as eventually the goal will be to have SLP
enabled.
2023-03-24 17:30:32 +00:00
Luke Lau
8d16c6809a [RISCV] Increase default vectorizer LMUL to 2
After some discussion and experimentation, we have seen that changing the default number of vector register bits to LMUL=2 strikes a sweet spot.
Whilst we could be clever here and make the vectorizer smarter about dynamically selecting an LMUL that
a) Doesn't affect register pressure
b) Suitable for the microarchitecture
we would need to teach its heuristics about RISC-V register grouping specifics.
Instead this just does the easy, pragmatic thing by changing the default to a safe value that doesn't affect register pressure signifcantly[1], but should increase throughput and unlock more interleaving.

[1] Register spilling when compiling sqlite at various levels of `-riscv-v-register-bit-width-lmul`:

LMUL=1    2573 spills
LMUL=2    2583 spills
LMUL=4    2819 spills
LMUL=8    3256 spills

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D143723
2023-03-23 10:33:50 +00:00
Ben Shi
9855fe4568 [RISCV][NFC] Add more tests for SLP vectorization (binops on load/store)
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D146025
2023-03-23 09:01:04 +08:00
Luke Lau
e69f8bac42 [RISCV][NFC] Add test case for SLP reduction vectorization failure
Horizontal reductions still occur on RISC-V, despite the maximum SLP VF
reported back by TTI being 1, to disable SLP.
This can cause the cost model to think it can vectorize a gather into
smaller, widened loads, when it will actually fail to do so.
This should ultimately be fixed whenever SLP is re-enabled for RISC-V at
some point.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D146529
2023-03-21 15:57:52 +00:00
Ben Shi
ce455f4434 [RISCV][NFC] Add more floating point tests for SLP vectorization
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D146108
2023-03-16 13:30:16 +08:00
Ben Shi
72ce9d1ccd [RISCV][NFC] Add tests for SLP vectorization of smin/smax/umin/umax
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D146015
2023-03-16 13:30:16 +08:00
Ben Shi
013235a200 [RISCV][NFC] Add tests for SLP vectorization of math functions
RISCV has "vfabs.v" and "vfsqrt.v" so math functions abs and sqrt
can be SLP vectorized. But others exp/log/sin/asin/sinh/asinh/...
can not.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D145562
2023-03-10 07:34:21 +08:00
Nikita Popov
580210a0c9 [SLP] Convert some tests to opaque pointers (NFC) 2022-12-23 10:02:57 +01:00
Bjorn Pettersson
3be72f4029 [test][SLPVectorizer] Use -passes syntax in RUN lines. NFC 2022-10-13 10:44:38 +02:00
Philip Reames
02bfe2de7c [RISCV] Adjust vector immediate store materialization cost
This change updates the costs to make constant pool loads match their actual cost, and adds the broadcast special case to avoid too many regressions. We really need more information about the constants being rematerialized, but this is an incremental improvement.

Differential Revision: https://reviews.llvm.org/D134746
2022-09-29 07:37:13 -07:00
Philip Reames
17f2ee804a [RISCV][SLP] Add test coverage for stores of constants 2022-09-27 07:53:33 -07:00
Philip Reames
a310637132 [RISCV] Disable SLP vectorization by default due to unresolved profitability issues
This change implements a TTI query with the goal of disabling slp vectorization on RISCV. The current default configuration disables SLP already, but its current tied to the ability to lower fixed length vectors. Over in D131508, I want to enable fixed length vectors for purposes of LoopVectorizer, but preliminary analysis has revealed a couple of SLP specific issues we need to resolve before enabling it by default. This change exists to allow us to enable LV without SLP.

Differential Revision: https://reviews.llvm.org/D132680
2022-08-26 14:11:22 -07:00
Alexey Bataev
0e7ed32c71 [SLP]Cost for a constant buildvector.
In many cases constant buildvector results in a vector load from a
constant/data pool. Need to consider this cost too.

Differential Revision: https://reviews.llvm.org/D126885
2022-08-19 08:02:42 -07:00
Philip Reames
1062595808 [RISCV][SLP] Add some basic test coverage 2022-08-11 13:05:14 -07:00
Philip Reames
7d6e8f2a96 [slp] Delete dead scalar instructions feeding vectorized instructions
If we vectorize a e.g. store, we leave around a bunch of getelementptrs for the individual scalar stores which we removed. We can go ahead and delete them as well.

This is purely for test output quality and readability. It should have no effect in any sane pipeline.

Differential Revision: https://reviews.llvm.org/D122493
2022-03-28 20:10:13 -07:00
eopXD
3cf15af2da [RISCV] Remove experimental prefix from rvv-related extensions.
Extensions affected: +v, +zve*, +zvl*

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117860
2022-01-22 20:18:40 -08:00
Kito Cheng
f142c45f1e [RISCV] Set getMinVectorRegisterBitWidth to 16 if enable fixed length vector code gen for RVV
getMinVectorRegisterBitWidth means what vector types is supported in
this target, and actually RISC-V support all fixed length vector types with
vector length less than `getMinRVVVectorSizeInBits`, so set it to 16,
means 2 x i8, that is minimal fixed length vector size in theory.

That also fixed one issue, some testcase migth become non-vectorizable
when `-riscv-v-vector-bits-min` set to larger value, because the vector size is
smaller than `-riscv-v-vector-bits-min`.

For example, following code can vectorize by SLP with
`-riscv-v-vector-bits-min=128` or `-riscv-v-vector-bits-min=256`, but
can't vectorize `-riscv-v-vector-bits-min=512` or larger:

```
void foo(double *da) {
  da[0] = 0;
  da[1] = 1;
  da[2] = 2;
  da[3] = 3;
}
```

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116534
2022-01-08 11:16:21 +08:00