7 Commits

Author SHA1 Message Date
Luke Lau
15f9cf164c [LV][RISCV] Don't interleave scalable vector loops
It's less clear with scalable vectors than fixed length vectors that
interleaving exposes more ILP, as scalable vectors can be thought of a
sort of hardware form of interleaving, especially with larger LMULs.
This also addresses the unexpected additional unrolling that occurs when
using larger LMULs in the loop vectorizer.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144485
2023-02-22 10:15:11 +00:00
Sander de Smalen
5a115452c4 Reland D143267: [LoopVectorize] Use DataLayout::getIndexType instead of i32 for non-constant GEP indices.
Fixed issue where 'ConstantInt::get(IndextTy, -Part)' was executed with the wrong type for Part,
e.g. IndexTy was i64, but Part was 'unsigned', which led to things like 'mul i64 .., 4294967292',
which was obviously wrong.

Also changed sve-vector-reverse.ll to be vectorized with UF>1 to test this.

This reverts commit 1f01cdda68614dba12af3cc3aff38541d0abcc6b.
2023-02-09 09:42:29 +00:00
Sander de Smalen
1f01cdda68 Revert "[LoopVectorize] Use DataLayout::getIndexType instead of i32 for non-constant GEP indices."
This patch causes a regression, so reverting it while I investigate the issue.

This reverts commit e6eb84a191ca2a1afd5789c5bb398da68bb6065e.
2023-02-08 15:46:52 +00:00
Sander de Smalen
e6eb84a191 [LoopVectorize] Use DataLayout::getIndexType instead of i32 for non-constant GEP indices.
This is specifically relevant for loops that vectorize using a scalable VF,
where the code results in:

  %vscale = call i32 llvm.vscale.i32()
  %vf.part1 = mul i32 %vscale, 4
  %gep = getelementptr  ..., i32 %vf.part1

Which InstCombine then changes into:

  %vscale = call i32 llvm.vscale.i32()
  %vf.part1 = mul i32 %vscale, 4
  %vf.part1.zext = sext i32 %vf.part1 to i64
  %gep = getelementptr  ..., i32 %vf.part1.zext

D143016 tried to remove these extends, but that only works when
the call to llvm.vscale.i32() has a single use. After doing any
kind of CSE on these calls the combine no longer kicks in.

It seems more sensible to ask DataLayout what type to use, rather
than relying on InstCombine to insert the extend and hoping it can
fold it away.

I've only changed this for indices that are not constant, because
I vaguely remember there was a reason for sticking with i32. It
would also mean patching up loads more tests.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D143267
2023-02-07 11:47:51 +00:00
Nikita Popov
5b40015063 [LoopVectorize] Convert some tests to opaque pointers (NFC)
For these tests update_test_checks.py had to be rerun.
2022-12-14 15:27:31 +01:00
Roman Lebedev
be51fa4580
[NFC] Port all runlines for LoopVectorize pass tests to -passes syntax 2022-12-05 22:17:30 +03:00
jacquesguan
45bae1be90 [RISCV][test] Add inloop reduction vectorize test. NFC 2022-08-04 15:06:44 +08:00