4 Commits

Author SHA1 Message Date
Craig Topper
a2b5b584a5 [RISCV] Use register allocation hints to improve use of compressed instructions.
Compressed instructions usually require one of the source registers
to also be the source register. The register allocator doesn't have
that bias on its own.

This patch adds register allocation hints to introduce this bias.
I've started with ADDI, ADDIW, and SLLI. These all have a 5-bit
field for the register. If the source and dest register are the
same they are guaranteed to compress as long as the immediate is
also 6 bits.

This code was inspired by similar code from the SystemZ target.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D138242
2022-11-25 08:39:44 -08:00
Philip Reames
d89d45ca9a [RISCV][InsertVSETVLI] Default to MA not MU
This changes the default value used for mask policy from mask undisturbed to mask agnostic. In hardware, there may be a minor preference for ta/ma, but since this is only going to apply to instructions which don't use the mask policy bit, this is functionally mostly a nop. The main value is to make future changes to using MA when legal for masked instructions easier to review by reducing test churn.

The prior code was motivated by a desire to minimize state transitions between masked and unmasked code. This patch achieves the same effect using the demanded field logic (landed in afb45ff), and there are no regressions I spotted in the test diffs. (Given the size, I have only been able to skim.) I do want to call out that regressions are possible here; the demanded analysis only works on a block local scope right now, so e.g. a tight loop mixing masked and unmasked computation might see an extra vsetvli or two.

Differential Revision: https://reviews.llvm.org/D133803
2022-10-06 07:59:39 -07:00
Philip Reames
b45a262679 [RISCV] Enable fixed length vectors and loop vectorization with same
This change enables the use of RISCV's variable length vector registers for fixed length vectors in the IR, and implicitly enables various IR transforms which generate fixed length vectors if legal (e.g. LoopVectorize). Specifically, this enables fixed length vectors which are known to be inbounds of the underlying variable hardware size.

For context, remember that the +V extension provides a minimum VLEN of 128. The embedded variants provide lower minimums. The analogy here is essentially vectorizing for SSE on a machine which may or may not include AVX2/AVX512. We won't get full utilization by default, but we will get some benefit. And of course, with an explicit mcpu we can vectorize to the exact target hardware.

The LV impact is mostly related to vectorizer robustness. In cases we haven't yet fully implemented scalable vectorization support, we can fall back to fixed length vectorization.

SLP has been disabled for now, even when fixed vectors are enabled.  See a310637 and associated review.  There are a few addiitional code quality issues which need worked through before turning SLP on would be reasonable.

Differential Revision: https://reviews.llvm.org/D131508
2022-08-26 14:45:23 -07:00
Bjorn Pettersson
3a39bb96ca [SelectionDAG] Use correct boolean representation in FoldConstantArithmetic
The description of SETCC says
  /// SetCC operator - This evaluates to a true value iff the condition is
  /// true.  If the result value type is not i1 then the high bits conform
  /// to getBooleanContents.

Without this patch, we sign extended the i1 to the used larger type
regardless of getBooleanContents. This resulted in miscompiles, as
shown in the attached testcase that ended up returning -1 instead of
1 when using -mattr=+v.

Fixes https://github.com/llvm/llvm-project/issues/55168

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D124618
2022-04-28 18:42:16 +02:00