635 Commits

Author SHA1 Message Date
Craig Topper
5a6c622afd [RISCV] Remove special case for constant shift amount in FSHL/FSHR lowering to FSL/FSR.
Remove fshl/fshr with constant shift amount isel patterns. Replace
with fsr/fsl with constant isel patterns.

This hack was trying to preserve as much optimization opportunity
for fshl/fshr by constant as possible, but the conversion to
RISCVISD::FSR/FSL happens so late it probably isn't worth much.

The new isel patterns are needed by D117468 anyway.
2022-01-18 11:47:50 -08:00
Craig Topper
aa7fc02feb Recommit "[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering."
This reverts the revert commit e32838573929ac85fc4df3058593798d10ce4cd2.

Accidental demanded bits change has been removed. The demanded bits
code itself was remove in a pre-commit since it isn't tested.

Original commit message:
Previous we used the fshl/fshr operand ordering for simplicity. This
made things confusing when D117468 proposed adding intrinsics for
the instructions. We can't just use the generic funnel shifting
intrinsics because fsl/fsr have different functionality that should
be exposed to software.

Now we use rs1, rs3, rs2/shamt order which matches the instruction
printing order and the order used in this intrinsic header
https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h
2022-01-18 10:52:43 -08:00
Craig Topper
b3a0ec7645 [RISCV] Remove DemandedBits handling for FSR/FSL until we have test cases for it.
Testing may be easier after D117468. Right now we get demanded bits
optimizations done on ISD::FSHL/FSHR before they become FSR/FSL. This
makes it hard to test.
2022-01-18 10:52:43 -08:00
Craig Topper
e328385739 Revert "[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering."
This reverts commit b634f8a663d56877663f5224a785d9f0263c4176.

I broke the SimplifyDemandedBits code, but we don't have tests.
2022-01-18 10:36:03 -08:00
Craig Topper
b634f8a663 [RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering.
Previous we used the fshl/fshr operand ordering for simplicity. This
made things confusing when D117468 proposed adding intrinsics for
the instructions. We can't just use the generic funnel shifting
intrinsics because fsl/fsr have different functionality that should
be exposed to software.

Now we use rs1, rs3, rs2/shamt order which matches the instruction
printing order and the order used in this intrinsic header
https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h
2022-01-18 09:47:28 -08:00
David Sherwood
f4515ab858 Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants"
This reverts commit 197f3c0deb76951315118ef13937b67ea9cbd5aa.

Reverting after miscompilation errors discovered with ffmpeg.
2022-01-18 08:40:20 +00:00
Han-Kuan Chen
ec9cb3a79c [RISCV] Provide VLOperand in td.
Currently, users expected VL is the last operand. However, since some
intrinsics has tail policy in the last operand, this rule cannot be used
anymore.

Reviewed By: craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D117452
2022-01-17 20:25:47 -08:00
Han-Kuan Chen
3fc4b5896a [RISCV] Make SplatOperand start from 0.
Current SplatOperand starts from 1 because operand 0 (or 1) is intrinsic
id in SelectionDAG.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117453
2022-01-17 20:14:59 -08:00
Craig Topper
116af698e2 [RISCV] When expanding CONCAT_VECTORS, don't create INSERT_SUBVECTORS for undef subvectors.
For fixed vectors, the undef will get expanded to an all zeros
build_vector. We don't want that so suppress creating the
insert_subvector.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117379
2022-01-17 14:40:59 -08:00
Craig Topper
9c410838d2 [RISCV] Legalize fixed length (insert_subvector undef, X, 0) to a scalable insert.
We were considering this legal, but later the undef would become an all
zeros vector. This would cause us to need to re-legalize the insert later
into a vslideup with zero vector.

This patch catches the case and directly legalizes it to a scalable
insert.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117377
2022-01-17 14:31:30 -08:00
David Sherwood
197f3c0deb [CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants
When we know the value we're extending is a negative constant then it
makes sense to use SIGN_EXTEND because this may improve code quality in
some cases, particularly when doing a constant splat of an unpacked vector
type. For example, for SVE when splatting the value -1 into all elements
of a vector of type <vscale x 2 x i32> the element type will get promoted
from i32 -> i64. In this case we want the splat value to sign-extend from
(i32 -1) -> (i64 -1), whereas currently it zero-extends from
(i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use
a single mov immediate instruction.

New tests added here:

  CodeGen/AArch64/sve-vector-splat.ll

I believe we see some code quality improvements in these existing
tests too:

  CodeGen/AArch64/reduce-and.ll
  CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll

The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only
occur because the test disables codegen prepare and branch folding.

Differential Revision: https://reviews.llvm.org/D114357
2022-01-17 11:08:57 +00:00
Craig Topper
4c1e1e05cb [RISCV] Add RISCVISD::BFPW to ComputeNumSignBitsForTargetNode. 2022-01-15 15:23:49 -08:00
Fraser Cormack
877d1b3d07 [SelectionDAG][VP] Add splitting/widening for VP_LOAD and VP_STORE
Original patch by @hussainjk.

This patch was split off from D109377 to keep vector legalization
(widening/splitting) separate from vector element legalization
(promoting).

While the original patch added a third overload of
SelectionDAG::getVPStore, this patch takes the liberty of collapsing
those all down to 1, as three overloads seems excessive for a
little-used node.

The original patch also used ModifyToType in places, but that method
still crashes on scalable vector types. Seeing as the other VP
legalization methods only work when all operands need identical
widening, this patch follows in that vein.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117235
2022-01-15 11:41:29 +00:00
Chenbing.Zheng
fdd33a0c75 [RISCV][NFC] Add a function to customLegalizeToWOp by Intrinsic
These cases follow the same pattern, so they can be combined
to a unqiue function.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117378
2022-01-15 08:28:08 +00:00
Craig Topper
2baa1dffd1 [RISCV] Add basic support for matching shuffles to vslidedown.vi.
Specifically the unary shuffle case where the elements being
shifted in are undef. This handles the shuffles produce by expanding
llvm.reduce.mul.

I did not reduce the VL which would increase the number of vsetvlis,
but may improve the execution speed. We'd also want to narrow the
multiplies so we could share vsetvlis between the vslidedown.vi and
the next multiply.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117239
2022-01-14 09:04:54 -08:00
Craig Topper
ac6b4896ea [RISCV] Honor the VT when converting float point register names to register class for inline assembly.
It appears the code here was written for the inline asm clobbering
a specific register, but it also gets used for named input and
output registers.

For the input and output case, we should honor the VT so we
don't insert conversion instructions around the inline assembly.

For the clobber, case we need to pick the largest register class.

Reviewed By: asb, jrtc27

Differential Revision: https://reviews.llvm.org/D117279
2022-01-14 09:04:00 -08:00
jacquesguan
88c0e0806b [RISCV] Improve i64 splat vector lowering in RV32.
We could use vmv.v.i/vmv.v.x whose eew is 32 to lower the i64 splat vector if the i64 constant scalar could be splitted into two same i32 scalar.

Differential Revision: https://reviews.llvm.org/D117079
2022-01-14 14:06:01 +08:00
David Sherwood
ba471ba8d2 Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants"
This reverts commit 31009f0b5afb504fc1f30769c038e1b7be6ea45b.

It seems to be causing SVE VLA buildbot failures and has introduced a
genuine regression. Reverting for now.
2022-01-13 15:59:43 +00:00
David Sherwood
31009f0b5a [CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants
When we know the value we're extending is a negative constant then it
makes sense to use SIGN_EXTEND because this may improve code quality in
some cases, particularly when doing a constant splat of an unpacked vector
type. For example, for SVE when splatting the value -1 into all elements
of a vector of type <vscale x 2 x i32> the element type will get promoted
from i32 -> i64. In this case we want the splat value to sign-extend from
(i32 -1) -> (i64 -1), whereas currently it zero-extends from
(i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use
a single mov immediate instruction.

New tests added here:

  CodeGen/AArch64/sve-vector-splat.ll

I believe we see some code quality improvements in these existing
tests too:

  CodeGen/AArch64/dag-numsignbits.ll
  CodeGen/AArch64/reduce-and.ll
  CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll

The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only
occur because the test disables codegen prepare and branch folding.

Differential Revision: https://reviews.llvm.org/D114357
2022-01-13 09:43:07 +00:00
Lian Wang
16877c5d2c [RISCV] Add bfp and bfpw intrinsic in zbf extension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116994
2022-01-13 02:53:00 +00:00
Craig Topper
63b17eb9ec [RISCV] Add strictfp support for compares.
This adds support for STRICT_FSETCC(quiet) and STRICT_FSETCCS(signaling).

FEQ matches well to STRICT_FSETCC oeq.
FLT/FLE matches well to STRICT_FSETCCS olt/ole.

Others require commuting operands or multiple instructions.

STRICT_FSETCC olt/ole/ogt/oge/ult/ule/ugt/uge uses FLT/FLE,
but we need to save/restore FFLAGS around them to avoid spurious
exceptions. I've implemented pseudo instructions with a
CustomInserter to insert the save/restore CSR instructions.
Unfortunately, this doesn't honor exceptions for signaling NANs
but I'm not sure if signaling nans are really supported by the
constrained intrinsics.

STRICT_FSETCC one and ueq expand to a pair of FLT instructions
with a save/restore of fflags around each. This could be improved
in the future.

There may be some opportunities to generate better code for strict
comparisons mixed with nonans fast math flags. I've left FIXMEs in
the .td files for that.

Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D116694
2022-01-11 20:01:41 -08:00
Craig Topper
be1cc64cc1 [RISCV] Add DAG combine to fold (fp_to_int (ffloor X)) -> (fcvt X, rdn)
Similar for ceil, trunc, round, and roundeven. This allows us to use
static rounding modes to avoid a libcall.

This optimization is done for AArch64 as isel patterns.
RISCV doesn't have instructions for ceil/floor/trunc/round/roundeven
so the operations don't stick around until isel to enable a pattern
match. Thus I've implemented a DAG combine.

We only handle XLen types except i32 on RV64. i32 will be type
legalized to a RISCVISD node. All other types will be type legalized
to XLen and maintain the FP_TO_SINT/UINT ISD opcode.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116771
2022-01-11 09:05:57 -08:00
wangpc
c6430fade3 [RISCV] Generate 32 bits jumptable entries when code model is small
The code can only address the whole RV32 address space or the lower 2 GiB
of the RV64 address space in small code model, so 32 bits entry is enough.
Cache hit ratio and code size have some improvements.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116435
2022-01-11 18:20:37 +08:00
wangpc
98d51c2542 [RISCV] Override TargetLowering::BuildSDIVPow2 to generate SELECT
When `Zbt` is enabled, we can generate SELECT for division by power
of 2, so that there is no data dependency.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D114856
2022-01-11 15:54:35 +08:00
jacquesguan
b607cd3928 [RISCV] Use vmv.s.x to build one element splat vector.
When we want to create an splat vector that only the first element is initialized, we could use vmv.s.x or vfmv.s.f to build it.

Differential Revision: https://reviews.llvm.org/D116277
2022-01-11 10:21:18 +08:00
jacquesguan
6b8362eb8d [RISCV] Disable EEW=64 for index values when XLEN=32.
Disable EEW=64 for vector index load/store when XLEN=32.

Differential Revision: https://reviews.llvm.org/D106518
2022-01-10 10:51:27 +08:00
Kazu Hirata
435a5a3652 [llvm] Fix bugprone argument comments (NFC)
Identified with bugprone-argument-comment.
2022-01-08 11:56:38 -08:00
Craig Topper
75117fb340 [RISCV] Don't advertise i32->i64 zextload as free for RV64.
The zextload hook is only used to determine whether to insert a
zero_extend or any_extend for narrow types leaving a basic block.
Returning true from this hook tends to cause any load whose output
leaves the basic block to become an LWU instead of an LW.

Since we tend to prefer sexts for i32 compares on RV64, this can
cause extra sext.w instructions to be created in other basic blocks.

If we use LW instead of LWU this gives the MIR pass from D116397
a better chance of removing them.

Another option might be to teach getPreferredExtendForValue in
FunctionLoweringInfo.cpp about our preference for sign_extend of
i32 compares. That would cause SIGN_EXTEND to be chosen for any
value used by a compare instead of using the isZExtFree heuristic.
That will require code to convert from the llvm::Type* to EVT/MVT
as well as querying the type legalization actions to get the
promoted type in order to call TargetLowering::isSExtCheaperThanZExt.
That seemed like many extra steps when no other target wants it.
Though it would avoid us needing to lean on the MIR pass in some cases.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116567
2022-01-06 08:13:42 -08:00
Craig Topper
808c662665 [RISCV] Change RISCVISD::FCVT*RTZ opcodes to take rounding mode as an operand.
Pre-work for a future change that will use these opcodes with other
rounding modes.

Differential Revision: https://reviews.llvm.org/D116724
2022-01-06 08:12:12 -08:00
Victor Perez
5527139302 [RISCV][VP] Add RVV codegen for [nX]vXi1 vp.select
Expand [nX]vXi1 vp.select the same way as [nX]vXi1 vselect.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D115546
2022-01-02 23:12:32 -08:00
Craig Topper
15787ccd45 [RISCV] Add support for STRICT_LRINT/LLRINT/LROUND/LLROUND. Tests for other strict intrinsics.
This patch adds isel support for STRICT_LRINT/LLRINT/LROUND/LLROUND.

It also adds test cases for f32 and f64 constrained intrinsics that
correspond to the intrinsics in float-intrinsics.ll and
double-intrinsics.ll. Support for promoting the integer argument of
STRICT_FPOWI was added.

I've skipped adding tests for f16 intrinsics, since we don't have libcalls
for them and we have inconsistent support for promoting them in LegalizeDAG.
This will need to be examined more closely.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116323
2021-12-30 11:54:32 -08:00
Hsiangkai Wang
a1c7ddf926 [RISCV] Support passing scalable vectur values through the stack.
After consuming all vector registers, the scalable vector values will be
passed indirectly. The pointer values will be saved in general
registers. If all general registers are used up, we will report an error to
notify users the compiler does not support passing scalable vector
values through the stack. In this patch, we remove the restriction. After
all general registers are used up, we use the stack to save the
pointers which point to the indirect passed scalable vector values.

Differential Revision: https://reviews.llvm.org/D116310
2021-12-28 09:26:36 +08:00
Kazu Hirata
e7774f499b Use static_assert instead of assert (NFC)
Identified with misc-static-assert.
2021-12-26 14:26:44 -08:00
Jim Lin
02478a26f2 [RISCV] Use DAG variable directly instead of DCI.DAG
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116087
2021-12-24 13:06:55 +08:00
Craig Topper
0a35211b34 [RISCV] Don't allow vector types to be used with inline asm 'r' constraint
The 'r' constraint uses the GPR class. There is generic support
for bitcasting and extending/truncating non-integer VTs to the
required integer VT. This doesn't work for scalable vectors and
instead crashes.

To prevent this, explicitly reject vectors. Fixed vectors might
work without crashing, but it doesn't seem worthwhile to allow.

While there remove an unnecessary level of indentation in the
"vr" and "vm" constraint handling.

Differential Revision: https://reviews.llvm.org/D115810
2021-12-23 20:32:36 -06:00
Victor Perez
10b3675aa9 [RISCV][VP] Lower mask vector VP AND/OR/XOR to RVV instructions
For fixed and scalable vectors, each intrinsic x is lowered to vmx.mm,
dropping the mask, which is safe to do as masked-off elements are
undef anyway.

Differential Revision: https://reviews.llvm.org/D115339
2021-12-23 15:02:32 -06:00
Craig Topper
7704c503ec [RISCV] Use positive 0.0 for the neutral element in fadd reductions if nsz is present.
-0.0 requires a constant pool. +0.0 can be made with vmv.v.x x0.

Not doing this in getNeutralElement for fear of changing other targets.

Differential Revision: https://reviews.llvm.org/D115978
2021-12-23 10:38:00 -06:00
Craig Topper
b7b260e19a [RISCV] Support strict FP conversion operations.
This adds support for strict conversions between fp types and between
integer and fp.

NOTE: RISCV has static rounding mode instructions, but the constrainted
intrinsic metadata is not used to select static rounding modes. Dynamic
rounding mode is always used.

Differential Revision: https://reviews.llvm.org/D115997
2021-12-23 09:40:58 -06:00
jacquesguan
28a3e7dea2 [RISCV] Override hasAndNotCompare to use more andn when have Zbb extension.
Enable transform (X & Y) == Y ---> (~X & Y) == 0 and (X & Y) != Y ---> (~X & Y) != 0 when have Zbb extension to use more andn instruction.

Differential Revision: https://reviews.llvm.org/D115922
2021-12-23 10:42:20 +08:00
Craig Topper
66bbefeb13 [RISCV] Revert Zfhmin related changes that aren't tested and depend on f16 being a legal type.
Our Zfhmin support is only MC layer, but these are CodeGen layer
interfaces. If f16 isn't a Legal type for CodeGen with Zfhmin, then
these interfaces should keep their non-Zfh behavior.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D115822
2021-12-16 08:55:28 -08:00
Craig Topper
3926893439 [RISCV] Add isel support for scalar STRICT_FADD/FSUB/FMUL/FDIV/FSQRT.
Test that STRICT_FMINNUM/FMAXNUM are lowered to libcalls for f32/f64.
The RISC-V instructions don't match the behavior of fmin/fmax libcalls
with respect to SNaN.

Promoting FMINNUM/FMAXNUM for f16 needs more work outside of the
RISC-V backend.

Reviewed By: asb, arcbbb

Differential Revision: https://reviews.llvm.org/D115680
2021-12-14 10:50:55 -08:00
Craig Topper
3f1c403a2b [RISCV] Use AdjustInstrPostInstrSelection to insert a FRM dependency for scalar FP instructions with dynamic rounding mode.
In order to support constrained FP intrinsics we need to model FRM
dependency. Whether or not a instruction uses FRM is based on a 3
bit field in the instruction. Because of this we can't add
'Uses = [FRM]' to the tablegen descriptions.

This patch examines the immediate after isel and adds an implicit
use of FRM. This idea came from Roger Ferrer Ibanez.

Other ideas:
We could be overly conservative and just pretend all instructions with
frm field read the FRM register. Or we could have pseudoinstructions
for CodeGen with rounding mode.

Reviewed By: asb, frasercrmck, arcbbb

Differential Revision: https://reviews.llvm.org/D115555
2021-12-14 10:17:57 -08:00
Craig Topper
b18b2a01ef [RISCV] Don't use VLMAX for start value splat in reduction lowering.
The reduction instructions only reads the first element. The
execution time for a splat may take longer with a larger VL.
We should use the smallest VL we can.

Reviewed By: frasercrmck, HsiangKai

Differential Revision: https://reviews.llvm.org/D115536
2021-12-13 09:06:42 -08:00
Kito Cheng
39c861719b [RISCV] Fix vm operand constraint to fit GCC's behavior
- `vm` constraint is used for masking operand, which always v0.

- Update testcase, only masking operand should use `vm`, vector mask operations
  should just use `vr` for any vector register.

 - Revise the description of `vm` constraint.

- This patch also fix issue on RISCVRegisterInfo.td and RISCVISelLowering.cpp.

  RISCVRegisterInfo.td:
  - The first VT in the list must be the largest total size since the
    SelectionDAGBuilder uses the first register in the list as the canonical
    type for the register.

  RISCVISelLowering.cpp:
  - Fix RISCVTargetLowering::splitValueIntoRegisterParts and
    RISCVTargetLowering::joinRegisterPartsIntoValue for handling vectors
    with different total size, that will happened on fractional LMUL since
    fractional LMUL is always occupy one vector register.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D112599
2021-12-09 14:46:49 +08:00
Craig Topper
acdbd34cfb [RISCV] Loosen some restrictions on lowering constant BUILD_VECTORs using vid.v.
The immediate size check on StepNumerator did not take into account
that vmul.vi does not exist. It also did not account for power of 2
constants that can be done with vshl.vi.

This patch fixes this by moving the conversion from mul to shift
further up. Then we can consider the immediates separately for MUL
vs SHL. For MUL I've allowed simm12 which requires a single addi
before a vmul.vx. For SHL I've allowed any uimm5 which works with
vshl.vi. We could relax these further in the future. This is a
starting point that allows us to emit the same number of instructions
we were already using for smaller numerators.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D115081
2021-12-06 09:34:40 -08:00
Victor Perez
9eb7322748 [RISCV][VP] Add RVV codegen for vp.select
Lower vp.select instrinsic to VSELECT_VL.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D114629
2021-12-03 11:02:20 +00:00
Craig Topper
2f6beb7b0e [RISCV] Add inline expansion for vector ftrunc/fceil/ffloor.
This prevents scalarization of fixed vector operations or crashes
on scalable vectors.

We don't have direct support for these operations. To emulate
ftrunc we can convert to the same sized integer and back to fp using
round to zero. We don't need to do a convert if the value is large
enough to have no fractional bits or is a nan.

The ceil and floor lowering would be better if we changed FRM, but
we don't model FRM correctly yet. So I've used the trunc lowering
with a conditional add or subtract with 1.0 if the truncate rounded
in the wrong direction.

There are also missed opportunities to use masked instructions.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D113543
2021-12-01 11:25:28 -08:00
Craig Topper
d8f9eaad89 [RISCV] Teach RISCVTargetLowering::shouldSinkOperands to handle udiv/sdiv/urem/srem.
The V extension supports .vx instructions for integer division and
remainder so we should sink splats for that operand.
2021-11-30 18:47:51 -08:00
David Green
9e8a71caf0 [DAG] Create fptosi.sat from clamped fptosi
This adds a fold in DAGCombine to create fptosi_sat from sequences for
smin(smax(fptosi(x))) nodes, where the min/max saturate the output of
the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because
it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN,
ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need
to be handled similarly.

A shouldConvertFpToSat method was added to control when converting may
be profitable. The original fptosi will have a less strict semantics
than the fptosisat, with less values that need to produce defined
behaviour.

This especially helps on ARM/AArch64 where the vcvt instructions
naturally saturate the result.

Differential Revision: https://reviews.llvm.org/D111976
2021-11-30 15:29:14 +00:00
Hans Wennborg
a87782c34d Revert "[DAG] Create fptosi.sat from clamped fptosi"
It causes builds to fail with this assert:

llvm/include/llvm/ADT/APInt.h:990:
bool llvm::APInt::operator==(const llvm::APInt &) const:
Assertion `BitWidth == RHS.BitWidth && "Comparison requires equal bit widths"' failed.

See comment on the code review.

> This adds a fold in DAGCombine to create fptosi_sat from sequences for
> smin(smax(fptosi(x))) nodes, where the min/max saturate the output of
> the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because
> it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN,
> ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need
> to be handled similarly.
>
> A shouldConvertFpToSat method was added to control when converting may
> be profitable. The original fptosi will have a less strict semantics
> than the fptosisat, with less values that need to produce defined
> behaviour.
>
> This especially helps on ARM/AArch64 where the vcvt instructions
> naturally saturate the result.
>
> Differential Revision: https://reviews.llvm.org/D111976

This reverts commit 52ff3b009388f1bef4854f1b6470b4ec19d10b0e.
2021-11-30 15:36:56 +01:00