Previously we had the same instructions being generated for `ISD::CTLZ` and `ISD::CTLZ_ZERO_UNDEF` which did not take advantage of the fact that zero is an invalid input for `ISD::CTLZ_ZERO_UNDEF`. This commit separates codegen for the two cases to allow for the optimization for the latter case.
The details of the optimization are outlined in #82075Fixes#82075
Co-authored-by: Manish Kausik H <hmamishkausik@gmail.com>
We masked out the sign bit from one value, and the non-sign bits
from the other so there should be no common bits set.
No idea how to test this on the DAG path, other than scraping
the debug logs. A few targets hit this path with f16 values, but
the resulting i16 ors get anyext promoted and lose the disjoint
flag. In the fp128 case, PPC gets further and the or loses the flag
somewhere else later. Adding a haveNoCommonBits assert shows this
works though.
Any fp128 need to end up as libcall, as will f32->i128 and f64->i128.
f16 are a bit special as the maximum range of the result fits in a i17,
so can be shrank to an i64. Vector with i128/fp128 types are scalarized.
As far as I can tell, this pull request was not approved, and
did not go through an RFC on discourse.
This reverts commit 89881480030f48f83af668175b70a9798edca2fb.
This reverts commit 225d8fc8eb24fb797154c1ef6dcbe5ba033142da.
Currently, on different platform, the behaivor of llvm.minnum is
different if one operand is sNaN:
When we compare sNaN vs NUM:
ARM/AArch64/PowerPC: follow the IEEE754-2008's minNUM: return qNaN.
RISC-V/Hexagon follow the IEEE754-2019's minimumNumber: return NUM. X86:
Returns NUM but not same with IEEE754-2019's minimumNumber as
+0.0 is not always greater than -0.0.
MIPS/LoongArch/Generic: return NUM.
LIBCALL: returns qNaN.
So, let's introduce llvm.minmumnum/llvm.maximumnum, which always follow
IEEE754-2019's minimumNumber/maximumNumber.
Half-fix: #93033
This change is an implementation of #87367's investigation on supporting
IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
Much of this change was following how G_FSIN and G_FCOS were used.
Changes:
- `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN`
opcode
- `llvm/docs/LangRef.rst` - Document the tan intrinsic
- `llvm/include/llvm/Analysis/VecFuncs.def` - Associate the tan
intrinsic as a vector function similar to the tanf libcall.
- `llvm/include/llvm/CodeGen/BasicTTIImpl.h` - Map the tan intrinsic to
`ISD::FTAN`
- `llvm/include/llvm/CodeGen/ISDOpcodes.h` - Define ISD opcodes for
`FTAN` and `STRICT_FTAN`
- `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic
- `llvm/include/llvm/IR/RuntimeLibcalls.def` - Define tan libcall
mappings
- `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN`
Opcode
- `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN`
Opcode handler
- `llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td` - Map
`G_FTAN` to `ftan`
- `llvm/include/llvm/Target/TargetSelectionDAG.td` - Define `ftan`,
`strict_ftan`, and `any_ftan` and map them to the ISD opcodes for `FTAN`
and `STRICT_FTAN`
- `llvm/lib/Analysis/VectorUtils.cpp` - Associate the tan intrinsic as a
vector intrinsic
- `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic
to `G_FTAN` Opcode
- `llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp` - Add `G_FTAN` to
the list of floating point math operations also associate `G_FTAN` with
the `TAN_F` runtime lib.
- `llvm/lib/CodeGen/GlobalISel/Utils.cpp` - More floating point math
operation common behaviors.
- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp - List the function
expansion operations for `FTAN` and `STRICT_FTAN`. Also define both
opcodes in `PromoteNode`.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp` - More `FTAN`
and `STRICT_FTAN` handling in the legalizer
- `llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h` - Define
`SoftenFloatRes_FTAN` and `ExpandFloatRes_FTAN`.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp` - Define `FTAN`
as a legal vector operation.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp` - Define
`FTAN` as a legal vector operation.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp` - define tan as an
intrinsic that doesn't return NaN.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp` Map
`LibFunc_tan`, `LibFunc_tanf`, and `LibFunc_tanl` to `ISD::FTAN`. Map
`Intrinsic::tan` to `ISD::FTAN` and add selection dag handling for
`Intrinsic::tan`.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp` - Define `ftan`
and `strict_ftan` names for the equivalent ISD opcodes.
- `llvm/lib/CodeGen/TargetLoweringBase.cpp` -Define a Tan128 libcall and
ISD::FTAN as a target lowering action.
- `llvm/lib/Target/X86/X86ISelLowering.cpp` - Add x86_64 lowering for
tan intrinsic
resolves https://github.com/llvm/llvm-project/issues/70082
The testing we have for vector ptradd was a bit lacking. In adding tests
this patch found a couple of issues mostly with the way v3 vectors of
ptrs were sometimes legalized via i64, and with non-i64 additions. It
does not attempt to fix the issue with mergevalues from returning vector
ptrs.
This hooks up G_INTRINSIC_LLRINT instructions, very similar to the lrint
nodes that already exist. On AArch64 they are treated the same as lrint
with the default return types.
Currently the inserted mergelike instructions will be inserted at the
location of the G_PHI. Seems like the behaviour was correct before, but
the rework done in https://reviews.llvm.org/D114198 forgot to include
the part which makes sure the instructions will be inserted after all
the G_PHIs.
This extends the legalization of lrint, adding libcall support for
fp128. The old vector legal types were removed as they were not being
properly handled (vector lrint is a fairly new concept as far as I
understand). They can be re-added properly in a followup.
This patch legalizes G_ZEXT, G_SEXT, and G_ANYEXT. If the type is a
legal mask type, then the instruction is legalized as the element-wise
select, where the condition on the select is the mask typed source
operand, and the true and false values are 1 or -1 (for
zero/any-extension and sign extension) and zero. If the type is a legal integer
or vector integer type, then the instruction is marked as legal.
The legalization of the extends may introduce a G_SPLAT_VECTOR, which
needs to be legalized in this patch for the extend test cases to pass.
A G_SPLAT_VECTOR is legal if the vector type is a legal integer or
floating point vector type and the source operand is sXLen type. This is
because the SelectionDAG patterns only support sXLen typed
ISD::SPLAT_VECTORS, and we'd like to reuse those patterns. A
G_SPLAT_VECTOR is cutom legalized if it has a legal s1 element vector
type and s1 scalar operand. It is legalized to G_VMSET_VL or G_VMCLR_VL
if the splat is all ones or all zeros respectivley. In the case of a
non-constant mask splat, we legalize by promoting the scalar value to
s8.
In order to get the s8 element vector back into s1 vector, we use a
G_ICMP. In order for the splat vector and extend tests to pass, we also
need to legalize G_ICMP in this patch.
A G_ICMP is legal if the destination type is a legal bool vector and the LHS and
RHS are legal integer vector types.
G_VSCALE should be lowered using VLENB. If the type is not sXLen it
should be lowered using a G_VSCALE on the narrow type and a G_MUL.
regbank select and instruction select are straightforward so we really
only need to add tests to show it works.
This patch improves codegen for scalar (<128bits) version
of llvm.abs intrinsic by using the existing non-XOR based lowering.
This takes the generated code closer to SDAG.
codegen with GISel for > 128 bit types is not very good
with these method so not doing so.
This is part of #70452 that changes the type used for the external
interface of MMO to LocationSize as opposed to uint64_t. This means the
constructors take LocationSize, and convert ~UINT64_C(0) to
LocationSize::beforeOrAfter(). The getSize methods return a
LocationSize.
This allows us to be more precise with unknown sizes, not accidentally
treating them as unsigned values, and in the future should allow us to
add proper scalable vector support but none of that is included in this
patch. It should mostly be an NFC.
Global ISel is still expected to use the underlying LLT as it needs, and
are not expected to see unknown sizes for generic operations. Most of
the changes are hopefully fairly mechanical, adding a lot of getValue()
calls and protecting them with hasValue() where needed.
Recommits llvm/llvm-project#80378 which was reverted in
llvm/llvm-project#84330. The problem was that the change in
llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir used
217 as an opcode instead of a regex.
This patch is stacked on
https://github.com/llvm/llvm-project/pull/80372,
https://github.com/llvm/llvm-project/pull/80307, and
https://github.com/llvm/llvm-project/pull/80306.
ShuffleVector on scalable vector types gets IRTranslate'd to
G_SPLAT_VECTOR since a ShuffleVector that has operates on scalable
vectors is a splat vector where the value of the splat vector is the 0th
element of the first operand, because the index mask operand is the
zeroinitializer (undef and poison are treated as zeroinitializer here).
This is analogous to what happens in SelectionDAG for ShuffleVector.
`buildSplatVector` is renamed to`buildBuildVectorSplatVector`. I did not
make this a separate patch because it would cause problems to revert
that change without reverting this change too.
i8 vectors do not have their sizes changed as I noticed regressions in
some tests when that was done.
This patch also adds support for most G_VECREDUCE_* operations to
moreElementsVector in LegalizerHelper.cpp.
The code for getting the "neutral" element is taken almost exactly as it
is in SelectionDAG, with the exception that support for
G_VECREDUCE_{FMAXIMUM,FMINIMUM} was not added.
The code for SelectionDAG is located at
SelectionDAG::getNeutralELement().
This alters the lowering of G_COPYSIGN to support vector types. The
general idea is that we just lower it to vector operations using and/or
and a mask, which are now converted to a BIF/BIT/BSP.
In the process the existing AArch64LegalizerInfo::legalizeFCopySign can
be removed, replying on expanding the scalar versions to vector instead,
which just needs a small adjustment to allow widening scalars to
vectors.
Extend LegalizerHelper's API to lower integer constants to a load from
constant pool. Previously, this lowering existed only for FP constants.
Apply this change to RISCV.
Legalize G_ABS for larger/smaller width vectors with legal element sizes
Fallsback for the smaller width vector tests because it is unable to
legalize for G_ANYEXT smaller width vectors
This fills out the fcmp handling to be more like the other instructions,
adding better support for fp16 and some larger vectors.
Select of f16 values is still not handled optimally in places as the
select is only legal for s32 values, not s16. This would be correct for
integer but not necessarily for fp. It is as if we need to do
legalization -> regbankselect -> extra legaliation -> selection.
Legalize shl/lshr/ashr for smaller/larger vector widths with legal
element sizes
Smaller than legal vector types does not work at the moment as it relies
on G_ANYEXT to work with smaller than legal vector types
Moved extractParts() and extractVectorParts() from LegalizerHelper
to Utils to be able to use it in different passes.
extractParts() will also try to use unmerge when doing irregular
splits where possible, falling back to extract elements when not.