MMX_MOVD64from64rr moves an MMX register to a 64-bit GPR.
MMX_MOVD64from64mr is the memory version of moving a MMX register to a
64-bit GPR. It requires the REX.W bit to be set. There are no isel
patterns that use this instruction.
MMX_MOVQ64mr is the MMX register store instruction. It doesn't
require a REX.W prefix. This makes it one byte shorter to encode
than MMX_MOVD64from64mr in many cases.
Both store instructions output the same mnemonic string. The assembler
would choose MMX_MOVQ64mr if it was to parse the output. Which is
another reason using it is the correct thing to do.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D122241
On RV32, we need to type legalize i64 scalar arguments to intrinsics.
We usually do this by splatting the value into a vector separately.
If the scalar happens to be sign extended, we can continue using a .vx
intrinsic.
We already special cased sign extended constants, this extends it
to any sign extended value.
I've only added tests for one case of vadd. Most intrinsics go
through the same check.
Reviewed By: khchen
Differential Revision: https://reviews.llvm.org/D122186
The mask being NoRegister prevented the existing aliases from matching
since NoRegister isn't in the VMV0 register class.
To workaround this I've added new aliases that look for zero_reg.
I had to motify tablegen to generate matching code for zero_reg.
And as a consequence, I had to change the EmitPriority for an ARM
alias that used zero_reg that started printing.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D121496
This adds LLVMAnyPointerToElt to use instead of LLVMPointerToElt.
This allows us to preserve the address space as part of the type
overload for the intrinsic, but still require the vector element
type to match the pointer type.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D122042
intrinsics.
Those operations are updated under a tail agnostic policy, but they
could have mask agnostic or undisturbed.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D120228
For '-filetype=null', 'NVPTXTargetStreamer' is not created, so the
return value of 'OutStreamer->getTargetStreamer()' should be checked
before calling the methods.
Differential Revision: https://reviews.llvm.org/D122001
These tests are located in 'X86' subfolders which means that they should
be compiled for that target. As they did not have the target specified
explicitly, they in fact were compiled for a default target triple. Not
all targets support all required features for these tests; for example,
if NVPTX is used as a default triple, the tests fail. The patch makes the
tests run for 'x86_64', thus they pass regardless of the default target.
Differential Revision: https://reviews.llvm.org/D121998
In the frame index lowering we have to insert shift and add
instructions to adjust stack object access. We need to take care of the stack
object user kind and use scalar shift/add for scalar users.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D121524
This fixes bug <https://github.com/llvm/llvm-project/issues/54022>. For
now this means that defined functions will have two .functype directives
emitted. Given discussion in that bug has suggested interest in moving
towards using something other than .functype to mark the beginning of a
function (which would, as a side-effect, solve this issue), this patch
doesn't attempt to avoid that duplication.
Some test cases that used CHECK-LABEL: foo rather than CHECK-LABEL: foo:
are broken by this change. This patch updates those test cases to always
have a colon at the end of the CHECK-LABEL string.
Differential Revision: https://reviews.llvm.org/D122134
Add the UsesMaskPolicy flag to indicate the operations result
would be effected by the mask policy. (ex. mask operations).
It means RISCVInsertVSETVLI should decide the mask policy according
by mask policy operand or passthru operand.
If UsesMaskPolicy is false (ex. unmasked, store, and reduction operations),
the mask policy could be either mask undisturbed or agnostic.
Currently, RISCVInsertVSETVLI sets UsesMaskPolicy operations default to
MA, otherwise to MU to keep the current mask policy would not be changed
for unmasked operations.
Add masked-tama, masked-tamu, masked-tuma and masked-tumu test cases.
I didn't add all operations because most of implementations are using
the same pseudo multiclass. Some tests maybe be duplicated in different
tests. (ex. masked vmacc with tumu shows in vmacc-rv32.ll and masked-tumu)
I think having different tests only for policy would make the testing
clear.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D120226
Sinking must check for interference between the block prologue
and the instruction being sunk.
Specifically check for clobbering of uses by the prologue, and
overwrites to prologue defined registers by the sunk instruction.
Reviewed By: rampitec, ruiling
Differential Revision: https://reviews.llvm.org/D121277
On RV32, we need to type legalize i64 scalar arguments to intrinsics.
We usually do this by splatting the value into a vector separately.
If the scalar happens to be sign extended, we can continue using a .vx
intrinsic.
We already special cased sign extended constants, this extends it
to any sign extended value.
I've only added tests for one case of vadd. Most intrinsics go
through the same check. I can add more tests if we're concerned.
Differential Revision: https://reviews.llvm.org/D122186
As suggested on PR35908, if we are adding/subtracting an extracted bit, attempt to use BT instead to fold the op and use a ADC/SBB op.
Reapply with extra type legality checks - LowerAndToBT was originally only used during lowering, now that it can occur earlier we might encounter illegal types that we can either promote to i32 or just bail.
Differential Revision: https://reviews.llvm.org/D122084
BUILD_VECTOR of i16 and undef gets expanded to the COPY_TO_REGCLASS.
The latter is further lowererd to the copy instructions.
We need to provide the correct register class for the uniform and divergent BUILD_VECTOR nodes
to avoid VGPR to SGPR copies.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D122068
AVX512 has excellent broadcast ops for everything but vXi1 bool vectors - so if we're broadcasting a comparison result, see if we can broadcast the comparison operands instead.
Trying to reduce the number of masked loads in favour of more unpklo/hi
instructions. Both ISD::ZEXTLOAD and ISD::SEXTLOAD are supported to extensions
from legal types.
Both of normal and masked loads test cases added to guard compile crash.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D120953
This test aims to demonstrate the WebAssembly backend's behaviour around
emission of the .functype directive. It covers defined and declared
functions as well as libcalls.
It currently fails to emit functypes for all defined functions at the
head of the file, causing issues with the type checker
<https://github.com/llvm/llvm-project/issues/54022>. The patch in
<https://reviews.llvm.org/D122134> is a proposal to fix this issue.
Ensure we don't attempt to fold to illegal types to ADC/SBB nodes.
After D122084 its possible for ADD(X,AND(SRL(Y,Z),1) patterns to be matched before type legalization.
Add shl/srl/sra to the list of ops that we canonicalize with a select to expose an identity merge
Differential Revision: https://reviews.llvm.org/D122070
As suggested on PR35908, if we are adding/subtracting an extracted bit, attempt to use BT instead to fold the op and use a ADC/SBB op.
Differential Revision: https://reviews.llvm.org/D122084
This reverts commit 011c64191ef9ccc6538d52f4b57f98f37d4ea36e and
e725e2afe02e18398525652c9bceda1eb055ea64.
Differential Revision: https://reviews.llvm.org/D122117
On GFX10.3 targets, the following instruction sequence
v_cmp_* SGPR, ...
s_and_saveexec ..., SGPR
leads to a fairly long stall caused by a VALU write to a SGPR and having the
following SALU wait for the SGPR.
An equivalent sequence is to save the exec mask manually instead of letting
s_and_saveexec do the work and use a v_cmpx instruction instead to do the
comparison.
This patch modifies the SIOptimizeExecMasking pass as this is the last position
where s_and_saveexec instructions are inserted. It does the transformation by
trying to find the pattern, extracting the operands and generating the new
instruction sequence.
It also changes some existing lit tests and introduces a few new tests to show
the changed behavior on GFX10.3 targets.
Reviewed By: sebastian-ne, critson
Differential Revision: https://reviews.llvm.org/D119696
After D108936, @llvm.smul.with.overflow.i64 was lowered to __multi3
instead of __mulodi4, which also doesn't exist on PowerPC 32-bit, not
even with compiler-rt. Block it as well so that we get inline code.
Because libgcc doesn't have __muloti4, we block that as well.
Fixes#54460.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D122090
This is the first patch to enable the XCOFF64 object writer.
Currently only fileHeader and sectionHeaders are supported.
Reviewed By: jhenderson, DiggerLin
Differential Revision: https://reviews.llvm.org/D120861
ComputePHILiveOutRegInfo assumes that constant incoming values to
Phis will be zero extended if they aren't a legal type. To guarantee
that we should zero_extend rather than any_extend constants.
This fixes a bug for RISCV where any_extend of constants can be
treated as a sign_extend.
Differential Revision: https://reviews.llvm.org/D122053
The code that inserts AssertZExt based on predecessor information assumes
constants are zero extended for phi incoming values this allows
AssertZExt to be created in blocks consuming a Phi.
SelectionDAG::getNode treats any_extend of i32 constants as sext for RISCV.
The code that creates phi incoming values in the predecessors creates an
any_extend for the constants which then gets treated as a sext by getNode.
This makes the AssertZExt incorrect and can cause zexts to be
incorrectly removed.
This bug was introduced by D105918
Differential Revision: https://reviews.llvm.org/D122052