We kind of have to have multiple implementations of fdiv split between
the two selectors with some pre-processing. Add yet another test to
check for consistency of interpretation of flag combinations. We have
quite a bit of test redundancy here already, but there are so many
possible interesting permutations it's unwieldy to cover every detail
in any one of them. We have a number of overlapping fdiv tests but
it's hard to follow everything going on as it is.
This isn't always folded to fneg for a freestanding fsub depending on
the denormal mode. When matching source modifiers, we're implicitly
canonicalizing the input so we can fold it here.
Doesn't bother handling the VOP3P case since it's only relevant with
DAZ, which nobody really uses with f16.
For f64, tests show an existing bug where DAGCombiner tries to respect
the denormal mode for fsub -0, x, but not after it's lowered to fadd
-0, (fneg x). Either the fold is wrong or we shouldn't restrict the
fsub case based on the denormal mode.
https://reviews.llvm.org/D155652
This has come up a few times in review; the current ones seem to be universally confusing. Even I as the original author of most of these get confused. Switch to using the SLOW/FAST naming used by x86, hopefully that's a bit clearer.
This reverts commit f8a36d8c3e264c4fccf8058e699201a452ea7bb7.
I believe this is causing an assertion failure on the
sanitizer-x86_64-linux buildbot:
clang++: /b/sanitizer-x86_64-linux/build/llvm-project/llvm/include/llvm/Support/Casting.h:578: decltype(auto) llvm::cast(From *) [To = llvm::BinaryOperator, From = llvm::Value]: Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed.
#10 0x000055bdd7e82408 canonicalizeLogicFirst(llvm::BinaryOperator&, llvm::IRBuilder<llvm::TargetFolder, llvm::IRBuilderCallbackInserter>&) /b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:2131:5
#11 0x000055bdd7e80183 llvm::InstCombinerImpl::visitAnd(llvm::BinaryOperator&) /b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:2661:20
Likely the code is encountering a constant expression in a case it
didn't before.
Correct RISC-V strictfp tests to follow the rules documented in the LangRef:
https://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics
Mostly these tests just needed the strictfp attribute on function definitions.
I've also removed the strictfp attribute from uses of the constrained
intrinsics because it comes by default since D154991, but I only did this
in tests I was changing anyway.
Test changes verified with D146845.
The patch D153600 implemented `-frecord-command-line` for the XCOFF direct assembly path. This patch adds support for the XCOFF integrated assembly path.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D154921
In preparation for removing support for add expressions, mark them
as undesirable. As such, we will no longer implicitly create such
expressions, but they still exist.
For a relocation, we don't differentiate the two cases:
* the symbol index is 0
* the symbol index is non zero, the type is not STT_SECTION, and the name is empty. Clang generates such local symbols for RISC-V linker relaxation.
So we may print
```
Offset Info Type Symbol's Value Symbol's Name + Addend
000000000000001c 0000000100000039 R_RISCV_32_PCREL 0000000000000000 0
// llvm-readobj
0x1C R_RISCV_32_PCREL - 0x0
```
while GNU readelf prints "<null>", which is clearer. Let's match the GNU behavior.
Related to https://reviews.llvm.org/D81842
```
000000000000001c 0000000100000039 R_RISCV_32_PCREL 0000000000000000 <null> + 0
// llvm-readobj
0x1C R_RISCV_32_PCREL <null> 0x0
```
Reviewed By: jhenderson, kito-cheng
Differential Revision: https://reviews.llvm.org/D155353
We don't support this as a argument or return type, it's always promoted to <2 x s32>.
Performing the widening prevents us from having selection failures due to unsupported
extends.
Fixes https://github.com/llvm/llvm-project/issues/58274
This is an alternative to D155288 that can handle other sources of
xori like FP compares. Unfortunately, it misses the i64 setge case
on RV32 in condops.ll.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D155328
Similar to combineVectorSignBitsTruncation, we don't require all-signbits source inputs, just enough signbits to reach into the lowest i16 to safely use PACKSSDW.
Refactor to use BasicBlockUtils functions and make life easier for
a subsequent patch for updating the dominator tree.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D154053
A vmv.v.v shares the same encoding as a vmerge that isn't masked, so we can
also fold it into its operands if we treat it as a vmerge with an all-ones
mask. We take care here not to actually transform the existing vmv into a
vmerge, otherwise things like True.hasOneUse() become inaccurate. Instead this
just returns an equivalent list of operands.
This is an alternative to D153351.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D155101
Currently when folding vmerge into its operands, we stop if the VLs aren't
identical. However since the body of (vmerge (vop)) is the intersection of
vmerge and vop's bodies, we can use the smaller of the two VLs if we know it
ahead of time. This patch relaxes the constraint on VL if they are both
constants, or if either of them are VLMAX.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D155071
Within the AggressiveInstCombine Pass we have
an analysis/optimization that matches that
pattern of the Table Based CTZ. Some Targets do
not support/define ctz(0), but since the
AggressiveInstCombine is just an extension of
InstCombine, it should be a target-independent
canonicalization Pass, and therefore, we decided
to introduce several instructions, such as select
and compare that produce canonical IR, even if
the input is 0. The task for the Targets that do
support that input is to handle such a case and
to produce an optimal assembly.
This patch optimizes the CTTZ/CTLZ instructions
if the input is 0 by performing the`DAG combine`,
by generating the cttz(x) & 0x1f pattern (the
same goes for ctlz as well).
Differential Revision: https://reviews.llvm.org/D151449
Adapt the existing ANY/ZERO_EXTEND_VECTOR_INREG shuffle matching to also recognise SIGN_EXTEND_VECTOR_INREG patterns to handle cases where we're effectively "splatting" all-signbits sources.
Currently when compiling for an execute-only target without movt then
EmitStructByval will generate a constant pool load which isn't
compatible with execute-only. Handle this by emitting tMOVi32imm,
and also simplify the existing movt handling by emitting t2MOVi32imm
or MOVi32imm.
Differential Revision: https://reviews.llvm.org/D154944
The expansion of the various MOVi32imm pseudo-instructions works by
splitting the operand into components (either halfwords or bytes) and
emitting instructions to combine those components into the final
result. When the operand is an immediate with some components being
zero this can result in pointless instructions that just add zero.
Avoid this by restructuring things so that a separate function handles
splitting the operand into components, then don't emit the component
if it is a zero immediate. This is straightforward for movw/movt,
where we just don't emit the movt if it's zero, but the thumb1
expansion using mov/add/lsl is more complex, as even when we don't
emit a given byte we still need to get the shift correct.
Differential Revision: https://reviews.llvm.org/D154943
Currently for armv6-m and armv8-m.baseline, we emit constant pool code when we
use execute-only (XO) in combination with stack guards.
XO is a new feature for armv6-m, and this patch is part of a series of patches
that substitutes constant pool generation with the tMOVi32imm equivalent.
However XO for armv8-m.baseline has been available for about 6 years, and so
for armv8-m.baseline this is a bugfix.
Reviewed By: simonwallis2, olista01
Differential Revision: https://reviews.llvm.org/D155170