This was set if `TT.isTargetAEABI()`. This was previously set above
if `TM.isAAPCS_ABI() && (TT.isTargetAEABI() || TT.isTargetGNUAEABI() ||
TT.isTargetMuslAEABI() || TT.isAndroid())`.
So this could differ based on a manually specified -target-abi flag due
to the `isAAPCS_ABI` part of the original condition. I'm guessing
these should be consistent, so either this second group of
setLibcallImpl
calls should have been guarded by the `isAAPCS_ABI` check, or the first
condition should remove it.
There doesn't appear to be any meaningful test coverage using the
manually specified ABI option, so #152108 tries to remove it
These float operations were expanded for scalar f32/f64/f128, but not
for f16 and more problematically, not for vectors. A small subset of
them was separately set to expand for vectors.
Change these to always expand by default, and adjust targets to mark
these as legal where necessary instead.
This is a much safer default, and avoids unnecessary legalization
failures because a target failed to manually mark them as expand.
Fixes https://github.com/llvm/llvm-project/issues/110753.
Fixes https://github.com/llvm/llvm-project/issues/121390.
At the moment the following piece of code causes undefined behavior:
```
int a;
void b() {
register float d2 asm("d2") = a;
asm("" ::"r"(d2));
}
```
This happens because variable and register types are incompatible.
Follow up to 28417e64, and the whole line of work started with 4b81dc7.
This change merges the handling for VPStore - currently in
lowerInterleavedVPStore - into the existing dedicated routine used in
the shuffle lowering path. This removes the last use of the dedicated
lowerInterleavedVPStore and thus we can remove it.
This contains two changes which are functional.
First, like in 28417e64, merging support for vp.store exposes the
strided store optimization for code using vp.store.
Second, it seems the strided store case had a significant missed
optimization. We were performing the strided store at the full unit
strided store type width (i.e. LMUL) rather than reducing it to match
the input width. This became obvious when I tried to use the mask
created by the helper routine as it caused a type incompatibility.
Normally, I'd try not to include an optimization in an API rework, but
structuring the code to both be correct for vp.store and not optimize
the existing case turned out be more involved than seemed worthwhile. I
could pull this part out as a pre-change, but its a bit awkward on it's
own as it turns out to be somewhat of a half step on the possible
optimization; the full optimization is complex with the old code
structure.
---------
Co-authored-by: Craig Topper <craig.topper@sifive.com>
This continues in the direction started by commit 4b81dc7. We
essentially merges the handling for VPLoad - currently in
lowerInterleavedVPLoad - into the existing dedicated routine. This
removes the last use of the dedicate lowerInterleavedVPLoad and thus we
can remove it.
This isn't quite NFC as the main callback has support for the strided
load optimization whereas the VPLoad specific version didn't. So this
adds the ability to form a strided load for a vp.load deinterleave with
one shuffle used.
I'm surprised at how bad the test coverage is here. There is some
overlap with existing tests, but they aren't comprehensive and do
not cover all the ABIs, or all the different types.
Fixes#147935
Add LLVM Context to getOptimalMemOpType and findOptimalMemOpLowering. So
that we can use EVT::getVectorVT to generate EVT type in
getOptimalMemOpType.
Related to [#146673](https://github.com/llvm/llvm-project/pull/146673).
This fully consolidates all the calling convention configuration into
RuntimeLibcallInfo. I'm assuming that __aeabi functions have a universal
calling convention, and on other ABIs just don't use them. This will
enable splitting of RuntimeLibcallInfo into the ABI and lowering component.
Previously we had a table of entries for every Libcall for
the comparison to use against an integer 0 if it was a soft
float compare function. This was only relevant to a handful of
opcodes, so it was wasteful. Now that we can distinguish the
abstract libcall for the compare with the concrete implementation,
we can just directly hardcode the comparison against the libcall
impl without this configuration system.
Instead of associating the libcall with the RTLIB::Libcall, put it
into a table indexed by the RTLIB::LibcallImpl. The LibcallImpls
should contain all ABI details for a particular implementation, not
the abstract Libcall. In the future the wrappers in terms of the
RTLIB::Libcall should be removed.
This is a module level property that needs to be globally
consistent, so it does not belong in the subtarget.
Now that the Triple knows the default exception handling type,
consolidate the interpretation of None as select target default
exception handling in TargetMachine and use that. This enables
moving the configuration of UNWIND_RESUME to RuntimeLibcalls.
Manually submitting, closes#147226
This marks ffloor as legal providing that armv8 and neon is present (or
fullfp16 for the fp16 instructions). The existing arm_neon_vrintm
intrinsics are auto-upgraded to llvm.floor.
If this is OK I will update the other vrint intrinsics.
/llvm-project/llvm/lib/Target/ARM/ARMISelLowering.cpp:2769:19: error: loop variable '[Reg, N]' creates a copy from type 'std::pair<unsigned int, llvm::SDValue> const' [-Werror,-Wrange-loop-construct]
for (const auto [Reg, N] : RegsToPass) {
^
/llvm-project/llvm/lib/Target/ARM/ARMISelLowering.cpp:2769:8: note: use reference type 'std::pair<unsigned int, llvm::SDValue> const &' to prevent copying
for (const auto [Reg, N] : RegsToPass) {
^~~~~~~~~~~~~~~~~~~~~
&
/llvm-project/llvm/lib/Target/ARM/ARMISelLowering.cpp:2954:19: error: loop variable '[Reg, N]' creates a copy from type 'std::pair<unsigned int, llvm::SDValue> const' [-Werror,-Wrange-loop-construct]
for (const auto [Reg, N] : RegsToPass)
^
/llvm-project/llvm/lib/Target/ARM/ARMISelLowering.cpp:2954:8: note: use reference type 'std::pair<unsigned int, llvm::SDValue> const &' to prevent copying
for (const auto [Reg, N] : RegsToPass)
^~~~~~~~~~~~~~~~~~~~~
&
2 errors generated.
Work towards separating the ABI existence of libcalls vs. the
lowering selection. Set libcall selection through enums, rather
than through raw string names.
This prevents it CSEing multiple nodes together from "volatile"
registers as they would end up with the same chain. The new chain out
should be the chain from the new READ_REGISTER node.
Fixes#144845
These are module level properties, and querying them through
a function-level subtarget context is confusing. Plus we don't
need an aliased name. This doesn't avoid all the uses, just the
ones in the TargetLowering constructor.
We need the full set of ABI options to accurately compute
the full set of libcalls. This partially resolves missing
information required to compute the set of ARM calls.
These are module level concepts, and attaching them to the
function level subtarget is confusing. Similarly these other
helpers that only operate on the triple should also be removed
from the subtarget.
Work towards making RuntimeLibcalls the centralized location for
all libcall information. This requires changing the encoding from
tracking the ISD::CondCode to using CmpInst::Predicate.
These are the easy cases that do not really depend on the subtarget,
other than for the deceptive predicates on the subtarget class. Most
of the rest of the cases here also do not, but this is obscured by
going through helper predicates added onto the subtarget which hide
dependence on TargetOptions.
This saves 2 instructions in the ARM soft float case for fcmp ueq.
This code is written in an confusingly overly general way. The point
of getCmpLibcallCC is to express that the compiler-rt implementations
of the FP compares are different aliases around functions which may
return -1 in some cases. This does not apply to the call for unordered,
which returns a normal boolean.
Also stop overriding the default value for the unordered compare for ARM.
This was setting it to the same value as the default, which is now assumed.
These Module level triple checks implemented in the Subtarget are kind of
a pain and we should probably get rid of all of them in across all targets,
and stick to a consistent set of names.