Mips requires fp128 args/returns to be passed differently than i128. It
handles this by inspecting the pre-legalization type. However, for soft
float libcalls, the original type is currently not provided (it will
look like a i128 call). To work around that, MIPS maintains a list of
libcalls working on fp128.
This patch removes that list by providing the original, pre-softening
type to calling convention lowering. This is done by carrying additional
information in CallLoweringInfo, as we unfortunately do need both types
(we want the un-softened type for OrigTy, but we need the softened type
for the actual register assignment etc.)
This is in preparation for completely removing all the custom
pre-analysis code in the Mips backend and replacing it with use of
OrigTy.
It is common to have ABI requirements for illegal types: For example,
two i64 argument parts that originally came from an fp128 argument may
have a different call ABI than ones that came from a i128 argument.
The current calling convention lowering does not provide access to this
information, so backends come up with various hacks to support it (like
additional pre-analysis cached in CCState, or bypassing the default
logic entirely).
This PR adds the original IR type to InputArg/OutputArg and passes it
down to CCAssignFn. It is not actually used anywhere yet, this just does
the mechanical changes to thread through the new argument.
Add LLVM Context to getOptimalMemOpType and findOptimalMemOpLowering. So
that we can use EVT::getVectorVT to generate EVT type in
getOptimalMemOpType.
Related to [#146673](https://github.com/llvm/llvm-project/pull/146673).
This PR takes the work previously done by @pawan-nirpal-031 on X86 in
#106370, and makes it available in common code. This should enable all
targets to use `__builtin_canonicalize` for all `f(16|32|64|128)` data
types.
Canonicalization is implemented here as multiplication by `1.0`, as
suggested in [the
docs](https://llvm.org/docs/LangRef.html#llvm-canonicalize-intrinsic).
Now we define FMAXNUM and FMINNUM as IEEE754-2008 with +0.0>-0.0.
MIPSr6's fmax/fmin just follow this rules full.
FMAXNUM_IEEE and FMINNUM_IEEE will be removed in future once:
1. Fixes FMAXNUM/FMINNUM for all targets
2. The use of FMAXNUM_IEEE/FMINNUM_IEEE are not used by middle end
anymore.
issue reason:
Because mips3 has the feature 'FeatureGP64Bit', when target mips3
process function `writeVarArgRegs`, the result of `getGPRSizeInBytes` is
8 and the result of `GetVarArgRegs` is `Mips::A0, Mips::A1, Mips::A2,
Mips::A3`. This would generate `gpr64 = COPY $a1` which should be `gpr64
= COPY $a1_64`.
Also when process `CC_Mips_FixedArg`, mips would CCDelegateTo
`CC_MipsO32_FP`. In fact, it should CCDelegateTo `CC_MipsN`.
Fix#98716.
Set it to true only if isInt<16>.
By default implemention defines them to true always. For most cases,
MIPS uses 16bit IMM, and for microMIPS, ICMP and ADDiu have 16bit IMM
flavors.
The llvm.readcyclecounter intrinsic can be implemented via the `rdhwr
$2, $hwr_cc` instruction.
$hwr_cc: High-resolution cycle counter. This register provides read
access to the coprocessor 0 Count Register.
Fix#106318.
The llvm.readcyclecounter intrinsic can be implemented via the `rdhwr
$3, $hwr_cc` instruction.
$hwr_cc: High-resolution cycle counter. This register provides read
access to the coprocessor 0 Count Register.
Fix#106318.
This is needed for architectures that actually use strict pointer
arithmetic instead of integers such as AArch64 with FEAT_CPA (see
https://github.com/llvm/llvm-project/pull/105669) or CHERI. Using an
index as the first operand of pointer arithmetic may result in an
invalid output.
While there are quite a few codegen changes here, these only change the
order of registers in add instructions. One MIPS combine had to be
updated to handle the new node order.
Reviewed By: topperc
Pull Request: https://github.com/llvm/llvm-project/pull/125279
I want to use this function for GISel too so Type * is a better common
interface. All of the callers already convert EVT to Type * as needed
by calling lowering anyway.
MIPSr6 has max.s/max.d/min.s/min.d instructions, which can be used as
fcanonicalize.
For pre-R6, we have no instructions that can fcanonicalize an float, so
let's use `fadd Y,X,X` to quiet it if it is NaN.
IEEE754-2008 requires that the result of general-computational and
quiet-computational operation shouldn't be signal NaN.
This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5.
I moved the `ISD` dependencies into the CodeGen portion of the handling,
it's a little awkward but it's the easiest solution I can think of for
now.
Well, not quite that simple. We can tc memset since it returns the first
argument but bzero doesn't do that and therefore we can end up
miscompiling.
This patch also refactors the logic out of isInTailCallPosition() into the callers.
As a result memcpy and memmove are also modified to do the same thing
for consistency.
rdar://131419786
Summary:
The LTO pass and LLD linker have logic in them that forces extraction
and prevent internalization of needed runtime calls. However, these
currently take all RTLibcalls into account, even if the target does not
support them. The target opts-out of a libcall if it sets its name to
nullptr. This patch pulls this logic out into a class in the header so
that LTO / lld can use it to determine if a symbol actually needs to be
kept.
This is important for targets like AMDGPU that want to be able to use
`lld` to perform the final link step, but does not want the overhead of
uncalled functions. (This adds like a second to the link time trivially)
Summary:
These Libcalls represent which functions are available to the backend.
If a runtime call is not available, the target sets the the name to
`nullptr`. Currently, this logic is spread around the various targets.
This patch pulls all of the locations that disable libcalls into the
intializer. This patch is effectively NFC.
The motivation behind this patch is that currently the LTO handling uses
the list of all runtime calls to determine which functions cannot be
internalized and must be extracted from static libraries. We do not want
this to happen for libcalls that are not emitted by the backend. A
follow-up patch will move out this logic so the LTO pass can know which
rtlib calls are actually used by the backend.
MIPS max.fmt/min.fmt instructions is IEEE2008 compatiable. If either
argument is sNaN, the result will be NaN.
So we define fminnum_ieee instead of fminnum in Mips32r6InstrInfo.td. We
also should define fcanonicalize. So that we can define fminnum as
expand to fcanonicalize and fminnum_ieee.
- The behavior is similar to UCOMISD on x86, which is also used to
compare two fp values, specifically on handling of NaNs.
- Update related tests regarding this change.
- The further goal is to implement `llvm.minimum` and `llvm.maximum`
intrinsics for MIPS R6 and Pre-R6.
Part of https://github.com/llvm/llvm-project/issues/64207
CallSiteInfo is originally used only for argument - register pairs. Make
it struct, in which we can store additional data for call sites.
Also, the variables/methods used for CallSiteInfo are named for its
original use case, e.g., CallFwdRegsInfo. Refactor these for the
upcoming
use, e.g. addCallArgsForwardingRegs() -> addCallSiteInfo().
An upcoming patch will add type ids for indirect calls to propogate them
from
middle-end to the back-end. The type ids will be then used to emit the
call
graph section.
Original RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html
Differential Revision: https://reviews.llvm.org/D107109?id=362888
Co-authored-by: Necip Fazil Yildiran <necip@google.com>
Argument-register pairs in CallSiteInfo is only needed when
EmitCallSiteInfo
is on. Currently, the pairs are always pushed to the vector but only
used
when EmitCallSiteInfo is on.
Don't fill the CallSiteInfo vector unless used.
Differential Revision: https://reviews.llvm.org/D107108?id=362887
Co-authored-by: Necip Fazil Yildiran <necip@google.com>
MIPSr6 ISA requires normal load/store instructions support
misunaligned memory access, while it is not always do so
by hardware. On some microarchitectures or some corner cases
it may need support by OS.
Don't confuse with pre-R6's lwl/lwr famlily: MIPSr6 doesn't
support them, instead, r6 requires lw instruction support
misunaligned memory access. So, if -mstrict-align is used for
pre-R6, lwl/lwr won't be disabled.
If -mstrict-align is used for r6 and the access is not well
aligned, some lb/lh instructions will be used to replace lw.
This is useful for OS kernels.
To be back-compatible with GCC, -m(no-)unaligned-access are also
added as Neg-Alias of -m(no-)strict-align.
The Mips target uses two TargetOpcode enumerators called
`PseudoD_SELECT_I` and `PseudoD_SELECT_I64`. A SDAG node is created
using these enumerators which is manually selected in
`MipsSEISelDAGToDAG.cpp` and ultimately expanded in
`EmitInstrWithCustomInserter` in `MipsISelLowering.cpp`.
This is not causing any upstream build to fail at the moment but it is
not guaranteed that these enumerators do not clash with Target ISD nodes
(i.e. those in the `MipsISD` namespace). We have seen this happening in
our downstream builds in which `Mips::PseudoD_SELECT_I` ends having the
same integer value as `MipsISD::VEXTRACT_ZEXT_ELT`. This confuses the
function `trySelect` in `MipsSEISelDAGToDAG.cpp` and causes a crash in 3
tests.
This change adds a new Target ISD opcode for these two cases and uses
them for the SDAG nodes. No test is included because this is a potential
error in the future not one that can be demonstrated in the current
codebase.
This include 2 fixes:
1. Disallow 'f' for softfloat.
2. Allow 'r' for softfloat.
Currently, 'f' is accpeted by clang, then LLVM meets an internal error.
'r' is rejected by LLVM by: couldn't allocate input reg for constraint
'r'.
Fixes: #64241, #63632
---------
Co-authored-by: Fangrui Song <i@maskray.me>
This follows on from #76708, allowing
`cast<ConstantSDNode>(N)->getZExtValue()` to be replaced with just
`N->getAsZextVal();`
Introduced via `git grep -l "cast<ConstantSDNode>\(.*\).*getZExtValue" |
xargs sed -E -i
's/cast<ConstantSDNode>\((.*)\)->getZExtValue/\1->getAsZExtVal/'` and
then using `git clang-format` on the result.
This helper function shortens examples like
`cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue();` to
`Node->getConstantOperandVal(1);`.
Implemented with:
`git grep -l
"cast<ConstantSDNode>\(.*->getOperand\(.*\)\)->getZExtValue\(\)" | xargs
sed -E -i
's/cast<ConstantSDNode>\((.*)->getOperand\((.*)\)\)->getZExtValue\(\)/\1->getConstantOperandVal(\2)/`
and `git grep -l
"cast<ConstantSDNode>\(.*\.getOperand\(.*\)\)->getZExtValue\(\)" | xargs
sed -E -i
's/cast<ConstantSDNode>\((.*)\.getOperand\((.*)\)\)->getZExtValue\(\)/\1.getConstantOperandVal(\2)/'`.
With a couple of simple manual fixes needed. Result then processed by
`git clang-format`.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.