Fix issue1: In mips1-4, require a minimum of 2 instructions between a
mflo/mfhi and the next mul/dmult/div/ddiv/divu/ddivu instruction.
Fix issue2: In mips1-4, should not put mflo into the delay slot for the
return.
Fix https://github.com/llvm/llvm-project/issues/81291
This adds support for:
* `muslabin32` (MIPS N32)
* `muslabi64` (MIPS N64)
* `muslf32` (LoongArch ILP32F/LP64F)
* `muslsf` (LoongArch ILP32S/LP64S)
As we start adding glibc/musl cross-compilation support for these
targets in Zig, it would make our life easier if LLVM recognized these
triples. I'm hoping this'll be uncontroversial since the same has
already been done for `musleabi`, `musleabihf`, and `muslx32`.
I intentionally left out a musl equivalent of `gnuf64` (LoongArch
ILP32D/LP64D); my understanding is that Loongson ultimately settled on
simply `gnu` for this much more common case, so there doesn't *seem* to
be a particularly compelling reason to add a `muslf64` that's basically
deprecated on arrival.
Note: I don't have commit access.
We encountered this problem in Zig, causing all of our
`mips(el)-linux-gnueabi*` tests to fail:
https://github.com/ziglang/zig/issues/21215
For these unusual cases, let's just bail in `MipsFastISel` since
`MipsTargetLowering` can handle them fine.
Note: I don't have commit access.
… instructions.
This is a fix I stumbled upon while working on something else. I decided
to break it out since it seems like a good "first issue" to submit. I
updated the comments in the "wrong error" test files to indicate that
the messages are no longer incorrect, but I left the names of the test
files alone. I was not sure what to do with those, so I would appreciate
thoughts or guidance.
MIPSr6 has max.s/max.d/min.s/min.d instructions, which can be used as
fcanonicalize.
For pre-R6, we have no instructions that can fcanonicalize an float, so
let's use `fadd Y,X,X` to quiet it if it is NaN.
IEEE754-2008 requires that the result of general-computational and
quiet-computational operation shouldn't be signal NaN.
The renamable flag is useful during MachineCopyPropagation but renamable
flag will be dropped after lowerCopy in some case.
This patch introduces extra arguments to pass the renamable flag to
copyPhysReg.
The o32 ABI specifies:
> Each relocation type of R_MIPS_HI16 must have an associated R_MIPS_LO16 entry immediately following it in the list of relocations. [...] the addend AHL is computed as (AHI << 16) + (short)ALO
In practice, the high-part and low-part relocations may not be adjacent
in assembly files, requiring the assembler to reorder relocations.
http://reviews.llvm.org/D19718 performed the reordering, but did not
optimize for the common case where a %lo immediately follows its
matching %hi. The quadratic time complexity could make sections with
many relocations very slow to process.
This patch implements the fast path, simplifies the code, and makes the
behavior more similar to GNU assembler (for the .rel.mips_hilo_8b test).
We also remove `OriginalSymbol`, removing overhead for other targets.
Fix#104562
Pull Request: https://github.com/llvm/llvm-project/pull/104723
For MIPS's o32 ABI (REL), https://reviews.llvm.org/D19718 introduced
`OriginalAddend` to find the matching R_MIPS_LO16 relocation for
R_MIPS_GOT16 when STT_SECTION conversion is applicable.
lw $2, %lo(local1)
lui $2, %got(local1)
However, we could just store the original `Addend` in
`ELFRelocationEntry` and remove `OriginalAddend`.
Note: The relocation ordering algorithm in
https://reviews.llvm.org/D19718 is inefficient (#104562), which will be
addressed by another patch.
We need to mask the SRL result to 8 bits before ORing in the SLL. This
is needed in case bits 23:16 of the input aren't zero. They will have
been shifted into bits 15:8.
We don't need to AND the result with 0xffff. It's ok if the upper 16
bits of the register are garbage.
Fixes#103035.
1. Add MipsPat to optimize (andi (srl (truncate i64 $1), x), y) to (andi
(truncate (dsrl i64 $1, x)), y).
2. Add MipsPat to optimize (ext (truncate i64 $1), x, y) to (truncate
(dext i64 $1, x, y)).
The assembly result is the same as gcc.
Fixes https://github.com/llvm/llvm-project/issues/42826
Generating the mapping from a register class to a register bank is
complex:
- there can be lots of register classes
- the mapping may be ambiguos
- a register class can span several register banks (e.g. a register
class containing all registers)
- the type information is not enough to decide which register bank to
map to (e.g. a register class containing floating point and vector
registers, and all register can represent a f64 value)
The approach taken here is to encode the register banks in an array
indexed by the ID of the register class. To save space, the entries are
packed into chunks of size 2^n.
The parameter is confusing as it duplicates MCStreamer::isVeboseAsm
(initialized from MCTargetOptions::AsmVerbose). After
233cca169237b91d16092c82bd55ee6a283afe98, no in-tree target uses the
parameter.
This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5.
I moved the `ISD` dependencies into the CodeGen portion of the handling,
it's a little awkward but it's the easiest solution I can think of for
now.
MachineFunction's probably should not include a backreference to
the owning MachineModuleInfo. Most of these references were used
just to query the MCContext, which MachineFunction already directly
stores. Other contexts are using it to query the LLVMContext, which
can already be accessed through the IR function reference.
Well, not quite that simple. We can tc memset since it returns the first
argument but bzero doesn't do that and therefore we can end up
miscompiling.
This patch also refactors the logic out of isInTailCallPosition() into the callers.
As a result memcpy and memmove are also modified to do the same thing
for consistency.
rdar://131419786
Summary:
The LTO pass and LLD linker have logic in them that forces extraction
and prevent internalization of needed runtime calls. However, these
currently take all RTLibcalls into account, even if the target does not
support them. The target opts-out of a libcall if it sets its name to
nullptr. This patch pulls this logic out into a class in the header so
that LTO / lld can use it to determine if a symbol actually needs to be
kept.
This is important for targets like AMDGPU that want to be able to use
`lld` to perform the final link step, but does not want the overhead of
uncalled functions. (This adds like a second to the link time trivially)
Summary:
These Libcalls represent which functions are available to the backend.
If a runtime call is not available, the target sets the the name to
`nullptr`. Currently, this logic is spread around the various targets.
This patch pulls all of the locations that disable libcalls into the
intializer. This patch is effectively NFC.
The motivation behind this patch is that currently the LTO handling uses
the list of all runtime calls to determine which functions cannot be
internalized and must be extracted from static libraries. We do not want
this to happen for libcalls that are not emitted by the backend. A
follow-up patch will move out this logic so the LTO pass can know which
rtlib calls are actually used by the backend.
Some targets only pass a TargetMachine & to their subtarget constructor
and require a static_cast to their target-specific TargetMachine subclass
to create *InstructionSelector.
These 3 targets already have the correct TargetMachine subclass
reference so no cast is needed.
The InstructionSelector objects all take these arguments in as `const`.
This function does not modify the object. Therefore we can mark them as
`const` here.
This reverts commit ab58b6d58edf6a7c8881044fc716ca435d7a0156.
In `CodeGen/Generic/MachineBranchProb.ll`, `llc` crashed with dumped MIR
when targeting PowerPC. Move test to `llc/new-pm`, which is X86
specific.