Spill/reload instructions are artificially generated by the compiler and
have no relation to the original source code. So the best thing to do is
not attach any debug location to them (instead of just taking the next
debug location we find on following instructions).
Refered to https://reviews.llvm.org/rG3e081703c349dd00b8ef6991c2d15964915dd8f4
Reviewed By: asb, kito-cheng, benshi001
Differential Revision: https://reviews.llvm.org/D129173
We need a pseudo for each scalar FP register class. Previously
we distinquished the pseudos by naming them with F16, F32, F64, or
BF16 in place of the F in the normal instruction name.
Because these strings can appear in other parts of the name we had
to do things like matching "_VBF16" to "_VF".
This patch replaces the F16, F32, F64 strings with FPR16, FPR32, and
FPR64. We also use FPR16 for BF16 since that is the scalar register
class for bf16.
Since the FPR16/32/64 string does not anywhere else in the pseudo
names, we can use this to simplify the string replacements. This
also allows us to simplify some BF16 related code.
Reviewed By: wangpc
Differential Revision: https://reviews.llvm.org/D157749
These test cases previously caused an error. RISCVInstrInfo::copyPhysReg also needed a tweak in order to account for copying bf16 values in FPR16 registers.
Differential Revision: https://reviews.llvm.org/D156883
Instead of checking '!Zfh && Zhfmin' first, handle Zfh. Then assert
that the other case is F+Zfhmin. The F+Zfhmin check will need to be
relaxed for bfloat16 support. As it was written before there would
be now error to catch that. Instead it would just silently create
fsgnj.h instructions.
Depends on D154628
For the cover letter of the patch-set, please checkout D154628.
This is the 2nd patch of the patch-set.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D154629
If the operands to the mul have other uses we may be extending their
live range past a kill flag.
Reviewed By: asb, asi-sc
Differential Revision: https://reviews.llvm.org/D155046
This change continues with the line of work discussed in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295.
This change handles most of the binary pseudos. I excluded pseudos which _TIED variants, and those that produce mask results. Both a bit different in functionality, and deserve their own change and review. As with previous changes in the series, we replace the existing TA and TU forms with a single unified pseudo with a passthru (which may be implicit_def) and a policy operand.
As before, we see codegen changes (some improvements and some regressions) due to scheduling differences caused by the extra implicit_def instructions.
Differential Revision: https://reviews.llvm.org/D154245
This change continues with the line of work discussed in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295. In D153155, we started removing the legacy distinction between unsuffixed (TA) and _TU pseudos. This patch continues that effort for the unary instruction families.
The change consists of a few interacting pieces:
* Adding a vector policy operand to VPseudoUnaryNoMaskTU.
* Then using VPseudoUnaryNoMaskTU for all cases where VPseudoUnaryNoMask was previously used and deleting the unsuffixed form.
* Then renaming VPseudoUnaryNoMaskTU to VPseudoUnaryNoMask, and adjusting the RISCVMaskedPseudo table to use the combined pseudo.
* Fixing up two places in C++ code which manually construct VMV_V_* instructions.
Normally, I'd try to factor this into a couple of changes, but in this case, the table structure is tied to naming and thus we can't really separate the otherwise NFC bits.
As before, we see codegen changes (some improvements and some regressions) due to scheduling differences caused by the extra implicit_def instructions.
Differential Revision: https://reviews.llvm.org/D153899
We were only checking for the previous insructions to write exactly
the register or a super register. We ignored writes to a subregister
and continued searching for the producing instruction. We need to
abort instead.
There's another check inside the if body to abort if the registers
don't match exactly. So we just need to check for overlap so we
enter the if body.
Reviewed By: fakepaper56
Differential Revision: https://reviews.llvm.org/D153490
With `-fsanitize=kcfi` (Kernel Control-Flow Integrity), Clang emits
"kcfi" operand bundles to indirect call instructions. Similarly to
the target-specific lowering added in D119296, implement KCFI operand
bundle lowering for RISC-V.
This patch disables the generic KCFI pass for RISC-V in Clang, and
adds the KCFI machine function pass in `RISCVPassConfig::addPreSched`
to emit target-specific `KCFI_CHECK` pseudo instructions before calls
that have KCFI operand bundles. The machine function pass also bundles
the instructions to ensure we emit the checks immediately before the
calls, which is not possible with the generic pass.
`KCFI_CHECK` instructions are lowered in `RISCVAsmPrinter` to a
contiguous code sequence that traps if the expected hash in the
operand bundle doesn't match the hash before the target function
address. This patch emits an `ebreak` instruction for error handling
to match the Linux kernel's `BUG()` implementation. Just like for X86,
we also emit trap locations to a `.kcfi_traps` section to support
error handling, as we cannot embed additional information to the trap
instruction itself.
Relands commit 62fa708ceb027713b386c7e0efda994f8bdc27e2 with fixed
tests.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D148385
With `-fsanitize=kcfi` (Kernel Control-Flow Integrity), Clang emits
"kcfi" operand bundles to indirect call instructions. Similarly to
the target-specific lowering added in D119296, implement KCFI operand
bundle lowering for RISC-V.
This patch disables the generic KCFI pass for RISC-V in Clang, and
adds the KCFI machine function pass in `RISCVPassConfig::addPreSched`
to emit target-specific `KCFI_CHECK` pseudo instructions before calls
that have KCFI operand bundles. The machine function pass also bundles
the instructions to ensure we emit the checks immediately before the
calls, which is not possible with the generic pass.
`KCFI_CHECK` instructions are lowered in `RISCVAsmPrinter` to a
contiguous code sequence that traps if the expected hash in the
operand bundle doesn't match the hash before the target function
address. This patch emits an `ebreak` instruction for error handling
to match the Linux kernel's `BUG()` implementation. Just like for X86,
we also emit trap locations to a `.kcfi_traps` section to support
error handling, as we cannot embed additional information to the trap
instruction itself.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D148385
We can mostly get this from the operand info in MCInstrDesc.
The exception is the _TIED pseudos so I've added a new flag for those.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D152313
Sometimes an developer would like to have more control over cmov vs branch. We have unpredictable metadata in LLVM IR, but currently it is ignored by X86 backend. Propagate this metadata and avoid cmov->branch conversion in X86CmovConversion for cmov with this metadata.
Example:
```
int MaxIndex(int n, int *a) {
int t = 0;
for (int i = 1; i < n; i++) {
// cmov is converted to branch by X86CmovConversion
if (a[i] > a[t]) t = i;
}
return t;
}
int MaxIndex2(int n, int *a) {
int t = 0;
for (int i = 1; i < n; i++) {
// cmov is preserved
if (__builtin_unpredictable(a[i] > a[t])) t = i;
}
return t;
}
```
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D118118
This commit implements the two NTLH intrinsic functions.
```
type __riscv_ntl_load (type *ptr, int domain);
void __riscv_ntl_store (type *ptr, type val, int domain);
```
```
enum {
__RISCV_NTLH_INNERMOST_PRIVATE = 2,
__RISCV_NTLH_ALL_PRIVATE,
__RISCV_NTLH_INNERMOST_SHARED,
__RISCV_NTLH_ALL
};
```
We encode the non-temporal domain into MachineMemOperand flags.
1. Create the RISC-V built-in function with custom semantic checking.
2. Assume the domain argument is a compile time constant,
and make it as LLVM IR metadata (nontemp_node).
3. Encode domain value as two bits MachineMemOperand TargetMMOflag.
4. According to MachineMemOperand TargetMMOflag, select corrsponding ntlh instruction.
Currently, it supports scalar type and fixed-length vector type.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D143364
It was only in RISCVInstrInfo because it was used by 2 passes, but those
passes have been merged in D147173.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D147174
I think the failure was caused by a mistake in an earlier patch.
Original commit message:
We've supported .insn for non-compressed for a while. This finishes the compressed supported.
Differential Revision: https://reviews.llvm.org/D146663
We've supported .insn for non-compressed for a while. This finishes the compressed supported.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D146663
I think it's good practice to avoid having default ctors unless they're really
valid/useful. For OutlinedFunction the default ctor was used to represent a
bail-out value for getOutliningCandidateInfo(), so I changed the API to return
an optional<getOutliningCandidateInfo> instead which seems a tad cleaner.
Differential Revision: https://reviews.llvm.org/D146375
This does not work by a mere composition of `enumerate` and `zip_equal`,
because C++17 does not allow for recursive expansion of structured
bindings.
This implementation uses `zippy` to manage the iteratees and adds the
stream of indices as the first zipped range. Because we have an upfront
assertion that all input ranges are of the same length, we only need to
check if the second range has ended during iteration.
As a consequence of using `zippy`, `enumerate` will now follow the
reference and lifetime semantics of the `zip*` family of functions. The
main difference is that `enumerate` exposes each tuple of references
through a new tuple-like type `enumerate_result`, with the familiar
`.index()` and `.value()` member functions.
Because the `enumerate_result` returned on dereference is a
temporary, enumeration result can no longer be used through an
lvalue ref.
Reviewed By: dblaikie, zero9178
Differential Revision: https://reviews.llvm.org/D144503
D145471 added overrides of the other signature to return MemBytes,
but shouldn't have removed these overrides.
These signatures will now call the MemBytes signature and ignore
the MemBytes. This matches X86.
Refer from: https://reviews.llvm.org/D44782
After https://reviews.llvm.org/D130302, LW+SEXT.B can be folded into LB
as partially reload stack slot. This gains incorrect optimization result
from `StackSlotColoring` without given the number of bytes exactly load
from stack. LB+SW are mis-interpreted as fully reload/restore from stack
slot without the sign-extension. SW would be considered as a redundant store.
The testcase is copied from llvm/test/CodeGen/X86/pr30821.mir.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D145471
The instructions in the XTHeadMac extension (multiply accumulate
instructions) were marked as commutative but because the destination
register was also an input (accumulate) register and was connected to
the destination register with a register allocator constraint, all
three operands (instead of two) were incorrectly considered
commutative. To fix that an appropriate fixCommutedOpIndices call was
added for these instructions in findCommutedOpIndices
New test functions have been added to test the correct behaviour in
xtheadmac.ll.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D144278
Some immediate types in RISCV target description lack operand type field.
This leads them being listed as OPERAND_UNKNOWN in MCOperandInfo. This patch adds this fields.
This is NFC because it does not affect flow of any current tools implementation.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D144105
For in-order cores MachineCombiner makes better decisions when the critical path
is calculated only for the current basic block and does not take into account
other blocks from the trace.
This patch adds a virtual method to TargetInstrInfo to allow each target decide
which strategy to use.
Depends on D140541
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D140542
The motivation behind this patch is to unify some of the outliner logic across architectures. This looks nicer in general and makes fixing [issues like this](https://reviews.llvm.org/D124707#3483805) easier.
There are some notable changes here:
1. `isMetaInstruction()` is used directly instead of checking for specific meta-instructions like `IMPLICIT_DEF` or `KILL`. This was already done in the RISC-V implementation, but other architectures still did hardcoded checks.
- As an exception to this, CFI instructions are explicitly delegated to the target because RISC-V has different handling for those.
2. `isTargetIndex()` checks are replaced with an assert; none of the architectures supported actually use `MO_TargetIndex` at this point in time.
3. `isCFIIndex()` and `isFI()` checks are also replaced with asserts, since these operands should not exist in [any context](https://reviews.llvm.org/D122635#3447214) at this stage in the pipeline.
Reviewed by: paquette
Differential Revision: https://reviews.llvm.org/D125072
This patch add the instructions of Zcb extension.
Instructions in zcb extensions shorten part of bit manipulation instructions.
Co-authored-by: Craig Topper <craig.topper@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D131141