With D134950, targets get notified when a virtual register is created and/or
cloned. Targets can do the needful with the delegate callback. AMDGPU propagates
the virtual register flags maintained in the target file itself. They are useful
to identify a certain type of machine operands while inserting spill stores and
reloads. Since RegAllocFast spills the physical register itself, there is no way
its virtual register can be mapped back to retrieve the flags. It can be solved
by passing the virtual register as an additional argument. This argument has no
use when the spill interfaces are called during the greedy allocator or even the
PrologEpilogInserter and can pass a null register in such cases.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138656
- Test cases for arch only has 16-bit instruction such as ck801/ck802 need
compile with -mattr=+btst16
- Fix the GPR copy instruction with MOV16 for 16-bit only arch.
After the frameindex is resolved, the offset can be negative. It would
be materialized as unsigned integer and can still calculated by add instruction.
CSKY arch has multiple FPU instruction versions such as FPU, FPUv2 and FPUv3 to implement floating operations.
For now, we just only support FPUv2 and FPUv3.
It includes the encoding, asm parsing of instructions and codegen of DAG nodes.
Loading constants inline is expensive on CSKY and it's in general better
to place the constant nearby in code space and then it can be loaded with a
simple 16/32 bit load instruction like lrw.
It needs lift or duplicates constant pool entry to make constant nearby so that lrw instruction can reach.
Lower global symbols such as call/external symbol.
Lower other leaf DAG node such as frame address/block address/jumptable/vastart.
Normally some leaf symbols need reside in constant pool as ABI prefers, and are addressed by
lrw or jsri instructions.
Every symbol in constant pool is lowered with one entry in target constant pool. The
entry has different type corresponding to different leaf node such as blockaddress,
jumptable, or global value.
Complete basic arithmetic operations such as add/sub/mul/div, and it also includes converions
and some specific operations such as bswap.Add load/store patterns to generate different addressing mode instructions.
Also enable some infra such as copy physical register and eliminate frame index.
Ooops. It constructs codegen infra and provide only basic code to generate first add instruction successfully.
Differential Revision: https://reviews.llvm.org/D112206