219 Commits

Author SHA1 Message Date
Patryk Wychowaniec
328cb9b731
[AVR] Remove earlyclobber from LDDRdPtrQ (#85277)
LDDRdPtrQ was marked as `earlyclobber`, which doesn't play well with
GreedyRA (which can generate this instruction through `loadRegFromStackSlot()`).

This seems to be the same case as:

a99b912c9b/llvm/lib/Target/AVR/AVRInstrInfo.td (L1421)

Closes https://github.com/llvm/llvm-project/issues/81911.
2024-03-15 19:07:54 +08:00
Nikita Popov
7bdc80f35c [AVR] Convert tests to opaque pointers (NFC) 2024-02-05 13:55:50 +01:00
Jay Foad
7b3bbd83c0 Revert "[CodeGen] Really renumber slot indexes before register allocation (#67038)"
This reverts commit 2501ae58e3bb9a70d279a56d7b3a0ed70a8a852c.

Reverted due to various buildbot failures.
2023-10-09 12:31:32 +01:00
Jay Foad
2501ae58e3
[CodeGen] Really renumber slot indexes before register allocation (#67038)
PR #66334 tried to renumber slot indexes before register allocation, but
the numbering was still affected by list entries for instructions which
had been erased. Fix this to make the register allocator's live range
length heuristics even less dependent on the history of how instructions
have been added to and removed from SlotIndexes's maps.
2023-10-09 11:44:41 +01:00
Ben Shi
5db0a450be
[AVR] Fix a crash in AVRInstrInfo::insertIndirectBranch (#67324)
Fixes https://github.com/llvm/llvm-project/issues/67042
2023-10-02 21:14:22 +08:00
Guozhi Wei
cbdccb30c2 [RA] Split a virtual register in cold blocks if it is not assigned preferred physical register
If a virtual register is not assigned preferred physical register, it means some
COPY instructions will be changed to real register move instructions. In this
case we can try to split the virtual register in colder blocks, if success, the
original COPY instructions can be deleted, and the new COPY instructions in
colder blocks will be generated as register move instructions. It results in
fewer dynamic register move instructions executed.

The new test case split-reg-with-hint.ll gives an example, the hot path contains
24 instructions without this patch, now it is only 4 instructions with this
patch.

Differential Revision: https://reviews.llvm.org/D156491
2023-09-15 19:52:50 +00:00
Fangrui Song
806761a762 [test] Change llc -march= to -mtriple=
The issue is uncovered by #47698: for IR files without a target triple,
-mtriple= specifies the full target triple while -march= merely sets the
architecture part of the default target triple, leaving a target triple which
may not make sense, e.g. riscv64-apple-darwin.

Therefore, -march= is error-prone and not recommended for tests without a target
triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead
of rejecting it outrightly.
2023-09-11 14:42:37 -07:00
Jay Foad
0f10850e51 [CodeGen] Add machine verification to some tests
This is to catch errors in an upcoming patch.
2023-07-24 11:04:10 +01:00
Patryk Wychowaniec
4e831753b9 [AVR] Expand shifts of all types except int8 and int16
Currently our AVRShiftExpand pass expands only 32-bit shifts, with the
assumption that other kinds of shifts (e.g. 64-bit ones) are
automatically reduced to 8-bit ones by LLVM during ISel.

However this is not always true and causes problems in the rust-lang runtime.

This commit changes the logic a bit, so that instead of expanding only
32-bit shifts, we expand shifts of all types except 8-bit and 16-bit.

This is not the most optimal solution, because 64-bit shifts can be
expanded to 32-bit shifts which has been deeply optimized.

I've checked the generated code using rustc + simavr, and all shifts
seem to behave correctly.

Spotted in the wild in rustc:
https://github.com/rust-lang/compiler-builtins/issues/523
https://github.com/rust-lang/rust/issues/112140

Reviewed By: benshi001

Differential Revision: https://reviews.llvm.org/D154785
2023-07-19 11:57:00 +08:00
Jianjian GUAN
eb33db4f91 [AVR] Enable verifyInstructionPredicates for AVR
This patch fixes the failed test of verifyInstructionPredicates which is caused by verifyInstructionPredicates. verifyInstructionPredicates will add JMPk without checking the target predicate.

Reviewed By: benshi001

Differential Revision: https://reviews.llvm.org/D155570
2023-07-19 11:43:39 +08:00
Ben Shi
71d90f3108 [AVR] Optimize 8-bit rotation when rotation bits == 3
Fixes https://github.com/llvm/llvm-project/issues/63100

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D152365
2023-06-11 08:41:47 +08:00
Ben Shi
e21df8296d [AVR] Optimize 8-bit rotation when rotation bits >= 4
Fixes https://github.com/llvm/llvm-project/issues/63100

Reviewed By: aykevl, Patryk27, jacquesguan

Differential Revision: https://reviews.llvm.org/D152130
2023-06-11 08:36:22 +08:00
Ben Shi
f3837e726f [AVR] Fix incorrect expansion of pseudo instruction ROLBRd
Since ROLBRd needs an implicit R1 (on AVR) or an implicit R17 (on AVRTiny),
we split ROLBRd to ROLBRdR1 (on AVR) and ROLBRdR17 (on AVRTiny).

Reviewed By: aykevl, Patryk27

Differential Revision: https://reviews.llvm.org/D152248
2023-06-11 00:20:43 +08:00
Ben Shi
cef723a0fe [AVR] Enable sub register liveness
Reviewed By: Patryk27

Differential Revision: https://reviews.llvm.org/D152606
2023-06-11 00:16:35 +08:00
Ben Shi
3b8c12c18e [AVR][NFC] Improve CodeGen tests
Reviewed By: Patryk27

Differential Revision: https://reviews.llvm.org/D152605
2023-06-11 00:15:20 +08:00
Ben Shi
b1f0cb89c1 [AVR][NFC][test] Supplement more tests of 8-bit rotation
Reviewed By: Patryk27, jacquesguan

Differential Revision: https://reviews.llvm.org/D152129
2023-06-06 11:24:18 +08:00
Ben Shi
53a7c254e4 [AVR][NFC][test] Suppement a test of the pseudo instruction RORBRd
Reviewed By: aykevl, Patryk27

Differential Revision: https://reviews.llvm.org/D152087
2023-06-04 23:19:21 +08:00
Patryk Wychowaniec
ff75a2be34 [AVR] Fix incorrect operands of pseudo instruction 'ROLBRd'
Fixes https://github.com/llvm/llvm-project/issues/63098

Reviewed by: benshi001

Differential Revision: https://reviews.llvm.org/D152063
2023-06-04 11:08:57 +08:00
Tobias Hieta
f84bac329b
[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
Ben Shi
2a528760bf [AVR] Fix an issue of writing 16-bit ports
For 16-bit ports, the normal devices reqiure writing high byte first
and then low byte. But the XMEGA devices require the reverse order.

Fixes https://github.com/llvm/llvm-project/issues/58395

Reviewed By: aykevl, jacquesguan

Differential Revision: https://reviews.llvm.org/D141752
2023-04-17 15:35:33 +08:00
Ben Shi
811759b100 [AVR] Disable post increment load from program memory space
We temporarily only allow post increment load/store from/to data memory,
and disable post increment load from program space.

Updates https://github.com/llvm/llvm-project/issues/59914

Reviewed By: mzh

Differential Revision: https://reviews.llvm.org/D147761
2023-04-12 11:52:55 +08:00
Ben Shi
aa18091124 [AVR][NFC] Fix errors in commit 6e57f68e41c92936b9ef3a4e6fb286e8805a9fbc 2023-04-10 11:17:06 +08:00
Ben Shi
6e57f68e41 [AVR] Reject invalid LDD instruction with explicit error
We should reject "ldd Rn, X" with explicit error message
rather than "llvm_unreachable" in llvm's release build.

Fixes https://github.com/llvm/llvm-project/issues/62012

Reviewed By: Miss_Grape

Differential Revision: https://reviews.llvm.org/D147877
2023-04-10 10:34:45 +08:00
Ben Shi
acb4d143bd [AVR] Fix incorrect expansion of pseudo instructions LPMWRdZ/ELPMWRdZ
The 'ELPM' instruction has three forms:

--------------------------
| form        | feature  |
| ----------- | -------- |
| ELPM        | hasELPM  |
| ELPM Rd, Z  | hasELPMX |
| ELPM Rd, Z+ | hasELPMX |
--------------------------

The second form is always used in the expansion of pseudo instructions
LPMWRdZ/ELPMWRdZ. But for devices without ELPMX and with only ELPM,
only the first form can be used.

Reviewed By: aykevl, Miss_Grape

Differential Revision: https://reviews.llvm.org/D141264
2023-04-06 15:27:04 +08:00
Jay Foad
effb7ab6c2 [TwoAddressInstruction] Improve tests for register killed by instruction
Define and use a MachineOperand overload of isPlainlyKilled. This
improves codegen in a couple of tests because it catches the case where
MO does not kill Reg but another operand of the same instruction does.

Differential Revision: https://reviews.llvm.org/D147167
2023-03-30 19:20:03 +01:00
Ben Shi
2a6e39dbf8 [AVR] Do not emit 'LPM Rd, Z' on devices without FeatureLPMX
The 'LPM' instruction has three forms:

------------------------
| form       | feature |
| ---------- | --------|
| LPM        | hasLPM  |
| LPM Rd, Z  | hasLPMX |
| LPM Rd, Z+ | hasLPMX |
------------------------

The second form is always selected in ISelDAGToDAG, even on devices
without FeatureLPMX. This patch emits "LPM + MOV" on devices with
only FeatureLPM.

Reviewed By: jacquesguan

Differential Revision: https://reviews.llvm.org/D141246
2023-03-24 17:47:24 +08:00
Ben Shi
4fa9dc9482 [AVR] Fix incorrect expansion of the pseudo 'ELPMBRdZ' instruction
The 'ELPM' instruction has three forms:

--------------------------
| form        | feature  |
| ----------- | -------- |
| ELPM        | hasELPM  |
| ELPM Rd, Z  | hasELPMX |
| ELPM Rd, Z+ | hasELPMX |
--------------------------

The second form is always used in the expansion of the pseudo
instruction 'ELPMBRdZ'. But for devices without ELPMX but only
with ELPM, only the first form can be emitted.

Reviewed By: jacquesguan

Differential Revision: https://reviews.llvm.org/D141221
2023-03-21 11:33:56 +08:00
Ben Shi
30d8f4e843 [AVR] Fix incorrect flags of livein registers when spilling them
In AVRFrameLowering::spillCalleeSavedRegisters(), when a 16-bit
livein register is spilled, two PUSH instructions are generated
for the higher and lower 8-bit registers. But these two 8-bit
registers are marked as killed in the two PUSH instructions, so
any future use of them will cause a crash.

This patch fixes the above issue by adding the two sub 8-bit
registers to the livein list.

Fixes https://github.com/llvm/llvm-project/issues/56423

Reviewed By: jacquesguan

Differential Revision: https://reviews.llvm.org/D144720
2023-02-28 11:08:54 +08:00
Ben Shi
f37d7c9381 [AVR] Optimize 16-bit comparison with a constant
Fixes https://github.com/llvm/llvm-project/issues/30923

Reviewed By: jacquesguan, aykevl

Differential Revision: https://reviews.llvm.org/D142281
2023-02-09 15:13:01 +08:00
Ayke van Laethem
8202a3da3c
[AVR] Support most address space casts
All hardware address spaces on AVR can be freely cast between (they keep
the same bit pattern). They just aren't dereferenceable when they're in
a different address space as they really do point to a separate address
space.

This is supported in avr-gcc: https://godbolt.org/z/9Gfvhnhv9

avr-gcc also supports the `__memx` address space which is 24 bits. We
don't support this address space yet but I've added a safeguard just in
case.

Differential Revison: https://reviews.llvm.org/D142107
2023-01-24 18:41:14 +01:00
Ben Shi
029f669db3 [AVR] Emit 'eicall' for devices with large program memory
Fixes https://github.com/llvm/llvm-project/issues/58856

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D142298
2023-01-23 09:06:10 +08:00
Ben Shi
c919ea5b48 [AVR] Fix incorrectly printed global symbol operands in inline-asm
Fixes https://github.com/llvm/llvm-project/issues/58879

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D142096
2023-01-20 09:45:00 +08:00
Ben Shi
159e2a804d [AVR] Fix a bug in AsmPrinter when printing inline-asm operands
Fixes https://github.com/llvm/llvm-project/issues/58878

Reviewed By: aykevl, Miss_Grape

Differential Revision: https://reviews.llvm.org/D141589
2023-01-13 14:23:41 +08:00
Ayke van Laethem
9592920890
[AVR] Optimize 32-bit shifts: optimize REG_SEQUENCE
This pseudo-instruction stores two small (8-bit) registers into one wide
(16-bit) register. But apparently the order matters a lot to the
register allocator.
This patch changes the order of inserting the registers to optimize for
the best register allocation in the tests of shift32.ll. It might be
detrimental in other cases, but keeping the registers in the same
physical register seems like it would be a common case.

Differential Revision: https://reviews.llvm.org/D140573
2023-01-08 20:05:31 +01:00
Ayke van Laethem
fad5e0cf50
[AVR] Optimize 32-bit shifts: reverse shift + move
This optimization turns shifts of almost a multiple of 8 into a shift
into the opposite direction. Unfortunately it doesn't compose well with
the other optimizations (I've tried) so it's separate from them.

Differential Revision: https://reviews.llvm.org/D140572
2023-01-08 20:05:31 +01:00
Ayke van Laethem
81f5f22f27
[AVR] Optimize 32-bit shifts: shift by 4 bits
This uses a complicated shift sequence that avr-gcc also uses, but
extended to work over any number of bytes and in both directions
(logical shift left and logical shift right). Unfortunately it can't be
used for an arithmetic shift right: I've tried to come up with a
sequence but couldn't.

Differential Revision: https://reviews.llvm.org/D140571
2023-01-08 20:05:31 +01:00
Ayke van Laethem
8f8afabd32
[AVR] Optimize 32-bit shift: move bytes around
This patch optimizes 32-bit constant shifts by renaming registers. This
is very effective as the compiler would otherwise need to do a lot of
single bit shift instructions. Instead, the registers are renamed at the
SSA level which means the register allocator will insert the necessary
mov instructions.

Unfortunately, the register allocator will insert some unnecessary movs
with the current code. This will be fixed in a later patch.

Differential Revision: https://reviews.llvm.org/D140570
2023-01-08 20:05:31 +01:00
Ayke van Laethem
840d10a1d2
[AVR] Custom lower 32-bit shift instructions
32-bit shift instructions were previously expanded using the default
SelectionDAG expander, which meant it used 16-bit constant shifts and
ORed them together. This works, but is far from optimal.

I've optimized 32-bit shifts on AVR using a custom inserter. This is
done using three new pseudo-instructions that take the upper and lower
bits of the value in two separate 16-bit registers and outputs two
16-bit registers.

This is the first commit in a series. When completed, shift instructions
will take around 31% less instructions on average for constant 32-bit
shifts, and is in all cases equal or better than the old behavior. It
also tends to match or outperform avr-gcc: the only cases where avr-gcc
does better is when it uses a loop to shift, or when the LLVM register
allocator inserts some unnecessary movs. But it even outperforms avr-gcc
in some cases where avr-gcc does not use a loop.

As a side effect, non-constant 32-bit shifts also become more efficient.

For some real-world differences: the build of compiler-rt I use in
TinyGo becomes 2.7% smaller and the build of picolibc I use becomes 0.9%
smaller. I think picolibc is a better representation of real-world code,
but even a ~1% reduction in code size is really significant.

The current patch just lays the groundwork. The result is actually a
regression in code size. Later patches will use this as a basis to
optimize these shift instructions.

Differential Revision: https://reviews.llvm.org/D140569
2023-01-08 20:05:31 +01:00
Ayke van Laethem
0408b131eb
[SelectionDAG][AVR] Add support for lrint and lround intrinsics
Integer legalization already supported splitting the output integer of
llround and llrint, but did not support this for lround and lrint yet.
This is not a problem for 32-bit architectures, but for 8/16-bit
architectures like AVR it results in a crash like this:

    ExpandIntegerResult #0: t7: i32 = lround t6

    LLVM ERROR: Do not know how to expand the result of this operator!

This patch simply add lrint/lround to the list of ISD opcodes to expand.

Fixes https://github.com/llvm/llvm-project/issues/59573.

Differential Revision: https://reviews.llvm.org/D140822
2023-01-08 18:56:07 +01:00
Ayke van Laethem
167338de96
[AVR] correctly declare __do_copy_data and __do_clear_bss
These two symbols are declared in object files to indicate whether .data
needs to be copied from flash or .bss needs to be cleared. They are
supported on avr-gcc and reduce firmware size a bit, which is especially
important on very small chips.

I checked the behavior of avr-gcc and matched it as well as possible.
From my investigation, it seems to work as follows:

__do_copy_data is set when the compiler finds a data symbol:
  * without a section name
  * with a section name starting with ".data" or ".gnu.linkonce.d"
  * with a section name starting with ".rodata" or ".gnu.linkonce.r" and
    flash and RAM are in the same address space

__do_clear_bss is set when the compiler finds a data symbol:
  * without a section name
  * with a section name that starts with .bss

Simply checking whether the calculated section name starts with ".data",
".rodata" or ".bss" should result in the same behavior.

Fixes: https://github.com/llvm/llvm-project/issues/58857

Differential Revision: https://reviews.llvm.org/D140830
2023-01-08 18:56:06 +01:00
Roman Lebedev
3bb5ddd175
[NFC][Codegen][AVR] Make shift.ll autogenerate-able 2022-12-24 19:26:42 +03:00
Ben Shi
a59e96f1a1 [AVR] Select 16-bit LDS/STS for load/store on AVRTiny.
The 32-bit LDS/STS are not available on AVRTiny, so we have
to use their compact 16-bit form for memory access.

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D139687
2022-12-23 11:03:45 +08:00
Ben Shi
c41d425030 [AVR][MC] Fix illegal operand forms.
These operands are illegal and rejected by avr-gcc.
    subi r24, -lo8(symobl+offset)
    sbci r25, -hi8(symobl+offset)

And their correct form should be
    subi r24, lo8(-(symobl+offset))
    sbci r25, hi8(-(symobl+offset))

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D140473
2022-12-23 09:48:06 +08:00
Ben Shi
3730f13428 [AVR] Fix a bug in AsmPrinter when printing memory operands.
Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D140383
2022-12-23 09:42:29 +08:00
Ayke van Laethem
d1d3005c9f
[AVR] Do not emit instructions invalid for attiny10
The attiny4/attiny5/attiny9/attiny10 have a slightly modified
instruction set that drops a number of useful instructions. This patch
makes sure to not emit them on these "reduced tiny" cores.

The affected instructions are:

  * lds and sts (load/store directly from data)
  * ldd and std (load/store with displacement)
  * adiw and sbiw (add/sub register pairs)
  * various other instructions that were emitted without checking
    whether the chip actually supports them (movw, adiw, etc)

There is a variant on lds and sts on these chips, but it can only
address a limited portion of the address space and is mainly useful to
load/store I/O registers (as an extension to the in and out
instructions). I have not implemented it here, implementing it can be
done in a separate patch.

This patch is not optimal. I'm sure it can be improved a lot. For
example, we could teach the instruction selector to not select lddw/stdw
instructions so that the weird pointer adjustments are not necessary.
But for now I've focused just on correctness, not on code quality.

Updates: https://github.com/llvm/llvm-project/issues/53459

Differential Revision: https://reviews.llvm.org/D131867
2022-12-22 17:04:53 +01:00
Ayke van Laethem
5527b21516
[AVR] Do not use R0/R1 on avrtiny
This patch makes sure the compiler uses R16/R17 on avrtiny (attiny10
etc) instead of R0/R1.

Some notes:

  * For the NEGW and ROLB instructions, it adds an explicit zero
    register. This is necessary because the zero register is different
    on avrtiny (and InstrInfo Uses lines need a fixed register).
  * Not entirely sure about putting all tests in features/avr-tiny.ll,
    but it doesn't seem like the "target-cpu"="attiny10" attribute
    works.

Updates: https://github.com/llvm/llvm-project/issues/53459

Differential Revision: https://reviews.llvm.org/D138582
2022-11-28 18:05:55 +01:00
Ayke van Laethem
91ae1afd3c
[AVR] Remove unused register scavenger
The LPMW/ELPMW instruction can be modified to use an earlyclobber, which
prevents it from using the Z register as an output register.

Also see: https://reviews.llvm.org/D131844

Differential Revision: https://reviews.llvm.org/D117957
2022-11-27 15:31:12 +01:00
Ben Shi
f452b9dcaf [AVR] Fix wrong ABI of AVRTiny.
A scalar which exceeds 4 bytes should be returned via stack, other
than via registers, on an AVRTiny device.

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D138201
2022-11-23 09:32:47 +08:00
Ayke van Laethem
a560e57a7e
[AVR] Only push and clear R1 in interrupts when necessary
R1 is a reserved register, but LLVM gives the APIs to know when it is
used or not. So this patch uses these APIs to only save/clear/restore R1
in interrupts when necessary.

The main issue here was getting inline assembly to work. One could argue
that this is the job of Clang, but for consistency I've made sure that
R1 is always usable in inline assembly even if that means clearing it
when it might not be needed.

Information on inline assembly in AVR can be found here:

https://www.nongnu.org/avr-libc/user-manual/inline_asm.html#asm_code

Essentially, this seems to suggest that r1 can be freely used in avr-gcc
inline assembly, even without specifying it as an input operand.

Differential Revision: https://reviews.llvm.org/D117426
2022-08-15 14:29:38 +02:00
Ayke van Laethem
43a8dbc5be
[AVR] Use @earlyclobber instead of register scavenging
The code to support the case when the register allocator has assigned
the same register to the src and the dst register operand isn't actually
needed:

  * LDWRdPtr and LDDWRdPtrQ have an @earlyclobber on the output
    register, so the register allocator will make sure to allocate a
    different register for the output register.
  * LDDWRdYQ does not have an @earlyclobber, but the pointer register is
    the fixed Y register which is reserved. The register allocator won't
    use reserved registers for the output value.

This removes a special case in the code that makes the pseudo
instruction expansion pass more complicated than it needs to be.

Differential Revision: https://reviews.llvm.org/D131844
2022-08-15 14:29:38 +02:00