97 Commits

Author SHA1 Message Date
knickish
e13f1d1daf
[M68k] ARII atomic load/store (#108982)
Only ARI was supported, this PR adds ARII support for atomic
loads/stores (also with zero displacement). Closes #107939
2024-10-18 23:49:26 +08:00
Alex Rønne Petersen
ad4a582fd9
[llvm] Consistently respect naked fn attribute in TargetFrameLowering::hasFP() (#106014)
Some targets (e.g. PPC and Hexagon) already did this. I think it's best
to do this consistently so that frontend authors don't run into
inconsistent results when they emit `naked` functions. For example, in
Zig, we had to change our emit code to also set `frame-pointer=none` to
get reliable results across targets.

Note: I don't have commit access.
2024-10-18 09:35:42 +04:00
Peter Lafreniere
d3c10b51a9
[M68k] Introduce more MOVI cases (#98377)
Add three more special cases for loading registers with immediates.

The first allows values in the range of [-255, 255] to be loaded with
MOVEQ, even if the register is more than 8 bits and the sign extention
is unwanted. This is done by loading the bitwise complement of the
desired value, then performing a NOT instruction on the loaded register.

This special case is only used when a simple MOVEQ cannot be used, and
is only used for 32 bit data registers. Address registers cannot support
MOVEQ, and the two-instruction sequence is no faster or smaller than a
plain MOVE instruction when loading 16 bit immediates on the 68000, and
likely slower for more sophisticated microarchitectures. However, the
instruction sequence is both smaller and faster than the corresponding
MOVE instruction for 32 bit register widths.

The second special case is for zeroing address registers. This simply
expands to subtracting a register with itself, consuming one instruction
word rather than 2-3, with a small improvement in speed as well.

The last special case is for assigning sign-extended 16-bit values to a
full address register. This takes advantage of the fact that the movea.w
instruction sign extends the output, permitting the immediate to be
smaller. This is similar to using lea with a 16-bit address, which is
not added in this patch as 16-bit absolute addressing is not yet
implemented.

This is a v2 submission of #90817. It also creates a 'Data' test
directory to better align with the backend's tablegen layout.
2024-09-03 19:04:23 -04:00
Michael Liao
8e4b8155c1 [M68k] Fix compilation pipeline check
- After 'RemoveLoadsIntoFakeUses' is enabled to support llvm.fake.use
2024-09-03 12:36:29 -04:00
Peter Rong
74e4694b8c
[LTO] enable ObjCARCContractPass only on optimized build (#101114)
\#92331 tried to make `ObjCARCContractPass` by default, but it caused a
regression on O0 builds and was reverted.
This patch trys to bring that back by:

1. reverts the
[revert](1579e9ca9c).
2. `createObjCARCContractPass` only on optimized builds.

Tests are updated to refelect the changes. Specifically, all `O0` tests
should not include `ObjCARCContractPass`

Signed-off-by: Peter Rong <PeterRong@meta.com>
2024-08-09 13:04:25 -07:00
Michael Liao
a1af1de438 [M68k] Fix compilation pipeline check
- After ExpandVP pass is merged into PreISelIntrinsicLowering
2024-08-06 14:29:50 -04:00
Michael Liao
7b0f14394d [M68k] Fix compilation pipeline check
- After 'lowerConstantIntrinsics' is merged into pre-isel lowering
2024-08-01 20:08:03 -04:00
Michael Liao
f5bab9678e [M68k] Fix compilation pipeline check
- Fix check after cab81dd03813ac6333ad7fc031d72b84341fe2b9
2024-05-31 20:06:54 -04:00
Nikita Popov
1579e9ca9c Revert "Run ObjCContractPass in Default Codegen Pipeline (#92331)"
This reverts commit 8cc8e5d6c6ac9bfc888f3449f7e424678deae8c2.
This reverts commit dae55c89835347a353619f506ee5c8f8a2c136a7.

Causes major compile-time regressions for unoptimized builds.
2024-05-24 08:14:26 +02:00
Michael Liao
dae55c8983 [M68k] Fix compilation pipeline check
- This change is after running ObjCContractPass in default codegen
  pipeline (#92331).
2024-05-24 00:26:09 -04:00
Peter Lafreniere
ebbc5de7db
[M68k] Correctly emit non-pic relocations (#89863)
The m68k backend will always emit external calls (including libcalls)
with
PC-relative PLT relocations, even when in non-pic mode or -fno-plt is
used.

This is unexpected, as other function calls are emitted with absolute
addressing, and a static code modes suggests that there is no PLT. It
also
leads to a miscompilation where the call instruction emitted expects an
immediate address, while the relocation emitted for that instruction is
PC-relative.

This miscompilation can even be seen in the default C function in
godbolt:
https://godbolt.org/z/zEoazovzo

Fix the issue by classifying external function references based upon the
pic
mode. This triggers a change in the static code model, making it more in
line
with the expected behaviour and allowing use of this backend in more
bare-metal
situations where a PLT does not exist.

The change avoids the issue where we emit a PLT32 relocation for an
absolute
call, and makes libcalls and other external calls use absolute
addressing modes
when a static code model is desired.

Further work should be done in instruction lowering and validation to
ensure
that miscompilations of the same type don't occur.
2024-05-03 23:14:56 +08:00
Peter Lafreniere
c4c9d4f306
[M68k] Add support for MOVEQ instruction (#88542)
Add support for the moveq instruction, which is both faster and smaller
(1/2 to 1/3 the size) than a move with immediate to register.

This change introduces the instruction, along with a set of
pseudoinstructions to handle immediate moves to a register that is
lowered post-RA.

Pseudos are used as moveq can only write to the full register, which
makes
matching i8 and i16 immediate loads difficult in tablegen. Furthermore,
selecting moveq before RA constrains that immediate to be moved into a
data
register, which may not be optimal.

The bulk of this change are fixes to existing tests, which cover the new
functionality sufficiently.
2024-04-26 20:34:21 +08:00
Peter Lafreniere
614a578034
[M68k] Add support for bitwise NOT instruction (#88049)
Currently the bitwise NOT instruction is not recognized. Add support for
using NOT on data registers. This is a partial implementation that puts
NOT at the same level of support as NEG currently enjoys.

Using not rather than eori cuts the length of the encoded instruction
in half or in thirds, leading to a reduction of 4-10 cycles per
instruction, on the original 68000.

This change includes tests for both bitwise and arithmetic negation.
2024-04-09 09:07:26 -07:00
Michael Liao
a490bbf539 [M68k] Fix compilation pipeline check
- Add 'Init Undef Pass', which is target-independent now.
2024-03-01 14:49:27 -05:00
Fangrui Song
cd0d11be7a [M68k] Convert tests to opaque pointers (NFC) 2024-02-06 12:53:16 -08:00
Min-Yih Hsu
4bd79ea3fe [M68k] Add pc-relative displacement (PCD) addressing mode for MOVSX
And disable offset folding altogether since we cannot always gain the
precise offset there to see if that fits into a certain size of
displacement.
2023-12-29 11:52:49 -08:00
Min-Yih Hsu
2476e2a911 [M68k] Optimize for overflow arithmetics that will never overflow
We lower overflow arithmetics to its M68kISD counterparts that produce
results of {i16/i32, i8} in which the second resut represents CCR. In
the event where we're certain there won't be an overflow, for instance
8 & 16-bit multiplications, we simply use zero in replacement of the
second result.
This patch replaces M68kISD::CMOV that takes this kind of zero or
all-ones CCR as condition value with its corresponding operand value.
2023-12-26 20:55:23 -08:00
Min-Yih Hsu
6f85075ff7 [M68k] U/SMULd32d16 are not supposed to be communitive
M68k only has 16-bit x 16-bit -> 32-bit variant for multiplications
taking 16-bit operands. We still define two input operands for this
class of instructions, and tie the first operand to the result value.
The problem is that the two operands have different register classes
(DR32 and DR16) hence making these instructions communitive produces
invalid MachineInstr (though the final assembly will still be correct).
2023-12-26 20:55:22 -08:00
Min-Yih Hsu
b80e1acc8c [M68k] Improve codegen of overflow arithmetics
The codegen logic for overflow arithmetics (e.g. llvm.uadd.overflow)
was a mess; overflow multiplications were not even supported.
This patch clean up the legalization of overflow arithmetics and add
supports for common variants of overflow multiplications.
2023-12-26 11:08:11 -08:00
Sheng
725656bdd8
[M68k] Emit RTE for interrupt handler. (#72787)
Fixes #64833
2023-12-04 15:11:00 +08:00
Tobias Stadler
ba0763e4cb [GlobalISel][M68k] Update test after 373c343
Missed test case in experimental target, which was not covered by pre-merge checks.
2023-11-02 03:32:47 +01:00
Matthias Braun
94aaaf4fb4 Update m68k tests to new block placement
e3cf80c5c1fe55efd8216575ccadea0ab087e79c affected block placement of
some tests in the experimental m68k target. This updates them.
2023-10-25 11:33:56 -07:00
Min-Yih Hsu
fd84b1a99d [M68k] Add new calling convention M68k_RTD
`M68k_RTD` is really similar to X86's stdcall, in which callee pops the
arguments from stack. In LLVM IR it can be written as `m68k_rtdcc`.
This patch also improves how ExpandPseudo Pass handles popping stack at
function returns in the absent of the RTD instruction.

Differential Revision: https://reviews.llvm.org/D149864
2023-10-15 16:12:31 -07:00
Jay Foad
0f10850e51 [CodeGen] Add machine verification to some tests
This is to catch errors in an upcoming patch.
2023-07-24 11:04:10 +01:00
Ian Douglas Scott
6a4e72b232 [M68k][MC] Add support for 32 bit register-register multiply/divide
Previously when targeting 68020+, instruction selection attempted to
emit a 32-bit register-register multiplication, but failed at instruction
selection. With this, it succeeds.

Differential Revision: https://reviews.llvm.org/D152120
2023-06-29 21:39:41 -07:00
Sheng
65b710efc1 [m68k] Fix incorrect handling of TLS when matching addressing mode.
`TargetGlobalTLSAddress` is not considered and handled correctly when matching addressing mode, which leads to an incorrect result of instruction selection.

fixes #63162.

Reviewed By: myhsu

Differential Revision: https://reviews.llvm.org/D153103
2023-06-23 08:30:53 +08:00
Matt Arsenault
80e2c26dfd RegisterCoalescer: Fix name of pass
I finally snapped and fixed this inconsistency.
2023-06-21 10:30:43 -04:00
Sheng
4c2ec08ebc [m68k] Add TLS Support
This patch introduces TLS (Thread-Local Storage) support to the LLVM m68k backend.

Reviewed By: glaubitz

Differential Revision: https://reviews.llvm.org/D144941
2023-06-03 19:09:47 +08:00
Tobias Hieta
f84bac329b
[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
Craig Topper
4a9e55a522 [M68k] Update divide-by-constant.ll after D150333. 2023-05-13 23:01:32 -07:00
Ian Douglas Scott
34b37c00ab [M68k] Add instruction selection support for zext with PCD addressing
Instruction selection was failing when trying to zero extend a value
loaded from a PC-relative address. This adds support for zero extension
using the "program counter indirect with displacement" addressing mode.
It also adds a test with code that was previously failing to compile.

This fixes a compile error in Rust's libcore.

Differential Revision: https://reviews.llvm.org/D149034
2023-04-29 16:27:16 -07:00
Ian Douglas Scott
7452212681 [M68k] Override CanLowerReturn to fix assertion with large return
If it couldn't fit the return value in two registers, this caused an
error during codegen. It seems this method is implemented in other
backends but not here, and allows it to pass return values in memory
when it isn't able to do so in registers.

Seems to fix compilation of Rust code with certain return types:
https://github.com/rust-lang/rust/issues/89498

Differential Revision: https://reviews.llvm.org/D148856
2023-04-22 12:23:04 -07:00
Min-Yih Hsu
aee4399f58 [M68k] Add subtarget features for M68881/2 FPU
Note that technically both M68000/010 can use M68881, despite the fact
that usually only M68020 and newer ISAs are equipped with M68881/2.
M68040 and newer processors have builtin M68882.

Differential Revision: https://reviews.llvm.org/D147479
2023-04-06 11:09:23 -07:00
Min-Yih Hsu
ed372d194f [M68k] Add support for lowering atomic fence
Ideally we want to lower ATOMIC_FENCE into `__sync_synchronize`.
However, libgcc doesn't implement that builtin as GCC simply generates an
inline assembly barrier whenever there needs to be a fence.

We use a similar way to lower ATOMIC_FENCE.

Differential Revision: https://reviews.llvm.org/D146996
2023-04-01 19:57:04 -07:00
Min-Yih Hsu
876980a59c [M68k] Add support for lowering i1 SIGN_EXTEND_INREG
Lowering i1 (inreg) sext with `(NEG (AND %val, 1))`.
2023-03-27 12:26:51 -07:00
Min-Yih Hsu
a85b37d0ca [M68k] Add support for lowering ATOMIC_SWAP
Lower to calling __sync_lock_test_and_set_* for target < M68020.
2023-03-27 10:58:52 -07:00
Min-Yih Hsu
70511e6176 [M68k] Fix CConvs for pointer type return values
Put the value into A0 instead of data registers. And remove the
redundant `RetCC_M68kCommon` as there aren't many rules shared between
existing CCs other than the pointer one.
This change is tested by existing tests.
2023-03-24 11:00:35 -07:00
Min-Yih Hsu
7335cd0513 [M68k] Add support for basic memory constraints in inline asm
This patch adds support for 'm', 'Q', and 'U' memory constraints.

Differential Revision: https://reviews.llvm.org/D143529
2023-03-08 13:52:34 -08:00
Min-Yih Hsu
058f7449cf [M68k] Provide exception pointer and selector registers
Using d0 for exception pointer and d1 for selector, as suggested by GCC.
2023-02-23 16:25:34 -08:00
Nick Desaulniers
cf86855c44 [M68k] fix test regression introduced by D140180
I added a new pass, callbrprepare, to the pass pipelines in
commit a3a84c9e2511 ("[llvm] add CallBrPrepare pass to pipelines")
but did not test experimental backends.
2023-02-17 09:22:24 -08:00
Simon Pilgrim
97a1c98f8e [M68k] Fix M68k pipeline order test after 4ece50737d5385fb80cfa23f5297d1111f8eed39 2023-01-21 13:00:56 +00:00
Paul Kirth
557a5bc336 [codegen] Add StackFrameLayoutAnalysisPass
Issue #58168 describes the difficulty diagnosing stack size issues
identified by -Wframe-larger-than. For simple code, its easy to
understand the stack layout and where space is being allocated, but in
more complex programs, where code may be heavily inlined, unrolled, and
have duplicated code paths, it is no longer easy to manually inspect the
source program and understand where stack space can be attributed.

This patch implements a machine function pass that emits remarks with a
textual representation of stack slots, and also outputs any available
debug information to map source variables to those slots.

The new behavior can be used by adding `-Rpass-analysis=stack-frame-layout`
to the compiler invocation. Like other remarks the diagnostic
information can be saved to a file in a machine readable format by
adding -fsave-optimzation-record.

Fixes: #58168

Reviewed By: nickdesaulniers, thegameg

Differential Revision: https://reviews.llvm.org/D135488
2023-01-19 01:51:14 +00:00
Paul Kirth
64a553a2b5 Revert "[LoongArch][M68k] Add 'Stack Frame Layout Analysis' to pipeline tests. NFC"
I missed that a forward fix was out when reverting
0a652c540556a118bbd9386ed3ab7fd9e60a9754.

This reverts commit 488bea797e167e6bf5ddab5f7eea78031b575ba0.
2023-01-13 23:07:23 +00:00
Craig Topper
488bea797e [LoongArch][M68k] Add 'Stack Frame Layout Analysis' to pipeline tests. NFC
These targets were missed in D135488.
2023-01-13 14:50:48 -08:00
Craig Topper
1b969c6b60 Recommit "[M68k] Regenerate divide-by-constant.ll. NFC"
Division algorithm was improved in D140750.

Fixes #59802.
2023-01-03 12:24:14 -08:00
Craig Topper
7f9ddd69d8 Revert "[M68k] Regenerate divide-by-constant.ll. NFC"
This reverts commit 0277f849c36ab6fe122b4fa1ae739e82869b5613.

I pasted the wrong bug number.
2023-01-03 12:23:53 -08:00
Craig Topper
0277f849c3 [M68k] Regenerate divide-by-constant.ll. NFC
Division algorithm was improved by D140750.

Fixes #59791.
2023-01-03 12:23:26 -08:00
Jonas Paulsson
5ecd363295 Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."
This reverts commit 122efef8ee9be57055d204d52c38700fe933c033.

- Patch fixed to not reuse definitions from predecessors in EH landing pads.
- Late review suggestions (by MaskRay) have been addressed.
- M68k/pipeline.ll test updated.
- Init captures added in processBlock() to avoid capturing structured bindings.
- RISCV has this disabled for now.

Original commit message:

A new pass MachineLateInstrsCleanup is added to be run after PEI.

This is a simple pass that removes redundant and identical instructions
whenever found by scanning the MF once while keeping track of register
definitions in a map. These instructions are typically immediate loads
resulting from rematerialization, and address loads emitted by target in
eliminateFrameInde().

This is enabled by default, but a target could easily disable it by means of
'disablePass(&MachineLateInstrsCleanupID);'.

This late cleanup is naturally not "optimal" in removing instructions as it
is done by looking at phys-regs, but still quite effective. It would be
desirable to improve other parts of CodeGen and avoid these redundant
instructions in the first place, but there are no ideas for this yet.

Differential Revision: https://reviews.llvm.org/D123394

Reviewed By: RKSimon, foad, craig.topper, arsenm, asb
2022-12-05 12:53:50 -06:00
Michał Górny
e99edb9235 Revert "[test] Fix CodeGen/M68k/pipeline.ll after D123394 MachineLateInstrsCleanupPass"
This reverts commit f55880e830e150d98e5340cdc3c4c41867a5514d.
The original change was reverted.
2022-12-05 16:47:17 +01:00
Dmitry Vyukov
dbe8c2c316 Use-after-return sanitizer binary metadata
Currently per-function metadata consists of:
(start-pc, size, features)

This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D136078
2022-12-05 14:40:31 +01:00