433 Commits

Author SHA1 Message Date
Fangrui Song
728490257e [Sparc,test] Change llc -march= to -mtriple=
Similar to 806761a7629df268c8aed49657aeccffa6bca449

-mtriple= specifies the full target triple while -march= merely sets the
architecture part of the default target triple (e.g. Windows, macOS),
leaving a target triple which may not make sense.

Therefore, -march= is error-prone and not recommended for tests without a target
triple. The issue has been benign as we recognize sparc*-apple-darwin as ELF instead
of rejecting it outrightly.
2024-12-15 10:29:34 -08:00
Koakuma
23d209f350
[SPARC] Allow overaligned allocas (#107223)
SPARC ABI doesn't use stack realignment, so let LLVM know about it in
`SparcFrameLowering`. This has the side effect of making all overaligned
allocations go through `LowerDYNAMIC_STACKALLOC`, so implement the
missing logic there too for overaligned allocations.
This makes the SPARC backend not crash on overaligned `alloca`s and fix
https://github.com/llvm/llvm-project/issues/89569.
2024-11-03 22:53:03 +07:00
Alex Rønne Petersen
5785cbb405
[llvm] Ensure that soft float targets don't emit fma() libcalls. (#106615)
The previous behavior could be harmful in some edge cases, such as
emitting a call to `fma()` in the `fma()` implementation itself.

Do this by just being more accurate in `isFMAFasterThanFMulAndFAdd()`.
This was already done for PowerPC; this commit just extends that to Arm,
z/Arch, and x86. MIPS and SPARC already got it right, but I added tests
for them too, for good measure.

Note: I don't have commit access.
2024-10-19 06:13:15 -07:00
Alex Rønne Petersen
ad4a582fd9
[llvm] Consistently respect naked fn attribute in TargetFrameLowering::hasFP() (#106014)
Some targets (e.g. PPC and Hexagon) already did this. I think it's best
to do this consistently so that frontend authors don't run into
inconsistent results when they emit `naked` functions. For example, in
Zig, we had to change our emit code to also set `frame-pointer=none` to
get reliable results across targets.

Note: I don't have commit access.
2024-10-18 09:35:42 +04:00
Matt Arsenault
187dcd8e22
DAG: Preserve disjoint flag when emitting final instructions (#110795) 2024-10-02 19:37:04 +04:00
Koakuma
dbad963a69
[SPARC] Align i128 to 16 bytes in SPARC datalayouts (#106951)
Align i128s to 16 bytes, following the example at
https://reviews.llvm.org/D86310.

clang already does this implicitly, but do it in backend code too for
the benefit of other frontends (see e.g
https://github.com/llvm/llvm-project/issues/102783 &
https://github.com/rust-lang/rust/issues/128950).
2024-09-30 08:32:33 +07:00
Bevin Hansson
12033e550b
[ISelDAG] Salvage debug info at isel by referring to frame indices. (#109126)
We can refer to frame index locations when salvaging debug info
for certain nodes, which prevents the compiler from optimizing
out the location.
2024-09-24 15:02:04 +02:00
Koakuma
576b7a781a
[SPARC] Remove assertions in printOperand for inline asm operands (#104692)
Inline asm operands could contain any kind of relocation, so remove the
checks.

Fixes https://github.com/llvm/llvm-project/issues/103493
2024-08-20 20:05:06 +07:00
Daniel Cederman
7faf1a0868
[Sparc] Add errata workaround pass for GR712RC and UT700 (#103843)
This patch adds a pass that provides workarounds for the errata
described in GRLIB-TN-0009, GRLIB-TN-0010, GRLIB-TN-0011, GRLIB-TN-0012,
and GRLIB-TN-0013, that are applicable to the GR712RC and UT700. The
documents are available for download from here:

https://www.gaisler.com/index.php/information/app-tech-notes

The pass will detect certain sensitive instruction sequences and prevent
them from occurring by inserting NOP instruction. Below is an overview
of each of the workarounds. A similar implementation is available in
GCC.

GRLIB-TN-0009:

* Insert NOPs to prevent the sequence (stb/sth/st/stf) -> (single
non-store/load instruction) -> (any store)

* Insert NOPs to prevent the sequence (std/stdf) -> (any store)

GRLIB-TN-0010:

* Insert a NOP between load instruction and atomic instruction (swap and
casa).

* Insert a NOP at branch target if load in delay slot and atomic
instruction at branch target.

* Do not allow functions to begin with atomic instruction.

GRLIB-TN-0011:

* Insert .p2align 4 before atomic instructions (swap and casa).

GRLIB-TN-0012:

* Place a NOP at the branch target of an integer branch if it is a
floating-point operation or a floating-point branch.

GRLIB-TN-0013:

* Prevent (div/sqrt) instructions in the delay slot.

* Insert NOPs to prevent the sequence (div/sqrt) -> (two or three
floating point operations or loads) -> (div/sqrt).

* Do not insert NOPs if any of the floating point operations have a
dependency on the destination register of the first (div/sqrt).

* Do not insert NOPs if one of the floating point operations is a
(div/sqrt).

* Insert NOPs to prevent (div/sqrt) followed by a branch.
2024-08-19 07:59:58 +02:00
Sergei Barannikov
991192b211
[Sparc] Remove custom lowering for ADD[CE] / SUB[CE] (#100861)
The default lowering produces fewer instructions.
2024-07-28 18:22:40 +03:00
Sergei Barannikov
77f89f1f54
[Sparc] Remove custom lowering for SMULO / UMULO (#100858)
The underlying issue was fixed by 7c4fe0e9. The lowering is tested by
[us]mulo-128-legalisation-lowering.ll and there are no changes.
2024-07-28 18:15:23 +03:00
Koakuma
edd2d7c558
[NFC][SPARC] Fix typos and style mismatches
Fix style errors accidentally introduced in PRs #87259 and #94245.

Reviewers: rorth, jrtc27, brad0, s-barannikov

Reviewed By: s-barannikov

Pull Request: https://github.com/llvm/llvm-project/pull/96019
2024-06-19 21:44:48 +07:00
Fangrui Song
4a67f80982 [test] Fix check prefixes 2024-05-13 11:25:12 -07:00
Fangrui Song
b9ae06ba15 [test] Convert text files from CRLF to LF
Skip *.pdb, *.rc, *crlf*, and FileCheck/dos-style-eol.txt
2024-05-03 10:09:52 -07:00
Koakuma
697dd93ae3
[SPARC] Implement L and H inline asm argument modifiers (#87259)
This adds support for using the L and H argument modifiers for twinword
operands in inline asm code, such as in:

```
%1 = tail call i64 asm sideeffect "rd %pc, ${0:L} ; srlx ${0:L}, 32, ${0:H}", "={o4}"()
```

This is needed by the Linux kernel.
2024-04-05 04:34:07 +07:00
James Y Knight
c1a99b2c77
[Sparc] limit MaxAtomicSizeInBitsSupported to 32 for 32-bit Sparc. (#81655)
When in 32-bit mode, the backend doesn't currently implement 64-bit
atomics, even though the hardware is capable if you have specified a V9
CPU. Thus, limit the width to 32-bit, for now, leaving behind a TODO.

This fixes a regression triggered by PR #73176.
2024-02-13 15:40:51 -05:00
Koakuma
c2f9885a8a
[SPARC] Support reserving arbitrary general purpose registers (#74927)
This adds support for marking arbitrary general purpose registers -
except for those with special purpose (G0, I6-I7, O6-O7) - as reserved,
as needed by some software like the Linux kernel.
2024-02-11 02:04:18 -05:00
Nikita Popov
ff9af4c43a [CodeGen] Convert tests to opaque pointers (NFC) 2024-02-05 14:07:09 +01:00
Koakuma
118d4234ac
[SPARC] Prefer RDPC over CALL to implement GETPCX for 64-bit target
On 64-bit target, prefer using RDPC over CALL to get the value of %pc.
This is faster on modern processors (Niagara T1 and newer) and avoids
polluting the processor's predictor state.

The old behavior of using a fake CALL is still done when tuning for
classic UltraSPARC processors, since RDPC is much slower there.

A quick pgbench test on a SPARC T4 shows about 2% speedup on SELECT
loads, and about 7% speedup on INSERT/UPDATE loads.

Reviewed By: @s-barannikov

Github PR: https://github.com/llvm/llvm-project/pull/78280
2024-01-16 22:46:39 +07:00
Fangrui Song
f972e4d343 [MC,ELF] .section: unconditionally print section flag 'G' after 'o'
* Placing 'G' before 'M' (SHF_MERGE) can be misleading as the sh_entsize
  argument goes before the section group name, if a reader doesn't know
  that the order of extra arguments is not affected by the order of flags.
* 'a', 'w', and 'x' indicate basic permission-related flags. Separating
  them with 'G' is kinda ugly.

Simplify code and move 'G' after 'o'. The new output is more similar to
GCC.
2024-01-09 10:48:23 -08:00
Sergei Barannikov
b9208aca9b
[Sparc] Remove duplicate ALU and SETHI instructions (NFCI) (#66851)
There are no 64-bit variants of these ALU / SETHI instructions in V9.
Remove these instruction definitions and add patterns to match DAG nodes
to the generic instructions defined in SparcInstrInfo.td.

This is not strictly NFC because of the changes in
`2011-01-11-FrameAddr.ll` test. The reason is that Sparc delay slot
filler pass handled ADDrr but not ADDXrr, which are now the same
instruction.
2023-10-10 20:34:20 +03:00
Jay Foad
7b3bbd83c0 Revert "[CodeGen] Really renumber slot indexes before register allocation (#67038)"
This reverts commit 2501ae58e3bb9a70d279a56d7b3a0ed70a8a852c.

Reverted due to various buildbot failures.
2023-10-09 12:31:32 +01:00
Jay Foad
2501ae58e3
[CodeGen] Really renumber slot indexes before register allocation (#67038)
PR #66334 tried to renumber slot indexes before register allocation, but
the numbering was still affected by list entries for instructions which
had been erased. Fix this to make the register allocator's live range
length heuristics even less dependent on the history of how instructions
have been added to and removed from SlotIndexes's maps.
2023-10-09 11:44:41 +01:00
Jay Foad
01aa0c776d [SPARC] Add a missing SPARC64-LABEL check 2023-09-28 13:15:09 +01:00
Sergei Barannikov
dd477ebd23
[Sparc] Remove LEA instructions (NFCI) (#65850)
LEA_ADDri and LEAX_ADDri are printed / encoded the same way as ADDri. I
had to change the type of simm13Op so that it can be used in both 32-
and 64-bit modes. This required the changes in operands of some
InstAliases.
2023-09-20 03:34:39 +03:00
Jay Foad
e0919b189b [CodeGen] Renumber slot indexes before register allocation (#66334)
RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate
the length of a live range for its heuristics. Renumbering all slot
indexes with the default instruction distance ensures that this estimate
will be as accurate as possible, and will not depend on the history of
how instructions have been added to and removed from SlotIndexes's maps.

This also means that enabling -early-live-intervals, which runs the
SlotIndexes analysis earlier, will not cause large amounts of churn due
to different register allocator decisions.
2023-09-19 11:18:12 +01:00
Rainer Orth
715fc4fc60
[Sparc] Don't emit __multi3 on 32-bit SPARC (#66362)
LLVM fails to build on 32-bit Solaris/SPARC: several programs fail to
link due to undefined references to `__multi3`. This reference is from
`lib/libLLVMScalarOpts.a(LoopStrengthReduce.cpp.o)`. However, This
function exists neither in the 32-bit `libgcc.a` nor in
`libclang_rt.builtins-sparc.a`. It's only defined in their 64-bit
counterparts.

The same issue affects several 32-bit targets, e.g. 32-bit PowerPC as
described in Issue #54460. The fix is the same: inhibit the libcall for
32-bit compilations. This patch does just that, regenerating the
affected testcases. It allows the build to complete.

Tested on `sparc-sun-solaris2.11`.
2023-09-15 07:31:59 +02:00
Fangrui Song
806761a762 [test] Change llc -march= to -mtriple=
The issue is uncovered by #47698: for IR files without a target triple,
-mtriple= specifies the full target triple while -march= merely sets the
architecture part of the default target triple, leaving a target triple which
may not make sense, e.g. riscv64-apple-darwin.

Therefore, -march= is error-prone and not recommended for tests without a target
triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead
of rejecting it outrightly.
2023-09-11 14:42:37 -07:00
Simon Pilgrim
3ad4f92f83 [DAG] More aggressively (extract_vector_elt (build_vector x, y), c) iff element is zero constant
We currently don't extract vector elements from multi-use build vectors unless TLI.aggressivelyPreferBuildVectorSources accepts them, which seems a little extreme for constant build vectors (especially as under some cases ComputeKnownBits will indirectly extract the data for us).

This is causing a few regressions in some upcoming SimplifyDemandedBits work I'm looking at, all of which just need to know that the element is zero, so I've tweaked the fold to accept zero elements as well, which will typically fold very easily.

Differential Revision: https://reviews.llvm.org/D155582
2023-07-18 17:31:34 +01:00
Simon Pilgrim
b8bda50932 [Sparc] Regenerate float-constants.ll test checks 2023-07-18 17:31:34 +01:00
Amaury Séchet
015323ff9b [NFC] Autogenerate CodeGen/SPARC/LeonInsertNOPLoadPassUT.ll 2023-06-15 13:24:39 +00:00
Tobias Hieta
f84bac329b
[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
Brad Smith
c30c291887 [SPARC] Lower BR_CC to BPr on 64-bit target whenever possible
On 64-bit target, when doing i64 BR_CC where one of the comparison operands is a
constant zero, try to fold the compare and BPcc into a BPr instruction.

For all integers, EQ and NE comparison are available, additionally for signed
integers, GT, GE, LT, and LE is also available.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D142461
2023-04-26 18:56:00 -04:00
Brad Smith
eee590ca4b Revert "[SPARC] Lower BR_CC to BPr on 64-bit target whenever possible"
This reverts commit 6590a372fa3f4582c04b4b179f90a3c728e75025.
2023-03-12 04:20:25 -04:00
Koakuma
6590a372fa [SPARC] Lower BR_CC to BPr on 64-bit target whenever possible
On 64-bit target, when doing i64 BR_CC where one of the comparison operands is a
constant zero, try to fold the compare and BPcc into a BPr instruction.

For all integers, EQ and NE comparison are available, additionally for signed
integers, GT, GE, LT, and LE is also available.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D142461
2023-03-11 17:47:53 -05:00
Koakuma
24e300190a [SPARC] Implement hooks for conditional branch relaxation
Integrate the BranchRelaxation pass to help with relaxing out-of-range
conditional branches.

This is mostly of concern for SPARCv9, which uses conditional branches with
much smaller range than its v8 counterparts.
(Some large autogenerated code, such as the ones generated by TableGen, already
hits this limitation when building in Release)

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D142458
2023-03-11 17:42:09 -05:00
Matt Arsenault
778cf5431c IR: Add atomicrmw uinc_wrap and udec_wrap
These are essentially add/sub 1 with a clamping value.

AMDGPU has instructions for these. CUDA/HIP expose these as
atomicInc/atomicDec. Currently we use target intrinsics for these,
but those do no carry the ordering and syncscope. Add these to
atomicrmw so we can carry these and benefit from the regular
legalization processes.
2023-01-24 17:55:11 -04:00
Koakuma
ac16ea89db [SPARC] Fix SELECT_REG emission for f128s
In LowerSELECT_CC, SELECT_REG between two f128s should only be emitted if we
have hardware quadfloat enabled.

This should fix issue #59646

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D140515
2022-12-22 13:52:56 -05:00
Koakuma
eaade37fdd [SPARC] Mark the %g0 register as constant & use it to materialize zeros
Materialize zeros by copying from %g0, which is now marked as constant.

This makes it possible for some common operations (like integer negation) to be
performed in fewer instructions.

This continues @arichardson's patch at D132561.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138887
2022-12-13 17:25:42 -05:00
Koakuma
f8f41c3fcd [SPARC] Lower SELECT_CC to MOVr on 64-bit target whenever possible
On 64-bit target, when doing i64 SELECT_CC where one of the comparison operands
is a constant zero, try to fold the compare and MOVcc into a MOVr instruction.

For all integers, EQ and NE comparison are available, additionally for signed
integers, GT, GE, LT, and LE is also available.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138922
2022-12-07 15:34:58 -05:00
Brad Smith
7806f86a5e Revert "[SPARC] Mark the %g0 register as constant & use it to materialize zeros"
2 of the Sparc tests are now failing.

This reverts commit 2c41310fc146a1f609147c65ac5f30e5a57e84a8.
2022-12-07 15:27:57 -05:00
Koakuma
2c41310fc1 [SPARC] Mark the %g0 register as constant & use it to materialize zeros
Materialize zeros by copying from %g0, which is now marked as constant.

This makes it possible for some common operations (like integer negation) to be
performed in fewer instructions.

This continues @arichardson's patch at D132561.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138887
2022-12-07 13:34:13 -05:00
Koakuma
f63a19baf0 [SPARC] Add tail call support for 64-bit target
Extend SPARC tail call support, first introduced in D51206 (commit 1c235c375492180c2eecb6331f169486019fd2d2),
to also cover 64-bit target.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D138741
2022-11-26 23:29:05 -05:00
Koakuma
fd0aeaa83a [SPARC] Don't emit deprecated FP branches when targeting v9
Don't emit deprecated v8-style FP compares & branches when targeting v9
processors.

For now, always use %fcc0, because currently the allocator requires allocatable
registers to also be spillable, which isn't the case with v9 FCC registers.

The work to enable allocation over the entire FCC register file will be done in
a future patch.

Fixes bug #17834

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D135515
2022-11-16 20:56:17 -05:00
Koakuma
586d5f91e6 [SPARC] Improve integer branch handling for v9 targets
Do not emit deprecated v8-style branches when targeting a v9 processor.

As a side effect, this also fixes the emission of useless ba's when doing
conditional branches on 64-bit integer values.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D130006
2022-11-16 20:51:20 -05:00
Matt Arsenault
48732d3541 SPARC: Register null target streamer
Fixes null dereference in emitFunctionBodyStart for 64-bit
2022-11-02 16:05:34 -07:00
Koakuma
d3fcbee10d [SPARC] Make calls to function with big return values work
Implement CanLowerReturn and associated CallingConv changes for SPARC/SPARC64.

In particular, for SPARC64 there's new `RetCC_Sparc64_*` functions that handles the return case of the calling convention.
It uses the same analysis as `CC_Sparc64_*` family of funtions, but fails if the return value doesn't fit into the return registers.

This makes calls to functions with big return values converted to an sret function as expected, instead of crashing LLVM.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D132465
2022-10-18 00:01:55 +00:00
Craig Topper
30305d7948 [TargetLowering][RISCV][Sparc] Don't emit zero check in CTTZTableLookup for CTTZ_ZERO_UNDEF.
The code incorrectly checked for CTLZ_ZERO_UNDEF instead of
CTTZ_ZERO_UNDEF.

While I was there I flipped the condition into an early out.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D136010
2022-10-17 10:15:39 -07:00
Rainer Orth
d9993484ee [Sparc] Don't use SunStyleELFSectionSwitchSyntax
As discussed in D85414 <https://reviews.llvm.org/D85414>, two tests
currently `FAIL` on Sparc since that backend uses the Sun assembler syntax
for the `.section` directive, controlled by
`SunStyleELFSectionSwitchSyntax`.

Instead of adapting the affected tests, this patch changes that default.
The internal assembler still accepts both forms as input, only the output
syntax is affected.

Current support for the Sun syntax is cursory at best: the built-in
assembler cannot even assemble some of the directives emitted by GCC, and
the set supported by the Solaris assembler is even larger: SPARC Assembly
Language Reference Manual, 3.4 Pseudo-Op Attributes
<https://docs.oracle.com/cd/E37838_01/html/E61063/gmabi.html#scrolltoc>.

A few Sparc test cases need to be adjusted. At the same time, the patch
fixes the failures from D85414 <https://reviews.llvm.org/D85414>.

Tested on `sparcv9-sun-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D85415
2022-08-17 12:59:29 +02:00
Shubham Narlawar
ab4fc87a9d [DAG] Emit table lookup from TargetLowering::expandCTTZ()
This patch emits table lookup in expandCTTZ.

Context -
https://reviews.llvm.org/D113291 transforms set of IR instructions to
cttz intrinsic but there are some targets which does not support CTTZ or
CTLZ. Hence, I generate a table lookup in TargetLowering::expandCTTZ().

Differential Revision: https://reviews.llvm.org/D128911
2022-08-08 12:08:05 +01:00