419 Commits

Author SHA1 Message Date
Koakuma
697dd93ae3
[SPARC] Implement L and H inline asm argument modifiers (#87259)
This adds support for using the L and H argument modifiers for twinword
operands in inline asm code, such as in:

```
%1 = tail call i64 asm sideeffect "rd %pc, ${0:L} ; srlx ${0:L}, 32, ${0:H}", "={o4}"()
```

This is needed by the Linux kernel.
2024-04-05 04:34:07 +07:00
James Y Knight
c1a99b2c77
[Sparc] limit MaxAtomicSizeInBitsSupported to 32 for 32-bit Sparc. (#81655)
When in 32-bit mode, the backend doesn't currently implement 64-bit
atomics, even though the hardware is capable if you have specified a V9
CPU. Thus, limit the width to 32-bit, for now, leaving behind a TODO.

This fixes a regression triggered by PR #73176.
2024-02-13 15:40:51 -05:00
Koakuma
c2f9885a8a
[SPARC] Support reserving arbitrary general purpose registers (#74927)
This adds support for marking arbitrary general purpose registers -
except for those with special purpose (G0, I6-I7, O6-O7) - as reserved,
as needed by some software like the Linux kernel.
2024-02-11 02:04:18 -05:00
Nikita Popov
ff9af4c43a [CodeGen] Convert tests to opaque pointers (NFC) 2024-02-05 14:07:09 +01:00
Koakuma
118d4234ac
[SPARC] Prefer RDPC over CALL to implement GETPCX for 64-bit target
On 64-bit target, prefer using RDPC over CALL to get the value of %pc.
This is faster on modern processors (Niagara T1 and newer) and avoids
polluting the processor's predictor state.

The old behavior of using a fake CALL is still done when tuning for
classic UltraSPARC processors, since RDPC is much slower there.

A quick pgbench test on a SPARC T4 shows about 2% speedup on SELECT
loads, and about 7% speedup on INSERT/UPDATE loads.

Reviewed By: @s-barannikov

Github PR: https://github.com/llvm/llvm-project/pull/78280
2024-01-16 22:46:39 +07:00
Fangrui Song
f972e4d343 [MC,ELF] .section: unconditionally print section flag 'G' after 'o'
* Placing 'G' before 'M' (SHF_MERGE) can be misleading as the sh_entsize
  argument goes before the section group name, if a reader doesn't know
  that the order of extra arguments is not affected by the order of flags.
* 'a', 'w', and 'x' indicate basic permission-related flags. Separating
  them with 'G' is kinda ugly.

Simplify code and move 'G' after 'o'. The new output is more similar to
GCC.
2024-01-09 10:48:23 -08:00
Sergei Barannikov
b9208aca9b
[Sparc] Remove duplicate ALU and SETHI instructions (NFCI) (#66851)
There are no 64-bit variants of these ALU / SETHI instructions in V9.
Remove these instruction definitions and add patterns to match DAG nodes
to the generic instructions defined in SparcInstrInfo.td.

This is not strictly NFC because of the changes in
`2011-01-11-FrameAddr.ll` test. The reason is that Sparc delay slot
filler pass handled ADDrr but not ADDXrr, which are now the same
instruction.
2023-10-10 20:34:20 +03:00
Jay Foad
7b3bbd83c0 Revert "[CodeGen] Really renumber slot indexes before register allocation (#67038)"
This reverts commit 2501ae58e3bb9a70d279a56d7b3a0ed70a8a852c.

Reverted due to various buildbot failures.
2023-10-09 12:31:32 +01:00
Jay Foad
2501ae58e3
[CodeGen] Really renumber slot indexes before register allocation (#67038)
PR #66334 tried to renumber slot indexes before register allocation, but
the numbering was still affected by list entries for instructions which
had been erased. Fix this to make the register allocator's live range
length heuristics even less dependent on the history of how instructions
have been added to and removed from SlotIndexes's maps.
2023-10-09 11:44:41 +01:00
Jay Foad
01aa0c776d [SPARC] Add a missing SPARC64-LABEL check 2023-09-28 13:15:09 +01:00
Sergei Barannikov
dd477ebd23
[Sparc] Remove LEA instructions (NFCI) (#65850)
LEA_ADDri and LEAX_ADDri are printed / encoded the same way as ADDri. I
had to change the type of simm13Op so that it can be used in both 32-
and 64-bit modes. This required the changes in operands of some
InstAliases.
2023-09-20 03:34:39 +03:00
Jay Foad
e0919b189b [CodeGen] Renumber slot indexes before register allocation (#66334)
RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate
the length of a live range for its heuristics. Renumbering all slot
indexes with the default instruction distance ensures that this estimate
will be as accurate as possible, and will not depend on the history of
how instructions have been added to and removed from SlotIndexes's maps.

This also means that enabling -early-live-intervals, which runs the
SlotIndexes analysis earlier, will not cause large amounts of churn due
to different register allocator decisions.
2023-09-19 11:18:12 +01:00
Rainer Orth
715fc4fc60
[Sparc] Don't emit __multi3 on 32-bit SPARC (#66362)
LLVM fails to build on 32-bit Solaris/SPARC: several programs fail to
link due to undefined references to `__multi3`. This reference is from
`lib/libLLVMScalarOpts.a(LoopStrengthReduce.cpp.o)`. However, This
function exists neither in the 32-bit `libgcc.a` nor in
`libclang_rt.builtins-sparc.a`. It's only defined in their 64-bit
counterparts.

The same issue affects several 32-bit targets, e.g. 32-bit PowerPC as
described in Issue #54460. The fix is the same: inhibit the libcall for
32-bit compilations. This patch does just that, regenerating the
affected testcases. It allows the build to complete.

Tested on `sparc-sun-solaris2.11`.
2023-09-15 07:31:59 +02:00
Fangrui Song
806761a762 [test] Change llc -march= to -mtriple=
The issue is uncovered by #47698: for IR files without a target triple,
-mtriple= specifies the full target triple while -march= merely sets the
architecture part of the default target triple, leaving a target triple which
may not make sense, e.g. riscv64-apple-darwin.

Therefore, -march= is error-prone and not recommended for tests without a target
triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead
of rejecting it outrightly.
2023-09-11 14:42:37 -07:00
Simon Pilgrim
3ad4f92f83 [DAG] More aggressively (extract_vector_elt (build_vector x, y), c) iff element is zero constant
We currently don't extract vector elements from multi-use build vectors unless TLI.aggressivelyPreferBuildVectorSources accepts them, which seems a little extreme for constant build vectors (especially as under some cases ComputeKnownBits will indirectly extract the data for us).

This is causing a few regressions in some upcoming SimplifyDemandedBits work I'm looking at, all of which just need to know that the element is zero, so I've tweaked the fold to accept zero elements as well, which will typically fold very easily.

Differential Revision: https://reviews.llvm.org/D155582
2023-07-18 17:31:34 +01:00
Simon Pilgrim
b8bda50932 [Sparc] Regenerate float-constants.ll test checks 2023-07-18 17:31:34 +01:00
Amaury Séchet
015323ff9b [NFC] Autogenerate CodeGen/SPARC/LeonInsertNOPLoadPassUT.ll 2023-06-15 13:24:39 +00:00
Tobias Hieta
f84bac329b
[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
Brad Smith
c30c291887 [SPARC] Lower BR_CC to BPr on 64-bit target whenever possible
On 64-bit target, when doing i64 BR_CC where one of the comparison operands is a
constant zero, try to fold the compare and BPcc into a BPr instruction.

For all integers, EQ and NE comparison are available, additionally for signed
integers, GT, GE, LT, and LE is also available.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D142461
2023-04-26 18:56:00 -04:00
Brad Smith
eee590ca4b Revert "[SPARC] Lower BR_CC to BPr on 64-bit target whenever possible"
This reverts commit 6590a372fa3f4582c04b4b179f90a3c728e75025.
2023-03-12 04:20:25 -04:00
Koakuma
6590a372fa [SPARC] Lower BR_CC to BPr on 64-bit target whenever possible
On 64-bit target, when doing i64 BR_CC where one of the comparison operands is a
constant zero, try to fold the compare and BPcc into a BPr instruction.

For all integers, EQ and NE comparison are available, additionally for signed
integers, GT, GE, LT, and LE is also available.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D142461
2023-03-11 17:47:53 -05:00
Koakuma
24e300190a [SPARC] Implement hooks for conditional branch relaxation
Integrate the BranchRelaxation pass to help with relaxing out-of-range
conditional branches.

This is mostly of concern for SPARCv9, which uses conditional branches with
much smaller range than its v8 counterparts.
(Some large autogenerated code, such as the ones generated by TableGen, already
hits this limitation when building in Release)

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D142458
2023-03-11 17:42:09 -05:00
Matt Arsenault
778cf5431c IR: Add atomicrmw uinc_wrap and udec_wrap
These are essentially add/sub 1 with a clamping value.

AMDGPU has instructions for these. CUDA/HIP expose these as
atomicInc/atomicDec. Currently we use target intrinsics for these,
but those do no carry the ordering and syncscope. Add these to
atomicrmw so we can carry these and benefit from the regular
legalization processes.
2023-01-24 17:55:11 -04:00
Koakuma
ac16ea89db [SPARC] Fix SELECT_REG emission for f128s
In LowerSELECT_CC, SELECT_REG between two f128s should only be emitted if we
have hardware quadfloat enabled.

This should fix issue #59646

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D140515
2022-12-22 13:52:56 -05:00
Koakuma
eaade37fdd [SPARC] Mark the %g0 register as constant & use it to materialize zeros
Materialize zeros by copying from %g0, which is now marked as constant.

This makes it possible for some common operations (like integer negation) to be
performed in fewer instructions.

This continues @arichardson's patch at D132561.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138887
2022-12-13 17:25:42 -05:00
Koakuma
f8f41c3fcd [SPARC] Lower SELECT_CC to MOVr on 64-bit target whenever possible
On 64-bit target, when doing i64 SELECT_CC where one of the comparison operands
is a constant zero, try to fold the compare and MOVcc into a MOVr instruction.

For all integers, EQ and NE comparison are available, additionally for signed
integers, GT, GE, LT, and LE is also available.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138922
2022-12-07 15:34:58 -05:00
Brad Smith
7806f86a5e Revert "[SPARC] Mark the %g0 register as constant & use it to materialize zeros"
2 of the Sparc tests are now failing.

This reverts commit 2c41310fc146a1f609147c65ac5f30e5a57e84a8.
2022-12-07 15:27:57 -05:00
Koakuma
2c41310fc1 [SPARC] Mark the %g0 register as constant & use it to materialize zeros
Materialize zeros by copying from %g0, which is now marked as constant.

This makes it possible for some common operations (like integer negation) to be
performed in fewer instructions.

This continues @arichardson's patch at D132561.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138887
2022-12-07 13:34:13 -05:00
Koakuma
f63a19baf0 [SPARC] Add tail call support for 64-bit target
Extend SPARC tail call support, first introduced in D51206 (commit 1c235c375492180c2eecb6331f169486019fd2d2),
to also cover 64-bit target.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D138741
2022-11-26 23:29:05 -05:00
Koakuma
fd0aeaa83a [SPARC] Don't emit deprecated FP branches when targeting v9
Don't emit deprecated v8-style FP compares & branches when targeting v9
processors.

For now, always use %fcc0, because currently the allocator requires allocatable
registers to also be spillable, which isn't the case with v9 FCC registers.

The work to enable allocation over the entire FCC register file will be done in
a future patch.

Fixes bug #17834

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D135515
2022-11-16 20:56:17 -05:00
Koakuma
586d5f91e6 [SPARC] Improve integer branch handling for v9 targets
Do not emit deprecated v8-style branches when targeting a v9 processor.

As a side effect, this also fixes the emission of useless ba's when doing
conditional branches on 64-bit integer values.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D130006
2022-11-16 20:51:20 -05:00
Matt Arsenault
48732d3541 SPARC: Register null target streamer
Fixes null dereference in emitFunctionBodyStart for 64-bit
2022-11-02 16:05:34 -07:00
Koakuma
d3fcbee10d [SPARC] Make calls to function with big return values work
Implement CanLowerReturn and associated CallingConv changes for SPARC/SPARC64.

In particular, for SPARC64 there's new `RetCC_Sparc64_*` functions that handles the return case of the calling convention.
It uses the same analysis as `CC_Sparc64_*` family of funtions, but fails if the return value doesn't fit into the return registers.

This makes calls to functions with big return values converted to an sret function as expected, instead of crashing LLVM.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D132465
2022-10-18 00:01:55 +00:00
Craig Topper
30305d7948 [TargetLowering][RISCV][Sparc] Don't emit zero check in CTTZTableLookup for CTTZ_ZERO_UNDEF.
The code incorrectly checked for CTLZ_ZERO_UNDEF instead of
CTTZ_ZERO_UNDEF.

While I was there I flipped the condition into an early out.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D136010
2022-10-17 10:15:39 -07:00
Rainer Orth
d9993484ee [Sparc] Don't use SunStyleELFSectionSwitchSyntax
As discussed in D85414 <https://reviews.llvm.org/D85414>, two tests
currently `FAIL` on Sparc since that backend uses the Sun assembler syntax
for the `.section` directive, controlled by
`SunStyleELFSectionSwitchSyntax`.

Instead of adapting the affected tests, this patch changes that default.
The internal assembler still accepts both forms as input, only the output
syntax is affected.

Current support for the Sun syntax is cursory at best: the built-in
assembler cannot even assemble some of the directives emitted by GCC, and
the set supported by the Solaris assembler is even larger: SPARC Assembly
Language Reference Manual, 3.4 Pseudo-Op Attributes
<https://docs.oracle.com/cd/E37838_01/html/E61063/gmabi.html#scrolltoc>.

A few Sparc test cases need to be adjusted. At the same time, the patch
fixes the failures from D85414 <https://reviews.llvm.org/D85414>.

Tested on `sparcv9-sun-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D85415
2022-08-17 12:59:29 +02:00
Shubham Narlawar
ab4fc87a9d [DAG] Emit table lookup from TargetLowering::expandCTTZ()
This patch emits table lookup in expandCTTZ.

Context -
https://reviews.llvm.org/D113291 transforms set of IR instructions to
cttz intrinsic but there are some targets which does not support CTTZ or
CTLZ. Hence, I generate a table lookup in TargetLowering::expandCTTZ().

Differential Revision: https://reviews.llvm.org/D128911
2022-08-08 12:08:05 +01:00
Sanjay Patel
8b75671314 [SDAG] try to replace subtract-from-constant with xor
This is almost the same as the abandoned D48529, but it
allows splat vector constants too.

This replaces the x86-specific code that was added with
the alternate patch D48557 with the original generic
combine.

This transform is a less restricted form of an existing
InstCombine and the proposed SDAG equivalent for that
in D128080:
https://alive2.llvm.org/ce/z/OUm6N_

Differential Revision: https://reviews.llvm.org/D128123
2022-07-08 08:14:24 -04:00
Koakuma
1466d65d9b [SPARC] Don't do leaf optimization on procedures with inline assembly
On SPARC, leaf function optimization omits the register window sliding (and the associated register name changes). This might result in miscompilation of procedures containing inline assembly, as some of the register constraints used may interfere with the register usage of optimized functions, so we disable leaf procedure optimization on those procedures to prevent it from happening.

This is a continuation of patch D102342 by @LemonBoy, the original comment is reproduced below:

> Leaf functions allow the compiler to omit the setup and teardown of a frame pointer, therefore avoiding the exchange of the in/out register. According to the SPARC architecture manual every reference to %i0-%i5 should be replaced with %o0-o5, if the target register is already in use a further remapping step to %g1-%g7 is required to free the output register.
>
> Add a simple check to make sure not to stomp on any output register that's already in use.

Reviewed By: dcederman

Differential Revision: https://reviews.llvm.org/D128263
2022-06-27 15:09:30 +02:00
LemonBoy
700eadca5f [SPARC] Fix type for i64 inline asm operands
Differential Revision: https://reviews.llvm.org/D101694
2022-06-04 18:32:16 -04:00
Nuno Lopes
80b3dcc045 [Support] Make report_fatal_error respect its GenCrashDiag argument so it doesn't generate a backtrace
There are a few places where we use report_fatal_error when the input is broken.
Currently, this function always crashes LLVM with an abort signal, which
then triggers the backtrace printing code.
I think this is excessive, as wrong input shouldn't give a link to
LLVM's github issue URL and tell users to file a bug report.
We shouldn't print a stack trace either.

This patch changes report_fatal_error so it uses exit() rather than
abort() when its argument GenCrashDiag=false.

Reviewed by: nikic, MaskRay, RKSimon

Differential Revision: https://reviews.llvm.org/D126550
2022-05-30 19:19:23 +01:00
Brad Smith
37ccfc55ab [Sparc] Have test use IAS 2022-05-23 01:22:07 -04:00
Mark Kettenis
3d869c88bb [Sparc] Make sure that we really don't emit quad-precision unless the "hard-quad-float" feature is available
Make sure that we really don't emit quad-precision unless the "hard-quad-float"
feature is available. Add missing replacement instruction patterns that are
needed to emit alternative code for conditional moves of quad-precision floats.

Test from koakuma.

Reviewed By: koakuma

Differential Revision: https://reviews.llvm.org/D119104
2022-05-18 20:11:58 -04:00
Daniel Cederman
1c235c3754 [Sparc] Add tail call support
This patch adds tail call support to the 32-bit Sparc backend.

Two new instructions are defined, TAIL_CALL and TAIL_CALLri. They are
encoded the same as CALL and BINDri, but are marked with isReturn so
that the epilogue gets emitted. In contrast to CALL, TAIL_CALL is not
marked with isCall. This makes it possible to use the leaf function
optimization when the only call a function makes is a tail call.

TAIL_CALL modifies the return address in %o7, so for leaf functions
the value in %o7 needs to be restored after the call. For normal
functions which uses the restore instruction this is not necessary.

Reviewed By: koakuma

Differential Revision: https://reviews.llvm.org/D51206
2022-03-08 13:50:54 +01:00
Nikita Popov
f430c1eb64 [Tests] Add elementtype attribute to indirect inline asm operands (NFC)
This updates LLVM tests for D116531 by adding elementtype attributes
to operands that correspond to indirect asm constraints.
2022-01-06 14:23:51 +01:00
Koakuma
3e0f3041cc [SPARC] Zero-extend the operands when doing UMULO on 64-bit integers
On SPARC, S/UMULO operation on 64-bit integers works by extending the value to 128-bit, then doing a multiplication and checking the upper half of the result.
This makes UMULO works correctly by putting a zero in the upper half rather than doing a sign extension.

Reviewed By: LemonBoy

Differential Revision: https://reviews.llvm.org/D110555
2021-11-14 19:59:52 +01:00
Nick Desaulniers
39e5dd113f [SparcISelLowering] avoid emitting libcalls to __muloti4 and __mulodi4
These compiler-rt-only symbols aren't available in libgcc.  Similar to
D108842, D108844, and D108926.

Fixes: pr/52043

Reviewed By: craig.topper, rengolin

Differential Revision: https://reviews.llvm.org/D112750
2021-10-29 13:14:09 -07:00
Jake Egan
56049b7129 Fix tests defaulting to incorrect triples on AIX
The tests only specify -march, so when the tests are run on AIX the target OS defaults to AIX, which causes the tests to misbehave.

This patch constrains the tests by specifying -mtriple instead of -march.

Reviewed By: daltenty, jsji, MaskRay

Differential Revision: https://reviews.llvm.org/D110186
2021-09-27 11:30:45 -04:00
Matt Arsenault
fae05692a3 CodeGen: Print/parse LLTs in MachineMemOperands
This will currently accept the old number of bytes syntax, and convert
it to a scalar. This should be removed in the near future (I think I
converted all of the tests already, but likely missed a few).

Not sure what the exact syntax and policy should be. We can continue
printing the number of bytes for non-generic instructions to avoid
test churn and only allow non-scalar types for generic instructions.

This will currently print the LLT in parentheses, but accept parsing
the existing integers and implicitly converting to scalar. The
parentheses are a bit ugly, but the parser logic seems unable to deal
without either parentheses or some keyword to indicate the start of a
type.
2021-06-30 16:54:13 -04:00
LemonBoy
5be3a1a064 [SPARC] Legalize truncation and extension between fp128 and half
Lower truncations and expansions between fp128 and half values into libcalls.
Expand truncating stores into two separate truncation and a store operations.

Reviewed By: jrtc27

Differential Revision: https://reviews.llvm.org/D104185
2021-06-13 20:05:15 +02:00
Wolfgang Pieb
5a1589fc6d [static initializers] Emit global_ctors and global_dtors in reverse order when .ctors/.dtors are used.
Reviewed By: rnk, MaskRay, efriedma

Differential Revision: https://reviews.llvm.org/D103495
2021-06-10 16:44:47 -07:00