522 Commits

Author SHA1 Message Date
Lei Huang
82acd8c377
[PowerPC] Add code to spill and restore DMRp registers (#142443) 2025-06-18 13:50:57 -04:00
Lei Huang
05f1ca7d17
[PowerPC] Spill and restore DMR register (#141530)
Add spilling and restoring of DMR registers.
2025-06-02 13:11:39 -04:00
Lei Huang
4b09eedf7b
[PowerPC] Update DMF VSX ACC data transfer instructions (#138897)
For cpu=future, acc registers no longer overlap VSRs and are prefixed
with `dm`. The original, xxmfacc/xxmtacc instructions are now extended
menemonics to it's dm* equivalents.
2025-05-26 12:47:12 -04:00
Philip Reames
f2ecd86e34
[Analysis] Remove implicit LocationSize conversion from uint64_t (#133342)
This change removes the uint64_t constructor on LocationSize
preventing implicit conversion, and fixes up the using APIs to adapt to
the change. Note that I'm adding a couple of explicit conversion points
on routines where passing in a fixed offset as an integer seems likely
to have well understood semantics.

We had an unfortunate case which arose if you tried to pass a TypeSize
value to a parameter of LocationSize type. We'd find the implicit
conversion path through TypeSize -> uint64_t -> LocationSize which works
just fine for fixed values, but looses information and fails assertions
if the TypeSize was scalable. This change breaks the first link in that
implicit conversion chain since that seemed to be the easier one.
2025-04-18 07:46:31 -07:00
zhijian lin
1a540c3b8b
[PowerPC] Deprecate uses of ISD::ADDC/ISD::ADDE/ISD::SUBC/ISD::SUBE (#133155)
ISD::ADDC, ISD::ADDE, ISD::SUBC and ISD::SUBE are being deprecated,
using ISD::UADDO_CARRY,ISD::USUBO_CARRY instead. Lowering the UADDO,
UADDO_CARRY, USUBO, USUBO_CARRY in the patch.
2025-04-03 13:22:49 -04:00
Craig Topper
6b7daf2249
[MachineCombiner][Targets] Use Register in TII genAlternativeCodeSequence interface. NFC (#131272) 2025-03-13 23:27:56 -07:00
Kazu Hirata
f4aea1324d
[PowerPC] Avoid repeated hash lookups (NFC) (#129193) 2025-02-27 23:01:19 -08:00
Craig Topper
571b787b83
[CodeGen] Change copyPhysReg interface to use Register instead of MCRegister. (#128473)
NVPTX, SPIRV, and WebAssembly pass virtual registers to this function
since they don't perform register allocation. We need to use Register to
avoid a virtual register being converted to MCRegister by the caller.
2025-02-24 09:55:34 -08:00
Christopher Di Bella
08c69b2ef6 Revert "[CodeGen] Remove static member function Register::isVirtualRegister. NFC (#127968)"
This reverts commit ff99af7ea03b3be46bec7203bd2b74048d29a52a.
2025-02-20 22:06:21 +00:00
Craig Topper
ff99af7ea0
[CodeGen] Remove static member function Register::isVirtualRegister. NFC (#127968)
Use nonstatic member instead. This requires explicit conversions, but
many will go away as we continue converting unsigned to Register.

In a few places where it was simple, I changed unsigned to Register.
2025-02-20 08:35:50 -08:00
David Tenty
aa9e519b24 Revert "[PowerPC] Deprecate uses of ISD::ADDC/ISD::ADDE/ISD::SUBC/ISD::SUBE (#116984)"
This reverts commit 7763119c6eb0976e4836f81c9876c49a36d46d73 (leaving the modifications from 03cb46d248b08)..
2025-02-19 09:44:39 -05:00
zhijian lin
7763119c6e
[PowerPC] Deprecate uses of ISD::ADDC/ISD::ADDE/ISD::SUBC/ISD::SUBE (#116984)
ISD::ADDC, ISD::ADDE, ISD::SUBC and ISD::SUBE are being deprecated,
using ISD::UADDO_CARRY,ISD::USUBO_CARRY instead. Lowering the UADDO,
UADDO_CARRY, USUBO, USUBO_CARRY in the patch.
2025-02-13 09:09:17 -05:00
Hua Tian
a9d2834508
[llvm][CodeGen] Fix the issue caused by live interval checking in window scheduler (#123184)
At some corner cases, the cloned MI still retains an old slot index,
which leads to the compiler crashing. This patch update the slot index
map before delete the recycled MI.

https://github.com/llvm/llvm-project/issues/123165
2025-01-23 09:39:03 +08:00
Venkata Ramanaiah Nalamothu
f7d8336a2f
[llvm] Pass MachineInstr flags to storeRegToStackSlot/loadRegFromStackSlot (NFC) (#120622)
This patch is in preparation to enable setting the MachineInstr::MIFlag
flags, i.e. FrameSetup/FrameDestroy, on callee saved register
spill/reload instructions in prologue/epilogue. This eventually helps in
setting the prologue_end and epilogue_begin markers more accurately.

The DWARF Spec in "6.4 Call Frame Information" says:

The code that allocates space on the call frame stack and performs the
save
operation is called the subroutine’s prologue, and the code that
performs
the restore operation and deallocates the frame is called its epilogue.

which means the callee saved register spills and reloads are part of
prologue (a.k.a frame setup) and epilogue (a.k.a frame destruction),
respectively. And, IIUC, LLVM backend uses FrameSetup/FrameDestroy flags
to identify instructions that are part of call frame setup and
destruction.

In the trunk, while most targets consistently set
FrameSetup/FrameDestroy on save/restore call frame information (CFI)
instructions of callee saved registers, they do not consistently set
those flags on the actual callee saved register spill/reload
instructions.

I believe this patch provides a clean mechanism to set
FrameSetup/FrameDestroy flags on the actual callee saved register
spill/reload instructions as needed. And, by having default argument of
MachineInstr::NoFlags for Flags, this patch is a NFC.

With this patch, the targets have to just pass FrameSetup/FrameDestroy
flag to the storeRegToStackSlot/loadRegFromStackSlot calls from the
target derived spillCalleeSavedRegisters and restoreCalleeSavedRegisters
to set those flags on callee saved register spill/reload instructions.

Also, this patch makes it very easy to set the source line information
on callee saved register spill/reload instructions which is needed by
the DwarfDebug.cpp implementation to set prologue_end and epilogue_begin
markers more accurately.

As per DwarfDebug.cpp implementation:

prologue_end is the first known non-DBG_VALUE and non-FrameSetup
location
    that marks the beginning of the function body

epilogue_begin is the first FrameDestroy location that has been seen in
the
    epilogue basic block

With this patch, the targets have to just do the following to set the
source line information on callee saved register spill/reload
instructions, without hampering the LLVM's efforts to avoid adding
source line information on the artificial code generated by the
compiler.

    <Foo>InstrInfo::storeRegToStackSlot() {
    ...
      DebugLoc DL =
Flags & MachineInstr::FrameSetup ? DebugLoc() : MBB.findDebugLoc(I);
    ...
    }

    <Foo>InstrInfo::loadRegFromStackSlot() {
    ...
      DebugLoc DL =
Flags & MachineInstr::FrameDestroy ? MBB.findDebugLoc(I) : DebugLoc();
    ...
    }

While I understand this patch would break out-of-tree backend builds, I
think it is in the right direction.

One immediate use case that can benefit from this patch is fixing
#120553 becomes simpler.
2025-01-22 13:36:39 +05:30
Pengcheng Wang
3ef78188d0
[PowerPC] Use RegisterClassInfo::getRegPressureSetLimit (#120383)
`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use
`TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Separate from https://github.com/llvm/llvm-project/pull/118787
2025-01-03 16:41:18 +08:00
Kazu Hirata
f71cb9dbb7
[PowerPC] Remove unused includes (NFC) (#116163)
Identified with misc-include-cleaner.
2024-11-14 07:55:18 -08:00
Kazu Hirata
4048c64306
[llvm] Remove redundant control flow statements (NFC) (#115831)
Identified with readability-redundant-control-flow.
2024-11-12 10:09:42 -08:00
zhijian lin
674574d25c
Promote 32bit pseudo instr that infer extsw removal to 64bit in PPCMIPeephole (#85451)
Fixes:   https://github.com/llvm/llvm-project/issues/71030

Bug only happens in 64bit involving spills. Since we don't know when the
spill will happen, all instructions in the chain used to deduce sign
extension for eliminating 'extsw' will need to be promoted to 64-bit
pseudo instructions.

The following instruction will promoted in PPCMIPeepholes: EXTSH, LHA,
ISEL to EXTSH8, LHA8, ISEL8
2024-10-31 15:49:36 -04:00
Keith Packard
44b020a381
[PowerPC][ISelLowering] Support -mstack-protector-guard=tls (#110928)
Add support for using a thread-local variable with a specified offset
for holding the stack guard canary value. This supports both 32- and 64-
bit PowerPC targets.

This mirrors changes from #108942 but targeting PowerPC instead of
RISCV. Because both of these PRs modify the same driver functions, this
series is stack on top of the RISC-V one.

---------

Signed-off-by: Keith Packard <keithp@keithp.com>
2024-10-17 19:06:47 -07:00
Piyou Chen
b01c006f73
[TII][RISCV] Add renamable bit to copyPhysReg (#91179)
The renamable flag is useful during MachineCopyPropagation but renamable
flag will be dropped after lowerCopy in some case.

This patch introduces extra arguments to pass the renamable flag to
copyPhysReg.
2024-08-27 10:08:43 +08:00
azhan92
1df4d866cc
[PowerPC] Add support for -mcpu=pwr11 / -mtune=pwr11 (#99511)
This PR adds support for -mcpu=pwr11/power11 and -mtune=pwr11/power11 in
clang and llvm.
2024-07-23 09:49:41 -04:00
Kazu Hirata
5e22a53698
[Target] Use range-based for loops (NFC) (#98705) 2024-07-13 17:40:51 -07:00
Chen Zheng
6a992bc89f [PowerPC] refactor CPU info in PPCTargetParser.def, NFC
CPU features will be done in follow up patches.
2024-07-03 00:20:14 -04:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Zaara Syeda
194e7cc7aa
[PowerPC][AIX] 64-bit large code-model support for toc-data (#90619)
This patch adds support for toc-data for 64-bit large code-model on AIX.
The sequence ADDIStocHA8/ADDItocL8 is used to access the data directly
from the TOC.
When emitting the instruction ADDIStocHA8, we check if the symbol has
toc-data attribute before creating a toc entry for it. When emitting the
instruction ADDItocL8, we use the LA8 instruction to load the address.
2024-05-21 14:00:24 -04:00
Xu Zhang
f6d431f208
[CodeGen] Make the parameter TRI required in some functions. (#85968)
Fixes #82659

There are some functions, such as `findRegisterDefOperandIdx` and  `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI  parameters, as shown in issue #82411.

Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`,  `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.

After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
2024-04-24 14:24:14 +01:00
Zaara Syeda
76ad289748
[PowerPC] 32-bit large code-model support for toc-data (#85129)
This patch adds the pseudo op ADDItocL for 32-bit large code-model
support for toc-data.
2024-04-17 09:24:53 -04:00
Pengcheng Wang
b564036933
[MachineCombiner][NFC] Split target-dependent patterns
We split target-dependent MachineCombiner patterns into their target
folder.

This makes MachineCombiner much more target-independent.

Reviewers:
davemgreen, asavonic, rotateright, RKSimon, lukel97, LuoYuanke, topperc, mshockwave, asi-sc

Reviewed By: topperc, mshockwave

Pull Request: https://github.com/llvm/llvm-project/pull/87991
2024-04-11 12:20:27 +08:00
Zaara Syeda
cc761a7c35
[PowerPC][NFC] Rename ADDItocL to match the 64-bit naming convention (#85099)
In preparation of adding a similar instruction for large code model on
AIX for 32-bit, rename the exisitng ADDItocL 64-instruction to ADDItocL8
to match the naming convention of other instructions with 32-bit and
64-bit variants.
2024-03-13 11:57:07 -04:00
David Green
44be5a7fdc
[Codegen] Make Width in getMemOperandsWithOffsetWidth a LocationSize. (#83875)
This is another part of #70452 which makes getMemOperandsWithOffsetWidth
use a LocationSize for Width, as opposed to the unsigned it currently
uses. The advantages on it's own are not super high if
getMemOperandsWithOffsetWidth usually uses known sizes, but if the
values can come from an MMO it can help be more accurate in case they
are Unknown (and in the future, scalable).
2024-03-06 17:40:13 +00:00
Felix (Ting Wang)
5b05870953
[PowerPC] Support local-dynamic TLS relocation on AIX (#66316)
Supports TLS local-dynamic on AIX, generates below sequence of code:

```
.tc foo[TC],foo[TL]@ld # Variable offset, ld relocation specifier
.tc mh[TC],mh[TC]@ml # Module handle for the caller
lwz 3,mh[TC]\(2\) $$ For 64-bit: ld 3,mh[TC]\(2\)
bla .__tls_get_mod # Modifies r0,r3,r4,r5,r11,lr,cr0
#r3 = &TLS for module
lwz 4,foo[TC]\(2\) $$ For 64-bit: ld 4,foo[TC]\(2\)
add 5,3,4 # Compute &foo
.rename mh[TC], "\_$TLSML" # Symbol for the module handle must have the name "_$TLSML"
```

---------

Co-authored-by: tingwang <tingwang@tingwangs-MBP.lan>
Co-authored-by: tingwang <tingwang@tingwangs-MacBook-Pro.local>
2024-03-01 08:09:40 +08:00
Philip Reames
3ff7caea33
[TTI] Use Register in isLoadFromStackSlot and isStoreToStackSlot [nfc] (#80339) 2024-02-01 17:52:35 -08:00
Nemanja Ivanovic
67c1c1dbb6
[PowerPC][X86] Make cpu id builtins target independent and lower for PPC (#68919)
Make __builtin_cpu_{init|supports|is} target independent and provide an
opt-in query for targets that want to support it. Each target is still
responsible for their specific lowering/code-gen. Also provide code-gen
for PowerPC.

I originally proposed this in https://reviews.llvm.org/D152914 and this
addresses the comments I received there.

---------

Co-authored-by: Nemanja Ivanovic <nemanjaivanovic@nemanjas-air.kpn>
Co-authored-by: Nemanja Ivanovic <nemanja@synopsys.com>
2024-01-26 11:24:50 -05:00
Shengchen Kan
550f0eb2ce [NFC] Rename TargetInstrInfo::FoldImmediate to TargetInstrInfo::foldImmediate and simplify implementation for X86 2024-01-26 20:50:58 +08:00
Chen Zheng
1f61507401 [NFC][PowerPC] remove the redundant spill related flags setting 2024-01-18 23:07:42 -05:00
Qiu Chaofan
85071a3c74
[PowerPC] Implement fence builtin (#76495) 2024-01-15 11:19:16 +08:00
Kai Luo
56414220df
[PowerPC] Use 'sync; ld; cmp; bc; isync' for atomic load seq-cst on 32-bit platform (#75905)
`cmp; bc; isync` is more performant than `lwsync` theoretically.

64-bit platform already features it, now implement it for 32-bit
platform.
2023-12-20 10:01:02 +08:00
Kai Luo
2f82662ce9
[PowerPC] Let base implementation decide if MI is rematerizable by default (#75772)
If MI is not PPC specific instructions, let base implementation decide
if MI is rematerizable.
This can fix failure in #75570 after #75271 .
2023-12-18 17:39:22 +08:00
Chen Zheng
4b932d84f4
[PowerPC] redesign the target flags (#69695)
12 bit is not enough for PPC's target specific flags. If 8 bit for the
bitmask flags, 4 bit for the direct mask, PPC can total have 16 direct
mask and 8 bitmask. Not enough for PPC, see this issue in
https://github.com/llvm/llvm-project/pull/66316

Redesign how PPC target set the target specific flags. With this patch,
all ppc target flags are direct flags. No bitmask flag in PPC anymore.

This patch aligns with some targets like X86 which also has many target
specific flags.

The patch also fixes a bug related to flag `MO_TLSGDM_FLAG` and `MO_LO`.
They are the same value and the test case changes in this PR shows the
bug.
2023-12-07 12:47:25 +08:00
Alex Bradbury
b717365216
[MachineScheduler][NFCI] Add Offset and OffsetIsScalable args to shouldClusterMemOps (#73778)
These are picked up from getMemOperandsWithOffsetWidth but weren't then
being passed through to shouldClusterMemOps, which forces backends to
collect the information again if they want to use the kind of heuristics
typically used for the similar shouldScheduleLoadsNear function (e.g.
checking the offset is within 1 cache line).

This patch just adds the parameters, but doesn't attempt to use them.
There is potential to use them in the current PPC and AArch64
shouldClusterMemOps implementation, and I intend to use the offset in
the heuristic for RISC-V. I've left these for future patches in the
interest of being as incremental as possible.

As noted in the review and in an inline FIXME, an ElementCount-style abstraction may later be used to condense these two parameters to one argument. ElementCount isn't quite suitable as it doesn't support negative offsets.
2023-12-06 15:30:48 +00:00
Ramkumar Ramachandra
9468de48fc
TargetInstrInfo: make getOperandLatency return optional (NFC) (#73769)
getOperandLatency has the following behavior: it returns -1 as a special
value, negative numbers other than -1 on some target-specific overrides,
or a valid non-negative latency. This behavior can be surprising, as
some callers do arithmetic on these negative values. Change the
interface of getOperandLatency to return a std::optional<unsigned> to
prevent surprises in callers. While at it, change the interface of
getInstrLatency to return unsigned instead of int.

This change was inspired by a refactoring in
TargetSchedModel::computeOperandLatency.
2023-12-01 11:29:19 +00:00
Alex Bradbury
6cf3566850
[NFC][MachineScheduler] Rename NumLoads parameter of shouldClusterMemOps to ClusterSize (#73757)
As the same hook is called for both load and store clustering, NumLoads
is a misleading name. Use ClusterSize instead.
2023-11-29 09:47:03 +00:00
Craig Topper
a845061935
[AArch64] Use the same fast math preservation for MachineCombiner reassociation as X86/PowerPC/RISCV. (#72820)
Don't blindly copy the original flags from the pre-reassociated
instrutions.
This copied the integer poison flags which are not safe to preserve
after reassociation.
    
For the FP flags, I think we should only keep the intersection of
the flags. Override setSpecialOperandAttr to do this.

Fixes #72777.
2023-11-22 14:17:45 -08:00
Kai Luo
eb7698254a
[PowerPC][EarlyIfConversion] Do not insert isel if subtarget doesn't support isel (#72211)
Some subtargets of PPC don't support `isel` instruction, early-ifcvt
should not insert this instruction.
2023-11-20 09:17:04 +08:00
Kazu Hirata
ac4a272913 [llvm] Stop including llvm/ADT/DenseSet.h (NFC)
Identified with clangd.
2023-11-11 09:48:29 -08:00
Nemanja Ivanovic
46d5d264fc [PowerPC] Improve kill flag computation and add verification after MI peephole
The MI Peephole pass has grown to include a large number of transformations over the years. Many of the transformations require re-computation of kill flags but don't do a good job of re-computing them. This causes us to have very common failures when the compiler is built with expensive checks. Over time, we added and augmented a function that is supposed to go and fix up kill flags after each transformation but we keep missing cases.
This patch does the following:
- Removes the function to re-compute kill flags
- Adds LiveVariables to compute and maintain kill flags while transforming code
- Adds re-computation of kill flags for the post-RA peepholes for each block that contains a transformed instruction

Reviewed By: stefanp

Differential Revision: https://reviews.llvm.org/D133103
2023-09-22 15:26:39 -04:00
Arthur Eubanks
0a1aa6cda2
[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.

This matches other nearby enums.

For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
2023-09-14 14:10:14 -07:00
Reid Kleckner
cda23c0732 [PPC] Fix layering issues between MCTargetDesc and CodeGen
See issue #64166 for more information about the layering issue.

The PPCMCTargetDesc library was including CodeGen headers such as
PPCInstrInfo.h and calling inline functions in them. This doesn't work
in the Bazel build, and is error-prone. If the inline function moves to
a cpp file, it will result in linker errors.

To address the issue, I moved several inline functions to
PPCMCTargetDesc.cpp, and declared them in the PPC namespace in
PPCMCTargetDesc.h, which seemed like the most straightforward fix.

Differential Revision: https://reviews.llvm.org/D156488
2023-08-30 16:09:01 -07:00
Craig Topper
d2e605c92a [PowerPC] Correct missue of getOperandConstraint in PPCInstrInfo::commuteInstructionImpl
getOperandConstraint does not return a bool, it returns an int. It
returns -1 if there is no TIED_TO.

Additionally, TIED_TO is only set on use operands not defs and it
points to the def that the use is tied to. So calling it on operand 0
is guaranteed to return -1.

As far as I can tell this code must have been copied from the
generic implementation prior to 6aa2744bed0b8.o

Unfortunately, this code is not executed in lit tests. I just happened
to notice it while looking for other uses of TIED_TO for something
I was working on.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D152754
2023-08-09 21:53:49 -07:00
Sander de Smalen
bbb95893de [TII] NFCI: Simplify the interface for isTriviallyReMaterializable
Currently `isTriviallyReMaterializable` calls
`isReallyTriviallyReMaterializable` and
`isReallyTriviallyReMaterializableGeneric`. The two interfaces
are confusing, but there are also some real issues with this.

The documentation of this function (see below) suggests that
`isReallyTriviallyRematerializable` allows the target to override the
default behaviour.

  /// For instructions with opcodes for which the M_REMATERIALIZABLE flag is
  /// set, this hook lets the target specify whether the instruction is actually
  /// trivially rematerializable, taking into consideration its operands.

It however implements something different. The default behaviour
is the analysis done in `isReallyTriviallyReMaterializableGeneric`,
which is testing if it is safe to rematerialize the MachineInstr.

The result of `isReallyTriviallyReMaterializable` is only considered if
`isReallyTriviallyReMaterializableGeneric` returns `false`.  That means
there is no way to override the default behaviour if
`isReallyTriviallyReMaterializableGeneric` returns true (i.e. it is safe to
rematerialize, but we'd rather not).

By making this a single interface, we can override the interface to do either.

Reviewed By: craig.topper, nemanjai

Differential Revision: https://reviews.llvm.org/D156520
2023-08-07 13:01:06 +00:00