7430 Commits

Author SHA1 Message Date
Hubert Tong
0812cde3bf NFC: Make isPPC64 const and use member initializer 2024-11-01 20:41:25 -04:00
zhijian lin
674574d25c
Promote 32bit pseudo instr that infer extsw removal to 64bit in PPCMIPeephole (#85451)
Fixes:   https://github.com/llvm/llvm-project/issues/71030

Bug only happens in 64bit involving spills. Since we don't know when the
spill will happen, all instructions in the chain used to deduce sign
extension for eliminating 'extsw' will need to be promoted to 64-bit
pseudo instructions.

The following instruction will promoted in PPCMIPeepholes: EXTSH, LHA,
ISEL to EXTSH8, LHA8, ISEL8
2024-10-31 15:49:36 -04:00
Zaara Syeda
ccddd13602
Enable aggressive constant merge in GlobalMerge for AIX (#113956)
Enable merging all constants without looking at use in GlobalMerge by
default to replace PPCMergeStringPool pass on AIX.
2024-10-31 11:22:48 -04:00
Fangrui Song
facdae62b7 [MCInstPrinter] Make printRegName non-const
Similar to printInst. printRegName may change states (e.g. #113834).
2024-10-29 19:14:54 -07:00
Maryam Moghadas
8a0cb9ac86
[PowerPC] Add custom lowering for ssubo (#111748)
This patch is to improve the codegen for ssubo node for i32 in 64-bit
mode by custom lowering.
2024-10-29 15:43:05 -04:00
Lei Huang
522f34cfff
[PowerPC] Expand global named register support (#113482)
Enable all valid registers for intrinsics that read from and write
to global named registers.
2024-10-24 10:05:18 -04:00
Lei Huang
a19f05b9ec
Revert "[PowerPC] Expand global named register support" (#113457)
Reverts llvm/llvm-project#112603
2024-10-23 09:36:28 -04:00
Lei Huang
06d192925d
[PowerPC] Expand global named register support (#112603)
Enable all valid registers for intrinsics that read from and write
to global named registers.
2024-10-22 14:34:24 -04:00
RolandF77
fc59f2cc0f
[PowerPC] special case small int constant for custom scalar_to_vector (#109850)
Special case small int constant in the PPC custom lowering of
scalar_to_vector.
2024-10-21 12:19:07 -04:00
Zaara Syeda
c5ca1b8626
[PPC] Add custom lowering for uaddo (#110137)
Improve the codegen for uaddo node for i64 in 64-bit mode and i32 in
32-bit mode by custom lowering.
2024-10-21 11:13:16 -04:00
Alex Rønne Petersen
ad4a582fd9
[llvm] Consistently respect naked fn attribute in TargetFrameLowering::hasFP() (#106014)
Some targets (e.g. PPC and Hexagon) already did this. I think it's best
to do this consistently so that frontend authors don't run into
inconsistent results when they emit `naked` functions. For example, in
Zig, we had to change our emit code to also set `frame-pointer=none` to
get reliable results across targets.

Note: I don't have commit access.
2024-10-18 09:35:42 +04:00
Keith Packard
44b020a381
[PowerPC][ISelLowering] Support -mstack-protector-guard=tls (#110928)
Add support for using a thread-local variable with a specified offset
for holding the stack guard canary value. This supports both 32- and 64-
bit PowerPC targets.

This mirrors changes from #108942 but targeting PowerPC instead of
RISCV. Because both of these PRs modify the same driver functions, this
series is stack on top of the RISC-V one.

---------

Signed-off-by: Keith Packard <keithp@keithp.com>
2024-10-17 19:06:47 -07:00
Jay Foad
85c17e4092
[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)
Convert many instances of:
  Fn = Intrinsic::getOrInsertDeclaration(...);
  CreateCall(Fn, ...)
to the equivalent CreateIntrinsic call.
2024-10-17 16:20:43 +01:00
Qiongsi Wu
f9d0789064
[PGO] Initialize GCOV Writeout and Reset Functions in the Runtime on AIX (#108570)
This PR registers the writeout and reset functions for `gcov` for all
modules in the PGO runtime, instead of registering them
using global constructors in each module. The change is made for AIX
only, but the same mechanism works on Linux on Power.

When registering such functions using global constructors in each module
without `-ffunction-sections`, the AIX linker cannot garbage collect
unused undefined symbols, because such symbols are grouped in the same
section as the `__sinit` symbol. Keeping such undefined symbols causes
link errors (see test case
https://github.com/llvm/llvm-project/pull/108570/files#diff-500a7e1ba871e1b6b61b523700d5e30987900002add306e1b5e4972cf6d5a4f1R1
for this scenario). This PR implements the initialization in the
runtime, hence avoiding introducing `__sinit` into each module.

The implementation adds a new global variable `__llvm_covinit_functions`
to each module. This new global variable contains the function pointers
to the `Writeout` and `Reset` functions. `__llvm_covinit_functions`'s
section is the named section `__llvm_covinit`. The linker will aggregate
all the `__llvm_covinit` sections from each module
to form one single named section in the final binary. The pair of
functions
```
const __llvm_gcov_init_func_struct *__llvm_profile_begin_covinit();
const __llvm_gcov_init_func_struct *__llvm_profile_end_covinit();
```
are implemented to return the start and end address of this named
section in the final binary, and they are used in function
```
__llvm_profile_gcov_initialize()
```
(which is a constructor function in the runtime) so the runtime knows
the addresses of all the `Writeout` and `Reset` functions from all the
modules.

One noticeable implementation detail relevant to AIX is that to preserve
the `__llvm_covinit` from the linker's garbage collection, a `.ref`
pseudo instruction is inserted into them, referring to the section that
contains the `__llvm_gcov_ctr` variables, which are used in the
instrumented code. The `__llvm_gcov_ctr` variables did not belong to
named sections before, but this PR added them to the
`__llvm_gcov_ctr_section` named section, so we can add a `.ref` pseudo
instruction that refers to them in the `__llvm_covinit` section.
2024-10-17 09:32:10 -04:00
Jay Foad
d9c95efb6c
[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112546)
Convert almost every instance of:
  CreateCall(Intrinsic::getOrInsertDeclaration(...), ...)
to the equivalent CreateIntrinsic call.
2024-10-16 15:43:30 +01:00
Stefan Pintilie
dcc5ba4a4d
[PowerPC] Add missing patterns for lround when i32 is returned. (#111863)
The patch adds support for lround when the output type of the rounding
is i32.
The support for a rounding result of type i64 existed before this patch.
2024-10-16 10:25:09 -04:00
Christudasan Devadasan
488d3924dd
[CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (#108508) 2024-10-16 13:22:57 +05:30
Lei Huang
23da16933b
[NFC][PowerPC] Use tablegen's MatchRegisterName() (#111553)
Use PPC `MatchRegisterName()` that is auto generated by table gen.
2024-10-15 16:58:36 -04:00
Rahul Joshi
fa789dffb1
[NFC] Rename Intrinsic::getDeclaration to getOrInsertDeclaration (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
2024-10-11 05:26:03 -07:00
Lei Huang
4e6a6eda30
[PowerPC] Update matchRegisterName() to return MCRegister instead of bool (#111186)
Initial patch to start using TableGen's auto generated function
`MatchRegisterName()`.

Update `PPCAsmParser::matchRegisterName()` implementation to align more
with tablegen's auto generated function.
2024-10-08 11:27:18 -04:00
RolandF77
06c8210a67
update P7 32-bit partial vector load cost (#108261)
Update cost model to reflect codegen change to use lfiwzx 
for 32-bit partial vector loads on pwr7 with
https://github.com/llvm/llvm-project/pull/104507.
2024-10-03 12:28:43 -04:00
Philip Reames
d288574363
[TTI][RISCV] Model cost of loading constants arms of selects and compares (#109824)
This follows in the spirit of 7d82c99403f615f6236334e698720bf979959704,
and extends the costing API for compares and selects to provide
information about the operands passed in an analogous manner. This
allows us to model the cost of materializing the vector constant, as
some select-of-constants are significantly more expensive than others
when you account for the cost of materializing the constants involved.

This is a stepping stone towards fixing
https://github.com/llvm/llvm-project/issues/109466. A separate SLP patch
will be required to utilize the new API.
2024-09-25 07:25:57 -07:00
Youngsuk Kim
d31e314131 [llvm] Don't call raw_string_ostream::flush() (NFC)
Don't call raw_string_ostream::flush(), which is essentially a no-op.
As specified in the docs, raw_string_ostream is always unbuffered.
( 65b13610a5226b84889b923bae884ba395ad084d for further reference )
2024-09-20 12:19:59 -05:00
Jay Foad
e03f427196
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
2024-09-19 16:16:38 +01:00
Lei Huang
4b524088a8
[NFC] Update function names in MCTargetAsmParser.h (#108643)
Update function names to adhere to LLVM coding standard.
2024-09-18 11:43:49 -04:00
Zaara Syeda
22067a8eb4
[PowerPC] Fix assert exposed by PR 95931 in LowerBITCAST (#108062)
Hit Assertion failed: Num < NumOperands && "Invalid child # of SDNode!" 
Fix by checking opcode and value type before calling getOperand.
2024-09-10 14:14:01 -04:00
Qiu Chaofan
06c331163e
[PowerPC] Implement llvm.set.rounding intrinsic (#67302) 2024-09-10 14:30:31 +08:00
anjenner
4af249fe6e
Add usub_cond and usub_sat operations to atomicrmw (#105568)
These both perform conditional subtraction, returning the minuend and
zero respectively, if the difference is negative.
2024-09-06 16:19:20 +01:00
Craig Topper
aafaa69434
[Target] Use templated MachineFunction::getSubtarget in *CallingConv.td. NFC (#107311)
This hides away the static_cast needed to get the target specific Subtarget
object.
2024-09-04 23:15:25 -07:00
RolandF77
26ba186bd0
[PowerPC] Improve pwr7 codegen for v4i8 load (#104507)
There are no partial vector loads on pwr7 so current v4i8 codegen is an
int load then store to vector sized temp and re-load as vector. Try to
use lfiwax to load 32 bits into an FP reg and take advantage of VSX FP
and vector reg sharing to move the result to the right vector position.
2024-09-04 12:55:27 -04:00
Kazu Hirata
9a17a6016d
[PowerPC] Use DenseMap::operator[] (NFC) (#107044) 2024-09-03 20:02:18 -07:00
Matt Arsenault
911b96058a
PPC: Custom lower ppcf128 is_fpclass if is_fpclass is custom (#105540)
Unfortunately expandIS_FPCLASS is called directly in SelectionDAGBuilder
depending on whether IS_FPCLASS is custom or not. This helps avoid ppc test
regressions in a future patch where the custom lowering would be bypassed.
2024-08-29 14:01:54 +04:00
RolandF77
89bbcbe285
[PowerPC] fix legalization crash (#105563)
If v2i64 scalar_to_vector is made custom, llc can crash in certain
legalization cases where v2i64 vectors are injected, even if they
weren't otherwise present. The code generated would be fine, but that
operation is not handled in ReplaceNodeResults. Add handling.
2024-08-28 11:22:23 -04:00
Piyou Chen
b01c006f73
[TII][RISCV] Add renamable bit to copyPhysReg (#91179)
The renamable flag is useful during MachineCopyPropagation but renamable
flag will be dropped after lowerCopy in some case.

This patch introduces extra arguments to pass the renamable flag to
copyPhysReg.
2024-08-27 10:08:43 +08:00
Kai Luo
8e901c255d
[PowerPC] Retire PPCExpandISel pass (#84289)
We can decide whether to expand isel or not in instruction selection
pass and early-if-conversion pass. The transformation implemented in
PPCExpandISel can be retired considering PPC backend doesn't generate
`isel` instructions post-RA.
Also if we are seeking performant branch-or-isel decision, we can turn
to selectoptimize pass.

---------

Co-authored-by: Kai Luo <lkail@cn.ibm.com>
2024-08-27 09:43:52 +08:00
Craig Topper
4b0c0ec6b8
[CodeGen] Use MCRegister for CCState::AllocateReg and CCValAssign::getReg. NFC (#106032) 2024-08-26 11:40:25 -07:00
Kazu Hirata
33e7cd6ff2
[llvm] Prefer StringRef::substr to StringRef::slice (NFC) (#105943)
S.substr(N) is simpler than S.slice(N, StringRef::npos) and
S.slice(N, S.size()). Also, substr is probably better recognizable
than slice thanks to std::string_view::substr.
2024-08-25 11:30:49 -07:00
Zaara Syeda
327edbe07a
[PowerPC] Fix mask for __st[d/w/h/b]cx builtins (#104453)
These builtins are currently returning CR0 which will have the format
[0, 0, flag_true_if_saved, XER].
We only want to return flag_true_if_saved. This patch adds a shift to
remove the XER bit before returning.
2024-08-22 09:55:46 -04:00
Craig Topper
e994494a59
[PowerPC] Use MathExtras helpers to simplify code. NFC (#104691) 2024-08-17 23:12:52 -07:00
Amy Kwan
cf721e29c6
[PowerPC] Do not merge TLS constants within PPCMergeStringPool.cpp (#94059)
This patch prevents thread-local constants to be merged within
PPCMergeStringPool.cpp.

The PPCMergeStringPool pass primarily merges non-thread-local constants
together, and thread-local constants should not be mixed together with
other (non-thread-local) constants. In the event that thread-local and
other non-thread-local constants are pooled together, the
llvm.threadlocal.address intrinsic can fail as it expects its argument
to be a thread-local global value, but the merged string structure
created by the PPCMergeStringPool pass is not thread-local as a whole.
2024-08-16 15:06:50 -04:00
Kazu Hirata
3080c80671
[PowerPC] Use range-based for loops (NFC) (#104410) 2024-08-15 17:58:46 -07:00
Amy Kwan
9325381998
[PowerPC][GlobalMerge] Enable GlobalMerge by default on AIX (#101226)
This patch turns on the GlobalMerge pass by default on AIX and updates
LIT tests accordingly.
2024-08-15 15:25:54 -04:00
Amy Kwan
5e990b0b7f
[PowerPC][GlobalMerge] Reduce TOC usage by merging internal and private global data (#101224)
This patch aims to reduce TOC usage by merging internal and private
global data.

Moreover, we also add the GlobalMerge pass within the PPCTargetMachine
pipeline, which is disabled by default. This transformation can be
enabled by -ppc-global-merge.
2024-08-14 10:14:33 -04:00
Craig Topper
5e231ffe29 [PowerPC] Use APInt::getZExtValue() instead of getRawData(). NFC 2024-08-13 19:05:12 -07:00
RolandF77
8b6e9de3dd
[PowerPC] improve P10 store forwarding on P7 scalar to vector (#102330)
Try to make P7 code with scalar to vector operations that use store/re-load to run smoother on P10 by supplying enough store width to cover the load and allow hardware store forwarding.
2024-08-12 12:30:06 -04:00
Kazu Hirata
f4fb735840
[llvm] Construct SmallVector<SDValue> with ArrayRef (NFC) (#102578) 2024-08-09 09:15:42 -07:00
Tim Gymnich
408d82d352
[PowerPC] Respect endianness when bitcasting to fp128 (#95931)
Fixes #92246

Match the behaviour of `bitcast v2i64 (BUILD_PAIR %lo %hi)` when
encountering `bitcast fp128 (BUILD_PAIR %lo $hi)`.
by inserting a missing swap of the arguments based on endianness.

### Current behaviour: 
**fp128**
bitcast fp128 (BUILD_PAIR %lo $hi) => BUILD_FP128 %lo %hi
BUILD_FP128 %lo %hi => MTVSRDD %hi %lo

**v2i64**
bitcast v2i64 (BUILD_PAIR %lo %hi) => BUILD_VECTOR %hi %lo
BUILD_VECTOR %hi %lo => MTVSRDD %lo %hi
2024-08-08 08:51:04 +08:00
Lei Huang
64510c1411
[PPC] Implement BCD assist builtins (#101390)
Implement BCD assist builtins for XL and GCC compatibility.

GCC compat:
```
unsigned int __builtin_cdtbcd (unsigned int);
unsigned int __builtin_cbcdtd (unsigned int);
unsigned int __builtin_addg6s (unsigned int, unsigned int);
```

64BIT XL compat:
```
long long __cdtbcd (long long); 
long long __cbcdtd (long long); 
long long __addg6s (long long source1, long long source2)
```
2024-08-07 13:38:48 -04:00
Zaara Syeda
d07f106e51
[PPC][AIX] Save/restore r31 when using base pointer (#100182)
When the base pointer r30 is used to hold the stack pointer, r30 is
spilled in the prologue. On AIX registers are saved from highest to
lowest, so r31 also needs to be saved.

Fixes https://github.com/llvm/llvm-project/issues/96411
2024-08-07 09:59:45 -04:00
Alexis Engelke
da0e66e64c
[CodeGen][NFC] Add wrapper method for MBBMap (#101893)
This is a preparation for changing the data structure of MBBMap.
2024-08-04 18:34:26 +02:00