1491 Commits

Author SHA1 Message Date
Nemanja Ivanovic
1fed131660 [PowerPC] Canonicalize shuffles to match more single-instruction masks on LE
We currently miss a number of opportunities to emit single-instruction
VMRG[LH][BHW] instructions for shuffles on little endian subtargets. Although
this in itself is not a huge performance opportunity since loading the permute
vector for a VPERM can always be pulled out of loops, producing such merge
instructions is useful to downstream optimizations.
Since VPERM is essentially opaque to all subsequent optimizations, we want to
avoid it as much as possible. Other permute instructions have semantics that can
be reasoned about much more easily in later optimizations.

This patch does the following:
- Canonicalize shuffles so that the first element comes from the first vector
  (since that's what most of the mask matching functions want)
- Switch the elements that come from splat vectors so that they match the
  corresponding elements from the other vector (to allow for merges)
- Adds debugging messages for when a shuffle is matched to a VPERM so that
  anyone interested in improving this further can get the info for their code

Differential revision: https://reviews.llvm.org/D77448
2020-06-18 21:54:22 -05:00
Esme-Yi
ad6024e29f [PowerPC] Custom lower rotl v1i128 to vector_shuffle.
Summary: A bug is reported in bugzilla-45628, where the swap_with_shift case can’t be matched to a single HW instruction xxswapd as expected.
In fact the case matches the idiom of rotate. We have MatchRotate to handle an ‘or’ of two operands and generate a rot[lr] if the case matches the idiom of rotate. While PPC doesn’t support ROTL v1i128. We can custom lower ROTL v1i128 to the vector_shuffle. The vector_shuffle will be matched to a single HW instruction during the phase of instruction selection.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81076
2020-06-18 01:32:23 +00:00
Qiu Chaofan
13edcd696e [PowerPC] Support constrained rounding operations
This patch adds handling of constrained FP intrinsics about round,
truncate and extend for PowerPC target, with necessary tests.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D64193
2020-06-14 23:43:31 +08:00
Qiu Chaofan
7315d221a2 [PowerPC] Exploit vnmsubfp instruction
On PowerPC, we have vnmsubfp Altivec instruction for fnmsub operation on
v4f32 type. Default pattern for this instruction never works since we
don't have legal fneg for v4f32 when VSX disabled.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80617
2020-06-14 23:19:17 +08:00
Guillaume Chatelet
1778564f91 [Alignment][NFC] Migrate the rest of backends
Summary: This is a followup on D81196

Reviewers: courbet

Subscribers: arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81278
2020-06-08 07:17:20 +00:00
Qiu Chaofan
7a001a2d92 [PowerPC] Require nsz flag for c-a*b to FNMSUB
On PowerPC, FNMSUB (both VSX and non-VSX version) means -(a*b-c). But
the backend used to generate these instructions regardless whether nsz
flag exists or not. If a*b-c==0, such transformation changes sign of
zero.

This patch introduces PPC specific FNMSUB ISD opcode, which may help
improving combined FMA code sequence.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D76585
2020-06-04 16:41:27 +08:00
QingShan Zhang
a462561cee [NFC][PowerPC] Remove unused node PPCISD::VMADDFP and PPCISD::VNMSUBFP
These two nodes were added by 69caef2b781130a7d0eeaf8898eb346b6423ae03 in 2005
and they are not used by PowerPC backend anymore. And the ISD::FMA is a prefer
way for VMADDFP if we really want to create that node. For VNMSUBFP, we will
also add a more generic node FNMSUB in D76585 if we really want it.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D80429
2020-06-03 06:36:30 +00:00
Li Rong Yi
3101601b54 [PowerPC] Exploit vabsd on P9
Summary: Exploit vabsd* for for absolute difference of vectors on P9,
for example:
void foo (char *restrict p, char *restrict q, char *restrict t)
{
  for (int i = 0; i < 16; i++)
     t[i] = abs (p[i] - q[i]);
}
this case should be matched to the HW instruction vabsdub.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80271
2020-06-01 02:30:27 +00:00
Zequan Wu
80e107ccd0 Add NoMerge MIFlag to avoid MIR branch folding
Let the codegen recognized the nomerge attribute and disable branch folding when the attribute is given

Differential Revision: https://reviews.llvm.org/D79537
2020-05-29 12:31:06 -07:00
Lei Huang
2368bf52cd [PowerPC] Add support for -mcpu=pwr10 in both clang and llvm
Summary:
This patch simply adds support for the new CPU in anticipation of
Power10. There isn't really any functionality added so there are no
associated test cases at this time.

Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc

Reviewed By: stefanp, nemanjai, amyk, #powerpc

Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo

Tags: #clang, #powerpc, #llvm

Differential Revision: https://reviews.llvm.org/D80020
2020-05-27 13:14:25 -05:00
Lei Huang
559845f8fe Revert "[PowerPC] Add support for -mcpu=pwr10 in both clang and llvm"
This reverts commit 7eb666b1556b86503f2f386bf921186cdbb2d22a.
2020-05-27 09:40:21 -05:00
Lei Huang
7eb666b155 [PowerPC] Add support for -mcpu=pwr10 in both clang and llvm
Summary:
This patch simply adds support for the new CPU in anticipation of
Power10. There isn't really any functionality added so there are no
associated test cases at this time.

Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc

Reviewed By: stefanp, nemanjai, amyk, #powerpc

Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo

Tags: #clang, #powerpc, #llvm

Differential Revision: https://reviews.llvm.org/D80020
2020-05-26 13:48:22 -05:00
Nemanja Ivanovic
099a875f28 [PowerPC] Unaligned FP default should apply to scalars only
As reported in PR45186, we could be in a situation where we don't
want to handle unaligned memory accesses for FP scalars but still
have VSX (which allows unaligned access for vectors). Change the
default to only apply to scalars.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45186
2020-05-26 10:19:06 -05:00
Nemanja Ivanovic
793cc518b9 [PowerPC] Prevent legalization loop from promoting SELECT_CC from v4i32 to v4i32
As reported in https://bugs.llvm.org/show_bug.cgi?id=45709 we can hit an
infinite loop in legalization since we set the legalization action for
ISD::SELECT_CC for all fixed length vector types to Promote. Without some
different legalization action for the type being promoted to, the legalizer
simply loops. Since we don't have patterns to match the node, the right
legalization action should be Expand.

Differential revision: https://reviews.llvm.org/D79854
2020-05-25 20:09:07 -05:00
Amy Kwan
b631f86ac5 [TLI][PowerPC] Introduce TLI query to check if MULH is cheaper than MUL + SHIFT
This patch introduces a TargetLowering query, isMulhCheaperThanMulShift.

Currently in DAG Combine, it will transform mulhs/mulhu into a
wider multiply and a shift if the wide multiply is legal.

This TLI function is implemented on 64-bit PowerPC, as it is more desirable to
have multiply-high over multiply + shift for words and doublewords. Having
multiply-high can also aid in further transformations that can be done.

Differential Revision: https://reviews.llvm.org/D78271
2020-05-23 16:47:12 -05:00
Nemanja Ivanovic
1a493b0fa5 [PowerPC] Add missing handling for half precision
The fix for PR39865 took care of some of the handling for half precision
but it missed a number of issues that still exist. This patch fixes the
remaining issues that cause crashes in the PPC back end.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45776

Differential revision: https://reviews.llvm.org/D79283
2020-05-22 07:50:11 -05:00
Sean Fertile
ce4ebc14a8 [PowerPC] Remove support for SplitCSR.
SplitCSR was only suppored for functions with CXX_FAST_TLS calling
convention. Clang only emits that calling convention for Darwin which is
no longer supported by the PowerPC backend. Another IR producer could
use the calling convention, but considering the calling convention is
meant to be an optimization and the codegen for SplitCSR can be
attrocious on Power (see the modifed lit test) it is best to remove it
and codegen CXX_FAST_TLS same as the C calling convention.

Differential Revision: https://reviews.llvm.org/D79018
2020-05-14 10:32:17 -04:00
Qiu Chaofan
e9753822b5 [PowerPC] Respect SDNodeFlags in lowering SELECT_CC
Legalizer should respect both command-line options or SDNode-level
fast-math flags.

Also, this patch propagates other flags during custom simplifying.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D79074
2020-05-13 14:05:47 +08:00
Qiu Chaofan
e8d2ff22f0 [PowerPC] Add fma/fsqrt/fmax strict-fp intrinsics
This patch adds strict-fp intrinsics support for fma, fsqrt, fmaxnum and
fminnum on PowerPC.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D72749
2020-05-12 13:44:09 +08:00
Kang Zhang
dcc5ff3bc2 [PowerPC] Use PredictableSelectIsExpensive to enable select to branch in CGP
Summary:
This patch will set the variable PredictableSelectIsExpensive to do the
select to if based on BranchProbability in CodeGenPrepare.

When the BranchProbability more than MinPercentageForPredictableBranch,
PPC will convert SELECT to branch.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D71883
2020-05-11 15:02:09 +00:00
Craig Topper
d1119980e5 [SelectionDAG] Use Align/MaybeAlign for ConstantPoolSDNode.
This patch stores the alignment for ConstantPoolSDNode as an
Align and updates the getConstantPool interface to take a MaybeAlign.

Removing getAlignment() will be done as a follow up.

Differential Revision: https://reviews.llvm.org/D79436
2020-05-08 16:04:11 -07:00
Sean Fertile
2a3cf5e583 [PowerPC][AIX] Pass ByVal formal args that span registers and stack.
Implement passing of ByVal formal arguments when the argument is passed
partly in the argument registers, with the remainder of the argument
passed on the stack.

Differential Revision: https://reviews.llvm.org/D78515
2020-04-28 14:57:14 -04:00
Craig Topper
a58b62b4a2 [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().

I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.

Differential Revision: https://reviews.llvm.org/D78882
2020-04-27 22:17:03 -07:00
Stefan Pintilie
1354a03e74 [PowerPC][Future] Implement PC Relative Tail Calls
Tail Calls were initially disabled for PC Relative code because it was not safe
to make certain assumptions about the tail calls (namely that all compiled
functions no longer used the TOC pointer in R2). However, once all of the
TOC pointer references have been removed it is safe to tail call everything
that was tail called prior to the PC relative additions as well as a number of
new cases.
For example, it is now possible to tail call indirect functions as there is no
need to save and restore the TOC pointer for indirect functions if the caller
is marked as may clobber R2 (st_other=1). For the same reason it is now also
possible to tail call functions that are external.

Differential Revision: https://reviews.llvm.org/D77788
2020-04-27 12:55:08 -05:00
Victor Huang
e20b07b021 [PowerPC][Future] Add missing changes for PC Realtive addressing
1. Use Subtarget.isUsingPCRelativeCalls() in LowerConstantPool to
check if using PCRelative addressing.

2. Change MO_GOT_FLAG = 32 to MO_GOT_FLAG = 8 in PPC.h to use
consecutive bits.

Differential Revision: https://reviews.llvm.org/D78406
2020-04-23 10:26:43 -05:00
Victor Huang
a60ca4b4e9 [PowerPC][Future] Initial support for PCRel addressing to get block address
Add initial support for PCRelative addressing to get block address
instead of using TOC.

Differential Revision: https://reviews.llvm.org/D76294
2020-04-22 15:01:29 -05:00
Victor Huang
02141a17ae [PowerPC][Future] Remove redundant r2 save and restore for indirect call
Currently an indirect call produces the following sequence on PCRelative mode:

extern void function( );
extern void (*ptrfunc) ( );

void g() {
    ptrfunc=function;
}

void f() {
    (*ptrfunc) ( );
}

Producing

paddi 3, 0, .LC0@PCREL, 1
ld 3, 0(3)
std 2, 24(1)
ld 12, 0(3)
mtctr 12
bctrl
ld 2, 24(1)

Though the caller does not use or preserve r2, it is still saved and restored
across a function call. This patch is added to remove these redundant save and
restores for indirect calls.

Differential Revision: https://reviews.llvm.org/D77749
2020-04-22 12:05:51 -05:00
Victor Huang
43abef06f4 [PowerPC][Future] Initial support for PCRel addressing for jump tables.
Add initial support for PC Relative addressing to get jump table base
address instead of using TOC.

Differential Revision: https://reviews.llvm.org/D75931
2020-04-22 10:45:01 -05:00
Craig Topper
d22989c34e [CallSite removal][Target] Replace CallSite with CallBase. NFC
In some cases just delete an unneeded include.
2020-04-21 23:29:36 -07:00
Stefan Pintilie
a92ee77d85 [PowerPC][Future] Add offsets to PC Relative relocations.
This is an optimization that applies to global addresses and
allows for the following transformation:
Convert this:

paddi r3, 0, symbol@PCREL, 1
ld r4, 8(r3)

To this:

pld r4, symbol@PCREL+8(0), 1

An instruction is saved and the linker can do the addition when
the symbol is resolved.

Differential Revision: https://reviews.llvm.org/D76160
2020-04-21 11:08:19 -05:00
Christopher Tetreault
a9b137f9ff [SVE] Remove calls to getBitWidth from PowerPC
Reviewers: efriedma, sdesmalen, hfinkel, david-arm, fpetrogalli

Reviewed By: efriedma, fpetrogalli

Subscribers: wuzish, nemanjai, tschuett, hiraditya, kbarton, rkruppe, psnobl, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77900
2020-04-20 14:18:37 -07:00
Nemanja Ivanovic
64b31d96df [PowerPC] Do not attempt to reuse load for 64-bit FP_TO_UINT without FPCVT
We call the function that attempts to reuse the conversion without checking
whether the target matches the constraints that the callee expects. This patch
adds the check prior to the call.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=43976

Differential revision: https://reviews.llvm.org/D77564
2020-04-20 13:00:06 -05:00
Sean Fertile
d52bb6d099 [PowerPC][AIX] ByVal formal argument support: passing on the stack.
Adds support for passing a ByVal formal argument completely on the stack
(ie after all argument registers are exhausted).

Differential Revision: https://reviews.llvm.org/D78263
2020-04-20 12:04:59 -04:00
Stefan Pintilie
b771c4a842 [PowerPC][Future] More support for PCRel addressing for global values
Add initial support for PC Relative addressing for global values that
require GOT indirect addressing. This patch adds PCRelative support for
global addresses that may not be known at link time and may require
access through the GOT.

Differential Revision: https://reviews.llvm.org/D76064
2020-04-17 11:06:13 -05:00
Stefan Pintilie
18b6050324 [PowerPC][Future] Initial support for PC Relative addressing for global values
This patch adds PC Relative support for global values that are known at link
time. If a global value requires access through the global offset table (GOT)
it is not covered in this patch.

Differential Revision: https://reviews.llvm.org/D75280
2020-04-16 12:45:22 -05:00
Chris Bowler
bee6c234ed [AIX][PowerPC] Implement caller byval arguments in stack memory
Differential Revision: https://reviews.llvm.org/D77578
2020-04-15 17:57:31 -04:00
Craig Topper
113f37a1f9 [CallSite removal][TargetLowering] Replace ImmutableCallSite with CallBase
Differential Revision: https://reviews.llvm.org/D77995
2020-04-13 13:50:15 -07:00
Nemanja Ivanovic
512600e3c0 [PowerPC] Handle f16 as a storage type only
The PPC back end currently crashes (fails to select) with f16 input. This patch
expands it on subtargets prior to ISA 3.0 (Power9) and uses the HW conversions
on Power9.

Fixes https://bugs.llvm.org/show_bug.cgi?id=39865

Differential revision: https://reviews.llvm.org/D68237
2020-04-11 07:34:47 -05:00
Kai Luo
b7d5229d78 [PowerPC] Update alignment for ReuseLoadInfo in LowerFP_TO_INTForReuse
In LowerFP_TO_INTForReuse, when emitting `stfiwx`, alignment of 4 is
set for the `MachineMemOperand`, but RLI(ReuseLoadInfo)'s alignment is
not updated for following loads.

It's related to failed alignment check reported in
https://bugs.llvm.org/show_bug.cgi?id=45297

Differential Revision: https://reviews.llvm.org/D77624
2020-04-10 05:49:19 +00:00
jasonliu
085689d44c [PPC][AIX] Implement variadic function handling in LowerFormalArguments_AIX
Summary:
This patch adds support for handling of variadic functions for AIX.
This includes ensuring that use and consume correct type of
va_list (char *va_list) for AIX.

Authored by: ZarkoCA

Reviewers: cebowleratibm, sfertile, jasonliu

Reviewed by: jasonliu

Differential Revision: https://reviews.llvm.org/D76130
2020-04-09 16:49:44 +00:00
Stefan Pintilie
75828ef615 [PowerPC][Future] Initial support for PCRel addressing for constant pool loads
Add initial support for PC Relative addressing for constant pool loads.
This includes adding a new relocation for @pcrel and adding a new PowerPC flag
to identify PC relative addressing.

Differential Revision: https://reviews.llvm.org/D74486
2020-04-09 11:17:23 -05:00
Sean Fertile
d0b57b41f4 [PowerPC][AIX][NFC] Replace deprecated getByValAlign call.
Replace call to deprecated 'getByValAlign()' with
'getNonZeroByValAlign()'.
2020-04-08 13:27:39 -04:00
Matt Arsenault
84aa58cbe2 CodeGen: Use Register in TargetLowering 2020-04-08 12:10:58 -04:00
Sean Fertile
8abfd2c3bb [PowerPC][AIX] Enable passing byval formal arguments in multiple registers.
Any or all the argument registers can be used to pass a byval formal
argument, with the limitation that the argument must fit in the
available registers (ie: is not split between registers and stack).

Differential Revision: https://reviews.llvm.org/D76902
2020-04-08 11:16:33 -04:00
Stefan Pintilie
6c4b40def7 [PowerPC][Future] Add Support For Functions That Do Not Use A TOC.
On PowerPC most functions require a valid TOC pointer.

This is the case because either the function itself needs to use this
pointer to access the TOC or because other functions that are called
from that function expect a valid TOC pointer in the register R2.
The main exception to this is leaf functions that do not access the TOC
since they are guaranteed not to need a valid TOC pointer.

This patch introduces a feature that will allow more functions to not
require a valid TOC pointer in R2.

Differential Revision: https://reviews.llvm.org/D73664
2020-04-08 08:07:35 -05:00
Chris Bowler
d6ea82d11c [AIX][PPC] Implement by-val caller arguments in multiple registers
Differential Revision: https://reviews.llvm.org/D76380
2020-04-06 11:06:51 -04:00
jasonliu
d65557d15d [NFC][XCOFF][AIX] Refactor get/setContainingCsect
Summary:
For current architect, we always require setContainingCsect to be
called on every MCSymbol got used in XCOFF context.
This is very hard to achieve because symbols gets created everywhere
 and other MCSymbol types(ELF, COFF) do not have similar rules.
It's very easy to miss setting the containing csect, and we would
need to add a lot of XCOFF specialized code around some common code area.

This patch intendeds to do
1. Rely on getFragment().getParent() to get csect from labels.
2. Only use get/setRepresentedCsect (was get/setContainingCsect)
if symbol itself represents a csect.

Reviewers: DiggerLin, hubert.reinterpretcast, daltenty

Differential Revision: https://reviews.llvm.org/D77080
2020-04-03 13:33:12 +00:00
Guillaume Chatelet
1dffa2550b [Alignment][NFC] Transition to MachineFrameInfo::getObjectAlign()
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77215
2020-04-01 14:08:28 +00:00
Guillaume Chatelet
c7468c1696 [Alignment][NFC] Use Align in SelectionDAG::getMemIntrinsicNode
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, nemanjai, hiraditya, kbarton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77149
2020-04-01 09:32:05 +00:00
Kai Luo
8eb40e41f6 [PowerPC] Don't generate ST_VSR_SCAL_INT if power8-vector is disabled
Summary:
In https://bugs.llvm.org/show_bug.cgi?id=45297, it fails selecting
instructions for `PPCISD::ST_VSR_SCAL_INT`. The reason it generate the
`PPCISD::ST_VSR_SCAL_INT` with `-power8-vector` in IR is PPC's
combiner checks `hasP8Altivec` rather than `hasP8Vector`. This patch
should resolve PR45297.

Differential Revision: https://reviews.llvm.org/D76773
2020-04-01 02:15:25 +00:00