llvm-project

Author	SHA1	Message	Date
Amy Kwan	b631f86ac5	[TLI][PowerPC] Introduce TLI query to check if MULH is cheaper than MUL + SHIFT This patch introduces a TargetLowering query, isMulhCheaperThanMulShift. Currently in DAG Combine, it will transform mulhs/mulhu into a wider multiply and a shift if the wide multiply is legal. This TLI function is implemented on 64-bit PowerPC, as it is more desirable to have multiply-high over multiply + shift for words and doublewords. Having multiply-high can also aid in further transformations that can be done. Differential Revision: https://reviews.llvm.org/D78271	2020-05-23 16:47:12 -05:00
Xiangling_Liao	2419dce5d1	[NFC][AIX] Remove spaces after the comma for '.csect' directive To be consistent with other directives like '.comm', '.lcomm', we remove the spaces after the comma for '.csect' on AIX. Differential Revision: https://reviews.llvm.org/D80247	2020-05-22 11:10:32 -04:00
Nemanja Ivanovic	1a493b0fa5	[PowerPC] Add missing handling for half precision The fix for PR39865 took care of some of the handling for half precision but it missed a number of issues that still exist. This patch fixes the remaining issues that cause crashes in the PPC back end. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45776 Differential revision: https://reviews.llvm.org/D79283	2020-05-22 07:50:11 -05:00
Jon Roelofs	5a8db275f8	Revert "[llvm][test] Add COM: directives before colon-less non-CHECKs in comments. NFC" This reverts commit 183d6af081899973f00fc24aeafcfc32de732f02. Revert pending further consensus building: https://reviews.llvm.org/D79963#2050521	2020-05-22 05:36:15 -06:00
QingShan Zhang	d1076d729a	[NFC][Test] Add test coverage for fsqrt on PowerPC	2020-05-22 10:59:27 +00:00
Jon Roelofs	183d6af081	[llvm][test] Add COM: directives before colon-less non-CHECKs in comments. NFC Differential Revision: https://reviews.llvm.org/D79963	2020-05-21 09:29:27 -06:00
Chen Zheng	8086cdd1b0	[PowerPC] add more high latency opcodes for machine combiner pass Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D80097	2020-05-21 02:39:20 -04:00
Kang Zhang	58684fbb6f	[NFC][PowerPC] Add 2 new cases to test livevars pass	2020-05-20 05:32:09 +00:00
Qiu Chaofan	ad4f196e25	[NFC] [PowerPC] Refresh fma-mutate.ll using script This is a clean-up after D78989. The old comments are out of date.	2020-05-19 13:39:58 +08:00
Chen Zheng	a6be4d17e3	[PowerPC-QPX] adjust operands order of qpx fma instructions. convert %3 = QVFMADD %2, %0, %1, implicit $rm to %3 = QVFMADD %2, %1, %0, implicit $rm Reviewed By: hfinkel, steven.zhang Differential Revision: https://reviews.llvm.org/D78986	2020-05-18 22:59:51 -04:00
Chen Zheng	4a69eda6f3	[PowerPC][MachineCombiner] add testcase for reassociating FMA - NFC	2020-05-18 21:18:01 -04:00
Chen Zheng	455ccde137	[PowerPC] add more high latency opcodes for machinecombiner - NFC	2020-05-17 21:02:06 -04:00
Li Rong Yi	80173566f4	[PowerPC] Add an intrinsic for Popcntb Summary: This patch adds the intrinsic llvm.ppc.popcntb for the HW instruction POPCNTB Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D79703	2020-05-15 15:19:12 +08:00
Sean Fertile	ce4ebc14a8	[PowerPC] Remove support for SplitCSR. SplitCSR was only suppored for functions with CXX_FAST_TLS calling convention. Clang only emits that calling convention for Darwin which is no longer supported by the PowerPC backend. Another IR producer could use the calling convention, but considering the calling convention is meant to be an optimization and the codegen for SplitCSR can be attrocious on Power (see the modifed lit test) it is best to remove it and codegen CXX_FAST_TLS same as the C calling convention. Differential Revision: https://reviews.llvm.org/D79018	2020-05-14 10:32:17 -04:00
Qiu Chaofan	2866c6cad4	[NFC] [PowerPC] Narrow fast-math flags in tests A lot of tests under PowerPC are using fast flag, while fast is just alias of 7 fast-math flags. This change makes test points clearer. mc-instrlat.ll and sms-iterator.ll keeps unchanged since they are not testing fast-math behavior. (one for machine combiner crash, one for machine pipeliner bug) Reviewed By: steven.zhang, spatel Differential Revision: https://reviews.llvm.org/D78989	2020-05-13 17:22:45 +08:00
Qiu Chaofan	8ffe8891cd	[PowerPC] Exploit VSX neg, abs and nabs for f32 xsnegdp, xsabsdp and xsnabsdp can be used to operate on f32 operand. This patch adds the missing patterns since we prefer VSX instructions when available. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D75344	2020-05-13 14:28:50 +08:00
Qiu Chaofan	e9753822b5	[PowerPC] Respect SDNodeFlags in lowering SELECT_CC Legalizer should respect both command-line options or SDNode-level fast-math flags. Also, this patch propagates other flags during custom simplifying. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D79074	2020-05-13 14:05:47 +08:00
Kang Zhang	782a4dd1a4	[PowerPC] Use add instead of addReg in ppc-early-ret pass Summary: The ppc-early-ret pass use the addReg() to add operand to the new instruction, it can't reserve the flag of old operand. This has caused machine verfications failed. This patch use add() to instead of addReg(). Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D77997	2020-05-13 05:59:52 +00:00
Justin Hibbits	0138cc0125	PowerPC: Treat llvm.fma.f* intrinsic as using CTR with SPE Summary: The SPE doesn't have a 'fma' instruction, so the intrinsic becomes a libcall. It really should become an expansion to two instructions, but for some reason the compiler doesn't think that's as optimal as a branch. Since this lowering is done after CTR is allocated for loops, tell the optimizer that CTR may be used in this case. This prevents a "Invalid PPC CTR loop!" assertion in the case that a fma() function call is used in a C/C++ file, and clang converts it into an intrinsic. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D78668	2020-05-12 17:19:43 -05:00
Kamau Bridgeman	cd83333fc8	[PowerPC] Fold redundant load immediates of zero and delete if possible This patch folds redundant load immediates into a zero for instructions which recognise this as the value zero and not the register. If the load immediate is no longer in use it is then deleted. This is already done in earlier passes but the ppc-mi-peephole allows for a more general implementation. Differential Revision: https://reviews.llvm.org/D69168	2020-05-12 13:15:06 -05:00
Qiu Chaofan	e8d2ff22f0	[PowerPC] Add fma/fsqrt/fmax strict-fp intrinsics This patch adds strict-fp intrinsics support for fma, fsqrt, fmaxnum and fminnum on PowerPC. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D72749	2020-05-12 13:44:09 +08:00
jasonliu	a1b04aaea2	Move PowerPC specific test under PowerPC directive to fix build break Fix build break in x86 platform which introduced by https://reviews.llvm.org/D79127	2020-05-11 20:05:05 +00:00
jasonliu	51e6fc44d0	[XCOFF][AIX] Emit correct alignment for csect Summary: This patch tries to emit the correct alignment result for both object file generation path and assembly path. Reviewed by: hubert.reinterpretcast, DiggerLin, daltenty Differential Revision: https://reviews.llvm.org/D79127	2020-05-11 19:43:10 +00:00
Kang Zhang	dcc5ff3bc2	[PowerPC] Use PredictableSelectIsExpensive to enable select to branch in CGP Summary: This patch will set the variable PredictableSelectIsExpensive to do the select to if based on BranchProbability in CodeGenPrepare. When the BranchProbability more than MinPercentageForPredictableBranch, PPC will convert SELECT to branch. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D71883	2020-05-11 15:02:09 +00:00
Hubert Tong	601d5bd516	[Target][XCOFF] Correctly halt when mixing AIX or XCOFF with ppc64le The code to prevent using `PPCXCOFFMCAsmInfo` with little-endian targets used an incorrect check. Also, there does not appear to be sufficient earlier checking to prevent failing this check, so the check here is upgraded to be a `report_fatal_error`. `PPCAIXAsmPrinter` was also missing a check against use with little-endian targets. This patch adds such a check in.	2020-05-08 16:51:34 -04:00
Hubert Tong	b116ded57d	[AIX] Avoid structor alias; die before bad alias codegen Summary: `AsmPrinter::emitGlobalIndirectSymbol` is dependent on `MCStreamer::emitAssignment` to produce `.set` directives for alias symbols; however, the `.set` pseudo-op on AIX is documented as not usable with external relocatable terms or expressions, which limits its applicability in generating alias symbols. Disable generating aliases on AIX until a different implementation strategy is available. Reviewers: cebowleratibm, jasonliu, sfertile, daltenty, DiggerLin Reviewed By: jasonliu Differential Revision: https://reviews.llvm.org/D79044	2020-05-08 16:51:34 -04:00
Jinsong Ji	80b78a47e5	[MachinePipeliner] Add ORE for MachinePipeliner This patch adds ORE for MachinePipeliner, so that people can anaylyze their code using opt-viewer or other tools, then optimize the code to catch more piplining opportunities. Reviewed By: bcahoon Differential Revision: https://reviews.llvm.org/D79368	2020-05-05 16:04:53 +00:00
Sean Fertile	47f9e71ac7	[PowerPC][AIX][NFC] Remove spills and reloads from arg passing test.	2020-05-04 14:26:33 -04:00
diggerlin	a2c8cd1812	[AIX] emit .extern and .weak directive linkage SUMMARY: emit .extern and .weak directive linkage Reviewers: hubert.reinterpretcast, Jason Liu Subscribers: wuzish, nemanjai, hiraditya Differential Revision: https://reviews.llvm.org/D76932	2020-04-30 09:54:10 -04:00
Chen Zheng	957c5dd78b	[PowerPC-QPX] add more test for QPX madd/msub operands order - NFC	2020-04-29 01:17:14 -04:00
QingShan Zhang	b5f89744cc	[DAGCombine] Checking the cost directly to improve the code readability Call getNegatedExpression(Cost) and check the Cost to make the code more clear. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D78347	2020-04-29 01:49:39 +00:00
Sean Fertile	2a3cf5e583	[PowerPC][AIX] Pass ByVal formal args that span registers and stack. Implement passing of ByVal formal arguments when the argument is passed partly in the argument registers, with the remainder of the argument passed on the stack. Differential Revision: https://reviews.llvm.org/D78515	2020-04-28 14:57:14 -04:00
Chen Zheng	22fdbd01a3	[Powerpc] add triple for new added qpx test case - NFC	2020-04-28 05:32:10 -04:00
Ng Zhi An	500b4ad5f4	[PowerPC] Fix downcast from nullptr for target streamer getTargetStreamer() might return null (e.g. when running inlined-strings.ll test), downcasting to a reference will be wrong. This is detectable with -fsanitize=null. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D78686	2020-04-28 09:20:10 +00:00
Chen Zheng	949018cc27	[PowerPC] add test case for reorder operands of qpx fma instr - nfc.	2020-04-28 04:43:32 -04:00
Chen Zheng	45d92806ea	[PowerPC] use inst-level fast-math-flags to drive MachineCombiner Currently, on PowerPC target, it uses function scope UnsafeFPMath option to drive Machine Combiner pass. This is not accurate in two ways: 1: the scope is not accurate. Machine Combiner pass only requires instruction-level flags instead of the function scope. 2: the float point flag is not accurate. Machine Combiner pass only requires float point flags reassoc and nsz. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D78183	2020-04-28 03:31:12 -04:00
Kang Zhang	4bb0a1cb70	[PowerPC] Fix the liveins for ppc-expand-isel pass Summary: In the ppc-expand-isel pass, we use stepForward() to update the liveins, this function is not recommended, because it needs the accurate kill info. This patch uses the function computeAndAddLiveIns() to update the liveins, it's the recommended method and can fix the liveins bug for ppc-expand-isel pass.. Reviewed By: efriedma, lkail Differential Revision: https://reviews.llvm.org/D78657	2020-04-28 03:22:48 +00:00
LemonBoy	f30416fdde	[AsmPrinter] Fix emission of non-standard integer constants for BE targets The code assumed that zero-extending the integer constant to the designated alloc size would be fine even for BE targets, but that's not the case as that pulls in zeros from the MSB side while we actually expect the padding zeros to go after the LSB. I've changed the codepath handling the constant integers to use the store size for both small(er than u64) and big constants and then add zero padding right after that. Differential Revision: https://reviews.llvm.org/D78011	2020-04-27 14:57:29 -07:00
Stefan Pintilie	1354a03e74	[PowerPC][Future] Implement PC Relative Tail Calls Tail Calls were initially disabled for PC Relative code because it was not safe to make certain assumptions about the tail calls (namely that all compiled functions no longer used the TOC pointer in R2). However, once all of the TOC pointer references have been removed it is safe to tail call everything that was tail called prior to the PC relative additions as well as a number of new cases. For example, it is now possible to tail call indirect functions as there is no need to save and restore the TOC pointer for indirect functions if the caller is marked as may clobber R2 (st_other=1). For the same reason it is now also possible to tail call functions that are external. Differential Revision: https://reviews.llvm.org/D77788	2020-04-27 12:55:08 -05:00
Kang Zhang	f85e35d2a3	[NFC][PowerPC] Add the killed flag for the case expand-isel-liveness.mir	2020-04-26 04:40:20 +00:00
Kang Zhang	fe2a522533	[NFC][PowerPC] Add a new test case in expand-isel-liveness.mir	2020-04-26 03:15:54 +00:00
Fangrui Song	10bc12588d	[XRay] Change Sled.Function to PC-relative for sled version 2 and make llvm-xray support sled version 2 addresses Follow-up of D78082 and D78590. Otherwise, because xray_instr_map is now read-only, the absolute relocation used for Sled.Function will cause a text relocation.	2020-04-24 14:41:56 -07:00
Fangrui Song	25e22613df	[XRay] Change ARM/AArch64/powerpc64le to use version 2 sled (PC-relative address) Follow-up of D78082 (x86-64). This change avoids dynamic relocations in `xray_instr_map` for ARM/AArch64/powerpc64le. MIPS64 cannot use 64-bit PC-relative addresses because R_MIPS_PC64 is not defined. Because MIPS32 shares the same code, for simplicity, we don't use PC-relative addresses for MIPS32 as well. Tested on AArch64 Linux and ppc64le Linux. Reviewed By: ianlevesque Differential Revision: https://reviews.llvm.org/D78590	2020-04-24 08:35:43 -07:00
Kang Zhang	302e11cd97	[NFC][PowerPC] Fix the liveins for 3 mir test cases	2020-04-24 08:03:02 +00:00
Victor Huang	a60ca4b4e9	[PowerPC][Future] Initial support for PCRel addressing to get block address Add initial support for PCRelative addressing to get block address instead of using TOC. Differential Revision: https://reviews.llvm.org/D76294	2020-04-22 15:01:29 -05:00
Victor Huang	02141a17ae	[PowerPC][Future] Remove redundant r2 save and restore for indirect call Currently an indirect call produces the following sequence on PCRelative mode: extern void function( ); extern void (ptrfunc) ( ); void g() { ptrfunc=function; } void f() { (ptrfunc) ( ); } Producing paddi 3, 0, .LC0@PCREL, 1 ld 3, 0(3) std 2, 24(1) ld 12, 0(3) mtctr 12 bctrl ld 2, 24(1) Though the caller does not use or preserve r2, it is still saved and restored across a function call. This patch is added to remove these redundant save and restores for indirect calls. Differential Revision: https://reviews.llvm.org/D77749	2020-04-22 12:05:51 -05:00
Victor Huang	43abef06f4	[PowerPC][Future] Initial support for PCRel addressing for jump tables. Add initial support for PC Relative addressing to get jump table base address instead of using TOC. Differential Revision: https://reviews.llvm.org/D75931	2020-04-22 10:45:01 -05:00
Qiu Chaofan	c12722cde8	[PowerPC] Exploit RLDIMI for OR with large immediates This patch exploits rldimi instruction for patterns like `or %a, 0b000011110000`, which saves number of instructions when the operand has only one use, compared with `li-ori-sldi-or`. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D77850	2020-04-22 14:16:52 +08:00
Stefan Pintilie	a92ee77d85	[PowerPC][Future] Add offsets to PC Relative relocations. This is an optimization that applies to global addresses and allows for the following transformation: Convert this: paddi r3, 0, symbol@PCREL, 1 ld r4, 8(r3) To this: pld r4, symbol@PCREL+8(0), 1 An instruction is saved and the linker can do the addition when the symbol is resolved. Differential Revision: https://reviews.llvm.org/D76160	2020-04-21 11:08:19 -05:00
Kang Zhang	e477915bfe	[PowerPC] Add a new test case expand-isel-liveness.mir	2020-04-21 16:00:34 +00:00

1 2 3 4 5 ...

2489 Commits