llvm-project

Author	SHA1	Message	Date
Qiu Chaofan	d03263814a	[PowerPC] Fix operand regclass of XSTSTDCSP	2024-03-20 17:45:19 +08:00
Lei Huang	f20660701d	[PowerPC] Add DFP conversion instructions definitions and MC tests Add td definitions and asm/disasm tests for the quantum conversion instructions in ISA 3.1 section 5.6.5 Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D154394	2023-07-12 13:57:49 -04:00
Qiu Chaofan	69bc8ff766	Reland "[PowerPC] Simplify fp-to-int store optimization" The build failure should be fixed by de681d53. Follow-up refactor will be done in future patches. This reverts commit e7c5ced0b9f0551ea17e1d2b48be86f03a772c59.	2023-06-05 13:53:08 +08:00
Vitaly Buka	e7c5ced0b9	Revert "[PowerPC] Simplify fp-to-int store optimization" Breaks https://lab.llvm.org/buildbot/#/builders/18/builds/9118 This reverts commit 8064caf83fb166b709bfe0e7641c5181341cb064.	2023-05-24 10:05:28 -07:00
Qiu Chaofan	8064caf83f	[PowerPC] Simplify fp-to-int store optimization On PowerPC VSX targets, fp-to-int will be transformed into xscv with mfvsr. When the result is to be stored, mfvsr can be replaced by a direct store. This change simplifies the optimization by using existing fp-to-int code, which helps CSE and handling strictfp cases. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D141473	2023-05-23 16:40:54 +08:00
Simon Pilgrim	8757ce4901	[PowerPC] Replace PPCISD::VABSD cases with generic ISD::ABDU(X,Y) node A move towards using the generic ISD::ABDU nodes on more backends Also support ISD::ABDS for v4i32 types using the existing signbit flip trick PowerPC has a select(icmp_ugt(x,y),sub(x,y),sub(y,x)) -> abdu(x,y) combine that I intend to move to DAGCombiner in a future patch. The ABS(SUB(X,Y)) -> PPCISD::VABSD(X,Y,1) v4i32 combine wasn't legal (https://alive2.llvm.org/ce/z/jc2hLU) - so I've removed it, having already added the legal sub nsw tests equivalent. Differential Revision: https://reviews.llvm.org/D142313	2023-02-25 20:17:17 +00:00
Stefan Pintilie	2e47aafb02	[PowerPC] Fix float materialization patterns. Two of the float materialization patterns use the VSSRC regsiter class. This register class is not available before Power 8. The patterns will stay the same for Power 8 and up but must use the class F4RC for Power 7 and earlier. This patch fixes those patterns. Reviewed By: nemanjai, amyk, #powerpc Differential Revision: https://reviews.llvm.org/D142120	2023-02-13 10:18:53 -05:00
James Y Knight	0be684ed97	[PowerPC] Switch to by-name matching for instructions (part 2 of 2). This is a follow-on to https://reviews.llvm.org/D134073. Currently, all of the "memri"-style complex operands, which contain both a register and an immediate, are encoded into a single field in the instruction definition. This requires complex encoders/decoders, and instruction definitions that insert and extract the correct parts of the bits. Now, switch to naming and encoding/decoding the sub-operands separately. Thus, we can now disable useDeprecatedPositionallyEncodedOperands. Reviewed By: barannikov88 Differential Revision: https://reviews.llvm.org/D137670	2023-02-02 15:28:45 -05:00
James Y Knight	4b43ef3e5c	[PowerPC] Switch to by-name matching for instructions (part 1 of 2). This is a follow-on to https://reviews.llvm.org/D134073. After https://reviews.llvm.org/D137653 we can now switch the PPC target away from positional operand matching. This patch fixes all of the "easy" cases. While this changes a large number of lines of tablegen source, it results in only a single non-comment change in the code generated by tablegen: the (unused) codegen-only "MTVRSAVEv" instruction was previously incorrectly encoding operand 0, and now encodes (correctly) operand 1. Changes which result in generated-code changes have been split off into the next (smaller) patch, for ease of review. Reviewed By: barannikov88 Differential Revision: https://reviews.llvm.org/D137661	2023-02-02 15:28:45 -05:00
Nemanja Ivanovic	f68fc8d9d2	[PowerPC] Fix incorrect shift amount for build_vector The pattern for a build_vector node was incorrect for big endian subtargets.	2023-01-30 16:36:08 -06:00
Stefan Pintilie	c1d0118459	[PowerPC] Materialize floats in the range [-16.0, 15.0]. Previous to this patch we only materialized 0.0 and all other floating point values would be loaded from the TOC. This patch adds materialization for the floating point values that can be represented as integers in [-16.0, 15.0]. For example we will now materialize 3.0 and -5.0 but not 4.7. Reviewed By: nemanjai, lei, #powerpc Differential Revision: https://reviews.llvm.org/D138844	2023-01-04 12:52:30 -06:00
Maryam Moghadas	934d5fa2b8	[PowerPC] Exploit xxperm, check for dead vectors and substitute vperm with xxperm vperm instruction requires the data to be in the Altivec registers, if one of the vector operands is not used after this vperm instruction then it can be substituted by xxperm which doubles the number of available registers. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D133700	2022-11-23 13:28:12 -06:00
Maryam Moghadas	a09140f34f	Revert "[PowerPC] Remove extra swap for extract+vperm on LE" This reverts commit f7294ac8093a2fbd8c00254580eaac6c4e1f7b24.	2022-08-25 12:34:43 -05:00
Stefan Pintilie	1492c88f49	[PowerPC] Fix bugs in sign-/zero-extension elimination This patch fixes the following two bugs in `PPCInstrInfo::isSignOrZeroExtended` helper, which is used from sign-/zero-extension elimination in PPCMIPeephole pass. - Registers defined by load with update (e.g. LBZU) were identified as already sign or zero-extended. But it is true only for the first def (loaded value) and not for the second def (i.e. updated pointer). - Registers defined by ORIS/XORIS were identified as already sign-extended. But, it is not true for sign extension depending on the immediate (while it is ok for zero extension). To handle the first case, the parameter for the helpers is changed from `MachineInstr` to a register number to distinguish first and second defs. Also, this patch moves the initialization of PPCMIPeepholePass to allow mir test case. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D40554	2022-08-19 07:05:40 -05:00
Amy Kwan	af430944b3	[PowerPC][AIX] Allow VSX patterns to be 32-bit and 64-bit safe on P8+. This patch updates two patterns involving `scalar_to_vector` and `SCALAR_TO_VECTOR_PERMUTED` nodes to be safe for both 64-bit and 32-bit by pulling the patterns out of the 64-bit specific guard. These patterns are matched on POWER8 and above. Differential Revision: https://reviews.llvm.org/D125389	2022-05-27 10:34:17 -05:00
Amy Kwan	c35ca3a1c7	[PowerPC] Implement XL compat __fnabs and __fnabss builtins. This patch implements the following floating point negative absolute value builtins that required for compatibility with the XL compiler: ``` double __fnabs(double); float __fnabss(float); ``` These builtins will emit : - fnabs on PWR6 and below, or if VSX is disabled. - xsnabsdp on PWR7 and above, if VSX is enabled. Differential Revision: https://reviews.llvm.org/D125506	2022-05-19 11:28:40 -05:00
Stefan Pintilie	ef34442232	[NFC][PowerPC] Move the Regsiter Operands for PowerPC into PPCRegisterInfo.td Currently the regsiter operand definitions are found in three separate files. This patch moves all of the definitions into PPCRegisterInfo.td. Reviewed By: amyk Differential Revision: https://reviews.llvm.org/D123543	2022-04-18 14:50:24 -05:00
Nemanja Ivanovic	766ca2c59e	[PowerPC] Add missed VSX shuffles instead of Altivec ones VSX introduced some permute instructions that are direct replacements for Altivec ones except they can target all the VSX registers. We have added code generation for most of these but somehow missed the low/hi word merges (XXMRG[LH]W). This caused some additional spills on some large computationally intensive code. This patch simply adds the missed patterns.	2022-03-14 10:11:54 -05:00
Qiu Chaofan	b2497e5435	[PowerPC] Add generic fnmsub intrinsic Currently in Clang, we have two types of builtins for fnmsub operation: one for float/double vector, they'll be transformed into IR operations; one for float/double scalar, they'll generate corresponding intrinsics. But for the vector version of builtin, the 3 op chain may be recognized as expensive by some passes (like early cse). We need some way to keep the fnmsub form until code generation. This patch introduces ppc.fnmsub.* intrinsic to unify four fnmsub intrinsics. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D116015	2022-03-07 13:00:06 +08:00
Chen Zheng	63cd1842a7	[PowerPC] use lvx + splat directly for aligned splat load Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D114062	2021-12-08 02:02:18 +00:00
Tarique Islam	0850655da6	Big-endian version of vpermxor A big-endian version of vpermxor, named vpermxor_be, is added to LLVM and Clang. vpermxor_be can be called directly on both the little-endian and the big-endian platforms. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D114540	2021-11-30 22:49:55 +00:00
Nemanja Ivanovic	5840f7197d	[PowerPC] Respect rounding mode in the back end Currently, the floating point instructions that depend on rounding mode are correctly marked in the PPC back end with an implicit use of the RM register. Similarly, instructions that explicitly define the register are marked with an implicit def of the same register. So for the most part, RM-using code won't be moved across RM-setting instructions. However, calls are not marked as RM-setting instructions so code can be moved across calls. This is generally desired, but so is the ability to turn off this behaviour with an appropriate option - and -frounding-math really should be that option. This patch provides a set of call instructions (for direct and indirect calls) that are marked with an implicit def of the RM register. These will be used for calls that are marked with the strictfp attribute. Differential revision: https://reviews.llvm.org/D111433	2021-11-10 08:19:58 -06:00
Chen Zheng	7c6f5950f0	[PowerPC] comment for different input register classes; nfc Add comments to explain why XXPERMDIs and XXPERMDI have different input register classes, vsfrc for XXPERMDIs and vsrc for XXPERMDI. This addresses the comments in abandoned patch D113178, we keep using `f0` instead of using `vs0` for XXPERMDIs on purpose.	2021-11-08 02:21:30 +00:00
Chen Zheng	fed2889f07	[PowerPC] use correct selection for v16i8/v8i16 splat load Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D113236	2021-11-05 10:04:03 +00:00
Chen Zheng	5a8b196340	[PowerPC] handle more splat loads without stack operation This mostly improves splat loads code generation on Power7 Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D106555	2021-11-03 05:17:41 +00:00
Albion Fung	4195ed9959	[PowerPC] Improved codegen related to xscvdpsxws/xscvdpuxws This patch removes the uneccessary mf/mtvsr generated in conjunction with xscvdpsxws/xscvdpuxws. Differential revision: https://reviews.llvm.org/D109902	2021-09-30 14:31:00 -05:00
Kai Luo	666ee849f0	[PowerPC] Fix shift amount of xxsldwi when performing vector int_to_double POC ``` // main.c #include <stdio.h> #include <altivec.h> extern vector double foo(vector int s); int main() { vector int s = {0, 1, 0, 4}; vector double vd; vd = foo(s); printf("%lf %lf\n", vd[0], vd[1]); return 0; } // poc.c vector double foo(vector int s) { int x1 = s[1]; int x3 = s[3]; double d1 = x1; double d3 = x3; vector double x = { d1, d3 }; return x; } ``` Compiled with `poc.c main.c -mcpu=pwr8 -O3` on BE machine. Current clang gives ``` 4.000000 1.000000 ``` while xlc gives ``` 1.000000 4.000000 ``` Xlc's output should be correct. Reviewed By: shchenz, #powerpc Differential Revision: https://reviews.llvm.org/D107428	2021-08-06 06:01:29 +00:00
Jinsong Ji	6f84d94b9c	[PowerPC] Fix copy/paste error in scalar_to_vector patterns https://reviews.llvm.org/D100478 refactoring added a copy/paste error for v8i16 patterns. Reviewed By: #powerpc, shchenz Differential Revision: https://reviews.llvm.org/D107609	2021-08-06 02:59:01 +00:00
Quinn Pham	e002d251dd	[PowerPC] Floating Point Builtins for XL Compat. This patch is in a series of patches to provide builtins for compatibility with the XL compiler. This patch adds builtins related to floating point operations Reviewed By: #powerpc, nemanjai, amyk, NeHuang Differential Revision: https://reviews.llvm.org/D103986	2021-07-21 08:33:39 -05:00
Albion Fung	3434ac9e39	[PowerPC] Store, load, move from and to registers related builtins This patch implements store, load, move from and to registers related builtins, as well as the builtin for stfiw. The patch aims to provide feature parady with xlC on AIX. Differential revision: https://reviews.llvm.org/D105946	2021-07-20 15:46:14 -05:00
Lei Huang	c8937b6cb9	[PowerPC] Implement XL compact math builtins Implement a subset of builtins required for compatiblilty with AIX XL compiler. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D105930	2021-07-16 13:21:13 -05:00
Amy Kwan	ba627a32e1	[PowerPC] Update Refactored Load/Store Implementation, XForm VSX Patterns, and Tests This patch includes the following updates to the load/store refactoring effort introduced in D93370: - Update various VSX patterns that use to "force" an XForm, to instead just XForm. This allows the ability for the patterns to compute the most optimal addressing mode (and to produce a DForm instruction when possible) - Update pattern and test case for the LXVD2X/STXVD2X intrinsics - Update LIT test cases that use to use the XForm instruction to use the DForm instruction Differential Revision: https://reviews.llvm.org/D95115	2021-07-16 09:28:48 -05:00
Albion Fung	db26cd30b6	[PowerPC] Improve f32 to i32 bitcast code gen The code gen for f32 to i32 bitcast is not currently the most efficient; this patch removes some unneccessary instructions gerneated. Differential revision: https://reviews.llvm.org/D100782	2021-05-31 16:00:58 -05:00
Nemanja Ivanovic	511f4ae54e	[PowerPC] Add patterns for vselect of v1i128 These patterns are missing even though the underlying instruction doesn't really care about the type. Added these patterns to resolve https://bugs.llvm.org/show_bug.cgi?id=50084	2021-05-17 06:37:46 -05:00
Nemanja Ivanovic	39e4676ca7	[PowerPC] Provide doubleword vector predicate form comparisons on Power7 There are two reasons this shouldn't be restricted to Power8 and up: 1. For XL compatibility 2. Because clang will expand comparison operators to these intrinsics* *Without this patch, the following causes a selection error: int test(vector signed long a, vector signed long b) { return a < b; } This patch provides the handling for the intrinsics in the back end and removes the Power8 guards from the predicate functions (vec_{all\|any}_{eq\|ne\|gt\|ge\|lt\|le}).	2021-05-13 04:56:56 -05:00
Albion Fung	ffbffaf6b6	[PowerPC] Improve codegen for int-to-fp conversion of subword vector extract When an integer is converted into floating point in subword vector extract, it can be done in 2 instructions instead of the 3+ instructions it generates right now. This patch removes the uncessary generation. Differential: https://reviews.llvm.org/D100604	2021-05-11 15:00:11 -05:00
Qiu Chaofan	f7294ac809	[PowerPC] Remove extra swap for extract+vperm on LE This is a simple fix on LE. On BE, vector shuffles are categorized into different ops. We may need more work to eliminate these in tablegen/pre-isel. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D101605	2021-05-07 13:48:08 +08:00
Amy Kwan	64d951be61	[PowerPC] Add new infrastructure to select load/store instructions, update P8/P9 load/store patterns. This patch introduces a new infrastructure that is used to select the load and store instructions in the PPC backend. The primary motivation is that the current implementation of selecting load/stores is dependent on the ordering of patterns in TableGen. Given this limitation, we are not able to easily and reliably generate the P10 prefixed load and stores instructions (such as when the immediates that fit within 34-bits). This refactoring is meant to provide us with more control over the patterns/different forms to exploit, as well as eliminating dependency of pattern declaration in TableGen. The idea of this refactoring is that it introduces a set of addressing modes that correspond to different instruction formats of a particular load and store instruction, along with a set of common flags that describes a load/store. Whenever a load/store instruction is being selected, we analyze the instruction and compute a set of flags for it. The computed flags are then used to select the most optimal load/store addressing mode. This patch is the first of a series of patches to be committed - it contains the initial implementation of the refactored load/store selection infrastructure and also updates P8/P9 patterns to adopt this infrastructure. The idea is that incremental patches will add more implementation and support, and eventually the old implementation will be removed. Differential Revision: https://reviews.llvm.org/D93370	2021-04-30 09:53:19 -05:00
Zarko Todorovski	f818ec9dd1	[AIX] Allow safe for 32bit P9 VSX extract and insert pattern matches In https://reviews.llvm.org/D92789 PPC64 checks were added that disallowed most VSX pattern matching. We enable some safe ones for 32bit in this patch. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D97503	2021-04-27 07:27:43 -04:00
Nemanja Ivanovic	6725b90a02	[PowerPC] Add vec_ctsl and vec_ctul to altivec.h These are added for compatibility with XLC. They are similar to vec_cts and vec_ctu except that the result is a doubleword vector regardless of the parameter type.	2021-04-23 11:03:38 -05:00
Nemanja Ivanovic	092619cf6b	[PowerPC] Improve codegen for vector fp to int widening conversions We currently do not utilize instructions that convert single precision vectors to doubleword integer vectors. These conversions come up in code occasionally and this improvement allows us to open code some functions that need to be added to altivec.h.	2021-04-22 05:04:06 -05:00
Nemanja Ivanovic	03e7fefff8	[PowerPC] Canonicalize shuffles on big endian targets as well Extend shuffle canonicalization and conversion of shuffles fed by vectorized scalars to big endian subtargets. For big endian subtargets, loads and direct moves of scalars into vector registers put the data in the correct element for SCALAR_TO_VECTOR if the data type is 8 bytes wide. However, if the data type is narrower, the value still ends up in the wrong place - althouth a different wrong place than on little endian targets. This patch extends the combine that keeps values where they are if they feed a shuffle to big endian targets. Differential revision: https://reviews.llvm.org/D100478	2021-04-20 07:29:47 -05:00
Nemanja Ivanovic	ff769dd111	[PowerPC] Minor improvement for insert_vector_elt codegen For v2f64, all VSX subtargets can insert an element with a single XXPERMDI.	2021-04-16 18:52:37 -05:00
Zarko Todorovski	6b7838b68c	[AIX] Allow safe for 32bit P8 VSX pattern matching Pull some of the safe for 32bit pattern matching for Pwr8 and above. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D97909	2021-04-14 08:12:48 -04:00
Nemanja Ivanovic	8be3181df6	[PowerPC] Fix incorrect subreg typo from 0148bf53f0a0	2021-04-14 05:01:12 -05:00
Nemanja Ivanovic	0148bf53f0	[PowerPC] Use correct node to get a super register from a subreg The VSX tablegen file has some rather eggregious uses of COPY_TO_REGCLASS even in situations where it needs to use SUBREG_TO_REG. While this produces correct code, it often doesn't allow the register coalescer to coalesce copies and the resulting code ends up being suboptimal. This patch just changes over patterns that should use SUBREG_TO_REG.	2021-04-13 19:52:21 -05:00
Albion Fung	9b6ac9e999	[P10] [Power PC] Exploiting new load rightmost vector element instructions. This pull request implements patterns to exploit the load rightmost vector element instructions for loading element 0 on little endian PowerPC subtargets into v8i16 and v16i8 vector registers for i16 and i8 data types. Differential Revision: https://reviews.llvm.org/D94816#inline-921403	2021-03-09 16:08:17 -05:00
Baptiste Saleil	34dc1ccb96	[PowerPC] Exploit the vinsw, vinsd, and vins[wd][lr]x instructions on P10 This patch generates the vinsw, vinsd, vinsblx, vinshlx, vinswlx, vinsdlx, vinsbrx, vinshrx, vinswrx and vinsdrx instructions for vector insertion on P10. Differential Revision: https://reviews.llvm.org/D94454	2021-02-18 14:17:47 +00:00
Craig Topper	94206f1f90	[PowerPC] Remove vnot_ppc and replace with the standard vnot. immAllOnesV has special support for looking through bitcasts automatically so isel patterns don't need to explicitly look for the bitconvert.	2021-01-31 19:41:33 -08:00
Nemanja Ivanovic	1150bfa6bb	[PowerPC] Add missing negate for VPERMXOR on little endian subtargets This intrinsic is supposed to have the permute control vector complemented on little endian systems (as the ABI specifies and GCC implements). With the current code gen, the result vector is byte-reversed. Differential revision: https://reviews.llvm.org/D95004	2021-01-25 12:23:33 -06:00

1 2 3 4 5 ...

253 Commits