llvm-project

Author	SHA1	Message	Date
Kai Luo	56414220df	[PowerPC] Use 'sync; ld; cmp; bc; isync' for atomic load seq-cst on 32-bit platform (#75905 ) `cmp; bc; isync` is more performant than `lwsync` theoretically. 64-bit platform already features it, now implement it for 32-bit platform.	2023-12-20 10:01:02 +08:00
Chen Zheng	4b932d84f4	[PowerPC] redesign the target flags (#69695 ) 12 bit is not enough for PPC's target specific flags. If 8 bit for the bitmask flags, 4 bit for the direct mask, PPC can total have 16 direct mask and 8 bitmask. Not enough for PPC, see this issue in https://github.com/llvm/llvm-project/pull/66316 Redesign how PPC target set the target specific flags. With this patch, all ppc target flags are direct flags. No bitmask flag in PPC anymore. This patch aligns with some targets like X86 which also has many target specific flags. The patch also fixes a bug related to flag `MO_TLSGDM_FLAG` and `MO_LO`. They are the same value and the test case changes in this PR shows the bug.	2023-12-07 12:47:25 +08:00
Sander de Smalen	81b7f115fb	[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979 ) It seems TypeSize is currently broken in the sense that: TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8) without failing its assert that explicitly tests for this case: assert(LHS.Scalable == RHS.Scalable && ...); The reason this fails is that `Scalable` is a static method of class TypeSize, and LHS and RHS are both objects of class TypeSize. So this is evaluating if the pointer to the function Scalable == the pointer to the function Scalable, which is always true because LHS and RHS have the same class. This patch fixes the issue by renaming `TypeSize::Scalable` -> `TypeSize::getScalable`, as well as `TypeSize::Fixed` to `TypeSize::getFixed`, so that it no longer clashes with the variable in FixedOrScalableQuantity. The new methods now also better match the coding standard, which specifies that: * Variable names should be nouns (as they represent state) * Function names should be verb phrases (as they represent actions)	2023-11-22 08:52:53 +00:00
Qiu Chaofan	426ad99bb2	[PowerPC] Forbid f128 SELECT_CC optimized into fsel (#71497 )	2023-11-15 12:20:06 +08:00
Qiu Chaofan	5f295552f1	[PowerPC] Fix incorrect symbol name of frexp libcall (#71626 ) frexpl is for ppc_fp128. The correct symbol name for f128 is frexpf128.	2023-11-08 14:41:19 +08:00
Paulo Matos	7b9d73c2f9	[NFC] Remove Type::getInt8PtrTy (#71029 ) Replace this with PointerType::getUnqual(). Followup to the opaque pointer transition. Fixes an in-code TODO item.	2023-11-07 17:26:26 +01:00
Nikita Popov	127ed9ae26	[PowerPC] Use zext instead of anyext in custom and combine (#68784 ) This custom combine currently converts `and(anyext(x),c)` into `anyext(and(x,c))`. This is not correct, because the original expression guaranteed that the high bits are zero, while the new one sets them to undef. Emit `zext(and(x,c))` instead. Fixes https://github.com/llvm/llvm-project/issues/68783.	2023-10-12 09:32:17 +02:00
Kishan Parmar	696ea67f19	Disable call to fma for soft-float PowerPC backend generate calls to libc function calls for soft-float, regardless of the -nostdlib /-ffreestanding flag. fma is not a function provided by compiler-rt builtins and thus should not be generated here. PR : [[ https://github.com/llvm/llvm-project/issues/55230 \| #55230 ]] Below is patch given by @nemanjai Reviewed By: jhibbits Differential Revision: https://reviews.llvm.org/D156344	2023-09-28 14:06:54 +05:30
Nick Desaulniers	330fa7d2a4	[TargetLowering] Deduplicate choosing InlineAsm constraint between ISels (#67057 ) Given a list of constraints for InlineAsm (ex. "imr") I'm looking to modify the order in which they are chosen. Before doing so, I noticed a fair amount of logic is duplicated between SelectionDAGISel and GlobalISel for this. That is because SelectionDAGISel is also trying to lower immediates during selection. If we detangle these concerns into: 1. choose the preferred constraint 2. attempt to lower that constraint Then we can slide down the list of constraints until we find one that can be lowered. That allows the implementation to be shared between instruction selection frameworks. This makes it so that later I might only need to adjust the priority of constraints in one place, and have both selectors behave the same.	2023-09-25 08:53:03 -07:00
Arthur Eubanks	0a1aa6cda2	[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295 ) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::	2023-09-14 14:10:14 -07:00
Maryam Moghadas	7b021f2e64	[PowerPC] Optimize VPERM and fix code order for swapping vector operands on LE This patch reverts commit 7614ba0a5db8 to optimize VPERM when one of its vector operands is XXSWAPD, similar to XXPERM. It also reorganizes the little-endian swap code on LE, swapping the vector operand after adjusting the mask operand. This ensures that the vector operand is swapped at the correct point in the code, resulting in a valid constant pool for the mask operand. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D149083	2023-09-13 15:00:49 -05:00
Nick Desaulniers	93bd428742	[InlineAsm] refactor InlineAsm class NFC (#65649 ) I would like to steal one of these bits to denote whether a kind may be spilled by the register allocator or not, but I'm afraid to touch of any this code using bitwise operands. Make flags a first class type using bitfields, rather than launder data around via `unsigned`.	2023-09-11 09:27:37 -07:00
Amy Kwan	3f46e5453d	[AIX][TLS] Produce a faster local-exec access sequence with -maix-small-local-exec-tls (And optimize when load/store offsets are 0) This patch utilizes the -maix-small-local-exec-tls option added in D155544 to produce a faster access sequence for the local-exec TLS model, where loading from the TOC can be avoided. The patch either produces an addi/la with a displacement off of r13 (the thread pointer) when the address is calculated, or it produces an addi/la followed by a load/store when the address is calculated and used for further accesses. This patch also optimizes this sequence a bit more where we can remove the addi/la when the load/store offset is 0. A follow up patch will be posted to account for when the load/store offset is non-zero, and currently in these situations we keep the addi/la that precedes the load/store. Furthermore, this access sequence is only performed for TLS variables that are less than ~32KB in size. Differential Revision: https://reviews.llvm.org/D155600	2023-09-07 20:05:29 -05:00
Ting Wang	71be020dda	[SelectionDAG][PowerPC] Memset reuse vector element for tail store On PPC there are instructions to store element from vector(e.g. stxsdx/stxsiwx), and these instructions can be leveraged to avoid tail constant in memset and constant splat array initialization. This patch tries to explore these opportunities. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D138883	2023-09-06 01:52:38 -04:00
Fangrui Song	111fcb0df0	[llvm] Fix duplicate word typos. NFC Those fixes were taken from https://reviews.llvm.org/D137338	2023-09-01 18:25:16 -07:00
Matt Arsenault	ad9d13d535	SelectionDAG: Swap operands of atomic_store Irritatingly, atomic_store had operands in the opposite order from regular store. This made it difficult to share patterns between regular and atomic stores. There was a previous incomplete attempt to move atomic_store into the regular StoreSDNode which would be better. I think it was a mistake for all atomicrmw to swap the operand order, so maybe it's better to take this one step further. https://reviews.llvm.org/D123143	2023-08-31 17:30:10 -04:00
Nick Desaulniers	2fad6e6985	[InlineAsm] wrap Kind in enum class NFC Should add some minor type safety to the use of this information, since there's quite a bit of metadata being laundered through an `unsigned`. I'm looking to potentially add more bitfields to that `unsigned`, but I find InlineAsm's big ol' bag of enum values and usage of `unsigned` confusing, type-unsafe, and un-ergonomic. These can probably be better abstracted. I think the lack of static_cast outside of InlineAsm indicates the prior code smell fixed here. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D159242	2023-08-31 08:54:51 -07:00
Qiu Chaofan	21bea1a208	[PowerPC] Support initial-exec TLS relocation on AIX Add TLS_IE relocation type to XCOFF writer, and emit code sequence for initial-exec TLS variables. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D156292	2023-08-30 16:22:16 +08:00
Chen Zheng	732f63d96d	[PowerPC]set default min-jump-table-entries to 64 on PPC Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D159050	2023-08-29 21:42:22 -04:00
Bjorn Pettersson	e53b28c833	[llvm] Drop some bitcasts and references related to typed pointers Differential Revision: https://reviews.llvm.org/D157551	2023-08-10 15:07:07 +02:00
Kai Luo	f26af16e2c	[PowerPC][AIX] Enable quadword atomics by default for AIX On AIX, a libatomic supporting inline quadword atomic operations has been released, so that compatibility is not an issue now, we can enable quadword atomics by default. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D151312	2023-07-25 08:21:07 +08:00
Brad Smith	a3e524df90	[PowerPC] Reorder setMaxAtomicSizeInBitsSupported(). NFC Reorder setMaxAtomicSizeInBitsSupported() in numerical and more logical order. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D155379	2023-07-22 20:01:27 -04:00
Kamau Bridgeman	62c1cf7c63	[PowerPC][Future] Enable __builtin_mma_xxm[t\|f]acc Future cpu instructions dmxxinstdmr512 and dmxxextfdmr512 insert and extract quad vectors from the new wide accumulator(wacc) register class. The introduction of these new instructions renders the p10 instructions xxmtacc and xxmfacc obsolete since the new wacc register class is a better choice for handing quad vector operations. This patch ensures that, for future cpu, instructions dmxxinstdmr512 and dmxxextfdmr512 are generated by custom lowering the intrinsics for xxm[t\|f]acc to produce no instructions. Reviewed By: amyk, lei Differential Revision: https://reviews.llvm.org/D153034	2023-07-14 13:38:40 -05:00
Nemanja Ivanovic	b0e249d5e2	Reland "[PowerPC] Remove extend between shift and and" The commit originally caused a bootstrap failure on the big endian PPC bot as the combine was interfering with the legalizer when applied on illegal types. This update restricts the combine to the only types for which it is actually needed. Tested on PPC BE bootstrap locally.	2023-07-07 14:45:05 -04:00
Nemanja Ivanovic	7cd9084c69	Revert "[PowerPC] Remove extend between shift and and" This reverts commit a57236de4eb8f38b4201647b10146941cbbb5c0b. Causes a bootstrap failure on ppc64be.	2023-07-05 20:04:49 -04:00
Nemanja Ivanovic	a57236de4e	[PowerPC] Remove extend between shift and and The SDAG will sometimes insert an extend between the shift and an and (immediate) even though the immediate is narrower than the narrow size. This does not allow us to produce a rotate instruction (such as rlwinm). This patch just adds a combine to move the extend onto the and. Differential revision: https://reviews.llvm.org/D152911	2023-07-05 16:33:07 -04:00
Elliot Goodrich	b0abd4893f	[llvm] Add missing StringExtras.h includes In preparation for removing the `#include "llvm/ADT/StringExtras.h"` from the header to source file of `llvm/Support/Error.h`, first add in all the missing includes that were previously included transitively through this header.	2023-06-25 15:42:22 +01:00
Amy Kwan	f5ae075048	[AIX][TLS] Generate 32-bit local-exec access code sequence This patch adds support for the TLS local-exec access model on AIX to allow for the ability to generate the 32-bit (specifically, non-optimized) code sequence. This work is a follow up of D149722. The particular sequence that is generated for this sequence is as follows: ``` .tc var[TC],var[TL]@le. // variable offset, with the le relocation specifier bla .__get_tpointer() // get the thread pointer, modifies r3 lwz reg1, var[TC](2) // load the variable offset add reg2, r3, reg1 // add the variable offset to the retrieved thread pointer ``` Differential Revision: https://reviews.llvm.org/D152669	2023-06-20 11:57:38 -05:00
Amy Kwan	d5659808b2	[AIX][TLS] Generate 64-bit local-exec access code sequence This patch adds support for the TLS local-exec access model on AIX to allow for the ability to generate the 64-bit (specifically, non-optimized) code sequence. For this patch in particular, the sequence that is generated involves a load of the variable offset, followed by an add of the loaded variable offset to r13 (which is thread pointer, respectively). This code sequence looks like the following: ``` ld reg1,var[TC](2) add reg2, reg1, r13 // r13 contains the thread pointer ``` The TOC (.tc pseudo-op) entries generated in the assembly files are also changed where we add the @le relocation for the variable offset. Differential Revision: https://reviews.llvm.org/D149722	2023-06-19 12:17:30 -05:00
Matt Arsenault	eece6ba283	IR: Add llvm.ldexp and llvm.experimental.constrained.ldexp intrinsics AMDGPU has native instructions and target intrinsics for this, but these really should be subject to legalization and generic optimizations. This will enable legalization of f16->f32 on targets without f16 support. Implement a somewhat horrible inline expansion for targets without libcall support. This could be better if we could introduce control flow (GlobalISel version not yet implemented). Support for strictfp legalization is less complete but works for the simple cases.	2023-06-06 17:07:18 -04:00
Qiu Chaofan	9e17e08324	[PowerPC] Combine fptoint-store under strict cases Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D141249	2023-06-05 16:24:02 +08:00
Qiu Chaofan	590c6a1727	[PowerPC] Require FPCVT for store fptoi combination	2023-06-05 14:26:32 +08:00
Qiu Chaofan	69bc8ff766	Reland "[PowerPC] Simplify fp-to-int store optimization" The build failure should be fixed by de681d53. Follow-up refactor will be done in future patches. This reverts commit e7c5ced0b9f0551ea17e1d2b48be86f03a772c59.	2023-06-05 13:53:08 +08:00
Craig Topper	6006d43e2d	LLVM_FALLTHROUGH => [[fallthrough]]. NFC Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D150996	2023-05-24 12:40:10 -07:00
Vitaly Buka	e7c5ced0b9	Revert "[PowerPC] Simplify fp-to-int store optimization" Breaks https://lab.llvm.org/buildbot/#/builders/18/builds/9118 This reverts commit 8064caf83fb166b709bfe0e7641c5181341cb064.	2023-05-24 10:05:28 -07:00
Nemanja Ivanovic	de681d53ba	[PowerPC] Do not attempt to combine fptoui without FPCVT Commit 8064caf83fb166b709bfe0e7641c5181341cb064 added a call to a function that performs this combine without checking whether the target supports FPCVT. This caused asserts to trip on BE bots as the default target does not have this feature.	2023-05-24 11:14:26 -05:00
Krasimir Georgiev	c37ced7d02	silence an unused variable warning after 8064caf83fb166b709bfe0e7641c5181341cb064	2023-05-23 12:47:13 +00:00
Qiu Chaofan	8064caf83f	[PowerPC] Simplify fp-to-int store optimization On PowerPC VSX targets, fp-to-int will be transformed into xscv with mfvsr. When the result is to be stored, mfvsr can be replaced by a direct store. This change simplifies the optimization by using existing fp-to-int code, which helps CSE and handling strictfp cases. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D141473	2023-05-23 16:40:54 +08:00
Sergei Barannikov	da42b2846c	[CodeGen] Support allocating of arguments by decreasing offsets Previously, `CCState::AllocateStack` always allocated stack space by increasing offsets. For targets with stack growing up (away from zero) it is more convenient to allocate arguments by decreasing offsets, so that the first argument is at the top of the stack. This is important when calling a function with variable number of arguments: the callee does not know the size of the stack, but must be able to access "fixed" arguments. For that to work, the "fixed" arguments should have fixed offsets relative to the stack top, i.e. the variadic arguments area should be at the stack bottom (at lowest addresses). The in-tree target with stack growing up is AMDGPU, but it allocates arguments by increasing addresses. It does not support variadic arguments. A drive-by change is to promote stack size/offset to 64-bit integer. This is what MachineFrameInfo expects. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D149575	2023-05-17 21:51:52 +03:00
Sergei Barannikov	01a7967447	[CodeGen] Replace CCState's getNextStackOffset with getStackSize (NFC) The term "next stack offset" is misleading because the next argument is not necessarily allocated at this offset due to alignment constrains. It also does not make much sense when allocating arguments at negative offsets (introduced in a follow-up patch), because the returned offset would be past the end of the next argument. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D149566	2023-05-17 21:51:45 +03:00
Zequan Wu	3977b77a6b	[CodeGen] Fix nomerge attribute not working in tail calls. In D79537, `nomerge` was made to only apply to non-tail calls. This fixes it by also applying it to tail calls. For ARM, I only made the new MI to inherit the flag under `TCRETURNdi` and `TCRETURNri`, because that's the place tail calls got replaced. Not sure if there's any other place needed. Fixes #61545. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D146749	2023-05-10 14:25:11 -04:00
NAKAMURA Takumi	c1221251fb	Restore CodeGen/MachineValueType.h from `Support` This is rework of; - rG13e77db2df94 (r328395; MVT) Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h` can be restored as well. Depends on D148767 Differential Revision: https://reviews.llvm.org/D149024	2023-05-03 00:13:20 +09:00
Kai Luo	eee024bf1b	[PowerPC] Update `incr` after resetting the register in MI After performing signed extension, we update the register in MI. We should also update `incr` register which is tracking the register in `MI`. Fixes https://github.com/llvm/llvm-project/issues/61882. Reviewed By: #powerpc, shchenz Differential Revision: https://reviews.llvm.org/D147594	2023-04-14 17:36:30 +08:00
Maryam Moghadas	cf0395f816	[PowerPC] Fix the xxperm swap requirements This patch is to fix the xxperm vector operand swap condition so that the single-use operand is in V2 to prevent copying, it also fixes the subtarget condition to exploit the xpperm. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D146632	2023-04-05 20:13:40 -05:00
Qiu Chaofan	5b8ea2d0e1	[PowerPC] Lower IS_FPCLASS by test data class instruction Power ISA 3.0 introduced new 'test data class' instructions, which accept flags for: NaN/Infinity/Zero/Denormal. This instruction can be used to implement custom lowering for llvm.is.fpclass, but some extra bits provided by the intrinsic are missing (normal and QNaN/SNaN). For those categories not natively supported, this patch uses a two-way or three-way combination to implement correct behavior. Reviewed By: sepavloff, shchenz Differential Revision: https://reviews.llvm.org/D140381	2023-04-03 11:37:17 +08:00
Craig Topper	219ff07f72	[Targets] Rename Flag->Glue. NFC Long long ago Glue was called Flag, and it was never completely renamed.	2023-04-02 19:28:51 -07:00
Simon Pilgrim	8153b92d9b	[DAG] Add SelectionDAG::SplitScalar helper Similar to the existing SelectionDAG::SplitVector helper, this helper creates the EXTRACT_ELEMENT nodes for the LO/HI halves of the scalar source. Differential Revision: https://reviews.llvm.org/D147264	2023-03-31 18:35:40 +01:00
Amy Kwan	6126356d82	[PowerPC] Implement 64-bit ELFv2 Calling Convention in TableGen (for integers/floats/vectors in registers) This patch partially implements the parameter passing rules outlined in the ELFv2 ABI within TableGen. Specifically, it implements the parameter assignment of integers, floats, and vectors within registers - where the GPR numbering will be "skipped" depending on the ordering of floats and vectors that appear within a parameter list. As we begin to adopt GlobalISel to the PowerPC backend, there is a need for a TableGen definition that encapsulates the ELFv2 parameter passing rules. Thus, this patch also changes the default calling convention that is returned within the ccAssignFnForCall() function used in our GlobalISel implementation, and also adds some additional testing of the calling convention that is implemented. Future patches that build on top of this initial TableGen definition will aim to add more of the ABI complexities, including support for additional types and also in-memory arguments. Differential Revision: https://reviews.llvm.org/D137504	2023-03-27 08:23:04 -05:00
Kazu Hirata	7bb6d1b32e	[llvm] Skip getAPIntValue (NFC) ConstantSDNode provides some convenience functions like isZero, getZExtValue, and isMinSignedValue that are named identically to those provided by APInt, so we can "skip" getAPIntValue.	2023-03-22 22:10:25 -07:00
Simon Pilgrim	da570ef1b4	[DAG] Match select(icmp(x,y),sub(x,y),sub(y,x)) -> abd(x,y) patterns Pulled out of PowerPC, and added ABDS support as well (hence the additional v4i32 PPC matches) Differential Revision: https://reviews.llvm.org/D144789	2023-03-14 15:10:30 +00:00

1 2 3 4 5 ...

1820 Commits