llvm-project

Author	SHA1	Message	Date
Kai Luo	b42738805a	[PowerPC] Auto gen test checks for #69299 . NFC.	2023-10-18 02:21:22 +00:00
Kai Luo	3104681686	[PowerPC][Atomics] Remove redundant block to clear reservation (#68430 ) This PR is following what https://reviews.llvm.org/D134783 does for quardword CAS.	2023-10-13 10:59:27 +08:00
Nikita Popov	127ed9ae26	[PowerPC] Use zext instead of anyext in custom and combine (#68784 ) This custom combine currently converts `and(anyext(x),c)` into `anyext(and(x,c))`. This is not correct, because the original expression guaranteed that the high bits are zero, while the new one sets them to undef. Emit `zext(and(x,c))` instead. Fixes https://github.com/llvm/llvm-project/issues/68783.	2023-10-12 09:32:17 +02:00
Nikita Popov	0ead1faef0	[PowerPC] Add test for #68783 (NFC)	2023-10-11 12:15:26 +02:00
Jay Foad	7b3bbd83c0	Revert "[CodeGen] Really renumber slot indexes before register allocation (#67038 )" This reverts commit 2501ae58e3bb9a70d279a56d7b3a0ed70a8a852c. Reverted due to various buildbot failures.	2023-10-09 12:31:32 +01:00
Jay Foad	2501ae58e3	[CodeGen] Really renumber slot indexes before register allocation (#67038 ) PR #66334 tried to renumber slot indexes before register allocation, but the numbering was still affected by list entries for instructions which had been erased. Fix this to make the register allocator's live range length heuristics even less dependent on the history of how instructions have been added to and removed from SlotIndexes's maps.	2023-10-09 11:44:41 +01:00
Lei	529ad40e05	[PowerPC] Fix missing kill flag update for XVCVDPSP transformations (#67997 ) Add transformed register to kill flag work list for XVCVDPSP tranformations. Ref: reviews.llvm.org/D133103	2023-10-06 10:24:54 -04:00
Kishan Parmar	696ea67f19	Disable call to fma for soft-float PowerPC backend generate calls to libc function calls for soft-float, regardless of the -nostdlib /-ffreestanding flag. fma is not a function provided by compiler-rt builtins and thus should not be generated here. PR : [[ https://github.com/llvm/llvm-project/issues/55230 \| #55230 ]] Below is patch given by @nemanjai Reviewed By: jhibbits Differential Revision: https://reviews.llvm.org/D156344	2023-09-28 14:06:54 +05:30
Qiu Chaofan	cc627828f5	Pre-commit some PowerPC test cases	2023-09-28 15:51:14 +08:00
Wael Yehia	da55b1b52f	[XCOFF] Do not generate the special .ref for zero-length sections (#66805 ) Co-authored-by: Wael Yehia <wyehia@ca.ibm.com>	2023-09-28 01:33:41 -04:00
esmeyi	d7195c57d8	Reland https://reviews.llvm.org/D159073 . The patch failed in test-suite due to a liveness error after rebasing on https://reviews.llvm.org/D133103, and now it's fixed. ``` [PowerPC][Peephole] Combine rldicl/rldicr and andi/andis after isel. Summary: rldicl/rldicr can be eliminated if it's used to clear thehigh-order or low-order n bits and all bits cleared will be ANDed with 0 byandi/andis. Or they can be folded to `andi 0` if all bits to AND are alreadyzero in the input. Reviewed By: qiucf, shchenz Differential Revision: https://reviews.llvm.org/D159073 ```	2023-09-26 06:24:47 -04:00
Kai Luo	5fabc8ba22	[PowerPC] Add test to show wrong target flags printed at MO_TLSGDM_FLAG operand. NFC.	2023-09-26 05:13:26 +00:00
esmeyi	77147a95b8	Revert "[PowerPC][Peephole] Combine rldicl/rldicr and andi/andis after isel." This reverts commit 2de74e1bd4d540063d7495fa6254781abd41e179. A test-suite failure occurs due to this commit, will fix soon.	2023-09-25 23:31:34 -04:00
esmeyi	2de74e1bd4	[PowerPC][Peephole] Combine rldicl/rldicr and andi/andis after isel. Summary: rldicl/rldicr can be eliminated if it's used to clear the high-order or low-order n bits and all bits cleared will be ANDed with 0 by andi/andis. Or they can be folded to `andi 0` if all bits to AND are already zero in the input. Reviewed By: qiucf, shchenz Differential Revision: https://reviews.llvm.org/D159073	2023-09-25 23:11:34 -04:00
Matthias Braun	740ee00a4c	PPCBranchCoalescing: Fix invalid branch weights (#67211 ) Re-normalize branch-weights after removing a block successor to avoid branch-weights not adding up to 100%. This changes MIR for the `test/CodeGen/PowerPC/branch_coalesce.ll` test like this: ```diff - successors: %bb.6(0x40000000); %bb.6(50.00%) + successors: %bb.6(0x80000000); %bb.6(100.00%) ``` This doesn't affect codegen on its own but fixing this helps with fluctuations I have with some of my upcoming changes.	2023-09-25 10:41:04 -07:00
Nemanja Ivanovic	46d5d264fc	[PowerPC] Improve kill flag computation and add verification after MI peephole The MI Peephole pass has grown to include a large number of transformations over the years. Many of the transformations require re-computation of kill flags but don't do a good job of re-computing them. This causes us to have very common failures when the compiler is built with expensive checks. Over time, we added and augmented a function that is supposed to go and fix up kill flags after each transformation but we keep missing cases. This patch does the following: - Removes the function to re-compute kill flags - Adds LiveVariables to compute and maintain kill flags while transforming code - Adds re-computation of kill flags for the post-RA peepholes for each block that contains a transformed instruction Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D133103	2023-09-22 15:26:39 -04:00
Jay Foad	e0919b189b	[CodeGen] Renumber slot indexes before register allocation (#66334 ) RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate the length of a live range for its heuristics. Renumbering all slot indexes with the default instruction distance ensures that this estimate will be as accurate as possible, and will not depend on the history of how instructions have been added to and removed from SlotIndexes's maps. This also means that enabling -early-live-intervals, which runs the SlotIndexes analysis earlier, will not cause large amounts of churn due to different register allocator decisions.	2023-09-19 11:18:12 +01:00
Craig Topper	f71a9e8bb7	[SelectionDAG][RISCV][PowerPC][X86] Use TargetConstant for immediates for ISD::PREFETCH. (#66601 ) The intrinsic uses ImmArg so TargetConstant would be consistent with how other intrinsics are handled. This hides the constants from type legalization so we can remove the promotion support. isel patterns are updated accordingly.	2023-09-18 08:58:50 -07:00
Guozhi Wei	cbdccb30c2	[RA] Split a virtual register in cold blocks if it is not assigned preferred physical register If a virtual register is not assigned preferred physical register, it means some COPY instructions will be changed to real register move instructions. In this case we can try to split the virtual register in colder blocks, if success, the original COPY instructions can be deleted, and the new COPY instructions in colder blocks will be generated as register move instructions. It results in fewer dynamic register move instructions executed. The new test case split-reg-with-hint.ll gives an example, the hot path contains 24 instructions without this patch, now it is only 4 instructions with this patch. Differential Revision: https://reviews.llvm.org/D156491	2023-09-15 19:52:50 +00:00
Maryam Moghadas	7b021f2e64	[PowerPC] Optimize VPERM and fix code order for swapping vector operands on LE This patch reverts commit 7614ba0a5db8 to optimize VPERM when one of its vector operands is XXSWAPD, similar to XXPERM. It also reorganizes the little-endian swap code on LE, swapping the vector operand after adjusting the mask operand. This ensures that the vector operand is swapped at the correct point in the code, resulting in a valid constant pool for the mask operand. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D149083	2023-09-13 15:00:49 -05:00
Simon Pilgrim	e6b85c3027	[DAG] FoldSetCC - add missing icmp(X,undef) -> isTrueWhenEqual case (REAPPLIED) Followup to D59363 which failed to handle the icmp(X,undef) -> isTrueWhenEqual case - similar to llvm::ConstantFoldCompareInstruction As discussed on the review, this is affecting some previously reduced test cases, but will also prevent reductions from relying on this inconsistent behaviour in the future. Reapplied after reversion at e1e3c75c7dad72 with a tweak to the pseudo-probe-peep.ll test Differential Revision: https://reviews.llvm.org/D158068	2023-09-13 12:33:39 +01:00
Simon Pilgrim	e1e3c75c7d	Revert rG6c56cf71ee82ec3a28e0dfc2b751bd10c16929da "[DAG] FoldSetCC - add missing icmp(X,undef) -> isTrueWhenEqual case" Need to address a missed test change	2023-09-13 11:27:47 +01:00
Simon Pilgrim	6c56cf71ee	[DAG] FoldSetCC - add missing icmp(X,undef) -> isTrueWhenEqual case Followup to D59363 which failed to handle the icmp(X,undef) -> isTrueWhenEqual case - similar to llvm::ConstantFoldCompareInstruction As discussed on the review, this is affecting some previously reduced test cases, but will also prevent reductions from relying on this inconsistent behaviour in the future. Differential Revision: https://reviews.llvm.org/D158068	2023-09-13 11:01:58 +01:00
Qiu Chaofan	69b056d563	[PowerPC] Implement SchedModel for Power7 Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D158704	2023-09-13 14:55:07 +08:00
Qiu Chaofan	d4d0b5eaab	Fix MIR failure after b922a362	2023-09-08 16:33:45 +08:00
Qiu Chaofan	b922a36211	[PowerPC] Define SchedModel for Power8 PowerPC subtargets prior to Power9 use the 'legacy' itinerary way to provide scheduling information. This patch re-writes the tablegen file to define the scheduling information in the new SchedModel way, which can bring improvements to some benchmarks. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D154488	2023-09-08 15:43:21 +08:00
bzEq	d9efcb54c9	[PEI][PowerPC] Fix false alarm of stack size limit (#65559 ) PPC64 allows stack size up to ((2^63)-1) bytes. Currently llc reports ``` warning: stack frame size (4294967568) exceeds limit (4294967295) in function 'main' ``` if the stack allocated is larger than 4G.	2023-09-08 15:16:00 +08:00
Amy Kwan	3f46e5453d	[AIX][TLS] Produce a faster local-exec access sequence with -maix-small-local-exec-tls (And optimize when load/store offsets are 0) This patch utilizes the -maix-small-local-exec-tls option added in D155544 to produce a faster access sequence for the local-exec TLS model, where loading from the TOC can be avoided. The patch either produces an addi/la with a displacement off of r13 (the thread pointer) when the address is calculated, or it produces an addi/la followed by a load/store when the address is calculated and used for further accesses. This patch also optimizes this sequence a bit more where we can remove the addi/la when the load/store offset is 0. A follow up patch will be posted to account for when the load/store offset is non-zero, and currently in these situations we keep the addi/la that precedes the load/store. Furthermore, this access sequence is only performed for TLS variables that are less than ~32KB in size. Differential Revision: https://reviews.llvm.org/D155600	2023-09-07 20:05:29 -05:00
Amy Kwan	8bdbee8aaa	[AIX][TLS] Add target attribute for -maix-small-local-exec-tls option. This patch adds a target attribute for an AIX-specific option that informs the compiler that it can use a faster access sequence for the local-exec TLS model (formally named aix-small-local-exec-tls). The Clang portion of this option is in D155544. The initial implementation to generate the faster access sequence is in D155600. Differential Revision: https://reviews.llvm.org/D156203	2023-09-07 20:05:29 -05:00
stefanp-ibm	0a4a8bec34	[PowerPC] Turn string pooling on by default. (#65628 ) This patch turns the string pooling pass on by default. Some tests are updated as required.	2023-09-07 16:49:31 -04:00
Wael Yehia	11d5c7bd28	[AIX] Add threadId and use nanosecond timestamp in sinit/sterm symbols With ThinLTO, when compiling SPEC 2017 omnetpp_r with -threads=4, two small modules can end up with the same timestamp in their sinit symbols when calculating time in seconds, creating duplicate definitions. This patch uses a timestamp in nanoseconds. Because the race can be between threads, embed the thread ID as well. Reviewed By: xingxue, daltenty Differential Revision: https://reviews.llvm.org/D159319	2023-09-07 17:46:41 +00:00
Amy Kwan	f94f85348d	Revert "[AIX][TLS] Generate .extern and .ref references to __tls_get_addr for local-exec accesses." This reverts commit f0b2f6954101c9052763a99a1e7ac135770e779a. The implementation is incorrect and breaks compiling local-exec programs.	2023-09-07 12:10:37 -05:00
esmeyi	b85a9b3093	[PowerPC] Try to use less instructions to materialize 64-bit constant when High32=Low32. Summary: Materialization a 64-bit constant with High32=Low32 only requires 2 instructions instead of 3 when Low32 can be materialized in 1 instruction. Reviewed By: qiucf Differential Revision: https://reviews.llvm.org/D158495	2023-09-07 13:03:17 -04:00
Stefan Pintilie	84e2fd7ee4	[PowerPC] Add a pass to merge all of the constant global arrays into one pool. On PowerPC the number of TOC entries must be kept low for large applications. In order to reduce the number of constant global arrays we can pool them into one structure and then access them as the base address of that structure plus some offset. The constant global arrays may be arrays of `i8` which are constant strings but they may also be arrays of `i32, i64, etc...`. Reviewed By: lei, amyk Differential Revision: https://reviews.llvm.org/D155730	2023-09-07 11:14:56 -04:00
Stefan Pintilie	492c1f3d7c	[PowerPC] Merge rotate and clear into single instruction. This patch tries to catch a codegen opportunity where the rotate and mask can be merged into a single RLDCL instruction. Reviewed By: lei, amyk Differential Revision: https://reviews.llvm.org/D158328	2023-09-07 09:25:41 -04:00
Ting Wang	71be020dda	[SelectionDAG][PowerPC] Memset reuse vector element for tail store On PPC there are instructions to store element from vector(e.g. stxsdx/stxsiwx), and these instructions can be leveraged to avoid tail constant in memset and constant splat array initialization. This patch tries to explore these opportunities. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D138883	2023-09-06 01:52:38 -04:00
Amy Kwan	f0b2f69541	[AIX][TLS] Generate .extern and .ref references to __tls_get_addr for local-exec accesses. Compiling with TLS variables requires -pthread, but if the user omits this option, the compiler will not show any obvious indication during compilation that -pthread is needed for programs using TLS variables. Instead, the user will experience a segmentation fault when running programs with TLS variables in them and without specifying -pthread. This patch aims to generate .extern/.ref references to __tls_get_addr[DS] for local-exec accesses, in order to trigger an error from the linker to indicate that there is an undefined symbol to __tls_get_addr. Doing so will remind the user to compile/link with -pthread. Differential Revision: https://reviews.llvm.org/D151335	2023-09-05 12:15:14 -05:00
Qiu Chaofan	082c5d7f63	[PowerPC] Implement builtin for mffsl mffsl is available since ISA 3.0. The builtin is named with ppc prefix to follow our convention. For targets earlier than power9, GCC generates extra code to support the functionality, while this patch does not implement such behavior. Reviewed By: nemanjai, tuliom Differential Revision: https://reviews.llvm.org/D158065	2023-09-05 11:22:09 +08:00
Matt Arsenault	b14e83d1a4	IR: Add llvm.exp10 intrinsic We currently have log, log2, log10, exp and exp2 intrinsics. Add exp10 to fix this asymmetry. AMDGPU already has most of the code for f32 exp10 expansion implemented alongside exp, so the current implementation is duplicating nearly identical effort between the compiler and library which is inconvenient. https://reviews.llvm.org/D157871	2023-09-01 19:45:03 -04:00
Chen Zheng	a69cb20768	[NFC] Fix the PowerPC broken cases in D152215. Reviewed By: qiucf Differential Revision: https://reviews.llvm.org/D159052	2023-09-01 02:07:48 -04:00
Stephen Peckham	282da83756	[XCOFF][AIX] Issue an error when specifying an alias for a common symbol Summary: There is no support in XCOFF for labels on common symbols. Therefore, an alias for a common symbol is not supported. Issue an error in the front end when an aliasee is a common symbol. Issue a similar error in the back end in case an IR specifies an alias for a common symbol. Reviewed by: hubert.reinterpretcast, DiggerLin Differential Revision: https://reviews.llvm.org/D158739	2023-08-31 11:43:47 -04:00
Qiu Chaofan	21bea1a208	[PowerPC] Support initial-exec TLS relocation on AIX Add TLS_IE relocation type to XCOFF writer, and emit code sequence for initial-exec TLS variables. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D156292	2023-08-30 16:22:16 +08:00
Chen Zheng	732f63d96d	[PowerPC]set default min-jump-table-entries to 64 on PPC Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D159050	2023-08-29 21:42:22 -04:00
Chen Zheng	833b1e307f	[NFC] add testcase for MinimumJumpTableEntries change on PowerPC.	2023-08-29 21:13:50 -04:00
Serguei Katkov	a701b7e368	[CGP] Remove dead PHI nodes before elimination of mostly empty blocks Before elimination of mostly empty block it makes sense to remove dead PHI nodes. It open more opportunity for elimination plus eliminates dead code itself. It appeared that change results in failing many unit tests and some of them I've updated and for another one I disable this optimization. The pattern I observed in the tests is that there is a infinite loop without side effects. As a result after elimination of dead phi node all other related instruction are also removed and tests stops to check what it is expected. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D158503	2023-08-29 04:35:06 +00:00
esmeyi	8514d207ba	[AIX] Handle ReadOnlyWithRel kind on AIX. Summary: This patch handles the SectionKind of ReadOnlyWithRel on AIX. The failure was discovered during sanitizer enablement and occured with `-fsanitize-coverage` option. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D157483	2023-08-28 00:21:09 -04:00
Craig Topper	2ad50f354a	[DAGCombiner][RISCV][AArch64][PowerPC] Restrict foldAndOrOfSETCC from using SMIN/SMAX where and OR/AND would do. This removes some diffs created by D153502. I'm assuming an AND/OR won't be worse than an SMIN/SMAX. For RISC-V at least, AND/OR can be a shorter encoding than SMIN/SMAX. It's weird that we have two different functions responsible for folding logic of setccs, but I'm not ready to try to untangle that. I'm unclear if the PowerPC chang is a regression or not. It looks like it might use more registers, but I don't understand PowerPC register so I'm not sure. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D158292	2023-08-23 20:26:23 -07:00
Kai Luo	1ceaec3e81	[PowerPC][altivec] Optimize codegen of vec_promote According to https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-promote, elements not specified by the input index argument are undefined. So that we don't need to set these elements to be zeros. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D158487	2023-08-24 02:10:13 +00:00
esmeyi	96b5ea6e00	[NFC][PowerPC] Add cases for 64-bit constants.	2023-08-23 04:10:16 -04:00
Stefan Pintilie	d0e1e7649b	[NFC][PowerPC] Add a test case for rotate and clear. Added a test case for situations where a rotate is followed by a clear. NFC because only a test case is added.	2023-08-21 11:01:47 -04:00

1 2 3 4 5 ...

3716 Commits