llvm-project

Author	SHA1	Message	Date
Douglas Yung	cc2c8ab21f	Require asserts for llvm/test/CodeGen/PowerPC/sms-regpress.mir.	2024-01-22 13:51:03 -08:00
Ryotaro KASUGA	7556626dcf	[CodeGen][MachinePipeliner] Limit register pressure when scheduling (#74807 ) In software pipelining, when searching for the Initiation Interval (II), `MachinePipeliner` tries to reduce register pressure, but doesn't check how many variables can actually be alive at the same time. As a result, a lot of register spills/fills can be generated after register allocation, which might cause performance degradation. To prevent such cases, this patch adds a check phase that calculates the maximum register pressure of the scheduled loop and reject it if the pressure is too high. This can be enabled this by specifying `pipeliner-register-pressure`. Additionally, an II search range is currently fixed at 10, which is too small to find a schedule when the above algorithm is applied. Therefore this patch also adds a new option `pipeliner-ii-search-range` to specify the length of the range to search. There is one more new option `pipeliner-register-pressure-margin`, which can be used to estimate a register pressure limit less than actual for conservative analysis. Discourse thread: https://discourse.llvm.org/t/considering-register-pressure-when-deciding-initiation-interval-in-machinepipeliner/74725	2024-01-22 17:06:37 +09:00
Hans Wennborg	677ced8af2	Require asserts for llvm/test/CodeGen/PowerPC/fence.ll	2024-01-15 17:25:49 +01:00
Nikita Popov	87bc91d425	[PowerPC] Fix shuffle combine with undef elements (#77787 ) This custom DAG combine works on a shuffle where one source vector is a zero splat, which means we can adjust the shuffle indices to refer to any element of the splat -- as long as we stay in the same vector. In the case where an undef (-1) index into the non-splat vector was used, we ended up adjusting the splat index to -1+NumElements, which points into the wrong vector. Fix this by using the first element from the splat if the other one is undef. There are four cases this theoretically affects, but in practice I only managed to demonstrate a miscompile with one of them. I think two of theses are effectively dead due to the operand canonicalization at the start of the transform. Fixes https://github.com/llvm/llvm-project/issues/77748.	2024-01-15 10:12:33 +01:00
Qiu Chaofan	ce1f9465b0	[NFC] Pre-commit case of ppcf128 extractelt soften	2024-01-15 15:27:36 +08:00
Qiu Chaofan	85071a3c74	[PowerPC] Implement fence builtin (#76495 )	2024-01-15 11:19:16 +08:00
Philip Reames	e4d01bb227	[SCEV] Special case sext in isKnownNonZero (#77834 ) The existing logic in isKnownNonZero relies on unsigned ranges, which can be problematic when our range calculation is imprecise. Consider the following: %offset.nonzero = or i32 %offset, 1 --> %offset.nonzero U: [1,0) S: [1,0) %offset.i64 = sext i32 %offset.nonzero to i64 --> (sext i32 %offset.nonzero to i64) U: [-2147483648,2147483648) S: [-2147483648,2147483648) Note that the unsigned range for the sext does contain zero in this case despite the fact that it can never actually be zero. Instead, we can push the query down one level - relying on the fact that the sext is an invertible operation and that the result can only be zero if the input is. We could likely generalize this reasoning for other invertible operations, but special casing sext seems worthwhile.	2024-01-12 07:45:28 -08:00
Nikita Popov	13b5882ee6	[PowerPC] Add test for #77748 (NFC)	2024-01-11 15:45:52 +01:00
Kai Luo	6615581526	[PowerPC] Make verifier happy when lowering `llvm.trap` (#77266 ) `llvm.trap` is lowered to `PPC::TRAP` and `PPC::TRAP` is set as terminator. Verifier complains about terminator should not lie in the middle of an MBB. See #77095. Fix it by removing `isTerminator` and `isBarrier` and then set `isTrap` which was introduced by https://reviews.llvm.org/D48836# and is being used by X86 and AArch64. `PPC::TRAP` is not a hardware memory barrier and `llvm.trap` doesn't indicate a memory barrier either.	2024-01-10 09:23:30 +08:00
Fangrui Song	f972e4d343	[MC,ELF] .section: unconditionally print section flag 'G' after 'o' * Placing 'G' before 'M' (SHF_MERGE) can be misleading as the sh_entsize argument goes before the section group name, if a reader doesn't know that the order of extra arguments is not affected by the order of flags. * 'a', 'w', and 'x' indicate basic permission-related flags. Separating them with 'G' is kinda ugly. Simplify code and move 'G' after 'o'. The new output is more similar to GCC.	2024-01-09 10:48:23 -08:00
Kai Luo	225e2704af	[PowerPC] Precommit test for lowering llvm.trap on ppc64le. NFC.	2024-01-08 10:20:01 +08:00
Chen Zheng	d6aef863d8	[PowerPC] make LR/LR8 CTR/CTR8 aliased (#76926 ) fixes https://github.com/llvm/llvm-project/issues/47156 fixes https://github.com/llvm/llvm-project/issues/47155	2024-01-08 09:37:40 +08:00
Chen Zheng	dd4dc2111e	nfc add cases for pr47156 and pr47155	2024-01-04 03:56:40 -05:00
Arthur Eubanks	ece1359857	Revert "[PowerPC] Add test after #75271 on PPC. NFC. (#75616 )" This reverts commit 5cfc7b3342ce4de0bbe182b38baa8a71fc83f8f8. This depends on 0e46b49de43349f8cbb2a7d4c6badef6d16e31ae which is being reverted.	2024-01-03 17:09:45 +00:00
Kai Luo	8ae73fea3a	[PowerPC] Precommit test for #72845 . NFC.	2024-01-03 03:03:48 +00:00
Qiu Chaofan	c97a7675ee	[PowerPC] Expand FSINCOS of fp128 (#76494 )	2023-12-29 11:27:06 +08:00
Kai Luo	5cfc7b3342	[PowerPC] Add test after #75271 on PPC. NFC. (#75616 ) Demonstrate `IMPLICIT_DEF implicit-def ...` can be generated after coalescing on PPC. The case is reduced from failure in #75570. The failure is triggered after #75271 .	2023-12-26 00:21:56 +08:00
stephenpeckham	7026086073	[XCOFF] Use RLDs to print branches even without -r (#74342 ) This presents misleading and confusing output. If you have a function defined at the beginning of an XCOFF object file, and you have a function call to an external function, the function call disassembles as a branch to the local function. That is, `void f() { f(); g();}` disassembles as >00000000 <.f>: 0: 7c 08 02 a6 mflr 0 4: 94 21 ff c0 stwu 1, -64(1) 8: 90 01 00 48 stw 0, 72(1) c: 4b ff ff f5 bl 0x0 <.f> 10: 4b ff ff f1 bl 0x0 <.f> With this PR, the second call will display: `10: 4b ff ff f1 bl 0x0 <.g> ` Using -r can help, but you still get the confusing output: >10: 4b ff ff f1 bl 0x0 <.f> 00000010: R_RBR .g	2023-12-21 08:17:32 -06:00
Kai Luo	56414220df	[PowerPC] Use 'sync; ld; cmp; bc; isync' for atomic load seq-cst on 32-bit platform (#75905 ) `cmp; bc; isync` is more performant than `lwsync` theoretically. 64-bit platform already features it, now implement it for 32-bit platform.	2023-12-20 10:01:02 +08:00
Paul Kirth	9a578a9f60	Revert "[StackColoring] Delete dead stack slots (#75351 )" (#75655 ) This reverts commit 08b306dc8e7c0b2498f4f194a3c51686d56dbd20. it causes the following assertion failure: llvm/include/llvm/CodeGen/MachineFrameInfo.h:530: int64_t llvm::MachineFrameInfo::getObjectOffset(int) const: Assertion `!isDeadObjectIndex(ObjectIdx) && "Getting frame offset for a dead object?"' failed.	2023-12-15 13:32:39 -08:00
mohammed-nurulhoque	08b306dc8e	[StackColoring] Delete dead stack slots (#75351 ) deletes slots that have lifetime markers and the lifetime ranges are empty.	2023-12-15 09:58:19 +00:00
Nikita Popov	9c093cbb5e	Revert "[StackColoring] Delete dead stack slots (#72633 )" This reverts commit a29457844bf0c4b2eb5c0f3877b6e8ef30cdef52. Causes an assertion failure in llvm/test/DebugInfo/COFF/lexicalblock.ll.	2023-12-13 14:31:09 +01:00
mohammed-nurulhoque	a29457844b	[StackColoring] Delete dead stack slots (#72633 ) Deletes slots that have lifetime markers and the lifetime ranges are empty.	2023-12-13 13:01:21 +01:00
paperchalice	60eca674b1	[CodeGen] Port `ExpandMemCmp` to new pass manager (#74050 )	2023-12-13 16:18:24 +08:00
bcahoon	a19c7c403f	[MachinePipeliner] Fix store-store dependences (#72575 ) The pipeliner needs to mark store-store order dependences as loop carried dependences. Otherwise, the stores may be scheduled further apart than the MII. The order dependences implies that the first instance of the dependent store is scheduled before the second instance of the source store instruction.	2023-12-11 21:10:34 -06:00
Maryam Moghadas	8f6f5ec776	[PowerPC] Move __ehinfo TOC entries to the end of the TOC section (#73586 ) On AIX, the __ehinfo toc-entry is never referenced directly using instructions, therefore we can allocate them with the TE storage mapping class to move them to the end of TOC.	2023-12-08 15:03:11 -05:00
Stefan Pintilie	ea8b95d0d5	[PowerPC] Add a set of extended mnemonics that are missing from Power 10. (#73003 ) This patch adds the majority of the missing extended mnemonics that were introduced in Power 10. The only extended mnemonics that were not added are related to the plq and pstq instructions. These will be added in a separate patch as the instructions themselves would also have to be added.	2023-12-07 13:40:00 -05:00
Chen Zheng	4b932d84f4	[PowerPC] redesign the target flags (#69695 ) 12 bit is not enough for PPC's target specific flags. If 8 bit for the bitmask flags, 4 bit for the direct mask, PPC can total have 16 direct mask and 8 bitmask. Not enough for PPC, see this issue in https://github.com/llvm/llvm-project/pull/66316 Redesign how PPC target set the target specific flags. With this patch, all ppc target flags are direct flags. No bitmask flag in PPC anymore. This patch aligns with some targets like X86 which also has many target specific flags. The patch also fixes a bug related to flag `MO_TLSGDM_FLAG` and `MO_LO`. They are the same value and the test case changes in this PR shows the bug.	2023-12-07 12:47:25 +08:00
Nikita Popov	eecb99c5f6	[Tests] Add disjoint flag to some tests (NFC) These tests rely on SCEV looking recognizing an "or" with no common bits as an "add". Add the disjoint flag to relevant or instructions in preparation for switching SCEV to use the flag instead of the ValueTracking query. The IR with disjoint flag matches what InstCombine would produce.	2023-12-05 14:09:36 +01:00
stephenpeckham	4b1254e7d4	[AIX] In assembly file, create a dummy text renamed to an empty string (#73052 ) This works around an AIX assembler and linker bug. If the -fno-integrated-as and -frecord-command-line options are used but there's no actual code in the source file, the assembler creates an object file with only an .info section. The AIX linker rejects such an object file.	2023-12-04 17:35:47 -06:00
Ramkumar Ramachandra	d48d1edcf3	PowerPC/aix-cc-abi: regenerate test using UTC (NFC) (#73963 ) Split out the parts of aix-cc-abi.ll that requires to be regenerated by utils/update_mir_test_checks.py into aix-cc-abi-mir.ll, and regenerate it using the script. Regenerate aix-cc-abi.ll using utils/update_llc_test_checks.py.	2023-12-01 08:22:18 +00:00
Kai Luo	afd9582b36	[PowerPC] Enhance test for PR #73609 . NFC.	2023-11-30 05:06:29 +00:00
Kai Luo	00f9946680	[PowerPC] Precommit test of building vector via load and zeros. NFC.	2023-11-28 03:32:57 +00:00
Bjorn Pettersson	30afb21547	Revert "[MCP] Enhance MCP copy Instruction removal for special case (#70778 )" This reverts commit cae46f6210293ba4d3568eb21b935d438934290d. Reverted due to miscompiles. See https://github.com/llvm/llvm-project/issues/73512	2023-11-27 19:39:40 +01:00
Chen Zheng	abc405858d	[XCOFF] make related SD symbols as isFunction (#69553 ) This will help tools like llvm-symbolizer recognizes more functions.	2023-11-26 11:59:09 +08:00
Stefan Pintilie	d896b1f5a6	[PowerPC] Do not string pool globals that are part of llvm used. (#66848 ) The string pooling pass was incorrectly pooling global varables that were part of llvm.used or llvm.compiler.used. This patch fixes the pass to prevent that by checking each candidate to make sure that it is not in either of those lists.	2023-11-24 12:21:28 -05:00
LWenH	32903b0b6d	[MCP] fix PowerPC redundant copy instructions removal fail test cases, NFC	2023-11-23 01:54:53 +08:00
Kai Luo	bfd3734610	[PowerPC] Use MIR test so that it's not affected by instruction selection. NFC.	2023-11-20 09:51:12 +00:00
Kai Luo	592386400d	[PowerPC] Precommit test to show codegen while `isel` is unavailable. NFC.	2023-11-20 07:28:21 +00:00
Kai Luo	eb7698254a	[PowerPC][EarlyIfConversion] Do not insert `isel` if subtarget doesn't support `isel` (#72211 ) Some subtargets of PPC don't support `isel` instruction, early-ifcvt should not insert this instruction.	2023-11-20 09:17:04 +08:00
Qiu Chaofan	426ad99bb2	[PowerPC] Forbid f128 SELECT_CC optimized into fsel (#71497 )	2023-11-15 12:20:06 +08:00
Qiongsi Wu	c8b11091e8	[SelectionDAG] Handling Oversized Alloca Types under 32 bit Mode to Avoid Code Generator Crash (#71472 ) Situations may arise leading to negative `NumElements` argument of an `alloca` instruction. In this case the `NumElements` is treated as a large unsigned value. Such large arrays may cause the size constant to overflow during code generation under 32 bit mode, leading to a crash. This PR limits the constant's bit width to the width of the pointer on the target. With this fix, ``` alloca i32, i32 -1 ``` and ``` alloca [4294967295 x i32], i32 1 ``` generates the exact same PowerPC assembly code under 32 bit mode.	2023-11-14 10:52:51 -05:00
Kai Luo	acdf7c8f27	[PowerPC] Precommit test to show impact of early-ifcvt on target without `isel`. NFC.	2023-11-14 06:10:05 +00:00
stephenpeckham	1d1fede493	[XCOFF] Ensure .file is emitted before any .info pseudo-ops (#71577 ) When generating the assembly code for AIX/XCOFF, the .file pseudo-op needs to be emitted first, before any csects are generated. Otherwise, information such as the embedded command line will be associated with part of the object file rather than the entire object file.	2023-11-09 16:03:45 -06:00
Juergen Ributzka	6d1d7be133	Obsolete WebKit Calling Convention (#71567 ) The WebKit Calling Convention was created specifically for the WebKit FTL. FTL doesn't use LLVM anymore and therefore this calling convention is obsolete. This commit removes the WebKit CC, its associated tests, and documentation.	2023-11-09 09:08:41 -08:00
Qiu Chaofan	5f295552f1	[PowerPC] Fix incorrect symbol name of frexp libcall (#71626 ) frexpl is for ppc_fp128. The correct symbol name for f128 is frexpf128.	2023-11-08 14:41:19 +08:00
Qiu Chaofan	d199fd76f7	[NFC] Add f128 frexp intrinsics for PowerPC	2023-11-08 11:27:40 +08:00
Nikita Popov	e4a4122eb6	[IR] Remove zext and sext constant expressions (#71040 ) Remove support for zext and sext constant expressions. All places creating them have been removed beforehand, so this just removes the APIs and uses of these constant expressions in tests. There is some additional cleanup that can be done on top of this, e.g. we can remove the ZExtInst vs ZExtOperator footgun. This is part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.	2023-11-03 10:46:07 +01:00
Nikita Popov	060de415af	Reapply [InstCombine] Simplify and/or of icmp eq with op replacement (#70335 ) Relative to the first attempt, this contains two changes: First, we only handle the case where one side simplifies to true or false, instead of calling simplification recursively. The previous approach would return poison if one operand simplified to poison (under the equality assumption), which is incorrect. Second, we do not fold llvm.is.constant in simplifyWithOpReplaced(). We may be assuming that a value is constant, if the equality holds, but it may not actually be constant. This is nominally just a QoI issue, but the std::list implementation in libstdc++ relies on the precise behavior in a way that causes miscompiles. ----- and/or in logical (select) form benefit from generic simplifications via simplifyWithOpReplaced(). However, the corresponding fold for plain and/or currently does not exist. Similar to selects, there are two general cases for this fold (illustrated with `and`, but there are `or` conjugates). The basic case is something like `(a == b) & c`, where the replacement of a with b or b with a inside c allows it to fold to true or false. Then the whole operation will fold to either false or `a == b`. The second case is something like `(a != b) & c`, where the replacement inside c allows it to fold to false. In that case, the operand can be replaced with c, because in the case where a == b (and thus the icmp is false), c itself will already be false. As the test diffs show, this catches quite a lot of patterns in existing test coverage. This also obsoletes quite a few existing special-case and/or of icmp folds we have (e.g. simplifyAndOrOfICmpsWithLimitConst), but I haven't removed anything as part of this patch in the interest of risk mitigation. Fixes #69050. Fixes #69091.	2023-11-03 10:16:15 +01:00
Kai Luo	7b5505b0d5	[PowerPC] Change registers used in test due to ABI breakage. NFC. (#70758 ) Usage of `r30` and `r31` has broken current traceback table's convention on AIX. Avoid using CSRs in livein list.	2023-11-03 07:08:33 +08:00

1 2 3 4 5 ...

3772 Commits