llvm-project

Author	SHA1	Message	Date
stephenpeckham	7026086073	[XCOFF] Use RLDs to print branches even without -r (#74342 ) This presents misleading and confusing output. If you have a function defined at the beginning of an XCOFF object file, and you have a function call to an external function, the function call disassembles as a branch to the local function. That is, `void f() { f(); g();}` disassembles as >00000000 <.f>: 0: 7c 08 02 a6 mflr 0 4: 94 21 ff c0 stwu 1, -64(1) 8: 90 01 00 48 stw 0, 72(1) c: 4b ff ff f5 bl 0x0 <.f> 10: 4b ff ff f1 bl 0x0 <.f> With this PR, the second call will display: `10: 4b ff ff f1 bl 0x0 <.g> ` Using -r can help, but you still get the confusing output: >10: 4b ff ff f1 bl 0x0 <.f> 00000010: R_RBR .g	2023-12-21 08:17:32 -06:00
Kai Luo	56414220df	[PowerPC] Use 'sync; ld; cmp; bc; isync' for atomic load seq-cst on 32-bit platform (#75905 ) `cmp; bc; isync` is more performant than `lwsync` theoretically. 64-bit platform already features it, now implement it for 32-bit platform.	2023-12-20 10:01:02 +08:00
Paul Kirth	9a578a9f60	Revert "[StackColoring] Delete dead stack slots (#75351 )" (#75655 ) This reverts commit 08b306dc8e7c0b2498f4f194a3c51686d56dbd20. it causes the following assertion failure: llvm/include/llvm/CodeGen/MachineFrameInfo.h:530: int64_t llvm::MachineFrameInfo::getObjectOffset(int) const: Assertion `!isDeadObjectIndex(ObjectIdx) && "Getting frame offset for a dead object?"' failed.	2023-12-15 13:32:39 -08:00
mohammed-nurulhoque	08b306dc8e	[StackColoring] Delete dead stack slots (#75351 ) deletes slots that have lifetime markers and the lifetime ranges are empty.	2023-12-15 09:58:19 +00:00
Nikita Popov	9c093cbb5e	Revert "[StackColoring] Delete dead stack slots (#72633 )" This reverts commit a29457844bf0c4b2eb5c0f3877b6e8ef30cdef52. Causes an assertion failure in llvm/test/DebugInfo/COFF/lexicalblock.ll.	2023-12-13 14:31:09 +01:00
mohammed-nurulhoque	a29457844b	[StackColoring] Delete dead stack slots (#72633 ) Deletes slots that have lifetime markers and the lifetime ranges are empty.	2023-12-13 13:01:21 +01:00
paperchalice	60eca674b1	[CodeGen] Port `ExpandMemCmp` to new pass manager (#74050 )	2023-12-13 16:18:24 +08:00
bcahoon	a19c7c403f	[MachinePipeliner] Fix store-store dependences (#72575 ) The pipeliner needs to mark store-store order dependences as loop carried dependences. Otherwise, the stores may be scheduled further apart than the MII. The order dependences implies that the first instance of the dependent store is scheduled before the second instance of the source store instruction.	2023-12-11 21:10:34 -06:00
Maryam Moghadas	8f6f5ec776	[PowerPC] Move __ehinfo TOC entries to the end of the TOC section (#73586 ) On AIX, the __ehinfo toc-entry is never referenced directly using instructions, therefore we can allocate them with the TE storage mapping class to move them to the end of TOC.	2023-12-08 15:03:11 -05:00
Stefan Pintilie	ea8b95d0d5	[PowerPC] Add a set of extended mnemonics that are missing from Power 10. (#73003 ) This patch adds the majority of the missing extended mnemonics that were introduced in Power 10. The only extended mnemonics that were not added are related to the plq and pstq instructions. These will be added in a separate patch as the instructions themselves would also have to be added.	2023-12-07 13:40:00 -05:00
Chen Zheng	4b932d84f4	[PowerPC] redesign the target flags (#69695 ) 12 bit is not enough for PPC's target specific flags. If 8 bit for the bitmask flags, 4 bit for the direct mask, PPC can total have 16 direct mask and 8 bitmask. Not enough for PPC, see this issue in https://github.com/llvm/llvm-project/pull/66316 Redesign how PPC target set the target specific flags. With this patch, all ppc target flags are direct flags. No bitmask flag in PPC anymore. This patch aligns with some targets like X86 which also has many target specific flags. The patch also fixes a bug related to flag `MO_TLSGDM_FLAG` and `MO_LO`. They are the same value and the test case changes in this PR shows the bug.	2023-12-07 12:47:25 +08:00
Nikita Popov	eecb99c5f6	[Tests] Add disjoint flag to some tests (NFC) These tests rely on SCEV looking recognizing an "or" with no common bits as an "add". Add the disjoint flag to relevant or instructions in preparation for switching SCEV to use the flag instead of the ValueTracking query. The IR with disjoint flag matches what InstCombine would produce.	2023-12-05 14:09:36 +01:00
stephenpeckham	4b1254e7d4	[AIX] In assembly file, create a dummy text renamed to an empty string (#73052 ) This works around an AIX assembler and linker bug. If the -fno-integrated-as and -frecord-command-line options are used but there's no actual code in the source file, the assembler creates an object file with only an .info section. The AIX linker rejects such an object file.	2023-12-04 17:35:47 -06:00
Ramkumar Ramachandra	d48d1edcf3	PowerPC/aix-cc-abi: regenerate test using UTC (NFC) (#73963 ) Split out the parts of aix-cc-abi.ll that requires to be regenerated by utils/update_mir_test_checks.py into aix-cc-abi-mir.ll, and regenerate it using the script. Regenerate aix-cc-abi.ll using utils/update_llc_test_checks.py.	2023-12-01 08:22:18 +00:00
Kai Luo	afd9582b36	[PowerPC] Enhance test for PR #73609 . NFC.	2023-11-30 05:06:29 +00:00
Kai Luo	00f9946680	[PowerPC] Precommit test of building vector via load and zeros. NFC.	2023-11-28 03:32:57 +00:00
Bjorn Pettersson	30afb21547	Revert "[MCP] Enhance MCP copy Instruction removal for special case (#70778 )" This reverts commit cae46f6210293ba4d3568eb21b935d438934290d. Reverted due to miscompiles. See https://github.com/llvm/llvm-project/issues/73512	2023-11-27 19:39:40 +01:00
Chen Zheng	abc405858d	[XCOFF] make related SD symbols as isFunction (#69553 ) This will help tools like llvm-symbolizer recognizes more functions.	2023-11-26 11:59:09 +08:00
Stefan Pintilie	d896b1f5a6	[PowerPC] Do not string pool globals that are part of llvm used. (#66848 ) The string pooling pass was incorrectly pooling global varables that were part of llvm.used or llvm.compiler.used. This patch fixes the pass to prevent that by checking each candidate to make sure that it is not in either of those lists.	2023-11-24 12:21:28 -05:00
LWenH	32903b0b6d	[MCP] fix PowerPC redundant copy instructions removal fail test cases, NFC	2023-11-23 01:54:53 +08:00
Kai Luo	bfd3734610	[PowerPC] Use MIR test so that it's not affected by instruction selection. NFC.	2023-11-20 09:51:12 +00:00
Kai Luo	592386400d	[PowerPC] Precommit test to show codegen while `isel` is unavailable. NFC.	2023-11-20 07:28:21 +00:00
Kai Luo	eb7698254a	[PowerPC][EarlyIfConversion] Do not insert `isel` if subtarget doesn't support `isel` (#72211 ) Some subtargets of PPC don't support `isel` instruction, early-ifcvt should not insert this instruction.	2023-11-20 09:17:04 +08:00
Qiu Chaofan	426ad99bb2	[PowerPC] Forbid f128 SELECT_CC optimized into fsel (#71497 )	2023-11-15 12:20:06 +08:00
Qiongsi Wu	c8b11091e8	[SelectionDAG] Handling Oversized Alloca Types under 32 bit Mode to Avoid Code Generator Crash (#71472 ) Situations may arise leading to negative `NumElements` argument of an `alloca` instruction. In this case the `NumElements` is treated as a large unsigned value. Such large arrays may cause the size constant to overflow during code generation under 32 bit mode, leading to a crash. This PR limits the constant's bit width to the width of the pointer on the target. With this fix, ``` alloca i32, i32 -1 ``` and ``` alloca [4294967295 x i32], i32 1 ``` generates the exact same PowerPC assembly code under 32 bit mode.	2023-11-14 10:52:51 -05:00
Kai Luo	acdf7c8f27	[PowerPC] Precommit test to show impact of early-ifcvt on target without `isel`. NFC.	2023-11-14 06:10:05 +00:00
stephenpeckham	1d1fede493	[XCOFF] Ensure .file is emitted before any .info pseudo-ops (#71577 ) When generating the assembly code for AIX/XCOFF, the .file pseudo-op needs to be emitted first, before any csects are generated. Otherwise, information such as the embedded command line will be associated with part of the object file rather than the entire object file.	2023-11-09 16:03:45 -06:00
Juergen Ributzka	6d1d7be133	Obsolete WebKit Calling Convention (#71567 ) The WebKit Calling Convention was created specifically for the WebKit FTL. FTL doesn't use LLVM anymore and therefore this calling convention is obsolete. This commit removes the WebKit CC, its associated tests, and documentation.	2023-11-09 09:08:41 -08:00
Qiu Chaofan	5f295552f1	[PowerPC] Fix incorrect symbol name of frexp libcall (#71626 ) frexpl is for ppc_fp128. The correct symbol name for f128 is frexpf128.	2023-11-08 14:41:19 +08:00
Qiu Chaofan	d199fd76f7	[NFC] Add f128 frexp intrinsics for PowerPC	2023-11-08 11:27:40 +08:00
Nikita Popov	e4a4122eb6	[IR] Remove zext and sext constant expressions (#71040 ) Remove support for zext and sext constant expressions. All places creating them have been removed beforehand, so this just removes the APIs and uses of these constant expressions in tests. There is some additional cleanup that can be done on top of this, e.g. we can remove the ZExtInst vs ZExtOperator footgun. This is part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.	2023-11-03 10:46:07 +01:00
Nikita Popov	060de415af	Reapply [InstCombine] Simplify and/or of icmp eq with op replacement (#70335 ) Relative to the first attempt, this contains two changes: First, we only handle the case where one side simplifies to true or false, instead of calling simplification recursively. The previous approach would return poison if one operand simplified to poison (under the equality assumption), which is incorrect. Second, we do not fold llvm.is.constant in simplifyWithOpReplaced(). We may be assuming that a value is constant, if the equality holds, but it may not actually be constant. This is nominally just a QoI issue, but the std::list implementation in libstdc++ relies on the precise behavior in a way that causes miscompiles. ----- and/or in logical (select) form benefit from generic simplifications via simplifyWithOpReplaced(). However, the corresponding fold for plain and/or currently does not exist. Similar to selects, there are two general cases for this fold (illustrated with `and`, but there are `or` conjugates). The basic case is something like `(a == b) & c`, where the replacement of a with b or b with a inside c allows it to fold to true or false. Then the whole operation will fold to either false or `a == b`. The second case is something like `(a != b) & c`, where the replacement inside c allows it to fold to false. In that case, the operand can be replaced with c, because in the case where a == b (and thus the icmp is false), c itself will already be false. As the test diffs show, this catches quite a lot of patterns in existing test coverage. This also obsoletes quite a few existing special-case and/or of icmp folds we have (e.g. simplifyAndOrOfICmpsWithLimitConst), but I haven't removed anything as part of this patch in the interest of risk mitigation. Fixes #69050. Fixes #69091.	2023-11-03 10:16:15 +01:00
Kai Luo	7b5505b0d5	[PowerPC] Change registers used in test due to ABI breakage. NFC. (#70758 ) Usage of `r30` and `r31` has broken current traceback table's convention on AIX. Avoid using CSRs in livein list.	2023-11-03 07:08:33 +08:00
Qiu Chaofan	b46e768455	[DAGCombine] Fold setcc_eq infinity into is.fpclass (#67829 )	2023-11-01 11:51:15 +09:00
Nikita Popov	e46dd6fbc0	Revert "[InstCombine] Simplify and/or of icmp eq with op replacement (#70335 )" This reverts commit 1770a2e325192f1665018e21200596da1904a330. Stage 2 llvm-tblgen crashes when generating X86GenAsmWriter.inc and other files.	2023-10-30 18:33:03 +01:00
Nikita Popov	1770a2e325	[InstCombine] Simplify and/or of icmp eq with op replacement (#70335 ) and/or in logical (select) form benefit from generic simplifications via simplifyWithOpReplaced(). However, the corresponding fold for plain and/or currently does not exist. Similar to selects, there are two general cases for this fold (illustrated with `and`, but there are `or` conjugates). The basic case is something like `(a == b) & c`, where the replacement of a with b or b with a inside c allows it to fold to true or false. Then the whole operation will fold to either false or `a == b`. The second case is something like `(a != b) & c`, where the replacement inside c allows it to fold to false. In that case, the operand can be replaced with c, because in the case where a == b (and thus the icmp is false), c itself will already be false. As the test diffs show, this catches quite a lot of patterns in existing test coverage. This also obsoletes quite a few existing special-case and/or of icmp folds we have (e.g. simplifyAndOrOfICmpsWithLimitConst), but I haven't removed anything as part of this patch in the interest of risk mitigation. Fixes #69050. Fixes #69091.	2023-10-30 10:05:39 +01:00
Simon Pilgrim	c9c9bf0f20	[DAG] WidenVectorOperand - add basic handling for *_EXTEND_VECTOR_INREG nodes Fixes Issue #70208	2023-10-25 16:52:15 +01:00
Matthias Braun	e3cf80c5c1	BlockFrequencyInfoImpl: Avoid big numbers, increase precision for small spreads BlockFrequencyInfo calculates block frequencies as Scaled64 numbers but as a last step converts them to unsigned 64bit integers (`BlockFrequency`). This improves the factors picked for this conversion so that: * Avoid big numbers close to UINT64_MAX to avoid users overflowing/saturating when adding multiply frequencies together or when multiplying with integers. This leaves the topmost 10 bits unused to allow for some room. * Spread the difference between hottest/coldest block as much as possible to increase precision. * If the hot/cold spread cannot be represented loose precision at the lower end, but keep the frequencies at the upper end for hot blocks differentiable.	2023-10-24 20:27:39 -07:00
Ramkumar Ramachandra	98c90a13c6	ISel: introduce vector ISD::LRINT, ISD::LLRINT; custom RISCV lowering (#66924 ) The issue #55208 noticed that std::rint is vectorized by the SLPVectorizer, but a very similar function, std::lrint, is not. std::lrint corresponds to ISD::LRINT in the SelectionDAG, and std::llrint is a familiar cousin corresponding to ISD::LLRINT. Now, neither ISD::LRINT nor ISD::LLRINT have a corresponding vector variant, and the LangRef makes this clear in the documentation of llvm.lrint.* and llvm.llrint.. This patch extends the LangRef to include vector variants of llvm.lrint. and llvm.llrint.*, and lays the necessary ground-work of scalarizing it for all targets. However, this patch would be devoid of motivation unless we show the utility of these new vector variants. Hence, the RISCV target has been chosen to implement a custom lowering to the vfcvt.x.f.v instruction. The patch also includes a CostModel for RISCV, and a trivial follow-up can potentially enable the SLPVectorizer to vectorize std::lrint and std::llrint, fixing #55208. The patch includes tests, obviously for the RISCV target, but also for the X86, AArch64, and PowerPC targets to justify the addition of the vector variants to the LangRef.	2023-10-19 13:05:04 +01:00
Kai Luo	b42738805a	[PowerPC] Auto gen test checks for #69299 . NFC.	2023-10-18 02:21:22 +00:00
Kai Luo	3104681686	[PowerPC][Atomics] Remove redundant block to clear reservation (#68430 ) This PR is following what https://reviews.llvm.org/D134783 does for quardword CAS.	2023-10-13 10:59:27 +08:00
Nikita Popov	127ed9ae26	[PowerPC] Use zext instead of anyext in custom and combine (#68784 ) This custom combine currently converts `and(anyext(x),c)` into `anyext(and(x,c))`. This is not correct, because the original expression guaranteed that the high bits are zero, while the new one sets them to undef. Emit `zext(and(x,c))` instead. Fixes https://github.com/llvm/llvm-project/issues/68783.	2023-10-12 09:32:17 +02:00
Nikita Popov	0ead1faef0	[PowerPC] Add test for #68783 (NFC)	2023-10-11 12:15:26 +02:00
Jay Foad	7b3bbd83c0	Revert "[CodeGen] Really renumber slot indexes before register allocation (#67038 )" This reverts commit 2501ae58e3bb9a70d279a56d7b3a0ed70a8a852c. Reverted due to various buildbot failures.	2023-10-09 12:31:32 +01:00
Jay Foad	2501ae58e3	[CodeGen] Really renumber slot indexes before register allocation (#67038 ) PR #66334 tried to renumber slot indexes before register allocation, but the numbering was still affected by list entries for instructions which had been erased. Fix this to make the register allocator's live range length heuristics even less dependent on the history of how instructions have been added to and removed from SlotIndexes's maps.	2023-10-09 11:44:41 +01:00
Lei	529ad40e05	[PowerPC] Fix missing kill flag update for XVCVDPSP transformations (#67997 ) Add transformed register to kill flag work list for XVCVDPSP tranformations. Ref: reviews.llvm.org/D133103	2023-10-06 10:24:54 -04:00
Kishan Parmar	696ea67f19	Disable call to fma for soft-float PowerPC backend generate calls to libc function calls for soft-float, regardless of the -nostdlib /-ffreestanding flag. fma is not a function provided by compiler-rt builtins and thus should not be generated here. PR : [[ https://github.com/llvm/llvm-project/issues/55230 \| #55230 ]] Below is patch given by @nemanjai Reviewed By: jhibbits Differential Revision: https://reviews.llvm.org/D156344	2023-09-28 14:06:54 +05:30
Qiu Chaofan	cc627828f5	Pre-commit some PowerPC test cases	2023-09-28 15:51:14 +08:00
Wael Yehia	da55b1b52f	[XCOFF] Do not generate the special .ref for zero-length sections (#66805 ) Co-authored-by: Wael Yehia <wyehia@ca.ibm.com>	2023-09-28 01:33:41 -04:00
esmeyi	d7195c57d8	Reland https://reviews.llvm.org/D159073 . The patch failed in test-suite due to a liveness error after rebasing on https://reviews.llvm.org/D133103, and now it's fixed. ``` [PowerPC][Peephole] Combine rldicl/rldicr and andi/andis after isel. Summary: rldicl/rldicr can be eliminated if it's used to clear thehigh-order or low-order n bits and all bits cleared will be ANDed with 0 byandi/andis. Or they can be folded to `andi 0` if all bits to AND are alreadyzero in the input. Reviewed By: qiucf, shchenz Differential Revision: https://reviews.llvm.org/D159073 ```	2023-09-26 06:24:47 -04:00

1 2 3 4 5 ...

3755 Commits