llvm-project

Author	SHA1	Message	Date
Antonio Frighetto	d79c4c1119	[CGP] Regenerate `revert-constant-ptr-propagation-on-calls.ll` test (NFC) Multiple buildbots were previously failing.	2024-09-02 09:55:43 +02:00
Antonio Frighetto	e4e0dfb0c2	[CGP] Undo constant propagation of pointers across calls It may be profitable to revert SCCP propagation of C++ static values, if such constants are pointers, in order to avoid redundant pointer computation, since the method returning the constant is non-removable.	2024-09-02 09:33:23 +02:00
Antonio Frighetto	ed6d9f6d2a	[CGP] Introduce test for PR102926 (NFC)	2024-09-02 09:33:23 +02:00
Stephen Tozer	3d08ade7bd	[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149 ) This patch is part of a set of patches that add an `-fextend-lifetimes` flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable `-fextend-lifetimes`, and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes` for this pointers only. All changes and tests in these patches were written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer) has handled review and merging. The extend lifetimes flag is intended to eventually be set on by `-Og`, as discussed in the RFC here: https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850 This patch implements a new intrinsic instruction in LLVM, `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA). Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>	2024-08-29 17:53:32 +01:00
Simon Pilgrim	f673882323	[X86] Allow speculative BSR/BSF instructions on targets with CMOV (#102885 ) Currently targets without LZCNT/TZCNT won't speculate with BSR/BSF instructions in case they have a zero value input, meaning we always insert a test+branch for the zero-input case. This patch proposes we allow speculation if the target has CMOV, and perform a branchless select instead to handle the zero input case. This will predominately help x86-64 targets where we haven't set any particular cpu target. We already always perform BSR/BSF instructions if we were lowering a CTLZ/CTTZ_ZERO_UNDEF instruction.	2024-08-22 11:11:00 +01:00
Sjoerd Meijer	70e8c982d0	[AArch64] Bail out for scalable vecs in areExtractShuffleVectors (#105484 ) The added test triggers the following assert in `areExtractShuffleVectors` that is called from `shouldSinkOperands`: Assertion `(!isScalable() \|\| isZero()) && "Request for a fixed element count on a scalable object"' failed. I don't think scalable types can be extract shuffles, so bail early if this is the case.	2024-08-21 15:27:09 +01:00
Noah Goldstein	e4c67ba67e	Recommit "[CodeGenPrepare] Folding `urem` with loop invariant value" Was missing remainder on `Start` value. Also changed logic as as nikic suggested (getting loop from `PN` instead of `Rem`). The prior impl increased the complexity of the code and made debugging it more difficult. Closes #104877	2024-08-20 09:17:49 -07:00
Noah Goldstein	9b25ad818c	[CodeGenPrepare][X86] Add tests for fixing `urem` transform; NFC	2024-08-20 09:17:49 -07:00
Noah Goldstein	731ae694a3	Revert "[CodeGenPrepare] Folding `urem` with loop invariant value" This reverts commit c64ce8bf283120fd145a57d0e61f9697f719139d. Seems to be causing stage2 failures on buildbots. Reverting while I investigate.	2024-08-18 20:36:35 -07:00
Noah Goldstein	c64ce8bf28	[CodeGenPrepare] Folding `urem` with loop invariant value ``` for(i = Start; i < End; ++i) Rem = (i nuw+ IncrLoopInvariant) u% RemAmtLoopInvariant; ``` -> ``` Rem = (Start nuw+ IncrLoopInvariant) % RemAmtLoopInvariant; for(i = Start; i < End; ++i, ++rem) Rem = rem == RemAmtLoopInvariant ? 0 : Rem; ``` In its current state, only if `IncrLoopInvariant` and `Start` both being zero. Alive2 seemed unable to prove this (see: https://alive2.llvm.org/ce/z/ATGDp3 which is clearly wrong but still checks out...) so wrote an exhaustive test here: https://godbolt.org/z/WYa561388 Closes #96625	2024-08-18 15:58:24 -07:00
Noah Goldstein	f16125a13c	[CodeGenPrepare][X86] Add tests for folding `urem` with loop invariant value; NFC	2024-08-18 15:58:24 -07:00
David Green	0e124537aa	[AArch64] Sink operands to fmuladd. (#102297 ) A fmuladd can be treated as a fma when sinking operands to the intrinsic, similar to D126234. Addresses a small part of #102195	2024-08-09 11:48:37 +01:00
Fangrui Song	8ea31db272	[CodeGenPrepare] Use MapVector to stabilize iteration order DenseMap iteration order is not guaranteed to be deterministic. Without the change, llvm/test/Transforms/CodeGenPrepare/X86/statepoint-relocate.ll would fail when `combineHashValue` changes (#95970). Fixes: dba7329ebb0dbe1fabb3faaedfd31da3b8bd611d	2024-06-18 17:19:51 -07:00
Stephen Tozer	094572701d	[RemoveDIs] Print IR with debug records by default (#91724 ) This patch makes the final major change of the RemoveDIs project, changing the default IR output from debug intrinsics to debug records. This is expected to break a large number of tests: every single one that tests for uses or declarations of debug intrinsics and does not explicitly disable writing records. If this patch has broken your downstream tests (or upstream tests on a configuration I wasn't able to run): 1. If you need to immediately unblock a build, pass `--write-experimental-debuginfo=false` to LLVM's option processing for all failing tests (remember to use `-mllvm` for clang/flang to forward arguments to LLVM). 2. For most test failures, the changes are trivial and mechanical, enough that they can be done by script; see the migration guide for a guide on how to do this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates 3. If any tests fail for reasons other than FileCheck check lines that need updating, such as assertion failures, that is most likely a real bug with this patch and should be reported as such. For more information, see the recent PSA: https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578	2024-06-14 15:07:27 +01:00
Nikita Popov	d10b76552f	[ConstantFold] Remove notional over-indexing fold (#93697 ) The data-layout independent constant folding currently has some rather gnarly code for canonicalizing GEP indices to reduce "notional overindexing", and then infers inbounds based on that canonicalization. Now that we canonicalize to i8 GEPs, this canonicalization is essentially useless, as we'll discard it as soon as the GEP hits the data-layout aware constant folder anyway. As such, I'd like to remove this code entirely. This shouldn't have any impact on optimization capabilities.	2024-05-30 08:36:44 +02:00
wanglei	9d4f7f44b6	[test][LoongArch] Add -mattr=+d option. NFC Because most of tests assume target-abi=`lp64d`, adding the corresponding feature is reasonable. rg -l loongarch -g '!*.s' \| xargs sed -i '/mtriple=loongarch/ {/-mattr=/!{/target-abi/! s/mtriple=loongarch.. /&-mattr=+d /}}'	2024-05-14 20:23:04 +08:00
Yingwei Zheng	ab12bba0aa	[CGP] Drop poison-generating flags after hoisting (#90382 ) See the following case: ``` define i8 @src1(i8 %x) { entry: %cmp = icmp eq i8 %x, -1 br i1 %cmp, label %exit, label %if.then if.then: %inc = add nuw nsw i8 %x, 1 br label %exit exit: %retval = phi i8 [ %inc, %if.then ], [ -1, %entry ] ret i8 %retval } define i8 @tgt1(i8 %x) { entry: %inc = add nuw nsw i8 %x, 1 %0 = icmp eq i8 %inc, 0 br i1 %0, label %exit, label %if.then if.then: ; preds = %entry br label %exit exit: ; preds = %if.then, %entry %retval = phi i8 [ %inc, %if.then ], [ -1, %entry ] ret i8 %retval } ``` `optimizeBranch` converts `icmp eq X, -1` into cmp to zero on RISC-V and hoists the add into the entry block. Poison-generating flags should be dropped as they don't still hold. Proof: https://alive2.llvm.org/ce/z/sP7mvK Fixes https://github.com/llvm/llvm-project/issues/90380	2024-04-29 15:51:49 +08:00
Alex Bradbury	1c8410a67d	[CodeGenPrepare] Preserve flags (such as nsw/nuw) in SinkCast (#89904 ) As demonstrated in the test change, when deciding to sink a trunc we were losing its flags. This patch moves to cloning the original instruction instead.	2024-04-25 15:05:07 +01:00
Alex Bradbury	2554a85c03	[CodeGenPrepare][test] Add test for sinking of truncs demonstrating nsw/nuw are dropped SinkCast creates a new cast with the same type and inputs, which drops the nsw/nuw flags. Reviewed as part of <https://github.com/llvm/llvm-project/pull/89904> but split out so I can land the test separately.	2024-04-25 15:01:55 +01:00
Matthias Braun	652bcf685c	CodeGenPrepare: Add support for llvm.threadlocal.address address-mode sinking (#87844 ) Depending on the TLSMode many thread-local accesses on x86 can be expressed by adding a %fs: segment register to an addressing mode. Even if there are mutliple users of a `llvm.threadlocal.address` intrinsic it is generally not worth sharing the value in a register but instead fold the %fs access into multiple addressing modes. Hence this changes CodeGenPrepare to duplicate the `llvm.threadlocal.address` intrinsic as necessary. Introduces a new `TargetLowering::addressingModeSupportsTLS` callback that allows targets to indicate whether TLS accesses can be part of an addressing mode. This is fixing a performance problem, as this folding of TLS-accesses into multiple addressing modes happened naturally before the introduction of the `llvm.threadlocal.address` intrinsic, but regressed due to `SelectionDAG` keeping things in registers when accessed across basic blocks, so CodeGenPrepare needs to duplicate to mitigate this. We see a ~0.5% recovery in a codebase with heavy TLS usage (HHVM). This fixes most of #87437	2024-04-17 12:48:02 -07:00
wanglei	8e4b0890a6	[LoongArch] Return true from shouldConsiderGEPOffsetSplit (#88371 ) If not performing gep splits can prevent important optimizations, such as preventing the element indices / member offsets from being (partially) folded into load/store instruction immediates.	2024-04-15 09:01:04 +08:00
wanglei	5fc8a190b3	[LoongArch] Pre commit test for #88371 . NFC	2024-04-12 17:57:28 +08:00
Yingwei Zheng	38a44bdc93	[CodeGenPrepare] Reverse the canonicalization of isInf/isNanOrInf (#81572 ) In commit `2b582440c1`, we canonicalize the isInf/isNanOrInf idiom into fabs+fcmp for better analysis/codegen (See also the discussion in https://github.com/llvm/llvm-project/pull/76338). This patch reverses the fabs+fcmp to `is.fpclass`. If the `is.fpclass` is not supported by the target, it will be expanded by TLI. Fixes the regression introduced by `2b582440c1` and https://github.com/llvm/llvm-project/pull/80414#issuecomment-1936374206.	2024-03-18 18:27:45 +08:00
Stephen Tozer	d128448efd	Revert "Reapply "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281 )"" Reverted due to some test failures on some buildbots. https://lab.llvm.org/buildbot/#/builders/67/builds/14669 This reverts commit aa436493ab7ad4cf323b0189c15c59ac9dc293c7.	2024-02-27 10:17:24 +00:00
Stephen Tozer	aa436493ab	Reapply "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281 )" Fixes the prior issue in which the symbol for a cl-arg was unavailable to some binaries. This reverts commit dc06d75ab27b4dcae2940fc386fadd06f70faffe.	2024-02-27 09:59:08 +00:00
Stephen Tozer	dc06d75ab2	Revert "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281 )" Reverted due to failures on buildbots, where a new cl flag was placed in the wrong file, resulting in link errors. https://lab.llvm.org/buildbot/#/builders/198/builds/8548 This reverts commit 0b398256b3f72204ad1f7c625efe4990204e898a.	2024-02-26 18:49:18 +00:00
Stephen Tozer	0b398256b3	[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281 ) This patch adds support for printing the proposed non-instruction debug info ("RemoveDIs") out to textual IR. This patch does not add any bitcode support, parsing support, or documentation. Printing of the new format is controlled by a flag added in this patch, `--write-experimental-debuginfo`, which defaults to false. The new format will be printed iff this flag is true, so whether we use the IR format is completely independent of whether we use non-instruction debug info during LLVM passes (which is controlled by the `--try-experimental-debuginfo-iterators` flag). Even with the flag disabled, some existing tests need to be updated, as this patch causes debug intrinsic declarations to be changed in a round trip, such that they always appear at the end of a module and have no attributes (this has no functional change on the module). The design of this new IR format was proposed previously on Discourse, and any further discussion about the design can still be contributed there: https://discourse.llvm.org/t/rfc-debuginfo-proposed-changes-to-the-textual-ir-representation-for-debug-values/73491	2024-02-26 18:22:05 +00:00
Nikita Popov	2d69827c5c	[Transforms] Convert tests to opaque pointers (NFC)	2024-02-05 11:57:34 +01:00
Nick Anderson	f1ec0d12bb	Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#77182 ) Port CodeGenPrepare to new pass manager and dependency BasicBlockSectionsProfileReader Fixes: #75380 Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>	2024-01-09 13:32:59 +07:00
Simon Pilgrim	7648371c25	Revert 4d7c5ad58467502fcbc433591edff40d8a4d697d "[NewPM] Update CodeGenPreparePass reference in CodeGenPassBuilder (#77054 )" Revert e0c554ad87d18dcbfcb9b6485d0da800ae1338d1 "Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#75380)" Revert #75380 and #77054 as they were breaking EXPENSIVE_CHECKS buildbots: https://lab.llvm.org/buildbot/#/builders/104	2024-01-05 12:28:10 +00:00
Nick Anderson	e0c554ad87	Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#75380 ) Port CodeGenPrepare to new pass manager and dependency BasicBlockSectionsProfileReader Fixes: #64560 Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>	2024-01-05 13:47:56 +07:00
Jeremy Morse	f85a38e21c	Follow up to d0858bffa11, add missing REQUIRES x86	2023-12-06 17:48:50 +00:00
Jeremy Morse	d0858bffa1	[DebugInfo][RemoveDIs] Maintain DPValues on skipped instrs in CGP (#74602 ) It turns out that CodeGenPrepare will skip over consecutive select instructions as it knows it can optimise them all at the same time. This is unfortunate for the RemoveDIs project to remove intrinsic-based debug-info, because that means debug-info attached to those skipped instructions doesn't get seen by optimizeInst and so updated. Add code to handle debug-info on those skipped instructions manually. This code will also have been slower when it had dbg.values stuffed in between instructions, but with RemoveDIs it'll go faster because the dbg.values won't break up the select sequence.	2023-12-06 17:25:33 +00:00
zhongyunde 00443407	d6f4d5209f	[CGP][AArch64] Rebase the common base offset for better ISel When all the large const offsets masked with the same value from bit-12 to bit-23. Fold add x8, x0, #2031, lsl #12 add x8, x8, #960 ldr x9, [x8, x8] ldr x8, [x8, #2056] into add x8, x0, #2031, lsl #12 ldr x9, [x8, #960] ldr x8, [x8, #3016]	2023-12-05 09:01:41 +08:00
Jeremy Morse	3ef98bcd46	[DebugInfo][RemoveDIs] Support maintaining DPValues in CodeGenPrepare (#73660 ) CodeGenPrepare needs to support the maintenence of DPValues, the non-instruction replacement for dbg.value intrinsics. This means there are a few functions we need to duplicate or replicate the functionality of: * fixupDbgValue for setting users of sunk addr GEPs, * The remains of placeDbgValues needs a DPValue implementation for sinking * Rollback of RAUWs needs to update DPValues * Rollback of instruction removal needs supporting (see github #73350) * A few places where we have to use iterators rather than instructions. There are three places where we have to use the setHeadBit call on iterators to indicate which portion of debug-info records we're about to splice around. This is because CodeGenPrepare, unlike other optimisation passes, is very much concerned with which block an operation occurs in and where in the block instructions are because it's preparing things to be in a format that's good for SelectionDAG. There isn't a large amount of test coverage for debuginfo behaviours in this pass, hence I've added some more.	2023-11-30 15:29:05 +00:00
Nikita Popov	c9832da350	[CGP] Drop nneg flag when moving zext past instruction (#72103 ) Fix the issue by not reusing the zext at all. The code already handles creation of new zexts if more than one is needed. Always use that code-path instead of trying to reuse the old zext in some case. (Alternatively we could also drop poison-generating flags on the old zext, but it seems cleaner to not reuse it at all, especially if it's not always possible anyway.) Fixes https://github.com/llvm/llvm-project/issues/72046.	2023-11-14 09:03:06 +01:00
Alex Richardson	e39f6c1844	[opt] Infer DataLayout from triple if not specified There are many tests that specify a target triple/CPU flags but no DataLayout which can lead to IR being generated that has unusual behaviour. This commit attempts to use the default DataLayout based on the relevant flags if there is no explicit override on the command line or in the IR file. One thing that is not currently possible to differentiate from a missing datalayout `target datalayout = ""` in the IR file since the current APIs don't allow detecting this case. If it is considered useful to support this case (instead of passing "-data-layout=" on the command line), I can change IR parsers to track whether they have seen such a directive and change the callback type. Differential Revision: https://reviews.llvm.org/D141060	2023-10-26 12:07:37 -07:00
Paul Walker	6383785bad	[SVE][CodeGenPrepare] Sink address calculations that match SVE gather/scatter addressing modes. (#66996 ) SVE supports scalar+vector and scalar+extw(vector) addressing modes. However, the masked gather/scatter intrinsics take a vector of addresses, which means address computations can be hoisted out of loops. The is especially true for things like offsets where the true size of offsets is lost by the time you get to code generation. This is problematic because it forces the code generator to legalise towards `<vscale x 2 x ty>` vectors that will not maximise bandwidth if the main block datatypes is in fact i32 or smaller. This patch sinks GEPs and extends for cases where one of the above addressing modes can be used. NOTE: There are cases where it would be better to split the extend in two with one half hoisted out of a loop and the other within the loop. Whilst true I think this switch of default is still better than before because the extra extends are an improvement over being forced to split a gather/scatter.	2023-10-11 13:20:08 +01:00
Jay Foad	7b3bbd83c0	Revert "[CodeGen] Really renumber slot indexes before register allocation (#67038 )" This reverts commit 2501ae58e3bb9a70d279a56d7b3a0ed70a8a852c. Reverted due to various buildbot failures.	2023-10-09 12:31:32 +01:00
Jay Foad	2501ae58e3	[CodeGen] Really renumber slot indexes before register allocation (#67038 ) PR #66334 tried to renumber slot indexes before register allocation, but the numbering was still affected by list entries for instructions which had been erased. Fix this to make the register allocator's live range length heuristics even less dependent on the history of how instructions have been added to and removed from SlotIndexes's maps.	2023-10-09 11:44:41 +01:00
Alex Richardson	83c4227ab7	Auto-generate test checks for tests affected by D141060 These files had manual CHECK lines which make the diff from D141060 very difficult to review.	2023-10-04 10:51:35 -07:00
Jay Foad	e0919b189b	[CodeGen] Renumber slot indexes before register allocation (#66334 ) RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate the length of a live range for its heuristics. Renumbering all slot indexes with the default instruction distance ensures that this estimate will be as accurate as possible, and will not depend on the history of how instructions have been added to and removed from SlotIndexes's maps. This also means that enabling -early-live-intervals, which runs the SlotIndexes analysis earlier, will not cause large amounts of churn due to different register allocator decisions.	2023-09-19 11:18:12 +01:00
Serguei Katkov	a701b7e368	[CGP] Remove dead PHI nodes before elimination of mostly empty blocks Before elimination of mostly empty block it makes sense to remove dead PHI nodes. It open more opportunity for elimination plus eliminates dead code itself. It appeared that change results in failing many unit tests and some of them I've updated and for another one I disable this optimization. The pattern I observed in the tests is that there is a infinite loop without side effects. As a result after elimination of dead phi node all other related instruction are also removed and tests stops to check what it is expected. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D158503	2023-08-29 04:35:06 +00:00
Harvin Iriawan	db158c7c83	[AArch64] Update generic sched model to A510 Refresh of the generic scheduling model to use A510 instead of A55. Main benefits are to the little core, and introducing SVE scheduling information. Changes tested on various OoO cores, no performance degradation is seen. Differential Revision: https://reviews.llvm.org/D156799	2023-08-21 12:25:15 +01:00
Jordan Rupprecht	f5b5a30858	Revert "[CodeGenPrepare][NFC] Update the dominator tree instead of rebuilding it" This reverts commit 0b1d1cdb89322c277baf5221218a830195fef9d4. It causes a clang crash. Details will be posted to D153638.	2023-08-01 23:08:55 -07:00
Momchil Velikov	0b1d1cdb89	[CodeGenPrepare][NFC] Update the dominator tree instead of rebuilding it Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D153638	2023-08-01 18:07:03 +01:00
Matthias Braun	02ba5b8c6b	Ignore load/store until stack address computation No longer conservatively assume a load/store accesses the stack when we can prove that we did not compute any stack-relative address up to this point in the program. We do this in a cheap not-quite-a-dataflow-analysis: Assume `NoStackAddressUsed` when all predecessors of a block already guarantee it. Process blocks in reverse post order to guarantee that except for loop headers we have processed all predecessors of a block before processing the block itself. For loops we accept the conservative answer as they are unlikely to be shrink-wrappable anyway. Differential Revision: https://reviews.llvm.org/D152213	2023-06-26 13:50:36 -07:00
Nikita Popov	b7bd3a734c	[CGP] Fix infinite loop in icmp operand swapping Don't swap the operands if they're the same. Fixes the issue reported at https://reviews.llvm.org/D152541#4427017.	2023-06-16 15:50:12 +02:00
Serguei Katkov	d119c386cd	[CGP] Additional tests for removing operand of assume. NFC.	2023-06-16 11:52:46 +07:00
Serguei Katkov	d57ed844fe	[CGP] Add test to show the missed case in remove llvm.assume	2023-06-07 17:20:57 +07:00

1 2 3 4 5 ...

427 Commits