llvm-project

Author	SHA1	Message	Date
Youngsuk Kim	caa32e6d6f	[llvm][LSR] Fix where invariant on ScaledReg & Scale is violated (#112576 ) Comments attached to the `ScaledReg` field of `struct Formula` explains that, `ScaledReg` must be non-null when `Scale` is non-zero. This fixes up a code path where this invariant is violated. Also, add an assert to ensure this invariant holds true. Without this patch, compiler aborts with the attached test case. Fixes #76504	2024-10-17 10:47:44 -04:00
Jeremy Morse	e6bf48d110	[X86] Don't request 0x90 nop filling in p2align directives (#110134 ) As of rev ea222be0d, LLVMs assembler will actually try to honour the "fill value" part of p2align directives. X86 printed these as 0x90, which isn't actually what it wanted: we want multi-byte nops for .text padding. Compiling via a textual assembly file produces single-byte nop padding since ea222be0d but the built-in assembler will produce multi-byte nops. This divergent behaviour is undesirable. To fix: don't set the byte padding field for x86, which allows the assembler to pick multi-byte nops. Test that we get the same multi-byte padding when compiled via textual assembly or directly to object file. Added same-align-bytes-with-llasm-llobj.ll to that effect, updated numerous other tests to not contain check-lines for the explicit padding.	2024-10-02 11:14:05 +01:00
Nikita Popov	9f3d1695eb	[SCEVExpander] Preserve gep nuw during expansion (#102133 ) When expanding SCEV adds to geps, transfer the nuw flag to the resulting gep. (Note that this doesn't apply to IV increment GEPs, which go through a different code path.)	2024-10-02 11:45:00 +02:00
Sergey Kachkov	1f2a634c44	Reland "[LSR] Do not create duplicated PHI nodes while preserving LCSSA form" (#107380 ) Motivating example: https://godbolt.org/z/eb97zrxhx Here we have 2 induction variables in the loop: one is corresponding to i variable (add rdx, 4), the other - to res (add rax, 2). The second induction variable can be removed by rewriteLoopExitValues() method (final value of res at loop exit is unroll_iter * -2); however, this doesn't happen because we have duplicated LCSSA phi nodes at loop exit: ``` ; Preheader: for.body.preheader.new: ; preds = %for.body.preheader %unroll_iter = and i64 %N, -4 br label %for.body ; Loop: for.body: ; preds = %for.body, %for.body.preheader.new %lsr.iv = phi i64 [ %lsr.iv.next, %for.body ], [ 0, %for.body.preheader.new ] %i.07 = phi i64 [ 0, %for.body.preheader.new ], [ %inc.3, %for.body ] %inc.3 = add nuw i64 %i.07, 4 %lsr.iv.next = add nsw i64 %lsr.iv, -2 %niter.ncmp.3.not = icmp eq i64 %unroll_iter, %inc.3 br i1 %niter.ncmp.3.not, label %for.end.loopexit.unr-lcssa.loopexit, label %for.body, !llvm.loop !7 ; Exit blocks for.end.loopexit.unr-lcssa.loopexit: ; preds = %for.body %inc.3.lcssa = phi i64 [ %inc.3, %for.body ] %lsr.iv.next.lcssa11 = phi i64 [ %lsr.iv.next, %for.body ] %lsr.iv.next.lcssa = phi i64 [ %lsr.iv.next, %for.body ] br label %for.end.loopexit.unr-lcssa ``` rewriteLoopExitValues requires %lsr.iv.next value to have only 2 uses: one in LCSSA phi node, the other - in induction phi node. Here we have 3 uses of this value because of duplicated lcssa nodes, so the transform doesn't apply and leads to an extra add operation inside the loop. The proposed solution is to accumulate inserted instructions that will require LCSSA form update into SetVector and then call formLCSSAForInstructions for this SetVector once, so the same instructions don't process twice. Reland fixes the issue with preserve-lcssa.ll test: it fails in the situation when x86_64-unknown-linux-gnu target is unavailable in opt. The changes are moved into separate duplicated-phis.ll test with explicit x86 target requirement to fix bots which are not building this target.	2024-09-09 16:14:51 +03:00
dyung	2bf551e600	Revert "[LSR] Do not create duplicated PHI nodes while preserving LCSSA form" (#107666 ) Reverts llvm/llvm-project#107380 Change is causing the test preserve-lcssa.ll to fail on at least 2 build bots: - https://lab.llvm.org/buildbot/#/builders/190/builds/5231 - https://lab.llvm.org/buildbot/#/builders/161/builds/1855	2024-09-06 19:54:26 -07:00
Sergey Kachkov	2cb4d1b1bd	[LSR] Do not create duplicated PHI nodes while preserving LCSSA form (#107380 ) Motivating example: https://godbolt.org/z/eb97zrxhx Here we have 2 induction variables in the loop: one is corresponding to i variable (add rdx, 4), the other - to res (add rax, 2). The second induction variable can be removed by rewriteLoopExitValues() method (final value of res at loop exit is unroll_iter * -2); however, this doesn't happen because we have duplicated LCSSA phi nodes at loop exit: ``` ; Preheader: for.body.preheader.new: ; preds = %for.body.preheader %unroll_iter = and i64 %N, -4 br label %for.body ; Loop: for.body: ; preds = %for.body, %for.body.preheader.new %lsr.iv = phi i64 [ %lsr.iv.next, %for.body ], [ 0, %for.body.preheader.new ] %i.07 = phi i64 [ 0, %for.body.preheader.new ], [ %inc.3, %for.body ] %inc.3 = add nuw i64 %i.07, 4 %lsr.iv.next = add nsw i64 %lsr.iv, -2 %niter.ncmp.3.not = icmp eq i64 %unroll_iter, %inc.3 br i1 %niter.ncmp.3.not, label %for.end.loopexit.unr-lcssa.loopexit, label %for.body, !llvm.loop !7 ; Exit blocks for.end.loopexit.unr-lcssa.loopexit: ; preds = %for.body %inc.3.lcssa = phi i64 [ %inc.3, %for.body ] %lsr.iv.next.lcssa11 = phi i64 [ %lsr.iv.next, %for.body ] %lsr.iv.next.lcssa = phi i64 [ %lsr.iv.next, %for.body ] br label %for.end.loopexit.unr-lcssa ``` rewriteLoopExitValues requires %lsr.iv.next value to have only 2 uses: one in LCSSA phi node, the other - in induction phi node. Here we have 3 uses of this value because of duplicated lcssa nodes, so the transform doesn't apply and leads to an extra add operation inside the loop. The proposed solution is to accumulate inserted instructions that will require LCSSA form update into SetVector and then call formLCSSAForInstructions for this SetVector once, so the same instructions don't process twice.	2024-09-06 18:39:47 +03:00
Shan Huang	d83d09facd	[DebugInfo][LoopStrengthReduce] Fix missing debug location updates (#97519 ) Fix #97510 . Note that, for the new phi instruction `NewPH`, which replaces the old phi `PH` and the cast `ShadowUse`, I choose to propagate the debug location of `PH` to it, because the cast is eliminated according to the optimization semantics.	2024-07-15 09:44:18 +08:00
Philip Reames	cb76896d6e	[SCEVExpander] Recognize urem idiom during expansion (#96005 ) If we have a urem expression, emitting it as a urem is significantly better that letting the fully expansion kick in. We have the risk of a udiv or mul which could have previously been shared, but loosing that seems like a reasonable tradeoff for being able to round trip a urem w/o modification.	2024-06-19 08:40:04 -07:00
Stephen Tozer	094572701d	[RemoveDIs] Print IR with debug records by default (#91724 ) This patch makes the final major change of the RemoveDIs project, changing the default IR output from debug intrinsics to debug records. This is expected to break a large number of tests: every single one that tests for uses or declarations of debug intrinsics and does not explicitly disable writing records. If this patch has broken your downstream tests (or upstream tests on a configuration I wasn't able to run): 1. If you need to immediately unblock a build, pass `--write-experimental-debuginfo=false` to LLVM's option processing for all failing tests (remember to use `-mllvm` for clang/flang to forward arguments to LLVM). 2. For most test failures, the changes are trivial and mechanical, enough that they can be done by script; see the migration guide for a guide on how to do this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates 3. If any tests fail for reasons other than FileCheck check lines that need updating, such as assertion failures, that is most likely a real bug with this patch and should be reported as such. For more information, see the recent PSA: https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578	2024-06-14 15:07:27 +01:00
Freddy Ye	4def1ce101	Reland "[X86] Remove knl/knm specific ISAs supports (#92883 )" (#93136 ) This reverts commit aa4069ea96e5eb62bc8c7895b9d920f129611b3a.	2024-05-24 13:46:34 +08:00
Freddy Ye	aa4069ea96	Revert "[X86] Remove knl/knm specific ISAs supports (#92883 )" (#93123 ) This reverts commit 282d2ab58f56c89510f810a43d4569824a90c538.	2024-05-23 10:25:23 +08:00
Freddy Ye	282d2ab58f	[X86] Remove knl/knm specific ISAs supports (#92883 ) Cont. patch after https://github.com/llvm/llvm-project/pull/75580	2024-05-23 09:46:44 +08:00
Nikita Popov	8e8d2595da	[ConstantFolding] Canonicalize constexpr GEPs to i8 (#89872 ) This patch canonicalizes constant expression GEPs to use i8 source element type, aka ptradd. This is the ConstantFolding equivalent of the InstCombine canonicalization introduced in #68882. I believe all our optimizations working on constant expression GEPs (like GlobalOpt etc) have already been switched to work on offsets, so I don't expect any significant fallout from this change. This is part of: https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699	2024-05-20 11:47:30 +02:00
Nikita Popov	6409c21857	[SCEVExpander] Use PoisoningVH for OrigFlags It's common to delete some instructions after using SCEVExpander, while it is still live (but will not be used afterwards). In that case, the AssertingVH may trigger. Replace it with a PoisoningVH so that we only detect the case where the SCEVExpander actually is used in a problematic fashion after the instruction removal. The alternative would be to add clear() calls to more code paths. Fixes https://github.com/llvm/llvm-project/issues/83404.	2024-03-05 16:41:52 +01:00
Nikita Popov	2d69827c5c	[Transforms] Convert tests to opaque pointers (NFC)	2024-02-05 11:57:34 +01:00
Stephen Tozer	7c53e9f667	[RemoveDIs][DebugInfo] Add support for DPValues to LoopStrengthReduce (#78706 ) This patch trivially extends support for DbgValueInst recovery to DPValues in LoopStrengthReduce; they are handled identically, so this is mostly done by reusing the DbgValueInst code (using templates or auto-parameter lambdas to reduce actual code duplication).	2024-01-22 18:59:19 +00:00
Nikita Popov	eecb99c5f6	[Tests] Add disjoint flag to some tests (NFC) These tests rely on SCEV looking recognizing an "or" with no common bits as an "add". Add the disjoint flag to relevant or instructions in preparation for switching SCEV to use the flag instead of the ValueTracking query. The IR with disjoint flag matches what InstCombine would produce.	2023-12-05 14:09:36 +01:00
Alex Richardson	e39f6c1844	[opt] Infer DataLayout from triple if not specified There are many tests that specify a target triple/CPU flags but no DataLayout which can lead to IR being generated that has unusual behaviour. This commit attempts to use the default DataLayout based on the relevant flags if there is no explicit override on the command line or in the IR file. One thing that is not currently possible to differentiate from a missing datalayout `target datalayout = ""` in the IR file since the current APIs don't allow detecting this case. If it is considered useful to support this case (instead of passing "-data-layout=" on the command line), I can change IR parsers to track whether they have seen such a directive and change the callback type. Differential Revision: https://reviews.llvm.org/D141060	2023-10-26 12:07:37 -07:00
Nikita Popov	4de93db447	[LSR] Regenerate test checks (NFC) While there also remove some UB from the test.	2023-09-21 16:34:44 +02:00
Florian Hahn	3ba3ea3c06	[IVUsers] Check getExpr result in findAddRecForLoop. This fixes a crash if the SCEV for the use isn't invertible and nullptr is returned. Fixes https://github.com/llvm/llvm-project/issues/63840	2023-07-20 14:56:19 +01:00
Nikita Popov	4ec3ea8afa	[LSR] Convert some tests to opaque pointers (NFC) These no longer show codegen regressions.	2023-07-12 11:48:44 +02:00
Nikita Popov	bd0710c221	[LSR] Move test to target specific directory (NFC) Uses an x86 triple.	2023-07-12 11:44:09 +02:00
Florian Hahn	69ca5c9d62	[SCEV] Add flag to control invertible check for normalization. When normalizing a SCEV expression during expansion, there should be no need for it to be invertible, as it will only be used for code generation. This fixes a crash after 7f5b15ad150e. Fixes https://github.com/llvm/llvm-project/issues/63678.	2023-07-05 18:11:44 +01:00
Florian Hahn	7f5b15ad15	[LSR] Move normalization check to normalizeForPostIncUse. Move the logic added in 3a57152d85e1 to normalizeForPostIncUse to catch additional un-invertable cases. This fixes another mis-compile pointed out by @peixin in D153004.	2023-07-04 11:56:51 +01:00
Florian Hahn	02591d26b9	[LSR] Add test for another normalization miscompile. Based on @peixin test case shared in D153004.	2023-07-03 18:57:31 +01:00
Nikita Popov	b51153792b	[LSR] Convert some tests to opaque pointers (NFC)	2023-06-23 17:13:57 +02:00
Florian Hahn	3a57152d85	[LSR] Return nullptr from getExpr if the result isn't invertible. getExpr is missing a check to make sure the result is invertible. This can lead to incorrect results, so return nullptr in those cases like in other places in IVUsers. Fixes #62660. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D153202	2023-06-22 19:10:48 +01:00
Florian Hahn	dae5cd73cb	Recommit "[LSR] Consider post-inc form when creating extends/truncates." This reverts the revert commit 1797ab36efc9c90c921cd725831f8c3f6a7125a2. The recommitted version now checks the PostIncLoopSets for all fixups and returns nullptr if the result doesn't match for all fixups.	2023-06-19 17:57:06 +01:00
NAKAMURA Takumi	7400bdc19f	pr62660-normalization-failure.ll REQUIRES: asserts (#62660 )	2023-06-18 15:24:53 +09:00
Florian Hahn	8225698212	[LSR] Enable SCEV verification for test from f3a0ad2d and mark as XFAIL The test fails SCEV verification, which cause the expensive check bots to fail. Always run verification and mark as XFAIL until fixed.	2023-06-17 21:06:49 +01:00
Florian Hahn	1797ab36ef	Revert "[LSR] Consider post-inc form when creating extends/truncates." This reverts commit abfeda5af329b5889db709ff74506e20e0b569e9. and fe19036e1266d2a90b44725c82b898134906e4c3. The added assertion triggers during clang bootstrap builds. Revert while I investigate.	2023-06-17 17:58:41 +01:00
Florian Hahn	f3a0ad2d8b	[LSR] Add test for #62660 . Add test for LSR miscompile.	2023-06-17 17:37:25 +01:00
Florian Hahn	abfeda5af3	[LSR] Consider post-inc form when creating extends/truncates. GenerateTruncates at the moment creates extends/truncates for post-inc uses of normalized expressions. For example, if an add rec of the form {1,+,-1} is used outside the loop, the normalized form will use {1,+,-1} instead of {0,+,-1}. When naively sign-extending the normalized expression, it will get extended incorrectly to {1,+,-1} for the wider type, if the backedge-taken count of the loop is 1. To address this, the patch updates GenerateTruncates to check if the LSRUse contains any fixups with PostIncLoops. If that's the case, first de-normalize the expression, then perform the extend/truncate, then normalize again. There may be other places where similar checks are needed and the helper can be generalized for those cases. I'd not be surprised if other subtle mis-compiles are caused by this. Fixes #38847. Fixes #58039. Fixes #62852. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D153004	2023-06-17 09:58:37 +01:00
Florian Hahn	f63c038af4	[LSR] Add test case for #58039 .	2023-06-17 09:57:00 +01:00
Florian Hahn	672b35d554	[LSR] Move new test to X86 subdir. The test added in 1665cb06307 requires the X86 backend, so move it to the X86 subdirectory.	2023-06-15 11:11:06 +01:00
Dmitry Makogon	0a3dc73e70	[Test] Move LoopStrengthReduce/pr62563.ll to X86 specific test folder (NFC) The test case is X86 specific. Should unblock buildbots after 253e3e2.	2023-05-31 20:24:30 +07:00
sgokhale	c4a60c9d34	[CodeGen][ShrinkWrap] Enable PostShrinkWrap by default This is an attempt to reland D42600 and enabling this optimisation by default. This also resolves the issue pointed out in the context of PGO build. Differential Revision: https://reviews.llvm.org/D42600	2023-05-25 13:56:29 +05:30
Tobias Hieta	f84bac329b	[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0 since I forgot the lit.local.cfg files in that one. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Reviewed By: barannikov88, kwk Differential Revision: https://reviews.llvm.org/D150762	2023-05-17 17:03:15 +02:00
Alan Zhao	f4999d3535	Revert "[CodeGen][ShrinkWrap] Split restore point" This reverts commit 1ddfd1c8186735c62b642df05c505dc4907ffac4. The original commit causes a Chrome build assertion failure with ThinLTO: https://crbug.com/1443635	2023-05-08 16:27:59 -07:00
sgokhale	1ddfd1c818	[CodeGen][ShrinkWrap] Split restore point Try to reland D42600 Differential Revision: https://reviews.llvm.org/D42600	2023-05-08 13:21:07 +05:30
sgokhale	bb5befefc6	Revert "[CodeGen][ShrinkWrap] Split restore point" This reverts commit 5f0bccc3d1a74111458c71f009817c9995f4bf83. An issue has been reported here: https://github.com/ClangBuiltLinux/linux/issues/1833	2023-04-13 10:52:28 +05:30
Nikita Popov	e7f4ad13ae	[Transforms] Convert some tests to opaque pointers (NFC)	2023-04-11 16:49:12 +02:00
sgokhale	5f0bccc3d1	[CodeGen][ShrinkWrap] Split restore point This patch splits a restore point to allow it to only post-dominate blocks reachable by use or def of CSRs(Callee Saved Registers)/FI(Frame Index). Benchmarking this on SPEC2017, this gives around 4% improvement on povray and no significant change for others. Co-authored-by: junbuml Differential Revision: https://reviews.llvm.org/D42600	2023-04-11 11:58:50 +05:30
Dmitry Makogon	3d7242f05e	Reapply "[LSR] Preserve LCSSA when rewriting instruction with PHI user" This reverts commit efd34ba60f3839b0a68b2e32ff9011b6823bc16f. Reapplies 8ff4832679e1. Missed a failing test. Needed to just update test checks.	2023-04-06 17:31:27 +07:00
Nico Weber	efd34ba60f	Revert "[LSR] Preserve LCSSA when rewriting instruction with PHI user" This reverts commit 8ff4832679e1ff2d2a1cfaa45bb5cb995b0685a1. Breaks tests, see https://reviews.llvm.org/D146811#4232839	2023-03-30 06:40:16 -04:00
Dmitry Makogon	8ff4832679	[LSR] Preserve LCSSA when rewriting instruction with PHI user Fixes https://github.com/llvm/llvm-project/issues/61182. LoopStrengthReduce may sometimes break LCSSA form when applying a rewrite for an instruction used in a PHI. It happens if: - The PHI is in a loop exit block, - The edge from the corresponding exiting block to that exit is critical, - The PHI has at least two inputs coming from loop blocks, - and the rewritten instruction is inserted in the loop. In such case we split the critical edge and then replace PHI inputs with the rewritten instruction. However ExitBlock is no longer a loop exit, so LCSSA form is broken. This patch fixes it by collecting all inserted instructions for PHIs whose parent block is not a loop exit and then forming LCSSA for them. Differential Revision: https://reviews.llvm.org/D146811	2023-03-30 14:46:28 +07:00
Dmitry Makogon	8e85bede79	[Test] Regenerate test checks for some LSR tests (NFC)	2023-03-24 21:24:22 +07:00
Nikita Popov	9ed2f14c87	[AsmParser] Remove typed pointer auto-detection IR is now always parsed in opaque pointer mode, unless -opaque-pointers=0 is explicitly given. There is no automatic detection of typed pointers anymore. The -opaque-pointers=0 option is added to any remaining IR tests that haven't been migrated yet. Differential Revision: https://reviews.llvm.org/D141912	2023-01-18 09:58:32 +01:00
Florian Hahn	20ecc07991	[MachineCombiner] Lift same-bb restriction for reassociable ops. This patch relaxes the restriction that both reassociate operands must be in the same block as the root instruction. The comment indicates that the reason for this restriction was that the operands not in the same block won't have a depth in the trace. I believe this is outdated; if the operand is in a different block, it must dominate the current block (otherwise it would need to be phi), which in turn means the operand's block must be included in the current rance, and depths must be available. There's a test case (no_reassociate_different_block) added in 70520e2f1c5fc4 which shows that we have accurate depths for operands defined in other blocks. This allows reassociation of code that computes the final reduction value after vectorization, among other things. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D141302	2023-01-13 15:32:44 +00:00
Nikita Popov	055fb7795a	[Transforms] Convert some tests to opaque pointers (NFC) These are all tests where conversion worked automatically, and required no manual fixup.	2023-01-05 12:43:45 +01:00

1 2 3 4

167 Commits