llvm-project

Author	SHA1	Message	Date
Nikita Popov	b7bd3a734c	[CGP] Fix infinite loop in icmp operand swapping Don't swap the operands if they're the same. Fixes the issue reported at https://reviews.llvm.org/D152541#4427017.	2023-06-16 15:50:12 +02:00
Nikita Popov	03de1cb715	[InstCombine][CGP] Move swapMayExposeCSEOpportunities() fold InstCombine tries to swap compare operands to match sub instructions in order to expose "CSE opportunities". However, it doesn't really make sense to perform this transform in the middle-end, as we cannot actually CSE the instructions there. The backend already performs this fold in `18f5446a45/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (L4236)` on the SDAG level, however this only works within a single basic block. To handle cross-BB cases, we do need to handle this in the IR layer. This patch moves the fold from InstCombine to CGP in the backend, while keeping the same (somewhat dubious) heuristic. Differential Revision: https://reviews.llvm.org/D152541	2023-06-15 14:17:58 +02:00
Florian Hahn	e97b8a7e3f	[AArch64] Don't use tbl lowering if ZExt can be folded into user. If the ZExt can be lowered to a single ZExt to the next power-of-2 and the remaining ZExt folded into the user, don't use tbl lowering. Fixes #62620. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D150482	2023-06-02 11:53:04 +01:00
Bing1 Yu	1845d9de18	[CGP] Disable default copy ctor and copy assignment operator for InstructionRemover class InstructionRemover manages resources such as dynamically allocated memory, it's generally a good practice to either implement a custom copy constructor or disable the default one. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D151543	2023-05-27 23:20:33 +08:00
Joshua Cranmer	3ac1cef866	[CodeGen] Fix crash in CodeGenPrepare::optimizeGatherScatterInst. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D151141	2023-05-23 15:02:03 -04:00
Thomas Symalla	b819fd7e2c	[NFC] Fix typo in CodeGenPrepare.cpp	2023-05-19 11:27:28 +02:00
Krzysztof Drewniak	0bc739a4ae	[GlobalISel] Handle ptr size != index size in IRTranslator, CodeGenPrepare While the original motivation for this patch (address space 7 on AMDGPU) has been reworked and is not presently planned to reach IR translation, the incorrect (by the spec) handling of index offset width in IR translation and CodeGenPrepare is likely to trip someone - possibly future AMD, since we have a p7:160:256:256:32 now, so we convert to the other API now. Reviewed By: aemerson, arsenm Differential Revision: https://reviews.llvm.org/D143526	2023-05-12 16:21:01 +00:00
NAKAMURA Takumi	c1221251fb	Restore CodeGen/MachineValueType.h from `Support` This is rework of; - rG13e77db2df94 (r328395; MVT) Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h` can be restored as well. Depends on D148767 Differential Revision: https://reviews.llvm.org/D149024	2023-05-03 00:13:20 +09:00
Wang, Xin10	9c1e4ee690	[NFC]Fix 2 logic dead code First, in CodeGenPrepare.cpp, line 6891, the VectorCond will always be false because if not function will return at 6888. Second, in SelectionDAGBuilder.cpp, line 5443, getSExtValue() will return value as int type, but now we use unsigned Val to maintain it, which make the if condition at 5452 meaningless. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D149033	2023-04-28 03:02:59 -04:00
Jordan Rupprecht	fbf42f1fe2	Revert "[CodeGenPrepare] Estimate liveness of loop invariants when checking for address folding profitability" This reverts commit 5344d8e10bb7d8672d4bfae8adb010465470d51b. It causes non-determinism when building clang. See the review thread on D143897.	2023-04-27 19:16:32 -07:00
Momchil Velikov	5344d8e10b	[CodeGenPrepare] Estimate liveness of loop invariants when checking for address folding profitability When checking the profitability of folding an address computation into a memory instruction, the compiler tries to determine the liveness of the values, comprising the address, at the point of the memory instruction. This patch improves on the live variable estimates by including the loop invariants which are references in the loop body. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D143897	2023-04-24 10:21:36 +01:00
Momchil Velikov	4f02a0f606	[NFC][CodeGenPrepare] Match against the correct instruction when checking profitability of folding an address The "nested" `AddressingModeMatcher`s in `AddressingModeMatcher::isProfitableToFoldIntoAddressingMode` are constructed using the original memory instruction, even though they check whether the address operand of a differrent memory instructon is foldable. The memory instruction is used only for a dominance check (when not checking for profitability), and using the wrong memory instruction does not change the outcome of the test - if an address is foldable, the dominance test afects which of the two possible ways to fold is chosen, but this result is discarded. As an example, in target triple = "x86_64-linux" declare i1 @check(i64, i64) define i32 @f(i1 %cc, ptr %p, ptr %q, i64 %n) { entry: br label %loop loop: %iv = phi i64 [ %i, %C ], [ 0, %entry ] %offs = mul i64 %iv, 4 %c.0 = icmp ult i64 %iv, %n br i1 %c.0, label %A, label %fail A: br i1 %cc, label %B, label %C C: %u = phi i32 [0, %A], [%w, %B] %i = add i64 %iv, 1 %a.0 = getelementptr i8, ptr %p, i64 %offs %a.1 = getelementptr i8, ptr %a.0, i64 4 %v = load i32, ptr %a.1 %c.1 = icmp eq i32 %v, %u br i1 %c.1, label %exit, label %loop B: %a.2 = getelementptr i8, ptr %p, i64 %offs %a.3 = getelementptr i8, ptr %a.2, i64 4 %w = load i32, ptr %a.3 br label %C exit: ret i32 -1 fail: ret i32 0 } the dominance test is perfomed between `%i = ...` and `%v = ...` at the moment we're checking whether `%a3 = ...` is foldable Using the memory instruction, which uses the interesting address is "more correct" and this change is needed by a future patch. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D143896	2023-04-21 18:09:51 +01:00
Momchil Velikov	6c9066fe2e	Recommit "[AArch64] Fix incorrect `isLegalAddressingMode`" This patch recommits 0827e2fa3fd15b49fd2d0fc676753f11abb60cab after reverting it in ed7ada259f665a742561b88e9e6c078e9ea85224. Added workround for `Targetlowering::AddrMode` no longer being an aggregate in C++20. `AArch64TargetLowering::isLegalAddressingMode` has a number of defects, including accepting an addressing mode, which consists of only an immediate operand, or not checking the offset range for an addressing mode in the form `1*ScaledReg + Offs`. This patch fixes the above issues. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D143895 Change-Id: I41a520c13ce21da503ca45019979bfceb8b648fa	2023-04-21 16:21:01 +01:00
Momchil Velikov	ed7ada259f	Revert "[AArch64] Fix incorrect `isLegalAddressingMode`" This reverts commit 0827e2fa3fd15b49fd2d0fc676753f11abb60cab. Failing buildbot, perhaps due to `-std=c++20`.	2023-04-20 16:10:45 +01:00
Momchil Velikov	0827e2fa3f	[AArch64] Fix incorrect `isLegalAddressingMode` `AArch64TargetLowering::isLegalAddressingMode` has a number of defects, including accepting an addressing mode which consists of only an immediate operand, or not checking the offset range for an addressing mode in the form `1*ScaledReg + Offs`. This patch fixes the above issues. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D143895 Change-Id: I756fa21941844ded44f082ac7eea4391219f9851	2023-04-20 15:43:11 +01:00
Akshay Khadse	43b38696aa	Fix uninitialized class members Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148692	2023-04-20 11:18:34 +08:00
Akshay Khadse	8bf7f86d79	Fix uninitialized pointer members in CodeGen This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148303	2023-04-17 16:32:46 +08:00
Momchil Velikov	99e57f06c4	[CodeGenPrepare] Increase the limit on the number of instructions to scan ... when finding all memory uses for an address and make it a parameter. Now that we have avoided potentially exponential run time of `FindAllMemoryUses` in D143893. it'd be beneficial to increase the limit up from 20. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D143894 Change-Id: I3abdf40332ef65e9b2f819ac32ac60e4200ec51d	2023-03-30 14:38:22 +01:00
Momchil Velikov	2453da0a4e	[CodeGenPrepare] Fix counting uses when folding addresses into memory instructions The counter of the number of instructions seen in `FindAllMemoryUses` is reset after returning from a recursive invocation of `FindAllMemoryUses` to the value it had before the call. In effect, depending on the shape of the uses graph, the function may scan up to `2^N-1` instructions where `N` is the scan limit (`MaxMemoryUsesToScan`). This does not look intuitive or intended. This patch changes the counting to just count the scanned instructions, independent of the shape of the references. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D143893 Change-Id: I99f5de55e84843cf2fbea287d6ae4312fa196240	2023-03-30 14:18:14 +01:00
Peter Rong	670c92a415	[CodeGen] Remove redundent instructions generated by combineAddrModes. CodeGenPare may optimize memory access modes. During such optimization, it might create a new instruction representing combined value. Later, If the optimization failed, the generated value is not removed and remains a dead instruction. Normally this won't be a problem as dead code will be eliminated later. However, in this case (Issue 58538), the generated instruction may trigger an infinite loop. The infinite loop involves `sinkCmpExpression`, where it tries to optimize the placeholder generated by us. (See the test case detailed in the issue) To fix this, we remove the unnecessary placeholder immediately when we abort the optimization. `AddressingModeCombiner` will keep track of the placeholder, and remove it if it is an inserted placeholder and has no uses. This patch fixes https://github.com/llvm/llvm-project/issues/58538, a test is also included. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D147041	2023-03-30 00:35:56 -07:00
Craig Topper	697a28b380	[CodeGenPrepare][RISCV] Correct the MathUsed flag for shouldFormOverflowOp For add, if we match the constant edge case the add isn't used by the compare so we shouldn't check for 2 users. For sub, the compare is not a user of the sub so the math is used if the sub has any users. This regresses RISC-V which I will work on other patches for. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D146786	2023-03-27 09:58:50 -07:00
Momchil Velikov	6a2a5f08de	[CodeGenPrepare] Don't give up if unable to sink first arg to a cold call Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D143892	2023-03-23 17:31:09 +00:00
Kazu Hirata	398af9b43b	[llvm] Use *{Map,Set}::contains (NFC)	2023-03-15 18:06:32 -07:00
Kazu Hirata	a585fa2637	[CodeGen] Use *{Set,Map}::contains (NFC)	2023-03-14 08:07:42 -07:00
Paul Walker	adbdf273ef	[CodeGenPrepare] Stop llvm.vscale() -> getelementptr(null, 1) transformation. I've pulled this change from D145404 to land in isolation because I'm concerned the code might be more important than the test coverage might suggest (NOTE: the code has no test coverage).	2023-03-08 15:47:03 +00:00
Xiang1 Zhang	eed31bbb37	[NFC] Remove dead code in ExtAddrMode::print checked by coverty tool	2023-03-08 15:01:28 +08:00
Sander de Smalen	170e7a0ec2	[AArch64][SME2] Add CodeGen support for target("aarch64.svcount"). This patch adds AArch64 CodeGen support such that the type can be passed and returned to/from functions, and also adds support to use this type in load/store operations and PHI nodes. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D136862	2023-03-02 12:07:41 +00:00
Kazu Hirata	a28b252d85	Use APInt::getSignificantBits instead of APInt::getMinSignedBits (NFC) Note that getMinSignedBits has been soft-deprecated in favor of getSignificantBits.	2023-02-19 23:56:52 -08:00
Jake Egan	08533f8b86	Revert "[CGP] Add generic TargetLowering::shouldAlignPointerArgs() implementation" These commits are causing a test-suite build failure on AIX. Revert for now for time to investigate. https://lab.llvm.org/buildbot/#/builders/214/builds/5779/steps/9/logs/stdio This reverts commit bd87a2449da0c82e63cebdf9c131c54a5472e3a7 and 4c72266830ffa332ebb7cf1d3bbd6c56d001fa0f.	2023-02-14 15:20:06 -05:00
Alex Richardson	bd87a2449d	[CGP] Add generic TargetLowering::shouldAlignPointerArgs() implementation This function was added for ARM targets, but aligning global/stack pointer arguments passed to memcpy/memmove/memset can improve code size and performance for all targets that don't have fast unaligned accesses. This adds a generic implementation that adjusts the alignment to pointer size if unaligned accesses are slow. Review D134168 suggests that this significantly improves performance on synthetic benchmarks such as Dhrystone on RV32 as it avoids memcpy() calls. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D134282	2023-02-09 10:11:40 +00:00
Kazu Hirata	caa99a01f5	Use llvm::popcount instead of llvm::countPopulation(NFC)	2023-01-22 12:48:51 -08:00
Piotr Fusik	898b5c9f5e	[NFC] Fix "form/from" typos Reviewed By: #libc, ldionne Differential Revision: https://reviews.llvm.org/D142007	2023-01-22 20:05:51 +01:00
ShihPo Hung	5fb3a57ea7	[Cost] Add CostKind to getVectorInstrCost and its related users LoopUnroll estimates the loop size via getInstructionCost(), but getInstructionCost() cannot pass CostKind to getVectorInstrCost(). And so does getShuffleCost() to getBroadcastShuffleOverhead(), getPermuteShuffleOverhead(), getExtractSubvectorOverhead(), and getInsertSubvectorOverhead(). To address this, this patch adds an argument CostKind to these functions. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D142116	2023-01-21 05:29:24 -08:00
Guillaume Chatelet	8fd5558b29	[NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize() This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.	2023-01-11 16:49:38 +00:00
Guillaume Chatelet	48f5d77eee	[NFC] Use TypeSize::getKnownMinValue() instead of TypeSize::getKnownMinSize() This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.	2023-01-11 16:36:39 +00:00
OCHyams	042107494d	[DebugInfo][NFC] Rename is/setUndef to is/setKilllocation These names better reflect the semantics and also the implementation, since it's not just "undef" operands that are sentinels used to signal that the debug intrinsic terminates dominating locations definitions. Related to https://discourse.llvm.org/t/auto-undef-debug-uses-of-a-deleted-value Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D140903	2023-01-06 09:15:02 +00:00
serge-sans-paille	38818b60c5	Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part Use deduction guides instead of helper functions. The only non-automatic changes have been: 1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t), (uint8_t)) 2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase. 3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated. 4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that). Per reviewers' comment, some useless makeArrayRef have been removed in the process. This is a follow-up to https://reviews.llvm.org/D140896 that introduced the deduction guides. Differential Revision: https://reviews.llvm.org/D140955	2023-01-05 14:11:08 +01:00
Sprite	a9f9f3dff4	Correct typos (NFC) Just found some typos while reading the llvm/circt project. compliment -> complement emitsd -> emits	2022-12-16 10:51:26 -08:00
Kazu Hirata	3c09ed006a	[llvm] Use std::nullopt instead of None in comments (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 17:12:44 -08:00
Kazu Hirata	998960ee1f	[CodeGen] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 20:36:08 -08:00
Kazu Hirata	000749d753	[CodeGen] Use std::optional in CodeGenPrepare.cpp (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-11-26 14:27:19 -08:00
OCHyams	3115e6828c	[Assignment Tracking][25/*] Replace sunk address uses in dbg.assign intrinsics The Assignment Tracking debug-info feature is outlined in this RFC: https://discourse.llvm.org/t/ rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D136255	2022-11-21 15:50:47 +00:00
Alex Richardson	754d25844a	[CGP] Update MemIntrinsic alignment if possible Previously it was only being done if shouldAlignPointerArgs() returned true, which right now is only true for ARM targets. Updating the argument alignment attributes of memcpy/memset intrinsics if the underlying object has larger alignment can be beneficial even when CGP didn't increase alignment (as can be seen from the test changes), so invert the loop and if condition. Differential Revision: https://reviews.llvm.org/D134281	2022-11-17 11:59:35 +00:00
Haohai Wen	e419620fc2	[CodeGenPrep] Change ValueToSExts from DeseMap to MapVector mergeSExts iterates throught ValueToSExts. Using DenseMap result in unstable optimization path so that output IR may vary even if the input IR is same. Reviewed By: wxiao3 Differential Revision: https://reviews.llvm.org/D137234	2022-11-04 11:15:18 +08:00
David Green	16e4e4ab87	[CodeGenPrep] Handle constants in ConvertPhiType This is a simple addition to the convertPhiTypes in CodeGenPrepare to consider and convert constants as it converts the phi type. Someone fixed the bug in the motivating example, so the undef is now a constant 0. This does mean converting between integer and floating point constants, which may have different materialization. Differential Revision: https://reviews.llvm.org/D135561	2022-10-13 16:41:44 +01:00
Florian Hahn	6b86b481e3	[AArch64] Use tbl for truncating vector FPtoUI conversions. On AArch64, doing the vector truncate separately after the fptoui conversion can be lowered more efficiently using tbl.4, building on D133495. https://alive2.llvm.org/ce/z/T538CC Depends on D133495 Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D133496	2022-09-16 14:57:43 +01:00
Florian Hahn	8491d01cc3	[AArch64] Lower vector trunc using tbl. Similar to using tbl to lower vector ZExts, tbl4 can be used to lower vector truncates. The initial version support i32->i8 conversions. Depends on D120571 Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D133495	2022-09-16 12:42:49 +01:00
Florian Hahn	5871f18827	[AArch64] Lower extending uitofp using tbl. On AArch64, doing the zero-extend separately first can be lowered more efficiently using tbl, building on D120571. https://alive2.llvm.org/ce/z/8Je595 Depends on D120571 Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D133494	2022-09-16 10:20:25 +01:00
Florian Hahn	81a11da762	[CGP,AArch64] Replace zexts with shuffle that can be lowered using tbl. This patch extends CodeGenPrepare to lower zext v16i8 -> v16i32 in loops using a wide shuffle creating a v64i8 vector, selecting groups of 3 zero elements and an element from the input. This is profitable on AArch64 where such shuffles can be lowered to tbl instructions, but only in loops, because it requires materializing 4 masks, which can be done in the loop preheader. This is the only reason the transform is part of CGP. If there's a better alternative I missed, please let me know. The same goes for the shouldReplaceZExtWithShuffle hook which guards this. I am not sure if this transform will be beneficial on other targets, but it seems like there is no way other convenient way. This improves the generated code for loops like the one below in combination with D96522. int foo(uint8_t p, int N) { unsigned long long sum = 0; for (int i = 0; i < N ; i++, p++) { unsigned int v = p; sum += (v < 127) ? v : 256 - v; } return sum; } https://clang.godbolt.org/z/Wco866MjY Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D120571	2022-09-15 19:18:13 +01:00
Xiang1 Zhang	16743c9534	[CodeGen] Limit building time in CodeGenPrepare for huge function Details: Currently CodeGenPrepare is very time consuming in handling big functions. Old Algorithm : It iterate each BB in function, and go on handle very instructions in BB. Due to some instruction optimizations may affect the BBs' dominate tree. The old logic will re-iterate and try optimize for each BB. Suppose we have a big function with 20000 BBs, If we handled the last BB with fine tuning the dominate tree. We need totally re-iterate and try optimize the 20000 BBs from the beginning. The Complex is near N! And we really encounter somes big tests (> 20000 BBs) that cost more than 30 mins in this pass. (Debug version compiler will cost 2 hours here) What this patch do for huge function ? It mainly changes the iteration way for optimization. 1 We do optimizeBlock for each BB (that is same with old way). And, in the meaning time, If BB is changed/updated in the optimization, it will be put into FreshBBs (try do optimizeBlock again). The new created BB at previous iteration will also put into FreshBBs. 2 For the BBs which not updated at previous iteration, we directly skip it. Strictly speaking, here may miss some opportunity, but the probability is very small. 3 For Instructions in single BB, we do optimizeInst for each instruction. If optimizeInst change the instruction dominator in this BB, rather than break and go back to optimize the first BB (the old way), we directly iterate instructions (to do optimizeInst) in this updated BB again (the new way). What this patch do for small/normal (not huge) function ? It is same with the Old Algorithm. (NFC) Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D129352	2022-09-07 10:05:40 +08:00

1 2 3 4 5 ...

803 Commits