llvm-project

Author	SHA1	Message	Date
Momchil Velikov	078899cd64	[SimplifyCFG] Allow SimplifyCFG hoisting to skip over non-matching instructions SimplifyCFG does some common code hoisting, which is limited to hoisting a sequence of identical instruction in identical order and stops at the first non-identical instruction. This patch allows hoisting instruction pairs over same-length sequences of non-matching instructions. The linear asymptotic complexity of the algorithm stays the same, there's an extra parameter `simplifycfg-hoist-common-skip-limit` serving to limit compilation time and/or the size of the hoisted live ranges. The patch improves SPECv6/525.x264_r by about 10%. Reviewed By: nikic, dmgreen Differential Revision: https://reviews.llvm.org/D129370	2022-09-05 15:13:46 +01:00
Tian Zhou	8fa432be4f	[InstCombine] reduce test-for-overflow of shifted value Fixes #57338. The added code makes the following transformations: For unsigned predicates / eq / ne: icmp pred (x << 1), x --> icmp getSignedPredicate(pred) x, 0 icmp pred x, (x << 1) --> icmp getSignedPredicate(pred) 0, x Some examples: https://alive2.llvm.org/ce/z/ckn4cj https://alive2.llvm.org/ce/z/h-4bAQ Differential Revision: https://reviews.llvm.org/D132888	2022-09-05 09:51:51 -04:00
Florian Hahn	408ebe5e3a	[VPlan] Move VPWidenCallRecipe to VPlanRecipes.cpp (NFC). Depends on D132585. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D132586	2022-09-05 10:48:29 +01:00
Nikita Popov	388b684354	[LICM] Separate check for writability and thread-safety (NFCI) This used a single check to make sure that the object is both writable and thread-local. Separate them out to make the deficiencies in the current code more obvious.	2022-09-05 09:43:17 +02:00
Florian Hahn	ba3d29f871	[LCSSA] Update unreachable uses with poison. Users of LCSSA may not expect non-phi uses when checking the uses outside a loop, which may cause crashes. This is due to the fact that we do not update uses in unreachable blocks. To ensure all reachable uses outside the loop are phis, update uses in unreachable blocks to use poison in dead code. Fixes #57508.	2022-09-04 22:26:18 +01:00
Kazu Hirata	7d8c2d17eb	[llvm] Use range-based for loops (NFC) Identified with modernize-loop-convert.	2022-09-03 23:27:25 -07:00
Fangrui Song	9fc679b87c	[SanitizerCoverage] Simplify pc-table and improve test. NFC	2022-09-03 14:29:21 -07:00
Kazu Hirata	9eca5ed790	[llvm] Use std::enable_if_t (NFC)	2022-09-03 11:17:44 -07:00
Kazu Hirata	fedc59734a	[llvm] Use range-based for loops (NFC)	2022-09-03 11:17:40 -07:00
Sanjay Patel	22e1f66f26	[SCCP] add helper function for replacing signed operations; NFC Preliminary refactoring for planned enhancement in D133198.	2022-09-03 10:30:10 -04:00
Sanjay Patel	5c759edc57	[InstCombine] reduce another or-xor bitwise logic pattern ~(A & ?) \| (A ^ B) --> ~((A & ?) & B) https://alive2.llvm.org/ce/z/mxex6V This is similar to 9d218b61cc50 where we peeked through another logic op to find a common operand.	2022-09-03 09:32:08 -04:00
Richard Smith	053841c562	Revert "[AggressiveInstCombine] Lower Table Based CTTZ" This reverts commit fec01ee3f5244bb9a04bc4310fc892c56c5b6bab. According to asan, this patch introduces a heap use after free.	2022-09-02 16:19:09 -07:00
Francis Visoiu Mistrih	c5b10f348e	[Matrix] Use print instead of dump for matrix-print-after-transpose-opt We should be able to use this option even if LLVM_ENABLE_DUMP is not on. (should fix the bots too)	2022-09-02 16:12:21 -07:00
Francis Visoiu Mistrih	81bdb4068d	[Matrix] Simplify matmuls with scalars If one of the operands is a transposed splat, the transpose can be removed. This is useful to simplify when transposes are distributed to operands of a matmul: * k^T -> k * (A * k)^t -> A^t * k Differential Revision: https://reviews.llvm.org/D130177	2022-09-02 15:50:25 -07:00
Sameer Sahasrabuddhe	46b293cb3f	[Attributor] Simplify offset calculation for a constant GEP Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D132931	2022-09-02 23:53:51 +05:30
Arthur Eubanks	57fd866551	[LoopPassManager] Implement and use LoopNestAnalysis::run() instead of manually creating LoopNests The current code is basically just emulating what the analysis manager does. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D132581	2022-09-02 10:55:53 -07:00
Djordje Todorovic	fec01ee3f5	[AggressiveInstCombine] Lower Table Based CTTZ This patch introduces recognition of table-based ctz implementation during the AggressiveInstCombine. This fixes the [0]. [0] https://bugs.llvm.org/show_bug.cgi?id=46434 Differential Revision: https://reviews.llvm.org/D113291	2022-09-02 17:26:55 +02:00
Jolanta Jensen	958abe864a	[LoopLoadElim] Add stores with matching sizes as load-store candidates We are not building up a proper list of load-store candidates because we are throwing away stores where the type don't match the load. This patch adds stores with matching store sizes as candidates. Author of the original patch: David Sherwood. Differential Revision: https://reviews.llvm.org/D130233	2022-09-02 13:11:25 +01:00
Muhammad Omair Javaid	18de7c6a3b	Revert "[InstCombine] Treat passing undef to noundef params as UB" This reverts commit c911befaec494c52a63e3b957e28d449262656fb. It has broken LLDB Arm/AArch64 Linux buildbots. I dont really understand the underlying reason. Reverting for now make buildbot green. https://reviews.llvm.org/D133036	2022-09-02 16:09:50 +05:00
Mikael Holmen	51d4c7ceea	[GlobalOpt] Fix debug variance problem in hasOnlyColdCalls hasOnlyColdCalls skipped over calls to intrinsics, but it did so after checking the linkage of the called function. This meant that the presence of a call to a debug intrinsic could affect the outcome of the optimization. In my original reproducer (for an out of tree target) it was particularly interesting, because the actual IR after GlobalOpt was not different with debug instrinsics present, so -print-after-all printouts didn't show anything there. However, without debuginfo, GlobalOpt went further and ran BlockFrequencyAnalysis and (more importanly) LoopAnalysis, and later on in the pipeline, instcombine behaved in different ways when LoopInfo was present. So a call to a dbg.declare prevented running LoopAnalysis in GlobalOpt, which later prevented InstCombine from doing an optimization. The dbg-intrinsic-loopanalysis.ll testcase tries to expose this. Then I also noted that adding a dbg.declare actually made the existing testcase colccc_coldsites.ll generate different code, so I modified that to now test it behaves the same way with and without the dbg.declare. Reviewed By: nikic, fhahn Differential Revision: https://reviews.llvm.org/D133193	2022-09-02 12:29:44 +02:00
Sergey Kachkov	be37caca00	[JumpThreading] Process range comparisions with non-local cmp instructions Use getPredicateOnEdge method if value is a non-local compare-with-a-constant instruction, that can give more precise results than getConstantOnEdge. Differential Revision: https://reviews.llvm.org/D131956	2022-09-02 12:22:45 +02:00
Nikita Popov	c453e5b901	Revert "[DSE] Eliminate noop store even through has clobbering between LoadI and StoreI" This reverts commit cd8f3e75813995c1d2da35370ffcf5af3aff9c2f. As pointed out by Eli on the review, this is missing an alignment check. The value might be written at an offset.	2022-09-02 09:28:48 +02:00
Nikita Popov	639d912282	[LICM] Allow load-only scalar promotion in the presence of unwinding Currently, we bail out of scalar promotion if the loop may unwind and the memory may be visible on unwind. This is because we can't insert stores of the promoted value on unwind edges. However, nowadays scalar promotion also has support for only promoting loads, while leaving stores in place. This kind of promotion is safe even in the presence of unwinding. Differential Revision: https://reviews.llvm.org/D133111	2022-09-02 09:27:13 +02:00
luxufan	cd8f3e7581	[DSE] Eliminate noop store even through has clobbering between LoadI and StoreI For noop store of the form of LoadI and StoreI, An invariant should be kept is that the memory state of the related MemoryLoc before LoadI is the same as before StoreI. For this example: ``` define void @pr49927(i32* %q, i32* %p) { %v = load i32, i32* %p, align 4 store i32 %v, i32* %q, align 4 store i32 %v, i32* %p, align 4 ret void } ``` Here the definition of the store's destination is different with the definition of the load's destination, which it seems that the invariant mentioned above is broken. But the definition of the store's destination would write a value that is LoadI, actually, the invariant is still kept. So we can safely ignore it. Differential Revision: https://reviews.llvm.org/D132657	2022-09-02 06:37:41 +00:00
Vitaly Buka	ad3a77df2d	[msan] Fix debug info with getNextNode When we want to add instrumentation after an instruction, instrumentation still should keep debug info of the instruction. Reviewed By: kda, kstoimenov Differential Revision: https://reviews.llvm.org/D133091	2022-09-01 20:13:56 -07:00
Chenbing Zheng	d30cf77cb1	[InstCombine] complete fold extractvalue (any_mul_with_overflow X, -1) When we do extractvalue (any_mul_with_overflow X, -1) --> (-X and icmp), which left partly failed to match vector constant with poison element. This patch try to fix it. Alive2: https://alive2.llvm.org/ce/z/2rGp_3 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D132996	2022-09-02 10:58:42 +08:00
Vitaly Buka	ad2b356f85	[msan] Use no-origin functions when possible Saves 1.8% of .text size on CTMark Reviewed By: kda Differential Revision: https://reviews.llvm.org/D133077	2022-09-01 19:18:38 -07:00
Arthur Eubanks	c911befaec	[InstCombine] Treat passing undef to noundef params as UB Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D133036	2022-09-01 15:16:45 -07:00
Rong Xu	0caa4a9559	[PGO] Support PGO annotation of CallBrInst We currently instrument CallBrInst but do not annotate it with the branch weight. This patch enables PGO annotation of CallBrInst. Differential Revision: https://reviews.llvm.org/D133040	2022-09-01 14:13:50 -07:00
Vitaly Buka	ef0f866718	[msan] Combine shadow check of the same instruction Reduces .text size by 1% on our large binary. On CTMark (-O2 -fsanitize=memory -fsanitize-memory-use-after-dtor -fsanitize-memory-param-retval) Size -0.4% Time -0.8% Reviewed By: kda Differential Revision: https://reviews.llvm.org/D133071	2022-09-01 13:55:59 -07:00
Vitaly Buka	9110673062	[nfc][msan] Group checks per instruction It's a preparation of to combine shadow checks of the same instruction Reviewed By: kda, kstoimenov Differential Revision: https://reviews.llvm.org/D133065	2022-09-01 13:10:16 -07:00
Jordan Rupprecht	3031a250de	[MSan] Fix determinism issue when using msan-track-origins. When instrumenting `alloca`s, we use a `SmallSet` (i.e. `SmallPtrSet`). When there are fewer elements than the `SmallSet` size, it behaves like a vector, offering stable iteration order. Once we have too many `alloca`s to instrument, the iteration order becomes unstable. This manifests as non-deterministic builds because of the global constant we create while instrumenting the alloca. The test added is a simple IR file, but was discovered while building `libcxx/src/filesystem/operations.cpp` from libc++. A reduced C++ example from that: ``` // clang++ -fsanitize=memory -fsanitize-memory-track-origins \ // -fno-discard-value-names -S -emit-llvm \ // -c op.cpp -o op.ll struct Foo { ~Foo(); }; bool func1(Foo); void func2(Foo); void func3(int) { int f_st, t_st; Foo f, t; func1(f) \|\| func1(f) \|\| func1(t) \|\| func1(f) && func1(t); func2(f); } ``` Reviewed By: kda Differential Revision: https://reviews.llvm.org/D133034	2022-09-01 09:15:57 -07:00
Nuno Lopes	858fe8664e	Expand Div/Rem: consider the case where the dividend is zero So we can't use ctlz in poison-producing mode	2022-09-01 17:04:26 +01:00
Nikita Popov	f5c178b6a4	[LICM] Remove unnecessary condition (NFC)	2022-09-01 15:42:35 +02:00
Nikita Popov	315aef667e	[LICM] Fix thread safety checks for promotion of byval args This code was relying on a very subtle contract: The expectation was that for non-allocas, the unwind safety check would already perform a capture check, so we don't need to perform it later. This held true when this unwind safety was only handled for allocas and noalias calls, but became incorrect when byval support was added. To avoid this kind of issue, just remove the dependency between the unwind and thread-safety checks entirely. At worst, this means we perform a redundant capture check. If this should turn out to be problematic for compile-time, we can cache that query in a more explicit way.	2022-09-01 15:33:46 +02:00
Sanjay Patel	c3d1504d63	[InstCombine] fix crash on type mismatch with fcmp fold The existing predicate doesn't work for a single-element vector, so make sure we are not crossing scalar/vector types. Test (was crashing) based on the post-commit example for: 482777123427	2022-09-01 08:57:55 -04:00
Sanjay Patel	addbdac5d5	[InstCombine] fold power-of-2 ctlz/cttz with inverted result When X is a power-of-two or zero and zero input is poison: ctlz(i32 X) ^ 31 --> cttz(X) cttz(i32 X) ^ 31 --> ctlz(X) https://alive2.llvm.org/ce/z/Cs7sFE	2022-09-01 08:57:55 -04:00
Nikita Popov	3f8b1d0f15	[LICM] Add some debug output to scalar promotion (NFC)	2022-09-01 14:46:30 +02:00
Alexey Bataev	982d9ef1c1	[SLP]Fix PR55734: SLP vectorizer's reduce_and formation introduces poison. Need either follow the original order of the operands for bool logical ops, or emit freeze instruction to avoid poison propagation. Differential Revision: https://reviews.llvm.org/D126877	2022-09-01 05:34:45 -07:00
Yuanbo Li	ebd0249fcf	[DebugInfo] Missing debug location after replacement in processSRem function This patch fixes an issue in which CorrelatedValuePropagation::processSRem would create new instructions to represent the SRem instruction, but would not correctly copy any existing debug location metadata to the new instruction. Differential Revision: https://reviews.llvm.org/D132218	2022-09-01 13:18:17 +01:00
Florian Hahn	fc444ddc77	[VPlan] Add field to track if intrinsic should be used for call. (NFC) This patch moves the cost-based decision whether to use an intrinsic or library call to the point where the recipe is created. This untangles code-gen from the cost model and also avoids doing some extra work as the information is already computed at construction. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D132585	2022-09-01 13:14:40 +01:00
Nuno Lopes	fa154a9170	Revert "Expand Div/Rem: consider the case where the dividend is zero" This reverts commit 4aed09868b5a51a29aade11d9d412c3313310f29.	2022-09-01 12:11:22 +01:00
Nuno Lopes	4aed09868b	Expand Div/Rem: consider the case where the dividend is zero So we can't use ctlz in poison-producing mode	2022-09-01 12:00:03 +01:00
Pavel Samolysov	527b9a9d90	[DeadArgElim] Use structure bindings in foreach loops. NFC Differential Revision: https://reviews.llvm.org/D133026	2022-09-01 13:48:46 +03:00
Nikita Popov	43e7d9af1d	[InstCombine] Fold extractvalue of phi Just as we do for most other operations, we should push extractvalue instructions through phis, if this does not increase unfolded instruction count.	2022-09-01 10:51:54 +02:00
Arthur Eubanks	04f3c20989	[NFC][LICM] Stop passing around unused BFI Uses of this were removed in 1a25d0bfbb6b587caa03bacd121b67086a774598.	2022-08-31 19:15:34 -07:00
Vitaly Buka	53d1ae88f8	[nfc][msan] Prepare the code for check sorting	2022-08-31 15:36:49 -07:00
Nikita Popov	ab6876a40d	reland: [Local] Allow creating callbr with duplicate successors Since D129288, callbr is allowed to have duplicate successors. This patch removes a limitation which prevents optimizations from actually producing such callbrs. This is probably the riskiest of all the recent callbr changes, because code with incorrect assumptions might be lurking somewhere. I fixed the one case I encountered ahead of time in `8201e3ef5c`. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D129997 Originally landed as commit 08860f525a23 ("[Local] Allow creating callbr with duplicate successors") Reverted in commit 1cf6b93df168 ("Revert "[Local] Allow creating callbr with duplicate successors"")	2022-08-31 13:23:00 -07:00
Alexey Bataev	588115c117	[SLP][NFC]Add a check for SelectInst to match description, NFC.	2022-08-31 13:04:21 -07:00
Alexey Bataev	d8d9ee10bb	[SLP][NFC]Fix comment and make function following naming standard, NFC.	2022-08-31 12:37:55 -07:00

1 2 3 4 5 ...

31400 Commits