llvm-project

Author	SHA1	Message	Date
Aleksandr Popov	bca5501869	[IRCE] Add NSW flag to main loop's indvar base We have guarantees that induction variable will not overflow in the main loop after the loop constrained. Therefore we can add no wrap flags on its base in order not to miss info that loop is countable. Add NSW flag now, since adding NUW flag requires a bit more complicated analysis. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D154954	2023-07-17 01:03:52 +02:00
Nuno Lopes	68f1391a62	[ScalarizeMaskedMemIntrin] Use poison instead of undef as placeholder [NFC] This is used for masked out lanes, that are replaced with the passthrough value	2023-07-17 10:11:14 +01:00
ManuelJBrito	ace9b6bbf5	[NewGVN] Canonicalize expressions for commutative intrinsics Ensure that commutative intrinsics that only differ by a permutation of their operands get the same value number by sorting the operand value numbers. Fixes https://github.com/llvm/llvm-project/issues/46753 Differential Revision: https://reviews.llvm.org/D155309	2023-07-16 17:24:17 +01:00
Maksim Kita	da822ce90e	[InstCombine] Generalise ((x1 ^ y1) \| (x2 ^ y2)) == 0 transform Generalise ((x1 ^ y1) \| (x2 ^ y2)) == 0 transform to more than two pairs of variables https://github.com/llvm/llvm-project/issues/57831. Depends D154384. Reviewed By: goldstein.w.n, nikic Differential Revision: https://reviews.llvm.org/D154306	2023-07-15 16:57:16 -05:00
Maksim Kita	39f0afde98	[InstCombine] Generalise ((x1 ^ y1) \| (x2 ^ y2)) == 0 transform tests Precommit tests for D154306. Differential Revision: https://reviews.llvm.org/D154384	2023-07-15 16:57:16 -05:00
zhongyunde	4d2723bd00	[ValueTracking] Support vscale assumes for isKnownToBeAPowerOfTwo This patch is separated from D154953 to see what tests are affected by this change alone according comment. Depend on the related updating of LangRef on D155193. Reviewed By: paulwalker-arm, nikic, david-arm Differential Revision: https://reviews.llvm.org/D155350	2023-07-15 19:42:58 +08:00
zhongyunde	a41e7a2a5d	[tests] precommit tests for D155350 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D155363	2023-07-15 19:36:37 +08:00
khei4	b02d349cbf	Revert "Revert "[MemCpyOpt] implement single BB stack-move optimization which unify the static unescaped allocas"" This reverts commit 36a6eb7d12a9f827bf3d5d4e5fdc68b8a62807b2. [MemCpyOpt] check that load/store and dest/src alloca are all in the same bb Differential Revision: https://reviews.llvm.org/D153453 Co-authored-by: serge-sans-paille <sguelton@mozilla.com>	2023-07-15 16:27:38 +09:00
khei4	a92e197114	[MemCpyOpt] precommit tests to add multi-BB stack-move optimization to check crash for D153453 (NFC) Differential Revision: https://reviews.llvm.org/D155179 Co-authored-by: serge-sans-paille <sguelton@mozilla.com>	2023-07-15 16:27:38 +09:00
Zhongyunde	7203286329	[LangRef] vscale_range implies the vscale is power-of-two According the discuss on D154953, we need to make the LangRef change before the optimization relied on the new behaviour: vscale_range implies vscale is a power-of-two value, parse of the attribute to reject values that are not a power-of-two. Thanks nikic for the wonderful summary of discussing on D154953: To provide a bit more context here. We would like to have power of two vscale exposed in a target-independent way, so we can make use of this in places like ValueTracking, just like we currently do the vscale range. Some options that have been discussed are: - Remove support for non-power-of-two vscales entirely. (This is my personal preference, but this is hard to undo if it turns out someone does need them.) - Add an extra attribute vscale_pow2, or a data layout property. - Make vscale_range imply power-of-two vscale, as a compromise solution (what this patch does). This would be relatively easy to turn into one of the two above at a later point. Reviewed By: paulwalker-arm, nikic, efriedma Differential Revision: https://reviews.llvm.org/D155193	2023-07-15 09:13:48 +08:00
Johannes Doerfert	232ce90541	[OpenMP][FIX] Adjust "known" attributes for runtime functions This showed up when we started to deduce readnone for the argument of __kmpc_global_thread_num. The known attributes for "getters" did not allow to read arguments, but that is sometimes the case.	2023-07-14 17:01:48 -07:00
Johannes Doerfert	55544518c6	[Attributor] Allow IR-attr deduction for non-IPO amendable functions If the function is non-IPO amendable we do skip most attributes/AAs. However, if an AA has a isImpliedByIR that can deduce the attribute from other attributes, we can run those. For now, we manually enable them, if we have more later we can use some automation/flag.	2023-07-14 13:54:04 -07:00
Johannes Doerfert	4dc5662c27	[Attributor][NFC] Update all tests with the script Three tests needed manual adjustment after https://reviews.llvm.org/D148216 got reverted. See https://github.com/llvm/llvm-project/issues/63746.	2023-07-14 13:53:38 -07:00
Anna Thomas	dfaf4587e4	Precommit follow-up testcase for interleaved miscompile Follow-up testcase for PR63602. Suggested by Ayal in D154309, more complete fix coming up which should handle this testcase as well.	2023-07-14 16:04:56 -04:00
Nikita Popov	2bc7d02312	Revert "[InstSimplify] Make simplifyWithOpReplaced() recursive (PR63104)" This is very likely the cause of a stage 2 failure in Transforms/LoopVectorize/check-prof-info.ll. Revert until I can investigate this. This reverts commit 3d199d086e076f0b9b90d4c59f2226a417a639b5.	2023-07-14 18:33:39 +02:00
Nikita Popov	3d199d086e	[InstSimplify] Make simplifyWithOpReplaced() recursive (PR63104) Support replacement of operands not only in the immediate instruction, but also instructions it uses. To the most part, this extension is straightforward, but there are two bits worth highlighting: First, we can now no longer assume that if the Op is a vector, the instruction also returns a vector. If Op is a vector and the instruction returns a scalar, we should consider it as a cross-lane operation. Second, for the x ^ x special case, we can no longer assume that the operand is RepOp, as we might have a replacement higher up the instruction chain. There is one optimization regression, but it is in a fuzzer-generated test case. Fixes https://github.com/llvm/llvm-project/issues/63104.	2023-07-14 16:33:40 +02:00
Jay Foad	70eafa391b	[InstCombine] Regenerate AMDGPU test checks	2023-07-14 15:28:55 +01:00
Alexey Bataev	8ab962e411	[SLP]Relax assertion to check if the input scalars were extended to match the size of base node (PR63668). Need to adjust the check for assert and take into account case where the original scalars are reused and were extended to match the vector factor of the reused SLP node.	2023-07-14 07:19:49 -07:00
Nikita Popov	547544112b	[InstSimplify] Allow gep inbounds x, 0 -> x in non-refining op replacement After the semantics change from https://reviews.llvm.org/D154051, gep inbounds x, 0 can no longer produce poison. As such, we can also perform this fold during non-refining operand replacement and avoid unnecessary drops of the inbounds flag. The online alive2 version has not been update to the new semantics yet, but we can use the following proof locally: define ptr @src(ptr %base, i64 %offset) { %cmp = icmp eq i64 %offset, 0 %gep = getelementptr inbounds i8, ptr %base, i64 %offset %sel = select i1 %cmp, ptr %base, ptr %gep ret ptr %sel } define ptr @tgt(ptr %base, i64 %offset) { %gep = getelementptr inbounds i8, ptr %base, i64 %offset ret ptr %gep }	2023-07-14 16:14:50 +02:00
Nikita Popov	91b84811ab	[InstSimplify] Add tests for recursive simplify with op replaced (NFC)	2023-07-14 16:06:34 +02:00
Alexey Bataev	bc8abb42bb	Revert "[SLP]Relax assertion to check if the input scalars were extended to" This reverts commit 6fdfc81287ecdc2a7f409d08538ec6ce2bd698da to fix the check in the assert )need to use end, nod begin function).	2023-07-14 07:04:06 -07:00
Alexey Bataev	6fdfc81287	[SLP]Relax assertion to check if the input scalars were extended to match the size of base node (PR63668). Need to adjust the check for assert and take into account case where the original scalars are reused and were extended to match the vector factor of the reused SLP node.	2023-07-14 06:48:25 -07:00
Nikita Popov	21827268ad	[InstCombine] Fold add of zext and sext of i1 (zext a) + (sext a) is 0 if a is a bool. The regression is in a fuzzer-generated test. Proof: https://alive2.llvm.org/ce/z/KotnN6	2023-07-14 14:52:13 +02:00
Nikita Popov	893ad30d11	[InstCombine] Add test for add of zext and sext (NFC)	2023-07-14 14:52:13 +02:00
Nikita Popov	dc2b2ae7dc	[InstCombine] Fold cttz of lowest set bit cttz(-a & a) is the same as cttz(a). -a & a is an idiom to extract the lowest set bit, which naturally does not affect the number of trailing zeroes. Proof: https://alive2.llvm.org/ce/z/Yp26x7	2023-07-14 14:31:35 +02:00
Nikita Popov	c8bc1abf55	[InstCombine] Add tests for cttz of lowest set bit (NFC)	2023-07-14 14:31:35 +02:00
Jay Foad	9ff71814cb	[EarlyCSE] Do not CSE convergent calls with memory effects D149348 did this for readnone calls, which are handled by SimpleValue. This patch does the same for all other CSEable calls, which are handled by CallValue. Differential Revision: https://reviews.llvm.org/D153151	2023-07-14 11:43:41 +01:00
Jay Foad	c2f8fe7cd8	[EarlyCSE] Precommit test for D153151 Differential Revision: https://reviews.llvm.org/D155210	2023-07-14 11:43:41 +01:00
Nikita Popov	cd1dcd2c95	[InstCombine] Handle const select arm in foldSelectCtlzToCttz() The select arm that takes the ctlz result can also instead be a constant with the bit width (as this is what the ctlz evaluates to for a==0). This avoids a regression when strengthening the simplifyWithOpReplaced() fold. Proof: https://alive2.llvm.org/ce/z/DMRL5A	2023-07-14 12:00:39 +02:00
Nikita Popov	701a8b348e	[InstCombine] Add test for ctlz->cttz fold with constant in select (NFC)	2023-07-14 11:52:14 +02:00
Noah Goldstein	ddd18d02c7	[InstCombine] Transform `icmp eq/ne ({su}div exact X,Y),C` -> `icmp eq/ne X, YC` We can do this if `YC` doesn't overflow. This is trivial if `C` is 0/1. Otherwise we actually generate a `mul` instruction iff the `div` has one use. Alive2 Links: udiv: https://alive2.llvm.org/ce/z/GWPW67 sdiv: https://alive2.llvm.org/ce/z/bUoX9h Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D150091	2023-07-13 19:36:59 -05:00
Noah Goldstein	fd691fce59	[InstCombine] Add tests for `icmp eq/ne ({su}div exact X, Y), C`; NFC Differential Revision: https://reviews.llvm.org/D150090	2023-07-13 19:36:59 -05:00
Alexey Bataev	ec6b40ab9b	[SLP]Add a test with the stores with long distances between them, NFC.	2023-07-13 15:14:09 -07:00
Nikita Popov	ddb46abd3c	[LSR] Don't consider users of constant outside loop In CollectLoopInvariantFixupsAndFormulae(), LSR looks at users outside the loop. E.g. if we have an addrec based on %base, and %base is also used outside the loop, then we have to keep it in a register anyway, which may make it more profitable to use %base + %idx style addressing. This reasoning doesn't hold up when the base is a constant, because the constant can be rematerialized. The lsr-memcpy.ll test regressed when enabling opaque pointers, because inttoptr (i64 6442450944 to ptr) now also has a use outside the loop (previously it didn't due to a pointer type difference), and that extra "use" results in worse use of addressing modes in the loop. However, the use outside the loop actually gets rematerialized, so the alleged register saving does not occur. The same reasoning also applies to other types of constants, such as global variable references. Differential Revision: https://reviews.llvm.org/D155073	2023-07-13 12:22:38 +02:00
Nikita Popov	e8a5df7beb	[LSR] Add test variant with global variables (NFC) A variant of the test using globals instead of inttoptr expressions for D155073.	2023-07-13 12:12:48 +02:00
khei4	36a6eb7d12	Revert "[MemCpyOpt] implement single BB stack-move optimization which unify the static unescaped allocas" This reverts commit 96ae0851c26237378fa1280b0a9ad713e1b72bdb.	2023-07-13 18:04:49 +09:00
khei4	96ae0851c2	[MemCpyOpt] implement single BB stack-move optimization which unify the static unescaped allocas Differential Revision: https://reviews.llvm.org/D153453	2023-07-13 14:52:30 +09:00
khei4	393215649b	[MemCpyOpt] precommit test to add single BB stack-move optimization (NFC) Differential Revision: https://reviews.llvm.org/D152277	2023-07-13 14:52:30 +09:00
Noah Goldstein	d50c1fcb5d	[InstCombine] Fold `(icmp eq/ne (zext i1 X) (sext i1 Y))`-> `(icmp eq/ne (or X, Y), 0)` This comes up when adding two `bool` types in C/C++ ``` bool foo(bool a, bool b) { return a + b; } ... -> define i1 @foo(i1 %a, i1 %b) { %conv = zext i1 %a to i32 %conv3.neg = sext i1 %b to i32 %tobool4 = icmp ne i32 %conv, %conv3.neg ret i1 %tobool4 } ``` Proof: https://alive2.llvm.org/ce/z/HffWAN Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D154574	2023-07-12 17:17:52 -05:00
Noah Goldstein	83ad4cb61f	[InstCombine] Add tests for folding `(icmp eq/ne (zext i1) (sext i1))`; NFC Differential Revision: https://reviews.llvm.org/D154573	2023-07-12 17:17:52 -05:00
Shilei Tian	bcba20b5d0	[Attributor] Add AAAddressSpace to deduce address spaces This patch adds initial support for the `AAAddressSpace` abstract attributor interface to deduce and query address space information for a pointer. We simply query the underlying objects that a pointer can point to and find a common address space if they exist. This is the minimal support for the interface, we currently manifest changes on loads and stores. Additionally we should use the target transform information to deduce if an address space transformation is a no-op for the target machine when calculating compatibility. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D120586	2023-07-12 15:47:41 -04:00
Eli Friedman	60712732ea	[IndVars] Teach replaceCongruentIVs to avoid scrambling induction variables replaceCongruentIVs analysis is based on ScalarEvolution; this makes comparing different PHIs and performing the replacement straightforward. However, it can have some side-effects: it isn't aware whether an induction variable is in canonical form, so it can perform replacements which obscure the meaning of the IR. In test22 in widen-loop-comp.ll, the resulting loop can't be analyzed by ScalarEvolution at all. My attempted solution is to restrict the transform: don't try to replace induction variables using PHI nodes that don't represent simple induction variables. I'm not sure if this is the best solution; suggestions welcome. Differential Revision: https://reviews.llvm.org/D121950	2023-07-12 12:27:39 -07:00
Anna Thomas	1159266734	[SLP] Add support for fmaximum/fminimum reduction This patch adds support for vectorized reduction of maximum/minimum intrinsics which are under the appropriate reduction kind. Differential Revision: https://reviews.llvm.org/D154463	2023-07-12 15:22:38 -04:00
Anna Thomas	a43aebcd91	[SLP] Test for minimum/maximum reduction minimum/maximum tests from D154463. This contains tests where we vectorize minimum/maximum as well as the tests where we currently do not identify reduction patterns. Differential Revision: https://reviews.llvm.org/D155096	2023-07-12 15:22:37 -04:00
Matt Arsenault	6ed48ebf2e	ValueTracking: Recognize fpclass clamping select patterns Improve computeKnownFPClass select handling to cover the case where the condition performs a class test. This allows us to recognize no-nans in cases like: %not.nan = fcmp ord float %x, 0.0 %select = select i1 %not.nan, float %x, float 0.0 Math library code has similar edge case filtering on the inputs and final results. https://reviews.llvm.org/D153089	2023-07-12 13:14:05 -04:00
Matt Arsenault	05f0de3d74	ValueTracking: Add base computeKnownFPClass select handling tests Prepare to handle class clamping patterns. Working around some kind of select special casing bug in attributor where computeKnownFPClass is never called on select.	2023-07-12 13:14:05 -04:00
Peixin Qiao	ab73bd3897	[InstCombine] Enhance select icmp and folding This folds (a << k) ? 2^k * a : 0 to 2^k * a. https://alive2.llvm.org/ce/z/_dDRjo Fix #62155. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D148420	2023-07-12 22:39:45 +08:00
Maciej Gabka	5b0e19a7ab	[TLI][AArch64] Add mappings to vectorized functions from ArmPL Arm Performance Libraries contain math library which provides vectorized versions of common math functions. This patch allows to use it with clang and llvm via -fveclib=ArmPL or -vector-library=ArmPL, so loops with such calls can be vectorized. The executable needs to be linked with the amath library. Arm Performance Libraries are available at: https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Libraries Reviewed by: paulwalker-arm Differential Revision: https://reviews.llvm.org/D154508	2023-07-12 12:53:18 +00:00
Nikita Popov	edb2fc6dab	[llvm] Remove explicit -opaque-pointers flag from tests (NFC) Opaque pointers mode is enabled by default, no need to explicitly enable it.	2023-07-12 14:35:55 +02:00
Krasimir Georgiev	c256e19671	Revert "Revert "IRBuilder: Fix not handling strictfp minnum/maxnum"" This reverts commit 593797ab9bedca6e9b0b7a9ed0589cf76023ab00. I didn't realize that there was already a fix for the broken tests fd2254b7358d0f78a79784688bd8012c1a52b9cf.	2023-07-12 14:13:31 +02:00

1 2 3 4 5 ...

26330 Commits