llvm-project

Author	SHA1	Message	Date
Johannes Doerfert	3c8a4c6f47	[OpenMP] Eliminate redundant barriers in the same block Patch originally by Giorgis Georgakoudis (@ggeorgakoudis), typos and bugs introduced later by me. This patch allows us to remove redundant barriers if they are part of a "consecutive" pair of barriers in a basic block with no impacted memory effect (read or write) in-between them. Memory accesses to local (=thread private) or constant memory are allowed to appear. Technically we could also allow any other memory that is not used to share information between threads, e.g., the result of a malloc that is also not captured. However, it will be easier to do more reasoning once the code is put into an AA. That will also allow us to look through phis/selects reasonably. At that point we should also deal with calls, barriers in different blocks, and other complexities. Differential Revision: https://reviews.llvm.org/D118002	2022-02-01 01:07:50 -06:00
Johannes Doerfert	989674f110	[OpenMP] Ensure to remove noinline from all runtime functions eventually We used to remove noinline from known OpenMP runtime functions (which are declared in OMPKinds.td). Now we remove noinline from all functions with the proper prefixes: __kmpc, _ZN4_OMP (= namespace omp), omp_	2022-02-01 01:07:50 -06:00
Serguei Katkov	28c5e1b760	[RS4GC] Make PointerToBase mapping be independent on call site. NFC. PointerToBase is a mapping between potentially derived pointer to its base. As soon as we are in SSA form if there is a base of derived pointer and it is available at def of derived pointer, the same base will be available at any point where derived pointer is alive. So the mapping of derived pointer to base pointer is not a property of a call site but the same on function level. Reviewers: reames, yrouban Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D118604	2022-02-01 11:47:36 +07:00
Fangrui Song	7aaf024dac	[BitcodeWriter] Fix cases of some functions `WriteIndexToFile` is used by external projects so I do not touch it.	2022-01-31 16:46:11 -08:00
Fangrui Song	85dfe19b36	[ModuleUtils] Move EmbedBufferInModule to LLVMTransformsUtils D116542 adds EmbedBufferInModule which introduces a layer violation (https://llvm.org/docs/CodingStandards.html#library-layering). See 2d5f857a1eaf5f7a806d12953c79b96ed8952da8 for detail. EmbedBufferInModule does not use BitcodeWriter functionality and should be moved LLVMTransformsUtils. While here, change the function case to the prevailing convention. It seems that EmbedBufferInModule just follows the steps of EmbedBitcodeInModule. EmbedBitcodeInModule calls WriteBitcodeToFile but has IR update operations which ideally should be refactored to another library. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D118666	2022-01-31 16:33:57 -08:00
Kirill Stoimenov	a5dd6c7419	[ASan] Fixed null pointer bug introduced in D112098. Also added some more test to cover the "else if" part. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D118645	2022-01-31 21:50:10 +00:00
William S. Moses	8cb9c73609	[LoopIdiom] Keep TBAA when creating memcpy/memmove When upgrading a loop of load/store to a memcpy, the existing pass does not keep existing aliasing information. This patch allows existing aliasing information to be kept. Reviewed By: jeroen.dobbelaere Differential Revision: https://reviews.llvm.org/D108221	2022-01-31 16:28:13 -05:00
Alexey Bataev	afaaecc88c	[SLP]Alternate vectorization for cmp instructions. Added support for alternate ops vectorization of the cmp instructions. It allows to vectorize either cmp instructions with same/swapped predicate but different (swapped) operands kinds or cmp instructions with different predicates and compatible operands kinds. Differential Revision: https://reviews.llvm.org/D115955	2022-01-31 11:11:25 -08:00
Jay Foad	8faad29634	Revert "[Local] invertCondition: try modifying an existing ICmpInst" This reverts commit a6b54ddaba2d5dc0f72dcc4591c92b9544eb0016. Apparently it is not safe to modify the condition even if it passes the hasOneUse test, because StructurizeCFG might have other references to the condition that are not manifest in the IR use-def chains.	2022-01-31 14:55:36 +00:00
Jay Foad	a6b54ddaba	[Local] invertCondition: try modifying an existing ICmpInst This avoids various cases where StructurizeCFG would otherwise insert an xor i1 instruction, and it since it generally runs late in the pipeline, instcombine does not clean up the xor-of-cmp pattern. Differential Revision: https://reviews.llvm.org/D118478	2022-01-31 10:44:17 +00:00
Nikita Popov	4810051a82	[Inline][Cloning] Reliably remove unreachable blocks during cloning (PR53206) The pruning cloner already tries to remove unreachable blocks. The original cloning process will simplify instructions and constant terminators, and only clone blocks that are reachable at that point. However, phi nodes can only be simplified after everything has been cloned. For that reason, additional blocks may become unreachable after phi simplification. The code does try to handle this as well, but only removes blocks that don't have predecessors. It misses unreachable cycles. This can cause issues if SEH exception handling code is part of an unreachable cycle, as the inliner is not prepared to deal with that. This patch instead performs an explicit scan for reachable blocks, and drops everything else. Fixes https://github.com/llvm/llvm-project/issues/53206. Differential Revision: https://reviews.llvm.org/D118449	2022-01-31 09:31:34 +01:00
Max Kazantsev	70b3beb0e2	[InstCombine] Generalize and-reduce pattern to handle `ne` case as well as `eq` Following Sanjay's proposal from discussion in D118317, this patch generalizes and-reduce handling to fold the following pattern ``` icmp ne (bitcast(icmp ne (lhs, rhs)), 0) ``` into ``` icmp ne (bitcast(lhs), bitcast(rhs)) ``` https://alive2.llvm.org/ce/z/WDcuJ_ Differential Revision: https://reviews.llvm.org/D118431 Reviewed By: lebedev.ri	2022-01-31 12:14:08 +07:00
Ricky Zhou	30ac5f9e64	[InstCombine] Do not combine atomic and non-atomic loads Before this change, InstCombine was willing to fold atomic and non-atomic loads through a PHI node as long as the first PHI argument is not an atomic load. The combined load would be non-atomic, which is incorrect. Fix this by only combining the loads in a PHI node when all of the arguments are non-atomic loads. Thanks to Eli Friedman for pointing out the bug at https://github.com/llvm/llvm-project/issues/50777#issuecomment-981045342! Fixes #50777 Differential Revision: https://reviews.llvm.org/D115113	2022-01-30 10:05:11 -05:00
Ricky Zhou	de80b53d1a	[InstCombine] Use range for loops (NFC) Preliminary clean-up for D115113 Differential Revision: https://reviews.llvm.org/D116086	2022-01-30 09:10:39 -05:00
Ricky Zhou	4aabed05a8	[InstCombine] Uppercase some variable names (NFC) Uppercase some variable names, per LLVM coding standards. This change intentionally does not rename every miscased variable, as a follow-up change ( D116086 ) intends to eliminate many of those by switching loops to range for loops. Differential Revision: https://reviews.llvm.org/D118553	2022-01-30 09:10:39 -05:00
Florian Hahn	8f12175fed	[VPlan] Use VPlan to check if only the first lane is used. This removes the remaining dependence on LoopVectorizationCostModel from buildScalarSteps and is required so it can be moved out of ILV. It also improves allows us to remove a few unneeded instructions. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D116554	2022-01-30 13:07:29 +00:00
Nuno Lopes	dd995aceda	[InstCombine] remove incorrect gep(x, undef) -> undef optimization gep(x, undef) carries the provenance of x, so we can't replace it with any pointer like undef. This leaves room for improvement for the poison case, but that's currently not possible as the demanded bits API doesn't distinguish between undef & poison bits. Fixes #44790	2022-01-30 11:34:32 +00:00
Nuno Lopes	f1c18acb07	[NewGVN] do phi(undef, x) -> x only if x is not poison phi([undef, A], [x, B]) -> x is only correct x is guaranteed to be a non-poison value. Otherwise we would be changing an undef to poison in the branch A. Differential Revision: https://reviews.llvm.org/D117907	2022-01-29 21:43:57 +00:00
Florian Hahn	efd4938723	[VPlan] Handle IV vector splat using VPWidenCanonicalIV. This patch tries to use an existing VPWidenCanonicalIVRecipe instead of creating another step-vector for canonical induction recipes in widenIntOrFpInduction. This has the following benefits: 1. First step to avoid setting both vector and scalar values for the same induction def. 2. Reducing complexity of widenIntOrFpInduction through making things more explicit in VPlan 3. Only need to splat the vector IV for block in masks. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D116123	2022-01-29 16:25:27 +00:00
Max Kazantsev	3b194ca7ab	Recommit "[InstCombine] Fold and-reduce idiom" Checks of original vector types made more thorough. Differential Revision: https://reviews.llvm.org/D118317	2022-01-29 11:27:48 +07:00
Philip Reames	6888081e32	[SLP] Use moveBefore to simplify code [NFC]	2022-01-28 12:44:07 -08:00
Ahmed Bougacha	634ca7349d	[ObjCARC] Require the function argument in the clang.arc.attachedcall bundle. Currently, the clang.arc.attachedcall bundle takes an optional function argument. Depending on whether the argument is present, calls with this bundle have the following semantics: - on x86, with the argument present, the call is lowered to: call _target mov rax, rdi call _objc_retainAutoreleasedReturnValue - on AArch64, without the argument, the call is lowered to: bl _target mov x29, x29 and the objc runtime call is expected to be emitted separately. That's because, on x86, the objc runtime checks for both the mov and the call on x86, and treats the combination as the ARC autorelease elision marker. But on AArch64, it only checks for the dedicated NOP marker, as that's historically been sufficiently unique. Thanks to that, the runtime call wasn't required to be adjacent to the NOP marker, so it wasn't emitted as part of the bundle sequence. This patch unifies both architectures: on AArch64, we now emit all 3 instructions for the bundle. This guarantees that the runtime call is adjacent to the marker in the sequence, and that's information the runtime can use to further optimize this. This helps simplify some of the handling, in particular BundledRetainClaimRVs, which no longer needs to know whether the bundle is sufficient or not: it now always should be. Note that this does not include an AutoUpgrade for the nullary bundles, as they are only produced in ObjCContract as part of the obj/asm emission pipeline, and are not expected to be in bitcode. Differential Revision: https://reviews.llvm.org/D118214	2022-01-28 12:41:45 -08:00
Philip Reames	746e435ff7	Revert "[SLP] Add a clarifying assert in block scheduling [NFC]" This reverts commit db49a78900f5e4b59714565876b5dbb5e2dfe840. The reasoning in the patch applied to a downstream branch, and I got myself confused when trying to split apart pieces. Thankfully, the assert was simply weaker than the actual invariant currently upstream which is that ReadyInsts is not empty.	2022-01-28 12:10:31 -08:00
Andrew Litteken	3785c1d055	[IRSim][IROutliner] Allowing Intrinsic Calls to be Used in Similarity Matching and Outlined Regions Due to some complications with lifetime, and assume-like intrinsics, intrinsics were not included as outlinable instructions. This patch opens up most intrinsics, excluding lifetime and assume-like intrinsics, to be outlined. For similarity, it is required that the intrinsic IDs, and the intrinsics names match exactly, as well as the function type. This puts intrinsics in a different class than normal call instructions (https://reviews.llvm.org/D109448), where the name will no longer have to match. This also adds an additional command line flag debug option to disable outlining intrinsics. Recommit of: 8de76bd569732acae6a10fdcb0152a49f7d4cd39 Adds extra checking of intrinsic function calls names to avoid taking the address of intrinsic calls when extracting function calls. Reviewers: paquette, jroelofs Differential Revision: https://reviews.llvm.org/D109450	2022-01-28 13:52:21 -06:00
Philip Reames	db49a78900	[SLP] Add a clarifying assert in block scheduling [NFC] The fact we could have a block with a valid scheduling window, but nothing to schedule was surprising to me. After digging through the code, this can only happen if we don't find anything to directly vectorize. However, the reduction handling code relies on this mode, so we can't simply consider such trees unvectorizeable. The assert conveys both that this situation can happen, but also that it can only happen for an immediate gather. Context: We built the bundle before deciding that vectorization of a bundle is possible. A side effect of bundle construction is manipulating the scheduling window, so a bundle which isn't vectorizable can cause the creation or expansion of a scheduling window.	2022-01-28 11:08:59 -08:00
Ellis Hoag	eea002a9c4	[InstrProf][NFC] Move function out of InstrProf.h `createIRLevelProfileFlagVar()` seems to be only used in `PGOInstrumentation.cpp` so we move it to that file. Then it can also take advantage of directly using options rather than passing them as arguments. Reviewed By: kyulee, phosek Differential Revision: https://reviews.llvm.org/D118097	2022-01-28 09:24:26 -08:00
Alexey Bataev	cec8b614f3	[SLP]Do not reorder top nodes if they do not require reordering. No need to reorder the top nodes, if they are not stores or insertelement instructions and each node should be analized only once, when the bottom-to-top analysis is performed. We still endup with extractelements for the top node scalars and the final shuffle just adds an extra cost and currently crashes the compiler for PHI nodes. Differential Revision: https://reviews.llvm.org/D116760	2022-01-28 09:16:18 -08:00
Nikita Popov	7d176844d0	[CodeExtractor] Fix warning in assert (NFC)	2022-01-28 16:33:34 +01:00
Nikita Popov	cf0357a545	[BasicBlockUtils] Fix typo in API name (NFC) detatch -> detach. As this requires touching all uses, also lower-case it in accordance with the style guide.	2022-01-28 16:32:13 +01:00
Nikita Popov	0ebbf3435f	[ArgPromotion] Don't assume all entry block instrs are executed We should abort this walk if we hit any instruction that is not guaranteed to transfer.	2022-01-28 16:08:42 +01:00
Nikita Popov	8b36c437df	[ArgPromotion] Make areFunctionArgsABICompatible() static (NFC) This function used to be shared with the Attributor, but can now be made private.	2022-01-28 15:26:36 +01:00
Hans Wennborg	fabaca10b8	Revert "[InstCombine] Fold and-reduce idiom" It causes builds to fail with llvm/include/llvm/Support/Casting.h:269: typename llvm::cast_retty<X, Y>::ret_type llvm::cast(Y) [with X = llvm::IntegerType; Y = const llvm::Type; typename llvm::cast_retty<X, Y>::ret_type = const llvm::IntegerType]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed. See the code review for link to a reproducer. > This patch introduces folding of and-reduce idiom and generates code > that is easier to read and which is lest costly in terms of icmp operations. > The folding is > ``` > icmp eq (bitcast(icmp ne (lhs, rhs)), 0) > ``` > into > ``` > icmp eq(bitcast(lhs), bitcast(rhs)) > ``` > > See PR53419. > > Differential Revision: https://reviews.llvm.org/D118317 > Reviewed By: lebedev.ri, spatel This reverts commit 8599bb0f26738ed88aae62aba57d82f7cf326cf9. This also revertes the dependent change: "[Test] Add 'ne' tests for and-reduce pattern folding" This reverts commit a4aaa5995308ac2ba1bf180c9ce9c321cdb9f28a.	2022-01-28 12:16:03 +01:00
Florian Hahn	b339bbdb19	[Matrix] Use ArrayType for allocas instead of VectorType. When creating an alloca to copy a matrix due to memory conflicts, those allocas used to use VectorTypes, which forced them to have huge alignments for large vectors. This patch updates LowerMatrixIntrinsics to use a corresponding array type, like Clang already does, to get more manageable alignments. Reviewed By: anemet, thegameg Differential Revision: https://reviews.llvm.org/D118239	2022-01-28 10:47:52 +00:00
Nikita Popov	91e5096d82	[InlineFunction] Use phis() iterator (NFC)	2022-01-28 10:36:28 +01:00
Florian Hahn	96400f179f	[VPlan] Record whether scalar IVs are need in induction recipe. (NFC) This explicitly records whether a scalar IV is needed in the VPWidenIntOrFpInductionRecipe, to remove a dependence on the cost-model during its ::execute. It will also be used in D116123 to determine if a vector phi will be generated. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D118167	2022-01-28 09:34:03 +00:00
Max Kazantsev	8599bb0f26	[InstCombine] Fold and-reduce idiom This patch introduces folding of and-reduce idiom and generates code that is easier to read and which is lest costly in terms of icmp operations. The folding is ``` icmp eq (bitcast(icmp ne (lhs, rhs)), 0) ``` into ``` icmp eq(bitcast(lhs), bitcast(rhs)) ``` See PR53419. Differential Revision: https://reviews.llvm.org/D118317 Reviewed By: lebedev.ri, spatel	2022-01-28 11:20:08 +07:00
Ellis Hoag	11d3074267	[InstrProf] Add single byte coverage mode Use the llvm flag `-pgo-function-entry-coverage` to create single byte "counters" to track functions coverage. This mode has significantly less size overhead in both code and data because * We mark a function as "covered" with a store instead of an increment which generally requires fewer assembly instructions * We use a single byte per function rather than 8 bytes per block The trade off of course is that this mode only tells you if a function has been covered. This is useful, for example, to detect dead code. When combined with debug info correlation [0] we are able to create an instrumented Clang binary that is only 150M (the vanilla Clang binary is 143M). That is an overhead of 7M (4.9%) compared to the default instrumentation (without value profiling) which has an overhead of 31M (21.7%). [0] https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4 Reviewed By: kyulee Differential Revision: https://reviews.llvm.org/D116180	2022-01-27 17:38:55 -08:00
Vitaly Buka	bddc814b44	[msan] Copy origin of byval arguments Depends on D117278 Reviewed By: kda, eugenis Differential Revision: https://reviews.llvm.org/D117285	2022-01-27 16:24:07 -08:00
Florian Hahn	9fd7a2e379	[ConstraintElimination] Use constraints with 0 or 1 coefficients. isConditionImplied is able to correctly handle 0 or 1 coefficients, so let it handle those cases, rather than skipping them.	2022-01-27 18:41:33 +00:00
Florian Hahn	258a0a3a55	[ConstraintElimination] Use simplified constraint for == 0. When checking x == 0, checking x u<= 0 is sufficient and simpler than x u>= 0 && x u<= 0. https://alive2.llvm.org/ce/z/btM7d3 ---------------------------------------- define i1 @src(i4 %a) { %0: %c = icmp eq i4 %a, 0 ret i1 %c } => define i1 @tgt(i4 %a) { %0: %c = icmp ule i4 %a, 0 ret i1 %c } Transformation seems to be correct!	2022-01-27 13:31:23 +00:00
Florian Hahn	a78ce48c37	[ConstraintElimination] Introduce struct to manage constraints. (NFC) This patch adds a struct to manage a list of constraints. It simplifies a follow-up change, that adds pre-conditions that must hold before a list of constraints can be used.	2022-01-27 12:40:09 +00:00
Nikita Popov	d839afe3f9	[InstCombine] Avoid pointer element type access in PointerReplacer This code replaces the address space of the pointers while keeping the element type. Use the appropriate helpers to make this work with opaque pointers.	2022-01-27 12:28:32 +01:00
Nikita Popov	648faa3b5d	[InstCombine] Mark element type access as non-opaque (NFC) Also make the function static to make it more obvious that it is only used in the one place.	2022-01-27 11:40:29 +01:00
Florian Hahn	bb5c1b0691	[LoopVersioning] Use IRBuilder for OR simplification.	2022-01-27 09:55:51 +00:00
Nikita Popov	2c736f666b	[InstCombine] Skip GEP of bitcast transform with opaque pointers This transform is fundamentally incompatible with opaque pointers. Usually we would not hit it anyway because the bitcast is folded away earlier, but due to worklist order it might survive until here, so make sure we bail out explicitly.	2022-01-27 10:51:45 +01:00
Nikita Popov	b7179d9279	[InstCombine] Extract GEP of bitcast folds into separate function (NFC)	2022-01-27 10:48:00 +01:00
Nikita Popov	73cd8e29ad	[InstCombine] Skip PromoteCastOfAllocation() transform under opaque pointers I think this can't be hit anyway (because a ptr-to-ptr bitcast would get folded earlier), but in the interest of being explicit skip this transform for opaque pointers entirely.	2022-01-27 10:25:45 +01:00
Nikita Popov	8d992862a0	[InstCombine] Remove some pointer element type accesses One of these is guarded against opaque pointers, and the others were accessing the call function type in a rather convoluted way.	2022-01-27 10:15:35 +01:00
Benjamin Kramer	f15014ff54	Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17" This reverts commit ef8206320769ad31422a803a0d6de6077fd231d2. - It conflicts with the existing llvm::size in STLExtras, which will now never be called. - Calling it without llvm:: breaks C++17 compat	2022-01-26 16:55:53 +01:00
serge-sans-paille	ef82063207	Rename llvm::array_lengthof into llvm::size to match std::size from C++17 As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no build breakage expected).	2022-01-26 16:17:45 +01:00

1 2 3 4 5 ...

29605 Commits