llvm-project

Author	SHA1	Message	Date
Alexey Bataev	ad2a0ccf8f	[SLP]Alternate vectorization for cmp instructions. Added support for alternate ops vectorization of the cmp instructions. It allows to vectorize either cmp instructions with same/swapped predicate but different (swapped) operands kinds or cmp instructions with different predicates and compatible operands kinds. Differential Revision: https://reviews.llvm.org/D115955	2022-02-03 06:24:10 -08:00
Simon Pilgrim	6b4ebdd46f	ModuleUtils - VFABI::setVectorVariantNames - use ArrayRef<> instead of const SmallVector to pass argument	2022-02-03 12:11:48 +00:00
Florian Hahn	413e47ecd4	[ConstraintElimination] Handle degenerate case with branch to same dest. When a conditional branch has the same block as both true and false successor it is not safe to add the condition. Fixes PR49819.	2022-02-03 11:09:14 +00:00
Roman Lebedev	ee4ba9f3a1	Revert "[SimplifyCFG] Start redesigning `FoldTwoEntryPHINode()`." Unfortunately, it seems we really do need to take the long route; start from the "merge" block, find (all the) "dispatch" blocks, and deal with each "dispatch" block separately, instead of simply starting from each "dispatch" block like it would logically make sense, otherwise we run into a number of other missing folds around `switch` formation, missing sinking/hoisting and phase ordering. This reverts commit 85628ce75b3084dc0f185a320152baf85b59aba7. This reverts commit c5fff9095342a792bf4b9a077fe3c3a83c4e566c. This reverts commit 34a98e1046e3aa55e5f26ab20a15e96b4034d25a. This reverts commit 1e353f092288309d74d380367aa50bbd383780ed.	2022-02-03 12:32:50 +03:00
Florian Mayer	fa75a62cb5	[NFC] pull retvec logic to MemoryTaggingSupport. we will also need this for aarch64 stack tagging. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D118852	2022-02-02 16:05:52 -08:00
Fangrui Song	85628ce75b	[SimplifyCFG] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds	2022-02-02 15:11:22 -08:00
Florian Mayer	f7a6c341cb	[mte] support more complicated lifetimes (e.g. for exceptions). Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D118848	2022-02-02 14:39:22 -08:00
Florian Mayer	1d679097da	[NFC] remove excessive whitespace.	2022-02-02 13:35:33 -08:00
Florian Mayer	712b31e2d4	[NFC] factor isStandardLifetime out of HWASan this is so we can use it for aarch64 stack tagging. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D118836	2022-02-02 13:23:55 -08:00
Alexey Bataev	8a1dfbc4d8	Revert "[SLP]Alternate vectorization for cmp instructions." This reverts commit 842a2360a84692f2e4c37cc3e652640e6627d004 to fix the bugs reported by users in https://reviews.llvm.org/D115955#3291538.	2022-02-02 12:06:36 -08:00
Anna Thomas	a73e4ce6a5	[LoopFuse] Change DT to reference in FusionCandidate struct. NFC Assertion added in f50821cff0 confirms that the DT is indeed nonnull. Change it to a reference instead of a pointer to make this explicit in FusionCandidate. Suggested in D118472.	2022-02-02 14:55:37 -05:00
Alexey Bataev	842a2360a8	[SLP]Alternate vectorization for cmp instructions. Added support for alternate ops vectorization of the cmp instructions. It allows to vectorize either cmp instructions with same/swapped predicate but different (swapped) operands kinds or cmp instructions with different predicates and compatible operands kinds. Differential Revision: https://reviews.llvm.org/D115955	2022-02-02 10:32:52 -08:00
Alexandros Lamprineas	438a81a284	[Function Specialisation] Fix use after free This is a fix for a use-after-free found by the address sanitizer when compiling GCC: https://github.com/llvm/llvm-project/issues/52821 The Function Specialization pass may remove instructions, cached inside the PredicateBase class, which are later being dereferenced from the SCCPInstVisitor class. To prevent the dangling references I am lazily deleting the dead instructions after the Solver has run. Differential Revision: https://reviews.llvm.org/D118591	2022-02-02 16:32:10 +00:00
Roman Lebedev	c5fff90953	[NFC][SimplifyCFG] Merge `FoldTwoEntryPHINode()` into it's only callee	2022-02-02 17:53:56 +03:00
Roman Lebedev	34a98e1046	[NFC][SimplifyCFG] `FoldTwoEntryPHINode()`: s/BB/MergeBB/	2022-02-02 17:53:56 +03:00
Roman Lebedev	1e353f0922	[SimplifyCFG] Start redesigning `FoldTwoEntryPHINode()`. The current `FoldTwoEntryPHINode()` is not quite designed correctly. It starts from the merge point, and then tries to detect the 'divergence' point. Because of that, it is limited to the simple two-predecessor case, where the PHI completely goes away. but that is rather pessimistic, and it doesn't make much sense from the costmodel side of things. For example if there is some other unrelated predecessor of the merge point, we could split the merge point so that the then/else blocks first branch to an empty block and then to the merge point, and then we'd be able to speculate the then/else code. But if we'd instead simply start at the divergence point, and look for the merge point, then we'll just natively support this case. There's also the fact that `SpeculativelyExecuteBB()` already does just that, but only if there is a single block to speculate, and with a much more restrictive cost model. But that also means we have code duplication. Now, sadly, while this is as much NFCI as possible, there is just no way to cleanly migrate to the proper implementation. The results are going to be different somewhat because of various phase ordering effects and SimplifyCFG block iteration strategy.	2022-02-02 17:53:56 +03:00
Benjamin Kramer	0c3d22a592	Revert "[SLP]Alternate vectorization for cmp instructions." This reverts commit 83620bd2ad867f706c699d0f2b8be10e43d9f3d7. It's causing miscompilations, see review comments at https://reviews.llvm.org/D115955	2022-02-02 13:08:51 +01:00
Florian Hahn	1c9f15426f	[GVN] Replace PointerIntPair with separate pointer & kind fields (NFC). After adding another value kind in 8a12cae862af, Value * pointers do not have enough available empty bits to store the kind (e.g. on ARM) To address this, the patch replaces the PointerIntPair with separate value and kind fields.	2022-02-02 09:44:15 +00:00
Florian Hahn	8a12cae862	[GVN] Support load of pointer-select to value-select conversion. This patch extends the available-value logic to detect loads of pointer-selects that can be replaced by a value select. For example, consider the code below: loop: %sel.phi = phi i32* [ %start, %ph ], [ %sel, %ph ] %l = load %ptr %l.sel = load %sel.phi %sel = select cond, %ptr, %sel.phi ... exit: %res = load %sel use(%res) The load of the pointer phi can be replaced by a load of the start value outside the loop and a new phi/select chain based on the loaded values, as illustrated below %l.start = load %start loop: sel.phi.prom = phi i32 [ %l.start, %ph ], [ %sel.prom, %ph ] %l = load %ptr %sel.prom = select cond, %l, %sel.phi.prom ... exit: use(%sel.prom) This is a first step towards alllowing vectorizing loops using common libc++ library functions, like std::min_element (https://clang.godbolt.org/z/6czGzzqbs) #include <vector> #include <algorithm> int foo(const std::vector<int> &V) { return *std::min_element(V.begin(), V.end()); } Reviewed By: reames Differential Revision: https://reviews.llvm.org/D118143	2022-02-02 09:23:09 +00:00
serge-sans-paille	e188aae406	Cleanup header dependencies in LLVMCore Based on the output of include-what-you-use. This is a big chunk of changes. It is very likely to break downstream code unless they took a lot of care in avoiding hidden ehader dependencies, something the LLVM codebase doesn't do that well :-/ I've tried to summarize the biggest change below: - llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h - llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h - llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h - llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h - llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h - llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h - llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h And the usual count of preprocessed lines: $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l before: 6400831 after: 6189948 200k lines less to process is no that bad ;-) Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D118652	2022-02-02 06:54:20 +01:00
Sander de Smalen	2a44eaf20f	[LV] Allow a scalable VF for the epilogue. For some reason we limited the epilogue VF to be fixed-width, but there is not necessarily a reason for doing so. If the main VF=vscale x 16, the epilogue VF could be either fixed-width, or a scalable VF upto vscale x 8. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D118688	2022-02-01 22:38:55 +00:00
Anna Thomas	f50821cff0	[LoopFuse] Add assertion for non-null DT in fusion candidate The code paths analyzed (all constructor invocations of fusion candidate) pass in a non-null DT. Adding this assert as requested in D118472 before converting this to a reference argument.	2022-02-01 17:00:09 -05:00
Anna Thomas	bc48a26655	[LoopPeel] Use reference instead of pointer for DT argument Cleanup code in peelLoop API. We already have usage of DT without guarding against a null DT, so this change constant folds the remaining null DT checks. Also make the argument a reference so that it is clear the argument is a nonnull DT. Extracted from D118472.	2022-02-01 17:00:08 -05:00
Florian Mayer	aefb2e134d	[hwasan] work around lifetime issue with setjmp. setjmp can return twice, but PostDominatorTree is unaware of this. as such, it overestimates postdominance, leaving some cases (see attached compiler-rt) where memory does not get untagged on return. this causes false positives later in the program execution. this is a crude workaround to unblock use-after-scope for now, in the longer term PostDominatorTree should bemade aware of returns_twice function, as this may cause problems elsewhere. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D118647	2022-02-01 12:14:20 -08:00
Matt Morehouse	de4e8bc3ac	[HWASan] Properly handle musttail calls. Fixes a compile error when the `clang::musttail` attribute is used. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D118712	2022-02-01 11:23:43 -08:00
Anna Thomas	4fc52db116	[InstCombine] Remove weaker fence adjacent to a stronger fence We have an instCombine rule to remove identical consecutive fences. We can extend this to remove weaker fences when we have consecutive stronger fence. As stated in the LangRef, a fence with a stronger ordering also implies ordering weaker than itself: "A fence which has seq_cst ordering, in addition to having both acquire and release semantics specified above, participates in the global program order of other seq_cst operations and/or fences." Reviewed-By: reames Differential Revision: https://reviews.llvm.org/D118607	2022-02-01 11:05:34 -08:00
Fangrui Song	30e8f83c84	[GlobalOpt] Don't replace alias with aliasee if either alias/aliasee may be preemptible Generalize D99629 for ELF. A default visibility non-local symbol is preemptible in a -shared link. `isInterposable` is an insufficient condition. Moreover, a non-preemptible alias may be referenced in a sub constant expression which intends to lower to a PC-relative relocation. Replacing the alias with a preemptible aliasee may introduce a linker error. Respect dso_preemptable and suppress optimization to fix the abose issues. With the change, `alias = 345` will not be rewritten to use aliasee in a `-fpic` compile. ``` int aliasee; extern int alias __attribute__((alias("aliasee"), visibility("hidden"))); void foo() { alias = 345; } // intended to access the local copy ``` While here, refine the condition for the alias as well. For some binary formats like COFF, `isInterposable` is a sufficient condition. But I think canonicalization for the changed case has little advantage, so I don't bother to add the `Triple(M.getTargetTriple()).isOSBinFormatELF()` or `getPICLevel/getPIELevel` complexity. For instrumentations, it's recommended not to create aliases that refer to globals that have a weak linkage or is preemptible. However, the following is supported and the IR needs to handle such cases. ``` int aliasee __attribute__((weak)); extern int alias __attribute__((alias("aliasee"))); ``` There are other places where GlobalAlias isInterposable usage may need to be fixed. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D107249	2022-02-01 10:41:16 -08:00
Alexey Bataev	83620bd2ad	[SLP]Alternate vectorization for cmp instructions. Added support for alternate ops vectorization of the cmp instructions. It allows to vectorize either cmp instructions with same/swapped predicate but different (swapped) operands kinds or cmp instructions with different predicates and compatible operands kinds. Differential Revision: https://reviews.llvm.org/D115955	2022-02-01 09:54:20 -08:00
Olle Fredriksson	9d555b4a83	[DFAJumpThreading] make update order deterministic We tracked down some non-determinism in compilation output to the DFAJumpThreading pass. These changes fixed our issue: * Make the DefMap type a MapVector to make its iteration order depend on insertion order. * Sort the values to be inserted into NewDefs by instruction order to make the insertion order deterministic. Since these values come from iterating over a ValueMap, which doesn't have deterministic iteration order, I couldn't fix this at its source. Reviewed By: alexey.zhikhar Differential Revision: https://reviews.llvm.org/D118590	2022-02-01 11:02:58 -05:00
Nikita Popov	1652c3b80c	[GlobalOpt] Avoid early exit before dead constant check In a similar vein to 236fbf571dc6cebcb81ac5187a170c8de6d5bc0e, make sure we don't early-exit before the dead constant check.	2022-02-01 15:57:19 +01:00
Nikita Popov	236fbf571d	[GlobalStatus] Skip non-pointer dead constant users Constant expressions with a non-pointer result type used an early exit that bypassed the later dead constant user check, and resulted in different optimization outcomes depending on whether dead users were present or not. This fixes the issue reported in https://reviews.llvm.org/D117223#3287039.	2022-02-01 15:51:32 +01:00
Benjamin Kramer	5281f0dab2	Revert "[SLP]Alternate vectorization for cmp instructions." This reverts commit afaaecc88c6e5989de8a6a0266610860ef99d9d6. Crashes when compiling SciPy, test case https://reviews.llvm.org/P8276	2022-02-01 11:40:43 +01:00
Florian Hahn	7fe4fa9a0a	[LV] Use onlyFirstLaneDemanded when widening pointer phis (NFCI). This removes another instance of recipe execution still relying on the cost model. Depends on D116554. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D116656	2022-02-01 09:50:47 +00:00
Jay Foad	d2e5d3512b	[StructurizeCFG] Clean up some boolean not instructions In some cases StructurizeCFG inserts i1 xor instructions to invert predicates. Add a quick loop to clean these up afterwards if we can get away with modifying an existing compare instruction instead. (StructurizeCFG is generally run late in the pipeline so instcombine does not clean them up for us.) Differential Revision: https://reviews.llvm.org/D118623	2022-02-01 09:35:37 +00:00
Nikita Popov	79179a378b	[ArgPromotion] Use range-based for loop (NFC)	2022-02-01 10:34:14 +01:00
Johannes Doerfert	3b8ffe668d	[Attributor][FIX] Relax assertion in IRPosition::verify A call base can be a floating value if we talk about the instruction and not the return value. This distinction was not made before but is important for liveness, e.g., a call site return value might be unused (=dead) but the call site is not.	2022-02-01 02:25:44 -06:00
Johannes Doerfert	a265cf22af	[Attributor] Introduce the `AA::isPotentiallyReachable` helper APIs To make usage easier (compared to the many reachability related AAs), this patch introduces a helper API, `AA::isPotentiallyReachable`, which performs all the necessary steps. It also does the "backwards" reachability (see D106720) as that simplifies the AA a lot (backwards queries were somewhat different from the other query resolvers), and ensures we use cached values in every stage. To test inter-procedural reachability in a reasonable way this patch includes an extension to `AAPointerInfo::forallInterferingWrites`. Basically, we can exclude writes if they cannot reach a load "during the lifetime" of the allocation. That is, we need to go up the call graph to determine reachability until we can determine the allocation would be dead in the caller. This leads to new constant propagations (through memory) in `value-simplify-pointer-info-gpu.ll`. Note: The new code contains plenty debug output to determine how reachability queries are resolved. Parts extracted from D110078. Differential Revision: https://reviews.llvm.org/D118673	2022-02-01 01:40:45 -06:00
Johannes Doerfert	b51b83f68e	[Attributor] Introduce the concept of query AAs D106720 introduced features that did not work properly as we could add new queries after a fixpoint was reached and which could not be answered by the information gathered up to the fixpoint alone. As an alternative to D110078, which forced eager computation where we want to continue to be lazy, this patch fixes the problem. QueryAAs are AAs that allow lazy queries during their lifetime. They are never fixed if they have no outstanding dependences and always run as part of the updates in an iteration. To determine if we are done, all query AAs are asked if they received new queries, if not, we only need to consider updated AAs, as before. If new queries are present we go for another iteration. Differential Revision: https://reviews.llvm.org/D118669	2022-02-01 01:40:44 -06:00
Kuter Dinel	b2d1ae0611	[Attributor] AAFunctionReachability, Instruction reachability. This patch implement instruction reachability for AAFunctionReachability attribute. It is used to tell if a certain instruction can reach a function transitively. NOTE: I created a new commit based of D106720 and set the author back to Kuter. Other metadata, etc. is wrong. I also addressed the remaining review comments and fixed the unit test. Differential Revision: https://reviews.llvm.org/D106720	2022-02-01 01:40:44 -06:00
Johannes Doerfert	ac3ec22df9	[Attributor] Use AAFunctionReachability to determine AANoRecurse We missed out on AANoRecurse in the module pass because we had no call graph. With AAFunctionReachability we can simply ask if the function may reach itself. Differential Revision: https://reviews.llvm.org/D110099	2022-02-01 01:40:44 -06:00
Johannes Doerfert	d1186ce7a9	[Attributor] Make interprocedural value explicit in genericValueTraversal genericValueTraversal can look through arguments and allow value simplification across function boundaries. In fact, the latter already happened unchecked. With this change we allow the user of genericValueTraversal to opt-out of interprocedural traversal if required. We explicitly look through arguments now which helps to do various things, incl. the propagation of constants into OpenMP parallel regions (on the host).	2022-02-01 01:40:44 -06:00
Johannes Doerfert	a1db0e523d	[Attributor][FIX] Liveness handling in the isAssumedDead helpers This fixes a conceptual problem with our AAIsDead usage which conflated call site liveness with call site return value liveness. Without the fix tests would obviously miscompile as we make genericValueTraversal more powerful (in a follow up). The effects on the tests are mixed but mostly marginal. The most prominent one is the lack of `noreturn` for functions. The reason is that we make entire blocks live at the same time (for time reasons). Now that we actually look at the block liveness, which we need to do, the return instructions are live and will survive. As an example, `noreturn_async.ll` has been modified to retain the `noreturn` even with block granularity. We could address this easily but there is little need in practice.	2022-02-01 01:18:52 -06:00
Johannes Doerfert	0f471710f8	[Attributor] Use edge liveness rather than block liveness We moved to the edge API a while back, not all uses were adjusted. Edge liveness is more precise.	2022-02-01 01:18:51 -06:00
Johannes Doerfert	53b6753bdd	[Attributor][FIX] Address two oversights in AAIsDead No tests as these were found browsing the code and I'm not sure how to test them properly.	2022-02-01 01:18:51 -06:00
Johannes Doerfert	cfabffb034	[Attributor][NFCI] Improve debug diagnostic	2022-02-01 01:18:51 -06:00
Johannes Doerfert	adf0d57f15	[Attributor] Provide convenient helpers for isAssumedRead{None,Only} We have two attributes that can answer readnone queries. While there is a dependence between them, it seems best to not force the users to know what AA to ask. The helpers also allow to check for readonly nicely. Test changes show where we now deduce readnone but haven't before, mostly because we only asked AAMemoryBehavior and not AAMemoryLocation. AANoAlias has not been ported to the new API yet.	2022-02-01 01:18:51 -06:00
Johannes Doerfert	e140d51319	[Attributor] Use CFG reasoning to filter potentially interfering writes Since D104432 we can look through memory by analyzing all writes that might interfere with a load. This patch provides some logic to exclude writes that cannot interfere with a location, due to CFG reasoning. We make sure to avoid multi-thread write-read situations properly while we ignore writes that cannot reach a load or writes that will be overwritten before the load is reached. Differential Revision: https://reviews.llvm.org/D106397	2022-02-01 01:18:51 -06:00
Johannes Doerfert	191fa419a6	[Attributor][NFC] Make debug output more useful and concise	2022-02-01 01:18:51 -06:00
Johannes Doerfert	3f0e670498	[Attributor][NFCI] Expose some nosync reasoning to outside users. No-sync is a property that we need in more places as complex transformations emerge. To simplify the query we provide an `AA::isNoSyncInst` helper now and expose two existing helpers through the `AANoSync` class.	2022-02-01 01:07:50 -06:00
Johannes Doerfert	a5b6aef24e	[Attributor][NFCI] Remove anonymous namespaces The namespaces made it more complicate to implement static helpers, among other things. We should not need them at all.	2022-02-01 01:07:50 -06:00

1 2 3 4 5 ...

29605 Commits