llvm-project

Author	SHA1	Message	Date
Nathan Chancellor	4e0008dcbe	Revert "[InstCombine] try to narrow shifted bswap-of-zext" This reverts commit 9e9bda2e8f5b88715bad767a4b7740df32b040d2. This causes a backend error when building the Linux kernel for arm64. See https://reviews.llvm.org/D122166 for a simplified reproducer.	2022-03-22 17:32:33 -07:00
Vasileios Porpodas	27bd8f9492	Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash. Original review: https://reviews.llvm.org/D121354 This reverts commit f7d7d2a08d16356c57f6d2d36bc2fc0589a55df9.	2022-03-22 16:41:55 -07:00
Philip Reames	7abefc4222	[instcombine] Fold away memset/memmove from otherwise unused alloca The motivation for this is that while both memcpyopt and dse will catch this case, both are limited by MSSA's walk back threshold when finding clobbers. As such, if you have a memcpy of an otherwise dead alloca placed towards the end of a long basic block with lots of other memory instructions, it would be missed. This is a bit undesirable for such an "obviously" useless bit of code. As noted in comments, we should probably generalize instcombine's escape analysis peephole (see visitAllocInst) to allow read xor write. Doing that would subsume this code in a more general way, but is also a more involved change. For the moment, I went with the easiest fix.	2022-03-22 13:48:48 -07:00
Arthur Eubanks	f7d7d2a08d	Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads."" This reverts commit 79613185d305013de743cdbd6690e4d77c8af27e. Causes crashes, see comments in https://reviews.llvm.org/D121973.	2022-03-22 13:33:49 -07:00
Sanjay Patel	ccf8c969c2	[InstCombine] reorder code, fix formatting; NFC The affected code can be updated to solve #54364, so make some cosmetic diffs before real changes.	2022-03-22 16:33:01 -04:00
Florian Hahn	50c8588e44	[LV] Remove Loop argument from createInductionResumeValues (NFCI). createInductionResumeValues only uses its loop argument only to get the pre-header, but the pre-header is already known (we created/cached it earlier). Remove the unneeded loop argument.	2022-03-22 14:23:12 +00:00
Sanjay Patel	60820e53ec	[InstCombine] try to canonicalize logical shift after bswap When shifting by a byte-multiple: bswap (shl X, C) --> lshr (bswap X), C bswap (lshr X, C) --> shl (bswap X), C This is an IR implementation of a transform suggested in D120648. The "swaps cancel" test models the motivating optimization from that proposal. Alive2 checks (as noted in the other review, we could use knownbits to handle shift-by-variable-amount, but that can be an enhancement patch): https://alive2.llvm.org/ce/z/pXUaRf https://alive2.llvm.org/ce/z/ZnaMLf Differential Revision: https://reviews.llvm.org/D122010	2022-03-22 09:10:55 -04:00
Djordje Todorovic	91ea247039	[Debugify] Use DebugifyLevel in Debugify original mode Before this patch the DebugifyLevel option was used for the synthetic mode, so after this, it will be used in the original mode as well. Differential Revision: https://reviews.llvm.org/D115623	2022-03-22 14:04:56 +01:00
Nikita Popov	afb526b3f4	[LICM] Handle store of pointer to itself (PR54495) Rather than iterating over users and comparing operands, iterate over uses and check operand number. Otherwise, we'll end up promoting a store twice if it has two equal operands. This can only happen with opaque pointers, as otherwise both operands differ by a level of indirection, so a bitcast would have to be involved. Fixes https://github.com/llvm/llvm-project/issues/54495.	2022-03-22 14:00:07 +01:00
Sanjay Patel	9e9bda2e8f	[InstCombine] try to narrow shifted bswap-of-zext This is the IR counterpart to 370ebc9d9a573d6 which provided a bswap narrowing fix for issue #53867. Here we can be more general (although I'm not sure yet what would happen for illegal types in codegen - too rare to worry about?): https://alive2.llvm.org/ce/z/3-CPfo This will be more effective if we have moved the shift after the bswap as proposed in D122010, but it is independent of that patch. Differential Revision: https://reviews.llvm.org/D122166	2022-03-22 08:22:30 -04:00
Djordje Todorovic	73777b4c35	[Debugify] Optimize debugify original mode Before we start addressing the issue with having a lot of false positives when using debugify in the original mode, we have made a few patches that should speed up the execution of the testing utility Passes. For example, when testing a large project (let's say LLVM project itself), we can face a lot of potential DI issues. Usually, we use -verify-each-debuginfo-preserve (that is very similar to -debugify-each) -- it collects DI metadata before each Pass, and after the Pass it checks if the Pass preserved the DI metadata. However, we can speed up this process, since we don't need to collect DI metadata before each Pass -- we could use the DI metadata that are collected after the previous Pass from the pipeline as an input for the next Pass. This patch speeds up the utility for ~2x. Differential Revision: https://reviews.llvm.org/D115622	2022-03-22 12:14:00 +01:00
serge-sans-paille	a53b689f0c	Fix missing include under -DEXPENSIVE_CHECK Regression introduced by f1985a3f855d3676c5aad0e5c258d2ea38598f44	2022-03-22 10:37:56 +01:00
serge-sans-paille	f1985a3f85	Cleanup includes: Transforms/IPO Preprocessor output diff: -238205 lines Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D122183	2022-03-22 10:06:28 +01:00
Chuanqi Xu	902f4708fe	[NFC] [Coroutines] Remove unnecessary check and constraints on SmallVector The CoroSplit pass would check the existence of coroutine intrinsic before starting work. It is not necessary and wasteful since it would iterate over the Module. This patch also removes the constraint on the corresponding of the SmallVector for the possible coroutines in the Modules. The original value is 4. Given coroutines is used actually in practice. 4 is really relatively a low threshold.	2022-03-22 14:24:46 +08:00
Vasileios Porpodas	79613185d3	Recommit "[SLP] Fix lookahead operand reordering for splat loads." Original review: https://reviews.llvm.org/D121354 The original commit 9136145eb019e1d18c966d4d06a3df349b88cc14 broke the build on several targets. Differential Revision: https://reviews.llvm.org/D121973	2022-03-21 15:57:32 -07:00
Hirochika Matsumoto	86f970e595	[IROutliner][NFC] Fix typo in doc of findOrCreatePHIInBlock Typo Fix in Documentation Author: hkmatsumoto Reviewers: AndrewLitteken Differential Revision: https://reviews.llvm.org/D121627	2022-03-21 12:34:20 -05:00
Philip Reames	ee7324b898	Rename mayBeMemoryDependent to mayHaveNonDefUseDependency [nfc]	2022-03-21 10:01:40 -07:00
Andrew Litteken	4e500df89e	[IROutliner] Fix phi nodes when self referential within block but doesn't contain branch When outlining a phi node, if the the incoming branch is a block contained in the region and the branch from that block is not outlined, we create broken code. The fix is to recognize when that branch from the included incoming block is not contained, and ignore the region. Reviewer: paquette Differential Revision: https://reviews.llvm.org/D121311	2022-03-21 11:05:15 -05:00
psamolysov-intel	2ed030ba88	[InferAddressSpaces][NFC] Small code improvements for the InferAddressSpaces pass There is a bunch of code improvements in the patch: marking as const everything what can be const and fixing some typos in comments. Also the patch removes the shadowing parameter TTI from the rewriteWithNewAddressSpaces method, the TTI parameter is not required because the same field is in the class. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D121671	2022-03-21 11:03:12 -05:00
Alexey Bataev	79a182371e	[SLP]Make stricter check for instructions that do not require scheduling. Need to check that the instructions with external operands can be reordered safely before actualy exclude them from the scheduling.	2022-03-21 06:09:12 -07:00
Sophia	72bde608d2	[LV] Fix typo in comment Reviewed by: fhahn (Florian Hahn) Differential Revision: https://reviews.llvm.org/D121781	2022-03-21 20:30:05 +08:00
Florian Hahn	0ebac76e6e	[LV] Remove unneeded Loop argument from completeLoopSkeleton. (NFCI) completeLoopSkeleton only uses its loop argument only to get the pre-header, but the pre-header is already known (we created/cached it earlier). Remove the unneeded loop argument.	2022-03-21 10:07:25 +00:00
Andrew Litteken	38e8880e93	[IROutliner] Do not outlined from functions with optnone Since the IROutliner is performing an optimization, it should not outline from functions explicitly marked with optnone. This adds an extra check and test to make sure this does not occur. Reviewers: paquette Differential Revision: https://reviews.llvm.org/D121567	2022-03-20 23:39:23 -05:00
Florian Hahn	487629cc61	[LV] Remove dead Loop argument from emitMemRuntimeChecks. (NFC)	2022-03-20 21:01:15 +00:00
Philip Reames	b7806c8b37	[SLP] Explicit track required stacksave/alloca dependency The semantics of an inalloca alloca instruction requires that it not be reordered with a preceeding stacksave intrinsic call. Unfortunately, there's no def/use edge or memory dependence edge. (THe memory point is slightly subtle, but in general a new allocation can't alias with a call which executes strictly before it comes into existance.) I'd tried to tackle this same case previously in 689babdf6, but the fix chosen there turned out to be incomplete. As such, this change contains a fully revert of the first fix attempt. This was noticed when investigating problems which surfaced with D118538, but this is definitely an existing bug. This time around, I managed to reduce a couple of additional cases, including one which was being actively miscompiled even without the new scheduling change. (See test diffs) Compile time wise, we only spend extra time when seeing a stacksave (rare), and even then we walk the block at most once per schedule window extension. Likely a non-issue.	2022-03-20 13:58:45 -07:00
Kazu Hirata	bce1bf0ee2	[Transform] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC)	2022-03-20 10:41:22 -07:00
Philip Reames	6253b77da9	[SLP] Respect control dependence within a block during scheduling This fixes an active miscompile visible in the test changes. The basic problem is that the scheduling dependency graph didn't have any edges for control dependence within a single basic block. The result is that we could (and in some rare cases did) perform reorderings within a block which could introduce new undefined behavior along paths which didn't previously contain any. Impact wise, we have two major cases where control is not guaranteed to reach a later instruction in the block: may throw calls, and calls containing infinite loops. * The former case was mostly covered by the memory dependencies, and to trigger require a function which can throw, but not write to memory. In theory, such a case is possible, but not likely in practice. * The later case is likely more of an issue in practice. After this code was first written, we changed the IR semantics to allow well defined infinite loops without satisifying mustprogress. Even for C/C++ - which do imply mustprogress - recent changes to how we treat atomics (e.g. an atomic read does not always imply a write) could expose this issue. I'm a bit shocked we don't seem to have a bug report which hit this in real code actually. Compile time wise, this results in a single extra scan of the scheduling window in the common case. Since we stop scanning at the next instruction which isn't guaranteed to execute, no matter what order we traverse instructions in, we scan the block once. The exception to this is that when we extend the scheduling window downwards, we invalidate all dependencies, and thus rescan. So the potentially expensive case is when we a call in a big schedule window which is frequently extended. We could optimize this case (by caching the last instruction not guaranteeed to transfer execution and scanning only the extended window) and starting there), but I decided to leave the complexity until it mattered. That same case is already degenerate with memory dependences which is more expensive than the control dependence scan. We could also consider combining the memory dependence and control dependence sets to reduce memory usage, but since it complicates the code slightly and makes debugging a bit harder, I went with the simplest scheme for now. This was noticed while trying to understand the failures reported against D118538, but is not otherwise related to that change.	2022-03-19 13:36:24 -07:00
Florian Hahn	1a820ff039	[LV] Remove unnecessary uses of Loop* (NFC). Update functions that previously took a loop pointer but only to get the pre-header. Instead, pass the block directly. This removes the requirement for the loop object to be created up-front.	2022-03-19 20:18:47 +00:00
Johannes Doerfert	4166738c38	[OpenMP][FIX] Do not crash when kernels are debug wrapper functions With debug information enabled (-g) Clang will wrap the actual target region into a new function which is called from the "kernel". The problem is that the "kernel" is now basically a wrapper without all the things we expect. More importantly, if we end up asking for an AAKernelInfo for the "target region function" we might try to turn it into SPMD mode. That used to cause an assertion as that function doesn't have an appropriately named `_exec_mode` global. While the global is going away soon we still need to make sure to properly handle this case, e.g., perform optimizations reliably. Differential Revision: https://reviews.llvm.org/D122043	2022-03-19 14:15:55 -05:00
Fangrui Song	c6692f819e	[GlobalOpt] Don't replace alias with aliasee if either alias/aliasee may be preemptible Generalize D99629 for ELF. A default visibility non-local symbol is preemptible in a -shared link. `isInterposable` is an insufficient condition. Moreover, a non-preemptible alias may be referenced in a sub constant expression which intends to lower to a PC-relative relocation. Replacing the alias with a preemptible aliasee may introduce a linker error. Respect dso_preemptable and suppress optimization to fix the abose issues. With the change, `alias = 345` will not be rewritten to use aliasee in a `-fpic` compile. ``` int aliasee; extern int alias __attribute__((alias("aliasee"), visibility("hidden"))); void foo() { alias = 345; } // intended to access the local copy ``` While here, refine the condition for the alias as well. For some binary formats like COFF, `isInterposable` is a sufficient condition. But I think canonicalization for the changed case has little advantage, so I don't bother to add the `Triple(M.getTargetTriple()).isOSBinFormatELF()` or `getPICLevel/getPIELevel` complexity. For instrumentations, it's recommended not to create aliases that refer to globals that have a weak linkage or is preemptible. However, the following is supported and the IR needs to handle such cases. ``` int aliasee __attribute__((weak)); extern int alias __attribute__((alias("aliasee"))); ``` There are other places where GlobalAlias isInterposable usage may need to be fixed. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D107249	2022-03-18 14:17:05 -07:00
Philip Reames	1093949cff	[SLP] Add comment clarifying assumption that tripped me up [NFC] I keep thinking this assumption is probably exploitable for a bug in the existing implementation, but all of my attempts at writing a test case have failed. So for the moment, just document this very subtle assumption.	2022-03-18 11:40:19 -07:00
Kazu Hirata	3e0f7c7881	[Vectorize] Fix an 'unused function' warning This patch fixes: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:3917:13: error: unused function 'needToScheduleSingleInstruction' [-Werror,-Wunused-function]	2022-03-18 11:24:57 -07:00
Kazu Hirata	b3d8c0d069	[Vectorize] Fix an 'unused variable' warning This patch fixes: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:8148:18: error: unused variable 'SDTE' [-Werror,-Wunused-variable]	2022-03-18 11:24:54 -07:00
Nick Desaulniers	e1bae23f6f	[SCCP] do not clean up dead blocks that have their address taken [SCCP] do not clean up dead blocks that have their address taken Fixes a crash observed in IPSCCP. Because the SCCPSolver has already internalized BlockAddresses as Constants or ConstantExprs, we don't want to try to update their Values in the ValueLatticeElement. Instead, continue to propagate these BlockAddress Constants, continue converting BasicBlocks to unreachable, but don't delete the "dead" BasicBlocks which happen to have their address taken. Leave replacing the BlockAddresses to another pass. Fixes: https://github.com/llvm/llvm-project/issues/54238 Fixes: https://github.com/llvm/llvm-project/issues/54251 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D121744	2022-03-18 11:02:15 -07:00
Philip Reames	8f108c32bc	Revert "[SLP] Optionally preserve MemorySSA" This reverts commit 1cfa986d68e2f04854ef30c432b8aa28e13a9706. See https://github.com/llvm/llvm-project/issues/54256 for why I'm discontinuing the project. Seperately, it turns out that while this patch does correctly preserve MSSA, it's correct only at the end of the pass; not between vectorization attempts. Even if we decide to resurrect this, we'll need to fix that before reapplying.	2022-03-18 10:45:59 -07:00
Florian Mayer	078b546555	[HWASan] do not replace lifetime intrinsics with tagged address. Quote from the LLVM Language Reference If ptr is a stack-allocated object and it points to the first byte of the object, the object is initially marked as dead. ptr is conservatively considered as a non-stack-allocated object if the stack coloring algorithm that is used in the optimization pipeline cannot conclude that ptr is a stack-allocated object. By replacing the alloca pointer with the tagged address before this change, we confused the stack coloring algorithm. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D121835	2022-03-18 10:39:51 -07:00
Florian Mayer	dbc918b649	Revert "[HWASan] do not replace lifetime intrinsics with tagged address." Failed on buildbot: /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/llc: error: : error: unable to get target for 'aarch64-unknown-linux-android29', see --version and --triple. FileCheck error: '<stdin>' is empty. FileCheck command line: /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/FileCheck /home/buildbot/buildbot-root/llvm-project/llvm/test/Instrumentation/HWAddressSanitizer/stack-coloring.ll --check-prefix=COLOR This reverts commit 208b923e74feeb986fe5114ca39a74b1d2032ed7.	2022-03-18 10:04:48 -07:00
Florian Hahn	5ab421fb4e	[LICM] Add allowspeculation pass options. This adds a new option to control AllowSpeculation added in D119965 when using `-passes=...`. This allows reproducing #54023 using opt. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D121944	2022-03-18 16:51:57 +00:00
Florian Mayer	208b923e74	[HWASan] do not replace lifetime intrinsics with tagged address. Quote from the LLVM Language Reference If ptr is a stack-allocated object and it points to the first byte of the object, the object is initially marked as dead. ptr is conservatively considered as a non-stack-allocated object if the stack coloring algorithm that is used in the optimization pipeline cannot conclude that ptr is a stack-allocated object. By replacing the alloca pointer with the tagged address before this change, we confused the stack coloring algorithm. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D121835	2022-03-18 09:45:05 -07:00
Nikita Popov	ab2284a643	[LowerConstantIntrinsics] Make TLI a required dependency The way the pass is actually used in the optimization pipeline, TLI will be available, but this is not the case when running just -lower-constant-intrinsics in tests, which ends up being quite confusing. Require TLI unconditionally, as we usually do.	2022-03-18 14:59:18 +01:00
Nikita Popov	fc8946fae7	[InstCombine] Remove integer SPF of SPF folds (NFCI) Now that we canonicalize to intrinsics, these folds should no longer be needed. Only one fold that also applies to floating-point min/max is retained.	2022-03-18 10:20:48 +01:00
Nikita Popov	f96428e16d	[MemorySSA] Don't optimize uses during construction This changes MemorySSA to be constructed in unoptimized form. MemorySSA::ensureOptimizedUses() can be called to optimize all uses (once). This should be done by passes where having optimized uses is beneficial, either because we're going to query all uses anyway, or because we're doing def-use walks. This should help reduce the compile-time impact of MemorySSA for some use cases (the reason why I started looking into this is D117926), which can avoid optimizing all uses upfront, and instead only optimize those that are actually queried. Actually, we have an existing use-case for this, which is EarlyCSE. Disabling eager use optimization there gives a significant compile-time improvement, because EarlyCSE will generally only query clobbers for a subset of all uses (this change is not included in this patch). Differential Revision: https://reviews.llvm.org/D121381	2022-03-18 09:56:16 +01:00
Florian Hahn	4a699ae9c6	[LoopSimplifyCFG] Check predecessors of exits before marking them dead. LoopSimplifyCFG may process loops that are not in loop-simplify/canonical form. For loops not in canonical form, exit blocks may be reachable from non-loop blocks and we cannot consider them as dead if they only are not reachable from the loop itself. Unfortunately the smallest test I could come up with requires running multiple passes: -passes='loop-mssa(loop-instsimplify,loop-simplifycfg,simple-loop-unswitch)' The reason is that loops are canonicalized at the beginning of loop pipelines, so a later transform has to break canonical form in a way that breaks LoopSimplifyCFG's dead-exit analysis. Alternatively we could try to require all loop passes to maintain canonical form. That in turn would also require additional verification. Fixes #54023, #49931. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D121925	2022-03-18 08:54:44 +00:00
Andrew Wei	0af3e6a22d	[InstCombine] Sink instructions with multiple users in a successor block. This patch tries to sink instructions when they are only used in a successor block. This is a further enhancement patch based on Anna's commit: D109700, which allows sinking an instruction having multiple uses in a single user. In this patch, sink instructions with multiple users in a single successor block will be supported. It could fix a known issue from rust: https://github.com/rust-lang/rust/issues/51346#issuecomment-394443610 Reviewed By: nikic, reames Differential Revision: https://reviews.llvm.org/D121585	2022-03-18 11:53:45 +08:00
Vasileios Porpodas	9136145eb0	Revert "[SLP] Fix lookahead operand reordering for splat loads." due to build failures This reverts commit 5efa78985bf5cbba1c4346ba41a16435fc516446.	2022-03-17 18:22:04 -07:00
Vasileios Porpodas	5efa78985b	[SLP] Fix lookahead operand reordering for splat loads. Splat loads are inexpensive in X86. For a 2-lane vector we need just one instruction: `movddup (%reg), xmm0`. Using the standard Splat score leads to worse code. This patch adds a new score dedicated for splat loads. Please note that a splat is usually three IR instructions: - It is usually a load and 2 inserts: %ld = load double, double* %gep %ins1 = insertelement <2 x double> poison, double %ld, i32 0 %ins2 = insertelement <2 x double> %ins1, double %ld, i32 1 - But it can also be a load, an insert and a shuffle: %ld = load double, double* %gep %ins = insertelement <2 x double> poison, double %ld, i32 0 %shf = shufflevector <2 x double> %ins, <2 x double> poison, <2 x i32> zeroinitializer Because of this some of the lit tests contain more IR instructions. Differential Revision: https://reviews.llvm.org/D121354	2022-03-17 18:05:54 -07:00
Paul Kirth	964398ccb1	Revert "Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics""" This reverts commit 6cf560d69a222bff4af4e1d092437fd77f0f981c.	2022-03-18 00:21:33 +00:00
Paul Kirth	6cf560d69a	Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics"" I mistakenly reverted my commit, so I'm relanding it. This reverts commit 10866a1df4a82cdc54187330c509a2d46235455d.	2022-03-18 00:04:22 +00:00
Paul Kirth	10866a1df4	Revert "[misexpect] Re-implement MisExpect Diagnostics" This reverts commit e7749d4713a5ec886011ceb0fc821c6723061724.	2022-03-17 23:54:26 +00:00
Paul Kirth	e7749d4713	[misexpect] Re-implement MisExpect Diagnostics Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata. New checks rely on 2 invariants: 1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered. 2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles. These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes. Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line. Differential Revision: https://reviews.llvm.org/D115907	2022-03-17 23:46:23 +00:00

1 2 3 4 5 ...

30058 Commits