llvm-project

Author	SHA1	Message	Date
Wei Mi	ee35784a90	[SampleFDO] Support enabling -funique-internal-linkage-name. now -funique-internal-linkage-name flag is available, and we want to flip it on by default since it is beneficial to have separate sample profiles for different internal symbols with the same name. As a preparation, we want to avoid regression caused by the flip. When we flip -funique-internal-linkage-name on, the profile is collected from binary built without -funique-internal-linkage-name so it has no uniq suffix, but the IR in the optimized build contains the suffix. This kind of mismatch may introduce transient regression. To avoid such mismatch, we introduce a NameTable section flag indicating whether there is any name in the profile containing uniq suffix. Compiler will decide whether to keep uniq suffix during name canonicalization depending on the NameTable section flag. The flag is only available for extbinary format. For other formats, by default compiler will keep uniq suffix so they will only experience transient regression when -funique-internal-linkage-name is just flipped. Another type of regression is caused by places where we miss to call getCanonicalFnName. Those places are fixed. Differential Revision: https://reviews.llvm.org/D96932	2021-03-09 21:41:40 -08:00
Philip Reames	cf1899e0a9	[rs4gc] common bdv operand visitation [nfc]	2021-03-09 20:28:47 -08:00
Arnold Schwaighofer	590ac0a26a	[coro async] Transfer the original function's attributes to the clone rdar://75052917 Differential Revision: https://reviews.llvm.org/D98051	2021-03-09 17:01:41 -08:00
Leonard Chan	cf371573b0	[llvm] Change DSOLocalEquivalent type if the underlying global value type changes We encountered an issue where LTO running on IR that used the DSOLocalEquivalent constant would result in bad codegen. The underlying issue was ValueMapper wasn't properly handling DSOLocalEquivalent, so this just adds the machinery for handling it. This code path is triggered by a fix to DSOLocalEquivalent::handleOperandChangeImpl where DSOLocalEquivalent could potentially not have the same type as its underlying GV. This updates DSOLocalEquivalent::handleOperandChangeImpl to change the type if the GV type changes and handles this constant in ValueMapper. Differential Revision: https://reviews.llvm.org/D97978	2021-03-09 15:09:48 -08:00
Sanjay Patel	23fd647cc6	[SLP] remove dead null check; NFC We cast<> to Instruction (not dyn_cast<>), so we already required/assumed that Cmp is not null.	2021-03-09 17:43:07 -05:00
Jianzhou Zhao	8506fe5b41	[dfsan] Tracking origins at memory transfer This is a part of https://reviews.llvm.org/D95835. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D98192	2021-03-09 22:15:07 +00:00
Juneyoung Lee	f49354838e	Revert "[InstCombine] Add simplification of two logical and/ors" This reverts commit 07c3b97e184d5bd828b8a680cdce46e73f3db9fc due to a reported failure in two-stage build.	2021-03-10 05:48:31 +09:00
gbtozers	df69c69427	[DebugInfo] Handle multiple variable location operands in IR This patch updates the various IR passes to correctly handle dbg.values with a DIArgList location. This patch does not actually allow DIArgLists to be produced by salvageDebugInfo, and it does not affect any pass after codegen-prepare. Other than that, it should cover every IR pass. Most of the changes simply extend code that operated on a single debug value to operate on the list of debug values in the style of any_of, all_of, for_each, etc. Instances of setOperand(0, ...) have been replaced with with replaceVariableLocationOp, which takes the value that is being replaced as an additional argument. In places where this value isn't readily available, we have to track the old value through to the point where it gets replaced. Differential Revision: https://reviews.llvm.org/D88232	2021-03-09 16:44:38 +00:00
Sanjay Patel	2986a9c7e2	[InstCombine] canonicalize 'not' op after min/max intrinsic This is another step towards parity between existing select transforms and min/max intrinsics (D98152).. The existing 'not' folds around select are complicated, so it's likely that we will need to enhance this, but this should be a safe step.	2021-03-09 11:33:28 -05:00
Sanjay Patel	41b9209a12	[InstCombine] fold min/max intrinsics with not ops This is a partial translation of the existing select-based folds. We need to recreate several different transforms to avoid regressions as noted in D98152. https://alive2.llvm.org/ce/z/teuZ_J	2021-03-09 08:55:48 -05:00
Florian Hahn	92da5b7119	[InstCombine] Simplify phis with incoming pointer-casts. If the incoming values of a phi are pointer casts of the same original value, replace the phi with a single cast. Such redundant phis are somewhat common after loop-rotate and removing them can avoid some unnecessary code bloat, e.g. because an iteration of a loop is peeled off to make the phi invariant. It should also simplify further analysis on its own. InstCombine already uses stripPointerCasts in a couple of places and also simplifies phis based on the incoming values, so the patch should fit in the existing scope. The patch causes binary changes in 47 out of 237 benchmarks in MultiSource/SPEC2000/SPEC2006 with -O3 -flto on X86. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D98058	2021-03-09 11:40:18 +00:00
Hongtao Yu	4c3d759d00	[CSSPGO] Always use callsite samples as callsite probe counts. For CS profile, the callsite count of previously inlined callees is populated with the entry count of the callees. Therefore when trying to get a weight for calliste probe after inlinining, the callsite count should always be used. The same fix has already been made for non-probe case. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D98094	2021-03-08 22:52:36 -08:00
Akira Hatanaka	dca5737945	Move ObjCARCUtil.h back to llvm/Analysis Instead of adding the header to llvm/IR, just duplicate the marker string in the auto upgrader.	2021-03-08 16:35:24 -08:00
Alina Sbirlea	29482426b5	Revert "[LICM] Make promotion faster" Revert 3d8f842712d49b0767832b6e3f65df2d3f19af4e Revision triggers a miscompile sinking a store incorrectly outside a threading loop. Detected by tsan. Reverting while investigating. Differential Revision: https://reviews.llvm.org/D89264	2021-03-08 12:53:03 -08:00
Philip Reames	ebc61f9d3c	[instcombine] Collapse trivial or recurrences If we have a recurrence of the form <Start, Or, Step> we know that the value taken by the recurrence stabilizes on the first iteration (provided step is loop invariant). We can exploit that fact to remove the loop carried dependence in the recurrence. Differential Revision: https://reviews.llvm.org/D97578 (or part)	2021-03-08 09:21:38 -08:00
Philip Reames	239a618180	[instcombine] Collapse trivial and recurrences If we have a recurrence of the form <Start, And, Step> we know that the value taken by the recurrence stabilizes on the first iteration (provided step is loop invariant). We can exploit that fact to remove the loop carried dependence in the recurrence. Differential Revision: https://reviews.llvm.org/D97578 (and part)	2021-03-08 09:21:38 -08:00
Philip Reames	97a7bc5831	[gvn] Precisely propagate equalities to phi operands The code used for propagating equalities (e.g. assume facts) was conservative in two ways - one of which this patch fixes. Specifically, it shifts the code reasoning about whether a use is dominated by the end of the assume block to consider phi uses to exist on the predecessor edge. This matches the dominator tree handling for dominates(Edge, Use), and simply extends it to dominates(BB, Use). Note that the decision to use the end of the block is itself a conservative choice. The more precise option would be to use the later of the assume and the value, and replace all uses after that. GVN handles that case separately (with the replace operand mechanism) because it used to be expensive to ask dominator questions within blocks. With the new instruction ordering support, we should probably rewrite this code at some point to simplify. Differential Revision: https://reviews.llvm.org/D98082	2021-03-08 08:59:00 -08:00
Sanne Wouda	05a6e2eb9a	[InstCombine] Add a combine for a shuffle of similar bitcasts Some intrinsics wrapper code has the habit of ignoring the type of the elements in vectors, thinking of vector registers as a "bag of bits". As a consequence, some operations are shared between vectors of different types are shared. For example, functions that rearrange elements in a vector can be shared between vectors of int32 and float. This can result in bitcasts in awkward places that prevent the backend from recognizing some instructions. For AArch64 in particular, it inhibits the selection of dup from a general purpose register (GPR), and mov from GPR to a vector lane. This patch adds a pattern in InstCombine to move the bitcasts past the shufflevector if this is possible. Sometimes this even allows InstCombine to remove the bitcast entirely, as in the included tests. Alternatively this could be done with a few extra patterns in the AArch64 backend, but InstCombine seems like a better place for this. Differential Revision: https://reviews.llvm.org/D97397	2021-03-08 16:32:30 +00:00
Sanne Wouda	5e963a2441	Rehome an orphaned comment [NFC] As seen in 35827164c4, the "shuffle x, x, mask" comment has drifted away from the implementation of the pattern. Put it back.	2021-03-08 16:32:30 +00:00
Stephen Tozer	4343c68fa3	Fix: [DebugInfo] Support DIArgList in DbgVariableIntrinsic This patch removed the only use of a lambda capture, triggering an error on `-Werror -Wunused-lambda-capture` builds.	2021-03-08 14:57:11 +00:00
gbtozers	e5d958c456	[DebugInfo] Support DIArgList in DbgVariableIntrinsic This patch updates DbgVariableIntrinsics to support use of a DIArgList for the location operand, resulting in a significant change to its interface. This patch does not update all IR passes to support multiple location operands in a dbg.value; the only change is to update the DbgVariableIntrinsic interface and its uses. All code outside of the intrinsic classes assumes that an intrinsic will always have exactly one location operand; they will still support DIArgLists, but only if they contain exactly one Value. Among other changes, the setOperand and setArgOperand functions in DbgVariableIntrinsic have been made private. This is to prevent code from setting the operands of these intrinsics directly, which could easily result in incorrect/invalid operands being set. This does not prevent these functions from being called on a debug intrinsic at all, as they can still be called on any CallInst pointer; it is assumed that any code directly setting the operands on a generic call instruction is doing so safely. The intention for making these functions private is to prevent DIArgLists from being overwritten by code that's naively trying to replace one of the Values it points to, and also to fail fast if a DbgVariableIntrinsic is updated to use a DIArgList without a valid corresponding DIExpression.	2021-03-08 14:36:13 +00:00
Ta-Wei Tu	df9158c9a4	[LoopInterchange] Replace tightly-nesting-ness check with the one from `LoopNest` The check `tightlyNested()` in `LoopInterchange` is similar to the one in `LoopNest`. In fact, the former misses some cases where loop-interchange is not feasible and results in incorrect behaviour. Replacing it with the much robust version provided by `LoopNest` reduces code duplications and fixes https://bugs.llvm.org/show_bug.cgi?id=48113. `LoopInterchange` has a weaker definition of tightly or perfectly nesting-ness than the one implemented in `LoopNest::arePerfectlyNested()`. Therefore, `tightlyNested()` is instead implemented with `LoopNest::checkLoopsStructure` and additional checks for unsafe instructions. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D97290	2021-03-08 11:36:08 +08:00
Mehdi Amini	8d5a981a13	Revert "[SimplifyCFG] Update FoldBranchToCommonDest to be poison-safe" This reverts commit 99108c791de0285ee726a10e8274772b18cee73c. Clang is miscompiling LLVM with this change, a stage-2 build hits multiple failures. As a repro, I built clang in a stage1 directory and used it this way: cmake -G Ninja ../llvm \ -DCMAKE_CXX_COMPILER=`pwd`/../build-stage1/bin/clang++ \ -DCMAKE_C_COMPILER=`pwd`/../build-stage1/bin/clang \ -DLLVM_TARGETS_TO_BUILD="X86;NVPTX;AMDGPU" \ -DLLVM_ENABLE_PROJECTS=mlir \ -DLLVM_BUILD_EXAMPLES=ON \ -DCMAKE_BUILD_TYPE=Release \ -DLLVM_ENABLE_ASSERTIONS=On ninja check-mlir	2021-03-08 00:15:47 +00:00
Juneyoung Lee	07c3b97e18	[InstCombine] Add simplification of two logical and/ors This is a patch that adds folding of two logical and/ors that share one variable: a && (a && b) -> a && b a && (a & b) -> a && b ... This is towards removing the poison-unsafe select optimization (D93065 has more context). Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D96945	2021-03-08 02:38:43 +09:00
Juneyoung Lee	d672c81126	[InstCombine] use safe transformation by default .. since it will be folded into and/or anyway	2021-03-08 02:25:29 +09:00
Nikita Popov	2b494f85f1	[CVP] Remove -cvp-dont-add-nowrap-flags option This option was originally added to work around a bug in LFTR. The bug has long since been fixed.	2021-03-07 18:19:31 +01:00
Nikita Popov	176bbcae11	[DSE] Remove MemDep-based implementation The MemorySSA-based implementation has been enabled without issue for a while now, so keeping the old implementation around doesn't seem useful anymore. This drops the MemDep-based implementation. Differential Revision: https://reviews.llvm.org/D97877	2021-03-07 18:17:31 +01:00
Juneyoung Lee	33590ed4f2	[InstCombine] fix another poison-unsafe select transformation This fixes another unsafe select folding by disabling it if EnableUnsafeSelectTransform is set to false. EnableUnsafeSelectTransform's default value is true, hence it won't affect generated code (unless the flag is explicitly set to false).	2021-03-08 02:11:04 +09:00
Juneyoung Lee	99108c791d	[SimplifyCFG] Update FoldBranchToCommonDest to be poison-safe This patch makes FoldBranchToCommonDest merge branch conditions into `select i1` rather than `and/or i1` when it is called by SimplifyCFG. It is known that merging conditions into and/or is poison-unsafe, and this is towards making things more correct by removing possible miscompilations. Currently, InstCombine simply consumes these selects into and/or of i1 (which is also unsafe), so the visible effect would be very small. The unsafe select -> and/or transformation will be removed in the future. There has been efforts for updating optimizations to support the select form as well, and they are linked to D93065. The safe transformation is fired when it is called by SimplifyCFG only. This is done by setting the new `PoisonSafe` argument as true. Another place that calls FoldBranchToCommonDest is LoopSimplify. `PoisonSafe` flag is set to false in this case because enabling it has a nontrivial impact in performance because SCEV is more conservative with select form and InductiveRangeCheckElimination isn't aware of select form of and/or i1. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95026	2021-03-08 01:38:03 +09:00
Juneyoung Lee	5bb38e84d3	[LoopUnswitch] unswitch if cond is in select form of and/or as well Hello all, I'm trying to fix unsafe propagation of poison values in and/or conditions by using equivalent select forms (`select i1 A, i1 B, i1 false` and `select i1 A, i1 true, i1 false`) instead. D93065 has links to patches for this. This patch allows unswitch to happen if the condition is in this form as well. `collectHomogenousInstGraphLoopInvariants` is updated to keep traversal if Root and the visiting I matches both m_LogicalOr()/m_LogicalAnd(). Other than this, the remaining changes are almost straightforward and simply replaces Instruction::And/Or check with match(m_LogicalOr()/m_LogicalAnd()). Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D97756	2021-03-08 01:19:43 +09:00
Nikita Popov	3fedaf2a52	[GVN] Don't explicitly materialize undefs from dead blocks When materializing an available load value, do not explicitly materialize the undef values from dead blocks. Doing so will will force creation of a phi with an undef operand, even if there is a dominating definition. The phi will be folded away on subsequent GVN iterations, but by then we may have already poisoned MDA cache slots. Simply don't register these values in the first place, and let SSAUpdater do its thing.	2021-03-06 23:46:24 +01:00
Fangrui Song	fb2cf0dd60	[FunctionImport] Delete unneeded setLive. NFC ValueInfo's in Worklist are guaranteed to be live.	2021-03-06 14:09:54 -08:00
Mauri Mustonen	494b5ba364	[VPlan] Support to widen call intructions in VPlan native path Add support to widen call instructions in VPlan native path by using a correct recipe when such instructions are encountered. This is already used by inner loop vectorizer. Previously call instructions got handled by wrong recipes and resulted in unreachable instruction errors like this one: https://bugs.llvm.org/show_bug.cgi?id=48139. Patch by Mauri Mustonen <mauri.mustonen@tuni.fi> Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D97278	2021-03-06 21:59:52 +00:00
Roman Lebedev	2ad1f5eb1a	[InstCombine] Don't canonicalize (gep i8* X, -(ptrtoint Y)) as (inttoptr (sub (ptrtoint X), (ptrtoint Y))) It's just a wrong thing to do. We introduce inttoptr where there were none, which results in loosing all provenance information because we no longer have a GEP{i,}, and pessimize all future optimizations, because we are basically not allowed to look past `inttoptr`. (gep i8* X, -(ptrtoint Y)) is the canonical form. So just drop this fold. Noticed while reviewing D98120.	2021-03-06 23:00:25 +03:00
Fangrui Song	9fb6782c69	[rs4gc] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds	2021-03-06 11:42:27 -08:00
Roman Lebedev	b46c085d2b	[NFCI] SCEVExpander: emit intrinsics for integral {u,s}{min,max} SCEV expressions These intrinsics, not the icmp+select are the canonical form nowadays, so we might as well directly emit them. This should not cause any regressions, but if it does, then then they would needed to be fixed regardless. Note that this doesn't deal with `SCEVExpander::isHighCostExpansion()`, but that is a pessimization, not a correctness issue. Additionally, the non-intrinsic form has issues with undef, see https://reviews.llvm.org/D88287#2587863	2021-03-06 21:52:46 +03:00
Ta-Wei Tu	8a003861a3	[NPM] Add -enable-loopinterchange option to NPM We have the `enable-loopinterchange` option in legacy pass manager but not in NPM. Add `LoopInterchange` pass to the optimization pipeline (at the same position as before) when `enable-loopinterchange` is turned on. Reviewed By: aeubanks, fhahn Differential Revision: https://reviews.llvm.org/D98116	2021-03-07 02:39:28 +08:00
William S. Moses	d163e75c81	[Attributor] Enable heap-to-stack of any size Enable Attributor's heap-to-stack to lower unbounded allocations given a max size of -1 Differential Revision: https://reviews.llvm.org/D97873	2021-03-06 12:57:32 -05:00
Philip Reames	5db2735af9	[gvn] Handle simply phi equivalence cases GVN basically doesn't handle phi nodes at all. This is for a reason - we can't value number their inputs since the predecessor blocks have probably not been visited yet. However, it also creates a significant pass ordering problem. As it stands, instcombine and simplifycfg ends up implementing CSE of phi nodes. This means that for any series of CSE opportunities intermixed with phi nodes, we end up having to alternate instcombine/simplifycfg and gvn to make progress. This patch handles the simplest case by simply preprocessing the phi instructions in a block, and CSEing them if they are syntactically identical. This turns out to be powerful enough to handle many cases in a single invocation of GVN since blocks which use the cse'd phi results are visited after the block containing the phi. If there's a CSE opportunity in one the phi predecessors required to recognize the phi CSE opportunity, that will require a second iteration on the function. (Still within a single run of gvn though.) Compile time wise, this could go either way. On one hand, we're potentially causing GVN to iterate over the function more. On the other, we're cutting down on iterations between two passes and potentially shrinking the IR aggressively. So, a bit unclear what to expect. Note that this does still rely on instcombine to canonicalize block order of the phis, but that's a one time transformation independent of the values incoming to the phi. Differential Revision: https://reviews.llvm.org/D98080	2021-03-06 09:31:12 -08:00
Philip Reames	8fe59ba51e	[rs4gc] track the original value in the state use for base pointer rewriting I'd originally intended to build on this for another purpose and have decided not to, but at a minimum, the stronger asserts are useful.	2021-03-06 08:46:15 -08:00
Philip Reames	6334952ff0	[rs4gc] minor code style improvement	2021-03-06 08:46:15 -08:00
Philip Reames	51b13a7ea0	[gvn] CSE gc.relocates based on meaning, not spelling The last two operands to a gc.relocate represent indices into the associated gc.statepoint's gc bundle list. (Effectively, gc.relocates are projections from the gc.statepoints multiple return values.) We can use this to recognize when two gc.relocates are equivalent (and can be CSEd), even when the indices are non-equal. This is particular useful when considering a chain of multiple statepoints as it lets us eliminate all duplicate gc.relocates in a single pass. Differential Revision: https://reviews.llvm.org/D97974 (Note: Part of the reviewed change was split and landed as f352463a)	2021-03-05 10:16:12 -08:00
Philip Reames	f352463ade	Mark gc.relocate and gc.result as readnone For some reason, we had been marking gc.relocates as reading memory. There's no known reason for this, and I suspect it to be a legacy of very early implementation conservatism. gc.relocate and gc.result are simply projections of the return values from the associated statepoint. Note that the LangRef has always declared them readnone. The EarlyCSE change is simply moving the special casing from readonly to readnone handling. As noted by the test diffs, this does allow some additional CSE when relocates are separated by stores, but since we generate gc.relocates in batches, this is unlikely to help anything in practice. This was reviewed as part of https://reviews.llvm.org/D97974, but split at reviewer request before landing. The motivation is to enable the GVN changes in that patch.	2021-03-05 10:07:17 -08:00
Philip Reames	99f93dd3a5	[rs4gc] avoid insert base computation instructions for deopt uses If we have a value live over a call which is used for deopt at the call, we know that the value must be a base pointer. We can avoid potentially inserting IR to materialize a base for this value. In it's current form, this is mostly a compile time optimization. Building the base pointer graph (and then optimizing it away again) is a relatively expensive operation. We also sometimes end up with better codegen in practice - due to failures in optimizing away the inserted base pointer propogation - but those are optimization bugs we're fixing concurrently. The alternative to this would be to extend the base pointer inference with the ability to generally reuse multiple-base input instructions (phis and selects). That's somewhat invasive and complicated, so we're defering it a bit longer. Differential Revision: https://reviews.llvm.org/D97885	2021-03-05 09:55:36 -08:00
gbtozers	65600cb2a7	[DebugInfo] Add DIArgList MD to store multple values in DbgVariableIntrinsics This patch adds a new metadata node, DIArgList, which contains a list of SSA values. This node is in many ways similar in function to the existing ValueAsMetadata node, with the difference being that it tracks a list instead of a single value. Internally, it uses ValueAsMetadata to track the individual values, but there is also a reasonable amount of DIArgList-specific value-tracking logic on top of that. Similar to ValueAsMetadata, it is a special case in parsing and printing due to the fact that it requires a function state (as it may reference function-local values). This patch should not result in any immediate functional change; it allows for DIArgLists to be parsed and printed, but debug variable intrinsics do not yet recognize them as a valid argument (outside of parsing). Differential Revision: https://reviews.llvm.org/D88175	2021-03-05 17:02:24 +00:00
David Sherwood	fec0a0adac	[SVE][LoopVectorize] Add support for extracting the last lane of a scalable vector There are certain loops like this below: for (int i = 0; i < n; i++) { a[i] = b[i] + 1; *inv = a[i]; } that can only be vectorised if we are able to extract the last lane of the vectorised form of 'a[i]'. For fixed width vectors this already works since we know at compile time what the final lane is, however for scalable vectors this is a different story. This patch adds support for extracting the last lane from a scalable vector using a runtime determined lane value. I have added support to VPIteration for runtime-determined lanes that still permit the caching of values. I did this by introducing a new class called VPLane, which describes the lane we're dealing with and provides interfaces to get both the compile-time known lane and the runtime determined value. Whilst doing this work I couldn't find any explicit tests for extracting the last lane values of fixed width vectors so I added tests for both scalable and fixed width vectors. Differential Revision: https://reviews.llvm.org/D95139	2021-03-05 09:57:56 +00:00
Michael Kruse	b119120673	[clang][OpenMP] Use OpenMPIRBuilder for workshare loops. Initial support for using the OpenMPIRBuilder by clang to generate loops using the OpenMPIRBuilder. This initial support is intentionally limited to: * Only the worksharing-loop directive. * Recognizes only the nowait clause. * No loop nests with more than one loop. * Untested with templates, exceptions. * Semantic checking left to the existing infrastructure. This patch introduces a new AST node, OMPCanonicalLoop, which becomes parent of any loop that has to adheres to the restrictions as specified by the OpenMP standard. These restrictions allow OMPCanonicalLoop to provide the following additional information that depends on base language semantics: * The distance function: How many loop iterations there will be before entering the loop nest. * The loop variable function: Conversion from a logical iteration number to the loop variable. These allow the OpenMPIRBuilder to act solely using logical iteration numbers without needing to be concerned with iterator semantics between calling the distance function and determining what the value of the loop variable ought to be. Any OpenMP logical should be done by the OpenMPIRBuilder such that it can be reused MLIR OpenMP dialect and thus by flang. The distance and loop variable function are implemented using lambdas (or more exactly: CapturedStmt because lambda implementation is more interviewed with the parser). It is up to the OpenMPIRBuilder how they are called which depends on what is done with the loop. By default, these are emitted as outlined functions but we might think about emitting them inline as the OpenMPRuntime does. For compatibility with the current OpenMP implementation, even though not necessary for the OpenMPIRBuilder, OMPCanonicalLoop can still be nested within OMPLoopDirectives' CapturedStmt. Although OMPCanonicalLoop's are not currently generated when the OpenMPIRBuilder is not enabled, these can just be skipped when not using the OpenMPIRBuilder in case we don't want to make the AST dependent on the EnableOMPBuilder setting. Loop nests with more than one loop require support by the OpenMPIRBuilder (D93268). A simple implementation of non-rectangular loop nests would add another lambda function that returns whether a loop iteration of the rectangular overapproximation is also within its non-rectangular subset. Reviewed By: jdenny Differential Revision: https://reviews.llvm.org/D94973	2021-03-04 22:52:59 -06:00
Wei Mi	2357d29335	[SampleFDO] Another fix to prevent repeated indirect call promotion in sample loader pass. In https://reviews.llvm.org/rG5fb65c02ca5e91e7e1a00e0efdb8edc899f3e4b9, to prevent repeated indirect call promotion for the same indirect call and the same target, we used zero-count value profile to indicate an indirect call has been promoted for a certain target. We removed PromotedInsns cache in the same patch. However, there was a problem in that patch described below, and that problem led me to add PromotedInsns back as a mitigation in https://reviews.llvm.org/rG4ffad1fb489f691825d6c7d78e1626de142f26cf. When we get value profile from metadata by calling getValueProfDataFromInst, we need to specify the maximum possible number of values we expect to read. We uses MaxNumPromotions in the last patch so the maximum number of value information extracted from metadata is MaxNumPromotions. If we have many values including zero-count values when we write the metadata, some of them will be dropped when we read them because we only read MaxNumPromotions values. It will allow repeated indirect call promotion again. We need to make sure if there are values indicating promoted targets, those values need to be saved in metadata with higher priority than other values. The patch fixed that problem. We change to use -1 to represent the count of a promoted target instead of 0 so it is easier to sort the values. When we prepare to update the metadata in updateIDTMetaData, we will sort the values in the descending count order and extract only MaxNumPromotions values to write into metadata. Since -1 is the max uint64_t number, if we have equal to or less than MaxNumPromotions of -1 count values, they will all be kept in metadata. If we have more than MaxNumPromotions of -1 count values, we will only save MaxNumPromotions such values maximally. In such case, we have logic in place in doesHistoryAllowICP to guarantee no more promotion in sample loader pass will happen for the indirect call, because it has been promoted enough. With this change, now we can remove PromotedInsns without problem. Differential Revision: https://reviews.llvm.org/D97350	2021-03-04 18:44:12 -08:00
David Blaikie	a2a55def35	Move llvm/Analysis/ObjCARCUtil.h to IR to fix layering. This is included from IR files, and IR doesn't/can't depend on Analysis (because Analysis depends on IR). Also fix the implementation - don't use non-member static in headers, as it leads to ODR violations, inaccurate "unused function" warnings, etc. And fix the header protection macro name (we don't generally include "LIB" in the names, so far as I can tell).	2021-03-04 16:14:53 -08:00
Jianzhou Zhao	db7fe6cd4b	[dfsan] Propagate origin tracking at store This is a part of https://reviews.llvm.org/D95835. Reviewed By: morehouse, gbalats Differential Revision: https://reviews.llvm.org/D97789	2021-03-04 23:34:44 +00:00

1 2 3 4 5 ...

26739 Commits