llvm-project

Author	SHA1	Message	Date
AdityaK	b42d245b77	[GVNHoist] Replace combineKnownMetadata with combineMetadataForCSE (#92197 ) There is no reason to call combineMetadata directly with a list of MD_ nodes. The combineMetadataForCSE function handles all the metadata correctly Partially fixes: #30866	2024-05-15 07:44:34 -07:00
Jie Fu	7c8176ebd3	[Coroutines] Remove unused function (NFC) llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:1223:1: error: unused function 'scanPHIsAndUpdateValueMap' [-Werror,-Wunused-function] scanPHIsAndUpdateValueMap(Instruction Prev, BasicBlock NewBlock, ^ 1 error generated.	2024-05-15 22:08:17 +08:00
Hans	3bb39690d7	[coro] Lower `llvm.coro.await.suspend.handle` to resume with tail call (#89751 ) The C++ standard requires that symmetric transfer from one coroutine to another is performed via a tail call. Failure to do so is a miscompile and often breaks programs by quickly overflowing the stack. Until now, the coro split pass tried to ensure this in the `addMustTailToCoroResumes()` function by searching for `llvm.coro.resume` calls to lower as tail calls if the conditions were right: the right function arguments, attributes, calling convention etc., and if a `ret void` was sure to be reached after traversal with some ad-hoc constant folding following the call. This was brittle, as the kind of implicit variants required for a tail call to happen could easily be broken by other passes (e.g. if some instruction got in between the `resume` and `ret`), see for example 9d1cb18d19862fc0627e4a56e1e491a498e84c71 and 284da049f5feb62b40f5abc41dda7895e3d81d72. Also the logic seemed backwards: instead of searching for possible tail call candidates and doing them if the circumstances are right, it seems better to start with the intention of making the tail calls we need, and forcing the circumstances to be right. Now that we have the `llvm.coro.await.suspend.handle` intrinsic (since f78688134026686288a8d310b493d9327753a022) which corresponds exactly to symmetric transfer, change the lowering of that to also include the `resume` part, always lowered as a tail call.	2024-05-15 15:29:08 +02:00
Pietro Ghiglio	83d9aa2768	[VPlan] Add scalar inferencing support for addrspace cast (#92107 ) Fixes https://github.com/llvm/llvm-project/issues/91434 PR: https://github.com/llvm/llvm-project/pull/92107	2024-05-15 14:03:21 +01:00
Jay Foad	1650f1b3d7	Fix typo "indicies" (#92232 )	2024-05-15 13:10:16 +01:00
Florian Hahn	d187005cad	[VPlan] Update VPBlendRecipe codegen for for first-lane only. Update VPBlendRecipe::execute to support generating code for first-lane only. This fixes a crash in the newly added test @test_not_first_lane_only_wide_compare_incoming_order_swapped.	2024-05-15 11:00:15 +01:00
Daniel Kiss	45726c1a3a	[LLVM] Make sanitizers respect the disable_santizer_instrumentation attribute. (#91732 ) `disable_sanitizer_instrumetation` is attached to functions that shall not be instrumented e.g. ifunc resolver because those run before everything is initialised. Some sanitizer already handles this attribute, this patch adds it to DataFLow and Coverage too.	2024-05-15 08:40:16 +02:00
Matt Arsenault	d7bb0723fe	InstCombine: Emit ldexp intrinsic in exp2->ldexp combine (#92039 ) Prefer to emit the intrinsic over a libcall in the intrinsic or no-math-errno case.	2024-05-15 07:41:28 +02:00
Matt Arsenault	847c83f7cc	InstCombine: Process addrspacecast uses in PointerReplacer (#91953 ) This was looking through an addrspacecast, and not finding a later unfoldable cast to another address space. Fixes improperly deleting a required alloca + memcpy and introducing an illegal addrspacecast. This also required fixing some worklist management issues with addrspacecast, and assuming that only memcpy sources could need replacement. Regresses one test function, but this looks like it optimized before by accident. It never saw the pointer use by the call to readonly_callee, which should require insertion of a new cast. Fixes #68120	2024-05-15 07:02:31 +02:00
Nikita Popov	71fbbb69d6	[IR] Move GlobalValue::getGUID() out of line (NFC) Avoid including MD5.h in a core IR header.	2024-05-15 10:49:25 +09:00
AtariDreams	4d1ecf1923	[Transforms] Preserve inbounds attribute of transformed GEPs when flattening loops (#86961 ) When flattening the loop, if the GEP was inbound, it should stay inbound, because the only thing that changed is how the pointers are calculated, not the elements being accessed. Proof: https://alive2.llvm.org/ce/z/dApMpQ	2024-05-15 10:26:23 +09:00
Philip Reames	baca93fc83	[LSR] Tweak debug output to always print initial cost	2024-05-14 13:34:20 -07:00
Florian Hahn	67d840b60f	[VPlan] Relax over-aggressive assertion in VPTransformState::get(). There are cases where a vector value has some users that demand the the single scalar value only (NeedsScalar), while other users demand the vector value (see attached test cases). In those cases, the NeedsScalar users should only demand the first lane. Fixes https://github.com/llvm/llvm-project/issues/91883.	2024-05-14 19:10:49 +01:00
Mingming Liu	6c8ebc0535	[NFC][CallPromotionUtils]Extract a helper function versionCallSiteWithCond from versionCallSite (#81181 ) * This is to be used by https://github.com/llvm/llvm-project/pull/81378 to implement a variant of versionCallSite that compares vtables. * The parent patch is https://github.com/llvm/llvm-project/pull/81051	2024-05-14 10:13:57 -07:00
Graham Hunter	2b15c4a62b	[AArch64] Postcommit fixes for histogram intrinsic (#92095 ) A buildbot with expensive checks enabled flagged some problems with my patch. There was also a post-commit nit on the langref changes.	2024-05-14 15:16:42 +01:00
AdityaK	bf7a0f9958	Fix incorrect codegen with respect to GEPs #85333 (#92047 ) As mentioned in #68882 and https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699 Gep arithmetic isn't consistent with different types. GVNSink didn't realize this and sank all geps as long as their operands can be wired via PHIs in a post-dominator. Fixes: #85333 Reapply: #88440 after fixing the non-determinism issues in #90995	2024-05-14 06:13:11 -07:00
Florian Hahn	b1e99a699d	[LV] Drop redundant comment from createEdgeMask (NFC). Follow-up to remove a redundant comment post-commit https://github.com/llvm/llvm-project/pull/91897	2024-05-14 12:43:47 +01:00
Ramkumar Ramachandra	d7ef34bfe3	[LV] update comment following 63d8058 (NFC) (#91120 ) Address a review comment post landing 63d8058 (LoopVectorize: guard appending InstsToScalarize; fix bug) to update a comment.	2024-05-14 10:59:26 +01:00
Florian Hahn	632317e9ab	[VPlan] Add non-poison propagating LogicalAnd VPInstruction opcode. (#91897 ) Add a new opcode to mode non-poison propagating logical AND operations used when generating edge masks. This follows the similar decision to model Not as dedicated opcode as well, to improve clarity. This also helps to simplify the matchers for https://github.com/llvm/llvm-project/pull/89386. PR: https://github.com/llvm/llvm-project/pull/91897	2024-05-14 09:42:49 +01:00
Lei Wang	5b6f151104	[SampleFDO] Improve stale profile matching by diff algorithm (#87375 ) This change improves the matching algorithm by using the diff algorithm, the current matching algorithm only processes the callsites grouped by the same name functions, it doesn't consider the order relationships between different name functions, this sometimes fails to handle this ambiguous anchor case. For example. (`Foo:1` means a calliste[callee_name: callsite_location]) ``` IR : foo:1 bar:2 foo:4 bar:5 Profile : bar:3 foo:5 bar:6 ``` The `foo:1` is matched to the 2nd `foo:5` and using the diff algorithm(finding longest common subsequence ) can help on this issue. One well-known diff algorithm is the Myers diff algorithm(paper "An O(ND) Difference Algorithm and Its Variations∗" Eugene W. Myers), its variations have been implemented and used in many famous tools, like the GNU diff or git diff. It provides an efficient way to find the longest common subsequence or the shortest edit script through graph searching. There are several variations/refinements for the algorithm, but as in our case, the num of function callsites is usually very small, so we implemented the basic greedy version in this change which should be good enough. We observed better matchings and positive perf improvement on our internal services.	2024-05-13 16:01:29 -07:00
Florian Hahn	e122380445	[LV] Use VPBuilder to create Select (NFCI).	2024-05-13 20:44:39 +01:00
chenlin	79643565a8	[LoopUnroll] Remove redundant debug instructions after blocks have been merged (#91246 ) Remove redundant debug instructions after blocks have been merged into the predecessor, It can reduce some compile time in some cases. This change only fixes the situation of loop unrolling, and other situations are not considered. "RemoveRedundantDbgInstrs" seems to be very time-consuming. Thus, we just add here after the "Dest" has been merged into the "Fold", this may be a more targeted solution!!! fixes: https://github.com/llvm/llvm-project/issues/89073	2024-05-13 09:42:04 -07:00
Paul Kirth	89a080cb79	[llvm][NFC] Document cl::opt MisExpectTolerance and fix typo Pull Request: https://github.com/llvm/llvm-project/pull/90670	2024-05-13 16:19:09 +00:00
Matt Arsenault	8823abea6f	InstCombine: Simplify vector initialization	2024-05-13 13:59:45 +02:00
Orlando Cazalet-Hyams	91d7ca904c	[DebugInfo] Remap extracted DIAssignIDs in hotcoldsplit (#91940 ) Fix #91814 When instructions are extracted into a new function the `DIAssignID` metadata uses and attachments need to be remapped so that the stores and assignment markers don't link to stores and assignment markers in the original function. This matches existing inlining behaviour for DIAssignIDs.	2024-05-13 12:49:42 +01:00
Matt Arsenault	c5b0da9d83	InstCombine: Preserve inbounds in PointerReplacer (#91735 ) This avoids spurious test changes in a future commit.	2024-05-13 13:49:09 +02:00
Graham Hunter	fbb37e9606	[AArch64] Add an all-in-one histogram intrinsic Based on discussion from https://discourse.llvm.org/t/rfc-vectorization-support-for-histogram-count-operations/74788 Current interface is: llvm.experimental.histogram(<vecty> ptrs, <intty> inc_amount, <vecty> mask) The integer type used by 'inc_amount' needs to match the type of the buckets in memory. The intrinsic covers the following operations: * Gather load * histogram on the elements of 'ptrs' * multiply the histogram results by 'inc_amount' * add the result of the multiply to the values loaded by the gather * scatter store the results of the add Supports lowering to histcnt instructions for AArch64 targets, and scalarization for all others at present.	2024-05-13 11:35:28 +01:00
Yingwei Zheng	b5f4210e9f	[InstCombine] Drop nuw flag when CtlzOp is a sub nuw (#91776 ) See the following case: ``` define i32 @src1(i32 %x) { %dec = sub nuw i32 -2, %x %ctlz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false) %sub = sub nsw i32 32, %ctlz %shl = shl i32 1, %sub %ugt = icmp ult i32 %x, -2 %sel = select i1 %ugt, i32 %shl, i32 1 ret i32 %sel } define i32 @tgt1(i32 %x) { %dec = sub nuw i32 -2, %x %ctlz = tail call i32 @llvm.ctlz.i32(i32 %dec, i1 false) %sub = sub nsw i32 32, %ctlz %and = and i32 %sub, 31 %shl = shl nuw i32 1, %and ret i32 %shl } ``` `nuw` in `%dec` should be dropped after the select instruction is eliminated. Alive2: https://alive2.llvm.org/ce/z/7S9529 Fixes https://github.com/llvm/llvm-project/issues/91691.	2024-05-13 14:27:59 +08:00
Kazu Hirata	e6785fd752	[Scalar] Fix a warning This patch fixes: llvm/lib/Transforms/Scalar/GVNSink.cpp:270:33: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture] While I am at it, this patch replaces llvm::for_each with a range-based for loop.	2024-05-12 23:02:37 -07:00
AdityaK	abe3c5ac19	[GVNSink] Fix non-determinisms by using a deterministic ordering (#90995 ) GVNSink used to order instructions based on their pointer values and was prone to non-determinism because of that. This patch ensures all the values stored are using a deterministic order. I have also added a verfier(`ModelledPHI::verifyModelledPHI`) to assert when ordering isn't preserved. Additionally, I have added a test case (mirror graph image of an existing test) that would have failed before this patch. Fixes: #77852	2024-05-12 19:41:54 -07:00
David Green	b7ed097f29	[VectorCombine] Add intrinsics handling to shuffleToIdentity (#91000 ) This is probably the most involved addition, as it tries to make use of isTriviallyVectorizable with isVectorIntrinsicWithScalarOpAtArg to handle a number of different intrinsics that are all lane-wise. Additional tests have been added for some of the different intrinsics from isVectorIntrinsicWithScalarOpAtArg / isVectorIntrinsicWithOverloadTypeAtArg.	2024-05-12 20:31:11 +01:00
Shan Huang	cdd782183d	[DebugInfo][LICM] Fix missing debug location updates (#91729 )	2024-05-11 16:26:04 +01:00
Shan Huang	3773191fc4	[DebugInfo][JumpThreading] Fix missing debug location updates (#91581 )	2024-05-11 16:10:00 +01:00
Alex Bradbury	3be8e2c95d	[InstCombine] Prefer to keep power-of-2 constants when combining ashr exact and slt/ult of a constant (#86111 ) We have flexibility in what constant to use when combining an `ashr exact` with a slt or ult of a constant, and it's not possible to revisit this decision later in the compilation pipeline after the `ashr exact` is removed. Keeping a constant close to power-of-2 (pow2val + 1) should be no worse than neutral, and in some cases may allow better codegen later on for targets that can more cheaply generate power of 2 (which may be selectable if converting back to setle/setge) or near power of 2 constants. Alive2 proofs: <https://alive2.llvm.org/ce/z/2BmPnq> and <https://alive2.llvm.org/ce/z/DtuhnR>	2024-05-10 13:50:03 +01:00
Florian Hahn	28767afd53	[LAA] Support backward dependences with non-constant distance. (#91525 ) Following up to 933f49248, also update the code reasoning about backwards dependences to support non-constant distances. Update the code to use the signed minimum distance instead of a constant distance This means e checked the lower bound of the dependence distance and the distance may be larger at runtime (and safe for vectorization). Whether to classify it as Unknown or Backwards depends on the vector width and LAA was updated to take TTI to get the maximum vector register width. If the minimum dependence distance is larger than the max vector width, we consider it as backwards-vectorizable. Otherwise we classify them as Unknown, so we re-try with runtime checks. PR: https://github.com/llvm/llvm-project/pull/91525	2024-05-10 11:47:13 +01:00
Graham Hunter	2e8d815596	[TTI] Support scalable offsets in getScalingFactorCost (#88113 ) Part of the work to support vscale-relative immediates in LSR.	2024-05-10 11:22:11 +01:00
Jeffrey Byrnes	f865dbff17	[SeparateConstOffsetFromGEP] Support GEP reordering for different types (#90802 ) This doesn't show up in existing lit tests, but has an impact on real code -- especially after the canonicalization of GEPs to i8. Alive2 tests for the inbounds handling: Case 1: https://alive2.llvm.org/ce/z/6bfFY3 Case 2: https://alive2.llvm.org/ce/z/DkLMLF	2024-05-09 16:57:36 -07:00
Eli Friedman	f893dccbba	Replace uses of ConstantExpr::getCompare. (#91558 ) Use ICmpInst::compare() where possible, ConstantFoldCompareInstOperands in other places. This only changes places where the either the fold is guaranteed to succeed, or the code doesn't use the resulting compare if we fail to fold.	2024-05-09 16:50:01 -07:00
Florian Hahn	c3d2af0f4e	[VPlan] VPEVLBasedIVPHI is a VPSingleDefRecipe. VPEVLBasedIVPHIRecipe inherits from VPSingleDefRecipe. Add VPEVLBasedIVPHISC to VPSingleDefRecipe::classof to make isa/dyn_cast & co work as expected. Split off https://github.com/llvm/llvm-project/pull/67934.	2024-05-09 19:18:37 +01:00
Yuxuan Chen	87235fa9af	[NFC] CoroElide: Refactor `Lowerer` into `CoroIdElider` (#91539 ) This patch contains no functional changes. The main goal of this patch is to get better clarity out of the code, to make intentions and assumptions clear. One major design problem I had in the past were `Lowerer`. It previously inherited from `coro::LowererBase` but it doesn't use any of the fields or methods from `LowererBase`. It might be an artifact leftover from previous designs of this code. Furthermore, we should clarify that although one such instance is bound to the function, `Lowerer` was dedicated to one `CoroId` instruction at a time. We rely on a sequence of fragile constructs like `CoroBegins.clear(); DestroyAddr.clear()`. This doesn't help understand the code. What's worse is that we have confusing calls like `elideHeapAllocations(CoroId->getFunction(), ...` and it might get confused with `CoroId->getCoroutine()`. The new structure intends to make it clear that we always operate on one `CoroId` at a time, which may have multiple `CoroBegin`s. Such structure doesn't rely on frequent `.clear()` that's prone to miss.	2024-05-09 08:21:40 -07:00
Shan Huang	b452b34932	[DebugInfo][IndVarSimplify] Fix missing debug location updates (#91443 ) Adds debug location updates for the newly created `phi`, `add`, `icmp` and `sitofp` instructions in `IndVarSimplify`. Fixes #91436	2024-05-09 12:58:53 +01:00
Alexey Bataev	58a94b1d0a	[SLP]Fix PR91467: Look through scalar cast, when trying to cast to another type. Need to look through the SExt/ZExt scalars to be gathered, when trying to reduce their width after minbitwidth analysis to prevent permanent attempts to revectorize such gathered instructions.	2024-05-09 04:19:43 -07:00
Nikita Popov	534701d5f9	[InstCombine] Handle commuted variants in or of xor pattern This pattern only handled commutation in the "or", while all involved operations are commutative. Make sure we handle all sixteen patterns.	2024-05-09 15:16:25 +09:00
Nikita Popov	0d335f78e4	[InstCombine] Handle more commuted cases in matchesSquareSum()	2024-05-09 12:35:16 +09:00
AtariDreams	409ff97aac	[InstCombine] Fix comment from #88193 (NFC) (#91427 ) It is inaccurate and needs to be corrected.	2024-05-09 09:26:36 +09:00
Mircea Trofin	96568f3539	[llvm][ctx_profile] Add instrumentation lowering (#90821 ) This adds the instrumentation lowering pass. (Tracking Issue: #89287, RFC referenced there)	2024-05-08 16:49:08 -07:00
Arthur Eubanks	2fb3774321	Revert "[SLP]Fix PR91467: Look through scalar cast, when trying to cast to another type." This reverts commit 2475efa91d8b4fa8f1a2d16052cb6d14be7d5dc6. Causes crashes, see comments on `2475efa91d`.	2024-05-08 23:01:47 +00:00
Mingming Liu	64f4ceb09e	[Inline][PGO] After inline, update InvokeInst profile counts in caller and cloned callee (#83809 ) A related change is https://reviews.llvm.org/D133121, which correctly preserves both branch weights and value profiles for invoke instruction. * If the branch weight of the `invokeinst` specifies taken / not-taken branches, there is no scale.	2024-05-08 15:48:40 -07:00
Teresa Johnson	cec6665f2b	[MemProf] Optionally update hints on existing hot/cold new calls (#91047 ) If directed by an option, update hints on calls to new that already provide a hot/cold hint.	2024-05-08 13:41:29 -07:00
AdityaK	42d99013bd	NFC: Add a comment indicating UpdateAnalysisInformation invalidates DFS Numbering (#91252 )	2024-05-08 11:03:46 -07:00

1 2 3 4 5 ...

36393 Commits