llvm-project

Author	SHA1	Message	Date
Wenju He	fe146e9b59	[InferAddressSpaces] Fix constant replace to avoid modifying other functions (#70611 ) A constant value is unique in llvm context. InferAddressSpaces was replacing its users in other functions as well. This leads to unexpected behavior in our downstream use case after the pass. InferAddressSpaces is a function passe, so it shall not modify functions other than currently processed one. Co-authored-by: Abhinav Gaba <abhinav.gaba@intel.com> --------- Co-authored-by: Abhinav Gaba <abhinav.gaba@intel.com>	2023-11-13 13:28:56 +08:00
JOE1994	c42d006f05	[llvm][InstrProfiling] Remove no-op ptr-to-ptr bitcasts (NFC) Opaque ptr cleanup effort (NFC).	2023-11-12 13:44:06 -05:00
Léonard Oest O'Leary	ff36411b23	[InstCombine] Use zext's nneg flag for icmp folding (#70845 ) This PR fixes https://github.com/llvm/llvm-project/issues/55013 : the max intrinsics is not generated for this simple loop case : https://godbolt.org/z/hxz1xhMPh. This is caused by a ICMP not being folded into a select, thus not generating the max intrinsics. For the story : Since LLVM 14, SCCP pass got smarter by folding sext into zext for positive ranges : https://reviews.llvm.org/D81756. After this change, InstCombine was sometimes unable to fold ICMP correctly as both of the arguments pointed to mismatched zext/sext. To fix this, @rotateright implemented this fix : https://reviews.llvm.org/D124419 that tries to resolve the mismatch by knowing if the argument of a zext is positive (in which case, it is like a sext) by using ValueTracking, however ValueTracking is not smart enough to infer that the value is positive in some cases. Recently, @nikic implemented #67982 which keeps the information that a zext is non-negative. This PR simply uses this information to do the folding accordingly. TLDR : This PR uses the recent nneg tag on zext to fold the icmp accordingly in instcombine. This PR also contains test cases for sext/zext folding with InstCombine as well as a x86 regression tests for the max/min case.	2023-11-13 00:53:53 +08:00
Florian Hahn	34c2dcd5ac	[VPlan] Move initial skeleton construction to createInitialVPlan. (NFC) This patch moves creating the middle VPBBs and an initial empty vector loop region for the top-level loop to createInitialVPlan. This consolidates code to create the initial VPlan skeleton and enables adding other bits outside the main region during initial VPlan construction. In particular, D150398 will add the exit check & branch to the middle block. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D158333	2023-11-12 13:00:44 +00:00
Michael Maitland	acef83c142	[VectorCombine] Fix crash in scalarizeVPIntrinsic (#72039 ) When getSplatOp returns nullptr, the intrinsic cannot be scalarized. This patch includes a test case that fixes a crash from trying to scalarize the VPIntrinsic when getSplatOp returns nullptr. This fixes https://github.com/llvm/llvm-project/issues/72034.	2023-11-11 19:54:15 -05:00
Alan Phipps	d3d49bca3e	[InstrProfiling] Don't attempt to create duplicate data variables. (#71998 ) Fixes a bug introduced by commit f95b2f1acf11 ("Reland [InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code Coverage (1/3)") createDataVariable() needs to check that a data variable wasn't already created before creating it. Previously, this was done inadvertantly in getOrCreateRegionCounters(), which checked that the RegionCounters was not created multiple times before creating the counter section and the data variable. When the creation of the data variable was abstracted into its own function (createDataVariable()), there was no corresponding check. This was failing on a case in which an instrumented function was being inlined into multiple functions and a duplicate data variable was created, which led to a segfault in emitNameData(). Test case added based on the repro that also ensures a single data variable was created in this case.	2023-11-11 18:34:29 -06:00
Kazu Hirata	22b0f7ba6e	[Transforms] Include llvm/ADT/SmallSet.h (NFC) This patch adds #include "llvm/ADT/SmallSet.h" to a couple of files that are relying on transitive includes of SmallSet.h. It in turn unblocks the removal of unnecessary includes of llvm/ADT/SmallSet.h in several other files.	2023-11-11 12:25:39 -08:00
Kazu Hirata	d4360e428f	[llvm] Stop including llvm/ADT/DenseMap.h (NFC) Ientified with clangd.	2023-11-11 10:07:19 -08:00
Florian Hahn	167b598648	[ConstraintElim] Remove redundant debug output (NFC). The removed code was printing `Processing facts ...` a second time.	2023-11-11 13:01:12 +00:00
Florian Hahn	ed6f4994d8	[VPlan] Handle conditional ordered reductions with scalar VFs. VPReductionRecipe::execute was not handling predicates for ordered reduction with scalar VFs, which was causing a crash. Thsi patch adds dedicated handling for scalar VFs when dealing with the condition. The other operands are already handled in a similar fashion below. Fixes #70988.	2023-11-11 12:55:40 +00:00
Kazu Hirata	bafd35ca04	[llvm] Stop including llvm/ADT/SmallPtrSet.h (NFC) Identified with clangd.	2023-11-11 00:35:14 -08:00
Kazu Hirata	c22fffcba4	[llvm] Stop including llvm/ADT/MapVector.h (NFC) Identified with clangd.	2023-11-10 23:56:20 -08:00
Kazu Hirata	84a48ee9fb	[llvm] Stop including llvm/ADT/SetVector.h (NFC) Identified with clangd.	2023-11-10 23:50:23 -08:00
Vidhush Singhal	754b93e466	[Attributor] New attribute to identify what byte ranges are alive for an allocation (#66148 ) Changes the size of allocations automatically. For now, implements the case when a single range from start of the allocation is alive and the allocation can be reduced.	2023-11-10 16:26:37 -08:00
William Junda Huang	683f2df6e5	[SampleProfile] Fix bug where remapper returns empty string and crashing Sample Profile loader (#71479 ) Normally SampleContext does not allow using an empty StirngRef to construct an object, this is to prevent bugs reading the profile. However empty names may be emitted by a function which its name is intentionally set to empty, or a bug in the remapper that returns an empty string. Regardless, converting it to FunctionId first will prevent the assert, and that assert check is unnecessary, which will be addressed in another patch	2023-11-10 21:38:13 +00:00
Nikita Popov	b43b2a64b5	[InstCombine] Avoid use of shift constant expressions (NFCI) Use the constant folding API instead. As we're working on ImmConstants, these folds are guaranteed to succeed.	2023-11-10 16:58:10 +01:00
Nikita Popov	707bb42163	[InstCombine] Require immediate constant in canEvaluateShifted() Otherwise we risk infinite loops when shift constant expressions are no longer supported.	2023-11-10 16:12:49 +01:00
Nikita Popov	8391f405cb	[InstCombine] Avoid uses of ConstantExpr::getLShr() Use the constant folding API instead.	2023-11-10 15:50:42 +01:00
Nikita Popov	eb5199e8d4	[InstCombine] Avoid some uses of ConstantExpr::getLShr() (NFC) Use the constant folding API instead. As we're working on ImmConstant, it is guaranteed to succeed.	2023-11-10 15:46:14 +01:00
Nikita Popov	c2a1966627	[InstCombine] Remove bitcast handling from SimplifyDemandedBits The complex set of type checks in this code reduces down to "always return nullptr". Drop the code to use the default implementation instead, which will just compute the KnownBits for the bitcast.	2023-11-10 15:25:39 +01:00
Nikita Popov	192e7d3d52	[IRBuilder] Add IsNonNeg param to CreateZExt() (NFC)	2023-11-10 12:00:34 +01:00
Alexander Potapenko	f577bfb995	[sanitizer][msan] fix AArch64 vararg support for KMSAN (#70660 ) Cast StackSaveAreaPtr, GrRegSaveAreaPtr, VrRegSaveAreaPtr to pointers to fix assertions in getShadowOriginPtrKernel(). Fixes: https://github.com/llvm/llvm-project/issues/69738 Patch by Mark Johnston.	2023-11-10 09:33:49 +01:00
Noah Goldstein	9ef829097b	[InstCombine] Fix buggy transform in `foldNestedSelects`; PR 71330 The bug is that `IsAndVariant` is used to assume which arm in the select the output `SelInner` should be placed but match the inner select condition with `m_c_LogicalOp`. With fully simplified ops, this works fine, but its possible if the select condition is not simplified, for it match both `LogicalAnd` and `LogicalOr` i.e `select true, true, false`. In PR71330 for example, the issue occurs in the following IR: ``` define i32 @bad() { %..i.i = select i1 false, i32 0, i32 3 %brmerge = select i1 true, i1 true, i1 false %not.cmp.i.i.not = xor i1 true, true %.mux = zext i1 %not.cmp.i.i.not to i32 %retval.0.i.i = select i1 %brmerge, i32 %.mux, i32 %..i.i ret i32 %retval.0.i.i } ``` When simplifying: ``` %retval.0.i.i = select i1 %brmerge, i32 %.mux, i32 %..i.i ``` We end up matching `%brmerge` as `LogicalAnd` for `IsAndVariant`, but the inner select (`%..i.i`) condition which is `false` with `LogicalOr`. Closes #71489	2023-11-09 16:36:49 -06:00
Nikita Popov	ed86e740ef	Revert "[SROA] Limit the number of allowed slices when trying to split allocas" This reverts commit e13e808283f7fd9e873ae922dd1ef61aeaa0eb4a. This causes performance regressions on GPU targets, see https://github.com/llvm/llvm-project/issues/69785. Revert the change for now.	2023-11-09 16:38:52 +01:00
Nikita Popov	369c9b791b	[MemCpyOpt] Require writable object during call slot optimization (#71542 ) Call slot optimization may introduce writes to the destination object that occur earlier than in the original function. We currently already check that that the destination is dereferenceable and aligned, but we do not make sure that it is writable. As such, we might introduce a write to read-only memory, or introduce a data race. Fix this by checking that the object is writable. For arguments, this is indicated by the new writable attribute. Tests using sret/dereferenceable are updated to use it.	2023-11-09 15:55:44 +01:00
Nikita Popov	1b1c81772f	[InstCombine] Drop poison flags in simplifyAssocCastAssoc() The nneg flag on zext may no longer hold after the reassociation.	2023-11-09 11:58:02 +01:00
Chuanqi Xu	b7b5907b56	[Coroutines] Introduce [[clang::coro_only_destroy_when_complete]] (#71014 ) Close https://github.com/llvm/llvm-project/issues/56980. This patch tries to introduce a light-weight optimization attribute for coroutines which are guaranteed to only be destroyed after it reached the final suspend. The rationale behind the patch is simple. See the example: ```C++ A foo() { dtor d; co_await something(); dtor d1; co_await something(); dtor d2; co_return 43; } ``` Generally the generated .destroy function may be: ```C++ void foo.destroy(foo.Frame frame) { switch(frame->suspend_index()) { case 1: frame->d.~dtor(); break; case 2: frame->d.~dtor(); frame->d1.~dtor(); break; case 3: frame->d.~dtor(); frame->d1.~dtor(); frame->d2.~dtor(); break; default: // coroutine completed or haven't started break; } frame->promise.~promise_type(); delete frame; } ``` Since the compiler need to be ready for all the cases that the coroutine may be destroyed in a valid state. However, from the user's perspective, we can understand that certain coroutine types may only be destroyed after it reached to the final suspend point. And we need a method to teach the compiler about this. Then this is the patch. After the compiler recognized that the coroutines can only be destroyed after complete, it can optimize the above example to: ```C++ void foo.destroy(foo.Frame frame) { frame->promise.~promise_type(); delete frame; } ``` I spent a lot of time experimenting and experiencing this in the downstream. The numbers are really good. In a real-world coroutine-heavy workload, the size of the build dir (including .o files) reduces 14%. And the size of final libraries (excluding the .o files) reduces 8% in Debug mode and 1% in Release mode.	2023-11-09 14:42:07 +08:00
Allen	7ec86f4d68	[SimplifyCFG] Fix the compile crash for invalid upper bound value (#71351 ) Fix the crash for the last land PR70542. Note: For '%add = add nuw i32 %x, 1', we can only infer the LowerBound is 1, but the UpperBound is wrapped to 0 in computeConstantRange. so we can't assume the UpperBound is valid bound when its value is 0. Fix https://github.com/llvm/llvm-project/issues/71329. Reviewed By: zmodem, nikic	2023-11-09 12:33:24 +08:00
Anna Thomas	29f03bf48d	[GuardWidening] Require analyses only if necessary We need to request analyses needed for guard widening only if there are guards/widenable conditions.	2023-11-08 11:54:10 -05:00
Jeremy Morse	f1b0a54451	Reapply 7d77bbef4ad92, adding new debug-info classes This reverts commit 957efa4ce4f0391147cec62746e997226ee2b836. Original commit message below -- in this follow up, I've shifted un-necessary inclusions of DebugProgramInstruction.h into being forward declarations (fixes clang-compile time I hope), and a memory leak in the DebugInfoTest.cpp IR unittests. I also tracked a compile-time regression in D154080, more explanation there, but the result of which is hiding some of the changes behind the EXPERIMENTAL_DEBUGINFO_ITERATORS compile-time flag. This is tested by the "new-debug-iterators" buildbot. [DebugInfo][RemoveDIs] Add prototype storage classes for "new" debug-info This patch adds a variety of classes needed to record variable location debug-info without using the existing intrinsic approach, see the rationale at [0]. The two added files and corresponding unit tests are the majority of the plumbing required for this, but at this point isn't accessible from the rest of LLVM as we need to stage it into the repo gently. An overview is that classes are added for recording variable information attached to Real (TM) instructions, in the form of DPValues and DPMarker objects. The metadata-uses of DPValues is plumbed into the metadata hierachy, and a field added to class Instruction, which are all stimulated in the unit tests. The next few patches in this series add utilities to convert to/from this new debug-info format and add instruction/block utilities to have debug-info automatically updated in the background when various operations occur. This patch was reviewed in Phab in D153990 and D154080, I've squashed them together into this commit as there are dependencies between the two patches, and there's little profit in landing them separately. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939	2023-11-08 16:42:35 +00:00
Nikita Popov	2c61f9cab5	[CVP] Fix use after scope Store the result of ConstantRange::sdiv() in a variable, as getSingleElement() will return a pointer to the APInt it contains.	2023-11-08 16:53:47 +01:00
Florian Hahn	26ab444e88	[ConstraintElim] Make sure add-rec is for the current loop. Update addInfoForInductions to also check if the add-rec is for the current loop. Otherwise we might add incorrect facts or crash. Fixes a miscompile & crash introduced by 00396e6a1a0b.	2023-11-08 14:07:28 +00:00
Nikita Popov	d687057de8	[CVP] Try to fold sdiv to constant If we know that the sdiv result is a single constant, directly use that instead of performing narrowing. Fixes https://github.com/llvm/llvm-project/issues/71659.	2023-11-08 14:49:24 +01:00
Markos Horro	9d2903c8e5	[IndVars] Add check of loop invariant for trunc instructions (#71072 ) The same idea as in 34d380e1f63a7e2cdb9ab1e6498f727fcd710a14, but considering truncation instructions. Improvement for #59633.	2023-11-08 11:16:23 +00:00
Nikita Popov	567c02a80e	[InstCombine] Remove inttoptr/ptrtoint handling from indexed compare fold Looking through inttoptr / ptrtoint intermixed with GEPs is very questionable from a provenance perspective. We also don't seem to have any test coverage that shows this is useful (apart from one test I added to guard against a crash).	2023-11-08 11:13:57 +01:00
Nikita Popov	5918f62301	[InstCombine] Infer zext nneg flag (#71534 ) Use KnownBits to infer the nneg flag on zext instructions. Currently we only set nneg when converting sext -> zext, but don't set it when we have a zext in the first place. If we want to use it in optimizations, we should make sure the flag inference is consistent.	2023-11-08 09:34:40 +01:00
Vladislav Dzhidzhoev	6beddd668a	Revert "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)" This caused assert: llvm/llvm/lib/CodeGen/AsmPrinter/DwarfFile.cpp:110: void llvm::DwarfFile::addScopeVariable(LexicalScope , DbgVariable ): Assertion `Ret.second' failed. See comments https://reviews.llvm.org/D144006#4656350. This reverts commit 3b449bd46a11a55a40cbc0016a99b202fa05248e.	2023-11-08 00:29:24 +01:00
Antonio Frighetto	7d39838948	[InstCombine] Favour `CreateZExtOrTrunc` in `narrowFunnelShift` (NFC) Use `CreateZExtOrTrunc`, reduce test and regenerate checks.	2023-11-07 22:48:14 +01:00
Paulo Matos	7b9d73c2f9	[NFC] Remove Type::getInt8PtrTy (#71029 ) Replace this with PointerType::getUnqual(). Followup to the opaque pointer transition. Fixes an in-code TODO item.	2023-11-07 17:26:26 +01:00
Philip Reames	551c280cfd	[indvars] Always fallback to truncation if AddRec widening fails (#70967 ) The current code structure results in cases where if a) we can't clone the IV user (because it's not in our whitelist) or b) can't prove the SCEV expressions are identical, we'd sometimes leave both the original unwiddened IV and the partially widdened IV in code. Instead, just truncate thw wide IV to the use - same as what we'd do if we couldn't find an addrec to start with. Noticed this while playing with changing how we produce addrecs. The current structure results in a very tight interlock between SCEVs internal capabilities and indvars code.	2023-11-07 07:49:39 -08:00
Antonio Frighetto	caa124b58d	[InstCombine] Zero-extend shift amounts in narrow funnel shift ops An issue arose when handling shift amounts while performing narrowed funnel shifts simplification. Specifically, shift amounts were incorrectly truncated when their type was narrower than the target bit width. This has been addressed by zero-extending `ShAmt` in such cases. Fixes: https://github.com/llvm/llvm-project/issues/71463. Proof: https://alive2.llvm.org/ce/z/5draKz.	2023-11-07 14:15:32 +01:00
Nikita Popov	6e56c35d19	[SpeculativeExecution] Add only-if-divergent-target pass option The optimization pipeline enables this option, but it was not preserved in -print-pipeline-passes output.	2023-11-07 11:49:37 +01:00
Hans Wennborg	05ed92127c	Revert "Reland [SimplifyCFG] Delete the unnecessary range check for small mask operation (#70542 )" This caused https://github.com/llvm/llvm-project/issues/71329 > Fix the compile crash when the default result has no result for > https://github.com/llvm/llvm-project/pull/65835 > > Fixes https://github.com/llvm/llvm-project/issues/65120 > Reviewed By: zmodem, nikic This reverts commit 7c4180a36a905b7ed46c09df77af1b65e356f92a.	2023-11-07 10:53:22 +01:00
Nikita Popov	e360a16fee	[GlobalOpt] Cache whether CC is changeable (#71381 ) The hasAddressTaken() call in hasOnlyColdCalls() has quadratic complexity if there are many cold calls to a function: We're going to visit each call of the function, and then for each of them iterate all the users of the function. We've recently encountered a case where GlobalOpt spends more than an hour in these hasAddressTaken() checks when full LTO is used. Avoid this by moving the hasAddressTaken() check into hasChangeableCC() and caching its result, so it is only computed once per function.	2023-11-07 10:36:45 +01:00
Allen	a0cd6265bc	[InstCombine] Split the FMul with reassoc into a helper function, NFC (#71493 ) The reassoc check is really hard to find because the handle branch it too large, so spilt it into a helper function.	2023-11-07 15:30:56 +08:00
Philip Reames	23099ac239	Add known and demanded bits support for zext nneg (#70858 ) zext nneg was recently added to the IR in #67982. This patch teaches demanded bits and known bits about the semantics of the instruction, and adds a couple of test cases to illustrate basic functionality.	2023-11-06 18:47:56 -08:00
LiqinWeng	5d3d08463d	[InstCombinePHI] Remove dead PHI on UnaryOperator (#71386 ) This patch mainly solves the problem of dead PHI on UnaryOperator	2023-11-07 09:45:33 +08:00
Tom Stellard	2400c54c37	[Vectorize] Remove Transforms/Vectorize.h (#71294 ) The only thing in this file is a declaration for createLoadStoreVectorizerPass(), and this function is already declared in LoadStoreVectorizer.h.	2023-11-06 14:04:22 -08:00
Simon Pilgrim	3ca4fe80d4	[Transforms] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC. startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)	2023-11-06 16:50:18 +00:00
Florian Hahn	a002271972	[VPlan] Add VPValue::replaceUsesWithIf (NFCI). Add replaceUsesWithIf helper and use it in a few places.	2023-11-06 16:08:22 +00:00

1 2 3 4 5 ...

35045 Commits