35045 Commits

Author SHA1 Message Date
Wenju He
fe146e9b59
[InferAddressSpaces] Fix constant replace to avoid modifying other functions (#70611)
A constant value is unique in llvm context. InferAddressSpaces was
replacing its users in other functions as well. This leads to unexpected
behavior in our downstream use case after the pass.

InferAddressSpaces is a function passe, so it shall not modify functions
other than currently processed one.

Co-authored-by: Abhinav Gaba <abhinav.gaba@intel.com>

---------

Co-authored-by: Abhinav Gaba <abhinav.gaba@intel.com>
2023-11-13 13:28:56 +08:00
JOE1994
c42d006f05 [llvm][InstrProfiling] Remove no-op ptr-to-ptr bitcasts (NFC)
Opaque ptr cleanup effort (NFC).
2023-11-12 13:44:06 -05:00
Léonard Oest O'Leary
ff36411b23
[InstCombine] Use zext's nneg flag for icmp folding (#70845)
This PR fixes https://github.com/llvm/llvm-project/issues/55013 : the
max intrinsics is not generated for this simple loop case :
https://godbolt.org/z/hxz1xhMPh. This is caused by a ICMP not being
folded into a select, thus not generating the max intrinsics.

For the story :

Since LLVM 14, SCCP pass got smarter by folding sext into zext for
positive ranges : https://reviews.llvm.org/D81756. After this change,
InstCombine was sometimes unable to fold ICMP correctly as both of the
arguments pointed to mismatched zext/sext. To fix this, @rotateright
implemented this fix : https://reviews.llvm.org/D124419 that tries to
resolve the mismatch by knowing if the argument of a zext is positive
(in which case, it is like a sext) by using ValueTracking, however
ValueTracking is not smart enough to infer that the value is positive in
some cases. Recently, @nikic implemented #67982 which keeps the
information that a zext is non-negative. This PR simply uses this
information to do the folding accordingly.

TLDR : This PR uses the recent nneg tag on zext to fold the icmp
accordingly in instcombine.

This PR also contains test cases for sext/zext folding with InstCombine
as well as a x86 regression tests for the max/min case.
2023-11-13 00:53:53 +08:00
Florian Hahn
34c2dcd5ac
[VPlan] Move initial skeleton construction to createInitialVPlan. (NFC)
This patch moves creating the  middle VPBBs and an initial empty
vector loop region for the top-level loop to createInitialVPlan.

This consolidates code to create the initial VPlan skeleton and enables
adding other bits outside the main region during initial VPlan
construction. In particular, D150398 will add the exit check & branch to
the middle block.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D158333
2023-11-12 13:00:44 +00:00
Michael Maitland
acef83c142
[VectorCombine] Fix crash in scalarizeVPIntrinsic (#72039)
When getSplatOp returns nullptr, the intrinsic cannot be scalarized.
This patch includes a test case that fixes a crash from trying to
scalarize the VPIntrinsic when getSplatOp returns nullptr.

This fixes https://github.com/llvm/llvm-project/issues/72034.
2023-11-11 19:54:15 -05:00
Alan Phipps
d3d49bca3e
[InstrProfiling] Don't attempt to create duplicate data variables. (#71998)
Fixes a bug introduced by
commit f95b2f1acf11 ("Reland [InstrProf][compiler-rt] Enable MC/DC
Support in LLVM Source-based Code Coverage (1/3)")

createDataVariable() needs to check that a data variable wasn't already
created before creating it. Previously, this was done inadvertantly in
getOrCreateRegionCounters(), which checked that the RegionCounters was
not created multiple times before creating the counter section and the
data variable. When the creation of the data variable was abstracted
into its own function (createDataVariable()), there was no corresponding
check. This was failing on a case in which an instrumented function was
being inlined into multiple functions and a duplicate data variable was
created, which led to a segfault in emitNameData(). Test case added
based on the repro that also ensures a single data variable was created
in this case.
2023-11-11 18:34:29 -06:00
Kazu Hirata
22b0f7ba6e [Transforms] Include llvm/ADT/SmallSet.h (NFC)
This patch adds #include "llvm/ADT/SmallSet.h" to a couple of files
that are relying on transitive includes of SmallSet.h.  It in turn
unblocks the removal of unnecessary includes of llvm/ADT/SmallSet.h in
several other files.
2023-11-11 12:25:39 -08:00
Kazu Hirata
d4360e428f [llvm] Stop including llvm/ADT/DenseMap.h (NFC)
Ientified with clangd.
2023-11-11 10:07:19 -08:00
Florian Hahn
167b598648
[ConstraintElim] Remove redundant debug output (NFC).
The removed code was printing `Processing facts ...` a second time.
2023-11-11 13:01:12 +00:00
Florian Hahn
ed6f4994d8
[VPlan] Handle conditional ordered reductions with scalar VFs.
VPReductionRecipe::execute was not handling predicates for ordered
reduction with scalar VFs, which was causing a crash. Thsi patch adds
dedicated handling for scalar VFs when dealing with the condition.
The other operands are already handled in a similar fashion below.

Fixes #70988.
2023-11-11 12:55:40 +00:00
Kazu Hirata
bafd35ca04 [llvm] Stop including llvm/ADT/SmallPtrSet.h (NFC)
Identified with clangd.
2023-11-11 00:35:14 -08:00
Kazu Hirata
c22fffcba4 [llvm] Stop including llvm/ADT/MapVector.h (NFC)
Identified with clangd.
2023-11-10 23:56:20 -08:00
Kazu Hirata
84a48ee9fb [llvm] Stop including llvm/ADT/SetVector.h (NFC)
Identified with clangd.
2023-11-10 23:50:23 -08:00
Vidhush Singhal
754b93e466
[Attributor] New attribute to identify what byte ranges are alive for an allocation (#66148)
Changes the size of allocations automatically.
For now, implements the case when a single range from start of the
allocation is alive and the allocation can be reduced.
2023-11-10 16:26:37 -08:00
William Junda Huang
683f2df6e5
[SampleProfile] Fix bug where remapper returns empty string and crashing Sample Profile loader (#71479)
Normally SampleContext does not allow using an empty StirngRef to
construct an object, this is to prevent bugs reading the profile.
However empty names may be emitted by a function which its name is
intentionally set to empty, or a bug in the remapper that returns an
empty string. Regardless, converting it to FunctionId first will prevent
the assert, and that assert check is unnecessary, which will be
addressed in another patch
2023-11-10 21:38:13 +00:00
Nikita Popov
b43b2a64b5 [InstCombine] Avoid use of shift constant expressions (NFCI)
Use the constant folding API instead. As we're working on
ImmConstants, these folds are guaranteed to succeed.
2023-11-10 16:58:10 +01:00
Nikita Popov
707bb42163 [InstCombine] Require immediate constant in canEvaluateShifted()
Otherwise we risk infinite loops when shift constant expressions
are no longer supported.
2023-11-10 16:12:49 +01:00
Nikita Popov
8391f405cb [InstCombine] Avoid uses of ConstantExpr::getLShr()
Use the constant folding API instead.
2023-11-10 15:50:42 +01:00
Nikita Popov
eb5199e8d4 [InstCombine] Avoid some uses of ConstantExpr::getLShr() (NFC)
Use the constant folding API instead. As we're working on
ImmConstant, it is guaranteed to succeed.
2023-11-10 15:46:14 +01:00
Nikita Popov
c2a1966627 [InstCombine] Remove bitcast handling from SimplifyDemandedBits
The complex set of type checks in this code reduces down to
"always return nullptr". Drop the code to use the default
implementation instead, which will just compute the KnownBits
for the bitcast.
2023-11-10 15:25:39 +01:00
Nikita Popov
192e7d3d52 [IRBuilder] Add IsNonNeg param to CreateZExt() (NFC) 2023-11-10 12:00:34 +01:00
Alexander Potapenko
f577bfb995
[sanitizer][msan] fix AArch64 vararg support for KMSAN (#70660)
Cast StackSaveAreaPtr, GrRegSaveAreaPtr, VrRegSaveAreaPtr to pointers to
fix assertions in getShadowOriginPtrKernel().

Fixes: https://github.com/llvm/llvm-project/issues/69738

Patch by Mark Johnston.
2023-11-10 09:33:49 +01:00
Noah Goldstein
9ef829097b [InstCombine] Fix buggy transform in foldNestedSelects; PR 71330
The bug is that `IsAndVariant` is used to assume which arm in the
select the output `SelInner` should be placed but match the inner
select condition with `m_c_LogicalOp`. With fully simplified ops, this
works fine, but its possible if the select condition is not
simplified, for it match both `LogicalAnd` and `LogicalOr` i.e `select
true, true, false`.

In PR71330 for example, the issue occurs in the following IR:
```
define i32 @bad() {
  %..i.i = select i1 false, i32 0, i32 3
  %brmerge = select i1 true, i1 true, i1 false
  %not.cmp.i.i.not = xor i1 true, true
  %.mux = zext i1 %not.cmp.i.i.not to i32
  %retval.0.i.i = select i1 %brmerge, i32 %.mux, i32 %..i.i
  ret i32 %retval.0.i.i
}
```

When simplifying:
```
%retval.0.i.i = select i1 %brmerge, i32 %.mux, i32 %..i.i
```

We end up matching `%brmerge` as `LogicalAnd` for `IsAndVariant`, but
the inner select (`%..i.i`) condition which is `false` with
`LogicalOr`.

Closes #71489
2023-11-09 16:36:49 -06:00
Nikita Popov
ed86e740ef Revert "[SROA] Limit the number of allowed slices when trying to split allocas"
This reverts commit e13e808283f7fd9e873ae922dd1ef61aeaa0eb4a.

This causes performance regressions on GPU targets, see
https://github.com/llvm/llvm-project/issues/69785. Revert the
change for now.
2023-11-09 16:38:52 +01:00
Nikita Popov
369c9b791b
[MemCpyOpt] Require writable object during call slot optimization (#71542)
Call slot optimization may introduce writes to the destination object
that occur earlier than in the original function. We currently already
check that that the destination is dereferenceable and aligned, but we
do not make sure that it is writable. As such, we might introduce a
write to read-only memory, or introduce a data race.

Fix this by checking that the object is writable. For arguments, this is
indicated by the new writable attribute. Tests using
sret/dereferenceable are updated to use it.
2023-11-09 15:55:44 +01:00
Nikita Popov
1b1c81772f [InstCombine] Drop poison flags in simplifyAssocCastAssoc()
The nneg flag on zext may no longer hold after the reassociation.
2023-11-09 11:58:02 +01:00
Chuanqi Xu
b7b5907b56
[Coroutines] Introduce [[clang::coro_only_destroy_when_complete]] (#71014)
Close https://github.com/llvm/llvm-project/issues/56980.

This patch tries to introduce a light-weight optimization attribute for
coroutines which are guaranteed to only be destroyed after it reached
the final suspend.

The rationale behind the patch is simple. See the example:

```C++
A foo() {
  dtor d;
  co_await something();
  dtor d1;
  co_await something();
  dtor d2;
  co_return 43;
}
```

Generally the generated .destroy function may be:

```C++
void foo.destroy(foo.Frame *frame) {
  switch(frame->suspend_index()) {
    case 1:
      frame->d.~dtor();
      break;
    case 2:
      frame->d.~dtor();
      frame->d1.~dtor();
      break;
    case 3:
      frame->d.~dtor();
      frame->d1.~dtor();
      frame->d2.~dtor();
      break;
    default: // coroutine completed or haven't started
      break;
  }

  frame->promise.~promise_type();
  delete frame;
}
```

Since the compiler need to be ready for all the cases that the coroutine
may be destroyed in a valid state.

However, from the user's perspective, we can understand that certain
coroutine types may only be destroyed after it reached to the final
suspend point. And we need a method to teach the compiler about this.
Then this is the patch. After the compiler recognized that the
coroutines can only be destroyed after complete, it can optimize the
above example to:

```C++
void foo.destroy(foo.Frame *frame) {
  frame->promise.~promise_type();
  delete frame;
}
```

I spent a lot of time experimenting and experiencing this in the
downstream. The numbers are really good. In a real-world coroutine-heavy
workload, the size of the build dir (including .o files) reduces 14%.
And the size of final libraries (excluding the .o files) reduces 8% in
Debug mode and 1% in Release mode.
2023-11-09 14:42:07 +08:00
Allen
7ec86f4d68
[SimplifyCFG] Fix the compile crash for invalid upper bound value (#71351)
Fix the crash for the last land PR70542.

Note:
For '%add = add nuw i32 %x, 1', we can only infer the LowerBound is 1,
but the UpperBound is wrapped to 0 in computeConstantRange.
so we can't assume the UpperBound is valid bound when its value is 0.

Fix https://github.com/llvm/llvm-project/issues/71329.
Reviewed By: zmodem, nikic
2023-11-09 12:33:24 +08:00
Anna Thomas
29f03bf48d [GuardWidening] Require analyses only if necessary
We need to request analyses needed for guard widening only if there are
guards/widenable conditions.
2023-11-08 11:54:10 -05:00
Jeremy Morse
f1b0a54451 Reapply 7d77bbef4ad92, adding new debug-info classes
This reverts commit 957efa4ce4f0391147cec62746e997226ee2b836.

Original commit message below -- in this follow up, I've shifted
un-necessary inclusions of DebugProgramInstruction.h into being forward
declarations (fixes clang-compile time I hope), and a memory leak in the
DebugInfoTest.cpp IR unittests.

I also tracked a compile-time regression in D154080, more explanation
there, but the result of which is hiding some of the changes behind the
EXPERIMENTAL_DEBUGINFO_ITERATORS compile-time flag. This is tested by the
"new-debug-iterators" buildbot.

[DebugInfo][RemoveDIs] Add prototype storage classes for "new" debug-info

This patch adds a variety of classes needed to record variable location
debug-info without using the existing intrinsic approach, see the rationale
at [0].

The two added files and corresponding unit tests are the majority of the
plumbing required for this, but at this point isn't accessible from the
rest of LLVM as we need to stage it into the repo gently. An overview is
that classes are added for recording variable information attached to Real
(TM) instructions, in the form of DPValues and DPMarker objects. The
metadata-uses of DPValues is plumbed into the metadata hierachy, and a
field added to class Instruction, which are all stimulated in the unit
tests. The next few patches in this series add utilities to convert to/from
this new debug-info format and add instruction/block utilities to have
debug-info automatically updated in the background when various operations
occur.

This patch was reviewed in Phab in D153990 and D154080, I've squashed them
together into this commit as there are dependencies between the two
patches, and there's little profit in landing them separately.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
2023-11-08 16:42:35 +00:00
Nikita Popov
2c61f9cab5 [CVP] Fix use after scope
Store the result of ConstantRange::sdiv() in a variable, as
getSingleElement() will return a pointer to the APInt it contains.
2023-11-08 16:53:47 +01:00
Florian Hahn
26ab444e88
[ConstraintElim] Make sure add-rec is for the current loop.
Update addInfoForInductions to also check if the add-rec is for the
current loop. Otherwise we might add incorrect facts or crash.

Fixes a miscompile & crash introduced by 00396e6a1a0b.
2023-11-08 14:07:28 +00:00
Nikita Popov
d687057de8 [CVP] Try to fold sdiv to constant
If we know that the sdiv result is a single constant, directly
use that instead of performing narrowing.

Fixes https://github.com/llvm/llvm-project/issues/71659.
2023-11-08 14:49:24 +01:00
Markos Horro
9d2903c8e5
[IndVars] Add check of loop invariant for trunc instructions (#71072)
The same idea as in 34d380e1f63a7e2cdb9ab1e6498f727fcd710a14, but considering
truncation instructions.
Improvement for #59633.
2023-11-08 11:16:23 +00:00
Nikita Popov
567c02a80e [InstCombine] Remove inttoptr/ptrtoint handling from indexed compare fold
Looking through inttoptr / ptrtoint intermixed with GEPs is very
questionable from a provenance perspective. We also don't seem to
have any test coverage that shows this is useful (apart from one
test I added to guard against a crash).
2023-11-08 11:13:57 +01:00
Nikita Popov
5918f62301
[InstCombine] Infer zext nneg flag (#71534)
Use KnownBits to infer the nneg flag on zext instructions.

Currently we only set nneg when converting sext -> zext, but don't set
it when we have a zext in the first place. If we want to use it in
optimizations, we should make sure the flag inference is consistent.
2023-11-08 09:34:40 +01:00
Vladislav Dzhidzhoev
6beddd668a Revert "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)"
This caused assert:
llvm/llvm/lib/CodeGen/AsmPrinter/DwarfFile.cpp:110:
void llvm::DwarfFile::addScopeVariable(LexicalScope *, DbgVariable *):
Assertion `Ret.second' failed.

See comments https://reviews.llvm.org/D144006#4656350.

This reverts commit 3b449bd46a11a55a40cbc0016a99b202fa05248e.
2023-11-08 00:29:24 +01:00
Antonio Frighetto
7d39838948 [InstCombine] Favour CreateZExtOrTrunc in narrowFunnelShift (NFC)
Use `CreateZExtOrTrunc`, reduce test and regenerate checks.
2023-11-07 22:48:14 +01:00
Paulo Matos
7b9d73c2f9
[NFC] Remove Type::getInt8PtrTy (#71029)
Replace this with PointerType::getUnqual().
Followup to the opaque pointer transition. Fixes an in-code TODO item.
2023-11-07 17:26:26 +01:00
Philip Reames
551c280cfd
[indvars] Always fallback to truncation if AddRec widening fails (#70967)
The current code structure results in cases where if a) we can't clone
the IV user (because it's not in our whitelist) or b) can't prove the
SCEV expressions are identical, we'd sometimes leave both the original
unwiddened IV and the partially widdened IV in code. Instead, just
truncate thw wide IV to the use - same as what we'd do if we couldn't
find an addrec to start with.

Noticed this while playing with changing how we produce addrecs. The
current structure results in a very tight interlock between SCEVs
internal capabilities and indvars code.
2023-11-07 07:49:39 -08:00
Antonio Frighetto
caa124b58d [InstCombine] Zero-extend shift amounts in narrow funnel shift ops
An issue arose when handling shift amounts while performing
narrowed funnel shifts simplification. Specifically, shift
amounts were incorrectly truncated when their type was
narrower than the target bit width. This has been addressed
by zero-extending `ShAmt` in such cases.

Fixes: https://github.com/llvm/llvm-project/issues/71463.

Proof: https://alive2.llvm.org/ce/z/5draKz.
2023-11-07 14:15:32 +01:00
Nikita Popov
6e56c35d19 [SpeculativeExecution] Add only-if-divergent-target pass option
The optimization pipeline enables this option, but it was not
preserved in -print-pipeline-passes output.
2023-11-07 11:49:37 +01:00
Hans Wennborg
05ed92127c Revert "Reland [SimplifyCFG] Delete the unnecessary range check for small mask operation (#70542)"
This caused https://github.com/llvm/llvm-project/issues/71329

> Fix the compile crash when the default result has no result  for
> https://github.com/llvm/llvm-project/pull/65835
>
> Fixes https://github.com/llvm/llvm-project/issues/65120
> Reviewed By: zmodem, nikic

This reverts commit 7c4180a36a905b7ed46c09df77af1b65e356f92a.
2023-11-07 10:53:22 +01:00
Nikita Popov
e360a16fee
[GlobalOpt] Cache whether CC is changeable (#71381)
The hasAddressTaken() call in hasOnlyColdCalls() has quadratic
complexity if there are many cold calls to a function: We're going to
visit each call of the function, and then for each of them iterate all
the users of the function.

We've recently encountered a case where GlobalOpt spends more than an
hour in these hasAddressTaken() checks when full LTO is used.

Avoid this by moving the hasAddressTaken() check into hasChangeableCC()
and caching its result, so it is only computed once per function.
2023-11-07 10:36:45 +01:00
Allen
a0cd6265bc
[InstCombine] Split the FMul with reassoc into a helper function, NFC (#71493)
The reassoc check is really hard to find because the handle branch it
too large, so spilt it into a helper function.
2023-11-07 15:30:56 +08:00
Philip Reames
23099ac239
Add known and demanded bits support for zext nneg (#70858)
zext nneg was recently added to the IR in #67982.   This patch teaches
demanded bits and known bits about the semantics of the instruction, and
adds a couple of test cases to illustrate basic functionality.
2023-11-06 18:47:56 -08:00
LiqinWeng
5d3d08463d
[InstCombinePHI] Remove dead PHI on UnaryOperator (#71386)
This patch mainly solves the problem of dead PHI on UnaryOperator
2023-11-07 09:45:33 +08:00
Tom Stellard
2400c54c37
[Vectorize] Remove Transforms/Vectorize.h (#71294)
The only thing in this file is a declaration for
createLoadStoreVectorizerPass(), and this function is already declared
in LoadStoreVectorizer.h.
2023-11-06 14:04:22 -08:00
Simon Pilgrim
3ca4fe80d4 [Transforms] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC.
startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)
2023-11-06 16:50:18 +00:00
Florian Hahn
a002271972
[VPlan] Add VPValue::replaceUsesWithIf (NFCI).
Add replaceUsesWithIf helper and use it in a few places.
2023-11-06 16:08:22 +00:00