35540 Commits

Author SHA1 Message Date
Florian Hahn
3683852d49
[VPlan] Use replaceUsesWithIf in replaceAllUseswith and add comment (NFCI).
Follow-up to post-commit commens for b1bfe221e6.
2024-01-21 12:56:16 +00:00
Kazu Hirata
b7a66d0fae [llvm] Use SmallString::operator std::string (NFC) 2024-01-19 18:54:11 -08:00
Fangrui Song
c71a5bf940
[msan] Unpoison indirect outputs for userspace when -msan-handle-asm-conservative is specified (#77393)
KMSAN defaults to `msan-handle-asm-conservative`, which inserts
`__msan_instrument_asm_store` calls to unpoison indirect outputs in
inline assembly (e.g. `=m` constraints in source).

```c
unsigned f() {
  unsigned v;
  // __msan_instrument_asm_store unpoisons v before invoking the asm.
  asm("movl $1,%0" : "=m"(v));
  return v;
}
```

Extend the mechanism to userspace, but require explicit
`-mllvm -msan-handle-asm-conservative` for experiments for now.

As

https://docs.kernel.org/dev-tools/kmsan.html#inline-assembly-instrumentation
says, this approach may mask certain errors (an indirect output may not
actually be initialized), but it also helps to avoid a lot of false
positives.

Link: https://github.com/google/sanitizers/issues/192
2024-01-19 16:18:28 -08:00
Pranav Kant
4482fd846a Revert "[InstCombine] Try to fold trunc(shuffle(zext)) to just a shuffle (#78636)"
This reverts commit 4d11f04b20f0bd7488e19e8f178ba028412fa519.

This breaks some programs as mentioned in #78636
2024-01-19 21:02:20 +00:00
Manish Kausik H
a0b9117454
LoopDeletion: Move EH pad check before the isLoopNeverExecuted Check (#78189)
This commit modifies `LoopDeletion::deleteLoopIfDead` to check if the
exit block of a loop is an EH pad before checking if the loop gets
executed. This handles the case where an unreachable loop has a
landingpad as an Exit block, and the loop gets deleted, leaving leaving
the landingpad without an edge from an unwind clause.

Fixes #76852.
2024-01-19 15:30:20 +01:00
Alexey Bataev
4d11f04b20
[InstCombine] Try to fold trunc(shuffle(zext)) to just a shuffle (#78636)
Tries to remove extra trunc/ext instruction for shufflevector
instructions.
2024-01-19 09:29:01 -05:00
Jay Foad
7017efa1a1 Fix typo "widended" 2024-01-19 13:50:26 +00:00
Florian Hahn
42fb1fac9e
[VPlan] Use DebugLoc from recipe in VPWidenCallRecipe (NFCI).
Instead of using the debug location of the underlying instruction, use
the debug location from the recipe. This removes an unneeded dependency
of the underlying instruction.
2024-01-19 13:33:03 +00:00
Florian Hahn
abdb61f5fd
[VPlan] Introduce VPSingleDefRecipe. (#77023)
This patch introduces a new common base class for recipes defining a
single result VPValue. This has been discussed/mentioned at various
previous reviews as potential follow-up and helps to replace various
getVPSingleValue calls.

PR: https://github.com/llvm/llvm-project/pull/77023
2024-01-19 10:27:53 +00:00
yonillasky
9299ca797a
[Coroutines] Fix inline comment about frame layout (#78626)
`ResumeIndex` isn't part of the frame struct header, so it necessarily
appears after the promise.

Co-authored-by: Yoni Lavi <yoni.lavi@nextsilicon.com>
2024-01-19 09:46:15 +08:00
Yingwei Zheng
9acc404230
[InstCombine] Recognize more rotation patterns (#78107)
InstCombine already handles the pattern `(shl ShVal, (X & (Width - 1)))
| (lshr ShVal, ((-X) & (Width - 1)))`. Under certain circumstances, `X &
(Width - 1)` will be simplified to `X`. Therefore, this patch adds
support for the pattern `(shl ShVal, X) | (lshr ShVal, ((-X) & (Width -
1)))`.

Alive2: https://alive2.llvm.org/ce/z/P7JQ2V
2024-01-18 20:29:53 +08:00
Congcong Cai
64e94438a4
[InstCombine] combine mul(abs(x),abs(y)) to abs(mul(x,y)) (#78395)
Fixes: https://github.com/llvm/llvm-project/issues/78076
Alive2 Proof: https://alive2.llvm.org/ce/z/XEDy0f
2024-01-18 20:12:00 +08:00
Paschalis Mpeis
37c87d5689
[LV][AArch64] LoopVectorizer allows scalable frem instructions (#76247)
LoopVectorizer is aware when a target can replace a scalable frem
instruction with a vector library call for a given VF and it returns the
relevant cost. Otherwise, it returns an invalid cost (as previously).

Add test that check costs on AArch64, when there is no vector library
available and when there is (with and without tail-folding).

NOTE: Invoking CostModel directly (not through LV) would still return
invalid costs.
2024-01-18 08:32:53 +00:00
Cyndy Ishida
735adbf1a8
[llvm] Teach MachO about XROS (#78373)
Add support for XROS to encode in Mach-O file formats.
2024-01-17 10:35:20 -08:00
alexfh
2d5cc1c9b3
Revert "[SimplifyCFG] switch: Do Not Transform the Default Case if the Condition is Too Wide" (#78469)
Reverts llvm/llvm-project#77831, which depends on #76669, which
seriously regresses compilation time / memory usage see
https://github.com/llvm/llvm-project/pull/76669#issuecomment-1889271710.
2024-01-17 19:04:34 +01:00
Stephen Tozer
69ec35fbec Revert "Create overloads of debug intrinsic utilities for DPValues (#78313)"
This reverts commit 4f57e207, which added several unused functions, causing
build errors on some buildbots.
2024-01-17 15:51:48 +00:00
Stephen Tozer
4f57e2076b
[RemoveDIs][DebugInfo] Create overloads of debug intrinsic utilities for DPValues (#78313)
In preparation for the major chunk of the assignment tracking
implementation, this patch adds a new set of overloaded versions of
existing functions that take DbgVariableIntrinsics, with the overloads
taking DPValues. This is used specifically to allow us to use generic code
to handle both DbgVariableIntrinsics and DPValues, reducing code
duplication. This patch doesn't actually add the uses of these functions.
2024-01-17 15:36:52 +00:00
Bruno De Fraine
656bf13004
[AST] Don't merge memory locations in AliasSetTracker (#65731)
This changes the AliasSetTracker to track memory locations instead of
pointers in its alias sets. The motivation for this is outlined in an RFC
posted on LLVM discourse:
https://discourse.llvm.org/t/rfc-dont-merge-memory-locations-in-aliassettracker/73336

In the data structures of the AST implementation, I made the choice to
replace the linked list of `PointerRec` entries (that had to go anyway)
with a simple flat vector of `MemoryLocation` objects, but for the
`AliasSet` objects referenced from a lookup table, I retained the
mechanism of a linked list, reference counting, forwarding, etc. The
data structures could be revised in a follow-up change.
2024-01-17 15:59:13 +01:00
Alexandros Lamprineas
92289db82f
[VFABI] Move the Vector ABI demangling utility to LLVMCore. (#77513)
This fixes #71892 allowing us to check magled names in the IR verifier.
2024-01-17 09:55:30 +00:00
Tanmay
4426a1b759
[InstCombine] Add log-pow simplification for FP exponent edge case. (#76641)
Fixes https://github.com/llvm/llvm-project/issues/76549

The cause of the optimization miss was -
1. `optimizePow` converting almost integer FP exponents to integer, and
turning `pow` to `powi`.
2. `optimizeLog` not accepting `Intrinsic::powi` as a target.

This patch converts constantInt back to constantFP where applicable and
adds a test.
2024-01-17 16:50:10 +07:00
Davide Italiano
b6f922fbf5 Revert "[CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined functions (#75385)"
This reverts commit fc6faa1113e9069f41b5500db051210af0eea843.
2024-01-16 17:01:01 -08:00
Teresa Johnson
070738ba88
[MemProf][NFC] Explicitly specify llvm version of function_ref (#77783)
As suggested in https://github.com/llvm/llvm-project/pull/75823, to
avoid confusion with std::function_ref, qualify all uses with llvm::
(we were already using the llvm version, but this avoids ambiguity).
2024-01-16 11:20:55 -08:00
David Green
7850c94b86 [NFC] sentinal -> sentinel 2024-01-16 17:22:06 +00:00
Alexey Bataev
093206bb7e [SLP]Fix PR78298: Assertion `GEP->getNumIndices() == 1 &&
!isa<Constant>(GEPIdx)' failed.

The non-constant index might be folded to constant during earlier stages
of vectorization. Need to consider this option and filter out out GEP
with the constant indices from the candidates list.
2024-01-16 09:17:35 -08:00
yonillasky
be690ea3db
[Coroutines] Fix incorrect attribute name coroutine.presplit (NFC) (#78296)
Those are probably leftovers from an old name of the same attribute.
Fixed for the sake of consistency.

Co-authored-by: Yoni Lavi <yoni.lavi@nextsilicon.com>
2024-01-16 17:35:07 +01:00
Florian Hahn
9a402d6fbb
[LV] Make DL optional argument for VPBuilder member functions (NFCI). 2024-01-16 15:50:09 +00:00
Florian Hahn
e7671bc9d6
[LV] Fix indent for loop in adjustRecipesForReductions (NFC). 2024-01-16 15:28:46 +00:00
Alexey Bataev
d79fdb2749 [SLP]Fix PR78236: correctly track external values, replaced several
times during reduction vectorization.

If the external value was replaced in the vectorizer several times during reduction vectorization, need to find the original value to correctly handle external uses and emit extractelement instructions properly.
2024-01-16 06:52:43 -08:00
Florian Hahn
6011d6b2cc
[VPlan] Use start value of reduction phi to determine type (NFCI).
Instead of accessing the underlying original IR value, check the type of
the start value from the recipe directly.
2024-01-16 14:39:51 +00:00
XChy
26d3cd1d07
[MoveAutoInit] Ignore unreachable basicblocks and handle catchswitch (#78232)
Fixes #78049
This patch has done:
- Ignore unreachable predecessors when looking for nearest common
dominator.
- Check catchswitch with `getFirstNonPHI`, instead of
`getFirstInsertionPt`. The latter skips EHPad.
2024-01-16 18:45:44 +08:00
Nikita Popov
de8f782355 Revert "Simplify (a % b) lt/ge (b-1) into (a % b) eq/ne (b-1) (#72504)"
This reverts commit 01f4d40aad58c5c34a8ae30edbf4e0ebbf235838.

Causes test failures.
2024-01-16 11:39:42 +01:00
elhewaty
01f4d40aad
Simplify (a % b) lt/ge (b-1) into (a % b) eq/ne (b-1) (#72504)
Alive2: https://alive2.llvm.org/ce/z/i7zYtE
Fixes: https://github.com/llvm/llvm-project/issues/71280
2024-01-16 10:15:15 +01:00
Kazu Hirata
d041af3019 [Transforms] Use a range-based for loop (NFC) 2024-01-15 21:25:50 -08:00
Mel Chen
b6e8f6604c
[LV] Skipping all debug instructions when native vplan is enabled (#77413)
The following internal error occurred when using native vplan to
vectorize the program with the debug info generation.

Assertion `!isa<DbgInfoIntrinsic>(CI) && "DbgInfoIntrinsic should have been dropped during VPlan construction"' failed.

This patch ignored all debug instructions to fix the error when native
vplan is enabled.
2024-01-16 11:08:10 +08:00
XChy
2c0fc0f37f
[DFAJumpThreading] Handle circular determinator (#78177)
Fixes the buildbot failure in
https://github.com/llvm/llvm-project/pull/78134#issuecomment-1892195197
When we meet the path with single `determinator`, the determinator
actually takes itself as a predecessor. Thus, we need to let `Prev` be
the determinator when `PathBBs` has only one element.
2024-01-15 17:52:53 -08:00
Noah Goldstein
60e8915d22 [InstCombine] Add folds for (add/sub/disjoint_or/icmp C, (ctpop (not x)))
`(ctpop (not x))` <-> `(sub nuw nsw BitWidth(x), (ctpop x))`. The
`sub` expression can sometimes be constant folded depending on the use
case of `(ctpop (not x))`.

This patch adds fold for the following cases:

`(add/sub/disjoint_or C, (ctpop (not x))`
    -> `(add/sub/disjoint_or C', (ctpop x))`
`(cmp pred C, (ctpop (not x))`
    -> `(cmp swapped_pred C', (ctpop x))`

Where `C'` depends on how we constant fold `C` with `BitWidth(x)` for
the given opcode.

Proofs: https://alive2.llvm.org/ce/z/qUgfF3

Closes #77859
2024-01-15 12:05:38 -08:00
Stephen Tozer
304119860a
[DebugInfo][RemoveDIs][NFC] Split findDbgDeclares into two functions (#77478)
This patch follows on from comments on
https://github.com/llvm/llvm-project/pull/73498, implementing the
proposed split of findDbgDeclares into two separate functions for
DbgDeclareInsts and DPVDeclares, which return containers rather than
taking containers by reference.
2024-01-15 17:46:56 +00:00
XChy
019ffbf324
[DFAJumpThreading] Extends the bitwidth of state from uint64_t to APInt (#78134)
Fixes #78059
2024-01-15 18:24:18 +08:00
Vitaly Buka
253d2f931e
Revert "[InstCombine] Fold icmp pred (inttoptr X), (inttoptr Y) -> icmp pred X, Y" (#78023)
Reverts llvm/llvm-project#77832

To fix https://lab.llvm.org/buildbot/#/builders/236/builds/8673

Also truncation to shorter type looks incorrect.

Issue for tracking #78024 .
2024-01-13 11:15:30 -08:00
Alexey Bataev
6fdc2ce8c5 [SLP]Fix PR77916: transform the whole mask, not only the elements for
the second vector.

Need to transform all elements in the long mask, if we decided to
produce shorter version, some elements may still have incorrect inifices
after transformation for the first vector in the permutation.
2024-01-12 07:07:43 -08:00
Yingwei Zheng
2aae304cbc
[InstCombine] Fold icmp pred (inttoptr X), (inttoptr Y) -> icmp pred X, Y (#77832)
NOTE: Alive2 proofs are unavailable because `inttoptr` is unsupported.
2024-01-12 23:03:07 +08:00
Qiongsi Wu
39bb790b90
[SimplifyCFG] switch: Do Not Transform the Default Case if the Condition is Too Wide (#77831)
https://github.com/llvm/llvm-project/pull/76669 taught SimplifyCFG to
handle switches when `default` has only one case. When the `switch`'s
condition is wider than 64 bit, the current implementation can calculate
the wrong default value. This PR skips cases where the condition is too
wide.
2024-01-12 08:54:35 -05:00
Nikita Popov
6c2fbc3a68
[IRBuilder] Add CreatePtrAdd() method (NFC) (#77582)
This abstracts over the common pattern of creating a gep with i8 element
type.
2024-01-12 14:21:21 +01:00
Florian Hahn
59d6f033a2
[VPlan] Support narrowing widened loads in truncateToMinimimalBitwidths.
MinBWs may also contain widened load instructions, handle them by only
narrowing their result.

Fixes https://github.com/llvm/llvm-project/issues/77468
2024-01-12 13:14:13 +00:00
Alexey Bataev
39b2104b4a [SLP]Fix a crash for reduced values with minbitwidth, which are reused.
If the reduced values are additionally affected by minbitwidth analysis,
need to cast them to a proper type before doing any math, if they are
reused.
2024-01-12 04:49:48 -08:00
Kazu Hirata
7b9bc4729b [IPO] Use a range-based for loop (NFC) 2024-01-11 22:48:22 -08:00
Fangrui Song
7740565f56 [asan] Enable StackSafetyAnalysis by default
StackSafetyAnalysis determines whether stack-allocated variables are
guaranteed to be safe from memory access bugs and enables the removal of
certain unneeded instrumentations.
(hwasan enables StackSafetyAnalysis in https://reviews.llvm.org/D108381)

In a release build of clang, text sections are 9% smaller.

Test updates:

* asan-stack-safety.ll: test the -asan-use-stack-safety=1 default
* lifetime-uar-uas.ll: switch to an indexed store to prevent
  StackSafetyAnalysis from optimizing out instrumentation for %c
* alloca_vla_interact.cpp: add a load to prevent StackSafetyAnalysis
  from optimizing out `__asan_alloca_poison` for the VLA `array`
* scariness_score_test.cpp: add -asan-use-stack-safety=0 to make a load
  of a `__asan_poison_memory_region`-poisoned local variable fail as
  intended.
* other .ll tests: add -asan-use-stack-safety=0

Reviewed By: kstoimenov

Pull Request: https://github.com/llvm/llvm-project/pull/77210
2024-01-11 14:03:28 -08:00
Zequan Wu
e7f7948751 Revert "[asan] Enable StackSafetyAnalysis by default"
This reverts commit 51fbab134560ece663517bf1e8c2a30300d08f1a.
This causes the compiler to crash. Will file a issue to track the status.
2024-01-11 15:24:44 -05:00
Philip Reames
f5dd70c582
[LSR] Require non-zero step when considering wrap around for term folding (#77809)
The term folding logic needs to prove that the induction variable does
not cycle through the same set of values so that testing for the value
of the IV on the exiting iteration is guaranteed to trigger only on that
iteration. The prior code checked the no-self-wrap property on the IV,
but this is insufficient as a zero step is trivially no-self-wrap per
SCEV's definition but does repeat the same series of values.

In the current form, this has the effect of basically disabling lsr's
term-folding for all non-constant strides. This is still a net
improvement as we've disabled term-folding entirely, so being able to
enable it for constant strides is still a net improvement.

As future work, there's two SCEV weakness worth investigating.

First sext (or i32 %a, 1) to i64 does not return true for
isKnownNonZero. This is because we check only the unsigned range in that
query. We could either do query pushdown, or check the signed range as
well. I tried the second locally and it has very broad impact - i.e. we
have a bunch of missing optimizations here.

Second, zext (or i32 %a, 1) to i64 as the increment to the IV in
expensive_expand_short_tc causes the addrec to no longer be provably
no-self-wrap. I didn't investigate this so it might be necessary, but
the loop structure is such that I find this result surprising.
2024-01-11 10:07:17 -08:00
Vladislav Dzhidzhoev
fc6faa1113
[CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined functions (#75385)
- [DebugMetadata][DwarfDebug] Support function-local types in lexical
block scopes (4/7)
- [CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined
functions

This is a follow-up for https://reviews.llvm.org/D144006, fixing a crash
reported
in Chromium (https://reviews.llvm.org/D144006#4651955).

The first commit is added for convenience, as it has already been
accepted.

If DISubpogram was not cloned (e.g. we are cloning a function that has
other
functions inlined into it, and subprograms of the inlined functions are
not supposed to be cloned), it doesn't make sense to clone its
DILocalVariables as well.
Otherwise get duplicated DILocalVariables not tracked in their
subprogram's retainedNodes, that crash LTO with Chromium.

This is meant to be committed along with
https://reviews.llvm.org/D144006.
2024-01-11 17:08:12 +01:00