35577 Commits

Author SHA1 Message Date
DianQK
a58dcc5e08
Reland "[SimplifyCFG] Improve the precision of PtrValueMayBeModified"
This relands commit f890f010f6a70addbd885acd0c8d1b9578b6246f.

The result value of `getelementptr inbounds (TY, null, not zero)` is a poison value.
We can think of it as undefined behavior.
2024-01-25 06:42:14 +08:00
DianQK
a0c1b5bdda
Reland "[SimplifyCFG] Check if the return instruction causes undefined behavior"
This relands commit b6a0be8ce3114d0c57e7a7d6c3c222986ca506ad.

Return undefined to a noundef return value is undefined.

Example:

```
define noundef i32 @test_ret_noundef(i1 %cond) {
entry:
  br i1 %cond, label %bb1, label %bb2
bb1:
  br label %bb2
bb2:
  %r = phi i32 [ undef, %entry ], [ 1, %bb1 ]
  ret i32 %r
}
```
2024-01-25 06:42:14 +08:00
Alexey Bataev
36e4a7ecca [SLP]Fix PR79321: SLPVectorizer's PHICompare doesn't provide a strict
weak ordering.
Try to make PHICompare to meat strict weak ordering criteria.
2024-01-24 13:46:05 -08:00
Alexey Bataev
48bbd76587 [SLP]Fix PR79229: Check that extractelement is used only in a single node
before erasing.

Before trying to erase the extractelement instruction, not enough to
check for single use, need to check that it is not used in several nodes
because of the preliminary nodes reordering.
2024-01-24 11:22:22 -08:00
Alexey Bataev
ca654acc16 [SLP]Fix PR79321: SLPVectorizer's PHICompare doesn't provide a strict
weak ordering.

Compared NumUses to meet the reaquirements of the strict weak ordering.
2024-01-24 09:36:25 -08:00
Kazu Hirata
1605bf5815
[ConstraintElimination] Use std::move in the constructor (NFC) (#79259)
Moving the contents of Coefficients saves 0.43% of heap allocations
during the compilation of a large preprocessed file, namely
X86ISelLowering.cpp, for the X86 target.
2024-01-24 09:18:57 -08:00
Jeremy Morse
0065d06760
[NFC][DebugInfo] Maintain RemoveDIs flag when attributor creates functions (#79143)
We're using this flag (IsNewDbgInfoFormat) to detect the boundaries in
LLVM of what's treating debug-info as intrinsics (i.e. dbg.value), and
what's using DPValue objects (the non-intrinsic replacement). The
attributor tends to create new wrapper functions and doesn't insert them
into Modules in the usual way, thus we have to manually update that flag
to signal what debug-info mode it's using.

I've added some --try-experimental-debuginfo-iterators RUN lines to
tests that would otherwise crash because of this, so that they're
exercised by our new-debuginfo-iterators buildbot.

NB: there's an attributor test with a dbg.value in it, however
attributes re-order themselves in RemoveDIs mode for various reasons, so
we're going to address that in a different patch.
2024-01-24 15:20:05 +00:00
Florian Hahn
3d91d9613e
[ConstraintElim] Make sure min/max intrinsic results are not poison.
The result of umin may be poison and in that case the added constraints
are not be valid in contexts where poison doesn't cause UB. Only queue
facts for min/max intrinsics if the result is guaranteed to not be
poison.

This could be improved in the future, by only adding the fact when
solving conditions using the result value.

Fixes https://github.com/llvm/llvm-project/issues/78621.
2024-01-24 14:25:55 +00:00
Nikita Popov
90ba33099c
[InstCombine] Canonicalize constant GEPs to i8 source element type (#68882)
This patch canonicalizes getelementptr instructions with constant
indices to use the `i8` source element type. This makes it easier for
optimizations to recognize that two GEPs are identical, because they
don't need to see past many different ways to express the same offset.

This is a first step towards
https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699.
This is limited to constant GEPs only for now, as they have a clear
canonical form, while we're not yet sure how exactly to deal with
variable indices.

The test llvm/test/Transforms/PhaseOrdering/switch_with_geps.ll gives
two representative examples of the kind of optimization improvement we
expect from this change. In the first test SimplifyCFG can now realize
that all switch branches are actually the same. In the second test it
can convert it into simple arithmetic. These are representative of
common optimization failures we see in Rust.

Fixes https://github.com/llvm/llvm-project/issues/69841.
2024-01-24 15:25:29 +01:00
Nikita Popov
89dae798cc [Loads] Use BatchAAResults for available value APIs (NFCI)
This allows caching AA queries both within and across the calls,
and enables us to use a custom AAQI configuration.
2024-01-24 14:04:21 +01:00
Jeremy Morse
fe0e632b00
[DebugInfo][RemoveDIs] Support DPValues in HWAsan (#78731)
This patch extends HWASAN to support maintenance of debug-info that
isn't stored as intrinsics, but is instead in a DPValue object. This is
straight-forwards: we collect any such objects in StackInfoBuilder, and
apply the same operations to them as we would to dbg.value and similar
intrinsics.

I've also replaced some calls to getNextNode with debug-info skipping
next calls, and use iterators for instruction insertion rather than
instruction pointers. This avoids any difference in output between
intrinsic / non-intrinsic debug-info, but also means that any debug-info
comes before code inserted by HWAsan, rather than afterwards. See the
test modifications, where the variable assignment (presented as a
dbg.value) jumps up over all the code inserted by HWAsan. Seeing how the
code inserted by HWAsan is always (AFAIUI) given the source-location of
the instruction being instrumented, I don't believe this will have any
effect on which lines variable assignments become visible on; it may
extend the number of instructions covered by the assignments though.
2024-01-24 10:38:35 +00:00
Kazu Hirata
873a7bb129 [Transforms] Use llvm::pred_size and llvm::predecessors (NFC) 2024-01-24 00:27:35 -08:00
Craig Topper
3dea0aa8f4
[LSR] Fix incorrect comment. NFC (#79207) 2024-01-23 17:57:34 -08:00
Jeffrey Byrnes
f709fbb1bb
[SROA] Only try additional vector type candidates when needed (#77678)
f9c2a341b9
causes regressions when we have a slice with integer vector type that is
the same size as the partition, and a ptr load/store slice that is not
the size of the element type.

Ref `vector-promotion.ll:ptrLoadStoreTys`. 

Before the patch, we would only consider `<4 x i32>` as a candidate type
for vector promotion, and would find that it is a viable type for all
the slices.

After the patch, we now add `<2 x ptr>` as a candidate type due to slice
with user `store ptr %val0, ptr %obj, align 8` -- and flag that we
`HaveVecPtrTy`. The pre-existing behavior of this flag results in
removing the viable `<4 x i32>` and keeping only the unviable `<2 x
ptr>`, which results in a failure to promote.

The end result is failing to promote an alloca that was previously
promoted -- this does not appear to be the intent of that patch, which
has the goal of increasing promotions by providing more promotion
opportunities.

This PR preserves this behavior via a simple reorganization of the
implemention: try first the slice types with same size as the partition,
then, if there is no promotable type, try the `LoadStoreTys.`
2024-01-23 17:22:49 -08:00
Jeffrey Byrnes
2a61be4e4c [SROA] NFC: Extract code to checkVectorTypesForPromotion
Change-Id: Ib6f237cc791a097f8f2411bc1d6502f11d4a748e
2024-01-23 15:40:20 -08:00
Paul Kirth
9d476e1e1a
[clang][FatLTO] Avoid UnifiedLTO until it can support WPD/CFI (#79061)
Currently, the UnifiedLTO pipeline seems to have trouble with several
LTO features, like SplitLTO units, which means we cannot use important
optimizations like Whole Program Devirtualization or security hardening
instrumentation like CFI.

This patch reverts FatLTO to using distinct pipelines for Full LTO and
ThinLTO. It still avoids module cloning, since that was error prone.
2024-01-23 14:04:52 -08:00
Alexey Bataev
bb3e0d7fc3 [SLP]Fix PR79193: skip analysis of gather nodes for minbitwidth.
No need in trying to analyze small graphs with gather node only to avoid
crash.
2024-01-23 12:44:49 -08:00
gulfemsavrun
7fe951ad8a
Revert "Reapply [hwasan] Update dbg.assign intrinsics in HWAsan pass … (#79186)
…#78606"

This reverts commit 13c6f1ea2e7eb15fe492d8fca4fa1857c6f86370 because it
causes an assertion in DebugInfoMetadata.cpp:1968 in Clang Linux
builders for Fuchsia.

https://logs.chromium.org/logs/fuchsia/buildbucket/cr-buildbucket/8758111613576762817/+/u/clang/build/stdout
2024-01-23 10:12:10 -08:00
Alina Sbirlea
edeaf41e22
[ConstantHoisting] Cache OptForSize. (#79170)
CacheOptForSize to remove quadratic behavior.

For each constant analyzed, ConstantHoising calls
`shouldOptimizeForSize(F)`, which calls `PSI.getTotalCallCount(F)`.
PSI.getTotalCallCount(F) goes through all the instructions in all basic
blocks, and checks if each is a call, to count them up.

This reduces `llc` time for a very large IR from ~10min to under 3min.
Reproducer testcase is much too large to share.
2024-01-23 09:42:47 -08:00
Jeremy Morse
4782ac8dd3
[DebugInfo][RemoveDIs] Use splice in Outliner rather than moveBefore (#79124)
This patch replaces a utility in the outliner that moves the contents of
one basic block into another basic block, with a call to splice instead.
I think it's NFC, however I'd like a second pair of eyes to look at it
just in case.

The reason for doing this is an edge case in the handling of DPValue
objects, the replacement for dbg.values. If there's a variable
assignment "dangling" at the end of a block (which happens when we
delete the terminator), inserting instructions at end() doesn't shift
the DPValue up into the block. We could probably fix this; but it's much
easier to use splice at the only call site that does this.

Patch adds --try-experimental-debuginfo-iterators to a test to exercise
this code path.
2024-01-23 16:23:48 +00:00
Stephen Tozer
632f44e5ed
[RemoveDIs][DebugInfo] Handle DPVAssign in most transforms (#78986)
This patch trivially updates various opt passes to handle DPVAssigns. In
all cases, this means some combination of generifying existing code to
handle DPValues and DbgAssignIntrinsics, iterating over DPValues where
previously we did not, or duplicating code for DbgAssignIntrinsics to
the equivalent DPValue function (in inlining and salvageDebugInfo).
2024-01-23 16:16:59 +00:00
OCHyams
13c6f1ea2e Reapply [hwasan] Update dbg.assign intrinsics in HWAsan pass #78606
llvm.dbg.assign intrinsics have 2 {value, expression} pairs; fix hwasan to update
the second expression.

Fixes #76545
2024-01-23 11:24:21 +00:00
AtariDreams
96adf69ba9
[InstCombine] Remove one-use check if other logic operand is constant (#77973)
By using `match(W, m_ImmConstant())`, we do not need to worry about
one-time use anymore.
2024-01-23 12:10:59 +01:00
Stephen Tozer
60e1c835d3
[RemoveDIs][DebugInfo] Update SROA to handle DPVAssigns (#78475)
SROA needs to update llvm.dbg.assign intrinsics when it migrates debug
info in response to alloca splitting; this patch updates the debug info
migration code to handle DPVAssigns as well, making use of generic code
to avoid duplication as much as possible.
2024-01-23 09:37:27 +00:00
Jeremy Morse
be0c809836 [NFC][Debuginfo][RemoveDIs] Switch an insertion to use iterators
With the soon-to-land new-debug-info storage model, it's going to be
important to use iterators for instruction insertion rather than
instruction pointers. This (single line in instcombine) is the last place
that trips up our internal testing for debug-info, where we insert a PHI
and it should be using an iterator.
2024-01-22 23:12:01 +00:00
gulfemsavrun
b00aa1c77b
Revert "Reapply [hwasan] Update dbg.assign intrinsics in HWAsan pass … (#79053)
…#78606"

This reverts commit 76160718df7c1f31ff50a4964d749c2b9d83f9cf because it
caused an assertion failure in emitDbgValue function in Codegen in Clang
Linux toolchain builders for Fuchsia.
https://logs.chromium.org/logs/fuchsia/buildbucket/cr-buildbucket/8758181086086431185/+/u/clang/build/stdout
2024-01-22 12:44:46 -08:00
Stephen Tozer
7c53e9f667
[RemoveDIs][DebugInfo] Add support for DPValues to LoopStrengthReduce (#78706)
This patch trivially extends support for DbgValueInst recovery to
DPValues in LoopStrengthReduce; they are handled identically, so this is
mostly done by reusing the DbgValueInst code (using templates or
auto-parameter lambdas to reduce actual code duplication).
2024-01-22 18:59:19 +00:00
Mingming Liu
5ce286849a
[CGProfile] Use callee's PGO name when caller->callee is an indirect call. (#78610)
- With PGO, indirect call edges are constructed using value profiles, and the profile address is mapped to a function's PGO name. The PGO name is computed using a functions linkage before LTO internalization or global promotion.
- With ThinLTO, local functions [could be
promoted](2663d2cb9c/llvm/lib/Transforms/Utils/FunctionImportUtils.cpp (L288)) to have external linkage; and with
[full](2663d2cb9c/llvm/lib/LTO/LTO.cpp (L1328))
or
[thin](2663d2cb9c/llvm/lib/LTO/LTO.cpp (L448))
LTO, global functions could be internalized. Edge construction should use a function's PGO name before its linkage is updated.
2024-01-22 10:36:03 -08:00
Stephen Tozer
89aa3355e2
[RemoveDIs][DebugInfo] Remove redundant DPVAssigns (#78574)
DPValues are already supported by most of the utilities that remove
redundant debug info after certain passes; the exception to this is
`removeUndefDbgAssignsFromEntryBlock`, which applies only to
llvm.dbg.assigns which were previously unimplemented for DPValues. Now
that DPVAssigns exist, we have to support removing redundant instances
in the same way, which this patch implements.
2024-01-22 18:04:07 +00:00
OCHyams
76160718df Reapply [hwasan] Update dbg.assign intrinsics in HWAsan pass #78606
llvm.dbg.assign intrinsics have 2 {value, expression} pairs; fix hwasan to update
the second expression.

Fixes #76545
2024-01-22 17:07:44 +00:00
Jie Fu
ac3ee1b1ae [Transforms] Fix -Wunused-variable and remove redundant VerifyStates after #75826 (NFC)
llvm-project/llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp:1064:18: error: unused variable 'I' [-Werror,-Wunused-variable]
    Instruction *I = cast<Instruction>(Pair.first);
                 ^
llvm-project/llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp:1066:11: error: unused variable 'BaseValue' [-Werror,-Wunused-variable]
    auto *BaseValue = State.getBaseValue();
          ^
2 errors generated.
2024-01-22 22:55:53 +08:00
Nikita Popov
ebb853fbe5 [ConstraintElim] Remove unused checkCondition() parameters (NFC) 2024-01-22 15:55:35 +01:00
Petr Maj
3c246efd04
True fixpoint algorithm in RS4GC (#75826)
Fixes a problem where the explicit marking of various instructions as
conflicts did not propagate to their users. An example of this:

```
%getelementptr = getelementptr i8, <2 x ptr addrspace(1)> zeroinitializer, <2 x i64> <i64 888, i64 908>
%shufflevector = shufflevector <2 x ptr addrspace(1)> %getelementptr, <2 x ptr addrspace(1)> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%shufflevector1 = shufflevector <2 x ptr addrspace(1)> %getelementptr, <2 x ptr addrspace(1)> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%select = select i1 false, <4 x ptr addrspace(1)> %shufflevector1, <4 x ptr addrspace(1)> %shufflevector
```

Here the vector shuffles will get single base (gep) during the fixpoint
and therefore the select will get a known base (gep). We later mark the
shuffles as conflicts, but this does not change the base of select. This
gets caught by an assert where the select's type will differ from its
(wrong) base later on.

The solution in the MR is to move the explicit conflict marking into the
fixpoint phase.

---------

Co-authored-by: Petr Maj <pmaj@azul.com>
2024-01-22 09:10:04 -05:00
Alexey Bataev
5a667bee9c [InstCombine] Try to fold trunc(shuffle(zext)) to just a shuffle (#78636)
Tries to remove extra trunc/ext instruction for shufflevector
instructions.

Differential Review: https://github.com/llvm/llvm-project/pull/78636
2024-01-22 05:50:20 -08:00
Orlando Cazalet-Hyams
5266c1285b
Revert "[hwasan] Update dbg.assign intrinsics in HWAsan pass" (#78971)
Reverts llvm/llvm-project#78606

https://lab.llvm.org/buildbot/#/builders/77/builds/33963
2024-01-22 13:30:50 +00:00
Orlando Cazalet-Hyams
a590f2315f
[hwasan] Update dbg.assign intrinsics in HWAsan pass (#78606)
llvm.dbg.assign intrinsics have 2 {value, expression} pairs; fix hwasan to update
the second expression.

Fixes #76545
2024-01-22 11:38:00 +00:00
Stephen Tozer
6aeb7a71d4
[RemoveDIs][DebugInfo] Add interface changes for AT analysis (#78460)
This patch adds the preliminary changes for handling DPValues in
AssignmentTrackingAnalysis - very few functional changes are included,
but internal data structures have been changed to operate with DPValues
as well as Instructions, allowing future patches to process DPValues
correctly.
2024-01-22 11:05:27 +00:00
Florian Hahn
3683852d49
[VPlan] Use replaceUsesWithIf in replaceAllUseswith and add comment (NFCI).
Follow-up to post-commit commens for b1bfe221e6.
2024-01-21 12:56:16 +00:00
Kazu Hirata
b7a66d0fae [llvm] Use SmallString::operator std::string (NFC) 2024-01-19 18:54:11 -08:00
Fangrui Song
c71a5bf940
[msan] Unpoison indirect outputs for userspace when -msan-handle-asm-conservative is specified (#77393)
KMSAN defaults to `msan-handle-asm-conservative`, which inserts
`__msan_instrument_asm_store` calls to unpoison indirect outputs in
inline assembly (e.g. `=m` constraints in source).

```c
unsigned f() {
  unsigned v;
  // __msan_instrument_asm_store unpoisons v before invoking the asm.
  asm("movl $1,%0" : "=m"(v));
  return v;
}
```

Extend the mechanism to userspace, but require explicit
`-mllvm -msan-handle-asm-conservative` for experiments for now.

As

https://docs.kernel.org/dev-tools/kmsan.html#inline-assembly-instrumentation
says, this approach may mask certain errors (an indirect output may not
actually be initialized), but it also helps to avoid a lot of false
positives.

Link: https://github.com/google/sanitizers/issues/192
2024-01-19 16:18:28 -08:00
Pranav Kant
4482fd846a Revert "[InstCombine] Try to fold trunc(shuffle(zext)) to just a shuffle (#78636)"
This reverts commit 4d11f04b20f0bd7488e19e8f178ba028412fa519.

This breaks some programs as mentioned in #78636
2024-01-19 21:02:20 +00:00
Manish Kausik H
a0b9117454
LoopDeletion: Move EH pad check before the isLoopNeverExecuted Check (#78189)
This commit modifies `LoopDeletion::deleteLoopIfDead` to check if the
exit block of a loop is an EH pad before checking if the loop gets
executed. This handles the case where an unreachable loop has a
landingpad as an Exit block, and the loop gets deleted, leaving leaving
the landingpad without an edge from an unwind clause.

Fixes #76852.
2024-01-19 15:30:20 +01:00
Alexey Bataev
4d11f04b20
[InstCombine] Try to fold trunc(shuffle(zext)) to just a shuffle (#78636)
Tries to remove extra trunc/ext instruction for shufflevector
instructions.
2024-01-19 09:29:01 -05:00
Jay Foad
7017efa1a1 Fix typo "widended" 2024-01-19 13:50:26 +00:00
Florian Hahn
42fb1fac9e
[VPlan] Use DebugLoc from recipe in VPWidenCallRecipe (NFCI).
Instead of using the debug location of the underlying instruction, use
the debug location from the recipe. This removes an unneeded dependency
of the underlying instruction.
2024-01-19 13:33:03 +00:00
Florian Hahn
abdb61f5fd
[VPlan] Introduce VPSingleDefRecipe. (#77023)
This patch introduces a new common base class for recipes defining a
single result VPValue. This has been discussed/mentioned at various
previous reviews as potential follow-up and helps to replace various
getVPSingleValue calls.

PR: https://github.com/llvm/llvm-project/pull/77023
2024-01-19 10:27:53 +00:00
yonillasky
9299ca797a
[Coroutines] Fix inline comment about frame layout (#78626)
`ResumeIndex` isn't part of the frame struct header, so it necessarily
appears after the promise.

Co-authored-by: Yoni Lavi <yoni.lavi@nextsilicon.com>
2024-01-19 09:46:15 +08:00
Yingwei Zheng
9acc404230
[InstCombine] Recognize more rotation patterns (#78107)
InstCombine already handles the pattern `(shl ShVal, (X & (Width - 1)))
| (lshr ShVal, ((-X) & (Width - 1)))`. Under certain circumstances, `X &
(Width - 1)` will be simplified to `X`. Therefore, this patch adds
support for the pattern `(shl ShVal, X) | (lshr ShVal, ((-X) & (Width -
1)))`.

Alive2: https://alive2.llvm.org/ce/z/P7JQ2V
2024-01-18 20:29:53 +08:00
Congcong Cai
64e94438a4
[InstCombine] combine mul(abs(x),abs(y)) to abs(mul(x,y)) (#78395)
Fixes: https://github.com/llvm/llvm-project/issues/78076
Alive2 Proof: https://alive2.llvm.org/ce/z/XEDy0f
2024-01-18 20:12:00 +08:00
Paschalis Mpeis
37c87d5689
[LV][AArch64] LoopVectorizer allows scalable frem instructions (#76247)
LoopVectorizer is aware when a target can replace a scalable frem
instruction with a vector library call for a given VF and it returns the
relevant cost. Otherwise, it returns an invalid cost (as previously).

Add test that check costs on AArch64, when there is no vector library
available and when there is (with and without tail-folding).

NOTE: Invoking CostModel directly (not through LV) would still return
invalid costs.
2024-01-18 08:32:53 +00:00