This relands commit f890f010f6a70addbd885acd0c8d1b9578b6246f.
The result value of `getelementptr inbounds (TY, null, not zero)` is a poison value.
We can think of it as undefined behavior.
before erasing.
Before trying to erase the extractelement instruction, not enough to
check for single use, need to check that it is not used in several nodes
because of the preliminary nodes reordering.
We're using this flag (IsNewDbgInfoFormat) to detect the boundaries in
LLVM of what's treating debug-info as intrinsics (i.e. dbg.value), and
what's using DPValue objects (the non-intrinsic replacement). The
attributor tends to create new wrapper functions and doesn't insert them
into Modules in the usual way, thus we have to manually update that flag
to signal what debug-info mode it's using.
I've added some --try-experimental-debuginfo-iterators RUN lines to
tests that would otherwise crash because of this, so that they're
exercised by our new-debuginfo-iterators buildbot.
NB: there's an attributor test with a dbg.value in it, however
attributes re-order themselves in RemoveDIs mode for various reasons, so
we're going to address that in a different patch.
The result of umin may be poison and in that case the added constraints
are not be valid in contexts where poison doesn't cause UB. Only queue
facts for min/max intrinsics if the result is guaranteed to not be
poison.
This could be improved in the future, by only adding the fact when
solving conditions using the result value.
Fixes https://github.com/llvm/llvm-project/issues/78621.
This patch canonicalizes getelementptr instructions with constant
indices to use the `i8` source element type. This makes it easier for
optimizations to recognize that two GEPs are identical, because they
don't need to see past many different ways to express the same offset.
This is a first step towards
https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699.
This is limited to constant GEPs only for now, as they have a clear
canonical form, while we're not yet sure how exactly to deal with
variable indices.
The test llvm/test/Transforms/PhaseOrdering/switch_with_geps.ll gives
two representative examples of the kind of optimization improvement we
expect from this change. In the first test SimplifyCFG can now realize
that all switch branches are actually the same. In the second test it
can convert it into simple arithmetic. These are representative of
common optimization failures we see in Rust.
Fixes https://github.com/llvm/llvm-project/issues/69841.
LAA currently adds memory locations with their original AATags to AST.
However, scoped alias AATags may be valid only within one loop
iteration, while LAA reasons across iterations.
Fix this by determining which alias scopes are defined inside the loop,
and drop AATags that reference these scopes.
Fixes https://github.com/llvm/llvm-project/issues/79137.
This is a followup to #76819. After those changes, we can still run into
an assertion failure for a slight variation of the test case: When
fixing up MemoryPhis, we map the incoming access to the access of the
cloned instruction -- which may now no longer exist.
Fix this by reusing the getNewDefiningAccessForClone() helper, which
will look upwards for a new defining access in that case.
f9c2a341b9
causes regressions when we have a slice with integer vector type that is
the same size as the partition, and a ptr load/store slice that is not
the size of the element type.
Ref `vector-promotion.ll:ptrLoadStoreTys`.
Before the patch, we would only consider `<4 x i32>` as a candidate type
for vector promotion, and would find that it is a viable type for all
the slices.
After the patch, we now add `<2 x ptr>` as a candidate type due to slice
with user `store ptr %val0, ptr %obj, align 8` -- and flag that we
`HaveVecPtrTy`. The pre-existing behavior of this flag results in
removing the viable `<4 x i32>` and keeping only the unviable `<2 x
ptr>`, which results in a failure to promote.
The end result is failing to promote an alloca that was previously
promoted -- this does not appear to be the intent of that patch, which
has the goal of increasing promotions by providing more promotion
opportunities.
This PR preserves this behavior via a simple reorganization of the
implemention: try first the slice types with same size as the partition,
then, if there is no promotable type, try the `LoadStoreTys.`
Here's a raft of minor fixes for the RemoveDIs project that's replacing
dbg.value intrinsics with DPValue objects, all IMO trivial:
* When inserting functions or blocks and calling setIsNewDbgInfoFormat,
do that after setting the Parent pointer, just in case conversion from
(or to) dbg.value mode is triggered.
* When transferring DPValues from an empty range in a splice call, don't
transfer if there are no DPValues attached to the source block at all.
* stripNonLineTableDebugInfo should drop DPValues.
* In insertBefore, don't try to transfer DPValues if there aren't any.
Currently, the UnifiedLTO pipeline seems to have trouble with several
LTO features, like SplitLTO units, which means we cannot use important
optimizations like Whole Program Devirtualization or security hardening
instrumentation like CFI.
This patch reverts FatLTO to using distinct pipelines for Full LTO and
ThinLTO. It still avoids module cloning, since that was error prone.
This patch replaces a utility in the outliner that moves the contents of
one basic block into another basic block, with a call to splice instead.
I think it's NFC, however I'd like a second pair of eyes to look at it
just in case.
The reason for doing this is an edge case in the handling of DPValue
objects, the replacement for dbg.values. If there's a variable
assignment "dangling" at the end of a block (which happens when we
delete the terminator), inserting instructions at end() doesn't shift
the DPValue up into the block. We could probably fix this; but it's much
easier to use splice at the only call site that does this.
Patch adds --try-experimental-debuginfo-iterators to a test to exercise
this code path.
This patch trivially updates various opt passes to handle DPVAssigns. In
all cases, this means some combination of generifying existing code to
handle DPValues and DbgAssignIntrinsics, iterating over DPValues where
previously we did not, or duplicating code for DbgAssignIntrinsics to
the equivalent DPValue function (in inlining and salvageDebugInfo).
With enough codegen complete, we can now correctly report the size of
vector registers for LSX/LASX, allowing auto vectorization (The
`auto-vec` feature needs to be enabled simultaneously).
As described, the `auto-vec` feature is an experimental one. To ensure
that automatic vectorization is not enabled by default, because the
information provided by the current `TTI` cannot yield additional
benefits for automatic vectorization.
This patch trivially extends support for DbgValueInst recovery to
DPValues in LoopStrengthReduce; they are handled identically, so this is
mostly done by reusing the DbgValueInst code (using templates or
auto-parameter lambdas to reduce actual code duplication).
Fixes a problem where the explicit marking of various instructions as
conflicts did not propagate to their users. An example of this:
```
%getelementptr = getelementptr i8, <2 x ptr addrspace(1)> zeroinitializer, <2 x i64> <i64 888, i64 908>
%shufflevector = shufflevector <2 x ptr addrspace(1)> %getelementptr, <2 x ptr addrspace(1)> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%shufflevector1 = shufflevector <2 x ptr addrspace(1)> %getelementptr, <2 x ptr addrspace(1)> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%select = select i1 false, <4 x ptr addrspace(1)> %shufflevector1, <4 x ptr addrspace(1)> %shufflevector
```
Here the vector shuffles will get single base (gep) during the fixpoint
and therefore the select will get a known base (gep). We later mark the
shuffles as conflicts, but this does not change the base of select. This
gets caught by an assert where the select's type will differ from its
(wrong) base later on.
The solution in the MR is to move the explicit conflict marking into the
fixpoint phase.
---------
Co-authored-by: Petr Maj <pmaj@azul.com>
This commit modifies `LoopDeletion::deleteLoopIfDead` to check if the
exit block of a loop is an EH pad before checking if the loop gets
executed. This handles the case where an unreachable loop has a
landingpad as an Exit block, and the loop gets deleted, leaving leaving
the landingpad without an edge from an unwind clause.
Fixes#76852.
This changes the AliasSetTracker to track memory locations instead of
pointers in its alias sets. The motivation for this is outlined in an RFC
posted on LLVM discourse:
https://discourse.llvm.org/t/rfc-dont-merge-memory-locations-in-aliassettracker/73336
In the data structures of the AST implementation, I made the choice to
replace the linked list of `PointerRec` entries (that had to go anyway)
with a simple flat vector of `MemoryLocation` objects, but for the
`AliasSet` objects referenced from a lookup table, I retained the
mechanism of a linked list, reference counting, forwarding, etc. The
data structures could be revised in a follow-up change.
I found another case where in the end block we could have a PHI that we
deal with incorrectly. The two incoming values are unique - one of them
is
the induction variable and another one is a value defined outside the
loop, e.g.
%final_val = phi i32 [ %inc, %while.body ], [ %d, %while.cond ]
We won't correctly select between the two values in the new end block
that
we create and so we will get the wrong result.