161 Commits

Author SHA1 Message Date
Csanád Hajdú
e53c46a908
[Statepoint] Treat result of atomicrmw xchg as a base pointer (#97280)
Atomic RMW Xchg wasn't handled before when searching for known base
pointers in the IR.
2024-11-08 10:23:27 -08:00
Paul Walker
38fffa630e
[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548) 2024-11-06 11:53:33 +00:00
Petr Maj
3c246efd04
True fixpoint algorithm in RS4GC (#75826)
Fixes a problem where the explicit marking of various instructions as
conflicts did not propagate to their users. An example of this:

```
%getelementptr = getelementptr i8, <2 x ptr addrspace(1)> zeroinitializer, <2 x i64> <i64 888, i64 908>
%shufflevector = shufflevector <2 x ptr addrspace(1)> %getelementptr, <2 x ptr addrspace(1)> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%shufflevector1 = shufflevector <2 x ptr addrspace(1)> %getelementptr, <2 x ptr addrspace(1)> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%select = select i1 false, <4 x ptr addrspace(1)> %shufflevector1, <4 x ptr addrspace(1)> %shufflevector
```

Here the vector shuffles will get single base (gep) during the fixpoint
and therefore the select will get a known base (gep). We later mark the
shuffles as conflicts, but this does not change the base of select. This
gets caught by an assert where the select's type will differ from its
(wrong) base later on.

The solution in the MR is to move the explicit conflict marking into the
fixpoint phase.

---------

Co-authored-by: Petr Maj <pmaj@azul.com>
2024-01-22 09:10:04 -05:00
Petr Maj
a1358225c5
Improvements to RS4GC BDV Algorithm (#69795)
Previously, after the algorithm fixpointed, the state was manually
patched by emitting BDVs for EE instructions earlier, while marking some
(but not all) vector and vector<->scalar instructions as conflict. This
causes issues as not all instructions that required BDVs had them
emitted and due to after-fixpoint patching, the extra BDVs did not
propagate to their users.

This change fixes both by rewriting the logic for BDV insertion &
patching. Instead of inserting the BDV for EE earlier, it merely marks
every EE instruction as a conflict. The two phase insertion algorithm
(first insert empty instructions and patch the BDVState, then actually
connect the BDV instructions to their input bases) then ensures correct
propagation to all its users. Furthermore the shufflevector instruction
as well as all instances of IE instruction are conservatively marked as
conflicts as well, fixing the second problem.

This change does not fix the handling of constant values and vectors in
the BDV. 

---------

Co-authored-by: Petr Maj <pmaj@azul.com>
2023-11-02 20:19:40 -04:00
Markus Böck
e6e62efa88
[RS4GC] Copy argument attributes from call to statepoint (#68475)
The current implementation completely ignores argument attributes on
calls, discarding them completely when creating a statepoint from a call
instruction. This is problematic in some scenarios as the argument
attributes affect the ABI of the call, leading to undefined behavior if
called with the wrong ABI attributes. Note that this cannot be solved
either by just having the function declaration annotated with the right
parameter attributes as the call might be indirect, therefore requiring
them to be present on the arguments.

This PR simply copies all parameter attributes over from the original
call to the created statepoint.
Note that some argument attributes become invalid after the lowering as
they imply memory effects that no longer hold with the statepoints.
These do not need to be explicitly handled in this PR as they are
removed by the `stripNonValidDataFromBody`.
2023-10-16 23:23:45 +02:00
Nuno Lopes
1844d64818 [RewriteStatepointsForGC] Use poison instead of undef as placeholder [NFC]
This is used in shufflevectors where the placeholder arg is unused.
It's also used when deleting invariant_start
2023-07-22 22:25:56 +01:00
Nikita Popov
6f7c9d1e17 [RewriteStatepointsForGC] Convert tests to opaque pointers (NFC) 2023-06-21 12:48:08 +02:00
Tobias Hieta
f84bac329b
[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
Nikita Popov
9ed2f14c87 [AsmParser] Remove typed pointer auto-detection
IR is now always parsed in opaque pointer mode, unless
-opaque-pointers=0 is explicitly given. There is no automatic
detection of typed pointers anymore.

The -opaque-pointers=0 option is added to any remaining IR tests
that haven't been migrated yet.

Differential Revision: https://reviews.llvm.org/D141912
2023-01-18 09:58:32 +01:00
Paul Walker
eae26b6640 [IRBuilder] Use canonical i64 type for insertelement index used by vector splats.
Instcombine prefers this canonical form (see getPreferredVectorIndex),
as does IRBuilder when passing the index as an integer so we may as
well use the prefered form from creation.

NOTE: All test changes are mechanical with nothing else expected
beyond a change of index type from i32 to i64.

Differential Revision: https://reviews.llvm.org/D140983
2023-01-11 14:08:06 +00:00
Nikita Popov
20fa198687 [RewriteStatepointsForGC] Avoid branch on undef UB in tests (NFC) 2023-01-03 14:31:33 +01:00
Nikita Popov
f01a3a893c [RewriteStatepointsForGC] Convert some tests to opaque pointers (NFC) 2023-01-03 14:27:26 +01:00
Denis Antrushin
86ed0daae7 [RS4GC] Rematerialize derived pointers before uses.
Introduce an option to rematerialize derived pointers immediately
before their uses instead of after every statepoint. This can be
beneficial when pointer is live across many statepoints but has
few uses.
Initial implementation is simple and rematerializes derived pointer
before every use, even if there are several uses in the same block
or rematerialization instructions can be hoisted etc.
Transformation is considered profitable if we would insert less
instructions than we would insert after every live statepoint.

Depends on D138910, D138911

Reviewed By: anna, skatkov

Differential Revision: https://reviews.llvm.org/D138912
2022-12-27 17:08:57 +03:00
Denis Antrushin
0dfe53b614 [RS4GC] Add few tests for derived pointer rematerialization. NFC.
Precommit few tests for the upcoming 'rematerialize derived pointers
at uses' feature.

Reviewed By: skatkov

Differential Revision: https://reviews.llvm.org/D138911
2022-12-13 13:43:10 +03:00
Bjorn Pettersson
3528e63d89 [test] Remove duplicate RUN lines in Transform tests 2022-12-08 11:47:16 +01:00
Roman Lebedev
dcd5f6f2fd
[NFC] Port all RewriteStatepointsForGC tests to -passes= syntax 2022-12-07 22:22:08 +03:00
Matt Arsenault
a74c5707be Fix some test files with executable permissions 2022-12-02 17:12:03 -05:00
Nikita Popov
3ddf56fd37 [Statepoint] Use default attributes for some GC intrinsics
This adds the default intrinsic attributes (nosync, nofree, nocallback,
willreturn) to the gc.result, gc.relocate, gc.pointer.base and
gc.pointer.offset intrinsics. As far as I understand, all of these
are supposed to be pure. Some quotes from LangRef:

> A gc.result is modeled as a ‘readnone’ pure function. It has no
> side effects since it is just a projection of the return value of
> the previous call represented by the gc.statepoint.

> A gc.relocate is modeled as a readnone pure function. It has no
> side effects since it is just a way to extract information about
> work done during the actual call modeled by the gc.statepoint.

Having willreturn in particular will be important to avoid
optimization regressions in the future.

Differential Revision: https://reviews.llvm.org/D136929
2022-11-08 09:27:22 +01:00
Nikita Popov
304f1d59ca [IR] Switch everything to use memory attribute
This switches everything to use the memory attribute proposed in
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly
attributes are dropped. The readnone, readonly and writeonly attributes
are restricted to parameters only.

The old attributes are auto-upgraded both in bitcode and IR.
The bitcode upgrade is a policy requirement that has to be retained
indefinitely. The IR upgrade is mainly there so it's not necessary
to update all tests using memory attributes in this patch, which
is already large enough. We could drop that part after migrating
tests, or retain it longer term, to make it easier to import IR
from older LLVM versions.

High-level Function/CallBase APIs like doesNotAccessMemory() or
setDoesNotAccessMemory() are mapped transparently to the memory
attribute. Code that directly manipulates attributes (e.g. via
AttributeList) on the other hand needs to switch to working with
the memory attribute instead.

Differential Revision: https://reviews.llvm.org/D135780
2022-11-04 10:21:38 +01:00
Nikita Popov
d41ecfab92 [X86] Use default attributes for intrinsics
This adds the default attributes (nocallback, nosync, nofree,
willreturn) to some X86 intrinsics. This will be needed to avoid
optimization regressions in the future (once we remove the
readonly -> willreturn implication for intrinsics).

Due to the number of intrinsics, this patch focuses just on the
IntrNoMem intrinsics up to the AVX2 section.

Differential Revision: https://reviews.llvm.org/D136939
2022-10-31 09:11:54 +01:00
Danila Malyutin
451497a030 [RS4GC] Handle vectors of pointers in non-live clobbering
Fix crash when trying to unconditionally cast alloca type to PointerType

Differential Revision: https://reviews.llvm.org/D131146
2022-08-16 17:47:30 +03:00
Max Kazantsev
a40af8589e [RS4GC] Handle special cases in unreachable code for memcpy/memmov
The existing code doesn't expect dummy values (undef, poison, null-derived
constants etc) as arguments of these intrinsics. However, they can be there
in unreached code. Currently we fail trying to find base for them.

Handle these cases separately. Return null as base for them to be consistent
with the handling in the main algorithm in findBaseDefiningValue.

Differential Revision: https://reviews.llvm.org/D129561
Reviewed By: apilipenko
2022-07-22 11:30:43 +07:00
Serguei Katkov
5e1ccdf960 [RS4GC] Handle freeze case for vector
Finding BDV for vector value does not handle freeze instruction.
Adding its handling as it is done for scalar case.

Reviewed By: apilipenko
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D128254
2022-06-23 11:58:41 +07:00
Nikita Popov
41d5033eb1 [IR] Enable opaque pointers by default
This enabled opaque pointers by default in LLVM. The effect of this
is twofold:

* If IR that contains *neither* explicit ptr nor %T* types is passed
  to tools, we will now use opaque pointer mode, unless
  -opaque-pointers=0 has been explicitly passed.
* Users of LLVM as a library will now default to opaque pointers.
  It is possible to opt-out by calling setOpaquePointers(false) on
  LLVMContext.

A cmake option to toggle this default will not be provided. Frontends
or other tools that want to (temporarily) keep using typed pointers
should disable opaque pointers via LLVMContext.

Differential Revision: https://reviews.llvm.org/D126689
2022-06-02 09:40:56 +02:00
Max Kazantsev
5a08e81779 [RS4GC] Add support for 'freeze' instruction to findBaseDefiningValue
Because this instruction is a noop, we can simply go through it in
search of the base.
2022-05-06 20:46:29 +07:00
Dmitry Makogon
e9b4f2256a [RS4GC] Add tests showing cases in which we could find a better base (NFC) 2022-04-28 17:22:11 +07:00
Dmitry Makogon
d03d2d8aea [RS4GC] Prune inputs of BDV if they are BDV themselves
Don't check whether an input of BDV can be pruned if the input
is the BDV itself. BDV is present in the states map, so in case
the input is the BDV itself, we'd return false. So explicitly check this case.

Differential Revision: https://reviews.llvm.org/D123846
2022-04-26 16:05:00 +07:00
Dmitry Makogon
084ad1ebee [Test] Add more tests showing duplicate PHIs generated by RS4GC (NFC)
This adds more tests with derived pointers.
2022-04-19 23:05:50 +07:00
Dmitry Makogon
6f8feeb342 [Test] Add more tests showing duplicate PHIs generated by RS4GC (NFC) 2022-04-18 17:39:06 +07:00
Dmitry Makogon
2603dcdd8d [Test] Add tests showing duplicate PHIs generated by RS4GC (NFC) 2022-04-13 15:51:17 +07:00
Daniil Suchkov
7c3e2b92cf [RewriteStatepointsForGC] Fix an incorrect assertion
The assertion verifying that a newly computed value matches what is
already cached used stripPointerCasts() to strip bitcasts, however the
values can be not only pointers, but also vectors of pointers. That is
problematic because stripPointerCasts() doesn't handle vectors of
pointers. This patch introduces an ad-hoc utility function to strip all
bitcasts regardless of the value type.

Reviewed By: skatkov, reames

Differential Revision: https://reviews.llvm.org/D119994
2022-02-17 18:44:57 +00:00
Daniil Suchkov
a99989529e [RewriteStatepointsForGC] Add a test exposing an incorrect assertion 2022-02-17 00:22:46 +00:00
Nikita Popov
46f9e45ef0 [Statepoint] Update gc.statepoint calls in tests with elementtype (NFC)
This updates tests for the LangRef change in D117890.
2022-02-04 14:15:41 +01:00
Nikita Popov
9f30afffaa [RS4GC] Restore DAG check line (NFC)
It's fishy that this is needed, but this is what the test did
previously. Should hopefully address buildbot failures.
2022-02-04 10:26:15 +01:00
Nikita Popov
c680eeab30 [IRBuilder][RS4GC] Require FunctionCallee when creating statepoint
This makes the statepoint methods in IRBuilder accept a
FunctionCallee, which carries both the callee and function type.
This is used to add the elementtype attribute to the statepoint call.

RS4GC requires an additional tweak to actually preserve that attribute
-- previously the attributes on the call were completely overwritten.

Differential Revision: https://reviews.llvm.org/D118886
2022-02-04 09:47:32 +01:00
Nikita Popov
2071f7f252 [RS4GC] Regenerate test checks (NFC) 2022-02-03 12:29:44 +01:00
Zarko Todorovski
9769e97c35 [LLVM] Inclusive terms: remove/replace references to sanity in RewriteStatepointsForGC.cpp and test
Part of work to have the LLVM backend to use more inclusive terms.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D112461
2021-10-25 16:17:41 -04:00
Nikita Popov
80110aafa0 [Tests] Fix incorrect noalias metadata
Mostly this fixes cases where !noalias or !alias.scope were passed
a scope rather than a scope list. In some cases I opted to drop
the metadata entirely instead, because it is not really relevant
to the test.
2021-09-18 20:51:00 +02:00
Yevgeny Rouban
88024a724c [RS4GC] Use one DVCache for both inlineGetBaseAndOffset() and insertParsePoints()
This new test demonstrates a case where a base ptr is generated
twice for the same value: the first one is generated while
the gc.get.pointer.base() is inlined, the second is generated
for the statepoint. This happens because the methods
inlineGetBaseAndOffset() and insertParsePoints() do not share
their defining value cache used by the findBasePointer() method.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D103240
2021-07-12 18:13:00 +07:00
Yevgeny Rouban
a95c336b5e [RS4GC] Add a test to demonstrate duplication of base generation. NFC
This new test demonstrates a case where a base ptr is generated
twice for the same value: the first one is generated while
the gc.get.pointer.base() is inlined, the second is generated
for the statepoint. This happens because the methods
inlineGetBaseAndOffset() and insertParsePoints() do not share
their defining value cache used by the findBasePointer() method.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D103238
2021-07-12 18:13:00 +07:00
Philip Reames
ac81cb7e6d Allow ptrtoint/inttoptr of non-integral pointer types in IR
I don't like landing this change, but it's an acknowledgement of a practical reality.  Despite not having well specified semantics for inttoptr and ptrtoint involving non-integral pointer types, they are used in practice.  Here's a quick summary of the current pragmatic reality:
* I happen to know that the main external user of non-integral pointers has effectively disabled the verifier rules.
* RS4GC (the lowering pass for abstract GC machine model which is the key motivation for non-integral pointers), even supports them.  We just have all the tests using an integral pointer space to let the verifier run.
* Certain idioms (such as alignment checks for alignment N, where any relocation is guaranteed to be N byte aligned) are fine in practice.
* As implemented, inttoptr/ptrtoint are CSEd and are not control dependent.  This means that any code which is intending to check a particular bit pattern at site of use must be wrapped in an intrinsic or external function call.

This change allows them in the Verifier, and updates the LangRef to specific them as implementation dependent.  This allows us to acknowledge current reality while still leaving ourselves room to punt on figuring out "good" semantics until the future.
2021-06-11 13:38:32 -07:00
Philip Reames
c880d5e583 [RS4GC] Treat inttoptr as base pointer
This is a modified version of a patch by tolziplohu with a style change, and most importantly, a revised commit message.

inttoptr for a non-integral address space is currently ill defined in the LangRef.  Figuring out exactly what the dynamic semantics of such a cast would be is hard, and not yet settled.  Despite that, we still need to go ahead and implement something in RS4GC for a couple of reasons.

First, as a simple consistency argument.  We're apparently added support for constexpr inttoptrs a while back, and even have tests which exercised them.  Having a lack of constant folding trigger a crash during lowering is non-ideal.

Second, and more fundementally, the optimizer is allowed to insert undefined constructs in unreachable code.  At the same time, we can't assume that dynamically dead code is always pruned before lowering.  As a result, we must assume that inttoptrs can occur (even if completely ill defined) along dead paths.  We need the lowering to not crash.  The stackmaps produced can be garbage (as the assumption is the code is dynamically dead), but the lowering itself can't crash.

Differential Revision: https://reviews.llvm.org/D103492
2021-06-07 10:27:23 -07:00
Yevgeny Rouban
4d26f41f76 [RS4GC] Introduce intrinsics to get base ptr and offset
There can be a need for some optimizations to get (base, offset)
for any GC pointer. The base can be calculated by generating
needed instructions as it is done by the
RewriteStatepointsForGC::findBasePointer() function. The offset
can be calculated in the same way. Though to not expose the base
calculation and to make the offset calculation as simple as
ptrtoint(derived_ptr) - ptrtoint(base_ptr), which is illegal
outside RS4GC, this patch introduces 2 intrinsics:

 @llvm.experimental.gc.get.pointer.base(%derived_ptr)
 @llvm.experimental.gc.get.pointer.offset(%derived_ptr)

These intrinsics are inlined by RS4GC along with generation of
statepoint sequences.

With these new intrinsics the GC parseable lowering for atomic
memcpy intrinsics (6ec2c5e402a724ba99bce82a9cac7a3006d660f4)
could be implemented as a separate pass.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D100445
2021-05-27 09:14:14 +07:00
Philip Reames
3f1c218318 [rs4gc] Strip memory related attributes consistently
I noticed that rs4gc is not stripping a number of memory aliasing related attributes. We do strip some from call sites, but don't strip the same ones from declarations or parameters.

Why do we need to strip these? Two answers:

    Safepoints conceptually read and write to the entire garbage collected heap in the physical model. We need this to preserve ordering of all loads and stores with respect to possible relocation.
    We can infer other attributes from these. For instance, readnone can imply both nofree and nosync. Both of which don't hold after physical rewriting.

Note: This exposed a latent issue which was fixed a couple weeks back in 01801d5274.

Differential Revision: https://reviews.llvm.org/D99802
2021-05-14 07:54:56 -07:00
Philip Reames
01801d5274 [rs4gc] Fix a latent bug around attribute stripping for intrinsics
This change fixes a latent bug which was exposed by a change currently in review (https://reviews.llvm.org/D99802#2685032).

The story on this is a bit involved.  Without this change, what ended up happening with the pending review was that we'd strip attributes off intrinsics, and then selectiondag would fail to lower the intrinsic.  Why?  Because the lowering of the intrinsic relies on the presence of the readonly attribute.  We don't have a matcher to select the case where there's a glue node needed.

Now, on the surface, this still seems like a codegen bug.  However, here it gets fun.  I was unable to reproduce this with a standalone test at all, and was pretty much struck until skatkov provided the critical detail.  This reproduces only when RS4GC and codegen are run in the same process and context.  Why?  Because it turns out we can't roundtrip the stripped attribute through serialized IR!

We'll happily print out the missing attribute, but when we parse it back, the auto-upgrade logic has a side effect of blindly overwriting attributes on intrinsics with those specified in Intrinsics.td.  This makes it impossible to exercise SelectionDAG from a standalone test case.

At this point, I decided to treat this an RS4GC bug as a) we don't need to strip in this case, and b) I could write a test which shows the correct behavior to ensure this doesn't break again in the future.

As an aside, I'd originally set out to handle libfuncs too - since in theory they might have the same issues - but backed away quickly when I realized how the semantics of builtin, nobuiltin, and no-builtin-x all interacted.  I'm utterly convinced that no part of the optimizer handles that correctly, and decided not to open that can of worms here.
2021-04-19 13:14:07 -07:00
Philip Reames
a505801e2b [rs4gc] Strip nofree and nosync attributes when lowering from abstract model
The safepoints being inserted exists to free memory, or coordinate with another thread to do so.  Thus, we must strip any inferred attributes and reinfer them after the lowering.

I'm not aware of any active miscompiles caused by this, but since I'm working on strengthening inference of both and leveraging them in the optimization decisions, I figured a bit of future proofing was warranted.
2021-04-02 09:12:24 -07:00
Philip Reames
d01653f827 [rs4gc] add tests for existing code stripping attributes from function signatures 2021-04-02 08:59:55 -07:00
Serguei Katkov
9fec382601 [RS4GC] Fix hang on infinite loop
meetBDVState utility may sets the base pointer for the conflict state.
At this moment the base for conflict state does not have any meaning but
is used in comparison of BDV states. This comparison is used as an indicator
of progress done on iteration and RS4GC pass uses infinite loop to reach
fixed point.
As a result for added test on each iteration state for some phi nodes is updated
with other base value for conflict state and it indicates as a progress while
for conflict state there is no any progress more possible.
In reality the base value is transferred from one state to another and pass
detects the progress on these states.

The test is very fragile. The traversal order of states and operands of phi nodes
plays important role.

Reviewers: reames, dantrushin
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D99058
2021-03-23 12:54:51 +07:00
Philip Reames
ef884e155d [rs4gc] don't force a conflict for a canonical broadcast
A broadcast is a shufflevector where only one input is used. Because of the way we handle constants (undef is a constant), the canonical shuffle sees a meet of (some value) and (nullptr). Given this, every broadcast gets treated as a conflict and a new base pointer computation is added.

The other way to tackle this would be to change constant handling specifically for undefs, but this seems easier.

Differential Revision: https://reviews.llvm.org/D98315
2021-03-16 12:59:06 -07:00
Philip Reames
5cabf472cb [rs4gc] don't duplicate existing values which are provably base pointers
RS4GC needs to rewrite the IR to ensure that every relocated pointer has an associated base pointer. The existing code isn't particularly smart about avoiding duplication of existing IR when it turns out the original pointer we were asked to materialize a base pointer for is itself a base pointer.

This patch adds a stage to the algorithm which prunes nodes proven (with a simple forward dataflow fixed point) to be base pointers from the list of nodes considered for duplication. This does require changing some of the later invariants slightly, that's probably the riskiest part of the change.

Differential Revision: D98122
2021-03-16 12:51:21 -07:00