Changes the size of allocations automatically.
For now, implements the case when a single range from start of the
allocation is alive and the allocation can be reduced.
The very first AA, at least the first one in order, is not necessary
anymore. `AAReturnedValues` was from a different time; one might say, a
simpler time.
It was rewriten once to use `Attribute::getAssumedSimplifiedValues`,
which is what the replacement, `AAPotentialValuesReturned`, does too.
To match the old behavior we needed to avoid the helper
`AAReturnedFromReturnedValues` and iterate the return instructions
explicitly, however, it is still less complexity than it was before.
`AAReturnedFromReturnedValues` and `getAssumedSimplifiedValues` now
allow users to stop at PHI and select nodes or to ignore those and look
through. `AANoFPClass` will stop at select and phi nodes to read the
fast math flags.
Fixes: https://github.com/llvm/llvm-project/issues/63404
Differential Revision: https://reviews.llvm.org/D154917
AANoCapture is now the first non-boolean AA that is always queried via
the new APIs and not created manually.
We explicitly do not manifest nocapture for `null` and `undef` anymore.
AANonNull is now the first AA that is always queried via the new APIs
and not created manually. Others will follow shortly to avoid trivial
AAs whenever possible.
This commit introduced some helper logic that will make it simpler to
port the next one. It also untangles AADereferenceable and AANonNull
such that the former does not keep a handle on the latter. Finally,
we stop deducing `nonnull` for `undef`, which was incorrect.
If an attribute is implied by the IR we do not (always) create an AA
anymore. To keep test coverage, and given the lack of a good heuristic
to decide otherwise, we will now also manifest such attributes.
We had some custom handling for existing MemoryEffects but we now move
it to the place we check other existing attributes before we manifest
new ones. If we later decide to curb duplication (of attributes on the
call site and callee), we can do that at a single location and for all
attributes.
The test changes basically add known `memory` callee information to the
call sites.
This is a partial cleanup to centralize the initialization and update
decisions for AAs. Lifting the burdon and boilerplate on users and
making it harder to accidentally perform unsound deductions.
The two static helpers show how we can lift the decisions to generate an
AA into the Attributor, avoiding trivial AAs that just cost us compile
time and maintenance code (to check for pre-conditions).
This converts the arg-count-mismatch.ll test to opaque pointers.
The bitcasts of the called functions are now implicit and this
affects behavior: We now infer memory attributes. This should be
fine, as function-level attributes are not affected by signature
mismatches.
Differential Revision: https://reviews.llvm.org/D153406
We now consistently use `CallBase::getCalledOperand` rather than
`getCalledFunction`, as we do not want the type checked performed by the
latter. This exposed various missing checks to handle mismatches
properly, but it is good to have them explicit now.
In a follow up we might want to flag certain calls as UB, but for now,
we allow everything to cut down on unexpected differences.
Instead of creating an AA for an IR attribute we can first check if it
is implied/known. If so, we can save the time to create the AA, figure
out it is implied, fix it, and later manifest it in the IR
(redundantly). Other IR attributes can be added to the list in
`AA::hasAssumedIRAttr` later on, for now we support 8 different ones.
It was never really useful to track #iterations, though it helped during
the initial development. What we should track, in a follow up, are
potentially #updates. That is also what we should restrict instead of
the #iterations.
Derive the mustprogress attribute based on the willreturn attribute
or the fact that all callers are mustprogress.
Differential Revision: https://reviews.llvm.org/D94740
Similar to loads, PHIs can be used to introduce non-dynamically unique
values into the simplification "algorithm". We need to check that PHIs
do not carry such a value from one iteration into the next as can cause
downstream reasoning to fail, e.g., downstream could think a comparison
is equal because the simplified values are equal while they are defined
in different loop iterations. Similarly, instructions in cycles are now
conservatively treated as non-dynamically unique. We could do better but
I'll leave that for the future.
The change in AAUnderlyingObjects allows us to ignore dynamically unique
when we simply look for underlying objects. The user of that AA should
be aware that the result might not be a dynamically unique value.
When we simplify loads we need to adjust types (esp. null-values)
properly to avoid inconsinstencies down the line. Add a cast and an
error message.
Fixes: https://github.com/llvm/llvm-project/issues/60788
IR is now always parsed in opaque pointer mode, unless
-opaque-pointers=0 is explicitly given. There is no automatic
detection of typed pointers anymore.
The -opaque-pointers=0 option is added to any remaining IR tests
that haven't been migrated yet.
Differential Revision: https://reviews.llvm.org/D141912
This patch adds two checks that have in experiments caused issues. One
was an oversight that allowed new AAs during cleanup to be optimistic.
The other treated functions as functions even if they were used as
values, e.g., in a cast instruction. In such cases we might have assumed
the value is dead if the function is not entered, which isn't true.
The new test functions don't expose a bug but I kept them around.
Check lines were regenerated for these.
The alignment changes in byval-2. look suspicious at first glance,
but actually only propagate pre-existing UB.
This resolves a recent regression introduced by a bug fix and allows us
to use dominating write information (formerly HasBeenWrittenTo
information) to skip potential interfering accesses.
Generally, there are two changes here:
1) If we have dominating writes they form a chain and we can look at the
least one to minimize the distance between the write and the (read)
access in question.
2) If such a least dominating write exists, we can ignore writes in
other functions as long as they cannot be reached from code between
this write and the (read) access in question.
We have all the tools available to make such queries and the positive
tests show the result. Note that the negative test from the bug fix is
still in tree and not affected.
As a side-effect, we can remove the (arbitrary) treshold now on the
number of interfering accesses since we do not iterate over dominating
ones anymore.
This patch introduces a new AA `AAUnderlyingObjects`. It is basically like a wrapper
AA of the function `AA::getAssumedUnderlyingObjects`, but it can recursively do
query if the underlying object is an indirect access, such as a phi node or a select
instruction.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D141164
We were already treating branch on poison as UB, but branch on
undef is also UB. Move the checks into the correct function.
From LangRef for br:
> If ‘cond’ is poison or undef, this instruction has undefined behavior.
From LangRef for switch:
> If ‘value’ is poison or undef, this instruction has undefined behavior.
There is a minor regression in dont-distribute-phi.ll, apparently
we handle that pattern in logical but not bitwise form.
AAPotentialConstantValues now works for PHI and Load by simply examinig
AAPotentialValues for the instruction itself.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D140371
Even if all loads and stores are in `nosync` functions we cannot
guarantee there is no synchronization going on between them. As such, we
cannot use CFG reasoning. We could check the entire module, or, what
happens now to minimize test churn, is to check if all accesses are in
the same function that is `nosync`. A follow up will undo some of the
regressions where possible.
Similarly, reachability cannot be used to exclude an access if the
access is not known to be executed by the same thread as the given
instruction.
The OpenMP-opt test was added for the latter problem.
We had two AAs for reachability but it was very cumbersome to extend
them. We also had some fallback to use LLVM-core mechanisms and cache
the result. The new design shares the query code and interface nicely
between AAIntraFnReachability and AAInterFnReachability.
As part of the rewrite we also added the ExclusionSet to the queries.
This switches everything to use the memory attribute proposed in
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly
attributes are dropped. The readnone, readonly and writeonly attributes
are restricted to parameters only.
The old attributes are auto-upgraded both in bitcode and IR.
The bitcode upgrade is a policy requirement that has to be retained
indefinitely. The IR upgrade is mainly there so it's not necessary
to update all tests using memory attributes in this patch, which
is already large enough. We could drop that part after migrating
tests, or retain it longer term, to make it easier to import IR
from older LLVM versions.
High-level Function/CallBase APIs like doesNotAccessMemory() or
setDoesNotAccessMemory() are mapped transparently to the memory
attribute. Code that directly manipulates attributes (e.g. via
AttributeList) on the other hand needs to switch to working with
the memory attribute instead.
Differential Revision: https://reviews.llvm.org/D135780
If we have a constant aggregate, e.g., as an initializer, we usually
failed to extract the proper value/type from it. This patch provides the
size and offset information necessary to extract the right part of the
constant.
Now that the legacy PM is no longer tested, the huge matrix of
test prefixes used by attributor tests is no longer needed and very
confusing for the casual reader. Reduce the prefixes down to just
CHECK, TUNIT and CGSCC.
This is the first patch in a series intended for removing flag
-enable-new-pm=0 from lit tests. This is part of a bigger
effort of completely removing legacy code related to legacy
pass manager in favor of currently default new pass manager.
In this patch flag has been removed only from tests where no significant
change has been required because checks has been duplicated for
both PMs.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D134150
Revert "[Attributor] Teach AAPointerInfo to look into aggregates"
This reverts commit 844f6c5d03d58e7ac0c6b838e4a7834ac575ab9b and
4ed0a88cd8a77370073feb270d77a9e8b27bd68c as they broke the buildbots
that run openmp/libomptarget/test/offloading/bug49021.cpp.
If we have a constant aggregate, e.g., as an initializer, we usually
failed to extract the proper value/type from it. This patch provides the
size and offset information necessary to extract the right part of the
constant.
For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many reasons:
- We recomputed the result a lot as there was no caching for the 9
locations calling `genericValueTraversal`.
- We added the idea of "intra" vs. "inter" procedural simplification
only as an afterthought. `genericValueTraversal` did offer an option
but `AAValueSimplify` did not. Thus, we might end up with "too much"
simplification in certain situations and then gave up on it.
- Because `genericValueTraversal` was not a real `AA` we ended up with
problems like the infinite recursion bug (#54981) as well as code
duplication.
This patch introduces `AAPotentialValues` and replaces the
`AAValueSimplify` uses with it. `genericValueTraversal` is folded into
`AAPotentialValues` as are the instruction simplifications performed in
`AAValueSimplify` before. We further distinguish "intra" and "inter"
procedural simplification now.
`AAValueSimplify` was not deleted as we haven't ported the
re-materialization of instructions yet. There are other differences over
the former handling, e.g., we may not fold trivially foldable
instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2`
but if an operand would be simplified to `i32 1` we would fold it still.
We are also even more aware of function/SCC boundaries in CGSCC passes,
which is good even if some tests look like they regress.
Fixes: https://github.com/llvm/llvm-project/issues/54981
Note: A previous version was flawed and consequently reverted in
6555558a80589d1c5a1154b92cc3af9495f8f86c.
This reverts commit f17639ea0cd30f52ac853ba2eb25518426cc3bb8 as three
AMDGPU tests haven't been updated. Will need to verify the changes are
not regressions we should avoid.