As part of the migration to ptradd
(https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699),
we need to change the representation of the `inrange` attribute, which
is used for vtable splitting.
Currently, inrange is specified as follows:
```
getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2)
```
The `inrange` is placed on a GEP index, and all accesses must be "in
range" of that index. The new representation is as follows:
```
getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2)
```
This specifies which offsets are "in range" of the GEP result. The new
representation will continue working when canonicalizing to ptradd
representation:
```
getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48)
```
The inrange offsets are relative to the return value of the GEP. An
alternative design could make them relative to the source pointer
instead. The result-relative format was chosen on the off-chance that we
want to extend support to non-constant GEPs in the future, in which case
this variant is more expressive.
This implementation "upgrades" the old inrange representation in bitcode
by simply dropping it. This is a very niche feature, and I don't think
trying to upgrade it is worthwhile. Let me know if you disagree.
Teaching ConstantFoldLoadFromUniformValue that types that are padded in
memory can't be considered as uniform.
Using the big hammer to prevent optimizations when loading from a
constant for which DataLayout::typeSizeEqualsStoreSize would return
false.
Main problem solved would be something like this:
store i17 -1, ptr %p, align 4
%v = load i8, ptr %p, align 1
If for example the i17 occupies 32 bits in memory, then LLVM IR doesn't
really tell where the padding goes. And even if we assume that the 15
most significant bits are padding, then they should be considered as
undefined (even if LLVM backend typically would pad with zeroes).
Anyway, for a big-endian target the load would read those most
significant bits, which aren't guaranteed to be one's. So it would be
wrong to constant fold the load as returning -1.
If LLVM IR had been more explicit about the placement of padding, then
we could allow the constant fold of the load in the example, but only
for little-endian.
Fixes: https://github.com/llvm/llvm-project/issues/81793
This is intended as the replacement for ConstantExpr::getIntegerCast(),
which does not require availability of the corresponding constant
expressions. It just forwards to ConstantFoldCastOperand with the
correct opcode.
Introduce isDesirableCastOp() which determines whether IR builder
and constant folding should produce constant expressions for a
given cast type. This mirrors what we do for binary operators.
Mark zext/sext as undesirable, which prevents most creations of such
constant expressions. This is still somewhat incomplete and there
are a few more places that can create zext/sext expressions.
This is part of the work for
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.
The reason for the odd result in the constantexpr-fneg.c test is
that initially the "a[]" global is created with an [0 x i32] type,
at which point the icmp expression cannot be folded. Later it is
replaced with an [1 x i32] global and the icmp gets folded away.
But at that point we no longer fold the zext.
Instead of ConstantExpr::getCast() with a fixed opcode, use the
corresponding getXYZ methods instead. For the one place creating
a pointer bitcast drop it entirely, as this is redundant with
opaque pointers.
The function StripPtrCastKeepAS() no longer makes any sense, as we've
migrated to using opaque pointers throughout the codebase. Hence, remove
it. No changes to tests are required.
Differential Revision: https://reviews.llvm.org/D156555
This makes it possible to use canonicalize to perform a dynamic check
for whether denormal flushing is enabled, which will fold out when the
denormal mode is known. Previously it would only fold if denormal
flushing were known enabled.
https://reviews.llvm.org/D156107
Constant folding cannot fail here, because we're really working
on plain integers. It might be better to make all of this work
on APInts instead of Constants.
Handle constant folding and idempotent folding. Not sure
this is an appropriate use of undef for the inf/nan case. The
C version says the second result is "unspecified". The AMDGPU
instruction returns 0.
This is stricter than the default "ieee", and should probably be the
default. This patch leaves the default alone. I can change this in a
future patch.
There are non-reversible transforms I would like to perform which are
legal under IEEE denormal handling, but illegal with flushing zero
behavior. Namely, conversions between llvm.is.fpclass and fcmp with
zeroes.
Under "ieee" handling, it is legal to translate between
llvm.is.fpclass(x, fcZero) and fcmp x, 0.
Under "preserve-sign" handling, it is legal to translate between
llvm.is.fpclass(x, fcSubnormal|fcZero) and fcmp x, 0.
I would like to compile and distribute some math library functions in
a mode where it's callable from code with and without denormals
enabled, which requires not changing the compares with denormals or
zeroes.
If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0,
it is no longer possible to call the function from code with denormals
enabled, or write an optimization to move the function into a denormal
flushing mode. For the original function, if x was a denormal, the
class would evaluate to false. If the function compiled with denormal
handling was converted to or called from a preserve-sign function, the
fcmp now evaluates to true.
This could also be of use for strictfp handling, where code may be
changing the denormal mode.
Alternative name could be "unknown".
Replaces the old AMDGPU custom inlining logic with more conservative
logic which tries to permit inlining for callees with dynamic handling
and avoids inlining other mismatched modes.
There is no getNullValue in ConstantFP. Due to inheritance, we're calling
Constant::getNullValue which handles any type including FP.
Since we already know we want an FP constant we can use ConstantFP::getZero
which might be faster and is a more readable name for an FP zero.
These expressions will now only be created if explicitly requested
in IR/bitcode (and by LowerTypeTests, which has a tricky to remove
use).
This is in preparation for removing these expressions entirely,
but also fixes#60983 in the meantime.
Check that the underlying object is a constant global with
definitive initializer upfront, so we can skip the more expensive
offset calculation logic if we can't perform the fold anyway.
Similar to how `makeArrayRef` is deprecated in favor of deduction guides, do the
same for `makeMutableArrayRef`.
Once all of the places in-tree are using the deduction guides for
`MutableArrayRef`, we can mark `makeMutableArrayRef` as deprecated.
Differential Revision: https://reviews.llvm.org/D141814
Use deduction guides instead of helper functions.
The only non-automatic changes have been:
1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).
Per reviewers' comment, some useless makeArrayRef have been removed in the process.
This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.
Differential Revision: https://reviews.llvm.org/D140955
Copied from the existing llvm.amdgcn.class handling; eventually I will
fold that to the generic intrinsic when legal. The tests should
probably move into an instsimplify only test.