https://alive2.llvm.org/ce/z/YGT5SNhttps://alive2.llvm.org/ce/z/PVDxCwhttps://alive2.llvm.org/ce/z/8buR2N
This is tricky because with positive numbers, we only go up, so we can
in fact always hit the signed_max boundary. This is important because
the intrinsic we use has the behavior of going the OTHER way, aka clamp
to INT_MIN if it goes in that direction.
And the range checking we do only works for positive numbers.
Because of this issue, we can only do this for constants as well.
For the simplification
```
(C && A) || (!C && B) --> sel C, A, B
```
(and related), if `C` (or (`!C`)) is the condition in the select
instruction representing the logical and, we can preserve that logical
and's branch weights when emitting the new instruction. Otherwise, the
profile data is unknown.
If `C` is the condition of both logical ands, then we just take the
branch weights of the first logical and (though in practice they should
be equal.)
Furthermore, `select-safe-transforms.ii` now passes under the profcheck
configuration, so we remove it from the failing tests.
Tracking issue: #147390
If a select instruction is replaced with one whose conditional is the
negation of the original, then the replacement's branch weights are the
reverse of the original's.
Tracking issue: #147390
The `masked.load`, `masked.store`, `masked.gather` and `masked.scatter`
intrinsics currently accept a separate alignment immarg. Replace this
with an `align` attribute on the pointer / vector of pointers argument.
This is the standard representation for alignment information on
intrinsics, and is already used by all other memory intrinsics. This
means the signatures now match llvm.expandload, llvm.vp.load, etc.
(Things like llvm.memcpy used to have a separate alignment argument as
well, but were already migrated a long time ago.)
It's worth noting that the masked.gather and masked.scatter intrinsics
previously accepted a zero alignment to indicate the ABI type alignment
of the element type. This special case is gone now: If the align
attribute is omitted, the implied alignment is 1, as usual. If ABI
alignment is desired, it needs to be explicitly emitted (which the
IRBuilder API already requires anyway).
In the case where we have a conditional that is implied by a previous
conditional (like x < 10 => x < 20 in a select), we can simply propagate
the profile information along the select.
Consider the following transform:
```
C = binop float A, nnan OOp
D = select ninf, i1 cond, float C, float A
->
E = select ninf, i1 cond, float OOp, float Identity
F = binop float A, E
```
We cannot propagate ninf from the original select, because OOp may be
inf, and the flag only guarantees that FalseVal (op OOp) is never
infinity.
Examples: -inf + +inf = NaN, -inf - -inf = NaN, 0 * inf = NaN
Specifically, if the original select has both ninf and nnan, we can
safely propagate the flag.
Alive2:
+ fadd: https://alive2.llvm.org/ce/z/TWfktv
+ fsub: https://alive2.llvm.org/ce/z/RAsjJb
+ fmul: https://alive2.llvm.org/ce/z/8eg4ND
Closes https://github.com/llvm/llvm-project/issues/161634.
Logical booleans in LLVM are represented by select statements - e.g. the
statement
```
A && B
```
is represented as
```
select i1 %A, i1 %B, i1 false
```
When LLVM folds two of the same logical booleans into a logical boolean
and a bitwise boolean (e.g. `A && B && C` -> `A && (B & C)`), the first
logical boolean is a select statement that retains the original
condition from the first logical boolean of the original statement. This
means that the new select statement has the branch weights as the
original select statement.
Tracking issue: #147390
This reverts commit 572b579632fb79ea6eb562a537c9ff1280b3d4f5.
This is a reland of #159666 but with a fix moving the `extern`
declaration of the flag under the LLVM namespace, which is needed to fix
a linker error caused by #161240.
If `select` simplification produces the transform:
```
(select A && B, T, F) -> (select A, T, F)
```
or
```
(select A || B, T, F) -> (select A, T, F)
```
it stands to reason that if the branches are the same, then the branch
weights remain the same since the net effect is a simplification of the
conditional.
There are also cases where InstCombine negates the conditional (and
therefore reverses the branches); this PR asserts that the branch
weights are reversed in this case.
Tracking issue: #147390
A common idiom is the usage of the PatternMatch match function within a
functional algorithm like all_of. Introduce a match functor to shorten
this idiom.
Co-authored-by: Luke Lau <luke@igalia.com>
```llvm
%sub = sub nsw T %x, %y
%cmp = icmp sgt T %x, %y ; or sge
%neg = sub T 0, %sub
%abs = select i1 %cmp, T %sub, T %neg
```
becomes:
```llvm
%sub = sub nsw T %x, %y
%abs = call T @llvm.abs.T(T %sub, i1 false)
```
Alive2: https://alive2.llvm.org/ce/z/ApdJX8https://alive2.llvm.org/ce/z/gRTmZk
This patch addresses
https://github.com/llvm/llvm-project/pull/155216#discussion_r2297724663.
This patch adds a helper function to put the inverse cast on constants,
with cast flags preserved(optional).
Follow-up patches will add trunc/ext handling on VectorCombine and flags
preservation on InstCombine.
When folding `X Pred C2 ? X BOp C1 : C2 BOp C1` to `min/max(X, C2) BOp
C1`, if NUW/NSW flags are present on `X BOp C1` and could be safely
applied to `C2 BOp C1`, then they may be added on the BOp after the fold
is complete. https://alive2.llvm.org/ce/z/n_3aNJ
Preserving these flags can allow subsequent transforms to re-order the
min/max and BOp, which in the case of NVPTX would allow for some
potential future transformations which would improve
instruction-selection.
Add the following folds for integer min max folding in InstCombine:
- (X > Y) ? X : (Y - 1) ==> MIN(X, Y - 1)
- (X < Y) ? X : (Y + 1) ==> MAX(X, Y + 1)
These are safe when overflow corresponding to the sign of the comparison
is poison. (proof https://alive2.llvm.org/ce/z/oj5iiI).
The most common of these patterns is likely the minimum case which
occurs in some internal library code when clamping an integer index to a
range (The maximum cases are included for completeness). Here is a
simplified example:
int clampToWidth(int idx, int width) {
if (idx >= width)
return width - 1;
return idx;
}
https://cuda.godbolt.org/z/nhPzWrc3W
Extend folding for `X Pred C2 ? X BOp C1 : C2 BOp C1` to `min/max(X, C2)
BOp C1` to allow min and max as `BOp`. This ensures a constant clamping
pattern is folded into a pair of min/max instructions. Here is a
simplified example of a case where this folding is not occurring
currently.
int clampToU8(int v) {
if (v < 0) return 0;
if (v > 255) return 255;
return v;
}
https://godbolt.org/z/78jhKPWbv
Generic proof: https://alive2.llvm.org/ce/z/cdpLYy
Before this patch, InstCombine hung because it replaced a value with a
more complex one:
```
%sel = select i1 %cmp, i32 %smax, i32 0 ->
%sel = select i1 %cmp, i32 %masked, i32 0 ->
%sel = select i1 %cmp, i32 %smax, i32 0 ->
...
```
This patch makes this replacement more conservative. It only performs
the replacement iff the new value is one of the operands of the original
value.
Closes https://github.com/llvm/llvm-project/issues/142405.
Having a finite Depth (or recursion limit) for computeKnownBits is very
limiting, but is currently a load-bearing necessity, as all KnownBits
are recomputed on each call and there is no caching. As a prerequisite
for an effort to remove the recursion limit altogether, either using a
clever caching technique, or writing a easily-invalidable KnownBits
analysis, make the Depth argument in APIs in ValueTracking uniformly the
last argument with a default value. This would aid in removing the
argument when the time comes, as many callers that currently pass 0
explicitly are now updated to omit the argument altogether.
Make use of known bits when trying to decompose a select/icmp bittest and folding it into an and. This means we can fold when additional information, for instance via a range attribute or metadata, allows us to conclude that the resulting mask is in fact a power of two.
for `trunc nuw` saves a instruction and otherwise only other
instructions without the select, same behavior as for bit test before.
proof: https://alive2.llvm.org/ce/z/a6QmyV
Consider the following pattern:
```
%cmp = fcmp <pred> double %x, 0.000000e+00
%negX = fneg <fmf> double %x
%sel = select i1 %cmp, double %x, double %negX
```
We cannot propagate ninf from fneg to select since `%negX` may not be
chosen. Similarly, we cannot propagate nnan unless `%negX` is guaranteed
to be selected when `%x` is NaN.
This patch also propagates nnan/ninf from fcmp to avoid regression in
`PhaseOrdering/generate-fabs.ll`.
Alive2: https://alive2.llvm.org/ce/z/t6U-tA
Closes https://github.com/llvm/llvm-project/issues/121430 and
https://github.com/llvm/llvm-project/issues/113989.
Changes: There was a serious bug in the previous patch, leading to a
miscompile. See #122723 for the miscompile report from Alexander, and
the follow-up investigation by Nikita. The patch has since been
reworked, and now includes the testcase from the miscompile.
Follow up on 4a0d53a (PatternMatch: migrate to CmpPredicate) to get rid
of one of the FIXMEs it introduced by replacing a predicate comparison
with CmpPredicate::getMatching.
Co-authored-by: Nikita Popov <npopov@redhat.com>
Follow up on 4a0d53a (PatternMatch: migrate to CmpPredicate) to get rid
of one of the FIXMEs it introduced by replacing a predicate comparison
with CmpPredicate::getMatching.