3302 Commits

Author SHA1 Message Date
Craig Topper
bf8054644d [DAGCombiner] Don't expand (neg (abs x)) if the abs has an additional user.
If the types aren't legal, the expansions may get type legalized in a
different way preventing code sharing. If the type is legal, we will
share some instructions between the two expansions, but we will need an
extra register.

Since we don't appear to fold (neg (sub A, B)) if the sub has an
additional user, I think it makes sense not to expand NABS.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120513
2022-03-01 07:32:07 -08:00
Sanjay Patel
acb96ffd14 [SDAG] fold bitwise logic with shifted operands
LOGIC (LOGIC (SH X0, Y), Z), (SH X1, Y) --> LOGIC (SH (LOGIC X0, X1), Y), Z

https://alive2.llvm.org/ce/z/QmR9rR

This is a reassociation + factoring fold. The common shift operation is moved
after a bitwise logic op on 2 input operands.
We get simpler cases of these patterns in IR, but I suspect we would miss all
of these exact tests in IR too. We also handle the simpler form of this plus
several other folds in DAGCombiner::hoistLogicOpWithSameOpcodeHands().

This is a partial implementation of a transform suggested in D111530
(only handles 'or' bitwise logic as a first step - need to stamp out more
tests for other opcodes).
Several of the same tests added for D111530 are altered here (but not
fully optimized). I'm not sure yet if this would help/hinder that patch,
but this should be an improvement for all tests added with ecf606cb4329ae
since it removes a shift operation in those examples.

Differential Revision: https://reviews.llvm.org/D120516
2022-02-27 09:54:12 -05:00
Simon Pilgrim
fadd20f80d [DAG] Ensure type is legal for bswap(shl(x,c)) -> zext(bswap(trunc(shl(x,c-bw/2)))) fold
As reported on D120192
2022-02-27 11:25:22 +00:00
Simon Pilgrim
370ebc9d9a [DAG] Attempt to fold bswap(shl(x,c)) -> zext(bswap(trunc(shl(x,c-bw/2))))
If the shl is at least half the bitwidth (i.e. the lower half of the bswap source is zero), then we can reduce the shift and perform the bswap at half the bitwidth and just zero extend.

Based off PR51391 + PR53867

Differential Revision: https://reviews.llvm.org/D120192
2022-02-24 19:33:51 +00:00
Sanjay Patel
4a3708cd6b [SDAG] remove shift that is redundant with part of funnel shift
This is the SDAG translation of D120253 :
https://alive2.llvm.org/ce/z/qHpmNn

The SDAG nodes can have different operand types than the result value.
We can see an example of that with AArch64 - the funnel shift amount
is an i64 rather than i32.

We may need to make that match even more flexible to handle
post-legalization nodes, but I have not stepped into that yet.

Differential Revision: https://reviews.llvm.org/D120264
2022-02-24 11:25:46 -05:00
Craig Topper
c7d6448d03 [DAGCombiner][TargetLowering] Pass SDValue by value to isMulAddWithConstProfitable.
Internally to DAGCombiner the SDValues were passed by non-const
reference despite not being modified. They were then passed by
const reference to TLI.

This patch passes them by value which is consistent with the vast
majority of code.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120420
2022-02-23 12:40:45 -08:00
Pawe Bylica
afdaa86b77
[DAGCombine] Extend combineCarryDiamond()
In combineCarryDiamond() use getAsCarry() to find more candidates for being a carry flag.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D118362
2022-02-23 21:37:49 +01:00
Paweł Bylica
df0c16ce00
[NFC][DAGCombine] Use isOperandOf() in combineCarryDiamond
Pre-commit for https://reviews.llvm.org/D118362.
2022-02-21 21:41:31 +01:00
Simon Pilgrim
46f1e8359e [DAG] visitBSWAP - pull out repeated SDLoc. NFC
Cleanup for D120192
2022-02-21 13:08:01 +00:00
Chen Zheng
efe5b8ad90 [ISEL] remove unnecessary getNode(); NFC
Reviewed By: RKSimon, craig.topper

Differential Revision: https://reviews.llvm.org/D120049
2022-02-20 21:08:49 -05:00
Luo, Yuanke
67ef63138b [SDAG] enable binop identity constant folds for sub
This patch extract the sub folding from D119654 and leave only add
folding in that patch.

Differential Revision: https://reviews.llvm.org/D120116
2022-02-21 09:37:36 +08:00
Sanjay Patel
a2963d871e [SDAG] fold sub-of-shift to add-of-shift
This fold is done in IR:
https://alive2.llvm.org/ce/z/jWyFrP

There is an x86 test that shows an improvement
from the added flexibility of using add (commutative).

The other diffs are presumed neutral.

Note that this could also be folded to an 'xor',
but I'm not sure if that would be universally better
(eg, x86 can convert adds more easily into LEA).

This helps prevent regressions from a potential fold for
issue #53829.
2022-02-18 11:55:50 -05:00
Paul Walker
6457f42bde [DAGCombiner] Extend ISD::ABDS/U combine to handle more cases.
The current ABD combine doesn't quite work for SVE because only a
single scalable vector per scalar integer type is legal (e.g. for
i32, <vscale x 4 x i32> is the only legal scalable vector type).

This patch extends the combine to also trigger for the cases when
operand extension must be retained.

Differential Revision: https://reviews.llvm.org/D115739
2022-02-17 13:32:20 +00:00
David Green
655d0d86f9 [DAGCombine] Move AVG combine to SimplifyDemandBits
This moves the matching of AVGFloor and AVGCeil into a place where
demand bit are available, so that it can detect more cases for more
folds. It changes the transform to start from a shift, not from a
truncate. We match the pattern shr(add(ext(A), ext(B)), 1), transforming
to ext(hadd(A, B)).

For signed values, because only the bottom bits are demanded llvm will
transform the above to use a lshr too, as opposed to ashr. In order to
correctly detect the hadd we need to know the demanded bits to turn it
back. Depending on whether the shift is signed (ashr) or logical (lshr),
and the extensions are signed or unsigned we can create different nodes.
If the shift is signed:
  Needs >= 2 sign bits. https://alive2.llvm.org/ce/z/h4gQAW generating signed rhadd.
  Needs >= 2 zero bits. https://alive2.llvm.org/ce/z/B64DUA generating unsigned rhadd.
If the shift is unsigned:
  Needs >= 1 zero bits. https://alive2.llvm.org/ce/z/ByD8sj generating unsigned rhadd.
  Needs 1 demanded bit zero and >= 2 sign bits https://alive2.llvm.org/ce/z/hvPGxX and
    https://alive2.llvm.org/ce/z/32P5n1 generating signed rhadd.

Differential Revision: https://reviews.llvm.org/D119072
2022-02-15 10:17:02 +00:00
David Green
03380c70ed [DAGCombine] Basic combines for AVG nodes.
This adds very basic combines for AVG nodes, mostly for constant folding
and handling degenerate (zero) cases. The code performs mostly the same
transforms as visitMULHS, adjusted for AVG nodes.

Constant folding extends to a higher bitwidth and drops the lowest bit.
For undef nodes, `avg undef, x` is transformed to x.  There is also a
transform for `avgfloor x, 0` transforming to `shr x, 1`.

Differential Revision: https://reviews.llvm.org/D119559
2022-02-14 11:18:35 +00:00
Craig Topper
e72fe654b7 [DAGCombiner] Use getShiftAmountConstant in DAGCombiner::foldSelectOfConstants.
This enables fshl to be matched earlier on X86

  %6 = lshr i32 %3, 1
  %7 = select i1 %4, i32 -2147483648, i32 0
  %8 = or i32 %6, %7

X86 uses i8 for shift amounts. SelectionDAGBuilder creates the
ISD::SRL with an i8 shift type. DAGCombiner turns the select into
an ISD::SHL. Prior to this patch it would use i32 for the shift
amount. fshl matching failed because the shift amounts have different
types. LegalizeDAG fixes the ISD::SHL shift amount to i8. This
allowed fshl matching to succeed.

With this patch, the ISD::SHL will be created with an i8 shift
amount. This allows the fshl to match immediately.

No test case beause we still end up with a fshl either way.
2022-02-13 19:09:26 -08:00
Sanjay Patel
96b7e0b5a0 [SDAG] clean up scalarizing load transform
I have not found a way to expose a difference for this patch in a test
because it only triggers for a one-use load, but this is the code that
was adapted into D118376 and caused miscompiles. The new code pattern
is the same as what we do in narrowExtractedVectorLoad() (reduces load
width for a subvector extract).

This removes seemingly unnecessary manual worklist management and fixes
the chain updating via "SelectionDAG::makeEquivalentMemoryOrdering()".

Differential Revision: https://reviews.llvm.org/D119549
2022-02-12 11:41:19 -05:00
Sanjay Patel
429f10f5f2 [SDAG] reduce code duplication and fix formatting; NFC 2022-02-12 10:22:13 -05:00
David Green
4072e362c0 [ISel] Port AArch64 HADD and RHADD to ISel
This ports the aarch64 combines for HADD and RHADD over to DAG combine,
so that they can be used in more architectures (notably MVE in a
followup patch). They are renamed to AVGFLOOR and AVGCEIL in the
process, to avoid confusion with instructions such as X86 hadd. The code
was also rewritten slightly to remove the AArch64 idiosyncrasies.

The general pattern for a AVGFLOORS is
  %xe = sext i8 %x to i32
  %ye = sext i8 %y to i32
  %a = add i32 %xe, %ye
  %r = lshr i32 %a, 1
  %t = trunc i32 %r to i8

An AVGFLOORU is equivalent with zext. Because of the truncate
lshr==ashr, as the top bits are not demanded. An AVGCEIL also includes
an extra rounding, so includes an extra add of 1.

Differential Revision: https://reviews.llvm.org/D106237
2022-02-11 18:28:56 +00:00
Reid Kleckner
b5a592a8e2 [DAG] Remove pointless std::function wrapper, NFC 2022-02-09 14:30:43 -08:00
Reid Kleckner
f63c150187 Revert "[DagCombine] Increase depth by number of operands to avoid a pathological compile time."
Appears to be causing check-llvm to fail

This reverts commit 49ab760090514dcbf84bd9dc7429146c4ca578ef.
2022-02-09 13:55:40 -08:00
Alina Sbirlea
49ab760090 [DagCombine] Increase depth by number of operands to avoid a pathological compile time.
We're hitting a pathological compile-time case, profiled to be in
DagCombiner::visitTokenFactor and many inserts into a SmallPtrSet.
It looks like one of the paths around findBetterNeighborChains is not
capped and leads to this.

This patch resolves the issue. Looking for feedback if this solution
looks reasonable.

Differential Revision: https://reviews.llvm.org/D118877
2022-02-09 13:31:28 -08:00
Sander de Smalen
ec46232517 [DAGCombiner] Fold ty1 extract_vector(ty2 splat(V)) -> ty1 splat(V)
This seems like an obvious fold, which leads to a few improvements.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D118920
2022-02-09 14:30:01 +00:00
Sanjay Patel
905abc5b7d [SDAG] enable binop identity constant folds for fmul/fdiv
The test diffs are identical to D119111.

This only affects x86 currently because no other target
has an override for the TLI hook that controls this transform.
2022-02-08 10:52:28 -05:00
Sanjay Patel
a68e098024 [SDAG] move x86 select-with-identity-constant fold behind a target hook; NFC
This is no-functional-change-intended because only the
x86 target enables the TLI hook currently.

We can add fmul/fdiv opcodes to the switch similar to the
proposal D119111, but we don't need to make other changes
like enabling target-specific combines.

We can also add integer opcodes (add, or, shl, etc.) to
the switch because this function is called from all of the
generic binary opcodes.

The goal is to incrementally enable the profitable diffs
from D90113 while avoiding regressions.

Differential Revision: https://reviews.llvm.org/D119150
2022-02-08 09:55:05 -05:00
Simon Pilgrim
fd2bb51f1e [ADT] Add APInt/MathExtras isShiftedMask variant returning mask offset/length
In many cases, calls to isShiftedMask are immediately followed with checks to determine the size and position of the bitmask.

This patch adds variants of APInt::isShiftedMask, isShiftedMask_32 and isShiftedMask_64 that return these values as additional arguments.

I've updated a number of cases that were either performing seperate size/position calculations or had created their own local wrapper versions of these.

Differential Revision: https://reviews.llvm.org/D119019
2022-02-08 12:04:13 +00:00
Simon Pilgrim
74555fd367 [DAG] visitINSERT_VECTOR_ELT - break if-else chain as they both return (style). NFC. 2022-02-07 09:58:47 +00:00
Craig Topper
c35ccd2ac8 [DAGCombiner][RISCV] Allow rotates by non-constant to be matched for i32 on riscv64 with Zbb.
rv64izbb has a RORW/ROLW instructions that operate on the lower
32-bits of a 64-bit value and sign extend bit 31 of the result.

DAGCombiner won't match rotate idioms because the i32 type isn't Legal
on riscv64.

This patch teaches DAGCombiner to allow it if the type is going to
be promoted and the target has Custom type legalization for ISD::ROTL
or ISD::ROTR. I've restricted this to scalar types. It doesn't appear
any in tree targets other than riscv64 have custom type legalization
for rotates.

If this patch isn't acceptable, I guess I can match SRLW, SLLW, and OR
after type legalization, but I'd like to avoid that if possible.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D119062
2022-02-06 10:58:12 -08:00
Bjorn Pettersson
cecf11c315 [DAGCombiner] Fold SSHLSAT/USHLSAT to SHL when no saturation will occur
When the shift amount is known and a known sign bit analysis of
the shiftee indicates that no saturation will occur, then we can
replace SSHLSAT/USHLSAT by SHL.

Differential Revision: https://reviews.llvm.org/D118765
2022-02-06 18:59:06 +01:00
Sander de Smalen
6452549f30 [DAGCombiner] Fold vecreduce_or/and if operand is insert_subvector.
Fold:
  vecreduce_or(insert_subvec(zeroinitializer, vec))
  -> vecreduce_or(vec)

  vecreduce_and(insert_subvec(allones, vec))
  -> vecreduce_and(vec)

  vecreduce_and/or(insert_subvec(undef, vec))
  -> vecreduce_and/or(vec)

This is useful for SVE which uses insert/extract subvector
to convert fixed-width to/from scalable vectors.

Reviewed By: bsmith

Differential Revision: https://reviews.llvm.org/D118919
2022-02-05 14:35:53 +00:00
Bjorn Pettersson
3db39e7479 [DAGCombiner] Fix dependency analysis in checkMergeStoreCandidatesForDependencies
In the aftermath of D116895 a problem was found in the analysis of
dependencies between store merge candidates in
checkMergeStoreCandidatesForDependencies, that is needed to avoid
the cycles are introduced in the DAG.

In the past it has been enough (or assumed to be enough) to start
scanning from non-chain operands when analysing the store merge
candidates for dependencies, assuming that the analysis of chain
dependencies performed when finding the candidates would cover
up for potential dependencies that exist involving the chain operands.
It was however discovered that one could end up with scenarios such
as descibed in the aarch64-checkMergeStoreCandidatesForDependencies.ll
test case, when the dependency between two stores is given by a mix
of chain operand dependencies and non-chain operand dependencies.

The fix in this patch make sure that we also account for chain operand
dependencies when doing the more elaborate analysis in
checkMergeStoreCandidatesForDependencies, no longer relying on that
the earlier check involving chain operands is enough.

Differential Revision: https://reviews.llvm.org/D118943
2022-02-04 08:53:01 +01:00
David Green
c89cfbd4dd Revert "[DAG] Extend SearchForAndLoads with any_extend handling"
This reverts commit 100763a88fe97b22cd5e3f69d203669aac3ed48f as it was
making incorrect assumptions about implicit zero_extends.
2022-02-01 20:18:40 +00:00
Bjorn Pettersson
3885879046 [DAGCombine] Add simple folds for SSHLSAT/USHLSAT
Do "simplifyShift" and "FoldConstantArithmetic" folds for the SSHLSAT
and USHLSAT DAG nodes.

This includes folds such as:
  (shlsat undef/poison, x) -> 0
  (shlsat x, undef/poison) -> undef
  (shlsat x, too_large_shamt) -> undef
  (shlsat 0, x) -> 0
  (shlsat x, 0) -> x
  (shlsat c1, c2) -> c3

Differential Revision: https://reviews.llvm.org/D118603
2022-02-01 10:51:35 +01:00
David Sherwood
daa80339df [CodeGen] Support folds of not(cmp(cc, ...)) -> cmp(!cc, ...) for scalable vectors
I have updated TargetLowering::isConstTrueVal to also consider
SPLAT_VECTOR nodes with constant integer operands. This allows the
optimisation to also work for targets that support scalable vectors.

Differential Revision: https://reviews.llvm.org/D117210
2022-02-01 09:50:00 +00:00
Kazu Hirata
2bea207d26 [CodeGen] Use default member initialization (NFC)
Identified with modernize-use-default-member-init.
2022-01-30 12:32:51 -08:00
Cullen Rhodes
5d089d9a83 [DAGCombiner] Fix invalid size request in combineRepeatedFPDivisors
If we have a vector FP division with a splatted divisor, use
getVectorMinNumElements when scaling the num of uses by splat factor.

For AArch64 the combine kicks in for the <vscale x 4 x float> case since it's
above the fdiv threshold (3) when scaling num uses by splat factor, but the
codegen is worse (splat + vector fdiv + vector fmul) than the <vscale x 2 x
double> case (splat + vector fdiv).

If the combine could be converted into a scalar FP division by
scalarizeBinOpOfSplats it may be cheaper, but it looks like this is predicated
on the isExtractVecEltCheap TLI function which is implemented for x86 but not
AArch64. Perhaps for now combineRepeatedFPDivisors should only scale num uses
by splat if the division can be converted into scalar op.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D118343
2022-01-28 17:01:08 +00:00
alex-t
5157f984ae [AMDGPU] Enable divergence-driven XNOR selection
Currently not (xor_one_use) pattern is always selected to S_XNOR irrelative od the node divergence.
This relies on further custom selection pass which converts to VALU if necessary and replaces with V_NOT_B32 ( V_XOR_B32)
on those targets which have no V_XNOR.
Current change enables the patterns which explicitly select the not (xor_one_use) to appropriate form.
We assume that xor (not) is already turned into the not (xor) by the combiner.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D116270
2022-01-26 15:33:10 +03:00
David Green
57356d6bb7 [DAG] Create fptoui.sat from clamped fptoui
This is the unsigned variant of D111976, where we convert a clamped
fptoui to a fptoui.sat. Because we are unsigned, the condition this time
is only UMIN of UINT_MAX. Similarly to D111976 it handles ISD::UMIN,
ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes.

This especially helps on ARM/AArch64 where the vcvt instructions
naturally saturate the result.

Differential Revision: https://reviews.llvm.org/D114964
2022-01-26 08:37:44 +00:00
Simon Pilgrim
15e2be291f [DAG] visitMULHS/MULHU/AND - remove some redundant LHS constant checks
Now that we constant fold and canonicalize constants to the RHS, we don't need to check both LHS and RHS for specific constants
2022-01-25 11:54:23 +00:00
Bjorn Pettersson
109cc5adcc [DAGCombine] Fold SRA of a load into a narrower sign-extending load
An sra is basically sign-extending a narrower value. Fold away the
shift by doing a sextload of a narrower value, when it is legal to
reduce the load width accordingly.

Differential Revision: https://reviews.llvm.org/D116930
2022-01-25 12:14:48 +01:00
Paweł Bylica
9d32847b33
[DAGCombine] Remove unused param in combineCarryDiamond(). NFC 2022-01-24 20:57:00 +01:00
Craig Topper
a43ed49f5b [DAGCombiner][RISCV] Canonicalize (bswap(bitreverse(x))->bitreverse(bswap(x)).
If the bitreverse gets expanded, it will introduce a new bswap. By
putting a bswap before the bitreverse, we can ensure it gets cancelled
out when this happens.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D118012
2022-01-24 08:31:53 -08:00
Bjorn Pettersson
46cacdbb21 [DAGCombiner] Adjust some checks in DAGCombiner::reduceLoadWidth
In code review for D117104 two slightly weird checks were found
in DAGCombiner::reduceLoadWidth. They were typically checking
if BitsA was a mulitple of BitsB by looking at (BitsA & (BitsB - 1)),
but such a comparison actually only make sense if BitsB is a power
of two.

The checks were related to the code that attempted to shrink a load
based on the fact that the loaded value would be right shifted.

Afaict the legality of the value types is checked later (typically in
isLegalNarrowLdSt), so the existing checks were both overly
conservative as well as being wrong whenever ExtVTBits wasn't a
power of two. The latter was a situation triggered by a number of
lit tests so we could not just assert on ExtVTBIts being a power of
two).

When attempting to simply remove the checks I found some problems,
that seems to have been guarded by the checks (maybe just out of
luck). A typical example would be a pattern like this:

  t1 = load i96* ptr
  t2 = srl t1, 64
  t3 = truncate t2 to i64

When DAGCombine is visiting the truncate reduceLoadWidth is called
attempting to narrow the load to 64 bits (ExtVT := MVT::i64). Then
the SRL is detected and we set ShAmt to 64.

In the past we've bailed out due to i96 not being a multiple of 64.
If we simply remove that check then we would end up replacing the
load with a new load that would read 64 bits but with a base pointer
adjusted by 64 bits. So we would read 32 bits the wasn't accessed by
the original load.
This patch will instead utilize the fact that the logical left shift
can be folded away by using a zextload. Thus, the pattern above will
now be combined into

  t3 = load i32* ptr+offset, zext to i64


Another case is shown in the X86/shift-folding.ll test case:

  t1 = load i32* ptr
  t2 = srl i32 t1, 8
  t3 = truncate t2 to i16

In the past we bailed out due to the shift count (8) not being a
multiple of 16. Now the narrowing kicks in and we get

  t3 = load i16* ptr+offset

Differential Revision: https://reviews.llvm.org/D117406
2022-01-24 12:22:04 +01:00
David Green
b27e5459d5 [DAG] Convert truncstore(extend(x)) back to store(x)
Pulled out of D106237, this folds truncstore(extend(x)) back to store(x)
if the original store was legal. This can come up due to the order we
fold nodes. A fold from X86 needs to be adjusted to prevent infinite
loops, to have it pick the operand of a trunc more directly.

Differential Revision: https://reviews.llvm.org/D117901
2022-01-22 13:20:36 +00:00
David Green
100763a88f [DAG] Extend SearchForAndLoads with any_extend handling
This extends the code in SearchForAndLoads to be able to look through
ANY_EXTEND nodes, which can be created from mismatching IR types where
the AND node we begin from only demands the low parts of the register.
That turns zext and sext into any_extends as only the low bits are
demanded. To be able to look through ANY_EXTEND nodes we need to handle
mismatching types in a few places, potentially truncating the mask to
the size of the final load.

Recommitted with a more conservative check for the type of the extend.

Differential Revision: https://reviews.llvm.org/D117457
2022-01-18 21:03:08 +00:00
Hans Wennborg
f4615feaa1 Revert "[DAG] Extend SearchForAndLoads with any_extend handling"
This caused builds to fail with

  llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:5638:
  bool (anonymous namespace)::DAGCombiner::BackwardsPropagateMask(llvm::SDNode *):
  Assertion `NewLoad && "Shouldn't be masking the load if it can't be narrowed"' failed.

See the code review for a link to a reproducer.

> This extends the code in SearchForAndLoads to be able to look through
> ANY_EXTEND nodes, which can be created from mismatching IR types where
> the AND node we begin from only demands the low parts of the register.
> That turns zext and sext into any_extends as only the low bits are
> demanded. To be able to look through ANY_EXTEND nodes we need to handle
> mismatching types in a few places, potentially truncating the mask to
> the size of the final load.
>
> Differential Revision: https://reviews.llvm.org/D117457

This reverts commit 578008789fd061a88ce47dac6ff627001b404348.
2022-01-18 10:50:55 +01:00
David Green
578008789f [DAG] Extend SearchForAndLoads with any_extend handling
This extends the code in SearchForAndLoads to be able to look through
ANY_EXTEND nodes, which can be created from mismatching IR types where
the AND node we begin from only demands the low parts of the register.
That turns zext and sext into any_extends as only the low bits are
demanded. To be able to look through ANY_EXTEND nodes we need to handle
mismatching types in a few places, potentially truncating the mask to
the size of the final load.

Differential Revision: https://reviews.llvm.org/D117457
2022-01-17 15:25:11 +00:00
Bjorn Pettersson
9f237c9e7d [DAGCombine] Refactor DAGCombiner::ReduceLoadWidth. NFCI
Update code comments in DAGCombiner::ReduceLoadWidth and refactor
the handling of SRL a bit. The refactoring is done with the intent
of adding support for folding away SRA by using SEXTLOAD in a
follow-up patch.

The function is also renamed as DAGCombiner::reduceLoadWidth.

Differential Revision: https://reviews.llvm.org/D117104
2022-01-16 20:24:52 +01:00
Nadav Rotem
e2cc091a7d Fix a missed opportunity to merge stores.
This commit fixes a missed opportunity in merging consecutive stores.
The code that searches for stores skipped the case of stores that
directly connect to the root. The comment above the implementation lists
this case but the code did not handle it. I found this pattern when
looking into the shared_ptr destructor. GCC generates the right
sequence. Here is a small repo:

    int foo(int* buff) {
        buff[0] = 0;
        int x = buff[1];
        buff[1] = 0;
        return x;
    }

Differential Revision: https://reviews.llvm.org/D116895
2022-01-10 13:49:02 -08:00
Craig Topper
cbcbbd6ac8 [ValueTracking][SelectionDAG] Rename ComputeMinSignedBits->ComputeMaxSignificantBits. NFC
This function returns an upper bound on the number of bits needed
to represent the signed value. Use "Max" to match similar functions
in KnownBits like countMaxActiveBits.

Rename APInt::getMinSignedBits->getSignificantBits. Keeping the old
name around to keep this patch size down. Will do a bulk rename as
follow up.

Rename KnownBits::countMaxSignedBits->countMaxSignificantBits.

Reviewed By: lebedev.ri, RKSimon, spatel

Differential Revision: https://reviews.llvm.org/D116522
2022-01-03 11:33:30 -08:00