Try to remove extra bitcasts around logicops if we're dealing with illegal types
Fixes the regressions in D145939
Differential Revision: https://reviews.llvm.org/D146032
[DAGCombiner] handle more store value forwarding
When lowering calls on target like PPC, some stack loads
will be generated for by value parameters. Node CALLSEQ_START
prevents such loads from being combined.
Suggested by @RolandF, this patch removes the unnecessary
loads for the byval parameter by extending ForwardStoreValueToDirectLoad
Reviewed By: nemanjai, RolandF
Differential Revision: https://reviews.llvm.org/D138899
Try to more aggressively narrow masks of extended values.
This is mainly for cases where the mask is trying to zero out any_extended upper bits, assuming we can zext/trunc the values for free.
This catches a few actual missed folds, as well as helps canonicalize a number of other cases which were being caught in isel etc.
Differential Revision: https://reviews.llvm.org/D145866
This patch resolves suboptimal code generation reported by https://github.com/llvm/llvm-project/issues/60571 .
DAGCombiner currently converts `(x or/xor const) + y` to `(x + y) + const` if this is valid.
However, if `.. + const` is broken down into a sequences of adds with carries, the benefit is not clear, introducing two more add(-with-carry) ops (total 6) in the case of the reported issue whereas the optimal sequence must only have 4 add(-with-carry)s.
This patch resolves this issue by allowing this conversion only when (1) `.. + const` is legal or promotable, or (2) `const` is a sign bit because it does not introduce more adds.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D144116
The merged store touches memory for other underlying objects, so mapping
the merged store to the first underlying object is not correct. For example
in https://github.com/llvm/llvm-project/issues/60744, the merged store is
not correctly analyzed as dependent with memory operations which are also
part of the merged store.
Fixes#60744
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D144711
This is generally done by the InstCombine, but can be emitted as an
intermediate step and is cheap to handle.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D145177
This is generally done by the InstCombine, but can be emitted as an
intermediate step and is cheap to handle.
Differential Revision: https://reviews.llvm.org/D145143
This is guarding a check for isTypeLegal so it should check is
LegalTypes.
Fixes PR61111.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D145139
This patch adds AArch64 CodeGen support such that the type can be passed
and returned to/from functions, and also adds support to use this type in
load/store operations and PHI nodes.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D136862
To make legalization easier, the operands and outputs have the same size for
these ISD Nodes. When legalizing the results in SplitVectorResult the operands
are legalized to the same size as the outputs.
The ISD Node has two output/results, therefore the legalizing functions update
both results/outputs.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D144744
`(and/or (icmp eq/ne A,C0), (icmp eq/ne A,C1))` can be lowered to
`(icmp eq/ne (and (sub A, (smin C0, C1)), (not (sub (smax C0, C1), (smin C0, C1)))), 0)`
generically if `(sub (smax C0, C1), (smin C0,C1))` is a power of 2.
This covers the existing case of `(and/or (icmp eq/ne A, C_Pow2),(icmp eq/ne A, -C_Pow2))`
as well as other cases.
Alive2 Links:
EQ: https://alive2.llvm.org/ce/z/mLJiUW
NE: https://alive2.llvm.org/ce/z/TKnzUr
Differential Revision: https://reviews.llvm.org/D144283
65420c8041f4 introduced an ICE in combineMinNumMaxNum(...) when
combineMinNumMaxNumImpl(...) returns an SDValue(). Make sure to check that a
value is returned before trying to perform an FNEG on it.
GitHub Issue: #60924
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D144571
This partially reverts a regression introduced in 8f25e382c5b1 for
AArch64 targets. In particular, we restore the logic of `(abs (sub nsw
x, y)) -> abds(x, y)` for all targets except X86, which keeps the logic
introduced in 8f25e382c5b1. See also https://reviews.llvm.org/D142288.
Differential Revision: https://reviews.llvm.org/D144379
Some of TargetLowering functions needed opcodes are often used in DAGCombiner.
The patch make those MatchContextClass classes have TargetLowering members and
pass specific opcodes for those TargetLowering functions.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D144075
Recommit bbdf24357932b064f2aa18ea1356b474e0220dde.
Original commit message:
If a chain of two selects share a true/false value and are controlled
by two setcc nodes, that are never both true, we can fold away one of
the selects. So, the following:
(select (setcc X, const0, eq), Y,
(select (setcc X, const1, eq), Z, Y))
Can be combined to:
select (setcc X, const1, eq) Z, Y
Differential Revision: https://reviews.llvm.org/D142535
This doesn't make sense as an option. fneg and fabs are bit
preserving by definition. If a target has some fneg or fabs
instruction that are not bitpreserving it's incorrect to lower
fneg/fabs to use it.
Requiring a bitcast to exist was unhelpful. The most basic cases
are always going to be a CopyFromReg or load, so they would need
a new cast inserted. Don't require a bitcast if it's a free
operation. I don't think this logic makes particularly much sense
(it seems to be imparting special interpretation of bitcast), but
this needs to be in sync with foldSignChangeInBitcast.
We should also get rid of this hasBitPreservingFPLogic hook. fabs/fneg
are bitpreserving or incorrectly implemented, so this should just be a
regular legality check.
While working on D143731 I hit a case where a build_vector with 2 undef operands could be generated (with one undef hidden behind a bitcast).
That made `reduceBuildVecTruncToBitCast` crash because it seems to assume there is at least one good operand.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D143886
Remove a dead masked store if another one has the same base pointer and mask or
the following store has all true constant mask and size if equal or bigger to
the first store.
Differential Revision: https://reviews.llvm.org/D143069
One of the cleanups necessary for D136529 - another being how we're going to handle moving freeze through multiple result nodes (like uaddo and subcarry)
So long as the operation is reassociative, we can reassociate the double
vecreduce from for example fadd(vecreduce(a), vecreduce(b)) to
vecreduce(fadd(a,b)). This will in general save a few instructions, but some
architectures (MVE) require the opposite fold, so a shouldExpandReduction is
added to account for it. Only targets that use shouldExpandReduction will be
affected.
Differential Revision: https://reviews.llvm.org/D141870
The patch tries to solve duplicated combine work for vp sdnodes. The idea is to
introduce MatchConext that verifies specific patterns and generate specific node
infromation. There is two MatchConext in DAGCombiner. EmptyMatcher is for
normal nodes and VPMatcher is for vp nodes.
The idea of this patch is come form Simon Moll's proposal [0]. I only fixed some
minor issues and added few new features in this patch.
[0]: c38a14484a
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D141891
Add support to allow removing a dead store for scalable types. Avoid to remove
scalable type store in favor of fixed type store, since scalable type size is
unknown at the compile time.
Differential Revision: https://reviews.llvm.org/D142100
I'm intending to add generic legalization in the future, but for now I've added basic support to targets that have the necessary MIN/MAX support to expand to SUB(MAX(X,Y),MIN(X,Y)).
This exposed a couple of issues with the DAG combines - in particular we need to catch trunc(abs(sub(ext(x),ext(y)))) patterns earlier before the SSE/AVX vector trunc expansion folds trigger.
Differential Revision: https://reviews.llvm.org/D142288
If a chain of two selects share a true/false value and are controlled
by two setcc nodes, that are never both true, we can fold away one of
the selects. So, the following:
(select (setcc X, const0, eq), Y,
(select (setcc X, const1, eq), Z, Y))
Can be combined to:
select (setcc X, const1, eq) Z, Y
Differential Revision: https://reviews.llvm.org/D142535