3171 Commits

Author SHA1 Message Date
Guozhi Wei
40c08367e4 [DAGCombiner] Add command line options to guard store width reduction
optimizations

As discussed in the thread http://lists.llvm.org/pipermail/llvm-dev/2020-May/141838.html,
some bit field access width can be reduced by ReduceLoadOpStoreWidth, some
can't. If two accesses are very close, and the first access width is reduced,
the second is not. Then the wide load of second access will be stalled for long
time.

This patch add command line options to guard ReduceLoadOpStoreWidth and
ShrinkLoadReplaceStoreWithStore, so users can use them to disable these
store width reduction optimizations.

Differential Revision: https://reviews.llvm.org/D80745
2020-05-29 09:41:41 -07:00
Sanjay Patel
21dadd774f [DAGCombiner] avoid unnecessary indirection from SDNode/SDValue; NFCI 2020-05-29 09:31:52 -04:00
Florian Hahn
d20a3d35e1 [DAGComb] Do not turn insert_elt into shuffle for single elt vectors.
Currently combineInsertEltToShuffle turns insert_vector_elt into a
vector_shuffle, even if the inserted element is a vector with a single
element. In this case, it should be unlikely that the additional shuffle
would be more efficient than a insert_vector_elt.

Additionally, this fixes a infinite cycle in DAGCombine, where
combineInsertEltToShuffle turns a insert_vector_elt into a shuffle,
which gets turned back into a insert_vector_elt/extract_vector_elt by
a custom AArch64 lowering (in visitVECTOR_SHUFFLE).

Such insert_vector_elt and extract_vector_elt combinations can be
lowered efficiently using mov on AArch64.

There are 2 test changes in arm64-neon-copy.ll: we now use one or two
mov instructions instead of a single zip1. The reason that we need a
second mov in ins1f2 is that we have to move the result to the result
register and is not really related to the DAGCombine fold I think.
But in any case, on most uarchs, mov should be cheaper than zip1. On a
Cortex-A75 for example, zip1 is twice as expensive as mov
(https://developer.arm.com/docs/101398/latest/arm-cortex-a75-software-optimization-guide-v20)

Reviewers: spatel, efriedma, dmgreen, RKSimon

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D80710
2020-05-29 13:21:13 +01:00
David Sherwood
b147b88c84 [CodeGen] Add support for extracting elements of scalable vectors
I have tried to ensure that SelectionDAG and DAGCombiner do
sensible things for scalable vectors, and added support for a
limited number of simple folds. Codegen support for the vector
extract patterns have also been added to the AArch64 backend.

New vector extract tests have been added here:

  CodeGen/AArch64/sve-extract-element.ll

and I have also added new folds using inserts and extracts here:

  CodeGen/AArch64/sve-insert-element.ll

Differential Revision: https://reviews.llvm.org/D80208
2020-05-29 07:49:43 +01:00
Sanjay Patel
f368040c14 [DAGCombiner] try to move splat after binop with splat constant
binop (splat X), (splat C) --> splat (binop X, C)
binop (splat C), (splat X) --> splat (binop C, X)

We do this in IR, and there's a similar fold for the case with 2
non-constant operands just above the code diff in this patch.

This was discussed in D79718, and the extra shuffle in the test
(llvm/test/CodeGen/X86/vector-fshl-128.ll::sink_splatvar) where it
was noticed disappears because demanded elements analysis is no
longer blocked. The large majority of the test diffs seem to be
benign code scheduling changes, but I do see another type of win:
moving the splat later allows binop narrowing in some cases.

Regressions were avoided on x86 and ARM with the INSERT_VECTOR_ELT
restriction.

Differential Revision: https://reviews.llvm.org/D79886
2020-05-26 08:12:46 -04:00
Amy Kwan
b631f86ac5 [TLI][PowerPC] Introduce TLI query to check if MULH is cheaper than MUL + SHIFT
This patch introduces a TargetLowering query, isMulhCheaperThanMulShift.

Currently in DAG Combine, it will transform mulhs/mulhu into a
wider multiply and a shift if the wide multiply is legal.

This TLI function is implemented on 64-bit PowerPC, as it is more desirable to
have multiply-high over multiply + shift for words and doublewords. Having
multiply-high can also aid in further transformations that can be done.

Differential Revision: https://reviews.llvm.org/D78271
2020-05-23 16:47:12 -05:00
Marcello Maggioni
dbaed589ab [SelectionDAG] Add the option of disabling generic combines.
Summary:
For some targets generic combines don't really do much and they
consume a disproportionate amount of time.
There's not really a mechanism in SDISel to tactically disable
combines, but we can have a switch to disable all of them and
let the targets just implement what they specifically need.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79112
2020-05-21 20:11:29 +00:00
Simon Pilgrim
9d4b4f344d DAGCombiner.cpp - remove non-constant EXTRACT_SUBVECTOR/INSERT_SUBVECTOR handling. NFC.
Now that D79814 has landed, we can assume that subvector ops use constant, in-range indices.
2020-05-15 12:41:35 +01:00
Craig Topper
d1119980e5 [SelectionDAG] Use Align/MaybeAlign for ConstantPoolSDNode.
This patch stores the alignment for ConstantPoolSDNode as an
Align and updates the getConstantPool interface to take a MaybeAlign.

Removing getAlignment() will be done as a follow up.

Differential Revision: https://reviews.llvm.org/D79436
2020-05-08 16:04:11 -07:00
Sanjay Patel
2f1fe1864d [DAGCombiner] sink target-supported FP<->int cast op after concat vectors
Try to combine N short vector cast ops into 1 wide vector cast op:
concat (cast X), (cast Y)... -> cast (concat X, Y...)

This is part of solving PR45794:
https://bugs.llvm.org/show_bug.cgi?id=45794

As noted in the code comment, this is uglier than I was hoping because
the opcode determines whether we pass the source or destination type
to isOperationLegalOrCustom(). Also IIUC, there's no way to validate
what the other (dest or src) type is. Without the extra legality check
on that, there's an ARM regression test in:
test/CodeGen/ARM/isel-v8i32-crash.ll
...that will crash trying to lower an unsupported v8f32 to v8i16.

Differential Revision: https://reviews.llvm.org/D79360
2020-05-06 10:25:58 -04:00
Simon Pilgrim
a09a3c6d3e Revert rG8e05ac0a510c - "[DAGCombine] visitTRUNCATE - remove GetDemandedBits call"
Causing buildbot failures
2020-05-02 20:08:33 +01:00
Simon Pilgrim
8e05ac0a51 [DAGCombine] visitTRUNCATE - remove GetDemandedBits call
rL368553 added SimplifyMultipleUseDemandedBits handling for ISD::TRUNCATE to SimplifyDemandedBits so we don't need to duplicate this (and it gets rid of another GetDemandedBits call which is slowly being replaced with SimplifyMultipleUseDemandedBits anyhow).
2020-05-02 19:52:17 +01:00
Craig Topper
6a1ad76dab [X86] Don't return true from isTruncateFree for vectors
Also fix some cost tables for vXi1 types to match the costs entries for the types they will be promoted to.

Differential Revision: https://reviews.llvm.org/D79045
2020-04-30 16:43:35 -07:00
Simon Pilgrim
96238486ed [DAGCombine] Move the remaining X86 funnel shift patterns to DAGCombine
X86 matches several 'shift+xor' funnel shift patterns:

  fold (or (srl (srl x1, 1), (xor y, 31)), (shl x0, y))  -> (fshl x0, x1, y)
  fold (or (shl (shl x0, 1), (xor y, 31)), (srl x1, y))  -> (fshr x0, x1, y)
  fold (or (shl (add x0, x0), (xor y, 31)), (srl x1, y)) -> (fshr x0, x1, y)

These patterns are also what we end up with the proposed expansion changes in D77301.

This patch moves these to DAGCombine's generic MatchFunnelPosNeg.

All existing X86 test cases still pass, and we just have a small codegen change in pr32282.ll.

Reviewed By: @spatel

Differential Revision: https://reviews.llvm.org/D78935
2020-04-30 12:57:17 +01:00
Simon Pilgrim
6547a5ceb2 [DAG] Add TODO comment regarding ADD(X,X) -> SHL(X,1) canonicalization
As discussed on D78935
2020-04-30 12:57:16 +01:00
David Sherwood
058cd8c5be [CodeGen] Add support for inserting elements into scalable vectors
Summary:
This patch tries to ensure that we do something sensible when
generating code for the ISD::INSERT_VECTOR_ELT DAG node when operating
on scalable vectors. Previously we always returned 'undef' when
inserting an element into an out-of-bounds lane index, whereas now
we only do this for fixed length vectors. For scalable vectors it
is assumed that the backend will do the right thing in the same way
that we have to deal with variable lane indices.

In this patch I have permitted a few basic combinations for scalable
vector types where it makes sense, but in general avoided most cases
for now as they currently require the use of BUILD_VECTOR nodes.

This patch includes tests for all scalable vector types when inserting
into lane 0, but I've only included one or two vector types for other
cases such as variable lane inserts.

Differential Revision: https://reviews.llvm.org/D78992
2020-04-30 11:14:04 +01:00
QingShan Zhang
b5f89744cc [DAGCombine] Checking the cost directly to improve the code readability
Call getNegatedExpression(Cost) and check the Cost to make the code more clear.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D78347
2020-04-29 01:49:39 +00:00
QingShan Zhang
2957fa0cd1 [NFC][DAGCombine] Adding three helper functions and change the getNegatedExpression to negateExpression
This is a NFC patch for D77319. The idea is to hide the getNegatibleCost inside the getNegatedExpression()
to have it return null if the cost is expensive, and add some helper function for easy to use. And
rename the old getNegatedExpression to negateExpression to avoid the semantic conflict.

Reviewed By: RKSimon

Differential revision: https://reviews.llvm.org/D78291
2020-04-27 04:11:42 +00:00
Matt Arsenault
e6605a209c DAG: Fix wrong legality check for ISD::FMAD
Since 1725f2884175ca618d29b06e35f5c6ebd618053d, this should check
isFMADLegalForFAddFSub rather than the the plain isOperationLegal.

This would assert in a subset of cases due to an oddity in how FMAD is
selected. We will allow FMA formation pre-legalize, but not FMAD even
in cases where it would be valid.

The current hook requires passing in the root fadd/fsub. However, in
this distributed case, this would be far more complicated to pass in
the relevant operand. AMDGPU doesn't get any value from the node, and
only needs the type and is the only implementor, so I'm not sure why
we have this complexity. Just rename and expand the assert to avoid
the more complicated checks spread through the distribution logic.
2020-04-13 10:25:39 -07:00
Sanjay Patel
1318ddbc14 [VectorUtils] rename scaleShuffleMask to narrowShuffleMaskElts; NFC
As proposed in D77881, we'll have the related widening operation,
so this name becomes too vague.

While here, change the function signature to take an 'int' rather
than 'size_t' for the scaling factor, add an assert for overflow of
32-bits, and improve the documentation comments.
2020-04-11 10:05:49 -04:00
Craig Topper
c41685b16f [SelectionDAG] Make getZeroExtendInReg take a vector VT if the operand VT is a vector.
This removes a call to getScalarType from a bunch of call sites.
It also makes the behavior consistent with SIGN_EXTEND_INREG.

Differential Revision: https://reviews.llvm.org/D77631
2020-04-07 11:34:08 -07:00
Pierre-vh
4fc59a468f Revert "[CodeGen][SelectionDAG] Flip Booleans More Often"
This reverts commit 23342bdcc888835e744f38a2fcd0a5c651e33a31.
2020-04-07 09:09:10 +01:00
Pierre-vh
23342bdcc8 [CodeGen][SelectionDAG] Flip Booleans More Often
Differential Revision: https://reviews.llvm.org/D77201
2020-04-07 08:19:57 +01:00
Matt Arsenault
70726cec5b DAG: Combine extract_vector_elt of concat_vectors
Fixes extra canonicalize regressions when legalizing
vector fminnum/fmaxnum.
2020-04-06 09:26:29 -04:00
Craig Topper
97e57f3b24 [DAGCombiner] Use getAnyExtOrTrunc instead of getSExtOrTrunc in the zext(setcc) combine.
We're ANDing with 1 right after which will cause the SIGN_EXTEND to
be combined to ANY_EXTEND later. Might as well just start with an
ANY_EXTEND.

While there replace create the AND using the getZeroExtendInReg
helper to remove the need to explicitly create the VecOnes constant.
2020-04-05 22:44:45 -07:00
Craig Topper
586c051a27 [DAGCombiner] Replace a hardcoded constant in visitZERO_EXTEND with a proper check for the condition its trying to protect.
This code is replacing a shift with a new shift on an extended type.
If the shift amount type can't represent the maximum shift amount
for the new type, the amount needs to be extended to a type that
can.

Previously, the code just hardcoded a check for 256 bits which
seems to have been an assumption that the original shift amount
was MVT::i8. But that seems more catered to a specific target
like X86 that uses i8 as its legal shift amount type. Other
targets may use different types.

This commit changes the code to look at the real type of the shift
amount and makes sure it has enough bits for the Log2 of the
new type. There are similar checks to this in SelectionDAGBuilder
and LegalizeIntegerTypes.
2020-04-05 20:35:57 -07:00
Guillaume Chatelet
3a78f44daf [Alignment][NFC] Convert SelectionDAG::InferPtrAlignment to MaybeAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77212
2020-04-01 13:22:11 +00:00
Qiu Chaofan
95bcab8272 [DAGCombiner] Require ninf for sqrt recip estimation
Currently, DAG combiner uses (fmul (rsqrt x) x) to estimate square
root of x. However, this method would return NaN if x is +Inf, which
is incorrect.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D76853
2020-04-01 16:23:43 +08:00
Craig Topper
f92563f907 [VectorUtils][X86] De-templatize scaleShuffleMask and 2 X86 shuffle mask helpers and move their implementation to cpp files
Summary: These were templated due to SelectionDAG using int masks for shuffles and IR using unsigned masks for shuffles. But now that D72467 has landed we have an int mask version of IRBuilder::CreateShuffleVector. So just use int instead of a template

Reviewers: spatel, efriedma, RKSimon

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D77183
2020-04-01 00:46:48 -07:00
Nemanja Ivanovic
4821411347 [DAGCombine] Fix splitting indexed loads in ForwardStoreValueToDirectLoad()
In DAGCombiner::visitLOAD() we perform some checks before breaking up an indexed
load. However, we don't do the same checking in ForwardStoreValueToDirectLoad()
which can lead to failures later during combining
(see: https://bugs.llvm.org/show_bug.cgi?id=45301).

This patch just adds the same checks to this function as well.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45301

Differential revision: https://reviews.llvm.org/D76778
2020-03-27 18:03:47 -05:00
Guillaume Chatelet
74eac9031a [Alignment][NFC] MachineMemOperand::getAlign/getBaseAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, dschuff, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, jrtc27, atanasyan, jfb, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76925
2020-03-27 15:49:13 +00:00
Juneyoung Lee
1bcc500b48 [DAGCombine] Add basic optimizations for FREEZE in SelDag
Summary: This patch is the first effort to adding basic optimizations for FREEZE in SelDag.

Reviewers: spatel, lebedev.ri

Reviewed By: spatel

Subscribers: xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76707
2020-03-27 12:20:39 +09:00
Craig Topper
9f7d4150b9 [X86] Move combineLoopMAddPattern and combineLoopSADPattern to an IR pass before SelecitonDAG.
These transforms rely on a vector reduction flag on the SDNode
set by SelectionDAGBuilder. This flag exists because SelectionDAG
can't see across basic blocks so SelectionDAGBuilder is looking
across and saving the info. X86 is the only target that uses this
flag currently. By removing the X86 code we can remove the flag
and the SelectionDAGBuilder code.

This pass adds a dedicated IR pass for X86 that looks across the
blocks and transforms the IR into a form that the X86 SelectionDAG
can finish.

An advantage of this new approach is that we can enhance it to
shrink the phi nodes and final reduction tree based on the zeroes
that we need to concatenate to bring the partially reduced
reduction back up to the original width.

Differential Revision: https://reviews.llvm.org/D76649
2020-03-26 14:10:20 -07:00
Sanjay Patel
0eeee83d75 [VectorUtils] move x86's scaleShuffleMask to generic VectorUtils
We have some long-standing missing shuffle optimizations that could
use this transform via VectorCombine now:
https://bugs.llvm.org/show_bug.cgi?id=35454
(and we still don't get that case in the backend either)

This function is apparently templated because there's existing code
in IR that treats mask values as unsigned and backend code that
treats masks values as signed.

The mask values are not endian-dependent (as shown by the existing
bitcast transform from DAGCombiner).

Differential Revision: https://reviews.llvm.org/D76508
2020-03-23 09:58:55 -04:00
Sam Parker
62fdb1f534 [DAGCombine] Skip PostInc combine with later users
When decided whether to generate a post-inc load/store, look at the
other memory nodes that use the same base address and, if any proceed
the current node, then don't do the combine.
The change only seems to be affecting the Arm backend, which I was
surprised at, but it appears to fix a lot of our issues around MVE
masked load/stores having to store a temporary address after an early
post-increment on a shared base address.

Differential Revision: https://reviews.llvm.org/D75847
2020-03-23 08:39:53 +00:00
Sam Parker
8e45eaf1da [NFC][DAGCombine] Refactor post-inc logic
Extract the decision to combine into a post-inc address into a
couple of functions to make the logic more clear and re-usable.

Differential Revision: https://reviews.llvm.org/D76060
2020-03-23 08:32:20 +00:00
Qiu Chaofan
763871053c [DAGCombiner] Require nsz for aggressive fma fold
For folding pattern `x-(fma y,z,u*v) -> (fma -y,z,(fma -u,v,x))`, if
`yz` is 1, `uv` is -1 and `x` is -0, sign of result would be changed.

Differential Revision: https://reviews.llvm.org/D76419
2020-03-22 23:10:07 +08:00
Simon Pilgrim
c5fd9e3888 [DAG] Don't permit EXTLOAD when combining FSHL/FSHR consecutive loads (PR45265)
Technically we can permit EXTLOAD of the LHS operand but only if all the extended bits are shifted out. Until we test coverage for that case, I'm just disabling this to fix PR45265.
2020-03-21 10:52:41 +00:00
Pirama Arumuga Nainar
edcfb47ff6 [DAGCombiner] Do not fold truncate(build_vector(..)) if it creates an illegal type
Summary:
It can be the case that a vector type is legal but the corresponding
scalar type is not legal for an architecture (i8 vs. v16i8 on AArch64).
Check if the scalar type created when folding
  truncate(build_vector(x,y)) -> build_vector(truncate(x),truncate(y))

is legal if we are running after the type legalizer.

This fixes https://github.com/android/ndk/issues/1207.

Reviewers: RKSimon, srhines

Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76312
2020-03-20 09:20:16 -07:00
Bjorn Pettersson
d168b77780 [DAGCombiner] Fix non-determinism problem related to argument evaluation order in visitFDIV
Summary:
For some reason the order in which we call getNegatedExpression
for the involved operands, after a call to isCheaperToUseNegatedFPOps,
seem to matter. This patch includes a new test case in
test/CodeGen/X86/fdiv.ll that crashes if we reverse the order of
those calls. Before this patch that could happen depending on
which compiler that were used when buildind llvm. With my GCC
version (7.4.0) I got the crash, because it seems like it is
using a different order for the argument evaluation compared
to clang.

All other users of isCheaperToUseNegatedFPOps already used this
pattern with unfolded/ordered calls to getNegatedExpression, so
this patch is aligning visitFDIV with the other use cases.

This patch simply deals with the non-determinism for FDIV. While
the underlying problem with getNegatedExpression is discussed
further in D76439.

Reviewers: spatel, RKSimon

Reviewed By: spatel

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76319
2020-03-20 16:11:17 +01:00
QingShan Zhang
d577193c0f [DAGCombine] Respect the uses when combine FMA for a*b+/-c*d
If it is a*b-c*d, it could be also folded into fma(a, b, -c*d) or fma(-c, d, a*b).
This patch is trying to respect the uses of a*b and c*d to make the best choice.

Differential Revision: https://reviews.llvm.org/D75982
2020-03-18 03:34:27 +00:00
Simon Pilgrim
c9656a3b31 [DAGCombiner] matchRotateSub - handle shift amount truncation
Under certain circumstances we'll end up in the position where the negated shift amount will get truncated to the type specified getScalarShiftAmountTy(), so we need to test for a truncated version of the shift amount as well.

This allows us to remove half of the remaining patterns tested for by X86ISelLowering's combineOrShiftToFunnelShift.
2020-03-17 16:01:23 +00:00
Simon Pilgrim
5641804298 [DAG] MatchRotate - Add funnel shift by variable support
Followup to D75114, this patch reuses the existing MatchRotate ROTL/ROTR rotation pattern code to also recognize the more general FSHL/FSHR funnel shift patterns when we have variable shift amounts, matched with MatchFunnelPosNeg which acts in an (almost) equivalent manner to MatchRotatePosNeg.
2020-03-15 11:50:45 +00:00
Brian Cain
ad7b930bd1 Initialize IsFast* values
We must initialize these values in case some targets do not assign to
them in allowsMemoryAccess().
2020-03-13 17:46:32 -05:00
QingShan Zhang
e601196833 [NFC][DAGCombine] Move the fold of a*b-c and a-b*c into lambda function
This will help the review of https://reviews.llvm.org/D75982. It is
a simple code refactor.
2020-03-13 02:35:46 +00:00
Simon Pilgrim
2a2d242017 [DAGCombine] foldVSelectOfConstants - ensure constants are same type
Fix bug identified by https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=21167, foldVSelectOfConstants must ensure that the 2 build vectors have scalars of the same type before trying to compare APInt values.
2020-03-12 20:02:05 +00:00
Simon Pilgrim
d8f9416fdc [DAG] MatchRotate - Add funnel shift by immediate support
This patch reuses the existing MatchRotate ROTL/ROTR rotation pattern code to also recognize the more general FSHL/FSHR funnel shift patterns when we have constant shift amounts.

Differential Revision: https://reviews.llvm.org/D75114
2020-03-11 18:55:18 +00:00
Simon Pilgrim
7202d9cde9 [DAG] Combine fshl/fshr(load1,load0,c) if we have consecutive loads
As noted on D75114, if both arguments of a funnel shift are consecutive loads we are missing the opportunity to combine them into a single load.

Differential Revision: https://reviews.llvm.org/D75624
2020-03-06 11:36:18 +00:00
Sanjay Patel
29a2b20ab3 [SDAG] simplify FP binops to undef
As discussed in the commit thread for rGa253a2a and D73978, we can do more undef folding for FP ops.
The nnan and ninf fast-math-flags specify that if an operand is the disallowed value, the result is
poison, so we can produce an undef result.

But this doesn't work as expected (the undef operand cases remain) because of a Flags propagation
problem in SelectionDAGBuilder.

I've added DAGCombiner calls to enable these for the other cases because we've shown in other
patches that (because of the limited way that SDAG iterates), it is possible to miss simplifications
like this if they are done only at node creation time.

Several potential follow-ups to expand on this patch are possible.

Differential Revision: https://reviews.llvm.org/D75576
2020-03-04 10:42:16 -05:00
Craig Topper
d8ad7cc088 [DAGCombiner][X86] Improve narrowExtractedVectorLoad to handle cases where the element size isn't byte sized by the subvector is.
Summary:
Follow up from D75377. If the subvector is byte sized and the
index is aligned to the subvector size, we can shrink the load.

Reviewers: spatel, RKSimon

Reviewed By: RKSimon

Subscribers: dbabokin, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75434
2020-03-03 08:41:31 -08:00