52796 Commits

Author SHA1 Message Date
Vettel
cae46f6210
[MCP] Enhance MCP copy Instruction removal for special case (#70778)
Machine Copy Propagation Pass may lose some opportunities to further
remove the redundant copy instructions during the ForwardCopyPropagateBlock
procedure. When we Clobber a "Def" register, we also need to remove the record 
from the copy maps that indicates "Src" defined "Def" to ensure the correct semantics
of the ClobberRegister function.

For more information, please see the C++ test case generated code in 
"vector.body" after the MCP Pass: https://gcc.godbolt.org/z/nK4oMaWv5.
2023-11-22 23:57:42 +08:00
Mircea Trofin
06e113af4c [mlgo] Fix test post 42d484082cd190400e0e493a8d679762ce0efbba
The opcodes got bumped, and an expected 1299 became 1300.
2023-11-22 07:25:04 -08:00
Momchil Velikov
ed5404cd6b
[AArch64] Add quadword gather load/scatter store intrinsics with unscaled vector offset (#71290)
This patch add intrinsics of the form
   
sv<type>_t svld1q_gather_u64offset_<typ>(svbool_t pg, const <type>_t *base, svuint64_t offs);
void svst1q_scatter_u64offset_<typ>(sbvool_t, <type>_t *base, svuint64_t offst, sv<type>_t data);

as well as their short forms.

ACLE spec: ARM-software/acle#257
2023-11-22 14:38:13 +00:00
CarolineConcatto
55f067f3ba
Revert "Revert "[SVE2.1][Clang][LLVM]Add BFloat16 builtin in Clang an… (#73110)
…d LLVM intrinisc (#70362)""

This reverts commit e1ee0e85104eed2c68b6821d9e5a2066e4154099.

The patch https://github.com/llvm/llvm-project/pull/70362 had a test in
LLDB failing. The Feature sve2p1 in AArch64InstrInfo.td uses
AssemblerPredicate instead of AssemblerPredicateWithAll This patch adds
again PR #70362 with the fix in the AArch64InstrInfo.td.
2023-11-22 13:34:51 +00:00
simpal01
74cdb8e6f8
[llvm][ARM] Emit MVE .arch_extension after .fpu directive if it does not include MVE features (#71545)
The floating-point and MVE features together specify the MVE
functionality that is supported on the Cortex-M85 processor. But the FPU
extension for the underlying architecture(armv8.1-m.main) is FPV5 which
does not include MVE-F. So Compiler's -S output and `-save-temps=obj`
loses MVE feature which leads to assembler error. What happening here is
.fpu directive overrides any previously set features by .cpu directive.
Since the the corresponding .fpu generated (.fpu fpv5-d16) does not
include MVE-F, it overrides those features even though it is supported
and set by the .cpu directive. Looks like .fpu is supposed to do this.

In this case, there should be an .arch_extension directive re-enabling
the relevant extensions after .fpu if the goal is to keep these
extensions enabled. GCC also does the same.

So this patch enables the MVE features by emitting the below arch
extension:
  .fpu fpv5-d16
  .arch_extension mve.fp

---------

Co-authored-by: Simi Pallipurath <simi.pallipurath.com>
2023-11-22 09:16:58 +00:00
Craig Topper
9a6452377b [RISCV] Add more Zbs patterns for -riscv-experimental-rv64-legal-i32. 2023-11-21 23:24:00 -08:00
Yeting Kuo
a756a6b97e
[TargetLowering][RISCV] Introduce shouldFoldSelectWithSingleBitTest and RISC-V implement. (#72978)
DAGCombiner folds (select_cc seteq (and x, y), 0, 0, A) to (and (sra
(shl x)) A) where y has a single bit set. Previously, DAGCombiner relies
on `shouldAvoidTransformToShift` to decide when to do the combine, but
`shouldAvoidTransformToShift` is only about shift cost. This patch
introuduces a specific hook to decide when to do the combine and disable
the combine when Zicond enabled and AndMask <= 1024.
2023-11-22 08:22:14 +08:00
Muhammad Omair Javaid
e1ee0e8510 Revert "[SVE2.1][Clang][LLVM]Add BFloat16 builtin in Clang and LLVM intrinisc (#70362)"
This reverts commit f79676a17eae4c63318561ba35613d97053fa12c.
2023-11-22 00:09:50 +05:00
Craig Topper
7a6fd49c8a [RISCV] Use short forward branch for ISD::ABS.
We can use short forward branch to conditionally negate if the
value is negative.
2023-11-21 11:00:06 -08:00
Craig Topper
ce6127433b [RISCV] Add rv32 command line to short-forward-branch-opt.ll. NFC 2023-11-21 11:00:05 -08:00
Joe Nash
b072ec5ec6 [AMDGPU] NFC. Run auto-update on a few tests 2023-11-21 13:52:02 -05:00
David Li
30e8dcd8cc
Enable customer lowering for fabs_v16f16 with AVX2 (#72914)
This is part-2 change to improve codegen for vec_fabs. In this patch,
v16f16 and v132f16 fabs are improved.

There will be at least two followups patches after this one.
1) fixing the ISEL crash when fabs.v32f16 uses custom lowering with
AVX512
2) better expansion for v16f16, v32f16 types on AVX1 subtargets.
2023-11-21 10:34:00 -08:00
Simon Pilgrim
1552b91162 [X86] X86FixupVectorConstantsPass - attempt to match VEX logic ops back to EVEX if we can create a broadcast fold
On non-DQI AVX512 targets, X86InstrInfo::setExecutionDomainCustom will convert EVEX int-domain instructions to VEX fp-domain instructions. But, if we have the chance to use a broadcast fold we're better off using a EVEX instruction, so handle a reverse fold.
2023-11-21 18:01:29 +00:00
Momchil Velikov
28f62d72f4
[AArch64] Add SVE2.1 intrinsics for indexed quadword gather loads and scatter stores (#70476)
This patch adds the quadword gather load intrinsics of the form

    sv<type>_t svld1q_gather_u64index_<typ>(svbool_t, const <type>_t *, svuint64_t);
    sv<type>_t svld1q_gather_u64base_index_<typ>(svbool_t, svuint64_t, int64_t);

and the quadword scatter store intrinsics of the form

    void svst1q_scatter_u64index_<typ>(svbool_t, <type>_t *, svuint64_t, sv<type>_t);
    void svst1q_scatter_u64base_index_<typ>(svbool, svuint64_t, int64_t, sv<type>_t);

ACLE spec: https://github.com/ARM-software/acle/pull/257
2023-11-21 16:44:14 +00:00
Jessica Del
f85e7ab035
[AMDGPU] - Add constant folding to s_wqm intrinsic (#72382)
Fold any constant input to the `s_wqm` intrinsic.
2023-11-21 16:36:45 +01:00
Momchil Velikov
f335883808
[AArch64][SVE2.1] Add intrinsics for quadword loads/stores with unscaled offset (#70474)
This patch adds a set of SVE2.1 quadword load/store intrisics:

  * Contiguous zero-extending load to quadword (single vector)

    sv<type>_t svld1uwq[_<typ>](svbool_t, const <type>_t *ptr);
    sv<type>_t svld1uwq_vnum[_<typ>](svbool_t, const <type> *ptr, int64_t vnum);
 
    sv<type>_t svld1udq[_<typ>](svbool_t, const <type>_t *ptr);
    sv<type>_t svld1udq_vnum[_<typ>](svbool_t, const <type>_t *ptr, int64_t vnum);

  * Contiguous truncating store of single vector operand

    void svst1uwq[_<typ>](svbool_t, const <type>_t *ptr, sv<type>_t data);
    void svst1uwq_vnum[_<typ>](svbool_t, const <type>_t *ptr, int64_t vnum, sv<type>_t data);

    void svst1udq[_<typ>](svbool_t, const <type>_t *ptr, sv<type>_t data);
    void svst1udq_vnum[_<typ>](svbool_t, const <type>_t *ptr, int64_t vnum, sv<type>_t data);

  * Gather load quadword

    sv<type>_t svld1q_gather[_u64base]_<typ>(svbool_t pg, svuint64_t zn);
    sv<type>_t svld1q_gather[_u64base]_offset_<typ>(svbool_t pg, svuint64_t zn, int64_t offset);

  * Scatter store quadword

    void svst1q_scatter[_u64base][_<typ>](svbool_t pg, svuint64_t zn, sv<type>_t data);
    void svst1q_scatter[_u64base]_offset[_<typ>](svbool_t pg, svuint64_t zn, int64_t offset, sv<type>_t data);

  * Contiguous load two, three or four quadword structures.

    sv<type>x2_t svld2q[_<typ>](svbool_t pg, const <type>_t *rn);
    sv<type>x2_t svld2q_vnum[_<typ>](svbool_t pg, const <type>_t *rn, uint64_t vnum);
    sv<type>x3_t svld3q[_<typ>](svbool_t pg, const <type>_t *rn);
    sv<type>x3_t svld3q_vnum[_<typ>](svbool_t pg, const <type>_t *rn, uint64_t vnum);
    sv<type>x4_t svld4q[_<typ>](svbool_t pg, const <type>_t *rn);
    sv<type>x4_t svld4q_vnum[_<typ>](svbool_t pg, const <type>_t *rn, uint64_t vnum);

  * Contiguous store two, three or four quadword structures.

    void svst2q[_<typ>](svbool_t pg, <type>_t *rn, sv<type>x2_t zt);
    void svst2q_vnum[_<typ>](svbool_t pg, <type>_t *rn, int64_t vnum, sv<type>x2_t zt);
    void svst3q[_<typ>](svbool_t pg, <type>_t *rn, sv<type>x3_t zt);
    void svst3q_vnum[_<typ>](svbool_t pg, <type>_t *rn, int64_t vnum, sv<type>x3_t zt);
    void svst4q[_<typ>](svbool_t pg, <type>_t *rn, sv<type>x4_t zt);
    void svst4q_vnum[_<typ>](svbool_t pg, <type>_t *rn, int64_t vnum, sv<type>x4_t zt);

ACLE spec: https://github.com/ARM-software/acle/pull/257

Co-authored-by: Caroline Concatto <caroline.concatto@arm.com>
Co-authored-by: Hassnaa Hamdi <hassnaa.hamdi@arm.com>
2023-11-21 15:34:59 +00:00
Brandon Wu
2749f52ec4
[RISCV] Convert all floating point vector type operands to integer vector type (#69559) 2023-11-21 23:19:10 +08:00
CarolineConcatto
f79676a17e
[SVE2.1][Clang][LLVM]Add BFloat16 builtin in Clang and LLVM intrinisc (#70362)
This patch implements the builtins in Clang
and the LLVM-IR intrinsic for the following:

For BFADD , BFSUB, BFMAX, BFMIN, BFMAXNM and BFMINNM, for instance: 
svbfloat16_t svadd[_bf16]_m (svbool_t pg, svbfloat16_t zdn, svbfloat16_t
zm);
svbfloat16_t svadd[_bf16]_x (svbool_t pg, svbfloat16_t zdn, svbfloat16_t
zm);
svbfloat16_t svadd[_bf16]_z (svbool_t pg, svbfloat16_t zdn, svbfloat16_t
zm);
svbfloat16_t svadd[_n_bf16]_m (svbool_t pg, svbfloat16_t zdn, bfloat16_t
zm);
svbfloat16_t svadd[_n_bf16]_x (svbool_t pg, svbfloat16_t zdn, bfloat16_t
zm);
svbfloat16_t svadd[_n_bf16]_z (svbool_t pg, svbfloat16_t zdn, bfloat16_t
zm);
the add, could be replaced by sub, max, min, maxnm and minnm.

For BFMUL:
svbfloat16_t svmul[_bf16]_m(svbool_t pg, svbfloat16_t zdn, svbfloat16_t
zm);
svbfloat16_t svmul[_bf16]_x(svbool_t pg, svbfloat16_t zdn, svbfloat16_t
zm);
svbfloat16_t svmul[_bf16]_z(svbool_t pg, svbfloat16_t zdn, svbfloat16_t
zm);
svbfloat16_t svmul[_n_bf16]_m(svbool_t pg, svbfloat16_t zdn, bfloat16_t
zm);
svbfloat16_t svmul[_n_bf16]_x(svbool_t pg, svbfloat16_t zdn, bfloat16_t
zm);
svbfloat16_t svmul[_n_bf16]_z(svbool_t pg, svbfloat16_t zdn, bfloat16_t
zm);

svbfloat16_t svmul_lane[_bf16](svbfloat16_t zn, svbfloat16_t zm,
                               uint64_t imm_idx);

For BFCLAMP:
svbfloat16_t svclamp[_bf16](svbfloat16_t op, svbfloat16_t min,
svbfloat16_t max);

For BFMLA and BFMLS
svbfloat16_t svmla[_bf16]_m(svbool_t pg, svbfloat16_t zda, svbfloat16_t
zn,
                            svbfloat16_t zm);
svbfloat16_t svmla[_bf16]_z(svbool_t pg, svbfloat16_t zda, svbfloat16_t
zn,
                            svbfloat16_t zm);
svbfloat16_t svmla[_bf16]_x(svbool_t pg, svbfloat16_t zda, svbfloat16_t
zn,
                            svbfloat16_t zm);
svbfloat16_t svmla[_n_bf16]_m(svbool_t pg, svbfloat16_t zda,
svbfloat16_t zn,
                              bfloat16_t zm);
svbfloat16_t svmla[_n_bf16]_z(svbool_t pg, svbfloat16_t zda,
svbfloat16_t zn,
                              bfloat16_t zm);
svbfloat16_t svmla[_n_bf16]_x(svbool_t pg, svbfloat16_t zda,
svbfloat16_t zn,
                              bfloat16_t zm);

svbfloat16_t svmla_lane[_bf16](svbfloat16_t zda, svbfloat16_t zn,
                               svbfloat16_t zm, uint64_t

According to the PR#257[1]
[1]ARM-software/acle#257

Co-authored-by: Matthew Devereau <matthew.devereau@arm.com>
2023-11-21 14:02:18 +00:00
Simon Pilgrim
8336bfb17e [X86] Regenerate ispow2.ll. NFC. 2023-11-21 12:58:24 +00:00
ZhaoQi
775d2f3201
[LoongArch][MC] Support to get the FixupKind for BL (#72938)
Previously, bolt could not get FixupKind for BL correctly, because bolt
cannot get target-flags for BL. Here just add support in MCCodeEmitter.

Fixes https://github.com/llvm/llvm-project/pull/72826.
2023-11-21 19:00:29 +08:00
Momchil Velikov
ef9bcace83
[MachineSink][AArch64] Preserve debug location when rematerialising an instruction to replace a COPY (#72685)
Fixes a regression in `tools/lldb-dap/optimized/TestDAP_optimized.py`
caused by enabling "sink-and-fold" in MachineSink.
2023-11-21 10:10:23 +00:00
CarolineConcatto
c77d79b6fe
[SVE2.1][Clang][LLVM]Add 128bits builtin in Clang and LLVM intrinisc (#71930)
This patch implements the builtins in Clang
and the LLVM-IR intrinsic for the following:

EXTQ
// Variants are also available for:
// _s8, _s16, _u16, _s32, _u32, _s64, _u64
// _bf16, _f16, _f32, _f64
svuint8_t svextq_lane[_u8](svuint8_t zdn, svuint8_t zm, uint64_t imm);

TBLQ and TBXQ
// Variants are also available for:
// _u8, _u16, _s16, _u32, _s32, _u64, _s64
// _bf16, _f16, _f32, _f64
svint8_t svtblq[_s8](svint8_t zn, svuint8_t zm);
svint8_t svtbxq[_s8](svint8_t zn, svuint8_t zm);

UZPQ1, UZPQ2, ZIPQ1 and ZIPQ2
// Variants are also available for:
// _s8, _u16, _s16, _u32, _s32, _u64, _s64
// _bf16, _f16, _f32, _f64
svuint8_t svuzpq1[_u8](svuint8_t zn, svuint8_t zm); svuint8_t
svuzpq2[_u8](svuint8_t zn, svuint8_t zm); svuint8_t
svzipq1[_u8](svuint8_t zn, svuint8_t zm); svuint8_t
svzipq2[_u8](svuint8_t zn, svuint8_t zm);

PMOV
// Variants are available for:
// _s8, _u16, _s16, _s32, _u32, _s64, _u64
svbool_t svpmov_lane[_u8](svuint8_t zn, uint64_t imm); svbool_t
svpmov[_u8](svuint8_t zn); // The immediate is zero svuint8_t
svpmov_u8_z(svbool_t pn); // The immediate is zero

// Variants are available for:
// _s16, _s32, _u32, _s64, _u64
svuint16_t svpmov_lane[_u16]_m(svuint16_t zd, svbool_t pn, uint64_t
imm);

According to the PR#257[1]
[1]ARM-software/acle#257

Co-authored-by: Hassnaa Hamdi <hassnaa.hamdi@arm.com>
2023-11-21 10:08:57 +00:00
Jessica Del
bebf3a9b8a
[AMDGPU] - Add bitreplicate const folding tests (#72649)
Add more test cases for `s_bitreplicate` constant folding.
2023-11-21 09:06:04 +01:00
Valery Pykhtin
c8c81f4dd6
[AMDGPU] Move ballot64 wave32 mode tests to the appropriate place. NFC. (#72959) 2023-11-21 08:29:15 +01:00
Freddy Ye
d102f8bda1
[MachineBlockPlacement][X86] Use max of MDAlign and TLIAlign to align Loops. (#71026)
This patch added backend consumption on a new loop metadata:
!1 = !{!"llvm.loop.align", i32 64}
which is generated from clang's new loop attribute:
[[clang::code_align()]]
clang patch: #70762
2023-11-21 14:06:32 +08:00
Craig Topper
a1de0946ab [RISCV] Fix spelling error in test name. NFC
foward -> forward
2023-11-20 21:50:35 -08:00
Liao Chunyu
9166cd2a71
[RISCV] DAG combine (mul (add x, 1), y) -> vmadd (#71495)
vmadd: (mul (add x, 1), y) -> (add (mul x, y), y)
           (mul x, add (y, 1)) -> (add x, (mul x, y))
    vnmsub: (mul (sub 1, x), y) -> (sub y, (mul x, y))
            (mul x, (sub 1, y)) -> (sub x, (mul x, y))
    
    Comparison with gcc:
    vmadd: https://gcc.godbolt.org/z/xjePx87Y7
    vnsub: https://gcc.godbolt.org/z/b17zG7nT1
2023-11-21 13:43:34 +08:00
ZhaoQi
2ca028ce7c
[LoongArch][MC] Pre-commit tests for instr bl fixupkind testing (#72826)
This patch is used to test whether fixupkind for bl can be returned
correctly. When BL has target-flags(loongarch-call), there is no error.
But without this flag, an assertion error will appear. So the test is
just tagged as "Expectedly Failed" now until the following patch fix it.
2023-11-21 08:34:52 +08:00
Min-Yih Hsu
0e24179797
[SelectionDAG] Add support to filter SelectionDAG dumps during ISel by function names (#72696)
`-debug-only=isel-dump` is the new debug type for printing SelectionDAG
after each ISel phase. This can be furthered filter by
`-filter-print-funcs=<function names>`.
Note that the existing `-debug-only=isel` will take precedence over the
new behavior and print SelectionDAG dumps of every single function
regardless of `-filter-print-funcs`'s values.
2023-11-20 14:00:47 -08:00
Simon Pilgrim
59d14b6233 [X86] combineLoad - try to reuse existing constant pool entries for smaller vector constant data (REAPPLIED)
If we already have a YMM/ZMM constant that a smaller XMM/YMM has matching lower bits, then ensure we reuse the same constant pool entry.

Extends the similar combines we already have to reuse VBROADCAST_LOAD/SUBV_BROADCAST_LOAD constant loads.

This is a mainly a canonicalization, but should make it easier for us to merge constant loads in a future commit (related to both #70947 and better X86FixupVectorConstantsPass usage for #71078).

Reapplied with fix to ensure we don't 'flip-flop' between multiple matching constants - only perform the fold if the new constant pool entry is larger than the current entry.
2023-11-20 15:38:48 +00:00
Simon Pilgrim
2fdf283c3f [X86] constant-pool-sharing.ll - add test showing failure to reuse subvectors when storing larger vector types
We do correctly use implicit zero-extension of xmm constant load -> ymm constant store though.
2023-11-20 15:24:38 +00:00
Simon Pilgrim
23e1b6159e [X86] Regenerate constant-pool-sharing.ll with AVX test coverage
Shows failure to share the constant pool load (broadcast) on AVX targets
2023-11-20 15:24:38 +00:00
bcahoon
28b5054751
[AMDGPU] Fix PromoteAlloca size check of alloca for store (#72528)
When storing a subvector, too many element were written when the
size of the alloca is smaller than the size of the vector store.
This patch checks for the minimum of the alloca vector and the
store vector to determine the number of elements to store.
2023-11-20 07:57:48 -06:00
Valery Pykhtin
57a11b7f75
[AMDGPU] Add live-through register set printing to GCNRegPressurePrinter pass. (#71096)
Add live-through register set printing, assuming live-through register
is in live-in and live-out sets, has no redefinitions but may have uses
in the block.
2023-11-20 13:35:47 +01:00
Simon Pilgrim
dfc03c45c1 [X86] vector-half-conversions.ll - regenerate with AVX512 slow/fast lane shuffles
Adds missing check prefixes
2023-11-20 12:09:16 +00:00
dewen
4594d5bb3a
[AArch64] Add missing bf16 store pattern (#72844)
We have STURHi store patterns but would fail to select from unscaled
offsets. This adds the missing pattern.
2023-11-20 19:46:58 +08:00
Simon Pilgrim
761a963dfc
[DAG] narrowExtractedVectorBinOp - ensure we limit late node creation to LegalOperations only (#72130)
Avoids infinite issues in some upcoming patches to help D152928 - x86 sees a number of regressions that are addressed by extending SimplifyDemandedVectorEltsForTargetNode to cover more binop opcodes
2023-11-20 10:56:41 +00:00
hstk30-hw
abcbca21cc
[AArch64] Fix big endian shuffle vector miscompile (#68673)
Fixes https://github.com/llvm/llvm-project/issues/65884
2023-11-20 10:24:20 +00:00
Rin
befa925aca
[MachineLICM][AArch64] Hoist COPY instructions with other uses in the loop (#71403)
When there is a COPY instruction in the loop with other uses, we want to
hoist the COPY, which in turn leads to the users being hoisted as well.

Co-authored-by David Green : David.Green@arm.com
2023-11-20 10:01:04 +00:00
Sam Tebbs
f7b5c25507
[AArch64][SME] Remove immediate argument restriction for svldr and svstr (#68565)
The svldr_vnum and svstr_vnum builtins always modify the base register
and tile slice and provide immediate offsets of zero, even when the
offset provided to the builtin is an immediate. This patch optimises the
output of the builtins when the offset is an immediate, to pass it
directly to the instruction and to not need the base register and tile
slice updates.
2023-11-20 09:57:29 +00:00
Kai Luo
bfd3734610 [PowerPC] Use MIR test so that it's not affected by instruction selection. NFC. 2023-11-20 09:51:12 +00:00
Diana
61332cb047
[AMDGPU] Emit backend_stack_size PAL metadata (#72509)
For chain functions, PAL uses a `backend_stack_size` metadata item,
which at the moment has the same meaning as `stack_frame_size_in_bytes`.
We emit both for now in order to simplify coordination with PAL.

The new item must be emitted in the `shader_functions` section, just as
the metadata for other module entry functions. For simplicity, we mark
chain functions as module entry functions and emit the same metadata for
all of them.
2023-11-20 10:01:13 +01:00
Matthew Devereau
cdf6693f07
[AArch64][SME] Add support for sme-fa64 (#70809) 2023-11-20 08:37:52 +00:00
Serge Pavlov
a2e1de1934 [ARM][FPEnv] Lowering of fpenv intrinsics
The change implements lowering of `get_fpenv`, `set_fpenv` and
`reset_fpenv`.

Differential Revision: https://reviews.llvm.org/D81843
2023-11-20 15:08:25 +07:00
Kai Luo
592386400d [PowerPC] Precommit test to show codegen while isel is unavailable. NFC. 2023-11-20 07:28:21 +00:00
Piyou Chen
3494c555c9
[RISCV] postpone removal in initundef pass (#71661)
InitUndef pass need replace the implicit def with Undef pseudo, but
current remove method will make noreg2implicit borken.

This patch postpone the removal until all basicblock be processed.
2023-11-20 11:44:27 +08:00
Kai Luo
eb7698254a
[PowerPC][EarlyIfConversion] Do not insert isel if subtarget doesn't support isel (#72211)
Some subtargets of PPC don't support `isel` instruction, early-ifcvt
should not insert this instruction.
2023-11-20 09:17:04 +08:00
Noah Goldstein
ed7c97e0ad Recommit "[DAGCombiner] Transform (icmp eq/ne (and X,C0),(shift X,C1)) to use rotate or to getter constants." (2nd Try)
Added missing check that the mask and shift amount added up to correct
bitwidth as well as test cases for the bug.

Closes #71729
2023-11-19 12:15:04 -06:00
Noah Goldstein
160a13a0cc [X86] Add more tests for transform (icmp eq/ne (and X,C0),(shift X,C1)); PR71598 2023-11-19 12:15:03 -06:00
Simon Pilgrim
aeccab5664 Revert rGbfbfd1caa4da "[X86] combineLoad - try to reuse existing constant pool entries for smaller vector constant data"
Investigating reports of this causing infinite loops
2023-11-18 22:44:08 +00:00