21888 Commits

Author SHA1 Message Date
Wei Ding
7ab1f7a421 AMDGPU : Fix an error for the llvm.cttz implementation.
Differential Revision: http://reviews.llvm.org/D39014

llvm-svn: 316037
2017-10-17 21:49:52 +00:00
Tim Northover
350a87eaf1 AArch64: account for possible frame index operand in compares.
If the address of a local is used in a comparison, AArch64 can fold the
address-calculation into the comparison via "adds". Unfortunately, a couple of
places (both hit in this one test) are not ready to deal with that yet and just
assume the first source operand is a register.

llvm-svn: 316035
2017-10-17 21:43:52 +00:00
Simon Pilgrim
7cd4e2c96f [X86][SSE] Tests packuswb/truncation codegen from PR34773
llvm-svn: 316033
2017-10-17 21:14:53 +00:00
Konstantin Zhuravlyov
7dabe9ced7 AMDGPU: Start generating metadata for MaxFlatWorkGroupSize
Differential Revision: https://reviews.llvm.org/D38958

llvm-svn: 316024
2017-10-17 20:03:21 +00:00
Sanjay Patel
94c0eb031c [ARM, AArch64] adjust tests trying to maintain their objective; NFC
A smarter compiler will see that these might be better without a jump table
if we're just using the constant values of the switch.

llvm-svn: 316012
2017-10-17 16:54:56 +00:00
Gadi Haber
85d99b4310 [X86][Broadwell] Added the broadwell cpu to the scheduling regression tests.<NFC>
NFC.
Added the Broadwell cpu and the BROADWELL prefix to all the scheduling regression tests, as part of prepartion for a larger commit of adding all Broadwell scheduiling.

Reviewers: RKSimon, zvi, aaboud
Differential Revision: https://reviews.llvm.org/D38994

Change-Id: I54bc9065168844c107b1729fcdc1d311ce3ea0a9
llvm-svn: 315998
2017-10-17 13:45:39 +00:00
Yichao Yu
a18b0b1817 Fix implicit null check with negative offset
Summary:
It seems that negative offset was accidentally allowed in D17967.
AFAICT small negative offset should be valid (always raise segfault) on all archs that I'm aware of (especially x86, which is the only one with this optimization enabled) and such case can be useful when loading hiden metadata from an object.

However, like the positive side, it should only be done within a certain limit.
For now, use the same limit on the positive side for the negative side.
A separate option can be added if needs appear.

Reviewers: mcrosier, skatkov

Reviewed By: skatkov

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D38925

llvm-svn: 315991
2017-10-17 11:47:36 +00:00
Gadi Haber
3020490aac [X86][Skylake] fixed/updated regression test mmx-schedule.ll which failed after r315978.
Change-Id: I60cd7e03ea6c3d9a3dc661a882458e83feca66e3
llvm-svn: 315985
2017-10-17 10:00:08 +00:00
Gadi Haber
1e0f1f476a [X86][SKL] Updated scheduling information for the SkylakeClient target
Updated the scheduling information for the SkylakeClient target with the following changes:

1. regrouped the instructions after adding load and store latencies.
2. regrouped the instructions after adding identified missing ports in several groups.
The changes were made after revisiting the latencies impact of all the load and store uOps.

Reviewers: zvi, RKSimon, craig.topper
Differential Revision: https://reviews.llvm.org/D38727

Change-Id: I778a308cc11e490e8fa5e27e2047412a1dca029f
llvm-svn: 315978
2017-10-17 06:47:04 +00:00
Daniel Sanders
3229217620 [globalisel][tablegen] Add a GIM_CheckIsSameOperand test where OtherInsnID and OtherOpIdx differ
llvm-svn: 315972
2017-10-17 05:24:44 +00:00
Craig Topper
341f2ab444 [X86] Add masked palignr tests to vector-shuffle-masked.ll
llvm-svn: 315971
2017-10-17 04:17:56 +00:00
Craig Topper
19f2f49ef1 [X86] Add AVX512BW to the vector-shuffle-masked test to prepare for an upcoming commit.
llvm-svn: 315970
2017-10-17 04:17:55 +00:00
Mark Searles
4e3d6160db Use the return value of UpdateNodeOperands(); in some cases, UpdateNodeOperands() modifies the node in-place and using the return value isn’t strictly necessary. However, it does not necessarily modify the node, but may return a resultant node if it already exists in the DAG. See comments in UpdateNodeOperands(). In that case, the return value must be used to avoid such scenarios as an infinite loop (node is assumed to have been updated, so added back to the worklist, and re-processed; however, node hasn’t changed so it is once again passed to UpdateNodeOperands(), assumed modified, added back to worklist; cycle infinitely repeats).
Differential Revision: https://reviews.llvm.org/D38466

llvm-svn: 315957
2017-10-16 23:38:53 +00:00
Simon Pilgrim
a590c74549 [X86][AVX] Add v4x64 vector shuffle test for <0,2,1,3> mask
llvm-svn: 315955
2017-10-16 23:20:16 +00:00
Quentin Colombet
0bd2825517 Re-apply [AArch64][RegisterBankInfo] Use the statically computed mappings for COPY
This reverts commit r315823, thus re-applying r315781.

Also make sure we don't use G_BITCAST mapping for non-generic registers.
Non-generic registers don't have a type but do have a reg bank.
Something the COPY mapping now how to deal with but the G_BITCAST
mapping don't.

-- Original Commit Message --
We use to resort on the generic implementation to get the mappings for
COPYs. The generic implementation resorts on table lookup and
dynamically allocated objects to get the valid mappings.

Given we already know how to map G_BITCAST and have the static mappings
for them, use that code path for COPY as well. This is much more
efficient.

Improve the compile time of RegBankSelect by up to 20%.

Note: When we eventually generate all the mappings via TableGen, we
wouldn't have to do that dance to shave compile time. The intent of this
change was to make sure that moving to static structure really pays off.

NFC.

llvm-svn: 315947
2017-10-16 22:28:40 +00:00
Quentin Colombet
9f20af6135 [AArch64][RegisterBankInfo] Add mapping support for G_BITCAST of s128
Anything bigger than 64-bit just map to FPR.

llvm-svn: 315946
2017-10-16 22:28:38 +00:00
Quentin Colombet
7c114d3d70 [AArch64][LegalizerInfo] Mark s128 G_BITCAST legal
We used to mark all G_BITCAST of 128-bit legal but only for vector
types. Scalars of this size are just fine as well.

llvm-svn: 315945
2017-10-16 22:28:27 +00:00
Simon Pilgrim
03c89a840a [X86][3DNow] Add scheduling latency/throughput tests for 3DNow! instructions
llvm-svn: 315942
2017-10-16 21:55:09 +00:00
Simon Pilgrim
608e1b57cf [X86][MMX] Add scheduling latency/throughput tests for MMX instructions
llvm-svn: 315939
2017-10-16 21:29:29 +00:00
Alexander Timofeev
9dff31c769 [AMDGPU] : revert r315908
llvm-svn: 315916
2017-10-16 16:57:37 +00:00
Sanjay Patel
a4b89ed0b7 [x86] add minmax tests with more predicate coverage; NFC
llvm-svn: 315913
2017-10-16 15:20:00 +00:00
Alexander Timofeev
3828242c7e [AMDGPU] Prevent Machine Copy Propagation from replacing live copy with the dead one
Differential revision: https://reviews.llvm.org/D38754

llvm-svn: 315908
2017-10-16 14:35:29 +00:00
Simon Pilgrim
259b190f0d Fix test name typo.
llvm-svn: 315907
2017-10-16 14:33:51 +00:00
Simon Pilgrim
664f2f697a [X86][SSE] Added additional PACKUS shuffle tests
Mainly inspired by PR34773

llvm-svn: 315906
2017-10-16 14:32:41 +00:00
Stefan Maksimovic
ee6b5a79dc [mips] Provide alternate predicates for constant synthesis
Ordering of patterns should not be of importance anymore
since the predicates used are mutually exclusive now.

llvm-svn: 315901
2017-10-16 13:18:21 +00:00
Yonghong Song
6621cf67cf bpf: fix bug on silently truncating 64-bit immediate
We came across an llvm bug when compiling some testcases that 64-bit
immediates are silently truncated into 32-bit and then packed into
BPF_JMP | BPF_K encoding.  This caused comparison with wrong value.

This bug looks to be introduced by r308080.  The Select_Ri pattern is
supposed to be lowered into J*_Ri while the latter only support 32-bit
immediate encoding, therefore Select_Ri should have similar immediate
predicate check as what J*_Ri are doing.

Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 315889
2017-10-16 04:14:53 +00:00
Hiroshi Inoue
e3a3e3c9e9 [PowerPC] Eliminate sign- and zero-extensions if already sign- or zero-extended
This patch enables redundant sign- and zero-extension elimination in PowerPC MI Peephole pass.
If the input value of a sign- or zero-extension is known to be already sign- or zero-extended, the operation is redundant and can be eliminated.
One common case is sign-extensions for a method parameter or for a method return value; they must be sign- or zero-extended as defined in PPC ELF ABI. 
For example of the following simple code, two extsw instructions are generated before the invocation of int_func and before the return. With this patch, both extsw are eliminated.

void int_func(int);
void ii_test(int a) {
    if (a & 1) return int_func(a);
}

Such redundant sign- or zero-extensions are quite common in many programs; e.g. I observed about 60,000 occurrences of the elimination while compiling the LLVM+CLANG.

Differential Revision: https://reviews.llvm.org/D31319

llvm-svn: 315888
2017-10-16 04:12:57 +00:00
Daniel Sanders
ea8711b88e Re-commit r315885: [globalisel][tblgen] Add support for iPTR and implement am_unscaled* and am_indexed*
Summary:
iPTR is a pointer of subtarget-specific size to any address space. Therefore
type checks on this size derive the SizeInBits from a subtarget hook.

At this point, we can import the simplests G_LOAD rules and select load
instructions using them. Further patches will support for the predicates to
enable additional loads as well as the stores.

The previous commit failed on MSVC due to a failure to convert an
initializer_list to a std::vector. Hopefully, MSVC will accept this version.

Depends on D37457

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37458

llvm-svn: 315887
2017-10-16 03:36:29 +00:00
Daniel Sanders
ce72d611af Revert r315885: [globalisel][tblgen] Add support for iPTR and implement am_unscaled* and am_indexed*
MSVC doesn't like one of the constructors.

llvm-svn: 315886
2017-10-16 02:15:39 +00:00
Daniel Sanders
6735ea86cd [globalisel][tblgen] Add support for iPTR and implement am_unscaled* and am_indexed*
Summary:
iPTR is a pointer of subtarget-specific size to any address space. Therefore
type checks on this size derive the SizeInBits from a subtarget hook.

At this point, we can import the simplests G_LOAD rules and select load
instructions using them. Further patches will support for the predicates to
enable additional loads as well as the stores.

Depends on D37457

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37458

llvm-svn: 315885
2017-10-16 01:16:35 +00:00
Daniel Sanders
a71f454765 [globalisel][tablegen] Implement unindexed load, non-extending load, and MemVT checks
Summary:
This includes some context-sensitivity in the MVT to LLT conversion so that
pointer types are tested correctly.
FIXME: I'm not happy with the way this is done since everything is a
       special-case. I've yet to find a reasonable way to implement it.

select-load.mir fails because <1 x s64> loads in tablegen get priority over s64
loads. This is fixed in the next patch and as such they should be committed
together, I've posted them separately to help with the review.

Depends on D37456

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37457

llvm-svn: 315884
2017-10-16 00:56:30 +00:00
Craig Topper
a5af4a64d0 [AVX512] Don't mark EXTLOAD as legal with AVX512. Continue using custom lowering.
Summary:
This was impeding our ability to combine the extending shuffles with other shuffles as you can see from the test changes.

There's one special case that needed to be added to use VZEXT directly for v8i8->v8i64 since the custom lowering requires v64i8.

Reviewers: RKSimon, zvi, delena

Reviewed By: delena

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38714

llvm-svn: 315860
2017-10-15 16:41:17 +00:00
Amjad Aboud
c8d67979c0 [X86] Ignore DBG instructions in X86CmovConversion optimization to resolve PR34565
Differential Revision: https://reviews.llvm.org/D38359

llvm-svn: 315851
2017-10-15 11:00:56 +00:00
Craig Topper
a9cd59fb5d [X86] Lower vselect with constant condition to vector_shuffle even with AVX512 instructions.
Summary:
It's better to use our shuffle lowering code to handle these than loading an immediate into a k-register.

It really feels like this should be a DAG combine optimization rather than a lowering operation, but that's a problem for another day.

Reviewers: RKSimon, delena, zvi

Reviewed By: delena

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38932

llvm-svn: 315849
2017-10-15 06:39:07 +00:00
Craig Topper
f02e97859b [X86] Don't use constant condition for select instruction when testing masking ops.
We should be able to fold constant conditions by converting to shuffles, but fixing that would break these tests in their current form. Since they are really trying to test masking ops, add a non-constant mask to the selects.

llvm-svn: 315848
2017-10-15 06:05:50 +00:00
Konstantin Zhuravlyov
263f7f6676 AMDGPU: Temporary disable pal metadata check line in llvm-readobj test
It fails on mips

llvm-svn: 315837
2017-10-14 23:42:11 +00:00
Craig Topper
dfb443e88c [X86] Remove a bunch of dead FileCheck lines with the wrong prefix.
llvm-svn: 315828
2017-10-14 21:46:55 +00:00
Simon Pilgrim
36fe00ee17 [X86][SSE] Don't attempt to reduce the imul vector width of odd sized vectors (PR34947)
llvm-svn: 315825
2017-10-14 19:57:19 +00:00
Simon Pilgrim
3f49b988e0 [X86][SSE] Test vector imul reduction on 32 and 64-bit targets
llvm-svn: 315824
2017-10-14 19:46:08 +00:00
Konstantin Zhuravlyov
a01d8b0b63 AMDGPU: Bring HSA metadata on par with the specification
Differential Revision: https://reviews.llvm.org/D38753

llvm-svn: 315821
2017-10-14 19:03:51 +00:00
Konstantin Zhuravlyov
b3c605d680 llvm-readobj: Print AMDGPU note contents
Differential Revision: https://reviews.llvm.org/D38752

llvm-svn: 315819
2017-10-14 18:21:42 +00:00
Simon Pilgrim
5bd4431aec Cleanup update_llc_test_checks.py notes.
llvm-svn: 315817
2017-10-14 17:37:03 +00:00
Konstantin Zhuravlyov
7b4be1ed89 AMDGPU: Cleanup elf-notes.ll test
llvm-svn: 315816
2017-10-14 17:36:53 +00:00
Konstantin Zhuravlyov
716af741e9 llvm-readobj: Print AMDGPU note type names
Differential Revision: https://reviews.llvm.org/D38751

llvm-svn: 315813
2017-10-14 16:43:46 +00:00
Konstantin Zhuravlyov
eda425edd4 AMDGPU: Do not emit deprecated notes for code object v3
Differential Revision: https://reviews.llvm.org/D38749

llvm-svn: 315810
2017-10-14 15:59:07 +00:00
Konstantin Zhuravlyov
9c05b2bc3b AMDGPU: Add support for isa version note
- Emit NT_AMD_AMDGPU_ISA
  - Add assembler parsing for isa version directive
    - If isa version directive does not match command line arguments, then return error

Differential Revision: https://reviews.llvm.org/D38748

llvm-svn: 315808
2017-10-14 15:40:33 +00:00
Simon Pilgrim
f367c27d2d [X86][SSE] Support combining AND(EXTRACT(SHUF(X)), C) -> EXTRACT(SHUF(X))
If we are applying a byte mask to a value extracted from a shuffle, see if we can combine the mask into shuffle.

Fixes the last issue with PR22415

llvm-svn: 315807
2017-10-14 15:01:36 +00:00
Craig Topper
f7e777763d [X86] Add patterns for vzmovl+cvtpd2dq/cvttpd2dq with a load.
llvm-svn: 315802
2017-10-14 07:04:48 +00:00
Craig Topper
61010a85b8 [X86] Add AVX512 versions of VCVTPD2PS to load folding tables.
llvm-svn: 315801
2017-10-14 05:55:43 +00:00
Craig Topper
ee277e190c [X86] Add patterns for vzmovl+cvtpd2ps with a load.
llvm-svn: 315800
2017-10-14 05:55:42 +00:00