346 Commits

Author SHA1 Message Date
Matt Arsenault
1a02d30c87 AMDGPU: Fix unused variable warnings in release builds
llvm-svn: 361030
2019-05-17 12:59:27 +00:00
Matt Arsenault
a510b570c2 AMDGPU/GlobalISel: Legalize G_FCEIL
llvm-svn: 361028
2019-05-17 12:20:05 +00:00
Matt Arsenault
6aebcd5499 AMDGPU/GlobalISel: Legalize G_INTRINSIC_TRUNC
llvm-svn: 361027
2019-05-17 12:20:01 +00:00
Matt Arsenault
6aafc5e19d AMDGPU/GlobalISel: Legalize G_FRINT
llvm-svn: 361026
2019-05-17 12:19:57 +00:00
Matt Arsenault
1448f5689e AMDGPU/GlobalISel: Legalize G_FCOPYSIGN
llvm-svn: 361025
2019-05-17 12:19:52 +00:00
Matt Arsenault
2b6f76f05f AMDGPU/GlobalISel: Fix non-power-of-2 G_EXTRACT sources
llvm-svn: 358894
2019-04-22 15:22:46 +00:00
Amara Emerson
946b1246d6 [GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with constants only.
Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't
be degraded.

This change also improves the IRTranslator so that in most places, but not all,
it creates constants using the MIRBuilder directly instead of first creating a
new destination vreg and then creating a constant. By doing this, the
buildConstant() method can just return the vreg of an existing G_CONSTANT
instead of having to create a COPY from it.

I measured a 0.2% improvement in compile time and a 0.9% improvement in code
size at -O0 ARM64.

Compile time:
Program                                        base   cse    diff
test-suite...ark/tramp3d-v4/tramp3d-v4.test     9.04   9.12  0.8%
test-suite...Mark/mafft/pairlocalalign.test     2.68   2.66 -0.7%
test-suite...-typeset/consumer-typeset.test     5.53   5.51 -0.4%
test-suite :: CTMark/lencod/lencod.test         5.30   5.28 -0.3%
test-suite :: CTMark/Bullet/bullet.test        25.82  25.76 -0.2%
test-suite...:: CTMark/ClamAV/clamscan.test     6.92   6.90 -0.2%
test-suite...TMark/7zip/7zip-benchmark.test    34.24  34.17 -0.2%
test-suite :: CTMark/SPASS/SPASS.test           6.25   6.24 -0.1%
test-suite...:: CTMark/sqlite3/sqlite3.test     1.66   1.66 -0.1%
test-suite :: CTMark/kimwitu++/kc.test         13.61  13.60 -0.0%
Geomean difference                                          -0.2%

Code size:
Program                                        base     cse      diff
test-suite...-typeset/consumer-typeset.test    1315632  1266480 -3.7%
test-suite...:: CTMark/ClamAV/clamscan.test    1313892  1297508 -1.2%
test-suite :: CTMark/lencod/lencod.test        1439504  1423112 -1.1%
test-suite...TMark/7zip/7zip-benchmark.test    2936980  2904172 -1.1%
test-suite :: CTMark/Bullet/bullet.test        3478276  3445460 -0.9%
test-suite...ark/tramp3d-v4/tramp3d-v4.test    8082868  8033492 -0.6%
test-suite :: CTMark/kimwitu++/kc.test         3870380  3853972 -0.4%
test-suite :: CTMark/SPASS/SPASS.test          1434904  1434896 -0.0%
test-suite...Mark/mafft/pairlocalalign.test    764528   764528   0.0%
test-suite...:: CTMark/sqlite3/sqlite3.test    782092   782092   0.0%
Geomean difference                                              -0.9%

Differential Revision: https://reviews.llvm.org/D60580

llvm-svn: 358369
2019-04-15 05:04:20 +00:00
Matt Arsenault
4ed6ccab9b AMDGPU/GlobalISel: Fix non-power-of-2 select
llvm-svn: 357762
2019-04-05 14:03:04 +00:00
Matt Arsenault
d3093c2f1f GlobalISel: Implement fewerElementsVector for phi
llvm-svn: 355048
2019-02-28 00:16:32 +00:00
Matt Arsenault
72bcf15dbf GlobalISel: Implement moreElementsVector for phi
llvm-svn: 355047
2019-02-28 00:01:05 +00:00
Matt Arsenault
f4bfe4cd17 AMDGPU/GlobalISel: Fix bit ops for non-power-of-2 sizes
llvm-svn: 354825
2019-02-25 21:32:48 +00:00
Matt Arsenault
82b103998b AMDGPU/GlobalISel: Clamp max implicit_def elements
llvm-svn: 354818
2019-02-25 20:46:06 +00:00
Matt Arsenault
2e0ee47712 AMDGPU/GlobalISel: Make phis legal
llvm-svn: 354592
2019-02-21 15:48:13 +00:00
Matt Arsenault
b10fa8df3f AMDGPU/GlobalISel: Fix bit count ops for non-power-of-2 types
llvm-svn: 354587
2019-02-21 15:22:20 +00:00
Matt Arsenault
75e30c4d5d GlobalISel: Fix fewerElementsVector for ctlz with different result type
Also complete the set of related operations.

llvm-svn: 354480
2019-02-20 16:42:52 +00:00
Matt Arsenault
c4d07554e4 GlobalISel: Implement moreElementsVector for g_insert results
llvm-svn: 354477
2019-02-20 16:11:22 +00:00
Matt Arsenault
b4c95b338b GlobalISel: Implement moreElementsVector for select
llvm-svn: 354354
2019-02-19 17:03:09 +00:00
Matt Arsenault
4d88427a58 GlobalISel: Implement moreElementsVector for G_EXTRACT source
llvm-svn: 354348
2019-02-19 16:44:22 +00:00
Matt Arsenault
26b7e859ef GlobalISel: Implement moreElementsVector for bit ops
llvm-svn: 354345
2019-02-19 16:30:19 +00:00
Matt Arsenault
fbe92a53d0 GlobalISel: Implement widenScalar for g_extract scalar results
llvm-svn: 354293
2019-02-18 22:39:27 +00:00
Matt Arsenault
530d05e94a GlobalISel: Add alignment to LegalityQuery MMOs
This allows targets to specify the minimum alignment required for the
load/store.

llvm-svn: 354071
2019-02-14 22:41:09 +00:00
Matt Arsenault
9e5e868d95 AMDGPU/GlobalISel: Fix RegBankSelect for GEP.
This is basically a pointer typed add, so shouldn't be any different.
This was assuming everything was an SGPR, which is not true.

Also cleanup legality for GEP. I don't seem to be seeing the problem
the hack marking s64 as a legal pointer type the comment mentions.

llvm-svn: 354067
2019-02-14 22:24:28 +00:00
Matt Arsenault
00ccd13c73 AMDGPU/GlobalISel: Only make f16 constants legal on f16 targets
We could deal with it, but there's no real point.

llvm-svn: 353845
2019-02-12 14:54:55 +00:00
Matt Arsenault
18ec382698 GlobalISel: Implement moreElementsVector for implicit_def
llvm-svn: 353754
2019-02-11 22:00:39 +00:00
Matt Arsenault
9dba67f431 GlobalISel: Add G_FCANONICALIZE instruction
llvm-svn: 353719
2019-02-11 17:05:20 +00:00
Matt Arsenault
b0a227049f AMDGPU/GlobalISel: Fix shift legalization for non-power-of-2
clampScalar doesn't do anything for non-power-of-2 in range.
There should probably be a combination rule to reduce the number
of matching rules.

llvm-svn: 353526
2019-02-08 15:06:24 +00:00
Matt Arsenault
0f2debb1c2 AMDGPU/GlobalISel: Fix non-power-of-2 implicit_def
llvm-svn: 353522
2019-02-08 14:46:27 +00:00
Matt Arsenault
dc88a2ce35 AMDGPU/GlobalISel: Don't use a copy in addrspacecast lowering
llvm-svn: 353516
2019-02-08 14:16:11 +00:00
Matt Arsenault
a8b4339c2f AMDGPU/GlobalISel: Legalize addrspacecast
Use a placeholder constant for now on targets
that need the load from the queue ptr.

llvm-svn: 353497
2019-02-08 02:40:47 +00:00
Matt Arsenault
fbec8fe93b GlobalISel: Implement narrowScalar for shift main type
This is pretty much directly ported from SelectionDAG. Doesn't include
the shift by non-constant but known bits version, since there isn't a
globalisel version of computeKnownBits yet.

This shows a disadvantage of targets not specifically which type
should be used for the shift amount. If type 0 is legalized before
type 1, the operations on the shift amount type use the wider type
(which are also less likely to legalize). This can be avoided by
targets specifying legalization actions on type 1 earlier than for
type 0.

llvm-svn: 353455
2019-02-07 19:37:44 +00:00
Matt Arsenault
d914189a2e AMDGPU/GlobalISel: Restrict g_implicit_def legality
llvm-svn: 353452
2019-02-07 19:10:15 +00:00
Matt Arsenault
c0f7569aab AMDGPU/GlobalISel: Legalize fsqrt
llvm-svn: 353438
2019-02-07 18:14:39 +00:00
Matt Arsenault
93fdec739b AMDGPU/GlobalISel: Legalize some f16 operations
llvm-svn: 353436
2019-02-07 18:03:11 +00:00
Matt Arsenault
c83b82363c GlobalISel: Implement fewerElementsVector for shifts
Introduce a new function which handles instructions with multiple type
indices, but have the same number of vector elements.

Also legalize v2s16 shifts when applicable.

llvm-svn: 353432
2019-02-07 17:38:00 +00:00
Matt Arsenault
91be65be65 GlobalISel: Try to make legalize rules more useful for vectors
Mostly keep the existing functions on scalars, but add versions which
also operate based on the vector element size.

llvm-svn: 353430
2019-02-07 17:25:51 +00:00
Matt Arsenault
10547230f3 AMDGPU/GlobalISel: Legalize select for v4s16
Also add some more select tests to help show future legalization
changes.

llvm-svn: 353045
2019-02-04 14:04:52 +00:00
Fangrui Song
b21ed3c57a [AMDGPU] Fix -Wunused-variable after rL352978
llvm-svn: 352982
2019-02-03 03:51:52 +00:00
Matt Arsenault
888aa5dedd GlobalISel: Implement widenScalar for G_UNMERGE_VALUES
For the scalar case only.

Also move the similar G_MERGE_VALUES handling to a separate function
and cleanup to make them look more similar.

llvm-svn: 352979
2019-02-03 00:07:33 +00:00
Matt Arsenault
0e5d856eb8 GlobalISel: Implement widenScalar for G_EXTRACT vector sources
Handle the basic element extract case.

llvm-svn: 352978
2019-02-02 23:56:00 +00:00
Matt Arsenault
eb2603cfb2 AMDGPU/GlobalISel: Avoid reporting illegal extloads as legal
This avoids breaking a test in a future commit.

llvm-svn: 352977
2019-02-02 23:39:13 +00:00
Matt Arsenault
58f9d3df97 AMDGPU/GlobalISel: Legalize icmp for pointer types
llvm-svn: 352976
2019-02-02 23:35:15 +00:00
Matt Arsenault
2065c94dd3 AMDGPU/GlobalISel: Legalize constant for pointer types
llvm-svn: 352975
2019-02-02 23:33:49 +00:00
Matt Arsenault
2491f82679 AMDGPU/GlobalISel: Legalize select for pointer types
llvm-svn: 352974
2019-02-02 23:31:50 +00:00
Matt Arsenault
cbaada6bc1 GlobalISel: Legalization for inttoptr/ptrtoint
llvm-svn: 352973
2019-02-02 23:29:55 +00:00
Matt Arsenault
c7bce739ad GlobalISel: Handle odd splits in fewerElementsVector for load/store
llvm-svn: 352720
2019-01-31 02:46:05 +00:00
Matt Arsenault
d1bfc8d0c3 GlobalISel: Implement narrowScalar for bswap
llvm-svn: 352719
2019-01-31 02:34:03 +00:00
Matt Arsenault
d5684f76e0 GlobalISel: Allow bitcount ops to have different result type
For AMDGPU the result is always 32-bit for 64-bit inputs.

llvm-svn: 352717
2019-01-31 02:09:57 +00:00
Matt Arsenault
dc6c78596b GlobalISel: Implement fewerElementsVector for select
llvm-svn: 352601
2019-01-30 04:19:31 +00:00
Matt Arsenault
f6cab16258 AMDGPU/GlobalISel: Fix clamping shifts with 16-bit insts
llvm-svn: 352599
2019-01-30 03:36:25 +00:00
Matt Arsenault
045bc9a4a6 GlobalISel: Support narrowScalar for uneven loads
llvm-svn: 352594
2019-01-30 02:35:38 +00:00