3308 Commits

Author SHA1 Message Date
David Green
2abaa027d9 [AArch64] Teach the costmodel about widening muls
A vector mul(sext, sext) or mul(zext, zext) will be code generated as a
single smull or umull instruction. This most notably effects v2i64
multiplies, which are otherwise not legal and need to be expanded.

The oneuse check has also been slightly changed, as it is already
checked from the use of isWideningInstruction in getCastInstrCost.

Differential Revision: https://reviews.llvm.org/D123006
2022-04-04 12:45:04 +01:00
David Green
2e2f38a1ac [AArch64] Add widening arithmetic cost tests. NFC 2022-04-04 12:19:45 +01:00
Dávid Bolvanský
fb65aaf0be [NFCI] Fixed missing colon in CHECK directives - part 2 2022-04-03 14:42:59 +02:00
Dávid Bolvanský
f02a0a69af [NFCI] Fixed missing colon in CHECK directives 2022-04-03 11:52:38 +02:00
Simon Pilgrim
d663166acb [CostModel][X86] Reduce cost of v2i64 icmp base cost on SSE2 targets
Based off the script from D103695, we were exaggerating the cost of the v2i64 comparison expansion using instruction count instead of effective throughput
2022-03-30 09:11:55 +01:00
Johannes Doerfert
a81fff8afd Reapply "[Intrinsics] Add nocallback to the default intrinsic attributes"
This reverts commit c5f789050daab25aad6770790987e2b7c0395936 and
reapplies 7aea3ea8c3b33c9bb338d5d6c0e4832be1d09ac3 with additional test
changes.
2022-03-25 09:36:50 -05:00
Arthur Eubanks
d051c566cd [test] Remove the last couple uses of -analyze in llvm/test 2022-03-23 11:31:12 -07:00
David Green
c56dd20f69 [AArch64] Add extra insert subvector cost model tests. NFC 2022-03-22 12:20:19 +00:00
Yeting Kuo
ecd7a0132a [RISCV] Add basic cost model for vector casting
To perform the cost model of vector casting, the patch consider most vector
casts as their scalar form and consider those vector form of free scalr castings
as 1.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121771
2022-03-22 14:17:08 +08:00
Simon Pilgrim
5dde9c1286 [CostModel][X86] Reduce cost of extracting bool vector elements
For constant indices, these are now just a MOVMSK+TEST/BT
2022-03-18 19:02:47 +00:00
Florian Hahn
1b7ef6aac8
[BasicAA] Account for wrapping when using abs(VarIndex) >= abs(Scale).
The patch adds an extra check to only set MinAbsVarIndex if
abs(V * Scale) won't wrap. In the absence of IsNSW, try to use the
bitwidths of the original V and Scale to rule out wrapping.

Attempt to model https://alive2.llvm.org/ce/z/HE8ZKj

The code in the else if below probably needs the same treatment, but I
need to come up with a test first.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D121695
2022-03-18 14:41:15 +00:00
Simon Pilgrim
4455c5cdea [CostModel][X86] Update RUN -passes=* to double quotes to appease update scripts on windows 2022-03-18 11:44:18 +00:00
Craig Topper
bbd2ecf9f0 [RISCV] Add +experimental-zvfh extension to cover half types in vectors.
Currently we allow half types in vectors if the scalar Zfh extension
is enabled. This behavior is not inline with the vector spec. For f32
and f64 types, the Zve32f, Zve64f, Zve64d, and V explicitly control
the availablity of floating point types in vectors.

In order to make our compiler compliant, we either need to remove all support
for half in vectors or we need an extension to control it.

Draft spec here https://github.com/riscv/riscv-v-spec/pull/780

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D121345
2022-03-17 10:04:02 -07:00
Florian Hahn
e5822ded56
[FunctionAttrs] Infer argmemonly .
This patch adds initial argmemonly inference, by checking the underlying
objects of locations returned by MemoryLocation.

I think this should cover most cases, except function calls to other
argmemonly functions.

I'm not sure if there's a reason why we don't infer those yet.

Additional argmemonly can improve codegen in some cases. It also makes
it easier to come up with a C reproducer for 7662d1687b09 (already fixed,
but I'm trying to see if C/C++ fuzzing could help to uncover similar
issues.)

Compile-time impact:
NewPM-O3: +0.01%
NewPM-ReleaseThinLTO: +0.03%
NewPM-ReleaseLTO+g: +0.05%

https://llvm-compile-time-tracker.com/compare.php?from=067c035012fc061ad6378458774ac2df117283c6&to=fe209d4aab5b593bd62d18c0876732ddcca1614d&stat=instructions

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D121415
2022-03-16 10:24:33 +00:00
Nikita Popov
57d57b1afd [AAEval] Make compatible with opaque pointers
With opaque pointers, we cannot use the pointer element type to
determine the LocationSize for the AA query. Instead, -aa-eval
tests are now required to have an explicit load or store for any
pointer they want to compute alias results for, and the load/store
types are used to determine the location size.

This may affect ordering of results, and sorting within one result,
as the type is not considered part of the sorted string anymore.

To somewhat minimize the churn, printing still uses faux typed
pointer notation.
2022-03-16 10:02:11 +01:00
Florian Hahn
a9772a7148
[BasicAA] Add test showing incorrect noalias result with wrapping.
@mul_may_overflow_var_nonzero_minabsvarindex_one_index shows BasicAA
incorrectly determining noalias for (%gep.917, i8* %gep.idx).
If %v == 10581764700698480926, %idx == 917 and the GEPs alias.
https://alive2.llvm.org/ce/z/yzDgnn
2022-03-15 12:32:07 +00:00
Nikita Popov
04b717c423 [TLI] Check that malloc argument has type size_t
DSE assumes that this is the case when forming a calloc from a
malloc + memset pair.

For tests, either update the malloc signature or change the
data layout.
2022-03-14 17:22:24 +01:00
David Sherwood
e7b89c2fc3 Add BasicTTIImpl cost model for llvm.get.active.lane.mask intrinsic
The vectoriser sometimes generates predicated vector loops using
the llvm.get.active.lane.mask intrinsic so it's important that we
are able to calculate a valid cost for the call instruction. When
SVE is enabled we are able to use a single whilelo instruction
for some vector types - in such cases I've marked the cost as 1.
For all other cases I've set the cost according to how the intrinsic
will be expanded.

Tests added here:

  Analysis/CostModel/AArch64/sve-intrinsics.ll
  Analysis/CostModel/ARM/active_lane_mask.ll
  Analysis/CostModel/RISCV/active_lane_mask.ll

Differential Revision: https://reviews.llvm.org/D121109
2022-03-14 09:35:05 +00:00
Yeting Kuo
ae7c6647f3 [RISCV] Add basic code modeling for fixed length vector reduction.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121447
2022-03-14 11:04:31 +08:00
Florian Hahn
aa590e5823
[AArch64] Improve costs for some conversions to fp16.
Currently the cost model under-estimates the cost of certain
FP16 conversions.

This patch updates getCastInstrCost to return more accurate costs for
the cases improved in c2ed9fd05479.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D113700
2022-03-11 10:27:39 +00:00
Florian Hahn
697f55e368
[AArch64] Move fp16 cast tests.
Move FP16 tests to fp16cast function, as suggested in D113700.
2022-03-10 12:22:06 +00:00
Arthur Eubanks
16823adf2a [test] Modify some tests to remove implicit -basic-aa in legacy PM RUN lines 2022-03-08 14:35:06 -08:00
Arthur Eubanks
b81d5baa0f [test] Use new PM for -aa-eval tests 2022-03-08 14:15:53 -08:00
Roman Lebedev
2f80ea7f4f
[NFC][LV] Use different braces in debug output
The analysis passes output function name encapsulated in `'` braces,
but LV uses `"`. Harmonizing this may help in creating an update script
for the LV costmodel test checks.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D121105
2022-03-07 19:32:37 +03:00
David Green
43b638241a [AArch64] Use NPM for cost model tests. NFC
As per the other tests, this switches the run lines back to using the
NPM via
-passes='print<cost-model>' -cost-kind=throughput 2>&1 -disable-output
2022-03-07 08:57:50 +00:00
Arthur Eubanks
f909aed671 Revert "[SCEV] Infer ranges for SCC consisting of cycled Phis"
This reverts commit fc539b0004d4fe8072aca00e38599a2300a955ce.

Causes miscompiles, see D110620.
2022-03-04 19:52:44 -08:00
Alex Tsao
89f15fc687 [RISCV] Add cost modelling for masked memory op
The patch adds very basic cost model for masked memory op on scalable vector.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117884
2022-03-03 20:47:58 +08:00
David Green
47f4cd9c3d [AArch64] Update costs for some fp16 converts
This updates the costs for FP16 converts, as some of them were pretty
high.

Differential Revision: https://reviews.llvm.org/D120771
2022-03-03 11:17:24 +00:00
David Green
65c0e45a37 [AArch64] Vector shifts cost 1
The costs of vector shifts was 2 as opposed to 1, as the nodes are
marked custom. Fix this like the others and mark the nodes as cheap.

Differential Revision: https://reviews.llvm.org/D120773
2022-03-03 10:42:57 +00:00
David Green
97e0366d67 [AArch64] Add some fp16 conversion cost tests. NFC 2022-03-02 18:07:14 +00:00
Nikita Popov
98cfcae4e9 Revert "[RISCV] Add cost modelling for masked memory op"
This reverts commit 76f243b53b1c4bed5defe8ffac1fd739a39b0097.

The newly added test fails.
2022-03-02 17:32:10 +01:00
Alex Tsao
76f243b53b [RISCV] Add cost modelling for masked memory op
The patch adds very basic cost model for masked memory op on scalable vector.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117884
2022-03-02 22:48:41 +08:00
David Green
02de975259 [AArch64] Add some tests for the cost of extending an extract. NFC 2022-03-02 14:47:32 +00:00
David Green
62c2b070d5 [AArch64] Add simple arithmetic cost model test. NFC 2022-03-01 23:31:02 +00:00
David Green
2e7c35ea12 [AArch64] Cleanup and extend cast costs. NFC 2022-02-26 17:59:02 +00:00
David Green
5fe8307b70 [AArch64] Add scalar min/max costs. NFC
The vector costs were already added, this adds scalar variants to
complete the test coverage.
2022-02-25 17:11:24 +00:00
Max Kazantsev
fc539b0004 [SCEV] Infer ranges for SCC consisting of cycled Phis
Our current strategy of computing ranges of SCEVUnknown Phis was to simply
compute the union of ranges of all its inputs. In order to avoid infinite recursion,
we mark Phis as pending and conservatively return full set for them. As result,
even simplest patterns of cycled phis always have a range of full set.

This patch makes this logic a bit smarter. We basically do the same, but instead
of taking inputs of single Phi we find its strongly connected component (SCC)
and compute the union of all inputs that come into this SCC from outside.

Processing entire SCC together has one more advantage: we can set range for all
of them at once, because the only thing that happens to them is the same value is
being passed between those Phis. So, despite we spend more time analyzing a
single Phi, overall we may save time by not processing other SCC members, so
amortized compile time spent should be approximately the same.

Differential Revision: https://reviews.llvm.org/D110620
Reviewed By: reames
2022-02-17 18:03:52 +07:00
Pavel Kosov
37fa99eda0 [SchedModels][CortexA55] Add ASIMD integer instructions
Depends on D114642

Original review https://reviews.llvm.org/D112201

OS Laboratory. Huawei Russian Research Institute. Saint-Petersburg

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D117003
2022-02-17 13:41:57 +03:00
Serguei Katkov
194899caef [MemoryDependency] Relax the re-ordering of atomic store and unordered load/store
Atomic store with Release semantic allows re-ordering of unordered load/store before the store.
Implement it.

Reviewers: reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D119844
2022-02-17 10:53:25 +07:00
Roman Lebedev
ae48af582b
[NFC][SCEV] Recognize umin_seq when operand is zext'ed in zero-check
zext(umin(x,y)) == umin(zext(x),zext(y))
zext(x) == 0  ->  x == 0

While it is not a very likely scenario, we probably should not expect
that instcombine already dropped such a redundant zext,
but handle directly. Moreover, perhaps there was no ZExtInst,
and SCEV somehow managed to  pull out said zext out of the SCEV expression.
2022-02-16 22:16:02 +03:00
Roman Lebedev
3c7d48ed90
[NFC][SCEV] Recognize umin_seq when operand is zext'ed in umin but not in zero-check
zext(umin(x,y)) == umin(zext(x),zext(y))
zext(x) == 0  ->  x == 0

Extra leading zeros do not affect the result of comparison with zero,
nor do they matter for the unsigned min/max,
so we should not be dissuaded when we find a zero-extensions,
but instead we should just skip it.
2022-02-16 22:16:02 +03:00
Roman Lebedev
21c6c43e6f
[NFC][SCEV] Add tests for umin_seq recognition with interfering zext's 2022-02-16 22:16:01 +03:00
Philip Reames
b59f135f16 Precommit tests from D119844, expanded with additional coverage 2022-02-16 07:55:43 -08:00
Serguei Katkov
15f1cffb3a [MemoryDependency] Relax the re-ordering with volatile store.
Volatile store does not provide any special rules for reordering with
atomics. Usual must alias anaylsis is enough here.

This makes the bahavior similar to how volatile load is handled.

Reviewers: reames, nikic
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D119818
2022-02-16 10:58:48 +07:00
Serguei Katkov
2e487da3cb [MemoryDepndency] Add a test for re-ordering with volatile load/store. 2022-02-16 10:27:11 +07:00
Roman Lebedev
65715ac72a
[SCEV] Generalize umin_seq matching
Since we don't greedily flatten `umin_seq(a, umin(b, c))` into `umin_seq(a, b, c)`,
just looking at the operands of the outer-level `umin` is not sufficient,
and we need to recurse into all same-typed `umin`'s.
2022-02-11 21:58:19 +03:00
Roman Lebedev
c234809ff8
[SCEV] Recognize x == 0 ? 0 : umin_seq(..., x, ...) -> umin_seq(x, umin_seq(...)) 2022-02-11 21:58:19 +03:00
Roman Lebedev
281421693b
[SCEV] Recognize x == 0 ? 0 : umin(..., x, ...) -> umin_seq(x, umin(...))
That is the canonical expansion for umin_seq,
so we really should roundtrip it.
2022-02-11 21:58:19 +03:00
Roman Lebedev
4d0c0e6cc2
[SCEV] createNodeForSelectOrPHIInstWithICmpInstCond(): generalize eq handling
The current logic was: https://alive2.llvm.org/ce/z/j8muXk
but in reality the offset to the Y in the 'true' hand
does not need to exist: https://alive2.llvm.org/ce/z/MNQ7DZ
https://alive2.llvm.org/ce/z/S2pMQD

To catch that, instead of computing the Y's in both
hands and checking their equality, compute Y and C,
and check that C is 0 or 1.
2022-02-11 21:58:19 +03:00
Roman Lebedev
bfce0ca203
[NFC][SCEV] Add test more tests for umin_seq recognition 2022-02-11 21:58:18 +03:00