26 Commits

Author SHA1 Message Date
Matt Arsenault
8fcf387202 InstCombine: Convert target tests to opaque pointers
The opaquify script deleted a few declarations for some reason which
were manually deleted.
2022-12-01 21:56:14 -05:00
Nikita Popov
fcfc31fffb [InstCombine] Convert some tests to opaque pointers (NFC)
Conversion was performed (without manual fixup) using:
https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34
2022-10-28 13:07:30 +02:00
Bjorn Pettersson
acdc419c89 [test] Use -passes=instcombine instead of -instcombine in lots of tests. NFC
Another step moving away from the deprecated syntax of specifying
pass pipeline in opt.

Differential Revision: https://reviews.llvm.org/D119081
2022-02-07 14:26:59 +01:00
David Green
ab0c5cea0b [ARM] Use v2i1 for MVE and CDE intrinsics
This adjusts all the MVE and CDE intrinsics now that v2i1 is a legal
type, to use a <2 x i1> as opposed to emulating the predicate with a
<4 x i1>. The v4i1 workarounds have been removed leaving the natural
v2i1 types, notably in vctp64 which now generates a v2i1 type.

AutoUpgrade code has been added to upgrade old IR, which needs to
convert the old v4i1 to a v2i1 be converting it back and forth to an
integer with arm.mve.v2i and arm.mve.i2v intrinsics. These should be
optimized away in the final assembly.

Differential Revision: https://reviews.llvm.org/D114455
2021-12-03 15:27:58 +00:00
David Green
5a6dfbb8cd [ARM] Teach DemandedVectorElts about VMOVN lanes
The class of instructions that write to narrow top/bottom lanes only
demand the even or odd elements of the input lanes. Which means that a
pair of VMOVNT; VMOVNB demands no lanes from the original input. This
teaches that to instcombine from the target hooks available through
ARMTTIImpl.

Differential Revision: https://reviews.llvm.org/D109325
2021-09-14 11:05:31 +01:00
David Green
ac5a5af19d [ARM] Add tests for MVE narrowing intrinsic demand bits. 2021-09-06 22:03:32 +01:00
David Green
a77e2d196c [ARM] Fix arm.mve.pred.v2i range upper limit
The range metadata specifies a half open range, so our top limit was one
off.
2021-07-05 21:06:30 +01:00
Arthur Eubanks
1aa02b37e7 Revert "[BuildLibCalls/SimplifyLibCalls] Fix attributes on created CallInst instructions."
This reverts commit 1eda5453f2dcc9a9a4b4578fe74163c529974503.

Causes https://crbug.com/1223647:
Incompatible argument and return types for 'returned' attribute
  tail call void @llvm.memset.p0i8.i64(i8* noalias noundef returned writeonly align 1 dereferenceable(255) %arraydecay, i8 0, i64 255, i1 false), !dbg !985
2021-06-24 19:24:34 -07:00
Jonas Paulsson
1eda5453f2 [BuildLibCalls/SimplifyLibCalls] Fix attributes on created CallInst instructions.
- When emitting libcalls, do not only pass the calling convention from the
  function prototype but also the attributes.

- Do not pass attributes from e.g. libc memcpy to llvm.memcpy.

Review: Reid Kleckner, Eli Friedman, Arthur Eubanks

Differential Revision: https://reviews.llvm.org/D103992
2021-06-24 14:47:24 -05:00
Dávid Bolvanský
cd54c57919 Reland "[Libcalls, Attrs] Annotate libcalls with noundef"
Fixed Clang tests.
2021-02-20 06:18:48 +01:00
Dávid Bolvanský
94d034fb86 Revert "[Libcalls, Attrs] Annotate libcalls with noundef"
This reverts commit 33b0c63775ce58014c55e285671e3315104a6076. Bots are failing. Some Clang tests need to be updated too.
2021-02-20 04:18:42 +01:00
Dávid Bolvanský
33b0c63775 [Libcalls, Attrs] Annotate libcalls with noundef
I think we can use here same logic as for nonnull.

strlen(X) - X must be noundef => valid pointer.

for libcalls with size arg, we add noundef only if size is known and greater than 0 - so pointers must be noundef (valid ones)

Reviewed By: jdoerfert, aqjune

Differential Revision: https://reviews.llvm.org/D95122
2021-02-20 04:10:07 +01:00
Juneyoung Lee
9d70dbdc2b [InstCombine] use poison as placeholder for undemanded elems
Currently undef is used as a don’t-care vector when constructing a vector using a series of insertelement.
However, this is problematic because undef isn’t undefined enough.
Especially, a sequence of insertelement can be optimized to shufflevector, but using undef as its placeholder makes shufflevector a poison-blocking instruction because undef cannot be optimized to poison.
This makes a few straightforward optimizations incorrect, such as:

```
;  https://bugs.llvm.org/show_bug.cgi?id=44185

define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) {
  %xv = insertelement <4 x float> %q, float %x, i32 2
  %r = shufflevector <4 x float> %y, <4 x float> %xv, <4 x i32> { 0, 6, 2, undef }
  ret <4 x float> %r ; %r[3] is undef
}
=>
define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) {
  %r = insertelement <4 x float> %y, float %x, i32 1
  ret <4 x float> %r ; %r[3] = %y[3], incorrect if %y[3] = poison
}

Transformation doesn't verify!
ERROR: Target is more poisonous than source
```

I’d like to suggest
1. Using poison as insertelement’s placeholder value (IRBuilder::CreateVectorSplat should be patched too)
2. Updating shufflevector’s semantics to return poison element if mask is undef

Note that poison is currently lowered into UNDEF in SelDag, so codegen part is okay.
m_Undef() matches PoisonValue as well, so existing optimizations will still fire.

The only concern is hidden miscompilations that will go incorrect when poison constant is given.
A conservative way is copying all tests having `insertelement undef` & replacing it with `insertelement poison` & run Alive2 on it, but it will create many tests and people won’t like it. :(

Instead, I’ll simply locally maintain the tests and run Alive2.
If there is any bug found, I’ll report it.

Relevant links: https://bugs.llvm.org/show_bug.cgi?id=43958 , http://lists.llvm.org/pipermail/llvm-dev/2019-November/137242.html

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93586
2020-12-28 08:58:15 +09:00
Meera Nakrani
545de56f87 [ARM] Enabled VMLAV and Add instructions to use VMLAVA
Used InstCombine to enable VMLAV and Add instructions to generate VMLAVA instead with tests.
2020-08-19 08:36:49 +00:00
Sebastian Neubauer
2a6c871596 [InstCombine] Move target-specific inst combining
For a long time, the InstCombine pass handled target specific
intrinsics. Having target specific code in general passes was noted as
an area for improvement for a long time.

D81728 moves most target specific code out of the InstCombine pass.
Applying the target specific combinations in an extra pass would
probably result in inferior optimizations compared to the current
fixed-point iteration, therefore the InstCombine pass resorts to newly
introduced functions in the TargetTransformInfo when it encounters
unknown intrinsics.
The patch should not have any effect on generated code (under the
assumption that code never uses intrinsics from a foreign target).

This introduces three new functions:
TargetTransformInfo::instCombineIntrinsic
TargetTransformInfo::simplifyDemandedUseBitsIntrinsic
TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic

A few target specific parts are left in the InstCombine folder, where
it makes sense to share code. The largest left-over part in
InstCombineCalls.cpp is the code shared between arm and aarch64.

This allows to move about 3000 lines out from InstCombine to the targets.

Differential Revision: https://reviews.llvm.org/D81728
2020-07-22 15:59:49 +02:00
Simon Tatham
01aefae4a1 [ARM,MVE] Add an InstCombine rule permitting VPNOT.
Summary:
If a user writing C code using the ACLE MVE intrinsics generates a
predicate and then complements it, then the resulting IR will use the
`pred_v2i` IR intrinsic to turn some `<n x i1>` vector into a 16-bit
integer; complement that integer; and convert back. This will generate
machine code that moves the predicate out of the `P0` register,
complements it in an integer GPR, and moves it back in again.

This InstCombine rule replaces `i2v(~v2i(x))` with a direct complement
of the original predicate vector, which we can already instruction-
select as the VPNOT instruction which complements P0 in place.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70484
2019-12-02 16:20:30 +00:00
Simon Tatham
f4f77aa53e [ARM,MVE] Add InstCombine rules for pred_i2v / pred_v2i.
If you're writing C code using the ACLE MVE intrinsics that passes the
result of a vcmp as input to a predicated intrinsic, e.g.

  mve_pred16_t pred = vcmpeqq(v1, v2);
  v_out = vaddq_m(v_inactive, v3, v4, pred);

then clang's codegen for the compare intrinsic will create calls to
`@llvm.arm.mve.pred.v2i` to convert the output of `icmp` into an
`mve_pred16_t` integer representation, and then the next intrinsic
will call `@llvm.arm.mve.pred.i2v` to convert it straight back again.
This will be visible in the generated code as a `vmrs`/`vmsr` pair
that move the predicate value pointlessly out of `p0` and back into it again.

To prevent that, I've added InstCombine rules to remove round trips of
the form `v2i(i2v(x))` and `i2v(v2i(x))`. Also I've taught InstCombine
about the known and demanded bits of those intrinsics. As a result,
you now get just the generated code you wanted:

  vpt.u16 eq, q1, q2
  vaddt.u16 q0, q3, q4

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70313
2019-11-18 10:39:30 +00:00
David Bolvansky
e80fcf0340 [SimplifyLibCalls] Mark known arguments with nonnull
Reviewers: efriedma, jdoerfert

Reviewed By: jdoerfert

Subscribers: ychen, rsmith, joerg, aaron.ballman, lebedev.ri, uenoku, jdoerfert, hfinkel, javed.absar, spatel, dmgreen, llvm-commits

Differential Revision: https://reviews.llvm.org/D53342

llvm-svn: 372091
2019-09-17 09:32:52 +00:00
David Bolvansky
39130314fe [SimplifyLibCalls] Add dereferenceable bytes from known callsites
Summary:
int mm(char *a, char *b) {
    return memcmp(a,b,16);
}

Currently:
define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 {
entry:
  %call = tail call i32 @memcmp(i8* %a, i8* %b, i64 16)
  ret i32 %call
}

After patch:
define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 {
entry:
  %call = tail call i32 @memcmp(i8* dereferenceable(16)  %a, i8* dereferenceable(16)  %b, i64 16)
  ret i32 %call
}




Reviewers: jdoerfert, efriedma

Reviewed By: jdoerfert

Subscribers: javed.absar, spatel, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66079

llvm-svn: 368657
2019-08-13 09:11:49 +00:00
Eric Christopher
cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher
a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Alexandros Lamprineas
61f0ba1fcc [InstCombine, ARM] Convert vld1 to llvm load
Convert a vector load intrinsic into an llvm load instruction.
This is beneficial when the underlying object being addressed
comes from a constant, since we get constant-folding for free.

Differential Revision: https://reviews.llvm.org/D46273

llvm-svn: 333643
2018-05-31 12:19:18 +00:00
Alexandros Lamprineas
52457d33b2 [InstCombine, ARM, AArch64] Convert table lookup to shuffle vector
Turning a table lookup intrinsic into a shuffle vector instruction
can be beneficial. If the mask used for the lookup is the constant
vector {7,6,5,4,3,2,1,0}, then the back-end generates byte reverse
instructions instead.

Differential Revision: https://reviews.llvm.org/D46133

llvm-svn: 333550
2018-05-30 14:38:50 +00:00
Chad Rosier
274d72faad [InstCombine] Combine XOR and AES instructions on ARM/ARM64.
The ARM/ARM64 AESE and AESD instructions have a builtin XOR as the first step in
the instruction. Therefore, if the AES key is zero and the AES data was
previously XORed, it can be combined into a single instruction.

Differential Revision: https://reviews.llvm.org/D47239
Patch by Michael Brase!

llvm-svn: 333193
2018-05-24 15:26:42 +00:00
Justin Bogner
3c6fbad388 InstCombine: Move tests that use target intrinsics into subdirectories
Tests with target intrinsics are inherently target specific, so it
doesn't actually make sense to run them if we've excluded their
target.

llvm-svn: 302979
2017-05-13 05:39:46 +00:00
Sam Parker
214f7bf5cc Enable simplify libcalls for ARM PCS
Teach SimplifyLibcalls that in can treat functions annotated with
apcs, aapcs or aapcs_vfp like normal C functions if they only take
and return integer or pointer values, and the target is not iOS.

Differential Revision: https://reviews.llvm.org/D24453

llvm-svn: 281322
2016-09-13 12:10:14 +00:00