50 Commits

Author SHA1 Message Date
David Sherwood
7f763d9b48
[AArch64] Support symmetric complex deinterleaving with higher factors (#151295)
For loops such as this:

```
struct foo {
  double a, b;
};

void foo(struct foo *dst, struct foo *src, int n) {
  for (int i = 0; i < n; i++) {
    dst[i].a += src[i].a * 3.2;
    dst[i].b += src[i].b * 3.2;
  }
}
```

the complex deinterleaving pass will spot that the deinterleaving
associated with the structured loads cancels out the interleaving
associated with the structured stores. This happens even though
they are not truly "complex" numbers because the pass can handle
symmetric operations too. This is great because it means we can
then perform normal loads and stores instead. However, we can also
do the same for higher interleave factors, e.g. 4:

```
struct foo {
  double a, b, c, d;
};

void foo(struct foo *dst, struct foo *src, int n) {
  for (int i = 0; i < n; i++) {
    dst[i].a += src[i].a * 3.2;
    dst[i].b += src[i].b * 3.2;
    dst[i].c += src[i].c * 3.2;
    dst[i].d += src[i].d * 3.2;
  }
}
```

This PR extends the pass to effectively treat such structures as
a set of complex numbers, i.e.

```
struct foo_alt {
  std::complex<double> x, y;
};
```

with equivalence between members:

```
  foo_alt.x.real == foo.a
  foo_alt.x.imag == foo.b
  foo_alt.y.real == foo.c
  foo_alt.y.imag == foo.d
```

I've written the code to handle sets with arbitrary numbers of
complex values, but since we only support interleave factors
between 2 and 4 I've restricted the sets to 1 or 2 complex
numbers. Also, for now I've restricted support for interleave
factors of 4 to purely symmetric operations only. However, it
could also be extended to handle complex multiplications,
reductions, etc.

Fixes: https://github.com/llvm/llvm-project/issues/144795
2025-08-12 11:05:15 +01:00
David Sherwood
3b9ea7bbc5
Fix build warnings after 6fbc397964340ebc9cb04a094fd04bef9a53abc3 (#151100) 2025-07-29 09:27:58 +01:00
David Sherwood
6fbc397964
[IR] Add new CreateVectorInterleave interface (#150931)
This PR adds a new interface to IRBuilder called CreateVectorInterleave,
which can be used to create vector.interleave intrinsics of factors 2-8.

For convenience I have also moved getInterleaveIntrinsicID and
getDeinterleaveIntrinsicID from VectorUtils.cpp to Intrinsics.cpp where
it can be used by IRBuilder.
2025-07-29 08:47:07 +01:00
Paul Walker
b7ef5dbac9
[LLVM][ComplexDeinterleaving] Update splat identification to include vector ConstantInt/FP. (#144516) 2025-06-18 11:53:27 +01:00
Florian Hahn
5c7bc6a0e6
[ComplexDeinterleave] Don't try to combine single FP reductions. (#139469)
Currently the apss tries to combine floating point reductions, without
checking for the correct fast-math flags and it also creates invalid IR
(using llvm.reduce.add for FP types).

For now, just bail out for non-integer types.

PR: https://github.com/llvm/llvm-project/pull/139469
2025-05-13 08:44:11 +01:00
Matt Arsenault
9383fb23e1
Reapply "IR: Remove uselist for constantdata (#137313)" (#138961)
Reapply "IR: Remove uselist for constantdata (#137313)"

This reverts commit 5936c02c8b9c6d1476f7830517781ce8b6e26e75.

Fix checking uselists of constants in assume bundle queries
2025-05-08 08:00:09 +02:00
Kirill Stoimenov
5936c02c8b Revert "IR: Remove uselist for constantdata (#137313)"
Possibly breaks the build: https://lab.llvm.org/buildbot/#/builders/24/builds/8119

This reverts commit 87f312aad6ede636cd2de5d18f3058bf2caf5651.
2025-05-07 00:07:55 +00:00
Matt Arsenault
87f312aad6
IR: Remove uselist for constantdata (#137313)
This is a resurrected version of the patch attached to this RFC:

https://discourse.llvm.org/t/rfc-constantdata-should-not-have-use-lists/42606

In this adaptation, there are a few differences. In the original patch, the Use's
use list was replaced with an unsigned* to the reference count in the value. This
version leaves them as null and leaves the ref counting only in Value.

Remove use-lists from instances of ConstantData (which are shared
across modules and have no operands).

To continue supporting most of the use-list API, store a ref-count in
place of the use-list; this is for API like Value::use_empty and
Value::hasNUses.  Operations that actually need the use-list -- like
Value::use_begin -- will assert.

This change has three benefits:

 1. The compiler output cannot in any way depend on the use-list order
    of instances of ConstantData.

 2. There's no use-list traffic when adding and removing simple
    constants from operand lists (although there is ref-count traffic;
    YMMV).

 3. It's cheaper to serialize use-lists (since we're no longer
    serializing the use-list order of things like i32 0).

The downside is that you can't look at all the users of ConstantData,
but traversals of users of i32 0 are already ill-advised.

Possible follow-ups:
  - Track if an instance of a ConstantVector/ConstantArray/etc. is known
    to have all ConstantData arguments, and drop the use-lists to
    ref-counts in those cases.  Callers need to check Value::hasUseList
    before iterating through the use-list.
  - Remove even the ref-counts.  I'm not sure they have any benefit
    besides minimizing the scope of this commit, and maintaining the
    counts is not free.

Fixes #58629

Co-authored-by: Duncan P. N. Exon Smith <dexonsmith@apple.com>
2025-05-06 17:20:37 +02:00
Kazu Hirata
47f391fd0e
[CodeGen] Remove unused local variables (NFC) (#138441) 2025-05-04 00:26:37 -07:00
Kazu Hirata
cfc2b0d094
[llvm] Use llvm::SmallVector::pop_back_val (NFC) (#136533) 2025-04-21 08:13:16 -07:00
Kazu Hirata
031475594a
[llvm] Use llvm::SmallVector::pop_back_val (NFC) (#136441) 2025-04-19 11:49:19 -07:00
Matt Arsenault
d36ce05237
ComplexDeinterleaving: Avoid using getNumUses (#136354) 2025-04-18 23:26:43 +02:00
Kazu Hirata
62d2cc84ac
[CodeGen] Avoid repeated hash lookups (NFC) (#135540) 2025-04-13 05:45:53 -07:00
Nicholas Guy
3f4b2f12a1
[llvm] Fix crash when complex deinterleaving operates on an unrolled loop (#129735)
When attempting to perform complex deinterleaving on an unrolled loop
containing a reduction, the complex deinterleaving pass would fail to
accommodate the wider types when accumulating the unrolled paths.
Instead of trying to alter the incoming IR to fit expectations, the pass
should instead decide against processing any reduction that results in a
non-complex or non-vector value.
2025-03-19 13:44:02 +00:00
Nicholas Guy
1b2943534f
[llvm] Fix crash caused by reprocessing complex reductions (#122077)
If a complex pattern had the shape of both a complex->complex reduction
and a complex->single reduction, the matching would recognise both and
deem the graph a valid transformation. Preventing this reprocessing
results in only one of these matching, meaning that in the case of an
invalid graph, we don't try to transform it anyway.
2025-01-09 08:31:57 +00:00
Nicholas Guy
8e1b49c38e
Complex deinterleaving/single reductions build fix Reapply "Add support for single reductions in ComplexDeinterleavingPass (#112875)" (#120441)
This reverts commit 76714be5fd4ace66dd9e19ce706c2e2149dd5716, fixing the
build failure that caused the revert.

The failure stemmed from the complex deinterleaving pass identifying a
series of add operations as a "complex to single reduction", so when it
tried to transform this erroneously identified pattern, it faulted. The
fix applied is to ensure that complex numbers (or patterns that match
them) are used throughout, by checking if there is a deinterleave node
amidst the graph.
2025-01-06 09:59:32 +00:00
Florian Hahn
76714be5fd
Revert "Add support for single reductions in ComplexDeinterleavingPass (#112875)"
This reverts commit b3eede5e1fa7ab742b86e9be22db7bccd2505b8a.

This has been breaking most AArch64 stage2 builds for 4+ hours,
reverting to get the bots back to green.

https://lab.llvm.org/buildbot/#/builders/41/builds/4172
https://lab.llvm.org/buildbot/#/builders/4/builds/4281
https://lab.llvm.org/buildbot/#/builders/199/builds/263
https://lab.llvm.org/buildbot/#/builders/198/builds/334
https://lab.llvm.org/buildbot/#/builders/143/builds/4276
https://lab.llvm.org/buildbot/#/builders/17/builds/4725
2024-12-18 15:06:52 +00:00
Nicholas Guy
b3eede5e1f
Add support for single reductions in ComplexDeinterleavingPass (#112875)
The Complex Deinterleaving pass assumes that all values emitted will
result in complex numbers, this patch aims to remove that assumption and
adds support for emitting just the real or imaginary components, not
both.
2024-12-18 10:34:26 +00:00
Kazu Hirata
735ab61ac8
[CodeGen] Remove unused includes (NFC) (#115996)
Identified with misc-include-cleaner.
2024-11-12 23:15:06 -08:00
Maciej Gabka
bfc0317153
Move several vector intrinsics out of experimental namespace (#88748)
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice

from the experimental namespace.

All these intrinsics exist in LLVM for more than a year now, and are
widely used, so should not be considered as experimental.
2024-04-29 10:16:45 +01:00
Jeremy Morse
f33f66be7d [NFC][RemoveDIs] Always use iterators for inserting PHIs
It's becoming potentially unsafe to insert a PHI instruction using a plain
Instruction pointer. Switch all the remaining sites that create and insert
PHIs to use iterators instead. For example, the code in
ComplexDeinterleavingPass.cpp is definitely at-risk of mixing PHIs and
debug-info.
2024-03-05 17:00:12 +00:00
Florian Hahn
4c9223c770
[ComplexDeinterleaving] Use MapVector to fix codegen non-determinism. 2023-09-06 15:57:35 +01:00
Igor Kirillov
e2cb07c322 [CodeGen] Fix incorrect insertion point selection for reduction nodes in ComplexDeinterleavingPass
When replacing ComplexDeinterleavingPass::ReductionOperation, we can do it
either from the Real or Imaginary part. The correct way is to take whichever
is later in the BasicBlock, but before the patch, we just always took the
Real part.

Fixes https://github.com/llvm/llvm-project/issues/65044

Differential Revision: https://reviews.llvm.org/D159209
2023-08-31 10:38:01 +00:00
Igor Kirillov
46b2ad0224 [CodeGen] Improve speed of ComplexDeinterleaving pass
Cache all results of running `identifyNode`, even those that do not identify
potential complex operations. This patch prevents ComplexDeinterleaving pass
from repeatedly trying to identify Nodes for the same pair of instructions.

Fixes https://github.com/llvm/llvm-project/issues/64379

Differential Revision: https://reviews.llvm.org/D156916
2023-08-04 14:11:43 +00:00
Igor Kirillov
c15557d64e [CodeGen] Extend ComplexDeinterleaving pass to recognise patterns using integer types
AArch64 introduced CMLA and CADD instructions as part of SVE2. This
change allows to generate such instructions when this architecture
feature is available.

Differential Revision: https://reviews.llvm.org/D153808
2023-07-19 11:01:19 +00:00
Igor Kirillov
0aecf7ff0d [CodeGen] Fix incorrectly detected reduction bug in ComplexDeinterleaving pass
Using ACLE intrinsics, it is possible to create a loop that the
deinterleaving pass incorrectly classified as a reduction loop.
For example, for fixed-width vectors the loop was like below:

vector.body:
  %a = phi <4 x float> [ %init.a, %entry ], [ %updated.a, %vector.body ]
  %b = phi <4 x float> [ %init.b, %entry ], [ %updated.b, %vector.body ]
  ...
; Does not depend on %a or %b:
  %updated.a = ...
  %updated.b = ...

Differential Revision: https://reviews.llvm.org/D154598
2023-07-10 12:54:38 +00:00
Igor Kirillov
7f20407cee [CodeGen] Add support for Splats in ComplexDeinterleaving pass
This commit allows generating of complex number intrinsics for expressions
with constants or loops invariants, which are represented as splats.
For instance, after vectorizing loops in the following code snippets,
the ComplexDeinterleaving pass will be able to generate complex number
intrinsics:

```
complex<> x = ...;
for (int i = 0; i < N; ++i)
    c[i] = a[i] * b[i] * x;
```

or

```
for (int i = 0; i < N; ++i)
    c[i] = a[i] * b[i] * (11.0 + 3.0i);
```

Differential Revision: https://reviews.llvm.org/D153355
2023-07-05 17:02:52 +00:00
Igor Kirillov
b4f9c3a933 [CodeGen] Refactor ComplexDeinterleaving to run identification on Values instead of Instructions
This change will make it easier to add identification of complex constants
in future patches.

Differential Revision: https://reviews.llvm.org/D153446
2023-07-03 10:35:14 +00:00
Wang, Xin10
c78acc9275 [NFC]Fix possibly derefer nullptr in ComplexDeinterleavingPass.cpp
Fix static analyzer reports issue, add assert to avoid analyzer report.

Reviewed By: igor.kirillov

Differential Revision: https://reviews.llvm.org/D153942
2023-06-28 23:26:09 -04:00
Igor Kirillov
1fce8df53a Fix the ComplexDeinterleaving bug when handling mixed reductions.
Add a missing check that ensures that ComplexDeinterleaving for reduction
is only analyzed for Real and Imaginary Instructions of the same type.

Differential Revision: https://reviews.llvm.org/D153862
2023-06-27 14:40:49 +00:00
Igor Kirillov
04a8070b46 Revert "Revert "[CodeGen] Extend reduction support in ComplexDeinterleaving pass to support predication""
Adds the capability to recognize SelectInst that appear in the IR.
These instructions are generated during scalable vectorization for reduction
and when the code contains conditions inside the loop body or when
"-prefer-predicate-over-epilogue=predicate-dont-vectorize" is set.

Differential Revision: https://reviews.llvm.org/D152558

This reverts commit ab09654832dba5cef8baa6400fdfd3e4d1495624.

Reason: Reapplying after removing unnecessary default case in switch expression.
2023-06-23 10:13:22 +00:00
Vitaly Buka
ab09654832 Revert "[CodeGen] Extend reduction support in ComplexDeinterleaving pass to support predication"
ComplexDeinterleavingPass.cpp:1849:3: error: default label in switch which covers all enumeration values

This reverts commit 116953b82130df1ebd817b3587b16154f659c013.
2023-06-22 11:29:38 -07:00
Igor Kirillov
116953b821 [CodeGen] Extend reduction support in ComplexDeinterleaving pass to support predication
Adds the capability to recognize SelectInst that appear in the IR.
These instructions are generated during scalable vectorization for reduction
and when the code contains conditions inside the loop body or when
"-prefer-predicate-over-epilogue=predicate-dont-vectorize" is set.

Differential Revision: https://reviews.llvm.org/D152558
2023-06-22 16:49:40 +00:00
Kazu Hirata
c97ab93dca [CodeGen] Fix a warning
This patch fixes:

  llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:1790:3: error:
  default label in switch which covers all enumeration values
  [-Werror,-Wcovered-switch-default]
2023-06-14 10:53:11 -07:00
Igor Kirillov
2cbc265cc9 [CodeGen] Add support for reductions in ComplexDeinterleaving pass
This commit enhances the ComplexDeinterleaving pass to handle unordered
reductions in simple one-block vectorized loops, supporting both
SVE and Neon architectures.

Differential Revision: https://reviews.llvm.org/D152022
2023-06-14 17:27:26 +00:00
Igor Kirillov
1a1e76100e [CodeGen] Improve handling -Ofast generated code by ComplexDeinterleaving pass
Code generated with -Ofast and -O3 -ffp-contract=fast (add
-ffinite-math-only to enable vectorization) can differ significantly.
Code compiled with -O3 can be deinterleaved using patterns as the
instruction order is preserved. However, with the -Ofast flag, there
can be multiple changes in the computation sequence, and even the real
and imaginary parts may not be calculated in parallel.
For more details, refer to
llvm/test/CodeGen/AArch64/complex-deinterleaving-*-fast.ll and
llvm/test/CodeGen/AArch64/complex-deinterleaving-*-contract.ll tests.
This patch implements a more general approach and enables handling most
-Ofast cases.

Differential Revision: https://reviews.llvm.org/D148558
2023-05-31 18:31:38 +00:00
Igor Kirillov
40a81d3100 [CodeGen] Refactor IR generation functions to use IRBuilder in ComplexDeinterleaving pass
This patch updates several functions in LLVM's IR generation code to accept
an IRBuilder object as an argument, rather than an Instruction that indicates
the insertion point for new instructions.
This change is necessary to handle sophisticated -Ofast optimization cases
from D148558 where it's unclear which instructions should be used as the
insertion point for new operations.

Differential Revision: https://reviews.llvm.org/D148703
2023-05-30 16:18:28 +00:00
Elliot Goodrich
ac73c48e09 [llvm] Reduce ComplexDeinterleavingPass.h includes
Remove the unnecessary `"llvm/IR/PatternMatch.h"` include directive from
`ComplexDeinterleavingPass.h` and move it to the corresponding source
file.

Add missing includes that were transitively included by this header to 3
other source files.

This reduces the total number of preprocessing tokens across the LLVM
source files in `lib` from (roughly) 1,964,876,961 to 1,935,091,611 - a
reduction of ~1.52%. This should result in a small improvement in
compilation time.
2023-05-20 17:49:18 +01:00
Elliot Goodrich
b7fb2a3fec Revert "[llvm] Reduce ComplexDeinterleavingPass.h includes"
This reverts commit 058ca5c07106d38ad66e3ec4972a613a64e88151.
2023-05-20 14:21:07 +01:00
Elliot Goodrich
058ca5c071 [llvm] Reduce ComplexDeinterleavingPass.h includes
Remove the unnecessary `"llvm/IR/PatternMatch.h"` include directive from
`ComplexDeinterleavingPass.h` and move it to the corresponding source
file.

Add missing includes that were transitively included by this header to 2
other source files.

This reduces the total number of preprocessing tokens across the LLVM
source files in `lib` from (roughly) 1,964,876,961 to 1,935,091,611 - a
reduction of ~1.52%. This should result in a small improvement in
compilation time.

Differential Revision: https://reviews.llvm.org/D150514
2023-05-20 13:36:50 +01:00
Igor Kirillov
6850bc35c6 [CodeGen] Enable AArch64 SVE FCMLA/FCADD instruction generation in ComplexDeinterleaving
This commit adds support for scalable vector types in theComplexDeinterleaving
pass, allowing it to recognize and handle `llvm.vector.interleave2` and
`llvm.vector.deinterleave2` intrinsics for both fixed and scalable vectors

Differential Revision: https://reviews.llvm.org/D147451
2023-04-21 09:58:35 +00:00
Akshay Khadse
aab0ca3e79 Fix uninitialized scalar members in CodeGen
This change fixes some static code analysis warnings.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D148811
2023-04-21 12:22:34 +08:00
Igor Kirillov
c692e87ab8 [CodeGen] Enable processing of interconnected complex number operations
With this patch, ComplexDeinterleavingPass now has the ability to handle
any number of interconnected operations involving complex numbers.
For example, the patch enables the processing of code like the following:

for (int i = 0; i < 1000; ++i) {
    a[i] =  w[i] * v[i];
    b[i] =  w[i] * u[i];
}

This code has multiple arrays containing complex numbers and a common
subexpression `w` that appears in two expressions.

Differential Revision: https://reviews.llvm.org/D146988
2023-04-18 13:05:49 +00:00
Akshay Khadse
8bf7f86d79 Fix uninitialized pointer members in CodeGen
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D148303
2023-04-17 16:32:46 +08:00
David Green
dc764a2e2d [ComplexDeinterleaving] Propagate fast math flags to symmetric operations.
This is a simple patch to make sure fast math flags are propagated through to
the newly created symmetric operations, which can help with later
simplifications.

Differential Revision: https://reviews.llvm.org/D146409
2023-03-28 12:12:02 +01:00
Nicholas Guy
96615c856d [Codegen][ARM][AArch64] Support symmetric operations on complex numbers
Differential Revision: https://reviews.llvm.org/D142482
2023-03-14 12:11:10 +00:00
Nicholas Guy
49384f1411 Cleanup of Complex Deinterleaving pass (NFCI)
Differential Revision: https://reviews.llvm.org/D143177
2023-03-14 12:11:09 +00:00
Kazu Hirata
4501133d96 Ensure newlines at the end of files (NFC) 2022-12-16 23:36:51 -08:00
Nicholas Guy
a3dc5b534a [ARM][CodeGen] Add integer support for complex deinterleaving
Differential Revision: https://reviews.llvm.org/D139628
2022-12-12 11:38:19 +00:00
Nicholas Guy
d52e2839f3 [ARM][CodeGen] Add support for complex deinterleaving
Adds the Complex Deinterleaving Pass implementing support for complex numbers in a target-independent manner, deferring to the TargetLowering for the given target to create a target-specific intrinsic.

Differential Revision: https://reviews.llvm.org/D114174
2022-11-14 14:02:27 +00:00