12 Commits

Author SHA1 Message Date
Valentin Clement
208a4510d4
[flang][NFC] Fix typo 2023-11-17 10:54:45 -08:00
Mats Petersson
8dcee5800c
[flang]Check for dominance in loop versioning (#68797)
This avoids trying to version loops that can't be versioned, and thus
avoids hitting an assert.

Co-authored with Slava Zakharin (who provided the test-code).
2023-10-12 13:07:16 +01:00
Slava Zakharin
7beb65ae2d
[flang] Fixed LoopVersioning for array slices. (#65703)
The first test case added in the LIT test demonstrates the problem.
Even though we did not consider the inner loop as a candidate for
the transformation due to the array_coor with a slice, we decided to
version the outer loop for the same function argument.
During the cloning of the outer loop we dropped the slicing completely
producing invalid code.

I restructured the code so that we record all arg uses that cannot be
transformed (regardless of the reason), and then fixup the usage
information across the loop nests. I also noticed that we may generate
redundant contiguity checks for the inner loops, so I fixed it
since it was easy with the new way of keeping the usage data.
2023-09-08 09:01:10 -07:00
Tom Eccles
ad9af7de90 [flang][LoopVersioning] support fir.array_coor
This is the last piece required for the loop versioning patch to work on
code lowered via HLFIR. With this patch, HLFIR performance on spec2017
roms is now similar to the FIR lowering.

Adding support for fir.array_coor means that many more loops will be
versioned, even in the FIR lowering. So far as I have seen, these do not
seem to have an impact on performance for the benchmarks I tried, but I
expect it would speed up some programs, if the loop being versioned
happened to be the hot code.

The main difference between fir.array_coor and fir.coordinate_of is
that fir.coordinate_of uses zero-based indices, whereas fir.array_coor
uses the indices as specified in the Fortran program (starting from 1 by
default, but also supporting non default lower bounds). I opted to
transform fir.array_coor operations into fir.coordinate_of operations
because this allows both to share the same offset calculation logic.

The tricky bit of this patch is getting the correct lower bounds for the
array operand to subtract from the fir.array_coor indices to get a
zero-based indices. So far as I can tell, the FIR lowering will always
provide lower bounds (shift) information in the shape operand to the
fir.array_coor when non-default lower bounds are used. If none is given,
I originally tried falling back to reading lower bounds from the box,
but this led to misscompilation in SPEC2017 cam4. Therefore the pass
instead assumes that if it can't already find an SSA value for the shift
information, the default lower bound (1) should be used.

A suspect the incorrect lower bounds in the box for the FIR lowering was
already a known issue (see https://reviews.llvm.org/D158119).

Differential Revision: https://reviews.llvm.org/D158597
2023-09-04 10:40:40 +00:00
Slava Zakharin
cccf4d6e4a [flang] Skip OPTIONAL arguments in LoopVersioning.
This patch fixes multiple tests failing with segfault due to accessing
absent argument box before the loop versioning check.
The absent arguments might be treated as contiguous for the purpose
of loop versioning, but this is not done in this patch.

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D158800
2023-08-25 08:33:49 -07:00
Tom Eccles
8d24b7322e [flang][LoopVersioning] support reboxed operands
Since https://reviews.llvm.org/D158119, many boxes lowered via HLFIR are
reboxed with better lower bounds information after they are declared.

For the loop versioning pass to support FIR lowered via HLFIR, it needs
to dereference fir.rebox operations to figure out that the variable was
a function argument.

I decided to modify the existing dereferencing of fir.declare so that
the declared/reboxed value is used in the versioned loop instead of the
function argument. This makes it easier for the improved lower bounds
information to be accessed. In doing this, I changed ArgInfo to store
ArgInfo::arg by value instead of by pointer because mlir::Value has
value-type semantics.

Differential Revision: https://reviews.llvm.org/D158408
2023-08-23 09:53:05 +00:00
Slava Zakharin
668f261bfa [flang] Make ISO_Fortran_binding.h a standalone header again.
This implements the proposal from
https://discourse.llvm.org/t/adding-flang-specific-header-files-to-clang/72442/6
Since ISO_Fortran_binding.h is supposed to be included from users'
C/C++ codes, it would better have no dependencies on other header
files.

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D158549
2023-08-22 18:56:27 -07:00
Tom Eccles
05011024fd [flang][LoopVersioning] support fir.declare
When FIR comes from HLFIR, there will be a fir.declare operation between
the source and the usage of each source variable (and some temporary
allocations). This pass needs to be able to follow these so that it can
still transform loops when HLFIR is used, otherwise it mistakenly
assumes these values are not function arguments.

More work is needed after this patch to fully support HLFIR, because the
generated code tends to use fir.array_coor instead of fir.coordinate_of.

Differential Revision: https://reviews.llvm.org/D157964
2023-08-18 09:51:22 +00:00
Tom Eccles
53cc33b00b [flang] Store KindMapping by value in FirOpBuilder
Previously only a constant reference was stored in the FirOpBuilder.
However, a lot of code was merged using

FirOpBuilder builder{rewriter, getKindMapping(mod)};

This is incorrect because the KindMapping returned will go out of scope
as soon as FirOpBuilder's constructor had run. This led to an infinite
loop running some tests using HLFIR (because the stack space containing
the kind mapping was re-used and corrupted).

One solution would have just been to fix the incorrect call sites,
however, as a large number of these had already made it past review, I
decided to instead change FirOpBuilder to store its own copy of the
KindMapping. This is not costly because nearly every time we construct a
KindMapping is exclusively to construct a FirOpBuilder. To make this
common pattern simpler, I added a new constructor to FirOpBuilder which
calls getKindMapping().

Differential Revision: https://reviews.llvm.org/D151881
2023-06-05 09:57:57 +00:00
Mats Petersson
b812932b35 [FLANG] Change loop versioning to use shift instead of divide
Despite me being convinced that the use of divide didn't produce any
divide instructions, it does in fact add more instructions than using
a plain shift operation.

This patch simply changes the divide to a shift right, with an
assert to check that the "divisor" is a power of two.

Reviewed By: kiranchandramohan, tblah

Differential Revision: https://reviews.llvm.org/D151880
2023-06-01 19:29:57 +01:00
Mats Petersson
b75f9ce3fe [FLANG] Support all arrays for LoopVersioning
This patch makes more than 2D arrays work, with a fix for the way that
loop index is calculated. Removing the restriction of number of
dimensions.

This also changes the way that the actual index is calculated, such that
the stride is used rather than the extent of the previous dimension. Some
tests failed without fixing this - this was likely a latent bug in the
2D version too, but found in a test using 3D arrays, so wouldn't
have been found with 2D only. This introduces a division on the index
calculation - however it should be a nice and constant value allowing
a shift to be used to actually divide - or otherwise removed by using
other methods to calculate the result. In analysing code generated with
optimisation at -O3, there are no divides produced.

Some minor refactoring to avoid repeatedly asking for the "rank" of the
array being worked on.

This improves some of the SPEC-2017 ROMS code, in the same way as the
limited 2D array improvements - less overhead spent calculating array
indices in the inner-most loop and better use of vector-instructions.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D151140
2023-05-30 18:54:40 +01:00
Mats Petersson
a716ace13d Add loop-versioning pass to improve unit-stride
Introduce conditional code to identify stride of "one element", and simplify the array accesses for that case.

This allows better loop performance in various benchmarks.

Reviewed By: tblah, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D141306
2023-04-18 09:53:07 +01:00