Instead of using the IV block argument of the do-loop we will use
the do-variable value loaded from its location. This usage is consistent
with other uses of the do-variable inside the loop.
Differential Revision: https://reviews.llvm.org/D133140
Keep the original data type of integer do-variables
for structured loops. When do-variable's data type
is an integer type shorter than IndexType, processing
the do-variable separately from the DoLoop's iteration index
allows getting rid of type casts, which can make backend
optimizations easier.
For example,
```
do i = 2, n-1
do j = 2, n-1
... = a(j-1, i)
end do
end do
```
If value of 'j' is computed by casting the DoLoop's iteration
index to 'i32', then Flang will produce the following LLVM IR:
```
%1 = trunc i64 %iter_index to i32
%2 = sub i32 %1, 1
%3 = sext i32 %2 to i64
```
LLVM's InstCombine may try to get rid of the sign extension,
and may transform this into:
```
%1 = shl i64 %iter_index, 32
%2 = add i64 %1, -4294967296
%3 = ashr exact i64 %2, 32
```
The extra computations for the element address applied on top
of this awkward pattern confuse LLVM vectorizer so that
it does not recognize the unit-strided access of 'a'.
Measured performance improvements on `SPEC CPU2000@IceLake`:
```
168.wupwise: 11.96%
171.swim: 11.22%
172.mrgid: 56.38%
178.galgel: 7.29%
301.apsi: 8.32%
```
Differential Revision: https://reviews.llvm.org/D132176
1. Remove the redundant collapse clause in MLIR OpenMP worksharing-loop
operation.
2. Fix several typos.
3. Refactor the chunk size type conversion since CreateSExtOrTrunc has
both type check and type conversion.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D128338
As per issue #1196, the loop induction variable, which is an argument
in the omp.wsloop operation, does not have a memory location, so when
passed to a function or subroutine, the reference to the value is not
a memory location, but the value of the induction variable. The callee
function/subroutine is then trying to dereference memory at address 1
or some other "not a good memory location".
This is fixed by creating a temporary memory location and storing the
value of the induction variable in that.
Test fixes as a consequence of the changed code generated.
Add checking for some of the omp-unstructured.f90 to check for alloca,
store and load operations, to ensure the correct flow. Add a test
for CYCLE inside a omp-do loop.
Also convert to use -emit-fir in the omp-unstructrued, and make
the symbol matching consistent in the omp-wsloop-variable test.
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D126711
The types of lower bound, upper bound, and step are converted into the
type of the loop variable if necessary. OpenMP runtime requires 32-bit
or 64-bit loop variables. OpenMP loop iteration variable cannot have
more than 64 bits size and will be narrowed.
This patch is part of upstreaming code from the fir-dev branch of
https://github.com/flang-compiler/f18-llvm-project. (#1256)
Co-authored-by: kiranchandramohan <kiranchandramohan@gmail.com>
Reviewed By: kiranchandramohan, shraiysh
Differential Revision: https://reviews.llvm.org/D125740