The I/O runtime's API allows -1 to be passed for a unit number in a
READ, WRITE, or PRINT statement, where it gets replaced by 5 or 6 as
appropriate. This turns out to have been a bad idea, as it prevents the
I/O runtime from detecting and reporting a program's invalid attempt to
use -1 as an I/O unit number. So just pass 5 or 6 as appropriate.
`llvm.fcmp` does support fast math attributes therefore so should
`arith.cmpf`.
The heavy churn in flang tests are because flang sets
`fastmath<contract>` by default on all operations that support the fast
math interface. Downstream users of MLIR should not be so effected.
This was requested in https://github.com/llvm/llvm-project/issues/74263
Patch 2/3 of the transition step 1 described in
https://discourse.llvm.org/t/rfc-enabling-the-hlfir-lowering-by-default/72778/7.
All the modified tests are still here since coverage for the direct
lowering to FIR was still needed while it was default. Some already have
an HLFIR version, some have not and will need to be ported in step 2
described in the RFC.
Note that another 147 lit tests use -emit-fir/-emit-llvm outputs but do
not need a flag since the HLFIR/no HLFIR output is the same for what is
being tested.
Change the separator in the `uniqueCGIdent` method to `X`. This change
is required to enable OpenMP offloading for the NVPTX target, as dots
are not valid identifiers in PTX and `uniqueCGIdent` is used to mangle
some literals. Follow up patches will change the remainder of `.`
appearances in names to `X` and add support for the NVPTX target.
Currently the local builder used in IntrinsicCall doesn't have the
fastmath flags passed to it. This results in the fastmath attribute
not being added to certain runtime calls. This patch simply forwards
the fastmath flags from the parent builder.
Differential Revision: https://reviews.llvm.org/D154611
The global names were created using a hash based on the address
of std::vector::data address. Since the memory may be reused
by different std::vector's, this may cause non-equivalent
constant expressions to map to the same name. This is what is happening
in the modified flang/test/Lower/constant-literal-mangling.f90 test.
I changed the name creation to use a map between the constant expressions
and corresponding unique names. The uniquing is done using a name counter
in FirConverter. The effect of this change is that the equivalent
constant expressions are now mapped to the same global, and the naming
is "stable" (i.e. it does not change from compilation to compilation).
Though, the issue is not HLFIR specific it was affecting several tests
when using HLFIR lowering.
Differential Revision: https://reviews.llvm.org/D150380
Following RFC at
https://discourse.llvm.org/t/rfc-ffp-contract-default-value/66301
This adds the `fastmath<contract>` attribute to `fir.call` and some
floating point arithmetic operations (hence the many test changes).
Instead of testing for this specific attribute, I am using a regular
expression to match any attributes.
Currently, lowering is promoting main program array and character
variables that are not saved into static memory.
This causes issues with equivalence initial value images because
semantics is relying on IsSaved to build the initial value of variables
in static memory. It seems more robust to have IsSaved be the place
deciding if a variable needs to be in static memory (except for common
block members).
Move the logic to decide if a main program variable must be in static
memory into evaluate::IsSaved and add two options to semantics to
replace the llvm options that were used in lowering:
- SaveMainProgram (off by default): save all main program variables.
- SaveBigMainProgramVariables (on by default): save all main program
variables that are bigger than 32 bytes.
The first options is required to run a few old programs that expect all
main program variables to be in bss (and therefore zero initialized).
The second option is added to allow performance testing: placing big
arrays in static memory seems a sane default to avoid blowing up the
stack with old programs that define big local arrays in the main
program, but since it is easier to prove that an alloca does not
escape/is not modified by calls, keeping big arrays on the stack could
yield improvements.
The logic of SaveBigMainProgramVariables is slightly changed compared to what
it was doing in lowering. The old code was placing all arrays and all
explicit length characters in static memory.
The new code is placing everything bigger than 32 bytes in static
memory. This has the advantages of being a simpler logic, and covering
the cases of scalar derived type with big array components or many
components. Small strings and arrays are now left on the stack (after
all, a character(1) can fit in register).
Note: I think it could have been nicer to add a single "integer" option
to set a threshold to place main program variables in static memory so
that this can be fine tuned by the drivers (SaveMainProgram would be
implemented by setting it to zero). But the language feature options are
not meant to carry integer options. Extending it for this seems an
overkill precedent, and placing it in SemanticsContext is weird (it is
a too low level option to be a bare member of SemanticsContext in my
opinion). So I just rolled my own dices and picked 32 for the sake of
simplicity.
Differential Revision: https://reviews.llvm.org/D134735
Instead of using the IV block argument of the do-loop we will use
the do-variable value loaded from its location. This usage is consistent
with other uses of the do-variable inside the loop.
Differential Revision: https://reviews.llvm.org/D133140
Keep the original data type of integer do-variables
for structured loops. When do-variable's data type
is an integer type shorter than IndexType, processing
the do-variable separately from the DoLoop's iteration index
allows getting rid of type casts, which can make backend
optimizations easier.
For example,
```
do i = 2, n-1
do j = 2, n-1
... = a(j-1, i)
end do
end do
```
If value of 'j' is computed by casting the DoLoop's iteration
index to 'i32', then Flang will produce the following LLVM IR:
```
%1 = trunc i64 %iter_index to i32
%2 = sub i32 %1, 1
%3 = sext i32 %2 to i64
```
LLVM's InstCombine may try to get rid of the sign extension,
and may transform this into:
```
%1 = shl i64 %iter_index, 32
%2 = add i64 %1, -4294967296
%3 = ashr exact i64 %2, 32
```
The extra computations for the element address applied on top
of this awkward pattern confuse LLVM vectorizer so that
it does not recognize the unit-strided access of 'a'.
Measured performance improvements on `SPEC CPU2000@IceLake`:
```
168.wupwise: 11.96%
171.swim: 11.22%
172.mrgid: 56.38%
178.galgel: 7.29%
301.apsi: 8.32%
```
Differential Revision: https://reviews.llvm.org/D132176
As Fortran 2018 16.9.163, the reshape is the only intrinsic which
requires the shape argument to be rank-one integer array and the SIZE
of it to be one constant expression. The current expression lowering
converts the shape expression with slice in intrinsic into one box value
with the box element type of unknown extent. However, the genReshape
requires the box element type to be constant size. So, convert the box
value into one with box element type of sequence of 1 x constant. This
corner case is found in cam4 in SPEC 2017
https://github.com/llvm/llvm-project/issues/56140.
Reviewed By: Jean Perier
Differential Revision: https://reviews.llvm.org/D128597
Added new -lower-math-early option that defaults to 'true' that matches
the current math lowering scheme. If set to 'false', the intrinsic math
operations will be lowered to MLIR operations, which should potentially
enable more MLIR optimizations, or libm calls, if there is no corresponding
MLIR operation exists or if "precise" mode is requested.
The generated math MLIR operations are then converted to LLVM dialect
during codegen phase.
The -lower-math-early option is not exposed to users currently. I plan to
get rid of the "early" lowering completely, when "late" lowering
is robust enough to support all math intrinsics that are currently
supported via pgmath. So "late" mode will become default and -lower-math-early
option will not be needed. This will effectively eliminate the mandatory
dependency on pgmath in Fortran lowering, but this is WIP.
Differential Revision: https://reviews.llvm.org/D128385
These tests were left behind during the upstreaming of parts lowering.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D128632
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>