Defined I/O subroutines have UNIT= and IOSTAT= dummy arguments that are
required to have type INTEGER with its default kind. When that default
kind is modified via -fdefault-integer-8, calls to defined I/O
subroutines from the runtime don't work.
Add a flag to the two data structures shared between the compiler and
the runtime support library to indicate that a defined I/O subroutine
was compiled under -fdefault-integer-8. This has been done in a
compatible manner, so that existing binaries are compatible with the new
library and new binaries are compatible with the old library, unless of
course -fdefault-integer-8 is used.
Fixes https://github.com/llvm/llvm-project/issues/148638.
When parsing a specification part, the parser will look ahead to see if
the next construct is an executable construct. In doing so it will
invoke OpenMPConstruct parser, whereas the only necessary thing to check
would be the directive alone.
`__has_builtin` is not available on all compilers. Make sure it works
when not defined.
Also fix some formatting issues:
- Use brace initialization where possible
- Fix wrong capitalization of variables.
- Add `std::` for `unit64_t` and `int64_t` as it is mostly done in this
part of the codebase.
Interpret TRANSFER(SOURCE=BOZ literal, MOLD=integer or real scalar) as
if it had been a reference to the standard INT or REAL intrinsic
functions, for which a BOZ literal is an acceptable argument, with a
warning about non-conformance. It's a needless extension that has
somehow crept into some other compilers and real applications.
When blank tokens arise from macro replacement in token sequences with
token pasting (##), the preprocessor is producing some bogus tokens
(e.g., "name(") that can lead to subtle bugs later when macro names are
not recognized as such.
The fix is to not paste tokens together when the result would not be a
valid Fortran or C token in the preprocessing context.
The recipe body generation was moved from lowering into FIR's
implementation of MappableType API. And now since all Fortran variable
types implement this type, lowering of OpenACC was updated to use this
API directly. No test changes were needed - all of the private,
firstprivate, and recipe tests get the same body as before.
This patch allows optimizing redundant array repacking, when
the source array is statically known to be contiguous.
This is part of the implementation plan for the array repacking
feature, though, it does not affect any real life use case
as long as FIR inlining is not a thing. I experimented with
simple cases of FIR inling using `-inline-all`, and I recorded
these cases in optimize-array-repacking.fir tests.
In order to create temporary copies of assumed-type arrays
(e.g. for `-frepack-arrays`), we have to allow the source_box
to be a !fir.box.
This patch replaces #147618.
This reverts commit e8e5d07767c444913f837dd35846a92fcf520eab.
This previously failed because the flang-rt build could not find the
llvm header file. It passed on some machines but only because they
had globally installed copies of older llvm.
To fix this, I've copied the required routines from llvm into flang.
With the following justification:
* Flang can, and does, use llvm headers.
* Some Flang headers are also used in Flang-rt.
* Flang-rt cannot use llvm headers.
* Therefore any Flang header using in Flang-rt cannot use llvm headers
either.
To support that conclusion,
https://flang.llvm.org/docs/IORuntimeInternals.html
states:
"The Fortran I/O runtime support library is written in C++17, and uses
some C++17 standard library facilities, but it is intended to not have
any link-time dependences on the C++ runtime support library or any LLVM
libraries."
This talks about libraries but I assume it applies to llvm in general.
Nothing in flang/include/flang/Common, or flang/include/flang/Common
includes any llvm header, and I see some very similar headers there
that duplicate llvm functionality. Like float128.h.
I can only assume this means these files must remain free of
dependencies
on LLVM.
I have copied the two routines literally and put them in the
flang::common
namespace, for lack of a better place for them. So they don't clash with
something.
I have specialised the function to the 1 type flang needs, as it might
save a bit of compile time.
Dispatch is the last construct (after ATOMIC and ALLOCATORS) where the
associated block requires a specific form.
Using OmpDirectiveSpecification for the begin and the optional end
directives will make the structure of all block directives more uniform.
The ALLOCATORS construct is one of the few constructs that require a
special form of the associated block.
Convert the AST node to use OmpDirectiveSpecification for the directive
and the optional end directive, and to use parser::Block as the body:
the form of the block is checked in the semantic checks (with a more
meaningful message).
This PR proposes re-modelling `reduce` specifiers to match OpenMP and
OpenACC. In particular, this PR includes the following:
* A new `fir` op: `fir.delcare_reduction` which is identical to OpenMP's
`omp.declare_reduction` op.
* Updating the `reduce` clause on `fir.do_concurrent.loop` to use the
new op.
* Re-uses the `ReductionProcessor` component to emit reductions for `do
concurrent` just like we do for OpenMP. To do this, the
`ReductionProcessor` had to be refactored to be more generalized.
* Upates mapping `do concurrent` to `fir.loop ... unordered` nests using
the new reduction model.
Unfortunately, this is a big PR that would be difficult to divide up in
smaller parts because the bottom of the changes are the `fir` table-gen
changes to `do concurrent`. However, doing these MLIR changes cascades
to the other parts that have to be modified to not break things.
This PR goes in the same direction we went for `private/local`
speicifiers. Now the `do concurrent` and OpenMP (and OpenACC) dialects
are modelled in essentially the same way which makes mapping between
them more trivial, hopefully.
PR stack:
- https://github.com/llvm/llvm-project/pull/145837 (this one)
- https://github.com/llvm/llvm-project/pull/146025
- https://github.com/llvm/llvm-project/pull/146028
- https://github.com/llvm/llvm-project/pull/146033
OpenMP 6.0 introduced alternative spelling for some directives, with the
previous spellings still allowed.
Warn the user when a new spelling is encountered with OpenMP version set
to an older value.
After the recent move to work queues, in certain cases when linking in
the fortran runtime built for offload on AMDGPU as required in certain
cases, we'll get missing symbols when linking. This PR tries to address
this issue by encompassing more of the library in
RT_OFFLOAD_API_GROUP_BEGIN, which has the affect of compiling these
functions for AMDGPU, resolving the missing symbols.
This PR should address the following issue:
https://github.com/llvm/llvm-project/issues/145888
In the case of nested loops, `acc.loop` is meant to subsume all of the
loops that it applies to (when explicitly described as doing so in the
OpenACC specification). So when there is a `acc loop tile(...)` present
on nested Fortran DO loops, `acc.loop` should apply to the `n` loops
that `tile` applies to. This change lowers such nested Fortran loops
with tile clause into a collapsed `acc.loop` with `n` IVs, loop bounds,
and step, in a similar fashion to the current lowering for acc loops
with `collapse` clause.
The test flang/test/Semantics/io08.f90 was failing when UBSAN was
enabled:
```
/home/david.spickett/llvm-project/flang/include/flang/Common/format.h:224:26: runtime error: signed integer overflow: 10 * 987654321098765432 cannot be represented in type 'int64_t' (aka 'long')
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/david.spickett/llvm-project/flang/include/flang/Common/format.h:224:26
```
This is because the code was effectively:
* Take the risk of UB happening
* Check whether it happened or not
Which UBSAN is obviously not going to like. Instead of checking after
the fact, use llvm's helpers that catch overflow without actually doing
it.
This patch adds an option to select the method for computing complex
number division. It uses `LoweringOptions` to determine whether to lower
complex division to a runtime function call or to MLIR's `complex.div`,
and `CodeGenOptions` to select the computation algorithm for
`complex.div`. The available option values and their corresponding
algorithms are as follows:
- `full`: Lower to a runtime function call. (Default behavior)
- `improved`: Lower to `complex.div` and expand to Smith's algorithm.
- `basic`: Lower to `complex.div` and expand to the algebraic algorithm.
See also the discussion in the following discourse post:
https://discourse.llvm.org/t/optimization-of-complex-number-division/83468
---------
Co-authored-by: Tarun Prabhu <tarunprabhu@gmail.com>
If a `do concurrent` loop is offloaded then there should be no CUDA data
transfer in it. Update the semantic and lowering to take that into
account.
`AssignmentChecker` has to be put into a separate pass because the
checkers in `SemanticsVisitor` cannot have the same `Enter/Leave`
functions. The `DoForallChecker` already has `Eneter/Leave` functions
for the `DoConstruct`.
If the size of the other Interval was 0, (that.size_ - 1) would wrap
below zero.
I've fixed this so that a zero size interval A is within interval B if
the start of A is within B. There's a few ways you could handle zero
sized intervals in theory but this one passes all tests so I assume it's
the intention.
This fixes the following tests when ubsan is enabled:
Flang :: Lower/OpenMP/PFT/sections-pft.f90
Flang :: Lower/OpenMP/derived-type-allocatable.f90
Flang :: Lower/OpenMP/privatization-proc-ptr.f90
Flang :: Lower/OpenMP/sections.f90
Flang :: Parser/OpenMP/sections.f90
Flang :: Semantics/OpenMP/clause-validity01.f90
Flang :: Semantics/OpenMP/if-clause.f90
Flang :: Semantics/OpenMP/parallel-sections01.f90
Flang :: Semantics/OpenMP/private-assoc.f90
The behaviour of strncmp is undefined if either string pointer is null
(https://en.cppreference.com/w/cpp/string/byte/strncmp.html).
I've copied the logic over from Compare to another CharBlock, which had
code to avoid UB in memcmp.
The test Preprocessing/kind-suffix.F90 was failing with UBSAN enabled,
and now passes.
Instead of emitting globals in the program/default address space, emit
them in the global address space. This also requires changes how address
of code-gen is handled, we need to cast to the default address space to
prevent code-gen issues.
In OpenMP Version 5.1, the tile and unroll directives were added. When
using these directives, it is possible to nest them within other OpenMP
Loop Constructs. This patch enables the semantics to allow for this
behaviour on these specific directives. Any nested loops will be stored
within the initial Loop Construct until reaching the DoConstruct itself.
Relevant tests have been added, and previous behaviour has been retained
with no changes.
See also, #110008
Folding hands complex exponentiations with constant arguments off to the
native libm, and on a least on host, this can produce spurious warnings
about division by zero and invalid arguments. Handle the case of a zero
base specially to avoid that, and also emit better warnings for the
undefined 0.**0 and (0.,0.)**0 cases. And add a test for these warnings
and the existing related ones.
Adds a hint to the warning message to disable a warning and updates the
tests to expect this.
Also fixes a bug in the storage of canonical spelling of error flags so
that they are not used after free.
Reland #145901 with a fix for shared library builds.
So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.
This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.
This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.
Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.
I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
This patch fixes:
flang/../mlir/include/mlir/IR/TypeRange.h:51:19: error: 'ArrayRef'
is deprecated: Use {} or ArrayRef<T>() instead
[-Werror,-Wdeprecated-declarations]
flang/../mlir/include/mlir/IR/ValueRange.h:401:20: error: 'ArrayRef'
is deprecated: Use {} or ArrayRef<T>() instead
[-Werror,-Wdeprecated-declarations]
Reinstate commits e5559ca4 and 925dbc79. Fix the issues with compilation
hangs by including DenseMapInfo specialization where the corresponding
instance of DenseMap was defined.
Ref: https://github.com/llvm/llvm-project/pull/144960
So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.
This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.
This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.
Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.
I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
ArrayRef has a constructor that accepts std::nullopt. This
constructor dates back to the days when we still had llvm::Optional.
Since the use of std::nullopt outside the context of std::optional is
kind of abuse and not intuitive to new comers, I would like to move
away from the constructor and eventually remove it.
This patch replaces std::nullopt with {}. There are a couple of
places where std::nullopt is replaced with TypeRange() to accommodate
perfect forwarding.
Convert all binary calls of min/max to extremum operations, so that
extremums generated by the compiler compare equal, and user min/max
calls also compare equal.
Fixes#133646
Originally opened as #144162 but I accidentally pushed a merge in such a
way that a bunch of code owners got added to the review. This is just
rebasing the original work on main and fixing the failing tests.