I triaged a benchmark that showed inaccurate results, when
local-alloc-tbaa
was enabled. It turned out to be not a real TBAA issue, but rather
TBAA affecting optimizations that affect FMA generation, which
introduced
an expected accuracy variation. I would like to keep this threshold
control for future uses.
OpenMP 6.0 has changed the modifiers on the MAP clause:
- map-type-modifier has been split into individual modifiers,
- map-type "delete" has become a modifier,
- new modifiers have been added.
This patch adds parsing support for all of the OpenMP 6.0 modifiers. The
old "map-type-modifier" is retained, but is no longer created in
parsing. It will remain to take advantage of the preexisting modifier
validation for older versions: when the OpenMP version is < 6.0, the
modifiers will be rewritten back as map-type-modifiers (or map- type in
case of "delete").
In this patch the modifiers will always be rewritten in the older format
to isolate these changes to parsing as much as possible.
Add the syntax `-fintrinsic-modules-path=<dir>` as an alias to the
existing option `-fintrinsic-modules-path <dir>`. gfortran also supports
both alternatives.
This is particularly useful with CMake which de-duplicates command-line
options. For instance,
`-fintrinsic-modules-path /path/A -fintrinsic-modules-path /path/B`
is de-duplicated to
`-fintrinsic-modules-path /path/A /path/B`
since it conisiders the second `-fintrinsic-modules-path`
"redundant". This can be avoided using the syntax
`-fintrinsic-modules-path=/path/A -fintrinsic-modules-path=/path/B`.
A report of the following code not generating an error led to fixing two bugs in directive checking.
- We should treat CombinedConstructs as OpenACC Constructs
- We should treat DoConstruct index variables as private.
```fortran
subroutine sub(nn)
integer :: nn, ii
!$acc serial loop default(none)
do ii = 1, nn
end do
!$acc end serial loop
end subroutine
```
Here `nn` should be flagged as needing a data clause while `ii` should
still get one implicitly.
Pointer remappings unconditionally update the element byte size and
derived type of the pointer's descriptor. This is okay when the pointer
is polymorphic, but not when a pointer is associated with an extended
type.
To communicate this monomorphic case to the runtime, add a new entry
point so as to not break forward binary compatibility.
Fixed a crash in the following example:
```
subroutine sub()
implicit none
print *, (i, i = 1, 2) ! Problem: using undefined var in implied-do loop
end subroutine sub
```
The error message was already generated, but the compiler crashed before
it could display it.
The following code is now accepted:
```
module m
end
program m
use m
end
```
The PROGRAM name doesn't really have an effect on the compilation
result, so it shouldn't result in symbol name conflicts.
This change makes the main program symbol name all uppercase in the
cooked character stream. This makes it distinct from all other symbol
names that are all lowercase in cooked character stream.
Modified the tests that were checking for lower case main program name.
Based on the OpenACC specification — which states that if the bind name
is given as an identifier it should be resolved according to the
compiled language, and if given as a string it should be used unmodified
— we introduce two distinct `bindName` representations for `acc routine`
to handle each case appropriately: one as an array of `SymbolRefAttr`
for identifiers and another as an array of `StringAttr` for strings.
To ensure correct correspondence between bind names and devices, this
patch also introduces two separate sets of device attributes. The
routine operation is extended accordingly, along with the necessary
updates to the OpenACC dialect and its lowering.
This PR changes how `-Werror` promotes warnings to errors so that it
interoperates with `-Wfatal-error`. It maintains the property that
warnings and other messages promoted to errors are displayed as there
original message.
It is possible that a non-polymorphic dummy argument
has a dynamic type that does not match its static type
in a valid Fortran program, e.g. when the actual and
the dummy arguments have different compatible derived
SEQUENCE types:
module mod
type t
sequence
integer x
end type
contains
subroutine test(x)
type t
sequence
integer x
end type
type(t) :: x(:)
end subroutine
end module
'test' may be called with an actual argument of type 'mod::t',
which is the dynamic type of 'x' on entry to 'test'.
If we create the repacking temporary based on the static type of 'x'
('test::t'), then the runtime will report the types mismatch
as an error. Thus, we have to create the temporary using
the dynamic type of 'x'. The fact that the dummy's type
has SEQUENCE or BIND attribute is not easily computable
at this stage, so we use the dynamic type for all derived
type cases. As long as this is done only when the repacking
actually happens, the overhead should not be noticeable.
Defined I/O subroutines have UNIT= and IOSTAT= dummy arguments that are
required to have type INTEGER with its default kind. When that default
kind is modified via -fdefault-integer-8, calls to defined I/O
subroutines from the runtime don't work.
Add a flag to the two data structures shared between the compiler and
the runtime support library to indicate that a defined I/O subroutine
was compiled under -fdefault-integer-8. This has been done in a
compatible manner, so that existing binaries are compatible with the new
library and new binaries are compatible with the old library, unless of
course -fdefault-integer-8 is used.
Fixes https://github.com/llvm/llvm-project/issues/148638.
When a type-bound generic ASSIGNMENT(=) procedure is ambiguous for a
particular reference, say so, rather than claiming that no specific
procedure matched the types and ranks of the LHS and RHS.
Fixes https://github.com/llvm/llvm-project/issues/148675.
Fixes#148386
The first time the line was classified (using
`Prescanner::ClassifyLine(const char *)`) the line was correctly
classified as a compiler directive. But then later on the token form is
invoked (`Prescanner::ClassifyLine(TokenSequence, Provenance)`). This
one incorrectly classified the line as a comment because there was no
whitespace token right after the sentinel. This fixes the issue by
ensuring this whitespace is added.
When parsing a specification part, the parser will look ahead to see if
the next construct is an executable construct. In doing so it will
invoke OpenMPConstruct parser, whereas the only necessary thing to check
would be the directive alone.
Interpret TRANSFER(SOURCE=BOZ literal, MOLD=integer or real scalar) as
if it had been a reference to the standard INT or REAL intrinsic
functions, for which a BOZ literal is an acceptable argument, with a
warning about non-conformance. It's a needless extension that has
somehow crept into some other compilers and real applications.
When blank tokens arise from macro replacement in token sequences with
token pasting (##), the preprocessor is producing some bogus tokens
(e.g., "name(") that can lead to subtle bugs later when macro names are
not recognized as such.
The fix is to not paste tokens together when the result would not be a
valid Fortran or C token in the preprocessing context.
This patch allows optimizing redundant array repacking, when
the source array is statically known to be contiguous.
This is part of the implementation plan for the array repacking
feature, though, it does not affect any real life use case
as long as FIR inlining is not a thing. I experimented with
simple cases of FIR inling using `-inline-all`, and I recorded
these cases in optimize-array-repacking.fir tests.
We cannot attach any "data" or "descriptor" tag to accesses
of derived types that contain descriptors, because this
will make them non-aliasing with any generic "data" or "descriptor"
accesses, which is not correct. We have to skip TBAA tags attachment
for such accesses same way we do it for boxes.
In order to create temporary copies of assumed-type arrays
(e.g. for `-frepack-arrays`), we have to allow the source_box
to be a !fir.box.
This patch replaces #147618.
Extends reduction support for `do concurrent`, in particular, for
associating names. Consider the following input:
```fortran
subroutine dc_associate_reduce
integer :: i
real, allocatable, dimension(:) :: x
associate(x_associate => x)
do concurrent (i = 1:10) reduce(+: x_associate)
end do
end associate
end subroutine
```
The declaration of `x_associate` is emitted as follows:
```mlir
%13:2 = hlfir.declare %10(%12) {uniq_name = "...."} : (!fir.heap<!fir.array<?xf32>>, !fir.shapeshift<1>) -> (!fir.box<!fir.array<?xf32>>, !fir.heap<!fir.array<?xf32>>)
```
where the HLFIR base type is an array descriptor (i.e. the
allocatable/heap attribute is dropped as stipulated by the spec; section
11.1.3.3).
The problem here is that `declare_reduction` ops accept only reference
types. This restriction is already partially handled for
`fir::BaseBoxType`'s by allocating a stack slot for the descriptor and
storing the box in that stack allocation. We have to modify this a
littble bit for `associate` since the HLFIR and FIR base types are
different (unlike most scenarios).
Built on top of https://github.com/llvm/llvm-project/pull/146533
Adding some more options I use to generate minimal testcases for
MLIR->LLVMIR conversion.
Hopefully this will save us all some time when writing these tests.
The following options are added
- `-simplify-mlir` runs CSE and canonicalization at the end of the MLIR
pipeline
- `-enable-aa` allows toggling whether TBAA is enabled
- `-test-gen` is shorthand for `-emit-final-mlir -simplify-mlir
-enable-aa=false`
This commit removes the verifier that checked if branch weights are
negative. This check was too strict because weights are interpreted as
unsigned integers.
This showed up when running the verifier on LLVM dialect modules that
were imported from LLVM IR.
The switch to `GetSymbolVector` introduced a regression on detecting
implicit data transfer when the rhs is a function call.
Make sure the symbol we are looking at are of interest to detect data
transfer.
Dispatch is the last construct (after ATOMIC and ALLOCATORS) where the
associated block requires a specific form.
Using OmpDirectiveSpecification for the begin and the optional end
directives will make the structure of all block directives more uniform.
The ALLOCATORS construct is one of the few constructs that require a
special form of the associated block.
Convert the AST node to use OmpDirectiveSpecification for the directive
and the optional end directive, and to use parser::Block as the body:
the form of the block is checked in the semantic checks (with a more
meaningful message).
Handles some loose ends in `do concurrent` reduction declarations. This
PR extends `getAllocaBlock` to handle declare ops, and also emit
`fir.yield` in all regions.
This PR proposes re-modelling `reduce` specifiers to match OpenMP and
OpenACC. In particular, this PR includes the following:
* A new `fir` op: `fir.delcare_reduction` which is identical to OpenMP's
`omp.declare_reduction` op.
* Updating the `reduce` clause on `fir.do_concurrent.loop` to use the
new op.
* Re-uses the `ReductionProcessor` component to emit reductions for `do
concurrent` just like we do for OpenMP. To do this, the
`ReductionProcessor` had to be refactored to be more generalized.
* Upates mapping `do concurrent` to `fir.loop ... unordered` nests using
the new reduction model.
Unfortunately, this is a big PR that would be difficult to divide up in
smaller parts because the bottom of the changes are the `fir` table-gen
changes to `do concurrent`. However, doing these MLIR changes cascades
to the other parts that have to be modified to not break things.
This PR goes in the same direction we went for `private/local`
speicifiers. Now the `do concurrent` and OpenMP (and OpenACC) dialects
are modelled in essentially the same way which makes mapping between
them more trivial, hopefully.
PR stack:
- https://github.com/llvm/llvm-project/pull/145837 (this one)
- https://github.com/llvm/llvm-project/pull/146025
- https://github.com/llvm/llvm-project/pull/146028
- https://github.com/llvm/llvm-project/pull/146033
The MappableType OpenACC type interface is a richer interface that
allows OpenACC dialect to be capable to better interact with a source
dialect, FIR in this case. fir.box and fir.class types already
implemented this interface. Now the same is being done with the other
FIR types that represent variables.
One additional notable change is that fir.array no longer implements
this interface. This is because MappableType is primarily intended for
variables - and FIR variables of this type have storage associated and
thus there's a pointer-like type (fir.ref/heap/pointer) that holds the
array type.
The end goal of promoting these FIR types to MappableType is that we
will soon implement ability to generate recipes outside of the frontend
via this interface.
OpenMP 6.0 introduced alternative spelling for some directives, with the
previous spellings still allowed.
Warn the user when a new spelling is encountered with OpenMP version set
to an older value.
The `-mframe-pointer` flag is not explicitly set in the original `flang`
invocation and so the value passed to `flang -fc1` can vary depending on
the host machine, so don't verify it in the output.
`-mframe-pointer` forwarding is already verified by
`flang/test/Driver/frame-pointer-forwarding.f90`.