InitializeClone(), implemented in #120295, was not handling top
level pointers and allocatables correctly.
Pointers and unallocated variables must be skipped.
This caused some regressions in the Fujitsu testsuite:
https://linaro.atlassian.net/browse/LLVM-1488
Allocatable members of privatized derived types must be allocated,
with the same bounds as the original object, whenever that member
is also allocated in it, but Flang was not performing such
initialization.
The `Initialize` runtime function can't perform this task unless
its signature is changed to receive an additional parameter, the
original object, that is needed to find out which allocatable
members, with their bounds, must also be allocated in the clone.
As `Initialize` is used not only for privatization, sometimes this
other object won't even exist, so this new parameter would need
to be optional.
Because of this, it seemed better to add a new runtime function:
`InitializeClone`.
To avoid unnecessary calls, lowering inserts a call to it only for
privatized items that are derived types with allocatable members.
Fixes https://github.com/llvm/llvm-project/issues/114888
Fixes https://github.com/llvm/llvm-project/issues/114889
Convert `DataSharingProcessor::symTable` from pointer to reference.
This avoids accidental null pointer dereferences and makes it
possible to use `symTable` when delayed privatization is disabled.
Both OpenMP privatization and DO CONCURRENT LOCAL lowering was incorrect
for pointers and derived type with default initialization.
For pointers, the descriptor was not established with the rank/type
code/element size, leading to undefined behavior if any inquiry was made
to it prior to a pointer assignment (and if/when using the runtime for
pointer assignments, the descriptor must have been established).
For derived type with default initialization, the copies were not
default initialized.
This patch simplifies the representation of OpenMP loop wrapper
operations by introducing the `NoTerminator` trait and updating
accordingly the verifier for the `LoopWrapperInterface`.
Since loop wrappers are already limited to having exactly one region
containing exactly one block, and this block can only hold a single
`omp.loop_nest` or loop wrapper and an `omp.terminator` that does not
return any values, it makes sense to simplify the representation of loop
wrappers by removing the terminator.
There is an extensive list of Lit tests that needed updating to remove
the `omp.terminator`s adding some noise to this patch, but actual
changes are limited to the definition of the `omp.wsloop`, `omp.simd`,
`omp.distribute` and `omp.taskloop` loop wrapper ops, Flang lowering for
those, `LoopWrapperInterface::verifyImpl()`, SCF to OpenMP conversion
and OpenMP dialect documentation.
OpenMP prohibits privatisation of variables that appear in expressions
for statement functions.
This is a re-working of an old patch https://reviews.llvm.org/D93213 by
@praveen-g-ctt.
The old patch couldn't be landed because of ordering concerns. Statement
functions are rewritten during parse tree rewriting, but this was done
after resolve-directives and so some array expressions were incorrectly
identified as statement functions. For this reason **I have opted to
re-order the semantics driver so that resolve-directives is run after
parse tree rewriting**.
Closes#54677
---------
Co-authored-by: Praveen <praveen@compilertree.com>
This patch creates a simple RAII wrapper class for `SymMap` to make it
easier to use and prevent a missing matching `popScope()` for a
`pushScope()` call on simple use cases.
Some push-pop pairs are replaced with instances of the new class by this
patch.
This patch removes unused and undefined method declarations from
`DataSharingProcessor`, as well as the unused `hasLastPrivateOp` class
member. The `insPt` class member is replaced by a local `InsertionGuard`
in the only place it is set and used.
This patch updates the `omp.parallel` operation according to the results
of the discussion in [this
RFC](https://discourse.llvm.org/t/rfc-disambiguation-between-loop-and-block-associated-omp-parallelop/79972).
It is removed from the set of loop wrapper operations, changing the
expected MLIR representation for composite `distribute parallel do/for`
into the following:
```mlir
omp.parallel {
...
omp.distribute {
omp.wsloop {
omp.loop_nest ... { ... }
omp.terminator
}
omp.terminator
}
...
omp.terminator
}
```
MLIR verifiers for operations impacted by this representation change are
updated, as well as related tests. The `LoopWrapperInterface` is also
updated, since it's no longer representing an optional "role" of an
operation but a mandatory set of restrictions instead.
Variables referenced in the body of statement functions need to be
handled as if they are explicitly referenced. Otherwise, they are
skipped during implicit privatization, because statement functions
are represented as procedures in the parse tree.
To avoid missing symbols referenced only in statement functions
during implicit privatization, new symbols, associated with them,
are created and inserted into the context of the directive that
privatizes them. They are later collected and processed in
lowering. To avoid confusing these new symbols with regular ones,
they are tagged with the new OmpFromStmtFunction flag.
Fixes https://github.com/llvm/llvm-project/issues/74273
Handles variables that are storage associated via `equivalence`. The
problem is that these variables are declared as `fir.ptr`s while their
privatized storage is declared as `fir.ref` which was triggering a
validation error in the OpenMP dialect.
This patch introduces a new OpenMP clause definition not defined by the spec.
Its main purpose is to define the `loop_inclusive` (previously "inclusive",
renamed according to the parent of this PR in the stack) argument of
`omp.loop_nest` in such a way that a followup implementation of a tablegen
backend to automatically generate clause and operation operand structures
directly from `OpenMP_Op` and `OpenMP_Clause` definitions can properly generate
the `LoopNestOperands` structure.
`collapse` clause arguments are also moved into this new definition, as they
represent information on the loop nests being collapsed rather than the
`collapse` clause itself.
Currently, there are some inconsistencies to how clause arguments are
named in the OpenMP dialect. Additionally, the clause operand structures
associated to them also diverge in certain cases. The purpose of this
patch is to normalize argument names across all `OpenMP_Clause` tablegen
definitions and clause operand structures.
This has the benefit of providing more consistent representations for
clauses in the dialect, but the main short-term advantage is that it
enables the development of an OpenMP-specific tablegen backend to
automatically generate the clause operand structures without breaking
dependent code.
The main re-naming decisions made in this patch are the following:
- Variadic arguments (i.e. multiple values) have the "_vars" suffix.
This and other similar suffixes are removed from array attribute
arguments.
- Individual required or optional value arguments do not have any suffix
added to them (e.g. "val", "var", "expr", ...), except for `if` which
would otherwise result in an invalid C++ variable name.
- The associated clause's name is prepended to argument names that don't
already contain it as part of its name. This avoids future collisions
between arguments named the same way on different clauses and adding
both clauses to the same operation.
- Privatization and reduction related arguments that contain lists of
symbols pointing to privatizer/reducer operations use the "_syms"
suffix. This removes the inconsistencies between the names for
"copyprivate_funcs", "[in]reductions", "privatizers", etc.
- General improvements to names, replacement of camel case for snake
case everywhere, etc.
- Renaming of operation-associated operand structures to use the
"Operands" suffix in place of "ClauseOps", to better differentiate
between clause operand structures and operation operand structures.
- Fields on clause operand structures are sorted according to the
tablegen definition of the same clause.
The assembly format for a few arguments is updated to better reflect the
clause they are associated with:
- `chunk_size` -> `dist_schedule_chunk_size`
- `grain_size` -> `grainsize`
- `simd` -> `par_level_simd`
Don't use `copyHostAssociateVar` for allocatable variables. It isn't
clear to me whether or not this should be addressed in
`copyHostAssociateVar` instead of inside OpenMP. I opted for OpenMP
to minimise how many things I effected. `copyHostAssociateVar` will
not update the destination variable if the destination variable
was unallocated. This is incorrect because assignment inside of the
openmp block can cause the allocation status of the variable to
change. Furthermore, `copyHostAssociateVar` seems to only copy the
variable address not other metadata like the size of the allocation.
Reallocation by assignment could cause this to change.
This patch enables the lastprivate clause to be used in the presence of
the collapse clause.
Note: the way we currently implement lastprivate means that this adds a
large number of compare instructions to the end of every iteration of
the loop. This is a clearly non-optimal thing to do, but lastprivate in
general will need re-implementing to prevent this. This is planned as
part of the delayed privatization work. This current implementation is
just a stop-gap measure as generating sub-optimal but working code is
better than crashing out.
This patch removes the introduction of `fir.undef` operations as a way
to keep track of insertion points inside of the `DataSharingProcessor`,
and it replaces them with an `InsertionGuard` to avoid creating such
operations inside of loop wrappers.
Leaving any `fir.undef` operation inside of a loop wrapper would result
in a verifier error, since they enforce strict requirements on the
contents of their code regions.
Extends delayed privatization support to `taraget .. private(..)`. With
this PR, `private` is support for `target` **only** is delayed
privatization mode.
The object identity requires more than just `Symbol`. Don't use `id()`
to get the Symbol associated with the object, becase the return value
will need to change. Instead use `sym()` which is added for that reason.
Fixes a bug in emiting deacllocation logic when delayed privatization is
disabled. I introduced the bug when implementing delayed privatization
for allocatables: when delayed privatization is disabled the
deacllocation ops are emitted for only one allocatable variables.
This PR contains 2 commits:
1. A commit to reapply changes introduced #91116 (was reverted earlier
due to test suite failures)
2. A commit containing a possible solution for the issue causing the
test suite failures. In particular, it introduces a simple symbol
visitor class to keep track of the current active OMP construct and
marking this active construct as the scope defining the symbol being
visisted.
Besides duplicating code, privatizing variables in every section
causes problems when synchronization barriers are used. This
happens because each section is executed by a given thread, which
will cause the program to hang if not all running threads execute
the barrier operation.
Fixes https://github.com/llvm/llvm-project/issues/72824
Current implementation of default clause privatization incorrectly fails
to privatize in presence of non-OpenMP constructs (i.e. nested
constructs with regions whose symbols need to be privatized in the scope
of the parent OpenMP construct). This patch fixes the same by
considering non-OpenMP constructs separately by collecting symbols of a
nested region if it is a non-OpenMP construct with a region, and
privatizing it in the scope of the parent OpenMP construct.
Fixes https://github.com/llvm/llvm-project/issues/71914 and
https://github.com/llvm/llvm-project/issues/71915
This patch updates lowering from PFT to MLIR of workshare loops to
follow the loop wrapper approach. Unit tests impacted by this change are
also updated.
As the last patch of the stack, this should compile and pass unit tests.
This patch replaces some `saveInsertionPoint`, `restoreInsertionPoint`
call pairs for an `InsertionGuard` instance where it makes sense within
Flang OpenMP lowering to make further modifications less error-prone.
This patch updates Flang lowering to use the new set of OpenMP clause
operand structures and their groupings into directive-specific sets of
clause operands.
It simplifies the passing of information from the clause processor and
the creation of operations.
The `DataSharingProcessor` is slightly modified to not hold delayed
privatization state. Instead, optional arguments are added to
`processStep1` which are only passed when delayed privatization is used.
This enables using the clause operand structure for `private` and
removes the need for the ad-hoc `DelayedPrivatizationInfo` structure.
The processing of the `schedule` clause is updated to process the
`chunk` modifier rather than requiring two separate calls to the
`ClauseProcessor`.
Lowering of a block-associated `ordered` construct is updated to emit a
TODO error if the `simd` clause is specified, since it is not currently
supported by the `ClauseProcessor` or later compilation stages.
Removed processing of `schedule` from `omp.simdloop`, as it doesn't
apply to `simd` constructs.
The clause templates defined in ClauseT.h were originally based on
flang's parse tree nodes. Since those representations are going to be
reused for clang (together with the clause splitting code), it makes
sense to separate them from flang, and instead have them based on the
actual OpenMP spec (v5.2).
The member names in the templates follow the naming presented in the
spec, and the representation (e.g. members) is derived from the clause
definitions as described in the spec.
Since the representations of some clauses has changed (while preserving
the information), the current code using the clauses (especially the
code converting parser::OmpClause to omp::Clause) needs to be adjusted.
This patch does not make any functional changes.
PR #81833 introduced some changes to broke some debug builds. This
happened due to an indirectly included file referencing an `operator <<`
function which is defined in a `.cpp` file that not linked with `tco`
and `fir-opt`.
Adds basic support for emitting delayed privatizers from flang. So far,
only types of symbols are supported (i.e. scalars), support for more
complicated types will be added later. This also makes sure that
reduction and delayed privatization work properly together by merging
the
body-gen callbacks for both in case both clauses are present on the
parallel construct.
This started as an experiment to reduce the compilation time of
iterating over `Lower/OpenMP.cpp` a bit since it is too slow at the
moment. Trying to do that, I split the `DataSharingProcessor`,
`ReductionProcessor`, and `ClauseProcessor` into their own files and
extracted some shared code into a util file. All of these new `.h/.cpp`
files as well as `OpenMP.cpp` are now under a `Lower/OpenMP/` directory.
This resulted is a slightly better organization of the OpenMP lowering
code and hence opening this NFC.
As for the compilation time, this unfortunately does not affect it much
(it shaves off a few seconds of `OpenMP.cpp` compilation) since from
what I learned the bottleneck is in `DirectivesCommon.h` and
`PFTBuilder.h` which both consume a lot of time in template
instantiation it seems.