See #90452. The old parse tree errors exploded to thousands of unhelpful
lines when there were multiple missing end directives.
Instead, allow a missing end directive in the parse tree then validate
that it is present during semantics (where the error messages are a lot
easier to control).
I'm trying to remove the redirection in SmallSet.h:
template <typename PointeeType, unsigned N>
class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N>
{};
to make it clear that we are using SmallPtrSet. There are only
handful places that rely on this redirection.
This patch replaces SmallSet to SmallPtrSet where the element type is
a pointer.
This reverts commit 5178aeff7b96e86b066f8407b9d9732ec660dd2e.
In addition:
* Scalar constant UNSIGNED BOUNDARY is explicitly casted
to the result type so that the generated hlfir.eoshift
operation is valid. The lowering produces signless constants
by default. It might be a bigger issue in lowering, so I just
want to "fix" it for EOSHIFT in this patch.
* Since we have to create unsigned integer constant during
HLFIR inlining, I added code in createIntegerConstant
to make it possible.
Fixes a regression uncovered by Fujitsu test 0686_0024.f90. In
particular, verifies that a pre-determined symbol is only privatized by
its defining evaluation (e.g. the loop for which the symbol was marked
as pre-determined).
In relation to the approval and merge of the
[PRIF](https://github.com/llvm/llvm-project/pull/76088) specification
about multi-image features in Flang, here is a first PR to add support
for the `-fcoarray` compilation flag and the initialization of the PRIF
environment.
Other PRs will follow for adding support of lowering to PRIF.
Consider the following example:
```fortran
implicit none
integer :: i, j
do concurrent (i=1:10) local(j)
block
do j=1,20
end do
end block
end do
```
Without the fix introduced in this PR, the compiler would "re-localize"
the `j` variable inside the `fir.do_concurrent` loop:
```mlir
fir.do_concurrent {
%7 = fir.alloca i32 {bindc_name = "j"}
%8:2 = hlfir.declare %7 {uniq_name = "_QFloop_in_nested_blockEj"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
...
fir.do_concurrent.loop (%arg0) = (%5) to (%6) step (%c1) local(@_QFloop_in_nested_blockEj_private_i32 %4#0 -> %arg1 : !fir.ref<i32>) {
%12:2 = hlfir.declare %arg1 {uniq_name = "_QFloop_in_nested_blockEj"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
...
%17:2 = fir.do_loop %arg2 = %14 to %15 step %c1_1 iter_args(%arg3 = %16) -> (index, i32) {
fir.store %arg3 to %8#0 : !fir.ref<i32>
...
}
}
}
```
This happened because we did a shallow look-up of `j` and since the loop
is nested inside a `block`, the look-up failed and we re-created a local
allocation for `j` inside the parent `fir.do_concurrent` loop. This
means that we ended up not using the actual localized symbol which is
passed as a region argument to the `fir.do_concurrent.loop` op.
In case of `j`, we do not need to do a shallow look-up. The shallow
look-up is only needed if a symbol is an OpenMP private one or an
iteration variable of a `do concurrent` loop. Neither of which applies
to `j`.
With the fix, `j` is properly resolved to the `local` region argument:
```mlir
fir.do_concurrent {
...
fir.do_concurrent.loop (%arg0) = (%5) to (%6) step (%c1) local(@_QFloop_in_nested_blockEj_private_i32 %4#0 -> %arg1 : !fir.ref<i32>) {
...
%10:2 = hlfir.declare %arg1 {uniq_name = "_QFloop_in_nested_blockEj"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
...
%15:2 = fir.do_loop %arg2 = %12 to %13 step %c1_1 iter_args(%arg3 = %14) -> (index, i32) {
fir.store %arg3 to %10#0 : !fir.ref<i32>
...
}
}
}
```
Both clang and gfortran support the -fopenmp-simd flag, which enables
OpenMP support only for simd constructs, while disabling the rest of
OpenMP.
Implement the appropriate parse tree rewriting to remove non-SIMD OpenMP
constructs at the parsing stage.
Add a new SimdOnly flang OpenMP IR pass which rewrites generated OpenMP
FIR to handle untangling composite simd constructs, and clean up OpenMP
operations leftover after the parse tree rewriting stage.
With this approach, the two parts of the logic required to make the flag
work can be self-contained within the parse tree rewriter and the MLIR
pass, respectively. It does not need to be implemented within the core
lowering logic itself.
The flag is expected to have no effect if -fopenmp is passed explicitly,
and is only expected to remove OpenMP constructs, not things like OpenMP
library functions calls. This matches the behaviour of other compilers.
---------
Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
When the rhs of the data transfer is from a different type, allocate a
new temp on the host and first transfer the rhs to it. Then, use the
elemental op created to do the conversion.
There semantic analysis of the ATOMIC construct will require additional
rewriting (reassociation of certain expressions for user convenience),
and that will be driven by diagnoses made in the semantic checks.
While the rewriting of min/max is not required to be done in semantic
analysis, moving it there will make all rewriting for ATOMIC construct
be located in a single location.
Fixes a bug when a block variable is marked as implicit private. In such
case, we can simply ignore privatizing that symbol within the context of
the currrent OpenMP construct since the "private" allocation for the
symbol will be emitted within the nested block anyway.
Add Automap modifier to the MLIR op definition for the DeclareTarget
directive's Enter clause. Also add lowering support in Flang.
Automap Ref: OpenMP 6.0 section 7.9.7.
Fixes a bug when a block variable is marked as pre-determined private.
In such case, we can simply ignore privatizing that symbol within the
context of the currrent OpenMP construct since the "private" allocation
for the symbol will be emitted within the nested block anyway.
Fixes#149563
When emitting unstructured `do concurrent` loops, reduction processing
should be skipped since we are not emitting `fir.do_concurrent` loop in
the first place.
Reviewed in #152379
- Move the allocator index set up after the allocate statement otherwise
the derived type descriptor is not allocated.
- Support array of derived-type with device component
When the rhs is a an array element, the assert was triggered but this is
still a valid transfer. Remove the assert. The operation has a verifier
to check its validity.
- Move the allocator index set up after the allocate statement otherwise
the derived type descriptor is not allocated.
- Support array of derived-type with device component
Currently, we indicate to the runtime that implicit scalar captures are
firstprivate (via map and
capture types), enough for the runtime trace to treat it as such, but we
do not CodeGen the IR
in such a way that we can take full advantage of this aspect of the
OpenMP specification.
This patch seeks to change that by applying the correct symbol flags
(firstprivate/implicit) to the
implicitly captured scalars within target regions, which then triggers
the delayed privitization code
generation for these symbols, bringing the code generation in-line with
the explicit firstpriviate
clause. Currently, similarly to the delayed privitization I have
sheltered this segment of code
behind the EnabledDelayedPrivitization flag, as without it, we'll
trigger an compiler error for
firstprivate not being supported any time we implicitly capture a scalar
and try to firstprivitize
it, in future when this flag is removed it can also be removed here. So,
for now, you need to
enable this via providing the compiler the flag on compilation of any
programs.
The descriptor for derived-type with CUDA components are allocated in
managed memory. The lowering was calling the standard runtime on
allocate statement where it should be a `cuf.allocate` operation.
See #150178
This may regress some test cases which only ever passed by accident.
I've tested SPEC2017 and a sample of applications to check that this
doesn't break anything too obvious. Presumably this was not a widely
used feature or we would have noticed the bug sooner.
I'm unsure whether this should be backported to LLVM 21 or not: I think
it is much better to refuse to compile than to silently produce the
wrong result, but there is a chance this could regress something which
previously worked by accident. Opinions welcome.
When the LoweringBridge is created, it registers an MLIR Diagnostics
handler with the MLIRContext. However, it never deregisters it once
lowering is finished.
This fixes this particular scenario. It also makes it so that the
Diagnostics handler is optional.
The current mangleName implementation doesn't take a FoldingContext,
which prevents the proper evaluation of expressions containing parameter
references to an integer constant. Since parametrized derived types are
not yet implemented, the compiler will crash there in some cases (see
example in issue #127424).
This is a workaround so that doesn't happen until the feature is
properly implemented.
Fixes#127424
The structure is
- OmpBeginDirective (aka OmpDirectiveSpecification)
- Block
- optional<OmpEndDirective> (aka optional<OmpDirectiveSpecification>)
The OmpBeginDirective and OmpEndDirective are effectively different
names for OmpDirectiveSpecification. They exist to allow the semantic
analyses to distinguish between the beginning and the ending of a block
construct without maintaining additional context.
The actual changes are in the parser: parse-tree.h and openmp-parser.cpp
in particular. The rest is simply changing the way the directive/clause
information is accessed (typically for the simpler).
All standalone and block constructs now use OmpDirectiveSpecification to
store the directive/clause information.
Implement the lowering for delayed privatisation for composite
"distibute simd"constructs. Fixes new crashes previously masked by simd
information on composite constructs being ignored.
Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
Implement the lowering for delayed privatisation for composite "do simd"
constructs. Fixes new crashes previously masked by simd information on
composite constructs being ignored, such as llvm#150975.
Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
The OpenMPSectionConstruct corresponds to the `!$omp section` directive,
but there is nothing in the AST node that stores the directive
information. Even though the only possibility (at the moment) is
"section" without any clauses, for improved generality it is helpful to
have that information anyway.
An error report of the following code generating non-atomic code led us
to realize there are missing checks in the OpenACC atomics code. Add
some of those checks for atomic and sketch how the rest of the code
should proceed in checking the rest of the properties. The following
cases are all reported as errors.
```fortran
! Originally reported error!
!$acc atomic capture
a = b
c = b
!$acc end atomic capture
! Other ambiguous, but related errors!
!$acc atomic capture
x = i
i = x
!$acc end atomic capture
!$acc atomic capture
a = b
b = b
!$acc end atomic capture
!$acc atomic capture
a = b
a = c
!$acc end atomic capture
```
OpenMP loop transformations to not have data-sharing attributes and do
not explicitly privatize the loop variable. The DataSharingProcessor was
still used in #144785 because `createAndSetPrivatizedLoopVar` expected
it.
We skip that function and directly write to the loop variable. If the
loop variable is implicitly or explicitly privatized, it will be due to
surrounding OpenMP constructs such as `parallel`.