Predefine the `__pic__/__pie__/__PIC__/__PIE__` macros based on the
configured relocation level. This logic mirrors that of the clang
driver, where `__pic__/__PIC__` are defined for both PIC and PIE modes,
but `__pie__/__PIE__` are only defined for PIE mode.
Fixes https://github.com/llvm/llvm-project/issues/135275
Both clang and gfortran support the -fopenmp-simd flag, which enables
OpenMP support only for simd constructs, while disabling the rest of
OpenMP.
Implement the appropriate parse tree rewriting to remove non-SIMD OpenMP
constructs at the parsing stage.
Add a new SimdOnly flang OpenMP IR pass which rewrites generated OpenMP
FIR to handle untangling composite simd constructs, and clean up OpenMP
operations leftover after the parse tree rewriting stage.
With this approach, the two parts of the logic required to make the flag
work can be self-contained within the parse tree rewriter and the MLIR
pass, respectively. It does not need to be implemented within the core
lowering logic itself.
The flag is expected to have no effect if -fopenmp is passed explicitly,
and is only expected to remove OpenMP constructs, not things like OpenMP
library functions calls. This matches the behaviour of other compilers.
---------
Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
PR #153488 caused the msvc build (https://lab.llvm.org/buildbot/#/builders/166/builds/1397) to fail:
```
..\llvm-project\flang\include\flang/Evaluate/rewrite.h(78): error C2668: 'Fortran::evaluate::rewrite::Identity::operator ()': ambiguous call to overloaded function
..\llvm-project\flang\include\flang/Evaluate/rewrite.h(43): note: could be 'Fortran::evaluate::Expr<Fortran::evaluate::SomeType> Fortran::evaluate::rewrite::Identity::operator ()<Fortran::evaluate::SomeType,S>(Fortran::evaluate::Expr<Fortran::evaluate::SomeType> &&,const U &)'
with
[
S=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>,
U=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>
]
..\llvm-project\flang\lib\Semantics\check-omp-atomic.cpp(174): note: or 'Fortran::evaluate::Expr<Fortran::evaluate::SomeType> Fortran::semantics::ReassocRewriter::operator ()<Fortran::evaluate::SomeType,S,void>(Fortran::evaluate::Expr<Fortran::evaluate::SomeType> &&,const U &,Fortran::semantics::ReassocRewriter::NonIntegralTag)'
with
[
S=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>,
U=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>
]
..\llvm-project\flang\include\flang/Evaluate/rewrite.h(78): note: while trying to match the argument list '(Fortran::evaluate::Expr<Fortran::evaluate::SomeType>, const S)'
with
[
S=Fortran::evaluate::value::Integer<128,true,32,unsigned int,unsigned __int64,128>
]
..\llvm-project\flang\include\flang/Evaluate/rewrite.h(78): note: the template instantiation context (the oldest one first) is
..\llvm-project\flang\lib\Semantics\check-omp-atomic.cpp(814): note: see reference to function template instantiation 'U Fortran::evaluate::rewrite::Mutator<Fortran::semantics::ReassocRewriter>::operator ()<const Fortran::evaluate::Expr<Fortran::evaluate::SomeType>&,Fortran::evaluate::Expr<Fortran::evaluate::SomeType>>(T)' being compiled
with
[
U=Fortran::evaluate::Expr<Fortran::evaluate::SomeType>,
T=const Fortran::evaluate::Expr<Fortran::evaluate::SomeType> &
]
```
The reason is that there is an ambiguity between operator() of
ReassocRewriter itself and operator() of the base class `Identity` through
`using Id::operator();`. By the C++ specification, method declarations
in ReassocRewriter hide methods with the same signature from a using
declaration, but this does not apply to
```
evaluate::Expr<T> operator()(..., NonIntegralTag = {})
```
which has a different signature due to an additional tag parameter.
Since it has a default value, it is ambiguous with operator() without
tag parameter.
GCC and Clang both accept this, but in my understanding MSVC is correct
here.
Since the overloads of ReassocRewriter cover all cases (integral and
non-integral), removing the using declaration to avoid the ambiguity.
When a REAL or COMPLEX literal appears without an explicit kind suffix
or a kind-determining exponent letter, and the conversion of that
literal from decimal to binary is inexact, emit a warning if that
constant is later implicitly widened to a more precise kind, since it
will have a different value than was probably intended.
Values that convert exactly from decimal to default real, e.g. 1.0 and
0.125, do not elicit this warning.
There are many contexts in which Fortran implicitly converts constants.
This patch covers name constant values, variable and component
initialization, constants in expressions, structure constructor
components, and array constructors.
For example, "real(8) :: tenth = 0.1" is a common Fortran bug that's
hard to find, and is one that often trips up even experienced Fortran
programmers. Unlike C and C++, the literal constant 0.1 is *not* double
precision by default, and it does not have the same value as 0.1d0 or
0.1_8 do when it is converted from decimal to real(4) and then to
real(8).
An atomic update expression of form
x = x + a + b
is technically illegal, since the right-hand side is parsed as (x+a)+b,
and the atomic variable x should be an argument to the top-level +. When
the type of x is integer, the result of (x+a)+b is guaranteed to be the
same as x+(a+b), so instead of reporting an error, the compiler can
treat (x+a)+b as x+(a+b).
This PR implements this kind of reassociation for integral types, and
for the two arithmetic associative/commutative operators: + and *.
Reinstate PR153098 one more time with fixes for the issues that came up:
- unused variable "lsrc",
- use of ‘outer1’ before deduction of ‘auto’.
When the rhs of the data transfer is from a different type, allocate a
new temp on the host and first transfer the rhs to it. Then, use the
elemental op created to do the conversion.
An atomic update expression of form
x = x + a + b
is technically illegal, since the right-hand side is parsed as (x+a)+b,
and the atomic variable x should be an argument to the top-level +. When
the type of x is integer, the result of (x+a)+b is guaranteed to be the
same as x+(a+b), so instead of reporting an error, the compiler can
treat (x+a)+b as x+(a+b).
This PR implements this kind of reassociation for integral types, and
for the two arithmetic associative/commutative operators: + and *.
Reinstate PR153098 with fixes for the issues that came up:
- unused variable "lsrc",
- use of ‘outer1’ before deduction of ‘auto’.
An atomic update expression of form
x = x + a + b
is technically illegal, since the right-hand side is parsed as (x+a)+b,
and the atomic variable x should be an argument to the top-level +. When
the type of x is integer, the result of (x+a)+b is guaranteed to be the
same as x+(a+b), so instead of reporting an error, the compiler can
treat (x+a)+b as x+(a+b).
This PR implements this kind of reassociation for integral types, and
for the two arithmetic associative/commutative operators: + and *.
Properly cast the selector to `i64` regardless of its integer type.
We used to generate llvm.trunc always.
We have to use `i64` as long as the case values may exceed INT_MAX.
Fixes#153050.
A long chain of fir.pack_arrays might require multiple iterations
of the greedy rewriter, if we use down-top traversal. The rewriter
may not converge in 10 (default) iterations. It is not an error,
but it was reported as such.
This patch changes the traversal to top-down and also disabled
the hard error, if the rewriter does not converge soon enough.
There semantic analysis of the ATOMIC construct will require additional
rewriting (reassociation of certain expressions for user convenience),
and that will be driven by diagnoses made in the semantic checks.
While the rewriting of min/max is not required to be done in semantic
analysis, moving it there will make all rewriting for ATOMIC construct
be located in a single location.
Add a new AutomapToTargetData pass. This gathers the declare target
enter variables which have the AUTOMAP modifier. And adds
omp.declare_target_enter/exit mapping directives for fir.alloca and
fir.free oeprations on the AUTOMAP enabled variables.
Automap Ref: OpenMP 6.0 section 7.9.7.
Fixes a bug when a block variable is marked as implicit private. In such
case, we can simply ignore privatizing that symbol within the context of
the currrent OpenMP construct since the "private" allocation for the
symbol will be emitted within the nested block anyway.
Add a new `AutomapToTargetData` pass. This gathers the declare target
enter variables which have the `AUTOMAP` modifier. And adds
`omp.declare_target_enter/exit` mapping directives for `fir.allocmem`
and `fir.freemem` oeprations on the `AUTOMAP` enabled variables.
Automap Ref: OpenMP 6.0 section 7.9.7.
Add Automap modifier to the MLIR op definition for the DeclareTarget
directive's Enter clause. Also add lowering support in Flang.
Automap Ref: OpenMP 6.0 section 7.9.7.
Fixes a bug when a block variable is marked as pre-determined private.
In such case, we can simply ignore privatizing that symbol within the
context of the currrent OpenMP construct since the "private" allocation
for the symbol will be emitted within the nested block anyway.
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.
This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
Fixes#149563
When emitting unstructured `do concurrent` loops, reduction processing
should be skipped since we are not emitting `fir.do_concurrent` loop in
the first place.
The function ResolveOmpObject had a lot of highly-indented code in two
variant visitors. Extract the visitors into their own functions, and
reformat the code. Replace !(||) with !&&! in a couple of places to make
the formatting a bit nicer. Use llvm::enumerate instead of manually
maintaining iteration index.
Reviewed in #152379
- Move the allocator index set up after the allocate statement otherwise
the derived type descriptor is not allocated.
- Support array of derived-type with device component
When the rhs is a an array element, the assert was triggered but this is
still a valid transfer. Remove the assert. The operation has a verifier
to check its validity.
- Move the allocator index set up after the allocate statement otherwise
the derived type descriptor is not allocated.
- Support array of derived-type with device component
They were inserted in the current scope.
OpenMP spec (all versions):
The names of critical constructs are global entities of the program. If
a name conflicts with any other entity, the behavior of the program is
unspecified.
This PR implements a semantic checker to ensure the legality of nested
OpenACC parallelism. The following are quotes from Spec 3.3. We need to
disallow loops from having parallelism at the same level as or at a
sub-level of child loops.
>**2.9.2 gang clause**
>[2064] <ins>When the parent compute construct is a parallel
construct</ins>, or on an orphaned loop construct, the gang clause
behaves as follows. (...) The associated dimension is the value of the
dim argument, if it appears, or is dimension one. The dim argument must
be a constant positive integer with value 1, 2, or 3.
>[2112] The region of a loop with a gang(dim:d) clause may not contain a
loop construct with a gang(dim:e) clause where e >= d unless it appears
within a nested compute region.
>[2074] <ins>When the parent compute construct is a kernels
construct</ins>, the gang clause behaves as follows. (...)
>[2148] The region of a loop with the gang clause may not contain
another loop with a gang clause unless within a nested compute region.
>**2.9.3 worker clause**
>[2122]/[2129] The region of a loop with the worker clause may not
contain a loop with the gang or worker clause unless within a nested
compute region.
>**2.9.4 vector clause**
>[2141]/[2148] The region of a loop with the vector clause may not
contain a loop with a gang, worker, or vector clause unless within a
nested compute region.
https://openacc.org/sites/default/files/inline-images/Specification/OpenACC-3.3-final.pdf
Module files shouldn't ever produce parsing errors, and if they did in
the case of a badly-generated module file, the compiler will notice and
crash. So we can run the parser on module files with message deferral
enabled, and that saves time that would otherwise be spent generating
messages on failed parsing alternatives that are discarded anyway when
backtracking. It's not a big savings (single digit percentage on overall
compilation time for a big application with lots of modules), but worth
doing.