Fix a crash caused by an invalid expression in the atomic capture
clause, due to the `checkForSymbolMatch` function not accounting for
`GetExpr` potentially returning null.
Fix https://github.com/llvm/llvm-project/issues/139884
Flang at Ofast usually produces executables that consume more stack that
other Fortran compilers.
This is in part because the alloca created from temporary heap
allocation by the StackArray pass are created at the function scope
level without lifetimes, and LLVM does not/is not able to merge alloca
that do not have overlapping lifetimes.
This patch adds an option to generate LLVM lifetime in the StackArray
pass at the previous heap allocation/free using the LLVM dialect
operation for it.
This patch relies on #140235 and #139724 to speed-up compilations of
files with derived type array global with initial value.
Currently, such derived type global init was lowered to an
llvm.mlir.insertvalue chain in the LLVM IR dialect because there was no
way to represent such value via attributes.
This chain was later folded in LLVM dialect to LLVM IR using LLVM IR
(not dialect) folding. This insert chain generation and folding is very
expensive for big arrays. For instance, this patch brings down the
compilation of FM_lib fmsave.f95 from 50s
to 0.5s.
Between firstprivate, private and reduction init regions, the difference
is largely whether or not the temp that is created is initialized or
not. Some recent fixes were made to privatization (#135698, #137869) but
did not get propagated to reductions, even though they need to return
the yield the same things from their init regions.
To mitigate this discrepancy in the future, refactor the init region
recipes so they can be shared between the three recipe ops.
Also add "none" to the OpenACC_ReductionOperator enum for better error
checking.
Switch from `int64_t` to `int64_t*` to fit with the rest of the
implementation.
New tentative with some fix. The previous was reverted some time ago.
Reviewed in #138010
On Clang 17 the implicit copy assignment operator was issuing a warning because of the user-declared copy constructor. Declare the copy assignment operator as default.
This patch updates several passes to include the DLTI dialect, since
their use of the `fir::support::getOrSetMLIRDataLayout()` utility
function could, in some cases, require this dialect to be loaded in
advance.
Also, the `CUFComputeSharedMemoryOffsetsAndSize` pass has been updated
with a dependency to the GPU dialect, as its invocation to
`cuf::getOrCreateGPUModule()` would result in the same kind of error if
no other operations or attributes from that dialect were present in the
input MLIR module.
An assignment to a whole polymorphic object in a PURE subprogram that is
implemented by means of a defined assignment procedure shouldn't be
subjected to the same definability checks as it would be for an
intrinsic assignment (which would also require it to be allocatable).
Fixes https://github.com/llvm/llvm-project/issues/139129.
Fortran 2023 constraint C7108 prohibits the use of a structure
constructor in a way that is ambiguous with a generic function reference
(intrinsic or user-defined). Sadly, no Fortran compiler implements this
constraint, and the common portable interpretation seems to be the
generic resolution, not the structure constructor.
Restructure the processing of structure constructors in expression
analysis so that it can be driven both from the parse tree as well as
from generic resolution, and then use it to detect ambigous structure
constructor / generic function cases, so that a portability warning can
be issued. And document this as a new intentional violation of the
standard in Extensions.md.
Fixes https://github.com/llvm/llvm-project/issues/138807.
The compiler was emitting a warning about incompatible shapes being used
for two calls to the same procedure with an implicit interface when one
passed a whole array and the other passed a scalar. When the scalar is a
whole element of a contiguous array, however, we must allow for storage
association and not flag it as being a problem.
This is a fix for https://github.com/llvm/llvm-project/issues/134912
which is a problem with mapping `fir.boxchar<k>` type values to the
target i.e an `omp.target` op.
There really are two problems. Fixing the first exposed the second. The
first problem is that OpenMP lowering of maps in `omp.target` in Flang
cannot handle the mapping of a value that doesnt have a defining
operation. In other words, a value that is a block argument. This is handled
by mapping the value using a `MapInfoOp`.
The second problem this fixes is that it adds bounds to `omp.map.info`
ops that map `fir.char<k, ?>` types by extracting the length from the
corresponding `fir.boxchar`
Early HLFIR optimizations may experience problems with values
produced by hlfir.associate. In most cases this is a unique
local memory allocation, but it can also reuse some other
hlfir.expr memory sometimes. It seems to be safe to assume
unique allocation for trivial types, since we always
allocate new memory for them.
This change allows marking more designators producing an opaque
box with 'contiguous' attribute, e.g. like in test1 case
in flang/test/HLFIR/propagate-contiguous-attribute.fir.
This would make isSimplyContiguous() return true for such
designators allowing merging hlfir.eval_in_mem with hlfir.assign
where the LHS is a contiguous array section.
Depends on #139003
Address failing Fujitsu test suite cases that were broken by the patch
to defer the handling of !$ lines in -fopenmp vs. normal compilation to
actual compilation rather than processing them immediately in -E mode.
Tested on the samples in the bug report as well as all of the Fujitsu
tests that I could find that use !$ lines.
Fixes https://github.com/llvm/llvm-project/issues/136845.
This compiler allows an element of an assumed-shape array or POINTER to
be used in sequence association as an actual argument, so long as the
array is declared to have the CONTIGUOUS attribute.
Make sure that this extension is under control of a LanguageFeature
enum, so that a hypothetical compiler driver option could disable it,
and add an optional portability warning for its use.
Bring the typed expression representation of a coindexed reference up to
F'2023, which removed some restrictions that had allowed the current
representation to suffice for older revisions of the language. This new
representation is somewhat more simple -- it uses a DataRef as its base,
so any subscripts in a part-ref can be represented as an ArrayRef there.
Update the code that creates the CoarrayRef, and add more checking to
it, as well as actually capturing any STAT=, TEAM=, & TEAM_NUMBER=
specifiers that might appear. Enforce the constraint that the part-ref
must have subscripts if it is an array. (And update a pile of
copied-and-pasted test code that lacked such subscripts.)
This aims to implement most of the initial arguments for defaultmap
aside from firstprivate and none, and some of the more recent OpenMP 6
additions which will come in subsequent updates (with the OpenMP 6
variants needing parsing/semantic support first).
OpenACC routines annotations in separate compilation units currently get
ignored, which leads to errors in compilation. There are two reason for
currently ignoring open acc routine information and this PR is
addressing both.
- The module file reader doesn't read back in openacc directives from
module files.
- Simple fix in `flang/lib/Semantics/mod-file.cpp`
- The lowering to HLFIR doesn't generate routine directives for symbols
imported from other modules that are openacc routines.
- This is the majority of this diff, and is address by the changes that
start in `flang/lib/Lower/CallInterface.cpp`.
Currently, we do not generate the appropriate checks to check if an
optional
allocatable argument is present before accessing relevant components of
it,
in particular when creating bounds, we must generate a presence check
and we
must make sure we do not generate/keep an load external to the presence
check
by utilising the raw address rather than the regular address of the info
data structure.
Similarly in cases for optional allocatables we must treat them like
non-allocatable
arguments and generate an intermediate allocation that we can have as a
location
in memory that we can access later in the lowering without causing
segfaults when
we perform "mapping" on it, even if the end result is an empty
allocatable
(basically, we shouldn't explode if someone tries to map a non-present
optional,
similar to C++ when mapping null data).
The `AddDebugInfo` pass currently has a dependency on the `DLTI` MLIR
dialect caused by a call to the `fir::support::getOrSetMLIRDataLayout()`
utility function.
This dependency is not captured in the pass definition. This patch adds
the dependency and simplifies several unit tests that had to explicitly
use the `DLTI` dialect to prevent the missing dependency from causing
compiler failures.
This is a fix for the issue https://github.com/llvm/llvm-project/issues/137126
that turned out to be a driver issue.
FrontendActions has a loop to process multiple input files and `flang -fc1`
accept multiple files, but the semantic, lowering, and llvm codegen
actions were not re-entrant, and crash or weird behaviors occurred
when processing multiple files with `-fc1`.
This patch makes the actions reentrant by cleaning-up the contexts/modules
if needed on entry.
When lowering allocatables, the generated calls to runtime functions
were not using the runtime::createArguments utility which handles the
required conversions. createArguments is where I added the implicit
volatile casts to handle converting volatile variables to the
appropriate type based on their volatility in the callee. Because the
calls to allocatable runtime functions were not using this function,
their arguments were not casted to have the appropriate volatility.
Add a test to demonstrate that volatile and allocatable
class/box/reference types are appropriately casted before calling into
the runtime library.
Instead of using a recursive variadic template to perform the
conversions in createArguments, map over the arguments directly so that
createArguments can be called with an ArrayRef of arguments. Some cases
in Allocatable.cpp already had a vector of values at the point where
createArguments needed to be called - the new overload allows calling
with a vector of args or the variadic version with each argument spelled
out at the callsite.
This change resulted in the allocatable runtime calls having their
arguments converted left-to-right, which changed some of the test
results. I used CHECK-DAG to ignore the order.
Add some missing handling of volatile class entities, which I previously
missed because I had not yet enabled volatile class entities in Lower.
Fixes#108136
In #108136 (the new testcase), flang was missing the length parameter
required for the variable length string when boxing the global variable.
The code that is initializing global variables for OpenMP did not
support types with length parameters.
Instead of duplicating this initialization logic in OpenMP, I decided to
use the exact same initialization as is used in the base language
because this will already be well tested and will be updated for any new
types. The difference for OpenMP is that the global variables will be
zero initialized instead of left undefined.
Previously `Fortran::lower::createGlobalInitialization` was used to
share a smaller amount of the logic with the base language lowering. I
think this bug has demonstrated that helper was too low level to be
helpful, and it was only used in OpenMP so I have made it static inside
of ConvertVariable.cpp.
Support is added for parsing. Basic semantics support is added to
forward the code to Lowering. Lowering will emit a TODO error. Detailed
semantics checks and lowering is further work.
This patch,
- Added support for lowering of task_reduction to MLIR
- Added support for lowering of in_reduction to MLIR
- Fixed incorrect DSA handling for variables in the presence of 'in_reduction' clause.
ConstantBase has a constructor that takes a value of any type as an
input: template <typename T> ConstantBase(const T &). A derived type
Constant<T> is a member of many Expr<T> classes (as an alternative in
the member variant).
When trying (erroneously) to create Expr<T> from a wrong input, if the
specific instance of Expr<T> contains Constant<T>, it's that constructor
that will be instantiated, leading to cryptic and confusing errors.
Eliminate the constructor from overload for invalid input values to help
produce more meaningful diagnostics.
The variant in Expr<Type<TypeCategory::Logical, KIND>> only contains
Relational<SomeType>, not other, more specific Relational<T> types.
When calling AsGenericExpr for a value of type Relational<T>, the AsExpr
function will attempt to create Expr<> directly for Relational<T>, which
won't work for the above reason.
Implement an overload of AsExpr for Relational<T>, which will wrap the
Relational<T> in Relational<SomeType> before creating Expr<>.
Enabling volatility lowering by default revealed some issues in lowering
and op verification.
For example, given volatile variable of a nested type, accessing
structure members of a structure member would result in a volatility
mismatch when the inner structure member is designated (and thus a
verification error at compile time).
In other cases, I found correct codegen when the checks were disabled,
also related to allocatable types and how we handle volatile references
of boxes.
This hides the strict verification of fir and hlfir ops behind a flag so
I can iteratively improve lowering of volatile variables without causing
compile-time failures, keeping the strict verification on when running
tests.
The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives.
Currently, the ATOMIC directive has its own handling of it, and the
definition of the UPDATE clause only supports its use in the DEPOBJ
directive, where it takes a dependence-type as an argument.
The UPDATE clause on the ATOMIC directive may not have any arguments.
Since the implementation of the ATOMIC construct will be modified to use
the standard handling of clauses, the definition of UPDATE should
reflect that.
[RFC on
discourse](https://discourse.llvm.org/t/rfc-volatile-representation-in-flang/85404/1)
Flang currently lacks support for volatile variables. For some cases,
the compiler produces TODO error messages and others are ignored. Some
of our tests are like the example from _C.4 Clause 8 notes: The VOLATILE
attribute (8.5.20)_ and require volatile variables.
Prior commits:
```
c9ec1bc753b0 [flang] Handle volatility in lowering and codegen (#135311)
e42f8609858f [flang][nfc] Support volatility in Fir ops (#134858)
b2711e1526f9 [flang][nfc] Support volatile on ref, box, and class types (#134386)
```
This patch,
- Added a new attribute `nontemporal` to fir.load and fir.store operation in the FIR dialect.
- Added a pass `lower-nontemporal` which is called before FIRToLLVM conversion pass and adds the nontemporal attribute to loads and stores on the list items specified in the nontemporal clause of the SIMD directive.
- Set the `UnitAttr:$nontemporal` to llvm.load and llvm.store operations during FIR to LLVM dialect conversion, if the corresponding fir.load or fir.store operations have the nontemporal attribute.
- Attached the `nontemporal metadata` to load and store instructions that have the nontemporal attribute, during LLVM dialect to LLVM IR translation.
This patch produces the following TBAA tree for a function:
```
Function root
|
"any access"
|
|- "descriptor member"
|- "any data access"
|
|- "dummy arg data"
|- "target data"
|
|- "allocated data"
|- "direct data"
|- "global data"
```
The TBAA tags are assigned using the following logic:
* All POINTER variables point to the root of "target data".
* Dummy arguments without POINTER/TARGET point to their
leafs under "dummy arg data".
* Dummy arguments with TARGET point to the root of "target data".
* Global variables without descriptors point to their leafs under
"global data" (including the ones with TARGET).
* Global variables with descriptors point to their leafs under
"direct data" (including the ones with TARGET).
* Locally allocated variables point to their leafs under
"allocated data" (including the ones with TARGET).
This change makes it possible to disambiguate globals like:
```
module data
real, allocatable :: a(:)
real, allocatable, target :: b(:)
end
```
Indeed, two direct references to global vars cannot alias
even if any/both of them have TARGET attribute.
In addition, the dummy arguments without POINTER/TARGET cannot alias
any other variable even with POINTER/TARGET. This was not expressed
in TBAA before this change.
As before, any "unknown" memory references (such as with Indirect
source, as classified by FIR alias analysis) may alias with
anything, as long as they point to the root of "any access".
Please think of the counterexamples for which this structure
may not work.
In cleaning up LowLevelIntrinsics I found some uncalled functions. We
would like to remove direct calls to llvm instructions wherever possible
to make it easier on consumers of our IR to match LLVM intrinsics, so if
this code is needed again we should use the op from the llvm dialect
instead anyways.