This helps to disambiguate accesses in the caller and the callee
after LLVM inlining in some apps. I did not see any performance
changes, but this is one step towards enabling other optimizations
in the apps that I am looking at.
The definition of llvm.noalias says:
```
... indicates that memory locations accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value. This guarantee only holds for memory locations that are modified, by any means, during the execution of the function.
```
I believe this exactly matches Fortran rules for the dummy arguments
that are modified during their subprogram execution.
I also set llvm.noalias and llvm.nocapture on the !fir.box<> arguments,
because the corresponding descriptors cannot be captured and cannot
alias anything (not based on them) during the execution of the
subprogram.
Refactors the utils needed to create privtization/locatization ops for
both the fir and OpenMP dialects into a shared location isolating OpenMP
stuff out of it as much as possible.
This helps to disambiguate accesses in the caller and the callee
after LLVM inlining in some apps. I did not see any performance
changes, but this is one step towards enabling other optimizations
in the apps that I am looking at.
The definition of llvm.noalias says:
```
... indicates that memory locations accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value. This guarantee only holds for memory locations that are modified, by any means, during the execution of the function.
```
I believe this exactly matches Fortran rules for the dummy arguments
that are modified during their subprogram execution.
I also set llvm.noalias and llvm.nocapture on the !fir.box<> arguments,
because the corresponding descriptors cannot be captured and cannot
alias anything (not based on them) during the execution of the
subprogram.
A dummy argument with an explicit INTEGER type of non-default kind can
be forward-referenced from a specification expression in many Fortran
compilers. Handle by adding type declaration statements to the initial
pass over a specification part's declaration constructs. Emit an
optional warning under -pedantic.
Fixes https://github.com/llvm/llvm-project/issues/140941.
FORMAT("J=",I3) is accepted by a few other Fortran compilers as a valid
format for input as well as for output. The character string edit
descriptor "J=" is interpreted as if it had been 2X on input, causing
two characters to be skipped over. The skipped characters don't have to
match the characters in the literal string. An optional warning is
emitted under control of the -pedantic option.
The current semantic check in place is incorrect, this patch fixes this.
Up to 1 **'default'** named mapper should be allowed for each derived
type.
The current semantic check only allows up to 1 **'default'** named
mapper across all derived types.
This also makes sure that declare mappers follow proper scoping rules
for both default and named mappers.
Co-authored-by: Raghu Maddhipatla <Raghu.Maddhipatla@amd.com>
This patch only supports the conversion from `fir.do_loop` to `scf.for`.
This pass is still experimental, and future work will focus on gradually
improving this conversion pass.
Co-authored-by: yanming <ming.yan@terapines.com>
Fix a crash caused by an invalid expression in the atomic capture
clause, due to the `checkForSymbolMatch` function not accounting for
`GetExpr` potentially returning null.
Fix https://github.com/llvm/llvm-project/issues/139884
Flang at Ofast usually produces executables that consume more stack that
other Fortran compilers.
This is in part because the alloca created from temporary heap
allocation by the StackArray pass are created at the function scope
level without lifetimes, and LLVM does not/is not able to merge alloca
that do not have overlapping lifetimes.
This patch adds an option to generate LLVM lifetime in the StackArray
pass at the previous heap allocation/free using the LLVM dialect
operation for it.
This patch relies on #140235 and #139724 to speed-up compilations of
files with derived type array global with initial value.
Currently, such derived type global init was lowered to an
llvm.mlir.insertvalue chain in the LLVM IR dialect because there was no
way to represent such value via attributes.
This chain was later folded in LLVM dialect to LLVM IR using LLVM IR
(not dialect) folding. This insert chain generation and folding is very
expensive for big arrays. For instance, this patch brings down the
compilation of FM_lib fmsave.f95 from 50s
to 0.5s.
Between firstprivate, private and reduction init regions, the difference
is largely whether or not the temp that is created is initialized or
not. Some recent fixes were made to privatization (#135698, #137869) but
did not get propagated to reductions, even though they need to return
the yield the same things from their init regions.
To mitigate this discrepancy in the future, refactor the init region
recipes so they can be shared between the three recipe ops.
Also add "none" to the OpenACC_ReductionOperator enum for better error
checking.
Switch from `int64_t` to `int64_t*` to fit with the rest of the
implementation.
New tentative with some fix. The previous was reverted some time ago.
Reviewed in #138010
On Clang 17 the implicit copy assignment operator was issuing a warning because of the user-declared copy constructor. Declare the copy assignment operator as default.
This patch updates several passes to include the DLTI dialect, since
their use of the `fir::support::getOrSetMLIRDataLayout()` utility
function could, in some cases, require this dialect to be loaded in
advance.
Also, the `CUFComputeSharedMemoryOffsetsAndSize` pass has been updated
with a dependency to the GPU dialect, as its invocation to
`cuf::getOrCreateGPUModule()` would result in the same kind of error if
no other operations or attributes from that dialect were present in the
input MLIR module.
An assignment to a whole polymorphic object in a PURE subprogram that is
implemented by means of a defined assignment procedure shouldn't be
subjected to the same definability checks as it would be for an
intrinsic assignment (which would also require it to be allocatable).
Fixes https://github.com/llvm/llvm-project/issues/139129.
Fortran 2023 constraint C7108 prohibits the use of a structure
constructor in a way that is ambiguous with a generic function reference
(intrinsic or user-defined). Sadly, no Fortran compiler implements this
constraint, and the common portable interpretation seems to be the
generic resolution, not the structure constructor.
Restructure the processing of structure constructors in expression
analysis so that it can be driven both from the parse tree as well as
from generic resolution, and then use it to detect ambigous structure
constructor / generic function cases, so that a portability warning can
be issued. And document this as a new intentional violation of the
standard in Extensions.md.
Fixes https://github.com/llvm/llvm-project/issues/138807.
The compiler was emitting a warning about incompatible shapes being used
for two calls to the same procedure with an implicit interface when one
passed a whole array and the other passed a scalar. When the scalar is a
whole element of a contiguous array, however, we must allow for storage
association and not flag it as being a problem.
This is a fix for https://github.com/llvm/llvm-project/issues/134912
which is a problem with mapping `fir.boxchar<k>` type values to the
target i.e an `omp.target` op.
There really are two problems. Fixing the first exposed the second. The
first problem is that OpenMP lowering of maps in `omp.target` in Flang
cannot handle the mapping of a value that doesnt have a defining
operation. In other words, a value that is a block argument. This is handled
by mapping the value using a `MapInfoOp`.
The second problem this fixes is that it adds bounds to `omp.map.info`
ops that map `fir.char<k, ?>` types by extracting the length from the
corresponding `fir.boxchar`
Early HLFIR optimizations may experience problems with values
produced by hlfir.associate. In most cases this is a unique
local memory allocation, but it can also reuse some other
hlfir.expr memory sometimes. It seems to be safe to assume
unique allocation for trivial types, since we always
allocate new memory for them.
This change allows marking more designators producing an opaque
box with 'contiguous' attribute, e.g. like in test1 case
in flang/test/HLFIR/propagate-contiguous-attribute.fir.
This would make isSimplyContiguous() return true for such
designators allowing merging hlfir.eval_in_mem with hlfir.assign
where the LHS is a contiguous array section.
Depends on #139003
Address failing Fujitsu test suite cases that were broken by the patch
to defer the handling of !$ lines in -fopenmp vs. normal compilation to
actual compilation rather than processing them immediately in -E mode.
Tested on the samples in the bug report as well as all of the Fujitsu
tests that I could find that use !$ lines.
Fixes https://github.com/llvm/llvm-project/issues/136845.
This compiler allows an element of an assumed-shape array or POINTER to
be used in sequence association as an actual argument, so long as the
array is declared to have the CONTIGUOUS attribute.
Make sure that this extension is under control of a LanguageFeature
enum, so that a hypothetical compiler driver option could disable it,
and add an optional portability warning for its use.
Bring the typed expression representation of a coindexed reference up to
F'2023, which removed some restrictions that had allowed the current
representation to suffice for older revisions of the language. This new
representation is somewhat more simple -- it uses a DataRef as its base,
so any subscripts in a part-ref can be represented as an ArrayRef there.
Update the code that creates the CoarrayRef, and add more checking to
it, as well as actually capturing any STAT=, TEAM=, & TEAM_NUMBER=
specifiers that might appear. Enforce the constraint that the part-ref
must have subscripts if it is an array. (And update a pile of
copied-and-pasted test code that lacked such subscripts.)
This aims to implement most of the initial arguments for defaultmap
aside from firstprivate and none, and some of the more recent OpenMP 6
additions which will come in subsequent updates (with the OpenMP 6
variants needing parsing/semantic support first).
OpenACC routines annotations in separate compilation units currently get
ignored, which leads to errors in compilation. There are two reason for
currently ignoring open acc routine information and this PR is
addressing both.
- The module file reader doesn't read back in openacc directives from
module files.
- Simple fix in `flang/lib/Semantics/mod-file.cpp`
- The lowering to HLFIR doesn't generate routine directives for symbols
imported from other modules that are openacc routines.
- This is the majority of this diff, and is address by the changes that
start in `flang/lib/Lower/CallInterface.cpp`.
Currently, we do not generate the appropriate checks to check if an
optional
allocatable argument is present before accessing relevant components of
it,
in particular when creating bounds, we must generate a presence check
and we
must make sure we do not generate/keep an load external to the presence
check
by utilising the raw address rather than the regular address of the info
data structure.
Similarly in cases for optional allocatables we must treat them like
non-allocatable
arguments and generate an intermediate allocation that we can have as a
location
in memory that we can access later in the lowering without causing
segfaults when
we perform "mapping" on it, even if the end result is an empty
allocatable
(basically, we shouldn't explode if someone tries to map a non-present
optional,
similar to C++ when mapping null data).
The `AddDebugInfo` pass currently has a dependency on the `DLTI` MLIR
dialect caused by a call to the `fir::support::getOrSetMLIRDataLayout()`
utility function.
This dependency is not captured in the pass definition. This patch adds
the dependency and simplifies several unit tests that had to explicitly
use the `DLTI` dialect to prevent the missing dependency from causing
compiler failures.
This is a fix for the issue https://github.com/llvm/llvm-project/issues/137126
that turned out to be a driver issue.
FrontendActions has a loop to process multiple input files and `flang -fc1`
accept multiple files, but the semantic, lowering, and llvm codegen
actions were not re-entrant, and crash or weird behaviors occurred
when processing multiple files with `-fc1`.
This patch makes the actions reentrant by cleaning-up the contexts/modules
if needed on entry.
When lowering allocatables, the generated calls to runtime functions
were not using the runtime::createArguments utility which handles the
required conversions. createArguments is where I added the implicit
volatile casts to handle converting volatile variables to the
appropriate type based on their volatility in the callee. Because the
calls to allocatable runtime functions were not using this function,
their arguments were not casted to have the appropriate volatility.
Add a test to demonstrate that volatile and allocatable
class/box/reference types are appropriately casted before calling into
the runtime library.
Instead of using a recursive variadic template to perform the
conversions in createArguments, map over the arguments directly so that
createArguments can be called with an ArrayRef of arguments. Some cases
in Allocatable.cpp already had a vector of values at the point where
createArguments needed to be called - the new overload allows calling
with a vector of args or the variadic version with each argument spelled
out at the callsite.
This change resulted in the allocatable runtime calls having their
arguments converted left-to-right, which changed some of the test
results. I used CHECK-DAG to ignore the order.
Add some missing handling of volatile class entities, which I previously
missed because I had not yet enabled volatile class entities in Lower.
Fixes#108136
In #108136 (the new testcase), flang was missing the length parameter
required for the variable length string when boxing the global variable.
The code that is initializing global variables for OpenMP did not
support types with length parameters.
Instead of duplicating this initialization logic in OpenMP, I decided to
use the exact same initialization as is used in the base language
because this will already be well tested and will be updated for any new
types. The difference for OpenMP is that the global variables will be
zero initialized instead of left undefined.
Previously `Fortran::lower::createGlobalInitialization` was used to
share a smaller amount of the logic with the base language lowering. I
think this bug has demonstrated that helper was too low level to be
helpful, and it was only used in OpenMP so I have made it static inside
of ConvertVariable.cpp.
Support is added for parsing. Basic semantics support is added to
forward the code to Lowering. Lowering will emit a TODO error. Detailed
semantics checks and lowering is further work.
This patch,
- Added support for lowering of task_reduction to MLIR
- Added support for lowering of in_reduction to MLIR
- Fixed incorrect DSA handling for variables in the presence of 'in_reduction' clause.
ConstantBase has a constructor that takes a value of any type as an
input: template <typename T> ConstantBase(const T &). A derived type
Constant<T> is a member of many Expr<T> classes (as an alternative in
the member variant).
When trying (erroneously) to create Expr<T> from a wrong input, if the
specific instance of Expr<T> contains Constant<T>, it's that constructor
that will be instantiated, leading to cryptic and confusing errors.
Eliminate the constructor from overload for invalid input values to help
produce more meaningful diagnostics.
The variant in Expr<Type<TypeCategory::Logical, KIND>> only contains
Relational<SomeType>, not other, more specific Relational<T> types.
When calling AsGenericExpr for a value of type Relational<T>, the AsExpr
function will attempt to create Expr<> directly for Relational<T>, which
won't work for the above reason.
Implement an overload of AsExpr for Relational<T>, which will wrap the
Relational<T> in Relational<SomeType> before creating Expr<>.
Enabling volatility lowering by default revealed some issues in lowering
and op verification.
For example, given volatile variable of a nested type, accessing
structure members of a structure member would result in a volatility
mismatch when the inner structure member is designated (and thus a
verification error at compile time).
In other cases, I found correct codegen when the checks were disabled,
also related to allocatable types and how we handle volatile references
of boxes.
This hides the strict verification of fir and hlfir ops behind a flag so
I can iteratively improve lowering of volatile variables without causing
compile-time failures, keeping the strict verification on when running
tests.