When a module variable is referenced inside an internal procedure, but
the use statement for the module is inside the host, semantics may not
create any symbols with HostAssocDetails directly under the internal
procedure scope.
So pft::getScopeVariableList, that is called in the bridge when lowering
the internal procedure scope, failed to instantiate the module
variables. This lead to "symbol is not mapped to any IR value" compile
time errors.
This patch fixes the issue by adding the variables to the list of
"captured" global variables from the host program, so that they are
instantiated as part of the `internalProcedureBindings` in the bridge.
The rational of doing it that way instead of changing
`getScopeVariableList` is that `getScopeVariableList` would have to
import all the module variables used inside the host since it cannot
know which ones are referenced inside the internal procedure from the
semantics::Scope information. The fix in this patch only instantiates
the module variables from the host that are actually referenced inside
the internal procedure.
With HLFIR the lbounds for the ALLOCATABLE result are taken from the
mutable box created for the result, so the non-default lbounds might be
propagated further causing incorrect result, e.g.:
```
program p
real, allocatable :: p5(:)
allocate(p5, source=real_init())
print *, lbound(p5, 1) ! must print 1, but prints 7
contains
function real_init()
real, allocatable :: real_init(:)
allocate(real_init(7:8))
end function real_init
end program p
```
With FIR lowering the box passed for `source` has explicit lower bound 1
at the call site, but the runtime box initialized by `real_init` call
still has lower bound 7. I am not sure if the runtime box initialized by
`real_init` will ever be accessed in a debugger via Fortran variable
names, but I think that having the right runtime bounds that can be
accessible via examining registers/stack might be good in general. So I
decided to update the runtime bounds at the point of return.
This change fixes the test above for HLFIR.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D156187
HLFIR lowering always adds hlfir.declare when symbols are bound to their
address allocated on the stack. Ensure that the declare is placed along
with the alloca if it is hoisted. And always return the mlir value that
is bound to the symbol (i.e the alloca in FIR lowering and the declare
in HLFIR lowering).
Context: Loop index variables in OpenMP parallel regions should be
privatised to work correctly.
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D158594
Unlike other executable constructs with associating selectors, the
selector of a SELECT RANK construct can have the ALLOCATABLE or POINTER
attribute, and will work as an allocatable or object pointer within
each rank case, so long as there is no RANK(*) case.
Getting this right exposed a correctness risk with the popular
predicate IsAllocatableOrPointer() -- it will be true for procedure
pointers as well as object pointers, and in many contexts, a procedure
pointer should not be acceptable. So this patch adds the new predicate
IsAllocatableOrObjectPointer(), and updates some call sites of the original
function to use the new one.
Differential Revision: https://reviews.llvm.org/D159043
Commonblock names are not variables, but they can be marked as
threadprivate in OpenMP. This requires the commonblock name to
be bound to the address of the Commonblock. hlfir.declares are
not required for these, but we should be able to retrieve the
mlir Value corresponding to the Commonblock. This patch enables
this by special casing the Commonblocks like procedures.
Reviewed By: tblah, vzakhari
Differential Revision: https://reviews.llvm.org/D158070
Lower the acc delcare directive in function/subroutine
to the newly introduced acc.declare operation. Only a single
acc.declare operation is procduced in a function or subroutine
so they don't end up nested.
Depends on D158314
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158315
The routine directive can appear in the specification part of
a subroutine, function or module and therefore appear before the
function or subroutine is lowered. We keep track of the created
routine info attribute and attach them to the function at the end
of the lowering if the directive appeared before the function was
lowered.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158204
This patch adds lowering for the exit part of the OpenACC declare construct
in function/subroutine.
Depends on D156560
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D156568
This patch provides support for usage of common block
in private/firstprivate and lastprivate clauses.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D156120
This supports the common block in OpenMP privat clause by making
each common block member host-associated privatization and
adds the test case.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D127215
This patch adds the skeleton and the basic lowering for OpenACC declare
construct when located in the module declaration. This patch just lower the
create clause with or without modifier. Other clause and global descrutor
lowering will come in follow up patches to keep this one small enough for
review.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D156266
This is an attempt at mimicing the method in which
threadprivate handles the following type of variables:
program main
integer :: i
!$omp declare target to(i)
end
Which essentially generates a GlobalOp for the variable (which
would normally only be an alloca) when it's instantiated. The
main difference is there is no operation generated within the
function, instead the declare target attribute is appended
later within handleDeclareTarget.
Reviewers: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D152037
This patch implements an early outlining transform of omp.target operations in
flang. The pass is needed because optimizations may cross target op region
boundaries, but with the outlining the resulting functions only contain a
single omp.target op plus a func.return, so there should not be any opportunity
to optimize across region boundaries.
The patch also adds an interface to be able to store and retrieve the parent
function name of the original target operation. This is needed to be able to
create correct kernel function names when lowering to LLVM-IR.
Reviewed By: kiranchandramohan, domada
Differential Revision: https://reviews.llvm.org/D154879
This patch lowers allocatables and pointers named in "private" OpenMP clause.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D148570
This patch extends the logic for lowering loop construct reductions to parallel block reductions.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D154182
This patch adds 'unordered' attribute handling the HLFIR elementals'
builders and fixes the attribute handling in lowering and transformations.
Depends on D154031, D154032
Reviewed By: jeanPerier, tblah
Differential Revision: https://reviews.llvm.org/D154035
Extend the SourceFile class to take account of #line directives
when computing source file positions for error messages.
Adjust the output of #line directives to -E output so that they
reflect any #line directives that were in the input.
Differential Revision: https://reviews.llvm.org/D153910
Lowering relies on dead code generation / unreachable block deletion
to delete some code that is potentially invalid.
However, calling mlir::simplifyRegion also merges block, which may
promote SSA values to block arguments. Not all FIR types are intended
to be block arguments.
The added test shows an example where block merging led to
fir.shape<> being block arguments (and a failure later in codegen).
Reviewed By: tblah, clementval, vdonaldson
Differential Revision: https://reviews.llvm.org/D153858
Lower user defined assignment inside the hlfir.region_assign
"userDefinedAssignment" mlir region.
This is done by adding an entry point to ConvertCall.h in order
to call genUserCall with the region block arguments as arguments.
The codegen for hlfir.region_assign with user defined assignment
will be added in a later patch.
Differential Revision: https://reviews.llvm.org/D153404
When the ENTRY statement is used, the same source can return different
types depending on the entry point. These different return values are
storage associated (share the same storage). Previously, this led to the
declaration of the results to all have the largest type. This patch adds
a convert between the stack allocation and the declaration so that the
hlfir.decl gets the right type.
I haven't managed to generate code where this convert converted a
reference to an allocation for a smaller type into an allocation for a
larger one, but I have added an assert just in case.
This is a different solution to https://reviews.llvm.org/D152725, see
discussion there.
Differential Revision: https://reviews.llvm.org/D152931
SmallSetVector has an inefficiency where it does set insertions
regardless of the number of elements present within it. This contrasts
with other "Small-" containers where they use linear scan up to a
certain size "N", after which they switch to another strategy.
This patch implements this functionality in SetVector, adding a template
parameter "N" which specifies the number of elements upto which the
SetVector follows the "small" strategy. Due to the use of "if
constexpr", there is no "small" code emitted when N is 0 which makes
this a zero overhead change for users using the default behaviour.
This change also allows having SmallSetVector use DenseSet instead of
SmallDenseSet by default, which helps a little with performance.
The reason for implementing this functionality in SetVector instead of
SmallSetVector is that it allows reusing all the code that is already
there and it is just augmented with the "isSmall" checks.
This change gives a good speedup (0.4%):
https://llvm-compile-time-tracker.com/compare.php?from=086601eac266ec253bf313c746390ff3e5656132&to=acd0a72a4d3ee840f7b455d1b35d82b11ffdb3c0&stat=instructions%3Au
Differential Revision: https://reviews.llvm.org/D152497
Begin upstreaming of CUDA Fortran support in LLVM Flang.
This first patch implements parsing for CUDA Fortran syntax,
including:
- a new LanguageFeature enum value for CUDA Fortran
- driver change to enable that feature for *.cuf and *.CUF source files
- parse tree representation of CUDA Fortran syntax
- dumping and unparsing of the parse tree
- the actual parsers for CUDA Fortran syntax
- prescanning support for !@CUF and !$CUF
- basic sanity testing via unparsing and parse tree dumps
... along with any minimized changes elsewhere to make these
work, mostly no-op cases in common::visitors instances in
semantics and lowering to allow them to compile in the face
of new types in variant<> instances in the parse tree.
Because CUDA Fortran allows the kernel launch chevron syntax
("call foo<<<blocks, threads>>>()") only on CALL statements and
not on function references, the parse tree nodes for CallStmt,
FunctionReference, and their shared Call were rearranged a bit;
this caused a fair amount of one-line changes in many files.
More patches will follow that implement CUDA Fortran in the symbol
table and name resolution, and then semantic checking.
Differential Revision: https://reviews.llvm.org/D150159
In some scenarios, a SELECT CASE could cause an error while lowering to FIR.
This was caused by a spurious extra branch added after the end statement.
Fixes#62726
Differential Revision: https://reviews.llvm.org/D151118
The following PowerPC vector type syntax is added:
VECTOR ( element-type-spec )
where element-type-sec is integer-type-spec, real-type-sec or unsigned-type-spec.
Two opaque types (__VECTOR_PAIR and __VECTOR_QUAD) are also added.
A finite set of functionalities are implemented in order to support the new types:
1. declare objects
2. declare function result
3. declare type dummy arguments
4. intrinsic assignment between the new type objects (e.g. v1=v2)
5. reference functions that return the new types
Submit on behalf of @tislam @danielcchen
Authors: @tislam @danielcchen
Differential Revision: https://reviews.llvm.org/D150876
Generate supporting data structures and calls to new runtime IO functions
for defined IO that accesses non-type-bound procedures, such as `wft` in:
module m1
type t
integer n
end type
interface write(formatted)
module procedure wft
end interface
contains
subroutine wft(dtv, unit, iotype, v_list, iostat, iomsg)
class(t), intent(in) :: dtv
integer, intent(in) :: unit
character(*), intent(in) :: iotype
integer, intent(in) :: v_list(:)
integer, intent(out) :: iostat
character(*), intent(inout) :: iomsg
iostat = 0
write(unit,*,iostat=iostat,iomsg=iomsg) 'wft was called: ', dtv%n
end subroutine
end module
module m2
contains
subroutine test1
use m1
print *, 'test1, should call wft: ', t(1)
end subroutine
subroutine test2
use m1, only: t
print *, 'test2, should not call wft: ', t(2)
end subroutine
end module
use m1
use m2
call test1
call test2
print *, 'main, should call wft: ', t(3)
end
Symbols corresponding to entries returning character results
must be mapped to EmboxCharOp, first, before we can map them
to DeclareOp. The code may be reworked after HLFIR is enabled
by default, but right now it seems like an acceptable solution to me.
Differential Revision: https://reviews.llvm.org/D150749
Fortran doesn't allow inaccessible procedure bindings to be
overridden, and this needs to apply to generic resolution.
When resolving a type-bound generic procedure from another
module, ensure only that the most extended override from its
module is used if it is PRIVATE, not a later apparent override
from another module.
Differential Revision: https://reviews.llvm.org/D150721
The global names were created using a hash based on the address
of std::vector::data address. Since the memory may be reused
by different std::vector's, this may cause non-equivalent
constant expressions to map to the same name. This is what is happening
in the modified flang/test/Lower/constant-literal-mangling.f90 test.
I changed the name creation to use a map between the constant expressions
and corresponding unique names. The uniquing is done using a name counter
in FirConverter. The effect of this change is that the equivalent
constant expressions are now mapped to the same global, and the naming
is "stable" (i.e. it does not change from compilation to compilation).
Though, the issue is not HLFIR specific it was affecting several tests
when using HLFIR lowering.
Differential Revision: https://reviews.llvm.org/D150380
This patch lowers assignments to vector subscripted designators into the
newly added hlfir.elemental_addr and hlfir.region_assign.
Note that the codegen of these operation to FIR is still TODO and will
still emit a TODO message when trying to compile programs end to end.
Differential Revision: https://reviews.llvm.org/D149962
Lower Forall to the previously added hlfir.forall, hlfir.forall_mask.
hlfir.forall_index, and hlfir.region_assign operations.
The HLFIR assignment code lowering is moved into genDataAssignment for
more readability and so that user defined assignment (still a TODO),
will be able to share most of the logic.
Differential Revision: https://reviews.llvm.org/D149878
hlfir.assign, in general, ends up calling the Assign runtime that asserts
that the types of LHS and RHS match. In case of implicit logical<->integer
conversions (allowed as an extension) the operands of hlfir.assign
have non-matching types. This change makes sure that the lowering
produces explicit type cast (either as a scalar fir.convert or
as a hlfir.elemental producing array expression).
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D149765
Lower vector subscripted designators as values when they appear outside
of the assignment left-hand side and input IO contexts.
This matches Fortran semantics where vector subscripted designators cannot
be written to outside of the two contexts mentioned above: they are
passed/taken by value where they appear.
This patch uses the added hlfir.element_addr to lower vector designators
in lowering. But when reaching the end of the designator lowering, the
hlfir.element_addr is turned into an hlfir.elemental when lowering is
not asking for the hlfir.elemental_addr.
This approach allows lowering vector subscripted in the same way in
while visiting the designator, and only adapt to the context at the
edge.
The part where lowering uses the hlfir.elemental_addr will be
done in further patch as it requires lowering assignments in the
new hlfir.region_assign op, and there is not codegen yet for these
new operations.
Differential Revision: https://reviews.llvm.org/D149480
The explicit `ignoring all compiler directives` reminder warning is no
longer accurate. Any similar, more accurate message is best generated
by the front end (change pending).
Modify code generation for assigned gotos to generate a runtime error
for most cases that violate F90 Clause 8.2.4, rather than treating a
nonconformant GOTO as a nop. For example, generate a runtime error for
a GOTO that attempts to branch to a label for a FORMAT statement.
Relax the requirement that an assigned GOTO with a label list must
branch to a label in the list, and instead allow a branch to any valid
assigned GOTO target in scope.
The current lowering for polymorphic pointer association was not
dealing with NULL in a "context aware" fashion: it was calling the
`PointerAssociate` runtime entry point with a fir.box<none> target.
But the fir.box<none> is a descriptor for a scalar, this lead the
runtime to set the pointer rank to zero, regardless of its actual
rank.
I do not think there is a way to expose this problem with the Fortran
code currently supported by flang, because most further manipulation of
the pointer would either set the rank correctly, or do not rely on the
rank in the runtime descriptor.
However, this is incorrect, and when assumed rank are supported, the
following would have failed:
```
subroutine check_rank(p)
class(*), pointer :: p(..)
p => null()
select rank(p)
rank (1)
print *, "OK"
rank default
print *, "FAILED"
end select
end subroutine
class(*), pointer :: p(:)
p => null()
call check_rank(p)
end
```
Instead, detect NULL() in polymorphic pointer lowering and trigger the
deallocation of the pointer.
Differential Revision: https://reviews.llvm.org/D147317
When the LHS is referring to a parent component the box need to be
reboxed to the parent component type so the runtime can handle the
assignment correctly.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D146046
Make use of the new runtime entry point for assignment to
LHS allocatable polymorphic.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D145324
A block construct is an execution control construct that supports
declaration scopes contained within a parent subprogram scope or another
block scope. (blocks may be nested.) This is implemented by applying
basic scope processing to the block level.
Name uniquing/mangling is extended to support this. The term "block" is
heavily overloaded in Fortran standards. Prior name uniquing used tag `B`
for common block objects. Existing tag choices were modified to free up `B`
for block construct entities, and `C` for common blocks, and resolve
additional issues with other tags. The "old tag -> new tag" changes can
be summarized as:
-> B -- block construct -> new
B -> C -- common block
C -> YI -- intrinsic type descriptor; not currently generated
CT -> Y -- nonintrinsic type descriptor; not currently generated
G -> N -- namelist group
L -> -- block data; not needed -> deleted
Existing name uniquing components consist of a tag followed by a name
from user source code, such as a module, subprogram, or variable name.
Block constructs are different in that they may be anonymous. (Like other
constructs, a block may have a `block-construct-name` that can be used
in exit statements, but this name is optional.) So blocks are given a
numeric compiler-generated preorder index starting with `B1`, `B2`,
and so on, on a per-procedure basis.
Name uniquing is also modified to include component names for all
containing procedures rather than for just the immediate host. This
fixes an existing name clash bug with same-named entities in same-named
host subprograms contained in different-named containing subprograms,
and variations of the bug involving modules and submodules.
F18 clause 9.7.3.1 (Deallocation of allocatable variables) paragraph 1
has a requirement that an allocated, unsaved allocatable local variable
must be deallocated on procedure exit. The following paragraph 2 states:
When a BLOCK construct terminates, any unsaved allocated allocatable
local variable of the construct is deallocated.
Similarly, F18 clause 7.5.6.3 (When finalization occurs) paragraph 3
has a requirement that a nonpointer, nonallocatable object must be
finalized on procedure exit. The following paragraph 4 states:
A nonpointer nonallocatable local variable of a BLOCK construct
is finalized immediately before it would become undefined due to
termination of the BLOCK construct.
These deallocation and finalization requirements, along with stack
restoration requirements, require knowledge of block exits. In addition
to normal block termination at an end-block-stmt, a block may be
terminated by executing a branching statement that targets a statement
outside of the block. This includes
Single-target branch statements:
- goto
- exit
- cycle
- return
Bounded multiple-target branch statements:
- arithmetic goto
- IO statement with END, EOR, or ERR specifiers
Unbounded multiple-target branch statements:
- call with alternate return specs
- computed goto
- assigned goto
Lowering code is extended to determine if one of these branches exits
one or more relevant blocks or other constructs, and adds a mechanism to
insert any necessary deallocation, finalization, or stack restoration
code at the source of the branch. For a single-target branch it suffices
to generate the exit code just prior to taking the indicated branch.
Each target of a multiple-target branch must be analyzed individually.
Where necessary, the code must first branch to an intermediate basic
block that contains exit code, followed by a branch to the original target
statement.
This patch implements an `activeConstructStack` construct exit mechanism
that queries a new `activeConstruct` PFT bit to insert stack restoration
code at block exits. It ties in to existing code in ConvertVariable.cpp
routine `instantiateLocal` which has code for finalization, making block
exit finalization on par with subprogram exit finalization. Deallocation
is as yet unimplemented for subprograms or blocks. This may result in
memory leaks for affected objects at either the subprogram or block level.
Deallocation cases can be addressed uniformly for both scopes in a future
patch, presumably with code insertion in routine `instantiateLocal`.
The exit code mechanism is not limited to block construct exits. It is
also available for use with other constructs. In particular, it is used
to replace custom deallocation code for a select case construct character
selector expression where applicable. This functionality is also added
to select type and associate constructs. It is available for use with
other constructs, such as select rank and image control constructs,
if that turns out to be necessary.
Overlapping nonfunctional changes include eliminating "FIR" from some
routine names and eliminating obsolete spaces in comments.
- always use genExprAddr when lowering to HLFIR: it does not create
temporary for array sections without vector subscripts, so there is
no need to have custom logic.
- update mangling to deal with AssocDetailsEntity. Their name is
required in HLFIR so that it can be added to the hlfir.declare
that is created for the selector once it is lowered. This should
allow getting debug info for selector when debug info are generated
from hlfir.declare.
The rest of associate construct lowering is unchanged and shared with
the current lowering.
This patch also enables select type lowering to work properly, but some
other todos (mainly about parent component references) prevents porting
the tests for now, so this will be done later.
Differential Revision: https://reviews.llvm.org/D144740
Use the runtime when there lhs or rhs is polymorphic. The runtime
allows to deal better with polymorphic entities and aliasing.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D144418