Move non-common files from FortranCommon to FortranSupport (analogous to
LLVMSupport) such that
* declarations and definitions that are only used by the Flang compiler,
but not by the runtime, are moved to FortranSupport
* declarations and definitions that are used by both ("common"), the
compiler and the runtime, remain in FortranCommon
* generic STL-like/ADT/utility classes and algorithms remain in
FortranCommon
This allows a for cleaner separation between compiler and runtime
components, which are compiled differently. For instance, runtime
sources must not use STL's `<optional>` which causes problems with CUDA
support. Instead, the surrogate header `flang/Common/optional.h` must be
used. This PR fixes this for `fast-int-sel.h`.
Declarations in include/Runtime are also used by both, but are
header-only. `ISO_Fortran_binding_wrapper.h`, a header used by compiler
and runtime, is also moved into FortranCommon.
The new LLVM stack save/restore intrinsic operations are more convenient
than function calls because they do not add function declarations to the
module and therefore do not block the parallelisation of passes.
Furthermore they could be much more easily marked with memory effects
than function calls if that ever proved useful.
This builds on top of #107879.
Resolves#108016
This PR aims to factor in the allocation address space provided by an
architectures data layout when generating the intrinsic instructions,
this allows them to be lowered later with the address spaces in tow.
This aligns the intrinsic creation with the LLVM IRBuilder's
https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/IR/IRBuilder.h#L1053
This is also necessary for the below example to compile for OpenMP AMD
GPU and not ICE the compiler in ISEL as AMD's stackrestore and stacksave
are expected to have the appropriate allocation address space for AMD
GPU.
program main
integer(4), allocatable :: test
allocate(test)
!$omp target map(tofrom:test)
do i = 1, 10
test = test + 50
end do
!$omp end target
deallocate(test)
end program
The PR also fixes the issue I opened a while ago which hits the same
error when compiling for AMDGPU:
https://github.com/llvm/llvm-project/issues/82368
Although, you have to have the appropriate GPU LIBC and Fortran offload
runtime (both compiled for AMDGPU) added to the linker for the command
or it will reach another ISEL error and ICE weirdly. But with the
pre-requisites it works fine with this PR.
Some passes in the flang pipeline are creating `fir.alloca` operation
like `hlfir.concat`. When these allocas are located in a loop, the stack
can quickly be used too much leading to segfaults.
This behavior can be seen in
https://github.com/jacobwilliams/json-fortran/blob/master/src/tests/jf_test_36.F90
This patch insert a call to LLVM stacksave/stackrestore in the body of
the loop to reclaim the alloca in its scope.
This PR is an alternative implementation to #95173