271 Commits

Author SHA1 Message Date
Matthias Springer
599c739905
[mlir][GPU] Add NVVM-specific cf.assert lowering (#120431)
This commit add an NVIDIA-specific lowering of `cf.assert` to to
`__assertfail`.

Note: `getUniqueFormatGlobalName`, `getOrCreateFormatStringConstant` and
`getOrDefineFunction` are moved to `GPUOpsLowering.h`, so that they can
be reused.
2025-01-06 12:00:11 +01:00
Matthias Springer
c870632ef6
[flang] Fix some memory leaks (#121050)
This commit fixes some but not all memory leaks in Flang. There are
still 91 tests that fail with ASAN.

- Use `mlir::OwningOpRef` instead of `std::unique_ptr`. The latter does
not free allocations of nested blocks.
- Pass `ModuleOp` as value instead of reference.
- Add few missing deallocations in test cases and other places.
2024-12-25 09:42:03 +01:00
Valentin Clement (バレンタイン クレメン)
d36836de01
[flang][cuda] Create descriptor in managed memory when emboxing fir.box_addr value (#120980) 2024-12-23 09:52:59 -08:00
Kazu Hirata
392651a7ec
[flang] Migrate away from PointerUnion::{is,get} (NFC) (#120880)
Note that PointerUnion::{is,get} have been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>

I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.
2024-12-22 13:30:16 -08:00
Valentin Clement (バレンタイン クレメン)
e650ac1654
[flang][cuda][NFC] Fix typo in CUFAllocDescriptor (#120797)
Missing `r` in the function name.
2024-12-20 13:57:47 -08:00
Valentin Clement (バレンタイン クレメン)
81831ef3e7
[flang][cuda] Correctly allocate descriptor in managed memory when reboxing (#120795)
Reboxing might create a new in memory descriptor. If this one was
allocate with managed memory, allocate the new one in managed memory as
well.
2024-12-20 13:32:31 -08:00
Valentin Clement (バレンタイン クレメン)
3e13acfbf4
[flang][cuda] Make default.nonTbpDefinedIoTable compiler generated (#120686)
`default.nonTbpDefinedIoTable` is a special global defined for IO that
doesn't follow the mangling scheme and is then not handle correctly in
the `CompilerGeneratedNames` pass. Update how it is generated with
doGenerated so it can be handle without special handling.

Also do not generate comdat in gpu module as the current code is not
handling nested module correctly.
2024-12-20 10:37:48 -08:00
Matthias Springer
eb6c4197d5
[mlir][CF] Split cf-to-llvm from func-to-llvm (#120580)
Do not run `cf-to-llvm` as part of `func-to-llvm`. This commit fixes
https://github.com/llvm/llvm-project/issues/70982.

This commit changes the way how `func.func` ops are lowered to LLVM.
Previously, the signature of the entire region (i.e., entry block and
all other blocks in the `func.func` op) was converted as part of the
`func.func` lowering pattern.

Now, only the entry block is converted. The remaining block signatures
are converted together with `cf.br` and `cf.cond_br` as part of
`cf-to-llvm`. All unstructured control flow is not converted as part of
a single pass (`cf-to-llvm`). `func-to-llvm` no longer deals with
unstructured control flow.

Also add more test cases for control flow dialect ops.

Note: This PR is in preparation of #120431, which adds an additional
GPU-specific lowering for `cf.assert`. This was a problem because
`cf.assert` used to be converted as part of `func-to-llvm`.

Note for LLVM integration: If you see failures, add
`-convert-cf-to-llvm` to your pass pipeline.
2024-12-20 13:46:45 +01:00
Valentin Clement (バレンタイン クレメン)
e93d226664
[flang][cuda] Update CompilerGeneratedNames pass to work on gpu module (#120660)
- Update `CompilerGeneratedNames` so it can perform renaming in
gpu.module
- Update Codegen so it look in the correct module for the type
descriptor.
2024-12-19 19:07:00 -08:00
Valentin Clement (バレンタイン クレメン)
4530273d7c
[flang][cuda] Allocate descriptor in managed memory when emboxing device memory (#120485)
When emboxing memory that comes from CUFMemAlloc, we need to allocate
the descriptor in manage memory as it might be passed to a kernel.
2024-12-18 18:20:45 -08:00
Peter Klausler
fc97d2e68b
[flang] Add UNSIGNED (#113504)
Implement the UNSIGNED extension type and operations under control of a
language feature flag (-funsigned).

This is nearly identical to the UNSIGNED feature that has been available
in Sun Fortran for years, and now implemented in GNU Fortran for
gfortran 15, and proposed for ISO standardization in J3/24-116.txt.

See the new documentation for details; but in short, this is C's
unsigned type, with guaranteed modular arithmetic for +, -, and *, and
the related transformational intrinsic functions SUM & al.
2024-12-18 07:02:37 -08:00
Valentin Clement (バレンタイン クレメン)
5e1f87e849
[flang][cuda] Correctly allocate memory for descriptor load (#120164)
CodeGen will allocate memory for a new descriptor on descriptor loads.
CUDA Fortran local descriptor are allocated in managed memory by the
runtime. The newly allocated storage for cuda descriptor must also be
allocated through the runtime.
2024-12-16 19:12:05 -08:00
Michael Kruse
c91ba04328
[Flang][NFC] Split runtime headers in preparation for cross-compilation. (#112188)
Split some headers into headers for public and private declarations in
preparation for #110217. Moving the runtime-private headers in
runtime-private include directory will occur in #110298.

* Do not use `sizeof(Descriptor)` in the compiler. The size of the
descriptor is target-dependent while `sizeof(Descriptor)` is the size of
the Descriptor for the host platform which might be too small when
cross-compiling to a different platform. Another problem is that the
emitted assembly ((cross-)compiling to the same target) is not identical
between Flang's running on different systems. Moving the declaration of
`class Descriptor` out of the included header will also reduce the
amount of #included sources.

* Do not use `sizeof(ArrayConstructorVector)` and
`alignof(ArrayConstructorVector)` in the compiler. Same reason as with
`Descriptor`.

* Compute the descriptor's extra flags without instantiating a
Descriptor. `Fortran::runtime::Descriptor` is defined in the runtime
source, but not the compiler source.

* Move `InquiryKeywordHashDecode` into runtime-private header. The
function is defined in the runtime sources and trying to call it in the
compiler would lead to a link-error.

* Move allocator-kind magic numbers into common header. They are the
only declarations out of `allocator-registry.h` in the compiler as well.
 
This does not make Flang cross-compile ready yet, the main goal is to
avoid transitive header dependencies from Flang to clang-rt. There are
more assumptions that host platform is the same as the target platform.
2024-12-06 15:29:00 +01:00
Tom Eccles
e9fc2faf0c
[flang][CodeGen] fix bug hoisting allocas using a shared constant arg (#116251)
When hoisting the allocas with a constant integer size, the constant
integer was moved to where the alloca is hoisted to unconditionally.

By CodeGen there have been various iterations of mlir canonicalization
and dead code elimination. This can cause lots of unrelated bits of code
to share the same constant values. If for some reason the alloca
couldn't be hoisted all of the way to the entry block of the function,
moving the constant might result in it no-longer dominating some of the
remaining uses.

In theory, there should be dominance analysis to ensure the location of
the constant does dominate all uses of it. But those constants are
effectively free anyway (they aren't even separate instructions in LLVM
IR), so it is less expensive just to leave the old one where it was and
insert a new one we know for sure is immediately before the alloca.
2024-11-15 10:31:20 +00:00
Valentin Clement (バレンタイン クレメン)
e5092c3019
[flang][cuda] Support malloc and free conversion in gpu module (#116112) 2024-11-13 17:09:38 -08:00
Valentin Clement (バレンタイン クレメン)
466b58ba38
[flang] Avoid generating duplicate symbol in comdat (#114472)
In case where a fir.global might be duplicated in an inner module
(gpu.module), the conversion pattern will be applied on the module and
the gpu module version of the global and try to generate multiple comdat
with the same symbol name. This is what we have in the implementation of
CUDA Fortran.

Just check for the presence of the `ComdatSelectorOp` before creating a
new one.
2024-10-31 18:59:04 -07:00
Asher Mancinelli
0c9a02355a
[flang][fir] always use memcpy for fir.box (#113949)
@jeanPerier explained the importance of converting box loads and stores
into `memcpy`s instead of aggregate loads and stores, and I'll do my
best to explain it here.

* [(godbolt link) Example comparing opt transformations on memcpys vs
aggregate load/stores](https://godbolt.org/z/be7xM83cG)
* LLVM can more effectively reason about memcpys compared to aggregate
load/stores.
* This came up when others were discussing array descriptors for
assumed-rank arrays passed to `bind(c)` subroutines, with the
implication that the array descriptors are known to have lower bounds of
1 and that they are not pointer/allocatable types.
* [(godbolt link) Clang also uses memcpys so we should probably follow
them, assuming the clang developers are generatign what they know Opt
will handle more effectively.](https://godbolt.org/z/YT4x7387W)
* This currently may not help much without the `nocapture` attribute
being propagated to function calls, but [it looks like someone may do
this soon (discourse
link)](https://discourse.llvm.org/t/applying-the-nocapture-attribute-to-reference-passed-arguments-in-fortran-subroutines/81401/23)
or I can do this in a follow-up patch.

Note on test `flang/test/Fir/embox-char.fir`: it looks like the original
test was auto-generated. I wasn't too sure which parts were especially
important to test, so I regenerated the test. If we want the updated
version to look more like the old version, I'll make those changes.
2024-10-30 09:50:27 -07:00
Scott Manley
e6a4346b5a
[flang] add getElementType() to fir::SquenceType and fir::VectorType (#112770)
getElementType() was missing from Sequence and Vector types. Did a
replace of the obvious places getEleTy() was used for these two types
and updated to use this name instead.

Co-authored-by: Scott Manley <scmanley@nvidia.com>
2024-10-18 09:29:25 +02:00
jeanPerier
2f0b4f43fc
[flang][extension] support concatenation with absent optional (#112678)
Fix #112593 by adding support in lowering to concatenation with an
absent optional _assumed length_ dummy argument because:
1. Most compilers seem to support it (most likely by accident).
2. This actually makes the compiler codegen simpler. Codegen was going
out of its way to poke the LLVM optimizer bear by producing an undef
argument for the length.

I insist on the fact that no compiler support this with _explicit
length_ optional arguments and the executable will segfault and I would
discourage users from using that "feature" because runtime checks for
bad optional dereference will kick when used (For instance, "nagfor
-C=present" will produce an executable that abort with an error message
. Flang does not have such runtime check option so far).

Hence, I am not updating the Extensions.md document because this is not
something I think we should advertise.
2024-10-17 13:25:09 +02:00
Abid Qadeer
cd12ffb622
[mlir][debug] Allow multiple DIGlobalVariableExpression on globals. (#111981)
Currently, we allow only one DIGlobalVariableExpressionAttr per global.
It is especially evident in import where we pick the first from the list
and ignore the rest. In contrast, LLVM allows multiple
DIGlobalVariableExpression to be attached to the global. They are needed
for correct working of things like DICommonBlock. This PR removes this
restriction in mlir. Changes are mostly mechanical. One thing on which I
went a bit back and forth was the representation inside GlobalOp. I
would be happy to change if there are better ways to do this.

---------

Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
2024-10-13 23:36:00 +01:00
Leandro Lupori
390943f25b
[flang] Implement conversion of compatible derived types (#111165)
With some restrictions, BIND(C) derived types can be converted to
compatible BIND(C) derived types.
Semantics already support this, but ConvertOp was missing the
conversion of such types.

Fixes https://github.com/llvm/llvm-project/issues/107783
2024-10-09 10:37:46 -03:00
Matthias Springer
206fad0e21
[mlir][NFC] Mark type converter in populate... functions as const (#111250)
This commit marks the type converter in `populate...` functions as
`const`. This is useful for debugging.

Patterns already take a `const` type converter. However, some
`populate...` functions do not only add new patterns, but also add
additional type conversion rules. That makes it difficult to find the
place where a type conversion was added in the code base. With this
change, all `populate...` functions that only populate pattern now have
a `const` type converter. Programmers can then conclude from the
function signature that these functions do not register any new type
conversion rules.

Also some minor cleanups around the 1:N dialect conversion
infrastructure, which did not always pass the type converter as a
`const` object internally.
2024-10-05 21:32:40 +02:00
jeanPerier
1753de2d95
[flang][FIR] remove fir.complex type and its fir.real element type (#111025)
Final patch of
https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292

Since fir.real was only still used as fir.complex element type, this
patch removes it at the same time.
2024-10-04 09:57:03 +02:00
jeanPerier
c2601f1769
[flang][NFC] remove unused fir.constc operation (#110821)
As part of [RFC to replace fir.complex usages by mlir.complex
type](https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292).

fir.constc is unused so instead of porting it, just remove it.
Complex constants are currently created with inserts in lowering
already. When using mlir complex, we may just want to start using
[complex.constant](4f6ad17adc/mlir/include/mlir/Dialect/Complex/IR/ComplexOps.td (L131C5-L131C16)).
2024-10-02 16:16:57 +02:00
Sirui Mu
fde3c16ac9
[mlir][LLVM] Add operand bundle support (#108933)
This PR adds LLVM [operand
bundle](https://llvm.org/docs/LangRef.html#operand-bundles) support to
MLIR LLVM dialect. It affects these 3 operations related to making
function calls: `llvm.call`, `llvm.invoke`, and `llvm.call_intrinsic`.

This PR adds two new parameters to each of the 3 operations. The first
parameter is a variadic operand `op_bundle_operands` that contains the
SSA values for operand bundles. The second parameter is a property
`op_bundle_tags` which holds an array of strings that represent the tags
of each operand bundle.
2024-09-26 07:59:37 +02:00
Jan Leyonberg
4290e34ebd
[flang][AMDGPU] Convert math ops to AMD GPU library calls instead of libm calls (#99517)
This patch invokes a pass when compiling for an AMDGPU target to lower
math operations to AMD GPU library calls library calls instead of libm
calls.
2024-09-10 09:48:55 -04:00
Nikita Popov
67e19e5bb1 [flang] Set isSigned=true for negative constant (NFC)
We're providing this as a negative signed value, so set the flag.
Currently doesn't make a difference, but will assert in the future.

Split out of https://github.com/llvm/llvm-project/pull/80309.
2024-09-05 15:25:05 +02:00
Peter Klausler
9e53e77265
[flang] Fix warnings from more recent GCCs (#106567)
While experimenting with some more recent C++ features, I ran into
trouble with warnings from GCC 12.3.0 and 14.2.0. These warnings looked
legitimate, so I've tweaked the code to avoid them.
2024-09-04 10:52:51 -07:00
Slava Zakharin
cfd4c1805e
[RFC][flang] Replace special symbols in uniqued global names. (#104859)
This change addresses more "issues" as the one resolved in #71338.
Some targets (e.g. NVPTX) do not accept global names containing
`.`. In particular, the global variables created to represent
the runtime information of derived types use `.` in their names.
A derived type's descriptor object may be used in the device code,
e.g. to initialize a descriptor of a variable of this type.
Thus, the runtime type info objects may need to be compiled
for the device.

Moreover, at least the derived types' descriptor objects
may need to be registered (think of `omp declare target`)
for the host-device association so that the addendum pointer
can be properly mapped to the device for descriptors using
a derived type's descriptor as their addendum pointer.
The registration implies knowing the name of the global variable
in the device image so that proper host code can be created.
So it is better to name the globals the same way for the host
and the device.

CompilerGeneratedNamesConversion pass renames all uniqued globals
such that the special symbols (currently `.`) are replaced
with `X`. The pass is supposed to be run for the host and the device.

An option is added to FIR-to-LLVM conversion pass to indicate
whether the new pass has been run before or not. This setting
affects how the codegen computes the names of the derived types'
descriptors for FIR derived types.

fir::NameUniquer now allows `X` to be part of a name, because
the name deconstruction may be applied to the mangled names
after CompilerGeneratedNamesConversion pass.
2024-08-21 13:37:03 -07:00
Valentin Clement (バレンタイン クレメン)
15e1e3b234
[flang] Read the extra field from the in box when doing reboxing (#102992)
Updated version of #102686. The issue was that in some rebox case the
addendum presence flag should be updated and not always taken from the
"from" box. This is the case when reboxing a fir.class to a fir.box that
doesn't require an addendum for example.

Open a new review since there is a bit of additional code in the CodeGen
part.
2024-08-14 11:23:56 -07:00
Valentin Clement (バレンタイン クレメン)
8fc9b4efd2
Revert "[flang] Read the extra field from the in box when doing reboxing" (#102931)
Reverts llvm/llvm-project#102686 as it might be the source of buildbot
failures https://lab.llvm.org/buildbot/#/builders/143/builds/1392.
2024-08-12 09:35:50 -07:00
Valentin Clement (バレンタイン クレメン)
dab7e3c30d
[flang] Read the extra field from the in box when doing reboxing (#102686)
The extra field in the descriptor carries multiple information and
cannot be deducted anymore when doing a reboxing. This patch updates the
codegen to retrieve the extra field value from the inboc and set it in
the new box.
2024-08-12 08:48:27 -07:00
Valentin Clement (バレンタイン クレメン)
0def9a923d
[flang] Add allocator_idx attribute on fir.embox and fircg.ext_embox (#101212)
#100690 introduces allocator registry with the ability to store
allocator index in the descriptor. This patch adds an attribute to
fir.embox and fircg.ext_embox to be able to set the allocator index
while populating the descriptor fields.
2024-08-01 12:49:17 -07:00
Valentin Clement (バレンタイン クレメン)
6df4e7c25f
[flang] Add ability to have special allocator for descriptor data (#100690)
This patch enhances the descriptor with the ability to have specialized
allocator. The allocators are registered in a dedicated registry and the
index of the desired allocator is stored in the descriptor. The default
allocator, std::malloc, is registered at index 0.

In order to have this allocator index in the descriptor, the f18Addendum
field is repurposed to be able to hold the presence flag for the
addendum (lsb) and the allocator index.

Since this is a change in the semantic and name of the 7th field of the
descriptor, the CFI_VERSION is bumped to the date of the initial change.

This patch only adds the ability to have this features as part of the
descriptor but does not add specific allocator yet. CUDA fortran will be
the first user of this feature to allocate descriptor data in the
different type of device memory base on the CUDA attribute.

---------

Co-authored-by: Slava Zakharin <szakharin@nvidia.com>
2024-08-01 09:39:53 -07:00
jeanPerier
d8b672dac9
[flang][NFC] rename fircg op operand index accessors (#100584)
fircg operations have xxxOffset members to give the operand index of
operand xxx. This is a bit weird when looking at usage (e.g.
`arrayCoor.shiftOffset` reads like it is shifting some offset). Rename
them to getXxxOperandIndex.
2024-07-25 18:02:03 +02:00
Alexis Perry-Holby
f1d3fe7aae
Add basic -mtune support (#98517)
Initial implementation for the -mtune flag in Flang.

This PR is a clean version of PR #96688, which is a re-land of PR #95043
2024-07-16 16:48:24 +01:00
Matthias Springer
e73cf2f0c5
[flang] Remove materialization workaround in type converter (#98743)
This change is in preparation of #97903, which adds extra checks for
materializations: it is now enforced that they produce an SSA value of
the correct type, so the current workaround no longer works.

The original workaround avoided target materializations by directly
returning the to-be-converted SSA value from the materialization
callback. This can be avoided by initializing the lowering patterns that
insert the materializations without a type converter. For
`cg::XEmboxOp`, the existing workaround that skips
`unrealized_conversion_cast` ops is still in place.

Also remove the lowering pattern for `unrealized_conversion_cast`. This
pattern has no effect because `unrealized_conversion_cast` ops that are
inserted by the dialect conversion framework are never matched by the
pattern driver.
2024-07-15 16:07:48 +02:00
Ramkumar Ramachandra
db791b278a
mlir/LogicalResult: move into llvm (#97309)
This patch is part of a project to move the Presburger library into
LLVM.
2024-07-02 10:42:33 +01:00
Tarun Prabhu
8dd9494056
Revert "[flang] Add basic -mtune support" (#96678)
Reverts llvm/llvm-project#95043
2024-06-25 13:25:39 -06:00
Alexis Perry-Holby
a790279bf2
[flang] Add basic -mtune support (#95043)
This PR adds -mtune as a valid flang flag and passes the information
through to LLVM IR as an attribute on all functions. No specific
architecture optimizations are added at this time.
2024-06-25 18:39:35 +01:00
Vijay Kandiah
6d340e4c44
[flang] fixing alloca hoisting for blocks having single op. (#96009)
This change fixes the issue
https://github.com/llvm/llvm-project/issues/95977 due to commit
c0cba5198155dba246ddd5764f57595d9bbbddef inserting allocas after the
terminator op in the insertion block in the case where the block had
only a single operation, its terminator, in it. With this change, the
hoisted constant-sized allocas are placed at the front of the insertion
block, rather than right after the first operation in it.
2024-06-19 18:45:23 -05:00
jeanPerier
a786919256
[flang] allow assumed-rank box in fir.store (#95980)
Codegen is done with a memcpy using the rank from the "value" descriptor
like for the fir.load case.
Rational described in
https://github.com/llvm/llvm-project/blob/main/flang/docs/AssumedRank.md.
2024-06-19 10:12:19 +02:00
Vijay Kandiah
c0cba51981
[Flang] Hoisting constant-sized allocas at flang codegen. (#95310)
This change modifies the `AllocaOpConversion` in flang codegen to insert
constant-sized LLVM allocas at the entry block of `LLVMFuncOp` or
OpenACC/OpenMP Op, rather than in-place at the `fir.alloca`. This
effectively hoists constant-sized FIR allocas to the proper block.

When compiling the example subroutine below with `flang-new`, we get a
llvm.stacksave/stackrestore pair around a constant-sized `fir.alloca
i32`.

```
subroutine test(n)
    block
      integer :: n
      print *, n
    end block
  end subroutine test
```

Without the proposed change, downstream LLVM compilation cannot hoist
this constant-sized alloca out of the stacksave/stackrestore region
which may lead to missed downstream optimizations:

```
*** IR Dump After Safe Stack instrumentation pass (safe-stack) ***
define void @test_(ptr %0) !dbg !3 {
  %2 = call ptr @llvm.stacksave.p0(), !dbg !7
  %3 = alloca i32, i64 1, align 4, !dbg !8
  %4 = call ptr @_FortranAioBeginExternalListOutput(i32 6, ptr @_QQclX62c91d05f046c7a656e7978eb13f2821, i32 4), !dbg !9
  %5 = load i32, ptr %3, align 4, !dbg !10, !tbaa !11
  %6 = call i1 @_FortranAioOutputInteger32(ptr %4, i32 %5), !dbg !10
  %7 = call i32 @_FortranAioEndIoStatement(ptr %4), !dbg !9
  call void @llvm.stackrestore.p0(ptr %2), !dbg !15
  ret void, !dbg !16
}
```

With this change, the `llvm.alloca` is already hoisted out of the
stacksave/stackrestore region during flang codegen:

```
// -----// IR Dump After FIRToLLVMLowering (fir-to-llvm-ir) //----- //
  llvm.func @test_(%arg0: !llvm.ptr {fir.bindc_name = "n"}) attributes {fir.internal_name = "_QPtest"} {
    %0 = llvm.mlir.constant(4 : i32) : i32
    %1 = llvm.mlir.constant(1 : i64) : i64
    %2 = llvm.alloca %1 x i32 {bindc_name = "n"} : (i64) -> !llvm.ptr
    %3 = llvm.mlir.constant(6 : i32) : i32
    %4 = llvm.mlir.undef : i1
    %5 = llvm.call @llvm.stacksave.p0() {fastmathFlags = #llvm.fastmath<contract>} : () -> !llvm.ptr
    %6 = llvm.mlir.addressof @_QQclX62c91d05f046c7a656e7978eb13f2821 : !llvm.ptr
    %7 = llvm.call @_FortranAioBeginExternalListOutput(%3, %6, %0) {fastmathFlags = #llvm.fastmath<contract>} : (i32, !llvm.ptr, i32) -> !llvm.ptr
    %8 = llvm.load %2 {tbaa = [#tbaa_tag]} : !llvm.ptr -> i32
    %9 = llvm.call @_FortranAioOutputInteger32(%7, %8) {fastmathFlags = #llvm.fastmath<contract>} : (!llvm.ptr, i32) -> i1
    %10 = llvm.call @_FortranAioEndIoStatement(%7) {fastmathFlags = #llvm.fastmath<contract>} : (!llvm.ptr) -> i32
    llvm.call @llvm.stackrestore.p0(%5) {fastmathFlags = #llvm.fastmath<contract>} : (!llvm.ptr) -> ()
    llvm.return
  }
```

---------

Co-authored-by: Vijay Kandiah <vkandiah@sky6.pgi.net>
2024-06-14 11:36:05 -05:00
Valentin Clement (バレンタイン クレメン)
c1654c38e8
[flang] Carry over alignment computed by frontend for COMMON (#94280)
The frontend computes the necessary alignment for COMMON blocks but this
information is never carried over to the code generation and can lead to
segfault for COMMON block that requires a non default alignment.

This patch add an optional attribute on fir.global and carries over the
information.
2024-06-04 11:15:31 -07:00
jeanPerier
fd8b2d2046
[flang] lower RANK intrinsic (#93694)
First commit is reviewed in
https://github.com/llvm/llvm-project/pull/93682.

Lower RANK using fir.box_rank. This patches updates fir.box_rank to
accept box reference, this avoids the need of generating an assumed-rank
fir.load just for the sake of reading ALLOCATABLE/POINTER rank. The
fir.load would generate a "dynamic" memcpy that is hard to optimize
without further knowledge. A read effect is conditionally given to the
operation.
2024-05-30 11:02:09 +02:00
jeanPerier
e398383f9a
[flang][fir] add codegen for fir.load of assumed-rank fir.box (#93569)
- Update LLVM type conversion of assumed-rank fir.box/class to generate
the type of the maximum ranked descriptor. That way, alloca for assumed
rank descriptor copies are always big enough. This is needed in the
fir.load case that generates a new storage for the value
- Add a "computeBoxSize" helper to compute the dynamic size of a
descriptor.
- Use that size to generate an llvm.memcpy intrinsic to copy the input
descriptor into the new storage.

Looking at https://reviews.llvm.org/D108221?id=404635, it seems valid to
add the TBAA node on the memcpy, which I did.

In a further patch, I think we should likely always use a memcpy since
LLVM seems to have a better time optimizing it than fir.load/fir.store
patterns.
2024-05-30 09:30:27 +02:00
jeanPerier
26e0ce0b36
[flang] update fir.box_rank and fir.is_array codegen (#93541)
fir.box_rank codegen was invalid, it was assuming the rank field in the
descriptor was an i32. This is not correct. Do not hard code the type,
use the named position to find the type, and convert as needed in the
patterns.
2024-05-28 17:32:27 +02:00
Krzysztof Parzyszek
101f977f2c
[flang][CodeGen] Avoid out-of-bounds memory access in SelectCaseOp (#92955)
`SelectCaseOp::getCompareOperands` may return an empty range for the
"default" case. Do not dereference the range until it is expected to be
non-empty.

This was detected by address-sanitizer.
2024-05-22 10:52:17 -05:00
Abid Qadeer
f156b9ce7a
[flang] Add debug information for module variables. (#91582)
This PR add debug info for module variables. The module variables are
added as global variables but their scope is set to module instead of
compile unit. The scope of function declared inside a module is also set
accordingly.

After this patch, a module variable could be evaluated in the GDB as `p
helper::gli` where helper is name of the module and gli is the name of
the variable. A future patch will add the import module functionality
which will remove the need to prefix the name with helper::.

The line number where is module is declared is a best guess at the
moment as this information is not part of the GlobalOp.
2024-05-22 10:59:29 +01:00
Abid Qadeer
cd5ee2715e
[reland][flang] Initial debug info support for local variables (#92304)
This is same as #90905 with an added fix. The issue was that we
generated variable info even when user asked for line-tables-only. This
caused llvm dwarf generation code to fail an assertion as it expected an
empty variable list.

Fixed by not generating debug info for variables when user wants only
line table. I also updated a test check for this case.
2024-05-16 09:10:59 +01:00