Valentin Clement (バレンタイン クレメン) 68945cce4d
[flang] Restrict mem2reg promotion through fir.declare to single-block case (#182933)
The PromotableOpInterface on fir.declare allows mem2reg to promote
allocas accessed through declare ops. However, MLIR's mem2reg computes
defining blocks and live-in sets only from direct users of the slot
pointer. Stores through fir.declare are users of the declare result, not
the alloca, so they are not registered as defining blocks. This causes
missing phi nodes at join points (loop headers, merge blocks), which
silently drops conditional updates to promoted variables.
This was observed in CUDA Fortran kernels where a loop variable updated
conditionally (e.g., mywatch = max(1, mywatch-32)) became constant after
promotion, producing incorrect results at runtime.
The fix restricts promotion through fir.declare to cases where all users
of the declare are in the same block. In single-block cases no phi nodes
are needed, so the MLIR limitation does not apply. Cross-block cases are
left unpromoted until the MLIR mem2reg infrastructure is extended to
track defining blocks through PromotableOpInterface results.

With the current behavior, this would be the result. 
```
func.func @loop_conditional_update(%arg0: i32, %cdt: i1) -> i32 {
  %c1 = arith.constant 1 : i32
  %alloca = fir.alloca i32 {bindc_name = "mywatch", uniq_name = "_QFkernelEmywatch"}
  %declare = fir.declare %alloca {uniq_name = "_QFkernelEmywatch"} : (!fir.ref<i32>) -> !fir.ref<i32>
  fir.store %arg0 to %declare : !fir.ref<i32>
  llvm.br ^loop
^loop:
  %val = fir.load %declare : !fir.ref<i32>
  llvm.cond_br %cdt, ^update, ^exit
^update:
  %new = arith.subi %val, %c1 : i32
  fir.store %new to %declare : !fir.ref<i32>
  llvm.br ^loop
^exit:
  %result = fir.load %declare : !fir.ref<i32>
  return %result : i32
}
```

```
  func.func @loop_conditional_update(%arg0: i32, %arg1: i1) -> i32 {
    %c1_i32 = arith.constant 1 : i32
    fir.declare_value %arg0 {uniq_name = "_QFkernelEmywatch"} : i32
    llvm.br ^bb1
  ^bb1:  // 2 preds: ^bb0, ^bb2
    llvm.cond_br %arg1, ^bb2, ^bb3
  ^bb2:  // pred: ^bb1
    %0 = arith.subi %arg0, %c1_i32 : i32 // Doesn't use current value. 
    fir.declare_value %0 {uniq_name = "_QFkernelEmywatch"} : i32
    llvm.br ^bb1
  ^bb3:  // pred: ^bb1
    return %arg0 : i32 // always return $arg0
  }
```

A better fix should probably be done in mem2reg to support these cases
better. I'll look into that later this week.
2026-02-23 20:59:45 +00:00
..

Flang

Flang is a ground-up implementation of a Fortran front end written in modern C++. It started off as the f18 project (https://github.com/flang-compiler/f18) with an aim to replace the previous flang project (https://github.com/flang-compiler/flang) and address its various deficiencies. F18 was subsequently accepted into the LLVM project and rechristened as Flang.

Please note that flang is not ready yet for production usage.

Getting Started

Read more about flang in the docs directory. Start with the compiler overview.

To better understand Fortran as a language and the specific grammar accepted by flang, read Fortran For C Programmers and flang's specifications of the Fortran grammar and the OpenMP grammar.

Treatment of language extensions is covered in this document.

To understand the compilers handling of intrinsics, see the discussion of intrinsics.

To understand how a flang program communicates with libraries at runtime, see the discussion of runtime descriptors.

If you're interested in contributing to the compiler, read the style guide and also review how flang uses modern C++ features.

If you are interested in writing new documentation, follow LLVM's Markdown style guide.

Consult the Getting Started with Flang for information on building and running flang.