6210 Commits

Author SHA1 Message Date
Connector Switch
6560adb584
[flang] optimize atand/atan2d precision (#154544)
Part of https://github.com/llvm/llvm-project/issues/150452.
2025-08-22 15:55:46 +08:00
Valentin Clement (バレンタイン クレメン)
1d05d693a1
[flang][cuda] Fix offset with multiple assumed size shared array (#154844)
When multiple assumed size variable are used in a kernel with dynamic
shared memory, each variable use the 0 offset. Update the pass to
account for that.

```
attributes(global) subroutine testany( a )
    real(4), shared :: smasks(*)
    real(8), shared :: dmasks(*)
end subroutine
```
2025-08-21 21:51:43 +00:00
Anchu Rajendran S
bce9b6d177
[Flang][Flang-Driver]Fix to add atomic control options in non-fc1 mode (#154638) 2025-08-21 10:15:33 -07:00
Renaud Kauffmann
3856bb6bbf
[flang] [acc] Adding allocation to the recipe of scalar allocatables (#154643)
Currently the privatization recipe of a scalar allocatable is as follow:

```
 acc.private.recipe @privatization_ref_box_heap_i32 : !fir.ref<!fir.box<!fir.heap<i32>>> init {
  ^bb0(%arg0: !fir.ref<!fir.box<!fir.heap<i32>>>):
    %0 = fir.alloca !fir.box<!fir.heap<i32>>
    %1:2 = hlfir.declare %0 {uniq_name = "acc.private.init"} : (!fir.ref<!fir.box<!fir.heap<i32>>>) -> (!fir.ref<!fir.box<!fir.heap<i32>>>, !fir.ref<!fir.box<!fir.heap<i32>>>)
    acc.yield %1#0 : !fir.ref<!fir.box<!fir.heap<i32>>>
  }
```

This change adds the allocation for the scalar.
2025-08-20 16:04:57 -07:00
Akash Banerjee
d69ccded4f
[MLIR] Add cpow support in ComplexToROCDLLibraryCalls (#153183)
This PR adds support for complex power operations (`cpow`) in the
`ComplexToROCDLLibraryCalls` conversion pass, specifically targeting
AMDGPU architectures. The implementation optimises complex
exponentiation by using mathematical identities and special-case
handling for small integer powers.

- Force lowering to `complex.pow` operations for the `amdgcn-amd-amdhsa`
target instead of using library calls
- Convert `complex.pow(z, w)` to `complex.exp(w * complex.log(z))` using
mathematical identity
2025-08-20 17:18:30 +00:00
Jean-Didier PAILLEUX
e4334afca0
[flang] Add support of THIS_IMAGE and NUM_IMAGES with PRIF (#154081)
In relation to the approval and merge of the
https://github.com/llvm/llvm-project/pull/76088 specification about
multi-image features in Flang.
Here is a PR on adding support for `THIS_IMAGE` and `NUM_IMAGES` in
conformance with the PRIF specification.
For `THIS_IMAGE`, the lowering to the subroutine containing the coarray
argument is not present in this PR, and will be in a future one.
2025-08-20 08:11:42 +02:00
Andre Kuhlenschmidt
d4673febb4
[flang][openacc] fix unguarded dereference of type pointer (#153606)
The added test used to cause a segfault, now it doesn't.
2025-08-19 15:10:09 -07:00
Connector Switch
a0eb9958eb
[flang] optimize asind precision (#154350)
Part of https://github.com/llvm/llvm-project/issues/150452.
2025-08-19 23:54:09 +08:00
Michael Klemm
a5f1ddd115
[Flang][OpenMP] Fix issue with named constants in SHARED and FIRSTPRIVATE clauses (#154335)
The seemingly was a regression that prevented the usage of named
constant (w/ PARAMETER attribute) in SHARED and FIRSTPRIVATE clauses.
This PR corrects that.
2025-08-19 17:52:27 +02:00
Slava Zakharin
6f489fb5e5
Reapply "[flang] Lower EOSHIFT into hlfir.eoshift." (#153907) (#154241)
This reverts commit 5178aeff7b96e86b066f8407b9d9732ec660dd2e.

In addition:
  * Scalar constant UNSIGNED BOUNDARY is explicitly casted
    to the result type so that the generated hlfir.eoshift
    operation is valid. The lowering produces signless constants
    by default. It might be a bigger issue in lowering, so I just
    want to "fix" it for EOSHIFT in this patch.
  * Since we have to create unsigned integer constant during
    HLFIR inlining, I added code in createIntegerConstant
    to make it possible.
2025-08-19 08:36:14 -07:00
Slava Zakharin
c79a88ee0a
[flang] Convert hlfir.designate with comp and contiguous result. (#154232)
Array sections like this have not been using the knowledge that
the result is contiguous:
```
type t
  integer :: f
end type
type(t) :: a(:)
a%f = 0
```

Peter Klausler is working on a change that will result in the
corresponding
hlfir.designate having a component and a non-box result.
This patch fixes the issues found in HLFIR-to-FIR conversion.
2025-08-19 08:35:40 -07:00
Krzysztof Parzyszek
8255d240a9
[flang][OpenMP] Avoid crash with MAP w/o modifiers, version >= 6.0 (#154352)
The current code will crash on the MAP clause with OpenMP version >= 6.0
when the clause does not explicitly list any modifiers. The proper fix
is to update the handling of assumed-size arrays for OpenMP 6.0+, but in
the short term keep the behavior from 5.2, just avoid the crash.
2025-08-19 10:18:51 -05:00
Krzysztof Parzyszek
42350f428d
[flang][OpenMP] Parse GROUPPRIVATE directive (#153807)
No semantic checks or lowering yet.
2025-08-19 08:32:43 -05:00
Leandro Lupori
ddb36a8102
[flang] Preserve dynamic length of characters in ALLOCATE (#152564)
Fixes #151895
2025-08-19 09:25:08 -03:00
Peter Klausler
50a40738d6
[flang] Catch semantic error with LBOUND/UBOUND (#154184)
The "ARRAY=" argument to these intrinsics cannot be scalar, whether
"DIM=" is present or not. (Allowing the "ARRAY=" argument to be scalar
when "DIM=" is absent would be a conceivable extension returning an
empty result array, like SHAPE() does with extents, but it doesn't seem
useful in a programming language without compilation-time rank
polymorphism apart from assumed-rank dummy arguments, and those are
supported.)

Fixes https://github.com/llvm/llvm-project/issues/154044.
2025-08-18 14:45:38 -07:00
Peter Klausler
2cf982c0f5
[flang] Don't duplicate impure function call for UBOUND() (#153648)
Because the per-dimension information in a descriptor holds an extent
and a lower bound, but not an upper bound, the calculation of the upper
bound sometimes requires that the extent and lower bound be extracted
from a descriptor and added together, minus 1. This shouldn't be
attempted when the NamedEntity of the descriptor is something that
shouldn't be duplicated and used twice; specifically, it shouldn't apply
to NamedEntities containing references to impure functions as parts of
subscript expressions.

Fixes https://github.com/llvm/llvm-project/issues/153031.
2025-08-18 14:43:13 -07:00
Krzysztof Parzyszek
8429f7faaa
[flang][OpenMP] Parsing support for DYN_GROUPPRIVATE (#153615)
This does not perform semantic checks or lowering.
2025-08-18 13:35:02 -05:00
Connector Switch
b368e7f6a5
[flang] optimize acosd precision (#154118)
Part of https://github.com/llvm/llvm-project/issues/150452.
2025-08-18 14:15:52 +00:00
Krzysztof Parzyszek
ae75884130
[Frontend][OpenMP] Add 6.1 as a valid OpenMP version (#153628)
Co-authored-by: Michael Klemm <michael.klemm@amd.com>
2025-08-18 09:13:27 -05:00
Chaitanya
4a3bf27c69
[OpenMP] Introduce omp.target_allocmem and omp.target_freemem omp dialect ops. (#145464)
This PR introduces two new ops in omp dialect, omp.target_allocmem and
omp.target_freemem.
omp.target_allocmem: Allocates heap memory on device. Will be lowered to
omp_target_alloc call in llvm.
omp.target_freemem: Deallocates heap memory on device. Will be lowered
to omp+target_free call in llvm.


Example:
  %1 = omp.target_allocmem %device : i32, i64
  omp.target_freemem %device, %1 : i32, i64

The work in this PR is C-P/inspired from @ivanradanov commit from
coexecute implementation:
[Add fir omp target alloc and free
ops](be860ac8ba)
[Lower omp_target_{alloc,free} to
llvm](6e2d584dc9)
2025-08-18 18:15:11 +05:30
Kareem Ergawy
c1e2a9c66d
[flang][OpenMP] Only privaize pre-determined symbols when defined the evaluation. (#154070)
Fixes a regression uncovered by Fujitsu test 0686_0024.f90. In
particular, verifies that a pre-determined symbol is only privatized by
its defining evaluation (e.g. the loop for which the symbol was marked
as pre-determined).
2025-08-18 13:36:08 +02:00
Slava Zakharin
5178aeff7b
Revert "[flang] Lower EOSHIFT into hlfir.eoshift." (#153907)
Reverts llvm/llvm-project#153106

Buildbots failing:
* https://lab.llvm.org/buildbot/#/builders/199/builds/5188
* https://lab.llvm.org/buildbot/#/builders/41/builds/8329
2025-08-15 17:48:40 -07:00
Jean-Didier PAILLEUX
acdbb00af5
[flang] Adding support of -fcoarray flang and init PRIF (#151675)
In relation to the approval and merge of the
[PRIF](https://github.com/llvm/llvm-project/pull/76088) specification
about multi-image features in Flang, here is a first PR to add support
for the `-fcoarray` compilation flag and the initialization of the PRIF
environment.
Other PRs will follow for adding support of lowering to PRIF.
2025-08-15 16:04:49 -07:00
Slava Zakharin
9f302ed0cf
[flang] Inline hlfir.eoshift during HLFIR intrinsics simplication. (#153108)
This patch generalizes the code for hlfir.cshift to be applicable
for hlfir.eoshift. The major difference is the selection
of the boundary value that might be statically/dynamically absent,
in which case the default scalar value has to be used.
The scalar value of the boundary is always computed before
the hlfir.elemental or the assignment loop.
Contrary to hlfir.cshift simplication, the SHIFT value is not
normalized,
because the original value (and its sign) participate in the EOSHIFT
index computation for addressing the input array and selecting
which elements of the results are assigned from the boundary operand.
2025-08-15 15:22:06 -07:00
Slava Zakharin
25285b3476
[flang] Lower EOSHIFT into hlfir.eoshift. (#153106)
Straightforward lowering of EOSHIFT intrinsic into the new hlfir.eoshift
operation.
2025-08-15 13:55:05 -07:00
Slava Zakharin
4c6afc7993
[flang] Lower hlfir.eoshift to the runtime call. (#153107)
Straightforward lowering of hlfir.eoshift to the runtime call
in LowerHLFIRIntrinsics pass.
2025-08-15 13:54:49 -07:00
Slava Zakharin
95d4362521
[flang] Added hlfir.eoshift operation definition. (#153105)
This is a basic definition of the operation corresponding to
the Fortran's EOSHIFT transformational intrinsic.
2025-08-15 13:15:35 -07:00
Valentin Clement (バレンタイン クレメン)
3720d8b52d
[flang][cuda] Update some bind name to fast version and add __sincosf (#153744)
Use the fast version in the bind name and reorder these fast math
functions. Add missing __sincosf interface.
2025-08-15 11:07:15 -07:00
Valentin Clement (バレンタイン クレメン)
115f816069
[flang][cuda] Add missing bind name for __int2double_rn (#153720) 2025-08-15 10:27:19 -07:00
Valentin Clement (バレンタイン クレメン)
0e4af726cb
[flang][cuda] Add interface for __fdividef (#153742) 2025-08-15 10:26:40 -07:00
Valentin Clement (バレンタイン クレメン)
0e8c964c21
[flang][cuda] Add interfaces for double_as_longlong and longlong_as_double (#153719) 2025-08-15 17:26:11 +00:00
Valentin Clement (バレンタイン クレメン)
fd3f052aeb
[flang][cuda] Add interfaces for int_as_float and float_as_int (#153716) 2025-08-15 10:00:53 -07:00
Valentin Clement (バレンタイン クレメン)
583499a8cf
[flang][cuda] Add missing bind name for __hiloint2double, __double2loint and __double2hiint (#153713) 2025-08-15 09:32:59 -07:00
Akash Banerjee
1fd1d63463 [MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR (#153048)
Add a new AutomapToTargetData pass. This gathers the declare target
enter variables which have the AUTOMAP modifier. And adds
omp.declare_target_enter/exit mapping directives for fir.alloca and
fir.free oeprations on the AUTOMAP enabled variables.

Automap Ref: OpenMP 6.0 section 7.9.7.
2025-08-15 15:41:41 +01:00
Kareem Ergawy
b9e33fd493
[flang] Do not re-localize loop ivs when nested inside blocks (#153350)
Consider the following example:
```fortran
  implicit none
  integer :: i, j

  do concurrent (i=1:10) local(j)
    block
      do j=1,20
      end do
    end block
  end do
```

Without the fix introduced in this PR, the compiler would "re-localize"
the `j` variable inside the `fir.do_concurrent` loop:
```mlir
    fir.do_concurrent {
      %7 = fir.alloca i32 {bindc_name = "j"}
      %8:2 = hlfir.declare %7 {uniq_name = "_QFloop_in_nested_blockEj"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
      ...
      fir.do_concurrent.loop (%arg0) = (%5) to (%6) step (%c1) local(@_QFloop_in_nested_blockEj_private_i32 %4#0 -> %arg1 : !fir.ref<i32>) {
        %12:2 = hlfir.declare %arg1 {uniq_name = "_QFloop_in_nested_blockEj"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
        ...
        %17:2 = fir.do_loop %arg2 = %14 to %15 step %c1_1 iter_args(%arg3 = %16) -> (index, i32) {
          fir.store %arg3 to %8#0 : !fir.ref<i32>
          ...
        }
      }
    }
```

This happened because we did a shallow look-up of `j` and since the loop
is nested inside a `block`, the look-up failed and we re-created a local
allocation for `j` inside the parent `fir.do_concurrent` loop. This
means that we ended up not using the actual localized symbol which is
passed as a region argument to the `fir.do_concurrent.loop` op.

In case of `j`, we do not need to do a shallow look-up. The shallow
look-up is only needed if a symbol is an OpenMP private one or an
iteration variable of a `do concurrent` loop. Neither of which applies
to `j`.

With the fix, `j` is properly resolved to the `local` region argument:
```mlir
    fir.do_concurrent {
      ...
      fir.do_concurrent.loop (%arg0) = (%5) to (%6) step (%c1) local(@_QFloop_in_nested_blockEj_private_i32 %4#0 -> %arg1 : !fir.ref<i32>) {
        ...
        %10:2 = hlfir.declare %arg1 {uniq_name = "_QFloop_in_nested_blockEj"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
        ...
        %15:2 = fir.do_loop %arg2 = %12 to %13 step %c1_1 iter_args(%arg3 = %14) -> (index, i32) {
          fir.store %arg3 to %10#0 : !fir.ref<i32>
          ...
        }
      }
    }
```
2025-08-15 08:45:02 +02:00
Valentin Clement (バレンタイン クレメン)
3bc4d66082
[flang][cuda] Add interfaces for __int2float_rX (#153708) 2025-08-14 16:45:44 -07:00
Valentin Clement (バレンタイン クレメン)
ffe4870472
[flang][cuda] Add interfaces for __float2int_rX and __float2unit_rX (#153691) 2025-08-14 23:11:45 +00:00
Valentin Clement (バレンタイン クレメン)
602f308d4f
[flang][cuda] Add interface for __saturatef (#153705) 2025-08-14 15:55:17 -07:00
Valentin Clement (バレンタイン クレメン)
2775c79c4f
[flang][cuda] Add interfaces for __float2ll_rX (#153702) 2025-08-14 15:44:52 -07:00
Valentin Clement (バレンタイン クレメン)
ca9ddd54b7
[flang][cuda] Add interfaces for __ll2float_rX (#153694) 2025-08-14 15:35:02 -07:00
Valentin Clement (バレンタイン クレメン)
df15c0d716
[flang][cuda] Add interfaces for __dsqrt_rn and __dsqrt_rz (#153624) 2025-08-14 22:08:33 +00:00
Valentin Clement (バレンタイン クレメン)
b989c7c2e0
[flang][cuda] Add interfaces for __drcp_rX (#153681) 2025-08-14 21:44:47 +00:00
Valentin Clement (バレンタイン クレメン)
06590444f5
[flang][cuda] Add bind names for __double2ull_rX interfaces (#153678) 2025-08-14 21:10:20 +00:00
Valentin Clement (バレンタイン クレメン)
bad3df4764
[flang][cuda] Add bind names for __double2ll_rX interfaces (#153660) 2025-08-14 13:34:25 -07:00
Valentin Clement (バレンタイン クレメン)
20a829937c
[flang][cuda] Add interfaces for __expf and __exp10f (#153633) 2025-08-14 11:36:55 -07:00
Valentin Clement (バレンタイン クレメン)
e27e4f3a99
[flang][cuda] Add interfaces for __uint2float_rX functions (#153620)
Also add bind name for __uint2double_rn
2025-08-14 18:05:37 +00:00
Valentin Clement (バレンタイン クレメン)
efce767a88
[flang][cuda] Add interfaces for __ull2float_rX functions (#153613) 2025-08-14 10:28:17 -07:00
Valentin Clement (バレンタイン クレメン)
a8f1f1b41f
[flang][cuda] Add interfaces for __logf, __log2f and __log10f (#153611) 2025-08-14 17:17:52 +00:00
Valentin Clement (バレンタイン クレメン)
6961139ce9
[flang][cuda] Add interfaces for __sinf and __tanf (#153609) 2025-08-14 09:50:03 -07:00
Ian McInerney
28d5bc5649
[Flang][Driver] Predefine pic/pie macros based on configured level (#153449)
Predefine the `__pic__/__pie__/__PIC__/__PIE__` macros based on the
configured relocation level. This logic mirrors that of the clang
driver, where `__pic__/__PIC__` are defined for both PIC and PIE modes,
but `__pie__/__PIE__` are only defined for PIE mode.

Fixes https://github.com/llvm/llvm-project/issues/135275
2025-08-14 17:48:20 +01:00