10575 Commits

Author SHA1 Message Date
Valentin Clement (バレンタイン クレメン)
9a0e03f430
[flang][cuda] Update implicit data transfer for device component (#147882)
Update the detection of implicit data transfer when a device resident
allocatable derived-type component is involved and remove the TODOs.
2025-07-10 09:50:31 -07:00
Krzysztof Parzyszek
9b0ae6ccd6
[flang][OpenMP] Issue a warning when parsing future directive spelling (#147765)
OpenMP 6.0 introduced alternative spelling for some directives, with the
previous spellings still allowed.

Warn the user when a new spelling is encountered with OpenMP version set
to an older value.
2025-07-10 09:57:03 -05:00
Daniel Paoliello
154de3e1bd
[flang] Don't check the '-mframe-pointer' flag (#147837)
The `-mframe-pointer` flag is not explicitly set in the original `flang`
invocation and so the value passed to `flang -fc1` can vary depending on
the host machine, so don't verify it in the output.

`-mframe-pointer` forwarding is already verified by
`flang/test/Driver/frame-pointer-forwarding.f90`.
2025-07-10 07:27:33 -07:00
Daniel Chen
13ead00049
[Flang] Fix PowerPC build failure due to the deprecation of ArrayRef(std::nullopt_t) {}. (#147816)
Our local Flang build on PowerPC was broken as
```
llvm/flang/../mlir/include/mlir/IR/ValueRange.h:401:20: error: 'ArrayRef' is deprecated: Use {} or ArrayRef<T>() instead [-Werror,-Wdeprecated-declarations]
  401 |       : ValueRange(ArrayRef<Value>(std::forward<Arg>(arg))) {}
      |                    ^
llvm/flang/lib/Optimizer/CodeGen/CodeGen.cpp:2243:53: note: in instantiation of function template specialization 'mlir::ValueRange::ValueRange<const std::nullopt_t &, void>' requested here
 2243 |                              /*cstInteriorIndices=*/std::nullopt, fieldIndices,
      |                                                     ^
 llvm/include/llvm/ADT/ArrayRef.h:70:18: note: 'ArrayRef' has been explicitly marked deprecated here
   70 |     /*implicit*/ LLVM_DEPRECATED("Use {} or ArrayRef<T>() instead", "{}")
      |                  ^
llvm/include/llvm/Support/Compiler.h:244:50: note: expanded from macro 'LLVM_DEPRECATED'
  244 | #define LLVM_DEPRECATED(MSG, FIX) __attribute__((deprecated(MSG, FIX)))
      |                                                  ^
1 error generated.
```

This patch is to fix it.
2025-07-10 09:53:03 -04:00
agozillon
75f81ded8f
[Flang][FlangRT][Runtime] Add RT_OFFLOAD_API_GROUP_BEGIN to missing symbols on AMDGPU (#147612)
After the recent move to work queues, in certain cases when linking in
the fortran runtime built for offload on AMDGPU as required in certain
cases, we'll get missing symbols when linking. This PR tries to address
this issue by encompassing more of the library in
RT_OFFLOAD_API_GROUP_BEGIN, which has the affect of compiling these
functions for AMDGPU, resolving the missing symbols.

This PR should address the following issue:
https://github.com/llvm/llvm-project/issues/145888
2025-07-10 13:19:58 +02:00
Krzysztof Parzyszek
2546c6d3f7
[flang][OpenMP] Recognize remaining OpenMP 6.0 spellings in parser (#147723)
Parse OpenMP 6.0 spellings for directives that don't use
OmpDirectiveNameParser.
2025-07-09 16:02:24 -05:00
Krzysztof Parzyszek
d2adfcaa9e
[flang][OpenMP] Handle multiple spellings in OmpDirectiveNameParser (#147722)
Collect all spellings from all supported OpenMP versions before parsing.
Break up the list of spellings by the initial letter to speed up parsing
a little.
2025-07-09 16:02:01 -05:00
Vijay Kandiah
c4138a24dc
[mlir][acc][flang] Lower nested ACC loops with tile clause as collapsed loops (#147801)
In the case of nested loops, `acc.loop` is meant to subsume all of the
loops that it applies to (when explicitly described as doing so in the
OpenACC specification). So when there is a `acc loop tile(...)` present
on nested Fortran DO loops, `acc.loop` should apply to the `n` loops
that `tile` applies to. This change lowers such nested Fortran loops
with tile clause into a collapsed `acc.loop` with `n` IVs, loop bounds,
and step, in a similar fashion to the current lowering for acc loops
with `collapse` clause.
2025-07-09 15:47:11 -05:00
Andre Kuhlenschmidt
fc9dd58734
[flang][driver] add -Wfatal-errors (#147614)
Adds the flag `-Wfatal-errors` which truncates the error messages at 1 error.
2025-07-09 12:35:43 -07:00
Leandro Lupori
a63846b475
[flang] Fix array assignment regression introduced by #147371 (#147761)
In some cases fixed shape arrays can be fir.heap/fir.ptr, even
after hlfir::derefPointersAndAllocatables() is called.
2025-07-09 14:41:56 -03:00
David Spickett
e8e5d07767 Revert "[flang] Avoid undefined behaviour when parsing format expressions (#147539)"
This reverts commit d0caf0d4857c2b00ba988f86703663685ec8697f.

MathExtras.h is not found in some builds.
2025-07-09 14:49:58 +00:00
David Spickett
d0caf0d485
[flang] Avoid undefined behaviour when parsing format expressions (#147539)
The test flang/test/Semantics/io08.f90 was failing when UBSAN was
enabled:
```
/home/david.spickett/llvm-project/flang/include/flang/Common/format.h:224:26: runtime error: signed integer overflow: 10 * 987654321098765432 cannot be represented in type 'int64_t' (aka 'long')
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/david.spickett/llvm-project/flang/include/flang/Common/format.h:224:26
```
This is because the code was effectively:
* Take the risk of UB happening
* Check whether it happened or not

Which UBSAN is obviously not going to like. Instead of checking after
the fact, use llvm's helpers that catch overflow without actually doing
it.
2025-07-09 15:31:45 +01:00
Maksim Levental
1770e9b5c6
[mlir] remove dangling builders from td (#147619)
These are "dangling" builders (decls are emitted but there are no defns
anywhere).
2025-07-09 09:59:24 -04:00
Shunsuke Watanabe
c9900015a9
[flang] Add -fcomplex-arithmetic= option and select complex division algorithm (#146641)
This patch adds an option to select the method for computing complex
number division. It uses `LoweringOptions` to determine whether to lower
complex division to a runtime function call or to MLIR's `complex.div`,
and `CodeGenOptions` to select the computation algorithm for
`complex.div`. The available option values and their corresponding
algorithms are as follows:
- `full`: Lower to a runtime function call. (Default behavior)
- `improved`: Lower to `complex.div` and expand to Smith's algorithm.
- `basic`: Lower to `complex.div` and expand to the algebraic algorithm.

See also the discussion in the following discourse post:
https://discourse.llvm.org/t/optimization-of-complex-number-division/83468

---------

Co-authored-by: Tarun Prabhu <tarunprabhu@gmail.com>
2025-07-09 13:43:54 +09:00
Valentin Clement (バレンタイン クレメン)
46caad52ac
[flang][cuda] Do not produce data transfer in offloaded do concurrent (#147435)
If a `do concurrent` loop is offloaded then there should be no CUDA data
transfer in it. Update the semantic and lowering to take that into
account.

`AssignmentChecker` has to be put into a separate pass because the
checkers in `SemanticsVisitor` cannot have the same `Enter/Leave`
functions. The `DoForallChecker` already has `Eneter/Leave` functions
for the `DoConstruct`.
2025-07-08 10:52:15 -07:00
Leandro Lupori
e976eaf303
[flang] Fix optimization of array assignments after #146408 (#147371)
Host associated variables were not being handled properly.
For array references, get the fixed shape extents from the value
type instead, that works correctly in all cases.
2025-07-08 14:47:26 -03:00
Jack Styles
9a8d45f626
[Flang][OpenMP] Fix crash when block.end() is missed (#147519)
As reported in #145917 and #147309, there are situation's where flang
may crash. This is because `nextIt` in
`RewriteOpenMPLoopConstruct` gets re-assigned when an iterator is erased
from the block. If this is missed, Flang may attempt to access a
location in memory that is not accessable and cause a compiler crash.

This adds protection where the crash can occur, and a test with a
reproducer that can trigger the crash.

Fixes #147309
2025-07-08 12:28:58 -05:00
David Spickett
31786ee89f
[flang] Avoid undefined behaviour in Interval::Contains (#147505)
If the size of the other Interval was 0, (that.size_ - 1) would wrap
below zero.

I've fixed this so that a zero size interval A is within interval B if
the start of A is within B. There's a few ways you could handle zero
sized intervals in theory but this one passes all tests so I assume it's
the intention.

This fixes the following tests when ubsan is enabled:
  Flang :: Lower/OpenMP/PFT/sections-pft.f90
  Flang :: Lower/OpenMP/derived-type-allocatable.f90
  Flang :: Lower/OpenMP/privatization-proc-ptr.f90
  Flang :: Lower/OpenMP/sections.f90
  Flang :: Parser/OpenMP/sections.f90
  Flang :: Semantics/OpenMP/clause-validity01.f90
  Flang :: Semantics/OpenMP/if-clause.f90
  Flang :: Semantics/OpenMP/parallel-sections01.f90
  Flang :: Semantics/OpenMP/private-assoc.f90
2025-07-08 14:39:08 +01:00
David Spickett
d889a7485f
[flang] Avoid UB in CharBlock Compare to C string (#147329)
The behaviour of strncmp is undefined if either string pointer is null
(https://en.cppreference.com/w/cpp/string/byte/strncmp.html).

I've copied the logic over from Compare to another CharBlock, which had
code to avoid UB in memcmp.

The test Preprocessing/kind-suffix.F90 was failing with UBSAN enabled,
and now passes.
2025-07-08 08:52:36 +01:00
Valentin Clement (バレンタイン クレメン)
659c8102f4
Reland [flang][cuda] Allocate derived-type with CUDA component in anaged memory (#147416) 2025-07-07 17:40:04 -07:00
Valentin Clement (バレンタイン クレメン)
07cc7ea7d5
Reland [flang][cuda] Do not create global for derived-type with allocatable device components (#147402)
Reviewed in #146780

derived type with CUDA device allocatable components will be handle via
CUDA allocation. Do not create global for them.
2025-07-07 15:19:41 -07:00
Daniel Paoliello
71ffa2a4d3
[flang] Correctly handle -mframe-pointer=reserved (#146937)
Fixes `#146802`

#146582 started using the `Reserved` Frame Pointer kind for Arm64
Windows, but this revealed a bug in Flang where it copied the
`-mframe-pointer=reserved` flag from Clang, but didn't correctly handle
it in its own command line parser and subsequent compilation pipeline.

This change adds support for `-mframe-pointer=reserved` and adds a test
to make sure that functions are correctly marked when the flag is set.
2025-07-07 09:15:47 -07:00
Leandro Lupori
6855573700
[flang][OpenMP] Fix parallel-firstprivate-clause-scalar.f90 test (#146932)
Fix REQUIRES and references to declared variables.

Fixes #146875
2025-07-07 09:10:11 -03:00
David Spickett
b1a8c8a32c
[flang][test] Fix REQUIRES and options for a few x86 specific tests (#146872)
These should have been looking for the "x86" target not "x64_64".

When run on AArch64 they failed because bbc tried to compile for
AArch64. Add a target option to fix that, as these tests are x86
specific.
2025-07-07 10:26:17 +01:00
David Spickett
3e934dded0
[flang][test] Fix test REQUIRES and options for aint.f90 (#146870)
This test should have been looking for the "x86" target, not "x86_64".

In the time it's not been running, -target must have been changed to
-triple.
2025-07-07 10:25:41 +01:00
Jack Styles
8a221a585c
[Flang][OpenMP] Push context when parsing DECLARE VARIANT (#147075)
Basic parsing and semantics support for Declare Variant was added in
#130578. However, this did not include variant of `Pre` and `Post`
within `OmpAttributeVisitor`. This meant that when a function in the
class tried to get the context using `GetContext`, Flang would crash as
the context was empty. To ensure this is possible, such as when
resolving names as part of the `uniform` clause in the `simd` directive,
the context is now pushed within `OmpAttributeVisitor` when parsing a
`DECLARE VARIANT` directive.

Fixes #145222
2025-07-07 09:57:13 +01:00
Kiran Chandramohan
5271f9fba9
[Flang][Doc] NFC: Minor fix for headings (#147077)
Use a top level section to ensure that there is only one entry in the
flang.llvm.org/docs page.

Also generate a table of contents.
2025-07-07 09:57:03 +01:00
Michael Klemm
19afd27eb8
[Flang] Fix ACOSD and ASIND (fixes issue #145593) (#145656)
Original implementation converted DEG->RAD before calling ACOS/ASIN, but
the conversion needs to happen after invoking ACOS/ASIN.
2025-07-07 09:09:11 +02:00
Joseph Huber
6db02dc431
[Clang] Introduce --offload-targets for -fopenmp-targets (#146594)
Summary:
This patch is mostly an NFC that renames the existing `-fopenmp-targets`
into `--offload-targets`. Doing this early to simplify a follow-up patch
that will hopefully allow this syntax to be used more generically over
the existing `--offload` syntax (which I think is mostly unmaintained
now.). Following in the well-trodden path of trying to pull language
specific offload options into generic ones, but right now this is still
just OpenMP specific.
2025-07-04 16:20:53 -05:00
agozillon
fd5ed046fd
[Flang][OpenMP][NFC] Remove flag toggling deprecated no hlfir flow in map-types-and-sizes.f90 (#146995)
We no longer utilise the deprecated FIR only flow, so we should be
testing for the current HLFIR flow that we support as opposed to the
older that we no longer maintain.
2025-07-04 17:24:56 +02:00
Tom Eccles
ed17bf1e4c
[flang] Fix tests broken by #146734 (#147055)
These tests referred to privatizers which were never declared
2025-07-04 14:50:29 +01:00
jeanPerier
274e798a98
[flang] use set_union instead of merge in added DerivedTypeCache (#147024)
When merging the list of recursive reference under two components,
duplicates should be removed.
If the recursive reference to parents nodes (referred by depth of the
parents node) are [1, 2, 5] and [4, 5], the new list should be [1,2,4,5].

With std::merge the order was correct but 5 was duplicated. Use
std::set_union instead that removes duplicates.

With this patch Fujitsu tests 0394_0030.f90 [1] and 0390_0230.f90 [2]()
finally compile with -g in about 10s. Their compilation was hanging
before #146543, and they were now hitting an error:
"LLVM ERROR: SmallVector unable to grow" which is fixed by this patch.

[1]: 0d02267bb9/Fortran/0394/0394_0030.f90
[2]: 0d02267bb9/Fortran/0390/0390_0230.f90
2025-07-04 14:42:42 +02:00
Leandro Lupori
0ba59587fa
[flang] Optimize assignments of multidimensional arrays (#146408)
Assignments of n-dimensional arrays, with trivial RHS, were
always being converted to n nested loops. For contiguous arrays,
it's possible to flatten them and use a single loop, that can
usually be better optimized by LLVM.

In a test program, using a 3-dimensional array and varying its
size, the resulting speedup was as follows (measured on Graviton4):

16K     1.09
64K     1.40
128K    1.90
256K    1.91
512K    1.00

For sizes above or equal to 512K no improvement was observed.
It looks like LLVM stops trying to perform aggressive loop
unrolling at a certain threshold and just uses nested loops
instead. Larger sizes won't fit on L1 and L2 caches too.

This was noticed while profiling 527.cam4_r. This optimization
makes aer_rad_props_sw slightly faster, but unfortunately it
practically doesn't change 527.cam4_r total execution time.
2025-07-04 08:49:51 -03:00
David Spickett
d84df61c00
[flang] Fix x86 REQUIRES in a couple of tests (#146869)
Many tests in Flang are looking for x86_64-registered-target, but this
never exists because the target is just called x86.

These two pass with this corrected but the others I need to look into
why they fail.
2025-07-04 08:43:57 +01:00
Kareem Ergawy
3e78afff0d
[flang] Fix Windows bot failure caused by #146667 (#147002)
Fixes a Windows bot failure caused by #146667. Just run the test if an
AMD GPU target is registered. Hopefully, the bot now passes.

Test coverage is not reduced since `bbc` is still run on all platforms.
2025-07-04 08:41:29 +01:00
Kareem Ergawy
8c9e0c6c61
[flang][OpenMP] Allocate reduction init temps on the stack for GPUs (#146667)
Temps needed for the reduction init regions are now allocate on the heap
all the time. However, this is performance killer for GPUs since malloc
calls are prohibitively expensive. Therefore, we should do these
allocations on the stack for GPU reductions.
2025-07-04 06:29:34 +02:00
delaram-talaashrafi
1f7effc887
[mlir][acc][flang] Use SymbolRefAttr for func_name in ACC routine (#146951)
Changed the type of the `func_name` attribute from SymbolNameAttr to
SymbolRefAttr. SymbolNameAttr is typically used when defining a symbol
(e.g., `sym_name`), while SymbolRefAttr is appropriate for referencing
existing operations. This change ensures that MLIR can correctly track
the link to the referenced `func.func` operation.
2025-07-03 14:55:49 -07:00
Peter Klausler
2b7e3f6fa6
[flang] Unify derived types in distinct module files (#146759)
When using -fhermetic-module-files it's possible for a derived type to
have multiple distinct definition sites that are being compared for
being the same type, as in argument association. Accept them as being
the same type so long as they have the same names, the same module
names, and identical definitions.
2025-07-03 14:34:16 -07:00
Peter Klausler
dd3214d5a6
[flang] Fix handling of identifier in column 1 of free form continuat… (#146430)
…ion line

An obsolete flag ("insertASpace_") is being used to signal some cases in
the prescanner's implementation of continuation lines when a token
should be broken when it straddles a line break. It turns out that it's
sufficient to simply note these cases without ever actually inserting a
space, so don't do that (fixing the motivating bug). This leaves some
variables with obsolete names, so change them as well.

This patch handles the third of the three bugs reported in
https://github.com/llvm/llvm-project/issues/146362 .
2025-07-03 14:32:38 -07:00
Andre Kuhlenschmidt
bc89380179
[flang][preprocessor] fix use of bitwise-and for logical-and (#146758)
The preprocessor used bitwise and to implement logical, this is a bug.

towards #146362
2025-07-03 12:37:54 -07:00
Andre Kuhlenschmidt
67d6679c91
[flang][prescanner] fix invalid check (#146613)
`TokenSequence::pop_back()` had a check assumed that tokens are never
empty. Loosen this check since isn't true.

towards #146362
2025-07-03 12:36:34 -07:00
parabola94
2b49d36c08
[flang][cmake] Separate FLANG_INCLUDE_TOOLS from FLANG_BUILD_TOOLS (#145005)
If we disable `FLANG_BUILD_TOOLS`, not only building the tools but also
generating the targets for them is skipped now. On the other hand, llvm
separates them into `LLVM_BUILD_TOOLS` and `LLVM_INCLUDE_TOOLS`.
This patch introduces `FLANG_INCLUDE_TOOLS` for the distinction.
2025-07-03 16:45:47 +02:00
jeanPerier
8763ac3252
[flang] fix skip-external-rtti-definition for ppc (#146826)
PPC does not use comdat. There is no need to check for that in the test,
just remove it.

Fix for https://lab.llvm.org/buildbot/#/builders/201/builds/5278
2025-07-03 14:10:31 +02:00
jeanPerier
4868d66282
[flang] improve DITypeAttr caching with recursive derived types (#146543)
The current DITypeAttr caching for derived type debug metadata
generation strategy is not optimal. This turns out to be an issue for
compile times in apps with very very complex derived types like CP2K

See the added debug-cyclic-derived-type-caching-simple.f90 test for more
details about the duplication issue.

As a real world example justifying the new non trivial caching strategy,
in CP2K, emitting debug type info for the swarm_worker_type` in swarm_worker.F
caused 1,747,347 llvm debug metadata nodes to be emitted instead of 8023
after this patch (200x less) leading to noticeable compile time
improvements (I measured 0.12s spent in `AddDebugInfo` pass instead of
7.5s prior to this patch).

The main idea is that caching is now associating to the cached
DITypeAttr tree for a derived type a list of parent nodes being referred
to recursively via indices in this DITypeAttr.

When leaving the context of a parent node, all types that were cached
and linked to this parent node are cleared from the cache.
This allows more reusage in sub-trees while still fulfilling the MLIR
requirements that DITypeAttr types referring to a parent DITypeAttr via
integer id should only be used inside the DITypeAttr of the parent.

Most of the complexity comes from computing the "list of parent nodes"
by merging the ones from the components.

This is made is such a way that the extra cost for apps without
recursive derived type is minimal because the extra data structure
should not require extra dynamic allocations when they are no or little
recursion.

Example:

Take the following type graph (Fortran source for it in the added
debug-cyclic-derived-type-caching-complex.f90).
A is the tope level types, and has direct components of types B, C, and
E.
There are cycles in the type tree introduced by type B and D.
Types `C` and `E` are of interest here because they are in the middle of
those cycles and appear in several places in the type tree. There
occurrences is labeled in brackets in the order of visit by the
DebugTypeGenerator.

```
 A -> B -> C [1] -> D -> E [1] -> F -> G -> B
 |   |              |             |
 |   |              |             | -> D
 |   |              |
 |   |              | -> H -> E [2] ->  F -> G -> B
 |   |                                  |
 |   |                                  |-> D
 |   |
 |   | -> I -> E [3] ->  F -> G -> B
 |   |                   |
 |   |                   |-> D
 |   | -> C [2]
 |
 | -> C [3] -> D
 | -> E [4] -> F -> G -> B
               |
               | -> D
```

With this patch, E[2] and E[3] can share the same DITypeAttr as well as
C[1] and C[2] while they previously all got there own nodes.

To be safe with regards to cycles in MLIR, a DITypeAttr created for a
node N2 under a node N1 being recursively referred to and above the
recursive reference to N1 shall not be used above N1 in the DITypeAttr
tree. It can however be used in several places under N1.

Hence here:
-E[2] cannot reuse E[1] DITypeAttr because D appears above and under
E[1].
-E[3] can reuse E[2] DITypeAttr because they are both under B and above
D.
-E[4] cannot reuse E[3] DITypeAttr  because it is above B.

This is achieved by this patch because when visiting A and reaching B,
the recursive reference to B is registered in the visit context. This
context is added D when going back-up in F. So when reaching back E[1]
with the information to build its DITypeAttr, its recursive references
are known and saved along the DITypeAttr in the cache.

When reaching back D, the cache for E is cleared because it is known it
depended on D. A new DITypeAttr is created after E[2], and this time it
only depends on B because the D under E[2] is not a recursive reference
(D is not above E[2]). Hence, when reaching E[3] it can be reused, and
the cache entry for E[2] is cleared when reaching B, which leads to a
new DITypeAttr to be created for E[4].
2025-07-03 14:09:01 +02:00
Abid Qadeer
d56c06e6c9
[flang][debug] Generate DISubprogramAttr for omp::TargetOp. (#146532)
This is combination of https://github.com/llvm/llvm-project/pull/138149
and https://github.com/llvm/llvm-project/pull/138039 which were opened
separately for ease of reviewing. Only other change is adjustments in 2
tests which have gone in since.

There are `DeclareOp` present for the variables mapped into target
region. That allow us to generate debug information for them. But the
`TargetOp` is still part of parent function and those variables get the
parent function's `DISubprogram` as a scope.
    
In `OMPIRBuilder`, a new function is created for the `TargetOp`. We also
create a new `DISubprogram` for it. All the variables that were in the
target region now have to be updated to have the correct scope. This
after the fact updating of
debug information becomes very difficult in certain cases. Take the
example of variable arrays. The type of those arrays depend on the
artificial `DILocalVariable`(s) which hold the size(s) of the array.
This new function will now require that we generate the new variable and
and new types. Similar issue exist for character type variables too.
    
To avoid this after the fact updating, this PR generates a
`DISubprogramAttr` for the `TargetOp` while generating the debug info in
`flang`. Then we don't need to generate a `DISubprogram` in
`OMPIRBuilder`. This change is made a bit more complicated by the the
fact that in new scheme, the debug location already points to the new
`DISubprogram` by the time it reaches `convertOmpTarget`. But we need
some code generation in the parent function so we have to carefully
manage the debug locations.
    
This fixes issue `#134991`.
2025-07-03 10:38:28 +01:00
KAWASHIMA Takahiro
d67013a2b4
[Flang][AArch64][VecLib] Add libmvec support for Flang/AArch64 (#146453)
`-fveclib=libmvec` for AArch64 (NEON and SVE) in Clang was supported by
#143696. This patch does the same for Flang.

Vector functions defined in `libmvec` are used for the following Fortran
operator and functions currently.

- Power operator (`**`)
- Fortran intrinsic functions listed below for `real(kind=4)` and
`real(kind=8)` (including their coresponding specific intrinsic
functions)
- Fortran intrinsic functions which are expanded using functions listed
below (for example, `sin` for `complex(kind=8)`)

```
sin
tan
cos
asin
acos
atan (both atan(x) and atan(y, x))
atan2
cosh
tanh
asinh
acosh
atanh
erf
erfc
exp
log
log10
```

As with Clang/AArch64, glibc 2.40 or higher is required to use all these
functions.
2025-07-03 14:38:45 +09:00
Valentin Clement
e718ce0037 Revert "[flang][cuda] Do not create global for derived-type with allocatable device components (#146780)"
This reverts commit e873ce31ae0e875081c8e5480c9c4925c97469ce.
2025-07-02 17:51:55 -07:00
Valentin Clement
a5350785db Revert "[flang][cuda] Allocate derived-type with CUDA componement in managed memory (#146797)"
This reverts commit 925588cd001a91d592b99e6e7c6bee9514f5a26e.
2025-07-02 17:51:45 -07:00
Valentin Clement (バレンタイン クレメン)
925588cd00
[flang][cuda] Allocate derived-type with CUDA componement in managed memory (#146797)
Similarly to descriptor for device data, put derived type holding device
descriptor in managed memory.
2025-07-02 16:02:08 -07:00
Valentin Clement (バレンタイン クレメン)
e873ce31ae
[flang][cuda] Do not create global for derived-type with allocatable device components (#146780)
derived type with CUDA device allocatable components will be handle via
CUDA allocation. Do not create global for them.
2025-07-02 15:43:09 -07:00