2779 Commits

Author SHA1 Message Date
Peter Klausler
bbcdad1f8e
[flang][runtime] MCLOCK library routine (#148960)
Add MCLOCK as an interface to std::clock().
2025-07-16 09:10:07 -07:00
Peter Klausler
52a46dc57f
[flang] Allow -fdefault-integer-8 with defined I/O (#148927)
Defined I/O subroutines have UNIT= and IOSTAT= dummy arguments that are
required to have type INTEGER with its default kind. When that default
kind is modified via -fdefault-integer-8, calls to defined I/O
subroutines from the runtime don't work.

Add a flag to the two data structures shared between the compiler and
the runtime support library to indicate that a defined I/O subroutine
was compiled under -fdefault-integer-8. This has been done in a
compatible manner, so that existing binaries are compatible with the new
library and new binaries are compatible with the old library, unless of
course -fdefault-integer-8 is used.

Fixes https://github.com/llvm/llvm-project/issues/148638.
2025-07-16 09:09:49 -07:00
Krzysztof Parzyszek
51b6f64b89
[flang][OpenMP] Avoid unnecessary parsing of OpenMP constructs (#148629)
When parsing a specification part, the parser will look ahead to see if
the next construct is an executable construct. In doing so it will
invoke OpenMPConstruct parser, whereas the only necessary thing to check
would be the directive alone.
2025-07-15 09:05:56 -05:00
Valentin Clement (バレンタイン クレメン)
90ef114a33
[flang][cuda] Add cuf.set_allocator_idx for device component (#148750) 2025-07-14 19:31:44 -07:00
Valentin Clement (バレンタイン クレメン)
2c6771889a
[flang][cuda] Introduce cuf.set_allocator_idx operation (#148717) 2025-07-14 17:23:18 -07:00
Valentin Clement (バレンタイン クレメン)
5eecec8e81
[flang] Fix use of __has_builtin and formatting (#148746)
`__has_builtin` is not available on all compilers. Make sure it works
when not defined.

Also fix some formatting issues: 
- Use brace initialization where possible
- Fix wrong capitalization of variables.
- Add `std::` for `unit64_t` and `int64_t` as it is mostly done in this
part of the codebase.
2025-07-14 17:21:09 -07:00
Razvan Lupusoru
c4fc358156
[flang][acc][nfc] Move FIROpenACCSupport to Support subfolder (#148710)
In order to prepare for adding FIROpenACCTransforms, move the FIR
OpenACC support library to its own subfolder.
2025-07-14 13:42:20 -07:00
Peter Klausler
69f38443e5
[flang] Extension: TRANSFER(boz, integer or real scalar) (#147604)
Interpret TRANSFER(SOURCE=BOZ literal, MOLD=integer or real scalar) as
if it had been a reference to the standard INT or REAL intrinsic
functions, for which a BOZ literal is an acceptable argument, with a
warning about non-conformance. It's a needless extension that has
somehow crept into some other compilers and real applications.
2025-07-14 11:12:50 -07:00
Peter Klausler
4dceb25dd1
[flang] Don't create bogus tokens from token pasting (##) (#147596)
When blank tokens arise from macro replacement in token sequences with
token pasting (##), the preprocessor is producing some bogus tokens
(e.g., "name(") that can lead to subtle bugs later when macro names are
not recognized as such.

The fix is to not paste tokens together when the result would not be a
valid Fortran or C token in the preprocessing context.
2025-07-14 11:11:43 -07:00
Razvan Lupusoru
b54cfa46a7
[flang][acc] Implement MappableType's generatePrivateInit (#148302)
The recipe body generation was moved from lowering into FIR's
implementation of MappableType API. And now since all Fortran variable
types implement this type, lowering of OpenACC was updated to use this
API directly. No test changes were needed - all of the private,
firstprivate, and recipe tests get the same body as before.
2025-07-14 10:53:54 -07:00
Slava Zakharin
4775b96898
[flang] Optimize redundant array repacking. (#147881)
This patch allows optimizing redundant array repacking, when
the source array is statically known to be contiguous.
This is part of the implementation plan for the array repacking
feature, though, it does not affect any real life use case
as long as FIR inlining is not a thing. I experimented with
simple cases of FIR inling using `-inline-all`, and I recorded
these cases in optimize-array-repacking.fir tests.
2025-07-14 09:41:42 -07:00
Slava Zakharin
fc99ef7411
[flang] Allow embox's source_box to be a !fir.box. (#148305)
In order to create temporary copies of assumed-type arrays
(e.g. for `-frepack-arrays`), we have to allow the source_box
to be a !fir.box.

This patch replaces #147618.
2025-07-14 09:40:42 -07:00
David Spickett
29d8c346c5
Reland "[flang] Avoid undefined behaviour when parsing format expressions (#147539)" (#148169)
This reverts commit e8e5d07767c444913f837dd35846a92fcf520eab.

This previously failed because the flang-rt build could not find the 
llvm header file. It passed on some machines but only because they
had globally installed copies of older llvm.

To fix this, I've copied the required routines from llvm into flang.

With the following justification:
* Flang can, and does, use llvm headers.
* Some Flang headers are also used in Flang-rt.
* Flang-rt cannot use llvm headers.
* Therefore any Flang header using in Flang-rt cannot use llvm headers
either.

To support that conclusion,
https://flang.llvm.org/docs/IORuntimeInternals.html
states:
"The Fortran I/O runtime support library is written in C++17, and uses
some C++17 standard library facilities, but it is intended to not have
any link-time dependences on the C++ runtime support library or any LLVM
libraries."

This talks about libraries but I assume it applies to llvm in general.

Nothing in flang/include/flang/Common, or flang/include/flang/Common
includes any llvm header, and I see some very similar headers there
that duplicate llvm functionality. Like float128.h.

I can only assume this means these files must remain free of
dependencies
on LLVM.

I have copied the two routines literally and put them in the
flang::common
namespace, for lack of a better place for them. So they don't clash with
something.

I have specialised the function to the 1 type flang needs, as it might
save a bit of compile time.
2025-07-14 10:14:04 +01:00
Krzysztof Parzyszek
61a9d2c22d
[flang][OpenMP] Use OmpDirectiveSpecification in DISPATCH (#148008)
Dispatch is the last construct (after ATOMIC and ALLOCATORS) where the
associated block requires a specific form.
Using OmpDirectiveSpecification for the begin and the optional end
directives will make the structure of all block directives more uniform.
2025-07-11 07:28:54 -05:00
Krzysztof Parzyszek
638943b27e
[flang][OpenMP] Convert AST node for ALLOCATORS to use Block as body (#148005)
The ALLOCATORS construct is one of the few constructs that require a
special form of the associated block.
Convert the AST node to use OmpDirectiveSpecification for the directive
and the optional end directive, and to use parser::Block as the body:
the form of the block is checked in the semantic checks (with a more
meaningful message).
2025-07-11 06:45:11 -05:00
Kareem Ergawy
a510e75949
[flang][fir] Small clean-up in fir_DoConcurrentLoopOp's defintion (#146028)
Re-organizes the op definition a little bit and removes a method that
does not add much value to the API.

PR stack:
- https://github.com/llvm/llvm-project/pull/145837
- https://github.com/llvm/llvm-project/pull/146025
- https://github.com/llvm/llvm-project/pull/146028 (this one)
- https://github.com/llvm/llvm-project/pull/146033
2025-07-11 08:30:36 +02:00
Kareem Ergawy
7c8a197918
[NFC][flang] Move ReductionProcessor to Lower/Support. (#146025)
With #145837, the `ReductionProcessor` component is now used by both
OpenMP and `do concurrent`. Therefore, this PR moves it to a shared
location: `flang/Lower/Support`.

PR stack:
- https://github.com/llvm/llvm-project/pull/145837
- https://github.com/llvm/llvm-project/pull/146025 (this one)
- https://github.com/llvm/llvm-project/pull/146028
- https://github.com/llvm/llvm-project/pull/146033
2025-07-11 07:42:51 +02:00
Kareem Ergawy
eba35cc1c0
[flang][do concurrent] Re-model reduce to match reductions are modelled in OpenMP and OpenACC (#145837)
This PR proposes re-modelling `reduce` specifiers to match OpenMP and
OpenACC. In particular, this PR includes the following:

* A new `fir` op: `fir.delcare_reduction` which is identical to OpenMP's
`omp.declare_reduction` op.
* Updating the `reduce` clause on `fir.do_concurrent.loop` to use the
new op.
* Re-uses the `ReductionProcessor` component to emit reductions for `do
concurrent` just like we do for OpenMP. To do this, the
`ReductionProcessor` had to be refactored to be more generalized.
* Upates mapping `do concurrent` to `fir.loop ... unordered` nests using
the new reduction model.

Unfortunately, this is a big PR that would be difficult to divide up in
smaller parts because the bottom of the changes are the `fir` table-gen
changes to `do concurrent`. However, doing these MLIR changes cascades
to the other parts that have to be modified to not break things.

This PR goes in the same direction we went for `private/local`
speicifiers. Now the `do concurrent` and OpenMP (and OpenACC) dialects
are modelled in essentially the same way which makes mapping between
them more trivial, hopefully.

PR stack:
- https://github.com/llvm/llvm-project/pull/145837 (this one)
- https://github.com/llvm/llvm-project/pull/146025
- https://github.com/llvm/llvm-project/pull/146028
- https://github.com/llvm/llvm-project/pull/146033
2025-07-11 06:39:30 +02:00
Valentin Clement (バレンタイン クレメン)
9a0e03f430
[flang][cuda] Update implicit data transfer for device component (#147882)
Update the detection of implicit data transfer when a device resident
allocatable derived-type component is involved and remove the TODOs.
2025-07-10 09:50:31 -07:00
Krzysztof Parzyszek
9b0ae6ccd6
[flang][OpenMP] Issue a warning when parsing future directive spelling (#147765)
OpenMP 6.0 introduced alternative spelling for some directives, with the
previous spellings still allowed.

Warn the user when a new spelling is encountered with OpenMP version set
to an older value.
2025-07-10 09:57:03 -05:00
agozillon
75f81ded8f
[Flang][FlangRT][Runtime] Add RT_OFFLOAD_API_GROUP_BEGIN to missing symbols on AMDGPU (#147612)
After the recent move to work queues, in certain cases when linking in
the fortran runtime built for offload on AMDGPU as required in certain
cases, we'll get missing symbols when linking. This PR tries to address
this issue by encompassing more of the library in
RT_OFFLOAD_API_GROUP_BEGIN, which has the affect of compiling these
functions for AMDGPU, resolving the missing symbols.

This PR should address the following issue:
https://github.com/llvm/llvm-project/issues/145888
2025-07-10 13:19:58 +02:00
Vijay Kandiah
c4138a24dc
[mlir][acc][flang] Lower nested ACC loops with tile clause as collapsed loops (#147801)
In the case of nested loops, `acc.loop` is meant to subsume all of the
loops that it applies to (when explicitly described as doing so in the
OpenACC specification). So when there is a `acc loop tile(...)` present
on nested Fortran DO loops, `acc.loop` should apply to the `n` loops
that `tile` applies to. This change lowers such nested Fortran loops
with tile clause into a collapsed `acc.loop` with `n` IVs, loop bounds,
and step, in a similar fashion to the current lowering for acc loops
with `collapse` clause.
2025-07-09 15:47:11 -05:00
Andre Kuhlenschmidt
fc9dd58734
[flang][driver] add -Wfatal-errors (#147614)
Adds the flag `-Wfatal-errors` which truncates the error messages at 1 error.
2025-07-09 12:35:43 -07:00
David Spickett
e8e5d07767 Revert "[flang] Avoid undefined behaviour when parsing format expressions (#147539)"
This reverts commit d0caf0d4857c2b00ba988f86703663685ec8697f.

MathExtras.h is not found in some builds.
2025-07-09 14:49:58 +00:00
David Spickett
d0caf0d485
[flang] Avoid undefined behaviour when parsing format expressions (#147539)
The test flang/test/Semantics/io08.f90 was failing when UBSAN was
enabled:
```
/home/david.spickett/llvm-project/flang/include/flang/Common/format.h:224:26: runtime error: signed integer overflow: 10 * 987654321098765432 cannot be represented in type 'int64_t' (aka 'long')
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/david.spickett/llvm-project/flang/include/flang/Common/format.h:224:26
```
This is because the code was effectively:
* Take the risk of UB happening
* Check whether it happened or not

Which UBSAN is obviously not going to like. Instead of checking after
the fact, use llvm's helpers that catch overflow without actually doing
it.
2025-07-09 15:31:45 +01:00
Maksim Levental
1770e9b5c6
[mlir] remove dangling builders from td (#147619)
These are "dangling" builders (decls are emitted but there are no defns
anywhere).
2025-07-09 09:59:24 -04:00
Shunsuke Watanabe
c9900015a9
[flang] Add -fcomplex-arithmetic= option and select complex division algorithm (#146641)
This patch adds an option to select the method for computing complex
number division. It uses `LoweringOptions` to determine whether to lower
complex division to a runtime function call or to MLIR's `complex.div`,
and `CodeGenOptions` to select the computation algorithm for
`complex.div`. The available option values and their corresponding
algorithms are as follows:
- `full`: Lower to a runtime function call. (Default behavior)
- `improved`: Lower to `complex.div` and expand to Smith's algorithm.
- `basic`: Lower to `complex.div` and expand to the algebraic algorithm.

See also the discussion in the following discourse post:
https://discourse.llvm.org/t/optimization-of-complex-number-division/83468

---------

Co-authored-by: Tarun Prabhu <tarunprabhu@gmail.com>
2025-07-09 13:43:54 +09:00
Valentin Clement (バレンタイン クレメン)
46caad52ac
[flang][cuda] Do not produce data transfer in offloaded do concurrent (#147435)
If a `do concurrent` loop is offloaded then there should be no CUDA data
transfer in it. Update the semantic and lowering to take that into
account.

`AssignmentChecker` has to be put into a separate pass because the
checkers in `SemanticsVisitor` cannot have the same `Enter/Leave`
functions. The `DoForallChecker` already has `Eneter/Leave` functions
for the `DoConstruct`.
2025-07-08 10:52:15 -07:00
David Spickett
31786ee89f
[flang] Avoid undefined behaviour in Interval::Contains (#147505)
If the size of the other Interval was 0, (that.size_ - 1) would wrap
below zero.

I've fixed this so that a zero size interval A is within interval B if
the start of A is within B. There's a few ways you could handle zero
sized intervals in theory but this one passes all tests so I assume it's
the intention.

This fixes the following tests when ubsan is enabled:
  Flang :: Lower/OpenMP/PFT/sections-pft.f90
  Flang :: Lower/OpenMP/derived-type-allocatable.f90
  Flang :: Lower/OpenMP/privatization-proc-ptr.f90
  Flang :: Lower/OpenMP/sections.f90
  Flang :: Parser/OpenMP/sections.f90
  Flang :: Semantics/OpenMP/clause-validity01.f90
  Flang :: Semantics/OpenMP/if-clause.f90
  Flang :: Semantics/OpenMP/parallel-sections01.f90
  Flang :: Semantics/OpenMP/private-assoc.f90
2025-07-08 14:39:08 +01:00
David Spickett
d889a7485f
[flang] Avoid UB in CharBlock Compare to C string (#147329)
The behaviour of strncmp is undefined if either string pointer is null
(https://en.cppreference.com/w/cpp/string/byte/strncmp.html).

I've copied the logic over from Compare to another CharBlock, which had
code to avoid UB in memcmp.

The test Preprocessing/kind-suffix.F90 was failing with UBSAN enabled,
and now passes.
2025-07-08 08:52:36 +01:00
Valentin Clement (バレンタイン クレメン)
659c8102f4
Reland [flang][cuda] Allocate derived-type with CUDA component in anaged memory (#147416) 2025-07-07 17:40:04 -07:00
Valentin Clement (バレンタイン クレメン)
07cc7ea7d5
Reland [flang][cuda] Do not create global for derived-type with allocatable device components (#147402)
Reviewed in #146780

derived type with CUDA device allocatable components will be handle via
CUDA allocation. Do not create global for them.
2025-07-07 15:19:41 -07:00
Valentin Clement
e718ce0037 Revert "[flang][cuda] Do not create global for derived-type with allocatable device components (#146780)"
This reverts commit e873ce31ae0e875081c8e5480c9c4925c97469ce.
2025-07-02 17:51:55 -07:00
Valentin Clement (バレンタイン クレメン)
e873ce31ae
[flang][cuda] Do not create global for derived-type with allocatable device components (#146780)
derived type with CUDA device allocatable components will be handle via
CUDA allocation. Do not create global for them.
2025-07-02 15:43:09 -07:00
Kareem Ergawy
b1774222c7
[flang] Emit fir.global in the global address space (#146653)
Instead of emitting globals in the program/default address space, emit
them in the global address space. This also requires changes how address
of code-gen is handled, we need to cast to the default address space to
prevent code-gen issues.
2025-07-02 17:15:22 +02:00
Jack Styles
65cb0eae58
[Flang][OpenMP] Add Semantics support for Nested OpenMPLoopConstructs (#145917)
In OpenMP Version 5.1, the tile and unroll directives were added. When
using these directives, it is possible to nest them within other OpenMP
Loop Constructs. This patch enables the semantics to allow for this
behaviour on these specific directives. Any nested loops will be stored
within the initial Loop Construct until reaching the DoConstruct itself.

Relevant tests have been added, and previous behaviour has been retained
with no changes.

See also, #110008
2025-07-01 08:39:15 +01:00
Peter Klausler
a93d843ab3
[flang] Don't warn on (0.,0.)**(nonzero noninteger) (#145179)
Folding hands complex exponentiations with constant arguments off to the
native libm, and on a least on host, this can produce spurious warnings
about division by zero and invalid arguments. Handle the case of a zero
base specially to avoid that, and also emit better warnings for the
undefined 0.**0 and (0.,0.)**0 cases. And add a test for these warnings
and the existing related ones.
2025-06-30 10:21:37 -07:00
Andre Kuhlenschmidt
83b462af17
[flang][CLI] Have the CLI hint the flag to disable a warning (#144767)
Adds a hint to the warning message to disable a warning and updates the
tests to expect this.

Also fixes a bug in the storage of canonical spelling of error flags so
that they are not used after free.
2025-06-30 10:17:05 -07:00
jeanPerier
faefe7cf7d
[flang] add option to generate runtime type info as external (#146071)
Reland #145901 with a fix for shared library builds.

So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.

This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.

This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.

Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.

I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
2025-06-30 09:58:00 +02:00
Kazu Hirata
c57c5f53a3 [flang] Fix warnings
This patch fixes:

  flang/../mlir/include/mlir/IR/TypeRange.h:51:19: error: 'ArrayRef'
  is deprecated: Use {} or ArrayRef<T>() instead
  [-Werror,-Wdeprecated-declarations]

  flang/../mlir/include/mlir/IR/ValueRange.h:401:20: error: 'ArrayRef'
  is deprecated: Use {} or ArrayRef<T>() instead
  [-Werror,-Wdeprecated-declarations]
2025-06-28 12:55:22 -07:00
Krzysztof Parzyszek
344b5b7f9e
[flang][OpenMP] Move lowering of ATOMIC to separate file, NFC (#146225)
Reinstate commits e5559ca4 and 925dbc79. Fix the issues with compilation
hangs by including DenseMapInfo specialization where the corresponding
instance of DenseMap was defined.

Ref: https://github.com/llvm/llvm-project/pull/144960
2025-06-28 13:38:00 -05:00
Valentin Clement (バレンタイン クレメン)
75175e7230
[flang][cuda] Inline this_thread_block() calls (#146144) 2025-06-27 14:59:29 -07:00
Valentin Clement (バレンタイン クレメン)
b2f504ff15
[flang][cuda] Inline this_warp() calls (#146134) 2025-06-27 14:12:17 -07:00
jeanPerier
37e2d10499
Revert "[flang] add option to generate runtime type info as external" (#146064)
Reverts llvm/llvm-project#145901

Broke shared library builds because of the usage of
`skipExternalRttiDefinition` in Lowering.
2025-06-27 14:05:59 +02:00
jeanPerier
e816817bbb
[flang] add option to generate runtime type info as external (#145901)
So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.

This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.

This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.

Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.

I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
2025-06-27 13:00:29 +02:00
jeanPerier
b989c76f39
[flang][NFC] switch ValueRange(nullopt) to ValueRange{} after #146011 (#146043)
Clean-up some std::nullopt usages in FIR ops builder that triggers a
deprecated warning after #146011.
2025-06-27 12:49:34 +02:00
Kazu Hirata
938cdb30f1
[flang] Migrate away from std::nullopt (NFC) (#145928)
ArrayRef has a constructor that accepts std::nullopt.  This
constructor dates back to the days when we still had llvm::Optional.

Since the use of std::nullopt outside the context of std::optional is
kind of abuse and not intuitive to new comers, I would like to move
away from the constructor and eventually remove it.

This patch replaces std::nullopt with {}.  There are a couple of
places where std::nullopt is replaced with TypeRange() to accommodate
perfect forwarding.
2025-06-26 12:41:49 -07:00
Andre Kuhlenschmidt
283c2e8d7c
[flang][semantics] fix issue with equality of min/max in module files (#145824)
Convert all binary calls of min/max to extremum operations, so that
extremums generated by the compiler compare equal, and user min/max
calls also compare equal.

Fixes #133646

Originally opened as #144162 but I accidentally pushed a merge in such a
way that a bunch of code owners got added to the review. This is just
rebasing the original work on main and fixing the failing tests.
2025-06-26 12:15:57 -07:00
Valentin Clement (バレンタイン クレメン)
2b2bd51f3b
[flang][cuda] Inline this_grid call for cooperative groups (#145796) 2025-06-25 16:40:47 -07:00
Krzysztof Parzyszek
77a3ae5845
[flang][OpenMP] Remove recognition of versions 3.0 and older (#145708)
The oldest supported version is now 3.1. In terms of semantic analysis
the compiler treats all versions <= 4.5 identically, and there is no
plan to add version-specific checks for older versions.

See discourse thread:

https://discourse.llvm.org/t/rfc-remove-openmp-versions-prior-to-3-1/86901
2025-06-25 10:20:52 -05:00