2791 Commits

Author SHA1 Message Date
jeanPerier
faefe7cf7d
[flang] add option to generate runtime type info as external (#146071)
Reland #145901 with a fix for shared library builds.

So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.

This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.

This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.

Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.

I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
2025-06-30 09:58:00 +02:00
Kazu Hirata
c57c5f53a3 [flang] Fix warnings
This patch fixes:

  flang/../mlir/include/mlir/IR/TypeRange.h:51:19: error: 'ArrayRef'
  is deprecated: Use {} or ArrayRef<T>() instead
  [-Werror,-Wdeprecated-declarations]

  flang/../mlir/include/mlir/IR/ValueRange.h:401:20: error: 'ArrayRef'
  is deprecated: Use {} or ArrayRef<T>() instead
  [-Werror,-Wdeprecated-declarations]
2025-06-28 12:55:22 -07:00
Krzysztof Parzyszek
344b5b7f9e
[flang][OpenMP] Move lowering of ATOMIC to separate file, NFC (#146225)
Reinstate commits e5559ca4 and 925dbc79. Fix the issues with compilation
hangs by including DenseMapInfo specialization where the corresponding
instance of DenseMap was defined.

Ref: https://github.com/llvm/llvm-project/pull/144960
2025-06-28 13:38:00 -05:00
Valentin Clement (バレンタイン クレメン)
75175e7230
[flang][cuda] Inline this_thread_block() calls (#146144) 2025-06-27 14:59:29 -07:00
Valentin Clement (バレンタイン クレメン)
b2f504ff15
[flang][cuda] Inline this_warp() calls (#146134) 2025-06-27 14:12:17 -07:00
jeanPerier
37e2d10499
Revert "[flang] add option to generate runtime type info as external" (#146064)
Reverts llvm/llvm-project#145901

Broke shared library builds because of the usage of
`skipExternalRttiDefinition` in Lowering.
2025-06-27 14:05:59 +02:00
jeanPerier
e816817bbb
[flang] add option to generate runtime type info as external (#145901)
So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.

This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.

This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.

Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.

I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
2025-06-27 13:00:29 +02:00
jeanPerier
b989c76f39
[flang][NFC] switch ValueRange(nullopt) to ValueRange{} after #146011 (#146043)
Clean-up some std::nullopt usages in FIR ops builder that triggers a
deprecated warning after #146011.
2025-06-27 12:49:34 +02:00
Kazu Hirata
938cdb30f1
[flang] Migrate away from std::nullopt (NFC) (#145928)
ArrayRef has a constructor that accepts std::nullopt.  This
constructor dates back to the days when we still had llvm::Optional.

Since the use of std::nullopt outside the context of std::optional is
kind of abuse and not intuitive to new comers, I would like to move
away from the constructor and eventually remove it.

This patch replaces std::nullopt with {}.  There are a couple of
places where std::nullopt is replaced with TypeRange() to accommodate
perfect forwarding.
2025-06-26 12:41:49 -07:00
Andre Kuhlenschmidt
283c2e8d7c
[flang][semantics] fix issue with equality of min/max in module files (#145824)
Convert all binary calls of min/max to extremum operations, so that
extremums generated by the compiler compare equal, and user min/max
calls also compare equal.

Fixes #133646

Originally opened as #144162 but I accidentally pushed a merge in such a
way that a bunch of code owners got added to the review. This is just
rebasing the original work on main and fixing the failing tests.
2025-06-26 12:15:57 -07:00
Valentin Clement (バレンタイン クレメン)
2b2bd51f3b
[flang][cuda] Inline this_grid call for cooperative groups (#145796) 2025-06-25 16:40:47 -07:00
Krzysztof Parzyszek
77a3ae5845
[flang][OpenMP] Remove recognition of versions 3.0 and older (#145708)
The oldest supported version is now 3.1. In terms of semantic analysis
the compiler treats all versions <= 4.5 identically, and there is no
plan to add version-specific checks for older versions.

See discourse thread:

https://discourse.llvm.org/t/rfc-remove-openmp-versions-prior-to-3-1/86901
2025-06-25 10:20:52 -05:00
jeanPerier
22ee837ec0
[flang][NFC] do not copy fields in fir::RecordType::getTypeList (#145530)
For historical reason, `fir::RecordType::getTypeList` was returning an
std::vector, causing the entire field list to be copied when called.

It is called a lot indirectly in all type helpers, which themselves are
called a lot in derived type heavy code like WRF.
The `fir::hasDynamicType` helper is also called a lot, and it can just
check for length parameters to avoid looping on all derived type
components in most cases.
2025-06-25 11:51:07 +02:00
Tom Eccles
8f7f48a97e
[flang][OpenMP][NFC] remove globals with mlir::StateStack (#144898)
Idea suggested by @skatrak
2025-06-24 18:30:37 +01:00
Krzysztof Parzyszek
fb209929e1
[flang][OpenMP] Set isNewBlock directly on OpenMP constructs (#144593)
When the PFT builder decides that an evaluation needs a new block it
checks if the evaluation has nested evaluations. In such case it sets
the flag on the first nested evaluation. This works under the assuption
that such an evaluation only serves as a container, and does not, by
itself, generate any code.

This fails for OpenMP constructs that contain nested evaluations because
the top-level evaluation does generate code that wraps the code from the
nested evaluations. In such cases, the code for the top-level evaluation
may be emitted in a wrong place.

When setting the `isNewBlock` flag, recognize OpenMP directives, and
treat them accordingly.

This fixes https://github.com/llvm/llvm-project/issues/139071
2025-06-23 08:09:50 -05:00
Peter Klausler
9fd22cb56d
[flang][NFC] Move new code to right place (#144551)
Some new code was added to flang/Semantics that only depends on
facilities in flang/Evaluate. Move it into Evaluate and clean up some
minor stylistic problems.
2025-06-19 13:42:46 -07:00
Slava Zakharin
8631b4f1b4
[flang] Set low probability for array repacking code. (#144830)
This allows LLVM to place the most probably cold blocks
that do the repacking out of the line of the potentially hot code.
2025-06-19 12:12:04 -07:00
Andre Kuhlenschmidt
17f5b8b52a
[flang][driver] add ability to look up feature flags without setting them (#144559)
This just adds some convenience methods to feature control and rewrites
old code in terms of those methods. Also cleans up some names that I
just realize were overloads of another method.
2025-06-18 11:21:35 -07:00
Krzysztof Parzyszek
4084ffcf1e
[flang] Show types in DumpEvExpr (#143743)
When dumping evaluate::Expr, show type names which contain a lot of
useful information.

For example show
```
expr <Fortran::evaluate::SomeType> {
  expr <Fortran::evaluate::SomeKind<Fortran::common::TypeCategory::Integer>> {
    expr <Fortran::evaluate::Type<Fortran::common::TypeCategory::Integer, 4>> {
      ...
```
instead of
```
expr T {
  expr T {
    expr T {
      ...
```
2025-06-18 11:31:03 -05:00
Slava Zakharin
70343c8d44
[mlir][flang] Added Weighted[Region]BranchOpInterface's. (#142079)
The new interfaces provide getters and setters for the weight
information about the branches of BranchOpInterface and
RegionBranchOpInterface operations.

These interfaces are done the same way as LLVM dialect's
BranchWeightOpInterface.

The plan is to produce this information in Flang, e.g. mark
most probably "cold" code as such and allow LLVM to order
basic blocks accordingly. An example of such a code is
copy loops generated for arrays repacking - we can mark it
as "cold" assuming that the copy will not happen dynamically.
If the copy actually happens the overhead of the copy is probably high
enough so that we may not care about the little overhead
of jumping to the "cold" code and fetching it.
2025-06-17 16:14:13 -07:00
Krzysztof Parzyszek
5f841a6284
[flang][OpenMP] Set _OPENMP macro for version 6.0 (#144410) 2025-06-17 07:41:20 -05:00
Jack Styles
cb355def95
[Flang][OpenMP] Add Parsing support for Indirect Clause (#143505)
As part of OpenMP Version 5.1, support for the `indirect` clause was
added for the `declare target` directive. This clause should follow an
`enter` clause, and allows procedure calls to be done indirectly through
OpenMP.

This adds Parsing support for the clause, along with semantics checks.
Currently, lowering for the clause is not supported so a TODO message
will be outputted to the user. It also performs version checking as
`indirect` is only support in OpenMP 5.1 or greater.

See also: #110008
2025-06-17 09:05:36 +01:00
Peter Klausler
2bf3ccabfa
[flang] Restructure runtime to avoid recursion (relanding) (#143993)
Recursion, both direct and indirect, prevents accurate stack size
calculation at link time for GPU device code. Restructure these
recursive (often mutually so) routines in the Fortran runtime with new
implementations based on an iterative work queue with
suspendable/resumable work tickets: Assign, Initialize, initializeClone,
Finalize, and Destroy.

Default derived type I/O is also recursive, but already disabled. It can
be added to this new framework later if the overall approach succeeds.

Note that derived type FINAL subroutine calls, defined assignments, and
defined I/O procedures all perform callbacks into user code, which may
well reenter the runtime library. This kind of recursion is not handled
by this change, although it may be possible to do so in the future using
thread-local work queues.

(Relanding this patch after reverting initial attempt due to some test
failures that needed some time to analyze and fix.)

Fixes https://github.com/llvm/llvm-project/issues/142481.
2025-06-16 14:37:01 -07:00
FYK
52d34865b9
Fix and reapply IR PGO support for Flang (#142892)
This PR resubmits the changes from #136098, which was previously
reverted due to a build failure during the linking stage:

```
undefined reference to `llvm::DebugInfoCorrelate'  
undefined reference to `llvm::ProfileCorrelate'
```

The root cause was that `llvm/lib/Frontend/Driver/CodeGenOptions.cpp`
references symbols from the `Instrumentation` component, but the
`LINK_COMPONENTS` in the `llvm/lib/Frontend/CMakeLists.txt` for
`LLVMFrontendDriver` did not include it. As a result, linking failed in
configurations where these components were not transitively linked.

### Fix:

This updated patch explicitly adds `Instrumentation` to
`LINK_COMPONENTS` in the relevant `llvm/lib/Frontend/CMakeLists.txt`
file to ensure the required symbols are properly resolved.

---------

Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com>
Co-authored-by: Chyaka <52224511+liliumshade@users.noreply.github.com>
Co-authored-by: Tarun Prabhu <tarunprabhu@gmail.com>
2025-06-13 12:05:16 -06:00
Tom Eccles
4a47634a00
[flang][OpenMP] Support substrings and complex part refs for DEPEND (#143907)
Fixes #142404

The parser can't tell the difference between array indexing and a
substring: that has to be done in semantics once we have types.
Substrings can only be in the form string([lower]:[higher]) not
string(index) or string(lower:higher:step). I added semantic checks to
catch this for the DEPEND clause.

This patch also adds lowering for correct substrings and for complex
part references.
2025-06-13 14:16:58 +01:00
Valentin Clement (バレンタイン クレメン)
9992668404
[flang][cuda] Add runtime check for passing device arrays (#144003) 2025-06-12 20:47:58 -07:00
Krzysztof Parzyszek
141d390dcb
[flang][OpenMP] Overhaul implementation of ATOMIC construct (#137852)
The parser will accept a wide variety of illegal attempts at forming an
ATOMIC construct, leaving it to the semantic analysis to diagnose any
issues. This consolidates the analysis into one place and allows us to
produce more informative diagnostics.

The parser's outcome will be parser::OpenMPAtomicConstruct object
holding the directive, parser::Body, and an optional end-directive. The
prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have
been removed. READ, WRITE, etc. are now proper clauses.

The semantic analysis consistently operates on "evaluation"
representations, mainly evaluate::Expr (as SomeExpr) and
evaluate::Assignment. The results of the semantic analysis are stored in
a mutable member of the OpenMPAtomicConstruct node. This follows a
precedent of having `typedExpr` member in parser::Expr, for example.
This allows the lowering code to avoid duplicated handling of AST nodes.

Using a BLOCK construct containing multiple statements for an ATOMIC
construct that requires multiple statements is now allowed. In fact, any
nesting of such BLOCK constructs is allowed.

This implementation will parse, and perform semantic checks for both
conditional-update and conditional-update-capture, although no MLIR will
be generated for those. Instead, a TODO error will be issues prior to
lowering.

The allowed forms of the ATOMIC construct were based on the OpenMP 6.0
spec.
2025-06-11 10:05:34 -05:00
Peter Klausler
10f512f7bb
Revert runtime work queue patch, it breaks some tests that need investigation (#143713)
Revert "[flang][runtime] Another try to fix build failure"

This reverts commit 13869cac2b5051e453aa96ad71220d9d33404620.

Revert "[flang][runtime] Fix build bot flang-runtime-cuda-gcc errors
(#143650)"

This reverts commit d75e28477af0baa063a4d4cc7b3cf657cfadd758.

Revert "[flang][runtime] Replace recursion with iterative work queue
(#137727)"

This reverts commit 163c67ad3d1bf7af6590930d8f18700d65ad4564.
2025-06-11 07:55:06 -07:00
Valentin Clement (バレンタイン クレメン)
a3201ce9e1
[flang][cuda] Add option to disable warp function in semantic (#143640)
These functions are not available in some lower compute capabilities.
Add option in the language feature to enforce the semantic check on
these.
2025-06-10 22:10:26 -07:00
Peter Klausler
b994a4c04f
[flang][NFC] Clean up code in two new functions (#142037)
Two recently-added functions in Semantics/tools.h need some cleaning up
to conform to the coding style of the project. One of them should
actually be in Parser/tools.{h,cpp}, the other doesn't need to be
defined in the header.
2025-06-10 14:44:41 -07:00
Peter Klausler
163c67ad3d
[flang][runtime] Replace recursion with iterative work queue (#137727)
Recursion, both direct and indirect, prevents accurate stack size
calculation at link time for GPU device code. Restructure these
recursive (often mutually so) routines in the Fortran runtime with new
implementations based on an iterative work queue with
suspendable/resumable work tickets: Assign, Initialize, initializeClone,
Finalize, and Destroy.

Default derived type I/O is also recursive, but already disabled. It can
be added to this new framework later if the overall approach succeeds.

Note that derived type FINAL subroutine calls, defined assignments, and
defined I/O procedures all perform callbacks into user code, which may
well reenter the runtime library. This kind of recursion is not handled
by this change, although it may be possible to do so in the future using
thread-local work queues.

The effects of this restructuring on CPU performance are yet to be
measured.
2025-06-10 14:44:19 -07:00
Pranav Bhandarkar
f993f362ef
[Flang][OpenMP] - When mapping a fir.boxchar, map the underlying data pointer as a member (#141715)
This PR adds functionality to the `MapInfoFinalization` pass wherein the
underlying data pointer of a `fir.boxchar` is mapped as a member of the
parent boxchar.
2025-06-10 13:09:32 -05:00
Andre Kuhlenschmidt
d502c68dcb
[flang][common] return ENUM_CLASS names definition to original state (#143553)
This PR simply reverts a few lines in
bf60aa1c551ef5de62fd1d1cdcbff58cba55cacd to their state in
bcba39a56fd4e1debe3854d564c3e03bf0a50ee6 so that they are constant for
some of the build tests that require it. This should fix the breakage
caused by #142022.
2025-06-10 09:21:20 -07:00
Cameron McInally
cde1035a2f
[flang] Add support for -mrecip[=<list>] (#143418)
This patch adds support for the -mrecip command line option. The parsing
of this options is equivalent to Clang's and it is implemented by
setting the "reciprocal-estimates" function attribute.

Also move the ParseMRecip(...) function to CommonArgs, so that Flang is
able to make use of it as well.

---------

Co-authored-by: Cameron McInally <cmcinally@nvidia.com>
2025-06-10 08:25:33 -06:00
Andre Kuhlenschmidt
bf60aa1c55
[flang][cli] Add diagnostic flags to the CLI (#142022)
This change allows the flang CLI to accept `-W[no-]<feature>` flags matching the clang syntax and enable and disable usage and language feature warnings.
2025-06-10 06:41:13 -07:00
jeanPerier
59e4d0b34d
[flang][hlfir] ensure hlfir.declare result box attributes are consistent (#143137)
Prevent hlfir.declare output to be fir.box/class values with the
heap/pointer attribute to ensure the runtime descriptor attributes are
in line with the Fortran attributes for the entities being declared
(only fir.ref<box/class> can be ALLOCATABLE/POINTERS).

This fixes a bug where an associated entity inside a SELECT TYPE was being
unexpectedly reallocated inside assign runtime because the selector was allocatable
and this attribute was not properly removed when creating the descriptor
for the associated entity (that does not inherit the ALLOCATABLE/POINTER
attribute according to Fortran 2023 section 11.1.3.3).
2025-06-10 14:41:14 +02:00
Kajetan Puchalski
18f8e23815
[flang][OpenMP] Make static duration variables default to shared DSA (#142783)
According to the OpenMP standard, variables with static storage duration
are predetermined as shared.
Add a check when creating implicit symbols for OpenMP to fix them
erroneously getting set to firstprivate.

Fixes llvm#140732.

---------

Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
2025-06-09 15:52:24 +01:00
Tom Eccles
ce603a0f16
[flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (#140066)
This adds another puzzle piece for the support of OpenMP DECLARE
REDUCTION functionality.

This adds support for operators with derived types, as well as declaring
multiple different types with the same name or operator.

A new detail class for UserReductionDetials is introduced to hold the
list of types supported for a given reduction declaration.

Tests for parsing and symbol generation added.

Declare reduction is still not supported to lowering, it will generate a
"Not yet implemented" fatal error.

Fixes #141306
Fixes #97241
Fixes #92832
Fixes #66453

---------

Co-authored-by: Mats Petersson <mats.petersson@arm.com>
2025-06-09 11:17:03 +01:00
Pranav Bhandarkar
8395912895
[Flang] - Handle BoxCharType in fir.box_offset op (#141713)
To map `fir.boxchar` types reliably onto an offload target, such as a
GPU, the `omp.map.info` operation is used to map the underlying data
pointer (`fir.ref<fir.char<k, ?>>`) wrapped by the `fir.boxchar` MLIR
value. The `omp.map.info` operation needs a pointer to the underlying
data pointer.
Given a reference to a descriptor (`fir.box`), the `fir.box_offset` is
used to obtain the address of the underlying data pointer. This PR
extends `fir.box_offset` to provide the same functionality for
`fir.boxchar` as well.
2025-06-06 10:48:07 -05:00
Kajetan Puchalski
0d40574e16
[flang] Inline hlfir.copy_in for trivial types (#138718)
hlfir.copy_in implements copying non-contiguous array slices for
functions that take in arrays required to be contiguous through
flang-rt.

For large arrays of trivial types, this can incur overhead compared to a
plain, inlined copy loop.

To address that, add a new InlineHLFIRCopyIn optimisation pass to inline
hlfir.copy_in operations for trivial types.

For the time being, the pattern is only applied in cases where the
copy-in does not require a corresponding copy-out, such as when the
function being called declares the array parameter as intent(in).

Applying this optimisation reduces the runtime of thornado-mini's
DeleptonizationProblem by about 10%.

---------

Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
2025-06-06 15:10:17 +01:00
jeanPerier
6a41f53c39
[flang][hlfir] do not propagate polymorphic temporary as allocatables (#142609)
Polymorphic temporary are currently propagated as
fir.ref<fir.class<fir.heap<>>> because their allocation may be delayed
to the hlfir.assign copy (using realloc).

This patch moves away from this and directly allocate the temp and
propagate it as a fir.class.

The polymorphic temporaries creating is also simplified by avoiding the
need to call the runtime to setup the descriptor altogether (the runtime
is still call for the allocation currently because alloca/allocmem do
not support polymorphism).
2025-06-06 09:53:41 +02:00
Kareem Ergawy
bac4aa440c
[flang] Extend localization support for do concurrent (init regions) (#142564)
Extends support for locality specifiers in `do concurrent` by supporting
data types that need `init` regions.

This further unifies the paths taken by the compiler for OpenMP
privatization clauses and `do concurrent` locality specifiers.
2025-06-05 01:01:53 +02:00
Peter Klausler
4b23d4c7ca
[flang] Extension: allow override of inaccessible DEFERRED binding (#142691)
Inaccessible procedure bindings can't be overridden, but DEFERRED
bindings must be in a non-abstract extension. We presently emit an error
for an attempt to override an inaccessible binding in this case. But
some compilers accept this usage, and since it seems safe enough, I'll
allow it with an optional warning. Codes can avoid this warning and
conform to the standard by changing the deferred bindings to be public.
2025-06-04 09:23:34 -07:00
Leandro Lupori
aac1f85393
[flang][OpenMP] Explicitly set Shared DSA in symbols (#142154)
Before this change, OmpShared was not always set in shared symbols.
Instead, absence of private flags was interpreted as shared DSA.
The problem was that symbols with no flags, with only a host
association, could also mean "has same DSA as in the enclosing
context". Now shared symbols behave the same as private and can be
treated the same way.

Because of the host association symbols with no flags mentioned
above, it was also incorrect to simply test the flags of a given
symbol to find out if it was private or shared. The function
GetSymbolDSA() was added to fix this. It would be better to avoid
the need of these special symbols, but this would require changes
to how symbols are collected in lowering.

Besides that, some semantic checks need to know if a DSA clause
was used or not. To avoid confusing implicit symbols with DSA
clauses a new flag was added: OmpExplicit. It is now set for all
symbols with explicitly determined data-sharing attributes.

With the changes above, AddToContextObjectWithDSA() and the symbol
to DSA map could probably be removed and the DSA could be obtained
directly from the symbol, but this was not attempted.

Some debug messages were also added, with the "omp" DEBUG_TYPE, to
make it easier to debug the creation of implicit symbols and to
visualize all associations of a given symbol.

Fixes #130533
Fixes #140882
2025-06-03 10:58:23 -03:00
Kajetan Puchalski
01d4b16406
[flang][OpenMP] Resolve names for declare simd uniform clause (#142160)
Add a visitor for OmpClause::Uniform to resolve its parameter names.
Add Symbol::Flag::OmpUniform to attach it to the resolved symbols.
Fixes issue #140741.

---------

Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
2025-06-02 16:37:02 +01:00
Tarun Prabhu
597340b5b6
Revert "Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler" (#142159)
Reverts llvm/llvm-project#136098
2025-05-30 08:27:08 -06:00
FYK
d27a210a77
Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (#136098)
This patch implements IR-based Profile-Guided Optimization support in
Flang through the following flags:

- `-fprofile-generate` for instrumentation-based profile generation

- `-fprofile-use=<dir>/file` for profile-guided optimization

Resolves #74216 (implements IR PGO support phase)

**Key changes:**

- Frontend flag handling aligned with Clang/GCC semantics

- Instrumentation hooks into LLVM PGO infrastructure

- LIT tests verifying:

    - Instrumentation metadata generation

    - Profile loading from specified path

    - Branch weight attribution (IR checks)

**Tests:**

- Added gcc-flag-compatibility.f90 test module verifying:

    -  Flag parsing boundary conditions

    -  IR-level profile annotation consistency

    -  Profile input path normalization rules

- SPEC2006 benchmark results will be shared in comments

For details on LLVM's PGO framework, refer to [Clang PGO
Documentation](https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization).

This implementation was developed by [XSCC Compiler
Team](https://github.com/orgs/OpenXiangShan/teams/xscc).

---------

Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com>
Co-authored-by: Tom Eccles <t@freedommail.info>
2025-05-30 08:13:53 -06:00
Cameron McInally
ce9cef79ea
[flang] Add support for -mprefer-vector-width=<value> (#142073)
This patch adds support for the -mprefer-vector-width= command line
option. The parsing of this options is equivalent to Clang's and it is
implemented by setting the "prefer-vector-width" function attribute.

Co-authored-by: Cameron McInally <cmcinally@nvidia.com>
2025-05-30 07:50:18 -06:00
Yussur Mustafa Oraji
5c3bf36c99
[flang] Add __COUNTER__ preprocessor macro (#136827)
This commit adds support for the `__COUNTER__` preprocessor macro, which
works the same as the one found in clang.
It is useful to generate unique names at compile-time.
2025-05-30 08:33:53 -04:00
Kareem Ergawy
f5d3470d42
[flang][OpenMP] Allow structure component in task depend clauses (#141923)
Even though the spec (version 5.2) prohibits strcuture components from
being specified in `depend` clauses, this restriction is not sensible.

This PR rectifies the issue by lifting that restriction and allowing
structure components in `depend` clauses (which is allowed by OpenMP
6.0).
2025-05-30 06:22:29 +02:00