This patch implements IR-based Profile-Guided Optimization support in
Flang through the following flags:
- `-fprofile-generate` for instrumentation-based profile generation
- `-fprofile-use=<dir>/file` for profile-guided optimization
Resolves#74216 (implements IR PGO support phase)
**Key changes:**
- Frontend flag handling aligned with Clang/GCC semantics
- Instrumentation hooks into LLVM PGO infrastructure
- LIT tests verifying:
- Instrumentation metadata generation
- Profile loading from specified path
- Branch weight attribution (IR checks)
**Tests:**
- Added gcc-flag-compatibility.f90 test module verifying:
- Flag parsing boundary conditions
- IR-level profile annotation consistency
- Profile input path normalization rules
- SPEC2006 benchmark results will be shared in comments
For details on LLVM's PGO framework, refer to [Clang PGO
Documentation](https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization).
This implementation was developed by [XSCC Compiler
Team](https://github.com/orgs/OpenXiangShan/teams/xscc).
---------
Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com>
Co-authored-by: Tom Eccles <t@freedommail.info>
This patch adds support for the -mprefer-vector-width= command line
option. The parsing of this options is equivalent to Clang's and it is
implemented by setting the "prefer-vector-width" function attribute.
Co-authored-by: Cameron McInally <cmcinally@nvidia.com>
There are various places where the -fveclib option is parsed to
determine whether its value is correct for the target. Unfortunately
these places assume case-insensitivity and subsequently use "LIBMVEC"
where the driver mandates "libmvec", thus rendering the diagnosistic
useless.
This PR corrects the naming along with similar incorrect uses within the
test files.
This PR starts the effort to upstream AMD's internal implementation of
`do concurrent` to OpenMP mapping. This replaces #77285 since we
extended this WIP quite a bit on our fork over the past year.
An important part of this PR is a document that describes the current
status downstream, the upstreaming status, and next steps to make this
pass much more useful.
In addition to this document, this PR also contains the skeleton of the
pass (no useful transformations are done yet) and some testing for the
added command line options.
This looks like a huge PR but a lot of the added stuff is documentation.
It is also worth noting that the downstream pass has been validated on
https://github.com/BerkeleyLab/fiats. For the CPU mapping, this achived
performance speed-ups that match pure OpenMP, for GPU mapping we are
still working on extending our support for implicit memory mapping and
locality specifiers.
PR stack:
- https://github.com/llvm/llvm-project/pull/126026 (this PR)
- https://github.com/llvm/llvm-project/pull/127595
- https://github.com/llvm/llvm-project/pull/127633
- https://github.com/llvm/llvm-project/pull/127634
- https://github.com/llvm/llvm-project/pull/127635
Add -f[no-]slp-vectorize to the flang driver.
Add corresponding -fvectorize-slp to the flang frontend.
Enable -fslp-vectorize at -O2 and higher in flang to match the current
behaviour in clang.
---------
Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
This PR addresses some of the issues described in
https://github.com/llvm/llvm-project/issues/127617. Key changes:
- Stop assuming fixed-form for `-x f95` unless the input is a `.i` file.
This change ensures compatibility with `-save-temps` workflows while
preventing unintended fixed-form assumptions.
- Ensure `-x f95-cpp-input` enables `-cpp` by default, aligning Flang's
behavior with `gfortran`.
`-fd-lines-as-code` and `-fd-lines-as-comments` enables treatment for
lines beginning with `d` or `D` in fixed form sources.
Using these options in free form has no effect.
If the `-fd-lines-as-code` option is given they are treated as if the
first column contained a blank.
If the `-fd-lines-as-comments` option is given, they are treated as
comment lines.
This patch adds the -fvectorize and -fno-vectorize flags to flang.
Note that this also changes the behaviour of `flang -fc1` to match that
of `clang -cc1`, which is that vectorization is only enabled in the
presence of the `-vectorize-loops` flag.
Additionally, this patch changes the behaviour of the default
optimisation levels to match clang, such that vectorization only happens
at the same levels as it does there.
This patch is in draft while I write an RFC to discuss the above two
changes.
This is needed to generate proper ABI flags in the ELF header for LTO
builds. If these flags aren't set correctly, we can't link with objects
that were built with the correct flags.
For non-LTO builds the mcpu/mattr in the TargetMachine will cause the
backend to infer an ABI. For LTO builds the mcpu/mattr aren't set.
I've only added lp64, lp64f, and lp64d ABIs. ilp32* requires riscv32
which is not yet supported in flang. lp64e requires a different
DataLayout string and would need additional plumbing.
Fixes#115679
Move non-common files from FortranCommon to FortranSupport (analogous to
LLVMSupport) such that
* declarations and definitions that are only used by the Flang compiler,
but not by the runtime, are moved to FortranSupport
* declarations and definitions that are used by both ("common"), the
compiler and the runtime, remain in FortranCommon
* generic STL-like/ADT/utility classes and algorithms remain in
FortranCommon
This allows a for cleaner separation between compiler and runtime
components, which are compiled differently. For instance, runtime
sources must not use STL's `<optional>` which causes problems with CUDA
support. Instead, the surrogate header `flang/Common/optional.h` must be
used. This PR fixes this for `fast-int-sel.h`.
Declarations in include/Runtime are also used by both, but are
header-only. `ISO_Fortran_binding_wrapper.h`, a header used by compiler
and runtime, is also moved into FortranCommon.
When -fimplicit-none-ext is passed, flang behaves as if "implicit
none(external)" was specified for all relevant constructs in Fortran
source file.
Note: implicit17.f90 was based on implicit07.f90 with `implicit
none(external)` removed and `-fimplicit-none-ext` added.
The behavior is not entirely consistent with that of clang for the
moment since detailed timing information on the LLVM IR optimization and
code generation passes is not provided. The -ftime-report= option is
also not enabled since that is only relevant for information about the
LLVM IR passes. However, some code to handle that option has been
included, to make it easier to support the option when the issues
blocking it are resolved. A FortranSupport library has been created that
is intended to mirror the LLVM and MLIR support libraries.
Based on @tarunprabhu's PR
https://github.com/llvm/llvm-project/pull/107270 with minor changes
addressing latest review feedback. He's busy and we'd like to get this
support in ASAP.
Co-authored-by: Tarun Prabhu <tarun.prabhu@gmail.com>
Implement the UNSIGNED extension type and operations under control of a
language feature flag (-funsigned).
This is nearly identical to the UNSIGNED feature that has been available
in Sun Fortran for years, and now implemented in GNU Fortran for
gfortran 15, and proposed for ISO standardization in J3/24-116.txt.
See the new documentation for details; but in short, this is C's
unsigned type, with guaranteed modular arithmetic for +, -, and *, and
the related transformational intrinsic functions SUM & al.
-frealloc-lhs is the default.
If -fno-realloc-lhs is specified, then an allocatable on the left
side of an intrinsic assignment is not implicitly (re)allocated
to conform with the right hand side. Fortran runtime will issue
an error if there is a mismatch in shape/type/allocation-status.
The aliases are -mcpu=help and -mtune=help. There is still an issue with
the output which prints an example line that references clang. That is
not fixed here because it is printed in llvm/MC/SubtargetInfo.cpp. Some
more thought is needed to determine how best to handle this.
Fixes#117010
nsw is now added to do-variable increment when -fno-wrapv is enabled as
GFortran seems to do.
That means the option introduced by #91579 isn't necessary any more.
Note that the feature of -flang-experimental-integer-overflow is enabled
by default.
The underlying issue was caused by a file included in two different
places which resulted in duplicate definition errors when linking
individual shared libraries. This was fixed in c3201ddaeac02a2c86a38b
[#109874].
This does a global rename from `flang-new` to `flang`. I also
removed/changed any TODOs that I found related to making this change.
---------
Co-authored-by: H. Vetinari <h.vetinari@gmx.com>
Co-authored-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
Fortran INCLUDE lines have (until now) been treated like #include
directives. This isn't how things work with other Fortran compilers when
running under the -E option for preprocessing only, so stop doing it by
default, and add -fpreprocess-include-lines to turn it back on when
desired.
There are autoconf-configured projects for which the generated Makefile
is invoking flang with more than one -J option, each one specifying the
same directory. Although only one module directory should be specified
(by either -J or -module-dir), it should not really matter how many
times this same directory has been specified.
Apparently, other compilers understand it that way, hence autoconf's
configure script may generate a Makefile with the repetitive -J's.
For example, when trying to build the ABINIT [1] project (which can be
configured by either CMake or the configure script) when configured by
autoconf, it fails to build as such:
```
make[3]: Entering directory 'src/98_main'
mpifort -DHAVE_CONFIG_H -I. -I../../../src/98_main -I../.. -I../../src/incs -I../../../src/incs -Ifallbacks/exports/include -Jbuild/mods -Jbuild/mods -c -o abinit-abinit.o `test -f 'abinit.F90' || echo '../../../src/98_main/'`abinit.F90
error: Only one '-module-dir/-J' option allowed
make[3]: *** [Makefile:3961: abinit-abinit.o] Error 1
```
This patch solves the problem.
[1] https://github.com/abinit/abinit.git
Add support for the -frecord-command-line option that will produce the
llvm.commandline metadata which will eventually be saved in the object
file. This behavior is also supported in clang. Some refactoring of the
code in flang to handle these command line options was carried out. The
corresponding -grecord-command-line option which saves the command line
in the debug information has not yet been enabled for flang.
Adding hidden options to disable types through the
`TargetCharacteristics`. I am seeing issues when I do this
programmatically and would like, for anyone, to have the ability to
reproduce them for development and testing purposes.
I am planning to file a couple of issues following this patch.