Four "issues" on GitHub report possible performance problems, likely
detected by static analysis. None of them would ever make a measureable
difference in compilation time, but I'm resolving them to clean up the
open issues list.
Fixes https://github.com/llvm/llvm-project/issues/79703, .../79705,
.../79706, & .../79707.
This patch forwards the target CPU and features information from the
Flang frontend to MLIR func.func operation attributes, which are later
used to populate the target_cpu and target_features llvm.func
attributes.
This is achieved in two stages:
1. Introduce the `fir.target_cpu` and `fir.target_features` module
attributes with information from the target machine immediately after
the initial creation of the MLIR module in the lowering bridge.
2. Update the target rewrite flang pass to get this information from the
module and pass it along to all func.func MLIR operations, respectively
as attributes named `target_cpu` and `target_features`. These attributes
will be automatically picked up during Func to LLVM dialect lowering and
used to initialize the corresponding llvm.func named attributes.
The target rewrite and FIR to LLVM lowering passes are updated with the
ability to override these module attributes, and the `CodeGenSpecifics`
optimizer class is augmented to make this information available to
target-specific MLIR transformations.
This completes a full flow by which target CPU and features make it all
the way from compiler options to LLVM IR function attributes.
The existing type size computation in LoopVersioning does not work
for REAL*10, because the compute element size is 10 bytes,
which violates the power-of-two assertion.
We'd better use the DataLayout for computing the storage size
of each element of an array of the given type.
Derived type passed with VALUE in BIND(C) context must be passed like C
struct and LLVM is not implementing the ABI for this (it is up to the
frontends like clang).
Previous patch #75802 implemented the simple cases where the derived
type have one field, this patch implements the general case. Note that
the generated LLVM IR is compliant from a X86-64 C ABI point of view and
compatible with clang generated assembly, but that it is not guaranteed
to match the LLVM IR signatures generated by clang for the C equivalent
functions because several LLVM IR signatures may lead to the same X86-64
signature.
Implement the C struct passing ABI on X86-64 for the trivial case where
the structs have one element. This is required to cover some cases of
BIND(C) derived type pass with the VALUE attribute.
In the context of C/Fortran interoperability (BIND(C)), it is possible
to give the VALUE attribute to a BIND(C) derived type dummy, which
according to Fortran 2018 18.3.6 - 2. (4) implies that it must be passed
like the equivalent C structure value. The way C structure value are
passed is ABI dependent.
LLVM does not implement the C struct ABI passing for LLVM aggregate type
arguments. It is up to the front-end, like clang is doing, to split the
struct into registers or pass the struct on the stack (llvm "byval") as
required by the target ABI.
So the logic for C struct passing sits in clang. Using it from flang
requires setting up a lot of clang context and to bridge FIR/MLIR
representation to clang AST representation for function signatures (in
both directions). It is a non trivial task.
See
https://stackoverflow.com/questions/39438033/passing-structs-by-value-in-llvm-ir/75002581#75002581.
Since BIND(C) struct are rather limited as opposed to generic C struct
(e.g. no bit fields). It is easier to provide a limited implementation
of it for the case that matter to Fortran.
This patch:
- Updates the generic target rewrite pass to keep track of both the new
argument type and attributes. The motivation for this is to be able to
tell if a previously marshalled argument is passed in memory (it is a C
pointer), or if it is being passed on the stack (has the byval llvm
attributes).
- Adds an entry point in the target specific codegen to marshal struct
arguments, and use it in the generic target rewrite pass.
- Implements limited support for the X86-64 case. So far, the support
allows telling if a struct must be passed in register or on the stack,
and to deal with the stack case. The register case is left TODO in this
patch.
The X86-64 ABI implemented is the System V ABI for AMD64 version 1.0
COMPLEX(10) passing by value and returning follows C complex
passing/returning ABI.
Cover the COMPLEX(10) case (X87 / __Complex long double on X86-64).
Implements System V ABI for AMD64 version 1.0.
The LLVM signatures match the one generated by clang for the __Complex
long double case.
Note that a FIXME is added for the COMPLEX(8) case that is incorrect in
a corner case. This will be fixed when dealing with passing derived type
by value in BIND(C) context.
This update makes the user visible messages relating to features that
are not yet implemented be more consistent. I also cleaned up some of
the code.
For NYI messages that refer to intrinsics, I made sure the the message
begins with "not yet implemented: intrinsic:" to make them easier to
recognize.
I created some utility functions for NYI reporting that I put into
.../include/Optimizer/Support/Utils.h. These mainly convert MLIR types
to their Fortran equivalents.
I converted the NYI code to use the newly created utility functions.
Function arguments or return values that are complex floating point values
aren't correctly lowered for Windows x86 32-bit and 64-bit targets.
See: https://github.com/llvm/llvm-project/issues/61976
Add targets that are specific for these platforms and OS.
With thanks to @mstorsjo for pointing out the fix.
Reviewed By: vzakhari
Differential Revision: https://reviews.llvm.org/D147768
After the extraction of the TypeConverter, move the header files
to the include dir so the shared library build is fine.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D147979
Add LoongArch64 linux target specifics to Target.cpp which is similar to
RISCV-64 in D136547.
For LoongArch, a complex floating-point number, or a structure
containing just one complex floating-point number, is passed as though
it were a structure containing two floating-point reals.
Reviewed By: vzakhari
Differential Revision: https://reviews.llvm.org/D143131
This is the first patch of several that will enable generating code for AMD
GPUs. It adds the AMDGPU target so it can be used with the --target and -mcpu
options.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D143102
Clang uses signext/zeroext attributes for integer arguments shorter than
the default 'int' type on a target. So Flang has to match this for functions
from Fortran runtime and also for BIND(C) routines. This patch implements
ABI adjustments only for Fortran runtime calls. BIND(C) part will be done
separately.
This resolves https://github.com/llvm/llvm-project/issues/58579
Differential Revision: https://reviews.llvm.org/D142677
Adding support for ppc64 (big endian) in order to support flang on 64 bit AIX
Reviewed By: clementval, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D138390
Clang uses signext/zeroext attributes for integer arguments shorter than
the default 'int' type on a target. So Flang has to match this for functions
from Fortran runtime and also for BIND(C) routines. This patch implements
ABI adjustments only for Fortran runtime calls. BIND(C) part will be done
separately.
This resolves https://github.com/llvm/llvm-project/issues/58579
Differential Revision: https://reviews.llvm.org/D137050
As an attempt to fix errors in Flang regression tests on RISCV64 platform, RISCV64 target was added, and subsequent tests were provided.
Reviewed By: vzakhari
Differential Revision: https://reviews.llvm.org/D136547
Clang uses signext/zeroext attributes for integer arguments shorter than
the default 'int' type on a target. So Flang has to match this for functions
from Fortran runtime and also for BIND(C) routines. This patch implements
ABI adjustments only for Fortran runtime calls. BIND(C) part will be done
separately.
This resolves https://github.com/llvm/llvm-project/issues/58579
Differential Revision: https://reviews.llvm.org/D137050
This allows all ELF operating systems to use target specifics tuned for Linux,
since they use mostly the same ABIs. If some triples are to excluded, it's
better done at the driver layer.
Reviewed By: emaste
Differential Revision: https://reviews.llvm.org/D135100
As described in Issue #57642, `flang` currently lacks SPARC support in
`Optimizer/CodeGen/Target.cpp`, which causes a considerable number of tests
to `FAIL` with
error: flang/lib/Optimizer/CodeGen/Target.cpp:310: not yet implemented:
target not implemented
This patch fixes this by following GCC`s documentation of the ABI described
in the Issue.
Tested on `sparcv9-sun-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D133561
When testing LLVM 15.0.0 rc1 on Solaris, I found that 50+ flang tests
`FAIL`ed with
error:
/vol/llvm/src/llvm-project/local/flang/lib/Optimizer/CodeGen/Target.cpp:310:
not yet implemented: target not implemented
This patch fixes that for Solaris/x86, where the fix is trivial (just
handling it like the other x86 OSes).
Tested on `amd64-pc-solaris2.11`; only a single failure remains now.
Differential Revision: https://reviews.llvm.org/D131054
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D128331
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D127634
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
This patch basically extends https://reviews.llvm.org/D122008 with
support for MacOSX/Darwin.
To facilitate this, I've added `MacOSX` to the list of supported OSes in
Target.cpp. Flang already supports `Darwin` and it doesn't really do
anything OS-specific there (it could probably safely skip checking the
OS for now).
Note that generating executables remains hidden behind the
`-flang-experimental-exec` flag. Also, we don't need to add `-lm` on
MacOSX as `libm` is effectively included in `libSystem` (which is linked
in unconditionally).
Differential Revision: https://reviews.llvm.org/D125628
This patch adds Win32 to the list of supported triples in
`fir::CodeGenSpecifics`. This change means that we can use the "native"
triple, even when running tests on Windows. Currently this affects only
1 test, but it will change once we start adding more tests for lowering
and code-generation.
Differential Revision: https://reviews.llvm.org/D119332
This patch add conversion for primitive operations on complex types.
- fir.addc
- fir.subc
- fir.mulc
- fir.divc
- fir.negc
This adds also the type conversion for !fir.complex<KIND> type.
This patch is part of the upstreaming effort from fir-dev branch.
This patch was updated to avoid failure on windows buildbot.
Flang codegen does not support windows target so we force the test
to use a known target instead.
Reviewed By: kiranchandramohan, rovka
Differential Revision: https://reviews.llvm.org/D113434
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
This patch add conversion for primitive operations on complex types.
- fir.addc
- fir.subc
- fir.mulc
- fir.divc
- fir.negc
This adds also the type conversion for !fir.complex<KIND> type.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: rovka
Differential Revision: https://reviews.llvm.org/D113434
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Rewrite function signatures and calls to functions that accept or return
COMPLEX values.
Also teach insert_value and extract_value about the MLIR ComplexType, by
adding AnyComplex to AnyCompositeLike.
This patch is part of the effort for upstreaming the fir-dev branch.
Differential Revision: https://reviews.llvm.org/D113273
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
Co-authored-by: Tim Keith <tkeith@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
This patch adds the basic infrastructure for the TargetRewrite pass,
which rewrites certain FIR dialect operations into target specific
forms. In particular, it converts boxchar function parameters, call
arguments and return values. Other convertions will be included in
future patches.
This patch is part of the effort for upstreaming the fir-dev branch.
Differential Revision: https://reviews.llvm.org/D112910
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
Co-authored-by: Tim Keith <tkeith@nvidia.com>