537 Commits

Author SHA1 Message Date
Jakub Kuderski
10c5d75623
[flang] Fix new clang-tidy warning llvm-type-switch-case-types. NFC. (#178512)
Pre-commiting this before landing the new check in
https://github.com/llvm/llvm-project/pull/177892
2026-01-28 16:07:23 -05:00
agozillon
a16668a8d7
[Flang][OpenMP][MLIR] Align declare mapper pass handling with other map and global operations (#176852)
This PR makes a couple of minor tweaks to the lowering for
declare_mapper operations:

1) Add declare_mapper operations to the list of global operations to
have optimisation passes
executed on them. Primarily just to make sure we keep it inline with
other global operations
that contain regions. Prevents oddities where we embed FIR/HLFIR into
the mapper that needs
lowered before being converted to LLVM-IR. One example that springs to
mind is if we ever
decide to remove the single block condition on the operation to allow
conditional checks
   for mapped data.
2) Add a CodeGenOpenMP.cpp conversion for DeclareMapperOp to make sure
we convert the return
type correctly from a BoxType to a struct type rather than an opaque
pointer when lowering.
Currently, I've left out the block argument types from being converted
as they're wrapped
   in a fir.ref and would be opauqe pointers in either case.

So some minor additions to keep declare_mapper a little more inline with
the rest of the OpenMP operations.
2026-01-23 22:36:39 +01:00
Susan Tan (ス-ザン タン)
2698d15664
[flang] Lowering FIR memory ops to MemRef dialect (#173507)
This patch introduces FIRToMemRef, a lowering pass that converts FIR
memory operations to the MemRef dialect, including support for slices,
shifts, and descriptor-style access patterns. To support partial
lowering, where FIR and MemRef types can coexist, we extend the handling
of fir.convert to correctly marshal between FIR reference-like types and
MemRef descriptors. The patch also factors the type conversion logic
into a reusable FIRToMemRefTypeConverter, which centralizes the rules
for converting FIR types (e.g. !fir.ref, !fir.box, sequences, logicals)
to their corresponding memref types, and is used throughout the new
pass.

---------

Co-authored-by: Scott Manley <rscottmanley@gmail.com>
Co-authored-by: jeanPerier <jean.perier.polytechnique@gmail.com>
2026-01-14 10:46:50 -05:00
khaki3
9057744221
[flang] Fix SelectCaseOpConversion to convert block signatures (#175298)
When `fir.select_case` branches to blocks with arguments that have FIR
types (e.g., `!fir.ref`), the block signature must be converted to LLVM
types before creating the branch. Otherwise, the branch passes LLVM
types (`!llvm.ptr`) but the block expects FIR types, causing a type
mismatch error.

This adds block signature conversion similar to what
`SelectOpConversionBase` already does for `fir.select` and
`fir.select_rank`.
2026-01-12 09:45:15 -08:00
Thirumalai Shaktivel
212527c00b
[Flang] Add FIR and LLVM lowering support for prefetch directive (#167272)
Implementation details:
* Add PrefetchOp in FirOps
* Handle PrefetchOp in FIR Lowering and also pass required default
values
* Handle PrefetchOp in CodeGen.cpp
* Add required tests
2026-01-05 13:24:10 +05:30
Victor Chernyakin
c438773432
[LLVM][ADT] Migrate users of make_scope_exit to CTAD (#174030)
This is a followup to #173131, which introduced the CTAD functionality.
2026-01-02 20:42:56 -08:00
Abid Qadeer
fc9e6e13fd
[flang] Represent use statement in fir. (#168106)
We have a longstanding issue in debug info that use statement is not
fully respected. The problem has been described in
https://github.com/llvm/llvm-project/issues/160923. This is first part
of the effort to address this issue. This PR adds infrastructure to emit
`use` statement information in FIR, which will be used by subsequent
patches to generate DWARF debug information.

The information about use statement is collected during semantic
analysis and stored in `PreservedUseStmt` objects. During lowering,
`fir.use_stmt` operations are emitted for each `PreservedUseStmt`
object. The `fir.use_stmt` operation captures the module name, `only`
list symbols, and any renames specified in the use statement. The
`fir.use_stmt` is removed during `CodeGen`.
2026-01-02 12:10:18 +00:00
Susan Tan (ス-ザン タン)
01c3e25586
[flang] restrict fir.convert lowering (#172117)
Restrict lowering of fir.convert and exclude core memref types from it.
This is in preparation for a lowering that accommodates MemRef dialect.
2025-12-15 11:52:18 -05:00
Abid Qadeer
0e8222b84b
[flang][debug] Make common blocks data extraction more robust. (#168752)
Our current implementation for extracting information about common block
required traversal of FIR which was not ideal but previously there was
no other way to obtain that information. The `[hl]fir.declare` was
extended in commit https://github.com/llvm/llvm-project/pull/155325 to
include storage and storage_offset. This commit adds these operands in
`fircg.ext_declare` and then use them in `AddDebugInfoPass` to create
debug data for common blocks.
2025-11-20 14:28:56 +00:00
Jean-Didier PAILLEUX
3b83e7fa4e
[flang] Implement !DIR$ IVDEP directive (#133728)
This directive tells the compiler to ignore vector dependencies in the
following loop and it must be placed before a `do loop`.

Sometimes the compiler may not have sufficient information to decide
whether a particular loop is vectorizable due to potential dependencies
between iterations and the directive is here to tell to the compiler
that vectorization is safe with `parallelAccesses` metadata.

This directive is also equivalent to `#pragma clang loop assume(safety)`
in C++
2025-11-14 14:06:46 +01:00
Abid Qadeer
cfc56c982f
[flang][debug] Track dummy argument positions explicitly. (#167489)
CHARACTER dummy arguments were treated as local variables in debug info.
This happened because our method to get the argument number was not
robust. It relied on `DeclareOp` having a direct reference to arguments
which was not the case for character arguments. This is fixed by storing
source-level argument positions in `DeclareOp`.

Fixes #112886
2025-11-12 10:21:32 +00:00
Jacques Pienaar
a51c1f89ff
[mlir] Remove deprecated GEN_PASS_CLASSES (#167496)
Update CIR & flang in preparation for removing in #166904 (split out
from there, sufficient approvals there but wanted to enable more time to
removing feature).
2025-11-11 13:12:45 +00:00
Kazu Hirata
ee0652b4da
[flang] Remove unused local variables (NFC) (#167105)
Identified with bugprone-unused-local-non-trivial-variable.
2025-11-08 07:40:59 -08:00
Valentin Clement (バレンタイン クレメン)
0589409b64
[flang][cuda] Support gpu.launch_func with async token in target rewrite pass (#165485) 2025-10-28 20:19:28 -07:00
Jean-Didier PAILLEUX
c1779f33bd
[flang] Implement !DIR$ [NO]INLINE and FORCEINLINE directives (#134350)
This patch adds the support of these two directives : `!dir$ inline` and
`!dir$ noinline`.
- `!dir$ noinline` tells to the compiler to not perform inlining on
specific function calls by adding the `noinline` metadata on the call.
- `!dir$ inline` tells to the compiler to attempt inlining on specific
function calls by adding the `inlinehint` metadata on the call.
- `!dir$ forceinline` tells to the compiler to always perfom inlining on
specific function calls by adding the `alwaysinline` metadata on the
call.

Currently, these directives can be placed before a `DO LOOP`, call
functions or assignments. Maybe other statements can be added in the
future if needed.

For the `inline` directive the correct name might be `forceinline` but
I'm not sure ?
2025-10-28 08:02:15 +01:00
Valentin Clement (バレンタイン クレメン)
47ea8543e2
[flang] Update target rewrite to support workgroup and private attributions (#164515)
Some operations like the gpu.func have arguments that need to stay in
place while rewriting the signature. This is the case for the workgroup
and private attribution.
Update the target rewrite pass to be aware of that when adding argument
at the end of the function signature. If any trailing arguments are
present, the new argument will be inserted just before them.
2025-10-22 09:48:10 -07:00
Jakub Kuderski
23ead47655
[flang][mlir] Migrate to free create functions. NFC. (#164657)
See
https://discourse.llvm.org/t/psa-opty-create-now-with-100-more-tab-complete/87339.

I plan to mark these as deprecated in
https://github.com/llvm/llvm-project/pull/164649.
2025-10-22 12:47:48 -04:00
jeanPerier
c9fb37c75f
[flang][FIR] add fir.assumed_size_extent to abstract assumed-size extent encoding (#164452)
The purpose of this patch is to allow converting FIR array representation to
memref when possible without hitting memref verifier issue.

The issue was that FIR arrays may be assumed size, in which case the
last dimension will not be known at runtime. Flang uses -1 to encode
this to fulfill Fortran 2023 standard requirements in 18.5.3 point 5
about CFI_desc_t.

When arrays are converted to memeref, if this `-1` reaches memeref
operations, it triggers verifier errors (even if the conversion happened
in code that guards the code to be entered at runtime if the array is
assumed-size because folders/verifiers do not take into account
reachability).

This follows-up on discussions in #163505 merge requests
2025-10-22 11:46:18 +02:00
Valentin Clement (バレンタイン クレメン)
7b10e977f8
[flang][cuda] Do not fail if global is not found (#163445) 2025-10-14 20:51:42 +00:00
Valentin Clement (バレンタイン クレメン)
6a7754f2ac
[flang][cuda] Set address space for constant variables (#163430)
Set the correct address space for constant variables. Address of
operation will introduce an address cast.
2025-10-14 19:16:26 +00:00
Valentin Clement (バレンタイン クレメン)
1c8cd1ed97
[flang][cuda] Add a TODO for code generation of CONSTANT variable (#163268) 2025-10-13 21:52:31 +00:00
Alexey Bataev
0ca23a3054
[Flang]Fix the build with the EXPENSIVE_CHECKS enabled (#162541) 2025-10-08 16:35:23 -04:00
Valentin Clement (バレンタイン クレメン)
e47a42e5b5
[flang][cuda] Do not use managed memory inside gpu module (#160730)
Do not issue call to _FortranACUFAllocDescriptor inside gpu module.
2025-09-25 16:51:49 +00:00
Valentin Clement (バレンタイン クレメン)
37de695cb1
[flang][cuda] Make sure global device descriptor is allocated in managed memory (#160596)
When the descriptor of a global device variable is re-materialized to be
passed to a kernel, make sure it is allocated in managed memory
otherwise the kernel launch will fail.
2025-09-24 20:32:59 +00:00
agozillon
046d6a3998
[Flang][OpenMP] Additional global address space modifications for device (#119585)
A prior PR added a portion of the global address space modifications
required for declare target to, this PR seeks to add
 a small amount more leftover from that PR.
   
The intent is to allow for more correct IR that the backends (in
particular AMDGPU) can treat more aptly for optimisations and code
correctness
    
1/3 required PRs to enable declare target to mapping, should look at PR
3/3 to check for full green passes (this one will fail a number due to
some dependencies).
    
Co-authored-by: Raghu Maddhipatla raghu.maddhipatla@amd.com
2025-09-17 03:27:03 +02:00
Fabian Mora
48babe1931
[mlir][LLVM] Add LLVMAddrSpaceAttrInterface and NVVMMemorySpaceAttr (#157339)
This patch introduces the `LLVMAddrSpaceAttrInterface` for defining
compatible LLVM address space attributes

To test this interface, this patch also adds:
- Adds NVVMMemorySpaceAttr implementing both LLVMAddrSpaceAttrInterface
and MemorySpaceAttrInterface
- Converts NVVM memory space constants from enum to MLIR enums
- Updates all NVVM memory space references to use new attribute system
- Adds support for NVVM memory spaces in ptr dialect translation

Example:
```mlir
llvm.func @nvvm_ptr_address_space(
    !ptr.ptr<#nvvm.memory_space<global>>,
    !ptr.ptr<#nvvm.memory_space<shared>>,
    !ptr.ptr<#nvvm.memory_space<constant>>,
    !ptr.ptr<#nvvm.memory_space<local>>,
    !ptr.ptr<#nvvm.memory_space<tensor>>,
    !ptr.ptr<#nvvm.memory_space<shared_cluster>>
  ) -> !ptr.ptr<#nvvm.memory_space<generic>>
```
Translating the above code to LLVM produces:
```llvm
declare ptr @nvvm_ptr_address_space(ptr addrspace(1), ptr addrspace(3), ptr addrspace(4), ptr addrspace(5), ptr addrspace(6), ptr addrspace(7))
```


To convert the memory space enum to the new enum class use:
```bash
grep -r . -e "NVVMMemorySpace::kGenericMemorySpace" -l | xargs sed -i -e "s/NVVMMemorySpace::kGenericMemorySpace/NVVMMemorySpace::Generic/g"
grep -r . -e "NVVMMemorySpace::kGlobalMemorySpace" -l | xargs sed -i -e "s/NVVMMemorySpace::kGlobalMemorySpace/NVVMMemorySpace::Global/g"
grep -r . -e "NVVMMemorySpace::kSharedMemorySpace" -l | xargs sed -i -e "s/NVVMMemorySpace::kSharedMemorySpace/NVVMMemorySpace::Shared/g"
grep -r . -e "NVVMMemorySpace::kConstantMemorySpace" -l | xargs sed -i -e "s/NVVMMemorySpace::kConstantMemorySpace/NVVMMemorySpace::Constant/g"
grep -r . -e "NVVMMemorySpace::kLocalMemorySpace" -l | xargs sed -i -e "s/NVVMMemorySpace::kLocalMemorySpace/NVVMMemorySpace::Local/g"
grep -r . -e "NVVMMemorySpace::kTensorMemorySpace" -l | xargs sed -i -e "s/NVVMMemorySpace::kTensorMemorySpace/NVVMMemorySpace::Tensor/g"
grep -r . -e "NVVMMemorySpace::kSharedClusterMemorySpace" -l | xargs sed -i -e "s/NVVMMemorySpace::kSharedClusterMemorySpace/NVVMMemorySpace::SharedCluster/g"
```

NOTE: A future patch will add support for ROCDL, it wasn't added here to
keep the patch small.
2025-09-14 09:05:28 -04:00
agozillon
8f16af3c20
[Flang][OpenMP] Fix mapping of character type with LEN > 1 specified (#154172)
Currently, there's a number of issues with mapping characters with LEN's
specified (strings effectively). They're represented as a char type in
FIR with a len parameter, and then later on they're expanded into an
array of characters when we're translating to the LLVM dialect. However,
we don't generate a bounds for these at lowering. The fix in this PR for
this is to generate a bounds from the LEN parameter and attatch it to
the map on lowering from FIR to the LLVM dialect when we encounter this
type.
2025-09-09 16:36:04 +02:00
jeanPerier
3beec2f687
[flang] do not rely on existing fir.convert in TargetRewrite (#157413)
TargetRewrite is doing a shallow rewrite of function signatures. It is
only rewriting function definitions (FuncOp), calls (CallOp) and
AddressOfOp. It is not trying to visit each operations that may have an
operand with a function type.
It therefore needs function signature casts around the operations it is
rewriting.

Currently, these casts were not inserted after AddressOfOp rewrites
because lowering tends to always insert function cast after generating
AddressOfOp to the void type so the pass relied on implicitly updating
this cast operand type to get the required cast. This is brittle because
there is no guarantee such convert must be here and canonicalization and
passes may remove them.

Insert a cast after on the result of rewritten operations. If it is
redundant, it will be canonicalized away later.
2025-09-08 17:22:25 +02:00
jeanPerier
355dbbc37c
[flang][FIR] enable fir.box_addr codegen inside fir.global (#157120)
FIR lowering of the fir.box type inside fir.global is special (it is an
actual descriptor struct value instead of being a descriptor in memory)
and causes builtin.unrealized_conversion_cast to be inserted under the
hood by MLIR dialect conversion framework after each operation producing
a fir.box is translated. These builtin.unrealized_conversion_cast must
be removed before the code generation of operation of using the fir.box
in order to get the right "by value" code generation required in global
initial value definitions.
2025-09-08 10:15:22 +02:00
Chaitanya
4a3bf27c69
[OpenMP] Introduce omp.target_allocmem and omp.target_freemem omp dialect ops. (#145464)
This PR introduces two new ops in omp dialect, omp.target_allocmem and
omp.target_freemem.
omp.target_allocmem: Allocates heap memory on device. Will be lowered to
omp_target_alloc call in llvm.
omp.target_freemem: Deallocates heap memory on device. Will be lowered
to omp+target_free call in llvm.


Example:
  %1 = omp.target_allocmem %device : i32, i64
  omp.target_freemem %device, %1 : i32, i64

The work in this PR is C-P/inspired from @ivanradanov commit from
coexecute implementation:
[Add fir omp target alloc and free
ops](be860ac8ba)
[Lower omp_target_{alloc,free} to
llvm](6e2d584dc9)
2025-08-18 18:15:11 +05:30
Slava Zakharin
b8e4232bd2
[flang] Cast fir.select[_rank] selector to i64. (#153239)
Properly cast the selector to `i64` regardless of its integer type.
We used to generate llvm.trunc always.

We have to use `i64` as long as the case values may exceed INT_MAX.

Fixes #153050.
2025-08-12 16:43:44 -07:00
Valentin Clement (バレンタイン クレメン)
3847620ba9
[flang][NFC] Move the rest of ops creation to new APIs (#152079) 2025-08-05 07:27:43 -07:00
Valentin Clement (バレンタイン クレメン)
3b23fdb35d
[flang][NFC] Update more FIR op creation to the new APIs (#152060) 2025-08-04 17:53:44 -07:00
Maksim Levental
dcfc853c51
[mlir][NFC] update flang/lib create APIs (12/n) (#149914)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-24 19:05:40 -04:00
Diego Caballero
c99c213e72
[mlir][Flang][NFC] Replace use of vector.insertelement/extractelement (#143272)
This PR is part of the last step to remove `vector.extractelement` and
`vector.insertelement` ops (RFC:
https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops).
It replaces `vector.insertelement` and `vector.extractelement` with
`vector.insert` and `vector.extract` in Flang. It looks like no lit
tests are impacted?
2025-07-18 14:43:03 -07:00
Kelvin Li
df56b1a2cf
[flang] handle allocation of zero-sized objects (#149165)
This PR handles the allocation of zero-sized objects for different
implementations. One byte is allocated for the zero-sized objects.
2025-07-17 23:52:48 -04:00
Slava Zakharin
bcee18a2e2
[flang] Handle SEQUENCE derived types for array repacking. (#148777)
It is possible that a non-polymorphic dummy argument
has a dynamic type that does not match its static type
in a valid Fortran program, e.g. when the actual and
the dummy arguments have different compatible derived
SEQUENCE types:
module mod
  type t
    sequence
    integer x
  end type
contains
  subroutine test(x)
    type t
      sequence
      integer x
    end type
    type(t) :: x(:)
  end subroutine
end module

'test' may be called with an actual argument of type 'mod::t',
which is the dynamic type of 'x' on entry to 'test'.
If we create the repacking temporary based on the static type of 'x'
('test::t'), then the runtime will report the types mismatch
as an error. Thus, we have to create the temporary using
the dynamic type of 'x'. The fact that the dummy's type
has SEQUENCE or BIND attribute is not easily computable
at this stage, so we use the dynamic type for all derived
type cases. As long as this is done only when the repacking
actually happens, the overhead should not be noticeable.
2025-07-16 12:11:15 -07:00
Akash Banerjee
dbb12109b9
[OpenMP] Add TargetAMDGPU support for Complex argument and return types (#144924) 2025-07-16 14:00:06 +01:00
Akash Banerjee
fc114e4d93
[MLIR] Add ComplexTOROCDLLibraryCalls pass (#144926) 2025-07-16 13:59:41 +01:00
Razvan Lupusoru
c4fc358156
[flang][acc][nfc] Move FIROpenACCSupport to Support subfolder (#148710)
In order to prepare for adding FIROpenACCTransforms, move the FIR
OpenACC support library to its own subfolder.
2025-07-14 13:42:20 -07:00
Tom Eccles
9a805ba169
[flang][NFC] Fix deprecation warning (#147932)
I started getting deprecation warnings from operations constructors
which seem to be doing implicit construction of mlir::ValueRange from a
std::nullopt by relying on implicit conversion from std::nullopt into
llvm::ArrayRef. ArrayRef{std::nullopt} is what has been deprecated.
2025-07-11 10:37:34 +01:00
Kareem Ergawy
eba35cc1c0
[flang][do concurrent] Re-model reduce to match reductions are modelled in OpenMP and OpenACC (#145837)
This PR proposes re-modelling `reduce` specifiers to match OpenMP and
OpenACC. In particular, this PR includes the following:

* A new `fir` op: `fir.delcare_reduction` which is identical to OpenMP's
`omp.declare_reduction` op.
* Updating the `reduce` clause on `fir.do_concurrent.loop` to use the
new op.
* Re-uses the `ReductionProcessor` component to emit reductions for `do
concurrent` just like we do for OpenMP. To do this, the
`ReductionProcessor` had to be refactored to be more generalized.
* Upates mapping `do concurrent` to `fir.loop ... unordered` nests using
the new reduction model.

Unfortunately, this is a big PR that would be difficult to divide up in
smaller parts because the bottom of the changes are the `fir` table-gen
changes to `do concurrent`. However, doing these MLIR changes cascades
to the other parts that have to be modified to not break things.

This PR goes in the same direction we went for `private/local`
speicifiers. Now the `do concurrent` and OpenMP (and OpenACC) dialects
are modelled in essentially the same way which makes mapping between
them more trivial, hopefully.

PR stack:
- https://github.com/llvm/llvm-project/pull/145837 (this one)
- https://github.com/llvm/llvm-project/pull/146025
- https://github.com/llvm/llvm-project/pull/146028
- https://github.com/llvm/llvm-project/pull/146033
2025-07-11 06:39:30 +02:00
Daniel Chen
13ead00049
[Flang] Fix PowerPC build failure due to the deprecation of ArrayRef(std::nullopt_t) {}. (#147816)
Our local Flang build on PowerPC was broken as
```
llvm/flang/../mlir/include/mlir/IR/ValueRange.h:401:20: error: 'ArrayRef' is deprecated: Use {} or ArrayRef<T>() instead [-Werror,-Wdeprecated-declarations]
  401 |       : ValueRange(ArrayRef<Value>(std::forward<Arg>(arg))) {}
      |                    ^
llvm/flang/lib/Optimizer/CodeGen/CodeGen.cpp:2243:53: note: in instantiation of function template specialization 'mlir::ValueRange::ValueRange<const std::nullopt_t &, void>' requested here
 2243 |                              /*cstInteriorIndices=*/std::nullopt, fieldIndices,
      |                                                     ^
 llvm/include/llvm/ADT/ArrayRef.h:70:18: note: 'ArrayRef' has been explicitly marked deprecated here
   70 |     /*implicit*/ LLVM_DEPRECATED("Use {} or ArrayRef<T>() instead", "{}")
      |                  ^
llvm/include/llvm/Support/Compiler.h:244:50: note: expanded from macro 'LLVM_DEPRECATED'
  244 | #define LLVM_DEPRECATED(MSG, FIX) __attribute__((deprecated(MSG, FIX)))
      |                                                  ^
1 error generated.
```

This patch is to fix it.
2025-07-10 09:53:03 -04:00
Shunsuke Watanabe
c9900015a9
[flang] Add -fcomplex-arithmetic= option and select complex division algorithm (#146641)
This patch adds an option to select the method for computing complex
number division. It uses `LoweringOptions` to determine whether to lower
complex division to a runtime function call or to MLIR's `complex.div`,
and `CodeGenOptions` to select the computation algorithm for
`complex.div`. The available option values and their corresponding
algorithms are as follows:
- `full`: Lower to a runtime function call. (Default behavior)
- `improved`: Lower to `complex.div` and expand to Smith's algorithm.
- `basic`: Lower to `complex.div` and expand to the algebraic algorithm.

See also the discussion in the following discourse post:
https://discourse.llvm.org/t/optimization-of-complex-number-division/83468

---------

Co-authored-by: Tarun Prabhu <tarunprabhu@gmail.com>
2025-07-09 13:43:54 +09:00
Kareem Ergawy
b1774222c7
[flang] Emit fir.global in the global address space (#146653)
Instead of emitting globals in the program/default address space, emit
them in the global address space. This also requires changes how address
of code-gen is handled, we need to cast to the default address space to
prevent code-gen issues.
2025-07-02 17:15:22 +02:00
jeanPerier
faefe7cf7d
[flang] add option to generate runtime type info as external (#146071)
Reland #145901 with a fix for shared library builds.

So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.

This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.

This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.

Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.

I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
2025-06-30 09:58:00 +02:00
jeanPerier
37e2d10499
Revert "[flang] add option to generate runtime type info as external" (#146064)
Reverts llvm/llvm-project#145901

Broke shared library builds because of the usage of
`skipExternalRttiDefinition` in Lowering.
2025-06-27 14:05:59 +02:00
jeanPerier
e816817bbb
[flang] add option to generate runtime type info as external (#145901)
So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.

This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.

This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.

Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.

I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
2025-06-27 13:00:29 +02:00
Kazu Hirata
938cdb30f1
[flang] Migrate away from std::nullopt (NFC) (#145928)
ArrayRef has a constructor that accepts std::nullopt.  This
constructor dates back to the days when we still had llvm::Optional.

Since the use of std::nullopt outside the context of std::optional is
kind of abuse and not intuitive to new comers, I would like to move
away from the constructor and eventually remove it.

This patch replaces std::nullopt with {}.  There are a couple of
places where std::nullopt is replaced with TypeRange() to accommodate
perfect forwarding.
2025-06-26 12:41:49 -07:00
Slava Zakharin
8631b4f1b4
[flang] Set low probability for array repacking code. (#144830)
This allows LLVM to place the most probably cold blocks
that do the repacking out of the line of the potentially hot code.
2025-06-19 12:12:04 -07:00