llvm-project

Author	SHA1	Message	Date
Maksim Levental	46f6df0848	[mlir][NFC] update `flang/Optimizer/Transforms` create APIs (11/n) (#149915 ) See https://github.com/llvm/llvm-project/pull/147168 for more info.	2025-07-21 19:37:17 -04:00
Slava Zakharin	4775b96898	[flang] Optimize redundant array repacking. (#147881 ) This patch allows optimizing redundant array repacking, when the source array is statically known to be contiguous. This is part of the implementation plan for the array repacking feature, though, it does not affect any real life use case as long as FIR inlining is not a thing. I experimented with simple cases of FIR inling using `-inline-all`, and I recorded these cases in optimize-array-repacking.fir tests.	2025-07-14 09:41:42 -07:00
Sergio Afonso	7e8b3fea43	[Flang] Add missing dependent dialects to MLIR passes (#139260 ) This patch updates several passes to include the DLTI dialect, since their use of the `fir::support::getOrSetMLIRDataLayout()` utility function could, in some cases, require this dialect to be loaded in advance. Also, the `CUFComputeSharedMemoryOffsetsAndSize` pass has been updated with a dependency to the GPU dialect, as its invocation to `cuf::getOrCreateGPUModule()` would result in the same kind of error if no other operations or attributes from that dialect were present in the input MLIR module.	2025-05-13 16:17:49 +01:00
Slava Zakharin	2d12d31f44	[flang] Propagate contiguous attribute through HLFIR. (#138797 ) This change allows marking more designators producing an opaque box with 'contiguous' attribute, e.g. like in test1 case in flang/test/HLFIR/propagate-contiguous-attribute.fir. This would make isSimplyContiguous() return true for such designators allowing merging hlfir.eval_in_mem with hlfir.assign where the LHS is a contiguous array section. Depends on #139003	2025-05-12 18:33:47 -07:00
Slava Zakharin	0ac8cb1b3d	[flang] Recognize fir.pack_array in LoopVersioning. (#133191 ) This change enables LoopVersioning when `fir.pack_array` is met in the def-use chain. It fixes a couple of huge performance regressions caused by enabling `-frepack-arrays`.	2025-03-31 11:41:43 -07:00
Michael Kruse	b815a3942a	[Flang] Move non-common headers to FortranSupport (#124416 ) Move non-common files from FortranCommon to FortranSupport (analogous to LLVMSupport) such that * declarations and definitions that are only used by the Flang compiler, but not by the runtime, are moved to FortranSupport * declarations and definitions that are used by both ("common"), the compiler and the runtime, remain in FortranCommon * generic STL-like/ADT/utility classes and algorithms remain in FortranCommon This allows a for cleaner separation between compiler and runtime components, which are compiled differently. For instance, runtime sources must not use STL's `<optional>` which causes problems with CUDA support. Instead, the surrogate header `flang/Common/optional.h` must be used. This PR fixes this for `fast-int-sel.h`. Declarations in include/Runtime are also used by both, but are header-only. `ISO_Fortran_binding_wrapper.h`, a header used by compiler and runtime, is also moved into FortranCommon.	2025-02-06 15:29:10 +01:00
agozillon	4186805060	[Flang][MLIR] Extend DataLayout utilities to have basic GPU Module support (#123149 ) As there is now certain areas where we now have the possibility of having either a ModuleOp or GPUModuleOp and both of these modules can have DataLayout's and we may require utilising the DataLayout utilities in these areas I've taken the liberty of trying to extend them for use with both. Those with more knowledge of how they wish the GPUModuleOp's to interact with their parent ModuleOp's DataLayout may have further alterations they wish to make in the future, but for the moment, it'll simply utilise the basic data layout construction which I believe combines parent and child datalayouts from the ModuleOp and GPUModuleOp. If there is no GPUModuleOp DataLayout it should default to the parent ModuleOp. It's worth noting there is some weirdness if you have two module operations defining builtin dialect DataLayout Entries, it appears the combinatorial functionality for DataLayouts doesn't support the merging of these. This behaviour is useful for areas like: https://github.com/llvm/llvm-project/pull/119585/files#diff-19fc4bcb38829d085e25d601d344bbd85bf7ef749ca359e348f4a7c750eae89dR1412 where we have a crossroads between the two different module operations.	2025-01-30 17:31:50 +01:00
Slava Zakharin	711419e302	[flang] Enable loop-versioning for slices. (#120344 ) Loops resulting from array expressions like array(:,i) may be versioned for the unit stride of the innermost dimension, when the initial array is an assumed-shape array (which are contiguous in many Fortran programs). This speeds up facerec for about 12% due to further vectorization of the innermost loop produced for the total SUM reduction.	2024-12-23 07:53:10 -08:00
jeanPerier	c4204c0b29	[flang] replace fir.complex usages with mlir complex (#110850 ) Core patch of https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292. After that, the last step is to remove fir.complex from FIR types.	2024-10-03 17:10:57 +02:00
Abid Qadeer	d07dc73bcf	[flang][debug] Support derived types. (#99476 ) This PR adds initial debug support for derived type. It handles `RecordType` and generates appropriate `DICompositeTypeAttr`. The `TypeInfoOp` is used to get information about the parent and location of the derived type. We use `getTypeSizeAndAlignment` to get the size and alignment of the components of the derived types. This function needed a few changes to be suitable to be used here: 1. The `getTypeSizeAndAlignment` errored out on unsupported type which would not work with incremental way we are building debug support. A new variant of this function has been that returns an std::optional. The original function has been renamed to `getTypeSizeAndAlignmentOrCrash` as it will call `TODO()` for unsupported types. 2. The Character type was returning size of just element and not the whole string which has been fixed. The testcase checks for offsets of the components which had to be hardcoded in the test. So the testcase is currently enabled on x86_64. With this PR in place, this is how the debugging of derived types look like: ``` type :: t_date integer :: year, month, day end type type :: t_address integer :: house_number end type type, extends(t_address) :: t_person character(len=20) name end type type, extends(t_person) :: t_employee type(t_date) :: hired_date real :: monthly_salary end type type(t_employee) :: employee (gdb) p employee $1 = ( t_person = ( t_address = ( house_number = 1 ), name = 'John', ' ' <repeats 16 times> ), hired_date = ( year = 2020, month = 1, day = 20 ), monthly_salary = 3.1400001 ) ```	2024-08-27 10:30:49 +01:00
Christian Sigg	fac349a169	Reapply "[mlir] Mark `isa/dyn_cast/cast/...` member functions depreca… (#90406 ) …ted. (#89998)" (#90250) This partially reverts commit 7aedd7dc754c74a49fe84ed2640e269c25414087. This change removes calls to the deprecated member functions. It does not mark the functions deprecated yet and does not disable the deprecation warning in TypeSwitch. This seems to cause problems with MSVC.	2024-04-28 22:01:42 +02:00
dyung	7aedd7dc75	Revert "[mlir] Mark `isa/dyn_cast/cast/...` member functions deprecated. (#89998 )" (#90250 ) This reverts commit 950b7ce0b88318f9099e9a7c9817d224ebdc6337. This change is causing build failures on a bot https://lab.llvm.org/buildbot/#/builders/216/builds/38157	2024-04-26 12:09:13 -07:00
Christian Sigg	950b7ce0b8	[mlir] Mark `isa/dyn_cast/cast/...` member functions deprecated. (#89998 ) See https://mlir.llvm.org/deprecation and https://discourse.llvm.org/t/preferred-casting-style-going-forward.	2024-04-26 16:28:30 +02:00
Tom Eccles	08dc03c570	[flang][NFC] Use tablegen to create LoopVersioning constructor (#90037 ) The pass is currently defined as only considering function arguments as candidates for the optimization. I would prefer to generalise the pass for other top level operations only when there is a concrete use case before making too many assumptions about the current set of top level operations. Therefore I have not adapted this pass to run on all top level operations.	2024-04-26 10:54:46 +01:00
Slava Zakharin	3e47e75feb	[flang] Use DataLayout for computing type size in LoopVersioning. (#79778 ) The existing type size computation in LoopVersioning does not work for REAL*10, because the compute element size is 10 bytes, which violates the power-of-two assertion. We'd better use the DataLayout for computing the storage size of each element of an array of the given type.	2024-01-29 09:14:47 -08:00
David Green	49212d1601	[Flang] Fix for replacing loop uses in LoopVersioning pass (#77899 ) The added test case has a loop that is versioned, which has a use of the loop in an if block after the loop. The current code replaces all uses of the loop with the new version If, but only if the parent blocks match. As far as I can see it should be safe to replace all the uses, then construct the result for the If with op.op.	2024-01-20 22:16:05 +00:00
David Green	4056287d3a	[Flang] Clean up LoopVersioning LLVM_DEBUG blocks. NFC (#77818 ) Just a little trick to put LLVM_DEBUG blocks into separate { } scopes, so they clang-format better.	2024-01-15 11:23:50 +00:00
Valentin Clement	208a4510d4	[flang][NFC] Fix typo	2023-11-17 10:54:45 -08:00
Mats Petersson	8dcee5800c	[flang]Check for dominance in loop versioning (#68797 ) This avoids trying to version loops that can't be versioned, and thus avoids hitting an assert. Co-authored with Slava Zakharin (who provided the test-code).	2023-10-12 13:07:16 +01:00
Slava Zakharin	7beb65ae2d	[flang] Fixed LoopVersioning for array slices. (#65703 ) The first test case added in the LIT test demonstrates the problem. Even though we did not consider the inner loop as a candidate for the transformation due to the array_coor with a slice, we decided to version the outer loop for the same function argument. During the cloning of the outer loop we dropped the slicing completely producing invalid code. I restructured the code so that we record all arg uses that cannot be transformed (regardless of the reason), and then fixup the usage information across the loop nests. I also noticed that we may generate redundant contiguity checks for the inner loops, so I fixed it since it was easy with the new way of keeping the usage data.	2023-09-08 09:01:10 -07:00
Tom Eccles	ad9af7de90	[flang][LoopVersioning] support fir.array_coor This is the last piece required for the loop versioning patch to work on code lowered via HLFIR. With this patch, HLFIR performance on spec2017 roms is now similar to the FIR lowering. Adding support for fir.array_coor means that many more loops will be versioned, even in the FIR lowering. So far as I have seen, these do not seem to have an impact on performance for the benchmarks I tried, but I expect it would speed up some programs, if the loop being versioned happened to be the hot code. The main difference between fir.array_coor and fir.coordinate_of is that fir.coordinate_of uses zero-based indices, whereas fir.array_coor uses the indices as specified in the Fortran program (starting from 1 by default, but also supporting non default lower bounds). I opted to transform fir.array_coor operations into fir.coordinate_of operations because this allows both to share the same offset calculation logic. The tricky bit of this patch is getting the correct lower bounds for the array operand to subtract from the fir.array_coor indices to get a zero-based indices. So far as I can tell, the FIR lowering will always provide lower bounds (shift) information in the shape operand to the fir.array_coor when non-default lower bounds are used. If none is given, I originally tried falling back to reading lower bounds from the box, but this led to misscompilation in SPEC2017 cam4. Therefore the pass instead assumes that if it can't already find an SSA value for the shift information, the default lower bound (1) should be used. A suspect the incorrect lower bounds in the box for the FIR lowering was already a known issue (see https://reviews.llvm.org/D158119). Differential Revision: https://reviews.llvm.org/D158597	2023-09-04 10:40:40 +00:00
Slava Zakharin	cccf4d6e4a	[flang] Skip OPTIONAL arguments in LoopVersioning. This patch fixes multiple tests failing with segfault due to accessing absent argument box before the loop versioning check. The absent arguments might be treated as contiguous for the purpose of loop versioning, but this is not done in this patch. Reviewed By: PeteSteinfeld Differential Revision: https://reviews.llvm.org/D158800	2023-08-25 08:33:49 -07:00
Tom Eccles	8d24b7322e	[flang][LoopVersioning] support reboxed operands Since https://reviews.llvm.org/D158119, many boxes lowered via HLFIR are reboxed with better lower bounds information after they are declared. For the loop versioning pass to support FIR lowered via HLFIR, it needs to dereference fir.rebox operations to figure out that the variable was a function argument. I decided to modify the existing dereferencing of fir.declare so that the declared/reboxed value is used in the versioned loop instead of the function argument. This makes it easier for the improved lower bounds information to be accessed. In doing this, I changed ArgInfo to store ArgInfo::arg by value instead of by pointer because mlir::Value has value-type semantics. Differential Revision: https://reviews.llvm.org/D158408	2023-08-23 09:53:05 +00:00
Slava Zakharin	668f261bfa	[flang] Make ISO_Fortran_binding.h a standalone header again. This implements the proposal from https://discourse.llvm.org/t/adding-flang-specific-header-files-to-clang/72442/6 Since ISO_Fortran_binding.h is supposed to be included from users' C/C++ codes, it would better have no dependencies on other header files. Reviewed By: PeteSteinfeld Differential Revision: https://reviews.llvm.org/D158549	2023-08-22 18:56:27 -07:00
Tom Eccles	05011024fd	[flang][LoopVersioning] support fir.declare When FIR comes from HLFIR, there will be a fir.declare operation between the source and the usage of each source variable (and some temporary allocations). This pass needs to be able to follow these so that it can still transform loops when HLFIR is used, otherwise it mistakenly assumes these values are not function arguments. More work is needed after this patch to fully support HLFIR, because the generated code tends to use fir.array_coor instead of fir.coordinate_of. Differential Revision: https://reviews.llvm.org/D157964	2023-08-18 09:51:22 +00:00
Tom Eccles	53cc33b00b	[flang] Store KindMapping by value in FirOpBuilder Previously only a constant reference was stored in the FirOpBuilder. However, a lot of code was merged using FirOpBuilder builder{rewriter, getKindMapping(mod)}; This is incorrect because the KindMapping returned will go out of scope as soon as FirOpBuilder's constructor had run. This led to an infinite loop running some tests using HLFIR (because the stack space containing the kind mapping was re-used and corrupted). One solution would have just been to fix the incorrect call sites, however, as a large number of these had already made it past review, I decided to instead change FirOpBuilder to store its own copy of the KindMapping. This is not costly because nearly every time we construct a KindMapping is exclusively to construct a FirOpBuilder. To make this common pattern simpler, I added a new constructor to FirOpBuilder which calls getKindMapping(). Differential Revision: https://reviews.llvm.org/D151881	2023-06-05 09:57:57 +00:00
Mats Petersson	b812932b35	[FLANG] Change loop versioning to use shift instead of divide Despite me being convinced that the use of divide didn't produce any divide instructions, it does in fact add more instructions than using a plain shift operation. This patch simply changes the divide to a shift right, with an assert to check that the "divisor" is a power of two. Reviewed By: kiranchandramohan, tblah Differential Revision: https://reviews.llvm.org/D151880	2023-06-01 19:29:57 +01:00
Mats Petersson	b75f9ce3fe	[FLANG] Support all arrays for LoopVersioning This patch makes more than 2D arrays work, with a fix for the way that loop index is calculated. Removing the restriction of number of dimensions. This also changes the way that the actual index is calculated, such that the stride is used rather than the extent of the previous dimension. Some tests failed without fixing this - this was likely a latent bug in the 2D version too, but found in a test using 3D arrays, so wouldn't have been found with 2D only. This introduces a division on the index calculation - however it should be a nice and constant value allowing a shift to be used to actually divide - or otherwise removed by using other methods to calculate the result. In analysing code generated with optimisation at -O3, there are no divides produced. Some minor refactoring to avoid repeatedly asking for the "rank" of the array being worked on. This improves some of the SPEC-2017 ROMS code, in the same way as the limited 2D array improvements - less overhead spent calculating array indices in the inner-most loop and better use of vector-instructions. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D151140	2023-05-30 18:54:40 +01:00
Mats Petersson	a716ace13d	Add loop-versioning pass to improve unit-stride Introduce conditional code to identify stride of "one element", and simplify the array accesses for that case. This allows better loop performance in various benchmarks. Reviewed By: tblah, kiranchandramohan Differential Revision: https://reviews.llvm.org/D141306	2023-04-18 09:53:07 +01:00

29 Commits