llvm-project

Author	SHA1	Message	Date
Akash Banerjee	3b10b9a2b0	[MLIR][OpenMP] Add lowering support for AUTOMAP modifier (#151513 ) Add Automap modifier to the MLIR op definition for the DeclareTarget directive's Enter clause. Also add lowering support in Flang. Automap Ref: OpenMP 6.0 section 7.9.7.	2025-08-11 12:45:22 +01:00
Krzysztof Parzyszek	43db6c5cc1	[flang][OpenMP] General utility to get directive id from AST node (#150121 ) Fortran::parser::omp::GetOmpDirectiveName(t) will get the OmpDirectiveName object that corresponds to construct t. That object (an AST node) contains the enum id and the source information of the directive. Replace uses of extractOmpDirective and getOpenMPDirectiveEnum with the new function.	2025-07-23 08:25:33 -05:00
Maksim Levental	a3a007ad5f	[mlir][NFC] update `flang/Lower` create APIs (8/n) (#149912 ) See https://github.com/llvm/llvm-project/pull/147168 for more info.	2025-07-21 19:54:29 -04:00
Krzysztof Parzyszek	ff5784bb90	[flang][OpenMP] Move extractOmpDirective to Utils.cpp, NFC (#148653 )	2025-07-17 12:11:12 -05:00
Kareem Ergawy	7c8a197918	[NFC][flang] Move `ReductionProcessor` to `Lower/Support`. (#146025 ) With #145837, the `ReductionProcessor` component is now used by both OpenMP and `do concurrent`. Therefore, this PR moves it to a shared location: `flang/Lower/Support`. PR stack: - https://github.com/llvm/llvm-project/pull/145837 - https://github.com/llvm/llvm-project/pull/146025 (this one) - https://github.com/llvm/llvm-project/pull/146028 - https://github.com/llvm/llvm-project/pull/146033	2025-07-11 07:42:51 +02:00
Kareem Ergawy	7e9887a50d	[flang] Generlize names of delayed privatization CLI flags (#138816 ) Remove the `openmp` prefix from delayed privatization/localization flags since they are now used for `do concurrent` as well. PR stack: - https://github.com/llvm/llvm-project/pull/137928 - https://github.com/llvm/llvm-project/pull/138505 - https://github.com/llvm/llvm-project/pull/138506 - https://github.com/llvm/llvm-project/pull/138512 - https://github.com/llvm/llvm-project/pull/138534 - https://github.com/llvm/llvm-project/pull/138816 (this PR)	2025-05-29 12:27:03 +02:00
Akash Banerjee	fbb11b4c4e	[OpenMP][Flang] Fix OOB access for derived type mapping (#140948 )	2025-05-22 01:34:40 +01:00
Sergio Afonso	30b0946326	[Flang][MLIR][OpenMP] Improve use_device_* handling (#137198 ) This patch updates MLIR op verifiers for operations taking arguments that must always be defined by an `omp.map.info` operation to check this requirement. It also modifies Flang lowering for `use_device_{addr, ptr}`, as well as the custom MLIR printer and parser for these clauses, to support initializing it to `OMP_MAP_RETURN_PARAM` and represent this in the MLIR representation as `return_param`. This internal mapping flag is what eventually is used for variables passed via these clauses into the target region when translating to LLVM IR, so making it explicit in Flang and MLIR removes an inconsistency in the current representation.	2025-05-15 12:28:06 +01:00
Sergio Afonso	b231f6f862	[MLIR][OpenMP] Improve omp.map.info verification (#132066 ) This patch makes the `map_type` and `map_capture_type` arguments of the `omp.map.info` operation required, which was already an invariant being verified by its users via `verifyMapClause()`. This makes it clearer, as getters no longer return misleading `std::optional` values. Checks for the `mapper_id` argument are moved to a verifier for the operation, rather than being checked by users. Functionally NFC, but not marked as such due to a reordering of arguments in the assembly format of `omp.map.info`.	2025-03-20 15:48:45 +00:00
Krzysztof Parzyszek	d67947162f	[flang][OpenMP] Implement HAS_DEVICE_ADDR clause (#128568 ) The HAS_DEVICE_ADDR indicates that the object(s) listed exists at an address that is a valid device address. Specifically, `has_device_addr(x)` means that (in C/C++ terms) `&x` is a device address. When entering a target region, `x` does not need to be allocated on the device, or have its contents copied over (in the absence of additional mapping clauses). Passing its address verbatim to the region for use is sufficient, and is the intended goal of the clause. Some Fortran objects use descriptors in their in-memory representation. If `x` had a descriptor, both the descriptor and the contents of `x` would be located in the device memory. However, the descriptors are managed by the compiler, and can be regenerated at various points as needed. The address of the effective descriptor may change, hence it's not safe to pass the address of the descriptor to the target region. Instead, the descriptor itself is always copied, but for objects like `x`, no further mapping takes place (as this keeps the storage pointer in the descriptor unchanged). --------- Co-authored-by: Sergio Afonso <safonsof@amd.com>	2025-03-10 08:11:01 -05:00
Kareem Ergawy	9543e9e927	[flang][OpenMP] Handle pre-detemined `lastprivate` for `simd` (#129507 ) This PR tries to fix `lastprivate` update issues in composite constructs. In particular, pre-determined `lastprivate` symbols are attached to the wrong leaf of the composite construct (the outermost one). When using delayed privatization (should be the default mode in the future), this results in trying to update the `lastprivate` symbol in the wrong construct (outside the `omp.loop_nest` op). For example, given the following input: ```fortran !$omp target teams distribute parallel do simd collapse(2) private(y_max) do i=x_min,x_max do j=y_min,y_max enddo enddo ``` Without the fixes introduced in this PR, the `DataSharingProcessor` tries to generate the `lastprivate` update ops in the `parallel` op since this is the op for which the DSP instance is created. The fix consists of 2 main parts: 1. Instead of creating a single DSP instance, one instance is created for the leaf constructs that might need privatization (whether for explicit, implicit, or pre-determined symbols). 2. When generating the `lastprivate` comparison ops, we don't directly use the SSA values of the UBs and steps. Instead, we regenerated these SSA values from the original loop bounds' expressions. We have to do this to avoid using `host_eval` values in the `lastprivate` comparison logic which is illegal.	2025-03-07 05:44:39 +01:00
jeanPerier	a8db1fb9b5	[flang] update fir.coordinate_of to carry the fields (#127231 ) This patch updates fir.coordinate_op to carry the field index as attributes instead of relying on getting it from the fir.field_index operations defining its operands. The rational is that FIR currently has a few operations that require DAGs to be preserved in order to be able to do code generation. This is the case of fir.coordinate_op, which requires its fir.field operand producer to be visible. This makes IR transformation harder/brittle, so I want to update FIR to get rid if this. Codegen/printer/parser of fir.coordinate_of and many tests need to be updated after this change.	2025-02-28 09:50:05 +01:00
Akash Banerjee	ee17955dfe	[MLIR][OpenMP] Add OMP Mapper field to MapInfoOp (#120994 ) This patch adds the mapper field to the omp.map.info op. Depends on #117046.	2025-02-18 17:27:48 +00:00
jeanPerier	662133a278	[flang][OpenMP][OpenACC] remove libEvaluate dependency in passes (#123784 ) Move OpenACC/OpenMP helpers from Lower/DirectivesCommon.h that are also used in OpenACC and OpenMP mlir passes into a new Optimizer/Builder/DirectivesCommon.h so that parser and evaluate headers are not included in Optimizer libraries (this both introduce compile-time and link-time pointless overheads). This should fix https://github.com/llvm/llvm-project/issues/123377	2025-01-21 20:32:42 +01:00
Kareem Ergawy	e532241b02	Re-apply (#117867 ): [flang][OpenMP] Implicitly map allocatable record fields (#120374 ) This re-applies #117867 with a small fix that hopefully prevents build bot failures. The fix is avoiding `dyn_cast` for the result of `getOperation()`. Instead we can assign the result to `mlir::ModuleOp` directly since the type of the operation is known statically (`OpT` in `OperationPass`).	2024-12-18 09:19:45 +01:00
Kareem Ergawy	dc936f3c19	Revert "[flang][OpenMP] Implicitly map allocatable record fields (#117867 )" (#120360 )	2024-12-18 06:52:24 +01:00
Kareem Ergawy	db09014a07	[flang][OpenMP] Implicitly map allocatable record fields (#117867 ) This is a starting PR to implicitly map allocatable record fields. This PR contains the following changes: 1. Re-purposes some of the utils used in `Lower/OpenMP.cpp` so that these utils work on the `mlir::Value` level rather than the `semantics::Symbol` level. This takes one step towards to enabling MLIR passes to more easily do some lowering themselves (e.g. creating `omp.map.bounds` ops for implicitely caputured data like this PR does). 2. Adds support for implicitely capturing and mapping allocatable fields in record types. There is quite some distant to still cover to have full support for this. I added a number of todos to guide further development. Co-authored-by: Andrew Gozillon <andrew.gozillon@amd.com> Co-authored-by: Andrew Gozillon <andrew.gozillon@amd.com>	2024-12-18 05:37:58 +01:00
agozillon	e508bacce4	[Flang][OpenMP] Derived type explicit allocatable member mapping (#113557 ) This PR is one of 3 in a PR stack, this is the primary change set which seeks to extend the current derived type explicit member mapping support to handle descriptor member mapping at arbitrary levels of nesting. The PR stack seems to do this reasonably (from testing so far) but as you can create quite complex mappings with derived types (in particular when adding allocatable derived types or arrays of allocatable derived types) I imagine there will be hiccups, which I am more than happy to address. There will also be further extensions to this work to handle the implicit auto-magical mapping of descriptor members in derived types and a few other changes planned for the future (with some ideas on optimizing things). The changes in this PR primarily occur in the OpenMP lowering and the OMPMapInfoFinalization pass. In the OpenMP lowering several utility functions were added or extended to support the generation of appropriate intermediate member mappings which are currently required when the parent (or multiple parents) of a mapped member are descriptor types. We need to map the entirety of these types or do a "deep copy" for lack of a better term, where we map both the base address and the descriptor as without the copying of both of these we lack the information in the case of the descriptor to access the member or attach the pointers data to the pointer and in the latter case we require the base address to map the chunk of data. Currently we do not segment descriptor based derived types as we do with regular non-descriptor derived types, we effectively map their entirety in all cases at the moment, I hope to address this at some point in the future as it adds a fair bit of a performance penalty to having nestings of allocatable derived types as an example. The process of mapping all intermediate descriptor members in a members path only occurs if a member has an allocatable or object parent in its symbol path or the member itself is a member or allocatable. This occurs in the createParentSymAndGenIntermediateMaps function, which will also generate the appropriate address for the allocatable member within the derived type to use as a the varPtr field of the map (for intermediate allocatable maps and final allocatable mappings). In this case it's necessary as we can't utilise the usual Fortran::lower functionality such as gatherDataOperandAddrAndBounds without causing issues later in the lowering due to extra allocas being spawned which seem to affect the pointer attachment (at least this is my current assumption, it results in memory access errors on the device due to incorrect map information generation). This is similar to why we do not use the MLIR value generated for this and utilise the original symbol provided when mapping descriptor types external to derived types. Hopefully this can be rectified in the future so this function can be simplified and more closely aligned to the other type mappings. We also make use of fir::CoordinateOp as opposed to the HLFIR version as the HLFIR version doesn't support the appropriate lowering to FIR necessary at the moment, we also cannot use a single CoordinateOp (similarly to a single GEP) as when we index through a descriptor operation (BoxType) we encounter issues later in the lowering, however in either case we need access to intermediate descriptors so individual CoordinateOp's aid this (although, being able to compress them into a smaller amount of CoordinateOp's may simplify the IR and perhaps result in a better end product, something to consider for the future). The other large change area was in the OMPMapInfoFinalization pass, where the pass had to be extended to support the expansion of box types (or multiple nestings of box types) within derived types, or box type derived types. This requires expanding each BoxType mapping from one into two maps and then modifying all of the existing member indices of the overarching parent mapping to account for the addition of these new members alongside adjusting the existing member indices to support the addition of these new maps which extend the original member indices (as a base address of a box type is currently considered a member of the box type at a position of 0 as when lowered to LLVM-IR it's a pointer contained at this position in the descriptor type, however, this means extending mapped children of this expanded descriptor type to additionally incorporate the new member index in the correct location in its own index list). I believe there is a reasonable amount of comments that should aid in understanding this better, alongside the test alterations for the pass. A subset of the changes were also aimed at making some of the utilities for packing and unpacking the DenseIntElementsAttr containing the member indices shareable across the lowering and OMPMapInfoFinalization, this required moving some functions to the Lower/Support/Utils.h header, and transforming the lowering structure containing the member index data into something more similar to the version used in OMPMapInfoFinalization. There we also some other attempts at tidying things up in relation to the member index data generation in the lowering, some of which required creating a logical operator for the OpenMP ID class so it can be utilised as a map key (it simply utilises the symbol address for the moment as ordering isn't particularly important). Otherwise I have added a set of new tests encompassing some of the mappings currently supported by this PR (unfortunately as you can have arbitrary nestings of all shapes and types it's not very feasible to cover them all).	2024-11-16 12:28:37 +01:00
Sergio Afonso	88478a89cd	[Flang][OpenMP] Improve entry block argument creation and binding (#110267 ) The main purpose of this patch is to centralize the logic for creating MLIR operation entry blocks and for binding them to the corresponding symbols. This minimizes the chances of mixing arguments up for operations having multiple entry block argument-generating clauses and prevents divergence while binding arguments. Some changes implemented to this end are: - Split into two functions the creation of the entry block, and the binding of its arguments and the corresponding Fortran symbol. This enabled a significant simplification of the lowering of composite constructs, where it's no longer necessary to manually ensure the lists of arguments and symbols refer to the same variables in the same order and also match the expected order by the `BlockArgOpenMPOpInterface`. - Removed redundant and error-prone passing of types and locations from `ClauseProcessor` methods. Instead, these are obtained from the values in the appropriate clause operands structure. This also simplifies argument lists of several lowering functions. - Access block arguments of already created MLIR operations through the `BlockArgOpenMPOpInterface` instead of directly indexing the argument list of the operation, which is not scalable as more entry block argument-generating clauses are added to an operation. - Simplified the implementation of `genParallelOp` to no longer need to define different callbacks depending on whether delayed privatization is enabled.	2024-10-07 11:26:35 +01:00
Krzysztof Parzyszek	f98244392b	[flang][OpenMP] Parse lastprivate modifier, add TODO to lowering (#110568 ) Parse the lastprivate clause with a modifier. Codegen for it is not yet implemented.	2024-10-02 15:36:45 -05:00
Akash Banerjee	142433684a	[OpenMP][Flang] Fix dynamic-extent array mapping (#107247 ) This patch fixes the mapping and lowering of arrays with dynamic extents and adds a new test for the same. The fix discards the incomplete the dynamic extent information and replacing it with just the base type. When lowering to llvm later, the bounds information is used instead.	2024-09-05 12:44:10 +01:00
Kareem Ergawy	10df320743	[flang][OpenMP] Enable delayed privatization for `omp parallel` by default (#90945 ) Flips the delayed privatization switch to be on by default. After the recent fixes related to delayed privatization, the gfortran test suite runs successfully with delayed privatization turned on by defuault for `omp parallel`.	2024-08-02 09:46:34 +02:00
Alexander Shaposhnikov	77d8cfb3c5	[Flang] Switch to common::visit more call sites (#90018 ) Switch to common::visit more call sites. Test plan: ninja check-all	2024-06-17 12:59:04 -07:00
Kareem Ergawy	1539da4601	[flang][OpenMP] Add `--openmp-enable-delayed-privatization-staging` flag (#94749 )	2024-06-07 18:08:25 +02:00
Krzysztof Parzyszek	b025d6913e	[flang][OpenMP] Make object identity more precise (#94495 ) Derived type components may use a given `Symbol` regardless of what parent objects they are a part of. Because of that, simply using a symbol address is not sufficient to determine object identity. Make the designator a part of the IdTy. To compare identities, when symbols are equal (and non-null), compare the designators.	2024-06-06 07:28:41 -05:00
Krzysztof Parzyszek	8b18f2fe06	[flang][OpenMP] Add `sym()` member function to omp::Object (#94493 ) The object identity requires more than just `Symbol`. Don't use `id()` to get the Symbol associated with the object, becase the return value will need to change. Instead use `sym()` which is added for that reason.	2024-06-05 13:38:28 -05:00
Krzysztof Parzyszek	7a66e4209b	[flang][OpenMP] Remove unnecessary `Fortran::` qualification, NFC (#92298 ) The `Fortran::` namespace is redundant for all parts of the code in this PR, except for names of functions in their definitions.	2024-05-16 07:49:01 -05:00
Krzysztof Parzyszek	be7c9e3957	[flang][OpenMP] Decompose compound constructs, do recursive lowering (#90098 ) A compound construct with a list of clauses is broken up into individual leaf/composite constructs. Each such construct has the list of clauses that apply to it based on the OpenMP spec. Each lowering function (i.e. a function that generates MLIR ops) is now responsible for generating its body as described below. Functions that receive AST nodes extract the construct, and the clauses from the node. They then create a work queue consisting of individual constructs, and invoke a common dispatch function to process (lower) the queue. The dispatch function examines the current position in the queue, and invokes the appropriate lowering function. Each lowering function receives the queue as well, and once it needs to generate its body, it either invokes the dispatch function on the rest of the queue (if any), or processes nested evaluations if the work queue is at the end. Re-application of ca1bd5995f6ed934f9187305190a5abfac049173 with fixes for compilation errors.	2024-05-13 10:32:16 -05:00
Krzysztof Parzyszek	25a3ba3315	Revert "[flang][OpenMP] Decompose compound constructs, do recursive lowering (#90098 )" It breaks some builds, e.g. https://lab.llvm.org/buildbot/#/builders/268/builds/13909 This reverts commit ca1bd5995f6ed934f9187305190a5abfac049173.	2024-05-13 08:43:45 -05:00
Krzysztof Parzyszek	ca1bd5995f	[flang][OpenMP] Decompose compound constructs, do recursive lowering (#90098 ) A compound construct with a list of clauses is broken up into individual leaf/composite constructs. Each such construct has the list of clauses that apply to it based on the OpenMP spec. Each lowering function (i.e. a function that generates MLIR ops) is now responsible for generating its body as described below. Functions that receive AST nodes extract the construct, and the clauses from the node. They then create a work queue consisting of individual constructs, and invoke a common dispatch function to process (lower) the queue. The dispatch function examines the current position in the queue, and invokes the appropriate lowering function. Each lowering function receives the queue as well, and once it needs to generate its body, it either invokes the dispatch function on the rest of the queue (if any), or processes nested evaluations if the work queue is at the end.	2024-05-13 08:09:24 -05:00
agozillon	e3ca558ffb	[Flang] Remove deprecated cast style that snuck in during landing of 435e850ba97ab567a14b6c84d2b27cadb771cb27	2024-05-10 14:56:01 -05:00
Andrew Gozillon	435e850ba9	[Flang][OpenMP][MLIR] Initial derived type member map support This patch is one in a series of four patches that seeks to refactor slightly and extend the current record type map support that was put in place for Fortran's descriptor types to handle explicit member mapping for record types at a single level of depth. For example, the below case where two members of a Fortran derived type are mapped explicitly: '''' type :: scalar_and_array real(4) :: real integer(4) :: array(10) integer(4) :: int end type scalar_and_array type(scalar_and_array) :: scalar_arr !$omp target map(tofrom: scalar_arr%int, scalar_arr%real) '''' Current cases of derived type mapping left for future work are: > explicit member mapping of nested members (e.g. two layers of record types where we explicitly map a member from the internal record type) > Fortran's automagical mapping of all elements and nested elements of a derived type > explicit member mapping of a derived type and then constituient members (redundant in Fortran due to former case but still legal as far as I am aware) > explicit member mapping of a record type (may be handled reasonably, just not fully tested in this iteration) > explicit member mapping for Fortran allocatable types (a variation of nested record types) This patch seeks to support this by extending the Flang-new OpenMP lowering to support generation of this newly required information, creating the neccessary parent <-to-> member map_info links, calculating the member indices and setting if it's a partial map. The OMPDescriptorMapInfoGen pass has also been generalized into a map finalization phase, now named OMPMapInfoFinalization. This pass was extended to support the insertion of member maps into the BlockArg and MapOperands of relevant map carrying operations. Similar to the method in which descriptor types are expanded and constituient members inserted. Pull Request: https://github.com/llvm/llvm-project/pull/82853	2024-05-10 14:16:26 -05:00
Krzysztof Parzyszek	554be97d7f	[flang][OpenMP] Implement getIterationVariableSymbol helper function,… (#90087 ) … NFC	2024-04-30 11:44:22 -05:00
Krzysztof Parzyszek	992413de99	[flang][OpenMP] Move clause/object conversion to happen early, in genOMP (#87086 ) This removes the last use of genOmpObjectList2, which has now been removed. --------- Co-authored-by: Sergio Afonso <safonsof@amd.com>	2024-04-18 12:02:04 -05:00
Sergio Afonso	734026347c	Reapply "[Flang][OpenMP][Lower] NFC: Move clause processing helpers into the ClauseProcessor (#85258 )" (#85807 ) This patch contains slight modifications to the reverted PR #85258 to avoid issues with constructs containing multiple reduction clauses, uncovered by a test on the gfortran testsuite. This reverts commit 9f80444c2e669237a5c92013f1a42b91b5609012.	2024-03-21 12:25:48 +00:00
Sergio Afonso	9f80444c2e	Revert "[Flang][OpenMP][Lower] NFC: Move clause processing helpers into the ClauseProcessor (#85258 )" Reverting due to failing gfortran test. This reverts commit 2f2f16f32bb2a6c250b19adbc229d9dc3b38640c.	2024-03-19 13:25:33 +00:00
Sergio Afonso	2f2f16f32b	[Flang][OpenMP][Lower] NFC: Move clause processing helpers into the ClauseProcessor (#85258 ) This patch moves some code in PFT to MLIR OpenMP lowering to the `ClauseProcessor` class. This is so that some behavior that is related to certain clauses stays within the `ClauseProcessor` and it's not the caller the one responsible for always doing this when the clause is present.	2024-03-19 11:49:45 +00:00
Krzysztof Parzyszek	63e70c0553	[flang][OpenMP] Convert repeatable clauses (except Map) in ClauseProc… (#81623 ) …essor Rename `findRepeatableClause` to `findRepeatableClause2`, and make the new `findRepeatableClause` operate on new `omp::Clause` objects. Leave `Map` unchanged, because it will require more changes for it to work. [Clause representation 3/6]	2024-03-15 07:04:42 -05:00
Kareem Ergawy	26b8be201e	[flang][OpenMP][MLIR] Basic support for delayed privatization code-gen (#81833 ) Adds basic support for emitting delayed privatizers from flang. So far, only types of symbols are supported (i.e. scalars), support for more complicated types will be added later. This also makes sure that reduction and delayed privatization work properly together by merging the body-gen callbacks for both in case both clauses are present on the parallel construct.	2024-02-28 10:15:57 +01:00
Kareem Ergawy	4d4af15c3f	[NFC][flang][OpenMP] Split `DataSharing` and `Clause` processors (#81973 ) This started as an experiment to reduce the compilation time of iterating over `Lower/OpenMP.cpp` a bit since it is too slow at the moment. Trying to do that, I split the `DataSharingProcessor`, `ReductionProcessor`, and `ClauseProcessor` into their own files and extracted some shared code into a util file. All of these new `.h/.cpp` files as well as `OpenMP.cpp` are now under a `Lower/OpenMP/` directory. This resulted is a slightly better organization of the OpenMP lowering code and hence opening this NFC. As for the compilation time, this unfortunately does not affect it much (it shaves off a few seconds of `OpenMP.cpp` compilation) since from what I learned the bottleneck is in `DirectivesCommon.h` and `PFTBuilder.h` which both consume a lot of time in template instantiation it seems.	2024-02-21 15:55:42 +01:00

40 Commits