llvm-project

Author	SHA1	Message	Date
Longsheng Mou	f047b735e9	[mlir][NFC] Use `getDefiningOp<OpTy>()` instead of `dyn_cast<OpTy>(getDefiningOp())` (#150428 ) This PR uses `val.getDefiningOp<OpTy>()` to replace `dyn_cast<OpTy>(val.getDefiningOp())` , `dyn_cast_or_null<OpTy>(val.getDefiningOp())` and `dyn_cast_if_present<OpTy>(val.getDefiningOp())`.	2025-07-25 10:35:51 +08:00
Kazu Hirata	1a0f482de8	[mlir] Remove unused includes (NFC) (#150476 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-07-24 11:23:53 -07:00
Longsheng Mou	3eb49c482c	[mlir][NFC] Use `hasOneBlock` instead of `llvm::hasSingleElement(region)` (#149809 )	2025-07-24 10:11:21 +08:00
Tom Eccles	a1c61ac756	[mlir][OpenMP] Allow composite SIMD REDUCTION and IF (#147568 ) Reduction support: https://github.com/llvm/llvm-project/pull/146671 If Support is fixed in this PR The problem for the IF clause in composite constructs was that wsloop and simd both operate on the same CanonicalLoopInfo structure: with the SIMD processed first, followed by the wsloop. Previously the IF clause generated code like ``` if (cond) { while (...) { simd_loop_body; } } else { while (...) { nonsimd_loop_body; } } ``` The problem with this is that this invalidates the CanonicalLoopInfo structure to be processed by the wsloop later. To avoid this, in this patch I preserve the original loop, moving the IF clause inside of the loop: ``` while (...) { if (cond) { simd_loop_body; } else { non_simd_loop_body; } } ``` On simple examples I tried LLVM was able to hoist the if condition outside of the loop at -O3. The disadvantage of this is that we cannot add the llvm.loop.vectorize.enable attribute on either the SIMD or non-SIMD loops because they both share a loop back edge. There's no way of solving this without keeping the old design of having two different loops: which cannot be represented using only one CanonicalLoopInfo structure. I don't think the presence or absence of this attribute makes much difference. In my testing it is the llvm.loop.parallel_access metadata which makes the difference to vectorization. LLVM will vectorize if legal whether or not this attribute is there in the TRUE branch. In the FALSE branch this means the loop might be vectorized even when the condition is false: but I think this is still standards compliant: OpenMP 6.0 says that when the if clause is false that should be treated like the SIMDLEN clause is one. The SIMDLEN clause is defined as a "hint". For the same reason, SIMDLEN and SAFELEN clauses are silently ignored when SIMD IF is used. I think it is better to implement SIMD IF and ignore SIMDLEN and SAFELEN and some vectorization encouragement metadata when combined with IF than to ignore IF because IF could have correctness consequences whereas the rest are optimiztion hints. For example, the user might use the IF clause to disable SIMD programatically when it is known not safe to vectorize the loop. In this case it is not at all safe to add the parallel access or SAFELEN metadata.	2025-07-15 10:30:02 +01:00
Michael Kruse	96bc07d492	[MLIR][OpenMP] Add canonical loop LLVM-IR lowering (#147069 ) Support for translating the operations introduced in #144785 to LLVM-IR. In order to keep the lowering simple, `OpenMPIRBuider::unrollLoopHeuristic` is applied when encountering the `omp.unroll_heuristic` op. As a result, the operation that unrolling is applied to (`omp.canonical_loop`) must have been emitted before even though logically there is no such requirement. Eventually, all transformations on a loop must be applied directly after emitting `omp.canonical_loop`, i.e. future transformations must be looked-up when encountering `omp.canonical_loop` itself. This is because many OpenMPIRBuilder methods (e.g. `createParallel`) expect all the region code to be emitted withing a callback. In the case of `createParallel`, the region code is getting outlined into a new function. Therefore, making the operation order a formal requirement would not make the implementation any easier.	2025-07-11 12:54:25 +02:00
agozillon	71783fea2c	[Flang][OpenMP][MLIR] Fix regression by #146653 by adding address space cast to getRefPtrIfDeclareTarget The patch introduced changes to add address spaces to a wider array of MLIR/LLVM values, however, it was missing an address space cast that exists in our downstream implementation that's required for declare target to work correctly.	2025-07-08 12:31:27 -05:00
Kajetan Puchalski	9006bc8717	[OpenMP] Enable simd in non-reduction composite constructs (#146097 ) Despite currently being ignored with a warning, simd as a leaf in composite constructs behaves as expected when the construct does not contain a reduction. Enable it for those non-reduction constructs. --------- Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>	2025-07-08 14:27:33 +01:00
Tom Eccles	ea5ee2e743	[mlir][OpenMP] Don't allow firstprivate for simd (#146734 ) This is not allowed by the openmp standard.	2025-07-04 12:15:07 +01:00
Kareem Ergawy	8c9e0c6c61	[flang][OpenMP] Allocate `reduction` init temps on the stack for GPUs (#146667 ) Temps needed for the reduction init regions are now allocate on the heap all the time. However, this is performance killer for GPUs since malloc calls are prohibitively expensive. Therefore, we should do these allocations on the stack for GPU reductions.	2025-07-04 06:29:34 +02:00
Abid Qadeer	d56c06e6c9	[flang][debug] Generate DISubprogramAttr for omp::TargetOp. (#146532 ) This is combination of https://github.com/llvm/llvm-project/pull/138149 and https://github.com/llvm/llvm-project/pull/138039 which were opened separately for ease of reviewing. Only other change is adjustments in 2 tests which have gone in since. There are `DeclareOp` present for the variables mapped into target region. That allow us to generate debug information for them. But the `TargetOp` is still part of parent function and those variables get the parent function's `DISubprogram` as a scope. In `OMPIRBuilder`, a new function is created for the `TargetOp`. We also create a new `DISubprogram` for it. All the variables that were in the target region now have to be updated to have the correct scope. This after the fact updating of debug information becomes very difficult in certain cases. Take the example of variable arrays. The type of those arrays depend on the artificial `DILocalVariable`(s) which hold the size(s) of the array. This new function will now require that we generate the new variable and and new types. Similar issue exist for character type variables too. To avoid this after the fact updating, this PR generates a `DISubprogramAttr` for the `TargetOp` while generating the debug info in `flang`. Then we don't need to generate a `DISubprogram` in `OMPIRBuilder`. This change is made a bit more complicated by the the fact that in new scheme, the debug location already points to the new `DISubprogram` by the time it reaches `convertOmpTarget`. But we need some code generation in the parent function so we have to carefully manage the debug locations. This fixes issue `#134991`.	2025-07-03 10:38:28 +01:00
Tom Eccles	16b75c819d	[mlir][OpenMP] implement SIMD reduction (#146671 ) This replicates clang's implementation. Basically: - A private copy of the reduction variable is created, initialized to the reduction neutral value (using regions from the reduction declaration op). - The body of the loop is lowered as usual, with accesses to the reduction variable mapped to the private copy. - After the loop, we inline the reduction region from the declaration op to combine the privatized variable into the original variable. - As usual with the SIMD construct, attributes are added to encourage vectorization of the loop and to assert that memory accesses in the loop don't alias across iterations. I have verified that simple scalar examples do vectorize at -O3 and the tests I could find in the Fujitsu test suite produce correct results. I tested on top of #146097 and this seemed to work for composite constructs as well. Fixes #144290	2025-07-02 16:49:34 +01:00
Abid Qadeer	232c2921e1	Reland [mlir][OpenMP] Use correct debug location with link clause. (#145889 ) https://github.com/llvm/llvm-project/pull/145026 was reverted because it failed a sanitizer test. That issue has been fixed in https://github.com/llvm/llvm-project/pull/145883.	2025-06-26 19:32:30 +01:00
Abid Qadeer	a75279e4a5	Revert "[mlir][OpenMP] Use correct debug location with link clause." (#145768 ) Reverts llvm/llvm-project#145026 Caused a CI failure on https://lab.llvm.org/buildbot/#/builders/169/builds/12504.	2025-06-25 20:06:36 +01:00
Abid Qadeer	006037675c	[mlir][OpenMP] Use correct debug location with link clause. (#145026 ) Please see the following program. ``` module test_0 INTEGER :: sp = 1 !$omp declare target link(sp) end module test_0 program main use test_0 integer :: new_len !$omp target map(tofrom:new_len) map(tofrom:sp) new_len = sp !$omp end target print , new_len print , sp end program ``` When compiled with `flang -g -O0 -fopenmp --offload-arch=gfx1100` will fail the compilation with the following error: `dbg attachment points at wrong subprogram for function` The reason is that with the `link` clause on `!$omp declare target`, an extra load instruction is inserted. But the debug location was not updated before insertion which caused an invalid location to be attached to the instruction.	2025-06-25 13:49:40 +01:00
Kajetan Puchalski	d3ed84ed67	[Utils][mlir] Fix interaction between CodeExtractor and OpenMPIRBuilder (#145051 ) CodeExtractor can currently erroneously insert an alloca into a different function than it inserts its users into, in cases where code is being extracted out of a function that has already been outlined. Add an assertion that the two blocks being inserted into are actually in the same function. Add a check to findAllocaInsertPoint in OpenMP to LLVMIR translation to prevent the aforementioned scenario from happening. OpenMPIRBuilder relies on a callback mechanism to fix-up a module later on during the finaliser step. In some cases this results in the module being invalid prior to the finalise step running. Remove calls to verifyModule wrapped in LLVM_DEBUG from CodeExtractor, as the presence of those results in the compiler crashing with -mllvm -debug due to premature module verification where it would not crash without -debug. Call ompBuilder->finalize() the end of mlir::translateModuleToLLVMIR, in order to make sure the module has actually been finalized prior to trying to verify it. Resolves https://github.com/llvm/llvm-project/issues/138102. --------- Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>	2025-06-25 13:34:35 +01:00
Krzysztof Parzyszek	8231dd71cb	[flang][OpenMP] Skip runtime mapping with no offload targets (#145594 ) When no offload targets are specified flang will avoid offloading for "target" constructs, but not "target data" constructs. This patch makes the behavior consistent across all offload-related operations. While ignoring "target" may produce semantically incorrect code, it may still be a useful debugging tool. -- This reinstates commits 6ba1955 and 349f8d6, reverted due to compilation failures in the gfortran test suite. These build problems were caused by an unrelated issue (https://github.com/llvm/llvm-project/issues/145558) which is now fixed. Ref: https://github.com/llvm/llvm-project/pull/144534	2025-06-25 07:10:08 -05:00
Krzysztof Parzyszek	3a71884ab1	[flang][OpenMP] Map device pointers on host device as well (#145562 ) Given a TARGET DATA construct with USE_DEVICE_PTR(x) and IF(FALSE), the compiler will crash if `x` was used in the body. The cause of the crash is that the MLIR->LLVM codegen tries to look up the translated value of x, but one had not been mapped. Given an IF clause, the translation will generate an if-then-else construct, with the "else" block corresponding to the false condition, i.e. the host device playing the role of the target device. In that block, still process the USE_DEVICE_ADDR/USE_DEVICE_PTR clauses, which will cause the translation mappings to be created. Fixes https://github.com/llvm/llvm-project/issues/145558	2025-06-24 15:38:23 -05:00
Tom Eccles	cc756716cf	[mlir][NFC] Move LLVM::ModuleTranslation::SaveStack to a shared header (#144897 ) This is so that we can re-use the same code in Flang.	2025-06-24 17:45:10 +01:00
antoine moynault	5fa55b2dfc	Revert "[flang][OpenMP] Skip runtime mapping with no offload targets (#144534 )" (#145478 ) And also revert 6ba1955 "[flang][OpenMP] Fix ignore-target-data.f90 test" As it causes several bot failures https://github.com/llvm/llvm-project/pull/144534#issuecomment-2995303224	2025-06-24 10:51:26 +02:00
Krzysztof Parzyszek	349f8d67d4	[flang][OpenMP] Skip runtime mapping with no offload targets (#144534 ) When no offload targets are specified flang will ignore "target" constructs, but not "target data" constructs. This patch makes the behavior consistent across all offload-related operations. While ignoring "target" may produce semantically incorrect code, it may still be a useful debugging tool.	2025-06-20 08:09:36 -05:00
Tom Eccles	aa01e8e9cf	[mlir][OpenMP] Fix broken insertion point for charbox with omp task (#143112 ) Fixes #142365	2025-06-17 10:42:42 +01:00
NimishMishra	bf1fe6eb33	[mlir][OpenMP] Reintroduce TODO for translation of linear clause (#143531 ) Reintroduce a TODO for linear clause translation unless corner issues (like linear variables being entities other than `alloca`, and support for linear variables of types other than integer) are solved.	2025-06-10 07:06:28 -07:00
Tom Eccles	b03081e9fb	[mlir][OpenMP] set correct insert point after creating a barrier (#142997 ) Fixes #138436	2025-06-06 10:43:13 +01:00
Tom Eccles	8d06d4c132	[mlir][OpenMP] Add translation of private_barrier attr to LLVMIR (#140090 ) Part of a series to fix https://github.com/llvm/llvm-project/issues/136357	2025-05-22 15:24:20 +01:00
NimishMishra	0baacd1a58	[flang][OpenMP] Support MLIR lowering of linear clause for omp.wsloop (#139385 ) This patch adds support for MLIR lowering of linear clause on omp.wsloop (except for linear modifiers).	2025-05-19 23:33:06 -07:00
Sergio Afonso	0cd7e8aa91	[MLIR][OpenMP] Assert on map translation functions, NFC (#137199 ) This patch adds assertions to map-related MLIR to LLVM IR translation functions and utils to explicitly document whether they are intended for host or device compilation only. Over time, map-related handling has increased in complexity. This is compounded by the fact that some handling is device-specific and some is host-specific. By explicitly asserting on these functions on the expected compilation pass, the flow should become slighlty easier to follow.	2025-05-15 12:29:06 +01:00
Tom Eccles	e40200901c	[mlir][OpenMP] cancel(lation point) taskgroup LLVMIR (#137841 ) A cancel or cancellation point for taskgroup is always nested inside of a task inside of the taskgroup. For the task which is cancelled, it is that task which needs to be cleaned up: not the owning taskgroup. Therefore the cancellation branch handler is done in the conversion of the task not in conversion of taskgroup. I added a firstprivate clause to the test for cancel taskgroup to demonstrate that the block being branched to is the same block where mandatory cleanup code is added. Cancellation point follows exactly the same code path.	2025-05-08 11:15:58 +01:00
Tom Eccles	8338a3c92b	[mlir][OpenMP] Convert omp.cancellation_point to LLVMIR (#137205 ) This is basically identical to cancel except without the if clause. taskgroup will be implemented in a followup PR.	2025-05-08 11:09:13 +01:00
Tom Eccles	a385c47a59	[mlir][OpenMP] convert wsloop cancellation to LLVMIR (#137194 ) Taskloop support will follow in a later patch.	2025-05-08 11:08:52 +01:00
Kaviya Rajendiran	857ac4c229	[MLIR][OpenMP] Lowering nontemporal clause to LLVM IR for SIMD directive (#118751 ) This patch, - Added a new attribute `nontemporal` to fir.load and fir.store operation in the FIR dialect. - Added a pass `lower-nontemporal` which is called before FIRToLLVM conversion pass and adds the nontemporal attribute to loads and stores on the list items specified in the nontemporal clause of the SIMD directive. - Set the `UnitAttr:$nontemporal` to llvm.load and llvm.store operations during FIR to LLVM dialect conversion, if the corresponding fir.load or fir.store operations have the nontemporal attribute. - Attached the `nontemporal metadata` to load and store instructions that have the nontemporal attribute, during LLVM dialect to LLVM IR translation.	2025-04-30 11:13:20 +05:30
Pranav Bhandarkar	7dd8122d4e	[Flang][MLIR][OpenMP] - Add support for firstprivate when translating omp.target ops from MLIR to LLVMIR (#131213 ) This patch adds support to translate `firstprivate` clauses on `omp.target` ops when translating from MLIR to LLVMIR. Presently, this PR is restricted to supporting only included tasks, i.e `#omp target nowait firstprivate(some_variable)` will likely not work correctly even if it produces object code.	2025-04-29 14:53:15 -05:00
Tom Eccles	7b70fc74d0	[mlir][OpenMP] Convert omp.cancel sections to LLVMIR (#137193 ) This is quite ugly but it is the best I could think of. The old FiniCBWrapper was way too brittle depending upon the exact block structure inside of the section, and could be confused by any control flow in the section (e.g. an if clause on cancel). The wording in the comment and variable names didn't seem to match where it was actually branching too as well. Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections branches to a block containing __kmpc_for_static_fini. This was hard to achieve here because sometimes the FiniCBWrapper has to run before the worksharing loop finalization has been crated. To get around this ordering issue I created a dummy branch to a dummy block, which is then fixed later once all of the information is available.	2025-04-29 17:19:40 +01:00
NimishMishra	b62afbccc8	[mlir][OpenMP] Add __atomic_store to AtomicInfo (#121055 ) This PR adds functionality for `__atomic_store` libcall in AtomicInfo. This allows for supporting complex types in `atomic write`. Fixes https://github.com/llvm/llvm-project/issues/113479 Fixes https://github.com/llvm/llvm-project/issues/115652	2025-04-29 07:53:36 -07:00
Tom Eccles	2085119887	[mlir][OpenMP] Convert omp.cancel parallel to LLVMIR (#137192 ) Support for other constructs will follow in subsequent PRs.	2025-04-28 10:33:55 +01:00
Dominik Adamski	adfc577895	[OpenMP][CodeExtractor]Add align metadata to load instructions (#131131 ) Moving code to another function can lead to missed optimization opportunities, because function passes operate on smaller chunks of code, and they cannot figure out all details. One example of missed optimization opportunities after code extraction is information about pointer alignment. The instruction combine pass adds information about pointer alignment to LLVM intrinsic memcpy calls if it can deduce it from the code or if align metadata is added. If this information is not present, then further optimization passes can generate inefficient code. If we add align metadata to extracted pointers, then the instruction combine pass can add the align attribute to the LLVM intrinsic memcpy call and unblock further optimization. Scope of changes: 1. Analyze MLIR map operations. Add information about the alignment of objects that are passed by reference to OpenMP GPU kernels. 2. Propagate alignment information to the outlined by `CodeExtractor` helper functions.	2025-04-10 09:45:30 +02:00
Jan Leyonberg	1aed6ad906	[MLIR][OpenMP] Enable multiple variables for target teams reductions (#134903 ) This patch enables multiple reductions to be used in a reduction clause inside target regions for GPU offloading. --------- Co-authored-by: Sergio Afonso <safonsof@amd.com>	2025-04-09 13:01:53 -04:00
NimishMishra	53fa92dcad	[mlir][llvm][OpenMP] Hoist __atomic_load alloca (#132888 ) Current implementation of `__atomic_compare_exchange` uses an alloca for `__atomic_load`, leading to issues like https://github.com/llvm/llvm-project/issues/120724. This PR hoists this alloca to `AllocaIP`. Fixes: https://github.com/llvm/llvm-project/issues/120724	2025-04-09 03:01:44 -07:00
Jan Leyonberg	fbc8335311	[MLIR][OpenMP] Add codegen for teams reductions (#133310 ) This patch adds the lowering of teams reductions from the omp dialect to LLVM-IR. Some minor cleanup was done in clang to remove an unused parameter.	2025-04-07 12:47:16 -04:00
Sergio Afonso	f59b5b8d59	[MLIR][OpenMP] Fix standalone distribute on the device (#133094 ) This patch updates the handling of target regions to set trip counts and kernel execution modes properly, based on clang's behavior. This fixes a race condition on `target teams distribute` constructs with no `parallel do` loop inside. This is how kernels are classified, after changes introduced in this patch: ```f90 ! Exec mode: SPMD. ! Trip count: Set. !$omp target teams distribute parallel do do i=... end do ! Exec mode: Generic-SPMD. ! Trip count: Set (outer loop). !$omp target teams distribute do i=... !$omp parallel do private(idx, y) do j=... end do end do ! Exec mode: Generic-SPMD. ! Trip count: Set (outer loop). !$omp target teams distribute do i=... !$omp parallel ... !$omp end parallel end do ! Exec mode: Generic. ! Trip count: Set. !$omp target teams distribute do i=... end do ! Exec mode: SPMD. ! Trip count: Not set. !$omp target parallel do do i=... end do ! Exec mode: Generic. ! Trip count: Not set. !$omp target ... !$omp end target ``` For the split `target teams distribute + parallel do` case, clang produces a Generic kernel which gets promoted to Generic-SPMD by the openmp-opt pass. We can't currently replicate that behavior in flang because our codegen for these constructs results in the introduction of calls to the `kmpc_distribute_static_loop` family of functions, instead of `kmpc_distribute_static_init`, which currently prevent promotion of the kernel to Generic-SPMD. For the time being, instead of relying on the openmp-opt pass, we look at the MLIR representation to find the Generic-SPMD pattern and directly tag the kernel as such during codegen. This is what we were already doing, but incorrectly matching other kinds of kernels as such in the process.	2025-04-03 15:41:00 +01:00
Kareem Ergawy	f23a6ef54c	[flang][OpenMP] Process `omp.atomic.update` while translating scopes for target device (#132165 ) Fixes a bug introduced by https://github.com/llvm/llvm-project/pull/130078. For non-BlockArgOpenMPOpInterface ops, we also want to map their entry block arguments to their operands, if any. For the current support in the OpenMP dialect, the table below lists all ops that have arguments (SSA operands and/or attributes) and not target-related. Of all these ops, we need to only process `omp.atomic.update` since it is the only op that has SSA operands & an attached region. Therefore, the region's entry block arguments must be mapped to the op's operands in case they are referenced inside the region. \| op \| operands? \| region(s)? \| parent is func? \| processed? \| \|--------------\|-------------\|------------\|------------------\|-------------\| \| atomic.read \| yes \| no \| yes \| no \| \| atomic.write \| yes \| no \| yes \| no \| \| atomic.update \| yes \| yes \| yes \| yes \| \| critical \| no \| no \| yes \| no \| \| declare_mapper \| no \| yes \| no \| no \| \| declare_reduction \| no \| yes \| no \| no \| \| flush \| yes \| no \| yes \| no \| \| private \| no \| yes \| yes \| no \| \| threadprivate \| yes \| no \| yes \| no \| \| yield \| yes \| no \| yes \| no \|	2025-03-20 16:21:09 -05:00
Sergio Afonso	b231f6f862	[MLIR][OpenMP] Improve omp.map.info verification (#132066 ) This patch makes the `map_type` and `map_capture_type` arguments of the `omp.map.info` operation required, which was already an invariant being verified by its users via `verifyMapClause()`. This makes it clearer, as getters no longer return misleading `std::optional` values. Checks for the `mapper_id` argument are moved to a verifier for the operation, rather than being checked by users. Functionally NFC, but not marked as such due to a reordering of arguments in the assembly format of `omp.map.info`.	2025-03-20 15:48:45 +00:00
Kareem Ergawy	b7eb01b3a1	[NFC][OpenMP][MLIR] Refactor code related to collecting privatizer info into a shared util (#131582 ) Moves code needed to collect info about delayed privatizers into a shared util instread of repeating the same patter across all relevant constructs.	2025-03-19 12:06:33 +01:00
Kareem Ergawy	e737b846b4	[flang][OpenMP] Translate OpenMP scopes when compiling for target device (#130078 ) If a `target` directive is nested in a host OpenMP directive (e.g. parallel, task, or a worksharing loop), flang currently crashes if the target directive-related MLIR ops (e.g. `omp.map.bounds` and `omp.map.info` depends on SSA values defined inside the parent host OpenMP directives/ops. This PR tries to solve this problem by treating these parent OpenMP ops as "SSA scopes". Whenever we are translating for the device, instead of completely translating host ops, we just tranlate their MLIR ops as pure SSA values.	2025-03-19 08:26:19 +01:00
Kareem Ergawy	49b8d8472f	[OpenMP][MLIR] Support LLVM translation for `distribute` with delayed privatization (#131564 ) Adds support for tranlating delayed privatization (`private` and `firstprivate`) for `omp.distribute` ops.	2025-03-18 10:14:42 +01:00
Matthias Springer	6c867e27a7	[mlir] Use `getSingleElement`/`hasSingleElement` in various places (#131460 ) This is a code cleanup. Update a few places in MLIR that should use `hasSingleElement`/`getSingleElement`. Note: `hasSingleElement` is faster than `.getSize() == 1` when it is used with linked lists etc. Depends on #131508.	2025-03-17 07:43:18 +01:00
Sergio Afonso	72b8744aa5	[MLIR][OpenMP] Reduce overhead of target compilation (#130945 ) This patch avoids calling `TargetOp::getInnermostCapturedOmpOp` multiple times during initialization of default and runtime target attributes in MLIR to LLVM IR translation of `omp.target` operations. This is a potentially expensive operation, so this change should help keep compile times lower.	2025-03-14 15:18:32 +00:00
Michael Klemm	28ffa7f6a4	[flang][OpenMP] Fix missing missing inode issue (#130798 ) When outlining an offload region, Flang creates a unique name by querying an inode ID. However, when the name of the actual source file does not match the logical file in a `#line` preprocessor directive, code-gen was failing as it could not determine the inode ID. This PR checks for this condition and if the logical file name does not exist, the inode is replaced with a hash value created from the source code itself.	2025-03-13 15:50:37 +01:00
Sergio Afonso	6ff33edf4d	[MLIR][OpenMP] Minor improvements to BlockArgOpenMPOpInterface, NFC (#130789 ) This patch introduces a use for the new `getBlockArgsPairs` to avoid having to manually list each applicable clause. Also, the `numClauseBlockArgs()` function is introduced, which simplifies the implementation of the interface's verifier and enables better memory handling within `getBlockArgsPairs`.	2025-03-13 14:48:19 +00:00
Krzysztof Parzyszek	d67947162f	[flang][OpenMP] Implement HAS_DEVICE_ADDR clause (#128568 ) The HAS_DEVICE_ADDR indicates that the object(s) listed exists at an address that is a valid device address. Specifically, `has_device_addr(x)` means that (in C/C++ terms) `&x` is a device address. When entering a target region, `x` does not need to be allocated on the device, or have its contents copied over (in the absence of additional mapping clauses). Passing its address verbatim to the region for use is sufficient, and is the intended goal of the clause. Some Fortran objects use descriptors in their in-memory representation. If `x` had a descriptor, both the descriptor and the contents of `x` would be located in the device memory. However, the descriptors are managed by the compiler, and can be regenerated at various points as needed. The address of the effective descriptor may change, hence it's not safe to pass the address of the descriptor to the target region. Instead, the descriptor itself is always copied, but for objects like `x`, no further mapping takes place (as this keeps the storage pointer in the descriptor unchanged). --------- Co-authored-by: Sergio Afonso <safonsof@amd.com>	2025-03-10 08:11:01 -05:00
Tom Eccles	ca1833b91e	[mlir][OpenMP] cast address space of private variables (#130301 ) Fixes #130159 The problem is that the alloca created for the private variable uses the default alloca address space in that module, but the function the pointer is being passed to expects a different address space, leading to a type missmatch in the function argument. I know nothing about how AMDGPU is supposed to work. I based this solution on code from createDeviceArgumentAccessor(). Please could somebody from AMD confirm this solution is appropriate.	2025-03-07 18:30:57 +00:00

1 2 3 4 5 ...

321 Commits