llvm-project

Author	SHA1	Message	Date
Kareem Ergawy	42da12063f	[flang][OpenMP] Extend delayed privatization for `omp.simd` (#122156 ) Adds support for delayed privatization for `simd` directives. This PR includes PFT down to LLVM IR lowering.	2025-01-12 07:46:58 +01:00
William Moses	38fcf62483	[MLIR] Import LLVM add flag to disable loadalldialects (#122574 ) Co-authored-by: Oleksandr "Alex" Zinenko <ftynse@gmail.com>	2025-01-11 09:11:22 -05:00
Kareem Ergawy	6f9e688203	[flang][OpenMP] Fix reduction init region block management (#122079 ) Replaces https://github.com/llvm/llvm-project/pull/121886 Fixes https://github.com/llvm/llvm-project/issues/120254 (hopefully 🤞) ## Problem Consider the following example: ```fortran program test real :: x(1) integer :: i !$omp parallel do reduction(+:x) do i = 1,1 x = 1 end do !$omp end parallel do end program ``` The HLFIR+OMP IR for this example looks like this: ```mlir func.func @_QQmain() { ... omp.parallel { %5 = fir.embox %4#0(%3) : (!fir.ref<!fir.array<1xf32>>, !fir.shape<1>) -> !fir.box<!fir.array<1xf32>> %6 = fir.alloca !fir.box<!fir.array<1xf32>> ... omp.wsloop private(@_QFEi_private_ref_i32 %1#0 -> %arg0 : !fir.ref<i32>) reduction(byref @add_reduction_byref_box_1xf32 %6 -> %arg1 : !fir.ref<!fir.box<!fir.array<1xf32>>>) { omp.loop_nest (%arg2) : i32 = (%c1_i32) to (%c1_i32_0) inclusive step (%c1_i32_1) { ... omp.yield } } omp.terminator } return } ``` The problem addressed by this PR is related to: the `alloca` in the `omp.parallel` region + the related `reduction` clause on the `omp.wsloop` op. When we try translate the reduction from MLIR to LLVM, we have to choose an `alloca` insertion point. This happens in `convertOmpWsloop` where at entry to that function, this is what the LLVM module looks like: ```llvm define void @_QQmain() { %tid.addr = alloca i32, align 4 ... entry: %omp_global_thread_num = call i32 @__kmpc_global_thread_num(ptr @1) br label %omp.par.entry omp.par.entry: %tid.addr.local = alloca i32, align 4 ... br label %omp.par.region omp.par.region: br label %omp.par.region1 omp.par.region1: ... %5 = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, align 8 ``` Now, when we choose an `alloca` insertion point for the reduction, this is the chosen block `omp.par.entry` (without the changes in this PR). The problem is that the allocation needed for the reduction needs to reference the `%5` SSA value. This results in inserting allocations in `omp.par.entry` that reference allocations in a later block `omp.par.region1` which causes the `Instruction does not dominate all uses!` error. ## Possible solution - take 2: This PR contains a more localized solution than https://github.com/llvm/llvm-project/pull/121886. It makes sure that on entry to `initReductionVars`, the IR builder is at a point where we can starting inserting initialization region; to make things cleaner, we still split the builder insertion point to a dedicated `omp.reduction.init`. This way we avoid splitting after the latest allocation block; which is what causing the issue.	2025-01-09 16:11:18 +01:00
William Moses	1c067a513c	[MLIR] Enable import of non self referential alias scopes (#121987 ) Fixes #121965. --------- Co-authored-by: Christian Ulmann <christianulmann@gmail.com> Co-authored-by: Alex Zinenko <git@ozinenko.com>	2025-01-08 13:40:05 +01:00
Alex MacLean	4583f6d344	[NVPTX] Switch front-ends and tests to ptx_kernel cc (#120806 ) the `ptx_kernel` calling convention is a more idiomatic and standard way of specifying a NVPTX kernel than using the metadata which is not supposed to change the meaning of the program. Further, checking the calling convention is significantly faster than traversing the metadata, improving compile time. This change updates the clang and mlir frontends as well as the NVPTXCtorDtorLowering pass to emit kernels using the calling convention. In addition, this updates all NVPTX unit tests to use the calling convention as well.	2025-01-07 18:24:50 -08:00
William Moses	b5f21671ef	MLIR: Enable importing inlineasm calls (#121624 )	2025-01-05 11:02:49 -05:00
agozillon	fa56e8bb64	[OpenMP][MLIR] Fix threadprivate lowering when compiling for target when target operations are in use (#119310 ) Currently the compiler will ICE in programs like the following on the device lowering pass: ``` program main implicit none type i1_t integer :: val(1000) end type i1_t integer :: i type(i1_t), pointer :: newi1 type(i1_t), pointer :: tab=>null() integer, dimension(:), pointer :: tabval !$omp THREADPRIVATE(tab) allocate(newi1) tab=>newi1 tab%val(:)=1 tabval=>tab%val !$omp target teams distribute parallel do do i = 1, 1000 tabval(i) = i end do !$omp end target teams distribute parallel do end program main ``` This is due to the fact that THREADPRIVATE returns a result operation, and this operation can actually be used by other LLVM dialect (or other dialect) operations. However, we currently skip the lowering of threadprivate, so we effectively never generate and bind an LLVM-IR result to the threadprivate operation result. So when we later go on to lower dependent LLVM dialect operations, we are missing the required LLVM-IR result, try to access and use it and then ICE. The fix in this particular PR is to allow compilation of threadprivate for device as well as host, and simply treat the device compilation as a no-op, binding the LLVM-IR result of threadprivate with no alterations and binding it, which will allow the rest of the compilation to proceed, where we'll eventually discard the host segment in any case. The other possible solution to this I can think of, is doing something similar to Flang's passes that occur prior to CodeGen to the LLVM dialect, where they erase/no-op certain unrequired operations or transform them to lower level series of operations. And we would erase/no-op threadprivate on device as we'd never have these in target regions. The main issues I can see with this are that we currently do not specialise this stage based on wether we're compiling for device or host, so it's setting a precedent and adding another point of having to understand the separation between target and host compilation. I am also not sure we'd necessarily want to enforce this at a dialect level incase someone else wishes to add a different lowering flow or translation flow. Another possible issue is that a target operation we have/utilise would depend on the result of threadprivate, meaning we'd not be allowed to entirely erase/no-op it, I am not sure of any situations where this may be an issue currently though.	2025-01-03 18:01:01 +01:00
Kaviya Rajendiran	d3eb65f15d	[MLIR][OpenMP] Lowering aligned clause to LLVM IR for SIMD directive (#119536 ) This patch, - Added a translation support for aligned clause in SIMD directive by passing the alignment details to "llvm.assume" intrinsic. - Updated the insertion point for llvm.assume intrinsic call in "OMPIRBuilder.cpp". - Added a check in aligned clause MLIR lowering, to ensure that the alignment value must be a power of 2.	2025-01-03 16:22:38 +05:30
Thirumalai Shaktivel	cbe583b0bd	[Flang] Add translation support for MutexInOutSet and InOutSet (#120715 ) Implementatoin details: Both Mutexinoutset and Inoutset is recognized as flag=0x4 and 0x8 respectively, the flags is set to `kmp_depend_info` and passed as argument to `__kmpc_omp_task_with_deps` runtime call	2024-12-26 15:02:09 +05:30
Muhammad Omair Javaid	927a70daf3	Revert "[Flang OpenMP] Add LLVM translation support for UNTIED in Task (#115283 )" This reverts commit 919aead1db64b2f1444842bc75a3af7836238671. It breaks following LLVM bots: https://lab.llvm.org/buildbot/#/builders/199 https://lab.llvm.org/buildbot/#/builders/143 https://lab.llvm.org/buildbot/#/builders/17	2024-12-24 01:47:24 +05:00
Thirumalai Shaktivel	919aead1db	[Flang OpenMP] Add LLVM translation support for UNTIED in Task (#115283 ) Implementation details: The UNTIED clause is recognized by setting the flag=0 for the default case or performing logical OR to flag if other clauses are specified, and this flag is passed as an argument to the `__kmpc_omp_task_alloc` runtime call.	2024-12-20 16:36:51 +05:30
Mehdi Amini	6a7d6c5f69	[MLIR] Add a MLIR_NVVM_EMBED_LIBDEVICE CMake option that embeds libdevice in the binary (#120238 ) This removes a runtime dependency on the CUDA Toolkit path, instead of looking up the filesystem we use a version of libdevice embedded in the binary at build time.	2024-12-17 16:53:38 +01:00
Mehdi Amini	72e8b9aeaa	[MLIR] Add a BlobAttr interface for attribute to wrap arbitrary content and use it as linkLibs for ModuleToObject (#120116 ) This change allows to expose through an interface attributes wrapping content as external resources, and the usage inside the ModuleToObject show how we will be able to provide runtime libraries without relying on the filesystem.	2024-12-17 01:30:56 +01:00
Renaud Kauffmann	9919295cfd	[mlir][gpu] Adding ELF section option to the gpu-module-to-binary pass (#119440 ) This is a follow-up of #117246. I thought then it would be easy to edit a DictionaryAttr but it turns out that these attributes are immutable and need to be passed during the construction of the gpu.binary Op. The first commit was using the NVVMTargetAttr to pass the information. After feedback from @fabianmcg, this PR now passes the information through a new option of the gpu-module-to-binary pass. Please add reviewers, as you see fit.	2024-12-16 09:09:41 -08:00
Ivan R. Ivanov	7c9404c279	[flang][OpenMP] Add frontend support for ompx_bare clause (#111106 )	2024-12-13 21:44:43 +09:00
Jie Fu	46ec271e03	[mlir] Fix -Wunused-variable in OpenMPToLLVMIRTranslation.cpp (NFC) /llvm-project/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp:3921:12: error: unused variable 'varType' [-Werror,-Wunused-variable] Type varType = mapInfoOp.getVarType(); ^ 1 error generated.	2024-12-12 22:11:41 +08:00
Kareem Ergawy	f9734b9df1	[mlir][OpenMP] - MLIR to LLVMIR translation support for delayed privatization of allocatables in `omp.target` ops (#116576 ) This PR adds support to translate the `private` clause from MLIR to LLVMIR when used on allocatables in the context of an `omp.target` op. This replaces https://github.com/llvm/llvm-project/pull/113208. Parent PR: https://github.com/llvm/llvm-project/pull/116770. Only the latest commit is relevant to the PR.	2024-12-12 14:39:58 +01:00
Zichen Lu	4971e53612	[mlir][Target] Support Fatbin target for static nvptxcompiler (#118044 ) ### Background In `lib/Target/LLVM/NVVM/Target.cpp`, `NVPTXSerializer` compile PTX to binary with two different flows controlled by `MLIR_ENABLE_NVPTXCOMPILER`. If building mlir with `-DMLIR_ENABLE_NVPTXCOMPILER=ON`, the flow does not check if the target is `gpu::CompilationTarget::Fatbin`, and compile PTX to cubin directly, which is not consistent with another flow. ### Implement Use static [nvfatbin](https://docs.nvidia.com/cuda/nvfatbin/index.html) library. I have tested it locally, the two flows can return the same Fatbin result after inputing the same `GpuModule`.	2024-12-10 11:45:24 +01:00
Kareem Ergawy	0e70e0edd5	[reapply (#118463 )][OpenMP][OMPIRBuilder] Add delayed privatization support for `wsloop` (#119170 ) This reapplies PR #118463 after introducing a fix for a bug uncovere by the test suite. The problem is that when the alloca block is terminated with a conditional branch, this violates a pre-condition of `allocatePrivateVars` (which assumes the alloca block has a single successor). This new PR includes a test that reproduces the issue. Extend MLIR to LLVM lowering by adding support for `omp.wsloop` for delayed privatization. This also refactors a few bit of code to isolate the logic needed for `firstprivate` initialization in a shared util that can be used across constructs that need it. The same is done for `dealloc` regions.	2024-12-09 14:32:04 +01:00
NimishMishra	9eb4056144	[mlir][llvm] Translation support for task detach (#116601 ) This PR adds translation support for task detach. Essentially, if the `detach` clause is present on a task, emit a `__kmpc_task_allow_completion_event` on it, and store its return (of type `kmp_event_t*`) into the `event_handle`.	2024-12-08 06:09:52 -08:00
Kareem Ergawy	c54616ea48	Revert "[OpenMP][OMPIRBuilder] Add delayed privatization support for `wsloop` (#118463 )" (#118848 )	2024-12-05 20:49:13 +01:00
Kareem Ergawy	0993335134	[OpenMP][OMPIRBuilder] Add delayed privatization support for `wsloop` (#118463 ) Extend MLIR to LLVM lowering by adding support for `omp.wsloop` for delayed privatization. This also refactors a few bit of code to isolate the logic needed for `firstprivate` initialization in a shared util that can be used across constructs that need it. The same is done for `dealloc` regions. Parent PR: https://github.com/llvm/llvm-project/pull/118447. Only latest commit is relevant for this PR.	2024-12-05 05:59:52 +01:00
Kareem Ergawy	7f72d71de7	[OpenMP][OMPIRBuilder] Refactor reduction initialization logic into one util (#118447 ) This refactors the logic needed to emit init logic for reductions by moving some duplicated code into a shared util. The logic for doing is quite involved and is needed for any construct that has reductions. Moreover, when a construct has both private and reduction clauses, both sets of clauses need to cooperate with each other when emitting the logic needed for allocation and initialization. Therefore, this PR clearly sets the boundaries for the logic needed to initialize reductions.	2024-12-05 05:23:49 +01:00
Krzysztof Drewniak	92a15dd748	[mlir][LLVM] Plumb range attributes on parameters and results through (#117801 ) We've had the ability to define LLVM's `range` attribute through #llvm.constant_range for some time, and have used this for some GPU intrinsics. This commit allows using `llvm.range` as a parameter or result attribute on function declarations and definitions.	2024-11-27 12:31:51 -06:00
NimishMishra	b9e3a769b9	[flang][mlir][llvm][OpenMP] Add lowering and translation support for mergeable clause on task (#114662 ) Add FIR generation and LLVMIR translation support for mergeable clause on task construct. If mergeable clause is present on a task, the relevant flag in `ompt_task_flag_t` is set and passed to `__kmpc_omp_task_alloc`.	2024-11-26 02:40:26 -08:00
Renaud Kauffmann	7fcc0f9065	Populate the llvm::GlobalVariable ELF section, with the attribute from the ObjectAttrs (#117246 )	2024-11-22 07:58:45 -08:00
arthurqiu	81055ff070	[mlir][nvvm] Add attributes for cluster dimension PTX directives (#116973 ) PTX programming models provides cluster dimension directives, which are leveraged by the downstream `ptxas` compiler. See https://docs.nvidia.com/cuda/nvvm-ir-spec/#supported-properties and https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#cluster-dimension-directives This PR introduces the cluster dimension directives to MLIR's NVVM dialect as listed below: ``` cluster_dim_{x,y,z} -> exact number of CTAs per cluster cluster_max_blocks -> max number of CTAs per cluster ```	2024-11-20 18:31:01 +01:00
Zichen Lu	08e7609692	[mlir][fix] Add callback functions for ModuleToObject (#116916 ) Here is the [merged MR](https://github.com/llvm/llvm-project/pull/116007) which caused a failure and [was reverted](https://github.com/llvm/llvm-project/pull/116811). Thanks to @joker-eph for the help, I fix it (miss constructing `ModuleObject` with callback functions in `mlir/lib/Target/LLVM/NVVM/Target.cpp`) and split unit tests from origin test which don't need `ptxas` to make the test runs more widely.	2024-11-20 13:22:08 +01:00
Mehdi Amini	af41c55673	Revert "[MLIR] Add callback functions for ModuleToObject" (#116811 ) Reverts llvm/llvm-project#116007 Bot is broken.	2024-11-19 15:28:17 +01:00
Zichen Lu	2153672ba3	[MLIR] Add callback functions for ModuleToObject (#116007 ) In ModuleToObject flow, users may want to add some callback functions invoked with LLVM IR/ISA for debugging or other purposes.	2024-11-19 13:51:08 +01:00
Tom Eccles	a6385a3fc8	[mlir][OpenMP][NFC] use llvm::zip_equal for firstprivate copy region translation (#116416 ) I think this is a bit easier to read.	2024-11-18 10:25:19 +00:00
Victor Perez	4f78f85190	[MLIR][SPIRV] Add definition and (de)serialization for cache controls (#115461 ) [SPV_INTEL_cache_controls](https://htmlpreview.github.io/?https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/INTEL/SPV_INTEL_cache_controls.html) defines decorations for load and store cache control. Add support for this extension in the SPIR-V dialect. As several `CacheControlLoadINTEL` and `CacheControlStoreINTEL` may be applied to the same value, these are represented as array attributes. (De)Serialization takes care of this representation. --------- Signed-off-by: Victor Perez <victor.perez@codeplay.com>	2024-11-18 09:42:31 +01:00
agozillon	b5db75bfce	[OpenMP][MLIR] Descriptor explicit member map lowering changes (#113556 ) This is one of 3 PRs in a PR stack that aims to add support for explicit mapping of allocatable members in derived types. The primary changes in this PR are the OpenMPToLLVMIRTranslation.cpp changes, which are small and seek to alter the current member mapping to add an additional map insertion for pointers. Effectively, if the member is a pointer (currently indicated by having a varPtrPtr field) we add an additional map for the pointer and then alter the subsequent mapping of the member (the data) to utilise the member rather than the parents base pointer. This appears to be necessary in certain cases when mapping pointer data within record types to avoid segfaulting on device (due to incorrect data mapping). In general this record type mapping may be simplifiable in the future. There are also additions of tests which should help to showcase the affect of the changes above.	2024-11-16 12:26:29 +01:00
lfrenot	40afff7bd9	[mlir][LLVM] Add disjoint flag (#115855 ) The implementation is mostly based on the one existing for the exact flag. disjoint means that for each bit, that bit is zero in at least one of the inputs. This allows the Or to be treated as an Add since no carry can occur from any bit. If the disjoint keyword is present, the result value of the or is a [poison value](https://llvm.org/docs/LangRef.html#poisonvalues) if both inputs have a one in the same bit position. For vectors, only the element containing the bit is poison.	2024-11-15 13:48:01 +01:00
agozillon	d84d0caf28	[Flang][OpenMP] Update MapInfoFinalization to use BlockArgs Interface and modify use_device_ptr/addr to be order independent (#113919 ) This patch primarily updates the MapInfoFinalization pass to utilise the BlockArgument interface. It also shuffles newly added arguments the MapInfoFinalization passes to the end of the BlockArg/Relevant MapInfo lists, instead of one prior to the owning descriptor type. During this it was noted that the use_device_ptr/addr handling of target data was a little bit too order dependent so I've attempted to make it less so, as we cannot depend on argument ordering to be the same as Fortran for any future frontends.	2024-11-14 15:47:37 +01:00
lfrenot	89aaf2cf68	[mlir][LLVM] Add nneg flag (#115498 ) This implementation is based on the existing one for the exact flag. If the nneg flag is set and the argument is negative, the result is a poison value.	2024-11-11 14:01:50 +01:00
lfrenot	afa178d360	[mlir][LLVM] Add exact flag (#115327 ) The implementation is mostly based on the one existing for the nsw and nuw flags. If the exact flag is present, the corresponding operation returns a poison value when the result is not exact. (For a division, if rounding happens; for a right shift, if a non-zero bit is shifted out.)	2024-11-08 13:56:44 +01:00
Matthias Springer	b613a54075	[mlir][IR][NFC] Cleanup insertion point API usage (#115415 ) Use `setInsertionPointToStart` / `setInsertionPointToEnd` when possible.	2024-11-08 14:31:27 +09:00
Tom Eccles	8269c400b4	[mlir][OpenMP][NFC] delayed privatisation cleanup (#115298 ) Upstreaming some code cleanups ahead of supporting delayed task execution. - Make allocatePrivateVars not need to be a template (it will need to operate separately on firstprivate and private variables for delayed task execution so it can't index into lists of all variables in the operation). - Use llvm::SmallVectorImpl for function arguments - collectPrivatizationDecls already reserves size for privateDecls so we don't need to do that in callers - Use llvm::zip_equal instead of C-style array indexing	2024-11-07 12:27:31 +00:00
Ilya Enkovich	2f743ac52e	[MLIR] [AMX] Utilize x86_amx type for AMX dialect in MLIR. (#111197 ) This patch is intended to resolve #109481 and improve the usability of the AMX dialect. In LLVM IR, AMX intrinsics use `x86_amx` which is one of the primitive types. This type is supposed to be used for AMX intrinsic calls and no other operations. AMX dialect of MLIR uses regular 2D vector types, which are then lowered to arrays of vectors in the LLVMIR dialect. This creates an inconsistency in the types used in the LLVMIR dialect and LLVMIR. Translation of AMX intrinsic calls to LLVM IR doesn't require result types to match and that is where tile loads and mul operation results get `x86_amx` type. This works in very simple cases when mul and tile store operations directly consume the result of another AMX intrinsic call, but it doesn't work when an argument is a block argument (phi node). In addition to translation problems, this inconsistency between types used in MLIR and LLVM IR makes MLIR verification and transformation quite problematic. Both `amx.tileload` and `vector::transfer_read` can load values of the same type, but only one of them can be used in AMX operations. In general, by looking at a type of value, we cannot determine if it can only be used for AMX operations or contrary can be used in other operations but AMX ones. To remove this inconsistency and make AMX operations more explicit in their limitations, I propose to add `LLVMX86AMXType` type to the LLVMIR dialect to match `x86_amx` type in LLVM IR, and introduce `amx::TileType` to be used by AMX operations in MLIR. This resolves translation problems for AMX usage with phi nodes and provides proper type verification in MLIR for AMX operations. P.S. This patch also adds missing FP16 support. It's trivial but unrelated to type system changes, so let me know if I should submit it separately. --------- Signed-off-by: Ilya Enkovich <ilya.enkovich@intel.com>	2024-11-06 14:30:53 +00:00
Tom Eccles	28452acac0	[mlir][OpenMP] delayed privatisation for TASK (#114785 ) This uses essentially an identical implementation to that used for ParallelOp. The private variable allocation and deallocation use shared functions to avoid code duplication. FIRSTPRIVATE variable copying uses duplicated code for now because I anticipate the implementation diverging in the near future once I store data for firstprivate variables in the task description structure. After enabling delayed privatisation for TASK in flang, one more test in the fujitsu test suite passes (I haven't looked into why).	2024-11-06 13:19:12 +00:00
Zichen Lu	f87484d591	Fix libnvptxcompiler_static.a absolute path (#115015 ) Now when building llvm-solid with `-DMLIR_ENABLE_NVPTXCOMPILER=ON`, there will be an absolute path (`/path/to/libnvptxcompiler_static.a`) in MLIRNVVMTarget dependencies (in `/build/path/install/lib/cmake/mlir/MLIRTargets.cmake`). For example, ```cmake set_target_properties(MLIRNVVMTarget PROPERTIES INTERFACE_LINK_LIBRARIES "MLIRIR;MLIRExecutionEngineUtils;MLIRSupport;MLIRGPUDialect;MLIRTargetLLVM;MLIRNVVMToLLVMIRTranslation;LLVMSupport;/path/to/libnvptxcompiler_static.a" ) ``` If downstream project uses pre-built llvm and depends on MLIRNVVMTarget, it may fail to build due to the absence of the `libnvptxcompiler_static.a` absolute path. After this commit, there will no absolute path in `/build/path/install/lib/cmake/mlir/MLIRTargets.cmake` ```cmake set_target_properties(MLIRNVVMTarget PROPERTIES INTERFACE_LINK_LIBRARIES "MLIRIR;MLIRExecutionEngineUtils;MLIRSupport;MLIRGPUDialect;MLIRTargetLLVM;MLIRNVVMToLLVMIRTranslation;LLVMSupport;\$<LINK_ONLY:MLIR_NVPTXCOMPILER_LIB>" ) ``` Then downstream project can modify `libnvptxcompiler_static.a` path and use cmake to build. For example, ```cmake # find_library(...) add_library(MLIR_NVPTXCOMPILER_LIB STATIC IMPORTED GLOBAL) set_property(TARGET MLIR_NVPTXCOMPILER_LIB PROPERTY IMPORTED_LOCATION ${...}) ```	2024-11-06 11:51:18 +01:00
Sergio Afonso	d3e796c2d0	[MLIR][OpenMP] Update not-yet-implemented errors, NFC (#114966 ) This patch improves not-yet-implemented error diagnostics to more closely follow the format used by Flang lowering for the same kind of errors. This helps keep some level of uniformity from a user perspective.	2024-11-05 12:48:54 +00:00
Sergio Afonso	6c28530ed0	[Flang][OpenMP] Properly bind arguments of composite operations (#113682 ) When composite constructs are lowered, clauses for each leaf construct are lowered before creating the set of loop wrapper operations, using these outside values to populate their operand lists. Then, when the loop nest associated to that composite construct is lowered, the binding of Fortran symbols to the entry block arguments defined by these loop wrappers is performed, resulting in the creation of `hlfir.declare` operations in the entry block of the `omp.loop_nest`. This approach prevents `hlfir.declare` operations related to the binding and other operations resulting from the evaluation of the clauses from being inserted between loop wrapper operations, which would be an illegal MLIR representation. However, this introduces the problem of entry block arguments defined by a wrapper that then should be used by one of its nested wrappers, because the corresponding Fortran symbol would still be mapped to an outside value at the time of gathering the list of operands for the nested wrapper. This patch adds operand re-mapping logic to update wrappers without changing when clauses are evaluated or where the `hlfir.declare` creation is performed.	2024-10-31 16:39:53 +00:00
Sergio Afonso	bd6c21460f	[MLIR][OpenMP] Emit descriptive errors for all unsupported clauses (#114037 ) This patch improves error reporting in the MLIR to LLVM IR translation pass for the 'omp' dialect by emitting descriptive errors when encountering clauses not yet supported by that pass. Additionally, not-yet-implemented errors previously missing for some clauses are added, to avoid silently ignoring them. Error messages related to inlining of `omp.private` and `omp.declare_reduction` regions have been updated to use the same format.	2024-10-31 11:59:51 +00:00
Sergio Afonso	21a6032eca	[MLIR][OpenMP] Simplify translation to LLVM IR error handling (#114036 ) This patch unifies the handling of errors passed through the OpenMPIRBuilder and removes some redundant error messages through the introduction of a custom `ErrorInfo` subclass. Additionally, the current list of operations and clauses unsupported by the MLIR to LLVM IR translation pass is added to a new Lit test to check they are being reported to the user.	2024-10-31 11:34:24 +00:00
Abid Qadeer	89f2d50cda	[mlir][debug] Support DIGenericSubrange. (#113441 ) `DIGenericSubrange` is used when the dimensions of the arrays are unknown at build time (e.g. assumed-rank arrays in Fortran). It has same `lowerBound`, `upperBound`, `count` and `stride` fields as in `DISubrange` and its translation looks quite similar as a result. --------- Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>	2024-10-31 10:09:26 +00:00
Sergio Afonso	a1f2fb6078	[MLIR][OpenMP] Prevent composite omp.simd related crashes (#113680 ) This patch updates the translation of `omp.wsloop` with a nested `omp.simd` to prevent uses of block arguments defined by the latter from triggering null pointer dereferences. This happens because the inner `omp.simd` operation representing composite `do simd` constructs is currently skipped and not translated, but this results in block arguments defined by it not being mapped to an LLVM value. The proposed solution is to map these block arguments to the LLVM value associated to the corresponding operand, which is defined above.	2024-10-29 17:05:12 +00:00
Sergio Afonso	d87964de78	[OpenMP][OMPIRBuilder] Error propagation across callbacks (#112533 ) This patch implements an approach to communicate errors between the OMPIRBuilder and its users. It introduces `llvm::Error` and `llvm::Expected` objects to replace the values returned by callbacks passed to `OMPIRBuilder` codegen functions. These functions then check the result for errors when callbacks are called and forward them back to the caller, which has the flexibility to recover, exit cleanly or dump a stack trace. This prevents a failed callback to leave the IR in an invalid state and still continue the codegen process, triggering unrelated assertions or segmentation faults. In the case of MLIR to LLVM IR translation of the 'omp' dialect, this change results in the compiler emitting errors and exiting early instead of triggering a crash for not-yet-implemented errors. The behavior in Clang and openmp-opt stays unchanged, since callbacks will continue always returning 'success'.	2024-10-25 11:30:16 +01:00
Kareem Ergawy	ad70f3e095	[flang][OpenMP] Support `target enter\|update\|exit .. nowait` (#113305 ) Extends `nowait` support for other device directives. This PR refactors the task generation utils used for the `target` directive so that they are general enough to be reused for other device directives as well.	2024-10-23 10:48:54 +02:00

1 2 3 4 5 ...

1188 Commits