llvm-project

Author	SHA1	Message	Date
Krzysztof Parzyszek	ae75884130	[Frontend][OpenMP] Add 6.1 as a valid OpenMP version (#153628 ) Co-authored-by: Michael Klemm <michael.klemm@amd.com>	2025-08-18 09:13:27 -05:00
Kazu Hirata	07eb7b7692	[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068 ) This patch replaces SmallSet<T , N> with SmallPtrSet<T , N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.	2025-08-18 07:01:29 -07:00
Abid Qadeer	62d0b712b7	[OMPIRBuilder] Avoid invalid debug location. (#153190 ) Fixes #153043. This is another case of debug location not getting updated when the insert point is changed by the `restoreIP`. Fixed by using the wrapper function that updates the debug location.	2025-08-12 16:20:52 +01:00
Abid Qadeer	3746bd21cb	[OMPIRBuilder][NFC] Fix build failure after 151306. (#153061 ) A function added in pr#151306 was under NDEBUG macro which caused the build to fail in certain cases. It has been moved out of the #ifdef check to ensure it is always compiled.	2025-08-11 19:46:42 +01:00
Abid Qadeer	049953fe8d	[OMPIRBuilder] Avoid invalid debug location. (#151306 ) This fixes #147063. I tried to fix this issue in more general way in https://github.com/llvm/llvm-project/pull/147091 but the reviewer suggested to fix the locations which are causing this issue. So this is a more targeted approach. The `restoreIP` is frequently used in the `OMPIRBuilder` to change the insert position. This function eventually calls `SetInsertPoint(BasicBlock TheBB, BasicBlock::iterator IP)`. This function updates the insert point and the debug location. But if the `IP` is pointing to the end of the `TheBB`, then the debug location is not updated and we could have a mismatch between insert point and the debug location. The problem can occur in 2 different code patterns. This code below shows the first scenario. ``` 1. auto curPos = builder.saveIP(); 2. builder.restoreIP(/ some new pos /); 3. // generate some code 4. builder.restoreIP(curPos); ``` If `curPos` points to the end of basic block, we could have a problem. But it is easy one to handle as we have the location before hand and can save the correct debug location before 2 and then restore it after 3. This can be done either manually or using the `llvm::InsertPointGuard` as shown below. ``` // manual approach auto curPos = builder.saveIP(); llvm::DebugLoc DbgLoc = builder.getCurrentDebugLocation(); builder.restoreIP(/ some new pos /); // generate some code builder.SetCurrentDebugLocation(DbgLoc); builder.restoreIP(curPos); { // using InsertPointGuard llvm::InsertPointGuard IPG(builder); builder.restoreIP(/ some new pos */); // generate some code } ``` This PR fixes one problematic case using the manual approach. For the 2nd scenario, look at the code below. ``` 1. void fn(InsertPointTy allocIP, InsertPointTy codegenIP) { 2. builder.setInsertPoint(allocIP); 3. // generate some alloca 4. builder.setInsertPoint(codegenIP); 5. } ``` The `fn` can be called from anywhere and we can't assume the debug location of the builder is valid at the start of the function. So if 4 does not update the debug location because the `codegenIP` points at the end of the block, the rest of the code can end up using the debug location of the `allocaIP`. Unlike the first case, we don't have a debug location that we can save before hand and restore afterwards. The solution here is to use the location of the last instruction in that block. I have added a wrapper function over `restoreIP` that could be called for such cases. This PR uses it to fix one problematic case.	2025-08-11 18:47:30 +01:00
Anchu Rajendran S	49ccf46adc	[OpenMP] [IR Builder] Changes to Support Scan Operation (#136035 ) Scan reductions are supported in OpenMP with the help of scan directive. Reduction clause of the for loop/simd directive can take an `inscan` modifier along with the body of the directive specifying a `scan` directive. This PR implements the lowering logic for scan reductions in workshare loops of OpenMP. The body of the for loop is split into two loops (Input phase loop and Scan Phase loop) and a scan reduction loop is added in the middle. The Input phase loop populates a temporary buffer with initial values that are to be reduced. The buffer is used by the reduction loop to perform scan reduction. Scan phase loop copies the values of the buffer to the reduction variable before executing the scan phase. Below is a high level view of the code generated. ``` <declare pointer to buffer> ptr omp parallel { size num_iters = <num_iters> // temp buffer allocation omp masked { buff = malloc(num_itersscanvarstype) ptr = buff } barrier; // input phase loop for (i: 0..<num_iters>) { <input phase>; buffer = ptr; buffer[i] = red; } // scan reduction omp masked { for (int k = 0; k != ceil(log2(num_iters)); ++k) { i=pow(2,k) for (size cnt = last_iter; cnt >= i; --cnt) { buffer = ptr; buffer[cnt] op= buffer[cnt-i]; } } } barrier; // scan phase loop for (0..<num_iters>) { buffer = ptr; red = buffer[i] ; <scan phase>; } // temp buffer deletion omp masked { free(ptr) } barrier; } ``` The temporary buffer needs to be shared between all threads performing reduction since it is read/written in Input and Scan workshare Loops. This is achieved by declaring a pointer to the buffer in the shared region and dynamically allocating the buffer by the master thread. This is the reason why allocation, deallocation and scan reduction are performed within `masked`. The code is verified to produce correct results for Fortran programs with the code changes in the PR https://github.com/llvm/llvm-project/pull/133149	2025-08-07 14:58:11 -07:00
Anchu Rajendran S	38542efcba	[flang][OMPIRBuilder][MLIR][llvm] Backend support for atomic control options (#151579 ) Adding mlir to llvm support for atomic control options. Atomic Control Options are used to specify architectural characteristics to help lowering of atomic operations. The options used are: `-f[no-]atomic-remote-memory`, `-f[no-]atomic-fine-grained-memory`, `-f[no-]atomic-ignore-denormal-mode`. Legacy option `-m[no-]unsafe-fp-atomics` is aliased to `-f[no-]ignore-denormal-mode`. More details can be found in https://github.com/llvm/llvm-project/pull/102569. This PR implements the MLIR to LLVM lowering support of atomic control attributes specified with OpenMP `atomicUpdateOp`. Initial support can be found in PR: https://github.com/llvm/llvm-project/pull/150860	2025-08-07 12:23:38 -07:00
Abid Qadeer	335dbba741	[OMPIRBuilder] Don't drop debug loc from LocationDescription. (#148713 ) `LocationDescription` contains both the insertion point and the debug location. When `LocationDescription` is available, it is better to use `updateToLocation` which will update both. This PR replaces `restoreIP(Loc.IP)` with `updateToLocation(Loc)` as former may not update debug location in all cases. I am not checking the return value of `updateToLocation` because that is checked just a few lines above in all cases and we would have returned early if it failed.	2025-07-29 17:31:29 +01:00
Abid Qadeer	7d7f3819e0	Revert "[OMPIRBuilder] Don't use invalid debug loc in reduction functions." (#150832 ) Reverts llvm/llvm-project#147950 I noticed some fails in the reduction tests with clang after this change. I need to understand the failures better. Reverting this for now.	2025-07-27 16:43:28 +01:00
Tom Eccles	a1c61ac756	[mlir][OpenMP] Allow composite SIMD REDUCTION and IF (#147568 ) Reduction support: https://github.com/llvm/llvm-project/pull/146671 If Support is fixed in this PR The problem for the IF clause in composite constructs was that wsloop and simd both operate on the same CanonicalLoopInfo structure: with the SIMD processed first, followed by the wsloop. Previously the IF clause generated code like ``` if (cond) { while (...) { simd_loop_body; } } else { while (...) { nonsimd_loop_body; } } ``` The problem with this is that this invalidates the CanonicalLoopInfo structure to be processed by the wsloop later. To avoid this, in this patch I preserve the original loop, moving the IF clause inside of the loop: ``` while (...) { if (cond) { simd_loop_body; } else { non_simd_loop_body; } } ``` On simple examples I tried LLVM was able to hoist the if condition outside of the loop at -O3. The disadvantage of this is that we cannot add the llvm.loop.vectorize.enable attribute on either the SIMD or non-SIMD loops because they both share a loop back edge. There's no way of solving this without keeping the old design of having two different loops: which cannot be represented using only one CanonicalLoopInfo structure. I don't think the presence or absence of this attribute makes much difference. In my testing it is the llvm.loop.parallel_access metadata which makes the difference to vectorization. LLVM will vectorize if legal whether or not this attribute is there in the TRUE branch. In the FALSE branch this means the loop might be vectorized even when the condition is false: but I think this is still standards compliant: OpenMP 6.0 says that when the if clause is false that should be treated like the SIMDLEN clause is one. The SIMDLEN clause is defined as a "hint". For the same reason, SIMDLEN and SAFELEN clauses are silently ignored when SIMD IF is used. I think it is better to implement SIMD IF and ignore SIMDLEN and SAFELEN and some vectorization encouragement metadata when combined with IF than to ignore IF because IF could have correctness consequences whereas the rest are optimiztion hints. For example, the user might use the IF clause to disable SIMD programatically when it is known not safe to vectorize the loop. In this case it is not at all safe to add the parallel access or SAFELEN metadata.	2025-07-15 10:30:02 +01:00
Abid Qadeer	45fa0b29bc	Revert "[OMPIRBuilder] Don't use invalid debug loc in task proxy function." (#148728 ) There is a sanitizer fail in CI after this which I need to investigate. Reverting for now. Reverts llvm/llvm-project#148284	2025-07-14 22:23:21 +01:00
Abid Qadeer	9d778089db	[OMPIRBuilder] Don't use invalid debug loc in task proxy function. (#148284 ) This is similar to https://github.com/llvm/llvm-project/pull/147950 but for task proxy function.	2025-07-14 21:04:34 +01:00
Abid Qadeer	7b91df3868	[OMPIRBuilder] Don't use invalid debug loc in reduction functions. (#147950 ) We have this pattern of code in OMPIRBuilder for many functions that are used in reduction operations. ``` Function LtGRFunc = Function::Create BasicBlock EntryBlock = BasicBlock::Create(Ctx, "entry", LtGRFunc); Builder.SetInsertPoint(EntryBlock); ``` The insertion point is moved to the new function but the debug location is not updated. This means that reduction function will use the debug location that points to another function. This problem gets hidden because these functions gets inlined but the potential for failure exists. This patch resets the debug location when insertion point is moved to new function. Some `InsertPointGuard` have been added to make sure why restore the debug location correctly when we are done with the reduction function.	2025-07-11 09:50:05 +01:00
Krzysztof Parzyszek	795b17d0b8	[Frontend][OpenMP] Implement directive name parser (#146776 ) Implement a state machine that consumes tokens (words delimited by white space), and returns the corresponding directive id, or fails if the tokens did not form a valid name.	2025-07-03 12:18:01 -05:00
Abid Qadeer	d56c06e6c9	[flang][debug] Generate DISubprogramAttr for omp::TargetOp. (#146532 ) This is combination of https://github.com/llvm/llvm-project/pull/138149 and https://github.com/llvm/llvm-project/pull/138039 which were opened separately for ease of reviewing. Only other change is adjustments in 2 tests which have gone in since. There are `DeclareOp` present for the variables mapped into target region. That allow us to generate debug information for them. But the `TargetOp` is still part of parent function and those variables get the parent function's `DISubprogram` as a scope. In `OMPIRBuilder`, a new function is created for the `TargetOp`. We also create a new `DISubprogram` for it. All the variables that were in the target region now have to be updated to have the correct scope. This after the fact updating of debug information becomes very difficult in certain cases. Take the example of variable arrays. The type of those arrays depend on the artificial `DILocalVariable`(s) which hold the size(s) of the array. This new function will now require that we generate the new variable and and new types. Similar issue exist for character type variables too. To avoid this after the fact updating, this PR generates a `DISubprogramAttr` for the `TargetOp` while generating the debug info in `flang`. Then we don't need to generate a `DISubprogram` in `OMPIRBuilder`. This change is made a bit more complicated by the the fact that in new scheme, the debug location already points to the new `DISubprogram` by the time it reaches `convertOmpTarget`. But we need some code generation in the parent function so we have to carefully manage the debug locations. This fixes issue `#134991`.	2025-07-03 10:38:28 +01:00
Abid Qadeer	4233ca1e4e	[OMPIRBuilder] Fix use of uninitialized variable. (#145883 ) The code in `OpenMPIRBuilder::getTargetEntryUniqueInfo` calls `ID.getDevice()` even when `getUniqueID` has failed and ID is un-initialized. This caused a sanitizer fail for me in https://github.com/llvm/llvm-project/pull/145026. Fix it by giving a default value to `ID`. The value chosen is the same as used in `OpenMPToLLVMIRTranslation.cpp`.	2025-06-26 16:10:58 +01:00
Kajetan Puchalski	d3ed84ed67	[Utils][mlir] Fix interaction between CodeExtractor and OpenMPIRBuilder (#145051 ) CodeExtractor can currently erroneously insert an alloca into a different function than it inserts its users into, in cases where code is being extracted out of a function that has already been outlined. Add an assertion that the two blocks being inserted into are actually in the same function. Add a check to findAllocaInsertPoint in OpenMP to LLVMIR translation to prevent the aforementioned scenario from happening. OpenMPIRBuilder relies on a callback mechanism to fix-up a module later on during the finaliser step. In some cases this results in the module being invalid prior to the finalise step running. Remove calls to verifyModule wrapped in LLVM_DEBUG from CodeExtractor, as the presence of those results in the compiler crashing with -mllvm -debug due to premature module verification where it would not crash without -debug. Call ompBuilder->finalize() the end of mlir::translateModuleToLLVMIR, in order to make sure the module has actually been finalized prior to trying to verify it. Resolves https://github.com/llvm/llvm-project/issues/138102. --------- Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>	2025-06-25 13:34:35 +01:00
Mikael Holmen	c16dc63b44	[OMPIRBuilder] Fix gcc -Wparentheses warning [NFC] Without this gcc warned like /repo/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:7559:68: warning: suggest parentheses around '&&' within '\|\|' [-Wparentheses] 7559 \| NumStaleCIArgs == (OffloadingArraysToPrivatize.size() + 2) && \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ 7560 \| "Wrong number of arguments for StaleCI when shareds are present"); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~	2025-06-18 10:59:18 +02:00
Abid Qadeer	2c90ebf3a7	[OMPIRBuilder][debug] Don't drop debug info for loop constructs. (#144393 ) In OMPIRBuilder, we have many cases where we don't handle the debug location correctly while changing the location or insertion point. This is one of those cases. Please see the following test program. ``` program main implicit none integer i, j integer array(16384) !$omp target teams distribute DO i=1,16384 !$omp parallel do DO j=1,16384 array(j) = i ENDDO !$omp end parallel do ENDDO !$omp end target teams distribute print *, array end program main ``` When tried to compile with the follownig command `flang -g -O2 -fopenmp test.f90 -o test --offload-arch=gfx90a` will fail in the verification with the following errors: `!dbg attachment points at wrong subprogram for function` This happens because we were dropping the debug location in the createCanonicalLoop and the call to the functions like `__kmpc_distribute_static_4u` get generated without a debug location. When it gets inlined, the locations inside it are not adjusted as the call instruction does not have the debug locations (`llvm/lib/Transforms/Utils/InlineFunction.cpp:fixupLineNumbers`). Later Verifier finds that the caller have instructions with debug locations that point to another function and fails. The fix is simple to not drop the debug location.	2025-06-17 09:34:47 +01:00
Karlo Basioli	ec32d88585	Annotate potentially unused variables introduced in #133499 (#144379 )	2025-06-16 17:02:24 +01:00
Kazu Hirata	2dd50bf79e	[OpenMP] Fix warnings This patch fixes: llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:7233:9: error: unused variable 'TaskTy' [-Werror,-Wunused-variable] llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:7666:15: error: unused variable 'ArrayType' [-Werror,-Wunused-variable]	2025-06-16 08:54:04 -07:00
Pranav Bhandarkar	404597061f	[OMPIRBuilder] - Make offloading input data persist for deferred target tasks (#133499 ) When we offload to the target, the pointers to data used by the kernel are passed in arrays created by `OMPIRBuilder`. These arrays of pointers are allocated on the stack on the host. This is fine for the most part because absent the `nowait` clause, the default behavior is that target tasks are included tasks. That is, the host is blocked until the offloaded target kernel is done. In turn, this means that the host's stack frame is intact and accessing the array of pointers when offloading is safe. However, when `nowait` is used on the `!$ omp target` instance, then the target task is a deferred task meaning, the generating task on the host does not have to wait for the target task to finish. In such cases, it is very likely that the stack frame of the function invoking the target call is wound up thereby leading to memory access errors as shown below. ``` AMDGPU error: Error in hsa_amd_memory_pool_allocate: HSA_STATUS_ERROR_INVALID_ALLOCATION: The requested allocation is not valid. AMDGPU error: Error in hsa_amd_memory_pool_allocate: HSA_STATUS_ERROR_INVALID_ALLOCATION: The requested allocation is not valid. "PluginInterface" error: Failure to allocate device memory: Failed to allocate from memory manager fort.cod.out: /llvm/llvm-project/offload/plugins-nextgen/common/src/PluginInterface.cpp:1434: Error llvm::omp::target::plugin::PinnedAllocationMapTy::lockMappedHostBuffer(void *, size_t): Assertion `HstPtr && "Invalid pointer"' failed. Aborted (core dumped) ``` This PR implements support in `OMPIRBuilder` to store these arrays of pointers in the task structure that is passed to the target task thereby ensuring it is available to the target task when the target task is eventually scheduled. --------- Co-authored-by: Sergio Afonso <safonsof@amd.com>	2025-06-16 10:27:48 -05:00
Krzysztof Parzyszek	7b2aa02a33	[utils][TableGen] Handle versions on clause/directive spellings (#143021 ) In "get<lang>DirectiveName(Kind, Version)", return the spelling that corresponds to Version, and in "get<lang>DirectiveKindAndVersions(Name)" return the pair {Kind, VersionRange}, where VersionRange contains the minimum and the maximum versions that allow "Name" as a spelling. This applies to clauses as well. In general it applies to classes that have spellings (defined via TableGen class "Spelling"). Given a Kind and a Version, getting the corresponding spelling requires a runtime search (which can fail in a general case). To avoid generating the search function inline, a small additional component of llvm/Frontent was added: LLVMFrontendDirective. The corresponding header file also defines C++ classes "Spelling" and "VersionRange", which are used in TableGen/DirectiveEmitter as well. For background information see https://discourse.llvm.org/t/rfc-alternative-spellings-of-openmp-directives/85507	2025-06-06 17:07:25 -05:00
Kareem Ergawy	a8d8af3bfa	[OpenMP][OMPIRBuilder] Collect users of a value before replacing them in target outlined function (#139064 ) This PR fixes a crash that curently happens given the following input: ```fortran subroutine caller() real :: x integer :: i !$omp target x = i call callee(x,x) !$omp end target endsubroutine caller subroutine callee(x1,x2) real :: x1, x2 endsubroutine callee ``` The crash happens because the following sequence of events is taken by the `OMPIRBuilder`: 1. .... 2. An outlined function for the target region is created. At first the outlined function still refers to the SSA values from the original function of the target region. 3. The builder then iterates over the users of SSA values used in the target region to replace them with the corresponding function arguments of outlined function. 4. If the same instruction references the SSA value more than once (say m), all uses of that SSA value are replaced in the instruction. Deleting all m uses of the value. 5. The next m-1 iterations will still iterate over the same instruction dropping the last m-1 actual users of the value. Hence, we collect all users first before modifying them.	2025-05-28 17:40:34 +02:00
NimishMishra	0baacd1a58	[flang][OpenMP] Support MLIR lowering of linear clause for omp.wsloop (#139385 ) This patch adds support for MLIR lowering of linear clause on omp.wsloop (except for linear modifiers).	2025-05-19 23:33:06 -07:00
Shafik Yaghmour	c248903053	[OpenMP][NFC] Use pass by const ref for Dependencies (#139592 ) Static analysis flagged the passing of Dependencies to emitTargetCall as a place we could use std::move to avoid copying. A closer look indicated we could instead turn the parameter into a const & and not have a default value since it was only used in two lines in a test and changing those two locations was easy.	2025-05-13 09:09:37 -07:00
Johannes Doerfert	73165de4e6	[OpenMP] implementation set controls elision for begin declare variant (#139287 ) The device and implementation set should trigger elision of tokens if they do not match statically in a begin/end declare variant. This simply extends the logic from the device set only and includes the implementation set. Reported by @kkwli.	2025-05-09 16:32:39 -07:00
Krzysztof Parzyszek	5b290588b5	[LLVM][OpenMP] Add older versions to llvm::omp::getOpenMPVersions (#138967 ) Add 3.1 and 4.0 as versions. This will make flang's default OpenMP version (3.1) be included in the list.	2025-05-08 07:23:01 -05:00
Tom Eccles	8338a3c92b	[mlir][OpenMP] Convert omp.cancellation_point to LLVMIR (#137205 ) This is basically identical to cancel except without the if clause. taskgroup will be implemented in a followup PR.	2025-05-08 11:09:13 +01:00
Shafik Yaghmour	c956ed06dc	[OpenMP][NFC] Use temporary instead of local to take advantage of move semantics (#138582 ) Static analysis flagged `MapnamesName` because we could move it into `createOffloadMapnames`. I just replaced the local with a direct call to `createPlatformSpecificName` at the function argument location.	2025-05-06 10:29:45 -07:00
Kazu Hirata	2f3067ed69	[llvm] Remove unused local variables (NFC) (#138454 )	2025-05-04 09:38:16 -07:00
NimishMishra	c617466501	[flang][llvm][OpenMP] Add implicit casts to omp.atomic (#131603 ) Currently, implicit casts in Fortran are handled by the OMPIRBuilder. This patch shifts that responsibility to FIR codegen.	2025-05-01 08:20:42 -07:00
Jonathan Thackray	6e49f73825	Reland [llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions (#137701 ) This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum` instructions. These mirror the `llvm.maximum.` and `llvm.minimum.` instructions, but are atomic and use IEEE754 2019 handling for NaNs, which is different to `fmax` and `fmin`. See: https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic for more details. Future changes will allow this LLVM IR to be lowered to specialised assembler instructions on suitable targets, such as AArch64.	2025-04-30 22:06:37 +01:00
Abid Qadeer	04aa5a88d1	[OMPIRBuilder] Don't discard the debug record from entry block. (#135161 ) When we get a function back from `CodeExtractor`, we discard its entry block after coping its instructions into the entry block we prepared. While copying the instructions, the terminator is discarded for obvious reasons. But if there were some debug values attached to the terminator, those are useful and needs to be copied.	2025-04-30 10:25:35 +01:00
Tom Eccles	7b70fc74d0	[mlir][OpenMP] Convert omp.cancel sections to LLVMIR (#137193 ) This is quite ugly but it is the best I could think of. The old FiniCBWrapper was way too brittle depending upon the exact block structure inside of the section, and could be confused by any control flow in the section (e.g. an if clause on cancel). The wording in the comment and variable names didn't seem to match where it was actually branching too as well. Clang's (non-OpenMPIRBuilder) lowering for cancel inside of sections branches to a block containing __kmpc_for_static_fini. This was hard to achieve here because sometimes the FiniCBWrapper has to run before the worksharing loop finalization has been crated. To get around this ordering issue I created a dummy branch to a dummy block, which is then fixed later once all of the information is available.	2025-04-29 17:19:40 +01:00
NimishMishra	b62afbccc8	[mlir][OpenMP] Add __atomic_store to AtomicInfo (#121055 ) This PR adds functionality for `__atomic_store` libcall in AtomicInfo. This allows for supporting complex types in `atomic write`. Fixes https://github.com/llvm/llvm-project/issues/113479 Fixes https://github.com/llvm/llvm-project/issues/115652	2025-04-29 07:53:36 -07:00
Jonathan Thackray	7ee0097b48	Revert "[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions" (#137657 ) Reverts llvm/llvm-project#136759 due to bad interaction with c792b25e4	2025-04-28 16:53:36 +01:00
Jonathan Thackray	ba420d8122	[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions (#136759 ) This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum` instructions. These mirror the `llvm.maximum.` and `llvm.minimum.` instructions, but are atomic and use IEEE754 2019 handling for NaNs, which is different to `fmax` and `fmin`. See: https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic for more details. Future changes will allow this LLVM IR to be lowered to specialised assembler instructions on suitable targets, such as AArch64.	2025-04-28 15:31:44 +01:00
Nikita Popov	5afe85982a	[OMPIRBuilder] Remove unnecessary pointer bitcasts (NFCI) Not needed with opaque pointers.	2025-04-23 16:00:25 +02:00
Matt Arsenault	3ad353c91a	OMPIRBuilder: Avoid getNumUses (#136350 )	2025-04-18 23:16:42 +02:00
Nick Sarnie	68ee56d150	[clang][OpenMP][SPIR-V] Fix addrspace of global constants (#134399 ) SPIR-V has strict address space rules, constant globals cannot be in the default address space. The OMPIRBuilder change was required for lit tests to pass, we were missing an addrspacecast. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>	2025-04-09 15:41:53 +00:00
NimishMishra	53fa92dcad	[mlir][llvm][OpenMP] Hoist __atomic_load alloca (#132888 ) Current implementation of `__atomic_compare_exchange` uses an alloca for `__atomic_load`, leading to issues like https://github.com/llvm/llvm-project/issues/120724. This PR hoists this alloca to `AllocaIP`. Fixes: https://github.com/llvm/llvm-project/issues/120724	2025-04-09 03:01:44 -07:00
Matt Arsenault	8286b804d5	OMPIRBuilder: Do not try to expand uses of ConstantData (#134584 )	2025-04-08 07:36:55 +07:00
Matt Arsenault	87afe4de22	OMPIRBuilder: Cleanup checking for GlobalValue (#134583 ) GlobalValue already subsumes GlobalObject and GlobalVariable, plus make use of the std::get<0> already copied to a variable above.	2025-04-08 07:33:07 +07:00
Jan Leyonberg	fbc8335311	[MLIR][OpenMP] Add codegen for teams reductions (#133310 ) This patch adds the lowering of teams reductions from the omp dialect to LLVM-IR. Some minor cleanup was done in clang to remove an unused parameter.	2025-04-07 12:47:16 -04:00
Michael Klemm	581f8bccb5	[clang][OpenMP] Fix bug #62099 - use hash value when inode ID cannot be determined (#131646 ) When creating the name of an outlined region, Clang uses the file's inode ID to generate a unique name. When the file does not exist, this causes a fatal abort of the compiler. This PR switches to a has value that is used instead. --------- Co-authored-by: Michael Kruse <github@meinersbur.de>	2025-04-02 15:09:22 +02:00
Kazu Hirata	d4427f308e	[llvm] Use range constructors of *Set (NFC) (#133549 )	2025-03-28 19:55:18 -07:00
Kareem Ergawy	5c02f1a5af	[OpenMP][IRBuilder] De-duplicate code that emit task dependencies (#132340 ) A small clean-up following up on #131795. Seems like we had 2 quite similar implementations for the same thing: emit task dependencies struct and filling it. This PR unifies the 2 versions into one. This is better since we had to fix a bug in one of them in #131795 so this applies the fix for both.	2025-03-24 16:04:20 +01:00
Kareem Ergawy	e295f5dd11	[OpenMP][IRBuilder] Don't initialize `kmp_dep_info` instances in alloc regions (#131795 ) Fixes #121289 Given the following MLIR, where a variable: `x`, is `private` for the `omp.parallel` op and a `depend` for the `omp.task` op: ```mlir omp.private {type = private} @_QFEx_private_i32 : i32 llvm.func @nested_task_with_deps() { %0 = llvm.mlir.constant(1 : i64) : i64 %1 = llvm.alloca %0 x i32 {bindc_name = "x"} : (i64) -> !llvm.ptr omp.parallel private(@_QFEx_private_i32 %1 -> %arg0 : !llvm.ptr) { omp.task depend(taskdependout -> %arg0 : !llvm.ptr) { omp.terminator } omp.terminator } llvm.return } ``` Before the fix proposed by this PR, the IR builder would emit the allocation and the initialzation logic for the task's depedency info struct in the parent function's alloc region: ```llvm define void @nested_task_with_deps() { .... %.dep.arr.addr = alloca [1 x %struct.kmp_dep_info], align 8 %2 = getelementptr inbounds [1 x %struct.kmp_dep_info], ptr %.dep.arr.addr, i64 0, i64 0 %3 = getelementptr inbounds nuw %struct.kmp_dep_info, ptr %2, i32 0, i32 0 %4 = ptrtoint ptr %omp.private.alloc to i64 store i64 %4, ptr %3, align 4 .... br label %entry omp.par.entry: ; preds = %entry .... %omp.private.alloc = alloca i32, align 4 ``` Note the following: - The private value `x` is alloced where it should be in the parallel op's entry region, - howerver, since the privae value is also a depedency of the task and since allocation and initialzation of the task depedency info struct both happen in the alloc region, - this results in the private value being referenced before it is actually defined. This PR fixes the issue by only allocating the task depedency info in the alloc region while initialzation happens in the current IP of the function with the rest of the logic that depends on it.	2025-03-19 08:19:58 +01:00
Nikita Popov	f137c3d592	[TargetRegistry] Accept Triple in createTargetMachine() (NFC) (#130940 ) This avoids doing a Triple -> std::string -> Triple round trip in lots of places, now that the Module stores a Triple.	2025-03-12 17:35:09 +01:00

1 2 3 4 5 ...

459 Commits