llvm-project

Author	SHA1	Message	Date
Valentin Clement (バレンタインクレメン)	1d05d693a1	[flang][cuda] Fix offset with multiple assumed size shared array (#154844 ) When multiple assumed size variable are used in a kernel with dynamic shared memory, each variable use the 0 offset. Update the pass to account for that. ``` attributes(global) subroutine testany( a ) real(4), shared :: smasks() real(8), shared :: dmasks() end subroutine ```	2025-08-21 21:51:43 +00:00
Chaitanya	4a3bf27c69	[OpenMP] Introduce omp.target_allocmem and omp.target_freemem omp dialect ops. (#145464 ) This PR introduces two new ops in omp dialect, omp.target_allocmem and omp.target_freemem. omp.target_allocmem: Allocates heap memory on device. Will be lowered to omp_target_alloc call in llvm. omp.target_freemem: Deallocates heap memory on device. Will be lowered to omp+target_free call in llvm. Example: %1 = omp.target_allocmem %device : i32, i64 omp.target_freemem %device, %1 : i32, i64 The work in this PR is C-P/inspired from @ivanradanov commit from coexecute implementation: [Add fir omp target alloc and free ops](`be860ac8ba`) [Lower omp_target_{alloc,free} to llvm](`6e2d584dc9`)	2025-08-18 18:15:11 +05:30
Terapines MLIR	c164e6309b	[flang][fir] Add conversion of `fir.iterate_while` to `scf.while`. (#152439 ) This commmit is a supplement for https://github.com/llvm/llvm-project/pull/140374. RFC:https://discourse.llvm.org/t/rfc-add-fir-affine-optimization-fir-pass-pipeline/86190/6	2025-08-14 13:39:55 +08:00
Slava Zakharin	b8e4232bd2	[flang] Cast fir.select[_rank] selector to i64. (#153239 ) Properly cast the selector to `i64` regardless of its integer type. We used to generate llvm.trunc always. We have to use `i64` as long as the case values may exceed INT_MAX. Fixes #153050.	2025-08-12 16:43:44 -07:00
Terapines MLIR	8e9ca057eb	[flang][fir] Add conversion of `fir.if` to `scf.if`. (#149959 ) This commmit is a supplement for https://github.com/llvm/llvm-project/pull/140374. RFC:https://discourse.llvm.org/t/rfc-add-fir-affine-optimization-fir-pass-pipeline/86190/6	2025-07-25 10:03:50 +08:00
Kelvin Li	df56b1a2cf	[flang] handle allocation of zero-sized objects (#149165 ) This PR handles the allocation of zero-sized objects for different implementations. One byte is allocated for the zero-sized objects.	2025-07-17 23:52:48 -04:00
Valentin Clement (バレンタインクレメン)	1e4e2b332d	[flang][cuda] Import type descriptor in the gpu module when needed (#149157 )	2025-07-16 14:12:27 -07:00
Valentin Clement (バレンタインクレメン)	2c6771889a	[flang][cuda] Introduce cuf.set_allocator_idx operation (#148717 )	2025-07-14 17:23:18 -07:00
Slava Zakharin	4775b96898	[flang] Optimize redundant array repacking. (#147881 ) This patch allows optimizing redundant array repacking, when the source array is statically known to be contiguous. This is part of the implementation plan for the array repacking feature, though, it does not affect any real life use case as long as FIR inlining is not a thing. I experimented with simple cases of FIR inling using `-inline-all`, and I recorded these cases in optimize-array-repacking.fir tests.	2025-07-14 09:41:42 -07:00
Slava Zakharin	fc99ef7411	[flang] Allow embox's source_box to be a !fir.box. (#148305 ) In order to create temporary copies of assumed-type arrays (e.g. for `-frepack-arrays`), we have to allow the source_box to be a !fir.box. This patch replaces #147618.	2025-07-14 09:40:42 -07:00
Christian Ulmann	374d5da214	[MLIR][Interfaces] Remove negative branch weight verifier (#148234 ) This commit removes the verifier that checked if branch weights are negative. This check was too strict because weights are interpreted as unsigned integers. This showed up when running the verifier on LLVM dialect modules that were imported from LLVM IR.	2025-07-14 07:34:29 +02:00
Kareem Ergawy	eba35cc1c0	[flang][do concurrent] Re-model `reduce` to match reductions are modelled in OpenMP and OpenACC (#145837 ) This PR proposes re-modelling `reduce` specifiers to match OpenMP and OpenACC. In particular, this PR includes the following: * A new `fir` op: `fir.delcare_reduction` which is identical to OpenMP's `omp.declare_reduction` op. * Updating the `reduce` clause on `fir.do_concurrent.loop` to use the new op. * Re-uses the `ReductionProcessor` component to emit reductions for `do concurrent` just like we do for OpenMP. To do this, the `ReductionProcessor` had to be refactored to be more generalized. * Upates mapping `do concurrent` to `fir.loop ... unordered` nests using the new reduction model. Unfortunately, this is a big PR that would be difficult to divide up in smaller parts because the bottom of the changes are the `fir` table-gen changes to `do concurrent`. However, doing these MLIR changes cascades to the other parts that have to be modified to not break things. This PR goes in the same direction we went for `private/local` speicifiers. Now the `do concurrent` and OpenMP (and OpenACC) dialects are modelled in essentially the same way which makes mapping between them more trivial, hopefully. PR stack: - https://github.com/llvm/llvm-project/pull/145837 (this one) - https://github.com/llvm/llvm-project/pull/146025 - https://github.com/llvm/llvm-project/pull/146028 - https://github.com/llvm/llvm-project/pull/146033	2025-07-11 06:39:30 +02:00
Razvan Lupusoru	4859b92b7f	[flang][acc] Update FIR ref, heap, and pointer to be MappableType (#147834 ) The MappableType OpenACC type interface is a richer interface that allows OpenACC dialect to be capable to better interact with a source dialect, FIR in this case. fir.box and fir.class types already implemented this interface. Now the same is being done with the other FIR types that represent variables. One additional notable change is that fir.array no longer implements this interface. This is because MappableType is primarily intended for variables - and FIR variables of this type have storage associated and thus there's a pointer-like type (fir.ref/heap/pointer) that holds the array type. The end goal of promoting these FIR types to MappableType is that we will soon implement ability to generate recipes outside of the frontend via this interface.	2025-07-10 15:23:57 -07:00
Tom Eccles	ed17bf1e4c	[flang] Fix tests broken by #146734 (#147055 ) These tests referred to privatizers which were never declared	2025-07-04 14:50:29 +01:00
Razvan Lupusoru	f16983f7d0	[flang][acc] Ensure fir.class is handled in type categorization (#146174 ) fir.class is treated similarly as fir.box - but it has one key distinction which is that it doesn't hold an element type. Thus the categorization logic was mishandling this case for this reason (and also the fact that it assumed that a base object is always a fir.ref). This PR improves this handling and adds appropriate test exercising both a class and a class field to ensure categorization works.	2025-06-30 15:04:14 -07:00
Valentin Clement (バレンタインクレメン)	f4cecfe1bb	[flang][cuda] Bring PARAMETER arrays into the GPU module (#146416 )	2025-06-30 14:24:44 -07:00
jeanPerier	22ee837ec0	[flang][NFC] do not copy fields in fir::RecordType::getTypeList (#145530 ) For historical reason, `fir::RecordType::getTypeList` was returning an std::vector, causing the entire field list to be copied when called. It is called a lot indirectly in all type helpers, which themselves are called a lot in derived type heavy code like WRF. The `fir::hasDynamicType` helper is also called a lot, and it can just check for length parameters to avoid looping on all derived type components in most cases.	2025-06-25 11:51:07 +02:00
Lei Huang	d715ecba79	Revert "[flang][fir] Add fir.if -> scf.if and add filecheck test … (#142965 )" (#145345 ) This reverts commit 823750d873dff1d03865900042fc9b58e0f7f9c3. Test causes segfault on aix flang builder.	2025-06-23 16:46:47 -04:00
Slava Zakharin	70343c8d44	[mlir][flang] Added Weighted[Region]BranchOpInterface's. (#142079 ) The new interfaces provide getters and setters for the weight information about the branches of BranchOpInterface and RegionBranchOpInterface operations. These interfaces are done the same way as LLVM dialect's BranchWeightOpInterface. The plan is to produce this information in Flang, e.g. mark most probably "cold" code as such and allow LLVM to order basic blocks accordingly. An example of such a code is copy loops generated for arrays repacking - we can mark it as "cold" assuming that the copy will not happen dynamically. If the copy actually happens the overhead of the copy is probably high enough so that we may not care about the little overhead of jumping to the "cold" code and fetching it.	2025-06-17 16:14:13 -07:00
Kareem Ergawy	282e471018	[flang] Erase `fir.local` ops before lowering `fir` to `llvm` (#143687 ) `fir.local` ops are not supposed to have any uses at this point (i.e. during lowering to LLVM). In case of serialization, the `fir.do_concurrent` users are expected to have been lowered to `fir.do_loop` nests. In case of parallelization, the `fir.do_concurrent` users are expected to have been lowered to the target parallel model (e.g. OpenMP). This hopefully resolved a build issue introduced by https://github.com/llvm/llvm-project/pull/142567 (see for example: https://lab.llvm.org/buildbot/#/builders/199/builds/4009).	2025-06-12 05:58:55 +02:00
Jameson Nash	082251bba4	[AArch64] fix trampoline implementation: use X15 (#126743 ) AAPCS64 reserves any of X9-X15 for a compiler to choose to use for this purpose, and says not to use X16 or X18 like GCC (and the previous implementation) chose to use. The X18 register may need to get used by the kernel in some circumstances, as specified by the platform ABI, so it is generally an unwise choice. Simply choosing a different register fixes the problem of this being broken on any platform that actually follows the platform ABI (which is all of them except EABI, if I am reading this linux kernel bug correctly https://lkml2.uits.iu.edu/hypermail/linux/kernel/2001.2/01502.html). As a side benefit, also generate slightly better code and avoids needing the compiler-rt to be present. I did that by following the XCore implementation instead of PPC (although in hindsight, following the RISCV might have been slightly more readable). That X18 is wrong to use for this purpose has been known for many years (e.g. https://www.mail-archive.com/gcc@gcc.gnu.org/msg76934.html) and also known that fixing this to use one of the correct registers is not an ABI break, since this only appears inside of a translation unit. Some of the other temporary registers (e.g. X9) are already reserved inside llvm for internal use as a generic temporary register in the prologue before saving registers, while X15 was already used in rare cases as a scratch register in the prologue as well, so I felt that seemed the most logical choice to choose here.	2025-06-11 21:49:01 -04:00
Pranav Bhandarkar	f993f362ef	[Flang][OpenMP] - When mapping a `fir.boxchar`, map the underlying data pointer as a member (#141715 ) This PR adds functionality to the `MapInfoFinalization` pass wherein the underlying data pointer of a `fir.boxchar` is mapped as a member of the parent boxchar.	2025-06-10 13:09:32 -05:00
Dominik Adamski	007d29e30c	[Flang] Turn on alias analysis for locally allocated objects (#143489 ) Previously, a bug in the MemCptOpt LLVM IR pass caused issues with adding alias tags for locally allocated objects for Fortran code. However, the bug has now been fixed (https://github.com/llvm/llvm-project/pull/129537 ), and we can safely enable alias tags for these objects. This change should improve the accuracy of the alias analysis. More accurate alias analysis assumes that Cray pointers do not alias with other variables. This assumption is common among other compilers. If the code violates this assumption, it can lead to incorrect results (see: https://github.com/llvm/llvm-project/issues/141928)	2025-06-10 16:46:13 +02:00
Q	823750d873	[flang][fir] Add fir.if -> scf.if and add filecheck test file (#142965 ) This commmit is a supplement for https://github.com/llvm/llvm-project/pull/140374. RFC:https://discourse.llvm.org/t/rfc-add-fir-affine-optimization-fir-pass-pipeline/86190/6 --------- Co-authored-by: ZhiQiang Fan <zhiqiang.fan@terapines.com>	2025-06-10 15:43:24 +08:00
Pranav Bhandarkar	8395912895	[Flang] - Handle `BoxCharType` in `fir.box_offset` op (#141713 ) To map `fir.boxchar` types reliably onto an offload target, such as a GPU, the `omp.map.info` operation is used to map the underlying data pointer (`fir.ref<fir.char<k, ?>>`) wrapped by the `fir.boxchar` MLIR value. The `omp.map.info` operation needs a pointer to the underlying data pointer. Given a reference to a descriptor (`fir.box`), the `fir.box_offset` is used to obtain the address of the underlying data pointer. This PR extends `fir.box_offset` to provide the same functionality for `fir.boxchar` as well.	2025-06-06 10:48:07 -05:00
Tom Eccles	d16ecad968	[flang] Disable noalias by default (#142128 ) With these enabled we see a 70% performance regression for exchange2_r on neoverse-v1 (aws graviton 3) using `-mcpu=native -Ofast -flto`. There is also a smaller regression on neoverse-v2. This appears to be because function specialization is no longer kicking in during LTO for digits_2. This can be seen in the output executable: previously it contained specialized copies of the function with names like `_QMbrute_forcePdigits_2.specialized.4`. Now there are no names like this. The bug is not in flang - instead in the function specialization pass - but due to the size of the regression I would like to request that this is disabled until function specialization has been fixed.	2025-05-30 17:35:41 +01:00
Slava Zakharin	a0d699a8e6	Reland "[flang] Added noalias attribute to function arguments. (#140803 )" This helps to disambiguate accesses in the caller and the callee after LLVM inlining in some apps. I did not see any performance changes, but this is one step towards enabling other optimizations in the apps that I am looking at. The definition of llvm.noalias says: ``` ... indicates that memory locations accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value. This guarantee only holds for memory locations that are modified, by any means, during the execution of the function. ``` I believe this exactly matches Fortran rules for the dummy arguments that are modified during their subprogram execution. I also set llvm.noalias and llvm.nocapture on the !fir.box<> arguments, because the corresponding descriptors cannot be captured and cannot alias anything (not based on them) during the execution of the subprogram.	2025-05-29 13:42:57 -07:00
Slava Zakharin	6ee2453360	Revert "[flang] Added noalias attribute to function arguments." (#141884 ) Reverts llvm/llvm-project#140803 Buildbot failure: https://lab.llvm.org/buildbot/#/builders/143/builds/8041	2025-05-28 18:06:11 -07:00
Slava Zakharin	2426ac6865	[flang] Added noalias attribute to function arguments. (#140803 ) This helps to disambiguate accesses in the caller and the callee after LLVM inlining in some apps. I did not see any performance changes, but this is one step towards enabling other optimizations in the apps that I am looking at. The definition of llvm.noalias says: ``` ... indicates that memory locations accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value. This guarantee only holds for memory locations that are modified, by any means, during the execution of the function. ``` I believe this exactly matches Fortran rules for the dummy arguments that are modified during their subprogram execution. I also set llvm.noalias and llvm.nocapture on the !fir.box<> arguments, because the corresponding descriptors cannot be captured and cannot alias anything (not based on them) during the execution of the subprogram.	2025-05-28 17:18:04 -07:00
MingYan	953302eb98	[flang][fir] Add FIR structured control flow ops to SCF dialect pass. (#140374 ) This patch only supports the conversion from `fir.do_loop` to `scf.for`. This pass is still experimental, and future work will focus on gradually improving this conversion pass. Co-authored-by: yanming <ming.yan@terapines.com>	2025-05-25 14:28:47 +08:00
Valentin Clement (バレンタインクレメン)	6811a3bedf	[flang][cuda] Allocate extra descriptor in managed memory when it is coming from device (#140818 )	2025-05-20 18:55:13 -07:00
jeanPerier	ed07412888	[flang] translate derived type array init to attribute if possible (#140268 ) This patch relies on #140235 and #139724 to speed-up compilations of files with derived type array global with initial value. Currently, such derived type global init was lowered to an llvm.mlir.insertvalue chain in the LLVM IR dialect because there was no way to represent such value via attributes. This chain was later folded in LLVM dialect to LLVM IR using LLVM IR (not dialect) folding. This insert chain generation and folding is very expensive for big arrays. For instance, this patch brings down the compilation of FM_lib fmsave.f95 from 50s to 0.5s.	2025-05-20 16:11:27 +02:00
Valentin Clement (バレンタインクレメン)	f5609aa1b0	[flang][cuda] Use a reference for asyncObject (#140614 ) Switch from `int64_t` to `int64_t*` to fit with the rest of the implementation. New tentative with some fix. The previous was reverted some time ago. Reviewed in #138010	2025-05-19 15:02:53 -07:00
jeanPerier	416b7dfaa0	[flang] use DataLayout instead of GEP to compute element size (#140235 ) Now that the datalayout is part of codegen, use that to generate type size constants in codegen instead of generating GEP.	2025-05-19 13:59:09 +02:00
Dominik Adamski	eb4fde9a4e	Revert "[Flang] Turn on alias analysis for locally allocated objects" (#140202 ) Reverts llvm/llvm-project#139682 (commit: cf16c97bfa1416672d8990862369e86f360aa11e ) due to reported regression in Fujitsu Fortran test suite: https://ci.linaro.org/job/tcwg_flang_test--main-aarch64-Ofast-sve_vla-build/2081/artifact/artifacts/notify/mail-body.txt/view/	2025-05-16 09:44:33 +02:00
Sergio Afonso	30b0946326	[Flang][MLIR][OpenMP] Improve use_device_* handling (#137198 ) This patch updates MLIR op verifiers for operations taking arguments that must always be defined by an `omp.map.info` operation to check this requirement. It also modifies Flang lowering for `use_device_{addr, ptr}`, as well as the custom MLIR printer and parser for these clauses, to support initializing it to `OMP_MAP_RETURN_PARAM` and represent this in the MLIR representation as `return_param`. This internal mapping flag is what eventually is used for variables passed via these clauses into the target region when translating to LLVM IR, so making it explicit in Flang and MLIR removes an inconsistency in the current representation.	2025-05-15 12:28:06 +01:00
Asher Mancinelli	f486cc4417	[flang] Add loop annotation attributes to the loop backedge (#126082 ) Flang currently adds loop metadata to a conditional branch in the loop preheader, while clang adds it to the loop latch's branch instruction. Langref says: > Currently, loop metadata is implemented as metadata attached to the branch instruction in the loop latch block. > > https://llvm.org/docs/LangRef.html#llvm-loop I misread langref a couple times, but I think this is the appropriate branch op for the LoopAnnotationAttr. In a couple examples I found that the metadata was lost entirely during canonicalization. This patch makes the codegen look more like clang's and the annotations persist through codegen. * current clang: https://godbolt.org/z/8WhbcrnG3 * current flang: https://godbolt.org/z/TrPboqqcn	2025-05-14 07:07:57 -07:00
Dominik Adamski	cf16c97bfa	[Flang] Turn on alias analysis for locally allocated objects (#139682 ) Previously, a bug in the MemCptOpt LLVM IR pass caused issues with adding alias tags for locally allocated objects for Fortran code. However, the bug has now been fixed ( https://github.com/llvm/llvm-project/pull/129537 ), and we can safely enable alias tags for these objects. This change should improve the accuracy of the alias analysis.	2025-05-14 09:21:18 +02:00
Asher Mancinelli	bbb7f01481	[flang] Fix volatile attribute propagation on allocatables (#139183 ) Ensure volatility is reflected not just on the reference to an allocatable, but on the box, too. When we declare a volatile allocatable, we now get a volatile reference to a volatile box. Some related cleanups: * SELECT TYPE constructs check the selector's type for volatility when creating and designating the type used in the selecting block. * Refine the verifier for fir.convert. In general, I think it is ok to implicitly drop volatility in any ptr-to-int conversion because it means we are in codegen (and representing volatility on the LLVM ops and intrinsics) or we are calling an external function (are there any cases I'm not thinking of?) * An allocatable test that was XFAILed is now passing. Making allocatables' boxes volatile resulted in accesses of those boxes being volatile, which resolved some errors coming from the strict verifier. * I noticed a runtime function was missing the fir.runtime attribute.	2025-05-13 08:13:47 -07:00
Slava Zakharin	2d12d31f44	[flang] Propagate contiguous attribute through HLFIR. (#138797 ) This change allows marking more designators producing an opaque box with 'contiguous' attribute, e.g. like in test1 case in flang/test/HLFIR/propagate-contiguous-attribute.fir. This would make isSimplyContiguous() return true for such designators allowing merging hlfir.eval_in_mem with hlfir.assign where the LHS is a contiguous array section. Depends on #139003	2025-05-12 18:33:47 -07:00
MingYan	db2d5762eb	[flang][fir] Support promoting `fir.do_loop` with results to `affine.for`. (#137790 ) Co-authored-by: yanming <ming.yan@terapines.com>	2025-05-09 10:55:21 +08:00
Kareem Ergawy	227e1ff73b	[flang][fir] Add locality specifiers modeling to `fir.do_concurrent.loop` (#138506 )	2025-05-08 21:42:52 +02:00
Kareem Ergawy	a83bb35e99	[flang][fir] Add `fir.local` op for locality specifiers (#138505 ) Adds a new `fir.local` op to model `local` and `local_init` locality specifiers. This op is a clone of `omp.private`. In particular, this new op also models the privatization/localization logic of an SSA value in the `fir` dialect just like `omp.private` does for OpenMP. PR stack: - https://github.com/llvm/llvm-project/pull/137928 - https://github.com/llvm/llvm-project/pull/138505 (this PR) - https://github.com/llvm/llvm-project/pull/138506 - https://github.com/llvm/llvm-project/pull/138512 - https://github.com/llvm/llvm-project/pull/138534 - https://github.com/llvm/llvm-project/pull/138816	2025-05-07 14:00:06 +02:00
Asher Mancinelli	7220fdad0c	[flang] Hide strict volatility checks behind flag (#138183 ) Enabling volatility lowering by default revealed some issues in lowering and op verification. For example, given volatile variable of a nested type, accessing structure members of a structure member would result in a volatility mismatch when the inner structure member is designated (and thus a verification error at compile time). In other cases, I found correct codegen when the checks were disabled, also related to allocatable types and how we handle volatile references of boxes. This hides the strict verification of fir and hlfir ops behind a flag so I can iteratively improve lowering of volatile variables without causing compile-time failures, keeping the strict verification on when running tests.	2025-05-02 09:03:20 -07:00
Valentin Clement (バレンタインクレメン)	9b6b144438	Revert "[flang][cuda] Use a reference for asyncObject" (#138221 ) Reverts llvm/llvm-project#138186	2025-05-01 17:41:44 -07:00
Valentin Clement (バレンタインクレメン)	7f922f1400	[flang][cuda] Use a reference for asyncObject (#138186 ) Switch from `int64_t` to `int64_t*` to fit with the rest of the implementation. New tentative with some fix. The previous was reverted yesterday.	2025-05-01 17:04:12 -07:00
Valentin Clement (バレンタインクレメン)	01a18809ee	Revert "[flang][cuda] Use a reference for asyncObject (#138010 )" (#138082 ) This reverts commit 9b0eaf71e674a28ee55be3afa11b5f7d4da732c0.	2025-04-30 22:03:26 -07:00
Valentin Clement (バレンタインクレメン)	9b0eaf71e6	[flang][cuda] Use a reference for asyncObject (#138010 ) Switch from `int64_t` to `int64_t*` to fit with the rest of the implementation.	2025-04-30 14:02:29 -07:00
Asher Mancinelli	8836bce842	[flang] Add lowering of volatile references (#132486 ) [RFC on discourse](https://discourse.llvm.org/t/rfc-volatile-representation-in-flang/85404/1) Flang currently lacks support for volatile variables. For some cases, the compiler produces TODO error messages and others are ignored. Some of our tests are like the example from _C.4 Clause 8 notes: The VOLATILE attribute (8.5.20)_ and require volatile variables. Prior commits: ``` c9ec1bc753b0 [flang] Handle volatility in lowering and codegen (#135311) e42f8609858f [flang][nfc] Support volatility in Fir ops (#134858) b2711e1526f9 [flang][nfc] Support volatile on ref, box, and class types (#134386) ```	2025-04-30 08:46:33 -07:00
Kaviya Rajendiran	857ac4c229	[MLIR][OpenMP] Lowering nontemporal clause to LLVM IR for SIMD directive (#118751 ) This patch, - Added a new attribute `nontemporal` to fir.load and fir.store operation in the FIR dialect. - Added a pass `lower-nontemporal` which is called before FIRToLLVM conversion pass and adds the nontemporal attribute to loads and stores on the list items specified in the nontemporal clause of the SIMD directive. - Set the `UnitAttr:$nontemporal` to llvm.load and llvm.store operations during FIR to LLVM dialect conversion, if the corresponding fir.load or fir.store operations have the nontemporal attribute. - Attached the `nontemporal metadata` to load and store instructions that have the nontemporal attribute, during LLVM dialect to LLVM IR translation.	2025-04-30 11:13:20 +05:30

1 2 3 4 5 ...

719 Commits