llvm-project

Author	SHA1	Message	Date
Slava Zakharin	6f489fb5e5	Reapply "[flang] Lower EOSHIFT into hlfir.eoshift." (#153907 ) (#154241 ) This reverts commit 5178aeff7b96e86b066f8407b9d9732ec660dd2e. In addition: * Scalar constant UNSIGNED BOUNDARY is explicitly casted to the result type so that the generated hlfir.eoshift operation is valid. The lowering produces signless constants by default. It might be a bigger issue in lowering, so I just want to "fix" it for EOSHIFT in this patch. * Since we have to create unsigned integer constant during HLFIR inlining, I added code in createIntegerConstant to make it possible.	2025-08-19 08:36:14 -07:00
Slava Zakharin	c79a88ee0a	[flang] Convert hlfir.designate with comp and contiguous result. (#154232 ) Array sections like this have not been using the knowledge that the result is contiguous: ``` type t integer :: f end type type(t) :: a(:) a%f = 0 ``` Peter Klausler is working on a change that will result in the corresponding hlfir.designate having a component and a non-box result. This patch fixes the issues found in HLFIR-to-FIR conversion.	2025-08-19 08:35:40 -07:00
Slava Zakharin	9f302ed0cf	[flang] Inline hlfir.eoshift during HLFIR intrinsics simplication. (#153108 ) This patch generalizes the code for hlfir.cshift to be applicable for hlfir.eoshift. The major difference is the selection of the boundary value that might be statically/dynamically absent, in which case the default scalar value has to be used. The scalar value of the boundary is always computed before the hlfir.elemental or the assignment loop. Contrary to hlfir.cshift simplication, the SHIFT value is not normalized, because the original value (and its sign) participate in the EOSHIFT index computation for addressing the input array and selecting which elements of the results are assigned from the boundary operand.	2025-08-15 15:22:06 -07:00
Slava Zakharin	4c6afc7993	[flang] Lower hlfir.eoshift to the runtime call. (#153107 ) Straightforward lowering of hlfir.eoshift to the runtime call in LowerHLFIRIntrinsics pass.	2025-08-15 13:54:49 -07:00
Slava Zakharin	95d4362521	[flang] Added hlfir.eoshift operation definition. (#153105 ) This is a basic definition of the operation corresponding to the Fortran's EOSHIFT transformational intrinsic.	2025-08-15 13:15:35 -07:00
Slava Zakharin	4775b96898	[flang] Optimize redundant array repacking. (#147881 ) This patch allows optimizing redundant array repacking, when the source array is statically known to be contiguous. This is part of the implementation plan for the array repacking feature, though, it does not affect any real life use case as long as FIR inlining is not a thing. I experimented with simple cases of FIR inling using `-inline-all`, and I recorded these cases in optimize-array-repacking.fir tests.	2025-07-14 09:41:42 -07:00
Kareem Ergawy	ab1c4905f4	[flang][do concurrent] Extned `getAllocaBlock()` and emit yields correctly (#146853 ) Handles some loose ends in `do concurrent` reduction declarations. This PR extends `getAllocaBlock` to handle declare ops, and also emit `fir.yield` in all regions.	2025-07-11 10:17:17 +02:00
Leandro Lupori	a63846b475	[flang] Fix array assignment regression introduced by #147371 (#147761 ) In some cases fixed shape arrays can be fir.heap/fir.ptr, even after hlfir::derefPointersAndAllocatables() is called.	2025-07-09 14:41:56 -03:00
Leandro Lupori	e976eaf303	[flang] Fix optimization of array assignments after #146408 (#147371 ) Host associated variables were not being handled properly. For array references, get the fixed shape extents from the value type instead, that works correctly in all cases.	2025-07-08 14:47:26 -03:00
Leandro Lupori	0ba59587fa	[flang] Optimize assignments of multidimensional arrays (#146408 ) Assignments of n-dimensional arrays, with trivial RHS, were always being converted to n nested loops. For contiguous arrays, it's possible to flatten them and use a single loop, that can usually be better optimized by LLVM. In a test program, using a 3-dimensional array and varying its size, the resulting speedup was as follows (measured on Graviton4): 16K 1.09 64K 1.40 128K 1.90 256K 1.91 512K 1.00 For sizes above or equal to 512K no improvement was observed. It looks like LLVM stops trying to perform aggressive loop unrolling at a certain threshold and just uses nested loops instead. Larger sizes won't fit on L1 and L2 caches too. This was noticed while profiling 527.cam4_r. This optimization makes aer_rad_props_sw slightly faster, but unfortunately it practically doesn't change 527.cam4_r total execution time.	2025-07-04 08:49:51 -03:00
jeanPerier	aeaf319b8c	[flang] avoid useless rebox of polymorphic scalars (#145507 ) Do not create new descriptor for polymorphic scalars when lowering hlfir.declare. hlfir.declare of box/class is lowered to a fir.rebox to ensure that local lower bounds and descriptor attributes (Pointer/Allocatable/None) are properly set-up in the descriptor associated to the symbol. For polymorphic scalar, this created a useless temporary descriptor. This was breaking invalid code #145256 that violates OPTIONAL usage rules. I am not fixing it primarily to support this invalid code, but rather because it is dumb to create a useless fir.rebox.	2025-06-25 09:41:33 +02:00
Kareem Ergawy	59d6fbb8ff	[flang][fir] Provide allocation block for `fir.local` when required (#144521 ) Extends `fir::FirOpBuilder::getAllocaBlock()` to support `fir.local`. This allows us to retrieve an allocation block when needed for `fir.local`.	2025-06-18 10:24:08 +02:00
Simone Pellegrini	abbbe4a6cd	[mlir][vector] Fix attaching write effects on transfer_write's base (#142940 ) This fixes an issue with `TransferWriteOp`'s implementation of the `MemoryEffectOpInterface` where the write effect was attached to the stored value rather than the base. This had the effect that when asking for the memory effects for the input memref buffer using `getEffectsOnValue(...)`, the function would return no-effects (as the effect would have been attached to the stored value rather than the input buffer).	2025-06-11 12:37:34 +01:00
jeanPerier	59e4d0b34d	[flang][hlfir] ensure hlfir.declare result box attributes are consistent (#143137 ) Prevent hlfir.declare output to be fir.box/class values with the heap/pointer attribute to ensure the runtime descriptor attributes are in line with the Fortran attributes for the entities being declared (only fir.ref<box/class> can be ALLOCATABLE/POINTERS). This fixes a bug where an associated entity inside a SELECT TYPE was being unexpectedly reallocated inside assign runtime because the selector was allocatable and this attribute was not properly removed when creating the descriptor for the associated entity (that does not inherit the ALLOCATABLE/POINTER attribute according to Fortran 2023 section 11.1.3.3).	2025-06-10 14:41:14 +02:00
Slava Zakharin	ba8077c9dd	[flang] Use optimal shape for assign expansion as a loop. (#143050 ) During `hlfir.assign` inlining and `ElementalAssignBufferization` we can deduce the optimal shape from `lhs` and `rhs` shapes. It is probably better be done in a separate pass that propagates constant shapes, but I have not seen any benchmarks that would benefit from this yet. So consider this as a workaround for a bigger TODO issue. The `ElementalAssignBufferization` case is from 465.tonto, but I do not have performance results yet (I do not expect much).	2025-06-06 10:45:38 -07:00
Slava Zakharin	e16f603351	[flang] Relax conflicts detection in ElementalAssignBufferization. (#143045 ) If there is a read-effect operation inside `hlfir.elemental`, there is no reason to block moving it to the assignment point unless there are write-effect operations between the elemental and the assignment. The previous code was disallowing the optimization even if there were only read-effect operations in between. This case is from 465.tonto, though this change does not improve performance at all.	2025-06-06 10:45:26 -07:00
Kajetan Puchalski	0d40574e16	[flang] Inline hlfir.copy_in for trivial types (#138718 ) hlfir.copy_in implements copying non-contiguous array slices for functions that take in arrays required to be contiguous through flang-rt. For large arrays of trivial types, this can incur overhead compared to a plain, inlined copy loop. To address that, add a new InlineHLFIRCopyIn optimisation pass to inline hlfir.copy_in operations for trivial types. For the time being, the pattern is only applied in cases where the copy-in does not require a corresponding copy-out, such as when the function being called declares the array parameter as intent(in). Applying this optimisation reduces the runtime of thornado-mini's DeleptonizationProblem by about 10%. --------- Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>	2025-06-06 15:10:17 +01:00
jeanPerier	6a41f53c39	[flang][hlfir] do not propagate polymorphic temporary as allocatables (#142609 ) Polymorphic temporary are currently propagated as fir.ref<fir.class<fir.heap<>>> because their allocation may be delayed to the hlfir.assign copy (using realloc). This patch moves away from this and directly allocate the temp and propagate it as a fir.class. The polymorphic temporaries creating is also simplified by avoiding the need to call the runtime to setup the descriptor altogether (the runtime is still call for the allocation currently because alloca/allocmem do not support polymorphism).	2025-06-06 09:53:41 +02:00
Asher Mancinelli	898df4b8ed	[flang] Skip opt-bufferization when memory effect does not have an associated value (#140781 ) Memory effects on the volatile memory resource may not be attached to a particular source, in which case the value of an effect will be null. This caused this test case to crash in the optimized bufferization pass's safety analysis because it assumes it can get the SSA value modified by the memory effect. This is because memory effects on the volatile resource indicate that the operation must not be reordered with respect to other volatile operations, but there is not a material ssa value that can be pointed to. This patch changes the safety checks such that memory effects which do not have associated values are not safe for optimized bufferization.	2025-05-22 06:50:25 -07:00
Slava Zakharin	54aa9282ed	[flang] Undo the effects of CSE for hlfir.exactly_once. (#140190 ) CSE may delete operations from hlfir.exactly_once and reuse the equivalent results from the parent region(s), e.g. from the parent hlfir.region_assign. This makes it problematic to clone hlfir.exactly_once before the top-level hlfir.where. This patch adds a "canonicalizer" that pulls in such operations back into hlfir.exactly_once.	2025-05-20 09:22:05 -07:00
Valentin Clement (バレンタインクレメン)	f5609aa1b0	[flang][cuda] Use a reference for asyncObject (#140614 ) Switch from `int64_t` to `int64_t*` to fit with the rest of the implementation. New tentative with some fix. The previous was reverted some time ago. Reviewed in #138010	2025-05-19 15:02:53 -07:00
Slava Zakharin	ee47aea435	[flang] Treat hlfir.associate as Allocate for FIR alias analysis. (#139004 ) Early HLFIR optimizations may experience problems with values produced by hlfir.associate. In most cases this is a unique local memory allocation, but it can also reuse some other hlfir.expr memory sometimes. It seems to be safe to assume unique allocation for trivial types, since we always allocate new memory for them.	2025-05-12 18:34:12 -07:00
Slava Zakharin	2d12d31f44	[flang] Propagate contiguous attribute through HLFIR. (#138797 ) This change allows marking more designators producing an opaque box with 'contiguous' attribute, e.g. like in test1 case in flang/test/HLFIR/propagate-contiguous-attribute.fir. This would make isSimplyContiguous() return true for such designators allowing merging hlfir.eval_in_mem with hlfir.assign where the LHS is a contiguous array section. Depends on #139003	2025-05-12 18:33:47 -07:00
Slava Zakharin	3aad7d7a3c	[flang] Fixed designator codegen for contiguous boxes. (#139003 ) Contiguous variables represented with a box do not have explicit shape, but it looks like the base/shape computation was assuming that. This caused generation of raw address fir.array_coor without the shape. This patch is needed to fix failures hapenning with #138797.	2025-05-12 18:33:29 -07:00
Asher Mancinelli	7220fdad0c	[flang] Hide strict volatility checks behind flag (#138183 ) Enabling volatility lowering by default revealed some issues in lowering and op verification. For example, given volatile variable of a nested type, accessing structure members of a structure member would result in a volatility mismatch when the inner structure member is designated (and thus a verification error at compile time). In other cases, I found correct codegen when the checks were disabled, also related to allocatable types and how we handle volatile references of boxes. This hides the strict verification of fir and hlfir ops behind a flag so I can iteratively improve lowering of volatile variables without causing compile-time failures, keeping the strict verification on when running tests.	2025-05-02 09:03:20 -07:00
Valentin Clement (バレンタインクレメン)	9b6b144438	Revert "[flang][cuda] Use a reference for asyncObject" (#138221 ) Reverts llvm/llvm-project#138186	2025-05-01 17:41:44 -07:00
Valentin Clement (バレンタインクレメン)	7f922f1400	[flang][cuda] Use a reference for asyncObject (#138186 ) Switch from `int64_t` to `int64_t*` to fit with the rest of the implementation. New tentative with some fix. The previous was reverted yesterday.	2025-05-01 17:04:12 -07:00
Valentin Clement (バレンタインクレメン)	01a18809ee	Revert "[flang][cuda] Use a reference for asyncObject (#138010 )" (#138082 ) This reverts commit 9b0eaf71e674a28ee55be3afa11b5f7d4da732c0.	2025-04-30 22:03:26 -07:00
Valentin Clement (バレンタインクレメン)	9b0eaf71e6	[flang][cuda] Use a reference for asyncObject (#138010 ) Switch from `int64_t` to `int64_t*` to fit with the rest of the implementation.	2025-04-30 14:02:29 -07:00
Slava Zakharin	7dad8b91bc	[flang] Fetch the initial reduction value from the input array. (#136790 ) Instead of using loop-carried IsFirst predicate, we can fetch the initial reduction values for MIN/MAX LOC/VAL reductions from the array itself. This results in a little bit cleaner loop nests, especially, generated for total reductions. Otherwise, LLVM is able to peel the first iteration of the innermost loop, but the surroudings of the peeled code are executed multiple times withing the outer loop(s). This patch does the manual peeling, which only works for non-masked reductions where the input array is not empty.	2025-04-30 13:53:26 -07:00
Asher Mancinelli	8836bce842	[flang] Add lowering of volatile references (#132486 ) [RFC on discourse](https://discourse.llvm.org/t/rfc-volatile-representation-in-flang/85404/1) Flang currently lacks support for volatile variables. For some cases, the compiler produces TODO error messages and others are ignored. Some of our tests are like the example from _C.4 Clause 8 notes: The VOLATILE attribute (8.5.20)_ and require volatile variables. Prior commits: ``` c9ec1bc753b0 [flang] Handle volatility in lowering and codegen (#135311) e42f8609858f [flang][nfc] Support volatility in Fir ops (#134858) b2711e1526f9 [flang][nfc] Support volatile on ref, box, and class types (#134386) ```	2025-04-30 08:46:33 -07:00
Slava Zakharin	4091f4dd96	Reland [flang] Generalized simplification of HLFIR reduction ops. (#136071 ) (#136246 ) This change generalizes SumAsElemental inlining in SimplifyHLFIRIntrinsics pass so that it can be applied to ALL, ANY, COUNT, MAXLOC, MAXVAL, MINLOC, MINVAL, SUM. This change makes the special handling of the reduction operations in OptimizedBufferization redundant: once HLFIR operations are inlined, the hlfir.elemental inlining should do the rest of the job.	2025-04-18 11:56:07 -07:00
Slava Zakharin	ce0c472791	Revert "Reland [flang] Generalized simplification of HLFIR reduction ops. (#136071 )" This reverts commit 32311a6b68d3de4642599abe14922c686bdb30fc.	2025-04-17 17:26:55 -07:00
Slava Zakharin	32311a6b68	Reland [flang] Generalized simplification of HLFIR reduction ops. (#136071 ) This change generalizes SumAsElemental inlining in SimplifyHLFIRIntrinsics pass so that it can be applied to ALL, ANY, COUNT, MAXLOC, MAXVAL, MINLOC, MINVAL, SUM. This change makes the special handling of the reduction operations in OptimizedBufferization redundant: once HLFIR operations are inlined, the hlfir.elemental inlining should do the rest of the job.	2025-04-17 16:19:47 -07:00
Slava Zakharin	f39242ceed	Revert "[flang] Generalized simplification of HLFIR reduction ops." (#136218 ) Reverts llvm/llvm-project#136071	2025-04-17 15:47:46 -07:00
Slava Zakharin	655b9db7b9	[flang] Generalized simplification of HLFIR reduction ops. (#136071 ) This change generalizes SumAsElemental inlining in SimplifyHLFIRIntrinsics pass so that it can be applied to ALL, ANY, COUNT, MAXLOC, MAXVAL, MINLOC, MINVAL, SUM. This change makes the special handling of the reduction operations in OptimizedBufferization redundant: once HLFIR operations are inlined, the hlfir.elemental inlining should do the rest of the job.	2025-04-17 15:42:48 -07:00
Miguel Saldivar	0f86e2395e	[flang] Avoid optimizing min and max if not valid type (#134972 ) In `makeMinMaxInitValGenerator` it explicitly checks for only `FloatType` and `IntegerType`, so we shouldn't match if we don't have either of those types. Fix for #134308	2025-04-15 10:14:58 +01:00
Asher Mancinelli	c9ec1bc753	[flang] Handle volatility in lowering and codegen (#135311 ) * Enable lowering and conversion patterns to pass volatility information from higher level operations to lower level ones. * Enable codegen to pass volatility to LLVM dialect ops by setting an attribute on loads, stores, and memory intrinsics. * Add utilities for passing along the volatility from an input type to an output type. To introduce volatile types into the IR, entities with the volatile attribute will be given a volatile type in the bridge; this is not enabled in this patch. User code should not result in IR with volatile types yet, so this patch contains no tests with Fortran source, only IR that already contains volatile types. Part 3 of #132486.	2025-04-14 11:02:23 -07:00
Valentin Clement (バレンタインクレメン)	f4d87c42a6	[flang][cuda] Add asyncId to allocate entry point (#134947 )	2025-04-09 10:52:02 -07:00
Asher Mancinelli	85fd83ed49	[flang][nfc] Use llvm memmove intrinsic over regular call (#134294 ) Follow up to #134170. We should be using the LLVM intrinsics instead of plain fir.calls when we can. Existing code creates a declaration for the llvm intrinsic and a regular fir.call, which makes it hard for consumers of the IR to find all the intrinsic calls.	2025-04-04 06:13:30 -07:00
Slava Zakharin	5f268d04f9	[flang] Code generation for fir.pack/unpack_array. (#132080 ) The code generation relies on `ShallowCopyDirect` runtime to copy data between the original and the temporary arrays (both directions). The allocations are done by the compiler generated code. The heap allocations could have been passed to `ShallowCopy` runtime, but I decided to expose the allocations so that the temporary descriptor passed to `ShallowCopyDirect` has `nocapture` - maybe this will be better for LLVM optimizations.	2025-03-31 11:42:17 -07:00
jeanPerier	44261dae5b	[flang][NFC] use hlfir.declare first result when both results are raw pointers (#132261 ) Currently, the helpers to get fir::ExtendedValue out of hlfir::Entity use hlfir.declare second result (`#1`) in most cases. This is because this result is the same as the input and matches what FIR was getting before lowering to HLFIR. But this creates odd situations when both hlfir.declare are raw pointers and either result ends-up being used in the IR depending on whether the code was generated by a helper using fir::ExtendedValue, or via "pure HLFIR" helpers using the first result. This will typically prevent simple CSE and easy identification that two operation (e.g load/store) are touching the exact same memory location without using alias analysis or "manual detection" (looking for common hlfir.declare defining op). Hence, when hlfir.declare results are both raw pointers, use `#0` when producing `fir::ExtendedValue`. When `#0` is a fir.box, keep using `#1` because these are not the same. The only code change is in HLFIRTools.cpp and is pretty small, but there is a big test fallout of `#1` to `#0`.	2025-03-21 11:41:04 +01:00
jeanPerier	b8271ec8b3	[flang] accept character type in fir::changeTypeShape (#131892 ) There is no reason for character element type to be forbidden in this helper. The assert was firing in character pointer assignment in FORALL after #130772 added a usage of this helper.	2025-03-19 10:01:24 +01:00
jeanPerier	3ff3b29dd6	[flang] lower remaining cases of pointer assignments inside forall (#130772 ) Implement handling of `NULL()` RHS, polymorphic pointers, as well as lower bounds or bounds remapping in pointer assignment inside FORALL. These cases eventually do not require updating hlfir.region_assign, lowering can simply prepare the new descriptor for the LHS inside the RHS region. Looking more closely at the polymorphic cases, there is not need to call the runtime, fir.rebox and fir.embox do handle the dynamic type setting correctly. After this patch, the last remaining TODO is the allocatable assignment inside FORALL, which like some cases here, is more likely an accidental feature given FORALL was deprecated in F2003 at the same time than allocatable components where added.	2025-03-14 10:51:46 +01:00
jeanPerier	40e245a9aa	[flang] add support for procedure pointer assignment inside FORALL (#130114 ) Very similar to object pointer assignment, the difference is the SSA types of the LHS (!fir.ref<!fir.boxproc<()->()>> and RHS (!fir.boxproc<()->()). The RHS must be saved as simple address, not descriptors (it is not possible to make CFI descriptor out of procedure entity).	2025-03-07 10:28:02 +01:00
jeanPerier	7302e1b94e	[flang] implement simple pointer assignments inside FORALL (#129522 ) The semantic of pointer assignments inside FORALL requires evaluating the targets (RHS) and pointer variables (LHS) of all iterations before evaluating the assignments. In practice, if the compiler can prove that the RHS and LHS evaluations are not impacted by the assignments, the evaluation of the FORALL assignment statement can be done in a single loop. However, if the compiler cannot prove this, it needs to "save" the addresses of the targets and/or the pointer descriptors of each iterations before doing the assignments. This patch implements the most common cases where there is no lower bound spec, no bounds remapping, the LHS is not polymorphic, and the RHS is not NULL. The HLFIR operation used to represent assignments inside FORALL can be used for pointer assignments to (the only difference being that the LHS is a descriptor address). The analysis for intrinsic assignment can be reused, with the distinction that the RHS data is not read during the assignment. The logic that is used to save LHS in intrinsic assignments inside FORALL is extracted to be used for the RHS of pointer assignments when needed (saving a descriptor value). Pointer assignment LHS are just descriptor addresses and are saved as int_ptr values.	2025-03-05 11:24:04 +01:00
jeanPerier	9a659fac2f	[flang] fix MAXVAL(x%array_comp_with_custom_lower_bounds) (#129684 ) The HLFIR inlining of MAXVAL kicks in at O1 and more when the argument is an array component reference but the implementation did not account for the rare cases where the array components have non default lower bounds. This patch fixes the issue by using `getElementAt` to compute the element address. Rename `indices` to `oneBasedIndices` for more clarity.	2025-03-04 17:52:05 +01:00
Slava Zakharin	a704e6587b	[flang] Added alternative inlining code for hlfir.cshift. (#129176 ) Flang generates slower code for `CSHIFT(CSHIFT(PTR(:,:,I),sh1,1),sh2,2)` pattern in facerec than other compilers. The first CSHIFT can be done as two memcpy's wrapped in a loop for the second dimension. This does require creating a temporary array, but it seems to be faster, than the current hlfir.elemental inlining. I started with modifying the new index computation in hlfir.elemental inlining: the new arith.select approach does enable some vectorization in LLVM, but on x86 it is using gathers/scatters and does not give much speed-up. I also experimented with LoopBoundSplitPass and InductiveRangeCheckElimination for a simple (not chained) CSHIFT case, but I could not adjust them to split the loop with a condition on the value of the IV into two loops with disjoint iteration spaces. I thought if I could do it, I would be able to keep the hlfir.elemental inlining mostly untouched, and then adjust the hlfir.elemental inlining heuristics for the facerec case. Since I was not able to make these pass work for me, I added a special case inlining for CSHIFT(ARRAY,SH,DIM=1) via hlfir.eval_in_mem. If ARRAY is not statically known to have the contiguous leading dimension, there is a dynamic check for contiguity, which allows exposing it to LLVM and enabling the rewrite of the copy loops into memcpys. This approach is stepping on the toes of LoopVersioning, but it is helpful in facerec case. I measured ~6% speed-up on grace, and ~4% on zen4.	2025-03-03 09:58:20 -08:00
jeanPerier	a8db1fb9b5	[flang] update fir.coordinate_of to carry the fields (#127231 ) This patch updates fir.coordinate_op to carry the field index as attributes instead of relying on getting it from the fir.field_index operations defining its operands. The rational is that FIR currently has a few operations that require DAGs to be preserved in order to be able to do code generation. This is the case of fir.coordinate_op, which requires its fir.field operand producer to be visible. This makes IR transformation harder/brittle, so I want to update FIR to get rid if this. Codegen/printer/parser of fir.coordinate_of and many tests need to be updated after this change.	2025-02-28 09:50:05 +01:00
Slava Zakharin	754d896ca7	[flang] Propagate fast-math flags to FIROpBuilder. (#126316 ) One constructor was missing to propagate fast-math flags from an operation to the builder. It is fixed now. And the builder creation in one opt-bufferization case should take the rewriter, I think.	2025-02-10 15:23:34 -08:00

1 2 3 4 5 ...

304 Commits