llvm-project

Author	SHA1	Message	Date
Kazu Hirata	70c73d1b72	[mlir] Use std::nullopt instead of None in comments (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 17:23:50 -08:00
Kazu Hirata	1a36588ec6	[mlir] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-03 18:50:27 -08:00
Christian Sigg	be065c41d8	[mlir] Change scf::LoopNest to store 'results'. This fixes the case where scf::LoopNest::loops is empty. Change LoopVector and ValueVector to SmallVector. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D136926	2022-12-01 06:51:45 +01:00
Matthias Springer	0d9761d50e	[mlir][SCF] Add tensor.dim(scf.foreach_thread) folding Dim sizes of `scf.foreach_thread` op results match the dim sizes of their respective tied shared_outs operands. Differential Revision: https://reviews.llvm.org/D138484	2022-11-22 11:28:27 +01:00
Mohammed Anany	77533d79f7	[mlir][SCF] Adding custom builder to SCF::WhileOp. This is a similar builder to the one for SCF::IfOp which allows users to pass region builders to it. Refer to the builders for IfOp. Reviewed By: tpopp Differential Revision: https://reviews.llvm.org/D137709	2022-11-15 18:16:49 +01:00
Nicolas Vasilache	f0a411da77	[mlir][Transform]Significantly cleanup scf.foreach_thread and GPU transform permutation handling Previously, the need for a dense permutation leaked into the thread_dim_mapping specification. This revision allows to use a sparse specification of the thread_dim_mapping and the proper completion / sorting is applied automatically. In the process, the sematics of scf.foreach_thread is tightened to require a matching number of thread dimensions and mappings. The relevant negative test is added. Differential Revision: https://reviews.llvm.org/D137906	2022-11-14 09:19:49 -08:00
Alexander Belyaev	07665e78cb	[mlir] Fix forward the fix for incorrect Optional<ArrayAttr> usage.	2022-11-11 10:53:04 +01:00
Alexander Belyaev	b7162e136e	[mlir] Fix incorrect access to the Optional<ArrayAttr> underlying values.	2022-11-11 10:46:04 +01:00
Guray Ozen	6663f34704	[mlir] Introduce device mapper attribute for `thread_dim_map` and `mapped to dims` `scf.foreach_thread` defines mapping its loops to processors via an integer array, see an example below. A lowering can use this mapping. However, expressing mapping as an integer array is very confusing, especially when there are multiple levels of parallelism. In addition, the op does not verify the integer array. This change introduces device mapping attribute to make mapping descriptive and verifiable. Then it makes GPU transform dialect use it. ``` scf.foreach_thread (%i, %j) in (%c1, %c2) { scf.foreach_thread (%i2, %j2) in (%c1, %c2) {...} { thread_dim_mapping = [0, 1]} } { thread_dim_mapping = [0, 1]} ``` It first introduces a `DeviceMappingInterface` which is an attribute interface. `scf.foreach_thread` defines its mapping via this interface. A lowering must define its attributes and implement this interface as well. This way gives us a clear validation. The change also introduces two new attributes (`#gpu.thread<x/y/z>` and `#gpu.block<x,y,z>` ). After this change, the above code prints as below, as seen here, this way clarifies the loop mappings. The change also implements consuming of these two new attribute by the transform dialect. Transform dialect binds the outermost loops to the thread blocks and innermost loops to threads. ``` scf.foreach_thread (%i, %j) in (%c1, %c2) { scf.foreach_thread (%i2, %j2) in (%c1, %c2) {...} { thread_dim_mapping = [#gpu.thread<x>, #gpu.thread<y>]} } { thread_dim_mapping = [#gpu.block<x>, #gpu.block<y>]} ``` Reviewed By: ftynse, nicolasvasilache Differential Revision: https://reviews.llvm.org/D137413	2022-11-11 08:44:57 +01:00
Mehdi Amini	34233d4995	Apply clang-tidy fixes for readability-redundant-smartptr-get in SCF.cpp (NFC)	2022-11-09 00:41:30 +00:00
Mehdi Amini	f5865c8701	Apply clang-tidy fixes for llvm-qualified-auto in SCF.cpp (NFC)	2022-11-09 00:41:30 +00:00
Kazu Hirata	585e35a998	[mlir] Use llvm::is_contained (NFC)	2022-11-06 19:56:15 -08:00
Sanjoy Das	788390c130	Make scf.for and affine.for conditionally speculatable for (I = Start; I < End; I += 1) always terminates so mark {scf\|affine}.for as RecursivelySpeculatable when step is known to be 1. Reviewed By: chelini Differential Revision: https://reviews.llvm.org/D136376	2022-10-30 16:08:42 -07:00
Jeff Niu	07d8fe9391	[mlir][scf] Add an IndexSwitchOp The `scf.index_switch` is a control-flow operation that branches to one of the given regions based on the values of the argument and the cases. The argument is always of type `index`. Example: ```mlir %0 = scf.index_switch %arg0 -> i32 case 2 { %1 = arith.constant 10 : i32 scf.yield %1 : i32 } case 5 { %2 = arith.constant 20 : i32 scf.yield %2 : i32 } default { %3 = arith.constant 30 : i32 scf.yield %3 : i32 } ``` Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D136003	2022-10-21 09:21:10 -07:00
Jeff Niu	6005a1d8af	[mlir][scf] Match any constants instead of arith.constant By matching `arith.constant` specifically, SCF canonicalizers/folders are incompatible with other kinds of constants. Use the generic matchers instead. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D135517	2022-10-12 18:01:57 -07:00
Jakub Kuderski	abc362a107	[mlir][arith] Change dialect name from Arithmetic to Arith Suggested by @lattner in https://discourse.llvm.org/t/rfc-define-precise-arith-semantics/65507/22. Tested with: `ninja check-mlir check-mlir-integration check-mlir-mlir-spirv-cpu-runner check-mlir-mlir-vulkan-runner check-mlir-examples` and `bazel build --config=generic_clang @llvm-project//mlir:all`. Reviewed By: lattner, Mogball, rriddle, jpienaar, mehdi_amini Differential Revision: https://reviews.llvm.org/D134762	2022-09-29 11:23:28 -04:00
Peiming Liu	52887071ea	[mlir][scf] Support simple symbolic expression without depending on AffineDialect to simply trivial loops. Remove dependence of AffineDialect Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D134291	2022-09-20 18:13:05 +00:00
Peiming Liu	d518fc28b6	[mlir][scf] Support simple symbolic expression when simplify loops Reviewed By: aartbik, ThomasRaoux Differential Revision: https://reviews.llvm.org/D134204	2022-09-19 21:50:01 +00:00
Guray Ozen	233de4e808	[mlir] Add map_nested_foreach_thread_to_gpu_threads op to transform dialect This revision adds a new op `map_nested_foreach_thread_to_gpu_threads` to transform dialect. The op searches `scf.foreach_threads` inside the `gpu_launch` and distributes them with `gpu.thread_id` attribute. Loop mapping is explicit and given by the `map_nested_foreach_thread_to_gpu_threads` op. Mapping is done one-to-one, therefore the loops dissappear. The dynamic trip count or trip count that are larger than thread size are not supported for the time being. However, we can indeed support them by generating a loop inside with cyclic scheduling. For the time being, trip counts that are dynamic or bigger than thread sizes are not supported. However, in the future the compiler can indeed generate a loop with static cyclic scheduling to support these cases. Current mechanism allows `scf.foreach_threads` to be siblings or nested. There cannot be interleaving code between the loops when they are nested. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D133950	2022-09-19 16:27:30 +02:00
Matthias Springer	4cd7362083	[mlir][SCF] foreach_thread: Capture shared output tensors explicitly This change refines the semantics of scf.foreach_thread. Tensors that are inserted into in the terminator must now be passed to the region explicitly via `shared_outs`. Inside of the body of the op, those tensors are then accessed via block arguments. The body of a scf.foreach_thread is now treated as a repetitive region. I.e., op dominance can no longer be used in conflict detection when using a value that is defined outside of the body. Such uses may now be considered as conflicts (if there is at least one read and one write in the body), effectively privatizing the tensor. Shared outputs are not privatized when they are used via their corresponding block arguments. As part of this change, it was also necessary to update the "tiling to scf.foreach_thread", such that the generated tensor.extract_slice ops use the scf.foreach_thread's block arguments. This is implemented by cloning the TilingInterface op inside the scf.foreach_thread, rewriting all of its outputs with block arguments and then calling the tiling implementation. Afterwards, the cloned op is deleted again. Differential Revision: https://reviews.llvm.org/D133114	2022-09-02 14:54:04 +02:00
Jeff Niu	27e8ee208c	[mlir] Remove a not very useful `eraseArguments` overload This overload just wraps a bitvector, and in most cases a bitvector could be used directly instead of a list. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D132896	2022-08-29 16:07:32 -07:00
Jeff Niu	58a47508f0	(Reland) [mlir] Switch segment size attributes to DenseI32ArrayAttr This reland includes changes to the Python bindings. Switch variadic operand and result segment size attributes to use the dense i32 array. Dense integer arrays were introduced primarily to represent index lists. They are a better fit for segment sizes than dense elements attrs. Depends on D131801 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D131803	2022-08-12 19:44:52 -04:00
Alex Zinenko	e8e718fa4b	Revert "[mlir] Switch segment size attributes to DenseI32ArrayAttr" This reverts commit 30171e76f0e5ea8037bc4d1450dd3e12af4d9938. Breaks Python tests in MLIR, missing C API and Python changes.	2022-08-12 10:22:47 +02:00
Jeff Niu	30171e76f0	[mlir] Switch segment size attributes to DenseI32ArrayAttr Switch variadic operand and result segment size attributes to use the dense i32 array. Dense integer arrays were introduced primarily to represent index lists. They are a better fit for segment sizes than dense elements attrs. Depends on D131738 Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D131702	2022-08-11 20:56:45 -04:00
Benjamin Kramer	9fa59e7643	[mlir] Use C++17 structured bindings instead of std::tie where applicable. NFCI	2022-08-09 13:34:17 +02:00
Kazu Hirata	c8e6ebd74e	Use value instead of getValue (NFC)	2022-08-06 11:21:39 -07:00
Kazu Hirata	9750648cb4	[mlir, flang] Use has_value instead of hasValue (NFC)	2022-08-06 11:12:47 -07:00
Nicolas Vasilache	7fbf55c927	[mlir][Tensor] Move ParallelInsertSlice to the tensor dialect This is moslty NFC and will allow tensor.parallel_insert_slice to gain rank-reducing semantics by reusing the vast majority of the tensor.insert_slice impl. Depends on D128857 Differential Revision: https://reviews.llvm.org/D128920	2022-07-04 01:53:12 -07:00
Nicolas Vasilache	b994d388ae	[mlir][SCF] Add a ParallelCombiningOpInterface to decouple scf::PerformConcurrently from its contained operations This allows purging references of scf.ForeachThreadOp and scf.PerformConcurrentlyOp from ParallelInsertSliceOp. This will allowmoving the op closer to tensor::InsertSliceOp with which it should share much more code. In the future, the decoupling will also allow extending the type of ops that can be used in the parallel combinator as well as semantics related to multiple concurrent inserts to the same result. Differential Revision: https://reviews.llvm.org/D128857	2022-07-01 00:16:02 -07:00
Matthias Springer	04dac2ca7c	[mlir][SCF][bufferize][NFC] Implement resolveConflicts for ParallelInsertSliceOp This was previous implemented as part of the BufferizableOpInterface of ForEachThreadOp. Moving the implementation to ParallelInsertSliceOp to be consistent with the remaining ops and to have a nice example op that can serve as a blueprint for other ops. Differential Revision: https://reviews.llvm.org/D128666	2022-06-28 12:18:22 +02:00
Nicolas Vasilache	a0f843fdaf	[SCF] Add thread_dim_mapping attribute to scf.foreach_thread An optional thread_dim_mapping index array attribute specifies for each virtual thread dimension, how it remaps 1-1 to a set of concrete processing element resources (e.g. a CUDA grid dimension or a level of concrete nested async parallelism). At this time, the specification is backend-dependent and is not verified by the op, beyond being an index array attribute. It is the reponsibility of the lowering to interpret the index array in the context of the concrete target the op is lowered to, or to ignore it when the specification is ill-formed or unsupported for a particular target. Differential Revision: https://reviews.llvm.org/D128633	2022-06-27 04:58:36 -07:00
Jacques Pienaar	2d70eff802	[mlir] Flip more uses to prefixed accessor form (NFC). Try to keep the final flip small. Need to flip MemRef as there are many templated cases with it and Tensor.	2022-06-26 19:12:38 -07:00
Kazu Hirata	3b7c3a654c	Revert "Don't use Optional::hasValue (NFC)" This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.	2022-06-25 11:56:50 -07:00
Kazu Hirata	aa8feeefd3	Don't use Optional::hasValue (NFC)	2022-06-25 11:55:57 -07:00
Nicolas Vasilache	98dbaed1e6	[mlir][SCF] Fold tensor.cast feeding into scf.foreach_thread.parallel_insert_slice Differential Revision: https://reviews.llvm.org/D128247	2022-06-21 01:19:18 -07:00
Nicolas Vasilache	a489aa745b	[mlir][SCF] Add scf::ForeachThread canonicalization. This revision adds the necessary plumbing for canonicalizing scf::ForeachThread with the `AffineOpSCFCanonicalizationPattern`. In the process the `loopMatcher` helper is updated to take OpFoldResult instead of just values. This allows composing various scenarios without the need for an artificial builder. Differential Revision: https://reviews.llvm.org/D128244	2022-06-21 00:54:46 -07:00
Kazu Hirata	6d5fc1e3d5	[mlir] Don't use Optional::getValue (NFC)	2022-06-20 23:20:25 -07:00
Kazu Hirata	037f09959a	[mlir] Don't use Optional::hasValue (NFC)	2022-06-20 11:22:37 -07:00
Alex Zinenko	8b68da2c7d	[mlir] move SCF headers to SCF/{IR,Transforms} respectively This aligns the SCF dialect file layout with the majority of the dialects. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D128049	2022-06-20 10:18:01 +02:00

39 Commits