This diff adds initial (partial) support for "fastmath" attributes for floating
point operations in the arithmetic dialect. The "fastmath" attributes are
implemented using a default-valued bit enum. The defined flags currently mirror
the fastmath flags in the LLVM dialect (and in LLVM itself). Extending the
set of flags (if necessary) is left as a future task.
In this diff:
- Definition of FastMathAttr as a custom attribute in the Arithmetic dialect
that inherits from the EnumAttr class.
- Definition of ArithFastMathInterface, which is an interface that is
implemented by operations that have an arith::fastmath attribute.
- Declaration of a default-valued fastmath attribute for unary and (some) binary
floating point operations in the Arithmetic dialect.
- Conversion code to lower arithmetic fastmath flags to LLVM fastmath flags
NOT in this diff (but planned or currently in progress):
- Documentation of flag meanings
- Addition of FastMathAttr attributes to other dialects that might lower to the
Arithmetic dialect (e.g. Math and Complex)
- Folding/rewrite implementations that are enabled by fastmath flags
- Specification of fastmath values from Python bindings (pending other in-
progress diffs)
Reviewed By: mehdi_amini, vzakhari
Differential Revision: https://reviews.llvm.org/D126305
Some down-stream libraries only have access to CAPI, setting
default value of mapMemorySpace to true will help them convert IR
to SPIR-V.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D136739
The revision specifies more precise argument and result type
constraints for many of the llvm intrinsics. Additionally, add
tests to verify intrinsics with invalid arguments/result result
in a verification error.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D136360
Use the `getReturnTypes()` API (which returns an `ArrayRef<Type>`)
rather than the `getReturnType()` API (which returns a `Type`) to avoid
returning a dangling reference in `LLVMFuncOp::getCallableResults()`.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D136669
This has been a long standing TODO, and actually enables users to generate
debug information for LLVM using the LLVM dialect; as opposed to our
dummy placeholder that generated just enough for line table information.
Differential Revision: https://reviews.llvm.org/D136543
This allows for using the llvm namespace cast methods instead of the ones on the Location class. The Location class method are kept for now, but we'll want to remove these eventually (with a really long lead time).
Related change: https://reviews.llvm.org/D135870
Differential Revision: https://reviews.llvm.org/D136520
This moves the commandline option registration into its own function, so
that users can register translations without registering the options.
Reviewed By: cota
Differential Revision: https://reviews.llvm.org/D136561
Drop the `createPadScalarOp` from Utils.h since it is a duplicate of
the `build` method added here.
Differential Revision: https://reviews.llvm.org/D136493
`getDestinationOperands` was almost a duplicate of `DestinationStyleOpInterface::getOutputOperands`. Now that the interface has been moved to mlir/Interfaces, it is no longer needed.
Differential Revision: https://reviews.llvm.org/D136240
This adds a subset of the necessary metadata for defining
debug info in the LLVM dialect. It doesn't import everything,
but just enough to start actually generating LLVM debug info
the expected way. Export/Import to LLVMIR will be added in a
followup.
Differential Revision: https://reviews.llvm.org/D136542
Prior to this change, the "ExtractSliceFromReshape" pattern would transform
```
%collapsed = tensor.collapse_shape %input [[0, 1], [2]]
: tensor<1x11x100xf32> into tensor<11x100xf32>
%slice = tensor.extract_slice %collapsed [%offt, 0] [%size, 100] [1, 1]
: tensor<11x100xf32> to tensor<?x100xf32>
```
into a loop that iterated over the range `%size - %offt`, that pieces
together multiple sub-slices of `%input` along the first dimension. This
is correct but obviously inefficient. The technical condition is that
collapsing at-most-one non-unit dimension of `%src` will not result in a
subsequent slice along the corresponding dimension of `%collapsed`
mapping across discontinuities in the index space of `%src`. Thus, the
definition of a "linearized dimension" (from the perspective of
`tensor.collapse_shape`) is updated to reflect this condition.
The transform will now generate
```
%slice = tensor.extract_slice %input [0, %offt, 0][1, %size, 100] [1, 1]
: tensor<1x11x100xf32> to tensor<1x?x100xf32>
%result = tensor.collapse_shape [[0, 1], [2]]
: tensor<1x?x100xf32> to tensor<?x100xf32>
```
which can be further canonicalized.
Additional tests are added to check this family of edge cases.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D135726
Enum definitions are currently spread throughout the op definitions
file, making it difficult to reason about both where they are, and
where to add new ones when necessary. The attribute definitions are
in a similarish case, where while they have a dedicated .td file, there
definitions/declarations are generated in the main LLVMDialect source
files. This makes it difficult to reason about how to cleanly add new
attributes/enums.
This commit adds a dedicated LLVMEnums.td file for enum definitions,
cleans up the AttrDefs.td file, and adds a new LLVMAttrs.cpp/.h file to
home enum/attr definitions moving forward. This makes it much cleaner to
add new attributes/enums to the LLVM dialect.
Differential Revision: https://reviews.llvm.org/D136409
This greatly simplifies composing enums in attribute/type printers,
which currently reimplement these functions as needed.
Differential Revision: https://reviews.llvm.org/D136407
SubElementInterfaces forbids all mutable types and attributes from
implementing `replaceImmediateSubElements`. However, this prohibits
literal structs, which are immutable, from implementing that function.
This patch defers the decision on whether to support
`replaceImmediateSubElements` to the individual types/attributes.
Depends on D136505
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D136507
This keeps the current parser, however, since the mnemonic `vec` is
overloaded for both of these types.
Depends on D136499
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D136505
This moves the `LLVMArrayType` to a `TypeDef`. The main side-effect of
this change is that the syntax `array<4xi32>` is no longer allowed. It
was previously parsed and then printed as `array<4 x i32>`. Now the
syntax must be the latter.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D136473
This adds a '--no-implicit-module' option, which disables the insertion
of a top-level 'builtin.module' during parsing.
The translation APIs are also updated to take/return 'Operation*'
instead of 'ModuleOp', to allow other operation types to be used. To
simplify translations which are restricted to specific operation types,
'TranslateFromMLIRRegistration' has an overload which performs the
necessary cast and error checking.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D134237
This patch adds a lowering pass to convert `index` dialect ops to LLVM.
Depends on D135694
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D135697
This patch adds folders for `index` dialect ops. Ths folders are
careful to ensure that fold results are valid on both 32-bit and 64-bit
targets.
Depends on D135689
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D135694
This patch adds the definitions for the operations and attributes (just
one enum attribute) for the `index` dialect.
Depends on D135688
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D135689
This patch introduces the `index` dialect and associated boilerplate for
adding ops and enums (comparison predicates).
Reviewed By: rriddle, jpienaar, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D135688
The `scf.index_switch` is a control-flow operation that branches to one of the
given regions based on the values of the argument and the cases. The
argument is always of type `index`.
Example:
```mlir
%0 = scf.index_switch %arg0 -> i32
case 2 {
%1 = arith.constant 10 : i32
scf.yield %1 : i32
}
case 5 {
%2 = arith.constant 20 : i32
scf.yield %2 : i32
}
default {
%3 = arith.constant 30 : i32
scf.yield %3 : i32
}
```
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D136003
Clarify that sparse_tensor.foreach iterates sparse_tensor in stored dim order.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D136401
Add an option to dump the pipeline that will be run to stderr. A
dedicated option is needed since the existing `test-dump-pipeline`
pipeline won't be usable with `-pass-pipeline` after D135745.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D135747
Previously a pipeline nested on `anchor-op` would print as just
`'pipeline'`, now it will print as `'anchor-op(pipeline)'`. This ensures
the text form includes all information needed to reconstruct the pass
manager.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D134622
This patch handles native `mma.sync` sizes and enables issuing `ldmatrix` on
largest possible tiles for matrixB. It requires handling
`vector.extract_strided_slice` from vector to ngpu lowering.
Differential Revision: https://reviews.llvm.org/D135749
This is to allow the use of a nop convert to express that the sparse tensor
allocated through bufferization::AllocTensorOp will be expanded to sparse
tensor storage by sparse tensor codegen.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D136214
This removes another massive source of redundancy, and instead has the Merger.{h,cpp} reuse the SparseTensorEnums library.
Depends On D136005
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D136123
The `SimplifyExtractStridedMetadata` pass features a pattern that forward
statically known information (offset, sizes, strides) to their respective
users.
This patch moves this pattern from this pass to the
`extract_strided_metadata` folding patterns.
Differential Revision: https://reviews.llvm.org/D135797