We've had the ability to define LLVM's `range` attribute through
#llvm.constant_range for some time, and have used this for some GPU
intrinsics. This commit allows using `llvm.range` as a parameter or
result attribute on function declarations and definitions.
The implementation is mostly based on the one existing for the exact
flag.
disjoint means that for each bit, that bit is zero in at least one of
the inputs. This allows the Or to be treated as an Add since no carry
can occur from any bit. If the disjoint keyword is present, the result
value of the or is a [poison
value](https://llvm.org/docs/LangRef.html#poisonvalues) if both inputs
have a one in the same bit position. For vectors, only the element
containing the bit is poison.
This implementation is based on the existing one for the exact flag.
If the nneg flag is set and the argument is negative, the result is a
poison value.
The implementation is mostly based on the one existing for the nsw and
nuw flags.
If the exact flag is present, the corresponding operation returns a
poison value when the result is not exact. (For a division, if rounding
happens; for a right shift, if a non-zero bit is shifted out.)
This patch adds operand bundle support for `llvm.intr.assume`.
This patch actually contains two parts:
- `llvm.intr.assume` now accepts operand bundle related attributes and
operands. `llvm.intr.assume` does not take constraint on the operand
bundles, but obviously only a few set of operand bundles are meaningful.
I plan to add some of those (e.g. `aligned` and `separate_storage` are
what interest me but other people may be interested in other operand
bundles as well) in future patches.
- The definitions of `llvm.call`, `llvm.invoke`, and
`llvm.call_intrinsic` actually define `op_bundle_tags` as an operation
property. It turns out this approach would introduce some unnecessary
burden if applied equally to the intrinsic operations because properties
are not available through `Operation *` but we have to operate on
`Operation *` during the import/export of intrinsics, so this PR changes
it from a property to an array attribute.
This patch relands commit d8fadad07c952c4aea967aefb0900e4e43ad0555.
This patch adds operand bundle support for `llvm.intr.assume`.
This patch actually contains two parts:
- `llvm.intr.assume` now accepts operand bundle related attributes and
operands. `llvm.intr.assume` does not take constraint on the operand
bundles, but obviously only a few set of operand bundles are meaningful.
I plan to add some of those (e.g. `aligned` and `separate_storage` are
what interest me but other people may be interested in other operand
bundles as well) in future patches.
- The definitions of `llvm.call`, `llvm.invoke`, and
`llvm.call_intrinsic` actually define `op_bundle_tags` as an operation
property. It turns out this approach would introduce some unnecessary
burden if applied equally to the intrinsic operations because properties
are not available through `Operation *` but we have to operate on
`Operation *` during the import/export of intrinsics, so this PR changes
it from a property to an array attribute.
The underlying issue was caused by a file included in two different
places which resulted in duplicate definition errors when linking
individual shared libraries. This was fixed in c3201ddaeac02a2c86a38b
[#109874].
Currently, we allow only one DIGlobalVariableExpressionAttr per global.
It is especially evident in import where we pick the first from the list
and ignore the rest. In contrast, LLVM allows multiple
DIGlobalVariableExpression to be attached to the global. They are needed
for correct working of things like DICommonBlock. This PR removes this
restriction in mlir. Changes are mostly mechanical. One thing on which I
went a bit back and forth was the representation inside GlobalOp. I
would be happy to change if there are better ways to do this.
---------
Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
This commit fixes a bug in the import of nameless globals. Before this
change, the fake symbol names were only generated during the
transformation of the definition. This caused issues when the symbol was
used before it was defined.
This commit addresses an issue with importing globals that reference
other globals. This case did not properly work due to not considering
that `llvm::GlobalVariables` are derived from `llvm::Constant`.
Add support for the -frecord-command-line option that will produce the
llvm.commandline metadata which will eventually be saved in the object
file. This behavior is also supported in clang. Some refactoring of the
code in flang to handle these command line options was carried out. The
corresponding -grecord-command-line option which saves the command line
in the debug information has not yet been enabled for flang.
* Strip calls to raw_string_ostream::flush(), which is essentially a no-op
* Strip unneeded calls to raw_string_ostream::str(), to avoid excess indirection.
This brings the behavior of flang in line with clang which also adds
this metadata unconditionally.
Co-authored-by: Tarun Prabhu <tarun.prabhu@gmail.com>
LLVM allows nameless globals for private global variables whereas in
MLIR globals must be addressed using symbols. We attach symbols to
nameless globals in order to enable their import.
---------
Co-authored-by: Christian Ulmann <christianulmann@gmail.com>
I would like to mark a call op in LLVM dialect as Musttail. The calling
convention attribute only exposes Tail, not Musttail. I noticed that the
CallInst of LLVM has an additional field to specify the flavor of tail
call kind. I bubbled this up to the LLVM dialect by adding another
attribute that maps to LLVM::CallInst::TailCallKind.
Adds `denormal-fp-math-f32`, `denormal-fp-math`, `fp-contract` to
llvmFuncOp attributes.
`denormal-fp-math-f32` and `denormal-fp-math` can enable the ftz, that
is , flushing denormal to zero.
`fp-contract` can enable the fma fusion such as `mul + add -> fma`
This PR adds -mtune as a valid flang flag and passes the information
through to LLVM IR as an attribute on all functions. No specific
architecture optimizations are added at this time.
The `noinline`, `alwaysinline`, and `optnone` function attributes are
already being used in MLIR code for the LLVM inlining interface and in
some SPIR-V lowering, despite residing in the passthrough dictionary,
which is intended as exactly that -- a pass through MLIR -- and not to
model any actual semantics being handled in MLIR itself.
Promote the `noinline`, `alwaysinline`, and `optnone` attributes out of
the passthrough dictionary on `llvm.func` into first class unit
attributes, updating the import and export accordingly.
Add a verifier to `llvm.func` that checks that these attributes are not
set in an incompatible way according to the LLVM specification.
Update the LLVM dialect inlining interface to use the first class
attributes to check whether inlining is possible.
This commit introduces a flag to allow skipping the potentially
recursive import of DICompositeType elements. This patch is essentially a
bandaid for the still broken recursive debug type import.
Some of our downstream inputs are produced by excessive usage of
template meta programming, and thus contain tens of thousands of types
that all participate in such recursions. Unfortunately, the series of
patches that introduces type support is not easily revertible due to
being around for a while now and Modular depending on it.
We can consider to revert this change once the type importer has show to
be very performant, but for now we are talking second vs hours to import
specific files.
This patch adds the `convertInstruction` and `getSupportedInstructions`
to `LLVMImportInterface`, allowing any non-LLVM dialect to specify how
to import LLVM IR instructions and overriding the default import of LLVM instructions.
Add operation mapping to the LLVM
`llvm.experimental.constrained.fptrunc.*` intrinsic.
The new operation implements the new
`LLVM::FPExceptionBehaviorOpInterface` and
`LLVM::RoundingModeOpInterface` interfaces.
---------
Signed-off-by: Victor Perez <victor.perez@codeplay.com>
This revision handles the case that the translation of a scope fails due
to cyclic metadata. This mainly affects the import of debug intrinsics
that indirectly take such a scope as metadata argument (e.g. via local
variable or label metadata). This commit ensures we drop intrinsics with
such a dependency on cyclic metadata.
Since https://github.com/ARM-software/acle/pull/276 the ACLE
defines attributes to better describe the use of a given SME state.
Previously the attributes merely described the possibility of it being
'shared' or 'preserved', whereas the new attributes have more semantics
and also describe how the data flows through the program.
For ZT0 we already had to add new LLVM IR attributes:
* aarch64_new_zt0
* aarch64_in_zt0
* aarch64_out_zt0
* aarch64_inout_zt0
* aarch64_preserves_zt0
We have now done the same for ZA, such that we add:
* aarch64_new_za (previously `aarch64_pstate_za_new`)
* aarch64_in_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_out_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_inout_za (more specific variation of
`aarch64_pstate_za_shared`)
* aarch64_preserves_za (previously `aarch64_pstate_za_shared,
aarch64_pstate_za_preserved`)
This explicitly removes 'pstate' from the name, because with SME2 and
the new ACLE attributes there is a difference between "sharing ZA"
(sharing
the ZA matrix register with the caller) and "sharing PSTATE.ZA" (sharing
either the ZA or ZT0 register, both part of PSTATE.ZA with the caller).
Adds unsafe-fp-math, no-infs-fp-math, no-nans-fp-math,
approx-func-fp-math, and no-signed-zeros-fp-math function attributes.
This allows code generators using the LLVMIR dialect to match the
codegen of Clang.
This revision updates the LLVM IR import to support unreachable basic
blocks. An unreachable block may dominate itself and a value defined
inside the block may thus be used before its definition. The import does
not support such dependencies. We thus delete the unreachable basic
blocks before the import. This is possible since MLIR does not have
basic block labels that can be reached using an indirect call and
unreachable blocks can indeed be deleted safely.
Additionally, add a small poison constant import test.
This patch adds the target_cpu attribute to llvm.func MLIR operations
and updates the translation to/from LLVM IR to match "target-cpu"
function attributes.
This revision adds support for importing call site calling conventions.
Additionally, the revision also adds a roundtrip test for an indirect
call with a non-standard calling convention.