878 Commits

Author SHA1 Message Date
Gil Rapaport
d9803841f2
[mlir][emitc] Add op modelling C expressions (#71631)
Add an emitc.expression operation that models C expressions, and provide
transforms to form and fold expressions. The translator emits the body
of
emitc.expression ops as a single C expression.
This expression is emitted by default as the RHS of an EmitC SSA value,
but if
possible, expressions with a single use that is not another expression
are
instead inlined. Specific expression's inlining can be fine tuned by
lowering
passes and transforms.
2023-12-20 15:04:46 +02:00
Oleksandr "Alex" Zinenko
9519e3ecbf
[mlir] support dialect attribute translation to LLVM IR (#75309)
Extend the `amendOperation` mechanism for translating dialect attributes
attached to operations from another dialect when translating MLIR to
LLVM IR. Previously, this mechanism would have no knowledge of the LLVM
IR instructions created for the given operation, making it impossible
for it to perform local modifications such as attaching operation-level
metadata. Collect instructions inserted by the LLVM IR builder and pass
them to `amendOperation`.
2023-12-19 14:18:16 +01:00
Paul Walker
dea16ebd26
[LLVM][IR] Replace ConstantInt's specialisation of getType() with getIntegerType(). (#75217)
The specialisation will not be valid when ConstantInt gains native
support for vector types.

This is largely a mechanical change but with extra attention paid to constant
folding, InstCombineVectorOps.cpp, LoopFlatten.cpp and Verifier.cpp to
remove the need to call `getIntegerType()`.

Co-authored-by: Nikita Popov <github@npopov.com>
2023-12-18 11:58:42 +00:00
Kareem Ergawy
d777504355
[MLIR][OpenMP][Offload] Lower target update op to DeviceRT (#75159)
Adds support for lowring `UpdateDataOp` to the DeviceRT. This reuses the
existing utils used by other device directive.
2023-12-18 11:14:46 +01:00
Kazu Hirata
88d319a29f [mlir] Use StringRef::{starts,ends}_with (NFC)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-13 22:58:30 -08:00
Tom Eccles
79524ba527
[mlir][ArmSME] Add sve streaming compatible attribute (#75222)
Following the same path already used for ArmStreaming and
ArmLocallyStreaming.

This should correspond to clang's __arm_streaming_compatible attribute.
2023-12-13 13:53:01 +00:00
Christian Ulmann
eab62971cd
[MLIR][LLVM] Support nameless and scopeless global constants (#75307)
This commit ensures that we model DI information for global constants
correctly. These constructs can lack scopes, names, and linkage names,
so these parameters were made optional for the DIGlobalVariable
attribute.
2023-12-13 10:47:59 +01:00
Ivan Radanov Ivanov
95dce3e86d Link NVVM translation in the to LLVMIR registration library 2023-12-12 14:02:39 +09:00
Ivan R. Ivanov
d5fb4c0f11
[MLIR][NVVM] Enable nvvm intrinsics import to LLVMIR (#68843)
Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
Co-authored-by: Christian Ulmann <christianulmann@gmail.com>
2023-12-12 13:31:55 +09:00
Tom Eccles
e9e1c411b6
[mlir][LLVM] Add nsw and nuw flags (#74508)
The implementation of these are modeled after the existing fastmath
flags for floating point arithmetic.
2023-12-07 10:35:00 +00:00
Billy Zhu
2ea60f4197
[MLIR][LLVM] Fuse Scope into CallsiteLoc Callee (#74546)
There's an issue in the translator today where, for a CallsiteLoc, if
the callee does not have a DI scope (perhaps due to compile options or
optimizations), it may get propagated the DI scope of its callsite's
parent function, which will create a non-existent DILocation combining
line & col number from one file, and the filename from another.

The root problem is we cannot propagate the parent scope when
translating the callee location, as it no longer applies to inlined
locations (see code diff and hopefully this will make sense).

To facilitate this, the importer is also changed so that callee scopes
are fused with the callee FileLineCol loc, instead of on the Callsite
loc itself. This comes with the benefit that we now have a symmetric
Callsite loc representation. If we required the callee scope be always
annotated on the Callsite loc, it would be hard for generic inlining
passes to maintain that, since it would have to somehow understand the
semantics of the fused metadata and pull it out while inlining.
2023-12-06 09:13:12 +01:00
Sang Ik Lee
7fc792cba7
[MLIR] Enable GPU Dialect to SYCL runtime integration (#71430)
GPU Dialect lowering to SYCL runtime is driven by spirv.target_env
attached to gpu.module. As a result of this, spirv.target_env remains as
an input to LLVMIR Translation.
A SPIRVToLLVMIRTranslation without any actual translation is added to
avoid an unregistered error in mlir-cpu-runner.
SelectObjectAttr.cpp is updated to
1) Pass binary size argument to getModuleLoadFn
2) Pass parameter count to getKernelLaunchFn
This change does not impact CUDA and ROCM usage since both
mlir_cuda_runtime and mlir_rocm_runtime are already updated to accept
and ignore the extra arguments.
2023-12-05 16:55:24 -05:00
Benjamin Maxwell
17de468df1
[mlir][llvm] Add llvm.target_features features attribute (#71510)
This patch adds a target_features (TargetFeaturesAttr) to the LLVM
dialect to allow setting and querying the features in use on a function.

The motivation for this comes from the Arm SME dialect where we would
like a convenient way to check what variants of an operation are
available based on the CPU features.

Intended usage:

The target_features attribute is populated manually or by a pass:

```mlir
func.func @example() attributes {
   target_features = #llvm.target_features<["+sme", "+sve", "+sme-f64f64"]>
} {
 // ...
}
```

Then within a later rewrite the attribute can be checked, and used to
make lowering decisions.

```c++
// Finds the "target_features" attribute on the parent
// FunctionOpInterface.
auto targetFeatures = LLVM::TargetFeaturesAttr::featuresAt(op);

// Check a feature.
// Returns false if targetFeatures is null or the feature is not in
// the list.
if (!targetFeatures.contains("+sme-f64f64"))
    return failure();
```

For now, this is rather simple just checks if the exact feature is in
the list, though it could be possible to extend with implied features
using information from LLVM.
2023-12-05 11:29:31 +00:00
Radu Salavat
3257e4ca16
[MLIR] Add support for frame pointers in MLIR (#72145)
Add support for frame pointers in MLIR.

---------

Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
Co-authored-by: Christian Ulmann <christianulmann@gmail.com>
2023-12-05 11:52:13 +01:00
Billy Zhu
fd870c6fa9
[MLIR][LLVM] Translate Debug EmissionKind (#74376)
Translate debug emission kind into LLVM (the importer already supports
this).
2023-12-05 11:05:21 +01:00
Justin Wilson
6da578cec1
[mlir] Add support for DIGlobalVariable and DIGlobalVariableExpression (#73367)
This PR introduces DIGlobalVariableAttr and
DIGlobalVariableExpressionAttr so that ModuleTranslation can emit the
required metadata needed for debug information about global variable.
The translator implementation for debug metadata needed to be refactored
in order to allow translation of nodes based on MDNode
(DIGlobalVariableExpressionAttr and DIExpression) in addition to
DINode-based nodes.

A DIGlobalVariableExpressionAttr can now be passed to the GlobalOp
operation directly and ModuleTranslation will create the respective
DIGlobalVariable and DIGlobalVariableExpression nodes. The compile unit
that DIGlobalVariable is expected to be configured with will be updated
with the created DIGlobalVariableExpression.
2023-12-04 15:52:02 +01:00
Adrian Kuegel
853682cc19 [mlir][LLVIR] Apply ClangTidy finding.
Remove unused using declaration.
2023-12-04 11:20:58 +00:00
Fangrui Song
a3ef858968 [mlir,polly] Replace uses of IRBuilder::getInt8PtrTy with getPtrTy. NFC 2023-11-27 20:58:25 -08:00
Guray Ozen
edf5cae739
[mlir][gpu] Support Cluster of Thread Blocks in gpu.launch_func (#72871)
NVIDIA Hopper architecture introduced the Cooperative Group Array (CGA).
It is a new level of parallelism, allowing clustering of Cooperative
Thread Arrays (CTA) to synchronize and communicate through shared memory
while running concurrently.

This PR enables support for CGA within the `gpu.launch_func` in the GPU
dialect. It extends `gpu.launch_func` to accommodate this functionality.

The GPU dialect remains architecture-agnostic, so we've added CGA
functionality as optional parameters. We want to leverage mechanisms
that we have in the GPU dialects such as outlining and kernel launching,
making it a practical and convenient choice.

An example of this implementation can be seen below:

```
gpu.launch_func @kernel_module::@kernel
                clusters in (%1, %0, %0) // <-- Optional
                blocks in (%0, %0, %0)
                threads in (%0, %0, %0)
```

The PR also introduces index and dimensions Ops specific to clusters,
binding them to NVVM Ops:

```
%cidX = gpu.cluster_id  x
%cidY = gpu.cluster_id  y
%cidZ = gpu.cluster_id  z

%cdimX = gpu.cluster_dim  x
%cdimY = gpu.cluster_dim  y
%cdimZ = gpu.cluster_dim  z
```

We will introduce cluster support in `gpu.launch` Op in an upcoming PR. 

See [the
documentation](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#cluster-of-cooperative-thread-arrays)
provided by NVIDIA for details.
2023-11-27 11:05:07 +01:00
Kiran Chandramohan
76c4a6e310 [MLIR][OpenMP] NFC: Remove unused variable 2023-11-24 16:22:00 +00:00
Akash Banerjee
6bdeb53ed9
[MLIR][OpenMP] Fix the assertion failure for VariableCaptureKind::ByCopy (#72424) 2023-11-24 11:33:55 +00:00
Akash Banerjee
f1d773863d
[Flang][OpenMP] Remove use of non reference values from MapInfoOp (#72444)
This patch removes the val field from the `MapInfoOp`.

Previously when lowering `TargetOp`, the bounds information for the
`BoxValues` were also being mapped. Instead these ops are now cloned
inside the target region to prevent mapping of non reference typed
values.
2023-11-24 11:33:19 +00:00
Oleksandr "Alex" Zinenko
8735b7dcc9
[mlir] do not inject malloc/free in to-LLVM translation (#73224)
In the early days of MLIR-to-LLVM IR translation, it had to forcefully
inject declarations of `malloc` and `free` functions as then-standard
(now `memref`) dialect ops were unconditionally lowering to libc calls.
This is no longer the case. Even when they do lower to libc calls, the
signatures of those methods are injected at lowering since calls must
target declared functions in valid IR. Don't inject those declarations
anymore.
2023-11-23 13:38:25 +01:00
Benjamin Maxwell
dbb8643333
[mlir][LLVM] Support immargs in LLVM_IntrOpBase intrinsics (#73013)
This extends `LLVM_IntrOpBase` so that it can be passed a list of
`immArgPositions` and a list (of the same length) of `immArgAttrNames`.
`immArgPositions` contains the positions of `immargs` on the LLVM IR
intrinsic, and `immArgAttrNames` maps those to a corresponding MLIR
attribute.

This allows modeling LLVM `immargs` as MLIR attributes, which is the
closest match semantically (and had already been done manually for the
LLVM dialect intrinsics).

This has two upsides:
* It's slightly easier to implement intrinsics with immargs now
(especially if they make use of other features, such as overloads)
* It clearly defines that `immargs` should map to attributes, before
there was no mention of `immargs` in LLVMOpBase.td, so implementing them
was unclear

This works with other features of the `LLVM_IntrOpBase`, so `immargs`
can be marked as overloaded too (which is used in some intrinsics).

As part of this patch (and to test correctness) existing intrinsics have
been updated to use these new parameters.

This also uncovered a few issues with the
`llvm.intr.vector.insert/extract` intrinsics. First, the argument order
for insert did not match the LLVM intrinsic, and secondly, both were
missing a mlirBuilder (so failed to import from LLVM IR). This is
corrected with this patch (and a test case added).
2023-11-23 10:12:12 +00:00
Oleksandr "Alex" Zinenko
8134a8fc3f
[mlir] use TypeSize and uint64_t in DataLayout (#72874)
Data layout queries may be issued for types whose size exceeds the range
of 32-bit integer as well as for types that don't have a size known at
compile time, such as scalable vectors. Use best practices from LLVM IR
and adopt `llvm::TypeSize` for size-related queries and `uint64_t` for
alignment-related queries.

See #72678.
2023-11-21 16:12:27 +01:00
agozillon
9d26c6bd7f
[MLIR][OpenMP] remove now unnecessary getUsedValuesDefinedAbove call from convertTargetOp (#72904)
This block of code was here to create pseudo handling of implicit
captures in target regions to prevent gfortran test regressions and
allow certain pieces of code to function, however, with the introduction
of the IFA patch which adds proper handling of implicits by adding them
to the map operands list alongside explicit mappings at the initial
Fortran -> MLIR generation phase this should no longer be required and
may cause some adverse affects at worse in the future.
2023-11-21 15:33:56 +01:00
Ivan Butygin
b84fe8ff16
[mlir][spirv] Add some op decorations (#72809)
NoSignedWrap, NoUnsignedWrap, FPFastMathMode.
2023-11-21 15:31:23 +03:00
Marius Brehler
c4fd1fd6d4
[mlir][emitc] Rename call op to call_opaque (#72494)
This renames the `emitc.call` op to `emitc.call_opaque` as the existing
call op does not refer to the callee by symbol. The rename allows to
introduce a new call op alongside with a future `emitc.func` op to model
and facilitate functions and function calls.
2023-11-17 10:22:15 +01:00
Billy Zhu
0ab6b20c36
[MLIR] Add DIExpression to LLVM dialect (#72462)
Add initial support for DIExpression in LLVM dialect.

Similar to LLVM IR, DI Expression is encoded as a list of uint64. The
difference is that LLVM IR has helpers for understanding the expression
(e.g. for verification and pretty printing), whereas the current support
added by this PR treats the expression elements as opaque.
2023-11-16 11:32:02 -08:00
Benjamin Maxwell
783ac3b6fb
[mlir][ArmSME] Make use of backend function attributes for enabling ZA storage (#71044)
Previously, we were inserting za.enable/disable intrinsics for functions
with the "arm_za" attribute (at the MLIR level), rather than using the
backend attributes. This was done to avoid a dependency on the SME ABI
functions from compiler-rt (which have only recently been implemented).

Doing things this way did have correctness issues, for example, calling
a streaming-mode function from another streaming-mode function (both
with ZA enabled) would lead to ZA being disabled after returning to the
caller (where it should still be enabled). Fixing issues like this would
require re-doing the ABI work already done in the backend within MLIR.

Instead, this patch switches to use the "arm_new_za" (backend) attribute
for enabling ZA for an MLIR function. For the integration tests, this
requires some way of linking the SME ABI functions. This is done via the
`%arm_sme_abi_shlib` lit substitution. By default, this expands to a
stub implementation of the SME ABI functions, but this can be overridden
by providing the `ARM_SME_ABI_ROUTINES_SHLIB` CMake cache variable
(pointing it at an alternative implementation). For now, the ArmSME
integration tests pass with just stubs, as we don't make use of nested
ZA-enabled calls.

A future patch may add an option to compiler-rt to build the SME
builtins into a standalone shared library to allow easily
building/testing with the actual implementation.
2023-11-14 12:50:38 +00:00
Akash Banerjee
8701b178e0
[MLIR][OpenMP] Changes to function-filtering pass (#71850)
Currently, when deleting the device functions in the second stage of filtering during MLIR to LLVM translation we can end up with invalid calls to these functions. This is because of the removal of the EarlyOutliningPass which would have otherwise gotten rid of any such calls.

This patch aims to alter the function filtering pass in the following way:
	- Any host function is completely removed.
	- Call to the host function are also removed and their uses replaced with Undef values.
	- Any host function with target region code is marked to be removed during the the second stage.
	- Calls to such functions are still removed and their uses replaced with Undef values.

Co-authored-by: Sergio Afonso <sergio.afonsofumero@amd.com>
2023-11-14 12:43:31 +00:00
Shraiysh
c9626e6264
[OpenMP][mlir] Add enter capture attribute to declare target (#72062)
This patch adds support for enter attribute in declare target. As the
enter attribute is a replacement for `to` attribute, it has the same
tests.
2023-11-13 14:51:20 -06:00
David Truby
a72e034f13
[mlir] Add llvm.linker.options operation to the LLVM IR Dialect (#71720)
This patch adds a `llvm.linker.options` operation taking a list of
strings to pass to the linker when the resulting object file is linked.
This is particularly useful on Windows to specify the CRT version to use
for this object file.
2023-11-13 14:13:05 +00:00
Paulo Matos
7b9d73c2f9
[NFC] Remove Type::getInt8PtrTy (#71029)
Replace this with PointerType::getUnqual().
Followup to the opaque pointer transition. Fixes an in-code TODO item.
2023-11-07 17:26:26 +01:00
Akash Banerjee
6bb7c65493 [MLIR][OpenMP] Add check to see if map operand is of PtrType before creating LoadInst
This fixes build error from fbaf2c6cf7b207145dbda0d1cbadd0b446a21199.
2023-11-07 13:26:27 +00:00
Sirraide
65fedb4394
[MLIR] Add support for calling conventions to LLVM::CallOp and LLVM::InvokeOp (#71319)
Despite the fact that the LLVM dialect’s `FuncOp` already supports
calling conventions, there was yet no support for them in the ops that
actually perform function calls, which led to incorrect LLVM IR being
generated if one actually tried setting a `FuncOp`’s calling convention
to anything other than `ccc`.

This commit adds support for calling conventions to `LLVM::CallOp` and
`LLVM::InvokeOp` and makes sure that calling conventions are parsed,
printed, and lowered appropriately.
2023-11-06 19:27:01 +01:00
Simon Camphausen
68b071d9a2
[mlir][emitc] Fix corner case in translation of literal ops (#71375)
Fix a corner case missed in #71296 when operands generated by literals
are mixed with the args attribute of a call op.

Additionally remove a range check that is already handled by the CallOp
verifier.
2023-11-06 16:17:20 +01:00
Akash Banerjee
63752399f8 [OpenMP][MLIR]OMPEarlyOutliningPass removal
This patch removes the OMPEarlyOutliningPass as it is no longer required. The implicit map operand capture has now been moved to the PFT lowering stage.

Depends on #67318.
2023-11-06 13:24:02 +00:00
Akash Banerjee
72e2387c05 [OpenMP][MLIR] Add "IsolatedFromAbove" trait to omp.target
This patch adds the MLIR translation changes required for add the IsolatedFromAbove and OutlineableOpenMPOpInterface traits to omp.target. It links the newly added block arguments to their corresponding llvm values.

Depends on #67164.
2023-11-06 13:24:02 +00:00
Gil Rapaport
6c59f0e1b0
[mlir][emitc] Fix literal translation (#71296)
- Do not emit variables-at-top for literals
- Do not emit an error for a missing name for literals used as call
operands.
2023-11-05 17:06:24 -08:00
Lei Zhang
4a4b8570f7 [mlir][spirv] Fix missing dependency and remove unnecessary headers 2023-11-05 10:44:05 -08:00
Sang Ik Lee
2dace04521
[mlir][spirv] Implement gpu::TargetAttrInterface (#69949)
This commit implements gpu::TargetAttrInterface for SPIR-V target
attribute. The plan is to use this to enable GPU compilation pipeline
for OpenCL kernels later.

The changes do not impact Vulkan shaders using milr-vulkan-runner.
New GPU Dialect transform pass spirv-attach-target is implemented for
attaching attribute from CLI.

gpu-module-to-binary pass now works with GPU module that has SPIR-V
module with OpenCL kernel functions inside.
2023-11-05 08:11:53 -08:00
Jie Fu
ac798eaa96 [mlir] Fix -Wunused-variable in ROCDL/Target.cpp (NFC)
/llvm-project/mlir/lib/Target/LLVM/ROCDL/Target.cpp:181:40: error: unused variable 'targetMachine' [-Werror,-Wunused-variable]
  std::optional<llvm::TargetMachine *> targetMachine =
                                       ^
1 error generated.
2023-11-04 08:15:37 +08:00
Mehdi Amini
d9dadfda85
Refactor ModuleToObject to offer more flexibility to subclass (NFC)
Some specific implementation of the offload may want more customization, and
even avoid using LLVM in-tree to dispatch the ISA translation to a custom
solution. This refactoring makes it possible for such implementation to work
without even configuring the target backend in LLVM.

Reviewers: fabianmcg

Reviewed By: fabianmcg

Pull Request: https://github.com/llvm/llvm-project/pull/71165
2023-11-03 13:41:45 -07:00
Christian Ulmann
4ce93d531d
[MLIR][LLVM] Avoid creating invalid DICompositeTypes in import (#70797)
This commit ensures that the debug info import skips `DICompositeTypes`
that have an array type tag and failed to translate the base type. This
is necessary because array `DICompositeTypes` require a base type to be
valid.
Note that this is currently not verified in LLVM, instead it leads to an
explosion of the `ASMPrinter`.
2023-10-31 14:30:31 +01:00
Andrew Gozillon
68c384676c [Flang][MLIR][OpenMP] Temporarily re-add basic handling of uses in target regions to avoid gfortran test-suite regressions
This was a regression introduced by myself in:

 6a62707c04

where I too hastily removed the basic handling of implicit captures
we have currently. This will be superseded by all implicit captures
being added to target operations map_info entries in a soon landing
series of patches, however, that is currently not the case so we must
continue to do some basic handling of these captures for the time
being. This patch re-adds that behaviour to avoid regressions.

Unfortunately this means some test changes as well as
getUsedValuesDefinedAbove grabs constants used outside
of the target region which aren't handled particularly
well currently.
2023-10-30 15:10:12 -05:00
tsitdikov
8bc4462bc1
Remove unused variable. (#70670)
All usages of the variable have been removed in
https://github.com/llvm/llvm-project/pull/68689, we now need to clean it
up.
2023-10-30 16:37:30 +01:00
agozillon
6a62707c04
[Flang][OpenMP][MLIR] Initial array section mapping MLIR -> LLVM-IR lowering utilising omp.bounds (#68689)
This patch seeks to add initial lowering of OpenMP array sections within
target region map clauses from MLIR to LLVM IR.

This patch seeks to support fixed sized contiguous (don't think OpenMP
supports anything other than contiguous sections from my reading but i
could be wrong) arrays initially, before looking toward assumed size and
shaped arrays. The patch also currently does not include stride, it's
left for future work.

Although, assumed size works in some fashion (dummy arguments) with some
minor alterations to the OMPEarlyOutliner, so it is possible changes
made in the IsolatedFromAbove series may allow this to work with no
further required patches.

It utilises the generated omp.bounds to calculate the size of the mapped
OpenMP array (both for sectioned and un-sectioned arrays) as well as the
offset to be passed to the kernel argument structure.

Alongside these changes some refactoring of how map data is handled is
attempted, using a new MapData structure to keep track of information
utilised in the lowering of mapped values.

The initial addition of a more complex createDeviceArgumentAccessor that
utilises capture kinds similarly to (and loosely based on) Clang to
generate different kernel argument accesses is also added.

A similar function for altering how the kernel argument is passed to the
kernel argument structure on the host is also utilised
(createAlteredByCaptureMap), which allows modification of the
pointer/basePointer based on their capture (and bounds information).
It's of note ByRef, is the default for explicit mappings and ByCopy will
be the default for implicit captures, so the former is currently tested
in this patch and the latter is not for the moment.
2023-10-30 16:00:23 +01:00
Christian Ulmann
a902ca6642
[MLIR][LLVM] Infer export location scope from location, if possible (#70465)
This commit changes the debug location exporter to try to infer the
locations scope from the MLIR location, if possible. This is necessary
when the function containing the operation does not have a DISubprogram
attached to it.

We observed a roundtrip crash with a case where the the subprogram was
missing on a function, but a debug intrinsic referenced a subprogram
non-the-less. This lead to a successful import, but the export silently
dropped the location, which results in invalid IR.
2023-10-30 10:42:04 +01:00
Youngsuk Kim
645b7795d4 [mlir] Remove no-op ptr-to-ptr bitcasts (NFC)
Opaque pointer cleanup effort. NFC.
2023-10-26 13:01:23 -05:00