8 Commits

Author SHA1 Message Date
Zichen Lu
fbffdaa174
[MLIR][GPU] Update serializeToObject to use SerializedObject wrapper and include ISA compiler logs (#176697)
This PR makes the compilation log from ISA compiler available to users
by returning it as part of the `gpu::ObjectAttr` properties, following
the existing pattern like `LLVMIRToISATimeInMs`.

Currently, the compiler log (which contains useful information such as
spill statistics when --verbose is passed) is only accessible in debug
builds via `LLVM_DEBUG`. However, there are good reasons to make this
information available in release builds as well:

1. Both `ptxas` and `libnvptxcompiler` are publicly available
tools/libraries distributed with the CUDA Toolkit. The `--verbose` flag
and its output are documented public features, not internal debug
information.
2. The verbose output provides valuable insights for users.

A new `SerializedObject` class is used to carry the metadata alongside
the binary when returning from `serializeObject`.
2026-01-30 12:56:20 +01:00
Srinivasa Ravi
1e468b2813
[MLIR][GPU][NVVM] Add verify-target-arch option to nvvm-attach-target pass (#176774)
This change adds the `verify-target-arch` option to the
`nvvm-attach-target` to control the `verifyTarget` parameter in the
attached `NVVMTargetAttr` which is used to enable/disable the
verification of the target architecture with respect to the NVVM Ops.
2026-01-22 17:18:22 +05:30
Guray Ozen
837b89fc0f
[MLIR][NVVM] Add ptxas-cmd-options to pass flags to the downstream compiler (#127457)
This PR adds `cmd-options` to the `gpu-lower-to-nvvm-pipeline` pipeline
and the `nvvm-attach-target` pass, allowing users to pass flags to the
downstream compiler, *ptxas*.

Example:
```
mlir-opt -gpu-lower-to-nvvm-pipeline="cubin-chip=sm_80 ptxas-cmd-options='-v --register-usage-level=8'"
```
2025-02-17 12:09:27 +01:00
Kazu Hirata
5262865aac
[mlir] Construct SmallVector with ArrayRef (NFC) (#101896) 2024-08-04 11:43:05 -07:00
Kazu Hirata
b7b337fb91
[mlir] Use llvm::unique (NFC) (#96415) 2024-06-24 11:54:02 -07:00
Adrian Kuegel
93228cff8f [mlir] Apply ClangTidy fix (NFC)
Use .empty() instead of checking for size().
2023-08-22 13:55:09 +02:00
Nicolas Vasilache
7c4e8c6a27 [mlir] Disentangle dialect and extension registrations.
This revision avoids the registration of dialect extensions in Pass::getDependentDialects.

Such registration of extensions can be dangerous because `DialectRegistry::isSubsetOf` is
always guaranteed to return false for extensions (i.e. there is no mechanism to track
whether a lambda is already in the list of already registered extensions).
When the context is already in a multi-threaded mode, this is guaranteed to assert.

Arguably a more structured registration mechanism for extensions with a unique ExtensionID
could be envisioned in the future.

In the process of cleaning this up, multiple usage inconsistencies surfaced around the
registration of translation extensions that this revision also cleans up.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D157703
2023-08-22 00:40:09 +00:00
Fabian Mora
fbbb8adef1 [mlir][gpu] Add passes to attach (NVVM|ROCDL) target attributes to GPU Modules
Adds the passes `nvvm-attach-target` & `rocdl-attach-target for attaching `nvvm.target` & `rocdl.target` attributes to GPU Modules.

These passes search GPU Modules in the immediate region of the Op being acted on, attaching the target attribute to the module.
Modules can be selected using a regex string, allowing fine grain attachment of targets, see the test `attach-target.mlir` for an example.

Depends on D154153

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D157351
2023-08-12 00:45:26 +00:00