[MLIR] Introduce RemarkEngine + pluggable remark streaming (YAML/Bitstream) (#152474 )

This PR implements structured, tooling-friendly optimization remarks
with zero cost unless enabled. It implements:
- `RemarkEngine` collects finalized remarks within `MLIRContext`.
- `MLIRRemarkStreamerBase` abstract class streams them to a backend.
- Backends: `MLIRLLVMRemarkStreamer` (bridges to llvm::remarks →
YAML/Bitstream) or your own custom streamer.
- Optional mirroring to DiagnosticEngine (printAsEmitRemarks +
categories).
- Off by default; no behavior change unless enabled. Thread-safe;
ordering best-effort.


## Overview

```
Passes (reportOptimization*)
         │
         ▼
+-------------------+
|  RemarkEngine     |   collects
+-------------------+
     │         │
     │ mirror  │ stream
     ▼         ▼
emitRemark    MLIRRemarkStreamerBase (abstract)
                   │
                   ├── MLIRLLVMRemarkStreamer → llvm::remarks → YAML | Bitstream
                   └── CustomStreamer → your sink
```

## Enable Remark engine and Plug LLVM's Remark streamer
```
// Enable once per MLIRContext. This uses `MLIRLLVMRemarkStreamer`
mlir::remark::enableOptimizationRemarksToFile(
    ctx, path, llvm::remarks::Format::YAML, cats);
```

## API to emit remark
```
// Emit from a pass
 remark::passed(loc, categoryVectorizer, myPassname1)
        << "vectorized loop";

remark::missed(loc, categoryUnroll, "MyPass")
        << remark::reason("not profitable at this size")   // Creates structured reason arg
        << remark::suggest("increase unroll factor to >=4");   // Creates structured suggestion arg

remark::passed(loc, categoryVectorizer, myPassname1)
        << "vectorized loop" 
        << remark::metric("tripCount", 128);                // Create structured metric on-the-fly
```

2025-08-21 16:02:31 +02:00

6.8 KiB

Raw Blame History

Remark Infrastructure

Remarks are structured, human- and machine-readable notes emitted by the compiler to explain:

What was transformed
What was missed
Why it happened

The RemarkEngine collects finalized remarks during compilation and sends them to a pluggable streamer. By default, MLIR integrates with LLVM’s llvm::remarks, allowing you to:

Stream remarks as passes run
Serialize them to YAML or LLVM bitstream for tooling

Key Points

Opt-in – Disabled by default; zero overhead unless enabled.
Per-context – Configured on MLIRContext.
Formats – LLVM Remark engine (YAML / Bitstream) or custom streamers.
Kinds – Passed, Missed, Failure, Analysis.
API – Lightweight streaming interface using << (like MLIR diagnostics).

How It Works

Two main components:

RemarkEngine (owned by MLIRContext): Receives finalized InFlightRemarks, optionally mirrors them to the DiagnosticEngine, and dispatches to the installed streamer.

MLIRRemarkStreamerBase (abstract): Backend interface with a single hook:

virtual void streamOptimizationRemark(const Remark &remark) = 0;

Default backend – MLIRLLVMRemarkStreamer Adapts mlir::Remark to LLVM’s remark format and writes YAML/bitstream via llvm::remarks::RemarkStreamer.

Ownership flow: MLIRContext → RemarkEngine → MLIRRemarkStreamerBase

Emitting Remarks

The remark::* helpers return an in-flight remark. You append strings or key–value metrics using <<.

Remark Options

When constructing a remark, you typically provide four fields that are StringRef:

Remark name – identifiable name
Category – high-level classification
Sub-category – more fine-grained classification
Function name – the function where the remark originates

Example

#include "mlir/IR/Remarks.h"

LogicalResult MyPass::runOnOperation() {
  Location loc = getOperation()->getLoc();

  remark::RemarkOpts opts = remark::RemarkOpts::name(MyRemarkName1)
                                .category(categoryVectorizer)
                                .function(fName)
                                .subCategory(myPassname1);

  // PASSED
  remark::passed(loc, opts)
      << "vectorized loop"
      << remark::metric("tripCount", 128);

  // ANALYSIS
  remark::analysis(loc, opts)
      << "Kernel uses 168 registers";

  // MISSED (with reason + suggestion)
  int tripBad = 4, threshold = 256, target = 128;
  remark::missed(loc, opts)
      << remark::reason("tripCount={0} < threshold={1}", tripBad, threshold)
      << remark::suggest("increase unroll to {0}", target);

  // FAILURE
  remark::failed(loc, opts)
      << remark::reason("failed due to unsupported pattern");

  return success();
}

Metrics and Shortcuts

Helper functions accept LLVM format style strings. This format builds lazily, so remarks are zero-cost when disabled.

Adding Remarks

remark::add(fmt, ...) – Shortcut for metric("Remark", ...).

Adding Reasons

remark::reason(fmt, ...) – Shortcut for metric("Reason", ...). Used to explain why a remark was missed or failed.

Adding Suggestions

remark::suggest(fmt, ...) – Shortcut for metric("Suggestion", ...). Used to provide actionable feedback.

Adding Custom Metrics

remark::metric(key, value) – Adds a structured key–value metric.

Example: tracking TripCount. When exported to YAML, it appears under args for machine readability:

remark::metric("TripCount", value)

String Metrics

Passing a plain string (e.g. << "vectorized loop") is equivalent to:

metric("Remark", "vectorized loop")

Enabling Remarks

1. With LLVMRemarkStreamer (YAML or Bitstream)

Persists remarks to a file in the chosen format.

mlir::remark::RemarkCategories cats{/*passed=*/categoryLoopunroll,
                                     /*missed=*/std::nullopt,
                                     /*analysis=*/std::nullopt,
                                     /*failed=*/categoryLoopunroll};

mlir::remark::enableOptimizationRemarksWithLLVMStreamer(
    context, yamlFile, llvm::remarks::Format::YAML, cats);

YAML format – human-readable, easy to diff:

--- !Passed
pass:            Category:SubCategory
name:            MyRemarkName1
function:        myFunc
loc:             myfile.mlir:12:3
args:
  - Remark:          vectorized loop
  - tripCount:       128

Bitstream format – compact binary for large runs.

2. With `mlir::emitRemarks` (No Streamer)

If the streamer isn't passed, the remarks are mirrored to the DiagnosticEngine using mlir::emitRemarks

mlir::remark::RemarkCategories cats{/*passed=*/categoryLoopunroll,
                                     /*missed=*/std::nullopt,
                                     /*analysis=*/std::nullopt,
                                     /*failed=*/categoryLoopunroll};
remark::enableOptimizationRemarks(
    /*streamer=*/nullptr, cats,
    /*printAsEmitRemarks=*/true);

3. With a Custom Streamer

You can implement a custom streamer by inheriting MLIRRemarkStreamerBase to consume remarks in any format.

class MyStreamer : public MLIRRemarkStreamerBase {
public:
  void streamOptimizationRemark(const Remark &remark) override {
    // Convert and write remark to your custom format
  }
};

auto myStreamer = std::make_unique<MyStreamer>();
remark::enableOptimizationRemarks(
    /*streamer=*/myStreamer, cats,
    /*printAsEmitRemarks=*/true);

6.8 KiB

Raw Blame History

Remark Infrastructure

Key Points

How It Works

Categories

1. Passed

2. Missed

3. Failure

4. Analysis

Emitting Remarks

Remark Options

Example

Metrics and Shortcuts

Adding Remarks

Adding Reasons

Adding Suggestions

Adding Custom Metrics

String Metrics

Enabling Remarks

1. With LLVMRemarkStreamer (YAML or Bitstream)

2. With `mlir::emitRemarks` (No Streamer)

3. With a Custom Streamer

6.8 KiB Raw Blame History Unescape Escape

Remark Infrastructure

Key Points

How It Works

Categories

1. Passed

2. Missed

3. Failure

4. Analysis

Emitting Remarks

Remark Options

Example

Metrics and Shortcuts

Adding Remarks

Adding Reasons

Adding Suggestions

Adding Custom Metrics

String Metrics

Enabling Remarks

1. With LLVMRemarkStreamer (YAML or Bitstream)

2. With mlir::emitRemarks (No Streamer)

3. With a Custom Streamer

6.8 KiB

Raw Blame History

2. With `mlir::emitRemarks` (No Streamer)