7 Commits

Author SHA1 Message Date
S. VenkataKeerthy
7634a8ed24
[MIR2Vec][llvm-ir2vec] Add MIR2Vec support to llvm-ir2vec tool (#164025)
Add MIR2Vec support to the llvm-ir2vec tool, enabling embedding generation for Machine IR alongside the existing LLVM IR functionality.

(This is an initial integration; Other entity/triplet gen for vocab generation would follow as separate patches)
2025-10-22 15:25:16 -07:00
S. VenkataKeerthy
c70d0812ba
[MIR2Vec] Handle Operands (#163281)
Handling opcodes in embedding computation.

- Revamped MIR Vocabulary with four sections - `Opcodes`, `Common Operands`, `Physical Registers`, and `Virtual Registers`
- Operands broadly fall into 3 categories -- the generic MO types that are common across architectures, physical and virtual register classes. We handle these categories separately in MIR2Vec. (Though we have same classes for both physical and virtual registers, their embeddings vary).
2025-10-22 10:58:38 -07:00
S. VenkataKeerthy
3c77b49797
[MIR2Vec] Add embedder for machine instructions (#162161)
Implement MIR2Vec embedder for generating vector representations of Machine IR instructions, basic blocks, and functions. This patch introduces changes necessary to *embed* machine opcodes. Machine operands would be handled incrementally in the upcoming patches.
2025-10-21 10:14:27 -07:00
Rahul Joshi
2a4f5b2751
[NFC][LLVM][CodeGen] Namespace related cleanups (#162999) 2025-10-13 07:54:50 -07:00
S. VenkataKeerthy
b32710a56b
[MIR2Vec] Added create factory methods for Vocabulary (#162569)
Added factory methods for vocabulary creation. This also would fix UB
issue introduced by #161713
2025-10-09 00:12:07 -07:00
S. VenkataKeerthy
566040e135
[MIR2Vec] Refactor MIR vocabulary to use opcode-based indexing (#161713)
Refactor MIRVocabulary to improve opcode lookup and add Section enum for better organization. This is useful for embedder lookups (next patches)

(Tracking issue - #141817)
2025-10-07 16:44:45 -07:00
S. VenkataKeerthy
879f8616ef
[IR2Vec] Initial infrastructure for MIR2Vec (#161463)
This PR introduces the initial infrastructure and vocabulary necessary for generating embeddings for MIR (discussed briefly in the earlier IR2Vec RFC - https://discourse.llvm.org/t/rfc-enhancing-mlgo-inlining-with-ir2vec-embeddings).  The MIR2Vec embeddings are useful in driving target specific optimizations that work on MIR like register allocation.

(Tracking issue - #141817)
2025-10-07 13:45:20 -07:00