MLIR Intermediate Representation
Overview
The Ora compiler includes a comprehensive MLIR (Multi-Level Intermediate Representation) lowering system that provides an intermediate representation for advanced analysis, optimization, and alternative code generation paths. This system complements the primary Yul backend and enables sophisticated compiler transformations.
What is MLIR?
MLIR (Multi-Level Intermediate Representation) is a compiler infrastructure developed by the LLVM community (https://mlir.llvm.org) that provides:
- Multi-level representation: Support for different abstraction levels
- Extensible dialects: Custom operation and type definitions
- Optimization passes: Pluggable transformation framework
- Analysis infrastructure: Data flow and control flow analysis
The Ora compiler integrates with the existing MLIR framework to provide Ora-specific lowering and analysis capabilities.
Ora MLIR Integration
The Ora compiler integrates with the LLVM MLIR framework (https://mlir.llvm.org) to:
- Represent Ora semantics in a structured intermediate form using MLIR operations and types
- Enable advanced analysis for verification and optimization using MLIR's pass infrastructure
- Support alternative backends beyond Yul by leveraging MLIR's code generation capabilities
- Provide debugging information with source location preservation using MLIR's location tracking
Our implementation provides Ora-specific dialects, operations, and lowering passes that work within the existing MLIR ecosystem.
Type System Mapping
Primitive Types
Ora primitive types are mapped to MLIR as follows:
| Ora Type | MLIR Type | Notes |
|---|---|---|
u8 - u256 | iN | N-bit unsigned integers |
i8 - i256 | iN | N-bit signed integers |
bool | i1 | Single bit boolean |
address | i160 | With ora.address attribute |
string | !ora.string | Ora dialect type |
bytes | !ora.bytes | Ora dialect type |
void | () | MLIR void type |
Complex Types
| Ora Type | MLIR Type | Description |
|---|---|---|
[T; N] | memref<NxT, space> | Fixed-size arrays with memory space |
slice[T] | !ora.slice<T> | Dynamic slices |
map[K, V] | !ora.map<K, V> | Key-value mappings |
doublemap[K1, K2, V] | !ora.doublemap<K1, K2, V> | Nested mappings |
struct { ... } | !llvm.struct<...> | Struct types |
enum | !ora.enum<name, repr> | Enumeration types |
!T | !ora.error<T> | Error types |
!T1 | T2 | !ora.error_union<T1, T2> | Error unions |
Memory Region Semantics
Ora's memory regions are represented in MLIR using memory spaces:
Storage (Space 1)
%storage_var = memref.alloca() {ora.region = "storage"} : memref<1xi256, 1>
- Persistent contract state
- Transactional semantics
- Gas costs for access
Memory (Space 0)
%memory_var = memref.alloca() : memref<1xi256, 0>
- Transient execution memory
- Cleared between calls
- Lower gas costs
TStore (Space 2)
%tstore_var = memref.alloca() {ora.region = "tstore"} : memref<1xi256, 2>
- Transient storage window
- Temporary persistence
- Intermediate gas costs
Expression Lowering
Arithmetic Operations
Ora arithmetic expressions are lowered to MLIR arith dialect operations:
let result = a + b * c;
%mul = arith.muli %b, %c : i256
%add = arith.addi %a, %mul : i256
Comparison Operations
let is_greater = x > y;
%cmp = arith.cmpi sgt, %x, %y : i256
Logical Operations
Logical operators use short-circuit evaluation:
let result = condition1 && condition2;
%result = scf.if %condition1 -> i1 {
scf.yield %condition2 : i1
} else {
%false = arith.constant false
scf.yield %false : i1
}
Field Access
Struct field access uses LLVM operations:
let value = my_struct.field;
%value = llvm.extractvalue %my_struct[0] : !llvm.struct<(i256, i256)>
Function Calls
let result = my_function(arg1, arg2);
%result = func.call @my_function(%arg1, %arg2) : (i256, i256) -> i256
Statement Lowering
Control Flow
If Statements
if (condition) {
// then block
} else {
// else block
}
scf.if %condition {
// then region
} else {
// else region
}
While Loops
while (condition) {
// body
}
scf.while (%arg0 = %initial) : (i256) -> () {
%cond = // evaluate condition
scf.condition(%cond)
} do {
// loop body
scf.yield %next_value : i256
}
For Loops
for (array) |item| {
// body
}
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%len = // array length
scf.for %i = %c0 to %len step %c1 {
%item = memref.load %array[%i] : memref<?xi256>
// loop body
}
Switch Statements
switch (value) {
case 1 => // handle 1
case 2...5 => // handle range
else => // default case
}
cf.switch %value : i256, [
default: ^default,
1: ^case1,
2: ^case2_5,
3: ^case2_5,
4: ^case2_5,
5: ^case2_5
]
Ora-Specific Features
Move Statements
move 100 from sender to receiver;
%amount = arith.constant 100 : i256
%move_op = ora.move %amount from %sender to %receiver {ora.move = true}
Log Statements
log Transfer(from: sender, to: receiver, amount: value);
ora.log "Transfer"(%sender, %receiver, %value) {
indexed = [true, true, false]
} : (i160, i160, i256) -> ()
Try-Catch Blocks
try {
risky_operation();
} catch (error) {
handle_error(error);
}
%result = scf.execute_region -> !ora.error<()> {
%success = func.call @risky_operation() : () -> !ora.error<()>
scf.yield %success : !ora.error<()>
}
%is_error = ora.is_error %result : !ora.error<()>
scf.if %is_error {
%error = ora.unwrap_error %result : !ora.error<()>
func.call @handle_error(%error) : (()) -> ()
}
Verification Features
Old Expressions
ensures balance == old(balance) + amount;
%old_balance = // captured at function entry
%postcond = arith.cmpi eq, %balance, %expected : i256
ora.assert %postcond {ora.ensures = true, ora.old = %old_balance}
Quantified Expressions
requires forall(i in 0...array.length) array[i] > 0;
%all_positive = ora.forall %i in %range {
%elem = memref.load %array[%i] : memref<?xi256>
%zero = arith.constant 0 : i256
%positive = arith.cmpi sgt, %elem, %zero : i256
ora.yield %positive : i1
} : (index) -> i1
ora.assert %all_positive {ora.requires = true}
Function Contracts
Requires Clauses
fn transfer(amount: u256)
requires(amount > 0)
requires(balance >= amount)
{
// function body
}
func.func @transfer(%amount: i256) {
%zero = arith.constant 0 : i256
%amount_positive = arith.cmpi sgt, %amount, %zero : i256
ora.assert %amount_positive {ora.requires = true}
%balance = // load balance
%sufficient = arith.cmpi uge, %balance, %amount : i256
ora.assert %sufficient {ora.requires = true}
// function body
}
Ensures Clauses
fn transfer(amount: u256)
ensures balance == old(balance) - amount
{
// function body
}
func.func @transfer(%amount: i256) {
%old_balance = memref.load %balance_ref : memref<1xi256>
// function body
%new_balance = memref.load %balance_ref : memref<1xi256>
%expected = arith.subi %old_balance, %amount : i256
%postcond = arith.cmpi eq, %new_balance, %expected : i256
ora.assert %postcond {ora.ensures = true}
}
Compiler Integration
CLI Usage
The Ora compiler uses a modern, streamlined command interface:
Compile and emit bytecode:
ora contract.ora
ora --emit-bytecode contract.ora
View MLIR IR:
ora --emit-mlir contract.ora
View Yul intermediate code:
ora --emit-yul contract.ora
Advanced options:
# Disable automatic MLIR validation (not recommended)
ora --no-validate-mlir contract.ora
# Use custom MLIR optimization passes
ora --mlir-passes="canonicalize,cse,mem2reg" contract.ora
Automatic MLIR Validation
The compiler automatically validates MLIR correctness before Yul lowering:
$ ora contract.ora
Parsing contract.ora...
Performing semantic analysis...
Lowering to MLIR...
Validating MLIR before Yul lowering...
✅ MLIR validation passed
Lowering to Yul...
Compiling to EVM bytecode...
Successfully compiled to EVM bytecode!
If validation fails, compilation stops immediately:
❌ MLIR validation failed with 3 error(s):
- [TypeMismatch] Expected i256, found i1
- [MalformedAst] Missing operand for arith.addi
- [InvalidRegion] Block has no terminator
Validation is automatic to catch errors early in the compilation pipeline.
Build Integration
The MLIR lowering pipeline:
- Lexing and Parsing: Source code → AST
- Semantic Analysis: Type checking, builtin validation
- MLIR Lowering: AST → MLIR module (with source locations)
- MLIR Validation: Structural and semantic checks (automatic)
- MLIR Optimization: CSE, canonicalization, mem2reg, SCCP, LICM
- Yul Lowering: MLIR → Yul (EVM intermediate language)
- Bytecode Generation: Yul → EVM bytecode
SSA Transformation
Ora to SSA
While Ora allows mutable local variables, the MLIR representation uses Static Single Assignment (SSA) form internally:
Ora Code (Mutable Variables):
pub fn calculate(x: u256) -> u256 {
let result = x; // Mutable local variable
result = result + 10;
result = result * 2;
return result;
}
MLIR Representation (SSA with Memory):
func.func @calculate(%arg0: i256) -> i256 {
%0 = memref.alloca() : memref<1xi256> // Stack allocation
memref.store %arg0, %0[] : memref<1xi256> // Store initial value
%1 = memref.load %0[] : memref<1xi256> // Load for +10
%c10 = arith.constant 10 : i256
%2 = arith.addi %1, %c10 : i256
memref.store %2, %0[] : memref<1xi256> // Store back
%3 = memref.load %0[] : memref<1xi256> // Load for *2
%c2 = arith.constant 2 : i256
%4 = arith.muli %3, %c2 : i256
memref.store %4, %0[] : memref<1xi256> // Store back
%5 = memref.load %0[] : memref<1xi256> // Load final result
return %5 : i256
}
After mem2reg Optimization Pass:
func.func @calculate(%arg0: i256) -> i256 {
%c10 = arith.constant 10 : i256
%0 = arith.addi %arg0, %c10 : i256 // Pure SSA!
%c2 = arith.constant 2 : i256
%1 = arith.muli %0, %c2 : i256 // Pure SSA!
return %1 : i256
}
SSA Benefits
- Optimization: Standard MLIR passes (CSE, SCCP, dead code elimination) work directly
- Analysis: Data flow analysis is straightforward in SSA form
- Type Safety: Each SSA value has a single, well-defined type
- Gas Efficiency:
mem2regeliminates redundant loads/stores
Memory Regions vs SSA
Local Variables (SSA via mem2reg):
- Function-scope variables
memref.alloca→ SSA values- Optimized away in final code
Storage Variables (NOT SSA):
- Contract state (persistent)
ora.sload/ora.sstoreoperations- Direct EVM
SLOAD/SSTOREopcodes - Cannot be optimized away
// Local variable (SSA):
%local = arith.constant 100 : i256
// Storage variable (NOT SSA):
%slot = arith.constant 0 : i256
%value = ora.sload %slot : i256
ora.sstore %slot, %value : i256
Standard Library Integration
Built-in Lowering
Ora's standard library built-ins are lowered to custom ora.evm.* MLIR operations:
Ora Code:
let timestamp = std.block.timestamp();
let sender = std.msg.sender();
MLIR:
%timestamp = ora.evm.timestamp() : i256 loc("contract.ora":10:20)
%sender = ora.evm.caller() : i160 loc("contract.ora":11:17)
Yul:
let timestamp := timestamp()
let sender := caller()
Zero-Overhead Guarantee
All standard library calls are inlined at the MLIR level - no function call overhead:
| Ora Built-in | MLIR Operation | Yul Opcode | Overhead |
|---|---|---|---|
std.block.timestamp() | ora.evm.timestamp | timestamp() | Zero |
std.msg.sender() | ora.evm.caller | caller() | Zero |
std.constants.U256_MAX | arith.constant -1 | 0xfff...fff | Zero |
Semantic Validation
Built-in calls are validated during semantic analysis:
✅ Valid:
let sender = std.msg.sender(); // Correct usage
❌ Invalid (caught at compile time):
let sender = std.msg.sender; // Error: missing ()
let invalid = std.block.fake(); // Error: unknown built-in
Debugging and Analysis
Source Location Preservation
All MLIR operations preserve exact source location information (97% coverage):
%result = arith.addi %a, %b : i256 loc("contract.ora":15:8)
What's tracked:
- Filename (e.g.,
contract.ora) - Line number (e.g.,
15) - Column number (e.g.,
8)
Use cases:
- Error reporting with exact source position
- Debugging with source-level stepping
- Coverage analysis
- Profiling with source attribution
Error Reporting
MLIR validation provides detailed error messages:
❌ MLIR validation failed with 2 error(s):
- [TypeMismatch] Expected i256, found i1
at contract.ora:42:15
- [MalformedAst] Missing terminator in block
at contract.ora:50:1
Optimization Passes
Standard MLIR passes optimize the IR:
| Pass | Purpose | Benefit |
|---|---|---|
| CSE | Common Subexpression Elimination | Reduce duplicate computations |
| Canonicalization | Simplify IR patterns | Cleaner code |
| mem2reg | Convert memory to SSA values | Eliminate loads/stores |
| SCCP | Sparse Conditional Constant Propagation | Constant folding |
| LICM | Loop-Invariant Code Motion | Hoist invariants |
Result: Significant gas savings in generated EVM bytecode!
Performance Characteristics
The MLIR lowering system is designed for:
- Fast compilation: Efficient lowering algorithms
- Memory efficiency: Minimal overhead during compilation
- Scalability: Handles large smart contracts
- Deterministic output: Consistent results for testing
Implementation Features
Core Features
| Feature | Coverage |
|---|---|
| AST → MLIR Lowering | Complete |
| Source Location Tracking | 97% |
| Automatic Validation | Yes |
| Type System Mapping | Complete |
| Standard Library | 17 built-ins |
| SSA Transformation | mem2reg pass |
| Optimization Passes | 5 passes (CSE, canonicalization, mem2reg, SCCP, LICM) |
| Storage Operations | sload, sstore, tload, tstore |
| Control Flow | if, while, for, switch |
| Function Calls | Complete |
| Yul Lowering | Complete |
In Development
- Formal verification integration
- Custom Ora dialect registration
- Advanced type features (generics, constraints)
Future
- Advanced analysis (loop, alias, escape)
- Gas cost modeling
- Alternative backends (LLVM IR, WebAssembly)
- IDE integration
Examples
ERC20 Token Contract
contract SimpleToken {
storage totalSupply: u256;
storage balances: map[address, u256];
storage allowances: doublemap[address, address, u256];
pub fn initialize(initialSupply: u256) -> bool {
let deployer = std.msg.sender();
totalSupply = initialSupply;
balances[deployer] = initialSupply;
return true;
}
pub fn transfer(recipient: address, amount: u256) -> bool {
let sender = std.msg.sender();
let senderBalance = balances[sender];
if (recipient == std.constants.ZERO_ADDRESS) {
return false;
}
if (senderBalance < amount) {
return false;
}
balances[sender] = senderBalance - amount;
let recipientBalance = balances[recipient];
balances[recipient] = recipientBalance + amount;
return true;
}
}
This example demonstrates:
- Zero-overhead built-ins (
std.msg.sender()→CALLERopcode) - Storage operations (
balances[sender]→ora.sloadwith keccak256) - Type safety (all operations type-checked)
- SSA form (local variables via mem2reg)
- Source locations (every operation tagged with source position)
The MLIR representation enables analysis and optimization while maintaining semantic integrity.
MLIR Resources
Learn More About MLIR
- MLIR Official Website: https://mlir.llvm.org
- MLIR Documentation: https://mlir.llvm.org/docs/
- MLIR Tutorials: https://mlir.llvm.org/docs/Tutorials/
- LLVM Project: https://llvm.org
Ora MLIR Implementation
- Source Code: Available in the
src/mlir/directory of the Ora repository - Technical Documentation: See
docs/mlir-lowering.mdfor implementation details - API Reference: See
docs/mlir-api.mdfor developer documentation - Troubleshooting: See
docs/mlir-troubleshooting.mdfor common issues
The Ora MLIR integration leverages the powerful MLIR framework developed by the LLVM community to provide advanced compiler capabilities for smart contract development.