Skip to main content

MLIR Intermediate Representation

Overview

The Ora compiler includes a comprehensive MLIR (Multi-Level Intermediate Representation) lowering system that provides an intermediate representation for advanced analysis, optimization, and alternative code generation paths. This system complements the primary Yul backend and enables sophisticated compiler transformations.

What is MLIR?

MLIR (Multi-Level Intermediate Representation) is a compiler infrastructure developed by the LLVM community (https://mlir.llvm.org) that provides:

  • Multi-level representation: Support for different abstraction levels
  • Extensible dialects: Custom operation and type definitions
  • Optimization passes: Pluggable transformation framework
  • Analysis infrastructure: Data flow and control flow analysis

The Ora compiler integrates with the existing MLIR framework to provide Ora-specific lowering and analysis capabilities.

Ora MLIR Integration

The Ora compiler integrates with the LLVM MLIR framework (https://mlir.llvm.org) to:

  1. Represent Ora semantics in a structured intermediate form using MLIR operations and types
  2. Enable advanced analysis for verification and optimization using MLIR's pass infrastructure
  3. Support alternative backends beyond Yul by leveraging MLIR's code generation capabilities
  4. Provide debugging information with source location preservation using MLIR's location tracking

Our implementation provides Ora-specific dialects, operations, and lowering passes that work within the existing MLIR ecosystem.

Type System Mapping

Primitive Types

Ora primitive types are mapped to MLIR as follows:

Ora TypeMLIR TypeNotes
u8 - u256iNN-bit unsigned integers
i8 - i256iNN-bit signed integers
booli1Single bit boolean
addressi160With ora.address attribute
string!ora.stringOra dialect type
bytes!ora.bytesOra dialect type
void()MLIR void type

Complex Types

Ora TypeMLIR TypeDescription
[T; N]memref<NxT, space>Fixed-size arrays with memory space
slice[T]!ora.slice<T>Dynamic slices
map[K, V]!ora.map<K, V>Key-value mappings
doublemap[K1, K2, V]!ora.doublemap<K1, K2, V>Nested mappings
struct { ... }!llvm.struct<...>Struct types
enum!ora.enum<name, repr>Enumeration types
!T!ora.error<T>Error types
!T1 | T2!ora.error_union<T1, T2>Error unions

Memory Region Semantics

Ora's memory regions are represented in MLIR using memory spaces:

Storage (Space 1)

%storage_var = memref.alloca() {ora.region = "storage"} : memref<1xi256, 1>
  • Persistent contract state
  • Transactional semantics
  • Gas costs for access

Memory (Space 0)

%memory_var = memref.alloca() : memref<1xi256, 0>
  • Transient execution memory
  • Cleared between calls
  • Lower gas costs

TStore (Space 2)

%tstore_var = memref.alloca() {ora.region = "tstore"} : memref<1xi256, 2>
  • Transient storage window
  • Temporary persistence
  • Intermediate gas costs

Expression Lowering

Arithmetic Operations

Ora arithmetic expressions are lowered to MLIR arith dialect operations:

let result = a + b * c;
%mul = arith.muli %b, %c : i256
%add = arith.addi %a, %mul : i256

Comparison Operations

let is_greater = x > y;
%cmp = arith.cmpi sgt, %x, %y : i256

Logical Operations

Logical operators use short-circuit evaluation:

let result = condition1 && condition2;
%result = scf.if %condition1 -> i1 {
scf.yield %condition2 : i1
} else {
%false = arith.constant false
scf.yield %false : i1
}

Field Access

Struct field access uses LLVM operations:

let value = my_struct.field;
%value = llvm.extractvalue %my_struct[0] : !llvm.struct<(i256, i256)>

Function Calls

let result = my_function(arg1, arg2);
%result = func.call @my_function(%arg1, %arg2) : (i256, i256) -> i256

Statement Lowering

Control Flow

If Statements

if (condition) {
// then block
} else {
// else block
}
scf.if %condition {
// then region
} else {
// else region
}

While Loops

while (condition) {
// body
}
scf.while (%arg0 = %initial) : (i256) -> () {
%cond = // evaluate condition
scf.condition(%cond)
} do {
// loop body
scf.yield %next_value : i256
}

For Loops

for (array) |item| {
// body
}
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%len = // array length
scf.for %i = %c0 to %len step %c1 {
%item = memref.load %array[%i] : memref<?xi256>
// loop body
}

Switch Statements

switch (value) {
case 1 => // handle 1
case 2...5 => // handle range
else => // default case
}
cf.switch %value : i256, [
default: ^default,
1: ^case1,
2: ^case2_5,
3: ^case2_5,
4: ^case2_5,
5: ^case2_5
]

Ora-Specific Features

Move Statements

move 100 from sender to receiver;
%amount = arith.constant 100 : i256
%move_op = ora.move %amount from %sender to %receiver {ora.move = true}

Log Statements

log Transfer(from: sender, to: receiver, amount: value);
ora.log "Transfer"(%sender, %receiver, %value) {
indexed = [true, true, false]
} : (i160, i160, i256) -> ()

Try-Catch Blocks

try {
risky_operation();
} catch (error) {
handle_error(error);
}
%result = scf.execute_region -> !ora.error<()> {
%success = func.call @risky_operation() : () -> !ora.error<()>
scf.yield %success : !ora.error<()>
}
%is_error = ora.is_error %result : !ora.error<()>
scf.if %is_error {
%error = ora.unwrap_error %result : !ora.error<()>
func.call @handle_error(%error) : (()) -> ()
}

Verification Features

Old Expressions

ensures balance == old(balance) + amount;
%old_balance = // captured at function entry
%postcond = arith.cmpi eq, %balance, %expected : i256
ora.assert %postcond {ora.ensures = true, ora.old = %old_balance}

Quantified Expressions

requires forall(i in 0...array.length) array[i] > 0;
%all_positive = ora.forall %i in %range {
%elem = memref.load %array[%i] : memref<?xi256>
%zero = arith.constant 0 : i256
%positive = arith.cmpi sgt, %elem, %zero : i256
ora.yield %positive : i1
} : (index) -> i1
ora.assert %all_positive {ora.requires = true}

Function Contracts

Requires Clauses

fn transfer(amount: u256) 
requires(amount > 0)
requires(balance >= amount)
{
// function body
}
func.func @transfer(%amount: i256) {
%zero = arith.constant 0 : i256
%amount_positive = arith.cmpi sgt, %amount, %zero : i256
ora.assert %amount_positive {ora.requires = true}

%balance = // load balance
%sufficient = arith.cmpi uge, %balance, %amount : i256
ora.assert %sufficient {ora.requires = true}

// function body
}

Ensures Clauses

fn transfer(amount: u256)
ensures balance == old(balance) - amount
{
// function body
}
func.func @transfer(%amount: i256) {
%old_balance = memref.load %balance_ref : memref<1xi256>

// function body

%new_balance = memref.load %balance_ref : memref<1xi256>
%expected = arith.subi %old_balance, %amount : i256
%postcond = arith.cmpi eq, %new_balance, %expected : i256
ora.assert %postcond {ora.ensures = true}
}

Compiler Integration

CLI Usage

The Ora compiler uses a modern, streamlined command interface:

Compile and emit bytecode:

ora contract.ora
ora --emit-bytecode contract.ora

View MLIR IR:

ora --emit-mlir contract.ora

View Yul intermediate code:

ora --emit-yul contract.ora

Advanced options:

# Disable automatic MLIR validation (not recommended)
ora --no-validate-mlir contract.ora

# Use custom MLIR optimization passes
ora --mlir-passes="canonicalize,cse,mem2reg" contract.ora

Automatic MLIR Validation

The compiler automatically validates MLIR correctness before Yul lowering:

$ ora contract.ora
Parsing contract.ora...
Performing semantic analysis...
Lowering to MLIR...
Validating MLIR before Yul lowering...
✅ MLIR validation passed
Lowering to Yul...
Compiling to EVM bytecode...
Successfully compiled to EVM bytecode!

If validation fails, compilation stops immediately:

❌ MLIR validation failed with 3 error(s):
- [TypeMismatch] Expected i256, found i1
- [MalformedAst] Missing operand for arith.addi
- [InvalidRegion] Block has no terminator

Validation is automatic to catch errors early in the compilation pipeline.

Build Integration

The MLIR lowering pipeline:

  1. Lexing and Parsing: Source code → AST
  2. Semantic Analysis: Type checking, builtin validation
  3. MLIR Lowering: AST → MLIR module (with source locations)
  4. MLIR Validation: Structural and semantic checks (automatic)
  5. MLIR Optimization: CSE, canonicalization, mem2reg, SCCP, LICM
  6. Yul Lowering: MLIR → Yul (EVM intermediate language)
  7. Bytecode Generation: Yul → EVM bytecode

SSA Transformation

Ora to SSA

While Ora allows mutable local variables, the MLIR representation uses Static Single Assignment (SSA) form internally:

Ora Code (Mutable Variables):

pub fn calculate(x: u256) -> u256 {
let result = x; // Mutable local variable
result = result + 10;
result = result * 2;
return result;
}

MLIR Representation (SSA with Memory):

func.func @calculate(%arg0: i256) -> i256 {
%0 = memref.alloca() : memref<1xi256> // Stack allocation
memref.store %arg0, %0[] : memref<1xi256> // Store initial value

%1 = memref.load %0[] : memref<1xi256> // Load for +10
%c10 = arith.constant 10 : i256
%2 = arith.addi %1, %c10 : i256
memref.store %2, %0[] : memref<1xi256> // Store back

%3 = memref.load %0[] : memref<1xi256> // Load for *2
%c2 = arith.constant 2 : i256
%4 = arith.muli %3, %c2 : i256
memref.store %4, %0[] : memref<1xi256> // Store back

%5 = memref.load %0[] : memref<1xi256> // Load final result
return %5 : i256
}

After mem2reg Optimization Pass:

func.func @calculate(%arg0: i256) -> i256 {
%c10 = arith.constant 10 : i256
%0 = arith.addi %arg0, %c10 : i256 // Pure SSA!
%c2 = arith.constant 2 : i256
%1 = arith.muli %0, %c2 : i256 // Pure SSA!
return %1 : i256
}

SSA Benefits

  1. Optimization: Standard MLIR passes (CSE, SCCP, dead code elimination) work directly
  2. Analysis: Data flow analysis is straightforward in SSA form
  3. Type Safety: Each SSA value has a single, well-defined type
  4. Gas Efficiency: mem2reg eliminates redundant loads/stores

Memory Regions vs SSA

Local Variables (SSA via mem2reg):

  • Function-scope variables
  • memref.alloca → SSA values
  • Optimized away in final code

Storage Variables (NOT SSA):

  • Contract state (persistent)
  • ora.sload / ora.sstore operations
  • Direct EVM SLOAD/SSTORE opcodes
  • Cannot be optimized away
// Local variable (SSA):
%local = arith.constant 100 : i256

// Storage variable (NOT SSA):
%slot = arith.constant 0 : i256
%value = ora.sload %slot : i256
ora.sstore %slot, %value : i256

Standard Library Integration

Built-in Lowering

Ora's standard library built-ins are lowered to custom ora.evm.* MLIR operations:

Ora Code:

let timestamp = std.block.timestamp();
let sender = std.msg.sender();

MLIR:

%timestamp = ora.evm.timestamp() : i256 loc("contract.ora":10:20)
%sender = ora.evm.caller() : i160 loc("contract.ora":11:17)

Yul:

let timestamp := timestamp()
let sender := caller()

Zero-Overhead Guarantee

All standard library calls are inlined at the MLIR level - no function call overhead:

Ora Built-inMLIR OperationYul OpcodeOverhead
std.block.timestamp()ora.evm.timestamptimestamp()Zero
std.msg.sender()ora.evm.callercaller()Zero
std.constants.U256_MAXarith.constant -10xfff...fffZero

Semantic Validation

Built-in calls are validated during semantic analysis:

Valid:

let sender = std.msg.sender();  // Correct usage

Invalid (caught at compile time):

let sender = std.msg.sender;    // Error: missing ()
let invalid = std.block.fake(); // Error: unknown built-in

Debugging and Analysis

Source Location Preservation

All MLIR operations preserve exact source location information (97% coverage):

%result = arith.addi %a, %b : i256 loc("contract.ora":15:8)

What's tracked:

  • Filename (e.g., contract.ora)
  • Line number (e.g., 15)
  • Column number (e.g., 8)

Use cases:

  • Error reporting with exact source position
  • Debugging with source-level stepping
  • Coverage analysis
  • Profiling with source attribution

Error Reporting

MLIR validation provides detailed error messages:

❌ MLIR validation failed with 2 error(s):
- [TypeMismatch] Expected i256, found i1
at contract.ora:42:15
- [MalformedAst] Missing terminator in block
at contract.ora:50:1

Optimization Passes

Standard MLIR passes optimize the IR:

PassPurposeBenefit
CSECommon Subexpression EliminationReduce duplicate computations
CanonicalizationSimplify IR patternsCleaner code
mem2regConvert memory to SSA valuesEliminate loads/stores
SCCPSparse Conditional Constant PropagationConstant folding
LICMLoop-Invariant Code MotionHoist invariants

Result: Significant gas savings in generated EVM bytecode!

Performance Characteristics

The MLIR lowering system is designed for:

  • Fast compilation: Efficient lowering algorithms
  • Memory efficiency: Minimal overhead during compilation
  • Scalability: Handles large smart contracts
  • Deterministic output: Consistent results for testing

Implementation Features

Core Features

FeatureCoverage
AST → MLIR LoweringComplete
Source Location Tracking97%
Automatic ValidationYes
Type System MappingComplete
Standard Library17 built-ins
SSA Transformationmem2reg pass
Optimization Passes5 passes (CSE, canonicalization, mem2reg, SCCP, LICM)
Storage Operationssload, sstore, tload, tstore
Control Flowif, while, for, switch
Function CallsComplete
Yul LoweringComplete

In Development

  • Formal verification integration
  • Custom Ora dialect registration
  • Advanced type features (generics, constraints)

Future

  • Advanced analysis (loop, alias, escape)
  • Gas cost modeling
  • Alternative backends (LLVM IR, WebAssembly)
  • IDE integration

Examples

ERC20 Token Contract

contract SimpleToken {
storage totalSupply: u256;
storage balances: map[address, u256];
storage allowances: doublemap[address, address, u256];

pub fn initialize(initialSupply: u256) -> bool {
let deployer = std.msg.sender();
totalSupply = initialSupply;
balances[deployer] = initialSupply;
return true;
}

pub fn transfer(recipient: address, amount: u256) -> bool {
let sender = std.msg.sender();
let senderBalance = balances[sender];

if (recipient == std.constants.ZERO_ADDRESS) {
return false;
}

if (senderBalance < amount) {
return false;
}

balances[sender] = senderBalance - amount;
let recipientBalance = balances[recipient];
balances[recipient] = recipientBalance + amount;

return true;
}
}

This example demonstrates:

  • Zero-overhead built-ins (std.msg.sender()CALLER opcode)
  • Storage operations (balances[sender]ora.sload with keccak256)
  • Type safety (all operations type-checked)
  • SSA form (local variables via mem2reg)
  • Source locations (every operation tagged with source position)

The MLIR representation enables analysis and optimization while maintaining semantic integrity.

MLIR Resources

Learn More About MLIR

Ora MLIR Implementation

  • Source Code: Available in the src/mlir/ directory of the Ora repository
  • Technical Documentation: See docs/mlir-lowering.md for implementation details
  • API Reference: See docs/mlir-api.md for developer documentation
  • Troubleshooting: See docs/mlir-troubleshooting.md for common issues

The Ora MLIR integration leverages the powerful MLIR framework developed by the LLVM community to provide advanced compiler capabilities for smart contract development.