Skip to main content

MLIR Intermediate Representation

Overview

The Ora compiler includes a comprehensive MLIR (Multi-Level Intermediate Representation) lowering system that provides an intermediate representation for advanced analysis, optimization, and alternative code generation paths. This system lowers to sensei-ir (SIR) for EVM bytecode generation and enables sophisticated compiler transformations.

What is MLIR?

MLIR (Multi-Level Intermediate Representation) is a compiler infrastructure developed by the LLVM community (https://mlir.llvm.org) that provides:

  • Multi-level representation: Support for different abstraction levels
  • Extensible dialects: Custom operation and type definitions
  • Optimization passes: Pluggable transformation framework
  • Analysis infrastructure: Data flow and control flow analysis

The Ora compiler integrates with the existing MLIR framework to provide Ora-specific lowering and analysis capabilities.

Ora MLIR Integration

The Ora compiler integrates with the LLVM MLIR framework (https://mlir.llvm.org) to:

  1. Represent Ora semantics in a structured intermediate form using MLIR operations and types
  2. Enable advanced analysis for verification and optimization using MLIR's pass infrastructure
  3. Lower to sensei-ir (SIR) for EVM bytecode generation via the sensei-ir backend
  4. Provide debugging information with source location preservation using MLIR's location tracking

Our implementation provides Ora-specific dialects, operations, and lowering passes that work within the existing MLIR ecosystem.

Type System Mapping

Primitive Types

Ora primitive types are mapped to MLIR as follows:

Ora TypeMLIR TypeNotes
u8 - u256iNN-bit unsigned integers
i8 - i256iNN-bit signed integers
booli1Single bit boolean
addressi160With ora.address attribute
string!ora.stringOra dialect type
bytes!ora.bytesOra dialect type
void()MLIR void type

Complex Types

Ora TypeMLIR TypeDescription
[T; N]memref<NxT, space>Fixed-size arrays with memory space
slice[T]!ora.slice<T>Dynamic slices
map[K, V]!ora.map<K, V>Key-value mappings
doublemap[K1, K2, V]!ora.doublemap<K1, K2, V>Nested mappings
struct { ... }!llvm.struct<...>Struct types
enum!ora.enum<name, repr>Enumeration types
!T!ora.error<T>Error types
!T1 | T2!ora.error_union<T1, T2>Error unions

Memory Region Semantics

Ora's memory regions are represented in MLIR using memory spaces:

Storage (Space 1)

%storage_var = memref.alloca() {ora.region = "storage"} : memref<1xi256, 1>
  • Persistent contract state
  • Transactional semantics
  • Gas costs for access

Memory (Space 0)

%memory_var = memref.alloca() : memref<1xi256, 0>
  • Transient execution memory
  • Cleared between calls
  • Lower gas costs

TStore (Space 2)

%tstore_var = memref.alloca() {ora.region = "tstore"} : memref<1xi256, 2>
  • Transient storage window
  • Temporary persistence
  • Intermediate gas costs

Expression Lowering

Arithmetic Operations

Ora arithmetic expressions are lowered to MLIR arith dialect operations:

let result = a + b * c;
%mul = arith.muli %b, %c : i256
%add = arith.addi %a, %mul : i256

Comparison Operations

let is_greater = x > y;
%cmp = arith.cmpi sgt, %x, %y : i256

Logical Operations

Logical operators use short-circuit evaluation:

let result = condition1 && condition2;
%result = scf.if %condition1 -> i1 {
scf.yield %condition2 : i1
} else {
%false = arith.constant false
scf.yield %false : i1
}

Field Access

Struct field access uses LLVM operations:

let value = my_struct.field;
%value = llvm.extractvalue %my_struct[0] : !llvm.struct<(i256, i256)>

Function Calls

let result = my_function(arg1, arg2);
%result = func.call @my_function(%arg1, %arg2) : (i256, i256) -> i256

Statement Lowering

Control Flow

If Statements

if (condition) {
// then block
} else {
// else block
}
scf.if %condition {
// then region
} else {
// else region
}

While Loops

while (condition) {
// body
}
scf.while (%arg0 = %initial) : (i256) -> () {
%cond = // evaluate condition
scf.condition(%cond)
} do {
// loop body
scf.yield %next_value : i256
}

For Loops

for (array) |item| {
// body
}
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%len = // array length
scf.for %i = %c0 to %len step %c1 {
%item = memref.load %array[%i] : memref<?xi256>
// loop body
}

Switch Statements

switch (value) {
case 1 => // handle 1
case 2...5 => // handle range
else => // default case
}
cf.switch %value : i256, [
default: ^default,
1: ^case1,
2: ^case2_5,
3: ^case2_5,
4: ^case2_5,
5: ^case2_5
]

Ora-Specific Features

Move Statements

move 100 from sender to receiver;
%amount = arith.constant 100 : i256
%move_op = ora.move %amount from %sender to %receiver {ora.move = true}

Log Statements

log Transfer(from: sender, to: receiver, amount: value);
ora.log "Transfer"(%sender, %receiver, %value) {
indexed = [true, true, false]
} : (i160, i160, i256) -> ()

Try-Catch Blocks

try {
risky_operation();
} catch (error) {
handle_error(error);
}
%result = scf.execute_region -> !ora.error<()> {
%success = func.call @risky_operation() : () -> !ora.error<()>
scf.yield %success : !ora.error<()>
}
%is_error = ora.is_error %result : !ora.error<()>
scf.if %is_error {
%error = ora.unwrap_error %result : !ora.error<()>
func.call @handle_error(%error) : (()) -> ()
}

Verification Features

Old Expressions

ensures balance == old(balance) + amount;
%old_balance = // captured at function entry
%postcond = arith.cmpi eq, %balance, %expected : i256
ora.assert %postcond {ora.ensures = true, ora.old = %old_balance}

Quantified Expressions

requires forall(i in 0...array.length) array[i] > 0;
%all_positive = ora.forall %i in %range {
%elem = memref.load %array[%i] : memref<?xi256>
%zero = arith.constant 0 : i256
%positive = arith.cmpi sgt, %elem, %zero : i256
ora.yield %positive : i1
} : (index) -> i1
ora.assert %all_positive {ora.requires = true}

Function Contracts

Requires Clauses

fn transfer(amount: u256) 
requires(amount > 0)
requires(balance >= amount)
{
// function body
}
func.func @transfer(%amount: i256) {
%zero = arith.constant 0 : i256
%amount_positive = arith.cmpi sgt, %amount, %zero : i256
ora.assert %amount_positive {ora.requires = true}

%balance = // load balance
%sufficient = arith.cmpi uge, %balance, %amount : i256
ora.assert %sufficient {ora.requires = true}

// function body
}

Ensures Clauses

fn transfer(amount: u256)
ensures balance == old(balance) - amount
{
// function body
}
func.func @transfer(%amount: i256) {
%old_balance = memref.load %balance_ref : memref<1xi256>

// function body

%new_balance = memref.load %balance_ref : memref<1xi256>
%expected = arith.subi %old_balance, %amount : i256
%postcond = arith.cmpi eq, %new_balance, %expected : i256
ora.assert %postcond {ora.ensures = true}
}

Compiler Integration

CLI Usage

The Ora compiler uses a modern, streamlined command interface:

Compile and emit bytecode:

ora contract.ora
ora --emit-bytecode contract.ora

View MLIR IR:

ora --emit-mlir contract.ora

View sensei-ir (SIR) intermediate code:

ora --emit-sir contract.ora

Advanced options:

# Disable automatic MLIR validation (not recommended)
ora --no-validate-mlir contract.ora

# Use custom MLIR optimization passes
ora --mlir-passes="canonicalize,cse,mem2reg" contract.ora

Automatic MLIR Validation

The compiler automatically validates MLIR correctness before sensei-ir lowering:

$ ora contract.ora
Parsing contract.ora...
Performing semantic analysis...
Lowering to MLIR...
Validating MLIR before sensei-ir lowering...
✅ MLIR validation passed
Lowering to sensei-ir (SIR)...
Compiling to EVM bytecode via sensei-ir...
Successfully compiled to EVM bytecode!

If validation fails, compilation stops immediately:

❌ MLIR validation failed with 3 error(s):
- [TypeMismatch] Expected i256, found i1
- [MalformedAst] Missing operand for arith.addi
- [InvalidRegion] Block has no terminator

Validation is automatic to catch errors early in the compilation pipeline.

Build Integration

The MLIR lowering pipeline:

  1. Lexing and Parsing: Source code → AST
  2. Semantic Analysis: Type checking, builtin validation
  3. MLIR Lowering: AST → MLIR module (with source locations)
  4. MLIR Validation: Structural and semantic checks (automatic)
  5. MLIR Optimization: CSE, canonicalization, mem2reg, SCCP, LICM
  6. sensei-ir Lowering: MLIR → sensei-ir (SIR) 🚧 In Development
  7. Bytecode Generation: sensei-ir → EVM bytecode via sensei-ir debug-backend 🚧 In Development

SSA Transformation

Ora to SSA

While Ora allows mutable local variables, the MLIR representation uses Static Single Assignment (SSA) form internally:

Ora Code (Mutable Variables):

pub fn calculate(x: u256) -> u256 {
let result = x; // Mutable local variable
result = result + 10;
result = result * 2;
return result;
}

MLIR Representation (SSA with Memory):

func.func @calculate(%arg0: i256) -> i256 {
%0 = memref.alloca() : memref<1xi256> // Stack allocation
memref.store %arg0, %0[] : memref<1xi256> // Store initial value

%1 = memref.load %0[] : memref<1xi256> // Load for +10
%c10 = arith.constant 10 : i256
%2 = arith.addi %1, %c10 : i256
memref.store %2, %0[] : memref<1xi256> // Store back

%3 = memref.load %0[] : memref<1xi256> // Load for *2
%c2 = arith.constant 2 : i256
%4 = arith.muli %3, %c2 : i256
memref.store %4, %0[] : memref<1xi256> // Store back

%5 = memref.load %0[] : memref<1xi256> // Load final result
return %5 : i256
}

After mem2reg Optimization Pass:

func.func @calculate(%arg0: i256) -> i256 {
%c10 = arith.constant 10 : i256
%0 = arith.addi %arg0, %c10 : i256 // Pure SSA!
%c2 = arith.constant 2 : i256
%1 = arith.muli %0, %c2 : i256 // Pure SSA!
return %1 : i256
}

SSA Benefits

  1. Optimization: Standard MLIR passes (CSE, SCCP, dead code elimination) work directly
  2. Analysis: Data flow analysis is straightforward in SSA form
  3. Type Safety: Each SSA value has a single, well-defined type
  4. Gas Efficiency: mem2reg eliminates redundant loads/stores

Memory Regions vs SSA

Local Variables (SSA via mem2reg):

  • Function-scope variables
  • memref.alloca → SSA values
  • Optimized away in final code

Storage Variables (NOT SSA):

  • Contract state (persistent)
  • ora.sload / ora.sstore operations
  • Direct EVM SLOAD/SSTORE opcodes
  • Cannot be optimized away
// Local variable (SSA):
%local = arith.constant 100 : i256

// Storage variable (NOT SSA):
%slot = arith.constant 0 : i256
%value = ora.sload %slot : i256
ora.sstore %slot, %value : i256

Standard Library Integration

Built-in Lowering

Ora's standard library built-ins are lowered to custom ora.evm.* MLIR operations:

Ora Code:

let timestamp = std.block.timestamp();
let sender = std.msg.sender();

MLIR:

%timestamp = ora.evm.timestamp() : i256 loc("contract.ora":10:20)
%sender = ora.evm.caller() : i160 loc("contract.ora":11:17)

sensei-ir (SIR):

fn main:
entry -> timestamp sender {
timestamp = timestamp
sender = caller
iret
}

Zero-Overhead Guarantee

All standard library calls are inlined at the MLIR level - no function call overhead:

Ora Built-inMLIR Operationsensei-ir OperationOverhead
std.block.timestamp()ora.evm.timestamptimestampZero
std.msg.sender()ora.evm.callercallerZero
std.constants.U256_MAXarith.constant -10xfff...fffZero

Semantic Validation

Built-in calls are validated during semantic analysis:

Valid:

let sender = std.msg.sender();  // Correct usage

Invalid (caught at compile time):

let sender = std.msg.sender;    // Error: missing ()
let invalid = std.block.fake(); // Error: unknown built-in

Debugging and Analysis

Source Location Preservation

All MLIR operations preserve exact source location information (97% coverage):

%result = arith.addi %a, %b : i256 loc("contract.ora":15:8)

What's tracked:

  • Filename (e.g., contract.ora)
  • Line number (e.g., 15)
  • Column number (e.g., 8)

Use cases:

  • Error reporting with exact source position
  • Debugging with source-level stepping
  • Coverage analysis
  • Profiling with source attribution

Error Reporting

MLIR validation provides detailed error messages:

❌ MLIR validation failed with 2 error(s):
- [TypeMismatch] Expected i256, found i1
at contract.ora:42:15
- [MalformedAst] Missing terminator in block
at contract.ora:50:1

Optimization Passes

Standard MLIR passes optimize the IR:

PassPurposeBenefit
CSECommon Subexpression EliminationReduce duplicate computations
CanonicalizationSimplify IR patternsCleaner code
mem2regConvert memory to SSA valuesEliminate loads/stores
SCCPSparse Conditional Constant PropagationConstant folding
LICMLoop-Invariant Code MotionHoist invariants

Result: Significant gas savings in generated EVM bytecode!

Performance Characteristics

The MLIR lowering system is designed for:

  • Fast compilation: Efficient lowering algorithms
  • Memory efficiency: Minimal overhead during compilation
  • Scalability: Handles large smart contracts
  • Deterministic output: Consistent results for testing

Implementation Features

Core Features

FeatureCoverage
AST → MLIR LoweringComplete
Source Location Tracking97%
Automatic ValidationYes
Type System MappingComplete
Standard Library17 built-ins
SSA Transformationmem2reg pass
Optimization Passes5 passes (CSE, canonicalization, mem2reg, SCCP, LICM)
Storage Operationssload, sstore, tload, tstore
Control Flowif, while, for, switch
Function CallsComplete
sensei-ir Lowering🚧 In Development

In Development

  • Formal verification integration
  • Custom Ora dialect registration
  • Advanced type features (generics, constraints)

Future

  • Advanced analysis (loop, alias, escape)
  • Gas cost modeling
  • Alternative backends (LLVM IR, WebAssembly)
  • IDE integration

Examples

ERC20 Token Contract

contract SimpleToken {
storage totalSupply: u256;
storage balances: map[address, u256];
storage allowances: doublemap[address, address, u256];

pub fn initialize(initialSupply: u256) -> bool {
let deployer = std.msg.sender();
totalSupply = initialSupply;
balances[deployer] = initialSupply;
return true;
}

pub fn transfer(recipient: address, amount: u256) -> bool {
let sender = std.msg.sender();
let senderBalance = balances[sender];

if (recipient == std.constants.ZERO_ADDRESS) {
return false;
}

if (senderBalance < amount) {
return false;
}

balances[sender] = senderBalance - amount;
let recipientBalance = balances[recipient];
balances[recipient] = recipientBalance + amount;

return true;
}
}

This example demonstrates:

  • Zero-overhead built-ins (std.msg.sender()CALLER opcode)
  • Storage operations (balances[sender]ora.sload with keccak256)
  • Type safety (all operations type-checked)
  • SSA form (local variables via mem2reg)
  • Source locations (every operation tagged with source position)

The MLIR representation enables analysis and optimization while maintaining semantic integrity.

MLIR Resources

Learn More About MLIR

Ora MLIR Implementation

  • Source Code: Available in the src/mlir/ directory of the Ora repository
  • Technical Documentation: See docs/mlir-lowering.md for implementation details
  • API Reference: See docs/mlir-api.md for developer documentation
  • Troubleshooting: See docs/mlir-troubleshooting.md for common issues

The Ora MLIR integration leverages the powerful MLIR framework developed by the LLVM community to provide advanced compiler capabilities for smart contract development.