MLIR Intermediate Representation

Overview

The Ora compiler includes a comprehensive MLIR (Multi-Level Intermediate Representation) lowering system that provides an intermediate representation for advanced analysis, optimization, and alternative code generation paths. This system complements the primary Yul backend and enables sophisticated compiler transformations.

What is MLIR?

MLIR (Multi-Level Intermediate Representation) is a compiler infrastructure developed by the LLVM community (https://mlir.llvm.org) that provides:

Multi-level representation: Support for different abstraction levels
Extensible dialects: Custom operation and type definitions
Optimization passes: Pluggable transformation framework
Analysis infrastructure: Data flow and control flow analysis

The Ora compiler integrates with the existing MLIR framework to provide Ora-specific lowering and analysis capabilities.

Ora MLIR Integration

The Ora compiler integrates with the LLVM MLIR framework (https://mlir.llvm.org) to:

Represent Ora semantics in a structured intermediate form using MLIR operations and types
Enable advanced analysis for verification and optimization using MLIR's pass infrastructure
Support alternative backends beyond Yul by leveraging MLIR's code generation capabilities
Provide debugging information with source location preservation using MLIR's location tracking

Our implementation provides Ora-specific dialects, operations, and lowering passes that work within the existing MLIR ecosystem.

Type System Mapping

Primitive Types

Ora primitive types are mapped to MLIR as follows:

Ora Type	MLIR Type	Notes
`u8` - `u256`	`iN`	N-bit unsigned integers
`i8` - `i256`	`iN`	N-bit signed integers
`bool`	`i1`	Single bit boolean
`address`	`i160`	With `ora.address` attribute
`string`	`!ora.string`	Ora dialect type
`bytes`	`!ora.bytes`	Ora dialect type
`void`	`()`	MLIR void type

Complex Types

Ora Type	MLIR Type	Description
`[T; N]`	`memref<NxT, space>`	Fixed-size arrays with memory space
`slice[T]`	`!ora.slice<T>`	Dynamic slices
`map[K, V]`	`!ora.map<K, V>`	Key-value mappings
`doublemap[K1, K2, V]`	`!ora.doublemap<K1, K2, V>`	Nested mappings
`struct { ... }`	`!llvm.struct<...>`	Struct types
`enum`	`!ora.enum<name, repr>`	Enumeration types
`!T`	`!ora.error<T>`	Error types
`!T1 \| T2`	`!ora.error_union<T1, T2>`	Error unions

Memory Region Semantics

Ora's memory regions are represented in MLIR using memory spaces:

Storage (Space 1)

%storage_var = memref.alloca() {ora.region = "storage"} : memref<1xi256, 1>

Persistent contract state
Transactional semantics
Gas costs for access

Memory (Space 0)

%memory_var = memref.alloca() : memref<1xi256, 0>

Transient execution memory
Cleared between calls
Lower gas costs

TStore (Space 2)

%tstore_var = memref.alloca() {ora.region = "tstore"} : memref<1xi256, 2>

Transient storage window
Temporary persistence
Intermediate gas costs

Expression Lowering

Arithmetic Operations

Ora arithmetic expressions are lowered to MLIR arith dialect operations:

let result = a + b * c;

%mul = arith.muli %b, %c : i256
%add = arith.addi %a, %mul : i256

Comparison Operations

let is_greater = x > y;

%cmp = arith.cmpi sgt, %x, %y : i256

Logical Operations

Logical operators use short-circuit evaluation:

let result = condition1 && condition2;

%result = scf.if %condition1 -> i1 {
  scf.yield %condition2 : i1
} else {
  %false = arith.constant false
  scf.yield %false : i1
}

Field Access

Struct field access uses LLVM operations:

let value = my_struct.field;

%value = llvm.extractvalue %my_struct[0] : !llvm.struct<(i256, i256)>

Function Calls

let result = my_function(arg1, arg2);

%result = func.call @my_function(%arg1, %arg2) : (i256, i256) -> i256

Statement Lowering

Control Flow

If Statements

if (condition) {
    // then block
} else {
    // else block
}

scf.if %condition {
    // then region
} else {
    // else region
}

While Loops

while (condition) {
    // body
}

scf.while (%arg0 = %initial) : (i256) -> () {
    %cond = // evaluate condition
    scf.condition(%cond)
} do {
    // loop body
    scf.yield %next_value : i256
}

For Loops

for (array) |item| {
    // body
}

%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%len = // array length
scf.for %i = %c0 to %len step %c1 {
    %item = memref.load %array[%i] : memref<?xi256>
    // loop body
}

Switch Statements

switch (value) {
    case 1 => // handle 1
    case 2...5 => // handle range
    else => // default case
}

cf.switch %value : i256, [
    default: ^default,
    1: ^case1,
    2: ^case2_5,
    3: ^case2_5,
    4: ^case2_5,
    5: ^case2_5
]

Ora-Specific Features

Move Statements

move 100 from sender to receiver;

%amount = arith.constant 100 : i256
%move_op = ora.move %amount from %sender to %receiver {ora.move = true}

Log Statements

log Transfer(from: sender, to: receiver, amount: value);

ora.log "Transfer"(%sender, %receiver, %value) {
    indexed = [true, true, false]
} : (i160, i160, i256) -> ()

Try-Catch Blocks

try {
    risky_operation();
} catch (error) {
    handle_error(error);
}

%result = scf.execute_region -> !ora.error<()> {
    %success = func.call @risky_operation() : () -> !ora.error<()>
    scf.yield %success : !ora.error<()>
}
%is_error = ora.is_error %result : !ora.error<()>
scf.if %is_error {
    %error = ora.unwrap_error %result : !ora.error<()>
    func.call @handle_error(%error) : (()) -> ()
}

Verification Features

Old Expressions

ensures balance == old(balance) + amount;

%old_balance = // captured at function entry
%postcond = arith.cmpi eq, %balance, %expected : i256
ora.assert %postcond {ora.ensures = true, ora.old = %old_balance}

Quantified Expressions

requires forall(i in 0...array.length) array[i] > 0;

%all_positive = ora.forall %i in %range {
    %elem = memref.load %array[%i] : memref<?xi256>
    %zero = arith.constant 0 : i256
    %positive = arith.cmpi sgt, %elem, %zero : i256
    ora.yield %positive : i1
} : (index) -> i1
ora.assert %all_positive {ora.requires = true}

Function Contracts

Requires Clauses

fn transfer(amount: u256) 
    requires(amount > 0)
    requires(balance >= amount)
{
    // function body
}

func.func @transfer(%amount: i256) {
    %zero = arith.constant 0 : i256
    %amount_positive = arith.cmpi sgt, %amount, %zero : i256
    ora.assert %amount_positive {ora.requires = true}
    
    %balance = // load balance
    %sufficient = arith.cmpi uge, %balance, %amount : i256
    ora.assert %sufficient {ora.requires = true}
    
    // function body
}

Ensures Clauses

fn transfer(amount: u256)
ensures balance == old(balance) - amount
{
    // function body
}

func.func @transfer(%amount: i256) {
    %old_balance = memref.load %balance_ref : memref<1xi256>
    
    // function body
    
    %new_balance = memref.load %balance_ref : memref<1xi256>
    %expected = arith.subi %old_balance, %amount : i256
    %postcond = arith.cmpi eq, %new_balance, %expected : i256
    ora.assert %postcond {ora.ensures = true}
}

Compiler Integration

CLI Usage

The Ora compiler uses a modern, streamlined command interface:

Compile and emit bytecode:

ora contract.ora
ora --emit-bytecode contract.ora

View MLIR IR:

ora --emit-mlir contract.ora

View Yul intermediate code:

ora --emit-yul contract.ora

Advanced options:

# Disable automatic MLIR validation (not recommended)
ora --no-validate-mlir contract.ora

# Use custom MLIR optimization passes
ora --mlir-passes="canonicalize,cse,mem2reg" contract.ora

Automatic MLIR Validation

The compiler automatically validates MLIR correctness before Yul lowering:

$ ora contract.ora
Parsing contract.ora...
Performing semantic analysis...
Lowering to MLIR...
Validating MLIR before Yul lowering...
✅ MLIR validation passed
Lowering to Yul...
Compiling to EVM bytecode...
Successfully compiled to EVM bytecode!

If validation fails, compilation stops immediately:

❌ MLIR validation failed with 3 error(s):
  - [TypeMismatch] Expected i256, found i1
  - [MalformedAst] Missing operand for arith.addi
  - [InvalidRegion] Block has no terminator

Validation is automatic to catch errors early in the compilation pipeline.

Build Integration

The MLIR lowering pipeline:

Lexing and Parsing: Source code → AST
Semantic Analysis: Type checking, builtin validation
MLIR Lowering: AST → MLIR module (with source locations)
MLIR Validation: Structural and semantic checks (automatic)
MLIR Optimization: CSE, canonicalization, mem2reg, SCCP, LICM
Yul Lowering: MLIR → Yul (EVM intermediate language)
Bytecode Generation: Yul → EVM bytecode

SSA Transformation

Ora to SSA

While Ora allows mutable local variables, the MLIR representation uses Static Single Assignment (SSA) form internally:

Ora Code (Mutable Variables):

pub fn calculate(x: u256) -> u256 {
    let result = x;      // Mutable local variable
    result = result + 10;
    result = result * 2;
    return result;
}

MLIR Representation (SSA with Memory):

func.func @calculate(%arg0: i256) -> i256 {
    %0 = memref.alloca() : memref<1xi256>    // Stack allocation
    memref.store %arg0, %0[] : memref<1xi256> // Store initial value
    
    %1 = memref.load %0[] : memref<1xi256>    // Load for +10
    %c10 = arith.constant 10 : i256
    %2 = arith.addi %1, %c10 : i256
    memref.store %2, %0[] : memref<1xi256>    // Store back
    
    %3 = memref.load %0[] : memref<1xi256>    // Load for *2
    %c2 = arith.constant 2 : i256
    %4 = arith.muli %3, %c2 : i256
    memref.store %4, %0[] : memref<1xi256>    // Store back
    
    %5 = memref.load %0[] : memref<1xi256>    // Load final result
    return %5 : i256
}

After mem2reg Optimization Pass:

func.func @calculate(%arg0: i256) -> i256 {
    %c10 = arith.constant 10 : i256
    %0 = arith.addi %arg0, %c10 : i256       // Pure SSA!
    %c2 = arith.constant 2 : i256
    %1 = arith.muli %0, %c2 : i256           // Pure SSA!
    return %1 : i256
}

SSA Benefits

Optimization: Standard MLIR passes (CSE, SCCP, dead code elimination) work directly
Analysis: Data flow analysis is straightforward in SSA form
Type Safety: Each SSA value has a single, well-defined type
Gas Efficiency: mem2reg eliminates redundant loads/stores

Memory Regions vs SSA

Local Variables (SSA via mem2reg):

Function-scope variables
memref.alloca → SSA values
Optimized away in final code

Storage Variables (NOT SSA):

Contract state (persistent)
ora.sload / ora.sstore operations
Direct EVM SLOAD/SSTORE opcodes
Cannot be optimized away

// Local variable (SSA):
%local = arith.constant 100 : i256

// Storage variable (NOT SSA):
%slot = arith.constant 0 : i256
%value = ora.sload %slot : i256
ora.sstore %slot, %value : i256

Standard Library Integration

Built-in Lowering

Ora's standard library built-ins are lowered to custom ora.evm.* MLIR operations:

Ora Code:

let timestamp = std.block.timestamp();
let sender = std.msg.sender();

MLIR:

%timestamp = ora.evm.timestamp() : i256 loc("contract.ora":10:20)
%sender = ora.evm.caller() : i160 loc("contract.ora":11:17)

Yul:

let timestamp := timestamp()
let sender := caller()

Zero-Overhead Guarantee

All standard library calls are inlined at the MLIR level - no function call overhead:

Ora Built-in	MLIR Operation	Yul Opcode	Overhead
`std.block.timestamp()`	`ora.evm.timestamp`	`timestamp()`	Zero
`std.msg.sender()`	`ora.evm.caller`	`caller()`	Zero
`std.constants.U256_MAX`	`arith.constant -1`	`0xfff...fff`	Zero

Semantic Validation

Built-in calls are validated during semantic analysis:

✅ Valid:

let sender = std.msg.sender();  // Correct usage

❌ Invalid (caught at compile time):

let sender = std.msg.sender;    // Error: missing ()
let invalid = std.block.fake(); // Error: unknown built-in

Debugging and Analysis

Source Location Preservation

All MLIR operations preserve exact source location information (97% coverage):

%result = arith.addi %a, %b : i256 loc("contract.ora":15:8)

What's tracked:

Filename (e.g., contract.ora)
Line number (e.g., 15)
Column number (e.g., 8)

Use cases:

Error reporting with exact source position
Debugging with source-level stepping
Coverage analysis
Profiling with source attribution

Error Reporting

MLIR validation provides detailed error messages:

❌ MLIR validation failed with 2 error(s):
  - [TypeMismatch] Expected i256, found i1
    at contract.ora:42:15
  - [MalformedAst] Missing terminator in block
    at contract.ora:50:1

Optimization Passes

Standard MLIR passes optimize the IR:

Pass	Purpose	Benefit
CSE	Common Subexpression Elimination	Reduce duplicate computations
Canonicalization	Simplify IR patterns	Cleaner code
mem2reg	Convert memory to SSA values	Eliminate loads/stores
SCCP	Sparse Conditional Constant Propagation	Constant folding
LICM	Loop-Invariant Code Motion	Hoist invariants

Result: Significant gas savings in generated EVM bytecode!

Performance Characteristics

The MLIR lowering system is designed for:

Fast compilation: Efficient lowering algorithms
Memory efficiency: Minimal overhead during compilation
Scalability: Handles large smart contracts
Deterministic output: Consistent results for testing

Implementation Features

Core Features

Feature	Coverage
AST → MLIR Lowering	Complete
Source Location Tracking	97%
Automatic Validation	Yes
Type System Mapping	Complete
Standard Library	17 built-ins
SSA Transformation	mem2reg pass
Optimization Passes	5 passes (CSE, canonicalization, mem2reg, SCCP, LICM)
Storage Operations	sload, sstore, tload, tstore
Control Flow	if, while, for, switch
Function Calls	Complete
Yul Lowering	Complete

In Development

Formal verification integration
Custom Ora dialect registration
Advanced type features (generics, constraints)

Future

Advanced analysis (loop, alias, escape)
Gas cost modeling
Alternative backends (LLVM IR, WebAssembly)
IDE integration

Examples

ERC20 Token Contract

contract SimpleToken {
    storage totalSupply: u256;
    storage balances: map[address, u256];
    storage allowances: doublemap[address, address, u256];
    
    pub fn initialize(initialSupply: u256) -> bool {
        let deployer = std.msg.sender();
        totalSupply = initialSupply;
        balances[deployer] = initialSupply;
        return true;
    }
    
    pub fn transfer(recipient: address, amount: u256) -> bool {
        let sender = std.msg.sender();
        let senderBalance = balances[sender];
        
        if (recipient == std.constants.ZERO_ADDRESS) {
            return false;
        }
        
        if (senderBalance < amount) {
            return false;
        }
        
        balances[sender] = senderBalance - amount;
        let recipientBalance = balances[recipient];
        balances[recipient] = recipientBalance + amount;
        
        return true;
    }
}

This example demonstrates:

Zero-overhead built-ins (std.msg.sender() → CALLER opcode)
Storage operations (balances[sender] → ora.sload with keccak256)
Type safety (all operations type-checked)
SSA form (local variables via mem2reg)
Source locations (every operation tagged with source position)

The MLIR representation enables analysis and optimization while maintaining semantic integrity.

MLIR Resources

Learn More About MLIR

MLIR Official Website: https://mlir.llvm.org
MLIR Documentation: https://mlir.llvm.org/docs/
MLIR Tutorials: https://mlir.llvm.org/docs/Tutorials/
LLVM Project: https://llvm.org

Ora MLIR Implementation

Source Code: Available in the src/mlir/ directory of the Ora repository
Technical Documentation: See docs/mlir-lowering.md for implementation details
API Reference: See docs/mlir-api.md for developer documentation
Troubleshooting: See docs/mlir-troubleshooting.md for common issues

The Ora MLIR integration leverages the powerful MLIR framework developed by the LLVM community to provide advanced compiler capabilities for smart contract development.

Overview​

What is MLIR?​

Ora MLIR Integration​

Type System Mapping​

Primitive Types​

Complex Types​

Memory Region Semantics​

Storage (Space 1)​

Memory (Space 0)​

TStore (Space 2)​

Expression Lowering​

Arithmetic Operations​

Comparison Operations​

Logical Operations​

Field Access​

Function Calls​

Statement Lowering​

Control Flow​

If Statements​

While Loops​

For Loops​

Switch Statements​

Ora-Specific Features​

Move Statements​

Log Statements​

Try-Catch Blocks​

Verification Features​

Old Expressions​

Quantified Expressions​

Function Contracts​

Requires Clauses​

Ensures Clauses​

Compiler Integration​

CLI Usage​

Automatic MLIR Validation​

Build Integration​

SSA Transformation​

Ora to SSA​

SSA Benefits​

Memory Regions vs SSA​

Standard Library Integration​

Built-in Lowering​

Zero-Overhead Guarantee​

Semantic Validation​

Debugging and Analysis​

Source Location Preservation​

Error Reporting​

Optimization Passes​

Performance Characteristics​

Implementation Features​

Core Features​

In Development​

Future​

Examples​

ERC20 Token Contract​

MLIR Resources​

Learn More About MLIR​

Ora MLIR Implementation​