Building Interpreters

Author

Ken Pu

1 Language design

1.1 Lexical elements

  • Reserved keywords: if, while, return, etc.
  • Identifier for variables and functions
  • Literals for numbers and strings
  • Operators: +, -, *, /, …
  • Comments
ANTLR lexer rule for STRING
fragment ESC : '\\'
               (["\\/bfnrt] | UNICODE)
             ;
fragment UNICODE : 'u' HEX HEX HEX HEX ;
fragment HEX     : [0-9a-fA-F] ;
STRING           : '"' (ESC | ~["\\])* '"' ;

1.2 Expressions

  • Arithmetic: a * (b + c)
  • Logical: x && y || (!z)
  • Comparison: a <= b
  • Function calls: f(x, y)
expr 
  : expr (MUL | DIV) expr
  | expr (ADD | SUB) expr
  | expr COMPARE expr
  | NOT expr
  | '(' expr ')'
  | funcCall
  | NUMBER
  | STRING
  ;

1.3 Code

  • Assignment: x = 10
  • Flow control:
    • conditions: if-else, switch-case, …
    • loops: while, for
  • Function definitions
  • Blocks of multiple statements
  • Scope handling: local vs global scope
assign
  : IDENTIFIER '=' expr ';'
  ;
ifElse 
  : 'if' '(' expr ')' '{' block '}'
    (
      'else' '{' block '}'
    )?
  ;
block 
  : (statement)*
  ;
statement
  : assign
  | ifElse
  | expr ';'

More on function definitions later.

1.4 Types

  • Primitive types
  • Arrays and records
  • User defined types
  • Type annotations
type
  : primitive
  | record
  ;
primitive
  : 'Integer' | 'Float' | 'Char'
  ;
record
  : 'class'
    '{'
      ( field ) +
    '}'
    (
      'extends' IDENTIFIER
    )?
  ;
field
  : IDENTIFIER type ';'
  ;
arrayType
  : '[]' type
  ;
Note
funcDeclare
  : 'fun'
    IDENTIFIER
    '('
    parameter*
    ')' '{'
      block
    '}'
  ;
  
parameter
  : IDENTIFIER ':' type
  ;

1.5 A complete program

program
  : ( statement )+ EOF
  ;

A complete program is a sentence in the sentence of \(L(\mathrm{program})\).

A complete program
// funcDeclare
fun count(c:Char, message:[] Char) {
  n = 0;
  for(m in message) {
    if(m == c) {
      n = n + 1;
    }
  }
  n;
}

name = "Parsing and Compilation";
println(count('i', name));

2 Runtime Environment

RuntimeEnvironment cluster_GlobalScope Global Scope cluster_LocalScope Local Scope (Function) cluster_IODevices I/O Devices GlobalScope Global Scope x → 10 y → 20 LocalScope Local Scope a → 5 b → 15 GlobalScope->LocalScope BuiltInFunctions Built-in Functions print() input() len() GlobalScope->BuiltInFunctions Disk Disk BuiltInFunctions->Disk Network Network BuiltInFunctions->Network Keyboard Keyboard BuiltInFunctions->Keyboard Monitor Monitor BuiltInFunctions->Monitor

The runtime environment is where the program will be evaluated.

By executing the statements in the program, the interpreter will access the runtime environment in two ways:

  • Modify the runtime environment:
    • create or delete scopes
    • create or modify symnbol bindings in scopes
  • Modify persistent resources:
    • modify memory content
    • modify disk content
  • Data transfer
    • read data from input devices (e.g. keyboard, microphone, networks)
    • write data to output devices (e.g. monitor, speaker, networks)

3 Interpreter

3.1 Interpreter Backend

  • An interpreter backend has direct read/write access to the runtime environment.
  • The backend can construct expressions, evaluate functions, and access I/O such as keyboard and monitor.

Key responsibilities of an interpreter backend include:

  • Expression Evaluation: Computes arithmetic, logical, and function call expressions.
  • Function Execution: Resolves function definitions, manages local scopes, and handles returns.
  • Variable Binding: Reads and writes variables in the runtime environment.
  • Control Flow Handling: Executes conditionals, loops, and recursion.
  • I/O Operations: Reads user input, writes output, and interacts with external devices.
Warning

Unlike compilers, which translate code into machine instructions, an interpreter backend directly processes and executes the program without generating a separate binary.

InterpreterSystem cluster_AST Abstract Syntax Tree (AST) cluster_Backend Interpreter Backend cluster_Runtime Runtime Environment Program Program Stmt1 Assignment (x = 5) Program->Stmt1 Stmt2 Stmt2 Program->Stmt2 PrintStmt Print (x + 2) Stmt2->PrintStmt Expr1 Expression (x + 2) PrintStmt->Expr1 Backend Traverses AST Evaluates Expressions Executes Statements Backend->Program RuntimeEnv Symbol Table Memory Built-in Functions I/O Devices Backend->RuntimeEnv

3.2 Tree Traversal

The evaluation phase in an interpreter involves traversing the Abstract Syntax Tree (AST).

The interpreter performs the following using a technique known as syntax-directed definitions (SDD).

  • computations,
  • variable assignments,
  • function calls,
  • and control flow operations.