@file:DependsOn("/jasmin.jar")
Compiler Basics and Toolchain
1 Design of Compiler Backend
In this section, we will cover the over design of a compiler backend.
It is important to note that for compiled languages, the compiler backend does not have any runtime information. Namely, it would not be able to predict the data values involved in the computation. All code analysis must rely solely on metadata that can be obtained from the parse tree of the program.
1.1 Variable
A variable is object to be created and maintained by the compiler backend for the purpose of code generation. It will have information on the data type (as JVM type signature) and the register (local) index.
Interpreted language:
In the interpreter backend, we had
Variable
to store the actual runtime data. This is because the interpreter backend is responsible for computing all data values, and thus variables are bound to the their respective data values.
Compiled languages:
Variables do not have the actual data as that would only be available at runtime. Recall the compiler backend only generates bytecode instructions, and do not perform any computation. Thus, the compiler backend does not have access to the actual data values. This means the variables can only be bound to the storage location (register index) and data type of the data value.
data class Variable(
val local:Int,
val signature:String = "I",
)
1.2 Scope
A scope is a collection of Variable
objects indexed by their variable names as strings.
import java.util.HashMap
class Scope(initUnusedIndex:Int = 0)
: HashMap<String, Variable>() {
// track next local index to be allocated
var unusedIndex = initUnusedIndex;
// toString
override fun toString():String = this.map {
"${it.key} = ${it.value}"
}.joinToString("\n")
}
- A scope is a hashmap from variable name, known as symbol name, to
Variable
objects. - In a scope, we also track the next available register index.
fun Scope.resolveOrPut(symbolName: String):Variable {
return getOrPut(symbolName) {
(this.unusedIndex++)
Variable}
}
Retrieves the variable by symbolName
. If it does not exist, then create a new variable.
Note:
We only consider integer values for now. So the variable type is
"I"
.
1.2.1 Testing scope
().apply {
Scope("x")
resolveOrPut("y")
resolveOrPut}
x = Variable(local=0, signature=I)
y = Variable(local=1, signature=I)
1.3 Code
Code is a generator of JVM bytecode. In order to generate JVM bytecode, we need a scope.
sealed class Code {
fun gen(scope: Scope):List<String>
abstract override fun toString():String
= "Bytecode:\n" +
this.gen(Scope()).mapIndexed {
, code -> " ${index+1}. ${code}"
index}.joinToString("\n")
}
1.4 Statement and expressions
class Stmt: Code()
abstract abstract class Expr: Code()
2 Compiler Backend
2.1 EmptyStatement
object emptyStmt: Stmt() {
override fun gen(scope: Scope):List<String> = listOf()
}
2.2 Class Statement
class ClassStmt(
val className: String,
val body: Stmt,
): Stmt() {
override fun gen(scope: Scope): List<String>
= listOf(
".class public $className",
".super java/lang/Object"
) +
.gen(scope)
body}
2.2.1 Testing ClassStmt
("Hello", emptyStmt) ClassStmt
Bytecode:
1. .class public Hello
2. .super java/lang/Object
2.3 Method statement
class MethodStmt(
val name: String,
val signature: String,
val stackLimit: Int,
val localsLimit: Int,
val body: Stmt,
): Stmt() {
override fun gen(scope: Scope): List<String>
= listOf(
".method public static $name$signature",
".limit stack $stackLimit",
".limit locals $localsLimit",
) +
.gen(scope) +
body(
listOf"return",
".end method",
)
companion object {
fun main(stackLimit:Int, localsLimit:Int, body:Stmt)
= MethodStmt(
"main", "([Ljava/lang/String;)V",
, localsLimit, body)
stackLimit}
}
2.3.1 Testing MethodStmt
(
ClassStmt"Hello",
.main(10, 10, emptyStmt)
MethodStmt)
Bytecode:
1. .class public Hello
2. .super java/lang/Object
3. .method public static main([Ljava/lang/String;)V
4. .limit stack 10
5. .limit locals 10
6. return
7. .end method
2.4 Print statement
class Print(
val e: Expr
): Stmt() {
override fun gen(scope: Scope): List<String>
= e.gen(scope) +
(
listOf"getstatic java/lang/System/out Ljava/io/PrintStream;",
"swap",
"invokevirtual java/io/PrintStream/println(I)V",
)
}
e.gen()
will place the data of the expressione
on top of the stack.print
will generate the print bytecode to print the value.
2.5 Integer Expression
class IntExpr(
val value: Int,
): Expr() {
override fun gen(scope: Scope): List<String>
= listOf(
"ldc $value"
)
}
2.5.1 Testing Print
and IntExpr
(100) IntExpr
Bytecode:
1. ldc 100
(IntExpr(123)) Print
Bytecode:
1. ldc 123
2. getstatic java/lang/System/out Ljava/io/PrintStream;
3. swap
4. invokevirtual java/io/PrintStream/println(I)V
2.6 Assignment statement
class AssignStmt(
val symbolName: String,
val e: Expr,
): Stmt() {
override fun gen(scope: Scope): List<String>
= scope.resolveOrPut(symbolName).let {
.gen(scope) +
e"istore ${it.local} ; $symbolName = ..."
}
}
2.6.1 Testing AssignStmt
("x", IntExpr(100)) AssignStmt
Bytecode:
1. ldc 100
2. istore 0 ; x = ...
2.7 Block Statement
We can chain multiple statements together (with a shared scope).
class BlockStmt(
val list:List<Stmt>,
): Stmt() {
override fun gen(scope: Scope):List<String>
= list.map {
.gen(scope)
it}.flatten()
}
2.7.1 Testing BlockStmt
(
BlockStmt(
listOf("x", IntExpr(100)),
AssignStmt("y", IntExpr(200)),
AssignStmt("x", IntExpr(300))
AssignStmt)
)
Bytecode:
1. ldc 100
2. istore 0 ; x = ...
3. ldc 200
4. istore 1 ; y = ...
5. ldc 300
6. istore 0 ; x = ...
2.8 Deref Expression
class DerefExpr(
val symbolName: String,
): Expr() {
override fun gen(scope: Scope): List<String> {
val v = scope[symbolName]
if(v == null) {
throw Exception("$symbolName not in scope.")
}
return listOf(
"iload ${v.local} ; $symbolName"
)
}
}
2.8.1 Testing Deref
().let {scope ->
Scope.resolveOrPut("x")
scope("x").gen(scope)
DerefExpr}
[iload 0 ; x]
try {
().let {scope ->
Scope.resolveOrPut("x")
scope("y").gen(scope)
DerefExpr}
} catch(e:Exception) {
("error: $e")
println}
error: java.lang.Exception: y not in scope.
(
BlockStmt(
listOf("x", IntExpr(100)),
AssignStmt("y", DerefExpr("x")),
AssignStmt)
)
Bytecode:
1. ldc 100
2. istore 0 ; x = ...
3. iload 0 ; x
4. istore 1 ; y = ...
2.9 Arithmetics
class ArithExpr(
val op: ArithExpr.Op,
val left: Expr,
val right: Expr,
): Expr() {
enum class Op {
,
Add,
Sub,
Mul,
Div}
override fun gen(scope: Scope): List<String> = (
.gen(scope)
left+ right.gen(scope)
+ when(op) {
.Add -> "iadd"
Op.Sub -> "isub"
Op.Mul -> "imul"
Op.Div -> "idiv"
Op})
}
2.9.1 Test ArithExpr
(listOf(
BlockStmt("pi", IntExpr(31415)),
AssignStmt("r", IntExpr(100)),
AssignStmt(
Print(ArithExpr.Op.Mul,
ArithExpr("pi"),
DerefExpr(ArithExpr.Op.Mul,
ArithExpr("r"),
DerefExpr("r")
DerefExpr)
)
))
)
Bytecode:
1. ldc 31415
2. istore 0 ; pi = ...
3. ldc 100
4. istore 1 ; r = ...
5. iload 0 ; pi
6. iload 1 ; r
7. iload 1 ; r
8. imul
9. imul
10. getstatic java/lang/System/out Ljava/io/PrintStream;
11. swap
12. invokevirtual java/io/PrintStream/println(I)V
= 31415
pi = 100
r print(pi * (r * r))
2.10 Putting them together
val program = ClassStmt(
"Hello",
(
MethodStmt"main", "([Ljava/lang/String;)V", 10, 5,
(listOf(
BlockStmt("pi", IntExpr(31415)),
AssignStmt("r", IntExpr(100)),
AssignStmt(
Print(ArithExpr.Op.Mul,
ArithExpr("pi"),
DerefExpr(ArithExpr.Op.Mul,
ArithExpr("r"),
DerefExpr("r")
DerefExpr)
)
),
))
)
)
(program) println
Bytecode:
1. .class public Hello
2. .super java/lang/Object
3. .method public static main([Ljava/lang/String;)V
4. .limit stack 10
5. .limit locals 5
6. ldc 31415
7. istore 0 ; pi = ...
8. ldc 100
9. istore 1 ; r = ...
10. iload 0 ; pi
11. iload 1 ; r
12. iload 1 ; r
13. imul
14. imul
15. getstatic java/lang/System/out Ljava/io/PrintStream;
16. swap
17. invokevirtual java/io/PrintStream/println(I)V
18. return
19. .end method
3 JVM Assembler with JASMIN
3.1 Assembler to ClassFile
import jasmin.ClassFile
import java.io.BufferedReader
import java.io.File
import java.io.StringReader
fun compile(classStmt: ClassStmt) {
val name = classStmt.className
classStmt.gen(Scope())
.joinToString("\n").also {
("$name.j").writeText(it)
File}.run {
(StringReader(this))
BufferedReader}.let { reader ->
().apply {
ClassFile(reader, name, false)
readJasmin}
}.let { classFile ->
("$name.class")
File.outputStream().use { classFile.write(it) }
}
("Success: $name.class")
println}
(program) compile
Success: Hello.class
3.2 Java Class Loader
import java.net.URLClassLoader
fun run(className:String) {
val classLoader = URLClassLoader(arrayOf(File(".").toURI().toURL()))
val classObj = classLoader.loadClass(className)
val mainMethod = classObj.getMethod("main", Array<String>::class.java)
(null, arrayOf<String>())
mainMethod.invoke }
(program.className) run
314150000
4 Conclusion
We have covered:
- Compiler backend for simple arithemtic and variables
- Programs can be formed by composing compiler objects
- Programs generate Java bytecode which are assembled and executed in the JVM
What about branching and loops?
- See previous lecture on JVM Programming
- We need to design the corresponding code generators for these programming constructs.
What about function declaration and invocation?
- See previous lecture on JVM Internals
- We need to design the code generators for these programming constructs.
- Scope management needs to be suppose subscopes.