Backend of Programming Languages
1 Turing Completeness
1.1 TM as the limit of computation
Theorem (Church Turing Thesis)
Every algorithm has a TM that implements it.
Definition: Turing Completeness
A programming language is Turing-complete if it is equality expressive as Turing machines.
1.2 Elements of a Turing-complete language
- Data representation
- Array-like memory storage
- Loop
- Branch control
Java is a Turing complete language.
int[] numbers = new int[]{1, 2, 3, 4, 5};
while(...) {
    ...
}
if(...) {
    ...
} else {
    ...
}1.3 Other Turing-complete languages
1.3.1 Python
numbers = [1,2,3,4]
while(...):
    ...
    
if ...:
    ...
else:
    ...1.3.2 Clojure
(let [numbers [1 2 3 4]]
    ...)
(loop ... (recur ...))
(if ...)1.3.3 SQL
SQL is also Turing-complete using recursive queries.
2 Minimal And Efficient Backend Language
2.1 Why not Turing Machine
- TM is too inefficient.
- TM lacks strong types.
- TM does not support modern processor’s optimization features.
2.2 Java Bytecode
A simple instruction based runtime environment consisting of:
- An operand stack (of 32-bit or 64-bit cells)
- Local variable array
- A heap for dynamic memory allocation
- Constant pool for defining constants
Operand stack with three values
 | <x> |
 | <y> |
 | <z> |
 +-----+Each x, y or z are either:
- integer, float, hot-spot reference (32b)
- long, double, references (64b)
 | <x> |          
 | <y> |   --->   | <w> |
 | <z> |   iadd   | <z> |
 +-----+          +-----+iadd: add the two integers from the top of the stack, and place the result back onto the stack.
2.3 From Java to Bytecode
A minimal Java program.
public class Hello {
    public static void main(String[] args) {
        System.out.println("Hello World");
    }
}In JVM bytecode, this is converted into much simpler (but more verbose) list of instructions.
.class public Hello
.super java/lang/ObjectDeclaration of a Java class named Hello.
.method public <init>()V
    aload_0
    invokenonvirtual java/lang/Object/<init>()V
    return
.end methodDeclares a constructor which invokes the superclass constructor.
.method public static main([Ljava/lang/String;)V
    .limit stack 2
    
    getstatic java/lang/System/out Ljava/io/PrintStream;
    ldc "Hello world"
    invokevirtual java/io/PrintStream/println(Ljava/lang/String;)V
    return
.end methodThe public static void main(String[] args) method that performs stack based computation.
- allocate a stack of maximum depth 2
- places an object reference on the stack (System.out). It also specifies its Java typejava.io.PrintStream.
- places a constant string on top of the stack.
- invokes a method println(...)which consumes two from the stack: the string and thePrintStream.
2.4 Some examples of Bytecode programming
2.4.1 Adding numbers
For example, we omit the class declaration and constructor implmentation in bytecode. Below is the main method implementation.
.method public static main([Ljava/lang/String;)V
    .limit stack 100
    .limit locals 2
    
    ldc 200
    ldc 300
    iadd
    istore_1
    
    getstatic java/lang/System/out Ljava/io/PrintStream;
    iload_1
    invokevirtual java/io/PrintStream/println(I)V
    return
.end methodEquivalent to the code:
x = 200 + 300
print(x)2.4.2 Adding numbers using a loop
For this example, we are only going to show the body of the main method. The complete bytecode would require the class declaration, constructor implementation, and the main method declaration code. (See above).
.limit stack 100
.limit locals 2
    
    We will have a larger stack in case we need more space for computation. We will have two local variables.
- The thisvariable that refers to the runtime object – this is always needed.
- The accumulator integer.
; #1 is the total sum
iconst_0
istore_1This initializes the local variable using the stack.
- Place 0on top of the stack
- Move the top of the stack to variable #1
; load a counter value
ldc 1000Store 1000 to the top of the stack. This will be the count-down value used during the loop.
Important: We will keep the count-down value in the stack until the end of the loop.
LOOP:This is a label that marks the start of the loop.
; break the loop if reach zero
dup
ifeq END_LOOPTest the top of the stack to be zero. If it is zero, then break out of the loop by branching to END_LOOP label.
Important: we duplicate the top of the stack so that that test ifeq does not remove the original count-down value.
; add the counter value to sum
dup
iload_1
iadd
istore_1Add the count-down value (a copy) to the local variable #1, and then store the result back to the local variable #1.
; decrement counter
iconst_1
isubModify the count-down value on the stack by decrementing it by 1.
goto LOOP
END_LOOP:End of the loop
; print the total
iload_1
getstatic java/lang/System/out Ljava/io/PrintStream;
iload_1
invokevirtual java/io/PrintStream/println(I)V
returnAfter the loop, we print the value of the local variable #1.
2.5 Towards a better programming interface
We need a programming language for:
- Semantics: Better abstraction of a runtime environment.
- Syntax: Better symbolic description of the computing process.
- Verification: Safer communication to minimize human errors.
- Compilation: Generate code into Java Bytecode for fast execution.