Optimization Goal:
faster at run time (mostly)
code size
network message sent
memory usage
disk access
...
Intermediate Language (Representation): exposes details like register for better code optimization. We often use high-level assembly for traditional programming language. We can have multiple steps of IL.
Three Address Code: either binary or unary operation
variable either in register or constant
Where should we optimize code:
AST:
Assembly:
Intermediate Representation (IR)
To simplify, we define Basic Block so that we can make sure within a Basic Block, executions are guarantee executed therefore predictable.
Basic Block: a maximal sequence with no label (except for the first instruction) and no jumps (except for the last instruction)
Control-Flow-Graph: a graph where Basic Blocks are nodes and possible jump instructions are edges.
Optimization Levels
Local Optimization: within a single Basic Block
Global Optimization: within a control-flow graph (within method body)
Inter-procedural Optimization: between different methods
Most compiler do Local Optimization, many do Global, only few do Inter-procedural.
Also, in practice, often a conscious decision is made not to implement the fanciest optimization known. Reasons: - hard to implement - costly in compilation time - low payoff
Possible Optimization:
delete tautology code (x = x
)
rewrite arithmetic expression (x = y ** 2
)
pre-compute stuff (x = 1+2
)
eliminate unreachable code (if DEBUG then...
or libraries, or result of optimization)
rewrite intermediate code in single assignment form
Constant Folding: combine two constant (say addition) into one constant.
it can be quiet dangerous to do
imagine you are running compiler on x86-64, and you want to compile for ARM,
but ARM might have different floating point arithmetic
so your floating point folding will not generate correct result
one solution is to treat floating point as a string and simulate computation as the target machine
Single assignment form: when a variable x
is assigned only once in lifetime, then the variable x
is in single assignment form.
If basic block is in single assignment form and x := ...
is the first assignment, we can do the following optimization:
x := y + z
...
w := y + z
optimize to
x := y + z
w := x
Assuming single assignment form, then we can do
b := z + y
a := b
x := 2 * a
optimize to
b := z + y
a := b // and so if a is never used, can be deleted (dead statement)
x := 2 * b
optimize to
b := z + y
x := 2 * b
Often, even when an optimization can't speed up things by itself, it enables more optimizations opportunities down the line. So optimization need to be performed in repetition.
Example:
move $a $b
move $b $a
optimize to
move $a $b
Example:
addiu $a $a i
addiu $a $a j
optimize to
addiu $a $a i+j
Example:
addiu $a $b 0
optimize to
move $a $b
In general, it is impossible for compiler to give best optimized code for a given program.
Table of Content