# Lecture 014

## Program Optimization

Optimization Goal:

• faster at run time (mostly)

• code size

• network message sent

• memory usage

• disk access

• ...

### Intermediate Code

Intermediate Language (Representation): exposes details like register for better code optimization. We often use high-level assembly for traditional programming language. We can have multiple steps of IL.

• Three Address Code: either binary or unary operation

• variable either in register or constant

Where should we optimize code:

• AST:

• Pro: it is machine independent
• Con: too high level
• Assembly:

• Pro: exposes optimization opportunity
• Con: machine dependent, reimplement optimization when retargetting
• Intermediate Representation (IR)

• Pro: machine independent
• Pro: exposes optimization opportunity

To simplify, we define Basic Block so that we can make sure within a Basic Block, executions are guarantee executed therefore predictable.

Basic Block: a maximal sequence with no label (except for the first instruction) and no jumps (except for the last instruction)

Control-Flow-Graph: a graph where Basic Blocks are nodes and possible jump instructions are edges.

Optimization Levels

• Local Optimization: within a single Basic Block

• Global Optimization: within a control-flow graph (within method body)

• Inter-procedural Optimization: between different methods

Most compiler do Local Optimization, many do Global, only few do Inter-procedural.

Also, in practice, often a conscious decision is made not to implement the fanciest optimization known. Reasons: - hard to implement - costly in compilation time - low payoff

### Local Optimization

Possible Optimization:

• delete tautology code (x = x)

• rewrite arithmetic expression (x = y ** 2)

• pre-compute stuff (x = 1+2)

• eliminate unreachable code (if DEBUG then... or libraries, or result of optimization)

• rewrite intermediate code in single assignment form

Constant Folding: combine two constant (say addition) into one constant.

• it can be quiet dangerous to do

• imagine you are running compiler on x86-64, and you want to compile for ARM,

• but ARM might have different floating point arithmetic

• so your floating point folding will not generate correct result

• one solution is to treat floating point as a string and simulate computation as the target machine

#### Common Subexpression Elimination

Single assignment form: when a variable x is assigned only once in lifetime, then the variable x is in single assignment form.

If basic block is in single assignment form and x := ... is the first assignment, we can do the following optimization:

x := y + z
...
w := y + z

optimize to

x := y + z
w := x


#### Copy Propagation

Assuming single assignment form, then we can do

b := z + y
a := b
x := 2 * a

optimize to

b := z + y
a := b     // and so if a is never used, can be deleted (dead statement)
x := 2 * b

optimize to

b := z + y
x := 2 * b


Often, even when an optimization can't speed up things by itself, it enables more optimizations opportunities down the line. So optimization need to be performed in repetition.

### Peephole Optimization

Example:

move $a$b
move $b$a

optimize to

move $a$b


Example:

addiu $a$a i
addiu $a$a j

optimize to

addiu $a$a i+j


Example:

addiu $a$b 0

optimize to

move $a$b


In general, it is impossible for compiler to give best optimized code for a given program.

Table of Content