# Lecture 008

## Handel

How do we decide shift and reduce?

Consider int * int + int, with production rules E -> T + E | T T -> int * T | int | (E) Then if we do reduction greedily when we have T | * int + int, we will get stuck since there is no production rules begin with * token.

Handel: a handel is a sequence of symbols on the stack that, once reduction is applied to the handel, will never get wrong (ie. always get to E)

• there is no known efficient algorithm for recognize handels

• On some context free grammar, there are heuristics that always guess correct handels

Note that handels only appear at the top of the stack, never inside.

### Grammar

Grammars:

• All Context Free Grammar (CFG)

• Unambiguous (CFG)

• LR(k) CFG: deterministic

• LALR: simplification

• Simple LRG: we care about

### Prefix

Viable Prefix: $\alpha$ is a viable prefix if $\alpha | \omega$ is a valid state. We know viable predix is a prefix of the handel.

For any grammar, the set of variable predix is a regular language.　We can show this by constructing a DFA to recognize the viable predix.

Item (LR(0) items): items for the a production rule is putting a dot at each $n+1$ index.

Example: the items for $T \to (E)$ are:

T -> .(E) T -> (.E) T -> (E.) T -> (E).

Example: the items for $T \to \epsilon$ is $T \to .$

Items of a production describes possible state of the parser that can eventually use that production.

Example: the items for $T \to (E.)$ says that the current state of the parser is (E|) and we could use the production rule $T \to (E)$, so we hope to see $)$ in the future.

Structure of the stack: the stack contains many prefixes of the right hand side of production rules:

\text{Prefix}_1 \text{Prefix}_2 ... \text{Prefix}_n | ...

Observe:

• $\text{Prefix}_i$ is prefix of $X_i \to \alpha_i$

• $\text{Prefix}_i$ will eventually reduce to $X_i$ (if no error)

• The reduced $X_i$ from $\text{Prefix}_i$ will eventually combine with $\text{Prefix}_{i-1}$ to form another prefix of $\alpha_{i - 1}$ (ie. there is $X_{i - 1} \to \text{Prefix}_{i - 1} X_i \beta$ for some $\beta$)

• by induction,

### Recognizing Handel

To recognize handels, we use NFA to try all possible routes by feeding NFA our current stack. If NFA accepts if and only if our current stack is a handel. (similar to DPLab in 15210)

Here are the steps

1. we add a dummy production $S' \to S$ to the set of all productions $G$
2. NFA states are "canonical collection of LR(0)" items of $G$ (including added state)
3. For item $E \to \alpha . X \beta$ (where $X$ is any terminal or non-terminal symbol), we add transition $(E \to \alpha . X \beta) \to^X (E \to \alpha X . \beta)$ (this represent we now go on checking whether shifted version can be satisfied)
4. For item $E \to \alpha . X \beta$ (where $X$ is any non-terminal symbol) and production $X \to \gamma$, we add $\epsilon$-transition $(E \to \alpha . X \beta) \to^\epsilon X \to .\gamma$ (this represent we now go on checking whether $X$ can be satisfied)
5. Every state is an accepting state (can still reject by taking in wrong symbol)
6. The start state is $S' \to .S$

Valid Item: item $X \to \beta . \gamma$ is valid for a viable prefix $\alpha \beta$ if we have

S' \to^* \alpha X \omega \to \alpha \beta \gamma \omega

So this is saying after seeing $\alpha \beta$ on the stack, we know item $X \to \beta . \gamma$ production can be used if we read more token from the steam, then $X \to \beta . \gamma$ is a valid item.

Note that DFA will terminate on the state that is a valid item.

Example: item $T \to (.E)$ is valid for (, ((, (((, ...

## SLR Parsing

LR(0) parsing: at state $\alpha | t$

• reduce $X \to \beta$: if DFA terminates in state contain $X \to \beta.$

• shift: if DFA terminates in state contain $X \to \beta.t\omega$

However, there might be a reduce/reduce conflict or shift/reduce conflict.

For example, if a state contains both

E -> T. // this tell us to reduce T E -> T. + E // this tell us to shift until seeing E

then there is a conflict

We fix the issue by adding one more rule.

SLR parsing: at state $\alpha | t$

• reduce $X \to \beta$: if DFA terminates in state contain $X \to \beta.$ and $t \in \text{Follow}(X)$

• shift: if DFA terminates in state contain $X \to \beta.t\omega$

If there are still conflicts, then the grammary is not SLR.

You can see shift-reduce conflict in precedence: take E * E + E as example

• we can apply E -> E * E.

• we can apply E -> E. + E

• It is really conflict resolution, but for more complicated grammar, things are different

SLR Parsing

1. let $M$ be DFA
2. let $|x_1...x_n\$ be initial configuration
3. Repeat until configuration is $S | \$
1. Let $\alpha | \omega$ be current configuration
2. Run $M$ on $\alpha$
3. (don't need to check if $M$ reject, since we will check below)
4. $M$ ends at state with items $I$, let $a$ be next input 1. Shift if $X \to \beta . a \gamma \in I$ 2. Reduce if $X \to \beta . \in I \land a \in \text{Follow}(X)$ 3. Report parsing error if neither applies (containing situration when $\alpha$ is not a viable prefix)

LR(1) is more powerful than SLR since lookahead is built in item. e.g. T -> . int * T, $ is an item with $ (or any terminals) being lookahead (allow more fine-grained lookahead than entire follow set).