The first set of non-terminal contains the first set of first character:
First(E) contains First(T)
First(T) contains First(() and First(int) = {(, int}
Now since First(T) does not have eps, we should not add First(X) to First(E).
So First(E) = First(T) = {(, int}
First(X) = {+, eps}
First(Y) = {*, eps}
Follow Set
Follow Set: what token can follow S
\text{Follow}(X) = \{t | S \to^* \beta X t \delta\}
Observation:
\text{First}(B) \subseteq \text{Follow}(A) \land \text{Follow}(X) \subseteq \text{Follow}(B) if X \to AB
\text{Follow}(X) \subseteq \text{Follow}(A) if X \to AB \land B \to^* \epsilon
\$ \in \text{Follow}(S) if S is the start symbol.
Algorithm Sketch
\$ \in \text{Follow}(S)
\text{First}(\beta) - \{\epsilon\} \subseteq \text{Follow}(X) for each production A \to \alpha X \beta
\text{Follow}(A) \subseteq \text{Follow}(X) for each production A \to \alpha X \beta where \epsilon \in \text{First}(\beta)
Follow Set Example
Parsing Table Construction
For each production rule A \to \alpha, do:
for each terminal t \in \text{First}(\alpha), T[A, t] = a
if \epsilon \in \text{First}(\alpha), for each t \in \text{Follow}(A), T[A, t] = a
if \epsilon \in \text{First}(\alpha) \land \$ \in \text{Follow}(A), T[A, $] = a
Parsing Table Building Example
Note that LL(1) parsing table can only be built for LL(1) grammar.
Example of non-LL(1) invalid Parsing Table
The only mechanical way to check for LL(1) grammar is to build the parsing table. (although quick checks includes: non-ambiguous, non-left-recursive, non-left-factored, and more)
LL(1) grammar is too weak to describe modern languages.
Bottom-Up Parsing
Bottom-up Parsing is the preferred method, can be just as efficient. It is more general than (deterministic) top-down parsing.
Bottom-up Parsing: reduces a string to the starting symbol by inverting production rules. (reduction)
Bottom-up parser traces a rightmost derivation in reverse.
Bottom-up Parse Tree
Note that we try to expand right most derivation (e.g. we choose to parse E first in T + E)
Shift-Reduce Parsing
Consequence of right-most derivation: when you see \alpha \beta \omega, then if the next production rule to apply reversely is X \to \beta, then we know \omega must be a terminal.
So we can let the \omega be terminal come from the input steam of tokens!
Shift: add a token from the token steam to our working set from the right hand side.
Reduce: reversely apply production rule to the left hand side.
Example of Shift-Reduce Parsing
How do we know when and where to shift and reduce?
Shift-reduce Conflict: when the parse is free to choose either do shift or do reduce in the next round. (almost expected)
Reduce-reduce Conflict: when the parser is possible to perform more than one possible reduce rules, indicating the grammar is bad.
Shift pushes a terminal onto stack. Reduce pop 0 or more symbols off the stack and push produced symbols on the stack.