Lecture 021

Context-Free Grammar & Parsing


  1. ATO: from input to char stream
  2. Lexical analysis: from char stream to token stream (using regular expression)
  3. parsing: from token stream to abstract syntax tree
  4. type checking: from abstract syntax tree to typed abstract syntax tree
  5. evaluation: from typed syntax tree to value stream


context-free (is context the state of an automata?) can a programing language be implemented by regular grammar are there any string that can't be captured by context-free grammar?


mixture(defined by myself): is a set of 0~n terminal with 0~n variables


3 ways to think about computation:

  1. automata
  2. language
  3. functions


Ambiguous: a grammar is ambiguous if there exists more than one left most derivation of a string

another way to identify regular language:

Regular: {a^n} Context Free: {a^n b^n} Not-Context Free: {a^n b^n c^n}

Button up operator Precedence Parser: use stack to stack operations

Grammar and Trees

Grammar and Trees

Parsing Structure

Parsing Structure

Parsing Function

Parsing Function

Parsing Function 2

Parsing Function 2

Computing Theory


computer: takes a stream of input, provide a stream of output, with its internal state


Deterministic Finite Automata (DFA)

Deterministic: every state has every possible input Finite: finite number of state Automata: machine

Q: finite set of states sigma: finite input alphabet delta: the transition function (Q * sigma -> Q, were delta is total) q0: the start state

computation: an ordered sequence of states traversed by automata


language of (): M -> set

language: a set of accepting string

regular language: a set of accepting string, as long as there exists some DFA (automata) accept this language

non-regular language:

Non-Deterministic Finite Automata (NFA)

Non-Deterministic Finite Automata: suddenly jump from one state to the other without taking any characters. (useful when concatenating two DFAs)

Note: you can convert NFA to DFA


Grammar: a formulated way to produce possible language (the set of actual strings) by iteratively replacing variables with terminals

Variables: input being replaced Terminals: output of variable, can contain variable

Regular Grammar

(associated with non-deterministic finite automata) Start Variable: the variable you start with Rules: (Variable * Terminal) only 4 types of rules are allowed

  1. A -> \empty
  2. A -> a (where a is a terminal)
  3. A -> B (where b is a variable)
  4. A -> aB (where a is a terminal) OR A -> Ba (where a is a terminal)

Note: you can convert regular grammar to regular language, therefore NFA and DFA

Context Free Grammar (CFG)

(non-deterministic finite automata with stack = push-down automata, this result infinite push) Rules: (Variable * Terminal) only 5 types of rules are allowed

  1. A -> \empty
  2. A -> a (where a is a terminal)
  3. A -> B (where b is a variable)
  4. A -> aB (where a is a terminal)
  5. A -> Ba (where a is a terminal)

well this is equivalent of having only one rule

  1. A -> x (where x can be variables or terminals, put whatever you want)

Now we can have L = {\empty, 01, 0011, 000111, ...} = {0^n 1^n | n >= 0} which is not regular anymore

Context Free Language: produced by context free grammar

Turning-equivalent computation

Finite Automata + 2 stacks

Table of Content