Lecture 002

Memoryless Machine

Computational Model: allowed rules for information processing.

• While "computer" refer to physical computer, but physical computer can only run one algorithm "the Universal Algorithm: the algorithm that runs all other algorithm", therefore physical computer (a Universal Machine) is a instantiation of computational model.

• Computer (Machine, Algorithm): an instantiation of algorithm (like a machine that only solves addition)

Interesting Algorithm:

• Should take infinite many combinations of different inputs, and not a lookup table ($\Sigma^2$ as infinite collection of finite inputs)

• therefore it should have finite memory
• Should not ignore any input, read the entire string once

• Should be decision problem

Algorithms: contains

• a transition system: (a diagram) includes states and the alphabet, can be constructed as a labeled digraph.

• formally $\mathcal{T} = (Q, \Sigma, \delta)$
• $Q$: finite set of states
• $\Sigma$: finite set of alphabet of $\mathcal{T}$
• $\delta$: transition relation $\delta \subseteq Q \times \Sigma \times Q$
• Elements of $\delta$: a transition $p \xrightarrow{a} q$
• complete: we cannot get stuck in any state, we can consume all alphabets $(\forall p \in Q, a \in \Sigma)(\exists q \in Q) p \xrightarrow{a} q$
• deterministic: can't branch or fork, can have at most one run from given state for any input $(\forall p, q, q' \in Q, a \in \Sigma) p \xrightarrow{a} q \land p \xrightarrow{a} q' \implies p = q'$
• an acceptance condition: condition to accept a state

Run($w$): a run on the input word $w$ of $m$ characters, is an alternating sequence (directed, labeled path in digraph) of states and letters ($p_0, a_1, p_1, a_2, ... p_{m-1}, a_m, p_m$) for $m \geq 0$, where every transitions are valid.

• source: $p_{i-1}$ (might not be initial state $q_0$)

• target: $q_i$

• short notation: $p_0, p_1, ... p_{m-1}, p_m$

• trace(label): $a_1a_2 ... a_m \in \Sigma^*$

• Source and Target of a Run: Given a run $p_0a_1...a_mp_m$, we call $p_0$ its source, and $p_m$ its target.

• computation path: a sequence of $q_0, ..., q_n \in Q$ where $q_0$ is initial state and $\forall i \in \{1, 2, ..., n\} \delta(q_{i-1}, w_i) = r_i$

• accepting computation path: $r_n \in F$
• rejecting computation path: $r_n \notin F$
• Nondeterministic: you can have more than one distinct runs on the same input

Nondeterministic Transitions: allow both $p \xrightarrow{a} q$ and $p \xrightarrow{a} q'$ (branching)

Valid Transition: $p_{i-1} \xrightarrow{a_i} q_i$ is valid if $p_{i-1}$ can reach $q_i$ by reading alphabet $a$

Finite State Machine

Finite State Machine (decides | accepts | computes) a language.

Finite State Machine (FSM): $\mathcal{A} = \langle{\mathcal{T}(Q, \Sigma, \delta), \text{acc}(q_0, F)}\rangle$

• $\text{acc}$: an acceptance condition: a function takes in a trace output whether the trace starts with $q_0$ and ends with accepting state $q \in F$. So it can be written minimally as $\text{acc}(q_0, F)$ for $q_0 \in Q$, $F \subseteq Q$.

• $\mathcal{T}$: transition system

• $q_0 \in Q$: initial state

• (Acceptance) Language: $\mathcal{L}(\mathcal{A})$ is all words accepted by automata $\mathcal{A}$

• Complete Transition Systems: for every state $p \in q$ and every character $a \in \Sigma$, there is at least one state $q$ such that $p \xrightarrow{a} q$.

• Deterministic Transition Systems: for every state $p \in q$ and every character $a \in \Sigma$, there is at most one state $q$ such that $p \xrightarrow{a} q$.

• Final (Accepting) State: $F \in Q$ indicated by double circle, where a algorithm accepts the input

• If a language can be recognized by a DFA, then it's complement can also be recognized by a DFA.

Initial State: indicated by arrow pointing to the state from nowhere

Deterministic Finite Automata (DFA): $\mathcal{M} = \langle{\mathcal{T}(Q, \Sigma, \delta), \text{acc}(q_0, F)}\rangle$ where $\mathcal{T}$ is deterministic and complete

• transition relation: $\delta \subseteq Q \times \Sigma \times Q$

• transition function: since deterministic, we can turn transition relation into function $\hat{\delta} : Q \times \Sigma \rightarrow Q$ ($\hat{\delta}(p, a) = q \equiv (p, a, q) \in \hat{\delta}$)

• extended transition function: $\delta^* : \subseteq Q \times \Sigma^* \rightarrow Q$

• Defined by recursion: $\begin{cases} \delta^*(p, \epsilon) = p\\ \delta^*(p, xy) = \delta(\delta(p, x), y)\\ \end{cases} \forall x, y \in \Sigma^*$
• Acceptance: $\mathcal{M} \text{ accepts word } u \iff \delta(q_0, u) \in F$

DFA Definition: $\mathcal{M} = \langle{\mathcal{T}(Q, \Sigma, \delta), \text{acc}(q_0, F)}\rangle$

• $Q$: finite, non-empty set of states

• $\Sigma$: finite, non-empty set of alphabet

• $\delta : Q \times \Sigma \rightarrow Q$: transition function reading current state and input alphabet, output new state

• $\delta^*(q, w)$: a function given a start state $q$, return the state after reading the word $w$.
• $q_0 \in Q$: start state

• $F \subseteq Q$: set of accepting states

• Other definitions

• accept: start with start state and end with accepting state
• regular language: $\exists M$ s.t. $L = \mathbb{L}(M)$
• all finite language are regular
• there are unary language that are not regular
• all language: $\mathcal{P}(\Sigma^*)$
• two DFAs are equivalent if they have the same structure, regardless of labels.

// EXERCISE: prove $\delta^*(p, xy) = \delta(\delta(p, x), y)$ for all $x, y \in \Sigma^*$

Regular (Recognizable) Language: a language $L \in \Sigma^*$ is regular $(\exists \text{DFA }\mathcal{M}) \mathcal{L} = L$

DFA Example

Trap: a state $p$ in DFA such that $(\forall a \in \Sigma) \delta(p, a) = p$ Sink: a trap that is not final (removing a sink break completeness)

• accepting sink state: a sink that is accepting a string

• rejecting sink state: a sink that is rejecting a string

Partial (Incomplete) DFAs (PDFAs): DFAs but with sink removal, therefore not complete.

Application DFA

Membership of regular language: grep, emacs, regexp Non-Empty Acceptance Language: model checking

Searching String

Back Transitions: for a word $s_0s_1s_2s_3...$ if the next symbol $s_n$ does not step forward, instead of creating hard branch, jump to the node reached by a word $s_1s_2s_3...$ and read $s_n$ again.

DFA Decision Problems

DFA Membership: Does DFA $\mathcal{M}$ accept input $x \in \Sigma^*$? (Lemma: DFA Membership solvable in linear time)

Emptiness: Does DFA $\mathcal{M}$ accept no input? (no path from $q_0$ to any $q \in F$, tested by DFS or BFS) Finiteness: Does DFA $\mathcal{M}$ accept finitely many input? Universality: Does DFA $\mathcal{M}$ accept all input? State Complexity: Find the state complexity of language. (Because language is an infinite set, usually the input is an arbitrary DFA accepting the language) Solve by constructing the minimal DFA.

(Lemma: DFA Emptiness, Finiteness, Universality solvable in linear time)

// EXERCISE: show how to deal with Finiteness, Universality

// EXERCISE: prove certain DFA only accept even number of a and b

Equivalent DFA: $\mathcal{M_1}, \mathcal{M_2}$ over $\Sigma$ are equivalent if $\mathcal{L(\mathcal{M_1})}, \mathcal{L(\mathcal{M_2})}$ (There exists multiple DFA to recognize the same language)

State Complexity of DFA ($\text{stc}(\mathcal{A})$): cardinality of states in $\mathcal{A}$ State Complexity of Regular Language $L$ ($\text{stc}(L)$): size of smallest DFA accepting $L$

• General Abstract Nonsense: always exists because natural numbers are well-ordered (given there exists a DFA accepting $L$) // QUESTION: why it is the case

Potential Problem: // QUESTION: why so?

• There might be serval DFAs of minimal size. Difficult to compare languages. (not the case: There exists exactly one up to isomorphism minimal DFA)

• Larger DFAs for the same language might not have connection to the minimal one. Difficult to obtain smallest machine given arbitrary one. (not the case: larger DFAs are directly related to minimal one via an equivalence relation)

Theorem: For every regular language there is exactly one minimal DFA, unique up to isomorphism.

accessible part of an automata: all states $p$ such that there exists a run from $q_0$ to $p$. (non-accessible does not exist in DFA) co-accessible part of an automata: all states $p$ such that there exists a run from $p$ to a final (accepting) state. (non-co-accessible can all be replaced by a single rejecting sink state) trim: if all states are accessible and co-accessible. (cut down not accessible or not co-accessible parts by graph search)

// EXERCISE: try to construct another minimum DFA for even/even. Is it possible? Which states in the larger DFAs correspond to states of minimal one?

Problem: determine if the string has an a at $k$ position

• $L_{a, k} = \{x \in \{a, b\}^* | x_k = a\}$

• There exists a DFA for $k>=0$

• NDFA can easily solve $k<0$

// EXERCISE: come up with an upper bound for complexity of $L_{a, k}$ for $k \leq 0$

Divisibility Problem: Is input number $a$ divisible by $k$?

• check if divisible by 5: $v(xa) = 2 * v(x) + a (\mod 5)$

• pre-computation might be heavy, but outcome DFA is simple

• In base 2: having $k$ state is minimal, because smaller machine cannot realize the change of natural number which affect divisibility

• In base $B$: $v(x_kx_{k-1}...x_1x_0 = \sum_{i \leq k}x_iB^i$

• Divisible by $m$ can be tested by a DFA in any base $B$ (build Horner automation with states $Q = \{0, 1..., m-1\}$, $\delta(p, a) = p \cdot B + a (\mod m)$, with initial and finite state 0, since $\delta(q_0, x) = v(x) (\mod m)$) // TODO: understand this part // QUESTION

But checking number $n$ with base $B=n$ is different:

• there are not many useful pattern
• there is a pattern for prime
• there is a pattern for base 2

Proving Existence of DFA

Idea: use the fact that two different string end up in the same state can never end up in a different state after reading same stream of symbols

• generally counting all finite number of strings is not solvable using finite number of states, because the length of different possible string input are infinite.

Good Papers

S. Kleene
Representation of Events in Nerve Nets and Finite Automata
RAND Corporation, RM 704 (1951).

M. O. Rabin and D. Scott
Finite automata and their decision problems
IBM J. Research and Development, 3 (1959), 114–125.


Nondeterministic Finite Automata

$\mathcal{A}=\langle{Q, \Sigma, \delta, I, F}\rangle$: An NFA accepts an input $x$ if and only if there exists a run on $\delta$ with trace $x$, a source $q\in I$, and a target $q' \in F$

• $Q$: finite, non-empty set of states

• $\Sigma$: finite, non-empty set of alphabet

• $\delta: Q \times \Sigma \times Q$: transition function, $q_i, a$ can map to multiple or none $q_p$.

• the transition function is a relation, can exist, and perfectly okay if a state does not have certain transition function for certain alphabets, or multiple transition functions for an alphabet.
• where $\epsilon$ can be one of the alphabets: reading nothing (that will be $\text{NFA}\epsilon$)
• $I$: set of initial states (multiple start points)

• $F$: set of final states

• Intuition: Each step NFA keep a set of states that it could travel to $S'={q' \in Q∣ (\exists q\in Q) (q,a,q')\in \delta}$ and check if they contain any final states when finish reading.

• Powerset construction: For every NFA there is an equivalent DFA.

// TODO: Exercise (NFAs allow for language reversal) // WARNING: you must look at this

nondeterministic finite automata with $\epsilon$-moves (NFAE): NFA with allow transition without reading any character

• Epsilon enumeration: For every NFAE there is an equivalent NFA.

• closure properties of NFAE-recognizable languages

• Closure of regular languages under concatenation: If $L_1$ and $L_2$ are regular, then $L_1L_2$ is regular by NFAEs allow for concatenation.
• Closure of regular languages under reversal: If $L$ is regular, then $L^R$ is regular by NFAs allow for language reversal.

// TODO: Exercise (NFAEs allow for concatenation) // WARNING: you must look at this

Table of Content