LN1

Alphabets and Strings

An alphabet is a set of symbols
String: a sequence of symbols from some alphabet
Language: a set of strings
Unary numbers alphabet $\Sigma=\{1\}$
Unary number: 1 11 111...

String Operations

Concatenation $wv$
Reverse $w^R$
String Length $|w|$
Empty String: A string with no letters is denoted: $\varepsilon$ or $\lambda$
Substring: a subsequence of consecutive characters
Prefix and Suffix $w=uv$ $u$ is prefix and $v$ is suffix
Exponent Operation: $w^n=\underbrace{ww\cdots w}_{n}$ and $w^0=\varepsilon$
The * Operation
$\Sigma^*$ :the set of all possible strings from alphabet $\Sigma$
The + Operation
$\Sigma^+=\Sigma^*-\{\varepsilon\}$

Languages

A language over alphabet $\Sigma$ is any subset of $\Sigma^*$

Empty language $\{\}$ or $\varnothing$

Operations on Languages

The usual set operations: union, intersection, difference, complement
Reverse: $L^R=\{w^R: w\in L\}$
Concatenation: $L_1L_2=\{ xy:x\in L_1,y\in L_2 \}$
$L^0=\{\varepsilon\}$
Star-Closure (Kleene *)
All strings that can be constructed from $L$
$L^*=L^0\cup L^1\cup L^2\cup\cdots$
Positive Closure
$L^+=L^1\cup L^2\cup\cdots$

LN2 Deterministic Finite Automata

For every state, there is a transition for every symbol in the alphabet

Formal Definition
Deterministic Finite Automaton (DFA)
$M=(Q,\Sigma,\delta,q_0,F)$

$Q$: set of states
$\Sigma$: input alphabet $\varepsilon\notin\Sigma$ (:the input alphabet never contains empty string)
$\delta$: transition function
$\delta:Q\times\Sigma\rightarrow Q, \delta(q_0,a)=q_1$
Extended Transition Function $\delta^*(q,w)=q'$
Describes the resulting state after scanning string $w$ from state $q$
$q_0$: initial state
$F$: set of accepting states

Language Accepted by DFA
Language accepted by $M$
$L(M)=\{w\in\Sigma^*:\delta^*(q_0,w)\in F\}$
Language rejected by $M$

Regular Languages
A language $L$ is regular if there is a DFA $M$ that accepts it, which means $L(M)=L$
The languages accepted by all DFAs form the family of regular languages

LN3 Non-Deterministic Finite Automata

A NFA accepts a string:
if there is a computation path of the NFA that accepts the string.
all the symbols of input string are processed and the last is an accepting state2

A NFA rejects a string:
if there is no computation of the NFA that accepts the string.
For each possible computation path:
All the input is consumed and the automaton is in a non accepting state
or
The input cannot be consumed

$\delta$: transition function
$\delta(q_0,x)=\{q_1,q_2,\cdots,q_k\}$ (can be $\varnothing$)

Note that for any state $q, q\in \delta^*(q, \varepsilon)$

![](file:///Users/emptyset/Documents/Gridea/post-images/1698606273832.png)

Equivalence of Machines
Machine $M_1$ is equivalent to machine $M_2$ if $L(M_1)=L(M_2)$
Languages accepted by NFAs = Regular Languages (aka Languages accepted by DFAs)

NFAs and DFAs have the same computation power, namely, they accept the same set of languages.

Proof:

Languages accepted by NFAs $\supseteqq$ Regular Languages. Because every DFA is trivially a NFA;
Languages accepted by NFAs $\subseteqq$ Regular Languages. Becasue any NFA can be converted to an equivalent DFA(We will show it later).

Conversion Procedure Steps

Initial state of NFA is $q_0$ and $\delta^*(q_0, \varepsilon)=\{q_0,\cdots\}$, so that initial state of DFA is $\{q_0,\cdots\}$;
For every DFA’s state $\{q_i,\cdots, q_j\}$, compute in the NFA $\delta^*(q_i,a)\cup\cdots\cup\delta^*(q_j,a)\rightarrow \{q_l',\cdots,q_n'\}$, then add transition to DFA $\delta(\{q_i,\cdots, q_j\}, q)=\{q_l',\cdots,q_n'\}$;
Repeat Step 2 for every state in DFA and symbols in alphabet until no more states can be added in the DFA;
For any DFA state $\{q_i,\cdots, q_j\}$, if some $q_j$ is accepting state in NFA, then $\{q_i,\cdots , q_j\}$ is accepting state in DFA.

LN4 Properties of Regular Languages

We say Regular languages are closed under

Union $L_1\cup L_2$
Concatenation $L_1L_2$
Star $L_1^*$
Reversal $L_1^R$
Complement $\overline{L_1}$
Intersection $L_1\cap L_2$

Every NFA(also include NFA without accepting state) is equivalent to a NFA which contains single accept state.

LN5 Regular Expressions

Regular Expressions

Regular expressions describe regular languages
$(a+b\cdot c)^*$ describes the language $\{a,bc\}^*=\{\varepsilon,a,bc,aa,abc,bca,bcbc,\cdots \}$

Primitive regular expressions: $\varnothing, \varepsilon, a$
Given regular expressions $r_1,r_2$, $r_1+r_2,r_1\cdot r_2, r_1^*, (r_1)$ are regular expressions

Languages of Regular Expressions

$L(r)$ means langguage of regular expression $r$
$L(\varnothing)=\varnothing$
$L(\varepsilon)=\{\varepsilon\}$
$L(a)=\{a\}$
$L(r_1+r_2)=L(r_1)\cup L(r_2)$
$L(r_1\cdot r_2)=L(r_1)L(r_2)$
$L(r_1^*)=(L(r_1))^*$
$L((r_1))=L(r_1)$

Equivalent Regular Expressions

Theorem:Languages Generated by Regular Expressions = Regular Languages
![](file:///Users/emptyset/Documents/Gridea/post-images/1700158394031.png)
$r=r_1^*r_2(r_4+r_3r_1^*r_2)^*$

When we say we are given a Regular Language $L$, we mean Language $L$ is in a standard representation (DFA, NFA, or Regular Expression)

LN6 Regular Grammars

Gramars

Grammar $G=(V,T,S,P)$

$S\rightarrow aSb$
$S\rightarrow\lambda$

$V$: Set of variables $V=\{S\}$
$T$: Set of terminal symbols $T=\{a,b\}$
$S$: Start variable
$P$: Set of Production rules $P=\{S\rightarrow aSb, S\rightarrow\lambda\}$

Sentential Form :A sentence that contains variables and terminals

Language of a Grammar

For a grammar $G$ with start variable $S$
$L(G)=\{w:S\overset{*}{\Rightarrow} w\}$ $w$ is one string of terminals
$A\rightarrow aAb|\lambda$

Linear Grammars

Linear Grammars: Grammars with at most one variable at the right side of a production

A Non-Linear Grammar
$S\rightarrow SS | \lambda | aSb | bSa$
$L(G) = \{w:n_a(w)=n_b(w)\}$ $n_a(w)$ means number of $a$ in string $w$

Right-Linear Grammars:
All productions have form $A\rightarrow xB$ or $A\rightarrow x$

Left-Linear Grammars:
All productions have form $A\rightarrow Bx$ or $A\rightarrow x$

Regular Grammars

A regular grammar is any right-linear or left-linear grammar
Regular grammars generate regular languages
Theorem Languages Generated by Regular Grammars Regular Languages = Regular Languages
For any transition $q\xrightarrow{a}p$ add production $a\rightarrow ap$
For any final state $q_f$ add production $q_f\rightarrow \lambda$

LN7-8 Non-regular languages (Pumping Lemma)

Non-regular languages

$\{ a^nb^n:n\ge 0\}$
$\{vv^R:v\in\{a,b\}^*\}$

The Pigeonhole Principle and DFAs

Considering number of states of DFA is $p$, for a walk of $w$, if $|w|\ge p$, which means numbers of states in walk is at least $p+1$, then a state is repeated in the walk $w$

The Pumping Lemma

Take an infinite regular language, $L$ There exists a DFA that accepts $L$
Take string $w\in L$ with $|w|\ge p$, then, at least one state is repeated in the walk of $w$
Take $q$ to be the first state repeated
We can write $w=xyz$, Where $y$ corresponds to substring between first and second occurrence of $q$
Note that $|xy|\le p$, because of unique states in $xy$ (Since, in $xy$ no state is repeated execpt $q$)
Since there is at least one transition in loop, $|y|\ge 1$
We do not care about the form of string $z$ ($z$ may actually overlap with the paths of $x$ and $y$)
In General, the string $xy^iz$ is accepted $i=0,1,2$
Therefore, $xy^iz\in L,i=0,1,2,\cdots$

The Pumping Lemma:

Given a infinite regular language $L$
there exists an integer $p$ (critical length)
for any string $w\in L$ with length $|w|\ge p$
we can write $w=xyz$
with $|xy|\le p$ and $|y|\ge 1$
such that $xy^iz\in L,i=0,1,2,\cdots$

Applications of the Pumping Lemma

Every language of finite size has to be regular
Therefore, every non-regular language has to be of infinite size

$L=\{ a^nb^n:n\ge 0\}$
Assume for contradiction that $L$ is a regular language
Since $L$ is infinite we can apply the Pumping Lemma
Let $p$ be the critical length for $L$
We pick $w=a^pb^p$
We can write $y=a^k,1\le k\le p$
Thus $xy^2z\in L,a^{p+k}b^p\in L$
But $a^{p+k}b^p\notin L$, CONTRADICTION!!!

LN9 Context-Free Languages

Context-Free Grammars

Grammar $G=(V,T,S,P)$

$G: S\rightarrow aSb | SS | \varepsilon$
$L(G)$ Describes matched parentheses: $((())())(())$

Derivation Order and Derivation Trees

$S\rightarrow AB, A\rightarrow aaA | \varepsilon, B\rightarrow Bb | \varepsilon$

derivation	1	2	3	4	5	6
leftmost	$S$	$AB$	$aaAB$	$aaB$	$aaBb$	$aab$
rightmost	$S$	$AB$	$ABb$	$Ab$	$aaAb$	$aab$

They give same derivation tree
![](file:///Users/emptyset/Documents/Gridea/post-images/1700166110000.png)

Ambiguity

A context-free grammar $G$ is ambiguous if there is a string $w\in L(G)$ which has:
two different derivation trees or two leftmost derivations (Two different derivation trees give two different leftmost derivations and vice-versa)

In general, ambiguity is bad and we want to remove it
Sometimes it is possible to find a non-ambiguous grammar for a language
But, in general ιt is difficult to achieve this

LN10 Simplifications of Context-Free Grammars

A Substitution Rule

Nullable Variables $\varepsilon$-production
Unit-Productions
Useless Productions

Normal Forms for Context-free Grammars

Chomsky Normal Form Each production has form: $A\rightarrow BC$ or $A\rightarrow a$
Conversion to Chomsky Normal Form
It is easy to find the Chomsky normal form for any context-free grammar (which doesn’t generate $\varepsilon$)

Greinbach Normal Form All productions have form: $A\rightarrow aV_1V_2\cdots V_k, k\ge 0$
Greinbach normal forms are very good for parsing strings (better than Chomsky Normal Forms)
However, it is difficult to find the Greinbach normal of a grammar

LN11 Properties of Context-Free languages

Closed

Union $S\rightarrow S_1 | S_2$
Concatenation $S\rightarrow S_1S_2$
Star Operation $S\rightarrow S_1S | \lambda$

Negative Properties of Context-Free Languages (not closed)

Intersection
Complement

$L_1=\{a^nb^nc^m\},L_2=\{a^nb^mc^m\}$

Intersection of Context-free languages and Regular Languages(Applications of Regular Closure)

The intersection of a context-free language and a regular language is a context-free language
Prove that $L=\{w:n_a=n_b=n_c\}$ is not context-free
Becasue $L_2=\{a^*b^*c^*\}$ is regular, $L\cap L_2=\{a^nb^nc^n\}$ is not context-free.
So it's impossible that $L$ is context-free

LN12 Pushdown Automata PDAs

Pushdown Automaton -- PDA

Initial Stack Symbol
The States

replace: $a,b\rightarrow c$
push: $a,\varepsilon\rightarrow c$
pop: $a,b\rightarrow\varepsilon$
no change: $a,\varepsilon\rightarrow\varepsilon$

PDAs are non-deterministic
Allowed non-deterministic transitions

A string is accepted if there is a computation such that:
All the input is consumed AND The last state is an accepting state
we do not care about the stack contents at the end of the accepting computation

Formalities for PDAs

Pushdown Automaton (PDA)
$M = (Q, \Sigma, \Gamma, \delta, q_0, z, F)$

$Q$ state
$\Sigma$ Input alphabet
$\Gamma$ Stack alphabet
$\delta$ Transition function $\delta(q_1,a,w_1)=\{(q_2,w_2),(q_3,w_3)\}$
$q_0$ Initial state
$z$ Stack start symbol
$F$ Accept states
$(q,u,s)$
$q$ Current state
$u$ Remaining input
$s$ Current stack content

$(q_1, bbb, aaa\$) \succnsim (q_2, bb, aa\$)$

Language $L(M)$ accepted by PDA $M$:

$L(M)=\{w:(q_0,w,z)\overset{*}{\succnsim} (q_f,\varepsilon,s)\}$

LN13 PDAs Accept Context-Free Languages

Theorem: Context-Free Languages (Grammars) = Languages Accepted by PDAs

Convert Context-Free Grammars to PDAs

For each terminal $a$ in $G$, add $a,a\rightarrow\varepsilon$
For each production $A\rightarrow w$ in $G$, add $\varepsilon, A\rightarrow w$
e.g. $G: S\rightarrow aSTb|b, T\rightarrow Ta|\varepsilon$
PDA: $\varepsilon,S\rightarrow aSTb;\varepsilon,S\rightarrow b;\varepsilon,T\rightarrow Ta; \varepsilon,T\rightarrow \varepsilon;a,a\rightarrow \varepsilon;b, b\rightarrow \varepsilon$

Grammar Leftmost Derivation
$\begin{array}{l} S\\ \Rightarrow aSTb \\ \Rightarrow abTb \\ \Rightarrow abTab \\ \Rightarrow abab\end{array}$

PDA Computation

$\begin{array}{l} (q_0, abab, \$) \\ \succnsim (q_1, abab, S\$) \\ \succnsim (q_1, bab, STb\$) \\ \succnsim (q_1, bab, bTb\$) \\ \succnsim (q_1, ab, Tb\$) \\ \succnsim (q_1, ab, Tab\$) \\ \succnsim (q_1, ab, ab\$) \\ \succnsim (q_1, b, b\$) \\ \succnsim (q_1, \varepsilon, \$) \\ \succnsim(q_2, \varepsilon, \$) \\ \end{array}$

Grammar $G$ generates string $w$ $S\overset{*}{\Rightarrow}$ if and only if PDA $M$ accepts $w$ $(q_0,w,\$)\overset{*}{\succnsim} (q_2,\varepsilon,\$)$

Convert PDAs to Context-Free Grammars

Too long to wirte down here

LN14 DPDA Deterministic PDA

Definition

$L(M)=\{a^nb^n:n\ge 0\}$ is DPDA
$L(M)=\{vv^R:v\in\{a,b\}^*\}$ is not DPDA
because $?,a\rightarrow b;?,a\rightarrow c$ is not allowed in DPDA

PDAs Have More Power than DPDAs

We will show that there exists a context-free language $L$ which is not accepted by any DPDA
Which means Deterministic Context-Free Languages(DPDA) $\subsetneqq$ Context-Free
Languages(PDA)

e.g. $L=\{a^nb^n\}\cup \{a^nb^{2n}\}$
Because if there is a DPDA accept $L$, we will get a DPDA accepting $\{a^nb^nc^n\}$ by modifying $M$, replacing $b$ with $c$
The language $\{a^nb^nc^n\}$ is not context-free(we can prove this using pumping lemma later)

LN15-16 Pumping Lemma for Context-free Languages

The Pumping Lemma:
For any infinite context-free language $L$
there exists an integer $p$ such that
for any string $w\in L,|w|\ge p$
we can write $w=uvxyz$
with length $|vxy|\le p$ and $|vy|>1$
and it must be that: $uv^ixy^iz\in L,\forall i\in \mathbb{N}$

LN17 Turing Machines

Turing Machines are deterministic
The tap with infinite length
The head at each transition (time step):

Reads a symbol
Writes a symbol
Moves Left or Right
Turing Machines are deterministic(No $\varepsilon$-transitions allowed)
No transition for input symbol is allowed

Halting

The machine halts in a state if there is no transition to follow
Accepting states have no outgoing transitions
The machine halts and accepts
Accept Input String = If machine halts in an accept state
Reject Input String = If machine halts in a non-accept state or If machine enters an infinite loop
In order to accept an input string, it is not necessary to scan all the symbols of the input string

Formal Definitions for Turing Machines

$M=(Q,\Sigma,\Gamma,\delta,q_0,\lozenge,F)$
$\lozenge$: Blank
A move $q_0 xayb\succnsim x q_1 ayb$
$L(M)=\{w:q_0w\overset{*}{\succnsim} x_1q_fx_2\}$
If a language $L$ is accepted by a Turing machine then we say that is:

Turing Recognizable
Turing Acceptable
Recursively Enumerable

LN18 Turing’s Thesis

Turing’s thesis (1930):
Any computation carried out by mechanical means can be performed by a Turing Machine
An algorithm for a problem is a Turing Machine which solves the problem

Variations of the Turing Machine

Turing machines with:

Stay-Option
Semi-Infinite Tape
Multitape
Multidimensional
Nondeterministic

Same Power of two machine classes:
both classes accept the same set of languages
each new class has the same power with Standard Turing Machine(accept Turing-Recognizable Languages)