You are on page 1of 13

Abstraction of Problems

Data: abstracted as a word in a given alphabet.


Σ: alphabet, a finite, non-empty set of symbols. Regular Languages
Σ∗ : all the words of finite length built up using Σ:
Hantao Zhang
Conditions: abstracted as a set of words, called http://www.cs.uiowa.edu/∼ hzhang/c135
language
Any subset L ⊆ Σ∗ is a formal language. The University of Iowa
Department of Computer Science
Unknown: Implicitly a Boolean variable: true if a word is
the language; no, otherwise.
Given w ∈ Σ∗ and L ⊆ Σ∗ , does w ∈ L?

Theory of Computation – p.2/?? Theory of Computation – p.1/??

Formal Definition of Finite Automata Finite Automata


A finite automaton is a 5-tuple (Q, Σ, δ, q0 , F ) where: The simplest computational model is called a finite state
Q is a finite set called the states machine or a finite automaton

Σ is a finite set called the alphabet Representations:


Graphical
δ : Q × Σ → Q is the transition function
Tabular
q0 ∈ Q is the start state also called initial state Mathematical.
F ⊆ Q is the set of accept states, also called the final
states

Theory of Computation – p.4/?? Theory of Computation – p.3/??


Computation of a Finite Automaton Applications
The automaton receives the input symbols one by one Finite automata are popular in parser construction of
from left to right compilers.
After reading each symbol, M1 moves from one state to Finite automata and their probabilistic counterparts,
another along the transition that has that symbol as its Markov chain, are useful tools for pattern recognition
label used in speech processing and optical character recognition
When M1 reads the last symbol of the input it produces Markov chains have been used to model and predict
the output: accept if M1 is in an accept state, or reject if price changes in financial applications
M1 is not in an accept state

Theory of Computation – p.6/?? Theory of Computation – p.5/??

Formal Definition of Acceptance Language Recognized by an Automaton


Let M = (Q, Σ, δ, q0 , F ) be a finite automaton and If L is the set of all strings that a finite automaton M
w = a1 a2 . . . an be a string over Σ. accepts, we say that L is the language of the machine
Then M accepts w if a sequence of states r0 , r1 , . . . , rn exist M and write L(M ) = L.
in Q such that the following hold: An automaton may accept several strings, but it always
1. r0 = q0 recognizes only one language
2. δ(ri , ai+1 ) = ri+1 for i = 0, 1, . . . , n − 1 If a machine accepts no strings, it still recognizes one
language, namely the empty language ∅
3. rn ∈ F

Condition (1) says where the machine starts.


Condition (2) says that the machine goes from state to state
according to its transition function δ.
Condition (3) says when the machine accepts its input: if it ends up in
an accept state.

Theory of Computation – p.8/?? Theory of Computation – p.7/??


Designing Finite Automata Regular Languages
Whether it be of automaton or artwork, design is a We say that a finite automaton M recognizes the language
creative process. Consequently it cannot be reduced to L if L = {w | M accepts w}
a simple recipe or formula.
Definition A language is called regular language if there exists
The approach:
Identify the finite pieces of information you need to a finite automaton that recognizes it.
solve the problem. These are the states.
Identify the condition (alphabet) to change from one
state to another.
Identify the start and final states.
Add missing transitions.

Theory of Computation – p.10/?? Theory of Computation – p.9/??

Regular Expressions Operations on Regular Languages


Three base cases: Let A and B be languages. We define regular operations
∅ is a regular expression denoting the language ∅; union, concatenation, and star as follows:
ǫ is a regular expression denoting the language {ǫ}; Union: A ∪ B = {x | x ∈ A ∨ x ∈ B}

For any a ∈ Σ, a is a regular expression denoting the Concatenation: A ◦ B = {xy | x ∈ A ∧ y ∈ B}

language {a}; Star: A∗ = {x1 x2 . . . xk | k ≥ 0 ∧ xi ∈ A, 1 ≤ i ≤ k}


Note:
Three recursive cases: If r1 and r2 are regular (1) Because “any number" includes 0, ǫ ∈ A∗ , no matter
expressions denoting languages L1 and L2 , respectively, what A is.
then (2) A+ denotes A ◦ A∗ .
Union: r1 ∪ r2 denotes L1 ∪ L2 ;
Concatenation: r1 r2 denotes L1 L2 ;
Star: r1∗ denotes L∗1 .

Theory of Computation – p.12/?? Theory of Computation – p.11/??


Closure under Complementation Closure under Intersection
TheoremThat class of regular languages is closed under Theorem That class of regular languages is closed under
complementation. intersection.
ProofFor that we will first show that if M is a DFA that Proof Use cross-product construction of states.
recognizes a language B , swapping the accept and
non-accept states in M yields a new DFA that
recognizes the complement of B .

Theory of Computation – p.14/?? Theory of Computation – p.13/??

Two ways to Introduce Nondeterminism Nondeterminism


More choices for the next state: Zero, one, or many So far in our discussion, every step of a finite
arrows may exit from each state. automaton computation follows in a unique way from
A state may change to the next state without spending the preceding step.
an input symbol: ǫ-transitions. We call this a deterministic computation. In a
nondeterministic computation, choices may exist for the
next state at any point.
Nondeterminism is a generalization of determinism;
hence, every finite automaton is a nondeterministic
finite automaton (NFA).

Theory of Computation – p.16/?? Theory of Computation – p.15/??


Tree Computation of NFA Formal Definition of NFA
A way to think of a nondeterministic computation is as a An NFA is a 5-tuple (Q, Σ, δ, q0 , F ) where:
tree of possibilities Q is a finite set called the states
The root of the tree corresponds to the start of the Σ is a finite set called the alphabet
computation δ : Q × (Σ ∪ {ǫ}) → P(Q)
Every branching point in the tree corresponds to a point where P(Q) is the power set of Q.
in the computation at which the machine has multiple q0 ∈ Q is the start state also called initial state
choices F ⊆ Q is the set of accept states, also called the final
The machine accepts if at least one of the computation states
branches ends in an accept state. In a DFA transition function is δ : Q × Σ → Q.
notation: For any alphabet Σ, Σǫ = Σ ∪ {ǫ}

Theory of Computation – p.18/?? Theory of Computation – p.17/??

Computation by an NFA Why NFA?


Let N = (Q, Σ, δ, q0 , F ) be an NFA and w a string over Σ. We Constructing NFA is sometimes easier than
say that N accepts w if w = a1 a2 . . . am , ai ∈ Σǫ , 1 ≤ i ≤ m, constructing DFA
and a sequence of states r0 , r1 , . . . , rm exists in Q such that: An NFA may be much smaller than a DFA that performs
1. r0 = q0 the same task.
2. ri+1 ∈ δ(ri , ai+1 ), for i = 0, 1, . . . , m − 1 Computation of NFA is usually more expensive than
3. rm ∈ F that of DFA.
Every NFA can be converted into an equivalent DFA.
Condition 1 says the machine’s starting state.
NFA provides good introduction to nondeterminism in
Condition 2 says that state ri+1 is one of the allowable more powerful computational models.
new states when N is in state ri and reads ai+1 . Note
that δ(ri , ai+1 ) is a set
Condition 3 says that the machine accepts the input if
the last state is in the accept state set.
Theory of Computation – p.20/?? Theory of Computation – p.19/??
Application of NFA Equivalence of NFA and DFA
Now we use the NFA to show that collection of regular Theorem Every NFA automation has an equivalent DFA to
languages is closed under regular operations union, recognize the same language.
concatenation, and star.
DFAs and NFAs recognize the same class of languages
The class of regular languages is
Theorem 1.25, 1.45
closed under the union operation. This equivalence is both surprising and useful.
It is surprising because NFAs appears to have more power than
The class of regular languages is
Theorem 1.26, 1.47
DFA, so we might expect that NFA recognizes more languages
closed under concatenation operation.
It is useful because describing an NFA for a given language
Theorem 1.49The class of regular languages is closed
sometimes is much easier than describing a DFA
under star operation.

Theory of Computation – p.22/?? Theory of Computation – p.21/??

Equivalence of NFA and DFA Question


The Formal Proof Is the class of languages recognized by NFAs closed under
Let N = (Q, Σ, δ, q0 , F ) be the NFA recognizing the language
A. We construct the DFA M recognizing A. complementation?

Before doing the full construction, consider first the


easier case when N has no ǫ transitions.

Theory of Computation – p.24/?? Theory of Computation – p.23/??


Corollary 1.40 Equivalence of NFA and DFA
A language is regular iff some NFA recognizes it The Formal Proof
Let N = (Q, Σ, δ, q0 , F ) be the NFA recognizing the language
Proof:
A. We construct the DFA M recognizing A.
If a language A is recognized by an NFA then A is
Before doing the full construction, consider first the
recognized by the DFA equivalent, hence, A is regular
easier case when N has no ǫ transitions.
If a language A is regular, it means that it is recognized
Then we consider the ǫ transitions.
by a DFA. But any DFA is also an NFA, hence, the
language is recognized by an NFA Notation: For any R ⊆ Q define E(R) to be the collection
of states that can be reached from R by going only
along ǫ transitions, including the members of R
themselves. Formally:
E(R) = R ∪ {q ∈ Q | ∃r1 ∈ R, r2 , ..., rk ∈ Q, ri+1 ∈
δ(ri , ǫ), rk = q}

Theory of Computation – p.26/?? Theory of Computation – p.25/??

Intuition may fail us Nonregular languages


Just because a language appears to require Consider the language B = {0n 1n |n ≥ 0}.
unbounded memory in order to be recognized, it If we attempt to find a DFA that recognizes B we discover that such a
doesn’t mean that it is necessarily so machine needs to remember how many 0s have been seen so far as
Example: it reads the input
C = {w|w has an equal number of 0s and 1s} is not regular Because the number of 0s isn’t limited, the machine needs to keep
D = {w|w has an equal number of 01 and 10 as substrings} is track of an unlimited number of possibilities
regular This cannot be done with any finite number of states

Theory of Computation – p.28/?? Theory of Computation – p.27/??


Pumping property Language nonregularity
All strings in the language can be “pumped" if they are at The technique for proving nonregularity of some
least as long as a certain special value, called the pumping languages is provided by a theorem about regular
length languages called pumping lemma
Pumping lemma states that all regular languages have
Meaning: each such string in the language contains a section
a special property
that can be repeated any number of times with the resulting If we can show that a language does not have this
string remaining in the language. property we are guaranteed that it is not regular

Theory of Computation – p.30/?? Theory of Computation – p.29/??

Proof idea Theorem 1.37


Let M = (Q, Σ, δ, q1 , F ) be a DFA that recognizes A Pumping Lemma: If A is a regular language, then there is a
Assign a pumping length p to be the number of states of M number p (the pumping length) where, if s is any string in A
of length at least p, then s may be divided into three pieces,
Show that any string s ∈ A, |s| ≥ p may be broken into three pieces
s = xyz , satisfying the following conditions:
xyz satisfying the pumping lemma’s conditions
1. for each i ≥ 0, xy i z ∈ A
If there are no strings in A of length at least p then theorem becomes
vacuously true because all three conditions hold for all strings of 2. |y| > 0
length at least p if there are no such strings 3. |xy| ≤ p

Theory of Computation – p.32/?? Theory of Computation – p.31/??


Note Pumping lemma’s proof
As x takes M from r1 to rj , y takes M from rj to rj , and Let M = (Q, Σ, δ, q1 , F ) by a DFA recognizing A and p be the
z takes M from rj to rn+1 , which is an accept state, M number of states of M ; let s = s1 s2 . . . sn be a string over Σ
must accept xy i z , for i ≥ 0 with n ≥ p and r1 , r2 , . . . , rn+1 be the sequence of states
while processing s, i.e., ri+1 = δ(ri , si ), 1 ≤ i ≤ n
We know that j 6= k , so |y| > 0;
n + 1 ≥ p + 1 and among the first p + 1 elements in r1 , r2 , . . . , rn+1
We also know that k ≤ p + 1, so |xy| ≤ p two must be the same state, say rj , rk .
Because rk occurs among the first p + 1 places in the sequence
Thus, all conditions are satisfied and lemma is proven starting at r1 , we have k ≤ p + 1
Now let x = s1 . . . sj−1 , y = sj . . . sk−1 , z = sk . . . sn .

Theory of Computation – p.34/?? Theory of Computation – p.33/??

Applications Using pumping lemma


Example 1.38 prove that B = {0n 1n |n ≥ 0} is not regular Prove that a language A is not regular using pumping
lemma:
Assume that B is regular and let p be the pumping length of 1. Assume that A is regular in order to obtain a contradiction
B . Choose s = 0p 1p ∈ B ; obviously |0p 1p | > p. By pumping 2. The pumping lemma guarantees the existence of a pumping length p
lemma s = xyz such that for any i ≥ 0, xy i z ∈ B such that all strings of length p or greater in A can be pumped
3. Find s ∈ A, |s| ≥ p, that cannot be pumped: demonstrate that s
cannot be pumped by considering all ways of dividing s into x,y,z,
showing that for each division one of the pumping lemma conditions,
(1) xy i z 6∈ B, (2) |y| > 0, (3) |xy| ≤ p, fails.

Theory of Computation – p.36/?? Theory of Computation – p.35/??


Example 1.39 Example, continuation
Prove that C = {w|w has an equal number of 0s and 1s} is Consider the cases:
not regular 1. y consists of 0s only. In this case xyyz has more 0s than 1s and so it
is not in B, violating condition 1
Proof: assume that C is regular and p is its pumping length.
2. y consists of 1s only. This leads to the same contradiction
Let s = 0p 1p with s ∈ C . Then pumping lemma guarantees
3. y consists of 0s and 1s. In this case xyyz may have the same number
that s = xyz , where xy i z ∈ C for any i ≥ 0. of 0s and 1s but they are out of order with some 1s before some 0s
hence it cannot be in B either

The contradiction is unavoidable if we make the assumption


that B is regular so B is not regular

Theory of Computation – p.38/?? Theory of Computation – p.37/??

Other selections Note


Selecting s = (01)p leads us to trouble because this string If we take the division x = z = ǫ, y = 0p 1p it seems that
indeed, no contradiction occurs. However:
can be pumped by the division: x = ǫ, y = 01, z = (01)p−1 .
Condition 3 states that |xy| ≤ p, and in our case
Then xy i z ∈ C for any i ≥ 0 xy = 0p 1p and |xy| > p. Hence, 0p 1p cannot be pumped
If |xy| ≤ p then y must consists of only 0s, so xyyz 6∈ C
because there are more 0-s than 1-s. This gives us the
desired contradiction

Theory of Computation – p.40/?? Theory of Computation – p.39/??


Example 1.40 An alternative method
Show that F = {ww|w ∈ {0, 1}∗ } is nonregular using Use the fact that B is nonregular.
pumping lemma If C were regular then C ∩ 0∗ 1∗ would also be regular because 0∗ 1∗ is
regular and intersection of regular languages is a regular language.
Proof: Assume that F is regular and p is its pumping length.
But C ∩ 0∗ 1∗ = B which is not regular.
Consider s = 0p 10p 1 ∈ F . Since |s| > p s = xyz and satisfies
Hence, C is not regular either.
the conditions of the pumping lemma.

Theory of Computation – p.42/?? Theory of Computation – p.41/??

Example 1.41 Note


2
Show that D = {1n |n ≥ 0} is nonregular. Condition 3 is again crucial because without it we could
pump s if we let x = z = ǫ, so xyyz ∈ F
Proof: by contradiction. Assume that D is regular and let p be Choosing s = 0p 10p 1 is the string that exhibits the
2
its pumping length. Consider s = 1p ∈ D, |s| ≥ p. Pumping essence of the nonregularity of F . If we chose, say
0p 0p ∈ F we fail because this string can be pumped
lemma guarantees that s can be split, s = xyz , where for all
i ≥ 0, xy i z ∈ D

Theory of Computation – p.44/?? Theory of Computation – p.43/??


Turning this idea into a proof Searching for a contradiction
Calculate the value of i that gives us the contradiction. The elements of D are strings whose lengths are perfect
squares. Looking at first perfect squares we observe that
If m = n2 , calculating the difference we obtain
√ they are: 0, 1, 4, 9, 25, 36, 49, 64, 81, . . .
(n + 1)2 − n2 = 2n + 1 = 2 m + 1
Note the growing gap between these numbers: large members
By pumping lemma |xy i z| and |xy i+1 z| are both perfect cannot be near each other
squares. But letting |xy i z| = m we can see that they
p Consider two strings xy i z and xy i+1 z which differ from each other by
cannot be both perfect square if |y| < 2 |xy i z| + 1, a single repetition of y.
because they would be too close together.
If we chose i very large the lengths of xy i z and xy i+1 z cannot be
both perfect square because they are too close to each other.

Theory of Computation – p.46/?? Theory of Computation – p.45/??

Example 1.42 Value of i for contradiction


Sometimes “pumping down" is useful when we apply To calculate the value for i that leads to contradiction we
pumping lemma. observe that:
We illustrate this using pumping lemma to prove that |y| ≤ |s| = p2
E = {0i 1j |i > j} is not regular
Let i = p4 . p
Then p
Proof:by contradiction using pumping lemma. Assume that E is 2
p
|y| ≤ p = p4 < 2 p4 + 1 ≤ 2 |xy i z| + 1
regular and its pumping length is p.

Theory of Computation – p.48/?? Theory of Computation – p.47/??


Try something else Searching for a contradiction
Since xy i z ∈ E even when i = 0, consider i = 0 and Let s = 0p+1 1p ; From decomposition s = xyz , from
xy 0 z = xz ∈ E . condition 3, |xy| ≤ p it results that y consists only of 0s.
This decreases the number of 0s in s. Let us examine xyyz to see if it is in E . Adding an
Since s has just one more 0 than 1 and xz cannot have extra-copy of y increases the n numbers of zeros. Since
more 0s than 1s, E contains all strings 0∗ 1∗ that have more 0s than 1s will
(xyz = 0p+1 1p and |y| =
6 0)
still give a string in E
xz cannot be in E .

This is the required contradiction

Theory of Computation – p.50/?? Theory of Computation – p.49/??

You might also like