A finite automaton is a 5-tuple (q, s, d, q0, f) where q is a set called the states. F is the set of accept states, also called the final states. Finite automata and their probabilistic counterparts, Markov chain, are useful tools for pattern recognition used in speech processing and optical character recognition.
A finite automaton is a 5-tuple (q, s, d, q0, f) where q is a set called the states. F is the set of accept states, also called the final states. Finite automata and their probabilistic counterparts, Markov chain, are useful tools for pattern recognition used in speech processing and optical character recognition.
A finite automaton is a 5-tuple (q, s, d, q0, f) where q is a set called the states. F is the set of accept states, also called the final states. Finite automata and their probabilistic counterparts, Markov chain, are useful tools for pattern recognition used in speech processing and optical character recognition.
Σ: alphabet, a finite, non-empty set of symbols. Regular Languages Σ∗ : all the words of finite length built up using Σ: Hantao Zhang Conditions: abstracted as a set of words, called http://www.cs.uiowa.edu/∼ hzhang/c135 language Any subset L ⊆ Σ∗ is a formal language. The University of Iowa Department of Computer Science Unknown: Implicitly a Boolean variable: true if a word is the language; no, otherwise. Given w ∈ Σ∗ and L ⊆ Σ∗ , does w ∈ L?
Theory of Computation – p.2/?? Theory of Computation – p.1/??
Formal Definition of Finite Automata Finite Automata
A finite automaton is a 5-tuple (Q, Σ, δ, q0 , F ) where: The simplest computational model is called a finite state Q is a finite set called the states machine or a finite automaton
Σ is a finite set called the alphabet Representations:
Graphical δ : Q × Σ → Q is the transition function Tabular q0 ∈ Q is the start state also called initial state Mathematical. F ⊆ Q is the set of accept states, also called the final states
Theory of Computation – p.4/?? Theory of Computation – p.3/??
Computation of a Finite Automaton Applications The automaton receives the input symbols one by one Finite automata are popular in parser construction of from left to right compilers. After reading each symbol, M1 moves from one state to Finite automata and their probabilistic counterparts, another along the transition that has that symbol as its Markov chain, are useful tools for pattern recognition label used in speech processing and optical character recognition When M1 reads the last symbol of the input it produces Markov chains have been used to model and predict the output: accept if M1 is in an accept state, or reject if price changes in financial applications M1 is not in an accept state
Theory of Computation – p.6/?? Theory of Computation – p.5/??
Formal Definition of Acceptance Language Recognized by an Automaton
Let M = (Q, Σ, δ, q0 , F ) be a finite automaton and If L is the set of all strings that a finite automaton M w = a1 a2 . . . an be a string over Σ. accepts, we say that L is the language of the machine Then M accepts w if a sequence of states r0 , r1 , . . . , rn exist M and write L(M ) = L. in Q such that the following hold: An automaton may accept several strings, but it always 1. r0 = q0 recognizes only one language 2. δ(ri , ai+1 ) = ri+1 for i = 0, 1, . . . , n − 1 If a machine accepts no strings, it still recognizes one language, namely the empty language ∅ 3. rn ∈ F
Condition (1) says where the machine starts.
Condition (2) says that the machine goes from state to state according to its transition function δ. Condition (3) says when the machine accepts its input: if it ends up in an accept state.
Theory of Computation – p.8/?? Theory of Computation – p.7/??
Designing Finite Automata Regular Languages Whether it be of automaton or artwork, design is a We say that a finite automaton M recognizes the language creative process. Consequently it cannot be reduced to L if L = {w | M accepts w} a simple recipe or formula. Definition A language is called regular language if there exists The approach: Identify the finite pieces of information you need to a finite automaton that recognizes it. solve the problem. These are the states. Identify the condition (alphabet) to change from one state to another. Identify the start and final states. Add missing transitions.
Theory of Computation – p.10/?? Theory of Computation – p.9/??
Regular Expressions Operations on Regular Languages
Three base cases: Let A and B be languages. We define regular operations ∅ is a regular expression denoting the language ∅; union, concatenation, and star as follows: ǫ is a regular expression denoting the language {ǫ}; Union: A ∪ B = {x | x ∈ A ∨ x ∈ B}
For any a ∈ Σ, a is a regular expression denoting the Concatenation: A ◦ B = {xy | x ∈ A ∧ y ∈ B}
language {a}; Star: A∗ = {x1 x2 . . . xk | k ≥ 0 ∧ xi ∈ A, 1 ≤ i ≤ k}
Note: Three recursive cases: If r1 and r2 are regular (1) Because “any number" includes 0, ǫ ∈ A∗ , no matter expressions denoting languages L1 and L2 , respectively, what A is. then (2) A+ denotes A ◦ A∗ . Union: r1 ∪ r2 denotes L1 ∪ L2 ; Concatenation: r1 r2 denotes L1 L2 ; Star: r1∗ denotes L∗1 .
Theory of Computation – p.12/?? Theory of Computation – p.11/??
Closure under Complementation Closure under Intersection TheoremThat class of regular languages is closed under Theorem That class of regular languages is closed under complementation. intersection. ProofFor that we will first show that if M is a DFA that Proof Use cross-product construction of states. recognizes a language B , swapping the accept and non-accept states in M yields a new DFA that recognizes the complement of B .
Theory of Computation – p.14/?? Theory of Computation – p.13/??
Two ways to Introduce Nondeterminism Nondeterminism
More choices for the next state: Zero, one, or many So far in our discussion, every step of a finite arrows may exit from each state. automaton computation follows in a unique way from A state may change to the next state without spending the preceding step. an input symbol: ǫ-transitions. We call this a deterministic computation. In a nondeterministic computation, choices may exist for the next state at any point. Nondeterminism is a generalization of determinism; hence, every finite automaton is a nondeterministic finite automaton (NFA).
Theory of Computation – p.16/?? Theory of Computation – p.15/??
Tree Computation of NFA Formal Definition of NFA A way to think of a nondeterministic computation is as a An NFA is a 5-tuple (Q, Σ, δ, q0 , F ) where: tree of possibilities Q is a finite set called the states The root of the tree corresponds to the start of the Σ is a finite set called the alphabet computation δ : Q × (Σ ∪ {ǫ}) → P(Q) Every branching point in the tree corresponds to a point where P(Q) is the power set of Q. in the computation at which the machine has multiple q0 ∈ Q is the start state also called initial state choices F ⊆ Q is the set of accept states, also called the final The machine accepts if at least one of the computation states branches ends in an accept state. In a DFA transition function is δ : Q × Σ → Q. notation: For any alphabet Σ, Σǫ = Σ ∪ {ǫ}
Theory of Computation – p.18/?? Theory of Computation – p.17/??
Computation by an NFA Why NFA?
Let N = (Q, Σ, δ, q0 , F ) be an NFA and w a string over Σ. We Constructing NFA is sometimes easier than say that N accepts w if w = a1 a2 . . . am , ai ∈ Σǫ , 1 ≤ i ≤ m, constructing DFA and a sequence of states r0 , r1 , . . . , rm exists in Q such that: An NFA may be much smaller than a DFA that performs 1. r0 = q0 the same task. 2. ri+1 ∈ δ(ri , ai+1 ), for i = 0, 1, . . . , m − 1 Computation of NFA is usually more expensive than 3. rm ∈ F that of DFA. Every NFA can be converted into an equivalent DFA. Condition 1 says the machine’s starting state. NFA provides good introduction to nondeterminism in Condition 2 says that state ri+1 is one of the allowable more powerful computational models. new states when N is in state ri and reads ai+1 . Note that δ(ri , ai+1 ) is a set Condition 3 says that the machine accepts the input if the last state is in the accept state set. Theory of Computation – p.20/?? Theory of Computation – p.19/?? Application of NFA Equivalence of NFA and DFA Now we use the NFA to show that collection of regular Theorem Every NFA automation has an equivalent DFA to languages is closed under regular operations union, recognize the same language. concatenation, and star. DFAs and NFAs recognize the same class of languages The class of regular languages is Theorem 1.25, 1.45 closed under the union operation. This equivalence is both surprising and useful. It is surprising because NFAs appears to have more power than The class of regular languages is Theorem 1.26, 1.47 DFA, so we might expect that NFA recognizes more languages closed under concatenation operation. It is useful because describing an NFA for a given language Theorem 1.49The class of regular languages is closed sometimes is much easier than describing a DFA under star operation.
Theory of Computation – p.22/?? Theory of Computation – p.21/??
Equivalence of NFA and DFA Question
The Formal Proof Is the class of languages recognized by NFAs closed under Let N = (Q, Σ, δ, q0 , F ) be the NFA recognizing the language A. We construct the DFA M recognizing A. complementation?
Before doing the full construction, consider first the
easier case when N has no ǫ transitions.
Theory of Computation – p.24/?? Theory of Computation – p.23/??
Corollary 1.40 Equivalence of NFA and DFA A language is regular iff some NFA recognizes it The Formal Proof Let N = (Q, Σ, δ, q0 , F ) be the NFA recognizing the language Proof: A. We construct the DFA M recognizing A. If a language A is recognized by an NFA then A is Before doing the full construction, consider first the recognized by the DFA equivalent, hence, A is regular easier case when N has no ǫ transitions. If a language A is regular, it means that it is recognized Then we consider the ǫ transitions. by a DFA. But any DFA is also an NFA, hence, the language is recognized by an NFA Notation: For any R ⊆ Q define E(R) to be the collection of states that can be reached from R by going only along ǫ transitions, including the members of R themselves. Formally: E(R) = R ∪ {q ∈ Q | ∃r1 ∈ R, r2 , ..., rk ∈ Q, ri+1 ∈ δ(ri , ǫ), rk = q}
Theory of Computation – p.26/?? Theory of Computation – p.25/??
Intuition may fail us Nonregular languages
Just because a language appears to require Consider the language B = {0n 1n |n ≥ 0}. unbounded memory in order to be recognized, it If we attempt to find a DFA that recognizes B we discover that such a doesn’t mean that it is necessarily so machine needs to remember how many 0s have been seen so far as Example: it reads the input C = {w|w has an equal number of 0s and 1s} is not regular Because the number of 0s isn’t limited, the machine needs to keep D = {w|w has an equal number of 01 and 10 as substrings} is track of an unlimited number of possibilities regular This cannot be done with any finite number of states
Theory of Computation – p.28/?? Theory of Computation – p.27/??
Pumping property Language nonregularity All strings in the language can be “pumped" if they are at The technique for proving nonregularity of some least as long as a certain special value, called the pumping languages is provided by a theorem about regular length languages called pumping lemma Pumping lemma states that all regular languages have Meaning: each such string in the language contains a section a special property that can be repeated any number of times with the resulting If we can show that a language does not have this string remaining in the language. property we are guaranteed that it is not regular
Theory of Computation – p.30/?? Theory of Computation – p.29/??
Proof idea Theorem 1.37
Let M = (Q, Σ, δ, q1 , F ) be a DFA that recognizes A Pumping Lemma: If A is a regular language, then there is a Assign a pumping length p to be the number of states of M number p (the pumping length) where, if s is any string in A of length at least p, then s may be divided into three pieces, Show that any string s ∈ A, |s| ≥ p may be broken into three pieces s = xyz , satisfying the following conditions: xyz satisfying the pumping lemma’s conditions 1. for each i ≥ 0, xy i z ∈ A If there are no strings in A of length at least p then theorem becomes vacuously true because all three conditions hold for all strings of 2. |y| > 0 length at least p if there are no such strings 3. |xy| ≤ p
Theory of Computation – p.32/?? Theory of Computation – p.31/??
Note Pumping lemma’s proof As x takes M from r1 to rj , y takes M from rj to rj , and Let M = (Q, Σ, δ, q1 , F ) by a DFA recognizing A and p be the z takes M from rj to rn+1 , which is an accept state, M number of states of M ; let s = s1 s2 . . . sn be a string over Σ must accept xy i z , for i ≥ 0 with n ≥ p and r1 , r2 , . . . , rn+1 be the sequence of states while processing s, i.e., ri+1 = δ(ri , si ), 1 ≤ i ≤ n We know that j 6= k , so |y| > 0; n + 1 ≥ p + 1 and among the first p + 1 elements in r1 , r2 , . . . , rn+1 We also know that k ≤ p + 1, so |xy| ≤ p two must be the same state, say rj , rk . Because rk occurs among the first p + 1 places in the sequence Thus, all conditions are satisfied and lemma is proven starting at r1 , we have k ≤ p + 1 Now let x = s1 . . . sj−1 , y = sj . . . sk−1 , z = sk . . . sn .
Theory of Computation – p.34/?? Theory of Computation – p.33/??
Applications Using pumping lemma
Example 1.38 prove that B = {0n 1n |n ≥ 0} is not regular Prove that a language A is not regular using pumping lemma: Assume that B is regular and let p be the pumping length of 1. Assume that A is regular in order to obtain a contradiction B . Choose s = 0p 1p ∈ B ; obviously |0p 1p | > p. By pumping 2. The pumping lemma guarantees the existence of a pumping length p lemma s = xyz such that for any i ≥ 0, xy i z ∈ B such that all strings of length p or greater in A can be pumped 3. Find s ∈ A, |s| ≥ p, that cannot be pumped: demonstrate that s cannot be pumped by considering all ways of dividing s into x,y,z, showing that for each division one of the pumping lemma conditions, (1) xy i z 6∈ B, (2) |y| > 0, (3) |xy| ≤ p, fails.
Theory of Computation – p.36/?? Theory of Computation – p.35/??
Example 1.39 Example, continuation Prove that C = {w|w has an equal number of 0s and 1s} is Consider the cases: not regular 1. y consists of 0s only. In this case xyyz has more 0s than 1s and so it is not in B, violating condition 1 Proof: assume that C is regular and p is its pumping length. 2. y consists of 1s only. This leads to the same contradiction Let s = 0p 1p with s ∈ C . Then pumping lemma guarantees 3. y consists of 0s and 1s. In this case xyyz may have the same number that s = xyz , where xy i z ∈ C for any i ≥ 0. of 0s and 1s but they are out of order with some 1s before some 0s hence it cannot be in B either
The contradiction is unavoidable if we make the assumption
that B is regular so B is not regular
Theory of Computation – p.38/?? Theory of Computation – p.37/??
Other selections Note
Selecting s = (01)p leads us to trouble because this string If we take the division x = z = ǫ, y = 0p 1p it seems that indeed, no contradiction occurs. However: can be pumped by the division: x = ǫ, y = 01, z = (01)p−1 . Condition 3 states that |xy| ≤ p, and in our case Then xy i z ∈ C for any i ≥ 0 xy = 0p 1p and |xy| > p. Hence, 0p 1p cannot be pumped If |xy| ≤ p then y must consists of only 0s, so xyyz 6∈ C because there are more 0-s than 1-s. This gives us the desired contradiction
Theory of Computation – p.40/?? Theory of Computation – p.39/??
Example 1.40 An alternative method Show that F = {ww|w ∈ {0, 1}∗ } is nonregular using Use the fact that B is nonregular. pumping lemma If C were regular then C ∩ 0∗ 1∗ would also be regular because 0∗ 1∗ is regular and intersection of regular languages is a regular language. Proof: Assume that F is regular and p is its pumping length. But C ∩ 0∗ 1∗ = B which is not regular. Consider s = 0p 10p 1 ∈ F . Since |s| > p s = xyz and satisfies Hence, C is not regular either. the conditions of the pumping lemma.
Theory of Computation – p.42/?? Theory of Computation – p.41/??
Example 1.41 Note
2 Show that D = {1n |n ≥ 0} is nonregular. Condition 3 is again crucial because without it we could pump s if we let x = z = ǫ, so xyyz ∈ F Proof: by contradiction. Assume that D is regular and let p be Choosing s = 0p 10p 1 is the string that exhibits the 2 its pumping length. Consider s = 1p ∈ D, |s| ≥ p. Pumping essence of the nonregularity of F . If we chose, say 0p 0p ∈ F we fail because this string can be pumped lemma guarantees that s can be split, s = xyz , where for all i ≥ 0, xy i z ∈ D
Theory of Computation – p.44/?? Theory of Computation – p.43/??
Turning this idea into a proof Searching for a contradiction Calculate the value of i that gives us the contradiction. The elements of D are strings whose lengths are perfect squares. Looking at first perfect squares we observe that If m = n2 , calculating the difference we obtain √ they are: 0, 1, 4, 9, 25, 36, 49, 64, 81, . . . (n + 1)2 − n2 = 2n + 1 = 2 m + 1 Note the growing gap between these numbers: large members By pumping lemma |xy i z| and |xy i+1 z| are both perfect cannot be near each other squares. But letting |xy i z| = m we can see that they p Consider two strings xy i z and xy i+1 z which differ from each other by cannot be both perfect square if |y| < 2 |xy i z| + 1, a single repetition of y. because they would be too close together. If we chose i very large the lengths of xy i z and xy i+1 z cannot be both perfect square because they are too close to each other.
Theory of Computation – p.46/?? Theory of Computation – p.45/??
Example 1.42 Value of i for contradiction
Sometimes “pumping down" is useful when we apply To calculate the value for i that leads to contradiction we pumping lemma. observe that: We illustrate this using pumping lemma to prove that |y| ≤ |s| = p2 E = {0i 1j |i > j} is not regular Let i = p4 . p Then p Proof:by contradiction using pumping lemma. Assume that E is 2 p |y| ≤ p = p4 < 2 p4 + 1 ≤ 2 |xy i z| + 1 regular and its pumping length is p.
Theory of Computation – p.48/?? Theory of Computation – p.47/??
Try something else Searching for a contradiction Since xy i z ∈ E even when i = 0, consider i = 0 and Let s = 0p+1 1p ; From decomposition s = xyz , from xy 0 z = xz ∈ E . condition 3, |xy| ≤ p it results that y consists only of 0s. This decreases the number of 0s in s. Let us examine xyyz to see if it is in E . Adding an Since s has just one more 0 than 1 and xz cannot have extra-copy of y increases the n numbers of zeros. Since more 0s than 1s, E contains all strings 0∗ 1∗ that have more 0s than 1s will (xyz = 0p+1 1p and |y| = 6 0) still give a string in E xz cannot be in E .
This is the required contradiction
Theory of Computation – p.50/?? Theory of Computation – p.49/??