You are on page 1of 10

First and Follow Sets

When I learn't about first and follow sets at university I found them difficult to follow, so I have tried to rewrite the rules I was taught for creating them so that they would be easier to understand. I hope it helps :) If you are worried if these rules are actually correct, I have had a lecturer ask if he can use them in his class so I am assuming they are correct... Please feel free to contact me if you have any queries about or suggestions for this page. My email address is james@jambe.cjb.net.

Rules for First Sets


1. If X is a terminal then First(X) is just X! 2. If there is a Production X then add to first(X) 3. If there is a Production X Y1Y2..Yk then add first(Y1Y2..Yk) to first(X) 4. First(Y1Y2..Yk) is either 1. First(Y1) (if First(Y1) doesn't contain ) 2. OR (if First(Y1) does contain ) then First (Y1Y2..Yk) is everything in First(Y1) <except for > as well as everything in First(Y2..Yk) 3. If First(Y1) First(Y2)..First(Yk) all contain then add to First(Y1Y2..Yk) as well.

Rules for Follow Sets


1. First put $ (the end of input marker) in Follow(S) (S is the start symbol) 2. If there is a production A aBb, (where a can be a whole string) then everything in FIRST(b) except for is placed in FOLLOW(B). 3. If there is a production A aB, then everything in FOLLOW(A) is in FOLLOW(B) 4. If there is a production A aBb, where FIRST(b) contains , then everything in FOLLOW(A) is in FOLLOW(B)

Here an example for you to follow through.


The Grammar E TE' E' +TE' E'

T FT' T' *FT' T' F (E) F id First Sets Follow Sets

We Want to make First sets so first we We want to make Follow sets so first we list the list the sets we need sets we need FIRST(E) = {} FIRST(E') = {} FIRST(T) = {} FIRST(T') = {} FIRST(F) = {} FOLLOW(E) = {} FOLLOW(E') = {} FOLLOW(T) ={} FOLLOW(T') = {} FOLLOW(F) = {}

First We apply rule 2 to T' and E' The First thing we do is Add $ to the start Symbol 'E' FIRST(E) = {} FIRST(E') = {} FIRST(T) = {} FIRST(T') = {} FIRST(F) = {} FOLLOW(E) = {$} FOLLOW(E') = {} FOLLOW(T) ={} FOLLOW(T') = {} FOLLOW(F) = {}

First We apply rule 3 to T' *FT' this Next we apply rule 2 to E' +TE' This says that rule tells us that we can add everything in everything in First(E') except for should be in First(*FT') into First(T') Follow(T) Since First(*) useing rule 1 is * we can FOLLOW(E) = {$} add * to First(T') FOLLOW(E') = {} FIRST(E) = {} FOLLOW(T) ={+} FIRST(E') = {+,} FOLLOW(T') = {}

FIRST(T) = {} FIRST(T') = {*,} FIRST(F) = {}

FOLLOW(F) = {} Next we apply rule 3 to E TE' This says that we should add everything in Follow(E) into Follow(E') FOLLOW(E) = {$}

First We apply rule 3 to T' *FT' this rule tells us that we can add everything in FOLLOW(E') = {$} First(*FT') into First(T') FOLLOW(T) ={+} Since First(*) useing rule 1 is * we can add * to First(T') FOLLOW(T') = {} FIRST(E) = {} FIRST(E') = {+,} FIRST(T) = {} FOLLOW(E) = {$} FIRST(T') = {*,} FOLLOW(E') = {$} FIRST(F) = {} FOLLOW(T) ={+} Two more productions begin with terminals F (E) and F id If we FOLLOW(T') = {+} apply rule 3 to these we get... FOLLOW(F) = {} FIRST(E) = {} Now we apply rule 2 to T' *FT' This says that FIRST(E') = {+,} everything in First(T') except for should be in Follow(F) FIRST(T) = {} FOLLOW(E) = {$} FIRST(T') = {*,} FOLLOW(E') = {$} FIRST(F) = {'(',id} FOLLOW(T) ={+} Next we apply rule 3 to T FT' once again this tells us that we can add FOLLOW(T') = {+} First(FT') to First(T) FOLLOW(F) = {*} Since First(F) doesn't contain that means that First(FT') is just First(F) Now we apply rule 2 to F (E) This says that everything in First(')') should be in Follow(E) FIRST(E) = {} FOLLOW(E) = {$,)} FIRST(E') = {+,} FOLLOW(F) = {} Next we apply rule 3 to T FT' This says that we should add everything in Follow(T) into Follow(T')

FIRST(T) = {'(',id} FIRST(T') = {*,} FIRST(F) = {'(',id}

FOLLOW(E') = {$} FOLLOW(T) ={+} FOLLOW(T') = {+}

Lastly we apply rule 3 to E TE' once FOLLOW(F) = {*} again this tells us that we can add First(TE') to First(E) Next we apply rule 3 to E TE' This says that we should add everything in Follow(E) into Follow(E') Since First(T) doesn't contain that means that First(TE') is just First(T) FOLLOW(E) = {$,)} FIRST(E) = {'(',id} FIRST(E') = {+,} FIRST(T) = {'(',id} FIRST(T') = {*,} FIRST(F) = {'(',id} FOLLOW(E') = {$,)} FOLLOW(T) = {+} FOLLOW(T') = {+} FOLLOW(F) = {*}

Next we apply rule 4 to E' +TE' This says that we should add everything in Follow(E') into Doing anything else doesn't change the Follow(T) (because First(E') contains ) sets so we are done! FOLLOW(E) = {$,)} FOLLOW(E') = {$,)} FOLLOW(T) = {+,$,)} FOLLOW(T') = {+} FOLLOW(F) = {*} Next we apply rule 3 to T FT' This says that we should add everything in Follow(T) into Follow(T') FOLLOW(E) = {$,)} FOLLOW(E') = {$,)} FOLLOW(T) = {+,$,)} FOLLOW(T') = {+,$,)} FOLLOW(F) = {*}

Finaly we apply rule 4 to T' *FT' This says that we should add everything in Follow(T') into Follow(F) FOLLOW(E) = {$,)} FOLLOW(E') = {$,)} FOLLOW(T) = {+,$,)} FOLLOW(T') = {+,$,)} FOLLOW(F) = {*,+,$,)}

Example 4
Using the assignment grammar from Example 2 with the following abbreviations:
P = Program U = Statements S = Statement A = Assignment Statement E = Expression T = Term F = Factor

and with left-recursion and common prefixes removed:


P U

S U'

U'

| U

A;

Id := E

T E'

E'

+ T E' |

F T'

T'

* F T' |

(E) | Id

The (magically created) table contains a row for each nonterminal and a column for each terminal. Here, we show only the part of the token needed to make the decision. All Id tokens are the same, but operator tokens have to be shown individually, since the parse is different for different operators.

The input string is:


a := b * c + d ;

Following the steps in the algorithm: (1) P is first pushed onto the stack. Since there is now a nonterminal on the top of the stack, the choice of nonterminal in the Case statement is taken. The current input symbol is a, which is an Id. We therefore consult the table at Table[P, Id] which contains the production P U. In (2), we replace the P at the top of the stack with U: Stack Production
(1) $ P U (2) $ U a := b * c + d ; $

Input
a := b * c + d ; $ P

With the top of stack, U, we consult the table at Table [U, a ]. The production there is U S U'. We pop the top of the stack and replace it with S U' (with S on the "top"):
(2) $ U S U' a := b * c + d ; $ U

(3) $ U' S

a := b * c + d ; $

Continuing,
$ U' S A ; $ U' ; A A Id := E a := b * c + d ; $ a := b * c + d ; $ a := b * c + d ; $ S

$ U' ; E := Id

Now, the terminal option in the CASE statement is chosen since the top of the stack contains a terminal. This is matched with the first terminal in the input, and the stack is popped as the input is advanced:
$ U' ; E := :=b * c + d ; $

The top of the stack is a terminal which matches the input. The stack and input become:
$ U' ; E b * c + d ; $

Continuing,
$ U' ; E a * b + c ; $ E T E'

$ U' ; E' T

a * b + c ; $

F T'

$ U' ; E' T' F

a * b + c ; $

Id

$ U' ; E' T' Id

a * b + c ; $

$ U' ; E' T'

* b + c ; $

T'

* F T'

$ U' ; E' T' F *

* b + c ; $

$ U' ; E' T' F

b + c ; $

Id

$ U' ; E' T' Id

b + c ; $

$ U' ; E' T'

+ c ; $

T'

$ U' ; E'

+ c ; $

E'

+ T E'

$ U' ; E' T +

+ c ;

$ U' ; E' T

c ; $

F T'

$ U' ; E' T' F

c ; $

Id

$ U' ; E' T' Id

Id ; $

$ U' ; E' T'

; $

T'

$ U' ; E'

; $

E'

$ U' ;

; $

$ U'

U'

Accept!

You might also like