You are on page 1of 31

CS424

Compiler Construction

LR Parsing

Recall the following terms

Rightmost Derivation

Reduction

Handle

Shift-Reduce Parsing

We now look at the class of grammars that


can be parsed using shift-reduce techniques

LR Parsing

The basic structure of an LR parser looks as


follows.

LR Parsing

The LR parsing algorithm follows a Finite


Automaton
GOTO is the transition function of the FA.
ACTION is a function that tells the parser what
action to take given the current state of the FA
and the next input symbol.
ACTION at any step can be Shift, Reduce,
Error, Accept.

LR Parsing

ACTION and GOTO together are the parsing table


of the LR parser.
There are multiple ways of constructing ACTION
and GOTO, called:

SLR or Simple LR

LR

LALR or Lookahead LR

Each of these corresponds to a different


construction of the Finite Automaton

SLR Parsing

We first look at the SLR method of


constructing parsing tables.
The corresponding automaton is called the
LR(0) automaton.
The states of this automaton are sets of LR(0)
items, which we describe next.

LR(0) Items

An item is a production plus an index in the right


hand side of the production which we will denote
by a dot.
The production A->XYZ yields the following items:

A->.XYZ

A->X.YZ

A->XY.Z

A->XYZ.

LR(0) Items

The dot in an item keeps track of where we


are in a parse.
The part of the production before the dot
corresponds to the part of input we have
already seen.
The part after the dot is what we expect to see
next.

LR(0) Items

For example, take the item A->X.YZ


This means that the parser has already seen some
input that can be derived from X.
Now it is waiting to see some string that can be derived
from Y. Once that happens it moves to the item A->XY.Z
Now it expects to see something derivable from Z so it
can move to A->XYZ.
Now it has seen something derived from XYZ, which
can be reduced to A.

Augmented Grammar

To construct the LR(0) automaton, we first


augment the grammar by adding a new start
symbol.
Given a CFG with start symbol S, we add a
new start symbol S' and a new production

S'->S

This will tell us when we can stop parsing and


accept.

Closure of Items

Suppose we have an item A->X.YZ and a


production Y-> ABC
The item tells us that we are waiting to see
something derived from Y, and the production
tells us that we are waiting to see something
derived from A.
Therefore we add the item Y->.ABC to the
closure of the original item.

Closure of Items

Given I, a set of items, compute Closure(I) as


follows:
Add everything in I to Closure(I)
If A->x.By is in Closure(I) and B->z is a
production then add B->.z to Closure(I), if not
already there.
Apply the previous step until no more items
can be added.

Closure Example

Consider the augmented expression grammar.


Given the item set, I = {E'->.E}, compute
Closure(I).

Closure Example
Closure(I) contains the items

E'->.E

E->.E+T

E->.T

T->.T*F

T->.F

F->.(E)

F->.id

Closure

Closures of item sets will be the states in the


finite automaton.
Next we see how to compute the transitions.
These correspond to the GOTO function.

GOTO Function

The intuitive idea is as follows.


Given an item A->X.YZ and grammar symbol
Y, the next item is A->XY.Z plus everything in
the closure of A->XY.Z
This means that we just saw a Y and now we
expect to see a Z and something derived from
Z.

GOTO Function

Formally, given I, a set of items and X a


grammar symbol.
GOTO(I, X) is the closure of the set of all items
[A->xX.y] such that [A->x.Xy] is in I.

GOTO Example

If I = {[E'->E.], [E->E.+T]}, then compute


GOTO(I, +)
This is the closure of E->E+.T

LR(0) Automaton For Expressions

The start state is the closure of [E'->.E]


Together with Closure and GOTO, we can now
build the automaton.

Using the LR(0) Automaton for


Parsing

Note that each state corresponds to a unique


grammar symbol, the one labeling the inputs.
So we can use states to mean grammar
symbols.
State 0 is the start state for the automaton

Using the LR(0) Automaton for


Parsing

Parsing with the LR(0) automaton is done with a stack,


which holds states.
Top of the stack is the current state.
If we are currently in the state j with input symbol a and
GOTO(j, a) is k then we shift a (or correspondingly state
k).
Otherwise we reduce using the item with a dot at the end.
Reduction corresponds to popping states on the right
hand side and pushing the state corresponding to the left
hand side.

Parsing Example

SLR Parsing Table

Rather than use the LR(0) automaton directly,


as just described, we code the information into
a parsing table.
The parsing table has two parts ACTION and
GOTO.
GOTO gives the next state.
ACTION tells us whether to shift or to reduce
using some production.

SLR Parsing Table

Label the productions in the expression


grammar as follows.

SLR Parsing Table

SLR Parsing Example

LR Parsing Algorithm

SLR Parsing Table Construction

SLR Parsing Table Construction

SLR Parsing

Any grammar for which the previous algorithm


results in a parsing action conflict is not SLR.
All ambiguous grammars are not SLR
parseable.
There are some non-ambiguous grammars that
cannot be parsed by SLR techniques.

You might also like