Professional Documents
Culture Documents
9. Parsing
2/24/2018
John Roberts
• Grammars
• X Language Grammar
3
Parsing
Parsing (Grammars)
recursive descent
Tokens processing
(stream of lexical units)
Abstract Syntax Tree (AST)
Parsing (Grammars)
recursive descent
Tokens processing
(stream of lexical units)
Abstract Syntax Tree (AST)
5
Overview
• Parsing
• Grammars
• X Language Grammar
• Wiki link
7
Grammars
• A grammar G consists of
• G = ( N, T, P, S )
8
Nonterminals
• G = ( N, T, P, S )
• blocks
• programs
• expressions
• etc.
9
Terminals
• G = ( N, T, P, S )
• G = ( N, T, P, S )
11
Production Rules
• if E then BLOCK else BLOCK is the rule’s right hand side (RHS)
12
Start symbol
• G = ( N, T, P, S )
14
Overview
• Parsing
• Grammars
• X Language Grammar
15 D = Declarations
TYPE → ‘int’
TYPE → ‘boolean’
Production Category
S → ‘return’ E return
S → BLOCK
17 SE = Simple Expression
Grammar for X
Production Category
E → SE
E → SE ‘==‘ SE =
E → SE ‘!=‘ SE !=
E → SE ‘<‘ SE <
E → SE ‘<=‘ SE <=
18
Grammar for X
Production Category
SE → T
SE → SE ‘+’ T +
SE → SE ‘-‘ T -
SE → SE ‘|’ T |
19 T = Term
Grammar for X
Production Category
T→F
T → T ‘*’ F *
T → T ‘/‘ F /
T → T ‘&’ F &
20 F = Factor
Grammar for X
Production Category
F → ‘(‘ E ‘)’
F → NAME
F → <int>
NAME → <id>
23
Overview
• Parsing
• Grammars
• X Language Grammar
24
Parser Notes
26
Building ASTs for Grammar
27
Building ASTs for Grammar
rightBrace
• Assignment 3 will add these into the grammar, and provide a textual
(next slide) and graphical (following slide) display of the generated
ASTs
3: IntType rightBrace
4: Id: i program
8: Decl
6: IntType block
7: Id: j
decl decl assign assign
10: Assign
9: Id: i int i int j i + j call
12: AddOp: +
i j
11: Id: i
13: Id: j
15: Int: 7
17: Assign
16: Id: j
19: Call
18: Id: write
20: Id: i
1 program { int i int j 31
2 i = i + j + 7
ASTs built from Source 3
4 }
j = write(i)
block
+ 7 write i
i j
32 Time permitting
Build an AST
• Answer: https://gist.github.com/
jrob8577/7401c004e6c78040a15e9944e2536db2
33
Overview
• Parsing
• Grammars
• X Language Grammar
36
Compiler Packages
39
AST - Methods (abbreviated)
// return the ith kid of this node (kids are index from 1)
public AST getKid(int i) {
if ( (i <= 0) || (i > kidCount())) {
return null;
}
return kids.get(i - 1);
}
• IdTree
• RelOpTree
41 Read the code, trace execution, step through with debugger - we’re going to
Parser start by reading the code through a trace of a simple program
• Before we get started - what are some strategies for
figuring out how the Parser works?
program { int i
int f( int j ) { int i return j + 5 }
i = f( 7 )
}
43 we’re gonna see currentToken in the next few slides, as we scan…
Parser - members
44 Instantiate the Lexer given the sourceProgram input, and scan the first token
Parser - Constructor (recall currentToken is a private member of the Parser class)
/**
* Construct a new Parser;
*
Constructor’s job is to place the class in a valid, default state - for the
* @param sourceProgram - source file name
* @exception Exception - thrown for any problems at startup (e.g. I/O)
*/
Parser, this means with the first token set
public Parser(String sourceProgram) throws Exception {
try {
lex = new Lexer(sourceProgram);
scan();
} catch (Exception e) {
System.out.println("********exception*******" + e.toString());
throw e;
};
}
45
Parser - execute
/**
* Execute the parse command
*
* @return the AST for the source program
* @exception Exception - pass on any type of exception raised
*/
public AST execute() throws Exception {
try {
return rProgram();
} catch (SyntaxError e) {
e.print();
throw e;
}
}
46 We need to check out expect next, but first, why are we invoking rBlock here?
Parser - rProgram (because grammar - more on this in a minute)
/**
* <pre>
* Program -> 'program' block ==> program
* </pre>
*
* @return the program tree
* @exception SyntaxError - thrown for any syntax error
*/
public AST rProgram() throws SyntaxError {
// note that rProgram actually returns a ProgramTree; we use the
// principle of substitutability to indicate it returns an AST
AST t = new ProgramTree();
expect(Tokens.Program);
t.addKid(rBlock());
return t;
}
48
Parser - So far
• Check that the next token is the program token and scan
past it, reporting a SyntaxError if the scanned token
doesn’t match the program token
50
Operation of each Parser method
51
Operation of each Parser method, continued
/**
* record the syntax error just encountered
*
* @param tokenFound is the token just found by the parser
* @param kindExpected is the token we expected to find based on the current
* context
*/
public SyntaxError(Token tokenFound, Tokens kindExpected) {
this.tokenFound = tokenFound;
this.kindExpected = kindExpected;
}
void print() {
System.out.println("Expected: "+ kindExpected);
return;
}
}
53
Parser - rBlock
54
Parser - rBlock
boolean startingStatement() {
if( isNextTok(Tokens.If) || isNextTok(Tokens.While) ||
isNextTok(Tokens.Return) || isNextTok(Tokens.LeftBrace) ||
isNextTok(Tokens.Identifier))
{
return true;
}
return false;
}
56
Parser - rDecl
• D → TYPE NAME
• SE → SE ‘+’ T
• SE → SE ‘-‘ T
• SE → SE ‘|’ T
62
Associativity (Wiki)
63
Associativity (Wiki)
• rExpr
• rTerm
• getMultOperTree
• getRelationTree
• rFactor
• rStatement
66
Parser
program { int i
int f( int j ) { int i return j + 5 }
i = f( 7 )
}
• Result: https://gist.github.com/jrob8577/
d4581f99d5006c944ecc88238f391276