You are on page 1of 94

Chapter 5

Compilers

Source
Macro Expanded Compiler or obj
Code
(with macro)
Processor Code Assembler

1
Terminology
 Statement (敘述)
 Declaration, assignment containing expression (運算式)
 Grammar (文法)
 A set of rules specify the form of legal statements
 Syntax (語法) vs. Semantics (語意)
 Example: assuming I, J, K:integer and X,Y:float
 I:=J+K vs. I:= X+Y
 Compilation (編譯)
 Matching statements written by the programmer to
structures defined by the grammar and generating the
appropriate object code.

2
Basic Compiler
 Lexical analysis - scanner
 Scanning the source statement, recognizing and
classifying the various tokens
 Syntactic analysis - parser
 Recognizing the statement as some language construct.
 Construct a parser tree (syntax tree)
 Code generation – code generator
 Generate assembly language codes
 Generate machine codes (Object codes)

3
PROGRAM
Scanner SUM
STATS
:=
VAR
0
SUM
;
,
SUMSQ
SUMSQ
:=
,
I

READ
(
VALUE
)
;

4
Parser
 Grammar: a set of rules
 Backus-Naur Form (BNF)
 Ex: Figure 5.2

 Terminology
 Define symbol ::=
 Nonterminal symbols <>
 Alternative symbols |
 Terminal symbols

5
Simplified Pascal Grammar

6
Parser
 READ(VALUE)  <read> ::= READ (<id-list>)
 <id-list>::= id | <id-list>,id

 <assign>::= id := <exp>
 SUM := 0  <exp> ::= <term> |
<exp>+<term> |
 SUM := SUM + <exp>-<term>
VALUE  <term>::=<factor> |
<term>*<factor> | <term> DIV
<factor>
 MEAN := SUM DIV
100  <factor>::= id | int | (<exp>)

7
Syntax Tree

8
Syntax Tree for Program 5.1

9
10
Lexical Analysis
 Function
 Scanning the program to be compiled and recognizing
the tokens that make up the source statements.

 Tokens
 Tokens can be keywords, operators, identifiers, integers,
floating-point numbers, character strings, etc.
 Each token is usually represented by some fixed-length
code, such as an integer, rather than as a variable-length
character string (see Figure 5.5)
 Token type, Token specifier (value) (see Figure 5.6)

11
Scanner Output
 Token specifier
 Identifier name, integer value, (type)
 Token coding scheme
 Figure 5.5

12
13
Token Recognizer
 By grammar
 <ident> ::= <letter> | <ident> <letter> | <ident> <digit>
 <letter>::= A | B | C | D | … | Z
 <digit> ::= 0 | 1 | 2 | 3 | … | 9
 By scanner - modeling as finite automata
 Figure 5.8 (a)

14
Recognizing Identifier
 Identifiers allowing
underscore (_)
 Figure 5.8 (b)

15
Recognizing Identifier

16
Recognizing Integer
 Allowing leading zeroes
 Figure 5.8 (c)
 Disallowing leading zeroes
 Figure 5.8 (d)

17
Scanner - Implementation
 Figure 5.10 (a)
 Algorithmic code for identifier
recognition
 Tabular representation of
finite automaton for Figure
5.9.
State A-Z 0-9 ;,+-*() : = .
1 2 4 5 6
2 2 2 3
3
4 4
5
6 7
7

18
Syntactic Analysis
 Recognize source statements as language
constructs or build the parse tree for the
statements.
 Bottom-up
 Operator-precedence parsing
 Shift-reduce parsing
 LR(0) parsing
 LR(1) parsing
 SLR(1) parsing
 LALR(1) parsing
 Top-down
 Recursive-descent parsing
 LL(1) parsing
19
Operator-Precedence Parsing
 Operator
 Any terminal symbol (or any token)
 Precedence
 * »+
 +«*
 Operator-precedence
 Precedence relations between operators

20
Precedence Matrix for the Fig. 5.2

21
Operator-Precedence Parse Example
BEGIN READ ( VALUE ) ;

22
Operator-Precedence Parse Example

23
Operator-Precedence Parse Example

24
Operator-Precedence Parse Example

25
Operator-Precedence Parsing
 Bottom-up parsing
 Generating precedence matrix
 Aho et al. (1988)

26
Shift-reduce Parsing with Stack
 Figure 5.14

27
Recursive-Descent Parsing
 Each nonterminal symbol in the grammar is
associated with a procedure.
 <read> ::= READ (<id-list>)
 <stmt> ::= <assign> | <read> | <write> | <for>

 Left recursion
 <dec-list> ::= <dec> | <dec-list>;<dec>
 Modification
 <dec-list> ::= <dec> {;<dec>}

28
Recursive-Descent Parsing (cont’d.)

29
Recursive-Descent Parsing of READ

30
Recursive-Descent Parsing of IDLIST

31
Recursive-Descent Parsing (cont’d.)

32
Recursive-Descent Parsing of ASSIGN

33
Recursive-Descent Parsing of EXP

34
Recursive-Descent Parsing of TERM

35
Recursive-Descent Parsing of FACTOR

36
Recursive-Descent Parsing (cont’d.)

37
Recursive-Descent Parsing (cont’d.)

38
Recursive-Descent Parsing (cont’d.)

39
Code Generation
Add S(id) to LIST and LISTCOUNT++

40
Code Generation (cont’d.)

41
Code Generation (cont’d.)

42
Code Generation (cont’d.)

Save to Register A 1. <term> already in Register A, or


2
2. <factor> already in Register A, or
3. <term>2 and <factor>both are not in Register A

43
Code Generation (cont’d.)

Node Specifier S(<term>1)


save to rA
44
45
Code Generation (cont’d.)

46
Code Generation (cont’d.) 9 and 10

S(<exp>) := S(<term>)
S(<exp>) <> rA =S(MEAN)

47
Code Generation (cont’d.) 10

48
Code Generation (cont’d.) 10

S(<exp>2) <> rA
Call GETA(<exp>2)
Generate [SUB S(MEAN)=T2]
S(<exp>1) := rA
REGA := <exp>1

49
Code Generation (cont’d.) 11

S(<term>) := S(SUMSQ)
S(<term>) <> rA
S(<term>) := S(MEAN)

50
Code Generation (cont’d.) 11

S(<term>2) <> rA
S(<factor>) <> rA
Call GETA(MEAN)
Generate [MUL MEAN]
S(MEAN) := rA
REGA:= MEAN

51
Code Generation (cont’d.) 11

S(<term>2) <> rA
Call GETA(SUMSQ)
Generate [DIV #100]
S(<term>1) := rA
REGA := <term>1

52
Code Generation (cont’d.) 12

S(<factor>) := S(SUMSQ)
S(<factor>) := S(MEAN)
S(<factor>) := S(MEAN)

S(<factor>) := S(#100)

53
Code Generation (cont’d.) REGA <> NULL = SUMSQ
S(MEAN) <> rA = S(SUMSQ)
Generate [STA T1]
S(SUMSQ) = T1
Generate [LDA MEAN]
S(MEAN) := rA
REGA := MEAN
REGA = NULL
Generate [LDA SUMSQ]
S(SUMSQ) := rA
REGA := SUMSQ
REGA <> NULL = MEAN
S(SUMSQ) <> rA = S(MEAN)
Generate [STA T2]
S(MEAN) = T2
Generate [LDA S(SUMSQ)=T1]
S(SUMSQ) := rA
REGA := SUMSQ 54
Code Generation (cont’d.)

55
Code Generation (cont’d.) 1, 2, and 3

56
Code Generation (cont’d.) 4, 5, and 7

57
8, 14, 15

58
Code Generation (cont’d.) 16 and 17

59
Code Generation (cont’d.)

60
Code Generation (cont’d.)

61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
3*(6-1)=3*5=15

76
77
78
79
80
81
82
83
Machine-Independent Compiler Features

84
Machine-Independent Compiler Features
 Storage Allocation
 Static Allocation: cannot be used for recursive call
 Dynamic Allocation: Activation Record

85
86
87
88
89
90
91
92
P-Code Compliers

93
Compiler-Compilers

94

You might also like