You are on page 1of 8

CS423 Lecture Notes

Chapter 1. Introduction

I. Basics and History of PL (Handout)

II. Compilation Process


a) Phases of Compilation Process
b) Supporting System
c) Pass

III. Writing A Compiler


a) Choice of a language
b) Bootstrapping
c) Writing Retargetable Compilers

1
I. Basics
a) Terminology

Compiler: is a software(program) that translates a program written in a

source language into the code in a object language.

Source Language: Programming Language that the compiler accepts as a

input (e.g., Pascal, C, C++…)

Object Language: A particular machine (or assembly) language that is used

to generate as the output of a compiler (Object Code).

Target Machine: the computer on which the object code is to be run

Cross Compiler: a compiler that generates an object code for a machine that

is different from the machine on which the compiler runs

e.g. compile a source in C on a PC-platform but the output (object code) runs

on a different machine such as Unix, Vax and etc

Run-time library: Collection of modules (functions) for computing certain

functions such as sqrt(x), sin(x), scanf()…. Etc

Linker: provides the run-time library to the object code and generates an

executable code

2
Compiler Linker

Source Obj Code executable

Run-Time Library

b)Interpreter vs. Compiler


Interpreter: executes the source without creating an object code

E.g., Basic, Lisp => the source is executed

Adv: Immediate Response

Runtime allocation of resource is possible (change of the size of a matrix)

Disadv: Slower in certain programming constructs(Loop)

3
II. Compilation Process
a) Phases of compilation Process:
i) Lexical Analysis
ii) Syntax Analysis
iii) Intermediate Code Generation
iv) Optimization
v) Object Code Generation

i) Lexical Analysis(LA): is a phase where the source code is broken up into meaningful

units (tokens)

Ex: source code: if (x == y) a = b-5;

L.A.: if, ( , x, ==, y, ), a, = , b, -, 5, ;

Note: Meaningful units may be different from language to language

In addition: - remove excessive white spaces (blank, tab etc)

- overpasses comments

- case conversion if needed

ii) Syntax Analysis(SA): is a phase where the structure of the program/individual

constructs is determined and constructed and outputs a parse tree.

Ex. Function, Procedure, declarations, loops, Conditions etc (determination)

Ex. Construction according the grammar

<Statement> -> <identifier> := <expression> ;

<expression> -> <expression> + <expression>

<expression> -> <expression> - <expression>

<expression> -> <identifier>

4
Ex. source: a := b + c:

<identifier> := <expression> ;
a

<expression> + <expression>

<identifier> <identifier>
b c

iii) Intermediate Code Generation (ICG) is a phase where an internal representation of

the source code is created that reflects the information uncovered during the syntax

analysis phase.

Internal representation: Abstract Syntax Tree (AST), Three Address Code(3AC),

Postfix notation

Ex. a := b + c;

AST

a +

b c

3AC: t1 = b+c;
a = t1;

Postfix: abc+=

iv. Optimization is a process of identifying and removing unnecessary/redundant

operations from intermediate code.

5
Ex. Source: x = a * b + c
3AC: t1 = a *b
t2 = t1 + c
x = t2
=>
t1 = a * b
x = t1 + c

v. Object Code Generation (OCG): is a phase where the IC is translated in a object code

Ex. Source; a = a*b + c


Obj; L 4,a
M 4,b
A 4,c
STO 4, x

LA SA ICG OPT OCG


Source
T tokens parse tree ic opt. Ic obj

b) Supporting Systems
During the compilation Process, we need 2 additional matters to take care of

i) Symbol Table Handler: Symbol table is the central depository of information about

names (variables, functions, types etc) created by the program. This handler inputs,

looks up the symbols.

ii) Error Handler: deals with errors that may occur during the compilation.

Detection, Message and Recovery

6
c)Pass
A pass is reading one version of a program(source) from a file and writing a new version.

Ex. One Pass Compiler: reads the source and writes the object code into a file

Multiple Pass Compiler: 2 or more passes are necessary to create the object code.
Reasons: 1. Some information is NOT available during initial pass
2. NOT enough memory to hold all intermediate results

7
III. Writing a Compiler
a)Choice of a language: Any HLL can be used to write a compiler

Q: Can we write a compiler in its own language? (ex. XYZ in XYZ?)


A: YES/NO

Bootstrapping:
Pascal Compiler

Minimal XYZ Compiler in Pascal Minimal XYZ compiler


(Subset of XYZ)

X Minimal XYZ Compiler


Full XYZ in Minimal XYZ Full XYZ

b)Writing Retargetable Compilers:

Compilers that can be used with different machines


Q: How do we write a such compiler

i) Distinguish Front/Back end of the compilation process


Front End: LA, SA, ICG (Machine Independent)
Back End: OPT, OCG (Machine Dependent)

Write Front Ends For Diff. Lang. Write Back ends for Diff. Machines
C PC
C++ MAC

ii) Write a compiler for an imaginary machine (P-Code)

C-Compiler for P-Code

XYZ Compiler in C XYZ Compiler on P-Code -> Interpreter for


Diff. machines

You might also like