You are on page 1of 2

Refresher

feb 17


Compilers

Compilation process involves source code of characters to tokens through lexical analysis,
parsing with Abstract Syntax Tree for type check and symbol table, into binary codes with
runtime stack subject to interpreter to run the program.

Lexical Analysis takes in source code to convert into a stream of lexical units as tokens with
assigned meanings. The job of lexical analysis is to removes whitespace and comments.

Parsing analyzes stream into Abstract Syntax Tree, tree representation of generated code, used
as the grammar of the language, abstract syntactic structure of source code. In AST parentheses
is implied as a group block node.

Constraining checks types and adds references to identifiers’ entry information in the symbol
table into decorated AST. Symbol table was built with information of identifiers in the source
code as well as its declaration or appearance. It is human readable in memory location.

Code generation walks through the decorated AST to generate bytecodes and to generate and
track runtime stack offset of address field for variable declaration in the code.

Function code -> run time stack
Program counter -> location pointer.


Lexical Analysis

Read and convert a stream of characters into tokens of lexical units (lexemes).
SourceReader.java reads file and generate output. Use toString() for print.

Tokens file defines tokens with each line consists of symbolic constant used in the compiler and
the actual token.

TokenSetup.java reads tokens based on the Tokens class and generate the files Tokens.java and
TokenType.java.

Toekn.java contains String of Token, TokenType, starting column, and ending column.

Symbol.java contains String of Token and Tokentype (value). All tokens in the source file is
placed once in the Symbol hash table.

Lexical Analysis creates symbol instances for reserved words and operators at initialization. In
processing the source program, new symbols of identifiers and numbers are created and added
to the hash map.

tokens.put( Tokens.program, Symbol.symbol( “program”, Tokens.program ) );
and Symbol.symbols.get( “program” ) yields Symbol( “program”, Tokens.program )

You might also like