You are on page 1of 92

A

PROJECT REPORT

ON

LEXICAL ANALYZER

SESSION: 2008 - 2009

Guided By: Submitted By:

Ms. Deepti Arora Ankita Verma

(B.E. VIIth sem I.T.)

Submitted to:

Department of Information Technology

Laxmi Devi Institute of Engineering & Technology, Alwar (Raj.)

University of Rajasthan

INDEX
1. Project description
1.1 Objective of the project

2. Project contents

2.1 Software development life cycle

2.2 Study and formulation

2.3 Project category

2.4 Platform (technology/tools)

2.5 Software and hardware used

2.6 Feasibility

2.7 System design

2.8 Data flow diagram

2.9 Entity relationship diagram

2.10 Testing

2.10.1testing methodology

2.10.2testing strategy

2.11 System security

2.12 Implementation and maintenance

2.13 Evaluation

2.14 code

3. Advantages and disadvantages

4. Conclusion

5. reference
ACKNOWLEDGEMENT

We thank Mr. Sudhir Pathak, Head of Department,


Department of Computer Science and Information
Technology, LIET Alwar (Raj.) for his guidance and
co-operation. We also acknowledge the advice and
help given to us by Ms. Deepti Arora. We would like
to extend our gratitude to the entire faculty and
staff of the Department of CS & IT, LIET Alwar
(Raj.), who stood by us in all pits and falls we had
to face during the development phase of this
project.
CERTIFICATE OF APPROVAL
SESSION: 2008-2009

This is o certify that Ms. Ankita Verma, Ms. Harshi Yadav, Ms. Sameeksha
Chauhan have successfully completed the project entitled

“LEXICAL ANALYZER”
Under the able guidance of Ms. Deepti Arora towards the partial
fulfillment of the of Bachelor’s degree course in Information Technology.

Head of Department Guide By:

Mr. Sudhir Pathak Ms. Deepti Arora


Preface
The lexical analyzer is responsible for scanning the source input file and
translating lexemes into small objects that the compiler can easily
process. These small values are often called “tokens”. The lexical
analyzer is also responsible for converting sequences of digits in to their
numeric form as well as processing other literal constants, for removing
comments and whitespace from the source file, and for taking care of
many other mechanical details.

The lexical analyzer reads a string of characters and checks if a valid


token in the grammar.

Lexical analysis terminology:

• Token:

 Terminal symbol in a grammar

 Classes of sequences of characters with a collective


meaning

 Constants, operators, punctuations, keywords.

• Lexeme:

 Character sequence matched by an instance of the token.


Project
Description
Lexical analyzer converts stream of input characters into a stream of
tokens. The different tokens that our lexical analyzer identifies are as
follows:

KEYWORDS: int, char, float, double, if, for, while, else, switch, struct,
printf, scanf, case, break, return, typedef, void

IDENTIFIERS: main, fopen, getch etc

NUMBERS: positive and negative integers, positive and negative floating


point numbers.

OPERATORS: +, ++, -, --, ||, *, ?, /, >, >=, <, <=, =, ==, &, &&.

BRACKETS: [ ], { }, ( ).

STRINGS: Set of characters enclosed within the quotes

COMMENT LINES: Ignores single line, multi line comments

For tokenizing into identifiers and keywords we incorporate a symbol


table which initially consists of predefined keywords. The tokens are read
from an input file. If the encountered token is an identifier or a keyword
the lexical analyzer will look up in the symbol table to check the
existence of the respective token. If an entry does exist then we proceed
to the next token. If not then that particular token along with the token
value is written into the symbol table. The rest of the tokens are directly
displayed by writing into an output file.

The output file will consist of all the tokens present in our input file along
with their respective token values.
SYSTEM DESIGN:
Process:

The lexical analyzer is the first phase of a compiler. Its main task is to
read the input characters and produce as output a sequence of tokens
that the parser uses for syntax analysis. This interaction, summarized
schematically in fig. a.

Upon receiving a “get next token “command from the parser, the lexical
analyzer reads the input characters until it can identify next token.

Sometimes, lexical analyzers are divided into a cascade of two phases,


the first called “scanning”, and the second “lexical analysis”.

The scanner is responsible for doing simple tasks, while the lexical
analyzer proper does the more complex operations.

The lexical analyzer which we have designed takes the input from an
input file. It reads one character at a time from the input file, and
continues to read until end of the file is reached. It recognizes the valid
identifiers, keywords and specifies the token values of the keywords.

It also identifies the header files, #define statements, numbers, special


characters, various relational and logical operators, ignores the white
spaces and comments. It prints the output in a separate file specifying
the line number.
BLOCK DIAGRAM:
OBJECTIVE OF
THE PROJECT

AIM OF THE PROJECT


Aim of the project is to develop a Lexical Analyzer
that can generate tokens for the further processing
of compiler.

PURPOSE OF THE PROJECT

The lexical features of a language can be specified


using types-3 grammar. The job of the lexical
analyzer is to read the source program one
character at a time and produce as output a
stream of tokens. The tokens produced by the
lexical analyzer serve as input to the next phase,
the parser. Thus, the lexical analyzer’s job is to
translate the source program in to a form more
conductive to recognition by the parser.

GOALS

To create tokens from the given input stream.


SCOPE OF PROJECT
Lexical analyzer converts the input program into
character stream of valid words of language,
known as tokens.

The parser looks into the sequence of these tokens


& identifies the language construct occurring in the
input program. The parser and the lexical analyzer
work hand in hand; in the sense that whenever the
parser needs further tokens to proceed, it request
the lexical analyzer. The lexical analyzer in turn
scans the remaining input stream & returns the
next token occurring there. Apart from that, the
lexical analyzer also participates in the creation &
maintenance of symbol table. This is because
lexical analyzer is the first module to identify the
occurrence of a symbol. If the symbol is getting
defined for the first time, it needs to be installed
into the symbol table. Lexical analyzer is most
widely used for doing the same.
PROJECT
CONTENTS

SOFTWARE DEVELOPMENT LIFE CYCLE


Systems Development Life Cycle (SDLC), or Software Development Life
Cycle, in systems engineering and software engineering relates to the
process of developing systems, and the models and methodologies, that
people use to develop these systems, generally computer or information
systems.

In software engineering this SDLC concept is developed into all kinds of


software development methodologies, the framework that is used to
structure, plan, and control the process of developing an information
system, the software development process.

Overview

Systems Development Life Cycle (SDLC) is any logical process used by a


systems analyst to develop an information system, including
requirements, validation, training, and user ownership. An SDLC should
result in a high quality system that meets or exceeds customer
expectations, within time and cost estimates, works effectively and
efficiently in the current and planned Information Technology
infrastructure, and is cheap to maintain and cost-effective to enhance.[2]

Computer systems have become more complex and usually (especially


with the advent of Service-Oriented Architecture) link multiple
traditional systems often supplied by different software vendors. To
manage this, a number of system development life cycle (SDLC) models
have been created: waterfall, fountain, spiral, build and fix, rapid
prototyping, incremental, and synchronize and stabilize. Although in the
academic sense, SDLC can be used to refer to various models, SDLC is
typically used to refer to a waterfall methodology.

In project management a project has both a life cycle and a "systems


development life cycle" during which a number of typical activities occur.
The project life cycle (PLC) encompasses all the activities of the project,
while the systems development life cycle (SDLC) is focused on
accomplishing the product requirements.

Systems Development Phases

Systems Development Life Cycle (SDLC) adheres to important phases that


are essential for developers, such as planning, analysis, design, and
implementation, and are explained in the section below. There are
several Systems Development Life Cycle Models in existence. The oldest
model, that was originally regarded as "the Systems Development Life
Cycle" is the waterfall model: a sequence of stages in which the output of
each stage becomes the input for the next. These stages generally follow
the same basic steps but many different waterfall methodologies give the
steps different names and the number of steps seems to vary between 4
and 7. There is no definitively correct Systems Development Life Cycle
model, but t he steps can be characterized and divided in several steps.

Phases
Initiation Phase

The Initiation Phase begins when a business sponsor identifies a need or


an opportunity. The purpose of the Initiation Phase is to:

• Identify and validate an opportunity to improve business


accomplishments of the organization or a deficiency related to a
business need.

• Identify significant assumptions and constraints on solutions to that


need.

• Recommend the exploration of alternative concepts and methods to


satisfy the need including questioning the need for technology, i.e.,
will a change in the business process offer a solution?

• Assure executive business and executive technical sponsorship.

System Concept Development Phase

The System Concept Development Phase begins after a business need or


opportunity is validated by the Agency/Organization Program Leadership
and the Agency/Organization CIO. The purpose of the System Concept
Development Phase is to:

Determine the feasibility and appropriateness of the alternatives.

Identify system interfaces.

Identify basic functional and data requirements to satisfy the business


need.
Establish system boundaries; identify goals, objectives, critical success
factors, and performance measures.

• Evaluate costs and benefits of alternative approaches to satisfy the


basic functional requirements
• Assess project risks
• Identify and initiate risk mitigation actions, and
• Develop high-level technical architecture, process models, data
models, and a concept of operations.

Planning Phase

During this phase, a plan is developed that documents the approach to be


used and includes a discussion of methods, tools, tasks, resources,
project schedules, and user input. Personnel assignments, costs, project
schedule, and target dates are established. A Project Management Plan is
created with components related to acquisition planning, configuration
management planning, quality assurance planning, concept of operations,
system security, verification and validation, and systems engineering
management planning.

Requirements Analysis Phase

This phase formally defines the detailed functional user requirements


using high-level requirements identified in the Initiation, System
Concept, and Planning phases. It also delineates the requirements in
terms of data, system performance, security, and maintainability
requirements for the system. The requirements are defined in this phase
to a level of detail sufficient for systems design to proceed. They need to
be measurable, testable, and relate to the business need or opportunity
identified in the Initiation Phase. The requirements that will be used to
determine acceptance of the system are captured in the Test and
Evaluation Master Plan.
The purposes of this phase are to:

Further define and refine the functional and data requirements and
document them in the Requirements Document,

Complete business process reengineering of the functions to be supported


(i.e., verify what information drives the business process, what
information is generated, who generates it, where does the information
go, and who processes it),

Develop detailed data and process models (system inputs, outputs, and
the process.

Develop the test and evaluation requirements that will be used to


determine acceptable system performance.

Design Phase

During this phase, the system is designed to satisfy the functional


requirements identified in the previous phase. Since problems in the
design phase could be very expensive to solve in the later stage of the
software development, a variety of elements are considered in the design
to mitigate risk. These include:

• Identifying potential risks and defining mitigating design


features.

• Performing a security risk assessment.

• Developing a conversion plan to migrate current data to the new


system.

• Determining the operating environment.

• Defining major subsystems and their inputs and outputs.

• Allocating processes to resources.


• Preparing detailed logic specifications for each software module.

Development Phase

Effective completion of the previous stages is a key factor in the success


of the Development phase. The Development phase consists of:
• Translating the detailed requirements and design into system
components.
• Testing individual elements (units) for usability.
• Preparing for integration and testing of the IT system.

Integration and Test Phase


Subsystem integration, system, security, and user acceptance testing is
conducted during the integration and test phase. The user, with those
responsible for quality assurance, validates that the functional
requirements, as defined in the functional requirements document, are
satisfied by the developed or modified system. OIT Security staff
assesses the system security and issue a security certification and
accreditation prior to installation/implementation. Multiple levels of
testing are performed, including:

• Testing at the development facility by the contractor and possibly


supported by end users

• Testing as a deployed system with end users working together with


contract personnel

• Operational testing by the end user alone performing all functions.

Implementation Phase
This phase is initiated after the system has been tested and accepted by
the user. In this phase, the system is installed to support the intended
business functions. System performance is compared to performance
objectives established during the planning phase. Implementation
includes user notification, user training, installation of hardware,
installation of software onto production computers, and integration of
the system into daily work processes.

This phase continues until the system is operating in production in


accordance with the defined user requirements.

Operations and Maintenance Phase

The system operation is ongoing. The system is monitored for continued


performance in accordance with user requirements and needed system
modifications are incorporated. Operations continue as long as the
system can be effectively adapted to respond to the organization’s needs.
When modifications or changes are identified, the system may reenter
the planning phase. The purpose of this phase is to:

• Operate, maintain, and enhance the system.

• Certify that the system can process sensitive information.

• Conduct periodic assessments of the system to ensure the


functional requirements continue to be satisfied.

• Determine when the system needs to be modernized, replaced, or


retired.

Disposition Phase
Disposition activities ensure the orderly termination of the system and
preserve the vital information about the system so that some or all of the
information may be reactivated in the future if necessary. Particular
emphasis is given to proper preservation of the data processed by the
system, so that the data can be effectively migrated to another system or
archived for potential future access in accordance with applicable
records management regulations and policies. Each system should have
an interface control document defining inputs and outputs and data
exchange. Signatures should be required to verify that all dependent
users and impacted systems are aware of disposition.
Summary

The purpose of a Systems Development Life Cycle methodology is to


provide IT Project Managers with the tools to help ensure successful
implementation of systems that satisfy Agency strategic and business
objectives. The documentation provides a mechanism to ensure that
executive leadership, functional managers and users sign-off on the
requirements and implementation of the system. The process provides
Agency managers and the Project Manager with the visibility of design,
development, and implementation status needed to ensure delivery on-
time and within budget.
SDLC OBJECTIVES

The objectives of the SDLC approach are to:


• Deliver quality systems which meet or exceed customer
expectations when promised and within cost estimates
• Develop quality systems using an identifiable, measurable, and
repeatable process.
• Establish an organizational and project management structure with
appropriate levels of authority to ensure that each system
development project is effectively managed throughout its life
cycle.
• Identify and assign the roles and responsibilities of all affected
parties including functional and technical managers throughout the
system development life cycle.
• Ensure that system development requirements are well defined and
subsequently satisfied.
• Provide visibility to the State of Maryland functional and technical
managers for major system development resource requirements and
expenditures.
• Establish appropriate levels of management authority to provide
timely direction, coordination, control, review, and approval of the
system development project.
• Ensure project management accountability.
• Ensure that projects are developed within the current and planned
information technology infrastructure.
• Identify project risks early and manage them before they become
problems.
SYSTEM STUDY
&
PROBLEM FORMULATION

A Software Requirements Specification (SRS) is a complete description


of the behavior of the software of the system to be developed. It includes
a set of use cases that describe all the interactions the users will have
with the software. Use cases are also known as functional requirements.
In addition to use cases, the SRS also contains nonfunctional (or
supplementary) requirements. Non-functional requirements are
requirements which impose constraints on the design or implementation
(such as performance engineering requirements, quality standards, or
design constraints).

Purpose

The purpose of this software requirements specification (SRS) is to


establish the ten major requirements necessary to develop the Software
Systems Engineering.
PROJECT CATEGORY

Category of this project is Compiler Design based.

COMPILER

To define what a compiler is one must first define what a translator is. A
translator is a program that takes another program written in one
language, also known as the source language, and outputs a program
written in another language, known as the target language.

Now that the translator is defined, a compiler can be defined as a


translator. The source language is a high-level language such as Java or
Pascal and the target language is a low-level language such as machine or
assembly.

There are five parts of compilation (or phases of the compiler)

1.)Lexical Analysis
2.)Syntax Analysis
3.)Semantic Analysis
4.)Code Optimization
5.)Code Generation

Lexical Analysis is the act of taking an input source program and


outputting a stream of tokens. This is done with the Scanner. The Scanner
can also place identifiers into something called the symbol table or place
strings into the string table. The Scanner can report trivial errors such as
invalid characters in the input file.
Syntax Analysis is the act of taking the token stream from the scanner and
comparing them against the rules and patterns of the specified language.
Syntax Analysis is done with the Parser. The Parser produces a tree, which
can come in many formats, but is referred to as the parse tree. It reports
errors when the tokens do not follow the syntax of the specified
language. Errors that the Parser can report are syntactical errors such as
missing parenthesis, semicolons, and keywords.

Semantic Analysis is the act of determining whether or not the parse tree
is relevant and meaningful. The output is intermediate code, also known
as an intermediate representation (or IR). Most of the time, this IR is
closely related to assembly language but it is machine independent.
Intermediate code allows different code generators for different
machines and promotes abstraction and portability from specific machine
times and languages. (I dare say the most famous example is java’s byte-
code and JVM). Semantic Analysis finds more meaningful errors such as
undeclared variables, type compatibility, and scope resolution.

Code Optimization makes the IR more efficient. Code optimization is


usually done in a sequence of steps. Some optimizations include code
hosting, or moving constant values to better places within the code,
redundant code discovery, and removal of useless code.

Code Generation is the final step in the compilation process. The input to
the Code Generator is the IR and the output is machine language code.
PLATFORM (TECHNOLOGY/TOOLS)

In computing, C is a general-purpose computer programming language


originally developed in 1972 by Dennis Ritchie at the Bell Telephone
Laboratories to implement the Unix operating system.

Although C was designed for writing architecturally independent system


software, it is also widely used for developing application software.

Worldwide, C is the first or second most popular language in terms of


number of developer positions or publicly available code. It is widely used
on many different software platforms, and there are few computer
architectures for which a C compiler does not exist. C has greatly
influenced many other popular programming languages, most notably
C++, which originally began as an extension to C, and Java and C# which
borrow C lexical conventions and operators.

Characteristics

Like most imperative languages in the ALGOL tradition, C has facilities for
structured programming and allows lexical variable scope and recursion,
while a static type system prevents many unintended operations. In C, all
executable code is contained within functions. Function parameters are
always passed by value. Pass-by-reference is achieved in C by explicitly
passing pointer values. Heterogeneous aggregate data types (struct) allow
related data elements to be combined and manipulated as a unit. C
program source text is free-format, using the semicolon as a statement
terminator (not a delimiter).

C also exhibits the following more specific characteristics:

• non-nest able function definitions

• variables may be hidden in nested blocks


• partially weak typing; for instance, characters can be used as
integers

• low-level access to computer memory by converting machine


addresses to typed pointers

• function and data pointers supporting ad hoc run-time


polymorphism

• array indexing as a secondary notion, defined in terms of pointer


arithmetic

• a preprocessor for macro definition, source code file inclusion, and


conditional compilation

• complex functionality such as I/O, string manipulation, and


mathematical functions consistently delegated to library routines

• A relatively small set of reserved keywords (originally 32, now 37 in


C99)

• A lexical structure that resembles B more than ALGOL, for example

• { ... }rather than ALGOL's the equal-sign is for assignment


(copying), much like Fortran

• two consecutive equal-signs are to test for equality (compare to . in


Fortran or the equal-sign in BASIC)

• && and || in place of ALGOL's and & or (these are semantically


distinct from the bit-wise operators & and | because they will never
evaluate the right operand if the result can be determined from the
left alone (short-circuit evaluation)).

• a large number of compound operators, such as +=, ++, ......


Features

The relatively low-level nature of the language affords the programmer


close control over what the computer does, while allowing special
tailoring and aggressive optimization for a particular platform. This
allows the code to run efficiently on very limited hardware, such as
embedded systems.

C does not have some features that are available in some other
programming languages:

• No assignment of arrays or strings (copying can be done via standard


functions; assignment of objects having struct or union type is
supported)

• No automatic garbage collection

• No requirement for bounds checking of arrays

• No operations on whole arrays

• No syntax for ranges, such as the A..B notation used in several


languages

• No separate Boolean type: zero/nonzero is used instead[6]

• No formal closures or functions as parameters (only function and


variable pointers)

• No generators or co routines; intra-thread control flow consists of


nested function calls, except for the use of the longjmp or
setcontext library functions
• No exception handling; standard library functions signify error
conditions with the global errno variable and/or special return
values

• Only rudimentary support for modular programming

• No compile-time polymorphism in the form of function or operator


overloading

• Only rudimentary support for generic programming

• Very limited support for object-oriented programming with regard


to polymorphism and inheritance

• Limited support for encapsulation

• No native support for multithreading and networking

• No standard libraries for computer graphics and several other


application programming needs

A number of these features are available as extensions in some compilers,


or can be supplied by third-party libraries, or can be simulated by
adopting certain coding disciplines.

Operators

• bitwise shifts (<<, >>)

• assignment (=, +=, -=, *=, /=, %=, &=, |=, ^=, <<=, >>=)

increment and decrement (++, --) Main article: Operators in C and C++

C supports a rich set of operators, which are symbols used within an


expression to specify the manipulations to be performed while evaluating
that expression. C has operators for:

• arithmetic (+, -, *, /, %)
• equality testing (==, !=)

• order relations (<, <=, >, >=)

• boolean logic (!, &&, ||)

• bitwise logic (~, &, |, ^)

• reference and dereference (&, *, [ ])

• conditional evaluation (? :)

• member selection (., ->)

• type conversion (( ))

• object size (sizeof)

• function argument collection (( ))

• sequencing (,)

• subexpression grouping (( ))

• C has a formal grammar, specified by the C standard.

Data structures

C has a static weak typing type system that shares some similarities with
that of other ALGOL descendants such as Pascal. There are built-in types
for integers of various sizes, both signed and unsigned, floating-point
numbers, characters, and enumerated types (enum). C99 added a boolean
datatype. There are also derived types including arrays, pointers, records
(struct), and untagged unions (union).

C is often used in low-level systems programming where escapes from the


type system may be necessary. The compiler attempts to ensure type
correctness of most expressions, but the programmer can override the
checks in various ways, either by using a type cast to explicitly convert a
value from one type to another, or by using pointers or unions to
reinterpret the underlying bits of a value in some other way.

Arrays

Array types in C are traditionally, of a fixed, static size specified at


compile time. (The more recent C99 standard also allows a form of
variable-length arrays.) However, it is also possible to allocate a block of
memory (of arbitrary size) at run-time, using the standard library's malloc
function, and treat it as an array. C's unification of arrays and pointers
(see below) means that true arrays and these dynamically-allocated,
simulated arrays are virtually interchangeable. Since arrays are always
accessed (in effect) via pointers, array accesses are typically not checked
against the underlying array size, although the compiler may provide
bounds checking as an option. Array bounds violations are therefore
possible and rather common in carelessly written code, and can lead to
various repercussions, including illegal memory accesses, corruption of
data, buffer overruns, and run-time exceptions.

C does not have a special provision for declaring multidimensional arrays,


but rather relies on recursion within the type system to declare arrays of
arrays, which effectively accomplishes the same thing. The index values
of the resulting "multidimensional array" can be thought of as increasing
in row-major order.

Although C supports static arrays, it is not required that array indices be


validated (bounds checking). For example, one can try to write to the
sixth element of an array with five elements, generally yielding
undesirable results. This type of bug, called a buffer overflow or buffer
overrun, is notorious for causing a number of security problems. On the
other hand, since bounds checking elimination technology was largely
nonexistent when C was defined, bounds checking came with a severe
performance penalty, particularly in numerical computation. A few years
earlier, some Fortran compilers had a switch to toggle bounds checking on
or off; however, this would have been much less useful for C, where array
arguments are passed as simple pointers.

Deficiencies

Although the C language is extremely concise, C is subtle, and expert


competency in C is not common—taking more than ten years to
achieve.[11] C programs are also notorious for security vulnerabilities due
to the unconstrained direct access to memory of many of the standard C
library function calls.

In spite of its popularity and elegance, real-world C programs commonly


suffer from instability and memory leaks, to the extent that any
appreciable C programming project will have to adopt specialized
practices and tools to mitigate spiraling damage. Indeed, an entire
industry has been born merely out of the need to stabilize large source-
code bases.

Although C was developed for Unix, Microsoft adopted C as the core


language of its operating systems. Although all standard C library calls are
supported by Windows, there is only ad-hoc support for Unix functionality
side-by-side with an inordinate number of inconstant Windows-specific
API calls. There is currently no document in existence that can explain
programming practices that work well across both Windows and Unix.

It is inevitable that C did not choose limit the size or endianness of its
types—for example, each compiler is free to choose the size of an int
type as any anything over 16 bits according to what is efficient on the
current platform. Many programmers work based on size and endianness
assumptions, leading to code that is not portable.

Also inevitable is that the C standard defines only a very limited gamut of
functionality, excluding anything related to network communications,
user interaction, or process/thread creation. Its parent document, the
POSIX standard, includes such a wide array of functionality that no
operating system appears to support it exactly, and only UNIX systems
have even attempted to support substantial parts of it.

Therefore the kinds of programs that can be portably written are


extremely restricted, unless specialized programming practices are
adopted.

SOFTWARE AND HARDWARE TOOLS

Windows XP

Windows XP is a line of operating systems produced by Microsoft for use


on personal computers, including home and business desktops, notebook
computers, and media centers. The name "XP" is short for "experience".
Windows XP is the successor to both Windows 2000 Professional and
Windows Me, and is the first consumer-oriented operating system
produced by Microsoft to be built on the Windows NT kernel and
architecture. Windows XP was first released on 25 October 2001, and over
400 million copies were in use in January 2006, according to an estimate
in that month by an IDC analyst. It is succeeded by Windows Vista, which
was released to volume license customers on 8 November 2006 and
worldwide to the general public on 30 January 2007. Direct OEM and
retail sales of Windows XP ceased on 30 June 2008, although it is still
possible to obtain Windows XP from System Builders (smaller OEMs who
sell assembled computers) until 31 July 2009 or by purchasing Windows
Vista Ultimate or Business and then downgrading to Windows XP.

Windows XP introduced several new features to the Windows line,


including:

• Faster start-up and hibernation sequences

• The ability to discard a newer device driver in favor of the previous


one (known as driver rollback), should a driver upgrade not produce
desirable results

• A new, arguably more user-friendly interface, including the


framework for developing themes for the desktop environment

• Fast user switching, which allows a user to save the current state
and open applications of their desktop and allow another user to log
on without losing that information

• The Clear Type font rendering mechanism, which is designed to


improve text readability on Liquid Crystal Display (LCD) and similar
monitors

• Remote Desktop functionality, which allows users to connect to a


computer running Windows XP Pro from across a network or the
Internet and access their applications, files, printers, and devices

• Support for most DSL modems and wireless network connections, as


well as networking over FireWire, and Bluetooth.
Turbo C++

Turbo C++ is a C++ compiler and integrated development environment


(IDE) from Borland. The original Turbo C++ product line was put on hold
after 1994, and was revived in 2006 as an introductory-level IDE,
essentially a stripped-down version of their flagship C++ Builder. Turbo
C++ 2006 was released on September 5, 2006 and is available in 'Explorer'
and 'Professional' editions. The Explorer edition is free to download and
distribute while the Professional edition is a commercial product. The
professional edition is no longer available for purchase from Borland.
Turbo C++ 3.0 was released in 1991 (shipping on November 20), and
came in amidst expectations of the coming release of Turbo C++ for
Microsoft Windows. Initially released as an MS-DOS compiler, 3.0
supported C++ templates, Borland's inline assembler, and generation of
MS-DOS mode executables for both 8086 real-mode & 286-protected (as
well as the Intel 80186.) 3.0's implemented AT&T C++ 2.1, the most
recent at the time. The separate Turbo Assembler product was no longer
included, but the inline-assembler could stand in as a reduced
functionality version.

Starting with version 3.0, Borland segmented their C++ compiler into two
distinct product-lines: "Turbo C++" and "Borland C++". Turbo C++ was
marketed toward the hobbyist and entry-level compiler market, while
Borland C++ targeted the professional application development market.
Borland C++ included additional tools, compiler code-optimization, and
documentation to address the needs of commercial developers. Turbo C++
3.0 could be upgraded with separate add-ons, such as Turbo Assembler
and Turbovision 1.0.
HARDWARE REQUIREMENT

Processor : Pentium (IV)

RAM : 256 MB

Hard Disk : 40 GB

FDD : 4 GB

Monitor : LG

SOFTWARE REQUIREMENT

Platform Used : TurboC++ 3.0

Operating System : WINDOWS XP & other versions

Languages :C
FEASIBILITY STUDY

Feasibility study: The feasibility study is a general examination of the


potential of an idea to be converted into a business. This study focuses
largely on the ability of the entrepreneur to convert the idea into a
business enterprise. The feasibility study differs from the viability study
as the viability study is an in-depth investigation of the profitability of
the idea to be converted into a business enterprise.

Types of Feasibility Studies

The following sections describe various types of feasibility studies.

• Technology and System Feasibility

This involves questions such as whether the technology needed for


the system exists, how difficult it will be to build, and whether the
firm has enough experience using that technology. The assessment
is based on an outline design of system requirements in terms of
Input, Processes, Output, Fields, Programs, and Procedures. This
can be quantified in terms of volumes of data, trends, frequency of
updating, etc in order to estimate if the new system will perform
adequately or not.

• Resource Feasibility

This involves questions such as how much time is available to build


the new system, when it can be built, whether it interferes with
normal business operations, type and amount of resources required,
dependencies, etc. Contingency and mitigation plans should also be
stated here so that if the project does over run the company is
ready for this eventuality.

• Schedule Feasibility

A project will fail if it takes too long to be completed before it is


useful. Typically this means estimating how long the system will
take to develop, and if it can be completed in a given time period
using some methods like payback period.

• Economic Feasibility

Economic analysis is the most frequently used method for


evaluating the effectiveness of a candidate system. More commonly
known as cost/benefit analysis, the procedure is to determine the
benefits and savings that are expected from a candidate system and
compare them with costs. If benefits outweigh costs, then the
decision is made to design and implement the system.

• Operational feasibility

Do the current work practices and procedures support a new


system? Also social factors i.e. how the organizational changes will
affect the working lives of those affected by the system...

• Technical feasibility

Centers around the existing computer system and to what extent it


can support the proposed addition
SYSTEM DESIGN
A lexical analyzer generator creates a lexical analyzer using a set of
specifications usually in the format

p1 {action 1}

p2 {action 2}

............

pn {action n}

Where pi is a regular expression and each action actioni is a program


fragment that is to be executed whenever a lexeme matched by pi is
found in the input. If more than one pattern matches, then longest
lexeme matched is chosen. If there are two or more patterns that match
the longest lexeme, the first listed matching pattern is chosen.

This is usually implemented using a finite automaton. There is


an input buffer with two pointers to it, a lexeme-beginning and a forward
pointer. The lexical analyzer generator constructs a transition table for a
finite automaton from the regular expression patterns in the lexical
analyzer generator specification. The lexical analyzer itself consists of a
finite automaton simulator that uses this transition table to look for the
regular expression patterns in the input buffer.

This can be implemented using an NFA or a DFA. The transition


table for an NFA is considerably smaller than that for a DFA, but the DFA
recognises patterns faster than the NFA.

Using NFA
The transition table for the NFA N is constructed for the
composite pattern p1|p2|. . .|pn, The NFA recognizes the longest prefix of
the input that is matched by a pattern. In the final NFA, there is an
accepting state for each pattern pi. The sequence of steps the final NFA
can be in is after seeing each input character is constructed. The NFA is
simulated until it reaches termination or it reaches a set of states from
which there is no transition defined for the current input symbol. The
specification for the lexical analyzer generator is so that a valid source
program cannot entirely fill the input buffer without having the NFA reach
termination. To find a correct match two things are done. Firstly,
whenever an accepting state is added to the current set of states, the
current input position and the pattern pi is recorded corresponding to this
accepting state. If the current set of states already contains an accepting
state, then only the pattern that appears first in the specification is
recorded. Secondly, the transitions are recorded until termination is
reached. Upon termination, the forward pointer is retracted to the
position at which the last match occurred. The pattern making this match
identifies the token found, and the lexeme matched is the string between
the lexeme beginning and forward pointers. If no pattern matches, the
lexical analyser should transfer control to some default recovery routine.

Using DFA

Here a DFA is used for pattern matching. This method is a


modified version of the method using NFA. The NFA is converted to a DFA
using a subset construction algorithm. Here there may be several
accepting states in a given subset of nondeterministic states. The
accepting state corresponding to the pattern listed first in the lexical
analyzer generator specification has priority. Here also state transitions
are made until a state is reached which has no next state for the current
input symbol. The last input position at which the DFA entered an
accepting state gives the lexeme.
DATA-FLOW DIAGRAM

A data flow diagram (DFD) is a graphical representation of the "flow" of


data through an information system. It differs from the flowchart as it
shows the data flow instead of the control flow of the program.

A data flow diagram can also be used for the visualization of data
processing (structured design).

Context Level Diagram (Level 0)

A context level Data flow diagram created using Select SSADM.

This level shows the overall context of the system and its operating
environment and shows the whole system as just one process. It does not
usually show data stores, unless they are "owned" by external systems,
e.g. are accessed by but not maintained by this system, however, these
are often shown as external entities.
Level 1

A Level 1 Data flow diagram for the same system.

This level shows all processes at the first level of numbering, data stores,
external entities and the data flows between them. The purpose of this
level is to show the major high level processes of the system and their
interrelation. A process model will have one, and only one, level 1
diagram. A level 1 diagram must be balanced with its parent context level
diagram, i.e. there must be the same external entities and the same data
flows, these can be broken down to more detail in the level 1, e.g. the
"enquiry" data flow could be split into "enquiry request" and "enquiry
results" and still be valid.
Level 2

A Level 2 Data flow diagram showing the "Process Enquiry" process for the
same system.

This level is a decomposition of a process shown in a level 1 diagram, as


such there should be level 2 diagrams for each and every process shown
in a level 1 diagram. In this example processes 1.1, 1.2 & 1.3 are all
children of process 1, together they wholly and completely describe
process 1, and combined must perform the full capacity of this parent
process. As before, a level 2 diagram must be balanced with its parent
level 1 diagram.
ENTITY-RELATIONSHIP DIAGRAM

An entity-relationship model (ERM) in software engineering is an


abstract and conceptual representation of data. Entity-relationship
modeling is a relational schema database modeling method, used to
produce a type of conceptual schema or semantic data model of a
system, often a relational database, and its requirements in a top-down
fashion.

The first stage of information system design uses these models during the
requirements analysis to describe information needs or the type of
information that is to be stored in a database. The data modeling
technique can be used to describe any ontology (i.e. an overview and
classifications of used terms and their relationships) for a certain universe
of discourse (i.e. area of interest). In the case of the design of an
information system that is based on a database, the conceptual data
model is, at a later stage (usually called logical design), mapped to a
logical data model, such as the relational model; this in turn is mapped to
a physical model during physical design. Note that sometimes, both of
these phases are referred to as "physical design".
FLOW CHART

A flowchart is common type of chart, that represents an algorithm or


process, showing the steps as boxes of various kinds, and their order by
connecting these with arrows. Flowcharts are used in analyzing,
designing, documenting or managing a process or program in various
fields.

Flowcharts are used in designing and documenting complex processes.


Like other types of diagram, they help visualize what is going on and
thereby help the viewer to understand a process, and perhaps also find
flaws, bottlenecks, and other less-obvious features within it. There are
many different types of flowcharts, and each type has its own repertoire
of boxes and notational conventions. The two most common types of
boxes in a flowchart are:

• A processing step, usually called activity, and denoted as a


rectangular box

• A decision usually denoted as a diamond.

Flow chart building blocks

• Symbols

A typical flowchart from older Computer Science textbooks may


have the following kinds of symbols:

• Start and end symbols

Represented as lozenges, ovals or rounded rectangles, usually


containing the word "Start" or "End", or another phrase signaling the
start or end of a process, such as "submit enquiry" or "receive
product".

• Arrows

Showing what's called "flow of control" in computer science. An


arrow coming from one symbol and ending at another symbol
represents that control passes to the symbol the arrow points to.

• Processing steps

Represented as rectangles. Examples: "Add 1 to X"; "replace


identified part"; "save changes" or similar.

• Input/Output

Represented as a parallelogram. Examples: Get X from the user;


display X.

• Conditional or decision

Represented as a diamond (rhombus). These typically contain a


Yes/No question or True/False test. This symbol is unique in that it
has two arrows coming out of it, usually from the bottom point and
right point, one corresponding to Yes or True, and one corresponding
to No or False. The arrows should always be labeled. More than two
arrows can be used, but this is normally a clear indicator that a
complex decision is being taken, in which case it may need to be
broken-down further, or replaced with the "pre-defined process"
symbol.

A number of other symbols that have less universal currency, such


as:
• A Document represented as a rectangle with a wavy base;

• A Manual input represented by parallelogram, with the top


irregularly sloping up from left to right. An example would be to
signify data-entry from a form;

• A Manual operation represented by a trapezoid with the longest


parallel side at the top, to represent an operation or adjustment to
process that can only be made manually.

• A Data File represented by a cylinder

Flowcharts may contain other symbols, such as connectors, usually


represented as circles, to represent converging paths in the flow chart.
Circles will have more than one arrow coming into them but only one
going out. Some flow charts may just have an arrow point to another
arrow instead. These are useful to represent an iterative process (what in
Computer Science is called a loop). A loop may, for example, consist of a
connector where control first enters, processing steps, a conditional with
one arrow exiting the loop, and one going back to the connector. Off-
page connectors are often used to signify a connection to a (part of
another) process held on another sheet or screen. It is important to
remember to keep these connections logical in order. All processes should
flow from top to bottom and left to right.
TESTING METHODOLOGY

Software Testing is an empirical investigation conducted to provide


stakeholders with information about the quality of the product or service
under test, with respect to the context in which it is intended to operate.
This includes, but is not limited to, the process of executing a program or
application with the intent of finding software bugs.

Static vs. dynamic testing

There are many approaches to software testing. Reviews, walkthroughs or


inspections are considered as static testing, whereas actually executing
programmed code with a given set of test cases is referred to as dynamic
testing. The former can be, and unfortunately in practice often is,
omitted, whereas the latter takes place when programs begin to be used
for the first time - which is normally considered the beginning of the
testing stage. This may actually begin before the program is 100%
complete in order to test particular sections of code (modules or discrete
functions). For example, Spreadsheet programs are, by their very nature,
tested to a large extent "on the fly" during the build process as the result
of some calculation or text manipulation is shown interactively
immediately after each formula is entered.

Software verification and validation

Software testing is used in association with verification and validation:

• Verification: Have we built the software right (i.e., does it match


the specification?)? It is process based.

• Validation: Have we built the right software (i.e., is this what the
customer wants?)? It is product based.
Testing methods

Software testing methods are traditionally divided into black box testing
and white box testing. These two approaches are used to describe the
point of view that a test engineer takes when designing test cases.

Black box testing

Black box testing treats the software as a black box without any
knowledge of internal implementation. Black box testing methods include
equivalence partitioning, boundary value analysis, all-pairs testing, fuzz
testing, model-based testing, traceability matrix, exploratory testing and
specification-based testing.

Specification-based testing

Specification-based testing aims to test the functionality according to the


requirements. Thus, the tester inputs data and only sees the output from
the test object. This level of testing usually requires thorough test cases
to be provided to the tester who then can simply verify that for a given
input, the output value (or behavior), is the same as the expected value
specified in the test case.

Specification-based testing is necessary but insufficient to guard against


certain risks.

Advantages and disadvantages

The black box tester has no "bonds" with the code, and a tester's
perception is very simple: a code MUST have bugs. Using the principle,
"Ask and you shall receive," black box testers find bugs where
programmers don't. BUT, on the other hand, black box testing is like a
walk in a dark labyrinth without a flashlight, because the tester doesn't
know how the back end was actually constructed.
That's why there are situations when

1. A black box tester writes many test cases to check something that can
be tested by only one test case and/or

2. Some parts of the back end are not tested at all

Therefore, black box testing has the advantage of an unaffiliated opinion


on the one hand and the disadvantage of blind exploring on the other.

White box testing

White box testing, by contrast to black box testing, is when the tester has
access to the internal data structures and algorithms (and the code that
implement these)

Types of white box testing

The following types of white box testing exist:

• Code coverage - creating tests to satisfy some criteria of code


coverage. For example, the test designer can create tests to cause
all statements in the program to be executed at least once.

• Mutation testing methods.

• Fault injection methods.

• Static testing - White box testing includes all static testing.

Code completeness evaluation

White box testing methods can also be used to evaluate the completeness
of a test suite that was created with black box testing methods. This
allows the software team to examine parts of a system that are rarely
tested and ensures that the most important function points have been
tested.
Two common forms of code coverage are:

• function coverage, which reports on functions executed and

• Statement coverage, which reports on the number of lines executed


to complete the test.

They both return coverage metric, measured as a percentage


TESTING STRATEGY

A software testing strategy is a well-planned series of steps that result in


the successful construction of the software. It should be able to test the
errors in software specification, design & coding phases of software
development. Software testing strategy always starts with coding &
moves in upward direction. Thus a testing strategy can also divide into
four phases:

• Unit Testing : Used for coding

• Integration Testing : Used for design phase

• System Testing : For system engineering

• Acceptance testing : For user acceptance

Unit Testing

In computer programming, unit testing is a method of testing that


verifies the individual units of source code are working properly. A unit is
the smallest testable part of an application. In procedural programming a
unit may be an individual program, function, procedure, etc., while in
object-oriented programming, the smallest unit is a method, which may
belong to a base/super class, abstract class or derived/child class.

Ideally, each test case is independent from the others; Double objects
like stubs, mock or fake objects as well as test harnesses can be used to
assist testing a module in isolation. Unit testing is typically done by
software developers to ensure that the code other developers have
written meets software requirements and behaves as the developer
intended.
Benefits

The goal of unit testing is to isolate each part of the program and show
that the individual parts are correct. A unit test provides a strict, written
contract that the piece of code must satisfy. As a result, it affords several
benefits. Unit tests find problems early in the development cycle.

Integration Testing

'Integration testing' (sometimes called Integration and Testing,


abbreviated I&T) is the phase of software testing in which individual
software modules are combined and tested as a group. It follows unit
testing and precedes system testing.

Integration testing takes as its input modules that have been unit tested,
groups them in larger aggregates, applies tests defined in an integration
test plan to those aggregates, and delivers as its output the integrated
system ready for system testing.

Purpose

The purpose of integration testing is to verify functional, performance


and reliability requirements placed on major design items. These "design
items", i.e. assemblages (or groups of units), are exercised through their
interfaces using Black box testing, success and error cases being
simulated via appropriate parameter and data inputs. Simulated usage of
shared data areas and inter-process communication is tested and
individual subsystems are exercised through their input interface. Test
cases are constructed to test that all components within assemblages
interact correctly, for example across procedure calls or process
activations, and this is done after testing individual modules, i.e. unit
testing.

The overall idea is a "building block" approach, in which verified


assemblages are added to a verified base which is then used to support
the integration testing of further assemblages.

Some different types of integration testing are big bang, top-down, and
bottom-up.

System Testing

System testing of software or hardware is testing conducted on a


complete, integrated system to evaluate the system's compliance with its
specified requirements. System testing falls within the scope of black box
testing, and as such, should require no knowledge of the inner design of
the code or logic.

As a rule, system testing takes, as its input, all of the "integrated"


software components that have successfully passed integration testing
and also the software system itself integrated with any applicable
hardware system(s). The purpose of integration testing is to detect any
inconsistencies between the software units that are integrated together
(called assemblages) or between any of the assemblages and the
hardware. System testing is a more limiting type of testing; it seeks to
detect defects both within the "inter-assemblages" and also within the
system as a whole.
Acceptance Testing

In engineering and its various sub disciplines, acceptance testing is


black-box testing performed on a system (e.g. software, lots of
manufactured mechanical parts, or batches of chemical products) prior to
its delivery. In some engineering sub disciplines, it is known as Functional
testing, black-box testing, release acceptance, QA testing, application
testing, confidence testing, final testing, validation testing, usability
testing, or factory acceptance testing.

In most environments, acceptance testing by the system provider is


distinguished from acceptance testing by the customer (the user or
client) prior to accepting transfer of ownership. In such environments,
acceptance testing performed by the customer is known as beta testing,
user acceptance testing (UAT), end user testing, site (acceptance)
testing, or field (acceptance) testing.
System security
One might think that there is a little reason to be concerned about in an
internet. After all, by definition an internet is internal to ones
organization; outsider can not access it. There are strong arguments for
the position that an intranet should be completely open to its user, with
little or no security.

Information security
Information security means protecting information and information
systems from unauthorized access, use, disclosure, disruption,
modification, or destruction.[1]

The terms information security, computer security and information


assurance are frequently incorrectly used interchangeably. These fields
are interrelated often and share the common goals of protecting the
confidentiality, integrity and availability of information; however, there
are some subtle differences between them.

These differences lie primarily in the approach to the subject, the


methodologies used, and the areas of concentration. Information security
is concerned with the confidentiality, integrity and availability of data
regardless of the form the data may take: electronic, print, or other
forms.

Computer security can focus on ensuring the availability and correct


operation of a computer system without concern for the information
stored or processed by the computer.

Security classification for information


An important aspect of information security and risk management is
recognizing the value of information and defining appropriate procedures
and protection requirements for the information. Not all information is
equal and so not all information requires the same degree of protection.
This requires information to be assigned a security classification.

The first step in information classification is to identify a member of


senior management as the owner of the particular information to be
classified. Next, develop a classification policy. The policy should
describe the different classification labels, define the criteria for
information to be assigned a particular label, and list the required
security controls for each classification.

Identification
Identification is an assertion of who someone is or what something is. If a
person makes the statement "Hello, my name is John Doe." they are
making a claim of who they are. However, their claim may or may not be
true. Before John Doe can be granted access to protected information it
will be necessary to verify that the person claiming to be John Doe really
is John Doe.

Authentication
Authentication is the act of verifying a claim of identity. When John Doe
goes into a bank to make a withdrawal, he tells the bank teller he is John
Doe (a claim of identity). The bank teller asks to see a photo ID, so he
hands the teller his driver's license. The bank teller checks the license to
make sure it has John Doe printed on it and compares the photograph on
the license against the person claiming to be John Doe. If the photo and
name match the person, then the teller has authenticated that John Doe
is who he claimed to be.
Authorization
Authorization to access information and other computing services begins
with administrative policies and procedures. The policies prescribe what
information and computing services can be accessed, by whom, and
under what conditions. The access control mechanisms are then
configured to enforce these policies.

Different computing systems are equipped with different kinds of access


control mechanisms - some may even offer a choice of different access
control mechanisms. The access control mechanism a system offers will
be based upon one of three approaches to access control or it may be
derived from a combination of the three approaches.
Implementation & maintenance

Implementation
The final phase of the progress process is the implementation of the new
system. This phase is culmination of the previous phases and will be
performed only after each of the prior phases has been successfully
completed to the satisfaction of both the user and quality assurance. The
tasks, comprise the implementation phase, include the installation of
hardware, proper scheduling of resources needed to put the system in to
introduction, a complete of instruction that support both the users and IS
environment.

Coding
This means program construction with procedural specification has
finished and the coding for the program begins:

Once the design phase was over, coding commenced.

Coding is natural consequence of design.

Coding step translate a detailed design representation of software into a


programming language realization.

Main emphasis while coding was on style so that the end result was an
optimized code.

The following points were kept in to consideration while coding.


Coding style
The structured programming method was used in all the modules the
projects.

It incorporated the following features.

The code has been written so that the definitions and implementation of
each function is contained in one file.

A group of related function was clubbed together in one file to include it


when needed and save us from the labor of writing it again and again.

Naming convention
As the project size grows, so does complexity of resigning the purpose of
the variable. Thus the variable were given meaningful names, which
would help in understanding the context and the purpose of the variable.

The function names are also given meaningful names that can be easily
understood by the user.
Indentation
Judicious use of indentation can make the task of reading and
understanding a program much simpler. Indentation is an essential part of
a good program. If code id intended without thought it will seriously
affect the readability of the program.

The higher level statement like the definition of the variable, constants
and the function are intended, with each nested block intended, stating
their purpose in the code.

Blank line is also left between each function definition to make the code
look neat.

Indentation for each source file stating the purpose of the file is also
done.
Maintenance
Maintenance testing is that testing which is
performed to either identify equipment problems,
diagnose equipment problems or to confirm that
repair measures has been effective. It can be
performed at either the system level (e.g., the
HVAC system), the equipment level (e.g., the
blower in a HVAC line), or the component level
(e.g., a control chip in the control box for the
blower in the HVAC line).

Preventive maintenance

The care and servicing by personnel for the purpose of maintaining


equipment and facilities in satisfactory operating condition by providing
for systematic inspection, detection, and correction of incipient failures
either before they occur or before they develop into major defects.

Maintenance, including tests, measurements, adjustments, and parts


replacement, performed specifically to prevent faults from occurring.

To make it simple:

Preventive maintenance is conducted to keep equipment working and/or


extend the life of the equipment.

Corrective maintenance, sometimes called "repair", is conducted to get


equipment working again.
The primary goal of maintenance is to avoid or mitigate the consequences
of failure of equipment. This may be by preventing the failure before it
actually occurs which PM and condition based maintenance help to
achieve. It is designed to preserve and restore equipment reliability by
replacing worn components before they actually fail. Preventive
maintenance activities include partial or complete overhauls at specified
periods, oil changes, lubrication and so on. In addition, workers can
record equipment deterioration so they know to replace or repair worn
parts before they cause system failure. The ideal preventive maintenance
program would prevent all equipment failure before it occurs.

Corrective maintenance
The idle time for production machines in a factory is mainly due to the
following reasons:

Lack of materials

Machine fitting, cleaning, tools replacement etc.

Breakdowns

Taking into consideration only breakdown idle time it can be split in some
components:

Operator's inspection time - That is the time required by the machine


operator to check the machine in order to detect the breakdown reason,
before calling the Maintenance department

Operator's repairing time - That means time required by machine


operator to fit the machine by himself in case he is able to do it.

Maintenance dead time - Time lost by machine operator waiting for the
machine to be repair by maintenance personnel, from the time they start
doing it until the moment they finish their task.
In the corrective environment the system has been conceived to reduce
the breakdown detection and diagnosis times and supply the adequate
information required to perform the repairing operations.

Different sensors are connected to every machine in the workshop, to


detect any change in the various parameters when they run put of the
normal performance or a shutdown is produced.
EVALUATION
Lexical analyzer converts stream of input characters into a stream of
tokens. The different tokens that our lexical analyzer identifies are as
follows:

KEYWORDS: int, char, float, double, if, for, while, else, switch, struct,
printf, scanf, case, break, return, typedef, void

IDENTIFIERS: main, fopen, getch etc

NUMBERS: positive and negative integers, positive and negative floating


point numbers.

OPERATORS: +, ++, -, --, ||, *, ?, /, >, >=, <, <=, =, ==, &, &&.

BRACKETS: [ ], { }, ( ).

STRINGS : Set of characters enclosed within the quotes

COMMENT LINES: Ignores single line, multi line comments

For tokenizing into identifiers and keywords we incorporate a symbol


table which initially consists of predefined keywords. The tokens are read
from an input file. If the encountered token is an identifier or a keyword
the lexical analyzer will look up in the symbol table to check the
existence of the respective token. If an entry does exist then we proceed
to the next token. If not then that particular token along with the token
value is written into the symbol table. The rest of the tokens are directly
displayed by writing into an output file.

The output file will consist of all the tokens present in our input file along
with their respective token values.
CODE
/* Program to make lexical analyzer that generates the tokens......

Created by: Ankita Verma, Harshi Yadav, Sameeksha Chauhan*/

#include<stdio.h>

#include<conio.h>

#include<string.h>

#include<ctype.h>

#define MAX 30

void main()

char str[MAX];

int state=0;

int i=0, j, startid=0, endid, startcon, endcon;

clrscr();

for(j=0; j<MAX; j++)

str[j]=NULL; //Initialise NULL

printf("*** Program on Lexical Analysis ***");


printf("\n\nEnter the string: ");

gets(str); //Accept input string

str[strlen(str)]=' ';

printf("\n\nAnalysis:");

while(str[i]!=NULL)

while(str[i]==' ') //To eliminate spaces

i++;

switch(state)

case 0: if(str[i]=='i') state=1; //if

else if(str[i]=='w') state=3; //while

else if(str[i]=='d') state=8; //do

else if(str[i]=='e') state=10; //else

else if(str[i]=='f') state=14; //for

else if(isalpha(str[i]) || str[i]=='_')

state=17;

startid=i;

} //identifiers

else if(str[i]=='<') state=19;

//relational '<' or '<='


else if(str[i]=='>') state=21;

//relational '>' or '>='

else if(str[i]=='=') state=23;

//relational '==' or assignment '='

else if(isdigit(str[i]))

state=25; startcon=i;

//constant

else if(str[i]=='(') state=26;

//special characters '('

else if(str[i]==')') state=27;

//special characters ')'

else if(str[i]==';') state=28;

//special characters ';'

else if(str[i]=='+') state=29;


//operator '+'

else if(str[i]=='-') state=30;

//operator '-'

break;

//States for 'if'

case 1: if(str[i]=='f') state=2;

else { state=17; startid=i-1; i--; }

break;

case 2: if(str[i]=='(' || str[i]==NULL)

printf("\n\nif : Keyword");

state=0;

i--;

else { state=17; startid=i-2; i--; }

break;

//States for 'while'

case 3: if(str[i]=='h') state=4;

else { state=17; startid=i-1; i--; }


break;

case 4: if(str[i]=='i') state=5;

else { state=17; startid=i-2; i--; }

break;

case 5: if(str[i]=='l') state=6;

else { state=17; startid=i-3; i--; }

break;

case 6: if(str[i]=='e') state=7;

else { state=17; startid=i-4; i--; }

break;

case 7: if(str[i]=='(' || str[i]==NULL)

printf("\n\nwhile : Keyword");

state=0;

i--;

else { state=17; startid=i-5; i--; }

break;

//States for 'do'

case 8: if(str[i]=='o') state=9;

else { state=17; startid=i-1; i--; }

break;
case 9: if(str[i]=='{' || str[i]==' ' || str[i]==NULL || str[i]=='(')

printf("\n\ndo : Keyword");

state=0;

i--;

break;

//States for 'else'

case 10: if(str[i]=='l') state=11;

else { state=17; startid=i-1; i--; }

break;

case 11: if(str[i]=='s') state=12;

else { state=17; startid=i-2; i--; }

break;

case 12: if(str[i]=='e') state=13;

else { state=17; startid=i-3; i--; }

break;

case 13: if(str[i]=='{' || str[i]==NULL)

printf("\n\nelse : Keyword");

state=0;

i--;
}

else { state=17; startid=i-4; i--; }

break;

//States for 'for'

case 14: if(str[i]=='o') state=15;

else { state=17; startid=i-1; i--; }

break;

case 15: if(str[i]=='r') state=16;

else { state=17; startid=i-2; i--; }

break;

case 16: if(str[i]=='(' || str[i]==NULL)

printf("\n\nfor : Keyword");

state=0;

i--;

else { state=17; startid=i-3; i--; }

break;

//States for identifiers

case 17:
if(isalnum(str[i]) || str[i]=='_')

state=18; i++;

else if(str[i]==NULL||str[i]=='<'||str[i]=='>'||str[i]=='('||str[i]==')'||
str[i]==';'||str[i]=='='||str[i]=='+'||str[i]=='-') state=18;

i--;

break;

case 18:

if(str[i]==NULL || str[i]=='<' || str[i]=='>' || str[i]=='(' || str[i]==')' ||


str[i]==';' || str[i]=='=' || str[i]=='+' ||str[i]=='-')

endid=i-1;

printf(" ");

for(j=startid; j<=endid; j++)

printf("\n\n%c", str[j]);

printf(" : Identifier");

state=0;

i--;

break;
//States for relational operator '<' & '<='

case 19: if(str[i]=='=') state=20;

else if(isalnum(str[i]) || str[i]=='_')

printf("\n\n< : Relational operator");

i--;

state=0;

break;

case 20: if(isalnum(str[i]) || str[i]=='_')

printf("\n\n<= : Relational operator");

i--;

state=0;

break;

//States for relational operator '>' & '>='

case 21: if(str[i]=='=') state=22;

else if(isalnum(str[i]) || str[i]=='_')

printf("\n\n> : Relational operator");


i--;

state=0;

break;

case 22: if(isalnum(str[i]) || str[i]=='_')

printf("\n\n>= : Relational operator");

i--;

state=0;

break;

//States for relational operator '==' & assignment operator '='

case 23: if(str[i]=='=') state=24;

else

printf("\n\n= : Assignment operator");

i--;

state=0;

break;

case 24: if(isalnum(STR[i]))

{
printf("\n\n== : Relational operator");

state=0;

i--;

break;

//States for constants

case 25: if(isalpha(str[i]))

printf("\n\n*** ERROR ***");

puts(str);

for(j=0; j<i; j++)

printf(" ");

printf("^");

printf("\n\nError at position %d\n Alphabet cannot follow


digit", i);

state=99;

else if(str[i]=='(' || str[i]==')' || str[i]=='<' || str[i]=='>' || str[i]==NULL ||


str[i]==';' || str[i]=='=')

endcon=i-1;

printf(" ");
for(j=startcon; j<=endcon; j++)

printf("\n\n%c", str[j]);

printf(" : Constant");

state=0;

i--;

break;

//State for special character '('

case 26: printf("\n\n( : Special character");

startid=i;

state=0;

i--;

break;

//State for special character ')'

case 27: printf("\n\n) : Special character");

state=0;

i--;

break;

//State for special character ';'

case 28: printf("\n\n; : Special character");


state=0;

i--;

break;

//State for operator '+'

case 29: printf("\n\n+ : Operator");

state=0;

i--;

break;

//State for operator '-'

case 30: printf("\n\n- : Operator");

state=0;

i--;

break;

//Error State

case 99: goto END;

i++;

printf("\n\nEnd of program");

END:
getch();

/* Output

Correct input

-------------

*** Program on Lexical Analysis ***

Enter the string: for(x1=0; x1<=10; x1++);

Analysis:

for : Keyword

( : Special character

x1 : Identifier

= : Assignment operator

0 : Constant

; : Special character

x1 : Identifier
<= : Relational operator

10 : Constant

; : Special character

x1 : Identifier

+ : Operator

+ : Operator

) : Special character

; : Special character

End of program

Wrong input

-----------

*** Program on Lexical Analysis ***

Enter the string: for(x1=0; x1<=19x; x++);

Analysis:
for : Keyword

( : Special character

x1 : Identifier

= : Assignment operator

0 : Constant

; : Special character

X1 : Identifier

<= : Relational operator

Token cannot be generated

*/
ADVANTAGES
AND
Disadvantages
OF LEXICAL
ANALYZER

ADVANTAGES
• Easier and faster development.
• More efficient and compact.
• Very efficient and compact.

DISADVANTAGES

• Done by hand.
• Development is complicate
CONCLUSION
Lexical analysis is a stage in compilation of any
program. In this phase we generate tokens from
the input stream of data. For performing this task
we need Lexical Analyzer.

So we are designing a lexical analyzer that will


generate tokens from the given input.

In the end, we would really like to thank our H.O.D.


Mr. Sudhir Pathak from the bottom of our hearts to
give us such a fruitful opportunity to enhance our
technical skills.
REFERENCE
• www.google.co.in

• www.wikipedia.com

• Let Us C : Yashwant Kanetkar

• Software Engineering : Rogger Pressman

• System Software Engineering : D. S. Dhamdhere

You might also like