You are on page 1of 100

PUNE INSTITUTE OF COMPUTER TECHNOLOGY

DHANKAWADI, PUNE – 43.

LAB MANUAL

ACADEMIC YEAR: 2010-2011

DEPARTMENT: COMPUTER ENGINEERING

CLASS: B.E SEMESTER: I

SUBJECT: COMPUTER LABORATORY-I

INDEX OF LAB EXPERIMENTS

EXPT. PROBLEM STATEMENT Revised on


NO
PART I: Principles of Compiler Design

1 Write a LEX program to count number of characters, words and lines 02/07/2010
as remove the C and C++ comments from a given input text file.
Create an output text file that consists of the contents of the input file
with line no. and display total no. of character word and lines.
2 Implement a lexical analyser for a subset of C using LEX. 22/06/2009
Implementation should support error handling.
3 Implement a natural language parser using LEX & YACC. 22/06/2009
4 Write an ambiguous CFG to recognise an infix expression and 22/06/2009
implement a parser that recognises the infix expression using YACC.
Provide the details of all conflicting entries in the parser table
generated by LEX and YACC and how they have been resolved.
(can take calculator as an application)
5 Write an attributed translation grammar to recognise declarations of 22/06/2009
simple variables, “for”, assignment, if, if-else statements as per syntax
of C or Pascal and generate equivalent three address code for the
given input made up of constructs mentioned above using LEX and
YACC. Write a code to store the identifiers from the input in a symbol
table and also to record other relevant information about the
identifiers. Display all records stored in the symbol table.
6 Laboratory Project 22/06/2009
For a small subset of C with essential programming constructs, write a
compiler using LEX and YACC(To be carried out in a group of 4 to 6
students).

P:F-LTL-UG/03/R1 Page 1 of 99
PART II : Operating System 22/06/2009
7 Study of Various Commands in Unix/ Linux 22/06/2009
This assignment includes:
General commands like grep,locate,chmod,chown,ls,cp etc.
It also includes the various system calls like:
File System Calls: read(),write(),open() etc.
Process System Calls: fork(),execv(),execl() etc.
Inter-process System Calls :pipe(), popen(), fifo(),signal() etc.
Each command should be written as per the format specified

COMMAND NAME command name


FORMAT command [option(s)] argument(s)
DESCRIPTION A brief description of what the command
does.
OPTIONS A list of the most useful options and a
brief description of each.
ARGUMENTS Mandatory or optional arguments.
EXAMPLE A simple example of how to use the
command.
8 Using fork system call create a child process, suspend it using wait 22/06/2009
system call and transfer it into the Zombie state.
9 Write a program for Client-Server communication using following 22/06/2009
inter- process communication mechanism.
1. Unnamed pipe
2. Named pipe
3.Semaphore (General)
10 File management using low level file access system calls such as 22/06/2009
write, read, open lseek, fstat
11 Implement an Alarm clock application using signals 22/06/2009
12 Create a program which has three threads 22/06/2009
1. Display Seconds 2. Display Minutes 3:Display Hours.
13 Write and insert module in Linux Kernel. 22/06/2009
PART III: Design and Analysis of Algorithm 22/06/2009
Note: Compute the time and space complexity for following
assignments.

14 Implement using divide and Conquer strategy(any one) 22/06/2009


• Merge Sort and Randomized Quick sort (recursive and non
recursive) and Compare recursive and non recursive versions
• Multiplication of 2 ‘n’ bit numbers where ‘n’ is a power of 2 .

P:F-LTL-UG/03/R1 Page 2 of 99
15 Implement using Greedy Method 22/06/2009
Minimal spanning tree/Job scheduling

16 Find shortest path for multistage graph problem (single source shortest 22/06/2009
path and all pair shortest path)
17 Implement 0/1 Knapsack's problem using Dynamic programming, 22/06/2009
Backtracking and Branch & Bound strategies.
Analyse the problem with all three methods
18 Implement with Backtracking(any one) 22/06/2009
1. Implement ‘n’ queens problem with backtracking. Calculate
the no. of solutions and no. of nodes generated in the state
space tree
2. For the assignment problem of ‘n’ people to ‘n’ jobs with cost
of assigning as C( i, j), find the optimal assignment of every
job to a person with minimum cost.
19 Implement following using branch and Bound 22/06/2009
Traveling salesperson problem

Head of Department Subject Coordinator


(Computer Engineering) (Prof. Archana Ghotkar)

P:F-LTL-UG/03/R1 Page 3 of 99
STUDENT ACTIVITY FLOW-CHART

START

Get imparted knowledge from Lab Teacher

Design the applications

NO
Consult Lab Make suggested
Teacher & If modification
accepted
YES
YES
Write the program and execute and test for
different inputs

Demonstrate to lab teacher for different i/p

NO Make suggested
modification
If
Completed

YES

END

P:F-LTL-UG/03/R1 Page 4 of 99
Revised on: 02/07/2010
Using LEX
TITLE

Write a LEX program to count number of characters, words and lines in a


PROBLEM given input text file. Create an output text file that consists of the contents
STATEMENT of the input file as well as line numbers
/DEFINITION

• Understand the importance and usage of LEX automated tool


OBJECTIVE

S/W PACKAGES Windows 2000, PC with the configuration as


AND Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color Monitor,
HARDWARE Keyboard, Mouse
APPARATUS Linux with a support of LEX utility
USED
1. A V Aho, R. Sethi, .J D Ullman, "Compilers: Principles,
REFERENCES Techniques, and Tools", Pearson Education, ISBN 81 - 7758 - 590
2. J. R. Levine, T. Mason, D. Brown, "Lex & Yacc", O'Reilly, 2000,
ISBN 81-7366 -061-X.– 8
3. K. Louden, "Compiler Construction: Principles and Practice",
Thomson Brookes/Cole (ISE), 2003, ISBN 981 - 243 - 694-4:

STEPS Refer to student activity flow chart, theory, algorithm, test input, test
output
INSTRUCTIONS • Title
FOR • Problem Definition
WRITING • Theory
JOURNAL • Source Code
• Output
• Conclusion

P:F-LTL-UG/03/R1 Page 5 of 99
Theory:

Lex is a tool for generating programs that perform pattern-matching on text. It is a tool
for generating scanners i.e. Programs which recognize lexical patterns in text. The
description of the scanner is in the form of pairs of regular expressions and C code calls
rules. Lex generates as output a C source file called lex.yy.c

Format of the input file :

The general format of lex source shall be:

Definitions
%%
Rules
%%
UserSubroutines

• The definition section is the place to define macros and to import header files
written in C. It is also possible to write any C code here, which will be copied
verbatim into the generated source file.
• The rules section is the most important section; it associates patterns with C
statements. Patterns are simply regular expressions. When the lexer sees some text in
the input matching a given pattern, it executes the associated C code. This is the
basis of how lex operates.
• The C code section contains C statements and functions that are copied verbatim
to the generated source file. These statements presumably contain code called by the
rules in the rules section. In large programs it is more convenient to place this code
in a separate file and link it in at compile time.

How the input is matched :

When the generated scanner is run, it analyzes its input looking for strings which match
any of its patterns. If it finds more than one match, it takes the one matching the most text
(for trailing context rules, this includes the length of the trailing part, even though it will
then be returned to the input). If it finds two or more matches of the same length, the rule
listed first in the flex input file is chosen.
Once the match is determined, the text corresponding to the match (called the token) is
made available in the global character pointer yytext, and its length in the global
integer yyleng. The action corresponding to the matched pattern is then executed (a
more detailed description of actions follows), and then the remaining input is scanned for
another match.
If no match is found, then the default rule is executed: the next character in the input is
considered matched and copied to the standard output. Thus, the simplest legal flex 
input is:
which generates a scanner that simply copies its input (one character at a time) to its

P:F-LTL-UG/03/R1 Page 6 of 99
output.

Actions in lex :
The action to be taken when an ERE is matched can be a C program fragment or the
special actions described below; the program fragment can contain one or more C
statements, and can also include special actions.
Four special actions shall be available:
| The action ’|’ means that the action for the next rule is the
action for this rule.
ECHO: Write the contents of the string yytext on the output.
REJECT:Usually only a single expression is matched by a given string in the input.
REJECT means "continue to the next expression that matches the current
input", and shall cause whatever rule was the second choice after the current rule
to be executed for the same input. Thus, multiple rules can be matched and executed
for one input string or overlapping input strings.
BEGIN The action:
BEGIN newstate;

P:F-LTL-UG/03/R1 Page 7 of 99
Algorithm

1. Write a lex input file and save it as <filename.l>


2. Generate a C file using the command 'lex <filename.l>'. This creates a .c file
named lex.yy.c
3. Compile the .c file using the command 'gcc lex.yy.c -lfl'. This compiles the c file
using the lfl library
4. Execute the file using the command './a.out <arguments to be passed> '

Test Input:
[root@localhost Lex&Yacc]# lex first.l
[root@localhost Lex&Yacc]# cc lex.yy.c -ll
[root@localhost Lex&Yacc]# ./a.out myinput.c

//myinput.c---

/* hello world!!!! this is a program to test my first lex assignment


*/

#include<stdio.h>
#include<conio.h>

main()
{
// this is a single line comment
int num,i;
printf("\nEnter the number: ");
scanf("%d",&num);
if(num<10)
printf("\nNumber Less than 10!!!");
else
printf("\nNumber Greater than 10!!!");
return(0);
}

Test Output:

/* HELLO WORLD!!!! THIS IS A PROGRAM TO TEST MY FIRST LEX


ASSIGNMENT
PICT
*/

P:F-LTL-UG/03/R1 Page 8 of 99
#include<stdio.h>
#include<conio.h>

main()
{
// THIS IS A SINGLE LINE COMMENT
int num,i;
printf("\nEnter the number: ");
scanf("%d",&num);
if(num<10)
printf("\nNumber Less than 10!!!");
else
printf("\nNumber Greater than 10!!!");
return(0);
}

*/

P:F-LTL-UG/03/R1 Page 9 of 99
Revised on:22/06/2009
Lexical analyzer using LEX
TITLE

Implement a lexical analyser for a subset of C using LEX Implementation


PROBLEM should support error handling.
STATEMENT
/DEFINITION

• Appreciate the role of lexical analysis in compilation


OBJECTIVE • Understand the theory behind design of lexical analyzers and
lexical analyzer generator
• Be able to use LEX to generate lexical analyzers
S/W PACKAGES Windows 2000, PC with the configuration as
AND Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color Monitor,
HARDWARE Keyboard, Mouse
APPARATUS Linux with LEX utility
USED
1. A V Aho, R. Sethi, .J D Ullman, "Compilers: Principles,
REFERENCES Techniques, and Tools", Pearson Education, ISBN 81 -
7758 - 590
2. J. R. Levine, T. Mason, D. Brown, "Lex & Yacc",
O'Reilly, 2000, ISBN 81-7366 -061-X.– 8
3. K. Louden, "Compiler Construction: Principles and
Practice", Thomson Brookes/Cole (ISE), 2003, ISBN 981 -
243 - 694-4:

STEPS Refer to student activity flow chart, theory, algorithm, test input,
test output
INSTRUCTIONS • Title
FOR • Problem Definition
WRITING • Theory
JOURNAL • Source Code
• Output
• Conclusion

P:F-LTL-UG/03/R1 Page 10 of 99
Theory:

Regular Expressions in lex

"..." - Any string enclosed in double-quotes shall represent the characters within the
double-quotes as themselves, except that backslash escapes.
<state>r, <state1,state2,...>r - The regular expression r shall be matched only when the
program is in one of the start conditions indicated by
state, state1, and so on.
r/x - The regular expression r shall be matched only if it is followed by an occurrence of
regular
expression x

* - matches zero or more occurrences of the preceding expressions


[] - a character class which matches any character within the brackets. If the first
character is a
circumflex '^' it changes the meaning to match any character except the ones within
brackets.
^ - matches the beginning of a line as the first characters of a regular expression. Also
used for negation
within square brackets.
{} - Indicates how many times the previous pattern is allowed to match, when containing
one or two
numbers.
$ - matches the end of a line as the last character of a regular expressions
\ - used to escape metacharacters and a part of usual c escape sequences e.g '\n' is a
newline character
while '\*' is a literal asterisk.
+ - matches one more occurrences of the preceding regular expression.
| - matches either the prceding regular expression or the following regular expression.
() - groups a series of regular expressions together, into a new regular expression.

Examples :

DIGIT [0-9]+
IDENTIFIER [a-zA-Z][a-zA-Z0-9]*

The functions or macros that are accessible to user code :

int yylex(void)

Performs lexical analysis on the input; this is the primary function generated
by the lex utility. The function shall return zero when the end of input is
reached; otherwise, it shall
return non-zero values (tokens) determined by the actions that are selected.

P:F-LTL-UG/03/R1 Page 11 of 99
int yymore(void)

When called, indicates that when the next input string is recognized, it is to be
appended to the current value of yytext rather than replacing it; the value in
yyleng shall be adjusted accordingly.

int yyless(int n)

Retains n initial characters in yytext, NUL-terminated, and treats the


remaining characters as if they had not been read;the value in yyleng shall be adjusted
accordingly.

int yywrap(void)

Called by yylex() at end-of-file; the default yywrap() shall always return 1. If


the application requires yylex() to continue processing with another source of
input, then the application
can include a function yywrap(), which associates another file with the external
variable FILE * yyin and shall return a value of zero.

int main(int argc, char *argv[])

Calls yylex() to perform lexical analysis, then exits. The user code can contain
main() to perform application-specific operations, calling yylex() as applicable.

P:F-LTL-UG/03/R1 Page 12 of 99
Algorithm :

1. Accept a C input filename as a command line argument.


2. Separate the tokens as identifiers, constants, keywords etc. and fill the generic
symbol table.
3. Check for syntax errors of expressions and give error messages if needed.

Test Input:

#include<stdio.h>
void main()
{
int a,b;
char b;
a=3;b=5;c=d;
c=a+d;
}

Test Output:

#include<stdio.h>
void main()
{
int a,b,c;
char b; Error : Redefinition of symbol b;
a=3;b=5;c=d;
c=a+d; Error : Undefined symbol d;
}

Symbol Table :a int 3, b int 5 , c int NULL

P:F-LTL-UG/03/R1 Page 13 of 99
Revised on: 22/06/2009

TITLE Natural language Parser

Implement a natural language parser using LEX & YACC


PROBLEM
STATEMENT
/DEFINITION

• Be proficient on writing grammars to specify syntax


OBJECTIVE • Understand the theories behind different parsing strategies-their
strengths and limitations
• Understand how the generation of parser can be automated
• Be able to use YACC to generate parsers
S/W PACKAGES Windows 2000, PC with the configuration as
AND Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color Monitor,
HARDWARE Keyboard, Mouse
APPARATUS Linux with LEX utility
USED
1. A V Aho, R. Sethi, .J D Ullman, "Compilers: Principles,
REFERENCES Techniques, and Tools", Pearson Education, ISBN 81 - 7758 - 590
2. J. R. Levine, T. Mason, D. Brown, "Lex & Yacc", O'Reilly, 2000,
ISBN 81-7366 -061-X.– 8
3. K. Louden, "Compiler Construction: Principles and Practice",
Thomson Brookes/Cole (ISE), 2003, ISBN 981 - 243 - 694-4:

STEPS Refer to student activity flow chart, theory, algorithm, test input, test
output
INSTRUCTIONS • Title
FOR • Problem Definition
WRITING • Theory
JOURNAL • Source Code
• Output
• Conclusion

P:F-LTL-UG/03/R1 Page 14 of 99
Theory:

Lex recognises regular expressions, whereas YACC recognises entire grammar. Lex
divides the input stream into tokens, while YACC uses these tokens and groups them
together logically.

The syntax of a YACC file :

%{
declaration section
%}
rules section
%%
user defined functions

Declaration section :
Here, the definition section is same as that of lex, where we can define all tokens and
include header files. The declarations section is used to define the symbols used to define
the target language and their relationship with each other. In particular, much of the
additional information required to resolve ambiguities in the context-free grammar for the
target language is provided here.

Grammar Rules in yacc


The rules section defines the context-free grammar to be accepted by the function
yacc generates, and associates with those rules C-language actions and additional
precedence information. The grammar is described below, and a formal definition
follows.

The rules section is comprised of one or more grammar rules. A grammar rule has
the form:

A : BODY ;

The symbol A represents a non-terminal name, and BODY represents a sequence of


zero or more names, literals, and semantic actions that can then be followed by
optional precedence rules. Only the names and literals participate in the formation of
the grammar; the semantic actions and precedence rules are used in other ways. The
colon and the semicolon are yacc punctuation. If there are several successive grammar
rules with the same left-hand side, the vertical bar ’|’ can be used to avoid rewriting
the left-hand side; in this case the semicolon appears only after the last rule. The BODY
part can be empty.

Programs Section
The programs section can include the definition of the lexical analyzer yylex(),

P:F-LTL-UG/03/R1 Page 15 of 99
and any other functions; for example, those used in the actions specified in the
grammar rules. It is unspecified whether the programs section precedes or follows
the semantic actions in the output file; therefore, if the application contains any macro
definitions and declarations intended to apply to the code in the semantic actions, it
shall place them within "%{ ... %}" in the declarations section.

Interface to the Lexical Analyzer


The yylex() function is an integer-valued function that returns a token number
representing the kind of token read. If there is a value associated with the
token returned by yylex() (see the discussion of tag above), it shall be
assigned to the external variable yylval.

Running lex and yacc :

• Generate y.tab.c and y.tab.h files using 'yacc -d <filename.y>


• Generate a .c file using 'lex <filename.l>
• Link the two files together using gcc y.tab.c lex.yy.c –lfl

P:F-LTL-UG/03/R1 Page 16 of 99
Algorithm :

1. Accept a sentence from the user


2. Write lex code to separate out the tokens.
3. Write YACC code to check the validity of the sentence.
4. If valid print 'accepted' else 'rejected'

Test Input:
I am a boy.

Test Output:
Statement is correct

Test Input:
I boy am.

Test Output:
Statement is wrong

P:F-LTL-UG/03/R1 Page 17 of 99
Revised on: 22/06/2009

TITLE Calculator program handling ambiguous grammar using LEX &


YACC
Write an ambiguous CFG to recognise an infix expression and
PROBLEM implement a parser that recognises the infix expression using YACC.
STATEMENT Provide the details of all conflicting entries in the parser table
/DEFINITION generated by LEX and YACC and how they have been resolved.
(can take calculator as an application)
• Be proficient on writing grammars to specify syntax
OBJECTIVE • Understand the theories behind different parsing strategies-
their strengths and limitations
• Understand how the generation of parser can be automated
• Be able to use YACC to generate parsers
• Understanding conflicting entries in the parser table .
• Resolution of conflicts by Parser.
S/W PACKAGES Windows 2000, PC with the configuration as
AND HARDWARE Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color
APPARATUS USED Monitor, Keyboard, Mouse
Linux with LEX utility
1. A V Aho, R. Sethi, .J D Ullman, "Compilers: Principles,
REFERENCES Techniques, and Tools", Pearson Education, ISBN 81 - 7758 -
590
2. J. R. Levine, T. Mason, D. Brown, "Lex & Yacc", O'Reilly,
2000, ISBN 81-7366 -061-X.– 8
3. K. Louden, "Compiler Construction: Principles and Practice",
Thomson Brookes/Cole (ISE), 2003, ISBN 981 - 243 - 694-4:
STEPS Refer to student activity flow chart, theory, algorithm, test input, test
output
INSTRUCTIONS • Title
FOR • Problem Definition
WRITING • Theory
JOURNAL • Source Code
• Output
• Conclusion

P:F-LTL-UG/03/R1 Page 18 of 99
Theory:

Ambiguity :

A grammar is said to be an ambiguous grammar if there is some string that it can


generate in more than one way (i.e., the string has more than one parse tree or more than
one leftmost derivation).
The context free grammar
A→A+A|A−A|a

is ambiguous since there are two leftmost derivations for the string a + a + a:
A →A+A A →A+A
→a+A →A+A+A
→a+A+
→a+A+A
A
→a+a+A →a+a+A
→a+a+a →a+a+a

Conflicts may arise because of mistakes in input or logic, or because the grammar rules.
This will result in a shift/reduce conflict. When there are shift/reduce or reduce/reduce
conflicts, Yacc still produces a parser. It does this by selecting one of the valid steps
wherever it has a choice. A rule describing which choice to make in a given situation is
called a disambiguating rule.
Yacc invokes two disambiguating rules by default:
• In a shift/reduce conflict, the default is to do the shift.
• In a reduce/reduce conflict, the default is to reduce by the earlier grammar rule (in
the input sequence).
There is one common situation where the rules given above for resolving conflicts are not
sufficient; this is in the parsing of arithmetic expressions. Most of the commonly used
constructions for arithmetic expressions can be naturally described by the notion of
precedence levels for operators, together with information about left or right
associativity.

The precedences and associativities are attached to tokens in the declarations section.
This is done by a series of lines beginning with a Yacc keyword: %left, %right, or
%nonassoc, followed by a list of tokens. All of the tokens on the same line are assumed
to have the same precedence level and associativity; the lines are listed in order of
increasing precedence or binding strength.

Thus,
%left '+' '-'
%left '*' '/'

describes the precedence and associativity of the four arithmetic operators. Plus and

P:F-LTL-UG/03/R1 Page 19 of 99
minus are left associative, and have lower precedence than star and slash, which are also
left associative. The keyword %right is used to describe right associative operators, and
the keyword %nonassoc is used to describe operators that may not associate with
themselves.

Passing Values between Actions


To get values generated by other actions, an action can use the yacc parameter keywords
that begin with a dollar sign ($1, $2,   ... ). These keywords refer to the values
returned by the components of the right side of a rule, reading from left to right. For
example, if the rule is:
A : B C D ;

then $1 has the value returned by the rule that recognized B, $2 has the value returned
by the rule that recognized C, and $3 the value returned by the rule that recognized D.
To return a value, the action sets the pseudo-variable $$ to some value. For example, the
following action returns a value of 1:
{ $$ = 1;}

P:F-LTL-UG/03/R1 Page 20 of 99
Algorithm :

1. Accept expression from user.


2. Write a yacc file to evaluate the expression.
3. Print the result.

Test Input:
none

Test Output:
>A=2
>B=3
>C=A+B
>C
=5

P:F-LTL-UG/03/R1 Page 21 of 99
Revised on: 22/06/2009

TITLE Three address code generation using LEX & YACC

Write an attributed translation grammar to recognise declarations of


PROBLEM simple variables, “for”, assignment, if, if-else statements as per syntax of
STATEMENT C or Pascal and generate equivalent three address code for the given
/DEFINITION input made up of constructs mentioned above using LEX and YACC.
Write a code to store the identifiers from the input in a symbol table and
also to record other relevant information about the identifiers. Display all
records stored in the symbol table.
• Understand the intermediate code generation phase
OBJECTIVE • Understand the need for various static semantic analyses such as
declaration processing, type checking
• Be able to perform such analysis on a syntax directed fashion
through the use of attributed definitions
S/W PACKAGES Windows 2000, PC with the configuration as
AND Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color Monitor,
HARDWARE Keyboard, Mouse
APPARATUS Linux with LEX utility
USED Linux with YACC utility
GCC
1. A V Aho, R. Sethi, .J D Ullman, "Compilers: Principles,
REFERENCES Techniques, and Tools", Pearson Education, ISBN 81 - 7758 -
590
2. J. R. Levine, T. Mason, D. Brown, "Lex & Yacc", O'Reilly,
2000, ISBN 81-7366 -061-X.– 8
3. K. Louden, "Compiler Construction: Principles and Practice",
Thomson Brookes/Cole (ISE), 2003, ISBN 981 - 243 - 694-4
STEPS Refer to student activity flow chart, theory, algorithm, test input, test
output
INSTRUCTIONS • Title
FOR • Problem Definition
WRITING • Theory
JOURNAL • Source Code
• Output
• Conclusion

P:F-LTL-UG/03/R1 Page 22 of 99
Theory:

Basically there are three types of 3 address codes:


1. Quadruplets
2. Triplets
3. Indirected Triplets

There are following types of 3 address statements -


Assignment statements of the form x:=y op z where op is binary arithmetic or logical
operation.
Assignment instruction of the form x:=op y where op is a unary operation like unary
minus
logical negation, shift operators & conversion operation.
Copy statements of the form x:=y where value of y is assigned to x.
the unconditional jump goto L. The 3 address statement with label L is the next
statement to be executed.
Conditional jumps such as if x rel op y goto L. This instruction applies a relational
operator as <=,>= etc to x & y. If condition is false next statements are executed.
Program x and call is for procedure calls and return y where y represents a return value
that is optional. Teir typical use is as the sequence of 3 address statements

param xn
param xn+1

generated as part of call to procedure.

Address and pointer assignment of the form x:=&y, x=*y, *x=y.

Quadruple--
A quadruple is a record structure with 4 fields which we will call as op,arg1,arg2 and
result
Op field contains an integral code & operator
The 3 address statements x:=y op z is represented by placing y in arg1 and z in arg2 and x
in result.
Statements with unary operators like x:=-y or x:=y do not use arg2.

The quadruplet for the arrangement


a:= b*-c+b*(-c) is:

P:F-LTL-UG/03/R1 Page 23 of 99
Op Arg1 Arg2 Result
Unary minus C  t1
* B T1 T2
Unary minus C T3
* B T3 T4
+ T2 T4 T5
:= T5 A

P:F-LTL-UG/03/R1 Page 24 of 99
Algorithm:
1. Write a lex file to detect various tokens like if, for, brackets, identifiers etc.
2. Write the yacc file for checking syntatically correct sattements.
3. Generate lex.yy.c & y.tab.c files and link them.
4. Print 3 address code & symbols in the symbol table.
5. Stop.

Test Input:
int main(int a,char b)
{
int a=6,b=3,c,d,e;
c=5;
if (a<c)
{
a=b+c*d-e;
}
while (a<b && c!=3)
{
b=c+3;
}
}

Test Output:
Program Syntactically Correct

Quadruples Produced :

Address Operator Operand1 Operand2 Result


0 = a 6
1 = b 3
2 = 5 c
3 < a c t0
4 IF t0 1 (6)
5 (n0)
6 * c d t1
7 + b t1 t2
8 - t2 e t3
9 = t3 a
10 (n0)
11 < a b t4
12 != c 3 t5
13 && t4 t5 t6
14 WHILE t6 1 (16)
15 (n1)

P:F-LTL-UG/03/R1 Page 25 of 99
16 + c 3 t7
17 = t7 b
18
(prev)
19 (n1)

P:F-LTL-UG/03/R1 Page 26 of 99
Revised on: 22/06/2009
COMMANDS IN UNIX/LINUX OPERATING SYSTEMS
TITLE
PROBLEM DEFINITION Study of Commands in UNIX/LINUX Operating Systems
Customizing the operating systems environment
OBJECTIVE • To get familiar with operating environment
REFERENCES INTERACTIVE UNIX Operating System Primer – Version
3.0.
• man commands
• Online LINUX Documentation Project
STEPS Refer to theory
Title
INSTRUCTIONS FOR • Problem Definition
WRITING JOURNAL • Theory
• Algorithm
• Source code
• compilation steps
• Output
• Conclusion

P:F-LTL-UG/03/R1 Page 27 of 99
Theory:

NAME

grep, egrep, fgrep, rgrep - print lines matching a pattern

SYNOPSIS

grep [options] PATTERN [FILE...]


grep [options] [-e PATTERN | -f FILE] [FILE...]

DESCRIPTION

grep searches the named input FILEs (or standard input if no files are named, or the
file name - is given) for lines containing a match to thegiven PATTERN. By default,grep
prints the matching lines.

In addition, three variant programs egrep, fgrep and rgrep are available. egrep is the
same as grep -E. fgrep is the same as grep -F.rgrep is the same as grep -r.

OPTIONS

-A NUM, --after-context=NUM
Print NUM lines of trailing context after matching lines.Places a
line containing -- between contiguous groups of matches.

-a, --text
Process a binary file as if it were text; this is equivalent to the --binary-
files=text option.

-B NUM, --before-context=NUM
Print NUM lines of leading context before matching lines.Places
a line containing -- between contiguous groups of matches.

-b, --byte-offset
Print the byte offset within the input file before each line of output.

EXAMPLE

ls –l | grep

P:F-LTL-UG/03/R1 Page 28 of 99
NAME

locate - list files in databases that match a pattern

SYNOPSIS

locate [-d path | --database=path] [-e | -E | --[non-]existing] [-i |


--ignore-case] [-0 | --null] [-c | --count] [-w | --wholename] |-b |
--basename] [-l N | --limit=N] [-S | --statistics] [-r | --regex ] [-P
| -H | --nofollow] [-L | --follow] [--version] [-A | --all] [-p |
--print] [--help] pattern...

DESCRIPTION

This manual page documents the GNU version of locate. For each given pattern,
locate searches one or more databases of file names and displays the file names that
contain the pattern. Patterns can containshell-style metacharacters: `*', `?', and `[]'. The
metacharacters do not treat `/'or `.' specially. Therefore, a pattern `foo*bar' can match
a file name that contains `foo3/bar', and a pattern `*duck*' can match a file name that
contains `lake/.ducky'. Patterns that contain metacharacters should be quoted to protect
them from expansion by the shell.

If a pattern is a plain string -- it contains no metacharacters --locate displays all


file names in the database that contain that strin anywhere. If a pattern does contain
metacharacters, locate only dis-plays file names that match the pattern exactly. As a
result, patterns that contain metacharacters should usually begin with a `*', and will
most often end with one as well. The exceptions are patterns that are intended to
explicitly match the beginning or end of a file name.

The file name databases contain lists of files that were on the system
when the databases were last updated. The system administrator can
choose the file name of the default database, the frequency with which
the databases are updated, and the directories for which they contain
entries; see updatedb(1).

If locate's output is going to a terminal, unusual characters in the


output are escaped in the same way as for the -print action of the find
command. If the output is not going to a terminal, file names are
printed exactly as-is.

OPTIONS

-A, --all
Print only names which match all non-option arguments, not those

P:F-LTL-UG/03/R1 Page 29 of 99
matching one or more non-option arguments.

-c, --count
Instead of printing the matched filenames, just print the total
number of matches we found, unless --print (-p) is also present.

-d path, --database=path
Instead of searching the default file name database, search the
file name databases in path, which is a colon-separated list of
database file names. You can also use the environment variable
LOCATE_PATH to set the list of database files to search. The
option overrides the environment variable if both are used.
Empty elements in the path are taken to be synonyms for the file
name of the default database. A database can be supplied on
stdin, using `-' as an element of path. If more than one element
of path is `-', later instances are ignored (and a warning mes-
sage is printed).

The file name database format changed starting with GNU find and
locate version 4.0 to allow machines with different byte order-
ings to share the databases. This version of locate can auto-
matically recognize and read databases produced for older ver-
sions of GNU locate or Unix versions of locate or find. Support
for the old locate database format will be discontinued in a
future release.

-e, --existing
Only print out such names that currently exist (instead of such
names that existed when the database was created). Note that
this may slow down the program a lot, if there are many matches
in the database. If you are using this option within a program,
please note that it is possible for the file to be deleted after
locate has checked that it exists, but before you use it.

-E, --non-existing
Only print out such names that currently do not exist (instead
of such names that existed when the database was created). Note
that this may slow down the program a lot, if there are many
matches in the database.

-L, --follow
If testing for the existence of files (with the -e or -E
options), consider broken symbolic links to be non-existing.
This is the default.

P:F-LTL-UG/03/R1 Page 30 of 99
-P, -H, --nofollow
If testing for the existence of files (with the -e or -E
options), treat broken symbolic links as if they were existing
files. The -H form of this option is provided purely for simi-
larity with find; the use of -P is recommended over -H.

-i, --ignore-case
Ignore case distinctions in both the pattern and the file names.

-l N, --limit=N
Limit the number of matches to N. If a limit is set via this
option, the number of results printed for the -c option will
never be larger than this number.

-m, --mmap
Accepted but does nothing, for compatibility with BSD locate.

-0, --null
Use ASCII NUL as a separator, instead of newline.

-p, --print
Print search results when they normally would not, because of
the presence of --statistics (-S) or --count (-c).

-w, --wholename
Match against the whole name of the file as listed in the
database. This is the default.

-b, --basename
Results are considered to match if the pattern specified matches
the final component of the name of a file as listed in the
database. This final component is usually referred to as the
`base name'.

-r, --regex
The pattern specified on the command line is understood to be a
regular expression, as opposed to a glob pattern. The Regular
expressions work in the same was as in emacs and find, except
for the fact that "." will match a newline. Filenames whose
full paths match the specified regular expression are printed
(or, in the case of the -c option, counted). If you wish to
anchor your regular expression at the ends of the full path
name, then as is usual with regular expressions, you should use
the characters ^ and $ to signify this.

P:F-LTL-UG/03/R1 Page 31 of 99
-s, --stdio
Accepted but does nothing, for compatibility with BSD locate.

-S, --statistics
Print various statistics about each locate database and then
exit without performing a search, unless non-option arguments
are given. For compatibility with BSD, -S is accepted as a syn-
onym for --statistics. However, the ouptut of locate -S is dif-
ferent for the GNU and BSD implementations of locate.

--help Print a summary of the options to locate and exit.

--version
Print the version number of locate and exit

NAME
chmod -- change file modes or Access Control Lists

SYNOPSIS
chmod [-fv] [-R [-H | -L | -P]] mode file ...
chmod [-fv] [-R [-H | -L | -P]] [-a | +a | =a] ACE file ...
chmod [-fv] [-R [-H | -L | -P]] [-E] file ...
chmod [-fv] [-R [-H | -L | -P]] [-C] file ...

DESCRIPTION
The chmod utility modifies the file mode bits of the listed files as
specified by the mode operand. It may also be used to modify the Access
Control Lists (ACLs) associated with the listed files.

The generic options are as follows:

-H If the -R option is specified, symbolic links on the command line


are followed. (Symbolic links encountered in the tree traversal
are not followed by default.)

-L If the -R option is specified, all symbolic links are followed.

-P If the -R option is specified, no symbolic links are followed.


This is the default.

-R Change the modes of the file hierarchies rooted in the files


instead of just the files themselves.

P:F-LTL-UG/03/R1 Page 32 of 99
-f Do not display a diagnostic message if chmod could not modify the
mode for file.

-v Cause chmod to be verbose, showing filenames as the mode is modi-


fied. If the -v flag is specified more than once, the old and
new modes of the file will also be printed, in both octal and
symbolic notation.

The -H, -L and -P options are ignored unless the -R option is specified.
In addition, these options override each other and the command's actions
are determined by the last one specified.

Only the owner of a file or the super-user is permitted to change the


mode of a file.

EXAMPLES OF VALID MODES


644 make a file readable by anyone and writable by the owner
only.

go-w deny write permission to group and others.

=rw, +X set the read and write permissions to the usual defaults,
but retain any execute permissions that are currently set.

+X make a directory or file searchable/executable by everyone


if it is already searchable/executable by anyone.

755
u=rwx,go=rx
u=rwx,go=u-w make a file readable/executable by everyone and writable by
the owner only.

go= clear all mode bits for group and others.

g=u-w set the group bits equal to the user bits, but clear the
group write bit.

NAME

ls - list directory contents

P:F-LTL-UG/03/R1 Page 33 of 99
SYNOPSIS

ls [OPTION]... [FILE]...

DESCRIPTION

List information about the FILEs (the current directory by default).


Sort entries alphabetically if none of -cftuSUX nor --sort.

Mandatory arguments to long options are mandatory for short options


too.

-a, --all
do not ignore entries starting with .

-A, --almost-all
do not list implied . and ..

--author
with -l, print the author of each file

-b, --escape
print octal escapes for nongraphic characters

--block-size=SIZE
use SIZE-byte blocks

-B, --ignore-backups
do not list implied entries ending with ~

-c with -lt: sort by, and show, ctime (time of last modification of
file status information) with -l: show ctime and sort by name
otherwise: sort by ctime

-C list entries by columns

--color[=WHEN]
control whether color is used to distinguish file types. WHEN may be
`never', `always', or `auto'

-d, --directory
list directory entries instead of contents, and do not derefer- ence
symbolic links
NAME
read, readv, pread -- read input

P:F-LTL-UG/03/R1 Page 34 of 99
LIBRARY
Standard C Library (libc, -lc)

SYNOPSIS
#include <sys/types.h>
#include <sys/uio.h>
#include <unistd.h>

ssize_t
read(int d, void *buf, size_t nbytes);

ssize_t
readv(int d, const struct iovec *iov, int iovcnt);

ssize_t
pread(int d, void *buf, size_t nbytes, off_t offset);

DESCRIPTION
Read() attempts to read nbytes of data from the object referenced by the
descriptor d into the buffer pointed to by buf. Readv() performs the
same action, but scatters the input data into the iovcnt buffers speci-
fied by the members of the iov array: iov[0], iov[1], ..., iov[iovcnt-1].
Pread() performs the same function, but reads from the specified position
in the file without modifying the file pointer.

For readv(), the iovec structure is defined as:

struct iovec {
char *iov_base; /* Base address. */
size_t iov_len; /* Length. */
};

Each iovec entry specifies the base address and length of an area in mem-
ory where data should be placed. Readv() will always fill an area com-
pletely before proceeding to the next.

On objects capable of seeking, the read() starts at a position given by


the pointer associated with d (see lseek(2)). Upon return from read(),
the pointer is incremented by the number of bytes actually read.

Objects that are not capable of seeking always read from the current

P:F-LTL-UG/03/R1 Page 35 of 99
position. The value of the pointer associated with such an object is
undefined.

Upon successful completion, read(), readv(), and pread() return the num-
ber of bytes actually read and placed in the buffer. The system guaran-
tees to read the number of bytes requested if the descriptor references a
normal file that has that many bytes left before the end-of-file, but in
no other case.

RETURN VALUES
If successful, the number of bytes actually read is returned. Upon reading end-of-file,
zero is returned. Otherwise, a -1 is returned and the global variable errno is set to
indicate the error.

Name
write - write to a file descriptor

Synopsis
#include <unistd.h>

ssize_t write(int fd, const void *buf, size_t count);

Description
write() writes up to count bytes to the file referenced by the file descriptor fd from the
buffer starting at buf. POSIX requires that a read() which can be proved to occur after a
write() has returned returns the new data. Note that not all file systems are POSIX
conforming.

Return Value
On success, the number of bytes written are returned (zero indicates nothing was written).
On error, -1 is returned, and errno is set appropriately. If count is zero and the file
descriptor refers to a regular file, 0 may be returned, or an error could be detected. For a
special file, the results are not portable.

Errors
EAGAIN
Non-blocking I/O has been selected using O_NONBLOCK and the write would
block.
EBADF
fd is not a valid file descriptor or is not open for writing.
EFAULT

P:F-LTL-UG/03/R1 Page 36 of 99
buf is outside your accessible address space.
EFBIG
An attempt was made to write a file that exceeds the implementation-defined
maximum file size or the process' file size limit, or to write at a position past the
maximum allowed offset.
EINTR
The call was interrupted by a signal before any data was written.
EINVAL
fd is attached to an object which is unsuitable for writing; or the file was opened
with the O_DIRECT flag, and either the address specified in buf, the value
specified in count, or the current file offset is not suitably aligned.
EIO
A low-level I/O error occurred while modifying the inode.
ENOSPC
The device containing the file referred to by fd has no room for the data.
EPIPE
fd is connected to a pipe or socket whose reading end is closed. When this
happens the writing process will also receive a SIGPIPE signal. (Thus, the write
return value is seen only if the program catches, blocks or ignores this signal.)
NAME

fork, fork1 - create a new process

SYNOPSIS

#include <sys/types.h>
#include <unistd.h>

pid_t fork(void);

pid_t fork1(void);

DESCRIPTION

The fork() and fork1() functions create a new process. The new process (child
process) is an exact copy of the calling process (parent process). The child process
inherits the following attributes from the parent process:
o real user ID, real group ID, effective user ID, effective group ID

o environment
o open file descriptors
o close-on-exec flags (see exec(2))
o signal handling settings (that is, SIG_DFL, SIG_IGN,
SIG_HOLD, function address)
o supplementary group IDs

P:F-LTL-UG/03/R1 Page 37 of 99
o set-user-ID mode bit
o set-group-ID mode bit
o profiling on/off status
o nice value (see nice(2))
o scheduler class (see priocntl(2))
o all attached shared memory segments (see shmop(2))
o process group ID -- memory mappings (see mmap(2))
o session ID (see exit(2))
o current working directory
o root directory
o file mode creation mask (see umask(2))

SunOS 5.9 Last change: 23 Jul 2001 1

System Calls fork(2)


o resource limits (see getrlimit(2))
o controlling terminal
o saved user ID and group ID
o task ID and project ID
o processor bindings (see processor_bind(2))
o processor set bindings (see pset_bind(2))

Scheduling priority and any per-process scheduling parameters that are specific to
a given scheduling class may or may not be inherited according to the policy of that
particular class (see priocntl(2)). The child process differs from the parent process in the
following ways:

o The child process has a unique process ID which does not match any active
process group ID.

o The child process has a different parent process ID (that is, the process ID of the
parent process).

o The child process has its own copy of the parent's file descriptors and
directory streams. Each of the child's file descriptors shares a common file pointer with
the corresponding file descriptor of the parent.

o Each shared memory segment remains attached and the value of shm_nattach is
incremented by 1.

o All semadj values are cleared (see semop(2)).

o Process locks, text locks, data locks, and other memory locks are not
inherited by the child (see plock(3C) and memcntl(2)).

P:F-LTL-UG/03/R1 Page 38 of 99
o The child process's tms structure is cleared:
tms_utime, stime, cutime, and cstime are set to 0 (see times(2)).

o The child processes resource utilizations are set to 0; see getrlimit(2). The
it_value and it_interval values for the ITIMER_REAL timer are reset to 0; see
getitimer(2).

o The set of signals pending for the child process is initialized to the empty set.

o Timers created by timer_create(3RT) are not inherited by the child process.

NAME
pipe -- create descriptor pair for interprocess communication

SYNOPSIS
#include <unistd.h>

int
pipe(int *fildes);

DESCRIPTION
The pipe() function creates a pipe, which is an object allowing unidirec-
tional data flow, and allocates a pair of file descriptors. The first
descriptor connects to the read end of the pipe, and the second connects
to the write end, so that data written to fildes[1] appears on (i.e., can
be read from) fildes[0]. This allows the output of one program to be
sent to another program: the source's standard output is set up to be the
write end of the pipe, and the sink's standard input is set up to be the
read end of the pipe. The pipe itself persists until all its associated
descriptors are closed.

A pipe whose read or write end has been closed is considered widowed.
Writing on such a pipe causes the writing process to receive a SIGPIPE
signal. Widowing a pipe is the only way to deliver end-of-file to a
reader: after the reader consumes any buffered data, reading a widowed
pipe returns a zero count.

RETURN VALUES
On successful creation of the pipe, zero is returned. Otherwise, a value
of -1 is returned and the variable errno set to indicate the error.

P:F-LTL-UG/03/R1 Page 39 of 99
ERRORS
The pipe() call will fail if:

[EMFILE] Too many descriptors are active.

[ENFILE] The system file table is full.

[EFAULT] The fildes buffer is in an invalid area of the


process's address space.

P:F-LTL-UG/03/R1 Page 40 of 99
Revised On: 22/06/2009
Client - Server communication using Inter process Communication:
TITLE

PROBLEM Write a program for Client-Server communication using following IPC


STATEMENT mechanism.
/DEFINITION 1. Unnamed pipe
2. Named pipe

OBJECTIVE To understand Inter process communication using pipe.


• To implement client server communication using Unnamed
pipe,Named pipe .
S/W PACKAGES Linux Fedora 4
AND PC with the configuration as
HARDWARE Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color Monitor,
APPARATUS Keyboard, Mouse
USED
REFERENCES Advanced Unix Programming By Richard Stevans
Vijay Mukhi's -The 'c' odyssey UNIX- Gandhi
STEPS Refer to student activity flow chart, theory, algorithm, test input, test
output
INSTRUCTIONS Title
FOR • Problem Definition
WRITING • Theory
JOURNAL • Algorithm
• Source code
• compilation steps
• Output
• Conclusion

P:F-LTL-UG/03/R1 Page 41 of 99
Theory:

A pipe is a form of redirection that is used in Linux and other Unix-like operating
systems to send the output of one program to another program for further processing.

Redirection is the transferring of standard output to some other destination, such as


another program, a file or a printer, instead of the display monitor (which is its default
destination). Standard output, sometimes abbreviated stdout, is the destination of the
output from command line (i.e., all-text mode) programs in Unix-like operating systems.

Pipes are used to create what can be visualized as a pipeline of commands, which is a
temporary direct connection between two or more simple programs. This connection
makes possible the performance of some highly specialized task that none of the
constituent programs could perform by themselves. A command is merely an instruction
provided by a user telling a computer to do something, such as launch a program. The
command line programs that do the further processing are referred to as filters.

This direct connection between programs allows them to operate simultaneously and
permits data to be transferred between them continuously rather than having to pass it
through temporary text files or through the display screen and having to wait for one
program to be completed before the next program begins.

History

Pipes rank alongside the hierarchical file system and regular expressions as one of the
most powerful yet elegant features of Unix-like operating systems. The hierarchical file
system is the organization of directories in a tree-like structure which has a single root
directory (i.e., a directory that contains all other directories). Regular expressions are a
pattern matching system that uses strings (i.e., sequences of characters) constructed
according to pre-defined syntax rules to find desired patterns in text.

Pipes were first suggested by M. Doug McIlroy, when he was a department head in the
Computing Science Research Center at Bell Labs, the research arm of AT&T (American
Telephone and Telegraph Company), the former U.S. telecommunications monopoly.
McIlroy had been working on macros since the latter part of the 1950s, and he was a
ceaseless advocate of linking macros together as a more efficient alternative to series of
discrete commands. A macro is a series of commands (or keyboard and mouse actions)
that is performed automatically when a certain command is entered or key(s) pressed.

McIlroy's persistence led Ken Thompson, who developed the original UNIX at Bell Labs
in 1969, to rewrite portions of his operating system in 1973 to include pipes. This
implementation of pipes was not only extremely useful in itself, but it also made possible
a central part of the Unix philosophy, the most basic concept of which is modularity (i.e.,

P:F-LTL-UG/03/R1 Page 42 of 99
a whole that is created from independent, replaceable parts that work together
efficiently).

Examples

A pipe is designated in commands by the vertical bar character, which is located on the
same key as the backslash on U.S. keyboards. The general syntax for pipes is:

command_1 | command_2 [| command_3 . . . ]

This chain can continue for any number of commands or programs.

A very simple example of the benefits of piping is provided by the dmesg command,
which repeats the startup messages that scroll through the console (i.e., the all-text, full-
screen display) while Linux is booting (i.e., starting up). dmesg by itself produces far too
many lines of output to fit into a single screen; thus, its output scrolls down the screen at
high speed and only the final screenful of messages is easily readable. However, by
piping the output of dmesg to the filter less, the startup messages can conveniently be
viewed one screenful at a time, i.e.,

dmesg | less

less allows the output of dmesg to be moved forward one screenful at a time by pressing
the SPACE bar and back one screenful at a time by pressing the b key. The command can
be terminated by pressing the q key. (The more command could have been used here
instead of less; however, less is newer than more and has additional functions, including
the ability to return to previous pages of the output.)

The same result could be achieved by first redirecting the output of dmesg to a temporary
file and then displaying the contents of that file on the monitor. For example, the
following set of two commands uses the output redirection operator (designated by a
rightward facing angle bracket) to first send the output of dmesg to a text file called
tempfile1 (which will be created by the output redirection operator if it does not already
exist), and then it uses another output redirection operator to transfer the output of
tempfile1 to the display screen:

dmesg > tempfile1


tempfile1 > less

However, redirection to a file as an intermediate step is clearly less efficient, both


because two separate commands are required and because the second command must
await the completion of the first command before it can begin.

P:F-LTL-UG/03/R1 Page 43 of 99
The use of two pipes to chain three commands together could make the above example
even more convenient for some situations. For example, the output of dmesg could first
be piped to the sort filter to arrange it into alphabetic order before piping it to less:

dmesg | sort -f | less

The -f option tells sort to disregard case (i.e., whether letters are lower case or upper
case) while sorting.

Likewise, the output of the ls command (which is used to list the contents of a directory)
is commonly piped to the the less (or more) command to make the output easier to read,
i.e.,

ls -al | less

or

ls -al | more

ls reports the contents of the current directory (i.e., the directory in which the user is
currently working) in the absence of any arguments (i.e., input data in the form of the
names of files or directories). The -l option tells ls to provide detailed information about
each item, and the -a option tells ls to include all files, including hidden files (i.e., files
that are normally not visible to users). Because ls returns its output in alphabetic order by
default, it is not necessary to pipe its output to the sort command (unless it is desired to
perform a different type of sorting, such as reverse sorting, in which case sort's -r option
would be used).

This could just as easily be done for any other directory. For example, the following
would list the contents of the /bin directory (which contains user commands) in a
convenient paged format:

ls -al /bin | less

The following example employs a pipe to combine the ls and the wc (i.e., word count)
commands in order to show how many filesystem objects (i.e., files, directories and links)
are in the current directory:

ls | wc -l

ls lists each object, one per line, and this list is then piped to wc, which, when used with
its -l option, counts the number of lines and writes the result to standard output (which, as
usual, is by default the display screen).

P:F-LTL-UG/03/R1 Page 44 of 99
The output from a pipeline of commands can be just as easily redirected to a file (where it
is written to that file) or a printer (where it is printed on paper). In the case of the above
example, the output could be redirected to a file named, for instance, count.txt:

ls | wc -l > count.txt

The output redirection operator will create count.txt if it does not exist or overwrite it if it
already exists. (The file does not, of course, require the .txt extension, and it could have
just as easily been named count, lines or anything else.)

The following is a slightly more complex example of combining a pipe with redirection
to a file:

echo -e "orange \npeach \ncherry" | sort > fruit

The echo command tells the computer to send the text that follows it to standard output,
and its -e option tells the computer to interpret each \n as the newline symbol (which is
used to start a new line in the output). The pipe redirects the output from echo -e to the
sort command, which arranges it alphabetically, after which it is redirected by the output
redirection operator to the file fruit.

As a final example, and to further illustrate the great power and flexibility that pipes can
provide, the following uses three pipes to search the contents of all of the files in current
directory and display the total number of lines in them that contain the string Linux but
not the string UNIX:

cat * | grep "Linux" | grep -v "UNIX" | wc -l

In the first of the four segments of this pipeline, the cat command, which is used to read
and concatenate (i.e., string together) the contents of files, concatenates the contents of
all of the files in the current directory. The asterisk is a wildcard that represents all items
in a specified directory, and in this case it serves as an argument to cat to represent all
objects in the current directory.

The first pipe sends the output of cat to the grep command, which is used to search text.
The Linux argument tells grep to return only those lines that contain the string Linux. The
second pipe sends these lines to another instance of grep, which, in turn, with its -v
option, eliminates those lines that contain the string UNIX. Finally, the third pipe sends
this output to wc -l, which counts the number of lines and writes the result to the display
screen.

P:F-LTL-UG/03/R1 Page 45 of 99
Algorithm:

1. Unnamed Pipe:
1. Create two pipe ends using pipe () command.
2. Create child process using fork () command.
3. Close read end for client ie.child and write data on the write end.
4. Close write end for server ie.parent and read data from the read end written by
the client.
2. Named Pipe:
1. Create two FIFOs using mkfifo() command.
2. Open one for reading (FIFO1) and other for writing (FIFO2).
3. Send data from client.
4. Wait from data from client and print the same.
3. Semaphore:
Server:
1. Semaphore set is obtained with semget () function with some specific
Semaphore key and set value to one.
2. While not reset by client continue else read data from the file written by
client.

Client:
1. Semaphore set is obtained with semget () function with servers specific
Semaphore key and set value to 1.
2. Write data to the file and reset the semaphore value and break.

Test Input:
1. Named Pipe Client:
[root@localhost Programs]# gcc –o client namedclient.c
[root@localhost Programs]#. /client hellopictsctr
Named Pipe Server:
[root@localhost Programs]# gcc –o server namedserver.c
[root@localhost Programs]#. /server

2. Unnamed Pipe:
[root@localhost Programs]# gcc unnamed.c
[root@localhost Programs]# ./a.out

3. Semaphore Client:
[root@localhost Programs]# gcc semclient.c
[root@localhost Programs]# ./a.out
Semaphore Server
[root@localhost Programs]# gcc semserver.c
[root@localhost Programs]# ./a.out

P:F-LTL-UG/03/R1 Page 46 of 99
Test Output:
1. Named Pipe Server:
Half duplex Server: Read from Pipe: hellopictsctr
Half duplex Server: Converting string: HELLOPICTSCTR

2. Unnamed Pipe:
Parent process:
Enter the data to the pipe: hellopictsctr
Child process:
Pipe read successfully: hellopictsctr

3. Semaphore Client:
Enter data: hellopictsctr
Semaphore Server:
Waiting for clients to update…

Updated : hellopisctsctr

P:F-LTL-UG/03/R1 Page 47 of 99
Revised On: 22/06/2009

System Call
TITLE
PROBLEM Using fork system call creates child process, suspend it using wait system
STATEMENT call and transfer it into the Zombie state.
/DEFINITION
OBJECTIVE To understand concept of Zombie state and learn system calls fork and
wait.
• To implement fork system call to create child process and transfer
it into Zombie state.
S/W PACKAGES Linux Fedora 4
AND PC with the configuration as
HARDWARE Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color Monitor,
APPARATUS Keyboard, Mouse
USED
REFERENCES Advanced Unix Programming By Richard Stevans
Vijay Mukhi's -The 'c' odyssey UNIX- Gandhi
STEPS Refer to student activity flow chart, theory, algorithm, test input, test
output
INSTRUCTIONS Title
FOR • Problem Definition
WRITING • Theory
JOURNAL • Algorithm
• Source code
• compilation steps
• Output
• Conclusion

P:F-LTL-UG/03/R1 Page 48 of 99
Theory:

On Unix and Unix-like computer operating systems, a zombie process or defunct


process is a process that has completed execution but still has an entry in the process
table, this entry being still needed to allow the process that started the zombie process to
read its exit status. The term zombie process derives from the common definition of
zombie—an undead person. In the term's colorful metaphor, the child process has died
but has not yet been reaped.

When a process ends, all of the memory and resources associated with it are deallocated
so they can be used by other processes. However, the process's entry in the process table
remains. The parent can read the child's exit status by executing the wait system call, at
which stage the zombie is removed. The wait call may be executed in sequential code,
but it is commonly executed in a handler for the SIGCHLD signal, which the parent is
sent whenever a child has died.

After the zombie is removed, its process ID and entry in the process table can then be
reused. However, if a parent fails to call wait, the zombie will be left in the process
table. In some situations this may be desirable, for example if the parent creates another
child process it ensures that it will not be allocated the same process ID. As a special
case, under Linux, if the parent explicitly ignores the SIGCHLD (sets the handler to
SIG_IGN, rather than simply ignoring the signal by default), all child exit status
information will be discarded and no zombie processes will be left.

A zombie process is not the same as an orphan process. An orphan process is a process
that is still executing, but whose parent has died. They don't become zombie processes;
instead, they are adopted by init (process ID 1), which waits on its children.

Zombies can be identified in the output from the Unix ps command by the presence of a
"Z" in the STAT column. Zombies that exist for more than a short period of time
typically indicate a bug in the parent program. As with other leaks, the presence of a few
zombies isn't worrisome in itself, but may indicate a problem that would grow serious
under heavier loads. Since there is no memory allocated to zombie processes except for
the process table entry itself, the primary concern with many zombies is not running out
of memory, but rather running out of process ID numbers.
To remove zombies from a system, the SIGCHLD signal can be sent to the parent
manually, using the kill command. If the parent process still refuses to reap the zombie,
the next step would be to remove the parent process. When a process loses its parent, init
becomes its new parent. Init periodically executes the wait system call to reap any
zombies with init as parent.

P:F-LTL-UG/03/R1 Page 49 of 99
Algorithm:

1. Include <sys/wait.h>,<sys/types.h> and <unistd.h> along with other header files.


2. Call the fork system call.
3. In the child process print the child process and parent process ID.
4. In the parent process print the parent process ID.
5. Execute the wait () system call in the parent.
6. While executing, programs use the ps command to see the Zombie child.

Test Input: None

Test Output:

[root@localhost Programs]# gcc -o zombie zombie.c


[root@localhost Programs]# ./zombie

Child process.
PID = 3952 Parent PID = 3951

Parent process.
PID = 3951

//On terminal 2

[root@localhost Programs]# ps -al

F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD


0 S 0 3951 3037 0 82 0 - 396 - pts/0 00:00:00 zombie
1 Z 0 3952 3951 0 82 0 - 0 exit pts/0 00:00:00 zomb <defunct>
4 R 0 3953 3099 0 77 0 - 1069 - pts/1 00:00:00 ps

P:F-LTL-UG/03/R1 Page 50 of 99
Revised On: 22/06/2009
File management
TITLE
PROBLEM File management using low level file access system calls such as
STATEMENT write,read,open, lseek, fstat
/DEFINITION
OBJECTIVE To understand and learn system calls read,write,open,lseek,fstat.
• To implement above system calls for file management .
S/W PACKAGES Linux Fedora 4
AND PC with the configuration as
HARDWARE Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color Monitor,
APPARATUS Keyboard, Mouse
USED
REFERENCES The Design of UNIX Operating System by Maurice Bach
Vijay Mukhi's -The 'c' odyssey UNIX- Gandhi
STEPS Refer to student activity flow chart, theory, algorithm, test input, test
output
INSTRUCTIONS Title
FOR • Problem Definition
WRITING • Theory
JOURNAL • Algorithm
• Source code
• compilation steps
• Output
• Conclusion

P:F-LTL-UG/03/R1 Page 51 of 99
Theory:

Various File access system calls are described below:


1. write (): The write() system call is used to write data to a file or other object identified
by a file descriptor. The prototype is
#include <sys/types.h>
size_t write (int fildes, const void *buf, size_t nbyte);
fildes is the file descriptor, buf is the base address of the area of memory that data is
copied from, nbyte is the amount of data to copy. The return value is the actual amont of
data written, if this differs from nbyte then something has gone wrong.

2. read (): The read() system call is used to read data from a file or other object identified
by a file descriptor. The prototype is
#include <sys/types.h>
size_t read (int fildes, void *buf, size_t nbyte);
fildes is the descriptor, buf is the base address of the memory area into which the data is
read and nbyte is the maximum amount of data to read. The return value is the actual
amount of data read from the file. The pointer is incremented by the amount of data read.

3. open (): The open() system call is usually used with two parameters although an extra
parameter can be used under certain circumstances. The prototype is
#include <fcntl.h>
int open (const char *path,int oflag);
The return value is the descriptor, -1 if the file could not be opened. The first parameter is
path name of the file to be opened and the second parameter is the opening mode
specified by bitwise ORing one or more of the following values O_RDONLY,
O_WRONLY, O_RDWR etc.

4. lseek ():The lseek() system call allows programs to manipulate read/write pointer
directly so providing the facility for direct access to any part of the file. It has three
parameters and the prototype is
#include <sys/types.h>
#include <unistd.h>
long lseek (int fildes,off_t offset,int whence)
fildes is the file descriptor, offset the required new value or alteration to the offset and
whence has one the three values i.e. SEEK_SET, SEEK_CUR, SEEK_END

5. fstat (): The fstat () system call obtains the same information about an open file known
by the file descriptor fd.Prototype is given as:
Int fstat (int fd, struct stat *sb);

P:F-LTL-UG/03/R1 Page 52 of 99
Algorithm:

1. Include <fcntl.h>, <sys/stat.h> and <unistd.h> along with other header files.
2. Accept filename and choice of operation from the user.
3. If choice is read, open file in mode O_RDONLY, read and display the
contents and close file.
4. If choice is write, open file in mode O_WRONLY, accept data to be written
and write into file and close file.
5. If choice is to append to file, open it in mode O_RDWR.Using lseek, position
pointer to end of the file, accept data to be written and write into file and close
file.
6. If choice is check file status, open file in any mode, use fstat () to get file
status and display the same.

Test Input:
[root@localhost Programs]# gcc fileop.c
[root@localhost Programs]#./a.out
Enter the filename: /root/programs/test.c
Enter choice:

Test Output:

Choice: Write
Enter data: I am student of Pict, Pune.
Data written

Choice: Append
Enter data: I am studying in BE.
Data written

Choice: Read
File contents are:
I am student of Pict, Pune. I am studying in BE.

Choice File Status:


File Status is:
ID of the device containing the file: 2056
File serial number: 133702
Mode of file: 33188
Size in bytes: 80
Last access: Wed Sep 26 22:26:05 2007

P:F-LTL-UG/03/R1 Page 53 of 99
Revised On: 22/06/2009
Signals
TITLE

PROBLEM Implement an Alarm clock application using signals


STATEMENT
/DEFINITION
OBJECTIVE To understand the concept of signal.
• To implement Alarm clock using SIGALRM signal.
S/W PACKAGES Linux Fedora 4
AND PC with the configuration as
HARDWARE Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color Monitor,
APPARATUS Keyboard, Mouse
USED
REFERENCES Advanced Unix Programming By Richard Stevens
Vijay Mukhi's -The 'c' odyssey UNIX- Gandhi
STEPS Refer to student activity flow chart, theory, algorithm, test input, test
output
INSTRUCTIONS Title
FOR • Problem Definition
WRITING • Theory
JOURNAL • Algorithm
• Source code
• compilation steps
• Output
• Conclusion

P:F-LTL-UG/03/R1 Page 54 of 99
Theory:

Signals, to be short, are various notifications sent to a process in order to notify it of


various "important" events. By their nature, they interrupt whatever the process is doing
at this minute, and force it to handle them immediately. Each signal has an integer
number that represents it (1, 2 and so on), as well as a symbolic name that is usually
defined in the file /usr/include/signal.h or one of the files included by it directly or
indirectly (HUP, INT and so on. Use the command 'kill -l' to see a list of signals
supported by your system).

Each signal may have a signal handler, which is a function that gets called when the
process receives that signal. The function is called in "asynchronous mode", meaning that
no where in your program you have code that calls this function directly. Instead, when
the signal is sent to the process, the operating system stops the execution of the process,
and "forces" it to call the signal handler function. When that signal handler function
returns, the process continues execution from wherever it happened to be before the
signal was received, as if this interruption never occurred.

Note for "hardwarists": If you are familiar with interrupts (you are, right?), signals are
very similar in their behavior. The difference is that while interrupts are sent to the
operating system by the hardware, signals are sent to the process by the operating system,
or by other processes. Note that signals have nothing to do with software interrupts,
which are still sent by the hardware (the CPU itself, in this case).

Signals are usually used by the operating system to notify processes that some event
occurred, without these processes needing to poll for the event. Signals should then be
handled, rather then used to create an event notification mechanism for a specific
application.

When we say that "Signals are being handled", we mean that our program is ready to
handle such signals that the operating system might be sending it (such as signals
notifying that the user asked to terminate it, or that a network connection we tried writing
into, was closed, etc). Failing to properly handle various signals, would likely cause our
application to terminate, when it receives such signals.

The most common way of sending signals to processes is using the keyboard. There are
certain key presses that are interpreted by the system as requests to send signals to the
process with which we are interacting:

Ctrl-C
Pressing this key causes the system to send an INT signal (SIGINT) to the
running process. By default, this signal causes the process to immediately
terminate.

P:F-LTL-UG/03/R1 Page 55 of 99
Ctrl-Z
Pressing this key causes the system to send a TSTP signal (SIGTSTP) to the
running process. By default, this signal causes the process to suspend execution.

Ctrl-\
Pressing this key causes the system to send a ABRT signal (SIGABRT) to the
running process. By default, this signal causes the process to immediately
terminate. Note that this redundancy (i.e. Ctrl-\ doing the same as Ctrl-C) gives us
some better flexibility. We'll explain that later on.

Another way of sending signals to processes is done using various commands, usually
internal to the shell:

kill
The kill command accepts two parameters: a signal name (or number), and a
process ID. Usually the syntax for using it goes something like:

kill -<signal> <PID>

For example, in order to send the INT signal to process with PID 5342, type:

kill -INT 5342

This has the same affect as pressing Ctrl-C in the shell that runs that process.If no
signal name or number is specified, the default is to send a TERM signal to the
process, which normally causes its termination, and hence the name of the kill
command.
fg
On most shells, using the 'fg' command will resume execution of the process (that
was suspended with Ctrl-Z), by sending it a CONT signal.

A third way of sending signals to processes is by using the kill system call. This is the
normal way of sending a signal from one process to another. This system call is also used
by the 'kill' command or by the 'fg' command. Here is an example code that causes a
process to suspend its own execution by sending itself the STOP signal:

#include <unistd.h> /* standard unix functions, like getpid() */


#include <sys/types.h> /* various type definitions, like pid_t */
#include <signal.h> /* signal name macros, and the kill() prototype */

/* first, find my own process ID */


pid_t my_pid = getpid();

/* now that i got my PID, send myself the STOP signal. */

P:F-LTL-UG/03/R1 Page 56 of 99
kill(my_pid, SIGSTOP);

An example of a situation when this code might prove useful, is inside a signal handler
that catches the TSTP signal (Ctrl-Z, remember?) in order to do various tasks before
actually suspending the process. We will see an example of this later on.

Most signals may be caught by the process, but there are a few signals that the process
cannot catch, and cause the process to terminate. For example, the KILL signal (-9 on
all unices I've met so far) is such a signal. This is why you usually see a process being
shut down using this signal if it gets "wild". One process that uses this signal is a system
shutdown process. It first sends a TERM signal to all processes, waits a while, and after
allowing them a "grace period" to shut down cleanly, it kills whichever are left using the
KILL signal.

STOP is also a signal that a process cannot catch, and forces the process's suspension
immediately. This is useful when debugging programs whose behavior depends on
timing. Suppose that process A needs to send some data to process B, and you want to
check some system parameters after the message is sent, but before it is received and
processed by process B. One way to do that would be to send a STOP signal to process B,
thus causing its suspension, and then running process A and waiting until it sends its oh-
so important message to process B. Now you can check whatever you want to, and later
on you can use the CONT signal to continue process B's execution, which will then
receive and process the message sent from process A.

Now, many other signals are catchable, and this includes the famous SEGV and BUS
signals. You probably have seen numerous occasions when a program has exited with a
message such as 'Segmentation Violation - Core Dumped', or 'Bus Error - core dumped'.
In the first occasion, a SEGV signal was sent to your program due to accessing an illegal
memory address. In the second case, a BUS signal was sent to your program, due to
accessing a memory address with invalid alignment. In both cases, it is possible to catch
these signals in order to do some cleanup - kill child processes, perhaps remove
temporary files, etc. Although in both cases, the memory used by your process is most
likely corrupt, it's probable that only a small part of it was corrupt, so cleanup is still
usually possible.

P:F-LTL-UG/03/R1 Page 57 of 99
Algorithm:

1. Write the function to be invoked on the receipt of the signal.


2. Use the “signal” system call with SIGALARM in the signum field and address
of function written in (1) as the function argument to register the user
function.
3. Accept number of seconds after which to signal the alarm.
4. Use alarm function with the number of seconds accepted in (3) to invoke the
function written in (1) after the specified timestamp.

Test Input:
[root@localhost Programs]# gcc -o signal signals.c
[root@localhost Programs]# ./signal
Enter the alarm interval in seconds: 5

Test Output:
1
2
3
4
******** ALARM ********

P:F-LTL-UG/03/R1 Page 58 of 99
Revised On: 22/06/2009
MultiThreading
TITLE

PROBLEM Create program which has three threads


STATEMENT 1. Display Seconds 2. Display Minutes 3:Display Hours.and synchronize
/DEFINITION them
OBJECTIVE To understand the concept mutithreading.
• To implement digital clock by creating three threads and join
them.
S/W PACKAGES Linux Fedora 4
AND PC with the configuration as
HARDWARE Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color Monitor,
APPARATUS Keyboard, Mouse
USED
REFERENCES Advanced Unix Programming By Richard Stevens
The Design of UNIX Operating System by Maurice Bach
STEPS Refer to student activity flow chart, theory, algorithm, test input, test
output
INSTRUCTIONS Title
FOR • Problem Definition
WRITING • Theory
JOURNAL • Algorithm
• Source code
• compilation steps
• Output
• Conclusion

P:F-LTL-UG/03/R1 Page 59 of 99
Theory:

We can think of a thread as basically a lightweight process. In order to understand this


let us consider the two main characteristics of a process:

Unit of resource ownership


-- A process is allocated:

• a virtual address space to hold the process image


• control of some resources (files, I/O devices...)

Unit of dispatching
- A process is an execution path through one or more programs:

• execution may be interleaved with other processes


• the process has an execution state and a dispatching priority

If we treat these two characteristics as being independent (as does modern OS theory):

• The unit of resource ownership is usually referred to as a process or task. This


Processes have:
o a virtual address space which holds the process image.
o protected access to processors, other processes, files, and I/O resources.
• The unit of dispatching is usually referred to a thread or a lightweight process.
Thus a thread:
o Has an execution state (running, ready, etc.)
o Saves thread context when not running
o Has an execution stack and some per-thread static storage for local
variables
o Has access to the memory address space and resources of its process
• all threads of a process share this when one thread alters a (non-private) memory
item, all other threads (of the process) sees that a file open with one thread, is
available to others

Benefits of Threads vs Processes

If implemented correctly then threads have some advantages of (multi) processes, They
take:

• Less time to create a new thread than a process, because the newly created thread
uses the current process address space.
• Less time to terminate a thread than a process.
• Less time to switch between two threads within the same process, partly because
the newly created thread uses the current process address space.

P:F-LTL-UG/03/R1 Page 60 of 99
• Less communication overheads -- communicating between the threads of one
process is simple because the threads share everything: address space, in
particular. So, data produced by one thread is immediately available to all the
other threads.

Example : A file server on a LAN

• It needs to handle several file requests over a short period


• Hence more efficient to create (and destroy) a single thread for each request
• Multiple threads can possibly be executing simultaneously on different processors

Thread Levels

There are two broad categories of thread implementation:

• User-Level Threads -- Thread Libraries.


• Kernel-level Threads -- System Calls.

There are merits to both, in fact some OSs allow access to both levels (e.g. Solaris).
User-Level Threads (ULT)

In this level, the kernel is not aware of the existence of threads -- All thread management
is done by the application by using a thread library. Thread switching does not require
kernel mode privileges (no mode switch) and scheduling is application specific

Kernel activity for ULTs:

• The kernel is not aware of thread activity but it is still managing process activity
• When a thread makes a system call, the whole process will be blocked but for the
thread library that thread is still in the running state
• So thread states are independent of process states

Advantages and inconveniences of ULT

Advantages:

• Thread switching does not involve the kernel -- no mode switching


• Scheduling can be application specific -- choose the best algorithm.
• ULTs can run on any OS -- Only needs a thread library

Disadvantages:

• Most system calls are blocking and the kernel blocks processes -- So all threads
within the process will be blocked

P:F-LTL-UG/03/R1 Page 61 of 99
• The kernel can only assign processes to processors -- Two threads within the
same process cannot run simultaneously on two processors

Kernel-Level Threads (KLT)


In this level, All thread management is done by kernel No thread library but an API
(system calls) to the kernel thread facility exists. The kernel maintains context
information for the process and the threads, switching between threads requires the kernel
Scheduling is performed on a thread basis.

Advantages and inconveniences of KLT

Advantages

• the kernel can simultaneously schedule many threads of the same process on
many processors blocking is done on a thread level
• kernel routines can be multithreaded

Disadvantages:

thread switching within the same process involves the kernel, e.g if we have 2 mode
switches per thread switch this results in a significant slow down.

P:F-LTL-UG/03/R1 Page 62 of 99
Algorithm:

1. Create three threads pertaining to the display of hour, minutes and seconds.
2. set the variables hh, mm and ss to the value corresponding to the local time.
3. Invoke the sleep function in each thread, handling function with arguments to
sleep depending upon the type of thread.

Test Input:
[root@localhost Programs]# gcc gthread.c -lpthread
[root@localhost Programs]# ./a.out

Test Output:

15:2:17
15:2:18
15:2:19
15:2:20
15:2:21

P:F-LTL-UG/03/R1 Page 63 of 99
Revised On: 22/06/2009
Insertion of module in Kernel
TITLE

PROBLEM Write and insert module in Linux Kernel.


STATEMENT
/DEFINITION
OBJECTIVE To implement program by writing module and insert it into kernel by
using make file.
S/W PACKAGES Linux Fedora 4
AND PC with the configuration as
HARDWARE Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color Monitor,
APPARATUS Keyboard, Mouse
USED
REFERENCES Linux Kernel Programming by Michael Beck
STEPS Refer to student activity flow chart, theory, algorithm, test input, test
output
INSTRUCTIONS Title
FOR • Problem Definition
WRITING • Theory
JOURNAL • Algorithm
• Source code
• compilation steps
• Output
• Conclusion

P:F-LTL-UG/03/R1 Page 64 of 99
Theory:

What exactly is a kernel module? Modules are pieces of code that can be loaded and
unloaded into the kernel upon demand. They extend the functionality of the kernel
without the need to reboot the system. For example, one type of module is the device
driver, which allows the kernel to access hardware connected to the system. Without
modules, we would have to build monolithic kernels and add new functionality directly
into the kernel image. Besides having larger kernels, this has the disadvantage of
requiring us to rebuild and reboot the kernel every time we want new functionality.

You can see what modules are already loaded into the kernel by running lsmod, which
gets its information by reading the file /proc/modules.

How do these modules find their way into the kernel? When the kernel needs a feature
that is not resident in the kernel, the kernel module daemon kmod execs modprobe to
load the module in. modprobe is passed a string in one of two forms:

// modprobe is command used to load a single module in kernel

//Modprobe will automatically load all base modules needed in a module stack, as described
by the dependency filemodules.dep. If the loading of one of these modules fails, the whole
current stack of modules loaded in the current session will be unloaded automatically.

• A module name like softdog or ppp.


• A more generic identifier like char-major-10-30.

If modprobe is handed a generic identifier, it first looks for that string in the file
/etc/modprobe.conf.[2] If it finds an alias line like:

alias char-major-10-30 softdog

it knows that the generic identifier refers to the module softdog.ko.

Next, modprobe looks through the file /lib/modules/version/modules.dep, to see if other


modules must be loaded before the requested module may be loaded. This file is created
by depmod -a and contains module dependencies. For example, msdos.ko requires the
fat.ko module to be already loaded into the kernel. The requested module has a
dependency on another module if the other module defines symbols (variables or
functions) that the requested module uses.

P:F-LTL-UG/03/R1 Page 65 of 99
Lastly, modprobe uses insmod to first load any prerequisite modules into the kernel, and
then the requested module. modprobe directs insmod to /lib/modules/version/[3], the
standard directory for modules. insmod is intended to be fairly dumb about the location
of modules, whereas modprobe is aware of the default location of modules, knows how to
figure out the dependencies and load the modules in the right order. So for example, if
you wanted to load the msdos module, you'd have to either run:

insmod /lib/modules/2.6.11/kernel/fs/fat/fat.ko
insmod /lib/modules/2.6.11/kernel/fs/msdos/msdos.ko

or:

modprobe msdos

What we've seen here is: insmod requires you to pass it the full pathname and to insert
the modules in the right order, while modprobe just takes the name, without any
extension, and figures out all it needs to know by parsing
/lib/modules/version/modules.dep.

Linux distros provide modprobe, insmod and depmod as a package called module-init-
tools. In previous versions that package was called modutils. Some distros also set up
some wrappers that allow both packages to be installed in parallel and do the right thing
in order to be able to deal with 2.4 and 2.6 kernels. Users should not need to care about
the details, as long as they're running recent versions of those tools.

P:F-LTL-UG/03/R1 Page 66 of 99
Algorithm:

1. Write a .c program which consists of functionality that is to be implemented.


2. Write a make file.
3. Go to the directory where both these files are stored and run make command.
Then .ko file will be created.
4. For inserting it use insmod.
5. To remove it use rmmod.
6. To check the message see file /log/messages or use dmesg command.

Test Input:
.c file

P:F-LTL-UG/03/R1 Page 67 of 99
Revised On: 22/06/2009
DIVIDE AND CONQUER
TITLE
Implement using divide and Conquer strategy(any one)
PROBLEM
STATEMENT 1. Merge Sort and Randomized Quicksort (recursive and non
/DEFINITION recursive) and Compare recursive and non recursive versions
2. Multiplication of 2 ‘n’ bit numbers where ‘n’ is a power of 2
• To understand the divide and conquer algorithmic strategy
OBJECTIVE • To implement searching and sorting using divide and conquer
strategy and application of divide and conquer
• Analyze the above algorithms and verify with execution of program
on different inputs
S/W PACKAGES Windows 2000, Turbo C++,
AND HARDWARE PC with the configuration as
APPARATUS Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color
USED Monitor, Keyboard, Mouse
• Fundamental of Algorithms by Bressard
REFERENCES • Fundamentals of algorithms by Horowitz / Sahani Galgotia
• Introduction to Algorithms by Cormen / Charles PHI

STEPS Refer to student activity flow chart, theory, algorithm, test input, test
output
INSTRUCTIONS • Title
FOR • Problem Definition
WRITING • Theory (covering Concept of Divide and Conquer)
JOURNAL • Algorithms
• Analysis of above for time and space complexity
• Program code
• Output for different i/p comparing time complexity
• Conclusion

P:F-LTL-UG/03/R1 Page 68 of 99
Theory

Divide and conquer Strategy

In general, divide and conquer is based on the following idea. The whole problem we
want to solve may too big to understand or solve at once. We break it up into smaller
pieces, solve the pieces separately, and combine the separate pieces together.
We analyze this in some generality: suppose we have a pieces, each of size n/b and
merging takes time f(n). (In the heapification example a=b=2 and f(n)=O(log n) but it
will not always be true that a=b -- sometimes the pieces will overlap.)
The easiest way to understand what's going on here is to draw a tree with nodes
corresponding to subproblems (labeled with the size of the sub-problem)
n
/ | \
n/b n/b n/b
/|\ /|\ /|\
. . .
. . .
. . .
For simplicity, let's assume n is a power of b, and that the recursion stops when n is 1.
Notice that the size of a node depends only on its level:
size(i) = n/(b^i).
What is time taken by a node at level i?
time(i) = f(n/b^i)
How many levels can we have before we get down to n=1? For bottom level, n/b^i=1, so
n=b^i and i=(log n)/(log b). How many items at level i? a^i. So putting these together we
have
(log n)/(log b)
T(n) = sum a^i f(n/b^i)
i=0
This looks messy, but it's not too bad. There are only a few terms (logarithmically many)
and often the sum is dominated by the terms at one end (f(n)) or the other (n^(log a/log
b)). In fact, you will generally only be a logarithmic factor away from the truth if you
approximate the solution by the sum of these two, O(f(n) + n^(log a/log b)).

P:F-LTL-UG/03/R1 Page 69 of 99
Let's use this to analyze heapification. By plugging in parameters a=b=2, f(n)=log n, we
get
log n
T(n) = 2 sum 2^i log(n/2^i)
i=0
Rewriting the same terms in the opposite order, this turns out to equal
log n
T(n) = 2 sum n/2^i log(2^i)
i=0

log n
= 2n sum i/2^i
i=0

infty
<= 2n sum i/2^i
i=0

= 4n
So heapification takes at most 4n comparisons and heapsort takes at most n log n + 4n.
(There's an n log n - 1.44n lower bound so we're only within O(n) of the absolute best
possible.)

Merge sort

According to Knuth, merge sort was one of the earliest sorting algorithms, invented by
John von Neumann in 1945.
merge(L1,L2)
{
list X = empty
while (neither L1 nor L2 empty)
{
compare first items of L1 & L2
remove smaller of the two from its list
add to end of X
}
catenate remaining list to end of X
return X
}
Time analysis: in the worst case both lists empty at about same time, so everything has to
be compared. Each comparison adds one item to X so the worst case is |X|-1 = |L1|+|L2|-
1 comparisons. One can do a little better sometimes e.g. if L1 is smaller than most of L2.

P:F-LTL-UG/03/R1 Page 70 of 99
Once we know how to combine two sorted lists, we can construct a divide and conquer
sorting algorithm that simply divides the list in two, sorts the two recursively, and merges
the results:
merge sort(L)
{
if (length(L) < 2) return L
else {
split L into lists L1 and L2, each of n/2 elements
L1 = merge sort(L1)
L2 = merge sort(L2)
return merge(L1,L2)
}
}
This is simpler than heapsort (so easier to program) and works pretty well. How many
comparisons does it use? We can use the analysis of the merge step to write down a
recurrence:
C(n) <= n-1 + 2C(n/2)
As you saw in homework 1.31, for n = power of 2, the solution to this is n log n - n + 1.
For other n, it's similar but more complicated. To prove this (at least the power of 2
version), you can use the formula above to produce
log n
C(N) <= sum 2^i (n/2^i - 1)
i=0

log n
= sum n - 2^i
i=0

= n(log n + 1) - (2n - 1)

= n log n - n + 1
So the number of comparisons is even less than heapsort.

Quicksort

Quicksort, invented by Tony Hoare, follows a very similar divide and conquer idea:
partition into two lists and put them back together again It does more work on the divide
side, less on the combine side.
Quicksort uses a simple idea: pick one object x from the list, and split the rest into those
before x and those after x.
quicksort(L)
{
if (length(L) < 2) return L

P:F-LTL-UG/03/R1 Page 71 of 99
else {
pick some x in L
L1 = { y in L : y < x }
L2 = { y in L : y > x }
L3 = { y in L : y = x }
quicksort(L1)
quicksort(L2)
return concatenation of L1, L3, and L2
}
}
(We don't need to sort L3 because everything in it is equal).
Quicksort analysis
The partition step of quicksort takes n-1 comparisons. So we can write a recurrence for
the total number of comparisons done by quicksort:
C(n) = n-1 + C(a) + C(b)
where a and b are the sizes of L1 and L2, generally satisfying a+b=n-1. In the worst case,
we might pick x to be the minimum element in L. Then a=0, b=n-1, and the recurrence
simplifies to C(n)=n-1 + C(n-1) = O(n^2). So this seems like a very bad algorithm.
Why do we call it quicksort? How can we make it less bad? Randomization!
Suppose we pick x=a[k] where k is chosen randomly. Then any value of a is equally
likely from 0 to n-1. To do average case analysis, we write out the sum over possible
random choices of the probability of that choice times the time for that choice. Here the
choices are the values of k, the probabilities are all 1/n, and the times can be described by
formulas involving the time for the recursive calls to the algorithm. So average case
analysis of a randomized algorithm gives a randomized recurrence:
n-1
C(n) = sum (1/n)[n - 1 + C(a) + C(n-a-1)]
a=0
To simplify the recurrence, note that if C(a) occurs one place in the sum, the same
number will occur as C(n-a-1) in another term -- we rearrange the sum to group the two
together. We can also take the (n-1) parts out of the sum since the sum of 1/n copies of
1/n times n-1 is just n-1.
n-1
C(n) = n - 1 + sum (2/n) C(a)
a=0
The book gives two proofs that this is O(n log n). Of these, induction is easier.
One useful idea here: we want to prove f(n) is O(g(n)). The O() hides too much
information, instead we need to prove f(n) <= a g(n) but we don't know what value a
should take. We work it out with a left as a variable then use the analysis to see what
values of a work.
We have C(1) = 0 = a (1 log 1) for all a. Suppose C(i) <= a i log i for some a, all i<n.
Then
C(n) = n-1 + sum(2/n) C(a)
<= n-1 + sum(2/n)ai log i
= n-1 + 2a/n sum(i=2 to n-1) (i log i)

P:F-LTL-UG/03/R1 Page 72 of 99
<= n-1 + 2a/n integral(i=2 to n)(i log i)
= n-1 + 2a/n (n^2 log n / 2 - n^2/4 - 2 ln 2 + 1)
= n-1 + a n log n - an/2 - O(1)
and this will work if n-1 < an/2, and in particular if a=2. So we can conclude that C(n) <=
2 n log n.
Note that this is worse than either merge sort or heap sort, and requires random number
generator to avoid being really bad. But it's pretty commonly used, and can be tuned in
various ways to work better. (For instance, let x be the median of three randomly chosen
values rather than just one value).

Algorithms :

1. Divide: Divide the input into sub-sets, unless the input size is small enough to solve
easily.
2. Recur: Recursively solve the sub-sets problems.
3. Conquer: Merge the solution from the sub-problems.

Merge Sort

Assume the problem is a Sequence S of n unordered objects and we want to return S


sorted. Using the Divide and conquer design pattern:
1. Divide: If S has one element then trivial to solve and return S. Otherwise divide S into
two sequences S1 and S2 with ceiling(n/2) and floor(n/2) elements respectively.
2. Recur: Recursively divide S1 and S2.
3. Conquer: Put back the elements into S by merging the sorted sequences S1 and S2 into
a sorted sequence.

Algorithm merge(S1, S2, S):


// input: ordered sequences S1 and S2
// output: ordered sequence S containing S1 and S2 elements.
while S1 is not empty and S2 is not empty do
if S1.first().element() <= S2.first().element() then
S.insertLast(S1.remove(S1.first()))
else
S.insertLast(S2.remove(S2.first()))
while S1 is not empty do
S.insertLast(S1.remove(S1.first()))
while S2 is not empty do
S.insertLast(S2.remove(S2.first()))
Note the time is O(n1 + n2)
Cost
Assume that inserting into first and last position is O(1), this is true for circular array or
link list.

P:F-LTL-UG/03/R1 Page 73 of 99
Let n be the size of S, the initial unsorted sequence. Let v be a node of the merge-sort
tree, and i the depth of the node. Then the size of the sequence is n/2i. The time spent at
both the divide and conquer phase is proportional to the size of the sequence at the node,
O(n/2i). Note that each depth there are 2i nodes. So the total time spent at the ith depth is
(time at each node, O(n/2i) )* ( number of nodes, 2i) equal to n. Therefore the total time
is (time at each depth, n)*(height of the merge-sort tree, lg n) equal to O(n lg n).

Test input

Recursive Quick Sort

Enter the number of elements : 5

Now enter the elements :-


Enter arr[0] : 12
Enter arr[1] : 2
Enter arr[2] : 3
Enter arr[3] : 1
Enter arr[4] : 6

The Sorted Array is :


1 2 3 6 12

Non Recursive Quick Sort

Enter the number of elements : 6


Now enter the elements :-
Enter arr[0] : 2
Enter arr[1] : 4
Enter arr[2] : 1
Enter arr[3] : -1
Enter arr[4] : 0
Enter arr[5] : 3

The Sorted Array is :


-1 0 1 2 3 4

Merge Sort
Enter the number of elements : 7

P:F-LTL-UG/03/R1 Page 74 of 99
Now enter the elements :-
Enter arr[0] : 23
Enter arr[1] : -4
Enter arr[2] : 2
Enter arr[3] : 1
Enter arr[4] : 7
Enter arr[5] : 34
Enter arr[6] : 0

The Sorted Array is :


-4 0 1 2 7 23 34
Revised On: 22/06/2009

TITLE
GREEDY METHOD

PROBLEM Implement following using Greedy Method


STATEMENT
/DEFINITION Minimal spanning tree/Job scheduling problem

• To understand Greedy algorithmic strategy


OBJECTIVE • To map the above problems to Greedy Method
• Implement the algorithms
• Analyze the above algorithms
S/W PACKAGES Windows 2000, Turbo C++,
AND HARDWARE PC with the configuration as
APPARATUS Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color
USED Monitor, Keyboard, Mouse
• Fundamental of Algorithms by Bressard
REFERENCES • Fundamentals of algorithms by Horowitz / Sahani Galgotia
• Introduction to Algorithms by Cormen / Charles PHI
STEPS Refer to student activity flow chart
INSTRUCTIONS • Title
FOR • Problem Definition
WRITING • Theory (Covering Concept of Greedy Method)
JOURNAL • Map the given problem to greedy method
• Algorithms
• Analysis of above for time and space complexity
• Program code
• Output for different i/p
• Conclusion

P:F-LTL-UG/03/R1 Page 75 of 99
Theory:

Greedy Method
In an optimisation problem, one will have, in the context of greedy algorithms, the
following:

• A collection (set, list, etc) of candidates, e.g. nodes, edges in a graph, etc.
• A set of candidates which have already been `used'.
• A predicate (solution) to test whether a given set of candidates give a solution
(not necessarily optimal).
• A predicate (feasible) to test if a set of candidates can be extended to a (not
necessarily optimal) solution.
• A selection function (select) which chooses some candidate which h as not yet
been used.
• An objective function which assigns a value to a solution.

An optimisation problem involves finding a subset, S, from a collection of candidates, C;


the subset, S, must satisfy some specified criteria, i.e. be a solution and be such that the
objective function is optimised by S. `Optimised' may mean

Minimised or Maximised
depending on the precise problem being solved. Greedy methods are distinguished by the
fact that the selection function assigns a numerical value to each candidate, x, and
chooses that candidate for which:

SELECT( x ) is largest
or SELECT( x ) is smallest

All Greedy Algorithms have exactly the same general form. A Greedy Algorithm for a
particular problem is specified by describing the predicates `solution' and `feasible'; and
the selection function `select'.

P:F-LTL-UG/03/R1 Page 76 of 99
Algorithm:

function select (C : candidate_set) return candidate;


function solution (S : candidate_set) return
boolean;
function feasible (S : candidate_set) return
boolean;
--***************************************************
function greedy (C : candidate_set) return candidate_set is
x : candidate;
S : candidate_set;
begin
S := {};
while (not solution(S)) and C /= {} loop
x := select( C );
C := C - {x};
if feasible( S union {x} ) then
S := S union { x };
end if;
end loop;
if solution( S ) then
return S;
else
return es;
end if;
end greedy;

Job Sequencing Problem

We have n jobs to execute, each one of which takes a unit time to process. At any time
instant we can do only one job. Doing job i earns a profit pi. The deadline for job i is di.
Suppose n = 4; p = [50, 10, 15, 30]; d = [2, 1, 2, 1]. It should be clear that we can process
no more than two jobs by their respective deadlines. The set of feasible sequences are:
Sequence Profit
1 50
2 10
3 15
4 30
1, 3 65
2, 1 60
2, 3 25
3, 1 65
4, 1 80 optimal
4, 3 45

P:F-LTL-UG/03/R1 Page 77 of 99
A set of jobs is feasible if there is a sequence that is feasible for this set. The greedy
method chooses jobs in decreasing order of their profits. If the inclusion of the next job
does not affect feasibility, we do so. Else we do not include the last job that was
considered. We need to show that this algorithm provides an optimal solution. We also
need to consider the implementation that is most efficient. Some results given below help
in this.

Test Input
Enter the number of JObs : 5

Job 1 :
Enter Deadline d[1] : 3
Enter Profit p[1] : 1

Job 2 :
Enter Deadline d[2] : 3
Enter Profit p[2] : 5

Job 3 :
Enter Deadline d[3] : 1
Enter Profit p[3] : 10

Job 4 :
Enter Deadline d[4] : 2
Enter Profit p[4] : 15

Job 5 :
Enter Deadline d[5] : 2
Enter Profit p[5] : 20

The Input Job Deatails Sorted in Decreasing order of Profits :-

JOB Deadline Profit

1 2 20
2 2 15
3 1 10
4 3 5
5 3 1

The no of Jobs selected is : 3

The Sequence of Jobs and their Profits :-

P:F-LTL-UG/03/R1 Page 78 of 99
JOB Profit

1 20
2 15
4 5

The Total Profit earned is : 40

Revised on: 22/06/2009

TITLE MULTISTAGE GRAPH PROBLEM

PROBLEM Finding shortest path for multistage graph problem(single source shortest
STATEMENT path and all pair shortest path)
/DEFINITION
• To understand the dynamic programming algorithmic strategy
OBJECTIVE • To implement above problem
• Analyze the above algorithms
S/W PACKAGES
AND HARDWARE Windows 2000, Turbo C++,
APPARATUS PC with the configuration as
USED Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color
Monitor, Keyboard, Mouse

REFERENCES • Fundamental of Algorithms by Bressard


• Fundamentals of algorithms by Horowitz / Sahani Galgotia
• Introduction to Algorithms by Cormen / Charles PHI

STEPS Refer to student activity flow chart, theory, algorithm, test input, test
output
INSTRUCTIONS • Title
FOR • Problem Definition
WRITING • Theory (Covering concept of Branch and Bound )

P:F-LTL-UG/03/R1 Page 79 of 99
JOURNAL • State space tree formulation
• Algorithms
• Analysis of above for time and space complexity
• Program code
• Output for different i/p
• Conclusion

Theory:

Shortest Path Problem

In graph theory, the shortest path problem is the problem of finding a path between two
vertices such that the sum of the weights of its constituent edges is minimized. An
example is finding the quickest way to get from one location to another on a road map; in
this case, the vertices represent locations and the edges represent segments of road and
are weighted by the time needed to travel that segment.

Formally, given a weighted graph (that is, a set V of vertices, a set E of edges, and a real-
valued weight function f : E → R), and one element v of V, find a path P from v to each
v' of V so that

is minimal among all paths connecting v to v' .

Sometimes it is called the single-pair shortest path problem, to distinguish it from the
following generalizations:

The single-source shortest path problem is a more general problem, in which we have
to find shortest paths from a source vertex v to all other vertices in the graph.The all-
pairs shortest path problem is an even more general problem, in which we have to find
shortest paths between every pair of vertices v, v' in the graph.Both these generalizations
have significantly more performant algorithms in practice than simply running a single-
pair shortest path algorithm on all relevant pairs of vertices.

Multistage Graphs

A multistage graph is a graph

P:F-LTL-UG/03/R1 Page 80 of 99
G=(V,E) with V partitioned into K >= 2 disjoint subsets such that if (a,b) is in E,
then a is in Vi , and b is in Vi+1 for some subsets in the partition; and | V1 | = | VK
| = 1.
The vertex s in V1 is called the source; the vertex t in VK is called the sink.
G is usually assumed to be a weighted graph.
The cost of a path from node v to node w is sum of the costs of edges in the path.
The "multistage graph problem" is to find the minimum cost path from s to t.
[Cf. the "network flow problem".]
Each set Vi is called a stage in the graph.
Consider the resource allocation problem:
Given n units of resources to be allocated to k projects.
For 1 <= i <= k, 0 <= j <= n,

P(i,j) = profit obtained by allocating "j" units of the resource to project i.


Transform this to instance of "multistage graph problem".
Create a multistage graph:
V = {s} and denote s = V(1,0) -- read, we are at node 1 having 0
allocated 0 units of resource
Stages 1 to k are such that stage i consists of a set:
{ V(i+1,j) } j=0 .. n
[we could denote the vertices in this set as: vi+1j
[or could instead call them vj of set Vi]
The edges are weighted with C(i,j) = -P(i,j) [the negative of the profit] to make
it a minimization problem.

Dynamic Programming solution:

Let path(i,j) be some specification of the minimal path from vertex j in set i to
vertex t; C(i,j) is the cost of this path; c(j,t) is the weight of the edge from j to t.
C(i,j) = min { c(j,l) + C(i+1,l) }

l in Vi+1
(j,l) in E

To write a simple algorithm, assign numbers to the vertices so those in stage Vi have
lower number those in stage Vi+1.

int[] MStageForward(Graph G)
{
// returns vector of vertices to follow through the graph
// let c[i][j] be the cost matrix of G
int n = G.n (number of nodes);
int k = G.k (number of stages);
float[] C = new float[n];

P:F-LTL-UG/03/R1 Page 81 of 99
int[] D = new int[n];
int[] P = new int[k];
for (i = 1 to n) C[i] = 0.0;
for j = n-1 to 1 by -1 {
r = vertex such that (j,r) in G.E and c(j,r)+C(r) is minimum
C[j] = c(j,r)+C(r);
D[j] = r;
}
P[1] = 1; P[k] = n;
for j = 2 to k-1 {
P[j] = D[P[j-1]];
}
return P;
}

ALL PAIRS SHORTEST PATHS

Enter the no of Nodes : 3

Vertex 1
Enter the no of edges from Vertex 1 : 2

Enter the second vertex of the Edge : 2


Enter the weight of the Edge : 4
Enter the second vertex of the Edge : 3
Enter the weight of the Edge : 11

Vertex 2
Enter the no of edges from Vertex 2 : 2

Enter the second vertex of the Edge : 1


Enter the weight of the Edge : 6
Enter the second vertex of the Edge : 3
Enter the weight of the Edge : 2

Vertex 3
Enter the no of edges from Vertex 3 : 1

Enter the second vertex of the Edge : 1


Enter the weight of the Edge : 3

P:F-LTL-UG/03/R1 Page 82 of 99
A(0) | 1 2 3
-----------------------------
1 | 0 4 11
2 |6 0 2
3 | 3 999 0

A(1) | 1 2 3
-----------------------------
1 | 0 4 11
2 |6 0 2
3 |3 7 0

A(2) | 1 2 3
-----------------------------
1 |0 4 6
2 |6 0 2
3 |3 7 0

A(3) | 1 2 3
-----------------------------
1 |0 4 6
2 |5 0 2
3 |3 7 0

*/

SINGLE SOURCE

Enter the no of Nodes : 7

Vertex 1

Enter the no of edges from Vertex 1 : 3

Enter the second vertex of the Edge : 2


Enter the weight of the Edge : 6
Enter the second vertex of the Edge : 3
Enter the weight of the Edge : 5
Enter the second vertex of the Edge : 4
Enter the weight of the Edge : 5

Vertex 2

Enter the no of edges from Vertex 2 : 1

P:F-LTL-UG/03/R1 Page 83 of 99
Enter the second vertex of the Edge : 5
Enter the weight of the Edge : -1

Vertex 3

Enter the no of edges from Vertex 3 : 2

Enter the second vertex of the Edge : 2


Enter the weight of the Edge : -2
Enter the second vertex of the Edge : 5
Enter the weight of the Edge : 1

Vertex 4

Enter the no of edges from Vertex 4 : 2

Enter the second vertex of the Edge : 3


Enter the weight of the Edge : -2
Enter the second vertex of the Edge : 6
Enter the weight of the Edge : -1

Vertex 5

Enter the no of edges from Vertex 5 : 1

Enter the second vertex of the Edge : 7


Enter the weight of the Edge : 3

Vertex 6

Enter the no of edges from Vertex 6 : 1

Enter the second vertex of the Edge : 7


Enter the weight of the Edge : 3

Enter the Source vertex : 1


Shortest path from Source vertex to all vertices having at the most Edges of 6 :
0 1 3 5 0 4 3
*/

P:F-LTL-UG/03/R1 Page 84 of 99
Revised On: 22/06/2009

TITLE DYNAMIC PROGRAMMING

PROBLEM Implement following using Dynamic programming, Backtracking and


STATEMENT Branch & Bound strategies.
/DEFINITION 0/1 Knapsack's problem.

OBJECTIVE • To understand the Dynamic Programming, Backtracking & branch


& Bound strategies.
• To map the above problem to all above three methods
• Analyze the above algorithms
S/W PACKAGES Windows 2000, Turbo C++,
AND HARDWARE PC with the configuration as
APPARATUS Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color
USED Monitor, Keyboard, Mouse

REFERENCES • Fundamental of Algorithms by Bressard


• Fundamentals of algorithms by Horowitz / Sahani Galgotia
• Introduction to Algorithms by Cormen / Charles PHI

STEPS Refer to student activity flow chart, theory, algorithm, test input, test
output
INSTRUCTIONS • Title
FOR • Problem Definition
WRITING • Theory (covering concept of Dynamic Programming)
JOURNAL • Map the given problem to Dynamic programming
• Algorithms
• Analysis of above for time and space complexity
• Program code
• Output for different i/p

P:F-LTL-UG/03/R1 Page 85 of 99
• Conclusion

Theory:
Knapsack problem

Definition: Given items of different values and volumes, find the most valuable set of
items that fit in a knapsack of fixed volume.

Formal Definition: There is a knapsack of capacity c > 0 and N items. Each item has
value vi > 0 and weight wi > 0. Find the selection of items (δi = 1 if selected, 0 if not) that
fit, ∑i=1N δiwi ≤ c, and the total value, ∑i=1N δivi, is maximized.

The 0-1 Knapsack

Consider of size K and we want to select from a set of n objects , where the ith object has
size si and value vi, a subset of these objects to maximize the value contained in the
knapsack with the contents of the knapsack less than or equal to K.

We will construct the state space where each node contains the total current value in the
knapsack, the total current size of the contents of the knapsack, and maximum potential
value that the knapsack can hold. In the algorithm, we will also keep a record of the
maximum value of any node (partially or completely filled knapsack) found so far.
When we perform the depth-first traversal of the state-space tree, a node is "promising" if
its maximum potential value is greater than this current best value.

We begin the state space tree with the root consisting of the empty knapsack. The
current weight and value are obviously 0. To find the maximum potential value we treat
the problem as if it were the fractional knapsack problem and we were using the greedy
algorithmic solution to that problem. We have shown that the greedy approach to the
fractional knapsack problem yields an optimal solution. We place each of the remaining
objects, in turn, into the knapsack until the next selected object is too big to fit into the

P:F-LTL-UG/03/R1 Page 86 of 99
knapsack. We then use the fractional amount of that object that could be placed in the
knapsack to determine the maximum potential value.

totalSize = currentSize + size of remaining objects that can be fully placed

bound (maximum potential value) = currentValue +

value of remaining objects fully placed +

(K - totalSize) * (value density of item that is partially placed)

In general, for a node at level i in the state space tree (the first i items have been
considered for selection) and for the kth object as the one that will not completely fit into
the remaining space in the knapsack, these formulae can be written:

k-1

totalSize = currentSize +  sj
j=i+1

k-1

bound = currentValue +  vj + (K - totalSize) * (vk/sk)


j=i+1

For the root node, currentSize = 0, currentValue = 0

From the root, we add two children at level 1 -- the node where the first item is included
in the knapsack and the node where it is not. For the child where the first item is not
included in the knapsack, the calculation for the bound proceeds as follows:

The state spaced traversed by the backtracking algorithm is displayed below. When the
bound of a node is less than or equal to the current maximum value, or adding the current
item to the node causes the size of the contents to exceed the capacity of the knapsack,

P:F-LTL-UG/03/R1 Page 87 of 99
the subtrees rooted at that node are pruned, and the traversal backtracks to the previous
parent in the state space tree.

import java.util.*; //LinkedList

public class KnapsackBacktrack {

private double maxValue;

private double K; //knapsack capacity

private double [ ] s; //array of sizes

private double [ ] v; //array of values (both ordered by value density)

private List bestList; //members of solution set for current best value

private int numItems; //number of items in set to select from (first item is

//dummy 0)

public KnapsackBacktrack(double capacity, double [ ] size, double [ ] value,

int num) {

maxValue = 0.0;

K = capacity;

s = size;

v = value;

numItems = num;

bestList = null;

private void knapsack(int index, double currentSize, double currentValue,

List cList) {

if (currentSize <= K && currentValue > maxValue) {

P:F-LTL-UG/03/R1 Page 88 of 99
maxValue = currentValue;

bestList = new LinkedL:ist(cList);

if (promising(index, currentSize, currentValue) {

List leftList = new LinkedList(cList);

leftList.add(new Integer(index + 1) );

knapsack(index + 1, currentSize + s[index+1], currentValue +


v[index+1], leftList);

rightList = new LinkedList(cList);

knapsack(index + 1, currentSize, currentValue, rightList);

private boolean promising(int item, double size, double value) {

double bound = value;

double totalSize = size;

int k = item + 1;

if (size > K) return false;

while (k < numItems && totalSize + s[k] <= K) {

bound += v[k];

totalSize += s[k];

k++;

if (k < numItems)

P:F-LTL-UG/03/R1 Page 89 of 99
bound += (K - totalSize) * (v[k]/s[k]);

return bound > maxValue;

public void findSolution( ) {

List currentList = new LinkedList( ); //create an empty list

knapsack (0, 0.0, 0.0, currentList);

System.out.print("The solution set is: ");

for (int i = 0; i < bestList.size( ); i++) {

System.out.print(" " + bestList.get(i) );

System.out.println( );

System.out.println("The value contained in the knapsack is: $" +


maxValue);

Run-time Efficiency of the Dynamic Programming Algorithm

For n objects, there are 2n possible solution sets -- the ith object is either in or out of the
solution set. The state space, therefore, is represented as a complete binary tree with 2n
leaves. Such a tree will have a total of 2n+1 - 1 nodes. The backtracking algorithm,
therefore, has a worst-case bound that is O(2n). With pruning the actual number of nodes
"visited" by the algorithm is much less.

Branch and bound (BB) is a general algorithm for finding optimal solutions of various
optimization problems, especially in discrete and combinatorial optimization. It consists
of a systematic enumeration of all candidate solutions, where large subsets of fruitless
candidates are discarded en masse, by using upper and lower estimated bounds of the
quantity being optimized.

A branch-and-bound procedure requires two tools. The first one is a splitting procedure
that, given a set S of candidates, returns two or more smaller sets whose
union covers S. Note that the minimum of f(x) over S is , where each

P:F-LTL-UG/03/R1 Page 90 of 99
vi is the minimum of f(x) within Si. This step is called branching, since its recursive
application defines a tree structure (the search tree) whose nodes are the subsets of S.

Another tool is a procedure that computes upper and lower bounds for the minimum
value of f(x) within a given subset S. This step is called bounding.

The key idea of the BB algorithm is: if the lower bound for some tree node (set of
candidates) A is greater than the upper bound for some other node B, then A may be
safely discarded from the search. This step is called pruning, and is usually implemented
by maintaining a global variable m that records the minimum upper bound seen among all
subregions examined so far. Any node whose lower bound is greater than m can be
discarded.

The recursion stops when the current candidate set S is reduced to a single element; or
also when the upper bound for set S matches the lower bound. Either way, any element of
S will be a minimum of the function within S.

The efficiency of the method depends strongly on the node-splitting procedure and on the
upper and lower bound estimators. All other things being equal, it is best to choose a
splitting method that provides non-overlappling subsets.

Ideally the procedure stops when all nodes of the search tree are either pruned or solved.
At that point, all non-pruned subregions will have their upper and lower bounds equal to
the global minimum of the function. In practice the procedure is often terminated after a
given time; at that point, the minimum lower bound and the maximum upper bound,
among all non-pruned sections, define a range of values that contains the global
minimum. Alternatively, within an overriding time constraint, the algorithm may be
terminated when some error criterion, such as (max-min)/(min + max), falls below a
specified value.

The efficiency of the method depends critically on the effectiveness of the branching and
bounding algorithms used; bad choices could lead to repeated branching, without any
pruning, until the sub-regions become very small. In that case the method would be
reduced to an exhaustive enumeration of the domain, which is often impractically large.
There is no universal bounding algorithm that works for all problems, and there is little
hope that one will ever be found; therefore the general paradigm needs to be implemented
separately for each application, with branching and bounding algorithms that are
specially designed for it.

Branch and bound methods may be classified according to the bounding methods and
according to the ways of creating/inspecting the search tree nodes.

The branch-and-bound design strategy is very similar to backtracking in that a state space
tree is used to solve a problem. The differences are that the branch-and-bound method (1)

P:F-LTL-UG/03/R1 Page 91 of 99
does not limit us to any particular way of traversing the tree and (2) is used only for
optimization problems.

Test Input

Enter the maximum weight of the knapsack : 12


Enter number of items : 5
Enter weight and value of item 11 1
Enter weight and value of item 22 6
Enter weight and value of item 35 18
Enter weight and value of item 46 22
Enter weight and value of item 57 28

optimal includes object 5 having weight = 7 value = 28


optimal includes object 3 having weight = 5 value = 18

Revised on: 22/06/2009

TITLE
BACKTRACKING

PROBLEM Implement with Backtracking(any one)


STATEMENT
/DEFINITION 1. Implement ‘n’ queens problem with backtracking. Calculate the no
of solutions and no of nodes generated in the state space tree
2. For the assignment problem of ‘n’ people to ‘n’ jobs with cost of
assigning as C(i,j), find the optimal assignment of every job to a
person with minimum cost.
• To understand Backtracking
OBJECTIVE • To implement above problems using backtracking
• Analyze the above algorithms
S/W PACKAGES Windows 2000, Turbo C++,
AND HARDWARE PC with the configuration as
APPARATUS Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color
USED Monitor, Keyboard, Mouse

REFERENCES • Fundamental of Algorithms by Bressard


• Fundamentals of algorithms by Horowitz / Sahani Galgotia
• Introduction to Algorithms by Cormen / Charles PHI

STEPS Refer to student activity flow chart, theory, algorithm, test input, test
output
INSTRUCTIONS • Title

P:F-LTL-UG/03/R1 Page 92 of 99
FOR • Problem Definition
WRITING • Theory (covering Concept of Backtracking)
JOURNAL • Algorithms
• Analysis of above for time and space complexity
• Program code
• Output for different i/p
• Conclusion

Backtracking Strategy for N-Queens and Assignment Problem

Backtracking is used to solve problems with tree structures. Even problems seemingly
remote to trees such as a walking a maze are actually trees when the decision 'back-left-
straight-right' is considered a node in a tree.

Backtracking is the approach to find a path in a tree. There are several different aims to
be achieved :

• just a path
• all paths
• the shortest path

depending on the algorithm and the problem, 'just a path' is the first solution.The shortest
path can only be determined when all paths are known, hence the order of the listing.
Backtracking Algorithm
As usual in a recursion, the recursive function has to contain all the knowledge. The
standard implementaion is :

1. check if the goal is achieved


REPEAT
2. check if the next step is possible at all
3. check if the next step leads to a known position - prevent circles
4. do this next step

UNTIL (the goal is achieved) or (this position failed)

P:F-LTL-UG/03/R1 Page 93 of 99
So the function has to return the success, hence the use of a function.
A prototype may look like :
function step(..):boolean; // true if goal reached
begin
result:=false;
if goal(..) then // is this the position of the goal ?
result:=true
else begin
// check all possible next steps
if (step_possible(next_A))and
(step_new(next_A)) then result:=step(next_A)
if (not result) and
(step_possible(next_B))and
(step_new(next_B)) then result:=step(next_B)
if (not result) and
(step_possible(next_C))and
(step_new(next_C)) then result:=step(next_C)
if (not result) and
(step_possible(next_D))and
(step_new(next_D)) then result:=step(next_D)
..
..
end;

N-Queens Problem

The n-queens problem consists in placing n non-attacking queens on an n-by-n chess


board. A queen can attack another queen vertically, horizontally, or diagonally. E.g.
placing a queen on a central square of the board blocks the row and column where it is
placed, as well as the two diagonals (rising and falling) at whose intersection the queen
was placed.

The algorithm to solve this problem uses backtracking, but we will unroll the recursion.
The basic idea is to place queens column by column, starting at the left. New queens must
not be attacked by the ones to the left that have already been placed on the board. We
place another queen in the next column if a consistent position is found. All rows in the
current column are checked. We have found a solution if we placed a queen in the
rightmost column.

Basic Idea

A possible position on the grid is set by the pair of pointers (i,j) where 1<i,j<n , and i
stands for the number of column and j stands for the number of row.Up to this point,for
the same i there are n valid values for j.For a candidate solution though,only one queen

P:F-LTL-UG/03/R1 Page 94 of 99
can be on each column,that is only one value j=V(i).Therefor the solutions are
represented with the n values of the matrix V=[V(1),...V(n)].All the solutions for which
V(i)=V(j) are rejected because 2 queens can not be on the same row.Now the solutions
are the permutes of n pointers,which is n! ,still a forbiddingly big number.Out of all these
solution the correct one is the one which satisfies the last requirement:2 queens will not
belong in the same diagonal,which is:

V(j)-V(i)<>±(i-j) for i<>j. (5.8-1)

A backtracking algorithm or this problem constructs the permutes [V(1),....V(n)] of the


{1,...,n} pointers,and examines them as to the property (5.8-1).For example there are (n-
2)! permutes in the shape of [3,4....].These will not be produced and examined if the
systematically construction of them has already ordered them in a sub-tree with root [3,4]
which will be rejected by the 5.8-1 condition,and will also reject all the (n-2)!
permutes.On the contrary ,the same way of producing-examining will go even further to
the examination of more permutes in the shape of p={1,4,2,...} since, so far the condition
is satisfied.The next node to be inserted that is j:=V(4) must also satisfies these:j-1<>3,j-
4<>2,j-4<>-2,j-2<>1,j-2<>-1.All the j pointers satisfying these requirements produce the
following permutes:[1,4,2,j,...] which connect to the tree as children of p.Meanwhile
large sets of permutes such as [1,4,2,6,...] have already been rejected.

A typical declaration of this algorithm:The root of all solutions,has as children n nodes


[1],...,[n],where [j] represents all the permutes starting with j(and whose number is (n-1)!
for every j).Inductive if a node includes the k nodes {j1,...jk} we attempt to increase it
with another node { j1,...,jk,jk+1} so that the condition (5.8-1) is fulfilled.

For n=4 this method produces the indirect graph of the following picture,and does not
produce the 4!=24 leaves of all the candidate solutions.

Assignment Problem

P:F-LTL-UG/03/R1 Page 95 of 99
This problem considers assigning n jobs to n workers such that the cost of carrying out
the job is the minimum. It is done by using a branch and bound technique in order to
obtain an optimal solution where we fix values of some variables making sure that the
constraints are satisfied and go on exploring the tree.

Test Input
Enter Number of Jobs: 5
For Job No. 1 Enter Cost of Assigning...
Person No.1 : 10
Person No.2 : 13
Person No.3 : 17
Person No.4 : 15
Person No.5 : 21
For Job No. 2 Enter Cost of Assigning...
Person No.1 : 11
Person No.2 : 16
Person No.3 : 21
Person No.4 : 23
Person No.5 : 32
For Job No. 3 Enter Cost of Assigning...
Person No.1 : 17
Person No.2 : 21
Person No.3 : 31
Person No.4 : 12
Person No.5 : 16
For Job No. 4 Enter Cost of Assigning...
Person No.1 : 21
Person No.2 : 22
Person No.3 : 23
Person No.4 : 24
Person No.5 : 25
For Job No. 5 Enter Cost of Assigning...
Person No.1 : 12
Person No.2 : 13
Person No.3 : 14
Person No.4 : 15
Person No.5 : 16
Test Output
Minimum Cost of Job Assignment: 75
Job No. 1 Will be given to Person : 2
Job No. 2 Will be given to Person : 1
Job No. 3 Will be given to Person : 4
Job No. 4 Will be given to Person : 3
Job No. 5 Will be given to Person : 5

P:F-LTL-UG/03/R1 Page 96 of 99
Test Input for N-Queens
Enter Number of Queens: 4
Test Output
Solution No.1 :
0 1 0 0
0 0 0 1
1 0 0 0
0 0 1 0
Do You Want to See More Solutions?( [Y]es/[N]o )y

Solution No.2 :
0 0 1 0
1 0 0 0
0 0 0 1
0 1 0 0
Do You Want to See More Solutions?( [Y]es/[N]o )y

Revised On: 22/06/2009

TITLE BRANCH AND BOUND

PROBLEM Implement following using Branch and Bound


STATEMENT
/DEFINITION Traveling salesperson problem

OBJECTIVE • To understand Branch and Bound algorithmic strategy


• To implement above problem using Branch and Bound
• Analyze the above algorithms
S/W PACKAGES
AND HARDWARE Windows 2000, Turbo C++,
APPARATUS PC with the configuration as
USED Pentium IV 1.7 GHz. 128M.B RAM, 40 G.B HDD, 15’’Color
Monitor, Keyboard, Mouse

REFERENCES • Fundamental of Algorithms by Bressard


• Fundamentals of algorithms by Horowitz / Sahani Galgotia
• Introduction to Algorithms by Cormen / Charles PHI

STEPS Refer to student activity flow chart, theory, algorithm, test input, test
output
INSTRUCTIONS • Title
FOR • Problem Definition
WRITING

P:F-LTL-UG/03/R1 Page 97 of 99
JOURNAL • Theory (Covering Concept of Branch and Bound )
• State space tree formulation
• Algorithms
• Analysis of above for time and space complexity
• Program code
• Output for different i/p
• Conclusion.

Theory:

Travelling Salesman Problem using Branch and Bound Technique

The Travelling Salesman Problem (TSP) is a deceptively simple combinatorial problem.


It can be stated very simply:

A salesman spends his time visiting n cities (or nodes) cyclically. In one tour he visits
each city just once, and finishes up where he started. In what order should he visit them
to minimise the distance travelled?
Many TSP's are symmetric - that is, for any two cities A and B, the distance from A to B
is the same as that from B to A. In this case you will get exactly the same tour length if
you reverse the order in which they are visited - so there is no need to distinguish
between a tour and its reverse, and you can leave off the arrows on the tour diagram.

If there are only 2 cities then the problem is trivial, since only one tour is possible. For
the symmetric case a 3 city TSP is also trivial. If all links are present then there are (n-1)!
different tours for an n city asymmetric TSP. To see why this is so, pick any city as the
first - then there are n-1 choices for the second city visited, n-2 choices for the third, and
so on. For the symmetric case there are half as many distinct solutions - (n-1)!/2 for an n
city TSP. In either case the number of solutions becomes extremely large for large n, so
that an exhaustive search is impractible.

The problem has some direct importance, since quite a lot of practical applications can be
put in this form. It also has a theoretical importance in complexity theory, since the TSP
is one of the class of "NP Complete" combinatorial problems. NP Complete problems

P:F-LTL-UG/03/R1 Page 98 of 99
have intractable in the sense that no one has found any really efficient way of solving
them for large n. They are also known to be more or less equivalent to each other; if you
knew how to solve one kind of NP Complete problem you could solve the lot.

The holy grail is to find a solution algorithm that gives an optimal solution in a time that
has a polynomial variation with the size n of the problem. If you could find a method
whose solution time varies like a quadratic expression, for example, then doubling n
multiplies the solution time by 4 for large n. Such algorithms run out of puff at a certain
level of n, more or less independently of computing power. If computation varies as 2^n,
say, then a thousand-fold increase in computing power will only allow you to add another
10 nodes.

Alternatively there are algorithms that seem to come up with a good solution quite
quickly, but you cannot say just how far it is from being optimal.

Branch and Bound Strategy

The first algorithm looks for a true optimum tour by doing a truncated search of the entire
solution space - known as the Branch and Bound technique. It reaches an initial tour very
quickly, and then continues to search, finding better tours and/or establishing that no
shorter tour exists. As each tour is found it is drawn on the screen, and details are entered
in a file called tsp.log. This algorithm is quite fast up to about 25 nodes, but becomes
painfully slow beyond 40.

The good news is that you can truncate the search at any time by hitting any key. You can
also set a % bound margin; setting this to 10 means that you wind up with a tour whose
length is within 10% of the shortest - but you get there rather more quickly than if you
insist on a true optimum (margin=0). If you like you can set the margin at 100, so it gives
up immediately after finding the solution.

Test Input
Enter No. of Nodes in Graph: 5
Enter the No. of Outgoing Edges From vertex 1 :2
Enter the target vertex no: 2
Enter the weight of the edge: 5
Enter the target vertex no: 3
Enter the weight of the edge: 6
Enter the No. of Outgoing Edges From vertex 2 : 1
Enter the target vertex no: 4
Enter the weight of the edge: 6
Enter the No. of Outgoing Edges From vertex 3 :2
Enter the target vertex no: 2
Enter the weight of the edge: 4

P:F-LTL-UG/03/R1 Page 99 of 99
Enter the target vertex no: 4
Enter the weight of the edge: 3
Enter the No. of Outgoing Edges From vertex 4 : 1
Enter the target vertex no: 5
Enter the weight of the edge: 10
Enter the No. of Outgoing Edges From vertex 5 : 0

Test Output
Total Cost: 26
Sequence of Traversal: 1 3 2 4 5 1

Head of Department Subject Coordinator

(COMPUTER ENGINEERING) (PROF. ARCHANA GHOTKAR)

P:F-LTL-UG/03/R1 Page 100 of 99

You might also like