You are on page 1of 14

July 2011 Master of Computer Science (MSCCS) Semester 1 MC0067 Database Management System (DBMS and Oracle 9i)

i) 4 Credits (Book ID: B0716 & B0717) Assignment Set 1 (40 Marks)
Answer all Questions Book ID: B0716 Q.1) Write about: Linear Search Collision Chain Each Question carries ten Marks

Answer: 1(a) Linear Search Linear search, also known as sequential search, means starting at the beginning of the data and checking each item in turn until either the desired item is found or the end of the data is reached. Linear search is a search algorithm, also known as sequential search that is suitable for searching a list of data for a particular value. It operates by checking every element of a list one at a time in sequence until a match is found. The Linear Search, or sequential search, is simply examining each element in a list one by one until the desired element is found. The Linear Search is not very efficient. If the item of data to be found is at the end of the list, then all previous items must be read and checked before the item that matches the search criteria is found. This is a very straightforward loop comparing every element in the array with the key. As soon as an equal value is found, it returns. If the loop finishes without finding a match, the search failed and -1 is returned. For small arrays, linear search is a good solution because it's so straightforward. In an array of a million elements linear search on average will take500, 000 comparisons to find the key. For a much faster search, take a look at binary search. Algorithm For each item in the database if the item matches the wanted info exit with this item Continue loop wanted item is not in database

Answer: 1(b) Collision Chain: In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number). Thus, a hash table implements an associate array. The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought. Ideally, the hash function should map each possible key to a unique slot index, but this ideal is rarely achievable in practice (unless the hash keys are fixed; i.e. new entries are never added to the table after it is created). Instead, most hash table designs assume that hast collisionsdifferent keys that map to the same hash valuewill occur and must be accommodated in some way. Q.2) Write about: Integrity Rules Relational Operators with examples for each

Page 1 of 14

Roll No. 521126647

July 2011 Master of Computer Science (MSCCS) Semester 1 MC0067 Database Management System (DBMS and Oracle 9i) 4 Credits (Book ID: B0716 & B0717) Assignment Set 1 (40 Marks)
Linear Search Collision Chain

Answer: 2(a) Integrity Rules: These are the rules which a relational database follows in order to stay accurate and accessible. These rules govern which operations can be performed on the data and on the structure of the database. There are three integrity rules defined for a relational database, which are: Distinct Rows in a Table - this rule says that all the rows of a table should be distinct to avoid in ambiguity while accessing the rows of that table. Most of the modern database management systems can be configured to avoid duplicate rows. Entity Integrity (A Primary Key or part of it cannot be null) - this rule says that 'null' is special value in a relational database and it doesn't mean blank or zero. It means the unavailability of data and hence a 'null' primary key would not be a complete identifier. This integrity rule is also termed as entity integrity. Referential Integrity - this rule says that if a foreign key is defined on a table then a value matching that foreign key value must exist as the primary key of a row in some other table. The following are the integrity rules to be satisfied by any relation. o No Component of the Primary Key can be null. o The Database must not contain any unmatched Foreign Key values. This is called the referential integrity rule. o Unlike the case of Primary Keys, there is no integrity rule saying that no component of the foreign key can be null. This can be logically explained with the help of the following example: o Consider the relations Employee and Account as given below. Table: Employee Emp# X101 X102 X103 X104 EmpName Shekhar Raj Sharma Vani EmpCity Bombay Pune Nagpur Bhopal EmpAcc# 120001 120002 Null 120003

Table: Account ACC# 120001 120002 120003 120004 OpenDate 30-Aug-1998 29-Oct-1998 01-Jan-1999 04-Mar-1999 BalAmt 5000 1200 3000 500

EmpAcc# in Employee relation is a foreign key creating reference from Employee to Account. Here, a Null value in EmpAcc# attribute is logically possible if an Employee does not have a bank account. If the business rules allow an employee to exist in the system without opening an account, a Null value can be allowed for EmpAcc# in Employee relation.

Page 2 of 14

Roll No. 521126647

July 2011 Master of Computer Science (MSCCS) Semester 1 MC0067 Database Management System (DBMS and Oracle 9i) 4 Credits (Book ID: B0716 & B0717) Assignment Set 1 (40 Marks)
In the case example given, Cust# in Ord_Aug cannot accept Null if the business rule insists that the Customer No. needs to be stored for every order placed.

Answer: 2(a) Relational Operators with examples for each: Relational Operators: In the relational model, the database objects seen so far have specific names: Name Meaning Relation Table Tuple Record(Row) Attribute Field(Column) Cardinality Number of Records(Rows) Degree(or Arity) Number of Fields(Columns) View Query/Answer table On these objects, a set of operators (relational operators) is provided to manipulate them:Restrict 1. 2. 3. 4. 5. 6. Project Union Difference Product Intersection Join

7. Divide The relational model is based on the solid foundation of Relational Algebra, which consists of a collection of operators that operate on relations. Each operator takes one or two relations as its input and produces a new relation as its output.

SELECT Or RESTRICT: This operator is used to extract data that satisfies a given condition. The lowercase Greek letter sigma, , is used to denote selection. Restrict simply extract records from a table. It is also known as Select, but not the same SELECT as defined in SQL.

PROJECT: This operator is used to project certain details of a relational table. It only displays the required details leaving out certain columns. The PROJECT operator is denoted by the Greek letter pi, . Project selects zero or more fields from a table and generates a new table that contains all of the records and only the selected fields (with no duplications).

PRODUCT: This operator is denoted by x. It helps combine information from two relational tables. The product of two tables is a third which contains all of the records in the first one added to each of the records in the second.

UNION: The UNION operator collects data from different tables and presents a unified version of the complete data. The union operation is represented by the symbol, U. Union creates a new table by adding the records of one table to another tables, must be compatible: have the same number of fields and each of the field pairs has to have values in the same domain.

Page 3 of 14

Roll No. 521126647

July 2011 Master of Computer Science (MSCCS) Semester 1 MC0067 Database Management System (DBMS and Oracle 9i) 4 Credits (Book ID: B0716 & B0717) Assignment Set 1 (40 Marks)
INTERSECT: This operator generates data that holds true in all the tables it is applied on. It is based on the intersection set theory and is represented by the symbol. The intersection of two tables is a third table which contains the records which are common to both.

DIFFERENCE: This operator, symbolized as -. It generates data from different tables, which holds true in one table and not the other. The difference of two tables is a third table which contains the records which appear in the first BUT NOT in the second.

JOIN: Join operation is an enhancement to the product operation. It allows a selection to be performed on the product of tables. Join of two tables is a third which contains all of the records in the first and the second which are related.

DIVIDE: The division operation, denoted by , is suited to queries that include the phrase for all. Dividing a table by another table gives all the records in the first which have values in their fields matching ALL the records in the second. The eight relational algebra operators are: 1. SELECT To retrieve specific tuples/rows from a relation.

Ord# 101 104

OrdDate Cust# 02-08-94 002 18-09-94 002

2. PROJECT To retrieve specific attributes/columns from a relation.

Description Power Supply 101-Keyboard 2000 Mouse 800 MS-DOS 6.0 5000 MS-Word 6.0 8000

Price 4000 2000 800 5000 8000

Page 4 of 14

Roll No. 521126647

July 2011 Master of Computer Science (MSCCS) Semester 1 MC0067 Database Management System (DBMS and Oracle 9i) 4 Credits (Book ID: B0716 & B0717) Assignment Set 1 (40 Marks)
3. PRODUCT To obtain all possible combination of tuples from two relations.
Ord# 101 101 101 101 101 102 102 OrdDate 02-08-94 02-08-94 02-08-94 02-08-94 02-08-94 11-08-94 11-08-94 O.Cust# 002 002 002 002 002 003 003 C.Cust# 001 002 003 004 005 001 002 CustName Shah Srinivasan Gupta Banerjee Apte Shah Srinivasan City Bombay Madras Delhi Calcutta Bombay Bombay Madras

4. UNION To retrieve tuples appearing in either or both the relations participating in the UNION.

Ord# 101 102 101 102 103 104 105

OrdDate 03-07-94 27-07-94 02-08-94 11-08-94 21-08-94 28-08-94 30-08-94

Cust# 001 003 002 003 003 002 005

Note: The union operation shown above logically implies retrieval of records of Orders placed in July or in August 5. INTERSECT To retrieve tuples appearing in both the relations participating in the INTERSECT.

Eg: To retrieve Cust# of Customers whove placed orders in July and in August Cust# 003

Page 5 of 14

Roll No. 521126647

July 2011 Master of Computer Science (MSCCS) Semester 1 MC0067 Database Management System (DBMS and Oracle 9i) 4 Credits (Book ID: B0716 & B0717) Assignment Set 1 (40 Marks)
6. DIFFERENCE To retrieve tuples appearing in the first relation participating in the DIFFERENCE but not the second.

Eg: To retrieve Cust# of Customers whove placed orders in July but not in August Cust# 001 7. JOIN To retrieve combinations of tuples in two relations based on a common field in both the relations.

ORD_AUG join CUSTOMERS (here, the common column is Cust#) Ord# 101 102 103 104 105 OrdDate 02-08-94 11-08-94 21-08-94 28-08-94 30-08-94 Cust# 002 003 003 002 005 CustNames Srinivasan Gupta Gupta Srinivasan Apte City Madras Delhi Delhi Madras Bombay

Note: The above join operation logically implies retrieval of details of all orders and the details of the corresponding customers who placed the orders. Such a join operation where only those rows having corresponding rows in the both the relations are retrieved is called the natural join or inner join. This is the most common join operation. Consider the example of EMPLOYEE and ACCOUNT relations. Relation: EMPLOYEE EMP # X101 X102 EmpName Shekhar Raj EmpCity Bombay Pune Acc# 120001 120002

Page 6 of 14

Roll No. 521126647

July 2011 Master of Computer Science (MSCCS) Semester 1 MC0067 Database Management System (DBMS and Oracle 9i) 4 Credits (Book ID: B0716 & B0717) Assignment Set 1 (40 Marks)
X103 X104 Acc# 120001 120002 120003 120004 Sharma Nagpur Vani Bhopal Relation: ACCOUNT OpenDate 30. Aug. 1998 29. Oct. 1998 1. Jan. 1999 4. Mar. 1999 BalAmt 5000 1200 3000 500 Null 120003

A join can be formed between the two relations based on the common column Acc#. The result of the (inner) join is : Emp# EmpName EmpCity Acc# OpenDate BalAmt X101 Shekhar Bombay 120001 30. Aug. 1998 5000 X102 Raj Pune 120002 29. Oct. 1998 1200 X104 Vani Bhopal 120003 1. Jan 1999 3000 Note that, from each table, only those records which have corresponding records in the other table appear in the result set. This means that result of the inner join shows the details of those employees who hold an account along with the account details. The other type of join is the outer join which has three variations the left outer join, the right outer join and the full outer join. These three joins are explained as follows: The left outer join retrieves all rows from the left-side (of the join operator) table. If there are corresponding or related rows in the right-side table, the correspondence will be shown. Otherwise, columns of the right-side table will take null values.

EMPLOYEE left outer join ACCOUNT gives: Emp# X101 X102 X103 X104 EmpName Shekhar Raj Sharma Vani EmpCity Bombay Pune Nagpur Bhopal Acc# 120001 120002 NULL 120003 OpenDate 30. Aug. 1998 29. Oct. 1998 NULL 1. Jan 1999 BalAmt 5000 1200 NULL 3000

The right outer join retrieves all rows from the right-side (of the join operator) table. If there are corresponding or related rows in the left-side table, the correspondence will be shown. Otherwise, columns of the left-side table will take null values.

Page 7 of 14

Roll No. 521126647

July 2011 Master of Computer Science (MSCCS) Semester 1 MC0067 Database Management System (DBMS and Oracle 9i) 4 Credits (Book ID: B0716 & B0717) Assignment Set 1 (40 Marks)

EMPLOYEE right outer join ACCOUNT gives: Emp# EmpName EmpCity Acc# OpenDate BalAmt X101 Shekhar Bombay 120001 30. Aug. 1998 5000 X102 Raj Pune 120002 29. Oct. 1998 1200 X104 Vani Bhopal 120003 1. Jan 1999 3000 NULL NULL NULL 120004 4. Mar. 1999 500 (Assume that Acc# 120004 belongs to someone who is not an employee and hence the details of the Account holder are not available here) The full outer join retrieves all rows from both the tables. If there is a correspondence or relation between rows from the tables of either side, the correspondence will be shown. Otherwise, related columns will take null values.

EMPLOYEE full outer join ACCOUNT gives: Emp# X101 X102 X103 X104 NULL EmpName Shekhar Raj Sharma Vani NULL EmpCity Bombay Pune Nagpur Bhopal NULL Acc# 120001 120002 NULL 120003 120004 OpenDate 30. Aug. 1998 29. Oct. 1998 NULL 1. Jan 1999 4. Mar. 1999 BalAmt 5000 1200 NULL 3000 500

8. DIVIDE Consider the following three relations:

R1 divide by R2 per R3 gives:

Page 8 of 14

Roll No. 521126647

July 2011 Master of Computer Science (MSCCS) Semester 1 MC0067 Database Management System (DBMS and Oracle 9i) 4 Credits (Book ID: B0716 & B0717) Assignment Set 1 (40 Marks)
a Thus the result contains those values from R1 whose corresponding R2 values in R3 include all R2 values. Answer: 2(b) Linear Search Linear search, also known as sequential search, means starting at the beginning of the data and checking each item in turn until either the desired item is found or the end of the data is reached. Linear search is a search algorithm, also known as sequential search that is suitable for searching a list of data for a particular value. It operates by checking every element of a list one at a time in sequence until a match is found. The Linear Search, or sequential search, is simply examining each element in a list one by one until the desired element is found. The Linear Search is not very efficient. If the item of data to be found is at the end of the list, then all previous items must be read and checked before the item that matches the search criteria is found. This is a very straightforward loop comparing every element in the array with the key. As soon as an equal value is found, it returns. If the loop finishes without finding a match, the search failed and -1 is returned. For small arrays, linear search is a good solution because it's so straightforward. In an array of a million elements linear search on average will take500, 000 comparisons to find the key. For a much faster search, take a look at binary search.

Algorithm For each item in the database if the item matches the wanted info exit with this item Continue loop wanted item is not in database Answer: 2(c) Collision Chain: In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number). Thus, a hash table implements an associate array. The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought. Ideally, the hash function should map each possible key to a unique slot index, but this ideal is rarely achievable in practice (unless the hash keys are fixed; i.e. new entries are never added to the table after it is created). Instead, most hash table designs assume that hast collisionsdifferent keys that map to the same hash valuewill occur and must be accommodated in some way.

Page 9 of 14

Roll No. 521126647

July 2011 Master of Computer Science (MSCCS) Semester 1 MC0067 Database Management System (DBMS and Oracle 9i) 4 Credits (Book ID: B0716 & B0717) Assignment Set 1 (40 Marks)
Q.3) Book ID: B0717 Discuss the correspondences between the ER model constructs and the relational model constructs. Show how each ER model construct can be mapped to the relational model, and discuss any alternative mappings Answer 3) Relational Data Model: The model uses the concept of a mathematical relation-which looks somewhat like a table of values-as its basic building block, and has its theoretical basis in set theory and first order predicate logic. The relational model represents the database a collection of relations. Each relation resembles a table of values or, to some extent, a flat file of records. When a relation is thought of as a table of values, each row in the table represents a collection of related data values. In the relation model, each row in the table represents a fact that typically corresponds to a real-world entity or relationship. The table name and column names are used to help in interpreting the meaning of the values in each row. In the formal relational model terminology, a row is called a tuples, a column header is called an attribute, and the table is called a relation. The data type describing the types of values that can appear in each column is represented by domain of possible values. ER Model: An entity-relationship model (ERM) is an abstract and conceptual representation of data. Entity-relationship modeling is a database modeling method, used to produce a type of conceptual schema or semantic data model of a system, often a relational database, and its requirements in a top-down fashion. Diagrams created by this process are called entity-relationship diagrams, ER diagrams, or ERDs. The first stage of information system design uses these models during the requirements analysis to describe information needs or the type of information that is to be stored in a database. In the case of the design of an information system that is based on a database, the conceptual data model is, at a later stage (usually called logical design), mapped to a logical data model, such as the relational model; this in turn is mapped to a physical model during physical design. We create a relational schema from an entityrelationship(ER) schema. In the case of the design of an information system that is based on a database, the conceptual data model is, at a later stage (usually called logical design), mapped to a logical data model, such as the relational model; this in turn is mapped to a physical model during physical design. Sometimes, both of these phases are referred to as "physical design". Key elements of this model are entities, attributes, identifiers and relationships.

Correspondence between ER and Relational Models:


ER Model Entity type 1:1 or 1:N relationship type M:N relationship type Binary relationship type Simple attributes Composite attributes Multi-valued attributes Value set Key attribute Lets take COMPANY database example: Relational Model Entity relation Foregin key Relationship relation and two foreign keys Relationship relation and n foreign keys Attributes Set of simple component attributes Relation and foreign key Domain Primary key or secondary key

Page 10 of 14

Roll No. 521126647

July 2011 Master of Computer Science (MSCCS) Semester 1 MC0067 Database Management System (DBMS and Oracle 9i) 4 Credits (Book ID: B0716 & B0717) Assignment Set 1 (40 Marks)
WORKS_FOR Initial Fnam e Lname addres s Name Number Location

Name

sex

salary

END DOB

EMPLOYEE StartDate

NoOfEmploy ee

DEPARTMENT

MANAGES

SUPERVISION

HOURS

CONTROLS

WORKSON 1 N

PROJECT DEPENDENTS_OF Nam e

Location

Number

DEPENDENT

Nam e

Sex DOB

Relationship

The COMPANY ER schema is below:

Page 11 of 14

Roll No. 521126647

July 2011 Master of Computer Science (MSCCS) Semester 1 MC0067 Database Management System (DBMS and Oracle 9i) 4 Credits (Book ID: B0716 & B0717) Assignment Set 1 (40 Marks)
Result of mapping the company ER schema into a relational database schema: EMPLOYEE FNAME INITIAL DEPARTMENT DNAME LNAME ENO DOB ADDRESS SEX SALARY SUPERENO DNO

DNUMBER

MGRENO

MGRSTARTDATE

DEPT_LOCATIONS DNUMBER DLOCATION PROJECT PNAME WORKS_ON EENO DEPENDENT EENO

PNUMBER

PLOCATION

DNUM

PNO

HOURS

DEPENDENT_NAME

SEX

DOB

RELATIONSHIP

Mapping of regular entity types: For each regular entity type E in the ER schema, create a relation R that includes all the simple attributes of E. Include only the simple component attributes of a composite attribute. Choose one of the key attributes of E as primary key for R. If the chosen key of E is composite, the set of simple attributes that form it will together the primary key of R. If multiple keys were identified for E during the conceptual design, the information describing the attributes that form each additional key is kept in order to specify secondary (unique) keys of relation R. Knowledge about keys is also kept for indexing purpose and other types of analyses. We create the relations EMPLOYEE, DEPARTMENT, and PROJECT in to correspond to the regular entity types EMPLOYEE, DEPARTMENT, and PROJECT. The foreign key and relationship attributes, if any, are not include yet; they will be added during subsequent steps. These, include the attributes SUPERENO and DNO of EMPLOYEE, MGRNO and MGRSTARTDATE of DEPARTMENT, and DNUM of PROJECT. We choose ENO, DNUMBER, and PNUMBER as primary keys for the relations EMPLOYEE, DEPARTMENT, and PROJECT, respectively. Knowledge that DNAME of DEPARTMENT and PNAME of PROJCET are secondary keys is kept for possible use later in the design. The relation that is created from the mapping of entity types are sometimes called entity relations because each tuples represents an entity instance. Q.4) Define the following terms: disk, disk pack, track, block, cylinder, sector, interblock gap, read/write head.

Page 12 of 14

Roll No. 521126647

July 2011 Master of Computer Science (MSCCS) Semester 1 MC0067 Database Management System (DBMS and Oracle 9i) 4 Credits (Book ID: B0716 & B0717) Assignment Set 1 (40 Marks)
Answer: Disk: Disks are used for storing large amounts of data. The most basic unit of data on the disk is a single bit of information. By magnetizing an area on disk in certain ways, one can make it represent a bit value of either 0 or 1. To code information, bits are grouped into bytes. Byte sizes are typically 4 to 8 bits, depending on the computer and the device. We assume that one character is stored in a single byte, and we use the terms byte and character interchangeably. The capacity of a disk is the number of bytes it can store, which is usually very large. Small floppy disks used with microcomputers typically hold from 400 kbytes to 1.5 Mbytes; hard disks for micros typically hold from several hundred Mbytes up to a few Gbytes. Whatever their capacity, disks are all made of magnetic material shaped as a thin circular disk and protected by a plastic or acrylic cover. A disk is single-sided if it stores information on only one of its surface and doublesided if both surfaces are used. Disk Packs: To increase storage capacity, disks are assembled into a disk pack, which may include many disks and hence many surfaces. A Disk pack is a layered grouping of hard disk platters (circular, rigid discs coated with a magnetic data storage surface). Disk pack is the core component of a hard disk drive. In modern hard disks, the disk pack is permanently sealed inside the drive. In many early hard disks, the disk pack was a removable unit, and would be supplied with a protective canister featuring a lifting handle. Track and cylinder: The (circular) area on a disk platter which can be accessed without moving the access arm of the drive is called track. Information is stored on a disk surface in concentric circles of small width, for each having a distinct diameter. Each circle is called a track. For disk packs, the tracks with the same diameter on the various surfaces are called cylinder because of the shape they would form if connected in space. The set of tracks of a disk drive which can be accessed without changing the position of the access arm are called cylinder. The number of tracks on a disk range from a few hundred to a few thousand, and the capacity of each track typically range from tens of Kbytes to 150 Kbytes. Sector: A fixed size physical data block on a disk drive. A track usually contains a large amount of information; it is divided into smaller blocks or sectors. The division of a track into sectors is hard-coded on the disk surface and cannot be changed. One type of sector organization calls a portion of a track that subtends a fixed angle at the center as a sector. Several other sector organizations are possible, one of which is to have the sectors subtend smaller angles at the center as one moves away, thus maintaining a uniform density of recording. Block and Interblock Gaps: A physical data record, separated on the medium from other blocks by interblock gaps is called block. The division of a track into equal sized disk blocks is set by the operating system during disk formatting. Block size is fixed during initialization and cannot be changed dynamically. Typical disk block sizes range from 512 to 4096 bytes. A disk with hard coded sectors often has the sectors

Page 13 of 14

Roll No. 521126647

July 2011 Master of Computer Science (MSCCS) Semester 1 MC0067 Database Management System (DBMS and Oracle 9i) 4 Credits (Book ID: B0716 & B0717) Assignment Set 1 (40 Marks)
subdivided into blocks during initialization. An area between data blocks which contains no data and which separates the blocks is called interblock gap. Blocks are separated by fixed size interblock gaps, which include specially coded control information written during disk initialization. This information is used to determine which block on the track follows each interblock gap. Read/write Head: A tape drive is required to read the data from or to write the data to a tape reel. Usually, each group of bits that forms a byte is stored across the tape, and the bytes themselves are stored consecutively on the tape. A read/write head is used to read or write data on tape. Data records on tape are also stored in blocks-although the blocks may be substantially larger than those for disks, and interblock gaps are also quite large. With typical tape densities of 1600 to 6250 bytes per inch, a typical interblock gap of 0.6 inches corresponds to 960 to 3750 bytes of wasted storage space.

Page 14 of 14

Roll No. 521126647

You might also like