You are on page 1of 23

PRESENTATION

ON
FILE ORGANISATION
SUBMITTED TO: BY: JYOTI(C-2657)
KRITIKA(30)
MRS. SONAL BENIWAL
POOJA(10)
MANISHA (16)
CSE-3rd sem(Dec-10)
FILE ORGANISATION
Technique for physically arranging records
of a file on the secondary storage.
FACTORS OF SELECTING
FILE ORGANISATION
Fast data retrieval and throughput.
Efficient storage space utilization.
Protection from failure and data loss.
Minimizing need for reorganization.
Security from unauthorized use..
TYPES
 Sequential file organization
Indexed file organization
Hashed file organization
SEQUENTIAL FILE
ORGANIZATION
It contains records organized by the order in which they are
entered. The order of the records is fixed.
Records in sequential file can be read or written only sequentially.
After you have placed a record in a sequential file, you can not
shorten, lengthen or delete the record. However you can update
(rewrite) a record if the length doesn’t change. New records are
added at the end of the file.
If the order in which you keep records in a file is not important,
sequential organization is a good choice whether there are many
records or only a few. Sequential output is only useful for printing
records .
THE SEQUENTIAL FILE
F i x e d f o r m a t u s e d f o r r e c o r d s .

 Records are of same length.


 All fields are of the same order and length.
 File names and length are attributes of a file.
 One field is the key field o uniquely identifies the records.
o records are stored in key sequence.
 New records are placed in a log file or transaction file.
 Batch update is performed to merge the log file with master file.
SEQUENTIAL FILE
Serial no. Name Roll no. Marks
Record1

Record2

Record3

Record4

Record5

Record6

Record7

Fixed length record.


Fixed set of fields in fixed order.
Sequential order based on key field.
Beginning of the file

Record1 Record2 Record Record (n)


(n-1)

Record
Terminator End of file
ADVANTAGES OF
SEQUENTIAL FILE

• Very efficient when most of the records


must be processed e.g Payroll
• Can be stored on inexpensive devices like
magnetic tape.
• Very efficient if the data has a natural
order
DISADVANTAGES
Entire file must be processed even if a
simple record is to be searched
Transactions have to be sorted before
processing
Overall processing is slow
INDEXED SEQUENTIAL
FILE
Index provides a lookup capability to quickly
reach the vicinity of the desired record
It contains key field and a pointer to the main
file
Indexed is searched to find highest key value
that is equal to or proceeds the desired key value
Search continues in the main file at the location
indicated by the pointer
COMPARISON OF
SEQUENTIAL AND INDEXED
SEQUENTIAL

Example:-A file contains 1 million records


On average 500,00 accesses are required to find a
record in a sequential file.
If an index contains 1000 entries, it will take an
average 500 accesses to find the key, followed by
500 accesses in the main file. Now on average it is
1000 accesses.
ADVANTAGES OF INDEXED
SEQUENTIAL FILE
Provides flexibility for users
who need both type of accesses
with the same file
Faster than the sequential file.
DISADVANTAGES

Extra storage space for the


index is required.
DIRECT(RANDOM) FILE
ORGANISATION
Records are read directly from or written on to the file
The records are stored at known address
The address is calculated by applying a mathematical function to
the key field
A random file would have to be stored in a direct access backing
storage medium e.g. magnetic disks, CD,DVD
Example: Any information retrieval system(train time table
system)
ADVANTAGES
Any record can be directly accessed
Speed of record processing is very fast
Up-to-date file because of online updating
Concurrent processing is possible
DISADVANTAGES

More complex than sequential


Does not fully use memory
locations
More security and backup
problems
HASH FILE
ORGANIZATION
Address of the disk block containing a desired
record is computed using a function (HASH
FUNCTION)and the search key
Let ‘k’ denote set of all search keys, ‘b’ denote
set of all bucket address. Hash function ‘h’ is a
function that maps k to b.
Bucket is typically a disk block
OPERATIONS
To insert a record with key as a key, compute h
which gives the address of the bucket of the
record. If there is space in the bucket then it is
stored in that bucket (else chaining?)
To lookup a record with key , compute h .Check
with every record in the bucket to obtain the
record
To delete a similar hash, find and delete is
HASH FUNCTIONS
Hash functions should be chosen so that
the distribution of the records is uniform.
Distribution is random.
Handling bucket overflows.
May occur due to insufficient number of buckets.
Due to bucket skew.
Solution ! ! Overflow bucket’s chain, double hashing,
linear probing ,quadratic probing.
HASH INDICES
Hashing can be used for organizing
indices. Hash index organizes
search keys with their associated
pointers
Typically only secondary indices
need to be organized using hashing
FILE ORGANIZATION
No FACTORS SEQUENTIAL INDEXED HASHED
1. Storage space No wasted space No wasted space for data but Extra space may be
extra space for index needed for addition
and deletion of
records

2. Sequential Very fast Moderately fast impractical


retrieval on
primary key
3. Random retrieval impractical Moderately fast Very fast
on primary key
4. Multiple key Possible , but Very fast with multiple indexes Not possible
retrieval requires scanning
whole file
5. Deleting rows Can create wasted A space can be dynamically Very easy
space or require allocated, this is easy , but
reorganizing requires maintenance of
indexes

6. Adding Rows Require rewriting A space can be dynamically Very easy , except
files allocated , this is easy , but multiple keys with
requires maintenance of indexes same address
require extra work
7. Updating rows Usually requires Easy , but requires maintenance Very easy
rewriting files of indexes
THANK
YOU…

You might also like