You are on page 1of 80

Chapter 7 & 8

File Systems

Computer Department SPIT, Piludara


File Systems

 File :
 Logical Storage unit
 Is the collection of related information

Computer Department SPIT, Piludara


File Systems (1)

Essential requirements for long-term


information storage:
1. It must be possible to store a very
large amount of information.
2. The information must survive even
after the termination of the process
using it.
3. Multiple processes must be able to
access the information concurrently.
Computer Department SPIT, Piludara
File Systems (2)

Think of a disk as a linear sequence of fixed-


size blocks and supporting reading and
writing of blocks.
Questions that quickly arise:
1. How do you find information?
2. How do you keep one user from reading
another user’s data?
3. How do you know which blocks are free?

Computer Department SPIT, Piludara


File Naming
• Files are used to store information on the disk and read it
back latter.
• But to access the files from other location, we need the
name of file, so file naming is important characteristics.
• When a process or user created a file, it gives the file
name. And when the process or user terminates, that file
can be accessed by other processes or users using its
name.
• Many O.S support two part file names.
1. Name of file
2. Extension of file
Ex : prog.c
 Some more common file extensions are listed in following fig.
Computer Department SPIT, Piludara
File Naming

Computer Department SPIT, Piludara


File Structure

Figure 4-2. Three kinds of files. (a) Byte sequence.


(b) Record sequence. (c) Tree of records.
Computer Department SPIT, Piludara
File Structure
• (a) Byte sequence : Here the file is in unstructured
sequence of bytes.

• (b) Record sequence : In this model, a file is a sequence


of fixed-length records, each with some internal
structure.
-- All the records must be same length.

• (c) Tree : In this model, file consists tree of records.


-- It is not necessary that all the records have same
length.
-- All the records contain unique key, which is used to
search a particular record from file.

Computer Department SPIT, Piludara


File Attributes
• Every file has a name and its data.

• But also file contains some important aspects like : the date
and the time the file was created, the size of data, the
protection of file……..etc.

• These aspects are known as file attributes.

Computer Department SPIT, Piludara


• There are main five categories of file attributes :
1. Protection : It contains first four attributes of above figure : Protection,
Password, Creator, Owner.
• It tells that who can access and who cannot.
2. Flags : It contains eight attributes : Read-only flag, Hidden flag, System
flag, Archive flag, Ascii/binary flag, random access flag, Temporary flag
and lock flags.
• The flags are bits, so they can be 1 or 0, and they are used to enable or
disable some attributes of the file.
3. Key : It contains three attributes : record length, key position & key length.
• These attributes are used to search any records from file using key. They
provide the information required to find the keys.
4. Date & Time : It contains three attributes: creation time, last access time,
last change time.
• They keep track of when file is created, when file is accessed, and when
file is changed.
5. Size : It contains two attributes : current size & maximum size.
• Current size specify that how big the file is at present, maximum size was
used in old O.S, and that specify how much size should be reserved to
store maximum amount of storage in advance.
File Operations
The most common system calls relating to files
1. Create 7. Append
2. Delete 8. Seek
3. Open 9. Get Attributes
4. Close 10. Set Attributes
5. Read 11. Rename
6. Write

Computer Department SPIT, Piludara


File Operations
• 1. create : The file is created with no data.
• 2. Delete : When the file is not needed longer, it has to be
deleted to free up disk space.
• 3. Open : to update the file, the file must be opened.
• 4. Close : when all the updating are finished, the attributes
and disk addresses are no longer needed, so the file
should be closed.
• 5. Read : Data are read from file. Here the caller must
specify how much data are needed.
• 6. Write : Data are written to the file at the current
position. If the current position is the end of file the file
size will be increase. If the current position is in the middle
of file then existing data are overwritten.
Computer Department SPIT, Piludara
File Operations
• 7. Append : This will used to add the data to the end of
file.
• 8. Seek : For random access files, a method needed to
specify from where to take data. Seek operation set the
pointer to specific position from where the data can be
read.
• 9. Get attributes : It will return the attributes of the file.
• 10. Set attributes : Some of the attributes can be changed
using this operation.
• 11. Rename : User can change the file name using this
operation. Here actually the new file is created with new
name and old file then deleted.

Computer Department SPIT, Piludara


Files Access
• Sequential access

• Random access

• Keyed (or indexed) access


Files Access
Sequential Access Method
• Read all bytes or records in order from the
beginning
• Cannot jump around
• Could possibly rewind or back up
 Appropriate for certain media or systems
• Magnetic tape or punched cards
• Video tape (VHS, etc.)
• Unix-Linux-Windows pipes
• Network streams
Sequential Access Method
Files Access
Random Access Method
• Bytes/records can be read/write in any
order
• Replace existing bytes or records
• Append to end of file
• Cannot insert data between existing bytes!
• Seek operation moves current file pointer
• After a seek, the file can be read sequentially from current
position.
• Discarded on close
• Typical of most modern information storage
• Data base systems
• Randomly accessible multi-media (CD, DVD, etc)
• …
Files Access
Keyed (or indexed) Access Methods
• Access items in file based on the contents of (part
of) an item in the file
• Means file can be accessed using the content of
the file using indexing.
• Provided in older commercial operating systems
(IBM ISAM)
Keyed (or indexed) Access Methods
File Types
• Many O.S support several types of files.
• There can be three types of files:
1. Regular files : that contains user information
2. Character special files : it is related to I/O devices like printer,
terminals and networks.
3. Block special files : are used in disks.

• We will discuss about regular files below :


• It can be either 1.)ASCII files or 2.)Binary files.
• 1.) ASCII files consists of lines of text.
• ASCII files is that they can be displayed and printed as it is,
and they can be edited with any text editor.
• If large number of files of programs use ASCII files for input
or output, it is easy to connect the output of one program to
the input of another, as in pipelines.
File Types
• 2.) Binary files : It means they are not in ASCII format, they
are in bit sequence format.

• The examples of binary files in the form of executable files


and archive files are given in following figure.
File Types

Figure 4-3.
(a) An
executable
file.
(b) An archive.
File Types
• As in figure (a), An executable files contains five sections :
header, text, data, Relocation bits, Symbol table.
• Header contains Magic number, text size, data size, BSS
size, Symbol table size, Entry point.
• Magic number specify that the file is executable or not.
• Then sizes of various pieces of the files.

• As in figure (b), It consists of a collection of library procedures


(modules).
• And each file is prefaced with header that telling its name,
creation date and size.
Directories
• To keep track of files, file systems use directories or folders,
which are in many O.S considered as files.
• There can be two level of directories :
1. Single-level Directory systems
2. Two-level Directory systems
3. Hierarchical Directory systems

1. Single-level Directory systems :


• It is simplest directory structure.
• Here all files are stored into single directory.
Directories
• It is also known as root directory.
• As shown in above figure, there is one root directory, and it
contains four files.
• It has some limitations : the name of files in single directory
must be different, and it can be difficult to remember all files
name.
2. Two-level Directory systems :
• To avoid the problem of unique file names, two level directory
is important.
• Here there can be same name of files, but in different
directory.
• This design can be used in multiuser computer or on a
network.
Root Dir

User Dir

C Files
A B B

• As shown in figure, in the first level, there is root directory


and in second level, there are sub user directories.
• In network system, some kind of login procedure is needed,
in which the user has to specify login name to access its
directories.
• Limitation : User can create only one sub directory.
Computer Department SPIT, Piludara
Directories
3. Hierarchical-level Directory systems :
• This level is most useful level in network system.
• Here user can made any numbers of sub directories.
Directories

• As shown in above figure, There can be any number of levels.


• At first level, there is root directory.
• At second level, there are user directories.
• At others levels, there are user sub directories.
Directory Operations
System calls for managing directories
1. Create 5. Readdir
2. Delete 6. Rename
3. Opendir 7. Link
4. Closedir 8. Uplink

Computer Department SPIT, Piludara


Directory Operations
1. Create : A directory is created with empty.
• It is empty except for dot and dotdot which are put there
automatically by the system.
• s = mkdir(name,mode)
2. Delete : A directory is deleted.
• Only an empty directory can be deleted.
• s = rmdir(name)
3. Opendir : A directory is opened to read.
• Before a directory can be read, it must be opened.
4. Closedir : A directory must be closed, after it has been read,
to free up internal table space.
5. Readdir : This call returns the next entry in an open directory for
reading.
• It was also possible to read directories using Read system call, but
it is depend on internal structure.
• Readdir always returns one entry in standard format, its no matter
which structure is being used by directory.
6. Rename : Using this system call, user can change the name of file.
• s = chdir(dirname)
7. Link : It is a technique that allows a file to appear in more than one
directory.
• s = link(name1, name2)-create a new entry, name2 point to name1.
• This system call specifies an existing file and a path name.
• So here the link will be created between existing file and the file that
specified in path name.
• This link will increment the counter in the file’s i-node, is called a hard link.
8. Unlink : A file, which is linked in other directory, will removed using this
system call.
• So unlinked file can be appear in only one directory.
• s = unlink(name) – remove a directory entry.
File System Implementation

Computer Department SPIT, Piludara


File System Layout

Figure 4-9. A possible file system layout.

Computer Department SPIT, Piludara


File System Implementation
• As in figure :
• Files systems are stored on disks.
• And Disks are divided into one or more partitions, with
independent file systems on each partitions.(NTFS, FAT, ext)
• The disks are distributed in sectors, And sector 0 of the disk
is called the MBR (Master Boot Record) and is used to boot
the computer.
• At the end of MBR, disk contains partition table.
• This table gives the starting and ending address of each
partition.
• In partition table, one of the partition is mark as active.
File System Implementation

• When the computer is booted, first BIOS executes MBR from


disk.
• BIOS-Basic Input Output System
• Then the MBR program search the active partition, read and
execute the program which is stored into first block of active
partition, is called boot block.
• And the program in the boot block loads the operating system
contained in that partition.

Note : Currently, every partition contains boot block, in advance.


File System Implementation
• Other than boot block, the remaining layout more specify
about file systems.

1. Superblock : the superblock contains following items :

a) key parameters : these are reads into memory when the


computer is booted or file system is first time touched.

b) magic number : to identify the file types.

c) the number of blocks in file system


File System Implementation
2. Free space management : How many and which blocks are
free in file system.

3. I-nodes : an array of data structure, one per file, telling all


about the file.

4. Root directory : which is placed at the top of the file system


tree, at first level

5. Files and directories : it contains all files, user directories and


sub-user directories.
File System Implementation
• There are three methods to implement memory allocation in
file system

1. Contiguous Allocation

2. Linked List Allocation

3. I-nodes (Indexed nodes) Allocation


Contiguous Allocation
Contiguous Allocation
• The contiguous allocation method requires each file to occupy a set of
contiguous blocks on the disk.
• Single set of blocks is allocated to a file at the time of creation
• Advantages :
• Only a single entry in the file allocation table
– Starting block and length of the file
• Suits sequential or direct access.
– In sequential access, the file system remembers the address of the
last referenced block, and then read next, sequentially.
– In direct access to block i of a file that start at block b, and then we
can direct access block b+i.
• The read operation will perform excellent because file can be read
from the disk in a single operation. Only one seek is needed(to the
first block) then after it can be read sequentially.
Contiguous Allocation
• Disadvantages :
• Finding free space for new file is difficult.
– Generally free space management system is used to find the free
space, others are also used, but they are relatively slow.
• This method is suffered from External fragmentation.
– This problem is created when some files are removed from middle.
– So storage is fragmented into number of holes is called chunks.
– No one chunk is capable to store data. So they are wasted.
– And suppose it is capable, then it is difficult to estimate the size of
the new file.
• To determining that How much space will needed for a file?
– At the time of creation of new file, user must specify the total
required space for that file to allocate space for that file.
Contiguous Allocation
– If we allocate too little space than requirement, then we cannot
extended it.
– If we allocate too large space than requirement then there can be
wasted of space.

• Currently, this allocation is used in CD-ROMs and DVDs.


Linked List Allocation
Linked List Allocation
• Each file is a linked list of disk blocks.
• As shown in fig, the directory contain a pointer to first block and last
block of the file.
• For Example, a file contains five blocks, might start from block 9,
then continue to block 16,1,10 and finally block 25.
• Each block contains a pointer to next block so it is called linked list
allocation.
• Advantages :
• no external fragmentation
– Unlike contiguous allocation, every disk block can be used in this
method
• Finding free space for new file is easy.
– The file can be start from any block, so no need to find free space
for new file.
Linked List Allocation
• Advantages :
• How much space will needed for a new file, is not a problem.
– Because there is no need to declare the size of the file at creation
time, it will be dynamically allocated.
• The file can be grow as long as free blocks are available.
• Disadvantages :
• The major problem is only for sequential access files.
– To find ith block of a file, we must start from beginning of that file,
and follow the pointers until we get to the ith block.
• The direct access is also slow.
– Because we need to keep track of all pointers, to access any block
directly.
Linked List Allocation
• Disadvantages :
• Each file required more space than it would be.
– Because the blocks also allocate the space for pointers. If a
pointers required 4 bytes out of 512 bytes block, then some
percentage of the disk is being used by pointers rather than for
information.

 Linked List Allocation Using a Table in Memory :


• To solve this problem, FAT – File Allocation Table system can be
use.
• In this system, the pointers of each block is stored into table, rather
than into block.
• A section of the disk, at the beginning of each partition is reservrd for
the FAT.
Linked List Allocation
Linked List Allocation
• As shown in figure, the table contains one entry for each block, and is
indexed by block number.
• The directory entry contains only the block number of the first block
of the file.
• The table entry indexed by block number and contains the block
number of the next block in the file.
• This chain continues until last block.
• Unused blocks are indicated by 0 table value.
Linked List Allocation
• Advantages :
• Direct access is much easier and faster.
– A table contains all the entry of the block, so we can access the
any block using the entry in the table.

• Disadvantages :
• The entire table must be in main memory all the time when it used, so
memory overload will be increased.
I-nodes Allocation
Implementing Directories (1)

Figure 4-14. (a) A simple directory containing fixed-size entries with the disk addresses and
attributes in the directory entry. (b) A directory in which each entry just refers to an i-node.

Computer Department SPIT, Piludara


Implementing Directories (2)

Figure 4-15. Two ways of handling long file names in a directory.


(a) In-line. (b) In a heap.
Computer Department SPIT, Piludara
Shared Files (1)

Figure 4-16. File system containing a shared file.


Computer Department SPIT, Piludara
Shared Files (2)

Figure 4-17. (a) Situation prior to linking. (b) After the link is
created. (c) After the original owner removes the file.
Computer Department SPIT, Piludara
Journaling File Systems
Operations required to remove a file in UNIX

1. Remove the file from its directory.


2. Release the i-node to the pool of free
i-nodes.
3. Return all the disk blocks to the pool of
free disk blocks.

Computer Department SPIT, Piludara


Virtual File Systems (1)

Figure 4-18. Position of the virtual file system.

Computer Department SPIT, Piludara


Virtual File Systems (2)

Figure 4-19. A simplified view of


the data structures and
code used by the VFS and
concrete file system to do a
read.
Computer Department SPIT, Piludara
Disk Space Management
Block Size (1)

Figure 4-20. Percentage of


files smaller than a
given size (in bytes).

Computer Department SPIT, Piludara


Disk Space Management
Block Size (2)

Figure 4-20. Percentage of


files smaller than a
given size (in bytes).

Computer Department SPIT, Piludara


Disk Space Management
Block Size (3)

Figure 4-21. The solid curve (left-hand scale) gives the data rate of a disk. The
dashed curve (right-hand scale) gives the disk space efficiency. All files are 4 KB.

Computer Department SPIT, Piludara


Keeping Track of Free Blocks (1)

Figure 4-22. (a) Storing the free list on a linked list. (b) A bitmap.
Computer Department SPIT, Piludara
Keeping Track of Free Blocks (2)

Figure 4-23. (a) An almost-full block of pointers to free disk blocks in memory and three blocks of
pointers on disk. (b) Result of freeing a three-block file. (c) An alternative strategy for handling
the three free blocks. The shaded entries represent pointers to free disk blocks.

Computer Department SPIT, Piludara


Disk Quotas

Figure 4-24. Quotas are kept


track of on a per-user basis
in a quota table.

Computer Department SPIT, Piludara


File System Backups (1)

Backups to tape are generally made to


handle one of two potential problems:

1. Recover from disaster.


2. Recover from stupidity.

Computer Department SPIT, Piludara


File System Backups (2)

Figure 4-25. A file system to be dumped. Squares are directories, circles are files. Shaded items
have been modified since last dump. Each directory and file is labeled by its i-node number.

Computer Department SPIT, Piludara


File System Backups (3)

Figure 4-26. Bitmaps used by the logical dumping algorithm.

Computer Department SPIT, Piludara


File System Consistency

Figure 4-27. File system states. (a) Consistent. (b) Missing block.
(c) Duplicate block in free list. (d) Duplicate data block.
Computer Department SPIT, Piludara
Caching (1)

Figure 4-28. The buffer cache data structures.

Computer Department SPIT, Piludara


Caching (2)
• Some blocks, such as i-node blocks, are rarely
referenced two times within a short interval.
• Consider a modified LRU scheme, taking two
factors into account:

1. Is the block likely to be needed again soon?


2. Is the block essential to the consistency of the
file system?

Computer Department SPIT, Piludara


Reducing Disk Arm Motion

Figure 4-29. (a) I-nodes placed at the start of the disk.


(b) Disk divided into cylinder groups, each with its own blocks and i-nodes.

Computer Department SPIT, Piludara


The ISO 9660 File System

Figure 4-30. The ISO 9660 directory entry.

Computer Department SPIT, Piludara


Rock Ridge Extensions

Rock Ridge extension fields:


1. PX - POSIX attributes.
2. PN - Major and minor device numbers.
3. SL - Symbolic link.
4. NM - Alternative name.
5. CL - Child location.
6. PL - Parent location.
7. RE - Relocation.
8. TF - Time stamps.
Computer Department SPIT, Piludara
Joliet Extensions

Joliet extension fields:


1. Long file names.
2. Unicode character set.
3. Directory nesting deeper than eight levels.
4. Directory names with extensions

Computer Department SPIT, Piludara


The MS-DOS File System (1)

Figure 4-31. The MS-DOS directory entry.

Computer Department SPIT, Piludara


The MS-DOS File System (2)

Figure 4-32. Maximum partition size for different block sizes. The
empty boxes represent forbidden combinations.
Computer Department SPIT, Piludara
The UNIX V7 File System (1)

Figure 4-33. A UNIX V7 directory entry.

Computer Department SPIT, Piludara


The UNIX V7 File
System (2)

Figure 4-34. A UNIX i-node.

Computer Department SPIT, Piludara


The UNIX V7 File System (3)

Figure 4-35. The steps in looking up /usr/ast/mbox.

Computer Department SPIT, Piludara

You might also like