You are on page 1of 44

CHAPTER SIX

FILES MANAGEMENT
Files
Directories
Secondary storage management

File
A file is a collection of similar records.
The file is treated as a single entity by users and applications and

may be referred by name.


Files have unique file names and may be created and deleted.
Restrictions on access control usually apply at the file level.
A file is a container for a collection of information.
The related files can be stored in a directory.
A directory is a means of organizing and grouping the files.
Directory can store both files and subdirectory.
A subdirectory refers to a directory that is within another
directory
The directory that is at the topmost level on your disk is called
the root directory.
2

File management
The way in which an operating system organizes , structures ,

names, accesses, uses, protects, and implements is called a file


management system.
The file manager provides a protection mechanism to allow
users administrator how processes executing on behalf of
different users can access the information in a file.
File protection is a fundamental property of files because it
allows different people to store their information on a shared
computer.

File Management
All computer applications need to store and retrieve information.

While a process is running, it can store a limited amount of


information within its own address space.
However the storage capacity is restricted to the size of the
virtual address space. (1st problem)
A 2nd problem with keeping information with in a process
address space is that when the process terminates, the
information is lost.
A 3rd problem is that it is frequently necessary for multiple
processes to access (parts of)
the information at the same time.
4

FILE SYSTEM
Long term file storage

Must store large amounts of data

Information stored must survive the termination


of the process using it

Multiple processes must be able to access the


information concurrently
Files are managed by the operating system. How
they are structured, named, accessed, used,
protected, and implemented are major topics in
operating system design.
5

File Names
Files are an abstraction

mechanism.
Many operating systems support
two-part file names
file extension usually indicates

something about the file. E.g. prog.c


is C source code

Unix
long file names (255 bytes)
include special characters
case sensitive (myshell.c !=
Myshell.c)
MS-DOS
6
file name = 8 chars, ., 3-char

File Structure
Unstructured:
The os does not know or care what is in the file
All it sees are bytes.
Any meaning must be imposed by user-level programs.
Both UNIX and Windows use this approach

File Types
Regular files
are the ones that contain user

information.
Directories
are system files for maintaining the
structure of the file system.
Character special files
are related to input/output and used
to model serial I/O devices such as
terminals, printers, and networks.
Block special files
8
are used to model disks.

Regular Files
ASCII files
ASCII files consist of lines of text
They can be displayed and printed as is
They can be edited with any text editor
Easy to connect the output of one program to
the input of another

binary files
They are not ASCII files
Executable, archive, etc
They have some internal structure known to
programs that use them

Regular Files
Executable

10

Archive

File Access
Sequential access
read all bytes/records from the beginning
cannot jump around, could rewind or back up
convenient when medium was magnetic tape
Random access
bytes/records read in any order
essential for data base systems
read can be
move file marker (seek), then read or
read and search

11

File Operations
Any file system provides not only a means to store data organized as files,
but a collection of functions that can be performed on files.
Typical operations include the following:
Create: A new file is defined and positioned within the structure of files.
Delete: A file is removed from the file structure and destroyed.
Open: An existing file is declared to be "opened" by a process, allowing the
process to perform functions on the file.
Close: The file is closed with respect to a process, so that the process no
longer may perform functions on the file, until the process opens the file
again.
Read: A process reads all or a portion of the data in a file.
Write: A process updates a file, either by adding new data that expands the
size of the file or by changing the values of existing data items in the file.
Some other operations
Append, Seek, Get attributes, Set Attributes, Rename, Lock
12

File Attributes
File attributes :- name ,type, location, size, protection, time, data,

13

user identification these are the attributes of a file.


Name: a file is named for the convenience of the user and is
referred to by its name
Type: files are so many types, the type is depending on the
extension of the file
Eg: .exe executable file .
.obj
object file
.src
source file
Location: this information is a pointer to a devices and to the
location of the file on that devices.
Size : the current size of the file (in bytes, blocks)
Protection : tell who may access it and who may not
Time: it specifies time of creation
Date: it specifies the file created date
User identification: this is useful for protection security and
usage monitoring

File Attributes

14

CHAPTER SIX
FILES MANAGEMENT
Files

Directories
Secondary storage

management

15

Directories

Each directory is a collection of entries, one entry per file.


Among attributes: the address on disk where the file starts.
Opening a file:

16

search directory for the corresponding file entry

Read attributes and disk addresses into the main memory

Directory system

Single level

Two level
Hierarchic
al
17

Path Names
Absolute path name:
Unix:
/usr/home/desu/teaching/lec22.ppt
Windows: C:\Users\tedo\Desktop\Chapter6

new

Relative path name

use the current directory as a starting point


. (dot) for this directory
.. (dotdot) for parent of this directory
if current directory is /usr/home/desu/research:

../teaching/lec22.ppt.txt

18

is the same file as

A Unix Directory Tree

19

Directory Operations

Create

Delete

Opendir

Closedir

Readdir
Rename
Link
Unlink

20

CHAPTER SIX
FILES MANAGEMENT
Files
Directories

Secondary storage

management

21

Secondary Storage Management

22

Most disks can be divided up into one or more partitions, with


independent file systems on each partition.
Sector 0 of the disk is called the MBR (Master Boot Record) and is
used to boot the computer
When the computer is booted, the BIOS reads in and executes the MBR
MBR program:
locate the active partition
read in its first block called the boot block
execute it
Every partition starts with a boot block, even if it does not contain a
bootable operating system
it might contain one in the future

Secondary Storage
Management

23

Contiguous Allocation

24

Contiguous Allocation
Advantages:
it is simple to implement
the read performance is excellent because the

entire file can be read from the disk in a single


operation High performance
Disadvantages:
Disk fragmentation

widely used on CD-R, DVD-R and other write-

once medias

25

Linked List Allocation

26

Linked List Allocation


Advantage:
No disk fragmentation
Probably few internal fragmentation in the last
block
Sufficient to store address of the first block
Efficient for sequential file reading
Disadvantages
amount of data storage in a block is no

longer a power of two

the pointer to the next block takes up a few bytes

random access is extremely slow


Read and search for a block
27

Linked List Allocation Using a Table in


Memory
Both disadvantages of

the linked list allocation


can be eliminated:
taking the pointer word

from each disk block and


putting it in a table in
memory
Such a table in main

memory is called a FAT


(File Allocation Table).
28

Linked List Allocation


Advantages:
entire block is available for data
random access is much easier
the chain is entirely in memory
it can be followed without making any disk
references
sufficient for the directory entry to keep the

starting block number


Disadvantages:
the entire table must be in memory all the
time to make it work
occupy a great deal of virtual memory
29

i-nodes
keeping track of which blocks belong to which

file is to associate with each file a data


structure called an i-node (index-node)
Advantage:
i-node need only be in memory when the

corresponding file is open


Disadvantages:
Difficult to accommodate growing files

30

i-nodes

31

Teowdos A

READING ASSIGNMENT
Directory disk allocation
Inode file structure

32

Disk Space Management


Two general strategies are used in storing n-

byte file:
n consecutive bytes of disk space are allocated
the file is split up into a number of (not

necessarily) contiguous blocks


The same tradeoff is present just like in memory

management systems between pure


segmentation and paging
Nearly all file systems chop files up into fixedsize blocks that need not be adjacent
33

Block Size
Having a large allocation unit, such as a

cylinder
every file, even a 1-byte file, ties up an entire

cylinder
using a small allocation unit
each file will consist of many blocks
reading each block normally requires a seek and

a rotational delay
reading a file consisting of many small blocks will be slow

34

Block Size
Small blocks are bad for performance but good for

disk space utilization.


A compromise size is needed

35

Keeping Track of Free Blocks


Once a block size has been chosen, the next

issue is how to keep track of free blocks


Two methods are widely used

using a linked list of disk blocks, with each

block holding as many free disk block numbers


as will fit
bitmap - a disk with n blocks requires a bitmap
with n bits
Free blocks are represented by 1s in the map, allocated

blocks by 0s (or vice versa)

36

The MS-DOS file system


File names 8+3 (UPPERCASE)
No ownership: all files accessible to user
Maintained via a File Allocation Table (FAT)
Attributes:
read-only
hidden
system
archive

37

The MS-DOS directory entry

Time is inaccurate: 2 bytes = 65536, but 86400 seconds per day


Date uses 7-bit for year, starting 1/1/1980. Runs out in 2107
First-block-number: index into FAT, with 64K entries
10 bytes (of 32) unused!
38

FAT-12/16
Block (also called cluster), multiple of 512

bytes
FAT-12: 12-bit block addresses, 512-byte
blocks
largest partition: 4096 x 512 = 2MB. OK for

floppy
For disks, MS allowed blocks of 1KB, 2KB,

4KB. Largest partition: 16MB


FAT-16: switch to 16-bit addresses, block size
up to 32KB.
Largest partition 2GB
39

FAT-32
Win95 2nd Edition / Win98 / Win ME
Really FAT-28: 28-bit block addresses
Potentially 228 x 215 per partition, but in
reality only 241 = 2TB
FAT itself now occupies a large RAM:

40

for 2GB disk, 4KB blocks 512K blocks


FAT uses 2MB RAM.

NTFS
Designed from scratch
Not compatible with Win95 / Win98
Usually 4KB blocks (clusters)
Blocks referred to by 64-bit numbers
Main data structure: Master File Table

41

(MFT)
Each MFT entry describes a file or
directory
MFT entry = 1KB
MFT is a file, can be anywhere on disk

Block runs
Idea: blocks of a file often sequential on

disk
A run is a set of consecutive blocks that
belong to the same file
No need to keep pointer to each block:
Enough to keep start/length of each run

42

An MFT record for a 3-run, 9block file

MFT

43

THE
END
THE END
44