Professional Documents
Culture Documents
Introduction to Systems
Lecture 15
Filesystems
Software
Result
Media Time
Queue
(Seek+Rot+Xfer)
(Device Driver)
Highest Bandwidth:
transfer large group of blocks sequentially from one track
10/17/07 Joseph CS194-3/16x UCB Fall 2007 Lec 15.2
Review: Building a File System
File System: Layer of OS that transforms block
interface of disks (or other block devices) into Files,
Directories, etc.
File System Components
Disk Management: collecting disk blocks into files
Naming: Interface to find files by name, not by blocks
Protection: Layers to keep data secure
Reliability/Durability: Keeping of files durable despite
crashes, media failures, attacks, etc
How do users access files?
Sequential Access: bytes read in order
Random Access: read/write element from middle
Content-based Access
What are file sizes?
Most files are small, with a few big files
Large files use up most of the disk space and BW
A few enormous files equivalent to a huge # of small files
File Systems
Structure, Naming, Directories
DBMS storage management
Data Durability
File Header
Null
Maintains two
copies of FAT
for durability
Project 2 posted
Online E-Book Store
Open source Hadoop running in a virtual machine
Using MapReduce to compute line indicies, word
counts, inverted index, search
Track Buffer
(Holds complete track)
Solution1: Skip sector positioning (interleaving)
Place the blocks from one file on every other block of a
track: give time for processing to overlap rotation
Solution2: Read ahead: read next block right after first,
even if application hasnt asked for it yet.
This can be done either by OS (read ahead)
By disk itself (track buffers). Many disk controllers have
internal RAM that allows them to read a complete track
Important Aside: Modern disks+controllers do many
complex things under the covers
Track buffers, elevator algorithms, bad block filtering
10/17/07 Joseph CS194-3/16x UCB Fall 2007 Lec 15.18
DBMS Structure: Files of Records
Blocks are the interface for I/O, but
A DBMS operates on records, and files of records
FILE: A collection of pages, each containing a
collection of records, and supporting:
Insert/delete/modify record
Fetch a particular record (specified using record id)
Scan all records (possibly with some conditions on the
records to be retrieved)
Unordered (Heap) Files
Simplest file structure contains records in no
particular order
As file grows and shrinks, (de-)allocate disk pages
To support record level operations, we must:
Keep track of the pages in a file, free space on pages,
and the records on a page
Two simple alternatives list or page directory
10/17/07 Joseph CS194-3/16x UCB Fall 2007 Lec 15.19
Heap File Implemented as a List
Data
Header Page 1
Page
Data
Page 2
Data
DIRECTORY Page N
single board
host array disk
CPU
adapter controller controller
single board
disk
Some systems duplicate controller
all hardware, namely
controllers, busses, etc. often piggy-backed
in small format devices
recovery
group
Each disk is fully duplicated onto its "shadow
For high I/O rate, high availability environments
Most expensive solution: 100% capacity overhead
Bandwidth sacrificed on write:
Logical write = two physical writes
Highest bandwidth when disk heads and rotation fully
synchronized (hard to do exactly)
Reads may be optimized
Can have two independent reads to same data
Recovery:
Disk failure replace disk and copy data to new disk
Hot Spare: idle disk already attached to system to be
used for immediate replacement
10/17/07 Joseph CS194-3/16x UCB Fall 2007 Lec 15.34
RAID 5+: High I/O Rate Parity
Stripe
Data stripped across Unit
multiple disks
D0 D1 D2 D3 P0
Successive blocks
stored on successive Increasing
(non-parity) disks D4 D5 D6 P1 D7 Logical
Disk
Increased bandwidth Addresses
over single disk D8 D9 P2 D10 D11
Parity block (in green)
constructed by XORing D12 P3 D13 D14 D15
data bocks in stripe
P0=D0D1D2D3 P4 D16 D17 D18 D19
Can destroy any one
disk and still
reconstruct data D20 D21 D22 D23 P5
Suppose D3 fails,
then can reconstruct: Disk 1 Disk 2 Disk 3 Disk 4 Disk 5
D3=D0D1D2P0
Later in term: talk about spreading information widely
across internet for durability.
10/17/07 Joseph CS194-3/16x UCB Fall 2007 Lec 15.35
Summary
File System transforms blocks into Files and Directories
Maximize sequential access, allow efficient random access
File (and directory) defined by header (inode)
Cray DEMOS: optimization for sequential access
Inode holds set of disk ranges, similar to segmentation
Multilevel Indexed Scheme (4.1, 4.2 BSD)
Inode holds file info, direct/indirect/doubly indirect pointers
Optimizations for sequential access and rotation
DBMS file layer keeps track of pages in a file, and
supports abstraction of a collection of records
Pages with free space identified using linked list or directory
structure (similar to how we track pages in files)
Naming: act of translating from user-visible names to
actual system resources (such as files)
Important system properties
Availability: how often is the resource available?
Durability: how well is data preserved against faults?
Reliability: how often is resource performing correctly?
RAID RAID1 (mirroring), RAID5 (parity block)
10/17/07 Joseph CS194-3/16x UCB Fall 2007 Lec 15.36