You are on page 1of 45

Physical database architecture

Training Division
New Delhi
Pages
? The fundamental unit of data
storage in Microsoft SQL
Server is the page.

? In SQL Server version 7.0 the size


of pages is 8 KB which means Page Header A
there can be 128 pages per
megabyte.

? The start of each page is a 96


byte header used to store 8K
information Body B
page
? such as the type of page,
? the amount of free space
on the page,
? the object ID of the
object owning the page.
A- 96 byte header
? The body of the page is of 8096
bytes. B- 8096 byte body
Types of Pages
? Data Data rows with all data
except text,ntext, and
image data

? Index Index entries

? Text/Image text, ntext, and image data

? Global Allocation maps Information about


allocated extents

? Page Free Space Information about free


space available on pages

? Index Allocation map Information about extents


used by a table or index
Data Pages

? Data pages contain all the data in


data rows except text,ntext, and
image data, which are stored in
separate pages.

? Data rows are placed serially on the


page starting immediately after the
header.

? Rows cannot span pages in SQL


Server.

? In SQL Server 7.0,the maximum


amount of data contained in a single
row is 8060 bytes, not including text,
ntext, and image data.
Row offset table

? Starts at the end of the page and determines the location of a


row within a page.

? Contains on entry for each row on the page and each entry
records how far the first byte of the row is from the start of
the page.

? The entries in the row offset table are in reverse sequence


from the sequence of the rows on the page.
Deleting Name1

Inserting Name4 Inserting Name5


Index Pages

• Stores the index pages .

• An index page has the same layout as the data page.

• Row in an index page consists of the index key and the


pointer to the page at the next lower level.
Extents

? Extents are the basic unit in which space is


allocated to tables and indexes.

? An extent is 8 contiguous pages, or 64KB(databases


have 16 extents per MB. )
SQL Server 7.0 has two types of extents:

? Uniform extents are owned by a single object; all


eight pages in the extent can only be used by the
owning object.

Page address

? Mixed extents are shared by up to eight objects.

Page address
Log Data

• Stored in a physically separate location from the data.

• No longer stored in the system table.

• Therefore ,does not compete for memory resources.

• Physically stored as one or several log files which SQL


Server stores as series of records.
Text and Image data

• Text , ntext and image datatypes used.

• Each column for a row of these types store upto 2 GB.

• In the data page , there is a 16 byte pointer which


points to the location of the text or image data.

• A table has one collection of pages to hold the text and


image data. (Stored in sysindexes for table ,indid=255)
Page Free Space Pages
Page Free Space (PFS) pages record

? Whether an individual page has been allocated,


? Amount of space free on each page.
? Each PFS page covers 8,000 pages.

For each page, the PFS has a bitmap recording whether the
page is

? empty
? 1-50% full,
? 51-80% full,
? 81-95% full,
? or 96-100% full.
Global Allocation Map pages

GAM pages record

? What extents have been allocated.

? Whether they have been allocated to


objects & indexes

? Whether the allocation has been for


uniform or mixed extents
There are two types of Global Allocation Maps:

Global Allocation Map(GAM) Shared Global Allocation Map


(SGAM)
? Keeps track of allocated extents ? Records what extents are currently
irrespective of whether the used as mixed extents and have at
allocation is for mixed or uniform least one unused page.
extent.
? The SGAM has one bit for each extent
? The GAM has one bit for each in the interval it covers.
extent in the
? If the bit is 1, the extent is being
? If the bit is 1, the extent is free; used as a mixed extent and has
free pages;
? if the bit is 0, the extent is ? If the bit is 0, the extent is not
allocated. being used as a mixed extent,or it
is a mixed extent whose pages
are all in use.

Both GAM & SGAM covers 64,000 extents, or nearly 4 GB of data


Extent Usage GAM Bit SGAM Bit

Free 1 0

Uniform 0 0

Mixed ,with no free pages 0 0

Mixed ,with free pages 0 1


? A new table or index is
allocated pages from mixed
extents.

? When the table or index grows to


the point that it has eight pages, it
is switched to uniform extents.
SQL Server 7.0 does not allocate entire extents to tables with small
amounts of data inorder to make its space allocation efficient,
Database Files and Filegroups
? A database is mapped over a set of operating system
files.

? These files are created at the same time as the


database is created.

? Minimum of two operating system files are created for


each database created.

? Primary data file


? Log file
SQL Server 7 allows the following three types of
database files:
Primary data files :
Every database has one primary data file that keeps track of all the
rest of the files in the database,in addition to storing data.By
convention,the name of a primary data file has the extension MDF.
Secondary data files:
A database might have zero or more secondary data files.By
convention, the name of a secondary data file has the extension
NDF.
Log files :
Every database will have at least one lof file that contains the
information necessary to recover all the transactions in a
database.By convention, a log file will have the suffix LDF.
• Maximum size for a • 32 TB
database file

• Maximum size of a • 4 TB
log file
SQL Server 7.0 databases have three types of files:

? Primary data files

?Is the starting point of the database ,

?Points to the rest of the files in the database,

?Every database has one primary data file.

?Recommended file extension for primary data files


is .mdf.
? Secondary data files

? Comprise all of the data file other than the primary


data file.

? Some databases may not have any secondary


data files, while others have multiple secondary
data files.

? The recommended file extension for secondary


data files is .ndf.
? Log files

? Hold all of the log information used to recover the


database.

? There must be at least one log file for each database,


although there can be more than one.

? The recommended file extension for log files is .ldf.


On creation of a database say for eg : “Training” , the
two files that are created are :

C:\MSSQL7\data\training_Data.MDF A
C:\MSSQL7\data\training_Log.LDF B

where A is the primary data file ,

and B is the log file.

The information of the database files is contained in the table


called “sysfiles” .
Sysfiles table
• File id • Database identification number which is unique for each database

• groupid • Identification of the filegroup to which the file belongs

• size • Size of the file in pages

• maxsize • Max size of file.”0”-no autogrowth,”1”-autogrowth till disk file

• growth • Autogrowth increment in pages or percentage of file size

• perf • Reserved for future use

• name • Logical name of file

• filename • The physical name of the file, including path


? SQL Server 7.0 files can grow automatically from their originally specified
size.

? When you define a file, you can specify a growth increment.

? Each time the file fills, it increases its size by the growth increment.

? If there are multiple files in a filegroup, they do not autogrow until all the files
are full.

? Each file can also have a maximum size specified.

? If a maximum size is not specified, the file can continue to grow until it has
used all available space on the disk.

? The user can let the files autogrow as needed to lessen the administrative
burden of monitoring the amount of free space in the database and allocating
additional space manually.
Points to remember :
• If the database must never be allowed to grow beyond its
initial size,then set the maximum growth size of the database
to zero.
• This will prevent the database files from growing. If the
database files fill with data, no more data is added until more
data files are added to the database or existing files are
expanded.
Fragmentation of Files
• Allowing files to grow automatically can cause fragmentation
of those files if a large number of files share the same disk.
• Therefore, it is recommended that files or filegroups be
created on as many different available local physical disks as
possible.
• Place objects that compete heavily for space in different
filegroups.
Disk Management Techniques

SQL Server can

? Allow the database file to grow


automatically

? Shrink the size of the database if the space


is not needed
Creating a database specifying the primary,secondary and log files with
autogrowth feature.
create database training
on (name=‘training_data1’,
filename=‘c:\sql_data\training1.mdf,
size=50,
maxsize=100,
filegrowth=10),
(name=‘training_data2’,
filename=‘d:\sql_data\training2.ndf,
size=100,
filegrowth=20),
log on (name=‘training_log’,
filename=‘e:\sql_data\’training_log.ldf’,
size=50,
filegrowth=20%)
go
Shrinking of databases
? Each file within a database can be shrunk to remove unused pages.

? Both data and transaction log files can be shrunk.

? The database files can be shrunk manually, either as a group or


individually and can also be set to shrink automatically at given
intervals.

? Shrinking activity occurs in the background and does not affect any
user activity within the database.
Shrinks the size of the data files in the specified database.
DBCC SHRINKDATABASE
( database_name [, target_percent]
[, {NOTRUNCATE | TRUNCATEONLY}]
)
Shrinking size of database file

Shrinks the size of the specified data file or log file for the
related database.

DBCC SHRINKFILE
{file_name | file_id }
{ [, target_size]
| [, {EMPTYFILE | NOTRUNCATE | TRUNCATEONLY}]
}
)
Database filegroups
A database comprises of :
? A primary filegroup and
? Any user-defined filegroups.
? Default filegroups.

The primary filegroup contains the :


? Primary data file and
? Any other files that are not put into
another file group.
? All pages for the system tables are
allocated in the primary file group.
User defined file group

These are filegroups that are specified using the


FILEGROUP keyword in a CREATE DATABASE or ALTER
DATABASE statement, or on the property page within SQL
Server Enterprise Manager.

Default filegroup

They contains the pages for all tables and indexes that do
not have a filegroup specified when they are created. In
each database, only one filegroup at a time can be the
default filegroup. If no default filegroup was specified, it
defaults to the primary filegroup.
Some important facts about file groups:
? No file can be a member of more than one
filegroup.
? Log files are never a part of a filegroup.

? Files in a filegroup will not autogrow unless there


is no space available on any of the files in the
filegroup.

? A maximum of 256 file groups can be created per


database, and file groups can contain only data
files;

? It is not possible to move files to a different


filegroup once the files have been added to the
database.
Advantages of filegroups:
•File groups allow files to be grouped together for administrative
and data allocation/placement purposes.
•For example, three files (data1.ndf, data2.ndf, and data3.ndf) can be
created on three disk drives, respectively, and assigned to the
filegroup fgroup1.
•A table can then be created specifically on the filegroup fgroup1.
Queries for data from the table will be spread across the three disks,
thereby improving performance.
•The same performance improvement can be accomplished with a
single file created on a RAID (redundant array of independent disks)
stripe set.
•Files and filegroups, however, allow you to easily add new files on
new disks.
•Additionally, if your database exceeds the maximum size for a
single Microsoft Windows file, you can use secondary data files to
allow your database to continue to grow.
By creating a filegroup on a specific disk or
RAID (redundant array of independent disks)
device, you can control where tables and
indexes in your database are physically located.

Reasons for placing tables and indexes on specific


disks include:
?Improved query performance.
?Parallel queries.
The following example creates a database with a primary
data file, a user-defined filegroup, and a log file. The
primary data file is in the primary filegroup and the user-
defined filegroup has two secondary data files. An
ALTER DATABASE statement makes the user-defined
filegroup the default. A table is then created specifying
the user-defined filegroup.
CREATE DATABASE training
ON PRIMARY
( NAME=’Trg_Primary',
FILENAME='c:\mssql7\data\Trg_Prm.mdf',
SIZE=4,
MAXSIZE=10,
FILEGROWTH=1),
FILEGROUP Trg_FG1
( NAME = ’Trg_FG1_Dat1',
FILENAME = 'c:\mssql7\data\Trg_FG1_1.ndf',
SIZE = 1MB,
MAXSIZE=10,
FILEGROWTH=1),
( NAME = ’Trg_FG1_Dat2',
FILENAME = 'c:\mssql7\data\Trg_FG1_2.ndf',
SIZE = 1MB,
MAXSIZE=10,
FILEGROWTH=1)
LOG ON
( NAME=’trg_log',
FILENAME='c:\mssql7\data\Trg.ldf',
SIZE=1,
MAXSIZE=10,
FILEGROWTH=1)
GO
An ALTER DATABASE statement makes the user-
defined filegroup the default.

ALTER DATABASE Trg


MODIFY FILEGROUP Trg_FG1 DEFAULT
GO

A table is then created specifying the user-


defined filegroup.

USE Trg
CREATE TABLE TrgTable
(par_id int PRIMARY KEY,
par_nm char(8) )
ON Trg_FG1
GO
Indexes
Indexes can be of two types:

1. Clustered Index :

?Data rows are sorted and stored in the table based on their key values.

?There can only be one clustered index per table because the data rows
themselves can only be sorted in one order.

?Clustered indexes are efficient for finding rows.


?

?The data rows form the lowest level of the clustered index.
2. Non Clustered Index :

•Nonclustered indexes have a structure that is completely separate from


the data rows.

•The lowest rows contain the nonclustered index key values and each key
value entry has pointers to the data rows containing the key value.

•The data rows are not stored in order based on the nonclustered key.
Two types of tables :
1.Clustered tables

Are tables that have a clustered index.

The data rows are stored in order based on the clustered


index key. The data pages are linked in a doubly-linked
list. The index is implemented as a B-tree index structure
that supports fast retrieval of the rows based on their
clustered index key values.

2. Heaps
Are tables that have no clustered index.
The data rows are not stored in any particular order, and
there is no particular order to the sequence of the data
pages. The data pages are not linked in a linked list.
Maximum Capacity Specifications
This table specifies the maximum sizes and numbers of various objects
defined in Microsoft SQL Server databases or referenced in Transact-SQL
statements.

Object SQL Server 7.0

Batch size 65,536* Network Packet Size


Bytes per short string column 8000
Bytes per text, ntext, or image column 2 GB-2
Bytes per GROUP BY, ORDER BY 8060
Bytes per index 900
Bytes per foreign key 900
Bytes per primary key 900
Bytes per row 8060
Bytes in source text of a stored procedure Lesser of batch size or 250 MB
Clustered indexes per table 1
Columns in GROUP BY, ORDER BY Limited only by number of bytes
Columns or expressions in a GROUP BY
WITH CUBE or WITH ROLLUP statement 10
Columns per index 16
Columns per foreign key 16
Columns per primary key 16
Columns per base table 1024
Columns per SELECT statement 4096
Columns per INSERT statement 1024
Connections per client Max. value of configured
connections
Database size 1,048,516 TB
Databases per server 32,767
Filegroups per database 256
Files per database 32,767
File size (data) 32 TB
File size (log) 4 TB
Foreign key table references per table 253
Identifier length (in characters) 128
Locks per connection Max. locks per server
Locks per server 2,147,483,647 (static)
40% of SQL Server
memory (dynamic)
Nested stored procedure levels 32
Nested subqueries 32
Nested trigger levels 32
Nonclustered indexes per table 249
Objects concurrently open in a server* 2,147,483,647
Objects in a database* 2,147,483,647
Parameters per stored procedure 1024
REFERENCES per table 63
Rows per table Limited by available
storage
SQL string length (batch size) 128* TDS packet size

Tables per database Limited by number of


objects in a database

Tables per SELECT statement 256


Triggers per table Limited by number of
objects in a database

UNIQUE indexes or constraints per table 249 nonclustered and 1


clustered

* Database objects include all tables, views, stored procedures, extended stored
procedures, triggers, rules, defaults, and constraints. The sum of the number of
all these objects in a database cannot exceed 2,147,483,647.