You are on page 1of 54

L02: DDBMS Architecture

DDBMS Architecture
 Architectural Models of DDBMS
 More on Transparency

H.Lu/HKUST
DDBMS Architecture
 Architecture defines a system’s structure with
 Componets
 Functions of components, and
 Their interactions
 Purpose of “reference architecture”:

 A framework for discussion


 Standardiztion

H.Lu/HKUST L02: Introduction -- 2


Approaches of Reference Models
 Based on components
 +: modularity -: unclear functionality
 Based on functions (e.g. ISO/OSI model)
 +: clear interfaces -: unclear complexity
 Based on data (e.g. ANSI/SPARC model)

 +: data-centric - : unclear functionality


Need an integration of the three approaches

H.Lu/HKUST L02: Introduction -- 3


ANSI/SPARC Architecture

Users

External External External External


Schema View View View

Conceptual Conceptual
Schema View

Internal Internal
Schema View

H.Lu/HKUST L02: Introduction -- 4


Conceptual Schema Definition (1)

RELATION EMP [ RELATION PAY [


KEY = {ENO} KEY = {TITLE}
ATTIBUTES = { ATTIRBUTES = {
ENO : CHAR(9) TITLE : CHAR (10)
ENAME : CHAR(15) SAL: NUMERIC(6)
TITLE : CHAR(10) }
} ]
]

H.Lu/HKUST L02: Introduction -- 5


Conceptual Schema Definition (2)

RELATION PROJ [ RELATION ASG [


KEY = {PNO} KEY = {ENO, PNO}
ATTIBUTES = { ATTIRBUTES = {
PNO : CHAR(7) ENO : CHAR(9)
PNAME : CHAR(20) PNO : CHAR(7)
BUDGET : NUMERIC(7) RESP : CHAR (10)
LOC : CHAR(15) DUR: NUMERIC(3)
} }
] ]

H.Lu/HKUST L02: Introduction -- 6


Internal Schema Definition

RELATION EMP [ INTERNA_REL E [


KEY = {ENO} INDEX ON E#
ATTIBUTES = { FIELD = {
ENO : CHAR(9) E# : BYTE(9)
ENAME : CHAR(15) ENAME: BYTE(15)
TITLE : CHAR(10) TIT: BYTE(10)
} }
] ]

H.Lu/HKUST L02: Introduction -- 7


External View Definition
Create a BUDGET view from PROJ :

CREATE VIEW BUDGET (PNAME,BUD)


AS SELECT PNAME, BUDGET
FROM PROJ

Create a PAYROLL view from EMP and PAY:

CREATE VIEW PAYROLL ( EMP_NO, EMP_NAME, SAL)


AS SELECT EMP.ENO, EMP.ENAME, PAY.SAL
FROM EMP, PAY
WHERE EMP.TITLE = PAY.TITLE

H.Lu/HKUST L02: Introduction -- 8


The Classical DDBMS Architecture
Global Schema

Fragmentation Site Independen


Schema Schemas
Allocation
Schema

Local Mapping Schema


Local Mapping Schema Other sites

DBMS 1 DBMS 2

Site 1 LOCA Site 2


LOCA
L L
DB 1 DB 2
H.Lu/HKUST L02: Introduction -- 9
DDBMS Schemas
 Global Schema: a set of global relations as if database were
not distributed at all
 Fragmentation Schema: global relation is split into “non-
overlapping” (logical) fragments. 1:n mapping from relation
R to fragments Ri.
 Allocation Schema: 1:1 or 1:n (redundant) mapping from
fragments to sites. All fragments corresponding to the same
relation R at a site j constitute the physical image Rj. A copy
of a fragment is denoted by Rji.
 Local Mapping Schema: a mapping from physical images to
physical objects, which are manipulated by local DBMSs.

H.Lu/HKUST L02: Introduction -- 10


Global Relations, Fragments and Physical Images
R
R1 R 11
R1
R 12 (Site 1)
R2

R 21
R2
(Site2)
R3 R 22

R 32
Global R3
Fragments (Site3)
Relation
R 33

Physical Images
H.Lu/HKUST L02: Introduction -- 11
Motivation for this Architecture
 Separating the concept of data fragmentation from
the concept of data allocation
 fragmentation transparency
 location transparency
 Explicit control of redundancy
 Independence from local databases: allows local
mapping transparency

H.Lu/HKUST L02: Introduction -- 12


Some Aspects of the Classical Architecture
 Distributed database technology is an “add-on”
technology, most users already have populated
centralized DBMSs. Whereas top down design
assumes implementation of new DDBMS from
scratch.
 In many application environments, such as semi-
structured databases, continuous multimedia data, the
notion of fragment is difficult to define.
 Current relational DBMS products provide for some
form of location transparency (such as, by using
nicknames).

H.Lu/HKUST L02: Introduction -- 13


L02: DDBMS Architecture

 DDBMS Architecture
Architectural Models of DDBMS
 More on Transparency

H.Lu/HKUST
Bottom Up Architecture - Present & Future
 Possible ways in which multiple databases may be put together for sharing
by multiple DBMSs, which are characterized according to
 Autonomy (A) degree to which individual DBMSs can operate
independently.
 0 - Tightly coupled - integrated (A0),
 1 - Semiautonomous - federated (A1),
 2 - Total Isolation - multidatabase systems(A2)
 Distribution (D)
 0 - no distribution - single site (D0),
 1 - client-server - distribution of DBMS functionality (D1),
 2 - full distribution – P2P distributed architecture(D2)
 Heterogeneity (H)
 0 - homogeneous (H0)
 1 - heterogeneous (H1)

H.Lu/HKUST L02: Introduction -- 15


Architectural Alternatives
Distribution
(A0, D2, H0) (A1, D2, H0) (A2, D2, H0)

(A0, D2, H1) (A2, D2, H1) (A2, D1, H0)

(A0, D1, H1) (A2, D1, H1) (A2, D0, H0)


(A0, D0, H0)
Autonomy
(A0, D0, H1)
(A1, D0, H1) (A2, D0, H1)
Heterogeneity
H.Lu/HKUST L02: Introduction -- 16
Autonomy
 Autonomy refers to the distribution of control, not of data. It indicates the
degree to which individual DBMSs can operate independently
 Exchanging information among sites?
 Independently executing transactions?
 Modifying the component systems?
 Requirements of an autonomous system
 The local operations of the individual DBMSs are not affected by their
participation in the DDBS.
 The individual DBMSs query processing and optimization should not
be affected by the execution of global queries that access multiple
databases
 System consistency or operation should not be compromised when
individual DBMSs join or leave the distributed database
confederation.

H.Lu/HKUST L02: Introduction -- 17


Dimensions of Autonomy
 Design Autonomy
 Freedom for individual DBMSs to use data
models and transaction management techniques
they prefer
 Communication Autonomy
 Freedom for individual DBMSs to decide what
information (data & control) is to be exported
 Execution Autonomy
 Freedom for individual DBMSs to execute
transactions submitted in any way that it wants to

H.Lu/HKUST L02: Introduction -- 18


Alternatives of Autonomy
 A0: Tight integration
 A single image of the entire database to any user
 A1: Semiautonomous
 Can operate independently
 Have made some parts of their local data shareable
 Need to be modified to enable exchanging information
 A2: Total isolation
 Stand alone DBMS at each site, no global control
 Know nothing about other databases

H.Lu/HKUST L02: Introduction -- 19


Distribution
 Deals with data, not of control
 Physical distribution of data over sites
 Two classes of distribution alternatives

 Client/Server (different from network C/S)


• Clients – Apps, Server – Data management
 Peer-to-Peer (or full)
• Each machine has full DBMS functionality
• Main focus of the textbook

H.Lu/HKUST L02: Introduction -- 20


Heterogeneity
 Various forms of heterogeneity
 Hardware, networking protocol, data manager
 Focus of this course

 Data models, query languages, xact mgmt

H.Lu/HKUST L02: Introduction -- 21


DDBMSs Implementation Alternatives
P2P P2P
P2P homogeneous P2P
homogeneous
Heterogeneous homogene
D DBMS federated system
Federated DBMS ous
P2P multidata
heterogeneous base
DBMS system
P2P
heterogeneo
Logically integrated
us
and homogeneous
Multidataba
multiple DBMSs
se
A
system
Heterogeneous
Integrated DBMS Multidatabase
System
H Single site Single Site Heterogeneous
homogeneous heterogeneous
multidatabase
federated DBMS
federated system
H.Lu/HKUST L02: Introduction -- 22
Taxonomy of Distributed Databases
 Composite DBMSs -tight integration
 single image of entire database is available to any user
 can be single or multiple sites
 can be homogeneous or heterogeneous
 Federated DBMSs - semiautonomous
 DBMSs that can operate independently, but have decided to make
some parts of their local data shareable
 can be single or multiple sites.
 they need to be modified to enable them to exchange information
 Multidatabase Systems - total isolation
 individual systems are stand alone DBMSs, which know neither the
existence of other databases or how to communicate with them
 no global control over the execution of individual DBMSs.
 can be single or multiple sites
 homogeneous or heterogeneous

H.Lu/HKUST L02: Introduction -- 23


Client Server Architectures
 Client/Server (Ax, D1, Hy)
 Multiple clients, single server
• Similar data management as in centralized db
 Multiple clients, multiple servers
• Clients connect to servers as needed, or
• Clients connect to their “home” server only

H.Lu/HKUST L02: Introduction -- 24


Components of Client/Server System
UI Application

Operati

System
Program
Client
DBMS
ng
Communication
SQL Queries software
Result
Communication Relation
software
Semantic Data
Operatin

Controller
Query
Optimizer
Transaction
g

Manager
Recovery
Manager
Runtime Support
Processor
System

Database
H.Lu/HKUST L02: Introduction -- 25
DDBS Reference Architecture
ES1 ES2 ESn External
Schema
GCS Global Conceptual Schem

LCS1 LCS2 LCSLocal


n Conceptual Schema

LIS1 LIS2 LISnLocal Internal Schema

It is logically integrated. Provides for the levels


of transparency
H.Lu/HKUST L02: Introduction -- 26
Components of a DDBS Site
USER PROCESSOR DATA PROCESSOR
Local Local
External Global
GD/D Conceptual log Internal
Schema Schema Schema
Schema

Local Recovery
Semantic Data
User Interface

Global Query

Local Query
Processor

Runtime
Controller

Optimizer

Support
Execution

Manager
Monitor
Handler

Global

H.Lu/HKUST L02: Introduction -- 27


DDBMS Components: USER PROCESSOR
 User interface handler interprets user commands and
formats the result data as it is sent to the user
 Semantic data controller checks the integrity
constraints and authorization requirements
 Global query optimizer and decomposer determines
execution strategy, translates global queries to local
queries, and generates strategy for distributed join
operations
 Global execution monitor (distributed transaction
manager) coordinates the distributed execution of the
user request

H.Lu/HKUST L02: Introduction -- 28


DDBMS Components: DATA PROCESSOR
 Local query processor selects the access path and is
involved in local query optimization and join
operations
 Local recovery manager maintains local database
consistency
 Run-time support processor physically accesses the
database. It is the interface to the OS and contains
database buffer manager

H.Lu/HKUST L02: Introduction -- 29


MDBS Architecture
 Heterogeneous MDBS-Unilingual
 Requires users to utilize different data models and
languages when both a local and global database need to
be accessed
 Applications accessing multiple databases should use
external views defined on GCS
 A single application can have a LES on LCS and GES on
GCS
 Heterogeneous MDBS-Multilingual
 Users access global database by means of external schema
defined using the language of user's local DBMS
 Queries against global databases are made using language
of local DBMS
 Deal with translation of queries at runtime
H.Lu/HKUST L02: Introduction -- 30
MDBS Architecture (with Global Schema)

GES1 GES2 GES3 Global External Schema

LES11 GCS LESn1 Local External Schema

LES12 LCS1 LCSn LESn2


(A2, Dx, Hy) –
LES13 LESn3 GCS generated
LIS1 LISn bottom-up

•The GCS is generated by integrating LES's or LCS's


•The users of a local DBMS can maintain their autonomy
•Design of GCS is bottom-up
H.Lu/HKUST L02: Introduction -- 31
MDBS with GCS
 Uni-lingual Heterogeneous MDBS
 Accessing multiple databases through GES on GCS
 LES on LCS and GES on GCS co-exist
 Multilingual Heterogeneous MDBS
 GES defined using the language of local DBMS
 Queries over GES in the language of local DBMS
 Easier for users to query multiple databases
 The system deals with translation of queries at runtime

H.Lu/HKUST L02: Introduction -- 32


MDBS without Global Conceptual Schema
Multidatabase ES1 ES2 ES3
Layer

Local Database
LCS1 LCS2 LCS3
System Layer

LIS1 LIS2 LIS3


 Local system layer consists of several
DBMSs which present to multidatabase
layer part of their databases
 The shared database is either local
conceptual schema or external schema (Not
shown in the figure above)
 External views on one or more LCSs.
H.Lu/HKUST L02: Introduction -- 33
MDBS without Global Schema
 Federated Database Systems
 Do not use global conceptual schema
 Each local DBMS defines export schema
 Global database is a union of export schemas
 Each application accesses global database through import
schema (external view)
 MDBSs components architecture
 Existence of full fledged local DBMSs
 MDBS is a layer on top of individual DBMSs that support
access to different databases
 The complexity of the layer depends on existence of GCS
and heterogeneity

H.Lu/HKUST L02: Introduction -- 34


MDBS Components Model
 Full-fledged DBMS at each site
 A layer runs on top of individual DBMSs
 The complexity of the layer depends on

 Existence of GCS and heterogeneity

H.Lu/HKUST L02: Introduction -- 35


Components of an MDBMS
User
User System Responses User Requests User
Multi-DBMS Layer

User Transaction Transaction User


Interface Manager Manager Interface
Query Scheduler Query
Scheduler
Processor Processor
Recovery Recovery
Query Query
Manager
Optimizer Manager Optimizer
Runtime Sup. Runtime Sup.
Processor Processor

Database Database

H.Lu/HKUST L02: Introduction -- 36


Global Directory
 Directory is itself a database that contains meat-data
about the actual data stored in the database. It
includes the support for fragmentation transparency
for the classical DDBMS architecture.
 Directory can be local or distributed.
 Directory can be replicated and/or partitioned.
 Directory issues are very important for large multi-
database applications, such as digital libraries.

H.Lu/HKUST L02: Introduction -- 37


L02: DDBMS Architecture

 DDBMS Architecture based on data


 Architectural Models

More on Transparency

H.Lu/HKUST
More on Transparency
 X transparency means the existence of X is not
known to users.
 Closely related to architecture issues.
 The level of transparency is inevitably a compromise
between ease of use and the difficulty and overhead
cost of providing high levels of transparency
 DDBMS provides location transparency and
fragmentation transparency, OS provides network
transparency, and DBMS provides data
independence

H.Lu/HKUST L02: Introduction -- 39


On Distribution Transparency
 Higher levels of distribution transparency require appropriate
DDBMS support, but makes end-application developers work
easy.
 Less distribution transparency the more the end-application
developer needs to know about fragmentation and allocation
schemes, and how to maintain database consistency.
 There are tough problems in query optimization and
transaction management that need to be tackled (in terms of
system support and implementation) before fragmentation
transparency can be supported.

H.Lu/HKUST L02: Introduction -- 40


Levels of Distribution Transparency
 Fragmentation Transparency
 Just like using global relations.
 Location Transparency
 Need to know fragmentation schema; but no need not know where
fragments are located
 Applications access fragments (no need to specify sites where
fragments are located).
 Local Mapping Transparency
 Need to know both fragmentation and allocation schema; no need to
know what the underlying local DBMSs are.
 Applications access fragments explicitly specifying where the
fragments are located.
 No Transparency
 Need to know local DBMS query languages, and write applications
using functionality provided by the Local DBMS

H.Lu/HKUST L02: Introduction -- 41


Distribution Transparency
 We analyze the different levels of distribution
transparency which can be provided by DDBMS for
applications.
A Simple Application
Supplier(SNum, Name, City)
Horizontally fragmented into:
Supplier 1 = σ City = ``HK'' Supplier at Site1
Supplier2 = σ City != “HK” Supplier at Site2, Site3
Application:
Read the supplier number from the terminal and return
the name of the supplier with that number
H.Lu/HKUST L02: Introduction -- 42
Types of Data Fragmentation
Vertical Fragmentation
• Projection on relation
(subset of attributes)
• Reconstruction by join
• Updates require no tuple
migration
Horizontal
Fragmentation
• Selection on relation
(subset of tuples)
• Reconstruction by union
• Updates may requires
tuple migration
Mixed Fragmentation
H.Lu/HKUST L02: Introduction -- 43
Horizontal Fragmentation
 Partitioning the tuples of a global relation into
subsets
 Example:
• Supplier(SNum, Name, City)
 Horizontal Fragmentation can be:
• Supplier 1 = σ City != “HK” Supplier
• Supplier2 = σ City != “HK” Supplier
 Reconstruction is possible:
• Supplier = Supplier1 ∪ Supplier2
 The set of predicates defining all the fragments must
be complete, and mutually exclusive
H.Lu/HKUST L02: Introduction -- 44
Derived Horizontal Fragmentation
 The horizontal fragmentation is derived from the horizontal
fragmentation of another relation
Example:
Supply (SNum, PNum, DeptNum, Quan) is the
SNum is a supplier number semi-join
Supply1 = Supply SNum=SNum Supplier1 operation.
Supply2 = Supply SNum=SNum Supplier2
The predicates defining derived horizontal fragments are:
Supply.SNum = Supplier.SNum and Supplier. City = ``HK''
Supply.SNum = Supplier.SNum and Supplier. City != ``HK''

H.Lu/HKUST L02: Introduction -- 45


Vertical Fragmentation
 The vertical fragmentation of a global relation is the
subdivision of its attributes into groups; fragments are
obtained by projecting the global relation over each
group
Example
EMP (ENum,Name,Sal,Tax,MNum,DNum)
A vertical fragmentation can be
EMP1 = Π ENum, Name, MNum, DNum EMP
EMP2 = Π ENum, Sal, Tax EMP
Reconstruction:
EMP = EMP1 EMP2
ENum = ENum

H.Lu/HKUST L02: Introduction -- 46


Rules for Data Fragmentation
 Completeness
All the data of the global relation must be mapped into the
fragments
 Reconstruction
It must always be possible to reconstruct each global relation
from its fragments
 Disjointedness
it is convenient that fragments be disjoint, so that the
replication of data can be controlled explicitly at the
allocation level

H.Lu/HKUST L02: Introduction -- 47


Level 4 of Distribution Transparency
 No Transparency
read(terminal, $SNum);
$SupIMS($Snum,$Name,$Found) at S1;
If not FOUND then
$SupCODASYL($Snum,$Name,$Found) at S3;
write(terminal, $Name).

DDBMS

Codasyl IMS

Supplier2 Supplier1

H.Lu/HKUST S3 S1 L02: Introduction -- 48


Level 3 of Distribution Transparency
Local Mapping Transparency
read(terminal, $SNum);
Select Name into $Name Supplier1 S1
from S1.Supplier1
where SNum = $SNum;
If not FOUND then Supplier2 S2
Select Name into $Name
from S3.Supplier2 Supplier2 S3
where SNum = $SNum;
write(terminal, $Name). DDBMS

The applications have to specify both the fragment names and the sites
where they are located. The mapping of database operations specified in
applications to those in DBMSs at sites is transparent

H.Lu/HKUST L02: Introduction -- 49


Level 2 of Distribution Transparency
Location Transparency
read(terminal, $SNum);
Select Name into $Name Supplier1 S1
from Supplier1
where SNum = $SNum;
S2
If not FOUND then Supplier2
Select Name into $Name
from Supplier2
where SNum = $SNum;
Supplier2 S3
write(terminal, $Name). DDBMS

The application is independent from changes in allocation


schema, but not from changes to fragmentation schema
H.Lu/HKUST L02: Introduction -- 50
Level 1 of Distribution Transparency
Fragmentation transparency:
Supplier1 S1
read(terminal, $SNum);
Select Name into $Name
from Supplier Supplier2 S2
where SNum = $SNum;
write(terminal, $Name).
Supplier2 S3

DDBMS

The DDBMS interprets the database operation by accessing


the databases at different sites in a way which is completely
determined by the system
H.Lu/HKUST L02: Introduction -- 51
Distribution Transparency for Updates
EMP1 = Π ENum,Name,Sal,Taxσ DNum≤10 (EMP)
Difficult
• broadcasting EMP2 = Π ENum,MNum,DNumσ DNum≤10 (EMP)
updates to all EMP3 = Π ENum,Name,DNumσ Dnum>10 (EMP)
copies EMP4 = Π ENum,MNum,Sal,Taxσ Dnum>10 (EMP)
EMP1 EMP2
• migration of EnumName Sal Tax EnumMnumDnum
tuples because 100 Ann 100 10 100 20 3
of change of
Update Dnum=15
fragment- for Employee with
defining
EMP3 Enum=100 EMP4
attributes
EnumName Dnum EnumMnum Sal Tax
100 Ann 15 100 20 100 10
H.Lu/HKUST L02: Introduction -- 52
An Update Application
With Level 2
UPDATE Emp Location
SET DNum = 15 Transparency only
WHERE ENum = 100;
Select Name, Tax, Sal into $Name, $Sal, $Tax
From EMP 1
With Level 1 Where ENum = 100;
Fragmentation Select MNum into $MNum
Transparency From Emp 2
Where ENum = 100;
Insert into EMP 3 (ENum, Name, DNum)
(100, $Name, 15);
Insert into EMP 4 (ENum, Sal, Tax, MNum)
(100, $Sal, $Tax, $MNum);
Delete EMP 1 where ENum = 100;
Delete EMP 2 where ENum = 100;
H.Lu/HKUST L02: Introduction -- 53
Summary
 Approaches of Reference Architectures
 Data-based Architecture of DBMS

 External, Conceptual, and Internal Schema


 DDBMS Models

 Autonomy, Distribution, and Heterogeneity


 Architectural Alternatives of DDBMS

 Client/Server, Peer-to-Peer, and Multi-database

H.Lu/HKUST L02: Introduction -- 54

You might also like