03 Architecture - Harish.vit - TPT

L02: DDBMS Architecture
DDBMS Architecture
 Architectural Models of DDBMS
 More on Transparency
H.Lu/HKUST
DDBMS Architecture
 Architecture defines a system’s structure with
 Componets
 Functions of components, and
 Their interactions
 Purpose of “reference architecture”:
 A framework for discussion

 Standardiztion
H.Lu/HKUST L02: Introduction -- 2

Approaches of Reference Models
 Based on components
 +: modularity -: unclear functionality
 Based on functions (e.g. ISO/OSI model)
 +: clear interfaces -: unclear complexity
 Based on data (e.g. ANSI/SPARC model)
 +: data-centric - : unclear functionality

Need an integration of the three approaches

ANSI/SPARC Architecture
Users
External External External External

Schema View View View
Conceptual Conceptual
Schema View
Internal Internal
Schema View

Conceptual Schema Definition (1)
RELATION EMP [ RELATION PAY [

KEY = {ENO} KEY = {TITLE}
ATTIBUTES = { ATTIRBUTES = {
ENO : CHAR(9) TITLE : CHAR (10)
ENAME : CHAR(15) SAL: NUMERIC(6)
TITLE : CHAR(10) }
} ]
]

Conceptual Schema Definition (2)
RELATION PROJ [ RELATION ASG [

KEY = {PNO} KEY = {ENO, PNO}
ATTIBUTES = { ATTIRBUTES = {
PNO : CHAR(7) ENO : CHAR(9)
PNAME : CHAR(20) PNO : CHAR(7)
BUDGET : NUMERIC(7) RESP : CHAR (10)
LOC : CHAR(15) DUR: NUMERIC(3)
} }
] ]

Internal Schema Definition
RELATION EMP [ INTERNA_REL E [

KEY = {ENO} INDEX ON E#
ATTIBUTES = { FIELD = {
ENO : CHAR(9) E# : BYTE(9)
ENAME : CHAR(15) ENAME: BYTE(15)
TITLE : CHAR(10) TIT: BYTE(10)
} }
] ]

External View Definition
Create a BUDGET view from PROJ :
CREATE VIEW BUDGET (PNAME,BUD)

AS SELECT PNAME, BUDGET
FROM PROJ
Create a PAYROLL view from EMP and PAY:
CREATE VIEW PAYROLL ( EMP_NO, EMP_NAME, SAL)

AS SELECT EMP.ENO, EMP.ENAME, PAY.SAL
FROM EMP, PAY
WHERE EMP.TITLE = PAY.TITLE

The Classical DDBMS Architecture
Global Schema
Fragmentation Site Independen

Schema Schemas
Allocation
Schema
Local Mapping Schema

Local Mapping Schema Other sites
DBMS 1 DBMS 2
Site 1 LOCA Site 2

LOCA
L L
DB 1 DB 2
DDBMS Schemas
 Global Schema: a set of global relations as if database were
not distributed at all
 Fragmentation Schema: global relation is split into “non-
overlapping” (logical) fragments. 1:n mapping from relation
R to fragments Ri.
 Allocation Schema: 1:1 or 1:n (redundant) mapping from
fragments to sites. All fragments corresponding to the same
relation R at a site j constitute the physical image Rj. A copy
of a fragment is denoted by Rji.
 Local Mapping Schema: a mapping from physical images to
physical objects, which are manipulated by local DBMSs.

Global Relations, Fragments and Physical Images
R
R1 R 11
R1
R 12 (Site 1)
R2
R 21
R2
(Site2)
R3 R 22
R 32
Global R3
Fragments (Site3)
Relation
R 33
Physical Images
Motivation for this Architecture
 Separating the concept of data fragmentation from
the concept of data allocation
 fragmentation transparency
 location transparency
 Explicit control of redundancy
 Independence from local databases: allows local
mapping transparency

Some Aspects of the Classical Architecture
 Distributed database technology is an “add-on”
technology, most users already have populated
centralized DBMSs. Whereas top down design
assumes implementation of new DDBMS from
scratch.
 In many application environments, such as semi-
structured databases, continuous multimedia data, the
notion of fragment is difficult to define.
 Current relational DBMS products provide for some
form of location transparency (such as, by using
nicknames).

 DDBMS Architecture
Architectural Models of DDBMS
 More on Transparency
H.Lu/HKUST
Bottom Up Architecture - Present & Future
 Possible ways in which multiple databases may be put together for sharing
by multiple DBMSs, which are characterized according to
 Autonomy (A) degree to which individual DBMSs can operate
independently.
 0 - Tightly coupled - integrated (A0),
 1 - Semiautonomous - federated (A1),
 2 - Total Isolation - multidatabase systems(A2)
 Distribution (D)
 0 - no distribution - single site (D0),
 1 - client-server - distribution of DBMS functionality (D1),
 2 - full distribution – P2P distributed architecture(D2)
 Heterogeneity (H)
 0 - homogeneous (H0)
 1 - heterogeneous (H1)

Architectural Alternatives
Distribution
(A0, D2, H0) (A1, D2, H0) (A2, D2, H0)
(A0, D2, H1) (A2, D2, H1) (A2, D1, H0)
(A0, D1, H1) (A2, D1, H1) (A2, D0, H0)

(A0, D0, H0)
Autonomy
(A0, D0, H1)
(A1, D0, H1) (A2, D0, H1)
Heterogeneity
Autonomy
 Autonomy refers to the distribution of control, not of data. It indicates the
degree to which individual DBMSs can operate independently
 Exchanging information among sites?
 Independently executing transactions?
 Modifying the component systems?
 Requirements of an autonomous system
 The local operations of the individual DBMSs are not affected by their
participation in the DDBS.
 The individual DBMSs query processing and optimization should not
be affected by the execution of global queries that access multiple
databases
 System consistency or operation should not be compromised when
individual DBMSs join or leave the distributed database
confederation.

Dimensions of Autonomy
 Design Autonomy
 Freedom for individual DBMSs to use data
models and transaction management techniques
they prefer
 Communication Autonomy
 Freedom for individual DBMSs to decide what
information (data & control) is to be exported
 Execution Autonomy
 Freedom for individual DBMSs to execute
transactions submitted in any way that it wants to

Alternatives of Autonomy
 A0: Tight integration
 A single image of the entire database to any user
 A1: Semiautonomous
 Can operate independently
 Have made some parts of their local data shareable
 Need to be modified to enable exchanging information
 A2: Total isolation
 Stand alone DBMS at each site, no global control
 Know nothing about other databases

Distribution
 Deals with data, not of control
 Physical distribution of data over sites
 Two classes of distribution alternatives
 Client/Server (different from network C/S)

• Clients – Apps, Server – Data management
 Peer-to-Peer (or full)
• Each machine has full DBMS functionality
• Main focus of the textbook

Heterogeneity
 Various forms of heterogeneity
 Hardware, networking protocol, data manager
 Focus of this course
 Data models, query languages, xact mgmt

DDBMSs Implementation Alternatives
P2P P2P
P2P homogeneous P2P
homogeneous
Heterogeneous homogene
D DBMS federated system
Federated DBMS ous
P2P multidata
heterogeneous base
DBMS system
P2P
heterogeneo
Logically integrated
us
and homogeneous
Multidataba
multiple DBMSs
se
A
system
Heterogeneous
Integrated DBMS Multidatabase
System
H Single site Single Site Heterogeneous
homogeneous heterogeneous
multidatabase
federated DBMS
federated system
Taxonomy of Distributed Databases
 Composite DBMSs -tight integration
 single image of entire database is available to any user
 can be single or multiple sites
 can be homogeneous or heterogeneous
 Federated DBMSs - semiautonomous
 DBMSs that can operate independently, but have decided to make
some parts of their local data shareable
 can be single or multiple sites.
 they need to be modified to enable them to exchange information
 Multidatabase Systems - total isolation
 individual systems are stand alone DBMSs, which know neither the
existence of other databases or how to communicate with them
 no global control over the execution of individual DBMSs.
 can be single or multiple sites
 homogeneous or heterogeneous

Client Server Architectures
 Client/Server (Ax, D1, Hy)
 Multiple clients, single server
• Similar data management as in centralized db
 Multiple clients, multiple servers
• Clients connect to servers as needed, or
• Clients connect to their “home” server only

Components of Client/Server System
UI Application
Operati
System
Program
Client
DBMS
ng
Communication
SQL Queries software
Result
Communication Relation
software
Semantic Data
Operatin
Controller
Query
Optimizer
Transaction
g
Manager
Recovery
Manager
Runtime Support
Processor
System
Database
DDBS Reference Architecture
ES1 ES2 ESn External
Schema
GCS Global Conceptual Schem
LCS1 LCS2 LCSLocal

n Conceptual Schema
LIS1 LIS2 LISnLocal Internal Schema
It is logically integrated. Provides for the levels

of transparency
Components of a DDBS Site
USER PROCESSOR DATA PROCESSOR
Local Local
External Global
GD/D Conceptual log Internal
Schema Schema Schema
Schema
Local Recovery
Semantic Data
User Interface
Global Query
Local Query
Processor
Runtime
Controller
Optimizer
Support
Execution
Manager
Monitor
Handler
Global

DDBMS Components: USER PROCESSOR
 User interface handler interprets user commands and
formats the result data as it is sent to the user
 Semantic data controller checks the integrity
constraints and authorization requirements
 Global query optimizer and decomposer determines
execution strategy, translates global queries to local
queries, and generates strategy for distributed join
operations
 Global execution monitor (distributed transaction
manager) coordinates the distributed execution of the
user request

DDBMS Components: DATA PROCESSOR
 Local query processor selects the access path and is
involved in local query optimization and join
operations
 Local recovery manager maintains local database
consistency
 Run-time support processor physically accesses the
database. It is the interface to the OS and contains
database buffer manager

MDBS Architecture
 Heterogeneous MDBS-Unilingual
 Requires users to utilize different data models and
languages when both a local and global database need to
be accessed
 Applications accessing multiple databases should use
external views defined on GCS
 A single application can have a LES on LCS and GES on
GCS
 Heterogeneous MDBS-Multilingual
 Users access global database by means of external schema
defined using the language of user's local DBMS
 Queries against global databases are made using language
of local DBMS
 Deal with translation of queries at runtime
MDBS Architecture (with Global Schema)
GES1 GES2 GES3 Global External Schema
LES11 GCS LESn1 Local External Schema
LES12 LCS1 LCSn LESn2

(A2, Dx, Hy) –
LES13 LESn3 GCS generated
LIS1 LISn bottom-up
•The GCS is generated by integrating LES's or LCS's

•The users of a local DBMS can maintain their autonomy
•Design of GCS is bottom-up
MDBS with GCS
 Uni-lingual Heterogeneous MDBS
 Accessing multiple databases through GES on GCS
 LES on LCS and GES on GCS co-exist
 Multilingual Heterogeneous MDBS
 GES defined using the language of local DBMS
 Queries over GES in the language of local DBMS
 Easier for users to query multiple databases
 The system deals with translation of queries at runtime

MDBS without Global Conceptual Schema
Multidatabase ES1 ES2 ES3
Layer
Local Database
LCS1 LCS2 LCS3
System Layer
LIS1 LIS2 LIS3

 Local system layer consists of several
DBMSs which present to multidatabase
layer part of their databases
 The shared database is either local
conceptual schema or external schema (Not
shown in the figure above)
 External views on one or more LCSs.
MDBS without Global Schema
 Federated Database Systems
 Do not use global conceptual schema
 Each local DBMS defines export schema
 Global database is a union of export schemas
 Each application accesses global database through import
schema (external view)
 MDBSs components architecture
 Existence of full fledged local DBMSs
 MDBS is a layer on top of individual DBMSs that support
access to different databases
 The complexity of the layer depends on existence of GCS
and heterogeneity

MDBS Components Model
 Full-fledged DBMS at each site
 A layer runs on top of individual DBMSs
 The complexity of the layer depends on
 Existence of GCS and heterogeneity

Components of an MDBMS
User
User System Responses User Requests User
Multi-DBMS Layer
User Transaction Transaction User

Interface Manager Manager Interface
Query Scheduler Query
Scheduler
Processor Processor
Recovery Recovery
Query Query
Manager
Optimizer Manager Optimizer
Runtime Sup. Runtime Sup.
Processor Processor
Database Database

Global Directory
 Directory is itself a database that contains meat-data
about the actual data stored in the database. It
includes the support for fragmentation transparency
for the classical DDBMS architecture.
 Directory can be local or distributed.
 Directory can be replicated and/or partitioned.
 Directory issues are very important for large multi-
database applications, such as digital libraries.

 DDBMS Architecture based on data

 Architectural Models
More on Transparency
H.Lu/HKUST
More on Transparency
 X transparency means the existence of X is not
known to users.
 Closely related to architecture issues.
 The level of transparency is inevitably a compromise
between ease of use and the difficulty and overhead
cost of providing high levels of transparency
 DDBMS provides location transparency and
fragmentation transparency, OS provides network
transparency, and DBMS provides data
independence

On Distribution Transparency
 Higher levels of distribution transparency require appropriate
DDBMS support, but makes end-application developers work
easy.
 Less distribution transparency the more the end-application
developer needs to know about fragmentation and allocation
schemes, and how to maintain database consistency.
 There are tough problems in query optimization and
transaction management that need to be tackled (in terms of
system support and implementation) before fragmentation
transparency can be supported.

Levels of Distribution Transparency
 Fragmentation Transparency
 Just like using global relations.
 Location Transparency
 Need to know fragmentation schema; but no need not know where
fragments are located
 Applications access fragments (no need to specify sites where
fragments are located).
 Local Mapping Transparency
 Need to know both fragmentation and allocation schema; no need to
know what the underlying local DBMSs are.
 Applications access fragments explicitly specifying where the
fragments are located.
 No Transparency
 Need to know local DBMS query languages, and write applications
using functionality provided by the Local DBMS

Distribution Transparency
 We analyze the different levels of distribution
transparency which can be provided by DDBMS for
applications.
A Simple Application
Supplier(SNum, Name, City)
Horizontally fragmented into:
Supplier 1 = σ City = ``HK'' Supplier at Site1
Supplier2 = σ City != “HK” Supplier at Site2, Site3
Application:
Read the supplier number from the terminal and return
the name of the supplier with that number
Types of Data Fragmentation
Vertical Fragmentation
• Projection on relation
(subset of attributes)
• Reconstruction by join
• Updates require no tuple
migration
Horizontal
Fragmentation
• Selection on relation
(subset of tuples)
• Reconstruction by union
• Updates may requires
tuple migration
Mixed Fragmentation
Horizontal Fragmentation
 Partitioning the tuples of a global relation into
subsets
 Example:
• Supplier(SNum, Name, City)
 Horizontal Fragmentation can be:
• Supplier 1 = σ City != “HK” Supplier
• Supplier2 = σ City != “HK” Supplier
 Reconstruction is possible:
• Supplier = Supplier1 ∪ Supplier2
 The set of predicates defining all the fragments must
be complete, and mutually exclusive
Derived Horizontal Fragmentation
 The horizontal fragmentation is derived from the horizontal
fragmentation of another relation
Example:
Supply (SNum, PNum, DeptNum, Quan) is the
SNum is a supplier number semi-join
Supply1 = Supply SNum=SNum Supplier1 operation.
Supply2 = Supply SNum=SNum Supplier2
The predicates defining derived horizontal fragments are:
Supply.SNum = Supplier.SNum and Supplier. City = ``HK''
Supply.SNum = Supplier.SNum and Supplier. City != ``HK''

Vertical Fragmentation
 The vertical fragmentation of a global relation is the
subdivision of its attributes into groups; fragments are
obtained by projecting the global relation over each
group
Example
EMP (ENum,Name,Sal,Tax,MNum,DNum)
A vertical fragmentation can be
EMP1 = Π ENum, Name, MNum, DNum EMP
EMP2 = Π ENum, Sal, Tax EMP
Reconstruction:
EMP = EMP1 EMP2
ENum = ENum

Rules for Data Fragmentation
 Completeness
All the data of the global relation must be mapped into the
fragments
 Reconstruction
It must always be possible to reconstruct each global relation
from its fragments
 Disjointedness
it is convenient that fragments be disjoint, so that the
replication of data can be controlled explicitly at the
allocation level

Level 4 of Distribution Transparency
 No Transparency
read(terminal, $SNum);
$SupIMS($Snum,$Name,$Found) at S1;
If not FOUND then
$SupCODASYL($Snum,$Name,$Found) at S3;
write(terminal, $Name).
DDBMS
Codasyl IMS
Supplier2 Supplier1
H.Lu/HKUST S3 S1 L02: Introduction -- 48

Local Mapping Transparency
Select Name into $Name Supplier1 S1
from S1.Supplier1
where SNum = $SNum;
If not FOUND then Supplier2 S2
Select Name into $Name
from S3.Supplier2 Supplier2 S3
where SNum = $SNum;
write(terminal, $Name). DDBMS
The applications have to specify both the fragment names and the sites
where they are located. The mapping of database operations specified in
applications to those in DBMSs at sites is transparent

Location Transparency
Select Name into $Name Supplier1 S1
from Supplier1
where SNum = $SNum;
S2
If not FOUND then Supplier2
from Supplier2
where SNum = $SNum;
Supplier2 S3
write(terminal, $Name). DDBMS
The application is independent from changes in allocation

schema, but not from changes to fragmentation schema
Fragmentation transparency:
Supplier1 S1
from Supplier Supplier2 S2
where SNum = $SNum;
write(terminal, $Name).
Supplier2 S3
DDBMS
The DDBMS interprets the database operation by accessing

the databases at different sites in a way which is completely
determined by the system
Distribution Transparency for Updates
EMP1 = Π ENum,Name,Sal,Taxσ DNum≤10 (EMP)
Difficult
• broadcasting EMP2 = Π ENum,MNum,DNumσ DNum≤10 (EMP)
updates to all EMP3 = Π ENum,Name,DNumσ Dnum>10 (EMP)
copies EMP4 = Π ENum,MNum,Sal,Taxσ Dnum>10 (EMP)
EMP1 EMP2
• migration of EnumName Sal Tax EnumMnumDnum
tuples because 100 Ann 100 10 100 20 3
of change of
Update Dnum=15
fragment- for Employee with
defining
EMP3 Enum=100 EMP4
attributes
EnumName Dnum EnumMnum Sal Tax
100 Ann 15 100 20 100 10
An Update Application
With Level 2
UPDATE Emp Location
SET DNum = 15 Transparency only
WHERE ENum = 100;
Select Name, Tax, Sal into $Name, $Sal, $Tax
From EMP 1
With Level 1 Where ENum = 100;
Fragmentation Select MNum into $MNum
Transparency From Emp 2
Where ENum = 100;
Insert into EMP 3 (ENum, Name, DNum)
(100, $Name, 15);
Insert into EMP 4 (ENum, Sal, Tax, MNum)
(100, $Sal, $Tax, $MNum);
Delete EMP 1 where ENum = 100;
Delete EMP 2 where ENum = 100;
Summary
 Approaches of Reference Architectures
 Data-based Architecture of DBMS
 External, Conceptual, and Internal Schema

 DDBMS Models
 Autonomy, Distribution, and Heterogeneity

 Architectural Alternatives of DDBMS
 Client/Server, Peer-to-Peer, and Multi-database

03 Architecture - Harish.vit - TPT

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

03 Architecture - Harish.vit - TPT

Uploaded by

Copyright:

Available Formats

L02: DDBMS Architecture

 A framework for discussion

H.Lu/HKUST L02: Introduction -- 2

 +: data-centric - : unclear functionality

H.Lu/HKUST L02: Introduction -- 3

External External External External

H.Lu/HKUST L02: Introduction -- 4

RELATION EMP [ RELATION PAY [

H.Lu/HKUST L02: Introduction -- 5

RELATION PROJ [ RELATION ASG [

H.Lu/HKUST L02: Introduction -- 6

RELATION EMP [ INTERNA_REL E [

H.Lu/HKUST L02: Introduction -- 7

CREATE VIEW BUDGET (PNAME,BUD)

Create a PAYROLL view from EMP and PAY:

CREATE VIEW PAYROLL ( EMP_NO, EMP_NAME, SAL)

H.Lu/HKUST L02: Introduction -- 8

Fragmentation Site Independen

Local Mapping Schema

Site 1 LOCA Site 2

H.Lu/HKUST L02: Introduction -- 10

H.Lu/HKUST L02: Introduction -- 12

H.Lu/HKUST L02: Introduction -- 13

H.Lu/HKUST L02: Introduction -- 15

(A0, D2, H1) (A2, D2, H1) (A2, D1, H0)

(A0, D1, H1) (A2, D1, H1) (A2, D0, H0)

H.Lu/HKUST L02: Introduction -- 17

H.Lu/HKUST L02: Introduction -- 18

H.Lu/HKUST L02: Introduction -- 19

 Client/Server (different from network C/S)

H.Lu/HKUST L02: Introduction -- 20

 Data models, query languages, xact mgmt

H.Lu/HKUST L02: Introduction -- 21

H.Lu/HKUST L02: Introduction -- 23

H.Lu/HKUST L02: Introduction -- 24

LCS1 LCS2 LCSLocal

LIS1 LIS2 LISnLocal Internal Schema

It is logically integrated. Provides for the levels

H.Lu/HKUST L02: Introduction -- 27

H.Lu/HKUST L02: Introduction -- 28

H.Lu/HKUST L02: Introduction -- 29

GES1 GES2 GES3 Global External Schema

LES11 GCS LESn1 Local External Schema

LES12 LCS1 LCSn LESn2

•The GCS is generated by integrating LES's or LCS's

H.Lu/HKUST L02: Introduction -- 32

LIS1 LIS2 LIS3

H.Lu/HKUST L02: Introduction -- 34

 Existence of GCS and heterogeneity

H.Lu/HKUST L02: Introduction -- 35

User Transaction Transaction User

H.Lu/HKUST L02: Introduction -- 36

H.Lu/HKUST L02: Introduction -- 37

 DDBMS Architecture based on data

H.Lu/HKUST L02: Introduction -- 39

H.Lu/HKUST L02: Introduction -- 40

H.Lu/HKUST L02: Introduction -- 41

H.Lu/HKUST L02: Introduction -- 45

H.Lu/HKUST L02: Introduction -- 46

H.Lu/HKUST L02: Introduction -- 47

H.Lu/HKUST S3 S1 L02: Introduction -- 48

H.Lu/HKUST L02: Introduction -- 49

The application is independent from changes in allocation