You are on page 1of 35

Database Management Systems

Chapter 10 Distributed Databases


Jerry Post
Copyright 2003
1

D A T A B A S E

Distributed Databases
Definition Advantages / Uses Problems / Complications Client-Server / SQL Server Microsoft Access
SELECT Sales FROM Britain.Sales UNION SELECT Sales FROM France.Sales UNION SELECT Sales FROM Italy.Sales

Germany
Britain

France

Italy
2

D A T A B A S E

Distributed Database Definition


Multiple independent databases
Each DBMS is a complete DBMS (engine, queries, locking, transactions, etc.) Usually on different machines. Usually in different locations. Database Zeus Database Apollo

England France
Database Athena

Connected by a network. Might be different environments


Hardware Operating System DBMS Software

United States

D A T A B A S E

Distributed Database Rules


C.J. Date Rule 0: Transparency: the user should not know or care that the database is distributed. Local autonomy. No reliance on a central site. Continuous operation. Location independence. Fragmentation independence (physical storage). Replication independence. Distributed query processing. Distributed transaction management. Hardware independence. Operating system independence. Network independence. DBMS independence.

D A T A B A S E

Distributed Features
Each database can continue to run even if portion fails. Data and hardware can be moved without affecting operations or users.
Expanding operations. Performance issues.

System expansion and upgrades.


Add new section without affecting others. Upgrade hardware, network and DBMS.

D A T A B A S E

Advantages and Applications


Business operations are often distributed
Work and data are segmented by department. Work and data are segmented by geographical location.

local transactions

Improved performance
Most updates and queries are performed locally. Maintain local control and responsibility over data.

future expansion

Can still combine data across the system. Scalability and expansion
Add on, not replacement.
6

D A T A B A S E

Creating a Distributed Database


Design administration plan. Choose hardware and DBMS vendor, and network. Set up network and DBMS connections. Choose locations for data. Choose replication strategy. Create backup plan and strategy. Create local views and synonyms. Perform stress test: loads and failures.
7

D A T A B A S E

Distributed Query Processing


Networks are slow
Drives: 20 - 60 MB per sec. LANs: 1-10 MB per sec (10-100 mbps). WANs: 0.01 - 5 MB per sec. Faster is possible but expensive! SANs: 10-100 MB per sec.

Goal is to minimize transmissions.


Each system must be capable of evaluating queries--preferably SQL. Results depend heavily on how the system joins tables.

WAN 0.1 - 5 MB

10 - 20 MB Disk drive

10-100 MB LAN

D A T A B A S E

Example

Distributed Query Processing


NY

NY: Customers: 1 M rows Customers(C#, ) LA: Production: 10 M rows 1,000,000 C# list from Chicago: Sales: 20 M rows desired P# Query: List customers who Chicago Matching bought blue products on March 1 Sales(S#, C#, Sdate) Customer Bad idea #1 data 20,000,000 SaleItem(S#, P#,) Transfer all rows to Chicago 50,000,000 Then JOIN and select. Better idea #2 (probably) P# sold on Transfer blue products from LA March 1 to Chicago Blue P# sold on Better idea #3 LA March 1 Get sale items on March 1 Products(P#, Color) Get blue products from LA 10,000,000 Send C# to NY
9

D A T A B A S E

Data Replication
Goals
Minimize transmissions Improve performance Support heavy multiuser access. Britain
Britain: Customers & Sales France: Customers & Sales Spain: Customers & Sales Periodic updates Market research & data corrections.

Problems
Updating copies
Bulk transmissions Site unavailable

Concurrency
Easier for two people to change the same data at the same time.

Spain
Britain: Customers & Sales France: Customers & Sales Spain: Customers & Sales Update data.

Decision support systems. Data warehouse.

10

D A T A B A S E

Concurrency and Locks


Each DBMS must maintain lock facility. To update, each DBMS must utilize and recognize other lock mechanisms and return codes. Each DBMS must have a deadlock resolution protocol that recognizes the distributed databases.
Random wait. Optimistic updates. Two-phase commit.
DBMS #1 Accounts

Jones

8898

Transaction A Locked Waiting

DBMS #2 Accounts
Jones 3561

Transaction B Waiting Locked

11

D A T A B A S E

Transactions & Two-Phase Commit


Two (or more) separate lock managers. DBMS initiating update serves as the coordinator. Two phases

Database 1 Initiate Transaction

1. Prepare to commit. All agree? Coordinator sends message 2. Commit and data to all machines to get ready. Local machines save data in logs, verify update status and return message. If all locals report OK, then Database 2 Lock tables. coordinator writes log and Database 3 Save log. instructs others to proceed. Update all tables. If any fail, it sends Rollback message.
12

D A T A B A S E

Distributed Transaction Managers


Transaction Manager Resource Manager DBMS Transaction Manager

Resource Manager
DBMS Transaction Manager

Transaction Processing DBMS Monitor The distributed transaction coordinator/transaction processing monitor handles the transaction decisions and coordinates across the participating systems.
13

Resource Manager

D A T A B A S E

Distributed Design Questions


Qu e s tion Wh a t level of dat a con sist ency is n eeded? H ow expensive is st or a ge? Wh a t a r e t h e sh a r ed a ccess r equ ir em en ts? H ow oft en a r e t h e t a bles updat ed? Requ ir ed speed of u pda t es (t r a nsa ction s)? H ow im port a n t a r e predict a ble t r a nsa ct ion t im es? DBMS suppor t for concu r r ency a n d lockin g? Ca n sh a r ed access be a voided? Co n c u rre n t H igh Medium H igh Globa l Oft en F a st H igh Good E xcellen t No Re p lic a tio n Low Mediu m Low Loca l Seldom Slow Low P oor Yes

14

D A T A B A S E

Distributed Databases In Oracle


Database Links
Full database names. CONNECT command.

Linking through synonyms.


CREATE SYNONYM Central control over permissions.

Schema.Table@Location Scott.Emp@hq.acme.com Server database Synonym: Employee Procedure:


DELETE FROM Employee WHERE ...

Linking through Views/queries.


CREATE VIEW AS Can assign local permissions.

Linking through stored procedures.


DELETE Strong control over actions.

View
user permissions

User can only run procedure. No other access.

15

D A T A B A S E

Client-Server
Server Server

Shared Database

Front-end User Interface Clients Clients

16

D A T A B A S E

LAN File Server


File Server

Not a distributed database.


Data file stored on server. Server is passive, appears as giant disk drive to PC. PC processes all data. Retrieves all needed data across the network.

DBMS data file Application Shared Data

Performance improvements.
Indexes are crucial. Store some data on each PC (replication). Store applications on PC (graphics & forms). Convert to SQL-Server

All data from all tables are read by PC, which performs JOIN and WHERE test. If available, reads index first.

SELECT Name, SaleDate FROM Customer INNER JOIN Sales ON Customer.C# = Sales.C# WHERE SaleDate BETWEEN #1-Mar-97# AND #9-Mar-97#;
17

D A T A B A S E

LAN File Server: Slow


File Server MyFile.mdb
CustID Name 115 Jenkins Forms 125 Juarez ...
Order ... Application and query transferred. One row at a time transferred, until all rows are examined.

DBMS software transferred.

SELECT * FROM Customer WHERE City = Sandy

18

D A T A B A S E

Client-Server Databases
File Server

One machine machine is dominant (server) and handles data for many clients. Client machines handle front-end tasks and small data tables that are not shared.

DBMS SQL Server Shared Data

Send SQL statement.


application

Return matching data.

19

D A T A B A S E

ADO and Direct Connections


Server Computer

The Database vendor provides its own data transport (e.g,. Oracle or SQL Server) installed on the server and the client. ADO provides a driver that connects your application to the transport services. ODBC can serve as the data transport if nothing else is available

Database Server DBMS transport

DBMS transport ADO

Visual Basic application


Client Computer
20

D A T A B A S E

Three-Tier Client-Server
Databases. Server Databases Transactions. Client front-end Legacy applications. Middle
Locate databases Business rules Program code

Database Servers

Database links. Business rules. Program code.

Middleware

Application. Front-end. User Interface.

Client

21

D A T A B A S E

Database Independence on the Client


Original DBMS New DBMS

ADO

ADO

Application

22

D A T A B A S E

Database Independence with Queries


Independent Application Query: works with any DBMS SELECT SaleID, SaleDate, CustomerID, CustomerName FROM SaleCustomer Saved Oracle Query SELECT SaleID, SaleDate, CustomerID, LastName || , || FirstName AS CustomerName FROM Sale, Customer WHERE Sale.CustomerID=Customer.CustomerID Saved SQL Server Query

SELECT SaleID, SaleDate, CustomerID, LastName + , + FirstName AS CustomerName FROM Sale INNER JOIN Customer ON Sale.CustomerID = Customer.CustomerID
23

D A T A B A S E

The Internet as Client-Server


information Internet Router request Router

Server Client Browser Web Server


HTML pages Forms Graphics

http://server.location/page

24

D A T A B A S E

<HTML> <HEAD> <TITLE>My main page</TITLE></HEAD> <BODY BACKGROUND=graphics/back0.jpg> <P>My text goes in paragraphs.</P> <P>Additional tags set <B>boldface</B> and <I>Italic</I>. <P>Tables are more complicated and use a set of tags for rows and columns.</P> <TABLE BORDER=1> <TR><TD>First cell</TD><TD>Second cell</TD></TR> <TR><TD>Next row</TD><TD>Second column</TD></TR> </TABLE> <P>There are form tags to create input forms for collecting data. But you need CGI program code to convert and use the input data.</P> </BODY> </HTML>
25

HTML Limited Clients

D A T A B A S E

HTML Output

My t ext goes in pa ragra phs. Addit iona l t a gs set bold fac e a nd Italic. Ta bles a re m ore com plica t ed a nd use a set of ta gs for rows a nd colum ns. F irst cell Second cell Next row Second colum n There a re form ta gs t o crea t e input form s for collect ing da t a . But you need CGI progra m code t o convert a nd use t he input da ta .

26

D A T A B A S E

Web Server Database Fundamentals


0 Request Server/Form.html 3

Client/Browser
Database 1 2 Data 3 HTML Form 2 DBMS 1 Result Query Web Server Result Page 1 HTML form Form.html 2 Query Template + Code Program code
27

D A T A B A S E

Database Example: Client Side


0 Request Server/Form.html 1 Initial form 3 Results 2

Server

28

D A T A B A S E

Client-Server Data Transfer


Order Form
Order ID Customer Order Date 1015 Jones, Martha 12-Aug

What if there are 10,000 customers? How much time to load the combo box?

How do you refresh/reload the combo box?


Alternatives?
29

D A T A B A S E

Latency
Server

Generate form

Receive form data

Transmission delay

Transmission delay

time

Form received
Client User delay

30

D A T A B A S E

XML: Transferring Data

Order: OrderID, OrderDate, ShippingCost, Comment Item: ItemID, Description, Quantity, Cost Item: ItemID, Description, Quantity, Cost Item: ItemID, Description, Quantity, Cost

Many XML files contain hierarchical data.

31

D A T A B A S E

<?xml version="1.0" encoding="utf-8"?> <xs:schema id="OrderList" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata"> <xs:element name="OrderList" msdata:IsDataSet="true"> <xs:complexType> Partial file, <xs:choice maxOccurs="unbounded"> generated by <xs:element name="Order"> <xs:complexType> .NET xsd.exe <xs:sequence> <xs:element name="OrderID" type="xs:string" minOccurs="0" /> <xs:element name="OrderDate" type="xs:date" minOccurs="0" /> <xs:element name="ShippingCost" type="xs:string" minOccurs="0" /> <xs:element name="Comment" type="xs:string" minOccurs="0" /> <xs:element name="Items" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="ItemID" nillable="true" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:simpleContent msdata:ColumnName="ItemID_Text" msdata:Ordinal="0"> <xs:extension base="xs:string"> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name="Description" nillable="true" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:simpleContent msdata:ColumnName="Description_Text" msdata:Ordinal="0"> <xs:extension base="xs:string"> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element>

XML: Schema Definition xsd

32

D A T A B A S E

XML Data Example


<?xml version="1.0"?> <!DOCTYPE OrderList SYSTEM "orderlist.dtd"> <OrderList> <Order> <OrderID>1</OrderID> <OrderDate>3/6/2004</OrderDate> <ShippingCost>$33.54</ShippingCost> <Comment>Need immediately.</Comment> <Items> <ItemID>30</ItemID> <Description>Flea Collar-DogMedium</Description> <Quantity>208</Quantity> <Cost>$4.42</Cost> <ItemID>27</ItemID> <Description>Aquarium Filter &amp; Pump</Description> <Quantity>8</Quantity> <Cost>$24.65</Cost> </Items> </Order> </OrderList>

XML: extensible markup language

33

D A T A B A S E

XML Example in Explorer

34

D A T A B A S E

Java and JDBC


Connection con = DriverManager.getConnection( "jdbc.myDriver:myDBName", myLogin, myPassword); Statement smt = con.CreateStatement(); ResultSet rst = smt.executeQuery( SELECT AnimalID, Name, Category, Breed FROM Animal); while (rst.next()) { int iAnimal = rst.getInt(AnimalID); String sName = rst.getString(Name); String sCategory = rst.getString(Category); String sBreed = rst.getString(Breed); \\ Now do something with these four variables }

35

You might also like