You are on page 1of 2

The computers would have to be connected in a communications network, or network distributed database management system (DDBMS), which is a DBMS

capable of supporting and manipulating distributed databases. A fixed amount of time, sometimes called the access delay, is required for every message. The formula for message transmission time is as follows: Communication time = access delay + (data volume / transmission rate) Heterogeneous DDBMSs are more complex than homogeneous DDBMSs and, consequently, have more problems and are more difficult to manage. a remote sitea site other than the one at which the user is locatedjust as easily as the user accesses data from the local site the site at which the user is located. Location transparency is the characteristic of a DDBMS that users do not need to be aware of the location of data in a distributed database data replication creates update problems that can lead to data inconsistencies. The steps taken by the DDBMS to update the various copies of data should be done behind the scenes; users should be unaware of the steps. This DDBMS characteristic is called replication transparency. Users should not be aware of the fragmentationthey should feel as if they are using a single central database. When users are unaware of fragmentation, the DDBMS has fragmentation transparency. Advantages Local control of data. Increasing database capacity System availability Improved performance. Disadvantages Update of replicated data. More complex query processing. More complex query processing. More complex treatment of concurrent update. More complex recovery measures. More difficult management of the data dictionary. More complicated security and backup requirements

network and that people use to access data stored on the server. A server is also called a back-end processor or a back-end machine, and a client is also called a front-end processor or a front-end machine.

Because the clients and the server perform different functions and can run different operating systems, this arrangement of client/server architecture is called a twotier architecture. When the clients perform the business functionseach client is called a fat client in this arrangementyou have a client maintenance problem. Because clients perform only the presentation functions in this arrangement, each client is called a thin client. Scalability is the ability of a computer system to continue to function well as utilization of the system increases. In a three-tier architecture, the clients perform the presentation functions, a database server performs the database functions, and separate computers (called application servers) perform the business functions and serve as an interface between clients and the database server. A three-tier architecture is sometimes referred to as an ntier architecture because additional application servers can be added for scalability without impacting the design for the client or the database server. Internet, which is a worldwide collection of millions of interconnected computers and computer networks that share resources, is used daily by most people and is an essential portal for all organizations. World Wide Web (or the Web), which is a vast collection of digital documents available on the Internet. Each digital document on the Web is called a Web page, each computer on which an individual or organization storesWeb pages for access on the Internet is called aWeb server, and each computer requesting a Web page from a Web server is called a Web client. A Web server requires special software to receive and respond to requests for Web pages from Web clients. The dominant Web server software packages are Apache HTTP Server and IIS. Apache HTTP Server is a free, open-source package that runs with most operating systems, while Internet Information Services (IIS) is a Microsoft package that comes with many versions of its operating systems. Each Web page is assigned an Internet address called a Uniform Resource Locator (URL); the URL identifies where the Web page is storedboth the location of the Web server and the name and location of the Web page on that server. Hypertext Transfer Protocol (HTTP), which is the data communication method used by Web clients and Web servers to exchange data on the Internet. Transmission Control Protocol/Internet Protocol (TCP/IP), which is the standard protocol for all communication on the Internet.

Local deadlock is deadlock that occurs at a single site in a distributed database. global deadlock involves one transaction that requires a record held by a second transaction at one site, while the second transaction requires a record held by the first transaction at a different site. A DDBMS usually prevents this potential inconsistency through the use of two-phase commit. The basic idea of two-phase commit is that one site, often the site initiating the update, acts as coordinator. RULESFORDISTRIBUTEDDATABASES 1. Local autonomy. No site should depend on another site to perform its database functions. 2. No reliance on a central site. These operations include data dictionary management, query processing, update management, database recovery, and concurrent update. 3. Continuous operation. 4. Location transparency. 5. Fragmentation transparency. 6. Replication transparency. 7. Distributed query processing. 8. Distributed transaction management 9. Hardware independence. 10. Operating system independence. 11. Network independence 12. DBMS independence.

Summary A distributed database is a single logical database that is physically divided among computers at several sites on a network. A user at any site can access data at any other site. A DDBMS is a DBMS capable of supporting and manipulating distributed databases. Computers in a network communicate through messages. Minimizing the number of messages is important for rapid access to distributed databases. A homogenous DDBMS is one that has the same local DBMS at each site, whereas a heterogeneous DDBMS is one that does not.

The file server stores the files required by the users on the network. In client/server terminology, the server is a computer providing data to the clients, which are the computers that are connected to a

Location transparency, replication transparency, and fragmentation transparency are important characteristics of DDBMSs. DDBMSs permit local control of data, increased database capacity, improved system availability, and added efficiency. DDBMSs are more complicated than non-DDBMSs in the areas of updating replicated data, processing queries, treating concurrent update, providing measures for recovery, managing the data dictionary, designing databases, and managing security and backup requirements. The two-phase commit usually uses a coordinator to manage concurrent update. C. J. Date presented 12 rules that serve as a benchmark against which you can measure DDBMSs. These rules are local autonomy, no reliance on a central site, continuous operation, location transparency, fragmentation transparency, replication transparency, distributed query processing, distributed transaction management, hardware independence, operating system independence, network independence, and DBMS independence. A file server stores the files required by users and sends entire files to the users. In a two-tier client/server architecture, a DBMS runs on a file server and the server sends only the requested data to the clients. The server performs database functions, and the clients perform presentation functions. A fat client can perform the business functions, or the server can perform the business functions in a thin client arrangement. In a three-tier client/server architecture, the clients perform the presentation functions, database servers perform the database functions, and application servers perform business functions. A three-tier architecture is more scalable than a two-tier architecture. The advantages of client/server systems are lower network traffic; improved processing distribution; thinner clients; greater processing transparency; increased network, hardware, and software transparency; improved security; decreased costs; and increased scalability. Web servers interact with Web clients using HTTP and TCP/IP to display HTML Web pages on Web clients screens. Dynamic Web pages, not static Web pages, are used in ecommerce; and server-side and client-side extensions provide the dynamic capability, including the capability to interact with databases. Cookies and session management techniques are used to counteract the stateless nature of HTTP. XML was developed in response to the need for data exchange between organizations and due to the inability of HTML to specify the structure and meaning of its data. XML is a metalanguage designed for the exchange of data on the Web. The W3C has developed recommendations for other languages related to XML. These languages include XHTML, a markup language based on XML and a stricter version of HTML; DTD and XML schema, both used to specify the structure and meaning of data in an XML document; XSL, a language for creating stylesheets; XSLT, which transforms an XML document into another document; and XQuery, which is an XML query language. OLTP is used with relational database management systems, and OLAP is used with data warehouses. A data warehouse is a subject-oriented, integrated, time-variant, nonvolatile collection of data in support of managements decisionmaking process. A typical data warehouse data structure is a star schema consisting of a central fact table surrounded by dimension tables. Users perceive the data in a data warehouse as a multidimensional database in the shape of a data cube. OLAP software lets users slice and dice data, drill down data, and roll up data.

Data mining consists of uncovering new knowledge, patterns, trends, and rules from the data stored in a data warehouse. E. F. Codd presented 12 rules that serve as a benchmark against which you can measure OLAP systems. These rules are multidimensional conceptual view; transparency; accessibility; consistent reporting performance; client/server architecture; generic dimensionality; dynamic sparse matrix handling; multiuser support; unrestricted, cross-dimensional operations; intuitive data manipulation; flexible reporting; and unlimited dimensions and aggregation levels. Object-oriented DBMSs deal with data as objects. An object is a set of related attributes along with the actions that are associated with the set of attributes. An OODBMS is a database management system in which data and the actions that operate on the data are encapsulated into objects. A domain is the set of values that are permitted for an attribute. The term class refers to the general structure, and the term object refers to a specific occurrence of a class. Methods are the actions defined for a class, and a message is a request to execute a method. A subclass inherits the structure and methods of its superclass. The UML is an approach to model all the various aspects of software development for object-oriented systems. The class diagram represents the design of an object-oriented database. Relationships are called associations, and visibility symbols indicate whether other classes can view or change the value in an attribute. Multiplicity indicates the number of objects that can be related to an individual object at the other end of the relationship. Generalization is the relationship between a superclass and a subclass. Rules that serve as a benchmark against which you can measure object-oriented systems are complex objects, object identity, encapsulation, information hiding, types or classes, inheritance, late binding, computational completeness, extensibility, persistence, performance, concurrent update support, recovery support, and query facility.

You might also like