Professional Documents
Culture Documents
This article needs attention from an expert on the subject. See the talk page for details. Consider associating this request with a WikiProject. (July 2010) Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files. What distinguishes grid computing from conventional high performance computing systems such as cluster computing is that grids tend to be more loosely coupled, heterogeneous, and geographically dispersed. Although a grid can be dedicated to a specialized application, it is more common that a single grid will be used for a variety of different purposes. Grids are often constructed with the aid of general-purpose grid software libraries known as middleware. Grid size can vary by a considerable amount. Grids are a form of distributed computing whereby a super virtual computer is composed of many networked loosely coupled computers acting together to perform very large tasks. Furthermore, distributed or grid computing, in general, is a special type of parallel computing that relies on complete computers (with onboard CPUs, storage, power supplies, network interfaces, etc.) connected to a network (private, public or the Internet) by a conventional network interface, such as Ethernet. This is in contrast to the traditional notion of a supercomputer, which has many processors connected by a local highspeed computer bus.
Contents
1 Overview 2 Comparison of grids and conventional supercomputers 3 Design considerations and variations 4 Market segmentation of the grid computing market o 4.1 The provider side o 4.2 The user side 5 CPU scavenging 6 History 7 Fastest virtual supercomputers 8 Current projects and applications 9 Definitions 10 See also o 10.1 Concepts and related technology o 10.2 Alliances and organizations o 10.3 Production grids o 10.4 International projects
10.5 National projects 10.6 Standards and APIs 10.7 Software implementations and middleware 10.8 Visualization frameworks 11 References o 11.1 Notes o 11.2 Bibliography 12 External links
o o o o
[edit] Overview
Grid computing combines computers from multiple administrative domains to reach a common goal,[1] to solve a single task, and may then disappear just as quickly. One of the main strategies of grid computing is to use middleware to divide and apportion pieces of a program among several computers, sometimes up to many thousands. Grid computing involves computation in a distributed fashion, which may also involve the aggregation of largescale cluster computing-based systems. The size of a grid may vary from smallconfined to a network of computer workstations within a corporation, for exampleto large, public collaborations across many companies and networks. "The notion of a confined grid may also be known as an intra-nodes cooperation whilst the notion of a larger, wider grid may thus refer to an inter-nodes cooperation".[2] Grids are a form of distributed computing whereby a super virtual computer is composed of many networked loosely coupled computers acting together to perform very large tasks. This technology has been applied to computationally intensive scientific, mathematical, and academic problems through volunteer computing, and it is used in commercial enterprises for such diverse applications as drug discovery, economic forecasting, seismic analysis, and back office data processing in support for e-commerce and Web services.
a small number of custom supercomputers. The primary performance disadvantage is that the various processors and local storage areas do not have high-speed connections. This arrangement is thus well-suited to applications in which multiple parallel computations can take place independently, without the need to communicate intermediate results between processors.[citation needed] The high-end scalability of geographically dispersed grids is generally favorable, due to the low need for connectivity between nodes relative to the capacity of the public Internet.[citation
needed]
There are also some differences in programming and deployment. It can be costly and difficult to write programs that can run in the environment of a supercomputer, which may have a custom operating system, or require the program to address concurrency issues. If a problem can be adequately parallelized, a thin layer of grid infrastructure can allow conventional, standalone programs, given a different part of the same problem, to run on multiple machines. This makes it possible to write and debug on a single conventional machine, and eliminates complications due to multiple instances of the same program running in the same shared memory and storage space at the same time.[citation needed]
placing applications in virtual machines.[citation needed] ___ Public systems or those crossing administrative domains (including different departments in the same organization) often result in the need to run on heterogeneous systems, using different operating systems and hardware architectures. With many languages, there is a trade off between investment in software development and the number of platforms that can be supported (and thus the size of the resulting network). Cross-platform languages can reduce the need to make this trade off, though potentially at the expense of high performance on any given node (due to run-time interpretation or lack of optimization for the particular platform).[citation needed] Various middleware projects have created generic infrastructure to allow diverse scientific and commercial projects to harness a particular associated grid or for the purpose of setting up new grids. BOINC is a common one for various academic projects seeking public volunteers;[citation needed] more are listed at the end of the article. In fact, the middleware can be seen as a layer between the hardware and the software. On top of the middleware, a number of technical areas have to be considered, and these may or may not be middleware independent. Example areas include SLA management, Trust and Security, Virtual organization management, License Management, Portals and Data Management. These technical areas may be taken care of in a commercial solution, though the cutting edge of each area is often found within specific research projects examining the field.[citation needed]
Software as a service (SaaS) is software that is owned, delivered and managed remotely by one or more providers. (Gartner 2007) Additionally, SaaS applications are based on a single set of common code and data definitions. They are consumed in a one-to-many model, and SaaS uses a Pay As You Go (PAYG) model or a subscription model that is based on usage. Providers of SaaS do not necessarily own the computing resources themselves, which are required to run their SaaS. Therefore, SaaS providers may draw upon the utility computing market. The utility computing market provides computing resources for SaaS providers.
[edit] History
The term grid computing originated in the early 1990s as a metaphor for making computer power as easy to access as an electric power grid in Ian Foster's and Carl Kesselman's seminal work, "The Grid: Blueprint for a new computing infrastructure" (2004). CPU scavenging and volunteer computing were popularized beginning in 1997 by distributed.net and later in 1999 by http://en.wikipedia.org/wiki/SETI@home to harness the power of networked PCs worldwide, in order to solve CPU-intensive research problems.[citation needed] The ideas of the grid (including those from distributed computing, object-oriented programming, and Web services) were brought together by Ian Foster, Carl Kesselman, and Steve Tuecke, widely regarded as the "fathers of the grid".[4] They led the effort to create the Globus Toolkit incorporating not just computation management but also storage management, security
provisioning, data movement, monitoring, and a toolkit for developing additional services based on the same infrastructure, including agreement negotiation, notification mechanisms, trigger services, and information aggregation. While the Globus Toolkit remains the de facto standard for building grid solutions, a number of other tools have been built that answer some subset of services needed to create an enterprise or global grid. In 2007 the term cloud computing came into popularity, which is conceptually similar to the canonical Foster definition of grid computing (in terms of computing resources being consumed as electricity is from the power grid). Indeed, grid computing is often (but not always) associated with the delivery of cloud computing systems as exemplified by the AppLogic system from 3tera.[citation needed]
BOINC 5.128PFLOPS as of Apr 24th 2010.[5] http://en.wikipedia.org/wiki/Folding@Home 5 PFLOPS, as of March 17, 2009 [6] As of April 2010, http://en.wikipedia.org/wiki/MilkyWay@Home computes at over 1.6 PFLOPS, with a large amount of this work coming from GPUs.[7] As of April 2010, http://en.wikipedia.org/wiki/SETI@Home computes data averages more than 730 TFLOPS.[8] As of April 2010, http://en.wikipedia.org/wiki/Einstein@Home is crunching more than 210 TFLOPS.[9] As of April 2010, GIMPS is sustaining 44 TFLOPS.[10]
The European Union has been a major proponent of grid computing. Many projects have been funded through the framework programme of the European Commission. Many of the projects are highlighted below, but two deserve special mention: BEinGRID and Enabling Grids for EsciencE.[citation needed] BEinGRID (Business Experiments in Grid) is a research project partly funded by the European commission[citation needed] as an Integrated Project under the Sixth Framework Programme (FP6) sponsorship program. Started on June 1, 2006, the project will run 42 months, until November 2009. The project is coordinated by Atos Origin. According to the project fact sheet, their mission is to establish effective routes to foster the adoption of grid computing across the EU and to stimulate research into innovative business models using Grid technologies. To extract best practice and common themes from the experimental implementations, two groups of consultants are analyzing a series of pilots, one technical, one business. The project is significant not only for its long duration, but also for its budget, which at 24.8 million Euros, is the largest of any FP6 integrated project. Of this, 15.7 million is provided by the European commission and the remainder by its 98 contributing partner companies. The Enabling Grids for E-sciencE project, which is based in the European Union and includes sites in Asia and the United States, is a follow-up project to the European DataGrid (EDG) and is arguably the largest computing grid on the planet. This, along with the LHC Computing Grid[12] (LCG), has been developed to support the experiments using the CERN Large Hadron Collider. The LCG project is driven by CERN's need to handle huge amounts of data, where storage rates of several gigabytes per second (10 petabytes per year) are required. A list of active sites participating within LCG can be found online[13] as can real time monitoring of the EGEE infrastructure.[14] The relevant software and documentation is also publicly accessible.[15] There is speculation that dedicated fiber optic links, such as those installed by CERN to address the LCG's data-intensive needs, may one day be available to home users thereby providing internet services at speeds up to 10,000 times faster than a traditional broadband connection.[16] Another well-known project is distributed.net, which was started in 1997 and has run a number of successful projects in its history. The NASA Advanced Supercomputing facility (NAS) has run genetic algorithms using the Condor cycle scavenger running on about 350 Sun and SGI workstations. Until April 27, 2007, United Devices operated the United Devices Cancer Research Project based on its Grid MP product, which cycle-scavenges on volunteer PCs connected to the Internet. As of June 2005, the Grid MP ran on about 3.1 million machines.[17]
[edit] Definitions
Today there are many definitions of grid computing:
In his article What is the Grid? A Three Point Checklist,[1] Ian Foster lists these primary attributes: o Computing resources are not administered centrally.
o o
Plaszczak/Wellner[18] define grid technology as "the technology that enables resource virtualization, on-demand provisioning, and service (resource) sharing between organizations." IBM defines grid computing as the ability, using a set of open standards and protocols, to gain access to applications and data, processing power, storage capacity and a vast array of other computing resources over the Internet. A grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of resources distributed across multiple administrative domains based on their (resources) availability, capacity, performance, cost and users' quality-of-service requirements.[19] An earlier example of the notion of computing as utility was in 1965 by MIT's Fernando Corbat. Corbat and the other designers of the Multics operating system envisioned a computer facility operating like a power company or water company. [20] Buyya/Venugopal[21] define grid as "a type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed autonomous resources dynamically at runtime depending on their availability, capability, performance, cost, and users' quality-of-service requirements". CERN, one of the largest users of grid technology, talk of The Grid: a service for sharing computer power and data storage capacity over the Internet. [22]
Grids can be categorized with a three stage model of departmental grids, enterprise grids and global grids. These correspond to a firm initially utilising resources within a single group i.e. an engineering department connecting desktop machines, clusters and equipment. This progresses to enterprise grids where nontechnical staff's computing resources can be used for cycle-stealing and storage. A global grid is a connection of enterprise and departmental grids that can be used in a commercial or collaborative manner.
Cloud computing Computer cluster Computon Distributed computing e-Science Edge computing Grid File System High-performance computing List of distributed computing projects Metacomputing Network Agility Render farm Scientific workflow system
Semantic grid Space based architecture (SBA) Tuple space Supercomputer Wireless Parallel computing Peer-to-peer
Open Grid Forum (Formerly Global Grid Forum) Object Management Group
Enabling Grids for E-sciencE INFN Production Grid NorduGrid OurGrid Sun Grid Techila Xgrid
March 2004 March 2006 April 2006 April 2008 May 2008 September 2005 January 2008 September 2010 April 2010 September 2008
December 2009
September 2012
and Latin America (EELA) Business Experiments in GRID (BEinGRID) BREIN KnowARC Nordic Data Grid Facility DataTAG European DataGrid (EDG) BalticGrid project / BalticGrid-II project EUFORIA (EU Fusion fOR Iter Applications) World Community Grid SORMA (Self-Organizing ICT Resource Management) XtreemOS GridEcon OurGrid High Performance and Grid Computing Research Group
America Europe Europe Europe Scandinavia and Finland Europe and North America Europe Europe (Baltic States) Europe Global Europe Europe Europe Brazil US
2006 June 2006 September 2006 June 2006 June 2006 January 2001 March 2001 November 2005 January 2008 November 2004 August 2006 June 2006 June 2006 December 2004 December 2010 November 2009 January 2010 November 2009 December 2010 January 2003 March 2004 April 2010 December 2010 active July 2009 (May 2010) ext. to September 2010 April 2009 active active
CNGrid (China) EGEE (EU) France Grilles (France) D-Grid (German) GARUDA (Indian) VECC (Calcutta, India) IsraGrid (Israel) INFN Grid (Italian) PL-Grid (Poland) National Grid Service (UK) Open Science Grid (USA) TeraGrid (USA)
GLUE, a technology-agnostic information model for a uniform representation of Grid resources A Simple API for Grid Applications (SAGA) Distributed Resource Management Application API (DRMAA) Grid Security Infrastructure (GSI) Open Grid Services Architecture (OGSA) Open Grid Services Infrastructure (OGSI) Web Services Resource Framework (WSRF)
Advanced Resource Connector (NorduGrid's ARC) Berkeley Open Infrastructure for Network Computing (BOINC) Globus Toolkit Platform LSF Platform Symphony Message Passing Interface (MPI) OurGrid Simple Grid Protocol Sun Grid Engine ProActive Portable Batch System UNICORE SDSC Storage resource broker (data grid) GridWay ZeroC ICE IceGrid gLite Xgrid Techila Grid
GStat
[edit] References
[edit] Notes
1. ^ a b "What is the Grid? A Three Point Checklist". http://dlib.cs.odu.edu/WhatIsTheGrid.pdf. 2. ^ "Pervasive and Artificial Intelligence Group :: publications [Pervasive and Artificial Intelligence Research Group]". Diuf.unifr.ch. May 18, 2009. http://diuf.unifr.ch/pai/wiki/doku.php?id=Publications&page=publication&kind=single& ID=276. Retrieved July 29, 2010.
3. ^ "Finetuning room temperature with computer". http://michkhoo.blogspot.com/2009/11/control-room-temperature-with-computer.html. Retrieved June 6, 2010. 4. ^ "Father of the Grid". http://magazine.uchicago.edu/0404/features/index.shtml. 5. ^ "BOINCstats BOINC combined credit overview." Retrieved on June 21, 2009. 6. ^ [1]. Retrieved 17 March 2009. 7. ^ http://boincstats.com/stats/project_graph.php?pr=milkyway. BOINC. http://boincstats.com/stats/project_graph.php?pr=milkyway. Retrieved April 21, 2010. 8. ^ http://www.boincstats.com/stats/project_graph.php?pr=sah. BOINC. http://www.boincstats.com/stats/project_graph.php?pr=sah. Retrieved April 21, 2010. 9. ^ http://de.boincstats.com/stats/project_graph.php?pr=einstein. BOINC. http://de.boincstats.com/stats/project_graph.php?pr=einstein. Retrieved April 21, 2010. 10. ^ "Internet PrimeNet Server Parallel Technology for the Great Internet Mersenne Prime Search". GIMPS. http://www.mersenne.org/primenet. Retrieved April 21, 2010 11. ^ [2][dead link] 12. ^ Large Hadron Collider Computing Grid official homepage 13. ^ "GStat 2.0 Summary View GRID EGEE". Goc.grid.sinica.edu.tw. http://goc.grid.sinica.edu.tw/gstat/. Retrieved July 29, 2010. 14. ^ "Real Time Monitor". Gridportal.hep.ph.ic.ac.uk. http://gridportal.hep.ph.ic.ac.uk/rtm/. Retrieved July 29, 2010. 15. ^ "LCG Deployment". Lcg.web.cern.ch. http://lcg.web.cern.ch/LCG/activities/deployment.html. Retrieved July 29, 2010. 16. ^ "Coming soon: superfast internet" 17. ^ [3][dead link] 18. ^ P Plaszczak, R Wellner, Grid computing, 2005, Elsevier/Morgan Kaufmann, San Francisco 19. ^ IBM Solutions Grid for Business Partners: Helping IBM Business Partners to Gridenable applications for the next phase of e-business on demand 20. ^ http://www.multicians.org/fjcc3.html 21. ^ "A Gentle Introduction to Grid Computing and Technologies" (PDF). http://www.buyya.com/papers/GridIntro-CSI2005.pdf. Retrieved May 6, 2005. 22. ^ "The Grid Caf The place for everybody to learn about grid computing". CERN. http://www.gridcafe.org/. Retrieved December 3, 2008.
[edit] Bibliography
Buyya, Rajkumar; Kris Bubendorfer (2009). Market Oriented Grid and Utility Computing. Wiley. ISBN 9780470287682. http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470287683,descCdtableOfContents.html. Benedict, Shajulin; Vasudevan (2008). "A Niched Pareto GA approach for scheduling scientific workflows in wireless Grids". Journal of Computing and Information Technology 16: 101. Davies, Antony (June 2004). "Computational Intermediation and the Evolution of Computation as a Commodity" (PDF). Applied Economics 36: 1131.
doi:10.1080/0003684042000247334. http://www.business.duq.edu/faculty/davies/research/EconomicsOfComputation.pdf. Foster, Ian; Carl Kesselman (1999). The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann Publishers. ISBN 1-55860-475-8. http://www.mkp.com/grids/. Plaszczak, Pawel; Rich Wellner, Jr (2006). Grid Computing "The Savvy Manager's Guide". Morgan Kaufmann Publishers. ISBN 0-12-742503-9. http://savvygrid.com/. Berman, Fran; Anthony J. G. Hey, Geoffrey C. Fox (2003). Grid Computing: Making The Global Infrastructure a Reality. Wiley. ISBN 0-470-85319-0. http://www.grid2002.org/. Li, Maozhen; Mark A. Baker (2005). The Grid: Core Technologies. Wiley. ISBN 0-47009417-6. http://coregridtechnologies.org/. Catlett, Charlie; Larry Smarr (June 1992). "Metacomputing". Communications of the ACM 35 (6). http://www.acm.org/pubs/cacm/. Smith, Roger (2005). "Grid Computing: A Brief Technology Analysis" (PDF). CTO Network Library. http://www.ctonet.org/documents/GridComputing_analysis.pdf. Buyya, Rajkumar (July 2005). "Grid Computing: Making the Global Cyberinfrastructure for eScience a Reality" (PDF). CSI Communications (Mumbai, India: Computer Society of India (CSI)) 29 (1). ISSN 0970-647X. http://www.gridbus.org/~raj/papers/CSICommunicationsJuly2005.pdf. Berstis, Viktors. "Fundamentals of Grid Computing". IBM. http://www.redbooks.ibm.com/abstracts/redp3613.html. Ferreira, Luis; et al.. "Grid Computing Products and Services". IBM. http://www.redbooks.ibm.com/abstracts/sg246650.html. Ferreira, Luis; et al.. "Introduction to Grid Computing with Globus". IBM. http://www.redbooks.ibm.com/abstracts/sg246895.html?Open. Jacob, Bart; et al.. "Enabling Applications for Grid Computing". IBM. http://www.redbooks.ibm.com/abstracts/sg246936.html?Open. Ferreira, Luis; et al.. "Grid Services Programming and Application Enablement". IBM. http://www.redbooks.ibm.com/abstracts/sg246100.html?Open. Jacob, Bart; et al.. "Introduction to Grid Computing". IBM. http://www.redbooks.ibm.com/abstracts/sg246778.html?Open. Ferreira, Luis; et al.. "Grid Computing in Research and Education". IBM. http://www.redbooks.ibm.com/abstracts/sg246649.html?Open. Ferreira, Luis; et al.. "Globus Toolkit 3.0 Quick Start". IBM. http://www.redbooks.ibm.com/abstracts/redp3697.html?Open. Surridge, Mike; et al.. "Experiences with GRIA Industrial applications on a Web Services Grid" (PDF). IEEE. http://www.gria.org/docs/experiences%20with%20gria%20paper.pdf. Stockinger, Heinz; et al. (to be published in 2007). "Defining the Grid: A Snapshot on the Current View" (PDF). Supercomputing 42: 3. doi:10.1007/s11227-006-0037-9. http://hst.web.cern.ch/hst/publications/DefiningTheGrid-1.1.pdf. Global Grids and Software Toolkits: A Study of Four Grid Middleware Technologies The Grid Technology Cookbook Francesco Lelli, Eric Frizziero, Michele Gulmini, Gaetano Maron, Salvatore Orlando, Andrea Petrucci and Silvano Squizzato. The many faces of the integration of instruments
and the grid. International Journal of Web and Grid Services 2007 Vol. 3, No.3 pp. 239 266 Electronic Edition Poess, Meikel; Nambiar, Raghunath (2005). Large Scale Data Warehouses on Grid. http://www.vldb2005.org/program/paper/tue/p1055-poess.pdf.
GridCafea beginner's guide about making a grid SuGI-Portalmore on grids. The Open Source GridTask Project link v d eParallel computing topics Cloud computing High-performance computing Cluster computing Distributed computing Grid computing Bit Instruction Data Task Superthreading Hyperthreading Amdahl's law Gustafson's law Cost efficiency KarpFlatt metric slowdown speedup
Process Thread Fiber PRAM Instruction window Multiprocessing Multithreading (computer architecture) Memory coherency Coordination Cache coherency Cache invalidation Barrier Synchronization Application checkpointing Models (Implicit parallelism Explicit parallelism Concurrency) Flynn's Programming taxonomy (SISD SIMD MISD MIMD (SPMD)) Thread (computer science) Nonblocking algorithm Hardware Multiprocessing (Symmetric Asymmetric) Memory (NUMA COMA distributed shared distributed shared) SMT MPP Superscalar Vector processor Supercomputer Beowulf Ateji PX POSIX Threads OpenMP PVM MPI UPC Intel Threading Building Blocks Boost.Thread Global Arrays Charm++ Cilk Co-array Fortran OpenCL CUDA
APIs
Embarrassingly parallel Grand Challenge Software lockout Scalability Problems Race conditions Deadlock Livelock Deterministic algorithm Parallel slowdown Categories: Grid computing
Navigation
Main page Contents Featured content Current events Random article Donate to Wikipedia
Interaction Toolbox
What links here Related changes Upload file Special pages Permanent link Cite this page
Print/export
Languages
Bahasa Indonesia Interlingua Italiano Magyar Nederlands Polski Portugus Ting Vit This page was last modified on 7 March 2011 at 17:00. Text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. See Terms of Use for details. Wikipedia is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization. Contact us Privacy policy About Wikipedia Disclaimers