You are on page 1of 5

First step in a PC cluster development with openMosix

JAVIER BILBAO GORKA GARATE


Department of Applied Mathematics
University of the Basque Country
School of Engineering, Alda. Urkijo, s/n 48013 - Bilbao
SPAIN

Abstract: - It is possible to make high calculation capacity machines by the interconnection of equipments with
high speed networks. This kind of systems is called distributed systems and nowadays they are the most used
technology in order to resolve problems that require high calculation capacity. Thus, computer clusters are
being developed. Clustering allows multiple computers to work together to solve common computing
problems. In this paper, one of the different types of clusters is presented, openMosix. The work also evaluates
the cluster by two case studies, where the execution time of the solved problems is improved.

Key-Words: - pc cluster, parallel computation, efficiency, Linux

1 Introduction It is possible that delay times exist in one of the


The great advance in the computer technology in the processors if it needs the result of some operation
last years has allowed the development of that is carried out by other processor.
computers with calculation capacities that were Programs that are run in this type of computers
inconceivable some years ago. should be optimized in order to exist the minimum
On the other hand, and in a parallel way to the dependence among the different processors.
advance in computers, communication nets have Otherwise, calculation capacity will be missed.
been developed spectacularly. Nowadays, it is usual These systems supply a high calculation capacity
in universities to have high speed nets in order to but they are very expensive and they are not easily
interconnect devices, and it allows to transfer a very scalable. So, it seems that it should not be very
high amount of information among them in a short suitable to invert a high amount of money in
period of time. machines that will decrease their calculation
The necessity of developing efficient machines capacity as time goes on. Actually, they do not
with high calculation capacities is demanded by decrease their capacity, but the problems to be
different areas, such as oceanography, astrophysics, resolved are more and more complex and they need
aerodynamic calculation, meteorological a higher calculation power.
calculations, human genome analysis, finite element The increase of the above mentioned
analysis, medical pictures treatment, creation of technologies, computers and high speed networks,
films, animations, and computer graphics. with the development of efficient software tools to
With the aim of face up to these necessities, a the distributed calculation, have allowed to resolve
great investigation effort has been inverted in the this problem of scalability by means of the computer
design of high calculation capacity machines. At the clusters.
beginning, the called multiprocessors came up, that It is possible to make high calculation capacity
consist of computational machines that have a high machines by the interconnection of equipments, not
number of processors. This multiprocessor machines necessary powerful, with high speed networks. This
deal with each problem with the philosophy divide kind of systems is called distributed systems and
and conquer: nowadays they are the most used technology in
Each processor runs its own sequence of order to resolve problems that require high
instructions. calculation capacity.
Each processor works in a different part of Financially, the difference between
the problem. supercomputers and computer clusters is very high.
All processors communicate with each The cost per Gflop (1 million of millions of
other. operations in float point per second) of one
supercomputer is estimated about 10,000 dollars,
and certain clusters have achieved a cost per Gflop 3 openMosix
around 650 dollars. openMosix is a project that allows the
implementation of SSI (Single System Image)
clusters in order to can be managed as symmetric
2 Clustering multiprocessors (SMP).
In general, clustering is a technology or a set of openMosix appeared in 1999 from the Mosix
technologies that allow multiple computers to work project and uses Linux.
together to solve common computing problems. The
computing problems in question can be anything
from complex CPU-intensive scientific
computations to a horde of miscellaneous processes
with no underlying commonality.
At the beginning, clusters were developed in Fig. 1. Logo of openMosix project
order to resolve problems of supercomputation, but
nowadays it is not their only use. The increase of the Unlike other similar projects such as VMS of
use of the web technology has caused the Digital, Sysplex of IBM or Mosix, openMosix is
implementation of clusters in different servers with developed with GPL license, whereas the previous
the aim of service to a high number of clients. projects are owners.
Services such as web servers, email, e-commerce, or openMosix is a Linux kernel extension for
high performance data bases have been to single-system image clustering. This kernel
implement this type of technology. extension turns a network of ordinary computers
A cluster system may be considered as being into a supercomputer for Linux applications.
made up of four major components, two hardware Once you have installed openMosix, the nodes in
and two software. The two hardware components the cluster start talking to one another and the
are the nodes that perform the work and the network cluster adapts itself to the workload. Processes
that interconnects the nodes to form a single system. originating from any one node, if that node is too
The two software components are the collection of busy compared to others, can migrate to any other
tools used to develop user parallel application node. openMosix continuously attempts to optimize
programs and the software environment for the resource allocation.
managing the parallel resources of the cluster [3]. So, openMosix is presented as a kernel patch for
Clusters of computers can be classified in two Linux, creating a reliable, fast and cost-efficient SSI
groups: high availability clusters (such as web clustering platform that is linearly scalable and
servers) and high performance clusters adaptive. Moreover, with openMosix' Auto
(supercomputation) [1]. Discovery, a new node can be added while the
cluster is running and the cluster will automatically
begin to use the new resources.
The recognition among the different nodes of the
2.1 System level or kernel clusters
cluster can be dynamic or static, and it uses oMFS
System level clusters or kernel clusters are that
(openMosix File System) file system in order to
where the cluster is form at operating system level.
access to the hard disk in remote nodes.
There is not necessary to specify the parallelism in
When applications are developed to solve
the applications or tasks because the parallelism is
problems in these kind of systems, fork and
implicit [2] [7], and they may be classified inside
forget philosophy is used. It consists of the
the high performance clusters.
division of the application in various processes by
It is the own operating system what takes care of
means of the call to the system fork() that allows to
the distribution among the different nodes of the
create a son process. This son process may be run in
processes that are being run. Nevertheless, by means
a different node to that created it.
of system calls, the execution of a certain process
There is no need to program applications
can be specified in an explicit way to make in one
specifically for openMosix. Since all openMosix
given node.
extensions are inside the kernel, every Linux
An example of this type of clusters are the called
application automatically and transparently benefits
openMosix clusters. This kind of clusters are
from the distributed computing concept of
developed in Unix operating system.
openMosix. The cluster behaves much as does a
Symmetric Multi-Processor, but this solution scales
to well over a thousand nodes which can themselves openMosix will allow for extremely scalable
be SMPs. parallel execution at the process level.
Probably the best-known type of Linux-based openMosix can migrate most standard Linux
cluster is the Beowulf cluster [9]. Beowulf clusters processes between nodes with no problem. If an
are scalable performance clusters based on application forks many child processes, each of
commodity hardware, on a private system network, which performs work, then openMosix will be able
with open source software (Linux) infrastructure. In to migrate each one of these processes to an
order for these systems to be able to pool their appropriate node in the cluster. You can take
computing resources, special cluster-enabled advantage of this ability even if a particular
applications must be written using clustering application is not designed to use multiple sub-
libraries. The most popular clustering libraries are processes that can be migrated independently. For
PVM and MPI; both are very mature and work very example, if you wanted to compress 12 digital audio
well. By using PVM or MPI, programmers can tracks using your cluster, you could simply start all
design applications that can span across an entire 12 audio encoding processes simultaneously. After a
cluster's computing resources rather than being few seconds, openMosix would migrate each
confined to the resources of a single machine. For process to an appropriate node in your cluster. If
many applications, PVM and MPI allow computing you happened to have a 12-node cluster, your audio
problems to be solved at a rate that scales almost encoding job would complete nearly 12 times faster
linearly in relation to the number of machines in the than it would have otherwise. If the number of
cluster. processes that you plan to run simultaneously is
The really advantage about openMosix is that it greater than the number of nodes in your cluster,
can turn a bunch of Linux machines into something multiprocessing provides options to experience
like a large virtual SMP system. However, there are additional performance gains.
a few differences. First, on a "real" SMP system,
two or more CPUs can exchange data very quickly;
but with openMosix, the speed at which nodes can 4 Cluster evaluation
communicate with one another is determined by the Some case studies have been implemented for
speed of your LAN. Using Gigabit Ethernet or some performance evaluation of the cluster. In this paper
other kind of high-bandwidth networking we will present two of these cases: for loops and
technology will allow you increase the effectiveness file compression from .wav to .mp3.
of your openMosix cluster [9].
On the other hand, openMosix provides a
number of benefits over traditional multiprocessor 2.1 For loops
systems. With openMosix, you can create clusters For a case study, a program which has two for
consisting of tens or even hundreds of nodes using nested loops was used. Both of loops were of high
inexpensive PC hardware. In contrast, SMP systems index, concretely, the first index was 1000000 (one
that contain large numbers of processors can be million) and the second one 100, that is to say, the
prohibitively expensive, depending on your budget. loops repeated one million times and one hundred
For many applications, openMosix will give you times respectively.
more than a traditional supercomputer or For performance the evaluation ten executions
mainframe. And of course, there's no reason why have implemented in parallel, launching each of
you can not run OpenMosix on a bunch of high-end them in background.
multi-processor systems. It is even possible to use In the same way, it would also possible that the
openMosix together with existing MPI or PVM own program would launch n processes, doing n
programs in order further optimize the performance times the call to the system fork(). This system
of your cluster-aware applications. creates a son process that it is possible to assign the
openMosix, like an SMP system, cannot execute execution of the two for loops to. The result is the
a single process on multiple physical CPUs at the same in both cases.
same time. This means that openMosix will not be These processes have been run in a 100 MHz
able to speed up a single process such as Mozilla, Pentium computer, and the execution time was 2
except to migrate it to a node where it can execute minutes 9 seconds.
most efficiently. In addition, openMosix does not When the same operation is performed in a
currently offer support for allowing multiple cluster of two computers, the previous 100 MHz
cooperating threads to be separated from one
another.
Pentium and one 500 MHz Pentium III, the do
execution time was 48 seconds. time bladeenc -quit -quiet $i -256 -copy -crc&
done

140 The command time before a program launching


120 causes that, at the end of running this program, the
100 execution time of the process is shown in the screen.
80 Pentium 100 When this process has been carried out by only
60 one Pentium IV computer, it has taken 14 minutes 5
40
Cluster
seconds. But when the process has been
20
implemented by a cluster of 4 Pentium IV
0
Pentium 100 Cluster computers, the execution time has been 10 minutes
26 seconds:
Fig. 2. Execution time for a simple cluster and a 100
MHz Pentium to solve for loops
900
The same evaluation was implemented with four 800
Pentium IV computers with 512 Mb of RAM. One 700
of these computers executed the task in 35.446 600
seconds, and the cluster (the four Pentium IV 500
Pentium IV
computers) took 19.674 seconds for the same 400
Cluster
operation. 300
200
100
40
0
35 Pentium IV Cluster
30
25
Pentium IV Fig. 4. Execution time for a cluster of four
20
computers and one Pentium IV to file compression
15 Cluster
10
5
This last implementation has been monitorized
0
by means of the openMosixview software. In the
Pentium IV Cluster next figure how various processes have migrated to
the rest of the nodes of the cluster is shown:
Fig. 3. Execution time for a cluster of four
computers and one Pentium IV to solve for loops

2.2 File compression


In order to do this probe, the .wav files of a music
CD were dumped to the hard disk.
After that, the compression process was
implemented by means of a free software,
BladeEnc.
The process has been programmed to do the
compression of all .wav files of the CD in parallel,
that is to say, each .wav file has been treated by an
execution process of the BladeEnc software.
To do this task, the next shell script has been
done:

#!/bin/bash Fig. 5. Monitorization of the working nodes when


the processes are migrating among the nodes of the
for i in `ls *wav`; cluster
The cluster took approximately 4 minutes less to [4] R. M. Yaez, Introduccin a las tecnologas
carry out the operation, however, the time difference clustering
between only one computer and the cluster (with 4 [5] D. Santo, El proyecto de cluster SSI openMosix
computers) is increased when the number of files [6] E. Plaza, Cluster heterogneo de computadoras
that have to be compressed is also increased. In the [7] M. Colomer, Clustering con openMosix
same way, if the number of files to compress is very [8] T. Sterling, Beowulf cluster computing with
low, it is just possible that the execution time of the Linux, MIT Press, Cambridge, 2001
cluster is higher than the time of the only computer. [9] D. Robbins, openMosix, http://www.intel.com
The reason is that the task of compressing files [10] LAM/MPI Users Guide. http://www.lam-
requires a constant access to the hard disk, when the mpi.org
previous task (for loops) only requires mathematical
calculations.
Therefore, when the compression is implemented
by the cluster, there is a constant direct access to the
hard disk of the computer that launched the
processes to the other nodes of the cluster. And
although openMosix has a oMFS file system that
allows any node of the cluster to access to the file
system of the other nodes, this access time is higher
than the time that a computer needs to access to its
own hard disk.

5 Conclusion
This paper presents a development of a computer
cluster based on openMosix. openMosix is a project
that allows the implementation of SSI clusters in
order to can be managed as symmetric
multiprocessors.
Some evaluations have been implemented with the
aim of comparing the resolution of a problem both
in an only computer and in a cluster. All results of
these evaluations show that the execution time
necessary to solve the problem is always lower
when a cluster is used.

Acknowledgements

This work was supported by the Aula Iberdrola of


the School of Engineering of Bilbao and the
Iberdrola S.A. Spanish utility in the academic year
2003-04.
We are also grateful to Maria Alonso and Jon
Viguera for their collaboration in the presented
evaluations.

References:
[1] M. Cataln i Cot, Manual para Clustering con
openMosix
[2] O. Pino, R. F. Arroyo, F. J. Nievas, Los clusters
como plataforma de procesamiento paralelo
[3] M. A. Perez, Arquitecturas paralelas

You might also like