You are on page 1of 8

Availability Modelling of the 3GPP R99 Telecommunication Networks

Dhananjay Kumar
Nokia Research Center, Nokia House, Summit Avenue, Farnborough, Hampshire, GU150NG, UK

Akiyoshi Miyabayashi Ojala Kari

Nokia Networks, Severo Ochoa, s/n, Edif. de Inst. Universitarios, Pl. 3, Parque Tecnolgico de Andaluca, Campanillas, 29590 Malaga, Spain Nokia Research Center, PO Box 407, FIN-00045, Nokia Group, Finland

ABSTRACT: In this paper, availability modelling for the 3GPP R99 network architecture is presented. The analytical availability modelling approach has been used. Analytical availability models can be broadly classified into two groups: non-state space (e.g. reliability block diagrams) and state-space models (e.g. Markov models). These models are briefly discussed. The reliability block diagram method is easier to use and has been applied to model the availability of the 3GPP R99 network architecture. An example network with a number of various network elements is considered for availability modelling. The numerical results are presented. The reliability block diagram method is suitable to capture overall availability of a network. However, in order to model features such as software failure, reconfigurations, and fault tolerance, state space modelling approach is needed. 1 INTRODUCTION During the last decade, rapid technical evolution, market pressures and complexity of telecommunication networks have put a very high demand on performance and availability modelling. The third ge neration (3G) telecommunication networks are still in development phase. Due to the very high demand on quality of services, very high costs of installation and operations, telecommunication equipment manufacturers have to put special efforts on assuring high reliability and availability of 3G networks. For operators, the compatibility issue, such as interworkability with legacy networks, is one of the most important factors. Therefore, we have also considered second generation (2G) networks in our modelling. A variety of measures for network reliability & availability has been proposed. These may be classified broadly into three categories: network survivability, network vulnerability, and network availability. The former two measures are limited to the concept of graph theory, but have penetrated into telecommunication systems. The third one not only concerns the various failure modes of network elements, but also the degraded performance of a network due to faults in network elements. This paper deals with the network availability models, which can also be extended for network service performance measures. 2 BASIC CONCEPTS OF RELIABILITY & AVAILABILITY Recommendation E.800 of the International Telecommunications Union (ITU-T) defines reliability as follows: The ability of an item to perform a r equired function under given conditions for a given time interval. In this definition, an item may be a circuit board, a component on a circuit board, a module consisting of several circuit boards, a base transceiver station with several modules, a fiber optic transport system, or a mobile switching center with all of its subtending network elements. The definition includes systems with software. The reliability of an item is a probabilistic measure and is defined mathematically for a time interval t by:

R(t ) = P( X > t ) = 1 F (t ) ,

(1)

where X is time to failure, F(t) is the distrib ution function of the items lifetime, and P(X > t) is the probability that the time to failure is greater then time t. In practice, Mean Time Between Failure (MTBF) is used as a measure of reliability. The MTBF and reliability are related mathematically as follows:

MTBF = R(t )dt


0

(2)

Availability is closely related to reliability, and is also defined in the ITU-T Recommendation E.800 as follows: "The ability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval, assuming that the external resources, if required, are provided." The availability at any point t in time, denoted by A(t ) , is sometimes called pointwise availability, instantaneous availability, or transient availability. However, in practice, the steady state availability denoted by A is often used and is given by
A = MTBF , MTBF + MTTR

(3)

The main benefit of discrete-event simulation is the ability to depict detailed system behaviour in the models. The main drawback of discrete-event simulation is the long execution time, particularly when tight confidence bounds are required in the solutions obtained. Analytical models are more of an abstraction of the real system than a discrete-event simulation model. In general, analytic models tend to be easier to develop and faster to solve than a simulation model. The main drawback is the set of assumptions that are often necessary to make analytic models tractable. Recent advances in model generation and solution techniques as well as computing power make analytic models more attractive. Therefore, availability modelling based only on analytic techniques has been used. 3.1 Analytical Models Analytical models can be broadly classified in two different types of models: non-state space and state space models, depending on the constitutive elements and solution techniques. 3.1.1 Non-state space models Non-state models do not require the enumeration of the system states. They allow a concise description of the system under study, they can be evaluated efficiently, and a large number of algorithms are available for solving such models (Sun et. al., 1999). The main constraint while using these models are the basic assumptions. All failure dependencies must be shown, that is, a component failure leading to a system failure must not make the system operational due to activation of a backup success path. These models cannot represent the system dependency occurring in real systems. Reliability block diagram and fault tree are two non-state space models used for availability prediction. A reliability block diagram (RBD) is a network diagram of a system that depicts the relationship of the subsystems that are required for successful operation of a system/network. In the RBD, each component/element of the system is represented as a block. The blocks are then connected in series, parallel, or k-out-of-n configurations based on the operational dependency between the components. If all of the blocks are needed for the system to function, the blocks are connected in series, which means that a failure in any of the blocks leads to a failure in the whole system. If the system can function with at least one block, they are connected in parallel, which means that only simultaneous failures in all of the blocks lead to a system failure. Fault trees, unlike RBDs, represent the probability of failure approach to availability modelling. It is a pictorial representation of the sequence of events/conditions to be satisfied for a failure to oc-

where MTTR is Mean Time To Repair. An important difference between reliability and availability is that reliability refers to failure- free operation during an interval, while availability refers to failure-free operation at a given i stant of time, n and usually, at the time when a device or system is first accessed to provide a required function or service. MTBF gives a measure of reliability, while MTBF and MTTR t gether provide a measure of o availability. The availability modelling is more useful as it considers the repair time. More details on these topics can be found in the books by Trivedi, 2002, and by Ross, 1989. 3 AVAILABILITY MODELLING APPROACHES Approaches to evaluate a system's availability can be broadly categorised as measurement-based and model-based. Measurement-based evaluation is expensive, as it requires building of a real system, taking of measurements and, finally, statistical analysis of the data. Model-based evaluation, on the other hand, is inexpensive and relatively easy to perform. Although easier to perform, model-based availability analysis poses problems such as largeness and complexity of the models, which makes the models difficult to solve. Model-based availability evaluation can be made through discrete-event simulation, or analytic models, or hybrid models combining simulation and analytic parts. A discrete-event simulation model can depict the detailed system behaviour, as it is essentially a program whose execution simulates the dynamic behaviour of the system and evaluates the required measures. An analytic model consists of a set of equations describing the systems behaviour. The evaluation measures are obtained by solving these equations. In simple cases, closed-form solutions are obtained, but in large real life cases, numerical solutions of the equations are necessary.

cur. A fault tree uses Boolean gates (such as AND, OR, and k of n gates) to represent the operational dependency of the system on its components/elements. When a component fails, the corresponding input to the gate becomes TRUE. If any input to an OR gate becomes TRUE, then its output also becomes TRUE. The inputs to an OR gate are those components which are all required to be functioning for the (sub)system to be functioning. The input to an AND gate, on the other hand, are those components, all of which should fail for the (sub)system to fail. Whenever the output of the topmost gate becomes TRUE, the system is considered as failed. To represent situations, where one failure event propagates failure along multiple paths in the fault tree, fault trees can have repeated nodes. Several algorithms for solving fault trees exist (see Luo & Trivedi, 1998, and Doyle & Dugan, 1995). 3.1.2 State Space models Many complex systems cannot be represented by non-state space models due to interdependencies and sharing of some of the systems functions, multiple failure modes, fault coverage etc. In reconfigurable systems, the effectiveness of the dynamic reconfiguration process often becomes the critical factor. Normally, Markov modelling or Petri Net modelling techniques are used. The availability modelling approaches are summarised in the following Figure 1.

for a 3rd Generation Mobile System based on evolved GSM core networks and on the radio access technologies that they support. The scope was subsequently amended to include the maintenance and development of the Global System for Mobile communication (GSM) Technical Specifications and Technical Reports, including evolved radio access technologies, such as the General Packet Radio Service (GPRS) and the Enhanced Data rates for GSM Evolution (EDGE). More information can be found at the homepage (www.3GPP.org) of the 3GPP. The third generation (3G) network architecture contains not only technical evolution, but also expansion to network architecture and services. The Public Land Mobile Network infrastructure is logically divided into a Core Network (CN) and an Access Network (AN) infrastructures. Our availability analysis is based on the 3GPP R99 architecture. The generic 3GPP R99 architecture is shown in Figure 2. The general details of the architecture, mobility, and services can be found in the book by Kaaranen, 2001. 4.1 Access Network (AN) The AN is the radio access network, which is in charge mainly of controlling the use and the integrity of the radio resources and radio channels. Two different types of access are defined: the Base Station Subsystem (BSS) and the Radio Network System (RNS). BSS offers Time Division Multiple Access (TDMA) based radio technology (such as GSM and/or GPRS) whereas RNS offers Wideband Code Division Multiple Access (WCDMA) based radio technology (such as Universal Mobile Telecommunication System, UMTS). BSS and RNS are also called GERAN (GSM/Edge Radio Access Network) and UTRAN (UMTS Terrestrial Radio Access Network), respectively. Each RNS contains a various number of Nodes B (base stations) and Radio Network Controllers (RNC). In parallel, BSS contains a various number of Base Station Transceivers (BTS) and Base Station Controllers (BSC). The main function of the Node B is to perform the air interface L1 processing (cha nnel coding and interleaving, rate adaptation, spreading etc.). It also performs some basic radio resource management operation. It logically corresponds to the BTS in GERAN. The RNC is the switching and controlling element of the RNS and interfaces with the CN. RNC logically corresponds to the BSC in GERAN. The 3GPP R99 does not define interconnection of Radio Access Network nodes (RAN) to multiple CN nodes. This means that any particular RNC/BSC is connected to a predefined CN node.

Availability Modeling Approach

Measurement based Discrete event simulation

Model based Hybrid models

Analytical models

Non state space models (Reliability block diagram, fault tree)

State space models (Markov models, Petri Net)

Figure 1. A summary of the availability approaches for telecommunication networks

4 3GPP R99 NETWORK ARC HITECTURE The 3rd Generation Partnership Project (3GPP) is a collaboration agreement among a number of telecommunications standards bodies. The original scope of the 3GPP was to produce globally applicable Technical Specifications and Technical Reports

PSTN

Gi

Gp

GMSC

GGSN

AuC
C
PSTN PSTN

Gc Gn Gr

HLR
D

EIR
F Gf Gs

VLR
B

VLR
B

MSC

MSC

SGSN CN

Gb

IuCS

IuPS

BSS BSC
Abis

RNS RNC
Iubis
Iur

RNC

BTS Um

BTS

Node B
cell

Node B

Uu ME
SIM-ME i/f or Cu

SIM

USIM
MS

Figure 2: 3GPP R99 architecture. A detailed description can be found in the technical specification document 3GPP TS-TS 23.002, 2002 V3.5.0 (www.3GPP.org)..

4.2 Core Network (CN) The core network is constituted of a Circuit Switched (CS) domain and a Packet Switched (PS) domain. These two domains differ by the way that they support user traffic. The entities specific to the CS domain are Mobile Switching Centre (MSC), Gateway MSC (GMSC) and Visitor Location Register (VLR). On the other hand, the entities specific to the PS domain are Serving GPRS Support Node (SGSN) and Gateway GPRS Support Node (GGSN). The rest of the CN network elements (NEs), e.g. Home Location Register (HLR), Equipment Identify Register (EIR), and Authe ntication Center (AuC), are common to both CS and PS domains. In 3GPP R99, the BSC is connected to the MSC (CS domain) via an A interface (as the basic 2G GSM network). In case of PS domain, the BSC is connected to the 2G-SGSN via Gb interface. The RNC is connected to the MSC via Iu-CS interface and to the 3G-SGSN via Iu-PS interface. The HLR is connected to the SGSN, GGSN, MSC and GMSC. The MSC is responsible for CS connection ma nagement, paging and securities activities. It also performs the call control and mobility management. The GMSC is the MSC acting as a bridge between

the mobile network and the fixed network. The VLR is a database, which stores information regarding the subscribers under the MSC area (temporarily). In PS domain, the SGSN functions in 2G and 3G are different. In 2G, protocol conversion, ciphering, compression and mobility management are the major tasks. In 3G, packet processing is the major task. SGSN is in charge of the mobility management, session management, packet transfer, charging, and admission control. GGSN is the interface to external data networks. GGSN has router functionality and charging functions. The HLR is located in the users home network and it contains subscription data and routing information. The EIR handles security functions related to the verification and identification of the m obile equipment. The AuC handles security functions related to the verification of the identification of the user. The simplified network architecture to be used in our availability analysis is shown in Figure 3. In order to simplify our availability analysis, the follo wing assumptions related to the network architecture are made

Interfaces CN CS Domain
BTS BSC MSC

GERAN
HLR

Other networks
GGSN

Node B

RNC

2G/3G SGSN

UTRAN Access Network (AN)

CN PS Domain Core Network (CN)

Figure 3: A simplified version of the CS and PS Network scenarios for 3GPP R99 networks.

A simplified 3GPP R99 network architecture is considered for modelling (see Figure 3). No transport elements or transmission failures have been considered SGSN and GGSN belong to the same ne twork. VLR is integrated in the MSC AuC and EIR are integrated in the HLR HLR is connected to only SGSN MSC and GMSC are part of the same physical NE called MSC. In reality the GMSC functions may be supported by another available MSC in the network.

No call blockages, drops or handovers.

5 AVAILABILITY OF THE 3GPP R99 NETWORK Availability analysis based on reliability block diagrams (RBD) is the simplest approach. However, it masks the maintainability, reconfiguration, process delays, and resilience aspects of a modern system. Therefore, one should adopt hierarchical modelling approach, where the top level is analyzed using RBD, and all blocks in the RBD are analyzed using state-space models if needed. The following assumptions have been made for availability modelling: The 3G R99 architecture is completely represented by the RBD There is no dependency between the NEs represented by blocks All network elements exist in one of the two states: failed or operational No reconfiguration, processing delay or resilience in the network

Similar approaches for the CS and PS networks have been adopted. The availability modelling of a PS is discussed in detail. The generic reliability block diagram for the PS network is shown in Figure 4. This RBD diagram is generic in order to represent any number of NEs. The RBD of the CS network will be similar except that the SGSN is replaced by MSC and there will be no GGSN. The number of network elements in Block B may be the same as shown in the Block A. The symbols and number of elements in the Block A are explained below. The Node Bs are shown in parallel combination, where k out of n are required for the system to be available (k=1,2, , n). If all the n Node Bs are required for system to be assumed to be operational, the corresponding block diagram will convert to a series system. Similarly, if j out of n RNCs are required, the block diagram shows parallel combination (j=1,2, . ., n). On the other hand, if all of the RNCs are required for the system to be available, the corresponding block diagram will convert to a series system. Similarly, if i out of n 2G/3G SGSN are required, the block diagram shows a parallel combination (i=1,2, . . ., n). On the other hand, if all of the 2G/3G SGSN are required for the system to be available, the corresponding block diagram will convert to a series system. The mathematical equations for the availability of the subnetworks were derived on the basis of series, parallel or k out of n combinations of the blocks (See Trivedi, 2002 & Ross 1989). The availability of the GERAN, where k out of n BTS are needed for the GERAN to be considered available, is given by

Node B1 Node B2
. . .

K out of n required

RNC1

Node Bn

..
Node B1 Node B2
. . .

..

..

J out of n required

SGSN1

K out of n required

RNCn

i out of n required

HLR

GGSN

Node Bn

BLOCK A BLOCK B
SGSNn

Figure 4: Reliability Block Diagram showing series and parallel operational relationship between different network elements of the 3GPP PS R99 network. A similar diagram will represented the CS network except that Node B is replaced by BTS, RNC by BSC, SGSNs by MSC and there will be no GGSN.

n n n! AGERAN= ABSCj Al BTSjk (1 ABTSjk ) nl j =1 k =l l!(n l)!

(4)

A2Gservice = AHLR AMSCi AGERANi


i =1

(7)

The availability of the UTRAN, where k out of n Node Bs are needed for the UTRAN to be considered available, is given by
n n n! AUTRAN= ARNCj Al NodeB (1 ANodeB ) nl (5) ijk ijk j =1 k=l l!(n l )!

The 3G-service availability for the PS is given by


A3Gservice = AGGSN AHLR ASGSNi AUTRANi
i =1 n

(8)

The availability equations for GERAN and UTRAN will be same as the above equations (4) and (5) for CS network. The availability of the CN for the PS network, where i out of n SGSN are needed for the CN to be considered available, is given by

The availability for maintenance may also be of importance to operators, for their resource planning. The availability for maintenance for the PS network is

APS maintenancee = AGGSN AHLR ASGSNi AUTRANi


i =1

]
(9)

n! ACNPS = AHLR AGGSN Al SGSN (1 ASGSN )nl (6) i i l!(n l )! i =l


The 2G-service availability for the PS is given by

[A
n i =1

MSCi

AGERANi

6 AN EXAMPLE OF THE 3GPP R99 NETWORK Let us consider a network in which the GERAN consists of 100 BTSs & 20 BSCs, and the UTRAN

consists of 250 Node Bs and 20 RNCs. In case of the PS CN, 2 SGSNs (2G/3G), one GGSN and one HLR are considered. Similarly, in case of the CS CN, 2 MSCs (2G/3G) and one HLR are considered. The 2G and 3G SGSNs are physically two different NEs while 2G and 3G MSCs are physically in one NE. For the sake of simplicity, it is assumed that each of these NEs has the availability of 99.9999%. The hypothetical network for a PS is represented in the Figure 5. Note that i=1,2, j=1,2,10, k=1, 2, , 25 for the Node Bs, and K = 1, 2, , 5 for the BTSs. The 3G and 2G service availability using the equations in Section 5, are listed in Table 1. In case of the CS

the CS network, the availability of the CN was calculated by replacing the SGSN with MSC and eliminating the term for GGSN in Equation (6). The numerical results show that by allowing at least one of the Node Bs/BTSs to fail, the down time per year decreases many times compared to when no failure is allowed. However, the down time does not decrease if a larger number of the Node Bs/BTSs are allowed to fail. The operator can plan their network based on the subscribers for their 2G & 3G services. The operator can plan maintenance resources based on the maintenance availability.

GGSN

2G/3G SGSN1

HLR

2G/3G SGSN2

RNC1 BSC1 Node B1-25 BTS1- 5

RNC2 RNC10 BSC10 BSC2

RNC11 BSC11

RNC12 RNC20 BSC12 BSC20

Node B26- 50 Node B101-125 Node B126-150 Node B151-175 Node B226-250 BTS6-10 BTS51-55 BTS46-50 BTS61-65 BTS96-100

Figure 5. An example of a 3G R99 packet switch network. A group of Node B/BTSs are dedicated to a particular RNC/BSC which itself is dedicated to a particular SGSN. In case of the CS network, the network element SGSN is replaced by an MSC and there is no GGSN.

Table 1. The availability values of the network shown in Figure 5 and its corresponding network for the CS. The values in the parenthesis are down time per year in minutes. It was calculated using the relationship downtime = 8760x60 (1-availability) min/year.

Failure criteria Service Availability for the 2G PS network Service Availability for the 3G PS network Availability for Maintenance of the PS ne twork Service Availability for the 2G CS network Service Availability for the 3G CS network Availability for Maintenance of the CS network

All NEs needed 99.987601% (65.2 min) 99.947614% (275.3 min) 99.935421% (339.4 min) 99.987701% (64.64 min) 99.947714% (274.82 min)) 99.935721% (337.85 min)

One Node B/BTS from each group needed and all other NEs needed 99.997600% (12.6 min) 99.997600% (12.6 min) 99.995400% (24.2 min) 99.997700% (12.09 min) 99.997700% (12.09 min) 99.995700% (22.60 min)

80% of BTS/Node B from each group needed and all other NEs needed 99.997600% (12.6 min) 99.997600% (12.6 min) 99.995400% (24.2 min) 99.997700% (12.09 min) 99.997700% (12.09 min) 99.995700% (22.60 min)

7 CONCLUSIONS At high levels, the availability of telecommunication networks can be modelled by using a Reliability Block Diagram (RBD). Other components of the network can be easily modelled as long as their operational relationships do not violate the assumptions of the RBD. In order to include the network features such as resilience, reconfigurations etc., state space models are necessary. It is suggested that the initial availability modelling is carried out by using such simple methods such as RBD. In order to capture the complexity of the network, state space models may be used for each block in the RBD. Operators may use the results of availability modelling to plan their resources for maintenance and also for revenue purposes from each type of service provided to their subscribers. 8 ACKNOWLEDGEMENT We are thankful to Mr. Veikko Juusola, Mr. Erik Salo, and Mr. Heikki Almay from Nokia/Networks, Dr. Jukka Rantala from Nokia Research Center, and Ms. Raquel Sanchez from University of Malaga for their support during the research work. REFERENCES 3GPP TS-TS 23.002, 2002: 3rd Generation Partnership Project: Technical specification g roup services and systems aspects: Network architecture V3.5.0 (Release 1999), www.3GPP.ORG. Doyle, S. A., and Dugan, J. B., 1995, Dependability assessment using binary decisions diagram, Proceedings of 5th international symposium on fault tolerant computing, Pp 249-258. ITU-T, 1994, International Telecommunications Union Telecommunication Standardization Sector Recommendations E.800. Kaaranen, H. et al., 2001., UMTS Networks, UK, Wiley & Sons. Luo, T., and Trivedi, K. S., 1998, An improved algorithm for coherent-system reliability", IEEE Transactions on Reliability, Vol. 47, No. 1, pp. 73-78, 1998. Ross, S.M., 1989, Introduction to Probability Models, Academic Press Inc., New York. Sun, H. R., Cao, Y., Han, J. J., and Trivedi, K. S., 1999, Availability and performance evaluation of automatic protection switching in TDMA wireless systems, Pacific Rim dependence Conference. Trivedi, K. S., 2002, Probability and Statistics with reliability, queuing, and computer science applications, John Wiley, New York.