Professional Documents
Culture Documents
Issue 01
Date 2016-02-29
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective
holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the
customer. All or part of the products, services and features described in this document may not be within the
purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information,
and recommendations in this document are provided "AS IS" without warranties, guarantees or
representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.
Website: http://www.huawei.com
Email: support@huawei.com
Contents
2 Overview......................................................................................................................................... 3
2.1 Introduction.................................................................................................................................................................... 3
2.2 Benefits........................................................................................................................................................................... 3
2.3 Architecture.................................................................................................................................................................... 3
3 Reliability Specifications.............................................................................................................7
4 Planned Service Interruption...................................................................................................... 8
4.1 Overview........................................................................................................................................................................ 8
4.2 BSC/RNC Software Management.................................................................................................................................. 8
5 Redundancy Design.................................................................................................................... 11
5.1 Overview.......................................................................................................................................................................11
5.2 RNC Redundancy Design.............................................................................................................................................12
5.2.1 Resource Management Plane.....................................................................................................................................12
5.2.2 Control plane............................................................................................................................................................. 12
5.2.3 User Plane..................................................................................................................................................................14
5.2.4 Transport Plane.......................................................................................................................................................... 14
5.3 BSC Redundancy Design............................................................................................................................................. 15
5.3.1 Resource Management Plane.....................................................................................................................................15
5.3.2 Control Plane............................................................................................................................................................. 16
5.3.3 User Plane..................................................................................................................................................................18
5.3.4 Transport Plane.......................................................................................................................................................... 19
5.4 BSC/RNC Resource Sharing........................................................................................................................................ 19
6 Network Redundancy.................................................................................................................20
7 Fault Management.......................................................................................................................22
8 Flow Control................................................................................................................................. 23
9 Operation and Maintenance Reliability................................................................................. 24
9.1 Overview...................................................................................................................................................................... 24
9.2 Technical Description................................................................................................................................................... 24
10 Hardware Reliability................................................................................................................ 28
11 Related Features.........................................................................................................................31
12 Network Impact......................................................................................................................... 32
13 Engineering Guidelines........................................................................................................... 33
13.1 When to Use Operation & Maintenance System One-Key Recovery........................................................................33
13.2 Deployment................................................................................................................................................................ 33
13.2.1 Process..................................................................................................................................................................... 33
13.2.2 Requirements........................................................................................................................................................... 33
13.2.3 Activation................................................................................................................................................................ 33
13.2.4 Activation Observation............................................................................................................................................34
13.2.5 Deactivation.............................................................................................................................................................34
13.3 Performance Monitoring.............................................................................................................................................34
13.4 Troubleshooting.......................................................................................................................................................... 34
14 Parameters................................................................................................................................... 35
15 Counters...................................................................................................................................... 36
16 Glossary....................................................................................................................................... 39
17 Reference Documents............................................................................................................... 40
1.1 Scope
This document describes the Base Station Controller Equipment Reliability feature, including
its technical principles, related features, network impact, and engineering guidelines.
l Feature change:
Changes in features of a specific product version
l Editorial change:
Changes in wording or addition of information that was not described in the earlier
version
SRAN11.1 01 (2016-02-29)
This issue does not include any changes.
2 Overview
2.1 Introduction
Reliability designs enable the controller to continue providing services even when it
experiences a fault, thereby maintaining high system reliability. Objectives of reliability
include:
l Decreasing the number of accidents
l Minimizing the scope of fault influence
l Shortening the duration of service interruption
Controller reliability designs include system availability, planned service interruption,
redundancy design, network redundancy, fault management, flow control, operation and
maintenance reliability, and hardware reliability.
2.2 Benefits
Reliability designs, which include redundancy design and hardware reliability design,
eliminate or reduce the impact of equipment faults on services, thereby improving system
reliability.
2.3 Architecture
Table 1 lists the controller equipment reliability-related features and functions that are
supported by GSM and UMTS.
Table 2-1 Controller equipment reliability-related features and functions that are supported by
GSM and UMTS
Reliability Feature/ Radio Feature ID/ Remarks
Category Function Access Feature Name
Technology
3 Reliability Specifications
4.1 Overview
Planned service interruption aims to reduce the duration of service interruption caused by
upgrades, minimize the impact of planned maintenance on live networks, and improve
equipment availability. Planned service interruption supports hot patches.
Huawei controllers support the uniform software management of GSM base station system
(GBSS) and radio access network (RAN), facilitating the remote management of the
controller software and improving the efficiency of software upgrades and downloads.
With this feature, users can implement the following operations on the U2000.
In addition, users can manage the programs, patches, licenses, and logs using the Web LMT.
The controller supports the software integrity check. The controller performs the software
integrity check after software loading and before software operation, and then completes
digital signature verification.
The BSC/RNC is upgraded remotely by the dedicated upgrade tool, which consists of the
upgrade client and the upgrade server. Figure 4-1 shows the BSC/RNC remote upgrade
process.
Step 1 Upload the upgrade server program and the version files required for the upgrade (such as a
major release or patch version) to a specified directory of the active OMU. In addition,
synchronize the upgrade directory of the standby OMU with the specified directory of the
active OMU.
Step 2 Conduct the pre-upgrade health check, backs up data and files, and upgrades the program and
data files in the standby workspace of the active OMU and standby OMU.
Step 3 Load the host program, BootROM, operating system (OS), and data files in the standby
workspace of the active OMU onto the standby workspaces of the FAM boards so that the
standby workspaces of the FAM boards are synchronized with that of the OMU.
Step 4 After the synchronization is successful, switch over the active and standby workspaces of the
active OMU so that the active OMU is upgraded to the latest version.
Step 5 Switch over the active and standby workspaces of the FAM boards. When the platform host
program, BootROM, OS or data files are upgraded, the FAM boards are reset. When a cold
patch is loaded to a type of FAM boards, only this type of FAM board is reset and the boards
automatically load the program and data files from their flash memories to complete the
upgrade. Hot patches adopt one-click installation.
Step 6 After the service verification is successful, switch over the active and standby workspaces of
the standby OMU so that the standby OMU is upgraded to the latest version. After the
workspace switchover is complete for the standby OMU, the standby OMU automatically
synchronizes its data with the active OMU.
----End
Key Specifications
Table 4-1 lists key specifications for BSC/RNC software management.
5 Redundancy Design
5.1 Overview
This section describes the MRFD-210101 System Redundancy feature and the GBFD-111701
Board Switchover feature. System redundancy provides reliability designs that improve
system reliability. These designs include active/standby switchovers and load sharing.
Huawei base station controllers adopt reliability designs, such as load sharing and active/
standby switchovers, to ensure the reliable operation of the system.
l Active/standby switchovers
In active/standby mode, the active board processes services while the standby board acts
as a backup for the active one. When the active board is faulty or needs to be replaced,
services on the active board are switched over to the standby board to ensure normal
service operations.
There are two types of switchovers:
Automatic switchover: automatically triggered by the system if the active board is
faulty.
Manual switchover: performed by maintenance personnel on the LMT.
Maintenance personnel use the immediate switchover command to switch over the
active and standby boards.
A successful active/standby switchover requires the following:
The standby board works normally.
No major or critical alarm is reported on the standby board.
When the standby board is switched over to the active state, the previously active board
is reset automatically. If this board restarts normally, it is switched over to the standby
state.
l Load sharing
In resource pool mode, load sharing is performed among processing units in the pool.
When one or multiple processing units are faulty, new service requests are allocated to
the normal processing units in the resource pool.
l Other reliability designs
Other reliability designs include the redundancy configuration of power and fan units. In
addition, software versions and important data configuration files are backed up so that
the system works normally even if an exception occurs in the software versions and files.
NOTE
CP stands for the control plane and UP stands for the user plane.
The redundancy design is similar for the BSC6900 and BSC6910 control planes. The
differences are as follows:
l A pair of BSC6900 active and standby control plane boards must be installed in adjacent
slots.
l The BSC6910 control plane uses the process backup mechanism. All active CP
processes on a UCUP board have backups evenly distributed on the two adjacent UCUP
boards.
When the BSC6900 or BSC6910 is running, the cell status, NodeB status, and online UE
information on the active subsystem are sent to the standby subsystem through the backup
channel. The standby subsystem then backs up the data. If the active subsystem is faulty, the
standby subsystem takes over services on the active subsystem to avoid service interruptions.
In addition to the redundancy design, the BSC6910 control plane also supports process
preemption. If a pair of active and standby CP processes of the BSC6910 are both faulty for a
certain period of time (less than 5 minutes), the BSC6910 preempts another standby CP
process to start the active CP process, thereby restoring services promptly.
If a user plane subsystem is faulty, the common channels for the cells carried on the
subsystem are reestablished, and services are interrupted for less than 5 seconds and then
restored. During the service interruption, CS services are released, and PS services are
interrupted and then reconnected. The user-plane processing capability decreases, but the
other functional user plane subsystems still work in resource pool mode.
The redundancy design is the same for the BSC6900 and BSC6910 user planes. Neither
supports user-plane service backup.
Table 5-1 describes the reliability indexes for transmission interface boards in different
scenarios.
Table 5-1 Reliability indexes for transmission interface boards in different scenarios
Scenario Availability Average Quantitative
Downtime Reliability
(Minute/Year) Analysis
Switchover for interface boards The impact persists within 3s for ongoing services.
NOTE
The delay caused by protocol negotiation with the peer equipment is not considered in the preceding
indexes. For example, if Link Aggregation Control Protocol (LACP) is enabled, the impact of a
switchover between interface boards persists within 9s for ongoing services.
The BSC6910 resource management plane consists of the central layer and local layer.
l The central layer manages global resources, including the control plane, user plane, and
transport plane resources, and troubleshoots system faults.
The BSC6900 is configured with a pair of boards whose logical type is resource
management processing (RMP). The boards are responsible for managing global
resources and troubleshooting system faults.
l The local layer manages board-level resources.
NOTE
The boards whose logical type is GCUP or GMCP (referred to as GCUP or GMCP boards) manage
board-level resources. GCUP or GMCP is short for GSM BSC Control plane and User plane Processing.
The boards whose logical type is RMP (referred to as RMP boards) have the following
characteristics:
l The CPU usage does not increase noticeably with the increase in the Busy Hour Call
Attempt (BHCA) or throughput. A sudden increase in the CPU usage is allowed within a
short period of time.
l A temporary fault in an RMP board does not interrupt ongoing services. This is because
only global resource scheduling is interrupted if an RMP board is faulty.
NOTE
CP stands for the control plane and UP stands for the user plane.
The redundancy design is similar for the BSC6900 and BSC6910 control planes. The
differences are as follows:
l A pair of BSC6900 active and standby control plane boards must be installed in adjacent
slots.
l The BSC6910 control plane uses the process backup mechanism. All active CP
processes on a board whose logical type is GCUP or GMCP have backups evenly
distributed on the two adjacent boards whose logical type is GCUP or GMCP.
When the BSC6900 or BSC6910 is running, the cell status, BTS status, and online MS
information on the active subsystem are sent to the standby subsystem through the backup
channel. The standby subsystem then backs up the data. If the active subsystem is faulty, the
standby subsystem takes over services on the active subsystem to avoid service interruptions.
In addition to the redundancy design, the BSC6910 control plane also supports process
preemption. If a pair of active and standby CP processes of the BSC6910 are both faulty for a
certain period of time (less than 5 minutes), the BSC6910 preempts another standby CP
process to start the active CP process, thereby restoring services promptly.
NOTE
CP stands for the control plane and UP stands for the user plane.
If a user plane subsystem is faulty, CS services are released, and PS services are interrupted
and then reconnected. The user-plane processing capability decreases, but the other functional
user plane subsystems still work in resource pool mode.
The redundancy design is the same for the BSC6900 and BSC6910 user planes. Neither
supports user-plane service backup.
Control plane resource sharing is used to share the CPU usage and memory. When the CPU
usage of a certain control-plane processing unit is too high or the memory of a certain control-
plane processing unit is insufficient, new calls are forwarded to other control-plane processing
units with light load.
The RNC implements dynamic resource sharing based on the resource pool and load
balancing. If a certain user-plane processing unit is overloaded, new services are forwarded to
other user-plane processing units with light load.
For details on load sharing, see Flow Control Feature Parameter Description.
The BSC6910 dynamically adjusts the numbers of multi-core DSPs allocated to the control
plane and user plane based on service requirements. These adjustments improve hardware
utilization by balancing the control-plane and user-plane processing capabilities.
The BSC6910 introduces a new service processing board: GPU. The GPU board can
simultaneously process user-plane and control-plane data. The BSC6910 monitors the user-
plane and control-plane resource usage and adjusts resources (multi-core DSPs) for each
plane proportionately. For details, see the RNC User Plane and Control Plane Resource
Sharing Feature Parameter Description.
The BSC6910 automatically allocates a new base station or cell to an EGPUa or EGPUb
board. When configuring a base station or cell on the BSC6910, telecom operators do not
need to specify the subrack, slot, or subsystem. In addition, the BSC6910 monitors the
distribution of base stations and cells on the EGPUa or EGPUb boards. When EGPUa or
EGPUb boards experience a load imbalance because there are hotspot base stations or cells,
the BSC6910 adjusts the distribution of base stations or cells on the EGPUa or EGPUb boards
to achieve load balancing.
Dynamic reallocation of cells can be performed during peak hours, whereas dynamic
reallocation of base stations must be performed during off-peak hours. During cell
reallocation, UEs in the CELL_DCH state in the cell do not drop from the network. During
base station reallocation, services carried by the base station are interrupted, and UEs
controlled by the base station experience call drops. Operators can schedule the time for base
station reallocation. For details, see Controller Resource Sharing Feature Parameter
Description.
6 Network Redundancy
7 Fault Management
Fault management detects and records device faults and notifies users of the detected faults
and associated troubleshooting methods. This helps maintenance personnel quickly locate and
rectify faults, minimizing the impact of faults on network running. For details, see Fault
Management Feature Parameter Description.
8 Flow Control
Flow control is a protective measure for communications between the BSC or RNC and its
peer equipment. Flow control provides protection in the following ways:
l It restricts incoming traffic to:
Protect equipment from overload, thereby maintaining system stability.
Ensure that equipment can properly process services even under heavy traffic.
l It restricts outgoing traffic to reduce the load on the peer equipment.
For details about flow control for BSCs/RNCs, see Flow Control Feature Parameter
Description.
9.1 Overview
The Operation & Maintenance System One-Key Recovery feature reduces the complexity of
the backup and recovery of the OS and the complexity of OMU data configuration. In
addition, this feature minimizes the duration of service disruption caused by the operation &
maintenance operations. This feature is applicable only to the DOPRA Linux OS and mainly
used in the following scenarios:
Scheme 1
The USB creator is used to create the USB disk for installing the DOPRA Linux OS and the
OMU applications. The USB installation disk is plugged into the USB port on the OMU
board. The OMU board is then reset. Five to ten minutes later, the OS or OMU applications
on the OMU board are recovered.
Note that the OS, OMU applications, and the respective configuration information must be
stored onto the USB installation disk during the creation of the USB installation disk. Then,
Bootstrap scripts are generated on the USB installation disk to facilitate the start-up of the
OMU board through the USB installation disk.
The Bootstrap scripts first install the DOPRA Linux OS and configure the information for the
OS. Then, the Bootstrap scripts install the OMU applications and configure the information
for the OMU applications. Figure 9-1 shows the OMU board software recovery process.
When the OS on the existing OMU boards is switched from non-DOPRA Linux to DOPRA
Linux, the USB creator is used to obtain the configuration information, especially the network
configuration information, the OMU applications configuration information, and the NE
confirmation information, of the OMU board whose OS is to be switched. Based on the
information obtained, the USB creator creates a USB installation disk for installing the
DOPRA Linux OS. The USB installation disk is plugged into the USB port on the OMU
board. The OMU board is then reset. Five to ten minutes later, the switchover of the OS is
complete.
Scheme 2
When OMU hardware is not damaged, the files are backed up through the existing OS on the
OMU board. In this way, users can recover the OMU OS without using an external storage
medium.
Before recovering the OMU OS, connect a keyboard and a monitor to the OMU board and
then reset the OMU board. When the system boot menu is displayed, select the system
recovery option using the keyboard. The OMU board starts to install the DOPRA Linux OS
automatically. Five to ten minutes later, the OS on the OMU board is recovered. Figure 9-2
shows the OS recovery process for the OMU board.
If no keystroke is detected after the boot menu is displayed, the OMU board boots the default
OS and does not perform OS recovery.
The Operation & Maintenance System One-Key Recovery feature is activated by default for
the newly delivered OMUs, and the OS backup and the system recovery program are preset.
For the existing OMUs, this feature can be activated through an OS switchover or upgrade.
10 Hardware Reliability
BSC or RNC board redundancy has two types: board backup and resource pool. Table 10-1
and Table 10-2 describe the BSC6910 and BSC6900 board redundancy, respectively.
NOTE
The BSC or RNC interface boards have an effective mechanism for fault detection and automatic
recovery. When the BSC or RNC detects that a certain proportion of resources of an interface board are
unavailable for a specified period of time, the BSC or RNC resets the interface board. If the faulty board
is the active one in a pair of active and standby boards, the BSC or RNC switches over the active and
standby boards. For example:
l The BSC or RNC resets an Iub interface board if a certain proportion of cells under the Iub interface
board are unavailable for a specified period of time because of a failure in Iub transmission links.
l The BSC or RNC resets an Iub interface board under the following conditions: The RRC connection
setup success rate in a cell is lower than a predefined threshold because of a failure in Iub
transmission links, the proportion of such cells under the Iub interface board reaches a predefined
cell threshold, the proportion of NodeBs having such cells reaches a predefined NodeB threshold,
and this situation persists for a specified period of time.
l If the BSC or RNC detects any transmission fault, the BSC or RNC reports an alarm instead of
resetting the interface board.
11 Related Features
Prerequisite Features
None
Impacted Features
None
12 Network Impact
System Capacity
None
Network Performance
None
13 Engineering Guidelines
13.2 Deployment
13.2.1 Process
l New sites
The feature has been activated for the delivered OMU boards by default.
l Existing sites
Install the latest DOPRA Linux OS using the USB installation disk, or upgrade the
DOPRA Linux OS to the latest version using the controller upgrade tool.
13.2.2 Requirements
l New sites
N/A
l Existing sites
If the USB installation disk is used to install the DOPRA Linux OS, a USB disk
with a capacity of 2 GB or higher must be ready.
If the controller upgrade tool is used to upgrade the DOPRA Linux OS, the
controller must run on the DOPRA Linux OS.
13.2.3 Activation
l Using the USB installation disk to install the latest DOPRA Linux OS
Prepare the USB installation disk for switching the OMU OS from non-DOPRA Linux
to DOPRA Linux. Next, use the USB installation disk to install DOPRA Linux. For
detailed operations, see Operation Guide to Switching OMU Operating System Through
USB Disks.
l Upgrading the DOPRA Linux OS to the latest version using the controller upgrade tool
Confirm the controller software version required by DOPRA Linux. For details, see
Guide to Dopra Linux Operating System Remote Patch Upgrade.
Upgrade the controller software version by referring to the controller upgrade
guide.
Upgrade the DOPRA Linux OS. For details, see Guide to Dopra Linux Operating
System Remote Patch Upgrade.
13.2.5 Deactivation
N/A
13.4 Troubleshooting
N/A
14 Parameters
15 Counters
16 Glossary
For the acronyms, abbreviations, terms, and definitions, see the Glossary.
17 Reference Documents
1. Operation and Maintenance Feature Parameter Description for GSM BSS or WCDMA
RAN
2. Controller Resource Sharing Feature Parameter Description for WCDMA RAN
3. Flow Control Feature Parameter Description for GSM BSS or WCDMA RAN
4. RNC in Pool Feature Parameter Description for WCDMA RAN
5. RNC Node Redundancy Feature Parameter Description for WCDMA RAN
6. BSC Node Redundancy Feature Parameter Description for GSM BSS
7. MSC Pool Feature Parameter Description for GSM BSS
8. SGSN Pool Feature Parameter Description for GSM BSS
9. TC Pool Feature Parameter Description for GSM BSS
10. Call Admission Control Feature Parameter Description for WCDMA RAN
11. Load Control Feature Parameter Description for WCDMA RAN
12. Overload Control Feature Parameter Description for WCDMA RAN
13. E2E Flow Control Feature Parameter Description for WCDMA RAN
14. Fault Management Feature Parameter Description for SingleRAN