Professional Documents
Culture Documents
customers", we keep striving to create new values for our customers. In order to
provide user with the latest and the most practical maintenance experience and
methods, ZTE maintenance experts composed this issue of pocket book of MSTP
product maintenance experience. All the troubleshooting cases in this pocket
book come from maintenance practice and can help you a lot to maintain ZTE
power supply products. We appreciate your comments and suggestions. Thanks!
Mismatching.................................................................................................. 1
-i-
3.3 AU not Configured with Service in 10G Optical Board of ZXMP
Unsuccessful ................................................................................................. 8
5.5 ZXMP S360’s OL1 Board Fault Causes Path Ring Switching
Failure ......................................................................................................... 13
10.1 10G Optical Boards of ZTE S390 Equipment and Marconi MSH64
-iii-
Hardware Faults
After the 04CSD board is replaced on the spot, the network element
(NE) is off management and the service is interrupted. Insert the former
O4CSD board, and then the NE’s monitoring is recovered, and the
service is recovered. However, the RS and MS error codes still exist.
Cause Analysis
Troubleshooting
-1-
MSTP Routine Troubleshooting Manual
Based on the above, the fault is caused by low NCP software version
and the inconformity of NM configuration with the new version
configuration requirement of the O4CSD board. The fault leads to
abnormal operation of the equipment.
2. Replace with the former O4CSD board, and confirm that the NE’s
monitoring is recovered.
4. Modify the version setting of the O4CSD board to 200 through the
NM software.
5. Pull out the O4CSD board, and then insert the new board.
-2-
Hardware Faults
Conclusion
During the process of fault handling, note whether the new board
version is consistent with the on-site board version when replacing the
O4CSD board or NCP board.
Cause Analysis
The neighbor NEs on the ring did not report LOS and MS-RDI
alarm, indicating that the optical path is normal.
Troubleshooting
-3-
MSTP Routine Troubleshooting Manual
Command format: if –a
(1)Pull and plug the NCP board on the spot to see if the problem is
solved. If not, proceed to the next step.
-4-
Hardware Faults
Conclusion
There are AIS, down time and remote defect indication in all 2M
services of the 2# EP1 board in a site’s S360 equipment. Related
channels over the ring also report AIS and down time. The service
corresponding to the central site reports AIS and remote defect
indication.
Cause Analysis
-5-
MSTP Routine Troubleshooting Manual
Troubleshooting
1. Reset the 2# EP1 board. If the fault still exists, proceed to the next
step.
6. Perform loopback at the terminal side of the optical port for the
site that drop services. If the alarm disappears, it is the fault in the
optical board of this site.
7. Replace this optical board on the spot. Restore the previous data,
and the problem is solved.
Conclusion
-6-
Hardware Faults
Cause Analysis
The red light and green light go out simultaneously after POST,
indicating that the time-division module of the CSB board fails to detect
the clock signal sent by the clock board. It is the crystal oscillator
failure of the clock board.
Troubleshooting
2. Replace it with the CSC board (whether it has the time division
module) to see if it works normally.
3. In the test, the CSB board can work normally without the time
division module.
4. Debug with the CSB board with no time division module. The
devices connecting both ends of the equipment are detected
reporting Lose Of Frame (LOF), and the self-loop local end’s
-7-
MSTP Routine Troubleshooting Manual
5. Replace the clock board, and the problem is solved. Insert the time
division module to the CSB board and the board can work
normally.
Conclusion
Only the CSB board with the time division module is conducted with
the test of this software. If the cross-connect board can work normally
without the time division module or with the CSC board (no matter
inserted with the time division module), use the alarm of the time
division module to locate the fault, which is in the clock board. It thus
eliminates the potential hidden trouble of the equipment.
Cause Analysis
All optical boards report LOS or LOF alarms, and even after the
self-loop of optical boards. The possibility of the damage in all optical
boards is small.
-8-
Hardware Faults
First, check the NCP board to locate the trouble and report the error
alarm.
Then, check the clock board to see if it is faulty, because it can lead to
unusable framing clock in the whole system. The signals transmitted by
the optical board cannot form frame.
Troubleshooting
Conclusion
The optical boards in self-loop give out the alarm. The problem may be
caused by the self-loop optical boards, or by the NCP board or clock
board.
-9-
MSTP Routine Troubleshooting Manual
Checking NE A:
Checking NE B:
-10-
Hardware Faults
Checking NE C:
Cause Analysis
Analyze the performance data in the line first. There are three kinds of
error codes monitoring the overhead byte in lines, including B1, B2 and
B3. They respectively monitor the quality of routes between the start
point and the end point.
-11-
MSTP Routine Troubleshooting Manual
Troubleshooting
2. Self-loop the 5# optical board of the local site A. If there are still
error codes in the local site, the fault is in NE A.
Conclusion
If there are B1 error codes, the fault is located between two points. If
the optical power is normal, the fault is in the optical board. Then leave
B2, B3, and V5 error codes alone. After B1 error codes are solved, if
the problem still exists, solve B2, B3, and V5 error codes respectively.
-12-
Hardware Faults
Note:
-13-
MSTP Routine Troubleshooting Manual
Cause Analysis
-14-
Hardware Faults
Troubleshooting
3. Pull and plug the CSC board. The fault still exists.
4. Pull out the TCS 16x16 module inserted in the CSC board, and
reinsert it. The fault disappears.
Conclusion
Note:
-15-
MSTP Routine Troubleshooting Manual
When replacing the CSC board, the TCS time division board over it
should be inserted tight. Otherwise, the board may not function
normally.
-16-
Performance Faults
-1-
MSTP Routine Troubleshooting Manual
Cause Analysis
The fault with only B1 error is easier to be processed. There are the
following causes of B1 error codes:
Clock failure
Troubleshooting
-2-
Performance Faults
If the service switching is normal, switch the clock board and the fault
still exists.
Check if the received light is normal. If not, check whether the internal
and external connection of ODF rack is loose. If yes, check the optical
interface inside the optical board. Though this situation is very rare, it
needs to be checked.
Conclusion
The fault analysis lists out four possible causes. During normal usage,
the case of sudden lessening received light and beyond the optical board
sensitivity is very rare. Therefore, the problem caused by the third
reason is the rarest. For the optical board of the transmitting end and the
optical board of the receiving end, most faults occur in the transmitting
end. Therefore, start troubleshooting at the transmitting end.
-3-
Data Configuration Faults
Cause Analysis
LFD: When the frame header of GFP cannot be locked (being in search
and pre-sync state), it reports LFD alarm. If in locked status, the alarm
disappears. These alarms are caused by inconsistent encapsulation
protocols adopted by both ends. Therefore, both ends should select the
same GFP for encapsulation. If V1.0 SFE board is adopted in one end, it
should be upgraded to V2.0.
Troubleshooting
-1-
MSTP Routine Troubleshooting Manual
Conclusion
In case of this fault, first check if the encapsulation protocols of the two
ends’ Ethernet network boards are consistent. If not, change their
encapsulation protocols into consistent.
Note:
-2-
Data Configuration Faults
Cause Analysis
Troubleshooting
Conclusion
-3-
MSTP Routine Troubleshooting Manual
Compare the equipment data with the NM database and they are
consistent.
Cause Analysis
Check the history alarms and NM setting by restoring the field data. It is
found that the Idle AU detection setting item under the Alarm menu in
the field NM data is set as enabled, which causes the AU not configured
with service reports the AU-AIS alarm.
In the SNCI mode, the system sends AU-AIS to the idle channel by
default. For instance, there are site A and site B, and they are
interconnected. Site A sends the AU-AIS, if site B is configured with
Idle AU Channel Detection, site B will detect the AU-AIS and report it.
-4-
Data Configuration Faults
Troubleshooting
Conclusion
-5-
Power Faults
Service boards of part slots prompt for channel alarm or pointer loss
alarm.
Take the expanded subrack for instance: it is inserted with the following
boards, as shown in Figure 4-1.
-1-
MSTP Routine Troubleshooting Manual
However, the service configured to two EP1 boards of 22# and 23# by
the first AUG of the 7# OL4 optical board always reports loss of TU12
channel alarm indication signal and loss of TU12 pointer. The service of
other slots' tributary boards is all normal.
Cause Analysis
The subrack is inserted with two power boards. Due to over-low current
output, the power board cannot supply power and becomes the load.
Therefore, the power supply to a specific slot’s board is too low, and the
board cannot work normally.
Troubleshooting
Pull out the power clock board with over-low output, or replace the
power clock board.
Conclusion
Whether the current output of the power clock board is stable will affect
the normal operation of all boards. Therefore, when more than one
boards malfunction, first check the operation status of the power clock
board.
-2-
Power Faults
The S360 equipment is inserted with multiple boards, yet there is only
one power board, which may lead to service failure of some boards.
Cause Analysis
The subrack is inserted with multiple boards. Though they can work,
due to insufficient power supply and voltage, part high power
consumption chips of some boards cannot work normally.
Troubleshooting
Conclusion
When many boards are inserted in the subrack, the S360 equipment
should be configured with dual power clock boards.
-3-
Protection Faults
Cause Analysis
Read the version of the board through the NM software, and the
difference is found. Refer to the table below for details.
NE Name NE A NE B
-1-
MSTP Routine Troubleshooting Manual
The distance of software time between NE A’s NCP board and LP16
board is too long, which leads to inconsistent version of the new LP16
and the old NCP board.
Troubleshooting
Upgrade the software of the NCP, LP16 and CSC boards, and the fault
disappears.
Conclusion
Before the stop production of S360 device, the final version is launched.
The old version boards of the existing network should be upgraded to
this final version as possible, to avoid the fault caused by too large
version discrepancy in the device’s board.
-2-
Protection Faults
When BTS side and BSC DDF side E1 lines are completed, the site is
also put into commercial application. Therefore, it is required to
complete path protection within the shortest period when E1 lines need
not to be remade.
-3-
MSTP Routine Troubleshooting Manual
Cause Analysis
Under the condition when the BSC side and BTS side E1 ports are
unchanged, the path protection configuration can be completed by
adjusting the configuration of slots at the BSC side and BTS side.
Note:
Troubleshooting
1. Calculate the sites working normally over the ring and observe the
performance of the network formed in ring, including 15-minute
-4-
Protection Faults
4. Export the current slot and port configuration information; that is,
the report in “related service query” and save it.
5. All NEs over the ring are off-line (For this step, the configuration
can be checked outside the equipment room).
-5-
MSTP Routine Troubleshooting Manual
-6-
Protection Faults
10. Download the updated slot configuration to all NCP boards over
the ring.
12. Check the operation status of sites with BSC engineers to assure
that the path protection has been configured.
Conclusion
1. Pay attention to the slot configuration mode and method at the initial
stage of engineering project. Make periodic check and patrol, so that the
problem can be detected and processed in time.
3. Fully apprehend the relationship between the port and the slot, as
well as the configuration method of path protection.
-7-
MSTP Routine Troubleshooting Manual
11# 5#
11#
5#
11# 5#
B
C
Cause Analysis
-8-
Protection Faults
-9-
MSTP Routine Troubleshooting Manual
Troubleshooting
-10-
Protection Faults
Conclusion
Cause Analysis
-11-
MSTP Routine Troubleshooting Manual
Troubleshooting
3. Then, read the K1K2 bytes of each site. Read the 30000 byte of
ROM register of LP16 board in No. 7 slot. The length of the
read-out byte is 4 bytes. The data respectively means that the node
receives K1 byte in east direction, receives K2 byte in west
direction, receives K1 byte in west direction, and receives K2 byte
in west direction. In normal operation, the first four bits of K1
byte and the last four bits of K2 byte should all be 0. However, the
first four bits of K1 byte and the latter four bits of K2 byte read
out at a site are not all 0. Therefore, the 7# LP16 of this site is
doubtful.
(2) Suspend the APS protocol of the two sites whose optical fiber is
disconnected.
-12-
Protection Faults
(3) Set the 77777 register of the two sites’ cross-connect board to
01.
(5) When the LP16 board is normal, start up the APS protocol of
the two sites.
(6) Set the 77777 register of the two sites’ cross-connect board to
00.
Conclusion
The network structure is shown as in Figure 5-4, and the central office
is located at NE A.
-13-
MSTP Routine Troubleshooting Manual
Cause Analysis
1. For the path ring protection failure, first judge if the service
configuration is correct. The protecting path is checked and found
to be normal. To assure the consistency of the NM data and NE
-14-
Protection Faults
data, re-deliver timeslots to each site of the protecting path, yet the
fault still exists.
4. To locate the faulty site, loopback at AU's terminal side for NE B's
10# optical board, and find that the tributary board alarm of NE B
still exists. Loopback at the terminal side of NE C's 7# optical
board, and the alarm of NE A corresponding to NE B's service
disappears. Thus, the fault is surely in NE B.
-15-
MSTP Routine Troubleshooting Manual
Troubleshooting
2. Then, replace the EP1 board, and the problem still does not
disappear.
3. Finally, replace the OL1 board and the fault disappears. In this
way, the fault is judged to be in the 10# OL1 board of NE B.
Conclusion
-16-
Protection Faults
One day, when MS switching test is implemented over the ring, after
the switching between NE D and NE E, a short break occurs in part
services every 3 to 5 seconds. Switch back and the service is recovered.
Cause Analysis
Troubleshooting
-17-
MSTP Routine Troubleshooting Manual
-18-
Protection Faults
4. Replace the 24# LP16 board, and the fault disappears. The fault of
the 24# LP16 board is likely caused by high temperature.
Conclusion
Clean the dust screen of the equipment periodically and check if the
fans of the equipment work normally. In case of the fan fault, replace
the fan in time.
-19-
NM Faults
Chapter 6 NM Faults
6.1 E300 NM Alerts “Database Disconnected”
Fault Description
In the process of transmitting E300 V3.18R2 version NM software, the computer and
the NM software are restarted due to sudden power failure, and the login to the NM
client end fails. The detail info table displays “Database disconnected”.
At the dbman tool page, execute 3 and then 1. Check the operation
status of each database, and find that the status of the config. database is
suspend, and the object status of the config. database is unknown, as
shown in Figure 6-1.
-21-
MSTP Routine Troubleshooting Manual
Cause Analysis
1. Analyze the log file and find that the config. database is suspended
in the Sybase database, so the dbsvr.exe process cannot start up
normally.
The faulty section in ..\db\dbsvr .log shows that the dbsvr.exe process
keeps restarting yet fails all along.
-22-
NM Faults
DBSVR exits
DBSVR exits
Database 'TransDB' has not been recovered yet - please wait and try
again.
-23-
MSTP Routine Troubleshooting Manual
-24-
NM Faults
4. To sum up, the config. database in the Sybase database fails after
sudden power down and is suspended. The cause is that the
internal in-database mode of sybase in the NM software of E300
V3.18R2 or above version is changed to the asynchronous mode.
Though the efficiency of writing to database is raised, the risk of
the database being suspended due to the sudden power down in the
process of in-database is great.
Troubleshooting
-25-
MSTP Routine Troubleshooting Manual
Use the DBMAN tool to solve the fault of Sybase’s being suspended.
The steps of re-creating database after recovery are as below:
2>go
1>use master
2>go
2>where name="database_name"
3>go
2>go
(3)Restart Sybase by using the dbman tool (execute 2 and then 1) for
-26-
NM Faults
Conclusion
Cause Analysis
-27-
MSTP Routine Troubleshooting Manual
Troubleshooting
Conclusion
-28-
NM Faults
In some S320 NEs of E300 NM software, all boards’ indicator lights are
shown grey in the boards view, unlike the boards view of other S320
NEs (flashing slowly in green normally).
Cause Analysis
Troubleshooting
-29-
MSTP Routine Troubleshooting Manual
4. After the version problem is confirmed, there are two solutions for
it: 1. keep the status unchanged since the board’s indicator light
function does not affect the normal maintenance; 2. upgrade the
NCP version of the S320 equipment to keep it consistent with
other NCP’s version. It is suggested to adopt the second solution
to achieve the indicator light function.
Conclusion
-30-
ECC Faults
Cause Analysis
-31-
MSTP Routine Troubleshooting Manual
Troubleshooting
Conclusion
Site A and B adopt the ZXMP S360 equipment. Site C, D, and E adopt
the ZXMP S320 equipment. Site A is the access NE, as shown in Figure
7-1. The NM software of site A can monitor other sites except site B.
-32-
ECC Faults
Cause Analysis
1. Telnet the NCP board at site A. Check the connection status of the
port and find that the route of site B’s optical direction is already
established.
4. Telnet the NCP board of site A and check the ECC route. If it is
normal, the optical board has no fault.
-33-
MSTP Routine Troubleshooting Manual
Troubleshooting
Reset the NCP board of site A and the problem is solved. Keep
observing. If there is any problem again, replace the board.
Conclusion
Get familiar with ECC related commands and usage. Solve the problem
on the basis of judgments from many aspects.
-34-
Clock Sync Faults
Cause Analysis
-35-
MSTP Routine Troubleshooting Manual
The clock board makes the external clock or extracted line clock as the
input of phase-locking circuit to compare the phase. Therefore, the
quality of this board’s crystal oscillator will affect the quality of the
clock.
Troubleshooting
2. Get the clock status of each site, and they are all locked. However,
site B, C, D extracts the clock in site A direction, site H extracts
the clock of site I, site A and I are external clocks. After analysis,
the clock instability should be caused by configuration error.
-36-
Clock Sync Faults
Note:
The S1 byte is not configured only in the ring network. Here, the cause
of line clocks being extracted from both sides is that the S1 byte is not
enabled.
4. Enable the S1 byte and do not change the clock setting. Extract the
clock from the equipment adjacent to site A (here is the G.811
clock) as the external clock of site A.
Conclusion
-37-
MSTP Routine Troubleshooting Manual
Adopt SSM
(1) Ring network, which has the access of two external clocks to
achieve active/standby protection.
-38-
ASON Faults
Cause Analysis
1. Check the connection status of the call with the set-up failure and
find no abnormal attributes configuration or limit strategy.
-39-
MSTP Routine Troubleshooting Manual
Troubleshooting
2. Check if the service path can travel other route, and empty the
AU4 resource as possible.
Conclusion
2. When the service is interrupted, check the line the service passing
through to see if there is idle bandwidth for newly established
connection. If the bandwidth is not enough, adjust part
connections by hand and recover the service in priority. It is
suggested to use no higher than 60% network resources, so that
the left resources can be reserved for recovery.
-40-
ASON Faults
-41-
MSTP Routine Troubleshooting Manual
Cause Analysis
1. Check the call attribute setting of SNCP service, and the TE link
the protective connection passing through. It is found that the
route policy set for this call service is: selecting “node irrelevant”
and “link irrelevant” items.
Troubleshooting
1. Edit the route policy of this 1+1 SNCP and deselect the “node
irrelevant” item. Then optimize re-routing for protective
connection, to set up protective connection automatically.
2. Or keep the present route policy unchanged and wait for the
recovery of lines and protective route.
Conclusion
1. When configuring 1+1 SNCP service, take the route in the existing
network into consideration, including work/protective route and
the possible recovery route. If a section of optical path is broken,
the work/protective connection should be able to find the third
independent route as the recovery route.
-42-
ASON Faults
-43-
Interconnection Faults
Chapter 10 Interconnection
Faults
10.1 10G Optical Boards of ZTE S390
Equipment and Marconi MSH64 Equipment
Fails in Interconnection
Fault Description
Cause Analysis
Test ZTE S390 equipment’s 10G optical board by using the SHD
analyzer (ONT50) and no problem is found. No matter adopting the port
self-loop or VC4 timeslot self-loop mode, the meter tests with no
problem and no alarm appears. The error code is 0.
-45-
MSTP Routine Troubleshooting Manual
-46-
Interconnection Faults
Hint:
-47-
MSTP Routine Troubleshooting Manual
streams correctly. That is, first locate the starting position of each
STM-N frame, and then identify the position of corresponding overhead
and payload in each frame. A1 and A2 bytes can perform the function of
framing. Through it, the receiving end can locate and separate the
STM-N frame from the information flow, and then find a VC
information packet in the frame through the location of the pointer.
How the receiving end locates the frame through the A1 and A2 bytes?
A1 and A2 have fixed value, namely, fixed bit pattern: A1: 11110110
(F6H), A2: 00101000 (28H). The receiving end checks each byte in the
signal flow. When 3N A1 (F6H)s appear successively, and 3N A2
(28H)s appear subsequently (STM-1 frame has three A1 and A2 bytes
respectively), it judges that it has received one STM-1 frame. The
receiving end distinguishes different STM-1 frames by locating the
starting point of each STM-1 frame, to reach the aim of separating
different frames.
-48-
Interconnection Faults
Troubleshooting
-50-
Interconnection Faults
VC4-RDI error, and the ATM data service fails, as shown in Figure
10-4.
Cause Analysis
-51-
MSTP Routine Troubleshooting Manual
-52-
Interconnection Faults
Troubleshooting
-53-
MSTP Routine Troubleshooting Manual
Conclusion
Note:
-54-
Interconnection Faults
Explanation on C2 byte:
For ET1, TT1, and the data board using TU11 and TU12, the
transmit value of C2 byte is fixed as 02.
For ET3, TT3, and VC4, the transmit value of C2 is fixed as 02,
and VC3’s C2 value is fixed as 04.
For the data board using VC4, the transmit value of C2 is 0x16
(HDLC/PPP encapsulated), 0x18 (LAPS encapsulated), or
0x1B (GFP encapsulated).
-55-