Professional Documents
Culture Documents
IKE S. GABRIEL
C. ESCALATION PLANNING
Guide the Crisis Management team whom to call and seek help when facing a catastrophic outage/event with aim of containing a crisis.
Provide instructions what initial action to take and after what time the next level of help must be contacted.
COLLABORATE WITH VENDOR LOCAL SUPPORT AND GLOBAL TAC (R&D) Require Third Level of Support? CONTACT VENDOR LOCAL SUPPORT
COORDINATE FAULTS/OUTAGE WITH RESOLVING UNITS (RAFO, NTBN, DNS, CORE, ETC.)
Immediate Assignment of Fault to Resolving Unit OCCURRENCE OF CATASTROPHIC OUTAGE AND EVENT
NOC
Detect Outage and Starts Isolating the problem. Notifies Head of NMC.
- Monitors clearing of
alarms and traffic normalization
-Continue coordination with Resolving Team. Send Update. - Monitor clearing of alarms and traffic normalization
- Monitor clearing of
alarms and traffic normalization. Collaborate with Vendor Local Support to contain crisis. Correlate events and KPIs with outages Vendor Local TAC takes over responsibility to neutralize fault. Assigns product expert to solve the problem
Deploy Quick Reaction Team to Location of crisis. Contact Vendor Support Team for assistance if necessary
O&M Support investigates the problem and attempt to neutralize. Seek support from Vendor experts if require. Vendor O&M team provide remote support while Technical Expert on their way to NOC/ affected sites.
Collaborate with Vendor Support to contain crisis. Investigate work-around and trigger contingency plan.
Collaborate with Global TAC /R&D to neutralize fault. Work with DMPI/DTPI O&M support to resolve the problem
Vendor Global TAC takes over responsibility to neutralize fault. Work with local to resolve the problem
-Collaborate with Heads of O&M Resolving Team -Get update from NMC and Brief Head of NOC/NOAT -Wait instructions from Head of NOC/NOAT on what further action to be taken.
Catastrophic Event/Outage Duty Officer Alex Galzote Arnold Melgarejo Richard Cadungog Tante Valdez Arnold Pedro PJ Capiral Melai Sabidong Alex Galzote Arnold Melgarejo Richard Cadungog Tante Valdez Arnold Pedro PJ Capiral 1st Duty Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Week 11 Week 12 Week 13
DUTY WEEK 2nd Duty Week 14 Week 15 Week 16 Week 17 Week 18 Week 19 Week 20 Week 21 Week 22 Week 23 Week 24 Week 25 Week 26 3rd Duty Week 27 Week 28 Week 29 Week 30 Week 31 Week 32 Week 33 Week 34 Week 35 Week 36 Week 37 Week 38 Week 39 4th Duty Week 40 Week 41 Week 42 Week 43 Week 44 Week 45 Week 46 Week 47 Week 48 Week 49 Week 50 Week 51 Week 52
CHINA R and D
Customer Support Head- Gary Cai Service Delivery Mgr. Peter Zhang
Maintenance Manager
Nelson Villoria
Name
Tel./Mobile Nos.
02 8190532 0922 8482934 0922 8016301 0917 8306301 0922 8006189 0922 8990212 0922 8990968 0917 8679888 0922 8341001 0917 5954485 0922 3613941 0916 4188028 0922 8850125 0922 9508457 0917 9017797 0922 8991619 0917 8513210
Emails
Name
Emails
HOTLINE
PH-TAC liwei maxin xujiaxiang ruanjiahai zhangligang caigaoyang huang zhan Panda Huang xue shi jun
phil_support@huawei.com li.wei@huawei.com maxin@huawei.com xujiaxiang@huawei.com jhruan@huawei.com zhangligang@huawei.com caigaoyang@huawei.com huangzhan@huawei.com huangxiarong@huawei.com xueshijun@huawei.com
TSD Director Service Delivery Manager Customer Support Manager Technical Director Maintenance manager Digitel Network CTO Core Network-Team leader
Wang Guodong
Joel Sabidong Liu Dongbo Shu Peng Fang Yongliang
0922 3861982
0922 8115635 0922 5306045 0933 9471514 0908 1577115
Wangguodongph@huawei.com
joelvs@huawei.com liudongbo@huawei.com shupeng1@huawei.com fangyongliang@huawei.com
HW Maintenance Team
Wireless-Team leader Data Com-Team leader A&S-Team leader Optical Network-Team leader
Page 11
Report Time
Final Solution Time (For Bug) System patch (software update) service starting time <45 days
Next update
Next update
Noted: Maintenance service for DMPI will be end of 2012-12-31 ,but that of DTPI with 80% has expired.
5 Mins
HW HQ Expert Group
HW Maintenance Manager
10 Mins
5 Mins
Info Sharing
DMPI Maintenance leader
HW Maintenance leader
5 Mins
HW Engineer
Page 13
Rudi Sponsor
Jenny
Rhoda Campos NOC IN & VAS Team
IN & VAS Engineer Yuzhenhua/Jo are/Shi wei/Errol/Ryan/Lloyed Optical & MW Engineer Mark Rey
Page 14
In the event that this situations occurs, NMC has two ways to deal with this: o Redirecting traffic through unaffected parts of the network o Reducing the demand on the network by blocking less priority users. o Implement traffic control such as call gapping, SS7 link distribution and BSC/RNC blocking, activating MSC congestion reduction mechanism
In case of disaster, for instance, priority might be given to police and other emergency services. In cases of national emergency (war), government and military would be given priority.
CRISIS CLASSIFICATION
DESCRIPTION OF EMERGENCY SCENARIO Class Total Breakdown of Interconnection Links within other Operators Interconnection network such as SMART, GLOBE & PLDT, IGF Total Breakdown of BSC/RNC Due To Hardware Fault BSS/RNC Breakdown of the Media Gateway Hardware Affecting Multiple BSS/RNC BSC/RNC Breakdown of the SGSN/GGSN Hardware BSS/RNC Corrupted CDR data -Too many erroneous CDRs (50% of the total CDR Collector CDRs); Corrupt data - Too many duplicated CDR's (50% of the total CDR Collector CDRs) on counter reports Total failure of all Processes in the CDR Collector CDR Collector Total break down of Clustered Server or Charging Gateway Charging Gateway Total Breakdown of MSC Rectifier ENVR_NSS Total Breakdown of MSC Inverter ENVR_NSS Total Breakdown of BSS /RNC DC Power Distribution ENVR_NSS Total Breakdown of Transmission Backbone DC Power ENVR_NSS System/Distribution Failure of UPS Emergency Power ENVR_NSS Building Fire Alarm ENVR_NSS Breakdown of Fire Suppression System ENVR_NSS MSC High Room Temperature ENVR_NSS MSC Door Intrusion Alert ENVR_NSS Breakdown of more than 2 MSC Air-conditioning Unit ENVR_NSS Breakdown of Genset at the MSC ENVR_NSS Day Tank Fuel Critical Level at MSC ENVR_NSS AC Main Failure to MSC Rectifier ENVR_NSS Charging function stop from any GSN. 3G No transfer of CDRs to CDR Collector 3G Breakdown of one CDR Collector server 3G Breakdown of one CG LAN router/switch 3G Total break down of Ethernet switch on Gi interface. 3G Total break down of DNS/DHCP 3G Total break down of Border GW 3G Data transfer between HLR and SGSN not possible 3G Breakdown of 3G IP backbone LAN Switch 3G Total failure of Iub traffic to RNC 3G Breakdown of SS7 signaling links to HLR 3G Type
Quality of Service Hardware Hardware Hardware Application Application Application Hardware Hardware Hardware Hardware
Escalation Up to
Alert Code
Yellow Orange Orange Orange Orange red orange Orange Red Red Yellow
Hardware
Hardware Hardware Hardware Hardware Hardware Hardware Hardware Hardware Hardware Application Application Application Application Hardware Hardware Hardware Hardware Hardware Transmission Transmission
Yellow
Yellow Red Yellow Yellow Yellow Orange Orange Yellow Yellow Yellow Yellow Orange Orange Orange yellow yellow yellow yellow orange yellow
CRISIS CLASSIFICATION
Corruption of Vital Service Profile Data Unable to perform deduction/refund via PMC Periodic fee is not functioning No generation of call detail records in the IN Failure of Oracle database Retrieval from backup not possible Total break down of IN platform IN Platform Shared hard drive failure Loss of a non-redundant element Repetitive changeover to standby platform Detection of multiple changeover to standby platform 100% of the Bearer Connection is down 100% Signaling link Failure on PCM/Trunk Layer 3 Switch routing/link Failure Total loss of call processing Total failure of recharging functions for all prepaid accounts Impossibility to carry out a basic operation function Unable to perform outgoing calls Incorrect generation or loss of call records Loss of connection to CDR Collector Two consecutive switch over for HLR Data transfer between HLR and MSC/VLR not possible Total break down of HLR Total Break Down of MSC or MGW Two consecutive switch over for MSC SG M3UA Overload for Multiple MSC Total loss of MSC call handling functions No location update for more than 70% of booked subscriber in VLR Total loss of the signaling links (SS7) MSC/HLR to SGW Breakdown of Transmission to DIGITEL LEC (PSTN/IGF) Breakdown of Transmission to other PLMN (GLOBE & SMART) More than 50% of calls are rejected Total Loss of Connection between HLR and CDR Collector IN-PPS IN-PPS IN-PPS IN-PPS IN-PPS IN-PPS IN-PPS IN-PPS IN-PPS IN-PPS IN-PPS IN-PPS IN-PPS IN-PPS IN-PPS IN-PPS IN-PPS IN-PPS Core Network Core Network Core Network Core Network Core Network Core Network Core Network Core Network Core Network Core Network Core Network Core Network Core Network Core Network Core Network Application Application Application Application Database Database Hardware Hardware Hardware Hardware Hardware Network Network Network Services Services Services Services Application Application Hardware Hardware Hardware Hardware Hardware Hardware Transmission Transmission Transmission Transmission Transmission Transmission Transmission yellow orange orange orange orange orange red red yellow orange yellow red red red red red orange orange orange orange red red red red red Red Red Red Red Orange Red Yellow Yellow
CRISIS CLASSIFICATION
Total Loss of Network Supervision to all MSC/MGW/SGW Total Loss of Connection to more than 50% of total number of Transmission Backbone Nodes Total Loss of Connection to GGSN or SGSN Total Loss of Connection to OMC-R by more than one BSC No Network Supervision for entire BSS network element. Total break down of one or more OMC-x or INMS server Total Loss of an application critical to monitoring IP backbone down Database Corruption More Than 10% of the Total BTS Isolated Due To Transmission Failure One (1) BSC Isolated Due To Transmission Failure One (1) RNC Isolated Due To Transmission Failure One (1) MSC Isolated Due To Transmission Failure Database crash Loss of billing data Breakdown of 2 or more SMSC Front-ends Total loss of SMSC interconnections to the SG Total Loss of SMSC interconnections to the Layer 3 switch Total Failure of SMSC to handle SMS processing OMC OMC OMC OMC OMC OMC OMC OMC OMC Transmission Transmission Transmission Transmission VAS-SMSC VAS-SMSC VAS-SMSC VAS-SMSC VAS-SMSC VAS-SMSC Application Application Application Application Application Hardware Hardware Hardware Hardware Network Network Network Network Database Database Hardware Interface Interface Services Yellow Yellow Yellow Yellow Yellow Yellow Yellow Yellow Yellow Yellow Yellow Orange Red Red Red Red Yellow orange red
Yellow
Orange
Red
The Mean time the Crisis Management Team response to a Level 1 emergency shall be within 4 hours from the time the crisis is escalated. The Mean time the Crisis Management Team response to a Level 1 emergency shall be within 2 hours from the time the crisis is escalated. The Mean time the Crisis Management Team response to a Level 1 emergency shall be within 1 hour from the time the crisis is escalated.
The head of the Crisis Management Team who is responsible for emergency planning should make an emergency training plan for all member of the Quick Reaction Team and the O&M personnel. The plan should be followed and followed up.
Additionally, the Head of Crisis Management team can schedule a CRISIS DRILL to determine the responsiveness and readiness of the team and validate the affectivity of the crisis management plan.
END OF PRESENTATION