You are on page 1of 59

Disaster Recovery Plan for Office of

Information Technologies

2010 Draft 3

TABLE of CONTENTS
1.1 Mission Statement .................................................................................................4
1.2 Disaster Recovery Planning ..................................................................................4
1.3 Recovery Objectives..............................................................................................5
1.4 Assumptions of the Plan........................................................................................6
1.5 Overview of the Disaster Recovery Plan ...............................................................6
2.0 DISASTER RISKS AND PREVENTION...........................................................8
2.1 Earthquake ...........................................................................................................8
2.2 Fire ......................................................................................................................9
2.3 Smoke ..................................................................................................................9
2.4 Flood or Water Damage .................................................................................... 10
2.5 Power Outage .................................................................................................... 10
2.6 Terrorist Activity or Sabotage .......................................................................... 10
2.7 Sudden Loss of key personnel ........................................................................... 11
3.0 DISASTER PREPARATION ............................................................................. 11
3.1

Disaster Recovery Planning............................................................................ 12

3.2 Warm Sites for the Fourth Avenue Building Data Center................................ 12
3.3 Replacement Equipment ................................................................................... 13
3.4 General Backup Information ............................................................................ 13
3.5 Backup Procedures ........................................................................................... 14
3.6 Offsite Storage Agreement ................................................................................ 14
3.7 Documentation of Current Systems .................................................................. 14
3.8 Storage of DRP.................................................................................................. 14
4.0 DISASTER DETECTION AND INITIATION................................................. 15
4.1 Red Cross Information...................................................................................... 15
4.2 Disaster Recovery Teams .................................................................................. 16
4.3 Disaster Detection and Determination .............................................................. 17
4.4 Disaster Notification.......................................................................................... 17
4.5 When to Activate the Plan................................................................................. 18
4.6 What to Do When a Crisis Erupts..................................................................... 18

OIT Disaster Recovery Plan, 2010

Page 2 of 59

6/14/2010

5.0 ACTIVATING THE PLAN .............................................................................. 19


6.0 DISASTER RECOVERY STRATEGY ............................................................ 20
6.1 Recovery Procedures......................................................................................... 20
6.2 Team Plans Defined .......................................................................................... 20
6.3 Team Plans by Service ...................................................................................... 21
6.5 Central Services Recovery Plan ....................................................................... 28
7.0 PLAN MAINTENANCE AND TESTING ........................................................ 29
8.0 Appendix A: EMERGENCY CONTACT LIST ............................................... 31
9.0 Appendix B: VENDOR LIST............................................................................ 35
10.0 Appendix C: NETWORK DIAGRAMS.......................................................... 37
.................................................................................................................................. 38
11.0 Appendix D: WARM SITE AGREEMENT.................................................... 39
.................................................................................................................................. 44
12.0 Appendix E: AGREEMENT WITH IRON MOUNTAIN OFFSITE
STORAGE ................................................................................................................ 45
13.0 Appendix F: WARM SITE EQUIPMENT AND CONFIG............................... 46
Hardware...................................................................................................................................................................... 46
Network & Cabling .................................................................................................................................................. 46
Storage .......................................................................................................................................................................... 47
14.0 Appendix G: DATA CENTER STARTUP/SHUTDOWN PROCEDURES ..... 49
15.0 Appendix F: OIT CONTINUITY OF OPERATIONS PLAN 2009 ............... 51

OIT Disaster Recovery Plan, 2010

Page 3 of 59

6/14/2010

1.0 INTRODUCTION
Portland State University depends heavily upon information and the ability to process and
analyze this information. The university increasingly depends on computer-supported
information processing and services. Technology and automated systems are often used
to process and analyze information and their disruption for even a few hours could cause
severely affect the overall performance of the institution. This dependency on IT services
will only continue to grow.

1.1 Mission Statement


The Office of Information Technologies (OIT) is a support organization dedicated to
providing state-of-the-art information technology and communications to students,
faculty, researchers, and staff for instruction, general research, administration, and public
service in support of the University's plan for excellence.
Towards that end, OIT provides a myriad of computing and communication services to
the University and some other educational entities. The continuing goals of OIT are to
meet the changing and expanding computing and communications needs of the University
and to provide outstanding service to all University constituents.

1.2 Disaster Recovery Planning


A disaster is an adverse incident that in some way causes the loss of the ability to perform
a specific or group of business functions or activities.
This incident could be the result of a natural event, a human mistake or willful damage. A
disaster recovery plan for Technology Infrastructure Services (TIS) must respond to each
of these events.
The Technology Infrastructure Services Disaster Recovery Plan is a comprehensive
statement of actions to be taken before, during and after a disaster. This plan is designed
to reduce the risk to an acceptable level by ensuring the restoration of critical functions
and services within a short time frame, and all essential production within a longer, but
permissible, time frame. This plan identifies the critical functions and services for the
university and the resources required to support them. Guidelines and recommendations
are provided for ensuring that needed personnel and resources are available for disaster
preparation, assessment and response to permit the timely restoration of services.

OIT Disaster Recovery Plan, 2010

Page 4 of 59

6/14/2010

1.3 Recovery Objectives


The reason for having a disaster recovery plan is to minimize business interruption. The
risk and size of the impact on university services is also minimized and subsequently, the
cost implications of an interruption in these services to students, faculty and staff is
minimized as well.
This Disaster Recovery Plan protects the university in the event that all or part of its
information technology operations is rendered unusable. The objectives of this document
are:

present a course of action for restoring critical systems to Portland State University
within a minimum number of days of initiation of the plan

minimize the disruption of IT operations and services

describe an organizational structure for carrying out the plan

identify the equipment, procedures and other items necessary for recovery

ensure an orderly recovery after a disaster occurs

minimize risk of lost production or services

provide a standard for testing the plan

minimize decision-making during a disaster

This recovery plan and the associated documents provide a measure of security for the
services, information and other non-computer assets of Technology Infrastructure
Services. The activities associated with the preparation of this plan include:

Identification of the risks to the Fourth Avenue Building which may affect the critical
functions

Identification of the likely impacts should a disaster occur, and the likelihood of their
occurrence

After identification, determination of a reasonable level of expenditure on the


business recovery planning process, and prevention and recovery

Determination of suitable prevention and protection processes

Demonstration of the validity of the plan by testing

OIT Disaster Recovery Plan, 2010

Page 5 of 59

6/14/2010

1.4 Assumptions of the Plan


This document plans for the major/worst case disaster. However, if an outage of services
occurs to a lesser degree, this plan will cover the incident. NOTE: This plan does not
guarantee zero data loss! Recovery efforts in this plan are targeted at getting the critical
systems functional using the last available off-site backup tapes or other sources.
Considerable effort will be required after critical systems are restored to restore data
integrity to the point of the disaster and to synchronize that data with any new data
produced from the point of the disaster forward.
In addition, this recovery plan is predicated on the following three assumptions:

The situation that caused the disaster is localized to the Fourth Avenue Building.

It is not a general disaster affecting a major portion of the greater Portland


metropolitan area. It should be noted, that this Plan will still be functional and
effective even in an area-wide disaster. Even though the basic priorities for restoration
of essential services to the community will normally take precedence over the
recovery of Portland State Universitys IT services, this plan will still outline ways in
which the services can be brought back online quickly.

The Plan is based on the availability of a warm site. The accessibility of this site is a
significant requirement.

This plan is also based on the concept that the technical teams tasked with Data Center
management and Networking will provide the baseline infrastructure needs in the event
of an emergency. These teams will ensure a warm site is activated or another location on
campus will be made available to house sensitive equipment. The Data Center/Network
Teams will ensure that Fourth Avenue Building or any other site will have proper
electrical, networking and security for housing systems that will process sensitive
University information. The teams and their plans are detailed later in this document.

1.5 Overview of the Disaster Recovery Plan


This plan will address the following areas in the event of a disaster that destroys or
severely cripples the main computing center for Portland State University:

data center recovery planning

warm site recovery planning

OIT Disaster Recovery Plan, 2010

Page 6 of 59

6/14/2010

network recovery planning

central services recovery planning

recovery of Banner processes

Personnel
Immediately following a disaster, a planned sequence of events begins. Key TIS
personnel are notified and recovery teams are grouped to implement the plan. Personnel
currently employed are listed in the plan. However, the plan has been designed to be
effective if some or all of the personnel are unavailable.
Portland State University must take special pains to ensure that the recovery workers are
provided with resources to meet their physical and emotional needs. If the disaster is one
that affects the greater metropolitan area, many local support agencies such as the Police
and Fire Departments and the Red Cross will be involved. PSU will make efforts to work
with any or all of these outside agencies to protect life and property and to ensure
security.
Salvage Operations at Disaster Site
Early efforts are targeted at protecting and preserving computer equipment. In particular,
any storage media (hard drives and backup tapes) are identified and either protected from
the elements or removed to a clean, dry environment away from the disaster site.
Designation and Activation of a Warm Site
A survey of the disaster scene is done by the appropriate personnel to determine which
warm site will be activated. If the disaster is contained to strictly Fourth Avenue
Building, TIS will move critical systems/functions to the alternate locations on campus. If
the disaster is campus wide, the Disaster Recovery Manager will determine if it is needed
to activate the warm site agreement with Western Washington University. If it is
determined to do so, the Associate CIO for Technology Infrastructure Services will
contact WWU to alert them to PSU's status. Key personnel will begin to restore and
activate our critical systems that are physically located at WWU. This may require that
one or more employees leave and restore on site at Western Washington University.
During this emergency restoration of critical systems, a survey of the disaster scene is
done by appropriate personnel to estimate the amount of time required to relocate all OIT
operations of FAB to a location some distance away from the scene of the disaster where
computing and networking capabilities can be temporarily restored until the primary site
is ready. Work begins almost immediately at repairing or rebuilding the primary site. This
may take months, the details of which are beyond the scope of this document.

OIT Disaster Recovery Plan, 2010

Page 7 of 59

6/14/2010

Purchase New Equipment


The recovery process relies heavily upon vendors to quickly provide replacements for the
resources that cannot be salvaged. The University will need to provide expedited
procurement procedures (approved by the University's purchasing office and the Office of
State Purchasing) to quickly place orders for equipment, supplies, software, and any other
needs.
Begin Reassembly at Recovery Site
Salvaged and new components are reassembled at the recovery site according to the
instructions contained in this plan. If vendors cannot provide a certain piece of
equipment on a timely basis, it may be necessary for the recovery personnel to make lastminute substitutions. After the equipment reassembly phase is complete, the work turns to
concentrate on the data recovery procedures.
Restore Data from Backups
Data recovery relies entirely upon the use of backups stored in locations off-site from
FAB. Early data recovery efforts focus on restoring application and university data
(financial, student information, etc.) from the backup tapes or other sources. Some
applications and/or data may be available only to a limited few key personnel. Specific
departmental information will be restored as appropriate and may require involvement
with outside administrators to ensure that data is restored properly.
Move Back to Restored Permanent Facility
If the recovery process has taken place at the warm site, physical restoration of the Fourth
Avenue Building Data Center (or an alternate facility) will have begun. When that facility
is ready for occupancy, the systems assembled at the warm site are to be moved back to
their permanent home. This plan does not attempt to address the logistics of this move,
which should be vastly less complicated than the work done to do the recovery at a warm
site.

2.0 DISASTER RISKS AND PREVENTION


As important as having a disaster recovery plan is, taking measures to prevent a disaster
or to mitigate its effects beforehand is even more important. This portion of the plan
reviews the various threats that can lead to a disaster, where our vulnerabilities are, and
steps we should take to minimize our risk. There are many forms of catastrophic loss that
can occur. This section lists some of the events and situations that are considered when
determining what to include in the plan.

2.1 Earthquake
Earthquakes could result in partial or total loss of data for an extended period. Recovery
could be slow or impossible. The probability of an earthquake in the greater Portland area
is low but the severity of loss and damage in the event of an earthquake is high.

OIT Disaster Recovery Plan, 2010

Page 8 of 59

6/14/2010

Preventive Measures
Building construction makes all the difference in whether the facility will survive or not.
Even if the building survives, earthquakes can interrupt power and other utilities for an
extended period of time. The Fourth Avenue Building, where the Data Center is located,
is served by redundant power feeds. PGE serves the building from the North and South
from separate substations.
In the event of a failure of both feeds, the building is serviced by a turbine that can run
the complex for several days before requiring refueling. The cooling systems are run by
well water drawn from sources internal to the building, a benefit in the event the external
water supply to the buildings is compromised.

2.2 Fire
Fire can also result in partial or total loss of data for an extended period. The probability
of fire within the Fourth Avenue Building Data Center is high based on the high power
consumption requirements of the equipment and heat generation in the room.
Preventive Measures
The Fourth Avenue Building is equipped with a sophisticated fire alarm system, with
ceiling-mounted smoke detectors scattered widely throughout the building. Hand-held
fire extinguishers are available. The Data Center is equipped with a air-sampling VESDA
system to detect the early onset of smoke particles. Temperature sensors are also
available to trigger on heat thresholds. These alert systems trigger the building alarm
panels and notify campus facilities. IT systems also monitor the alarms and notify OIT
personnel. If the fire is not dealt with in time, overhead sprinkler systems are deployed.
Building management personnel perform periodic maintenance checks of the fire alarm
systems.

2.3 Smoke
Smoke particles on magnetic media can render it useless. The damage from smoke occurs
much faster than damage from the actual fire or water. A relatively small amount of
smoke can cause a huge degree of loss in terms of data. It is imperative that smoke be
contained to the smallest possible area.
Preventative Measures
The preventative measures for smoke detection are the same as for fire detection. The
Fourth Avenue Building is equipped with a sophisticated fire alarm system, with ceilingmounted smoke detectors scattered widely throughout the building. Hand-held fire
extinguishers are available. The Data Center is equipped with an air-sampling VESDA
system to detect the early onset of smoke particles. Temperature sensors are also
available to trigger on heat thresholds. These alert systems trigger the building alarm
panels and notify campus facilities. IT systems also monitor the alarms and notify OIT

OIT Disaster Recovery Plan, 2010

Page 9 of 59

6/14/2010

personnel. If the fire is not dealt with in time, overhead sprinkler systems are deployed.
Building management personnel perform periodic maintenance checks of the fire alarm
systems

2.4 Flood or Water Damage


The possibility of floods from natural causes is small for the Fourth Avenue Building.
However, there is the risk of broken water and sewer lines causing major water or flood
damage. Being able to detect flooding and the presence of water in the data center could
help curtail serious damage to some costly pieces of equipment.
Preventative Measures:
Humidity levels in the Data Center are monitored by the CRAC units and reported to
facilities. Water sensors are embedded under the raised floor and monitored by a NetBot
which alerts IT personnel. Cramer Hall has water sensor contacts installed in the central
campus router room in order to have quick detection of flooding.

2.5 Power Outage


The likelihood of a power outage in the downtown Portland area is high. A three unit
UPS system provides clean power to the rooms and protects against any power surges.
Preventative Measures:
The Fourth Avenue Building, where the Data Center is located, is served by redundant
power feeds. PGE serves the building from the North and South from separate
substations.
In the event of a failure of both feeds, the building is serviced by a turbine that can run
the complex for several days before requiring refueling. The UPS system dedicated to the
Data Center can keep things running for up to an hour while the turbine spins up. In the
event of a turbine system failure during an emergency, the one hour of UPS time allows
IT personnel an opportunity to shut systems down in an orderly fashion.

2.6 Terrorist Activity or Sabotage


It is a reality that irrational things can be done that would adversely affect Technology
Infrastructure Services ability to provide IT services to the campus community. Physical
damage to the Fourth Avenue Building Data Center data or facilities by disgruntled
employee(s) (or students, or hackers) can pose a serious threat to data integrity. To
minimize these risks, an effective guideline for the handling of human relations issues
and labor disputes, in conjunction with good data protection procedures, will minimize
the exposure to these risks.

OIT Disaster Recovery Plan, 2010

Page 10 of 59

6/14/2010

Preventative Measures
Portland State University uses an HID proximity card-based access system for secure
areas. Every employee of OIT requiring access to the Data Center is issued a proximity
card with a number that is unique to that person. The Fourth Avenue Building Data
Center is protected by having these card readers on all doors into the facility. Every entry
to any door that is used with these cards is logged to a database. Card access reports are
looked over once a month for any odd activity and to ensure that only key personnel have
access to the data center. There are also two cameras within the data center and perimeter
that monitor the center at all times. The images are kept for a period of 30 days before
they are deleted. The Banner system that houses all student information and financial
information for the university has been placed behind a tightly controlled firewall. Most
systems in the Data Center are also firewalled to some extent. All systems and services
are consistently monitored through Nagios. Any unusual activity or high bandwidth
traffic pages system administrators in order for someone to investigate the activity.

2.7 Sudden Loss of key personnel


The loss of key personnel through death, unexpected departure or job termination is a
valid and serious exposure in Technology Infrastructure Services. This type of loss can be
minimized by cross training in each team and through the documentation of the services
and processes TIS provides to the university.
Preventative Measures
If needed, consulting can be acquired from the major vendors for the university, i.e.
SungardHE Banner, Microsoft, etc.
Recommendations
Documentation of critical systems and services should be kept in an offsite location. All
critical system passwords should be documented and taken to the offsite storage location.
These passwords should be updated in the document on a regular basis. Cross training in
each team in TIS should be a high priority for the staff.

3.0 DISASTER PREPARATION


In order to facilitate recovery from a disaster that destroys all or part of the data center in
Fourth Avenue Building, certain preparations have been made in advance. This document
describes what has been done to lay the way for a quick and orderly restoration of the
facilities that TIS operates.

OIT Disaster Recovery Plan, 2010

Page 11 of 59

6/14/2010

3.1

Disaster Recovery Planning

The first thing to do is to have a plan. This document is part of an overall plan that
Portland State University will use in response to a disaster. The extent to which a
business continuity plan can be effective, however, depends on disaster recovery plans by
other departments and units within the University.
Every other business unit within the university should develop a plan on how they will
conduct business, both in the event of a disaster in their own building or a disaster in OIT
that removes their access to data for a period of time. Those business units need means to
function while the computers and networks are down, plus they need a plan to
synchronize the data that is restored on the central computers with the current state of
affairs.

3.2 Warm Sites for the Fourth Avenue Building Data Center
If Fourth Avenue Bulding is either totally or partially destroyed in a disaster, repair or
rebuilding of the building and data center may take an extended period of time. In the
interim it will be necessary to restore computer and network services at an alternate site.
The university has a number of options for alternate sites. Each option has a cost
associated with it.
Western Washington University
There is a warm site agreement between Portland State University and Western
Washington University. This agreement stipulates that Banner related equipment is held
in the other's facility. In the event of a major catastrophe that affects the entire Portland
State campus, the financial data for Portland State will be recovered at the Western
Washington site. The formal agreement between Portland State and Western Washington
has been renewed. The details of the agreement can be found in Appendix D. A new set
of equipment has been placed in this remote location (see Appendix F), a snapshot of the
Oracle and Banner software trees are periodically transferred to the remote systems and
an incremental data backup is performed nightly.
Cramer Hall
Development of this on-site warmsite has resumed. This local warmsite will be built with
a focus on dealing with a localized disaster in the Fourth Avenue Building that leaves the
rest of the campus intact. This planned local warmsite is located in the Urban Center
Building.. The expansion to this area will provide two racks worth of capacity, with
related power and linkage to the data center network infrastructure. Some infrastructure
components for this site have been purchased. The Urban Center warm site is designed to
be a temporary measure until the main data center is brought back on line. There isnt
enough space in this location to re-build the entire data center capacity.

OIT Disaster Recovery Plan, 2010

Page 12 of 59

6/14/2010

Disaster Partnerships
One of the most critical issues involved in the recovery process is the availability of
qualified personnel to oversee and carry out the recovery. This is often where disaster
partnerships can have their greatest benefit. Through cooperative agreement, if one
partner loses key personnel in the disaster, the other partner can provide skilled workers
to carry out recovery and restoration tasks until the disabled partner can hire replacements
for its staff.
There is an informal agreement with the University of Oregon (UO) to host DNS services
for Portland State University in the event of an emergency. UO currently hosts one of
PSUs secondary DNS servers. When possible, formal agreements should be made
between the departments in OIT to outside partners.

3.3 Replacement Equipment


This plan contains a complete inventory of the components of each of the servers and
network systems that must be restored after a disaster. Where possible, agreements have
been made with vendors to supply replacements on an emergency basis. To avoid
problems and delays in the recovery, every attempt should be made to replicate the
current system configuration. However, there will likely be cases where components are
not available or the delivery timeframe is unacceptably long. Although some changes
may be required to the procedures documented in the plan, using different models of
equipment or equipment from a different vendor may be suitable to expediting the
recovery process.

3.4 General Backup Information


New hardware can be purchased. New buildings can be built. New employees can be
hired. However, the data that was stored on the old equipment cannot be bought at any
price. It must be restored from a copy that was not affected by the disaster. There are a
number of options available to help ensure that such a copy of OITs critical data,
currently residing the Data Center, survives a disaster at the primary facility.
Design of the Current Backup Systems
OIT backs up all servers hosted in the data center to two Sun StorageTek towers. Each
tape backup library tower can hold 84 LTO-3 tapes. Each tape can approximately hold
400-800 gigabytes of data depending on the types of files to be saved. Currently, backup
compression occurs on the backup host.
Windows Server Backups
Windows servers are backed up using CommVault backup software. Most servers have
an incremental backup each night and a full backup every two weeks. The current
retention is at least 45 days for all servers. To speed up recovery in a disaster situation,
CommVault runs a disaster recovery backup each night at 5 PM. This information is sent

OIT Disaster Recovery Plan, 2010

Page 13 of 59

6/14/2010

to the WWU site and contains a copy of the CommVault database and associated settings
for faster restore in a disaster situation.
UNIX/Linux Server Backups
Unix/Linux server backups, which include all Portland State Universitys financial and
other data from Banner, are backed up using Legato. There are nightly incremental
backups and a full backup every two weeks. A copy of the Legato configuration is also
periodically included in the backup tapes that are sent off site. The current retention is 2
to 3 months.

3.5 Backup Procedures


Well documented backup procedures help to ensure that recovery time is kept to a
minimum. The different teams in TIS have documented procedures for backing up the
critical data sets for OIT and Portland State University.

3.6 Offsite Storage Agreement


OIT has contracted with Iron Mountain to store the backup tapes offsite in a secure
location in a different area of Portland. The current frequency of the offsite rotation is to
have full backup tape sets for all servers sent offsite every week. This will allow for the
oldest data set to be 7 days or less in the event that restores need to occur.

3.7 Documentation of Current Systems


Maintaining current documentation of the file and directory systems will also ease the
process of recovery. In the event that key personnel are lost or injured, alternates or
replacements will be able to understand the configuration of the systems. Each team in
TIS maintains documentation on the structure of the services provided to the campus. The
IS Team maintains their own documentation about the structure and recovery procedures
for Oracle and SungardHE Banner.

3.8 Storage of DRP


An up-to-date paper copy of the Disaster Recovery Plan will be stored at the offsite Iron
Mountain location. A copy of this entire document, burned onto CDs, will be given to
each member of the Disaster Recovery Management team as well as to the Western
Washington University IT Director.

OIT Disaster Recovery Plan, 2010

Page 14 of 59

6/14/2010

4.0 DISASTER DETECTION AND INITIATION


In almost any disaster, hazards and dangers can abound. While survival of the disaster
itself can be a harrowing experience, further injury or death following the disaster
stemming from carelessness or negligence is senseless. Safety of personnel will be the top
priority of TIS. This section will discuss the possible hazards to be aware of during an
emergency.

4.1 Red Cross Information


As disaster workers seek to meet the needs of victims and communities following any
type of disaster, they are surrounded by and exposed to disorganization, confusion, scenes
of destruction, and the tears and the pain of victims.
Disaster workers have the potential to become "secondary victims," as they work long,
hard hours under poor conditions. In some cases, physical dangers exist for responders.
Worker accommodations may be poor when they are near or within the affected area, or
may require an hour or more of travel when located outside the affected area. Personal
support systems are left at home, and new supports must be formed while on the
operation and while time is scarce. Supervisory styles are different from person to person;
administrative organization and regulation often must change with little warning, adding
additional stressors as workers try to satisfy the needs of the clients and of the
organization.
Most disaster workers are dedicated individuals who also tend to be perfectionists.
Because of this, they are at risk of pushing themselves too hard and of not being satisfied
with what they have accomplished. With so much yet to do, they often fail to take credit
for the amount of work completed and the effort contributed to the operation.
Frustration is common, and our usual sense of humor is often stretched beyond limits.
Workers become exhausted, and anger comes easily to the surface. The anger of others -workers, victims, and media -- becomes difficult to deal with, and may be seen as a
personal attack on the worker rather than as a normal response to exhaustion. Survivor
guilt may emerge as workers see the losses of others when they have suffered none
themselves.
COPING: Remember that you are giving those victimized by the disaster a gift of
yourself -- your time and your caring -- a gift you could not give if you were also a
victim.
This may be your first experience with scenes of great destruction or high levels of injury
and death. These are realities we don't often face, and methods of coping with these are
not developed overnight. In each of us, there is an unconscious fear that a victim could be
you or a loved one. You need to understand and appreciate the intensity of your emotions,
and talk about your feelings to others.

OIT Disaster Recovery Plan, 2010

Page 15 of 59

6/14/2010

Although we may function in superhuman ways during a disaster operation, the stress
associated with our jobs takes its toll. We get tired . . . and confused . . . and hurt . . . and
scared. It is critical both for ourselves and those we try to help that we understand the
effects of stress and make every effort to deal with it.
Stress-relieving activities are not as difficult or time consuming as we may think. A 15minute walk during a lunch or coffee break; talking to a co-worker, supervisor, or mental
health worker; going out to dinner or a movie; or just learning and using deep breathing
exercises can significantly reduce stress.
During the operation, it's important to eat nutritional foods, avoid drinking large amounts
of caffeine and alcohol, get some exercise whenever possible, and get as much sleep as
you can. That way you'll be better able to continue meeting the challenges of your job.
Your supervisors will be attempting to juggle schedules so that you can have some time
off to yourself to sleep, read, or just sit in the sunshine. If you feel that you need this time
off before you're scheduled for it, just ask. If you need a change of assignment or setting,
just ask. And, hard as it may be to turn over your duties to someone else, when it is time
for your shift to be over, leave and take time to recharge.

4.2 Disaster Recovery Teams


To function in an efficient manner and to allow independent tasks to proceed
simultaneously, the recovery process will be handled by teams. This plan calls for teams
that work together, but for which specific portions of the recovery are assigned.
Disaster Recovery Management Team
The Disaster Recovery Management Team oversees the whole recovery process. The
members of this team should be comprised of personnel who are extremely familiar with
the structure, systems, and services that Technology Infrastructure Services provides to
Portland State University. The DR Manager leads the DR Management Team. In the case
of TIS, the DR Manager will be the current Associate CIO for Technology Infrastructure
Services. The DR Manager has the final authority on technical decisions that must be
made during the recovery but works closely with the Incident Commander dedicated to
OIT, typically the CIO, to ensure organizational goals are being met while dealing with
the disaster. The DR Manager is responsible for appointing the other members of the
Recovery Management Team. If appropriate, the DR Manager will ask additional
university staff from Facilities or other areas to participate on the team. If the University
Disaster Response Team has been mobilized, the DR Manager will take direction from
the Incident Command center.
Damage Assessment Team

OIT Disaster Recovery Plan, 2010

Page 16 of 59

6/14/2010

The Damage Assessment Team will be comprised of personnel who are knowledgeable
about the hardware and equipment located in the Fourth Avenue Building Data Center.
Likely choices for this team would be a member(s) from Physical Plant, NTS, CIS teams
and the IS team. The primary thrust for this team is to do two things: Provide information
for the Recovery Management Team to be able to make the choice of the recovery site
and provide an assessment of the recoverability of major hardware components. This
team will also be the main group involved with salvaging any equipment in the data
center. Based on this assessment the DR Management Team can begin the process of
acquiring replacement equipment for the recovery.
Facility Recovery Team
The Facility Recovery Team should be led by a member in Facilities but will also need to
include members from TIS. This team will be responsible for the details of preparing the
recovery site to accommodate the hardware, supplies, and personnel necessary for
recovery. They will be responsible for the oversight of the activities for the repair and/or
rebuilding of Fourth Avenue Building or a secondary site. It is anticipated that the major
responsibility for this will lie within Facilities and contractors. However, this team must
oversee these operations to ensure that any facility is repaired to properly support data
center operations.
All infrastructure recovery (networking, power, AC, UPS, generator, security, etc.) will
be the responsibility of Facilities and relevant TIS Teams. The recovery of specific
services and data will be the responsibility of the individual teams in OIT (NTS, CIS and
IS). Each team has separate recovery plans for restoring services quickly. Overall Data
Center restart strategy can be found in Appendix G.

4.3 Disaster Detection and Determination


The detection of an event which could result in a disaster affecting production or
information processing systems at Portland State University is the responsibility of the
Associate CIO for Technology Infrastructure Services, CIS, NTS or whoever first
discovers or receives information about an emergency situation developing in one of the
functional areas of Technology Infrastructure Services.

4.4 Disaster Notification


Whoever detects the disaster should notify the Associate CIO for Technology
Infrastructure Services or the Associate Director for Computing Infrastructure Services
(CIS) who is responsible for the Data Center. In addition to providing some fault
tolerance in initial response, this role sharing enables effective use of shifts during the

OIT Disaster Recovery Plan, 2010

Page 17 of 59

6/14/2010

disaster recover process. The Associate CIO for TIS or Associate Director for CIS will
monitor the evolving situation and, if appropriate, will then notify the Disaster Recovery
Teams in TIS. The complete emergency contact list for the university is included in
Appendix A.

4.5 When to Activate the Plan


The plan should be activated if any of the following circumstances occur:
Any damage to the Fourth Avenue building for any reason, which includes but is not
limited to fire, water, acts of terrorism and/or sabotage. Furthermore, this Disaster
Recovery Plan could be activated whenever an action occurs that hampers Technology
Infrastructure Services ability to provide service to the campus community for a period
greater than 48 hours.

4.6 What to Do When a Crisis Erupts


As soon as a potential crisis situation develops, the first person to be alerted should be the
Associate CIO for Technology Infrastructure Services who will also be the DR Manager.
If it is determined that a crisis situation has occurred, that person will alert TIS personnel
in order to help activate the Disaster Recovery Plan.
The first phase begins with the initial response to a disaster and activation of the plan.
During this phase, the existing emergency plans and procedures of Portland States
Campus Public Safety Office direct efforts to protect life and property, the primary goal
of initial response. Security over the area is established as local support services such as
the Police and Fire Departments are enlisted through existing mechanisms.
Once access to the facility is permitted, an assessment of the damage is made to
determine the estimated length of the outage. If access to the facility is precluded, then
the estimate includes the time until the effect of the disaster on the facility can be
evaluated.
If the estimated outage is less than 48 hours, recovery will be initiated under normal
operational recovery procedures. Use of facilities in the Urban Center or Cramer Hall
should be investigated to see if systems can be brought online in that location. If the
service outage is longer than 48 hours, the DR Manager will decide upon the appropriate
warm site for recovery (on-site at another location, Western Washington or elsewhere).

OIT Disaster Recovery Plan, 2010

Page 18 of 59

6/14/2010

5.0 ACTIVATING THE PLAN


The DR Recovery Manager sets the plan into motion. Early steps to take are as follows:
The Recovery Manager should retrieve the Disaster Recovery Plan. Copies of the plan
should be made and handed out at the first meeting of the DR Recovery Management
Team if possible.
Determine Personnel Status
One of the Recovery Manager's important early duties is to determine the status of
personnel working at the time of the disaster. Safety personnel on site after the disaster
will affect any rescues or first aid necessary to people caught in the disaster. However, the
Recovery Manager should produce a list of the able-bodied people who will be available
to aid in the recovery process. Taking care of people is a very important task and should
receive the highest priority immediately following the disaster. While we will have a
huge technical task of restoring computer and network operations ahead of us, we can't
lose sight of the human interests at stake.
Equipment Protection and Salvage
A primary goal of the recovery process is to restore all computer operations without the
loss of any data. The Damage Assessment Team can immediately set about the task of
assessing the damage to the data center, and protecting and salvaging any equipment or
hardware, especially those on which data may be stored. This document contains
information on procedures to be used immediately following an incident to preserve and
protect resources in the area damaged.
It is important that any equipment or hardware in the Fourth Avenue Building be
protected from the elements to avoid any further damage. Some hardware may be
salvageable or repairable and save time in restoring operations. The Damage Assessment
team should cover all computer equipment to avoid water damage. Ask the police to post
security guards at the primary site to prevent further damage. All salvageable equipment
will need to be moved to a secure location or the warm site.
As soon as practical a complete inventory of all recovered equipment must be taken,
along with estimates about when the equipment will be ready for use (in the case that
repairs or refurbishment is required). This inventory list should be given to the DR
Manager who will use it to determine which items from the disaster recovery hardware
and supplies lists must be purchased to begin building the recovery systems.
Establish the Recovery Control Center
The Recovery Control Center is the location from which the disaster recovery process is
coordinated. The Recovery Manager should designate where the Recovery Control Center
is to be established. Depending on the extent of damage, the center may be located off of
the campus.
Initial Steps of the Disaster Recovery Management Team

OIT Disaster Recovery Plan, 2010

Page 19 of 59

6/14/2010

The DR Manager is to call a meeting of the Recovery Management Team at the Recovery
Control Center or a designated alternate site. Each member of the team is to review the
status of their respective areas of responsibility. The DR Manager briefly reviews the plan
with the team. Any adjustments to the Disaster Recovery Plan to accommodate special
circumstances are to be discussed and decided upon.
Each member of the team is charged with fulfilling his/her respective role in the recovery
and to begin work as scheduled in the DR Plan.
Each member of the team is to review the makeup of their respective recovery teams. If
key personnel on any recovery team are unavailable, the DR Manager is to assist in
locating others who have the skills and experience necessary, including locating outside
help from other OUS institutions or vendors.
The next meeting of the Recovery Management Team is scheduled. The DR Management
team should meet at least once each day for the first week of the recovery process. An
assessment can be made at the end of the first week to decide the frequency of additional
meetings.
The DR Management Team members are to immediately start the process of calling
teams together to begin the recovery process.
Cell phones and two-way radios will be important during the early phases of the recovery
process. Some departments in OIT have two-way radio units that may be available if
damage is not severe to the Fourth Avenue Building.

6.0 DISASTER RECOVERY STRATEGY


The disaster recovery strategy pertains specifically to a disaster disabling the main
computing facility in the Fourth Avenue Building. This functional area provides the
infrastructure and major server support to Portland States administrative applications.

6.1 Recovery Procedures


The time required for recovery of the functional area and the eventual restoration of
normal processing depends on the damage caused by the disaster. The time frame for
recovery can vary from several days to several months. In either case, the recovery
process begins immediately after the disaster and takes place in parallel with back-up
operations at the designated warm site. The primary goal is to restore critical operations
as soon as possible.

6.2 Team Plans Defined

OIT Disaster Recovery Plan, 2010

Page 20 of 59

6/14/2010

Each team under the TIS umbrella and IS have documented the configuration and
installation details for the different services and applications provided to the campus. This
enables them to work on recovery and restore procedures as needed for various scenarios.
Each team documents the list of services they provide, additional hazards specific to the
particular services and applications, the equipment necessary for recovery, assuming that
all infrastructure will be provided either at the Fourth Avenue facility or rebuilt at another
facility and the restore procedures for recouping data. This also includes activating the
limited warm site at WWU.
These plans also define the various levels of criticality for the services and applications
OIT provides to the university. These levels of criticality will be used to define the order
in which TIS and IS services will be brought back online in the event of an emergency.
Each team plan uses these definitions when assigning importance to the recovery of the
services and applications provided to the campus. The levels are:
Critical (Category 1) the University can not run without these applications and/or
information and/or services and these things cannot be run without identical capabilities
being setup in another location (Western Washington University or other).
Vital (Category 2) this information and/or applications and/or services have a higher
tolerance of interruptions but for only very brief amounts of time.
Sensitive (Category 3) this information and/or applications and/or services can be run
by manual means or not at all for a longer period of time with the knowledge that once
restored there will be significant amounts of catch-up to be done.
Noncritical (Category 4) this information and/or application and/or service can be
interrupted for a significant amount of time without loss to business. There is no catchup to be done once the application has been restored.

6.3 Team Plans by Service


The plan is presented in 4 parts:
I. Services provided by the TIS Teams
II. Levels of Severity
III. Disaster Conditions
IV. Disaster Recovery Strategies
I.

Services provided by the TIS Teams


A.

Network Connectivity within the Data Center

OIT Disaster Recovery Plan, 2010

Page 21 of 59

6/14/2010

This encompasses Fast Ethernet or higher connectivity for OIT servers, such as the
central UNIX/Linux and Windows servers and systems that comprise Banner. This
service focuses only on connectivity between these servers, and not necessarily to the rest
of the PSU campus or to the Internet.
This service must be available all the times that it is necessary for Banner to be
functional.
B.

Network Connectivity on PSU Campus

This service is Gigabit or Fast Ethernet connectivity between PSU buildings and sites on
campus. It includes connectivity between Data Center and the rest of the PSU campus.
This service should be available almost all the time; however, it is slightly less critical
than service A.
C.

Internet Connectivity

This service provides a connection to the Internet for the PSU Campus. Any connectivity
to sites not on the PSU campus falls into this service.
This service should be available nearly all the time. It is slightly less critical than service
B.
D.

VPN (and dialup)

Provides remote access to the PSU campus network.


This service should be available as often as possible; however, business is not interrupted
without this service.
E.

Tape Backup/Restore and Offsite Storage

Provides tape backup of all systems and regular rotation of tapes to an offsite storage
location. Also provides the ability to restore from previous backups.
While important, this service will likely begin to operate mostly in a Restore mode after a
disaster. If systems are damaged or destroyed, recovery may be necessary from tapes on
the PSU campus or located at the offsite storage facility. Operation of this service in a
restore mode may be vital to restoring the business functions of PSU.
F.

Authentication Services

LDAP and Active Directory are provided as authentication mechanisms for users. E-mail,
desktop logins, wifi authentication and access to storage are some services that use one of
these authentication services.

OIT Disaster Recovery Plan, 2010

Page 22 of 59

6/14/2010

This is a critical service to maintain while users are being served. The services are
delivered in a redundant manner to enhance uptime. In the event of a disaster where
recovery of core systems takes precedence, bringing up authentication services for all
users may be of secondary importance.
G.

Domain Name Services

Web sites within the University, e-mail addresses and logins to various systems require
DNS to be functioning.
This is a critical service, even during a disaster, as external connections such as e-mail
would still need to resolve to PSU domain names, even though the sites may not be
available. This is a critical service and to that end, the service is delivered with some
redundancy (including external DNS servers).
H.

File Services

File services provide individual and shared access to data storage to users on various
platforms using various access methods. Access to user files on these servers must be at
least 95% available at all times during normal operations.
Storage for file services is delivered using redundantly configured equipment whenever
possible. In the event of a disaster, recovering file services may be prioritized based on
business needs that dictate which services are brought on line first.
I.

Print Services

Print Services include components such as print servers, print queues, print devices and
printing configuration in a network environment. Print services are provided either by
Windows or UNIX servers. Access to print services must be at least 95% available during
normal operations. During a disaster, they may have a reduced priority.
J.

Electronic Messaging services

Electronic Messaging services refer to the comprehensive enterprise messaging system


that provides electronic mail, scheduling and other groupware components.
This service must be available 95% during normal operation. During a disaster, bringing
up e-mail services is critical but may be constrained by the time taken to restore the data
before bringing the service online.
K.

Banner Systems

The Banner ERP system is delivered by a series of servers. During normal operation, this
service must be available 95% and during a disaster scenario, it is one of the first large

OIT Disaster Recovery Plan, 2010

Page 23 of 59

6/14/2010

systems that must be brought back online. This will require personnel from CIS who deal
with bringing up the servers and personnel from IS who deal with the configuration of the
Banner software.
L.

Web services

PSU provides a University level web presence in addition to those for other academic and
administrative departments. During normal operation, this service needs to be delivered
with a high degree of reliability.
During disaster conditions, bringing up a makeshift University level web presence on the
warm site as a means of communicating with constituents is vital. Due to the complex
nature of the existing technologies used for the websites, it may not be possible to deliver
the existing websites in their entirety on the warm site. As PSU continues to develop the
new system for campus-wide website delivery, it may be possible to mirror this system
offsite.

II. Disaster Conditions

Major Disaster (Earthquakes, Fires, Floods)


Power Outage
Cooling Unit failure
KVM switch or console access failure
Server hardware component failure
Server operating system errors
Server enterprise application\program errors
Data Corruption
Denial of Service
Unauthorized access of network resources

III. Disaster Recovery Strategies


A. Major Disaster (Earthquakes, Fires, Floods)
Warm sites for recovery (Western Washington, local warmsite if available, etc)

OIT Disaster Recovery Plan, 2010

Page 24 of 59

6/14/2010

In the event of such a major disaster, priority will be given to restoring service A
(connectivity between servers for critical services). Once Banner and its supporting
services have been restored, attention can be given to assessing damage and the
possibility of reconnecting PSU campus sites/buildings to the campus network. Network
hardware, whether previously in production or not, from other PSU sites/buildings may
be used in order to restore service A if necessary.
Restoration from Backups either at a warm site or at the offsite storage facility may be
necessary in the event of a major disaster. Thus, it is vital that tape drives be available in
order to restore data. A tape drive and other necessary hardware and software should be
located at our warm site or some other off-site location.
A tape subsystem has not yet been procured for the WWU warmsite pending the results
of a backup system redesign that is ongoing at PSU.
B.1 Power Outage External
The Fourth Avenue Building is driven from two redundant power feeds originating from
separate substations. In the event of the loss of one feed, the building switches to the
remaining feed in a matter of seconds. If the building loses both power feeds, the Data
Center UPS system picks up the load until the building turbine spins up and starts
delivering power to the facility. The UPS system for the Data Center is able to keep
systems running for about one hour. This provides a comfortable margin for the turbine to
spin up. The turbine is able to drive the building for up to three days before requiring
refueling. The redundant systems ensure the Data Center is well protected from external
power outages.
B.2 Power Outage Internal
The Fourth Avenue Building could experience an internal power event that cuts off power
to the Data Center. This could be the result of a failure of the redundant internal power
systems or a manual activation of the Emergency Power Off (EPO) controls. When
power is lost to the Data Center in this manner, all inbound power must de-activated
(typically at the PDUs) so that a restoration of power does not cause unsynchronized
reboots of systems. After the power event has been examined and power restored to the
Data Center, the systems need to be powered back up according to the Data Center
startup/shutdown sequence. (See Appendix G.)
C. Cooling Unit failure
Failure of air conditioning units in the Fourth Avenue Building Data Center or the CH
switchroom could cause extensive shutdowns and/or equipment damage if not dealt with.
The FAB Data Center has four separate CRAC units and can continue to sustain
acceptable temperatures with the loss of one unit. If more than one unit fails, load
shedding would be instituted by shutting down non-essential systems to lower the heat
output into the room, and to increase ventilation and air exchange as much as possible.

OIT Disaster Recovery Plan, 2010

Page 25 of 59

6/14/2010

Problems with the CRAC units are handled by PSU Facilities.


D. KVM switch failure
KVM failure would be especially crippling when coupled with another outage such as a
network outage, rendering systems completely inaccessible. A number of legacy KVM
switches have been kept by OIT, along with their cabling components, in the case of
failure of the entire CAT5-based KVM switching system. Also, overflow space exists on
the head-end KVM switch in case of failure of any single switch. Critical or necessary
systems would be attached to an alternate KVM switch if such was necessary in order to
regain control of the system or administer it. With the increased support for network
based remote console access in newer server purchases (like Dells DRAC and SUNs
ALOM for example), console access can be achieved directly from a management station
on the network.
E. Network hardware component failure
Hardware failure affecting network equipment, such as switches or routers, would likely
bring down a portion of the campus network, but no single failure could bring down all
network services. The Data Center network is engineered with redundant network
switches and dual connects to many critical systems.
Spares exist for most types of Cisco switching modules, and critical Cisco equipment is
also covered under 24x7x4 service contracts. If spares of the failed component did not
exist and 4-hour response was to be delayed or insufficient, other equipment could be
repurposed, potentially altering the network topology near the failure, in order to bring
systems back up to some level of connectivity.
F. Server hardware component failure
Examples of server hardware components that could fail are the following:

Hardware components in Servers and Storage Area Network systems such as power
supplies, hard drives, hard drive controllers, RAID arrays, processors, network cards,
memory modules and SAN switches;
Tape libraries;
External hard drive arrays;
Backup tape drives or towers;

Server and storage system failures are dealt with by placing critical (and unique)
equipment under a maintenance plan with a quick delivery of spares and/or diagnostic
services. In the case of server models that are in widespread use in the data center, a
failure would be dealt with by pulling a less critical unit into service to replace the failed
unit. As the purchases of new servers moves increasingly into the Blade Server area, a
higher level of built-in redundancy provides systems that are able to recovery more easily
from component failures.

OIT Disaster Recovery Plan, 2010

Page 26 of 59

6/14/2010

G. Networking system errors


Major software error/failure on a network component is almost always resolved by
rebooting the affected device, causing minimal downtime. If a reboot did not solve the
problem, it is likely that an active attack is occurring, or some hardware component has
failed or become unreliable. In this case, advanced troubleshooting/diagnosis will be
required to resolve the issue.
H. Server operating system errors
Errors may appear due to operating system bugs, configuration issues or in interactions
with software or the network. Diagnostic procedures are platform dependent. The system
administrators need to ensure patches are being applied in a timely fashion.
I. Data Corruption
Data corruption would be especially disastrous if it occurred on the backup tapes and
backups were necessary due to another disaster. It is therefore vital to test restoration
from backup tapes regularly to ensure PSU can restore from backups.
J. Denial of Service
This type of attack is common and can almost be expected to occur at some time on a
major scale at PSU. By observing traffic patterns on the network, using a network sniffer,
and other troubleshooting steps, the source of the attack can usually be identified and
either filtered or disconnected from the network. If the attack is distributed, with many
clients both internal and external to PSU and network filtering is unable to block the
attack, systematic disconnection of affected systems is the only method to ensure
restoration of critical services. This may take time, and thus if the attack is severe and
distributed enough, it may be necessary to disconnect the majority of the network and
only reconnect small sections at a time as critical services are needed.
K. Unauthorized access of network resources
This should only be classified as a disaster if the unauthorized access cannot easily be
stopped, or vital data to the ongoing security of PSU systems or Enterprise data has been
compromised. If unauthorized access is ongoing and the data are critical, the machine
should be unplugged from the network or even from power to stop the access.
If it becomes known that vital Enterprise data such as student records, financial
information, etc. has been compromised, law enforcement must be involved in the
investigation, and OIT resources should be dedicated to discovering the extent of the
access and compromised data, ascertaining how the data was accessed, and ensuring the
data will be secure in the future.

OIT Disaster Recovery Plan, 2010

Page 27 of 59

6/14/2010

6.5

Central Services Recovery Plan

Urban Center (construction of this localized warmsite has now resumed):


We plan to construct a limited warm site in this building, collocated with the network
infrastructure in place in the building entry terminal. Two racks with power and network
connectivity will be provided. This location will serve as the localized warm site,
containing a mixture of directory services, a basic University web presence and a
functional version of Banner that can be used by administrators in the event of a disaster.

Western Washington University


PSU has an Inter-Governmental Agreement with WWU, giving each institution access to
two racks worth of data center capacity on a reciprocal basis. We have installed a
complement of equipment that is capable of running a limited instance of Banner. In the
event of a catastrophic failure at PSU that disables the FAB Data center, the WWU warm
site will be brought online.
NOTE: if Banner is in use at WWU, we cannot access or use Banner here at PSU as there
is only one master mode. Once Banner is brought back up at PSU, we will need to ensure
that all transaction logs are copied back to local systems here at PSU before going live
with Banner locally.
Currently, the Banner warm site is implemented by two Sun servers, one for running the
database and the other for running the Banner applications servers (Banweb and inb).
Oracle db data updates are copied from PSU to WWU nightly. Periodically, a copy of the
Oracle and Banner application trees are also copied over to WWU. This process allows us
to keep a current copy of Banner at an offsite location without having to use backup tapes
to recover from in the event of an emergency. In the event of warm site activation, IS
personnel will access the systems remotely through a VPN (located at WWU), configure
and load the database and then bring up the application servers. The activation process
could take up to 48 hours.
In the future, we intend to speed up the ability to activate the warm site by using Oracle
DataGuard to sync up the Banner .
General Information:
System Administration procedures related to Disaster Recovery
The CIS UNIX and Windows teams document and maintain system administration
procedures related to backup and restores, service setup and recovery and other processes
that may be used during disaster recovery. Since this information changes frequently, it

OIT Disaster Recovery Plan, 2010

Page 28 of 59

6/14/2010

will not be copied into this document. Instead, a monthly copy of system administration
procedures will be placed along with the offsite copy of the Disaster Recovery plan. A
copy is also placed at the warm site.

7.0 PLAN MAINTENANCE AND TESTING


Having a disaster recovery plan is critical. But the plan will rapidly become obsolete if a
workable procedure for maintaining the plan is not also developed and implemented. This
document provides information about the maintenance procedures necessary to keep it up
to date.
Basic Maintenance
The plan will be annually evaluated and updated. All portions of the plan will be
reviewed by the Associate CIO for Technology and the Associate Director for CIS. If it is
deemed that portions need to change or be rewritten or reviewed by other TIS Teams, the
Associate CIO for Technology will assign that task to the appropriate team. In addition
the plan will be tested on a regular basis and any faults will be corrected. The Data Center
Management Group, comprised of the Associate Director of CIS along with select
members of the team, has the responsibility of overseeing the individual components and
files and ensuring that they meet standards consistent with the rest of the plan.
Change-Driven Maintenance
It is inevitable in the changing environment of the computer industry that this disaster
recovery plan will become outdated if not evaluated annually. Changes that will likely
affect the plan fall into several categories:

Hardware changes
Software or Application changes
Facility changes
Procedural changes
Personnel changes

As changes occur in any of the areas mentioned above, TIS management will determine if
changes to the plan are necessary. This decision will require that the managers be familiar
with the plan in some detail. A document referencing common changes that will require
plan maintenance will be made available and updated when required.
Changes Requiring Plan Maintenance
The following lists some of the types of changes that may require revisions to the disaster
recovery plan. Any change that can potentially affect whether the plan can be used to
successfully restore the operations in OIT systems should be reflected in the plan.
Hardware
Additions, deletions, or upgrades to hardware platforms.
Software or Application

OIT Disaster Recovery Plan, 2010

Page 29 of 59

6/14/2010

Additions, deletions, or upgrades to system software.


Changes to system configuration.
Changes to applications software affected by the plan.
Facilities
Changes that affect the availability/usability of any warm sites.
Personnel
Changes to personnel identified in the plan.
Changes to organizational structure of the department.
Procedural
Changes to off-site backup procedures, locations, etc.
Changes to application backups.
Changes to vendor lists maintained for acquisition and support purposes.
Testing
This plan will be tested on an annual basis to ensure that the procedures will work in the
event of a disaster. A report will be submitted to the TIS management staff after the
completion of the test that will detail the success and/or failure of the test. A discussion
surrounding any improvements to the plan will happen. Any revisions to the document
based upon the results of the test and the discussion in management will be integrated
into the document.
The current warm site configuration, with new hardware and software configurations, is
awaiting its first test.

OIT Disaster Recovery Plan, 2010

Page 30 of 59

6/14/2010

8.0 Appendix A: EMERGENCY CONTACT LIST


Current OIT Contact List:

NAME
ARC Academic
Research
Consultants
Arnold, Rick
Atalig, David
Avery, Terell
Bass, Ryan
(Director)
Beall, Scott
Bechdoldt, Inna
Birch-Wheeles,
Tamarack
Blanton,
Sharon (CIO)
Booth, Jeremy
Bowen, Rick
Bowen, Sandy
Broderick, Kirby
Buono, Nick
Burt, Jason
CAMPUS
OPERATOR
Charbonneau,
Michelle
Chen,
Christopher
CIS - SERVER
OPERATIONS
Cookus, Rocky
Cooley, Will
Cox, Bryan
Daffron,
Clayton
Ehlert, Michael
Evanoff, Anni
Everall, Brian
Fetter, David
Foster, Brad
Freeman, Mark

E-MAIL

OFFICE

HOME #

CELLULAR

consultants@pdx.edu
arnoldr@pdx.edu
datalig@pdx.edu
terell@pdx.edu

5.9112
5.9145
5.9115
5.9523

503.557.3990
503.482.5800
503.560.1497

971.222.8247
*503.853.4341

bass@pdx.edu
sbeall@pdx.edu
inna2@pdx.edu

5.4759
5.3268
5.3267

503.287.5141

*503.577.8958
503.789.5275
503.707.5930

tamarack@pdx.edu

5.3201

971.255.0479

*503.317.6499

sblanton@pdx.edu
jbooth@pdx.edu
bowenr@pdx.edu
bowens@pdx.edu
kirbyb@pdx.edu
buono@pdx.edu
jason@pdx.edu

5.9144
5.5907
5.3399
5.3278
5.4367
5.9160
5.3270

503.206.8728

*503.320.3787

503.654.4617
503.654.4617

PAGER

*503.204.2713
503.317.4769
503.784.6484
805.776.3709

5.6411
mtangen@pdx.edu

5.6202

chchen@pdx.edu
server-operationsrequests@pdx.edu
cookusr@pdx.edu
wcooley@pdx.edu
bryan.cox@pdx.edu

5.8424

cdaffron@pdx.edu
mehlert@pdx.edu
aevanoff@pdx.edu
brian@pdx.edu
dfetter@pdx.edu
fosterbk@pdx.edu
markf@pdx.edu

5.6201
5.3498
5.3294
5.9182
5.9154
5.9119
5.8493

5.9151
5.4369
5.8479
5.8490

OIT Disaster Recovery Plan, 2010

503.481.6119

*503.481.6119
415.710.6139

503.789.3077
503.788.4688

971.235.5105

971.409.6582
*503.327.3542
971.404.8264
971.238.0793
503.531.3489

Page 31 of 59

503.202.1814
206.235.3408
*503.720.5006

6/14/2010

Frey, Gordon
Garrick, Will
Gilbert, Dennis
Giles, Shem
Gostomski,
Michael
Guyette, David
Harris, Ann
(Director)
Hartig, Ben
Harvey,
Morgan
Henderson,
Beverly
Henry, Nate

freygo@pdx.edu
garrickw@pdx.edu
dennis@pdx.edu
giless@pdx.edu

5.3480
5.3235
5.3250
5.3255

503.283.5610
503.236.3915
503.384.0289
503.294.9999

503.309.8552
*503.539.6669
*503.348.3143
*503.572.9893

mjg@pdx.edu
guyette@pdx.edu

5.9153
5.4366

harrisa@pdx.edu
bhartig@pdx.edu

5.3448
5.9112

mharvey@pdx.edu

5.9112

hendersonb@pdx.edu
nhenry@pdx.edu

5.3141
5.3488

503.231.9132
503.282.4732

Hison, Tudor
Hoover, Lance
IS - BANNER
TECHNICAL
HOTLINE
IS/CIS/ARC
ADMIN
OFFICE
ITS CLASSROOM
AUDIO VISUAL
SUPPORT
ITS - DLC
OPERATIONS
ITS - VIDEO
PRODUCTION
SERVICES
Jayawardena,
Janaka (ACIO)
Johnston, Tim
(Director)
King, Todd
Kutch, Brenna
La Tourrette,
Tyson
Lee, Carolyn
Linton, Thom
McCartney,
Doug
(Director)
McElroy, Kenny
Miller, Leslie
Miranda,

tudor@pdx.edu
hooverl@pdx.edu

5.3284
5.5894

503.257.8492

*503.740.7503
*503.319.2065

503.475.5314
734.904.0040
503.786.7935

*503.348.2601
503.957.5257
503.953.3976
503.804.9686

5.9560
5.4441

av@pdx.edu

5.9100
5.9146
5.2630

janaka@cat.pdx.edu

5.5410

503.336.3750

*503.941.0374

johnstont@pdx.edu
kingt@pdx.edu
brennak@pdx.edu

5.2776
5.5430
5.8522

503.588.0955
360.896.8587

*971.645.2216

tyson@pdx.edu
leec@pdx.edu
tlinton@pdx.edu

5.9166
5.4358
5.9100

dmccart@pdx.edu
mkenny@pdx.edu
lesliej@pdx.edu
ade@pdx.edu

5.9110
5.3368
5.6420
5.3289

OIT Disaster Recovery Plan, 2010

*503.929.0485

503.547.8408

*503.381.6828
919.302.1103

503.262.8102

*503.890.8751
503.267.6150
*503.481.6326
*971.285.5004

503.284.0150

Page 32 of 59

6/14/2010

503.921.1385

Adrian
Mitarnowski,
Matthew
Moore, James
Morillas,
Monica
Moskal, Mary
Kay
Myers, Brian
Naanee, Lisa
Nimura, Alison
NTS NETWORK
OPERATIONS
CENTER
NTS TELEPHONE
REPAIR
AFTER
HOURS
Oreste, Joe
Owens,
Thomas
Oxman, Max
Parmer, Max
Powell, Shari
Richeson, Rod
Robbins, Ward
Schiller, Craig
Schmierbach,
James
Shapiro, David
Stapleton, Jim
Sukhun,
Jahed
(Director)
Thomas, Erica
Thomas, Jerrod
Tuggle, Jim
USS - HELP
DESK
USS - LABS
AND
CLASSROOMS
Vo, Jacquelyn
Tran
Waisanen, Ben
Walker, Mark
Walsh, Dan

mmit@pdx.edu
moorej@pdx.edu

5.6204
5.8467

morillas@pdx.edu

5.9104

moskalmk@pdx.edu
myers@pdx.edu
naaneel@pdx.edu
animura@pdx.edu

5.5544
5.9143
5.4067
5.9868

nts@lists.pdx.edu

5.3280

orestej@pdx.edu

5.6400
5.4359

503.888.9481

owenst@pdx.edu
oxman@pdx.edu
maxp@pdx.edu
powells@pdx.edu
rricheso@pdx.edu
robbinw@pdx.edu
craigs@pdx.edu

5.9529
5.8580
5.9157
5.3394
5.6203
5.4218
5.9107

503.560.4193
503.880.7535
503.380.7455
*971.544.1984
503.267.1320
*971.998.7950
*503.330.3162

jschmie@pdx.edu
davidsh@pdx.edu
stapletonj@pdx.edu

5.9158
5.3370
5.8492

sukhunj@pdx.edu
emthomas@pdx.edu
jerrodt@pdx.edu
tugglej@pdx.edu

5.3323
5.9147
5.8558
5.4466

helpdesk@lists.pdx.edu

5.4357

dst@lists.pdx.edu

5.8725

tranj@pdx.edu
waisanenb@pdx.edu
walkerm@pdx.edu
walshd@pdx.edu

5.3588
5.8461
5.8280
5.3310

OIT Disaster Recovery Plan, 2010

503.929.3368
503.473.6260
503.740.5257
360.885.0510

360.989.0792
*503.756.6625

503.866.0414
503.515.8691

503.632.2336

503.381.8256
503.238.4580
503.774.0880
503.848.5921

503.599.7749
*503.753.3782
*971.998.4162
*859.327.7711
971.285.6044

360.636.2754

503.775.1371
503.238.5217
503.570.7941
503.654.7890

Page 33 of 59

503.860.1935
503.921.1364
*503.704.8393

6/14/2010

503.599.2979

Weeks, Ellen
Weltin, Markus
Williams, Matt
Wortman, Inge
Wrate, Timothy
Wright, Jeffery
Zaw-Tun,
Naing
FAB 83 (Large
IS) Conference
Room
FAB 83-09
(Small IS)
Conference
Room
FAB 84
(Telecom)
Conference
Room
FAB 87-01
Conference
Room
FAB 90-01
Conference
Room
SMSU 18N
Conference
Room
Help Desk
(SMSU 18) Fax
FAB 83 Fax
SMSU 18P Fax

weekse@pdx.edu
mweltin@pdx.edu
matw@pdx.edu
wortmai@pdx.edu
twrate@pdx.edu
jefferyw@pdx.edu

5.2345
5.9112
5.8344
5.5483
5.4201
5.9108

naingz@pdx.edu

5.5893

*503.231.1838
971.207.8632
*971.544.0842
*503.740.2727
503.803.8986
503.980.3422

503.329.6581

5.9179

5.5720

5.8075
5.9014
5.2968
5.9130
5.3360
5.6487
5.3476

OIT Disaster Recovery Plan, 2010

Page 34 of 59

6/14/2010

503.921.1583

9.0 Appendix B: VENDOR LIST


Below is the contact information for the vendors of all components in this recovery plan.
This list will be updated for all vendors used by TIS and IS as it relates to disaster
recovery efforts.
Vendor
CommVault
HP
(Comark)
Dell
Sun
Microsoft
Sungard
Cisco
Avaya
Oracle
Qwest

Juniper
Iron
Mountain
Black Box
Network
Services
(BBNS)
/AVST
Integra
Telecom

Product
Backup
Software
Servers

Support Number
(877) 780-3077

Servers
Servers and
Software
Software
Banner
Network
Switches,
Routers
Campus
Phone
System
Database
Local PBX
Trunks,
QMOE
link to
Pittock
Network
Switch/Rou
ter
Offsite
Backup
Storage
Campus
Voice Mail
System

1-800-234-1490
800-USA-4SUN

Support ID
F4B30

1-800-633-3600

800-936-4900
800-825-2518
1 800 553 2447
1 408 526 7209
OUS INOC (1-541-7133331) Avaya (1-800-2422121)
800-223-1711
1-800-214-8043; 503-4255214; QMOE:

Customer ID
#0003033700

1-800-227-2218
1-888-314-JTAC (1-888314-5822)
888-365-4766

102932

Normal Bus. Hrs.: 1-888565-2400 (press 2);


Emergency After Hrs.: 1800-895-2400 (press 1); If
Reqd., Escalation: 1-800895-2400, ext. 255

Link to
CapCenter,
Link to
PBA,
OIT Disaster Recovery Plan, 2010

Page 35 of 59

6/14/2010

USA
Mobility
Wireless
Inc.
AT&T
Mobility
City of
Portland
IRNE

Modem
Pool Lines
Pagers

Cell
Phones
Main
campus
link to
Pittock

503-477-4541

425-580-5565;
Cellular: 425-495-9118
503-823-1000

OIT Disaster Recovery Plan, 2010

Page 36 of 59

6/14/2010

10.0 Appendix C: NETWORK DIAGRAMS


PSU Network Connectivity (internal and external)

OIT Disaster Recovery Plan, 2010

Page 37 of 59

6/14/2010

WWU Warm site connectivity with PSU

OIT Disaster Recovery Plan, 2010

Page 38 of 59

6/14/2010

11.0 Appendix D: WARM SITE AGREEMENT

OIT Disaster Recovery Plan, 2010

Page 39 of 59

6/14/2010

OIT Disaster Recovery Plan, 2010

Page 40 of 59

6/14/2010

OIT Disaster Recovery Plan, 2010

Page 41 of 59

6/14/2010

OIT Disaster Recovery Plan, 2010

Page 42 of 59

6/14/2010

OIT Disaster Recovery Plan, 2010

Page 43 of 59

6/14/2010

OIT Disaster Recovery Plan, 2010

Page 44 of 59

6/14/2010

12.0 Appendix E: AGREEMENT WITH IRON MOUNTAIN


OFFSITE STORAGE
Offsite Storage for Critical Data Processing Records
Description of Services:
Iron Mountain agrees to provide offsite storage and security of data processing records
for Portland State University. This service is to include storage, handling, pick-up and
delivery of media at the customers location.
Each Iron Mountain facility is designed and constructed specifically for the storage of
magnetic media (no bulk paper) and conforms to ANSI and NFPA requirements. The
entire facility is constructed with building features designed to minimize risk of fire and
unauthorized entry. Protective measures include:

Halon-protected, climate-controlled vaults


Building is seismically upgraded
Building located outside of the 100 year flood plain
Concrete vaults are built independent of exterior structure, a building within a
building
Building alarmed with Sonitrol alarm systems. Each is monitored by both fire and
police departments
Iron Mountain employees only interact with authorized Customer personnel
Vehicles are: independently climate-controlled and halon-protected, locked and
alarmed at all times, with cellular phones for communications

OIT Disaster Recovery Plan, 2010

Page 45 of 59

6/14/2010

13.0 Appendix F: WARM SITE EQUIPMENT AND CONFIG


HARDWARE
The following equipment is located at the Western Washington University Warm Site
(333 32nd St, Bellingham, WA 98225) Contact Number 360-650-3000:

Firewall: Cisco ASA 5550 w/VPN Premium License


Switch: Cisco WS-C3750G-24TS
Server: Sun Netra T2000, 4 core, 16 thread 1.2Ghz Niagara T1, 32GB RAM
(laga.oit.pdx.edu Banner DB)
Server: Sun Netra T2000, 8 core, 32 thread 1.2Ghz Niagara T1, 32GB RAM
(nott.oit.pdx.edu Banner App)
Storage Array: Sun StorageTek 2500 Array
Server: HP DL380 G3 (aah.psu.ds.pdx.edu)
Cyclades terminal server
Misc: Netbotz

NETWORK & CABLING

Cyclades Port Addressing

Port 3: Laga or Nott


Port 4: Laga or Nott
Port 5: Warmsite_SAN Controller A

Network Addressing
All subnets are /24, with a gateway at .1.
DNS: 131.252.120.128, 131.252.120.129.

laga.oit.pdx.edu
o e1000g0: 10.140.20.45 (prod VLAN)
o e1000g3: 10.140.0.47 (mgmt VLAN)
o LOM: 10.140.0.45 (mgmt VLAN)
nott.oit.pdx.edu
o e1000g0: 10.140.20.46
o e1000g3: 10.140.0.48
o LOM: 10.140.0.46

OIT Disaster Recovery Plan, 2010

Page 46 of 59

6/14/2010

Sun StorageTek 2500 Array (Warmsite_SAN)


o Controller A: 10.140.0.49
o Controller B: 10.140.0.50

Fiber Channel Cabling

Laga HBA port 1 to Warmsite_SAN Controller A port 1


Laba HBA port 2 to Warmsite_SAN Controller B port 1
Nott HBA port 1 to Warmsite_SAN Controller A port 2
Nott HBA port 2 to Warmsite_SAN Controller B port 2
External Network Connectivity via Firewall when Warm Site is Activated

All external access requires VPN (ipsec with local account auth) except as otherwise
stated.
Ports 22, 80 and 443 for nott.oit.pdx.edu (80/443 must be open to the world)
Ports 22 and 1526 for laga.oit.pdx.edu
Port 3389 for aah.psu.ds.pdx.edu
STORAGE
Of the 12 drives in the enclosure, 9 are allocated to laga, and 3 are allocated to nott.

OIT Disaster Recovery Plan, 2010

Page 47 of 59

6/14/2010

OIT Disaster Recovery Plan, 2010

Page 48 of 59

6/14/2010

14.0 Appendix G: DATA CENTER STARTUP/SHUTDOWN


PROCEDURES
Due to dependencies caused by the interconnected nature of the systems in the Data
Center, care must be exercised in startup up and shutting down systems. Bringing up a
system providing a service before various dependent services are already online can
adversely affect the service and at worst corrupt data.
The following tables list the order in which broad categories of services are brought
online and taken offline. Specific details of configuration management can be found in
the technical documentation of the respective teams.
Startup Sequence:
Order
1
1
1
1
1

2
2
2
2
2
2
2

3
4

System
Lenel Card Access
DNS
SAN
LDAP
Active Directory
TOTAL TIME for part 1,
assuming one
administrator per task
Banner
Luminis
OAM
pdx.edu Drupal
Email
File Servers
ESX
TOTAL TIME for part 2,
assuming one
administrator per task
Co-Lo Customers
Backups
GRAND TOTAL TIME

Estimated Time
(Minutes)

OIT Disaster Recovery Plan, 2010

Notes
20
20
45
30
20
45
45
45
30
45
60
30
45
60
90
60
255

Page 49 of 59

6/14/2010

Shutdown Sequence:
Order
1
2
3
3
3
3
3
3
3
3

4
4
4
4

System
Backups
Co-Lo Customers
ESX
File Servers
Email
Banner
Luminis
OAM
pdx.edu Drupal
Misc Servers
TOTAL TIME for
part 3, assuming
one administrator
per task
Active Directory
LDAP
SAN
DNS
TOTAL TIME for
part 2, assuming
one administrator
per task
Lenel Card Access
GRAND TOTAL
TIME

Estimated Time
(Minutes)

OIT Disaster Recovery Plan, 2010

Notes
20
30
30
10
20
30
20
20
30
30

30
15
30
20
15

60
15
155

Page 50 of 59

6/14/2010

15.0 Appendix F: OIT CONTINUITY OF OPERATIONS PLAN 2009


Continuity of Operations Plan (COOP)
Portland State University
Instructions: As part of PSU emergency response preparations, all PSU departments and
units are required to complete this form to document their Continuity of Operations Plan
(COOP)to describe how your department will operate during a natural disaster, pandemic,
or other emergency situation and to recover afterwards to be fully operational. This is your
Plan; feel free to augment this template to meet your needs. For more information, call
Campus Public Safety Director Mike Soto at (503)725-4782, or PSU Emergency
Management Coordinator Bryant Haley at (503)725-2220 or Don Johansen, Risk Manager at
(503)725-5340.
Department/Unit

OIT
Developer

Date Plan Finalized

Sharon Blanton
Head of Operations
Email address

Name

Sharon Blanton

9/23/09
Phone Number

5-9144

Alt Phone Number

503-320-3787

sblanton@pdx.edu

A: Your Departments Objectives


Considering your departments unique mission, describe your teaching, research and service
objectives:
What We Do
PSU-OIT serves campus customers by:
Providing leadership and direction for the optimal integration of information technology in all
University endeavors
Designing and deploying an efficient and effective campus technology infrastructure including
networks, telecommunications, servers and data storage, e-mail and web services, lab and
classroom technologies, etc.
Leading the campus in the development and use of enabling information technologies that support
instruction, research and public service
Developing, implementing and supporting quality information systems for administrative and
academic uses
Supporting the campus community of technology users through assistance, training, and
troubleshooting

OIT Disaster Recovery Plan, 2010

Page 51 of 59

6/14/2010

OIT Disaster Recovery Plan, 2010

Page 52 of 59

6/14/2010

B: Emergency Communication Systems


All PSU employees are responsible for keeping informed of emergencies by monitoring news
media reports, the PSUs website home page and/or Campus Alert phone messages.
To rapidly communicate with employees in an emergency, we encourage all departments to
prepare and maintain a call tree.
Note below the system(s) you will use to contact your employees in an emergency.
Departments should identify multiple communication systems that can be used for backup,
after hours, when not on campus, or for other contingencies.
Phone
Email
Text messaging
Call tree
Departmental web site
Pager
Instant messaging
Other (describe): __all__________________

C: Emergency Access to Information and Systems


If access to your departments information and systems is essential in an emergency, describe
your emergency access plan below. This may include remote access (or authorization to
allow remote access), contacting IT support, Blackboard, off-site data backup, backup files
on flash drives, hard copies, Blackberry/Treo or use of PSUs webmail email system.
All OIT staff use VPN to securely access PSU systems. We have a private IRC channel for
emergency communications. We also make extensive use of the OIT wiki that is housed off
site. We will file telecommuting forms for all staff. We have also contracted with an external
provider for 24x7 helpdesk support.

D: Your Departments Essential Functions


List below your departments functions that are essential to operational continuity and/or
recovery, and who is responsible to perform them. Also indicate if this function can be
performed off-site. Make sure that alternates are sufficiently cross-trained to assume
responsibilities.
Essential Function:

User Support Services


Primary

Alternate

Second Alternate

People Responsible

Jahed Sukhun

Heather Gregg

Max Oxman

Phone Numbers

5-3323; 971-998-4162

5-3466; 503-860-7039

5-8580; 503-880-7535

Remote Capabilities?

Essential Function:

Network and Telecomm


Primary

Alternate

Second Alternate

People Responsible

Tim Johnston

Clayton Daffron

Dan Walsh

Phone Numbers

5-2776; 971-645-2216

5-6201; 971-409-6582

5-3310; 503-704-8393

OIT Disaster Recovery Plan, 2010

Page 53 of 59

6/14/2010

Remote Capabilities?

Essential Function:

Enterprise Information Systems

Primary

Alternate

Second Alternate

People Responsible

Ann Harris

Joe Oreste

Scott Beall

Phone Numbers

5-3448; 503-348-2601

5-4359; 503-888-9481

5-3268; 503-789-5275

Remote Capabilities?

Enterprise Infrastructure Services

Essential Function:

Primary

Alternate

Second Alternate
Error! Contact not
defined.

People Responsible

Ryan Bass

Wil Cooley

Phone Numbers

5-4759; 503-577-8958

5-8479; 971-235-5105

5-5410; 503-476-5541

Remote Capabilities?

Instructional Technology Services

Essential Function:

Primary

Alternate

Second Alternate

People Responsible

Doug McCartney

Mark Walker

Rick Arnold

Phone Numbers

5-9110; 503-890-8751

5-8280; 503-449-9733

5-9145; 971-222-8247

Remote Capabilities?

CIO Administrative Operations

Essential Function:

Primary

Alternate

Second Alternate

People Responsible

Sharon Blanton

David Atalig

Jackie Vo

Phone Numbers

5-9144; 503-320-3787

5-9115; 503-853-4341

5-3588; 503-860-1935

Remote Capabilities?

Review your departments key personnel, leaders, heads and those responsible for the above
essential functions to identify your departments Emergency Operations Personnel (EOP).
Your departments Human Resources contact can help you identify EOP. For more
information on EOP, see Section M below. We strongly encourage all employees to update
their contact information in the PSU Emergency Notification System through Employee Link
at http://www.pdx.edu/cpso/psu-alert-notification-system, which is kept as private
information by default. This contact information may be important in an emergency.

E: Your Departments Leadership Succession


List the people who can make operational decisions if the head of your department or unit is
absent.
Head of
Operations

Name
Sharon Blanton

Phone Number
5-9144

OIT Disaster Recovery Plan, 2010

Page 54 of 59

Alt Phone Number


503-320-3787 (cell)

6/14/2010

First Successor
Second
Successor
Third Successor

Janaka Jayawardena

5-5410

503-476-5541 (cell)

Mark Gregory

5-3281

503-348-2672 (cell)

Jahed Sukhun

5-3323

971-998-4162 (cell)

F: Key Internal (Within PSU) Dependencies


All PSU departments rely on Office of Information Technology, Payroll, Purchasing,
Finance, Business Affairs, CPSO, Human Resources, Risk Management, and Facilities
Management. List below the other products and services upon which your department
depends and the internal (PSU) departments or units that provide them.
Dependency (product or service) :
Provider (PSU department):
Dependency (product or service) :
Provider (PSU department):

Contracts
Karen Preston
Legal
David Reece

Dependency (product or service) :


Provider (PSU department):
Dependency (product or service) :
Provider (PSU department):
Dependency (product or service) :
Provider (PSU department):
Dependency (product or service) :
Provider (PSU department):
Dependency (product or service) :
Provider (PSU department):
Dependency (product or service) :
Provider (PSU department):

G: Key External Dependencies


List below the products, services, suppliers and providers upon which your department
depends. We recommend that you encourage them to prepare continuity of operations plans.
Dependency (product or service)
:
Supplier/Provider

Disaster recovery site


Primary
Western Washington
University

OIT Disaster Recovery Plan, 2010

Alternate

Page 55 of 59

6/14/2010

Phone Numbers

360-650-3000

Dependency (product or service)


:

Data storage
Alternate

Supplier/Provider

Iron Mountain

Phone Numbers

800-899-4766

Dependency (product or service)


:
Primary

Alternate

Primary

Alternate

Primary

Alternate

Primary

Alternate

Primary

Alternate

Supplier/Provider
Phone Numbers
Dependency (product or service)
:
Supplier/Provider
Phone Numbers
Dependency (product or service)
:
Supplier/Provider
Phone Numbers
Dependency (product or service)
:
Supplier/Provider
Phone Numbers
Dependency (product or service)
:
Supplier/Provider
Phone Numbers

H: Mitigation Strategies
Considering your objectives, dependencies and essential functions, describe below the steps
you can take now to minimize the emergencys impact on your operations. For example, you
may wish to stock up on your critical supplies and develop contingency work-at-home
procedures. This may be the most important step of your emergency planning process.
Formulation of your mitigation strategies may require reevaluation of your objectives and
functions.

OIT implemented Elluminate, which provides an outsourced capability to hold

OIT Disaster Recovery Plan, 2010

Page 56 of 59

6/14/2010

meetings online.
OIT is in the process of significantly expanding the VPN capability.
OIT is reviewing and renewing telecommuting authorizations.
OIT also has a disaster recovery plan.

I: Exercising Your Plan & Informing Your Staff


Share your completed Plan with your staff. Hold exercises to test the Plan and maintain
awareness. Note below the type of exercises you will use and their scheduled dates. For
assistance in exercising your Plan, contact PSU Emergency Management Coordinator Bryant
Haley at (503)725-2220.
Staff orientation meeting Emergency communication
Exercise Dates
test
Call tree drill
Off site information access
test
Tabletop exercise
Unscheduled work at home
Staff Distribution Date
day
Emergency assembly drill
Interdepartmental
exercise
Other drill (describe): ___________________________________

J: Resumption of Operations
Describe your Plan to fully resume operations as soon as possible after the emergency event
has passed. Identify and address resumption/scheduling of normal activities and services,
work backlog, resupply of inventories, continued absenteeism, the use of earned time off, and
emotional needs.
See attached DR plan.

OIT Disaster Recovery Plan, 2010

Page 57 of 59

6/14/2010

K: Special Considerations for Your Department


Describe here any additional or unique considerations that your department may face in the
case of forced closure.
Even if we have to shut down operations, we still have to maintain services for the rest of
campus. If we have to activate our DR site at Western Washington University, then travel and
lodging will be needed.

OIT Disaster Recovery Plan, 2010

Page 58 of 59

6/14/2010

L: Additional Resources and Policy Summaries


Emergency Employee Selection Guidelines
Departments should identify as Emergency Operations Personnel (EOP) those who are
responsible for performing functions that are absolutely essential to the continuation of core
university operations (e.g., protection of property or safety, support of campus health
services, payroll, etc.) during a multi-week public health or other type of emergency when
classes and most other university activities are suspended. Emergency Operations Personnel
must satisfactorily perform their responsibilities in any type of emergency. Visit the
Emergency Management Unit website for more information on training:
http://www.pdx.edu/cpso/emergency-management-unit-emu.

M: More Information Regarding Your Department


Please note below information for your departments contact.
COOP Contact

Name

David Atalig

Phone Number

5-9115

Campus PO Box

751 OIT

Email address

datalig@pdx.edu

Dept. locations

SMSU, FAB, Cramer, Neuberger, Urban

Please indicate below the principle nature of your departments operations (check all that
apply):
Instruction
Student life support
Laboratory research
Research support
Other research
Facilities support
Administration
Other (describe):
____all____________________________

N: COOP Submission
Thank you for completing your departments Continuity of Operations Plan (COOP). Please
submit an electronic copy of this Plan to the Universitys Emergency Management
Coordinator, Bryant Haley at bhaley@pdx.edu.
The PSU Emergency Management Unit (EMU).

May 1, 2009

OIT Disaster Recovery Plan, 2010

Page 59 of 59

6/14/2010

You might also like