Strategies For Data Protection First Edition

STRATEGIES
FOR DATA
PROTECTION
FIRST EDITION
A strategic approach to
comprehensive data protection
TOM CLARK
STRATEGIES
FOR DATA
PROTECTION
FIRST EDITION
A strategic approach to comprehensive
data protection
TOM CLARK
This book is dedicated to the memory of Kent Hanson.
Returned too soon to stardust and golden, he is sorely
missed by his workmates and friends.
iv Strategies for Data Protection
2008 Brocade Communications Systems, Inc. All Rights Reserved.
Brocade, Fabric OS, File Lifecycle Manager, MyView, and StorageX are
registered trademarks and the Brocade B-wing symbol, DCX, and SAN Health
are trademarks of Brocade Communications Systems, Inc., in the United
States and/or in other countries. All other brands, products, or service names
are or may be trademarks or service marks of, and are used to identify,
products or services of their respective owners.
Notice: This document is for informational purposes only and does not set
forth any warranty, expressed or implied, concerning any equipment,
equipment feature, or service offered or to be offered by Brocade. Brocade
reserves the right to make changes to this document at any time, without
notice, and assumes no responsibility for its use. This informational document
describes features that may not be currently available. Contact a Brocade
sales office for information on feature and product availability. Export of
technical data contained in this document may require an export license from
the United States government.
Brocade Bookshelf Series designed by Josh Judd
Strategies for Data Protection
Written by Tom Clark
Reviewed by Brook Reams
Edited by Victoria Thomas
Design and Production by Victoria Thomas
Illustrations by Jim Heuser, David Lehmann, and Victoria Thomas
Content for Chapters 9 through 12 based on the Brocade corporate Web site
(www.brocade.com), edited by Doug Wesolek
Content for Chapter 13 provided by the S3 team
Printing History
First Edition, eBook, June 2008
Strategies for Data Protection v
Important Notice
Use of this book constitutes consent to the following conditions. This book is
supplied AS IS for informational purposes only, without warranty of any kind,
expressed or implied, concerning any equipment, equipment feature, or
service offered or to be offered by Brocade. Brocade reserves the right to
make changes to this book at any time, without notice, and assumes no
responsibility for its use. This informational document describes features that
may not be currently available. Contact a Brocade sales office for information
on feature and product availability. Export of technical data contained in this
book may require an export license from the United States government.
Brocade Corporate Headquarters
San Jose, CA USA
T: (408) 333 8000
info@brocade.com
Brocade European Headquarters
Geneva, Switzerland
T: +41 22 799 56 40
emea-info@brocade.com
Brocade Asia Pacific Headquarters
Singapore
T: +65 6538 4700
apac-info@brocade.com
Acknowledgements
Many thanks to Victoria Thomas for her meticulous copyediting and superb
organization at pulling this project together. Thanks also to Brook Reams for
reviewing the final manuscript and providing technical insight into many of the
issues raised by data protection. Finally, thanks to Tom Buiocchi and the entire
Brocade Marketing team for creating such a supportive and intelligent working
environment.
vi Strategies for Data Protection
About the Author
Tom Clark is a resident SAN evangelist for Brocade, and represents Brocade in
industry associations, conducts seminars and tutorials at conferences and
trade shows, promotes Brocade storage networking solutions, and acts as a
customer liaison. A noted author and industry advocate of storage networking
technology, he is a board member of the Storage Networking Industry
Association (SNIA) and Chair of the SNIA Green Storage Initiative. Clark has
published hundreds of articles and white papers on storage networking and
is the author of Designing Storage Area Networks, Second Edition (Addison-
Wesley 2003, IP SANs: A Guide to iSCSI, iFCP and FCIP Protocols for Storage
Area Networks (Addison-Wesley 2001), and Storage Virtualization:
Technologies for Simplifying Data Storage and Management (Addison-Wesley
2005).
Prior to joining Brocade, Clark was Director of Solutions and Technologies
for McDATA Corporation and the Director of Technical Marketing for Nishan
Systems, the innovator of storage over IP technology. As a liaison between
marketing, engineering, and customers, he has focused on customer
education and defining features that ensure productive deployment of SANs.
With more than 20 years experience in the IT industry, Clark has held technical
marketing and systems consulting positions with storage networking and other
data communications companies.
Strategies for Data Protection vii
Contents
Introduction .................................................................................................. i
Part One .......................................................................................................1
Chapter 1: Building the Foundation ........................................................3
Storage-Centric vs. Network-Centric SAN Architectures ..................................... 4
Flat SAN Topologies ...................................................................................... 4
Mesh SAN Topologies ................................................................................... 7
Core-Edge SAN Topologies ........................................................................... 9
Inter-Fabric Routing ....................................................................................11
Virtual Fabrics .............................................................................................13
Additional SAN Design Considerations .....................................................14
Highly Available Storage .....................................................................................16
Local Mirroring (RAID 1) .............................................................................16
Other RAID Levels .......................................................................................18
RAID as a Form of Storage Virtualization ..................................................20
Alternate Pathing and Failover ...................................................................20
Additional High Availability Storage Features ...........................................22
Storage and Fabric Consolidation .....................................................................22
SAN Security ........................................................................................................ 24
Securing the SAN Data Transport ..............................................................25
Securing Storage Data Placement ............................................................ 31
Securing the Management Interface ........................................................34
Going to the Next Level: The Brocade Data Center Fabric ...............................35
Chapter 2: Backup Strategies ............................................................... 37
Conventional Local Backup ................................................................................ 37
Backup Fabrics ...........................................................................................42
Disk-to-Disk (D2D) Tape Emulation ...........................................................43
Disk-to-Disk-to-Tape (D2D2T) .....................................................................44
Remote Backup ..................................................................................................45
Data Restoration from Tape .......................................................................49
viii Strategies for Data Protection
Contents
Chapter 3: Disaster Recovery ............................................................... 51
Defining the Scope of Disaster Recovery Planning ..........................................52
Defining RTO and RPO for Each Application .....................................................53
Synchronous Data Replication ..........................................................................55
Metro DR .....................................................................................................56
Leveraging High Speed ISLs ......................................................................58
Asynchronous Data Replication .........................................................................59
Going the Distance .....................................................................................60
Disaster Recovery Topologies ............................................................................70
Three-Tier DR ..............................................................................................70
Round Robin DR ......................................................................................... 71
SAN Routing for DR .............................................................................................73
Disaster Recovery for SMBs ............................................................................... 74
Chapter 4: Continuous Data Protection .............................................. 75
Defining the Scope of CDP ................................................................................. 76
Near CDP .............................................................................................................78
True CDP ..............................................................................................................78
Integrating CDP with Tape Backup and Disaster Recovery ..............................80
Chapter 5: Information Lifecycle Management ................................. 81
Tiered SAN Architectures ...................................................................................83
Classes of Storage Containers ..................................................................83
Classes of Storage Transport .....................................................................84
Aligning Data Value and Data Protection ..........................................................86
Leveraging Storage Virtualization ...................................................................... 87
Storage Virtualization Mechanics ..............................................................89
Convergence of Server and Storage Virtualization ...................................92
Fabric-Based Storage Services ..........................................................................92
Fabric Application Interface Standard (FAIS) ............................................93
Brocade Data Migration Manager (DMM) .................................................95
Chapter 6: Infrastructure Lifecycle Management .............................. 97
Leased versus Purchased Storage .................................................................... 97
The Data Deletion Dilemma ...............................................................................98
Bad Tracks ...................................................................................................98
Data Remanence ........................................................................................99
Software-based Data Sanitation ............................................................ 100
Hardware-based Data Sanitation ........................................................... 100
Physical Destruction of Storage Assets ...........................................................101
Contents
Strategies for Data Protection ix
Chapter 7: Extending Data Protection to Remote Offices ..............103
The Proliferation of Distributed Data .............................................................. 103
Centralizing Remote Data Assets ................................................................... 106
Remote Replication and Backup .............................................................107
Leveraging File Management Technology for Data Protection ............. 108
Protecting Data with Brocade StorageX ................................................. 110
Brocade File Management Engine ......................................................... 112
Part Two ..................................................................................................113
Chapter 8: Foundation Products ........................................................115
Brocade DCX Backbone .................................................................................. 116
Brocade 48000 Director ................................................................................. 119
Brocade Mi10K Director .................................................................................. 121
Brocade M6140 Director ................................................................................ 122
Brocade FC4-16IP iSCSI Blade ....................................................................... 123
Brocade FC10-6 Blade .................................................................................... 124
Brocade 5300 Switch ...................................................................................... 125
Brocade 5100 Switch ...................................................................................... 126
Brocade 300 Switch ........................................................................................ 127
Brocade Fibre Channel HBAs .......................................................................... 128
Brocade 825/815 FC HBA ...................................................................... 128
Brocade 425/415 FC HBA ...................................................................... 129
Brocade SAN Health ........................................................................................ 130
Chapter 9: Distance Products .............................................................133
Brocade 7500 Extension Switch .................................................................... 133
FR4-18i Extension Blade ................................................................................. 134
Brocade Edge M3000 ..................................................................................... 135
Brocade USD-X ................................................................................................. 136
Chapter 10: Backup and Data Protection Products ........................137
Brocade FA4-18 Fabric Application Blade ......................................................137
Brocade Data Migration Manager Solution ................................................... 139
EMC RecoverPoint Solution ............................................................................ 140
Chapter 11: Branch Office and File Management Products ..........143
Brocade File Management Engine ................................................................. 143
Brocade StorageX ............................................................................................ 145
Brocade File Insight ......................................................................................... 146
x Strategies for Data Protection
Contents
Chapter 12: Advanced Fabric Services and Software Products ....149
Brocade Fabric OS ........................................................................................... 149
Brocade Advanced Performance Monitoring ......................................... 150
Brocade Access Gateway .........................................................................151
Brocade Fabric Watch ............................................................................. 152
Brocade Inter-Switch Link Trunking ........................................................ 153
Brocade Extended Fabrics ...................................................................... 154
Brocade Enterprise Fabric Connectivity Manager ......................................... 156
Brocade Basic EFCM ............................................................................... 156
Brocade EFCM Standard and Enterprise ............................................... 156
Brocade Fabric Manager ................................................................................. 158
Brocade Web Tools .......................................................................................... 160
Chapter 13: Solutions Products .........................................................163
Backup and Recover Services ........................................................................ 163
Brocade Virtual Tape Library Solution ............................................................ 164
Appendix A: The Storage Networking Industry Association (SNIA) .167
Overview ........................................................................................................... 167
Board of Directors ............................................................................................ 168
Executive Director and Staff ........................................................................... 169
Board Advisors ................................................................................................. 169
Technical Council ............................................................................................. 169
SNIA Technology Center .................................................................................. 169
End User Council ..............................................................................................170
Committees .......................................................................................................170
Technical Work Groups ..................................................................................... 171
SNIA Initiatives .................................................................................................. 171
The SNIA Storage Management Initiative ............................................... 171
The SNIA XAM Initiative ............................................................................ 171
The SNIA Green Storage Initiative ........................................................... 171
Industry Forums ........................................................................................172
SNIA Data Management Forum ...............................................................172
SNIA IP Storage Industry Forum ..............................................................172
SNIA Storage Security Industry Forum ....................................................173
Regional Affiliates .............................................................................................173
Summary ...........................................................................................................173
Strategies for Data Protection xi
Figures
Figure 1. A simplified flat SAN architecture with no ISLs .................................. 4
Figure 2. Expanding a flat SAN architecture via the addition of
switch elements .................................................................................................... 6
Figure 3. A mesh SAN topology with redundant pathing .................................. 7
Figure 4. A core-edge SAN topology with classes of storage and servers ....... 9
Figure 5. A three-tier core-edge SAN topology with the core servicing
ISLs to fabric .......................................................................................................10
Figure 6. Using inter-fabric routing to provide device connectivity
between separate SANs .....................................................................................12
Figure 7. Sharing a common SAN infrastructure via virtual fabrics ...............14
Figure 8. Array-based (top) and server-based (bottom) disk mirroring ..........17
Figure 9. Array-based mirroring between separate enclosures .....................18
Figure 10. RAID 5 with distributed parity blocks .............................................19
Figure 11. Providing alternate paths from servers to storage .......................21
Figure 12. Simplifying the fabric and storage management
via consolidation .................................................................................................23
Figure 13. Establishing zones between groups of initiators and
targets to segregate traffic ................................................................................26
Figure 14. Creating secure device connectivity via port binding ...................27
Figure 15. Securing the fabric with fabric ISL binding ....................................28
Figure 16. Restricting visibility of storage Logical Units via
LUN masking .......................................................................................................32
Figure 17. The Brocade DCF provides the infrastructure to optimize
the performance and availability of upper-layer business applications .........36
Figure 18. LAN-based tape backup transports both data and metadata
over the LAN ........................................................................................................39
Figure 19. LAN-free tape backup separates the metadata and data
paths to offload the LAN transport and optimize backup streams .................40
Figure 20. Server-free backup removes the production server from the data
path, freeing CPU cycles for applications instead of backup operations .......41
Figure 21. A dedicated tape SAN isolates the backup process from
the production SAN ............................................................................................42
Figure 22. Disk-to-disk tape emulation requires no changes to
backup software .................................................................................................43
xii Strategies for Data Protection
Figures
Figure 23. Combining disk-to-disk tape emulation with conventional
tape backup ........................................................................................................45
Figure 24. Consolidating remote tape backup places all data under
the control and best practices of the data center ............................................46
Figure 25. Tape vaulting centralizes all data backup to a secure
location dedicated to protecting all corporate data .........................................47
Figure 26. Without tape pipelining, performance falls dramatically
during the first 10 miles. ....................................................................................48
Figure 27. Array-based synchronous replication over distance .....................55
Figure 28. Maximizing utilization of large storage systems for
bi-directional replication ....................................................................................56
Figure 29. Leveraging metro SONET for native Fibre Channel
disaster recovery ................................................................................................57
Figure 30. Using Brocade trunking to build high performance metro
disaster recovery links .......................................................................................58
Figure 31. Asynchronous data replication buffers multiple I/Os
while providing immediate local acknowledgement ........................................59
Figure 32. Larger port buffers avoid credit starvation ....................................62
Figure 33. Using Brocade rate limiting to avoid congestion and
erratic performance ............................................................................................65
Figure 34. A standard SCSI write operation over distance requires
significant protocol overhead ............................................................................67
Figure 35. FastWrite dramatically reduces the protocol overhead
across the WAN link by proxying for both initiator and target .........................68
Figure 36. A three-tier DR topology provides an extra layer of data
protection in the event of regional disruption ..................................................71
Figure 37. In a round-robin DR topology, each data center acts
as the recovery site for its neighbor ..................................................................72
Figure 38. SAN Routing reinforces stability of the DR implementation
by maintaining the autonomy of each site. ......................................................73
Figure 39. Continuous data protection provides finer granularity for
data restoration when corruption occurs. .........................................................76
Figure 40. Aged snapshots are rotated on a configurable interval
to conserve disk space on the CDP store. ........................................................78
Figure 41. The CDP engine manages metadata on the location and
time stamp of data copies on the CDP store. ...................................................79
Figure 42. Aligning cost of storage to business value of data .......................82
Figure 43. Aligning classes of storage transport to classes of
storage and applications ....................................................................................85
Figure 44. Conventional LUN allocation between servers and storage .........87
Figure 45. Logically binding servers to virtual LUNs drawn from the
storage pool ........................................................................................................88
Figure 46. The virtualization engine maintains a metadata mapping
to track virtual and physical data locations ......................................................90
Figure 47. FAIS block diagram with split data path controllers and
control path processor .......................................................................................94
Figure 48. Cylinder, head, and sector geometry of disk media ......................98
Figures
Strategies for Data Protection xiii
Figure 49. Traces of original data remain even if the specific
sector has been erased or overwritten .............................................................99
Figure 50. Remote office processing compounds the growth of
remote servers and storage and data vulnerability ...................................... 104
Figure 51. Decentralization of data storage has inherent cost and
data protection issues ..................................................................................... 105
Figure 52. Centralized file access replaces remote server and storage
assets with appliances optimized for high-performance file serving ........... 109
Figure 53. Brocade StorageX provides a global namespace to virtualize
file access across heterogeneous OSs and back-end storage elements .... 111
Figure 54. Brocade File Management Engine components and
architecture ...................................................................................................... 112
Figure 55. Brocade DCX Backbone with all slots populated (no door) ....... 116
Figure 56. Brocade 48000 Director with all slots populated ...................... 119
Figure 57. Brocade Mi10K Director .............................................................. 121
Figure 58. Brocade M6140 Director ............................................................. 122
Figure 59. FC4-16IP iSCSI Blade ................................................................... 123
Figure 60. Brocade 5300 Switch .................................................................. 125
Figure 61. Brocade 5100 Switch .................................................................. 126
Figure 62. Brocade 300 Switch ..................................................................... 127
Figure 63. Brocade 825 FC 8 Gbit/sec HBA (dual ports shown) ................ 128
Figure 64. Brocade 415 FC 4 Gbit/sec HBA (single port shown) ................ 129
Figure 65. SAN Health topology display ........................................................ 130
Figure 66. SAN Health reporting screen ....................................................... 132
Figure 67. Brocade 7500 Extension Switch ................................................. 133
Figure 68. FR4-18i Extension Blade ............................................................. 134
Figure 69. Brocade Edge M3000 .................................................................. 135
Figure 70. Brocade USD-X, 12-slot and 6-slot versions ............................... 136
Figure 71. Brocade FA4-18 ............................................................................ 137
Figure 72. EMC RecoverPoint on Brocade scenario .................................... 141
Figure 73. Brocade File Management Engine (FME) ................................... 143
Figure 74. Overview of Brocade File Insight ................................................. 147
Figure 75. Access Gateway on blades and the Brocade 300 Switch ......... 152
Figure 76. Brocade EFCM interface .............................................................. 157
Figure 77. Brocade Fabric Manager displays a topology-centric
view of SAN environments .............................................................................. 159
Figure 78. Brocade Web Tools Switch Explorer View of the
Brocade 48000 Director ................................................................................. 161
Figure 79. Storage Networking Industry Association organizational
structure ........................................................................................................... 168
xiv Strategies for Data Protection
Figures
Strategies for Data Protection i
Introduction
Data protection is an umbrella term that covers a wide range of tech-
nologies for safeguarding data assets. Data generated and
manipulated by upper-layer applications is the raw material of useful
information. Regardless of their individual products or service offer-
ings, institutions and enterprises today depend on information for their
livelihood. Loss of data can quickly result in loss of revenue, which in
turn could result in loss of the enterprise itself.
Because data is so essential for the viability of an organization, finding
the means to protect access to data and ensure the integrity of the
data itself is central to an IT strategy. Data ultimately resides on some
form of storage media: solid state disk, tape, optical media, and in par-
ticular disk media in the form of storage arrays. The dialect of data
protection is therefore necessarily storage-centric. Layers of data pro-
tection and access mechanisms, ranging from high-availability block
access to distributed file systems, are built on a foundation of fortified
storage and extend up to the application layer. Network-attached stor-
age (NAS), for example, serves files to upper-layer applications, but
cannot do so reliably without underlying safeguards at the block level,
including redundant array of inexpensive disks (RAID), alternate path-
ing, data replication, and block-based tape backup.
A strategic approach to comprehensive data protection includes a par-
fait of solutions that on the surface may seem unrelated, but in reality
are essential parts of a collaborative ecosystem. Safeguarding data
through data replication or backup has little value if access to data is
impeded or lost through bad network design or network outage. Con-
sequently, it is as important to ensure data access as it is to protect
data integrity. For storage area networks (SANs), alternate pathing with
failover mechanisms are essential for providing highly available access
to data, and high availability (HA) enables consistent implementation
of data replication, snapshot, backup, and other data protection
services.
ii Strategies for Data Protection
Introduction
In this book we will examine the key components of an enterprise-wide
data protection strategy, including data center SAN design within the
framework of Brocades data center fabric (DCF) architecture and
securing data assets in remote sites and branch offices. For most
enterprises, data is literally all over the place. Typically, more than 70
percent of all corporate data is generated and housed outside the cen-
tral data center. Data dispersed in remote offices is often unprotected
and creates vulnerability for both business operations and regulatory
compliance.
In the central data center, the most mission-critical applications are
run on high-performance Fibre Channel (FC) SANs. The data generated
by these first-tier applications typically benefits from a high degree of
protection through periodic disk-to-disk data replication and tape
backup (locally or remotely via a disaster recovery site). Even large
data centers, however, may have hundreds of standalone servers sup-
porting less critical, second-tier applications. Because they lack the
centrally managed services provided by a SAN, securing the data on
those servers is often difficult and requires additional administrative
overhead. Creating an enterprise-wide solution for protecting all local
and remote corporate data while keeping overall costs under control is
therefore a significant challenge for IT administrators.
Over the past twenty years, a hierarchy of data protection technologies
has evolved to safeguard data assets from device failures, system fail-
ures, operator errors, data corruption, and site outages. RAID, for
example, was developed in the late 1980s to provide data protection
against disk drive failures. Continuous data protection (CDP) is a more
recent technology that provides protection against malicious or inad-
vertent data corruption. At a very granular level, even cyclic
redundancy checks (CRCs) performed by SAN switches and end
devices provides data protection against bit corruption in the data
stream. Data is, after all, sacrosanct and no single technology can pro-
vide comprehensive protection against all potential hazards.
Data protection solutions are differentiated by the scope of defense
they provide. Lower-level solutions offer protection against component,
link, or device failure; while higher-level solutions protect against sys-
tem, business application, or site failure, as shown in Table 1.
Introduction
Strategies for Data Protection iii
In addition, different layers of data protection may satisfy very different
RTOs and RPOs. The recovery time objective (RTO) defines how quickly
access to data can be restored in the event of a device, system or site
failure. The recovery point objective (RPO) defines the point in time in
which the last valid data transaction was captured therefore measur-
ing the level of data protection from loss. The chronic complaint
against tape backup, for example, is that data transactions that occur
after the backup was performed are not secured, and restoration from
tape may take hours or days. Despite its poor RTO and RPO, the endur-
ing strength of tape is that it provides long-term storage of data on
economical, non-spinning media and is not subject to head crashes or
drive failures.
The scope of data protection also differentiates between recovery from
data loss and recovery from data corruption. Although RAID protects
against data loss due to disk failure, it offers no defense against data
corruption of inbound streams. A virus attack, for example, may cor-
rupt data as it is written to disk, in which case RAID will simply secure
Table 1. Block-based data protection mechanisms
Type of Data
Protection
Protection
Against
Recovery Time
Objective
Recovery Point
Objective
RAID Disk drive
failure
Instantaneous No data loss
Mirroring Link, disk or
array failure
Instantaneous No data loss
True CDP Data
corruption
Seconds
minutes
No data loss
Near CDP/
Snapshot
Data
corruption
Seconds
minutes
Some data loss
Synchronous
Replication
System/site
failure
Seconds
minutes
No data loss
Asynchronous
Replication
System/site
failure
Seconds
minutes
Some data loss
Disk to Disk
Tape Emulation
Array failure Minutes
Some data loss
*
*.Since last backup
Local Tape
Backup
Array failure Minutes hours Some data loss*
iv Strategies for Data Protection
Introduction
the already altered data. Likewise, synchronous and asynchronous
replications have no way to verify the integrity of the data on the
source array. Once data corruption has been identified, other means
must be used for restoration to a known good point in time. Restora-
tion from tape works, but is time consuming and useless for
transactions that occurred since the last backup. Continuous data pro-
tection (CDP) is a preferred solution, since it can enable immediate
restoration to the point just prior to data corruption (true CDP) or
within some short time frame prior to the event (near CDP).
Expanding in concentric circles from centralized SAN storage, the fab-
ric and server layers provide protected and continuous access to data.
Fabric zoning, and logical unit number (LUN) masking, for example,
can prevent servers from accessing and potentially corrupting data on
unauthorized storage arrays. Because Windows in particular wants to
own every storage asset it sees, it is imperative to zone or mask visibil-
ity of Windows servers to UNIX storage volumes. Likewise, use of
zoning or virtual fabrics can ensure that one departments data is
unreachable by another unrelated department. Enforcing fabric con-
nections between authorized initiators and targets, between physical
ports, and between switches that compose the fabric are meant to
prevent illicit access to storage and prevent fabric disruptions that
would impair data access.
At the server level, clustering facilitates scale-up of data access by
more clients and provides high availability using failover in the event of
a single server failure. Global clustering extends this concept across
geographical distances so that remote servers can participate in a
high-availability collaboration delivering application and data protec-
tion in the event of a site-wide disaster. At the transport layer,
individual SAN-attached servers are typically configured with redun-
dant host bus adapters (HBAs) for connectivity to parallel primary and
secondary fabrics. The failure of an HBA, port connection, switch port,
or switch or storage port triggers a failover to the alternate path and
thus ensures continuous data access.
At a more granular level, the Fibre Channel transport protocol protects
data integrity and availability through a number of mechanisms,
including CRC checks against the frame contents, to guard against bit
errors, frame sequencing to ensure in-order delivery of frames and
recovery from frame loss. iSCSI likewise provides a CRC digest to verify
packet contents, while relying on Transmission Control Protocol (TCP)
algorithms to provide discrete packet recovery.
Introduction
Strategies for Data Protection v
At every level, from entire storage sites to individual data frames, the
prime directive of storage technology is to safeguard data integrity and
ensure availability. This objective is fulfilled by engineering the many
facets of data protection into each component of the storage ecosys-
tem. The challenge for storage architects is to use these building
blocks in a coherent design that meets organizational and budget
goals. As with any construction project, quality building materials do
not guarantee a quality result. Developing a comprehensive strategy,
defining the business requirements, establishing guiding principles
based on those requirements, and creating a coherent design in
advance help ensure that all layers of protection and accessibility are
fully leveraged and work in concert to safeguard your data assets.
In the following chapters, we will explore the different strata of data
protection technologies, including data center design and operations,
disaster recovery, storage virtualization solutions, remote tape vault-
ing, SAN extension, and remote office data consolidation via file
management. In this process we will define the best practices applica-
ble to each technology and explain how Brocade products and services
can be leveraged to create a complete solution.
Although storage technologies are commonly available to the entire
market, each enterprise and institution is unique. Customizing an
implementation to suit your specific needs therefore requires an
understanding of your organizations primary business requirements.
Business requirements drive the guiding principles of what a solution
should provide, and those principles establish the parameters of the
final design. Characteristically, the first step is the hardest. The pro-
cess of collecting business requirements from corporate stakeholders
may result in conflicting needs, for example, the requirement to cen-
tralize storage assets to reduce costs and management overhead and
the requirement to accommodate a rapid proliferation of remote retail
sites. Fortunately, harmonizing these requirements is facilitated by the
much broader offering of technologies from the storage networking
industry today. As will be detailed in the following chapters, Brocade
provides a wide spectrum of solutions and cost points to fulfill a diver-
sity of business needs.
vi Strategies for Data Protection
Introduction
Strategies for Data Protection 1
Part One
The following chapters are included in Part One:
Chapter 1: Building the Foundation starting on page 3
Chapter 2: Backup Strategies starting on page 37
Chapter 3: Disaster Recovery starting on page 51
Chapter 4: Continuous Data Protection starting on page 75
Chapter 5: Information Lifecycle Management starting on
page 81
Chapter 6: Infrastructure Lifecycle Management starting on
page 97
Chapter 7: Extending Data Protection to Remote Offices starting
on page 103
2 Strategies for Data Protection
1
Building the Foundation
Implementing a comprehensive data protection strategy begins with
building a firm foundation at the data transport layer to ensure high
availability access to storage data. A typical data center, for example,
may have multiple, large storage RAID arrays, high-availability Fibre
Channel directors, fabric switches, and high-end servers running criti-
cal business applications. The data center SAN may be configured with
redundant pathing (Fabrics A and B) to guard against link, port, or
switch failures. Many companies have experienced such explosive
growth in data, however, that the original data center SAN design can-
not accommodate the rapid increase in servers, storage traffic, and
arrays. The foundation begins to crumble when administrators go into
reactive mode in response to sudden growth and scramble to integrate
new ports and devices into the SAN. As a consequence, data access
may be disrupted and data protection undermined.
NOTE: In this chapter and throughout the book, the term switch and
director refers to a SAN platform, which may be a standalone switch,
an embedded switch module, a director, or a backbone device.
Ideally, a data center SAN design should be flexible enough to accom-
modate both current and anticipated (typically looking out three years)
needs. Although business expansion is rarely linear, it is helpful to
compare an organization's current storage infrastructure to the one it
had three years ago. For most companies, that historical reality check
reveals a substantial increase in storage capacity, servers, tape
backup loads, and complexity of the fabric. That growth may be due to
natural business expansion or simply to the proliferation of compute
resources to more parts of the organization. In either case, the steady
growth of data assets increases the delta between the sheer quantity
of storage data and the amount that is adequately protected. A care-
fully considered SAN design can help close this gap.
Chapter 1: Building the Foundation
Storage-Centric vs. Network-Centric SAN
Architectures
A SAN architecture is characterized by the relationship between serv-
ers and storage that is enabled by the fabric topology of switches and
directors. A storage-centric architecture places storage assets at the
core of the SAN design with all fabric connectivity devoted to facilitat-
ing access to storage LUNs by any attached server. A network-centric
architecture, by contrast, borrows from conventional LAN networking
and promotes any-to-any peer connectivity. The impact of each
approach becomes clear when we look at practical examples of SAN
designs in flat, mesh, and core-edge variations.
Flat SAN Topologies
The flat SAN topology has been a popular starting point for SAN design
because it simplifies connectivity and can accommodate redundant
pathing configurations for high availability. As illustrated in Figure 1,
initiators (servers) and targets (storage arrays) are directly connected
to fabric switches or directors, and there is no need for inter-switch
links (ISLs) to create data paths between switches and directors.
Figure 1. A simplified flat SAN architecture with no ISLs
Storage-Centric vs. Network-Centric SAN Architectures
This is a storage-centric design in that storage connectivity is central-
ized to the fabric, and servers (with proper zoning) can attach to any
storage LUN. With redundant A and B pathing, storage transactions
can survive the loss of any single HBA, link, switch port, switch ele-
ment, or storage port. Because each switch element provides
independent paths to each storage array, there is no need for ISLs to
route traffic between switches.
Depending on the traffic load generated by each server, the fan-in
ratio of servers to storage ports (also known as oversubscription)
can be increased. Typically, for 1 Gbit/sec links, a fan-in ratio of 7:1
can be used, although that ratio can be increased to 12:1 at 2 Gbit/
sec and 18:1 or greater at 4 Gbit/sec. In the example in Figure 1, the
oversubscription would occur in the switch or director, with many more
ports devoted to server attachment and fewer ports for storage con-
nections. If the server fan-in ratio cannot accommodate the collective
traffic load of each server group, however, congestion will occur at the
switch storage port and lead to a loss of performance and transaction
stability.
In practice, the flat SAN topology can be expanded by adding more
switch elements, as shown in Figure 2.
Figure 2. Expanding a flat SAN architecture via the addition of switch
elements
Although this design is entirely adequate for moderate-sized SANs, it
becomes difficult to scale beyond about 600 ports. Three 256-port
directors on each A and B side, for example, would provide 768 ports
for direct server and storage connections. Adding a fourth or fifth
director to each side, however, would increase costs, complicate the
cable plant, and increase the complexity of the SAN and its
management.
In addition, the flat SAN topology is perhaps too egalitarian in applying
an equal cost to all server connectivity regardless of the traffic require-
ments of different applications. Particularly for flat SANs based on
Fibre Channel directors, high-usage servers may benefit from dedi-
cated 4 Gbit/sec connections, but that bandwidth and director real
estate are squandered on low-usage servers. Likewise, a flat SAN
topology cannot accommodate variations in cost and performance
attributes of different classes of storage devices, and so offers the
same connectivity cost to high-end arrays and lower-cost JBODs (just a
bunch of disks) alike. Consequently, even medium-sized SANs with
varying server requirements and classes of storage are better served
by a more hierarchical core-edge SAN design.
Mesh SAN Topologies
In conventional local area networks (LANs) and wide area networks
(WANs), the network is composed of multiple switches and routers
wired in a mesh topology. With multiple links connecting groups of
switches and routers and routing protocols to determine optimum
paths through the network, the network can withstand an outage of an
individual link or switch and still deliver data from source to destina-
tion. This network-centric approach assumes that all connected end
devices are peers and that the role of the network is simply to provide
any-to-any connectivity between peer devices.
Figure 3. A mesh SAN topology with redundant pathing
In a SAN environment, a mesh topology provides any-to-any connectiv-
ity by using inter-switch links between each switch or director in the
fabric, as shown in Figure 3. As more device ports are required, addi-
tional switches and their requisite ISLs are connected. Because each
switch has a route to every other switch, the mesh configuration offers
multiple data paths in the event of congestion or failure of a link, port
or switch. The trade-off for achieving high availability in the fabric, how-
ever, is the consumption of switch ports for ISLs and increased
complexity of the fabric cable plant.
Mesh topologies are inherently difficult to scale and manage as the
number of linked switches increases. A mesh topology with 8 switches,
for example, would require 28 ISLs (56 if 2 links are used per ISL). As
the switch count goes higher, a disproportionate number of ports must
be devoted to building a more complex and expensive fabric. Conse-
quently, as a best practice recommendation, mesh topologies for SANs
should be limited to 4 switches.
A more fundamental problem with mesh topologies, though, is the
assumption that storage networks need any-to-any connectivity
between peers. Although this model may be valid for messaging net-
works, it does not map directly to storage relationships. SAN end
devices can be active participants (initiators) or passive participants
(targets). Initiators do not typically communicate with one another as
peers across the SAN, but with storage targets in a master/slave rela-
tionship. Storage arrays, for example, do not initiate sessions with
servers, but passively wait for servers to instigate transactions with
them. The placement of storage targets on the SAN, then, should be to
optimize accessibility of targets by initiators and not to provide univer-
sal, any-to-any connectivity. This goal is more readily achieved with a
core-edge design.
Core-Edge SAN Topologies
Core-edge SAN topologies enable a storage-centric, scalable infra-
structure that avoids the complexities of mesh topologies and limited
capacity of flat SAN topologies. The core of the fabric is typically pro-
vided by one or more director-class switches which provide centralized
connectivity to storage. The edge of the fabric is composed of fabric
switches or directors with ISL connections to the core.
Figure 4. A core-edge SAN topology with classes of storage and
servers
As shown in Figure 4, the heavy lifting of storage transactions is sup-
ported by the core director since it is the focal point for all storage
connections, while the edge switches provide fan-in for multiple serv-
ers to core resources. This design allows for connectivity of different
classes of servers on paths that best meet the bandwidth require-
ments of different applications. Bandwidth-intensive servers, for
example, can be connected as core hosts with dedicated 4 Gbit/sec
links to the core director. Standard production servers can share band-
width through edge switches via ISLs to the core, and second-tier
servers can be aggregated through lower-cost edge switches or iSCSI
gateways to the core.
Storage placement in a core-edge topology is a balance between man-
ageability and application requirements. Placing all storage assets on
the core, for example, simplifies management and assignment of
LUNs to diverse application servers. Some departmental applications,
however, could be serviced by grouping servers and local storage on
the same switch, while still maintaining access to core assets. An engi-
neering department, for example, may have sufficient data volumes
and high-performance requirements to justify local storage for depart-
mental needs, in addition to a requirement to access centralized
storage resources. The drawback for departmental-base storage is
that dispersed storage capacity may not be efficiently utilized. Conse-
quently, most large data centers implement centralized storage to
maximize utilization and reduce overall costs.
Figure 5. A three-tier core-edge SAN topology with the core servicing
ISLs to fabric
As shown in Figure 5, a three-tier, core-edge design inserts a distribu-
tion layer between the core and edge. In this example, the core is used
to connect departmental or application-centric distribution switch ele-
ments via high-performance ISLs. Brocade, for example, offers 10
Gbit/sec ISLs as well as ISL Trunking to provide a very high-perfor-
mance backbone at the core. This tiered approach preserves the
ability to assign storage LUNs to any server, while facilitating expan-
sion of the fabric to support additional storage capacity and server
connections.
For simplicity, the figures shown above do not detail alternate or dual
pathing between servers, switches, and storage. The fabric illustrated
in Figure 4, for example, could be the A side of a dual-path configura-
tion. If directors are used, however, the full redundancy and 99.999
percent availability characteristic of enterprise-class switches provide
another means to implement dual pathing. A server with dual HBAs
could have one link connected to a director port on one blade, and a
redundant link connected to a director port on a different blade. Like-
wise, storage connections can be provided from storage ports to
different blades on the same director chassis. As in Fabric A and B,
this configuration provides failover in the event of loss of an HBA, link,
port, blade, or storage port.
Inter-Fabric Routing
Fibre Channel is a link layer (Layer 2) protocol. When two or more Fibre
Channel switches are connected to form a fabric, the switches engage
in a fabric-building process to ensure that there are no duplicate
addresses in the flat network address space. The fabric shortest path
first (FSPF) protocol is used to define optimum paths between the fab-
ric switches. In addition, the switches exchange Simple Name Server
(SNS) data, so that targets on one switch can be identified by initiators
attached to other switches. Zoning is used to enforce segregation of
devices, so that only authorized initiators can access designated tar-
gets. Analogous to bridged Ethernet LANs, a fabric is a subnet with a
single address space, which grows in population as more switches and
devices are added.
At some point, however, a single flat network may encounter problems
with stability, performance, and manageability if the network grows too
large. When a fabric reaches an optimum size, it is time to begin build-
ing a separate fabric instead of pushing a single fabric beyond its
limits. The concept of a manageable unit of SAN is a useful tool for
determining the maximum number of switches and devices that will
have predictable behavior and performance and can be reasonably
maintained in a single fabric.
Enterprise data centers may have multiple large fabrics or SAN conti-
nents. Previously, it was not possible to provide connectivity between
separate SANs without merging SANs into a single fabric via ISLs. With
inter-fabric routing (IFR), it is now possible to share assets among mul-
tiple manageable units of SANs without creating a single unwieldy
fabric. As shown in Figure 6, IFR SAN routers provide both connectivity
and fault isolation among separate SANs. In this example, a server on
SAN A can access a storage array on SAN B via the SAN router. From
the perspective of the server, the storage array is a local resource on
SAN A. The SAN router performs Network Address Translation (NAT) to
proxy the appearance of the storage array and to conform to the
address space of each SAN. Because each SAN is autonomous, fabric
reconfigurations or Registered State Change Notification (RSCN)
broadcasts on one SAN do not impact the others.
Figure 6. Using inter-fabric routing to provide device connectivity
between separate SANs
IFR thus provides the ability to build very large data center storage
infrastructures, the data center fabric, while keeping each fabric a
manageable SAN unit. In combination with Fibre Channel over IP
(FCIP), IFR can be used to scale enterprise-wide storage transport
across multiple geographies to further streamline storage operations
without merging the remote fabrics over WAN networks.
Virtual Fabrics
It is also possible to segregate departmental or business unit applica-
tions on a shared SAN infrastructure by dividing the physical fabric into
multiple logical fabrics. Each virtual fabric (VF) behaves as a separate
autonomous fabric with its own SNS and RSCN broadcast domain,
even if the virtual fabric spans multiple fabric switches, as shown in
Figure 7. To isolate frame routing between the virtual fabrics on the
same physical ISL, VF tagging headers are applied to the appropriate
frames as they are issued, and the headers are removed by the switch
before they are sent on to the designated initiator or target. Theoreti-
cally, the VF tagging header would allow for 4,096 virtual fabrics in a
single physical fabric configuration, although in practice only a few are
typically used.
Virtual fabrics are a means to consolidate SAN assets, while reducing
management complexity to enforce manageable SAN units. In the
example shown in Figure 7, each of the three virtual fabrics could be
administered by a separate department with different storage, secu-
rity, and bill-back policies. Although the total SAN configuration can be
quite large, the division into separately-managed Virtual Fabrics simpli-
fies administration, while leveraging the data center investment in SAN
technology.
Figure 7. Sharing a common SAN infrastructure via virtual fabrics
Additional SAN Design Considerations
Whether you are implementing a SAN for the first time or expanding an
existing SAN infrastructure, the one unavoidable constant in data stor-
age is growth. The steady growth in storage capacity needs, in
additional servers and applications and in data protection require-
ments, is so predictable that anticipated growth must be an integral
part of any SAN design and investment. A current requirement for 50
attached servers and 4 storage arrays, for example, could be satisfied
with two 32-port switches (4 for redundant pathing) or a 256-port
director chassis populated with two 32-port blades (4 for redundancy).
Which solution is better depends on the projected growth in both stor-
age capacity and server attachment, as well as availability needs.
Unfortunately, some customers have inherited complex meshed SAN
topologies due to the spontaneous acquisition of switches to satisfy
growing port requirements. At some point, fabric consolidation may be
required to simplify cabling and management and to provide stability
for storage operations. Without a solid foundation of a well-designed
managed unit of SAN, higher-level data protection solutions are always
at risk.
A managed unit of SAN can also be characterized by its intended func-
tionality; and functionality, in turn, can drive a specific SAN topology. A
high-availability SAN, for example, requires redundancy in switch ele-
ments and pathing, as well as management tools to monitor and
enforce continuous operation. However, a SAN designed for second-
tier applications may not justify full redundancy and be adequately
supported on a more streamlined topology. In addition, a SAN
designed specifically for tape backup has very different requirements
compared to a production SAN. Tape is characterized by large block,
bandwidth-intensive transactions, while production disk access is typi-
cally distinguished by small block and I/O-intensive transactions.
Because tape operations consume bandwidth for extended periods of
time and are sensitive to fabric events, customers can implement two
separate SANs or leverage Virtual Fabrics to isolate production disk
access from backup operations. As a separate tape SAN, a flat SAN
topology that avoids potential ISL oversubscription is recommended.
An optimized SAN topology can also be affected by the server technol-
ogy used to host applications. Blade servers and blade SAN switches,
in particular, can adversely impact the consumption of switch
addresses, or Domain IDs, and limit the total number of switches
allowable in a SAN unit. A new standard for N_Port ID Virtualization
(NPIV) has been created to address this problem. An NPIV-enabled
gateway presents logical hosts to the SAN and thus eliminates the
addition of another switch element, Domain ID assignment, and
interoperability or switch management issue. Brocade Access Gate-
way, for example, leverages NPIV to bring blade servers into the SAN
without requiring administrative overhead to monitor Domain ID usage
and potential interoperability conflicts. As long as the edge SAN
switches are NPIV aware, larger populations of blade servers can be
accommodated without limiting the scalability of the SAN topology.
Highly Available Storage
Data protection solutions are dependent on a stable underlying SAN
transport that is both predictable and manageable. The most carefully
crafted SAN, however, cannot ensure the availability and integrity of
data if storage targets are vulnerable to data loss or corruption. For
enterprise-class applications in particular, storage systems must be
designed to provide performance, capacity, data integrity, and high
availability. Therefore, storage array architectures can include resil-
iency features to maximize availability of the array itself and to protect
against data loss due to failed disk components.
Local Mirroring (RAID 1)
Spinning disk technology is mechanical and will eventually wear out
and fail. As one of the first storage solutions to guard against disk fail-
ure and data loss, simple mirroring of data between two different disks
or disk sets is easy to deploy, but it doubles the cost per data block
stored. Mirroring is also known as RAID 1" and was one of the first
data protection solutions at the disk level. As shown in Figure 8, disk
mirroring can be implemented within a single array enclosure. In the
top example, data is written once by the server to the storage array.
The array controller assumes responsibility for mirroring and so writes
the data to both primary and secondary mirror disk sets. If, however,
data corruption occurs in the controller logic, the data integrity of the
primary and/or mirror may be compromised.
In the bottom example in Figure 8, the volume manager running on the
server is responsible for mirroring and writes the data twice: once to
the primary and once to the secondary mirror. In both examples, if a
disk failure occurs on the primary disk set, either the volume manager
or the array controller logic must execute a failover from primary to the
mirror to redirect I/O and maintain continuity of data operations.
Figure 8. Array-based (top) and server-based (bottom) disk mirroring
Although simple mirroring accomplishes the goal of protecting data
against disk failure, additional utilities are required to reconstitute the
primary disk set and re-establish the mirror operation. Once the failed
primary has been serviced, for example, the data on the primary must
be rebuilt and synchronized to the new production mirror. For array-
based mirroring, this is typically performed as an automatic back-
ground operation and once synchronization has been achieved, the
primary is reinstated. This automated process, however, can have
unintended consequences. In one customer case study, a service call
to replace a drive on a mirror inadvertently resulted in a drive on the
primary being swapped. Instead of failing over to the mirror image, the
mirror was rebuilt to the now-corrupted primary image. It is no great
mystery that tape backup endures as a data protection insurance pol-
icy against potential array failures.
The primary drawback to mirroring within an array is that the entire
array is subject to failure or outage. Consequently, data centers may
physically isolate primary and mirror arrays, placing them in separate
areas with separate power sources.
Figure 9. Array-based mirroring between separate enclosures
As illustrated in Figure 9, separating production and mirror arrays pro-
vides protection against loss of disks, the array controller, and the
array enclosure. The mirroring function can be provided by the array
controller or the server. For switches that implement application ser-
vices, the mirroring intelligence may be provided by the fabric itself. In
some vendor offerings, the mirroring operation can be bidirectional so
that two storage arrays can mutually act as mirrors for each other. This
helps to reduce the overall cost and avoids dedicating an entire stor-
age array as a mirror.
As a data protection element, mirroring offers the advantage of near-
zero recovery time and immediate recovery point. Given that storage
systems are the most expensive components of a storage network,
however, mirroring comes at a price. In addition, unless mirroring is
combined with data striping across disks, it may lack the performance
required for high volume data center applications.
Other RAID Levels
In addition to mirroring, data protection at the array can be enforced by
alternate RAID algorithms. RAID 0+1, for example, combines data
striping (RAID 0) with mirroring to enhance performance and availabil-
ity. In RAID 0+1, data is first striped across multiple disks and those
disks in turn are mirrored to a second set of disks. RAID 0+1 boosts
performance, but it retains the additional cost of redundant arrays
characteristic of RAID 1. The inverse of RAID 0+1 is RAID 10, in which
case the mirroring occurs first as a virtual disk before striping is
executed.
Other RAID techniques attempt to integrate the performance advan-
tage of data striping with alternative means to reconstruct data in the
event of disk failure. The most commonly deployed is RAID 5, which
stripes data across a disk set and uses block parity instead of mirror-
ing to rebuild data. As data blocks are striped across multiple disks, a
parity block is calculated using an eXclusive OR (XOR) algorithm and
written to disk. If a disk fails, the data can be reconstructed on a new
disk from the parity blocks. In RAID 4, the parity blocks are written to a
single dedicated disk. This creates some vulnerability if the parity disk
itself fails and incurs a write penalty, since every write must be parity
processed on a single drive. RAID 5 reduces the write penalty by plac-
ing the parity information across multiple disks in the RAID set. As the
parity data is generated, the array controller does not have to wait for
the availability of a dedicated disk. As shown in Figure 10, RAID 5
arrays typically house spare disks that can automatically be brought
online and reconstructed in the event of disk failure. In this example, if
the third disk in the set fails, the parity block on the fifth disk (P abcd)
can be used to recreate both block C and the parity block (P efgh) for
blocks E, F, G, and H.
Figure 10. RAID 5 with distributed parity blocks
The primary benefit of RAID 5 is its ability to protect block data while
minimizing the number of disks required to guard against failure. On
the other hand, the write penalty generated by parity calculation needs
hardware acceleration to improve performance and avoid an adverse
impact to upper-layer applications.
With parity distributed across multiple disks, RAID 5 provides protec-
tion against a single disk failure. RAID 6 offers additional protection by
duplicating the parity blocks across different disks. With multiple cop-
ies of the parity blocks distributed over more disks, RAID 6 can
withstand the failure of two disks and still rebuild disk images from
spares.
In addition to standard RAID types, storage vendors may offer propri-
etary RAID options to optimize performance and reliability. Because
the RAID function occurs in the array enclosure, the fact that the par-
ticular RAID level is proprietary or open systems has no practical
interoperability implication. The only requirement is that the disks in a
RAID set are of the same technology (Fibre Channel, SATA, or SAS) and
have equivalent capacity and performance characteristics.
RAID as a Form of Storage Virtualization
Just as a volume manager on a server presents a logical view of stor-
age capacity that can exist on separate physical disks, a RAID
controller hides the complexity of multiple disks and the back-end
RAID execution. Binding to a LUN on a RAID array, a server simply sees
a single disk resource for reading and writing data. This abstraction
from the physical to logical views places an immense responsibility on
the RAID controller logic for maintaining the integrity of data on the
RAID set(s) and automatically recovering from back-end faults.
Today's storage virtualization takes the logical abstraction of physical
assets to a new level. Instead of simply masking the appearance of
physical disks in an enclosure, storage virtualization masks the
appearance of entire RAID arrays. Creating a single logical pool of sep-
arate physical storage systems facilitates capacity utilization and
dynamic assignment of storage to upper-layer applications. As with
basic RAID, however, this places significant responsibility on the virtu-
alization engine to map the logical location of data to its actual
physical distribution across multiple arrays. Every successive level of
abstraction that simplifies and automates storage administration must
be accompanied by a robust data protection mechanism working
behind the scenes.
Alternate Pathing and Failover
High-availability storage must provide both internal mechanisms for
data redundancy and data integrity via RAID, in addition to continuous
accessibility by external clients. This requires the appropriate SAN
design as outlined in Storage-Centric vs. Network-Centric SAN Archi-
tectures on page 4 to build dual pathing through the fabric and multi-
port connectivity on the array for each server. As illustrated in
Figure 11, alternate pathing can be configured as Fabrics A and B,
which provide each server with a primary and secondary path to stor-
age assets.
Figure 11. Providing alternate paths from servers to storage
In this example, the failure of a storage port on the array or any link or
port through Fabric A would still allow access through Fabric B. With
both sides active in normal operation, though, each individual server
sees two separate images of the same storage target: one from the A
side and one from the B side. A mechanism is therefore required to
reconcile this side effect of dual pathing and present a single image of
storage to the initiator. Typically, this reconciliation is performed by a
device driver installed on the host. The driver may include the addi-
tional ability to load balance between alternate paths to maximize
utilization of all fabric connectivity.
Additional High Availability Storage Features
High-end storage systems are further fortified with fault-tolerant fea-
tures that enable 99.999 percent availability. Redundant
uninterruptible power supplies, redundant fans, hot-swappable disk
drives, redundant RAID controllers, and non-disruptive microcode
updates guard against loss of data access due to any individual com-
ponent failure. These high-availability features add to the complexity
and total cost of the array, of course, and the selection of storage ele-
ments should therefore be balanced against the value of the data
being stored. The reality is that not all data merits first-class handling
throughout its lifetime. Designing a SAN infrastructure with multiple
classes of storage containers provides more flexibility in migrating
data from one storage asset to another, and thus aligning the value of
storage to the current business value of data.
Storage and Fabric Consolidation
For many data centers, the steady growth of data is reflected in the
spontaneous acquisition of more servers, switches, and storage
arrays. As this inventory grows, it becomes increasingly difficult to
manage connectivity and to provide safeguards for data access and
data integrity. In addition, the proliferation of storage arrays inevitably
leads to under-utilization of assets for some applications and over-utili-
zation for others. To reduce the number of storage components and
maximize utilization of assets it may be necessary to re-architect the
SAN on the basis of larger but few components.
Storage and fabric consolidation are a means to streamline storage
administration and achieve a higher return on investment on SAN
infrastructure. Previously, consolidation strategies were limited to
replacing dispersed assets with larger centralized ones. Today, the
concentration of resources can be further enhanced by new technolo-
gies for virtualizing the fabric (discussed in Virtual Fabrics on
page 13) and virtualizing storage capacity.
As shown in Figure 12, a SAN that is the result of a reactive addition of
switch and storage elements to accommodate growth quickly
becomes unmanageable. More switches means units to manage,
more ISLs, complex cabling, longer convergence times, and greater
vulnerability to fabric instability. While initially it may seem more eco-
nomical to simply connect an additional switch to support more ports,
in the long run complexity incurs its own costs. Collapsing the SAN
infrastructure into one or more directors or backbones simplifies man-
agement and the cabling plant and promotes stability and
predictability of operation.
Storage and Fabric Consolidation
Figure 12. Simplifying the fabric and storage management via
consolidation
Likewise, reactively adding storage arrays to accommodate increasing
capacity requirements often leads to inefficient utilization of storage
and increased management overhead. For the small SAN configura-
tion illustrated here, storage consolidation requires an investment in a
larger centralized storage system and data migration from dispersed
assets to the consolidated array. For large data center SANs, servicing
thousands of devices, the next step in storage consolidation may be to
virtualize designated storage systems to optimize capacity utilization
and facilitate data lifecycle management via different classes of virtu-
alized storage.
Storage and fabric consolidation projects can now take advantage of
enhanced features that streamline connectivity. Large storage arrays,
for example, not only provide high availability and capacity but more
ports for the SAN interconnect. Large arrays typically provide 128 to
256 ports at 2, 4 or 8 Gbit/sec Fibre Channel speeds. Brocade's intro-
duction of 8 Gbit/sec support enables a much higher fan-in ratio of
clients per storage port. In addition, Brocade directors provide 8 Gbit/
sec ISLs to both increase bandwidth for switch-to-switch traffic and
simplify cabling.
Storage consolidation also includes technologies to centralize data
geographically dispersed in remote sites and offices. As will be dis-
cussed in more detail in Chapter 7, centralizing data in the data center
is a prerequisite for safeguarding all corporate data assets, meeting
enterprise-wide regulatory compliance goals and reducing the cost of
IT support for remote locations. Implementing remote office data con-
solidation has been contingent on the arrival of new technologies for
accelerating data transactions over fairly low-speed WANs and innova-
tive means to reduce protocol overhead and to efficiently monitor data
changes.
SAN Security
Security for storage area networks incorporates three primary aspects:
Secure data transport
Secure data placement
Secure management interfaces
Securing the data transport requires multiple levels of protection,
including authorization of access, segregation of storage traffic
streams, maintaining the integrity of network (fabric) connectivity, and
encryption/decryption of the data in flight across the SAN.
Securing data placement must ensure that application data is written
to the appropriate storage area (LUN) in a specified storage system,
that data copies are maintained via mirroring or point in time copy,
and that sensitive data is encrypted as it is written to disk or tape.
Securing the management interface must include means to validate
authorized access to SAN hardware, such as SAN switches and stor-
age systems, to prevent an intruder from reconfiguring network
connections.
These three components are interdependent and a failure to secure
one may render the others inoperable. Safeguards can be imple-
mented for data transport and placement, for example, but an
exposed management interface can allow an intruder to redirect the
storage transport or deny access to data assets.
SAN Security
Securing the SAN Data Transport
The fact that the majority of SANs are based on Fibre Channel instead
of TCP/IP has created a false sense of security for data center storage
networks. Hacking Fibre Channel data streams would require very
expensive equipment and a high degree of expertise. In addition, the
physical security of data center environments is often assumed to pro-
vide sufficient protection against malfeasance. As SAN technology has
become ubiquitous in data centers, however, no one should assume
that the SANs are inherently secure. Simply reconfiguring a server so it
now has access to designated storage assets could enable unautho-
rized access to valuable corporate information.
Although Fibre Channel has relied on the physical separation of com-
munication networks and storage networks to provide a rudimentary
security barrier, modern business practices require a much higher
assurance of data defense. Physical isolation alone does not provide
security against internal attacks or inadvertent configuration errors.
The storage industry has therefore responded with a spectrum of
security capabilities to provide a high degree of data protection, while
still maintaining the performance required for storage applications.
Zoning
At a low level, zoning of resources in the SAN provides authorized
access between servers and storage ports through the Fibre Channel
network or fabric as illustrated in Figure 13. Zoning can be port based,
restricting access by authorizing only designated Fibre Channel switch
ports and attached devices to communicate to each other. Alternately,
zoning can be based on a 64-bit Fibre Channel World Wide Name
(WWN). Since each Fibre Channel device has a unique WWN, it is pos-
sible to authorize connections based on the unique identity of each
device.
Figure 13. Establishing zones between groups of initiators and targets
to segregate traffic
Port-based zoning is fairly secure, since it cannot be spoofed by manip-
ulating frame headers. If a device is moved from port to port, however,
the zone stays with the port, not the device. This makes hard or port-
based zoning more difficult to manage as adds, moves, and changes
are made to the fabric. Soft zoning based on WWN provides the flexi-
bility to have zones follow the device itself, but can be spoofed if
someone inserts a valid WWN into a frame to redirect storage data.
Zoning alone provides no means to monitor these sorts of intrusions
and has no integrated data encryption support.
Port Binding
Port binding established a fixed connection between a switch port and
the attached server or storage device. With port binding, only desig-
nated devices are allowed on specified ports and a substitution of
devices on a port results in port blocking of communications from the
substituted end device, as shown in Figure 14.
SAN Security
Figure 14. Creating secure device connectivity via port binding
Port binding thus locks in the authorized connection between the fab-
ric and the device, ensuring that the link between the device and the
fabric is secure. This mechanism prevents both deliberate and inad-
vertent changes in connectivity that might allow an unauthorized
server or workstation to gain access to storage data.
Fabric Binding
At a higher level, it may also be desirable to secure connections
between multiple fabric switches. Fibre Channel fabric switches are
designed to automatically extend the fabric as new switches are intro-
duced. When two fabric switches are connected via ISLs, they
automatically exchange fabric-building protocols, zoning information,
and routing tables. While this is acceptable in some environments, it
creates a security concern. Someone wishing to probe the fabric could
simply attach an additional switch and use it to gain entrance into the
SAN.
Figure 15. Securing the fabric with fabric ISL binding
As shown in Figure 15, fabric binding establishes fixed relationships
between multiple switches in the network. Only authorized ISLs are
allowed to communicate as a single fabric and any arbitrary attempts
to create new ISLs to new switches are blocked. Fabric binding
ensures that established switch-to-switch connections are locked into
place and that any changes to the SAN can occur only through secure
administrative control.
Use of Inter-Fabric Routing to Secure the Storage Network
An additional layer for securing storage operations is provided by Inter-
Fabric Routing technology. As discussed in Inter-Fabric Routing on
page 11, Inter-Fabric Routing can be applied in the data center to build
large, stable storage networks, or used for storage over distance appli-
cations such as disaster recovery. In addition, Inter-Fabric Routing is a
means to block denial of service attacks if someone were to deliber-
ately initiate faults to cause disruptive fabric reconfigurations.
SAN Routing technology prevents SAN-wide disruptions and reconfigu-
rations by providing fault isolation between fabric switches. Acting as a
router between SAN segments, the SAN router passes only authorized
storage traffic between each attached SAN. Each SAN segment main-
tains its autonomy from the others, and a disruption in one segment is
not allowed to propagate to other switches. Faults are therefore con-
SAN Security
tained at the segment level, and other fabric switches continue normal
operations. Denial of service attempts are restricted and not allowed
to impact the entire storage network.
SAN Routing products may support multi-vendor interoperability and
be extensible over any distance. For mission-critical data applications
such as disaster recovery, SAN Routing ensures that the underlying
transport aligns with the customer's requirement for continuous, non-
disruptive storage operation.
Virtual Fabrics
Large data centers often support a wide variety of storage applications
for different business units such as manufacturing, sales, marketing,
engineering, and human resources. While it is possible to deploy a
separate physical fabric for each business unit, this solution adds sig-
nificant costs, reduces storage utilization and adds ongoing
administrative overhead. Storage administrators may therefore
attempt to reduce costs by running multiple storage applications
across a larger unified SAN.
In order to segregate storage traffic over a single large fabric and pre-
vent, for example, sales applications from disrupting engineering
applications, some means is needed to isolate the fabric resources
supporting each application. For Fibre Channel SANs, this functionality
is provided by virtual fabric protocols. Frames for a specific application
are tagged with identifiers that enable that application data to traverse
its own path through the fabric. Consequently a large SAN switch with
hundreds of ports can host multiple virtual fabrics (or virtual SANs).
Similar to inter-fabric routing, disruptions or broadcast storms in one
virtual fabric are not allowed to propagate to other virtual fabrics.
Security for IP SAN Transport via IEEE Standards
For iSCSI and other IP-based storage protocols, conventional Ethernet
standards can be implemented to safeguard storage data transport.
IEEE 802.1Q virtual LAN (VLAN) tagging, for example, can be used to
create over 4,000 virtual LANs to separate traffic flows and ensure
that only members of the same VLAN can communicate. Like virtual
fabrics in Fibre Channel, this mechanism enables multiple storage
applications to share the same infrastructure while gaining the protec-
tion of segregated data streams. Access control lists (ACLs) commonly
supported in gigabit Ethernet switches and IP routers can be used to
restrict access to only designated network devices.
IPSec for SAN Transport
IP standards also provide a range of security features collectively
known as IPSec (IP security) standards. IPSec includes both authenti-
cation and data encryption standards, and IPSec functionality is
currently available from a community of IP network and security
vendors.
For IP storage data in flight, data encryption can be implemented
through conventional Data Encryption Standard (DES) or Advanced
Encryption Standard (AES). DES uses a 56-bit key, allowing for as many
as 72 quadrillion possible keys that could be applied to an IP data-
gram. The triple-DES algorithm passes the data payload through three
DES keys for even more thorough encryption. AES provides richer
encryption capability through the use of encryption keys of 128 to 256
bits.
IPSec authentication and encryption technologies are integrated into
the iSCSI protocol and can be used in conjunction with storage over
distance applications, such as disaster recovery. Use of FCIP for stor-
age extension over untrusted network WAN segments mandates data
encryption if data security is required.
Although DES and AES were originally developed for IP networking, the
same key-based encryption technologies can be applied to payload
encryption of native Fibre Channel frames in SANs. With some vendor
offerings, data may only be encrypted as it traverses the fabric and
decrypted before being written to disk or tape. In other products, the
data can remain in an encrypted state as it is written to disk and
decrypted only as it is retrieved by a server or workstation.
SAN Security
Securing Storage Data Placement
In addition to securing storage data as it crosses the fabric between
initiator (server) and target (storage array), it may also be necessary to
secure storage data at rest. Safeguarding data at the storage system
has two components. First, the application data must be written to its
specified storage location in a storage array. The authorized relation-
ship (binding) between a server application and its designated storage
location ensures that an unauthorized server cannot inadvertently or
deliberately access the same storage data. Second, additional data
security can be provided by payload encryption as the data is written to
disk or tape. Unauthorized access to or removal of disk drives or tape
cartridges would thereby render the data unintelligible without the
appropriate encryption keys.
LUN Masking
LUN masking restricts access to storage resources by making visible to
a server only those storage locations or logical units (LUNs) behind a
zoned storage port that a server is authorized to access. Both fabric
zoning and LUN masking are needed to fully enforce access controls.
Zoning defines server to storage port access control while LUN mask-
ing defines which storage LUNs behind the storage port are available
to the server and its applications. If a large storage array, for example,
supports 10 LUNs, a server may see only 1 available LUN. The other 9
have been masked from view and are typically assigned to different
servers.
Figure 16. Restricting visibility of storage Logical Units via LUN
masking
LUN masking provides access control between storage assets and
authorized servers, preventing a server from inadvertently or deliber-
ately attaching to unauthorized resources, as shown in Figure 16.
Without LUN masking, a Windows server, for example, could query the
fabric for available resources and attach to storage LUNs previously
assigned to a Solaris server. Since Windows writes a disruptive signa-
ture to its attached LUNs, this would render the Solaris data
unreadable. Although LUN masking can be implemented on an HBA at
the host, it is typically performed on the storage array after initial
configuration.
SAN Security
iSCSI LUN Mapping
iSCSI LUN mapping is an additional technique to extend control of stor-
age assets and create authorized connectivity across IP SANs. With
LUN mapping, the administrator can reassign LUNs to meet the stor-
age requirements of specific servers. A LUN 5 on the disk array, for
example, can be represented as a LUN 0 to an iSCSI server, enabling it
to boot from disk under tighter administrative control. Centralized
management and iSCSI LUN mapping can ensure that servers load
only their authorized system parameters and applications, and in com-
bination with LUN masking, attach only to designated storage
resources.
Internet Simple Name Server (iSNS)
The Internet Storage Name Service (iSNS) is an IETF-approved protocol
for device discovery and management in iSCSI networks. iSNS com-
bines features from Fibre Channel SNS with IP Domain Name Server
(DNS) capability. As an integral part of the protocol definition, iSNS
includes support for public/private key exchange, so that storage
transactions in IP SANs can be authenticated and payload secured.
iSNS has been endorsed by Microsoft and other vendors as the man-
agement solution of choice for iSCSI and IP storage environments.
Encryption of Data at Rest
Recent publicity on the theft or loss of tape backup cartridge sets and
disk drives in large corporations highlights the inherent vulnerability of
removable media. Retrieving storage data on tape or disk may require
expensive equipment, but the proliferation of SAN technology has low-
ered the threshold for this type of data theft. The highest level of
security for storage data at rest is therefore provided by encryption of
data as it is written to disk or tape. Previously, data encryption in the
SAN imposed a significant performance penalty. With current SAN
security technology, however, encrypting and decrypting data as it
moves to and from storage devices can be achieved with minimal
impact on production. As in any encryption solution, management of
encryption keys places an additional obligation on storage
administration.
Securing the Management Interface
Management of a SAN infrastructure is typically performed out of band
via Ethernet and TCP/IP. A Fibre Channel fabric switch, for example,
provides Fibre Channel ports for attachment to servers, storage sys-
tems, and other fabric switches (via ISLs), while also providing an
Ethernet port for configuration and diagnostics of the switch itself.
Unauthorized access to the management port of a fabric switch is
therefore an extreme liability. Deliberate or inadvertent configuration
changes to a switch can result in unauthorized access to storage
assets or loss of access altogether (also known as denial of service).
In some implementations, fabric management is performed in band,
over the Fibre Channel infrastructure. This approach provides addi-
tional protection by making it more difficult for an intruder to tap into
the management data stream. However, if a Fibre Channel connection
is down, both production data and management data are blocked. For
large enterprises, redundant pathing through the fabric is used to
ensure that both production and management data have alternate
routes if a link failure occurs.
Whether in band or out of band, ultimately an administrative interface
must be provided at a console. As in mainstream data communica-
tions, it is therefore critical that the operator at that console has
authorization to monitor fabric conditions or make configuration
changes. Standard management security mechanisms, such as CHAP
(Challenge-Handshake Authentication Protocol), SSL (Sec)ure Sockets
Layer), SSH (Secure Shell), and RADIUS (Remote Authentication Dial-In
User Service) are typically used to enforce access authorization to the
fabric and attached storage systems.
Going to the Next Level: The Brocade Data Center Fabric
Going to the Next Level: The Brocade Data Center
Fabric
The foundation elements of resilient storage systems and robust and
secure fabrics are prerequisites for implementing a coherent data pro-
tection strategy. The next phase in SAN evolution, however, must
extend the coverage of data to the upper-layer applications that gener-
ate and process data. This new application-centric approach is
embodied by the Brocade data center fabric (DCF) architecture and its
supporting products, including the Brocade DCX Backbone, launched
in January 2008.
The unique application focus of the Brocade DCF design aligns the
entire storage infrastructure to the more dynamic requirements of
today's business operations. For both server platforms and storage,
rigid physical connections between applications and data are being
replaced with more flexible virtual relationships and shared resource
pools. Enhanced data mobility, protection, and security are now key to
preserving data integrity and fulfilling regulatory requirements. By
combining enhanced connectivity with advanced storage and applica-
tion-aware services, the Brocade DCF is centrally positioned to
coordinate new capabilities in both server and storage platforms and
maximize data center productivity.
To minimize disruption and cost, the Brocade DCF architecture, shown
at a high level in Figure 17, is designed to interoperate with existing
storage and fabric elements while providing enhanced services where
needed. The Brocade DCX Backbone, for example, integrates with
existing Brocade and third-party fabrics and extends their value by pro-
viding Adaptive Networking services, multi-protocol connectivity, data
migration services, storage virtualization, data encryption for data at
rest, and other advanced services throughout the data center fabric.
To simplify administration, these advanced services can be automated
via policy-based rules that align to upper-layer application
requirements.
Figure 17. The Brocade DCF provides the infrastructure to optimize
the performance and availability of upper-layer business applications
2
Backup Strategies
Tape backup for data centers has been one of the original drivers for
the creation of SAN technology. Before the advent of SANs, backing up
open systems storage data over 100 Mbit/sec Ethernet LANs was sim-
ply too slow and did not allow sufficient time to safeguard all data
assets. As the first gigabit network transport, Fibre Channel provided
the bandwidth and an alternate storage network infrastructure to off-
load backup operations from the LAN. Subsequently, the development
of SCSI Extended Copy (third-party copy or TPC) technology also freed
individual servers from backup operations and enabled direct SAN-
based backup from disk to tape.
Although obituaries for the demise of tape have been written repeat-
edly over the past few years, tape endures as the principle mainstay of
data protection. Unlike spinning disk media, once data is committed to
tape it can be transported offsite and vaulted, and has a reasonable
shelf life. Even data centers that use newer disk-to-disk tape emula-
tion for primary backup also often implement a final backup to tape.
Conventional Local Backup
Tape backup operations and best practices date back to mainframe
and midrange computing environments. Backup processes are thus
well defined for both proprietary and open systems applications, and
technology innovation has largely focused on higher performance and
greater storage density of tape cartridge formats and robotics. Even
second-generation initiatives, such as virtual tape libraries (VTLs), rely
on established practices honed over the years by conventional tape
backup operations.
Chapter 2: Backup Strategies
Tape backup routines are shaped by an organization's recovery point
objective (RPO) and recovery time objective (RTO). The recovery point,
or the amount of data loss that can be tolerated in the event of data
corruption or outage, determines how frequently backups are exe-
cuted. The recovery time objective is determined by how the backups
are performed (incremental, differential, or full) and the severity of the
outage itself. A minor event, for example, may have a shorter recovery
time if an incremental backup can be used to restore data. A bare
metal restore (for example, required when an entire storage array
fails), by contrast, may have a much longer recovery time, since both
full and incremental backups must be restored to rebuild the most
recent data state.
For many companies today the RPO for mission-critical applications is
at or near zero. The loss of any data transaction is unacceptable. For
tape backup schedules that rely on daily incremental backups, then,
additional utilities such as snapshots or continuous data protection
are required to protect against data loss that may occur between incre-
mental backups. Not all data is essential for a company's survival,
however, and the RPO can vary from one application to another. Peri-
odic tape backup on a daily basis is therefore the lowest common
denominator for safeguarding all data assets, while more advanced
options should be implemented selectively for the highest-value data.
In addition to RPO and RTO criteria, tape backup operations are
bounded by the dimensions of the backup window, or the allowable
time to complete backups for all servers. Typically, applications must
be quiesced so that files or records can be closed and in a static state
for backup. For global or other 7/24 operation enterprises, however,
there may be no opportunity to quiesce applications and thus no
backup window at all. Although backup software can be used for copy-
ing open files, the files themselves may change content as the backup
occurs.
Conventional tape backup architectures for shared open systems envi-
ronments are typically LAN-based, LAN-free (SAN-based), or server-free
(SAN-based with Extended Copy). Although LAN-based backup configu-
rations are still common for small and medium-sized businesses,
today's enterprise data centers normally perform backup operations
across a storage network.
Figure 18. LAN-based tape backup transports both data and meta-
data over the LAN
As shown in Figure 18 a LAN-based tape backup configuration
requires a backup server that acts as the repository for metadata
(information on the structure of files and which files or records have
been copied) and the gatekeeper of the target tape subsystem.
Although metadata may incur little overhead on the LAN, the continu-
ous streaming of gigabytes of data from the production server to the
backup server can seriously impact other LAN-based applications.
Traditional LAN-based tape backup is based on backup of files. Each
server on the LAN may have gigabytes of direct-attached storage that
needs to be secured through backup. The backup server instructs
each server to initiate a backup, with the data sent over the LAN from
server to backup server. This type of backup involves multiple conver-
sions. Upon launching a backup, the target server must read blocks of
SCSI data from disk, assemble the blocks into files, and packetize the
files for transfer over the LAN. At the backup server, the inbound pack-
ets must be rebuilt into files, while the files are, in turn, disassembled
into blocks to be written to tape. The original data blocks that reside
on the target storage therefore undergo four steps of conversion
before reappearing at the destination as blocks: blocks > file > pack-
ets > file > blocks. Both the server and backup server must devote
considerable Central Processing Unit (CPU) cycles to both SCSI and
network protocol overhead.
In addition, the limited bandwidth of the LAN (typically 1 Gbit/sec
Ethernet) can impose a much longer backup window. Simply moving
the data path off the LAN and onto a higher performance storage net-
work can alleviate the dual problem of LAN traffic load and backup
window constraints. This was one of the initial issues that accelerated
the adoption of SANs in enterprise data centers.
Figure 19. LAN-free tape backup separates the metadata and data
paths to offload the LAN transport and optimize backup streams
across the SAN
Figure 19 illustrates a LAN-free, SAN-based tape backup scheme. In
this case, the target tape subsystem is deployed on the storage net-
work to create a more direct path between the production server and
tape. As in LAN-based backup, the backup server is responsible for
maintaining metadata on the backup process, but the production
server can now request data from storage and copy it directly to the
tape target. With the LAN transport no longer a bottleneck for streams
of backup data, the backup window becomes more manageable. Still,
in both LAN-based and LAN-free solutions, the server remains in the
data path, reading data from storage and writing data to tape.
Figure 20. Server-free backup removes the production server from the
data path, freeing CPU cycles for applications instead of backup
operations
Server-free backup takes a more direct path between storage and tape
by eliminating the production server from the backup process. As
shown in Figure 20 an extended copy engine in the SAN assumes both
initiator and target roles on behalf of the server to perform the reads
and writes of data for the backup operation. The extended copy engine
can be resident in a SAN director or switch, an appliance attached to
the SAN, or embedded in the tape subsystem. The backup server is
still required to host metadata and monitor backup status, but the
metadata path can now be across the SAN or via a LAN-attached
extended copy controller.
While the high-performance SAN infrastructure and advanced utilities,
such as extended copy, facilitate efficient backup of storage data, the
application software that initiates and manages backup processes var-
ies in capabilities from vendor to vendor. Although every storage
administrator recognizes the necessity of data backup, it is sometimes
difficult to verify that a backup operation was completed and that the
tapes can actually be used to restore data. In addition, regular backup
operations may repeatedly copy data that is unchanged over time,
which adds to the volume and duration of the backup process. Ven-
dors of backup software may provide additional utilities for
verification, point-in-time (snapshot) backup for active databases,
changed-block-only backup, data de-duplication, or other value-added
backup services. As the volume of storage data grows, the task of
securely backing up data in a reasonable time frame is increasingly
difficult.
Backup Fabrics
Data traffic on a production SAN is typically characterized by high I/O
of fairly short transactions. With the exception of streaming video or
large image data applications (for example, medical or geophysical
imaging), the brevity of normal business transactions across a SAN
makes those transactions more tolerant of transient fabric issues
such as congestion or disruption. Tape backup, by contrast, is charac-
terized by the continuous streaming of blocks of data from the initiator
to the tape target. Any fabric disruption in the backup stream can
abort the entire backup operation. Data centers can therefore elect to
build a separate and dedicated fabric for tape backup, both to mini-
mize disruption to the backup process and to offload the tape traffic
from the production SAN.
Figure 21. A dedicated tape SAN isolates the backup process from the
production SAN
As shown in Figure 21, a dedicated tape SAN can be implemented in
parallel with the production SAN to isolate backup traffic from other
storage transactions. Because most Fibre Channel-to-SCSI bridges for
tape attachment were originally based on Fibre Channel Arbitrated
Loop (FCAL) protocol, the tape SAN would employ FCAL-capable
switches and FCAL HBAs for server attachment. Today, Fibre Channel
ports are typically integrated into tape subsystems and thus eliminate
the need for bridge products. Although implementing a separate tape
SAN may require additional hardware and management, it does
enhance stability of tape operations to ensure backup completion.
While Brocade enterprise-class platforms are commonly used for pro-
duction SAN connectivity, Brocade SAN switches, such as the Brocade
5000 Switch, are often used to build dedicated tape backup SAN
infrastructures.
Disk-to-Disk (D2D) Tape Emulation
One of the persistent complaints made against tape backup stems
from the disparity between disk array speeds and tape speeds. Disk
media simply spins at much higher rates than tape media, making
tape the inevitable bottleneck in the backup process. In addition, tape
backup is a linear process that is protracted by the constant reposi-
tioning of the tape media to the read/write head. The shoe shine
motion of tape is essential for accurately positioning the tape media to
mark the beginning of a backup stream, but necessarily incurs latency
(as well as wear on the media itself).
Figure 22. Disk-to-disk tape emulation requires no changes to backup
software
Because tape backup processes and software are so ubiquitous in
data centers, it has been difficult to replace tape backup with an alter-
native technology. Consequently, vendors have developed tape
emulation products that enable disk arrays to behave as conventional
tape targets. In addition, some tape emulation devices can assume
the personality of different types of tape subsystems and so enable a
single emulation device to service multiple tape backup solutions.
Because disk-to-disk tape emulation eliminates the bottleneck posed
by tape mechanics, it is possible to dramatically reduce backup win-
dows. Data retrieval from D2D is also expedited for either partial or full
data restorations. As shown in Figure 22, disk-to-disk tape emulation
can be configured with an external appliance or be embedded in a
specialized disk array controller. From the standpoint of the backup
application, the target device appears as a conventional tape sub-
system. This makes it possible to drop in a D2D solution with no major
changes to backup operations.
Disk-to-Disk-to-Tape (D2D2T)
Not all customers are comfortable, however, with committing their
backup data entirely to spinning media. Consequently, a disk-to-disk
tape emulation installation may be supplemented by a conventional
tape subsystem for long-term data archiving, as shown in Figure 23.
Once data is backed up to the D2D array, it can be spooled to the
downstream tape subsystem and cartridges can be shipped offsite for
safekeeping. In this case, the tape device no longer imposes a bottle-
neck to the backup process, since the initial backup has already been
executed to disk. D2D2T does not eliminate tape, but helps overcome
the limitations of tape in terms of performance for both backup and
restore of data. With ever-increasing volumes of data to safeguard via
backup and with regulatory compliance pressures on both data preser-
vation and retrieval, D2D2T provides a means to both expedite
processes and ensure long-term data protection.
Remote Backup
Figure 23. Combining disk-to-disk tape emulation with conventional
tape backup
Remote Backup
By leveraging storage networking, large enterprise data centers can
centralize backup operations for local storage systems and replace
multiple dispersed tape devices with larger, higher-performance tape
silos. In addition to the main data center, large enterprises may also
have several smaller satellite data centers or regional offices with their
own storage and backup systems. Gaining control over all enterprise
data assets is difficult when backup processes can vary from one
remote location to another and when verifying the integrity of remotely
executed backups is not possible. The trend towards data center con-
solidation is therefore expanding to remote facilities, so that at
minimum corporate data can be centrally managed and safeguarded.
Previously, the limitations of WAN bandwidth excluded the possibility
of centralizing storage data backup operations from remote locations
to the main data center. Today, the combination of readily available
bandwidth and new storage technologies to optimize block data trans-
port over WANs enables the centralization of tape backup operations
throughout the enterprise.
Figure 24. Consolidating remote tape backup places all data under
the control and best practices of the data center
As shown in Figure 24, remote sites can now leverage dark fiber,
Dense Wave Division Multiplexing (DWDM), SONET, IP, or other WAN
transports and protocols to direct backup streams to the central data
center. SAN routers such as the Brocade 7500E, Brocade Edge
M3000, and Brocade USD-X, as well as the FR4-18i Extension Blade
for the Brocade 48000 Director and Brocade DCX Backbone, provide
high-performance storage connectivity over WANs and optimize block
data transport for backup and other storage applications.
Consolidating backup operations to the main data center enables cus-
tomers to extend data center best practices to all corporate data,
including verification of scheduled backups and restorability of tape
sets. If the primary data center implements disk-to-disk or D2D2T
technology, accelerated backup and data retrieval are likewise
extended to remotely generated data assets. In addition, the offload-
ing of backup operations to the data center reduces the requirement
for remote support personnel, remote tape hardware, and remote tape
handling and offsite transport.
Remote Backup
Tape Vaulting
The introduction of WAN optimization technology for block storage
data and increased availability of WAN bandwidth offer additional
strategies for data protection, including the shifting of all backup oper-
ations to centralized tape backup facilities. In this case, even data
center backup operations are offloaded--with the additional advantage
that even the failure of one or more data centers would still leave cor-
porate data accessible for restoration to a surviving data center or
third-party service.
Figure 25. Tape vaulting centralizes all data backup to a secure loca-
tion dedicated to protecting all corporate data
As illustrated in Figure 25, tape vaulting further centralizes data pro-
tection by hosting all backup operations in a secure, typically hardened
remote facility. In the event of a catastrophic failure at one or more
production sites, the most recent backups can be restored from the
tape vault to resume business operations. As with centralized tape
backup, tape vaulting can provide enhanced protection for all corpo-
rate data and facilitate higher levels of security, such as encryption of
data as it is being written to tape. Larger enterprises may implement
their own tape vaulting sites, but third-party services by companies
such as Iron Mountain are also available.
Tape Pipelining
In the remote tape backup examples above, the transmission latencies
associated with long-distance networking were not factored in. Speed
of light latency results in about 1 millisecond (ms) of latency per 100
miles each way, or 2 ms for the round trip. Depending on the quality of
the wide area network service, additional latencies incurred by net-
work routers may be significant. As will be discussed in more detail in
Chapter 3, Disaster Recovery, transmission latency over long dis-
tances has a direct impact on storage applications and in particular on
tape backup. Although nothing can be done about the speed of light
(other than quantum tunneling, perhaps), Brocade has addressed the
problem posed by latency for remote tape backup by introducing tape
pipelining technology.
Tape pipelining is used in the Brocade USD-X and Edge M3000 to
expedite the delivery of tape backup streams over very long distances.
Without tape pipelining, every tape I/O must wait for acknowledge-
ment from the receiving end before the next I/O can be executed.
Figure 26. Without tape pipelining, performance falls dramatically dur-
ing the first 10 miles.
As shown in Figure 26, unassisted tape backup over distance slows
dramatically over the first few miles as both the transmission and
acknowledgement encounter longer latencies. Tape pipelining
resolves this problem by providing local acknowledgement to tape I/
Os. The Brocade USD-X and Edge M3000 buffer the I/Os issued by the
local backup server, provide immediate acknowledgments for each
one, and then stream the backup data across the WAN link. At the
receiving end, they buffer the received data and spool it to the tape
controller. Because neither storage router empties its buffers until the
target tape device acknowledges that the data has been received, a
temporary disruption in the WAN link will not result in loss of data or
abort of the tape backup session.
Remote Backup
Tape pipelining is the enabling technology for enterprise-wide consoli-
dated tape backup and remote tape vaulting for both open systems
and FICON (Fiber Connectivity). It is currently supported on a wide vari-
ety of WAN interfaces, including SONET, dark fiber, DWDM, ATM,
Ethernet, and IP networks. In combination with IP networking, in partic-
ular, tape pipelining offers an economical means to span thousands of
miles for centralized backup. Companies that were previously limited
to metropolitan distances can now place their data protection and
archiving sites in safe havens far from potential natural or social
disruptions.
Data Restoration from Tape
The elephant that is always in the room with tape backup is restoration
of data from tape to disk in the event of a data corruption or data cen-
ter disaster. No one wants to think about it and consequently many
companies do not test the viability of their tape backup cartridges for
restorability. As a result, tape backup is sometimes treated as a rote
process driven by good intentions. It may mark the check box of regula-
tory compliance, but without periodic testing cannot ensure data
protection.
Although the backup window is critical for committing all disk data to
tape, it is the restoration window that will determine the length of out-
age and the loss of revenue from lost business. The recovery time
objective should therefore be realistically calculated on these basic
variables to the restoration process:
The total volume of data to be restored
The number of tape mounts required for that volume
The speed of the tape subsystem
The speed of the backup network
The configuration of the target disk array
If tape restore is performed over Gigabit Ethernet, for example, a tape
subsystem capable of 140 Mbit/sec will encounter a bottleneck at the
100 Mbit/sec limitation of the network. By contrast, a Brocade Fibre
Channel backup SAN can provide 2 Gbit/sec to 8 Gbit/sec throughput
and support multiple tape restore streams concurrently.
The design of a comprehensive and viable tape backup and restore
solution will determine whether data recovery takes hours, days, or
even weeks. Even the best design and implementation, however, is
incomplete without periodic testing for restorability and resumption of
normal business operations.
3
Disaster Recovery
Disaster Recovery (DR) is often viewed as an insurance policy. No one
likes to pay the premiums but everyone fears the repercussions of not
being covered. For today's enterprise data centers, Disaster Recovery
is virtually a mandatory requirement, if not for regulatory compliance
then for company survival. Whether downtime costs thousands or mil-
lions of dollars per hour, a prolonged data outage leaves a company
vulnerable to competition, depreciation of brand, and loss of custom-
ers. One of the persistent challenges for IT administrations then is to
create a workable disaster recovery plan that is always under constant
pressure from budgetary constraints and the steady growth of data
requiring protection.
Over the past decade storage networking technology has developed a
new set of products and protocols that facilitate practical implementa-
tion of today's disaster recovery requirements. We are no longer
bounded by distance or bandwidth restrictions and it is now possible
to deploy disaster recovery solutions that span thousands of miles.
Brocade SAN Routers, for example, are supporting DR installations
that link sites in Japan to recovery centers on the US east coast and
others that span the Atlantic from Europe to the US. These extremely
long-distance data protection solutions were unthinkable 10 years
ago. In addition, high performance DR is now possible for metro or
regional sites. Brocade directors provide sufficient buffering to drive
10 Gbit/sec performance for over 50 miles with maximum link utiliza-
tion. Along with the technological innovations discussed below, these
new capabilities are breaking the boundaries for implementing enter-
prise-wide DR solutions and give customers the flexibility to tailor
solutions to their own specific requirements.
Chapter 3: Disaster Recovery
Defining the Scope of Disaster Recovery Planning
In terms of data storage, Disaster Recovery represents an essential
aspect of the much broader scope of business continuity. Business
continuity planning must include personnel, facilities, remote offices,
power, transportation, telephone, and communications networks, in
addition to the data center infrastructure. The narrower scope of DR
planning focuses on data accessibility and so must consider servers,
storage networks, and the data center physical plant. This includes
providing additional features, such as standby diesel power generators
or redundant systems, to support a primary data center and provision-
ing dedicated recovery sites should the primary data center fail
completely.
Disaster Recovery planning can be as streamlined as implementing
periodic tape backup and then relying on a service provider for data
recovery and access or as complex as designing for multiple levels of
data protection at the primary data center and cloning the entire infra-
structure at one or more recovery sites. Data centers represent such a
substantial investment, however, that duplicating servers, storage
infrastructure, cooling, and facilities for standby operation is difficult to
justify to non-IT upper management. Enterprises are therefore often
dual-purposing recovery sites for both DR and production or applica-
tion development processing.
As recent history has shown, both natural and man-made disasters
have severe social and economic repercussions. No geographical loca-
tion is immune from potential disruption, but clearly some geographies
are more vulnerable than others. Coastal areas vulnerable to hurri-
canes, earthquakes, and tsunamis have an inherently higher risk
factor compared to inland areas, but even inland sites may be vulnera-
ble to tornados or periodic flooding. Disaster Recovery planning should
factor in the inherent risk of a specific data center location and that
assessment in turn drives selection of appropriate technologies and
safe havens. A DR plan that uses Oakland as a recovery site for a data
center in San Francisco, for example, probably does not adequately
protect against the potential effects of the San Andreas fault.
How far does data have to travel to be safe? Prior to 9/11, companies
in Manhattan commonly relied on recovery sites in New Jersey. New
Jersey itself suffered disruption, however, with the anthrax attacks the
month following the World Trade Center (WTC) attacks. During the cas-
cading Northeast power blackout in August, 2003, data center
managers discovered that locating recovery sites hundreds of miles
apart still cannot protect against severe failures of regional utilities.
Defining RTO and RPO for Each Application
A similar realization occurred in New Orleans in the fall of 2005, when
companies whose recovery sites were in Houston were hit by both Hur-
ricanes Katrina and Rita within a month's time. Previously, the
selection of a recovery site was limited by technology. It simply was not
possible to transport storage data beyond a metropolitan circumfer-
ence. With current technologies now able to send storage data
thousands of miles, companies can locate their recovery centers far
from regional vulnerabilities.
Defining RTO and RPO for Each Application
While all corporate data hopefully has some value, not all data needs
to be instantly accessible for the immediate resumption of business in
case of disaster or outage. One of the first steps in implementing an
effective Disaster Recovery strategy is to prioritize corporate data and
applications and match data types to levels of recovery. Online trans-
action processing, for example, may need a current and full copy of
data available in the event of disruption. This requirement is generally
met through synchronous disk-to-disk data replication over a suitably
safe distance. For other data, by contrast, it may be sufficient to have
tape backups available, with restoration to disk within two to three
days time. The Recovery Point Objective (the amount of data loss that
could reasonably be accepted) and the Recovery Time Objective (the
allowable time after an outage before business is seriously impacted)
can both vary from one application to another. Sizing the recovery tac-
tic to business requirements helps keep costs under control while
streamlining a recovery process.
The IBM user group SHARE (founded in 1955, the world's first organi-
zation of computing professionals) has defined multiple tiers of
Disaster Recovery protection, ranging from no protection to continuous
protection and availability:
Tier 0. No offsite data backup
No offsite data or means to recover from local disaster
Tier 1. Data backup with no hot site
Offsite backup with no recovery site (CTAM), or remote disk/tape but
no remote processors/servers
Tier 2. Data backup with hot site
Offsite backup with bare metal recovery site (data must be reloaded
and processors initialized)
Tier 3. Electronic vaulting
Electronic transmission of most current mission-critical data with tape
restore for remainder
Tier 4. Point-in-time copy
Snapshot copy of current volumes for streaming to remote disk
Tier 5. Transaction integrity
Application-dependent data consistency between the production and
the remote DR site
Tier 6. Zero or little data loss
Asynchronous or synchronous disk-to-disk copy with independent data
consistency
Tier 7. Highly automated, business-integrated solution
Synchronous disk-to-disk copy / automatic recovery of systems and
applications
The significance of this Disaster Recovery ranking is not that a com-
pany must choose a single tier for all applications, but that different
applications may merit different tiers. For example, a retail chain may
be able to sustain a lengthy data center outage for applications relat-
ing to inventory or point-of-sale statistics. The individual stores, after
all, can continue to transact business, sell goods, and accumulate
income for weeks before shelf inventory becomes critical. A weeks-
long outage of Microsoft Exchange, however, would be unacceptable,
given that e-mail today is critical to the information flow in all compa-
nies. In this example, Exchange would qualify for Tier 6 or 7 handling,
while inventory applications might adequately be served by Tier 1 or 2
solutions.
Prioritizing business applications and data and then pairing different
applications to different tiers of recovery are probably the most diffi-
cult but essential steps in formulating a cost-effective DR plan. If you
asked individual business units if their data is critical to the survival of
the company, of course, they would all say yes. An objective assess-
ment of the business value of application data is therefore required to
both contain costs and to ensure that truly mission-critical data gets
priority during recovery. The alternative approach is to simply give all
corporate data equal value and priority, but this simpler solution is
also the most expensive. Synchronous data replication of inventory
projection data or program development code can certainly be done
(and storage vendors will gladly sell you the requisite additional stor-
age and software licenses), but such data is better served by
traditional backup and offsite tape transport.
Synchronous Data Replication
Synchronous data replication is often used for application data that
requires a zero or near-zero recovery point objective. Typically imple-
mented at the disk array controller, every write of data to disk is
duplicated and sent to the (typically remote) secondary or recovery
array. As shown in Figure 27, the local write complete status is not
returned to the initiating server until the secondary array has com-
pleted its write operation.
Figure 27. Array-based synchronous replication over distance
Because every transaction must be confirmed by the secondary stor-
age array, synchronous data replication provides an immediate RPO
and RTO. In the event of a failure of the primary array or data center,
operations can be immediately resumed at the recovery site with no
data loss. As the distance between primary and recovery sites
increases, however, transmission latency can adversely impact server
performance. Synchronous data replication is therefore typically
restricted by the supplying vendor to about 150 miles or less. For
longer distances, asynchronous replication can be used.
Conventional array-based synchronous data replication is typically pro-
prietary and requires the same vendor products on both ends. For
customers who prefer a single vendor solution (or sometimes a single
neck to choke) this may not be an issue, but it does present a chal-
lenge to customers who have heterogeneous storage systems either
through mergers and acquisitions or vendor changes over time. How-
ever, proprietary solutions are often accompanied by unique value-
added services optimized for the vendor's architecture.
Figure 28. Maximizing utilization of large storage systems for bi-direc-
tional replication
In Figure 28, for example, the primary and secondary storage arrays
can be partitioned so that each array serves as the recovery system for
the other. This active-active configuration enables the both primary
and secondary sites to function as full production centers and as zero-
data-loss recovery sites should either array or site fail.
Metro DR
Given the distance limitations of synchronous data replication, it is
normally deployed within metropolitan environments. A financial insti-
tution with several sites in a city, for example, would implement
synchronous data replication to safeguard every data transaction,
even though all of the sites are vulnerable to potential disruptions
inherent to that geography. The risk that one or all sites may fail simul-
taneously (for example, in the event of a major earthquake) must be
balanced against the likelihood of failures due to less disruptive
events. The vast majority of data outages, after all, are due to operator
error or the unintended consequences of upgrades or periodic service
calls. As is discussed below, companies can implement a tiered DR
plan that combines synchronous data replication as primary protection
with asynchronous replication as a safeguard against true disasters.
Today customers have a variety of options for Metropolitan Area Net-
work (MAN) services to support synchronous data replication.
Companies can install or lease dark fiber between primary and recov-
ery sites and use DWDM or Course Wave
Division Multiplexing (CWDM) to maximize utilization of the fiber optic
cable plant. DWDM currently supports up to 64 channels on a single
fiber optic cable while CWDM, as the name implies, supports fewer at
8 to 16 channels per fiber. Both DWDM and CWDM are protocol agnos-
tic and so can support native Fibre Channel, Gigabit Ethernet, or IP
over Ethernet. In combination with Brocade directors, switches and
SAN Routers, DWDM/CWDM can easily accommodate metro storage
applications, including resource sharing and Disaster Recovery for
both open systems and FICON.
In many metropolitan areas, MAN service providers have built exten-
sive Synchronous Optical NETwork (SONET) rings around primary
business districts. Packet Over SONET (POS) enables encapsulation of
IP and so can be used for IP storage protocols, such as FCIP or iSCSI.
In addition, some vendors provide interfaces for bringing native Fibre
Channel traffic into SONET.
Figure 29. Leveraging metro SONET for native Fibre Channel disaster
recovery
As shown in Figure 29, Brocade directors at both primary and recovery
sites are Fibre Channel-attached to FC-SONET interfaces to connect to
the metropolitan SONET ring. With speeds from OC3 (155 Mbit/sec) to
OC48 (2.5 Gbit/sec) SONET is a viable option for metro disaster recov-
ery solutions.
Carriers are also providing Gigabit and 10 Gigabit Ethernet transports
for metropolitan data applications. Metro Ethernet services are mar-
keted primarily for Internet broadband connectivity but can support
any IP traffic including FCIP for DR traffic. In the future, metropolitan
10 Gigabit services will also be able to support Fibre Channel over
Ethernet (FCoE) once that protocol has achieved maturity.
Leveraging High Speed ISLs
For enterprise-class metropolitan DR applications, use of native Fibre
Channel Inter-Switch Links (ISLs) for connectivity between primary and
recovery sites eliminates the overhead of protocol conversion and sim-
plifies deployment and management. A single ISL, however, may not
be sufficient to support the total volume of DR traffic, particularly if
data replication for some applications is running concurrently with
tape backup streams for other applications. To address this issue, Bro-
cade has pioneered trunking technology that enables multiple ISLs to
be treated as a single logical ISL or trunk.
Figure 30. Using Brocade trunking to build high performance metro
disaster recovery links
As illustrated in Figure 30, up to eight 4 Gbit/sec ISLs can be com-
bined to create a single logical ISL capable of up to 32 Gbit/sec
throughput. Brocade trunking maintains in-order delivery of frames to
ensure data reliability. Because all links are treated as a single logical
ISL, the loss of a single ISL may reduce the total available bandwidth
but will not disrupt availability. Trunking is further enhanced with Bro-
cade Dynamic Path Selection (DPS), which provides exchange-based
load balancing when multiple ISL trunks are configured between multi-
ple switches.
The example shown in Figure 30 shows a maximum configuration but
in practice two to four 4 or 8 Gbit/sec trunked ISLs would be sufficient
for most metro DR applications. In addition, because each ISL is con-
nected to a different DWDM channel, the transmission length deltas
between the channels must be considered. Typically a metro distance
of 50 miles or less is suitable for trunked ISLs over DWDM.
Brocade has also introduced high-performance 10 Gbit/sec Fibre
Channel ISLs to further simplify the cabling scheme. The Brocade FC-
10-6 blade, for example, supports six 10 Gbit/sec FC ports and up to
forty-eight 10 Gbit/sec ports can be configured in a single Brocade
48000 Director chassis. As with all extension technologies, the band-
width-to-distance ratio dictates that the higher the bandwidth, the
Asynchronous Data Replication
shorter the distance that can be supported. The 10 Gbit/sec Fibre
Channel port speed, however, is still adequate for most metro dis-
tances. If longer metro distances are required, trunked ISLs at lower
speeds can be provisioned.
For data replication beyond the 150-mile radius supported by synchro-
nous applications, asynchronous data replication can be used.
Asynchronous data replication maintains optimum server performance
by immediately issuing write complete status as soon as the data is
committed to the local disk array. Multiple write operations are buff-
ered locally and then sent en masse to the remote secondary array. As
shown in Figure 31, the remote array sends back its own write com-
pletes as they are executed. The primary array can then flush its
buffers for the previous transactions and issue additional I/Os.
Figure 31. Asynchronous data replication buffers multiple I/Os while
providing immediate local acknowledgement
Asynchronous data replication cannot guarantee a zero RPO if the pri-
mary array suffers a sudden failure. There is always the risk that one
or more transactions will be lost. For transitory WAN disruptions, how-
ever, most asynchronous schemes can resume operations by re-
issuing frames still held in the array buffers. In addition, if Brocade
SAN Routers are used to provide WAN connectivity, they will also keep
the most recent transactions buffered until acknowledgment by the
receiving SAN router and this means that recovery of operations is ini-
tiated independent of the storage arrays.
Asynchronous data replication can be array based, appliance based or
driven by a storage virtualization engine in a standalone product or
director blade. Because asynchronous data replication is transparent
to server performance, it can drive over much longer latencies and
support DR configurations spanning thousands of miles. Long-dis-
tance WAN services are expensive, however, and the technical
challenge for SAN extension for long haul DR is to optimize utilization
of the available bandwidth and get more data across in less time.
Going the Distance
Bandwidth and latency are distinct and unrelated variables. Band-
width can determine how much data can be issued across a link, but
has no effect on how long it takes to get to the other side. Latency is
determined by transmission distance as well as intervening network
equipment and mitigating its effects requires other clever engineering
techniques. Transaction latency over distance must account for both
transmission of data and receipt of acknowledgment.
As shown in Table 2, transmission latency is about 1 millisecond (ms)
per 100 miles or about 2 ms round trip. Because asynchronous trans-
actions are largely immune to latency, 80 ms or more round trip is
acceptable. Still, if the latency of a certain distance is fixed by the laws
of nature and network equipment, it is always desirable to maximize
the amount of data that is delivered within the latency period.
Table 2. Transaction latency over distance
Point-to-Point
Distance (km)
Point-to-Point
distance (mi)
Latency each
way (ms)
Round-trip
latency (ms)
893 555 5 10
1,786 1,110 10 20
2,679 1,664 15 30
3,572 2,219 20 40
4,465 2,774 25 50
5,357 3,329 30 60
6,250 3,884 35 70
7,143 4,439 40 80
Brocade SAN extension products employ a number of innovations to
reduce the negative impact of transmission latency on upper-layer
applications. Current Brocade SAN extension products leverage the
availability and longer reach of TCP/IP networks by encapsulating
Fibre Channel in IP. FCIP and Internet Fibre Channel (iFCP) enable stor-
age traffic to take advantage of IP-based technologies such as jumbo
frames, data compression, and IP Security (IPSec) to both expedite
data delivery and secure storage data as it traverses the network. The
Brocade enhancements discussed below include both IP-based and
Fibre Channel-based mechanisms that work in concert to optimize link
utilization and boost performance.
Credit Starvation
Because the Fibre Channel architecture was originally designed for
local data center application, support for long-distance deployment
was never a priority. SAN connectivity is measured in feet or meters
and only occasionally in miles or kilometers. Consequently, the stan-
dard switch ports used for device attachment do not require large
buffers to accommodate long-distance transmission. The Brocade
5000 Switch, for example, provides long-haul connectivity up to about
25 miles at 4 Gbit/sec and about 50 miles at 2 Gbit/sec using
Extended Long-Wavelength Laser (ELWL) Small Form-factor Pluggable
(SFP) optics. That makes it suitable for metro applications, but it is not
designed to support transmissions of hundreds or thousands of miles.
Without enhanced port buffering, a standard switch port transmits the
contents of its buffer and then waits for buffer credit renewal from its
partner at the other end of the WAN link, as shown at the top of
Figure 32. As the distance between the two switches is extended,
more of the WAN link is idle while the initiator waits for credit replen-
ishment. Additional idle time is incurred, however, when the receiving
switch send credits back to the initiator. This credit starvation results
in wasted WAN bandwidth and further delays in data transmission at
the application layer.
Figure 32. Larger port buffers avoid credit starvation
To address this issue, Brocade SAN extension products such as the
Brocade 7500E, 7500, and Brocade Edge M3000 SAN Routers, the
FR4-18i Routing Blade, and the Brocade USD-X are designed with
large port buffers to support long-distance SAN and DR applications.
As shown at the bottom of Figure 32, enhanced port buffers enable
Brocade SAN extension solutions to fill the WAN pipe with productive
traffic. As the receiving SAN router processes the data and hands it off
to the downstream SAN, it can issue a steady stream of credits back to
its partner as new data continues to arrive. Maximizing utilization of
the WAN link both improves performance and the return on invest-
ment. The WAN provider, after all, charges for the link whether it is
used efficiently or not.
Data Compression
Compression technology identifies repetitive patterns in a data stream
and represents the same information in a more compact and efficient
manner. By compressing the data stream, more data can be sent
across the network, even if slower link speeds are used. At the destina-
tion, compressed data is returned to its original form and delivered
intact to the receiving device. Brocade implements lossless compres-
sion to ensure that the exact information is reproduced from the
compressed data. Only the payload of a packet is compressed and not
the Transmission Control Protocol (TCP) header. Packets with sizes
less than 512 bytes are not compressed.
The compression ratio compares the size of the original uncom-
pressed data to the compressed data. A compression ratio of 2:1, for
example, means that the compressed data stream is half the size of
the original data stream. Therefore, by using data compression, a cus-
tomer would achieve twice the performance using the same network
links.
Compression is especially useful when transmitting storage data over
a slow link such as a T1 (1.5 Mbit/sec) or 10 Mbit/sec Ethernet. By
enabling compression on a Brocade SAN router, a customer could
achieve 2 MB/sec data throughput on a T1 link and 11 MB/sec data
throughput on a standard 10 Mbit/sec Ethernet link. Data compres-
sion thus enables use of slower, less expensive link speeds for such
storage applications as asynchronous remote mirroring, remote tape
backup, and remote content distribution.
Brocade data compression is recommended for use of T3 (45 Mbit/
sec) and higher-speed WAN links. Without data compression, a T3 link
can deliver approximately 4.6 MB/sec of storage data. With data com-
pression enabled, the T3 link can support 25 MB/sec of storage data,
more than a fivefold increase in link utilization. Likewise, an OC-3 (155
Mbit/sec) WAN link that would normally drive 16 MB/sec throughput
can, using compression, deliver 35 MB/sec throughput, a twofold gain
in storage data throughput. Disaster Recovery implementations that
typically use T3 or higher speed WAN links can thus maximize use of
their wide area services to safeguard more data more quickly.
The efficiency of data compression depends on the data itself and the
bandwidth of the WAN link. Not all data is compressible. Graphic and
video data, for example, does not have the same data characteristics
as database records, which tend to have repetitive bit patterns. In
addition, data compression is most efficient when there is a greater
delta between ingress and egress speeds. The lower the WAN link
speed, the more opportunity there is to examine the data held in the
SAN router buffers and to apply the appropriate compression algo-
rithms if the data is compressible. If, for example, the ingress speed is
1 Gbit/sec Fibre Channel and the egress is Gigabit Ethernet, it is more
expeditious to simply hand the data to the WAN without compression.
This explains why in the examples provided above, compression on a
T3 link can enhance performance by a factor of 5:1, while compres-
sion on a higher speed OC3 link is only a factor of 2:1.
Jumbo Frames
In encapsulating Fibre Channel storage data in TCP/IP for transmis-
sion over conventional WANs, it is necessary to address the disparity
between Ethernet and Fibre Channel frame sizes. A typical Ethernet
frame is 1518 bytes. A typical Fibre Channel frame is about 2112
bytes. Wrapping Fibre Channel frames in Ethernet, therefore, requires
segmentation of frames on the sending side and reassembly on the
receiving side. This, in turn, incurs more processing overhead and
undermines performance end to end.
To align Fibre Channel and Ethernet frame sizes, a larger Ethernet
frame is needed. Although not an official IEEE standard, a de facto
standard called jumbo frames allows for Ethernet frames up to
about 9 k bytes in length. The caveat for use of jumbo frames is that all
intervening Ethernet switches, network routers, and SAN routers must
support a common jumbo frame format.
Use of a maximum jumbo frame size of 9 k bytes allows four Fibre
Channel frames to be encapsulated in a single Ethernet frame. This
would, however, complicate Fibre Channel link layer recovery as well as
buffer flow control. Instead, Brocade SAN routers encapsulate a com-
plete Fibre Channel frame into one jumbo Ethernet frame. Because
Fibre Channel frames may include extended and optional headers or
virtual fabric tagging information, the jumbo Ethernet frame size is not
fixed and varies depending on the requirements of the encapsulated
Fibre Channel frame.
Jumbo frames help expedite packet processing by increasing the pay-
load of every frame transmission and eliminating the continuous
overhead of segmentation and reassembly of Fibre Channel frames
from smaller 1500-byte Ethernet frames. If all network equipment
between source and destination supported jumbo frames, this is
another option that provides incremental improvement of perfor-
mance and link utilization.
Rate Limiting
The TCP layer above IP is an end-to-end insurance policy against data
loss. Because the available bandwidth through a network may be vari-
able and traffic loads unpredictable, congestion and buffer overruns in
the intervening network equipment can occur. In IP environments, the
response to congestion is to simply throw away frames, a reaction that
is horrifying to storage administrators. Packets may be lost, but thanks
to the TCP layer they will be recovered and retransmitted. Packet
recovery, however, has a performance penalty. The TCP layer must
identify the missing packets and generate retransmission. The IP layer,
in turn, does not simply resume at full speed but incrementally ramps
up the transmission rate until congestion again occurs.
Early adopters of SAN extension over IP soon learned of this behavior
when curious sawtooth performance patterns occurred. Levels of
reasonably high performance were periodically punctuated with sud-
den drops, as illustrated in the middle of Figure 33.
Figure 33. Using Brocade rate limiting to avoid congestion and erratic
performance
This constant cycle of congestion and recovery severely impacts per-
formance and results in wasted bandwidth on the WAN link.
As shown at the bottom of Figure 33, Brocade avoids the erratic
behavior caused by congestion, packet loss, recovery, and IP window
ramping by pacing the load delivered to the WAN link. By restricting the
traffic offered to the WAN to the designated bandwidth (in this exam-
ple, a T3 at 45 Mbit/sec), Brocade SAN routers can minimize potential
congestion and recovery latencies and help ensure the uninterrupted
delivery of data that storage applications expect.
FastWrite
The SCSI protocol includes commands and status exchanges that facil-
itate moving large blocks of data in an orderly fashion between servers
and storage. When servers and storage are separated by distance,
however, the normal SCSI exchange may lead to inefficient use of the
bandwidth available in the WAN link. Brocade SAN routers incorporate
a FastWrite option to address this problem. FastWrite preserves stan-
dards-based SCSI protocol exchanges, while enabling full utilization of
the available bandwidth across wide area connections and a 10x or
greater performance increase for storage applications.
Pioneered by Nishan Systems in 2001, FastWrite is now an integral
part of Brocade SAN extension technology. In order to understand how
FastWrite works, it is useful to review standard SCSCI write operations
as illustrated in Figure 34. There are two steps to a SCSI write. First,
the write command is sent across the WAN to the target. The first
round trip is essentially asking permission of the storage array to send
data. The target responds with an acceptance (FCP_XFR_RDY). The ini-
tiator waits until it receives this response from the target before
starting the second step, sending the data (FCP_DATA_OUT). For large
I/Os, the initiator sends multiple FCP_DATA_OUTs sequentially, but
must wait for an FCP_XFR_RDY for each one as shown in Figure 34.
When all the data has finally been received by the target and commit-
ted to disk, the target responds with a write complete status
(FCP_STATUS). In this example, the SAN routers are simply passing
SCSI commands and data across the WAN between the initiator and
the target.
As the distance and accompanying latency between the initiator and
target increases, more and more transaction time is consumed by
SCSI protocol overhead. This appears to be an inevitable result of
transmission latency over long WAN links and that would indeed be the
case if the SAN routers provided only protocol conversion between
Fibre Channel and IP. Brocade SAN routers, however, are intelligent
devices that can support more sophisticated applications and Fast-
Write can behave as a proxy target to the initiator and a proxy initiator
to the real target.
Figure 34. A standard SCSI write operation over distance requires sig-
nificant protocol overhead
As shown in Figure 34, when the initiator issues a write command to
the target (in this example for 1 MB of data), the local SAN router prox-
ies for the remote target and immediately responds with a transfer
ready for the entire amount to be written. As the initiator responds with
a series of DATA_OUTs, the local SAN router buffers the write data and
issues a FCP_CMD_WRT to its partner SAN router on the far side of the
WAN link. After an acknowledgment from the remote SAN router, the
local SAN router begins streaming the entire payload across the WAN
in a single write operation.
At the receiving end, the remote SAN router proxies as an initiator to
the remote target and issues an FCP_CMD_WRT to it. The remote tar-
get responds with an XFR_RDY specifying the amount that can be sent
with each DATA_OUT. On both sides of the WAN link, the SCSI protocol
overhead functions normally but is localized to each side. When all the
data has finally been committed to the remote disk array, the target
responds with a write complete FCP_STATUS, which is relayed by the
SAN routers back to the initiator.
Figure 35. FastWrite dramatically reduces the protocol overhead
across the WAN link by proxying for both initiator and target
Because there is no spoofing of the write complete, there is no risk
that the write operation will inadvertently be confirmed if a WAN dis-
ruption occurs during this process. For transient WAN outages, the
Brocade SAN routers keeps TCP sessions active and resumes opera-
tions once the link is restored. In the event of a hard failure of the WAN
link during the FastWrite operation, the sessions will terminate and the
initiator, having not received a write complete, will know the write was
unsuccessful. This ensures data integrity and safeguards the immortal
souls of SAN router design engineers. The prime directive of storage
networking technology, after all, is to preserve the sanctity of customer
data.
FastWrite has been used in customer deployments for over five years
and has repeatedly demonstrated substantial performance improve-
ments for Disaster Recovery and data migration applications.
Customers have seen a 10x or better performance boost and have
been able to compress data migration projects from weeks to days. In
combination with large port buffers, data compression, jumbo frames,
and rate limiting, FastWrite enables Brocade SAN routers to deliver
enterprise-class SAN extension that fully utilizes WAN bandwidth and
expedites data delivery over long-haul DR installations. As detailed in
Table 3, Brocade FastWrite provides sustained high performance over
extremely long distances spanning thousands of miles.
IP Security (IPSec)
Data moving over any link poses a potential security risk. The security
mechanisms discussed in Chapter 1 help secure the data center SAN
against internal and external intrusions as well as inadvertent disrup-
tions due to operator error or system upgrades. Long-haul DR using
FCIP or iFCP protocols can also be secured through established IETF
IPSec algorithms. The Brocade 7500 SAN router and FR4-18i Exten-
sion Blade, for example, provide hardware-based IPSec data
encryption for enforcing high-performance security over untrusted net-
work segments. In combination with the WAN optimization facilities
discussed above, Brocade's IPSec implementation ensures both the
security and expeditious delivery of storage data across the network.
Table 3. Comparison of performance over long distances with and
without FastWrite
ms km
Average
Throughput
ms km
Average
Throughput
0 0 55 0 0 55
1 200 37 1 200 55
2 400 30 2 400 55
5 1,000 18 5 1,000 55
10 2,000 10 10 2,000 55
15 3,000 7 15 3,000 55
20 4,000 5.7 20 4,000 55
25 5,000 5.01 25 5,000 55
30 6,000 4.3 30 6,000 43
35 7,000 3.5 35 7,000 40
40 8,000 3.5 40 8,000 39
Disaster Recovery Topologies
Although Disaster Recovery scenarios can use the common elements
of source, transport and destination, the profiles of practical DR con-
figurations can vary widely from one customer to another. A small or
medium enterprise, for example, can have a single disk array at its pro-
duction site and perform synchronous or asynchronous data
replication to a remote array. Large enterprises can have dozens of
arrays distributed over multiple data centers and replicate to one or
more strategically located DR facilities. In addition, remote data repli-
cation may be only one element of a more complex DR strategy,
incorporating continuous data protection mechanisms and centralized
tape vaulting. Disaster recovery topologies are thus more streamlined
or more complex depending on the business requirements of the
enterprise and the amount and variation of data types to be secured
against loss.
Three-Tier DR
Because synchronous data replication is bounded by WAN latency, it is
typically deployed within a 150-mile radius from the primary data cen-
ter. Synchronous replication has excellent RPO and RTO
characteristics, but still cannot protect storage data if a region-wide
disaster or outage occurs. Some enterprises therefore have moved to
a three-tier DR model that incorporates both synchronous and asyn-
chronous replication schemes.
Disaster Recovery Topologies
Figure 36. A three-tier DR topology provides an extra layer of data pro-
tection in the event of regional disruption
As shown in Figure 36, conventional synchronous replication can be
implemented within a metropolitan circumference to provide recovery
for a failure of the primary data center. This two-tier scenario is aug-
mented by an additional WAN link to provide asynchronous replication
to a third site. Because asynchronous replication is highly tolerant of
latency, the third remote recovery site can be situated thousands of
miles from the primary data center and therefore well beyond the
reach of a regional disruption. If a regional failure were to occur, there
is always the possibility that one or more transactions would be lost.
This potential loss, however, is miniscule compared to the potential
data loss if both primary and secondary sites were to fail
simultaneously.
Round Robin DR
Large enterprises with multiple data centers have yet another option to
provide data protection for all locations while minimizing costs. As
illustrated in Figure 37, a round-robin DR topology circumvents the
need to build a dedicated disaster recovery center by leveraging exist-
ing data centers and WAN connectivity. Depending on the
geographical distribution of the data centers, each location can use its
downstream neighbor as a data replication site, while also acting as a
recovery site for an upstream neighbor.
Figure 37. In a round-robin DR topology, each data center acts as the
recovery site for its neighbor
There are multiple variations on this theme. Two data centers in the
same metropolitan area, for example, could act as mutual synchro-
nous replication sites to each other, while both asynchronously
replicate to a more distant partner. In addition, all data centers could
implement centralized tape vaulting as a safeguard against the failure
of two or more data centers. In this example, if data centers B and C
failed simultaneously, data center D could assume the work of C, and
only data center B's data would be inaccessible until restoration from
tape is completed.
Before the advent of WAN optimization technologies and storage pro-
tocols over IP, these types of topologies were cost prohibitive due to
the lease rates for WAN bandwidth. Today, however, more storage data
can be transported over less expensive WAN services and at much
longer distances, making three-tier and round-robin configurations far
more affordable.
SAN Routing for DR
SAN Routing for DR
As we discussed in Chapter 1, Inter-Fabric Routing technology provides
fault isolation when connecting two or more fabrics either locally or
over distance. Also known as SAN Routing, IFR enables devices on
different fabrics to communicate but blocks potentially disruptive Reg-
istered State Change Notification (RSCN) broadcasts and fabric-
building protocols. SAN Routing is thus an ideal complement to DR
over distance. The goal of Disaster Recovery, after all, is to provide
continuous or near-continuous access to storage data and SAN Rout-
ing contributes to this goal by minimizing potential disruptions to fabric
stability.
Figure 38. SAN Routing reinforces stability of the DR implementation
by maintaining the autonomy of each site.
As shown in Figure 38, Brocade SAN Routers provide connectivity
between the resources that have been authorized to communicate
across the WAN link. Instead of merging both fabrics into a single SAN,
SAN Routers maintain the autonomy of each fabric. A disruption in the
DR fabric, for example, would not propagate to the production fabric as
would be the case if standard ISL links were used. In the example
shown in Figures 37 and 38 above, SAN Routing is a prerequisite for
connecting multiple sites over distance. Deploying a single extended
fabric across multiple locations simply poses too much risk and under-
mines the central goal of Disaster Recovery.
Disaster Recovery for SMBs
Although large enterprises have long recognized the necessity of a
comprehensive DR plan, Small and Medium Businesses (SMBs) also
appreciate the value of protecting their data assets from natural or
man-made disruptions. Hurricane Katrina, for example, did not dis-
criminate on the basis of gross annual receipts and impacted all
businesses equally. The ability to recover and resume business opera-
tions, however, hinges on the level of preparedness and the ability to
execute against the DR plan.
SMBs depend on their IT operations as much as any large, multi-
national enterprise, albeit on a smaller scale. This smaller scale, how-
ever, works to the advantage of SMBs, because there is typically much
less data to secure and far simpler infrastructures to clone for DR
sites. Large enterprises have essentially funded the research and
development of SAN and DR technologies by being the early adopters
and largest clients for shared storage technology. Although, once the
technology is proven and in production, costs typically decline, bringing
more sophisticated storage products into the price range of SMBs. The
Brocade 7500E SAN Router, for example, incorporates WAN and proto-
col optimization features designed to meet the demanding
requirements of large enterprises but is now an affordable DR element
for the tighter budgets of many SMBs. Likewise, Brocade switches and
Brocade 8 Gbit/sec 815 and 825 Host Bus Adapters (HBAs) are eco-
nomical SAN building blocks that maintain enterprise-class
functionality and performance for both production and DR
applications.
Vendors of storage networking products offer tiered solutions that
meet high-end, mid-range, and low-end requirements. A mid-range
storage array, for example, can still provide enterprise-class RAID on
the front end but use more economical Serial ATA (SATA) or Serial SCSI
(SAS) disks on the back end. The mid-tier systems also provide enter-
prise-class DR functionality, such as synchronous and asynchronous
disk-to-disk data replication, but at a lower cost than first-tier storage
arrays. In addition, vendors may provide storage appliances which sup-
port asynchronous replication between heterogeneous storage arrays,
eliminating the need to pair production and DR arrays from a single
vendor.
4
Continuous Data Protection
The tape backup and data replication technologies discussed in the
previous chapters provide varying degrees of data protection and
recovery for standard business applications. These mechanisms
alone, however, have proven inadequate for more demanding mission-
critical applications. Synchronous data replication, for example, cap-
tures every transaction and allows resumption of operations with no
data loss. Synchronous data replication does not maintain a history of
those transactions and cannot be used to restore operations to a
known good point in time if data corruption occurs. A virus attack on
an e-mail server simply replicated to the recovery array. Consequently,
a new class of data protection mechanisms is required for tracking
changes to data and enabling restoration from variable recovery
points.
Among its other tasks, the Data Management Forum (DMF) of the Stor-
age Networking Industry Association (SNIA) is defining a new set of
technologies for continuous data protection (CDP). The DMF defines
CDP as a methodology that continuously captures or tracks data
modifications and stores changes independent of the primary data,
enabling recovery points from any point in the past. The phrase any
point in the past is figurative here, given that the CDP change history
itself takes additional storage capacity and that capacity is not infinite.
CDP solutions can be block based, file based or application based.
Compared to tape backup or data replication, CDP offers much finer
granularity and the ability to move the recovery point objective selec-
tively backward in time.
Chapter 4: Continuous Data Protection
Defining the Scope of CDP
Tape backup and remote data replication provide protection against
the loss of a storage array, a system outage, or loss of the entire data
center. CDP, by contrast, is not primarily designed to recover from cata-
strophic physical events but is focused on the more subtle risks posed
by data corruption as transaction data is modified over time. CDP
therefore lies closer to the application layer, and in a large data center,
multiple CDP instances may be running against multiple applications
concurrently.
As shown in Figure 39, the recovery points for tape backup and data
replication are fixed in time. For tape, the recovery point is the last
incremental backup. For asynchronous data replication, the recovery
point is the last completed write of buffered I/Os to the secondary
array. For synchronous data replication, the recovery point is the last
transaction written to both primary and secondary arrays, even if that
transaction wrote corrupted data. The recovery times are also fixed to
the extent that restoration from tape takes a set time depending on
the volume of data to be restored (hours or days), and both asynchro-
nous and synchronous mechanisms require a cutover from primary to
secondary array access.
Figure 39. Continuous data protection provides finer granularity for
data restoration when corruption occurs.
Defining the Scope of CDP
Because true continuous data protection is driven by changes to data
instead of fixed points in time, the recovery point is variable. The fre-
quency of monitoring and logging data changes can differ from one
CDP solution to another but all CDP utilities provide a sliding recovery
point that not only facilitates recovery but ensures the integrity of the
data once the application resumes.
The data changes that CDP tracks on a primary array are stored on a
separate storage system, which is either co-located in the data center
or remote at a secondary or DR site. The amount of additional storage
required by CDP is determined by the rate of data changes and the fre-
quency of monitoring those changes. Periodic monitoring based on
snapshot technology is known as near CDP and is described as fre-
quent monitoring and change tracking but not actually continuous.
Near CDP is thus more accurately described as periodic data protec-
tion (PDP). True CDP, by contrast, continuously monitors and tracks
data changes and so is constantly updating the CDP store.
Near CDP
Near CDP solutions may use a number of different snapshot or point-
in-time copy mechanisms to capture the state of a storage volume at
any given moment. Snapshot-based near CDP triggers on a predefined
interval to create a recovery point. If, for example, a snapshot is taken
every 10 minutes, the snapshots would contain 6 recovery points per
hour. If data corruption is detected, the restore point would be 1 of the
6 recovery points or possibly more, depending on the total number of
snapshots allowed. A system allowing 40 revision points, for example,
could accommodate recovery points up to 6 hours prior to detection of
data corruption, but with granularity of only 10-minute intervals.
Depending on the vendor implementation, some products provide for
hundreds of recovery points. Once the designated number of recovery
points has been reached, a rotation algorithm replaces the older snap-
shots with new ones, as shown in Figure 40.
Figure 40. Aged snapshots are rotated on a configurable interval to
conserve disk space on the CDP store.
True CDP
True CDP (or simply, CDP) takes granularity to a finer level by monitor-
ing and tracking every data change as it occurs. This eliminates the
possibility of losing transactions during a snapshot interval but it
requires a more sophisticated mechanism for accurately managing
change metadata. CDP can operate at the file or block level, and in
both cases triggers on the write (that is, change) of data to primary
storage. Copy-on-write, for example, copies an original data location to
the CDP store just prior to the new write execution. If the write to the
primary array contains corrupted data, there is still a copy of the origi-
nal data on the CDP volume to restore from.
True CDP
Figure 41. The CDP engine manages metadata on the location and
time stamp of data copies on the CDP store.
To accurately track data changes, a CDP engine must maintain meta-
data on the location of copies and the time stamps used to
differentiate one revision from another, as shown in Figure 41. A
graphical interface is typically provided to simplify identification of
recovery points via a slider or dial to roll back to a designated point in
time. Block-based CDP is data-type agnostic and so can operate
against structured, semi-structured, or unstructured data.
At the application layer, however, it may be necessary to coordinate
CDP metadata with the application to maintain data consistency. An
Oracle or SQL Server transaction, for example, may issue multiple
writes to update a record. Restoring to a known good transaction state
requires coherence between what the application expects and the req-
uisite copies that CDP metadata can recover. Application-based CDP is
thus tightly integrated with the application's specific file or record
requirements via application programming interfaces (APIs) or as a
component of the application itself.
Integrating CDP with Tape Backup and Disaster
Recovery
Although there has been marketing-inspired confusion over near and
true CDP, the technology has proven value for addressing issues that
simple tape backup and data replication alone cannot resolve. Appli-
cation or operator errors that result in data corruption, accidental
deletion of files, or virus attacks on e-mail systems can bypass conven-
tional data protection solutions. On the other hand, CDP alone is
insufficient to protect against system outages or disasters. Some ven-
dors are therefore combining CDP with traditional tape backup and DR
to provide more comprehensive coverage for data assets.
Snapshot recovery points, for example, can be used as static volume
images for tape backup, leaving the production storage array free to
service ongoing transactions while the backup occurs. In addition, the
CDP store and metadata manager can be located at a remote DR site
to protect against both data corruption and outage at the primary data
center.
5
Information Lifecycle
Management
The introduction of information lifecycle management (ILM) technolo-
gies over the past few years has marked the maturity of the networked
storage infrastructure and its ascent toward the application layer. Prior
to ILM, data was treated as having constant value that required uni-
form treatment until it was finally retired to a tape archive. Typically
that uniform treatment consisted of high-availability transport, robust
failover mechanisms, and high-end storage. As the volume of data
increased over time, larger fabrics and additional storage arrays were
required, often straining the capacity and the budget of the data
center.
The tendency to accommodate the growth of data via constant expan-
sion of the storage infrastructure, however, is not sustainable as long
as all data is weighted equally. There is simply not enough floor space,
cooling plant, and power to contain growing data volumes and not
enough budget to provide first-class handling for all application data.
Fortunately, the reality is that not all application data is equal in value,
and even a single data set may have varying value through its lifetime.
Information lifecycle management translates this reality into a strategy
for tracking the business value of data and migrating data from one
class of storage to another, depending on the value of data at a given
point in time. Each class of storage represents a specific cost point in
terms of performance, availability, cost of storage per gigabyte, and
associated power costs. Data with high value gets first-class treat-
ment, but as that value declines over time it is more efficient to move
the data to a more economical storage container.
Chapter 5: Information Lifecycle Management
An order entry, for example, has high value as long as it is tied to pend-
ing revenue. Once the order is assembled, shipped, and most
importantly, billed, the transaction declines in value and may have only
historical significance (for example, for data mining). However, if sev-
eral months later the customer places an identical order, the original
transaction may regain value as a reference for the detail of the initial
order, customer information, and so on. As shown in Figure 42, ILM
can migrate data from high-end to mid-tier and from mid-tier to an
even lower tier or tape, while still being able to promote the data to a
higher class when needed.
One of the major challenges of ILM is to determine the current value of
a given data set. Using time stamps in file metadata is one approach.
If data is rarely accessed, it is legitimate to assume it has less immedi-
ate value. Another method is to manipulate file metadata or create
separate metadata on block data to assign a priority or value rating
that can be monitored and changed over time. A value-tracking mech-
anism is key, though, for automating the ILM process and avoiding
operator intervention to manually migrate data.
Figure 42. Aligning cost of storage to business value of data
Tiered SAN Architectures
Although it would appear to be much simpler to deploy a single class of
storage for all data, that is not feasible for large data centers with
space, power, cooling, and budget constraints. In addition, large data
centers may already have different classes of storage installed to ser-
vice less-mission-critical applications. By reserving space on second-
or third-tier storage for ILM-migrated data, storage managers can free
space on their first-tier arrays and maximize utilization of their lower-
tier systems.
Tiered SAN architectures are predicated on two basic concepts-
classes of storage and classes of storage transport-which reflect dif-
ferent cost, availability, and performance points. To maximize the
value of a storage infrastructure, both storage and the storage inter-
connect (or fabric) should be aligned. A JBOD, for example, is far more
economical than a high-end RAID array but typically lacks the high
availability, recoverability, and performance of top-tier systems. Conse-
quently, fabric connectivity to a JBOD may not merit the higher speed,
alternate pathing, and 99.999 percent availability provided by top-tier
platforms, such as the Brocade DCX Backbone or Brocade 48000
Director. In a core/edge SAN design, the JBOD is more appropriately
positioned toward the edge on a more economical SAN switch. Aligning
the class of storage to the appropriate class of transport maximizes
the cost effectiveness of each tier without squandering capacity or
bandwidth.
Classes of Storage Containers
Storage systems are characterized by the front-end services they pro-
vide and back-end disk capacity and I/O performance. First-tier arrays,
for example, offer multiple storage ports for SAN connectivity, config-
urable RAID levels, alternate pathing, large cache memory, and
possibly virtualization services on the front end and provide high-per-
formance and high-capacity disks (typically Fibre Channel) on the back
end. Second-tier systems may provide fewer SAN ports, fixed RAID lev-
els, less caching, and alternate pathing on the front end and use less
expensive SATA or SAS disks on the back end. A third-tier storage sys-
tem may provide no caching or RAID controller logic and lower-
performance back-end disks. In addition, some systems are deliber-
ately designed for lower-performance applications, such as MAID
(massive array of idle disks) systems that expect infrequent I/O.
Classes of storage can be classified in a hierarchy that spans a range
of systems-from high-performance and high-availability to much lower-
performance tape and optical storage:
Class 1. High-availability, high-performance RAID systems
Class 2. Moderate-performance RAID systems
Class 3. Fibre Channel JBODs
Class 4. Custom disk-to-disk-to-tape systems
Class 5. High- performance tape libraries
Class 6. Moderate-performance tape subsystems and devices
Class 7. Optical jukeboxes
Each tier or class of storage performs the basic function of storing
data, but with distinctly different levels of performance, availability,
reliability, and (most importantly) cost. When ILM migrates data from
one class of spinning media to another, the underlying assumption is
that the data still has sufficient value that it needs to be accessible or
referenced on demand. Otherwise, the data eventually retires to the
lower storage classes: tape or optical media. Data can be retrieved
from tape, but because tape is a linear storage media, data retrieval is
a much longer process.
Classes of Storage Transport
Corresponding to different classes of storage, the SAN transport can
be configured with different classes of bandwidth, security, and avail-
ability characteristics. As shown in Figure 43, the scalability of Fibre
Channel from 1 Gbit/sec to 10 Gbit/sec and iSCSI from 1 Gbit/sec to
subgigabit speeds enables the transport to align to different classes of
storage and applications and thus optimize fabric resources.
Figure 43. Aligning classes of storage transport to classes of storage
and applications
In this example, 8 and 10 Gbit/sec ISLs and future storage connec-
tions represent the top tier of the storage transport. For the Brocade
DCX Backbone and Brocade 48000 Director, 8 and 10 Gbit/sec ISLs
can be deployed in the data center to create a high-performance SAN
backbone as well as extended to metropolitan distances. The 4 and 8
Gbit/sec ports represent the next tier, with connectivity to high-end
and/or mid-tier storage and high-performance servers. The 2 and 4
Gbit/sec ports can support second-tier storage and servers and 1
Gbit/sec Fibre Channel to drive legacy FC servers.
The addition of iSCSI to the configuration provides more tiers of con-
nectivity. When connected via Brocade iSCSI-to-FC ports, iSCSI can
drive lower-tier iSCSI servers at 1 Gbit/sec Ethernet as well as subgiga-
bit remote iSCSI servers across a campus or WAN link.
In addition to proportional bandwidth allocation, the storage infra-
structure can be configured to provide higher or lower levels of
availability through dual- or single-path connectivity. When data has
higher value, accessibility is reinforced by alternate pathing and
failover through the SAN. When its value declines and the data is less
frequently accessed, single-path connectivity may be sufficient. Like-
wise, fabric security features can be judiciously allocated to more
critical storage assets depending on the level of security they merit.
Aligning Data Value and Data Protection
Ideally, the value of data should determine the level of data protection
that is provided for it. This is difficult to achieve in single-tier systems
because there is no means to differentiate high-value data from low-
value data. In a tiered storage architecture, however, the class of stor-
age itself defines the level of data protection. Top-tier storage may
require synchronous replication, snapshots, continuous data protec-
tion, or disk-to-disk-to-tape backup. For second- or third-tier storage,
tape backup alone is probably sufficient.
ILM surfaces another data protection issue, though. As data is aged
and archived onto tape, the retention period may no longer be the con-
ventional 10 to 15 years that was previously assumed. In addition to
business data that may be subject to regulatory compliance and long-
term retention requirements, the fact that today virtually all knowledge
is in digital format is raising concerns about much longer data protec-
tion and retention. In surveys conducted by the Storage Networking
Industry Association's Data Management Forum, for example, 80 per-
cent of respondents have information retention requirements of
greater than 50 years and 68 percent indicate that their data retention
requirements were in excess of 100 years. This poses significant chal-
lenges not only for durable long-term physical media but for logical
formatting of data that can be read by applications of the future. The
failure to migrate archived data to more current formats and media
periodically could make today's enormous repository of information
inaccessible to future generations. John Webster, founder of the Data
Mobility Group, has called this a potential digital dark ages.
With IT administrators currently struggling to provide data protection
for the diversity of data under their charge, the idea of safeguarding
data and making it accessible in the future is somewhat overwhelm-
ing. The hierarchy of data value that drives ILM should help in
prioritizing the types of data that are the most likely candidates for
very-long-term retention.
Leveraging Storage Virtualization
Although storage virtualization is not an absolute prerequisite for ILM,
virtualizing storage can facilitate creation of classes of storage that
optimize capacity utilization and use of heterogeneous storage sys-
tems. Storage virtualization is an abstraction layer that sits between
the consumers of storage (that is, servers) and the physical storage
arrays. Instead of binding to a LUN on a particular storage array, stor-
age virtualization enables a server to bind to a LUN created from a
storage pool. The pool of storage capacity is actually drawn from multi-
ple physical storage systems but appears as a single logical storage
resource. As was discussed in Chapter 1, even RAID is a form of stor-
age virtualization. RAID presents the appearance of a single logical
resource that hides the complexity of the multiple disk drives that com-
pose a RAID set. At a higher level, storage virtualization hides the
complexity of multiple RAID systems.
Figure 44. Conventional LUN allocation between servers and storage
As illustrated in Figure 44, in traditional configurations storage capac-
ity in individual arrays is carved into LUNs, which in turn are bound to
individual servers. During the normal course of operations, some LUNs
may become over-utilized (LUN 55 in this example), while others are
under-utilized (LUN 22). In conventional LUN allocation, however, it is
not possible to simply transfer excess capacity from one array to
another. In this example, Array C would need additional banks of disk
drives to increase overall capacity or a new array would have to be
added and data migrated from one array to another.
Storage virtualization enables optimum use of storage capacity across
multiple arrays by combining all capacity into a common storage pool.
As shown in Figure 45, each storage system contributes its capacity to
the pool and each server is bound to virtual LUNs created from the
pool. There are a number of benefits from basic storage pooling as
well as risks that must be considered for data protection. By pooling
storage capacity it is now possible to fully utilize the capacity of each
storage system and avoid under- and over-utilization, as shown in Fig-
ure 44. In addition, LUNs can be dynamically sized without concern for
the capacity limitations of any individual storage array. Because stor-
age virtualization inserts an abstraction layer between servers and
physical storage, it also frees individual servers from the vendor-spe-
cific attributes of individual arrays. Shared storage thus assumes a
more generic character and can accommodate heterogeneous arrays
in a single pool.
Figure 45. Logically binding servers to virtual LUNs drawn from the
storage pool
On the other hand, there is no longer a direct correlation between a
server's assigned LUNs and the underlying storage arrays. In fact, the
total capacity of a virtualized LUN could be drawn from multiple arrays.
Data protection mechanisms, such as disk-to-disk data replication,
might therefore be inoperable. A series of writes to a virtualized LUN
might span multiple physical arrays, and the replication software at
the array level would have no means to recognize that local writes are
only part of a virtualized transaction. To understand the implications of
storage virtualization for data protection, it is necessary to examine
the internal mechanics of the technology.
Storage Virtualization Mechanics
All permutations of storage virtualization technology operate on a com-
mon algorithm that maps virtual storage locations to physical ones.
The virtualization software or engine creates two virtual entities that
intervene between real servers and real storage. From the storage per-
spective, the virtualization engine creates a virtual initiator that poses
as a server to the storage controller. From the server perspective, the
virtualization engine creates a virtual target that poses as a storage
controller to the real initiator or server. The virtualization engine must
track every transaction from real initiators to virtual targets and then
translate those into downstream transactions between virtual initia-
tors to real targets.
Figure 46. The virtualization engine maintains a metadata mapping to
track virtual and physical data locations
As shown in Figure 46, the virtualization engine maintains metadata
mapping that associates the logical block address (LBA) range of a vir-
tual LUN to actual logical block address ranges from the contributing
storage arrays. A virtual LUN of 200 GB, for example, would have 400
million contiguous logical blocks of 512 bytes each. Those blocks
could be drawn from a single physical storage target, or be spread over
multiple storage targets:
Virtual Volume 2 Physical Storage Targets
200 GB VLUN 0
Start LBA 0
LBA 119,999,999
= FCID 000400 LUN 0
60 GB
Start BA 0
End LBA 119,999,999
LBA 120,000,000
LBA 199,999,999
=
FCID 001100 LUN 3
40 GB
Start LBA 600
End LBA 80,000,599
In this example, the 200 GB virtual LUN is composed of 60 GB from
one array, 40 GB from another, and 100 GB from a third array.
Although the LBA range of the virtual LUN appears to be contiguous, it
is actually eclectically assembled from multiple, non-contiguous
sources. A write of 10 GB of data to the virtual LUN beginning at LBA
115,000,000 would begin on one array and finish on another.
In terms of data protection, storage virtualization introduces two new
issues:
First, the metadata map itself must be protected, since without
the map there is no way to know where the data actually resides.
Vendors of storage virtualization solutions safeguard metadata by
maintaining redundant copies and synchronizing updates
between them.
Second, data protection mechanisms such as snapshots, CDP, or
replication must operate against virtual initiators and virtual tar-
gets and not their real and physical counterparts. If a virtual LUN
spans multiple arrays, conventional disk-based data replication
will capture only a portion of the total transactions between the
virtual initiator and the physical target. Therefore, virtualization
vendors typically package snapshot or replication utilities in their
solutions in addition to basic storage pooling.
Although storage virtualization adds a layer of underlying complexity to
storage configurations, it simplifies upper-layer management and
resource allocation. Like any abstraction layer, storage virtualization
masks complexity from an administrative standpoint but does not
make that complexity go away. Instead, the virtualization entity
assumes responsibility for maintaining the illusion of simplicity and
providing safeguards for incidents or failures on the back end. As with
graphical user interfaces that mask the complexity of underlying oper-
ating systems, files systems, and I/O, the key to success is resiliency
and transparent operation. In storage environments in particular, blue
screens are impermissible.
LBA 200,000,000
End LBA 399,999,999
=
FCID 00600 LUN 1
100 GB
Start LBA 100,000,000
End LBA 299,999,999
Virtual Volume 2 Physical Storage Targets
Convergence of Server and Storage Virtualization
ILM and storage virtualization have evolved in parallel with the devel-
opment of blade server platforms and server virtualization software.
The common goal of these technologies is to maximize productive utili-
zation of IT assets, while simplifying administration and reducing
ongoing operational costs. The combination of server virtualization
and blade servers in particular delivers more processing power and
simplified administration on a smaller footprint. On the storage side,
ILM and storage virtualization likewise facilitate greater efficiencies in
data storage, capacity utilization, and streamlined management.
Collectively, these trends are leading to a utility environment for both
data processing and data storage that will enable much higher levels
of automation of data processes on more highly optimized infrastruc-
tures. Brocade is an active contributor to utility computing and storage
and has already provided enabling elements for virtualized blade
server environments, such as the Brocade Access Gateway with NPIV
support, as discussed in Chapter 1, and fabric-based advanced stor-
age services for data migration, tiered storage infrastructures, and
storage virtualization. Future Brocade products will provide other
advanced storage services to enable customers to fully leverage their
SAN investment.
Fabric-Based Storage Services
ILM, data migration, and storage virtualization are being delivered on
a variety of platforms including dedicated servers, appliances, and
array-based intelligence. Because the fabric sits at the heart of stor-
age relationships, however, directors and switches that compose the
fabric are in a prime position to deliver advanced services efficiently
without extraneous elements. Fabric-based storage services are also
largely agnostic to the proprietary features of vendor-specific hosts
and storage targets. The combination of centrality and support for het-
erogeneous environments makes the fabric the preferred delivery
mechanism for advanced storage services, either independently or in
concert with other solutions.
The Brocade DCX Backbone, for example, uses the Brocade FA4-18
Fabric Application Blade to support a variety of fabric-based storage
services, including storage virtualization, volume management, repli-
cation, and data migration. Because the Brocade DCX provides the
core connectivity for the SAN, the intelligent services of the Brocade
FA4-18 can be applied throughout the fabric. In addition, the 99.999
percent availability and low power consumption engineered into the
Brocade DCX extends to the blade and provides resiliency and energy
efficiency for the advanced services it supports.
As with all other Brocade products, the Brocade FA4-18 is designed for
standards compliance. For fabric-based virtualization services, the
ANSI T11 Fabric Application Interface Standard (FAIS) defines a split-
path architecture that separates command data from storage data
and enables the fabric to maximize throughput for storage virtualiza-
tion applications. In the execution of FAIS, the Brocade FA4-18 delivers
enhanced performance of 1 million virtual I/Os per second (IOPS) and
an aggregate 64 Gbit/sec throughput. The functionality and perfor-
mance of the Brocade FA4-18 is also available in a standalone
product, the Brocade 7600 Fabric Application Platform.
Fabric Application Interface Standard (FAIS)
FAIS is an open systems project of the ANSI/INCITS T11.5 task group
and defines a set of common APIs to be implemented within fabrics.
The APIs are a means to more easily integrate storage applications
that were originally developed as host, array, or appliance-based utili-
ties to now be supported within fabric switches and directors.
The FAIS initiative separates control information from the data path. In
practice, this division of labor is implemented as two different types of
processors, as shown in Figure 47. The control path processor (CPP)
supports some form of operating system, the FAIS application inter-
face, and the storage virtualization application. The CPP is therefore a
high-performance CPU with auxiliary memory, centralized within the
switch architecture. It supports multiple instances of SCSI initiator and
SCSI target modes, and via the supported storage virtualization appli-
cation, presents the virtualized view of storage to the servers.
Allocation of virtualized storage to individual servers and management
of the storage metadata is the responsibility of the storage application
running on the CPP.
Figure 47. FAIS block diagram with split data path controllers and con-
trol path processor
The data path controller (DPC) may be implemented at the port level in
the form of an ASIC or dedicated CPU. The DPC is optimized for low
latency and high bandwidth to execute basic SCSI read/write transac-
tions under the management of one or more control path processors
(CPPs). Metadata mapping for storage pooling, for example, can be
executed by a DPC, but the DPC relies on control information from the
CPP to define the map itself. The Brocade FA4-18 and Brocade 7600,
for example, receive metadata mapping information from an external
CPP processor and then execute the translation of every I/O based on
the map contents.
Although the block diagram in Figure 47 shows the CPP co-located
with the data fastpath logic, the CPP can reside anywhere in the stor-
age network. A server or appliance, for example, can provide the CPP
function and communicate across the SAN to the enclosure or blade
housing the DPC function. Because the APIs that provide control infor-
mation and metadata are standardized, the DPC function of the
Brocade FA4-18 and Brocade 7600 can work in concert with a variety
of storage virtualization applications.
To safeguard the metadata mapping, redundant CPP servers can be
deployed. The FAIS standard allows for the DPC engine to be managed
by multiple CPPs, and the CPPs in turn can synchronize metadata
information to maintain consistency.
Brocade Data Migration Manager (DMM)
In converting from single-tier storage infrastructures to multi-tier, ILM-
friendly configurations, it is often difficult to migrate data from one
class of storage to another due to vendor proprietary features. Bro-
cade has proactively addressed this problem with the Brocade Data
Migration Manager (DMM) solution, which runs on the Brocade FA4-18
Fabric Application Blade or the Brocade 7600 Fabric Application
Platform.
Optimized for heterogeneous storage environments, Brocade DMM
supports both online and offline data migrations to minimize disrup-
tion to upper-layer applications. With throughput of terabytes per hour,
this solution enables rapid migration of data assets to accelerate
implementation of ILM for ongoing operations.
6
Infrastructure Lifecycle
Management
One of the often overlooked components of data protection is the
requirement to safeguard storage data once the storage system itself
has been retired. It is commonly assumed that once a storage system
has reached the end of its useful life, data will be migrated to a new
array and the old array erased. Simply reformatting the old system,
however, does not guarantee that the data is irretrievable. If the data is
particularly sensitive or valuable (for example, financial or personnel
records), the retired system can become a candidate for new technolo-
gies such as magnetic force scanning tunneling microscopy (STM) that
can retrieve the original data even if it has been overwritten.
Major vendors of content management solutions typically offer utilities
and secure deletion services for information lifecycle management to
migrate data from one asset to another. Aside from these specialized
services, though, forethought is required to establish best practices for
dealing with corporate data during an infrastructure technology
refresh.
Leased versus Purchased Storage
With purchased storage there is more flexibility in dealing with storage
systems that are being replaced or upgraded. The systems can be
repurposed into other departments, other facilities, or integrated as
secondary storage into a tiered storage architecture. With leased sys-
tems, however, at end of lease the equipment is expected to be
returned to the leasing agency or vendor. Consequently, data on those
systems should be migrated to new storage and then thoroughly
deleted on the retired system before it is returned.
External regulatory compliance or internal storage best practices may
dictate more extreme data deletion methods, including magnetic
degaussing, grinding or sanding of the disk media, acid treatment, or
Chapter 6: Infrastructure Lifecycle Management
high temperature incineration of disk drives. Some government and
military storage practices, in particular, require the complete destruc-
tion of disk drives that have failed or outlived their useful lives. Clearly,
physical destruction of storage media implies that the storage asset
cannot be repurposed or returned, and that aside from the frame and
controller logic the unit is thoroughly depreciated.
The Data Deletion Dilemma
Migrating data from one storage system to another can readily be
accomplished with advanced software, such as Brocade Data Migra-
tion Manager, and service offerings. This ensures non-disruptive
transfer of data from an old system to a new one with no loss of perfor-
mance for upper-layer applications. Once the migration is complete,
however, deleting data on the retired system requires more than a sim-
ple reformat of the disk set for a number of reasons.
Bad Tracks
During the normal course of disk drive operation, data blocks are writ-
ten to specific logical block addresses, which the disk drive logic, in
turn, translates into physical cylinder, head, and sector locations, as
illustrated in Figure 48.
Figure 48. Cylinder, head, and sector geometry of disk media
The Data Deletion Dilemma
If a track (cylinder) begins to fail or become marginal in read/write
response, the drive logic may attempt to copy the data to another loca-
tion and mark the track as bad. Bad track marking makes the
particular track unusable, but does not delete the data that was previ-
ously written there. In addition, when reformatting a disk drive, the
drive logic simply skips over the flagged bad tracks. Consequently,
even if the usual capacity of the disk is overwritten through reformat-
ting, the bad tracks may continue to hold sensitive data. It does not
take that many bytes to encode a Social Security number, a bank
account number, or a personal identification number (PIN), and tech-
niques do exist to reconstruct data from virtually any disk media.
Data Remanence
The writing of data bits on individual tracks is never so precise that
overwriting the data with new bit patterns will completely obliterate the
original data. The term data remanence refers to the detectable
presence of original data once it has been erased or overwritten. With
the right diagnostic equipment it may be possible to reconstruct the
original data, and in fact third-party companies specialize in this type
of data retrieval, typically for disk data that has been inadvertently
erased.
Figure 49. Traces of original data remain even if the specific sector
has been erased or overwritten
As symbolically illustrated in Figure 49, variations in magnetic flux or
slight changes in media sensitivity or magnetic field strength can leave
traces of the original data even when a disk sector has been erased or
overwritten with new data. This data remanence (the magnetic induc-
tion remaining in a magnetized substance no longer under external
magnetic influence) is detectable with magnetic force microscopy
(MFM) and more recently developed magnetic force STM. This technol-
ogy is relatively affordable, and given the availability of used or
discarded disk drives creates an opportunity for reconstruction of
potentially sensitive information.
Software-based Data Sanitation
Aside from physical destruction of the disk media, data remanence
can be addressed by implementing an erasure algorithm that makes
multiple passes over every disk track.
The Department of Defense, for example, requires a three-pass
sequence to ensure that tracks are completely overwritten:
A first pass write of a fixed value (for example, 0x00)
A second pass write of another fixed value (for example, 0xff)
The third pass is a write of some randomly selected value
This technique is also known as shredding and is analogous to paper
shredding of physical documents. In some sanitation algorithms, a
dozen or more passes may be implemented.
Although a final read may verify the overwrites, it is possible to com-
pletely eliminate data remanence by overwriting tracks with a low
frequency magnetic field. The lower frequency generates a broader
magnetic field that spills out on both sides of the track and conse-
quently obliterates original data traces detectable to STM technology.
Hardware-based Data Sanitation
Because Advanced Technology Attachment (ATA, typically IDE or EIDE
disk drives) disks are commonly used in portable, and therefore theft-
prone, laptops and PCs, the ATA standard includes a disk-based mech-
anism for Secure Erase. As with software data sanitation, Secure
Erase may execute multiple passes of overwrites. Because the opera-
tion is driven at a low level by the disk logic, however, it is possible to
also overwrite bad track areas and perform calculated offtrack over-
writing. In addition, because the process is disk based, it is possible to
bypass the upper-layer operating system and execute the erasure via
BIOS configuration.
Currently, an equivalent low-level secure erase procedure is unavail-
able for Fibre Channel drives, and so software-base data sanitation is
required to thoroughly cleanse disk media. Unlike ATA disks, Fibre
Channel drives for data center applications are typically deployed in
more complex RAID configurations. Data does not reside on a single
disk, but is striped across multiple disks in a RAID set. On the surface,
this might seem to inherently reduce the security vulnerability, since
reconstructing data via STM would require data retrieval of small por-
tions of remanence scattered across multiple disk drives. A single
Physical Destruction of Storage Assets
sector of a drive in a RAID set, however, could still yield sensitive or
proprietary records, Social Security numbers, or names and
addresses.
Physical Destruction of Storage Assets
Although physical destruction of disks has been common practice for
government, military, and security sectors, there are obvious environ-
mental implications. There is not only the issue of which landfill the
discarded disk drives go into or the emissions and energy consump-
tion from incineration, but the fact that retired storage assets may still
have productive application for other departments or organizations.
Slower drives may be replaced by faster units with more capacity, but
even slow drives can be repurposed for mid-tier applications.
Although degaussing disk media with a powerful magnetic field erases
sensitive data, it also erases the sync bytes and other low-level infor-
mation required for reformatting. If the drive is then unusable, it is
simply another candidate for landfill. As with acid treatment, sanding
or grinding of disk media, and passing disk drives through a physical
shredder, the goal of data security and protection may be accom-
plished, but at the expense of increasing limited resources and
environmental impact. Data sanitation that destroys the digital infor-
mation but maintains the viability of the physical storage unit is
therefore the preferred solution for storage asset lifecycle
management.
7
Extending Data Protection
to Remote Offices
One of the major gaps in corporate data protection is the vulnerability
of data assets that are geographically dispersed over remote offices
and facilities. While server consolidation and SAN technology have
helped customers streamline processes and reduce costs in the data
center, the bulk of data assets of most large companies are outside
the data center, dispersed in remote offices and regional sites. Accord-
ing to some industry analysts, up to 75 percent of corporate data
resides in remote locations. The majority of that remote data is stored
on remote storage arrays for servers hosting local productivity applica-
tions and e-mail.
Recent regulatory requirements highlight the cost and difficulty of
securing, protecting, and retrieving this data. Further, these remote
offices often lack personnel with the technical skill sets and rigorous
processes pioneered in data center environments to provide adequate
data protection. Consequently, even companies that have made signif-
icant investments in central data centers have been unable to
guarantee data accessibility and preservation of all corporate data
assets. With so much business information in a vulnerable state, com-
panies may be unable to meet regulatory compliance for customer
data or provide business continuity in the event of social or natural
disruptions.
The Proliferation of Distributed Data
In the early evolution of IT processing, all information access was cen-
tralized in data center mainframes. Remote offices lacked the
resources to independently generate and modify their own data. Dumb
terminals connected remote locations to the data center over low-
speed telecommunication links, all remote business transactions were
executed centrally, and data-center-based backup processes ensured
data protection and availability. The hegemony of the data center,
Chapter 7: Extending Data Protection to Remote Offices
though, was broken first by the introduction of minicomputers for
departments and next by microprocessors, PC-based business appli-
cations, local area networks, and client/server applications, such as e-
mail and file serving. These new tools enabled remote sites to run their
own applications, generate and analyze their own data, and be more
responsive to local client needs. If the mainframe or telecommunica-
tions links were down, business could still be transacted locally. This
allowed business units to leverage their own IT resources to be more
flexible and competitive.
The decentralization of application processing power, however, also
marks a steady increase in IT spending. Each remote site requires its
own file and application servers, program licenses, intelligent worksta-
tions, and LAN infrastructure. It also requires local data storage
resources to house the volumes of locally generated business informa-
tion, as illustrated in Figure 50. For companies with only a few remote
locations, this shift from centralized to decentralized IT assets may be
manageable. For companies with hundreds or thousands of remote
offices, though, decentralization has resulted in significantly increased
costs and a loss of control and management of vital corporate infor-
mation. This has been exacerbated by the explosion in storage
capacity required to hold the increase in files, e-mail, and other
unstructured data.
Figure 50. Remote office processing compounds the growth of remote
servers and storage and data vulnerability
The Proliferation of Distributed Data
Remote offices are now accustomed to the many benefits that local
processing and data storage provide. Applications can be tailored to
local business requirements. Using local servers and storage, transac-
tion response times are at LAN speed and not subject to the latencies
of remote telecommunication links. PC workstations and laptops offer
additional productivity tools and mobility that were previously unavail-
able in the monolithic mainframe model.
Remote offices, however, are also notoriously problematic in terms of
IT best practices and operations. Companies cannot afford to staff IT
personnel in every remote location. Backup processes are difficult to
monitor, and restore capability is rarely tested. Laptop data, for exam-
ple, may include essential business information but may lack the
safeguard of periodic tape backup. Data storage may be bound to indi-
vidual servers, requiring acquisition and management of additional
servers simply to meet growing storage capacity requirements. As a
successful company opens more branch offices, these problems are
compounded, as shown in Figure 51.
Figure 51. Decentralization of data storage has inherent cost and
data protection issues
Without some means to bring remote data assets under control, a
company faces the double burden of steadily increasing operational
expense and exposure to data loss.
Centralizing Remote Data Assets
Some companies have attempted to reverse data decentralization by
bringing business applications, servers, and storage back into the
data center. As in the previous mainframe paradigm, workstations at
remote offices access applications and data over telecommunication
links, and data center best practices for data availability and backup
can be performed centrally.
Typically, the first issue this reversal encounters is bandwidth. The
communication links to remote offices are simply not large enough to
accommodate all business traffic. Consequently, bottlenecks occur as
multiple users in remote locations attempt to access and modify data
simultaneously. This situation is aggravated by the fact that the appli-
cations themselves may engender megabytes of traffic per transaction
(for example, attaching a Microsoft PowerPoint presentation or graphic
to an e-mail) or require significant protocol overhead across a remote
link. The net result is that response times for opening or storing data
files are unacceptable for normal business operations. Without signifi-
cant enhancements, wide area links simply cannot deliver the LAN-like
performance expected (and often demanded) by remote clients.
Increasing bandwidth to remote offices may fix the bottleneck issue
but it cannot overcome the basic limits of wide area networks. Even
with unlimited bandwidth, network latency from the data center to a
remote site imposes its own transaction delay. At roughly 1 millisecond
per hundred miles (2x for a round-trip acknowledgment), network
latency negatively impacts response time as the distance increases.
Because of transmission delay over long distances, centralizing data
processing and storage inevitably imposes a tradeoff between control
of data assets and performance for day-to-day remote business
transactions.
Network latency is especially evident in chatty communication proto-
cols, which require constant acknowledgements and handshaking
between source and destination. When a remote user updates a file,
for example, the new data payload is not simply delivered as a continu-
ous data stream. Instead, protocol handshaking between the data
center server and the remote client workstation is interspersed in the
transaction, further exacerbating the effect of latency through the net-
work. Given that network latency is beyond our control, this problem
cannot be addressed without some means to dramatically reduce pro-
tocol overhead.
Even with these constraints, the trend toward remote office consolida-
tion back to the data center is powered by the recognition that the
vitality of a company is untenable if 75 percent of its business data is
at risk. Reducing costs for remote office IT infrastructure, gaining con-
trol of an enterprise's total data assets, implementing enterprise-wide
best practices for data replication and backup, and ensuring compli-
ance to new government regulations are essential requirements for
today's business operations. At the same time, however, solutions to
fix the remote office conundrum must maintain reasonable perfor-
mance and reliability for remote data transactions, both to provide
adequate response time for business operations and to minimize side
effects to remote users.
Remote Replication and Backup
For regional centers with significant local processing needs, consoli-
dating all server and storage assets in the corporate data center may
not be an option. At a minimum, the data housed in larger remote sites
must be protected against loss. A few years ago, the common practice
for safeguarding remote data was to perform periodic tape backups
locally and use the Chevy truck access method (CTAM) protocol to
physically move tapes offsite or to the central data center. Tapes sets,
however, can get lost, misplaced, mislabeled, or intercepted by miscre-
ants. In addition, the ability to restore from tape is rarely verified
through testing. Consequently, data protection for larger remote loca-
tions is now typically performed using synchronous or asynchronous
disk-to-disk data replication.
Block-based, disk-to-disk replication over distance must obey the laws
of physics, and network latency determines whether synchronous or
asynchronous methods can be used. Synchronous disk-to-disk replica-
tion for remote sites is operational inside a metropolitan
circumference, roughly 150 miles from the central data center. Every
write operation at the remote storage resource is simultaneously per-
formed at the data center, guaranteeing that every business
transaction is captured and preserved. Beyond 150 miles, however,
network latency imposes too great a delay in block level write opera-
tions and adversely impacts application performance. Asynchronous
block data replication can extend to thousands of miles, but since mul-
tiple write operations are buffered before being sent back to the data
center, there is always the possibility that a few transactions may be
lost in the event of WAN outage or other disruption.
Larger enterprises may use a combination of synchronous and asyn-
chronous methods to maximize protection of their corporate data. A
remote site, for example, may perform synchronous disk-to-disk repli-
cation to a nearby location, and secondarily asynchronous replication
to the data center. This solution imposes greater cost, but helps
ensure that any potential data loss is minimized.
In addition to disk-to-disk replication, companies may centralize
backup operations to the data center with remote backup techniques.
Remote backup provides only periodic preservation of dispersed data,
but at least it enables the data center to centralize control of data
management. If a regional site becomes inoperable, the vast majority
of its transactions can be reconstructed centrally to provide business
continuity.
The efficiency of disk-to-disk data replication and remote tape backup
technologies depends on the ability of telecommunications services to
deliver adequate performance for the volume of data involved. For
remote tape backup, as in data center backup operations, the window
of time required to perform backup must be sufficient to accommo-
date multiple backup operations concurrently. Finding methods to
expedite block data delivery across wide area links is therefore essen-
tial to meet backup window requirements and reduce costs for WAN
services.
As discussed in Chapters 2 and 3, Brocade technology for remote tape
backup and remote data replication leverages WAN optimization and
storage protocols to fully utilize WAN bandwidth and deliver the maxi-
mum amount of data in the least time. Brocade SAN extension
technology such as data compression, data encryption, rate limiting,
FastWrite, and tape pipelining enable secure data protection for
remote storage assets and extension of data center best practices to
all corporate data.
Leveraging File Management Technology for Data
Protection
Brocade file management technology includes a suite of solutions to
optimize file-level access throughout the corporate network. Although
files ultimately reside as block data on disk storage, the client or user
interface to business applications is typically at the file level. For clas-
sic remote office configurations, client workstations create, retrieve,
modify, and store files on servers attached to the local LAN. The serv-
ers, in turn, perform the file-to-block and block-to-file conversions
required for data storage. The organization of individual files into file
systems is typically executed on a per-server basis. A client is therefore
required to attach to multiple servers if broader file access is required,
with the file system structure of those servers represented as addi-
tional drive identifiers (for example, M: or Z: drives).
A key component of file management technology, wide area file ser-
vice (WAFS) technology, enables companies with multiple remote sites
to consolidate their storage assets at the central data center while pre-
serving local LAN-like response time for file access.
Figure 52. Centralized file access replaces remote server and storage
assets with appliances optimized for high-performance file serving
from the data center to the branch
As shown in Figure 52, wide are file access technologies enable cen-
tralization of remote data assets back to the main data center.
Formerly, remote clients would access files on their local file servers
and storage. In the wide area file solution, the remote client requests
are now directed to the edge appliance. The edge appliance communi-
cates across the WAN to the core appliance at the central data center.
LAN-like response times are maintained by a combination of technolo-
gies, including remote caching, compression, storage caching over IP
(SC-IP), and WAN optimization algorithms. Collectively, these technolo-
gies overcome the latency issues common to earlier attempts at
centralization and so satisfy the response time expectations of remote
users.
With data manipulated at remote locations now centralized at the data
center, best practices for data protection, backup, and disaster recov-
ery can be applied to all corporate data. In addition, management of
all corporate data can be streamlined on the basis of consolidated
storage management and advanced storage services, such as infor-
mation lifecycle management, extended to data generated by remote
users.
Although the primary impetus for remote office consolidation may be
to gain control over corporate-wide data assets, wide area file access
provides additional benefits in terms of rationalizing management of
file, print, network, and Web caching services. It dramatically reduces
the amount of hardware and software that has to be supported at
each remote location and reduces the administrative overhead of
maintaining dispersed assets. Wide area file access technology is also
a green IT solution in that the energy inefficiencies of hundreds or
thousands of dispersed servers and storage arrays can be replaced by
more centralized and energy efficient data center elements.
Wide area file access is designed for native integration with Microsoft
platforms in order to support secure and consistent file access poli-
cies. Key support includes Common Internet File System (CIFS)
protocol management, security mechanisms, such as Active Directory,
Server Message Block (SMB) signing, Kerberos authentication, and
Systems Management Server (SMS) distribution services. To help
organizations comply with their internal business objectives and indus-
try regulations, wide area file access technology is typically designed
to survive common WAN outages, and thus to help guarantee data
coherency and consistency.
Protecting Data with Brocade StorageX
Data protection technologies such as replication, snapshot, CDP, and
data archiving are essentially back-end processes operating between
servers and storage. A key consideration for any data protection
scheme, though, is to minimize the impact on ongoing front-end pro-
duction and in particular the end-user applications. In complex
heterogeneous environments that must support multiple operating
systems and different file systems, implementing consistent data pro-
tection strategies non-disruptively is often a challenge.
Brocade StorageX facilitates non-disruptive storage management by
presenting a unified view of file data across heterogeneous systems.
By pooling multiple file systems into a single logical file system, the
StorageX global namespace virtualizes file system access and hides
the back-end complexity of physical storage, as illustrated in
Figure 53. This enables storage administrators to harmonize diverse
storage elements, streamline data management, and implement data
protection technologies transparently to end user access.
Figure 53. Brocade StorageX provides a global namespace to virtual-
ize file access across heterogeneous operating systems and back-end
storage elements
As an integrated suite of file-oriented services, Brocade StorageX facil-
itates data protection by enabling transparent migration of data from
one storage element to another, replication of file data between heter-
ogeneous systems, and simplification of file management, even when
storage elements are still dispersed. In addition, StorageX enables
optimization of storage capacity utilization and so helps ensure that
user applications are allocated adequate storage without disrupting
ongoing operations.
The Brocade StorageX global namespace eliminates the need for indi-
vidual servers to attach to specific storage arrays through separate
drive letter or path designations. Instead, the global namespace pre-
sents a unified view of file structures that may be dispersed over
multiple arrays and presents a single drive letter or path. From the
standpoint of the client, it no longer matters where particular subdirec-
tories or folders reside, and this in turn makes it possible to migrate
file structures from one physical array to another without disrupting
user applications.
Brocade File Management Engine
In combination with the StorageX global namespace, Brocade File
Management Engine (FME) provides the ability to automate file lifecy-
cle management. As with ILM techniques for block storage data, file-
level lifecycle management monitors the frequency of file access and
as file data ages and declines in immediate value, it can be migrated
to secondary storage, retired to tape, or simply deleted depending on
data retention requirements. The clustered, highly-available FME is
built on a Windows Storage Server platform. It leverages and inte-
grates the following technology standards: CIFS protocol and Active
Directory and Microsoft security protocols. FME architecture ensures
that access to network resources is always available, protects against
data loss, and allows you to easily scale the management of a file
environment.
Figure 54. Brocade File Management Engine components and
architecture
Part Two
The following chapters are included in Part Two:
Chapter 8: Foundation Products starting on page 115
Chapter 9: Distance Products starting on page 133
Chapter 10: Backup and Data Protection Products starting on
page 137
Chapter 11: Branch Office and File Management Products start-
ing on page 143
Chapter 12: Advanced Fabric Services and Software Products
starting on page 149
8
Foundation Products
This chapter provides brief descriptions of the following Brocade foun-
dation product offerings:
Brocade DCX Backbone on page 116
Brocade 48000 Director on page 119
Brocade Mi10K Director on page 121
Brocade M6140 Director on page 122
Brocade FC4-16IP iSCSI Blade on page 123
Brocade FC10-6 Blade on page 124
Brocade 5300 Switch on page 125
Brocade Fibre Channel HBAs on page 128
Brocade SAN Health on page 130
The best place to obtain current information Brocade products and
services is to visit www.brocade.com > Resources > Documentation >
Data Sheets & Solutions Briefs.
Or make choices from the Products, Solutions, or Services main
menus.
Chapter 8: Foundation Products
Brocade DCX Backbone
The Brocade DCX offers flexible management capabilities as well as
Adaptive Networking services and fabric-based applications to help
optimize network and application performance. To minimize risk and
costly downtime, the platform leverages the proven five-nines (99.999
percent) reliability of hundreds of thousands of Brocade SAN
deployments.
Figure 55. Brocade DCX Backbone with all slots populated (no door)
The Brocade DCX facilitates the consolidation of server-to-server,
server-to-storage, and storage-to-storage networks with highly avail-
able, lossless connectivity. In addition, it operates natively with
Brocade and Brocade M-Series components, extending SAN invest-
ments for maximum ROI. It is designed to support a broad range of
current and emerging network protocols to form a unified, high-perfor-
mance data center fabric.
Brocade DCX Backbone
Table 4. Brocade DCX Capabilities
Feature Details
Industry-leading
capabilities for large
enterprises
Industry-leading Performance 8 Gbit/sec per-
port, full-line-rate performance
13 Tbit/sec aggregate dual-chassis bandwidth
(6.5 Tbit/sec for a single chassis)
1 Tbit/sec of aggregate ICL bandwidth
More than five times the performance of
competitive offerings
High scalability High-density, bladed architecture
Up to 384 8 Gbit/sec Fibre Channel ports in a
single chassis
Up to 768 8 Gbit/sec Fibre Channel ports in a
dual-chassis configuration
544 Gbit/sec aggregate bandwidth per slot plus
local switching
Fibre Channel Integrated Routing
Specialty blades for 10 Gbit/sec connectivity
(Brocade FC10-6 Blade on page 124),
Fibre Channel Routing over IP (FR4-18i
Extension Blade on page 134), and fabric-
based applications (Brocade FA4-18 Fabric
Application Blade on page 137)
Energy efficiency Energy efficiency less than one-half Watt
per Gbit/sec
Ten times more energy efficient than
competitive offerings
Ultra-High Availability Designed to support 99.99 percent uptime
Passive backplane, separate and redundant
control processor and core switching blades
Hot-pluggable components, including
redundant power supplies, fans, WWN cards,
blades, and optics
Fabric services and
applications
Adaptive Networking services, including Quality
of Service (QoS), Ingress Rate Limiting, Traffic
Isolation, and Top Talkers
Plug-in services for fabric-based storage
virtualization, continuous data protection and
replication, and online data migration
Multiprotocol
capabilities and fabric
interoperability
Support for Fibre Channel, FICON, FCIP, and
IPFC
Designed for future 10 Gigabit Ethernet,
Converged Enhanced Ethernet (CEE), and Fibre
Channel over Ethernet (FCoE)
Native connectivity in Brocade and Brocade
M-Series fabrics, including backward and
forward compatibility
Intelligent
management and
monitoring
Full utilization of the Brocade Fabric OS
embedded operating system
Flexibility to utilize a CLI, Brocade EFCM,
Brocade Fabric Manager, Brocade Advanced
Web Tools, and Brocade Advanced Performance
Monitoring
Integration with third-party management tools
Table 4. Brocade DCX Capabilities
Feature Details
Brocade 48000 Director
Brocade 48000 Director
Industry-leading 4, 8, and 10 Gbit/sec Fibre Channel and FICON per-
formance, the Brocade 48000 provides HA, multiprotocol connectivity,
and broad investment protection for Brocade FOS and Brocade M-EOS
fabrics. It scales non-disruptively from 32 to as many as 384 concur-
rently active 4 or 8 Gbit/sec full-duplex ports in a single domain.
Figure 56. Brocade 48000 Director with all slots populated
The Brocade 48000 provides industry-leading power and cooling effi-
ciency, helping to reduce the total cost of ownership. It supports
blades for Fibre Channel Routing, FCIP SAN extension, and iSCSI, and
is designed to support a wide range of fabric-based applications. It
also supports the Brocade FC10-6 blade, providing 10 Gbit/sec Fibre
Channel data transfer for specific types of data-intensive storage
applications.
With its fifth-generation, high-performance architecture, the Brocade
48000 is a reliable foundation for core-to-edge SANs, enabling fabrics
capable of supporting thousands of hosts and storage devices. To pro-
vide even higher performance, enhanced Brocade ISL Trunking
combines up to eight 8 Gbit/sec ports between switches into a single,
logical high-speed trunk running at up to 64 Gbit/sec. Other services
provide additional QoS and Traffic Management capabilities to opti-
mize fabric performance.
Utilizing Brocade Fabric OS, the Brocade 48000 also supports native
connectivity with existing Brocade M-EOS fabrics.
The Brocade 48000 is designed to integrate with heterogeneous envi-
ronments that include IBM mainframe and open platforms with
multiple operating systems such as Microsoft Windows, Linux, Sun
Solaris, HP-UX, AIX, and i5/OS. These capabilities help make it ideal for
enterprise management and high-volume transaction processing
applications such as:
Enterprise resource planning (ERP)
Data warehousing
Data backup
Remote mirroring
HA clustering
Designed for use in the Brocade 48000 Director, the FR4-18i Exten-
sion Blade (see page 134) provides performance-optimized FCIP as
well as Fibre Channel Routing services. The Brocade FR4-18i offers a
wide range of benefits for inter-SAN connectivity, including long-dis-
tance SAN extension, greater resource sharing, and simplified
management. The Brocade 48000 also supports the Brocade FC4-
16IP (see page 123), which enables cost-effective, easy-to-manage
Ethernet connectivity so low-cost servers can access high-perfor-
mance Fibre Channel storage resources.
The Brocade 48000 supports the Brocade FA4-18 Fabric Application
Blade (see page 137) for a variety of fabric-based applications
increasing flexibility, improving operational efficiency, and simplifying
SAN management. This includes Brocade OEM and ISV Partner appli-
cations for storage virtualization and volume management,
replication, and data mobility, as well as the Brocade Data Migration
Manager (see page 139).
Brocade directors are the most power-efficient in the industry, with the
lowest documented power draw. They require less power per port
(under 4 watts per port) and less power per unit bandwidth than any
other director. Brocade is the only vendor to require less than one watt
per Gbit/sec of bandwidth.
Brocade Mi10K Director
Brocade Mi10K Director
With the Brocade Mi10K, organizations can securely and efficiently
consolidate large and geographically distributed networks, supporting
the most demanding open systems and mainframe environments. Pro-
viding up to 256 Fibre Channel or FICON ports in a compact 14U
chassis, the Brocade Mi10K delivers broad scalability advantages.
Organizations can natively connect Brocade 8 Gbit/sec switches, the
Brocade 48000 Director, and Brocade DCX Backbones to the Brocade
Mi10K without disruptionenabling improved utilization of shared
storage resources with complete Brocade Mi10K functionality. The
ability to protect M-Series investments helps reduce costs, streamline
deployment in expanding SANs, and provide a seamless path for
future infrastructure migration.
Figure 57. Brocade Mi10K Director
Brocade M6140 Director
The Brocade M6140 Director is a reliable, high-performance solution
for small to midsize data centers using Brocade M-Series SAN fabric
devices. Designed to support 247, mission-critical open systems and
System z environments, the Brocade M6140 enables IT organizations
to further consolidate and simplify their storage networks while keep-
ing pace with rapid data growth and changing business requirements.
Providing up to 140 Fibre Channel or FICON ports, the Brocade M6140
supports 1, 2, and 4 Gbit/sec transfer speeds to address a broad
range of application performance needs. For data replication and
backup to remote sites, the Brocade M6140 provides 10 Gbit/sec
Fibre Channel transfer speeds over dark fiber using DWDM. To help
ensure uninterrupted application performance, the Brocade M6140
features extensive component redundancy to achieve 99.999 percent
system reliability.
The Brocade M6140 utilizes special port cards in up to 35 slots,
enabling organizations to scale their SAN environments in small 4-port
increments for cost-effective flexibility. Organizations can also natively
connect Brocade 8 Gbit/sec switches, the Brocade 48000 Director,
and Brocade DCX Backbones to the Brocade M6140 without disrup-
tionenabling improved Brocade utilization of shared storage
resources with complete Brocade M6140 functionality.
Figure 58. Brocade M6140 Director
Brocade FC4-16IP iSCSI Blade
Brocade FC4-16IP iSCSI Blade
Todays IT organizations face financial and operational challenges,
such as the growing need to better protect datafor mission-critical
applications and also for second-tier servers such as e-mail servers.
Business demands faster provisioning of storage in a more service-ori-
ented, granular fashion. The centralization of data has also become
increasingly important for these organizations as they deploy new initi-
atives to comply with industry regulations.
All of these challenges can be addressed
by allowing lower-cost iSCSI servers to
access valuable, high-performance Fibre
Channel SAN resources. The Brocade
FC4-16IP blade for the Brocade 48000
Director is a cost-effective solution that
enables this type of connectivity. The Bro-
cade FC4-16IP provides a wide range of
performance, scalability, availability, and
investment protection benefits to help
increase storage administrator productiv-
ity and application performance while
continuing to reduce capital and opera-
tional costs.
The blade features eight GbE ports for
iSCSI connectivity and eight full-speed 1,
2, and 4 Gbit/sec FC ports. The Fibre
Channel ports provide the same perfor-
mance features available in all Brocade
switches.
Figure 59. FC4-16IP iSCSI Blade
Brocade FC10-6 Blade
The Brocade FC10-6 enables organizations with dark fiber or DWDM
10 Gbit/sec long-distance links, to fully utilize these links via dark fiber
or DWDM (Ciena and Adva 10 Gbit/sec DWDM have been tested and
work with the Brocade FC10-6). In many environments, a leased 10
Gbit/sec link is underutilized because organizations can transmit only
4 Gbit/sec Fibre Channel traffic over a 10 Gbit/sec connection.
The Brocade FC10-6 Blade has six 10 Gbit/sec FC ports that use 10
Gigabit Small Form Factor Pluggable (XFP) optical transceivers. The
ports on the FC10-6 blade operate only in E_Port mode to create ISLs.
The FC10-6 blade has buffering to drive 10 Gbit/sec up to 120 km per
port, which exceeds the capabilities of 10 Gbit/sec XFPs that are avail-
able in short-wave and 10 km, 40 km, and 80 km long-wave versions.
The Brocade FC10-6 is managed with the same tools and CLI com-
mands that are used for Brocade FOS-based products. The CLI,
Brocade Enterprise Fabric Connectivity Manager (EFCM), Brocade Fab-
ric Manager, and Brocade Web Tools all support 10 Gbit/sec utilizing
the same commands used for other Fibre Channel links.
Brocade 5300 Switch
Brocade 5300 Switch
As the value and volume of business data continue to rise, organiza-
tions need technology solutions that are easy to implement and
manage and that can grow and change with minimal disruption. The
Brocade 5300 Switch is designed to consolidate connectivity in rapidly
growing mission-critical environments, supporting 1, 2, 4, and 8 Gbit/
sec technology in configurations of 48, 64, or 80 ports in a 2U chassis.
The combination of density, performance, and pay-as-you-grow scal-
ability increases server and storage utilization, while reducing
complexity for virtualized servers and storage.
Figure 60. Brocade 5300 Switch
Used at the fabric core or at the edge of a tiered core-to-edge infra-
structure, the Brocade 5300 operates seamlessly with existing
Brocade switches through native E_Port connectivity into Brocade FOS
or M-EOS) environments. The design makes it very efficient in power,
cooling, and rack density to help enable midsize and large server and
storage consolidation. The Brocade 5300 also includes Adaptive Net-
working capabilities to more efficiently manage resources in highly
consolidated environments. It supports Fibre Channel Integrated Rout-
ing for selective device sharing and maintains remote fabric isolation
for higher levels of scalability and fault isolation.
The Brocade 5300 utilizes ASIC technology featuring eight 8-port
groups. Within these groups, an inter-switch link trunk can supply up to
68 Gbit/sec of balanced data throughput. In addition to reducing con-
gestion and increasing bandwidth, enhanced Brocade ISL Trunking
utilizes ISLs more efficiently to preserve the number of usable switch
ports. The density of the Brocade 5300 uniquely enables fan-out from
the core of the data center fabric with less than half the number of
switch devices to manage compared to traditional 32- or 40-port edge
switches.
Brocade 5100 Switch
The Brocade 5100 Switch is designed for rapidly growing storage
requirements in mission-critical environments combining 1, 2, 4, and
8 Gbit/sec Fibre Channel technology in configurations of 24, 32, or 40
ports in a 1U chassis. As a result, it provides low-cost access to indus-
try-leading SAN technology and pay-as-you-grow scalability for
consolidating storage and maximizing the value of virtual server
deployments.
Similar to the Brocade 5300, he Brocade 5100 features a flexible
architecture that operates seamlessly with existing Brocade switches
through native E_Port connectivity into Brocade FOS or M-EOS environ-
ments. With the highest port density of any midrange enterprise
switch, it is designed for a broad range of SAN architectures, consum-
ing less than 2.5 watts of power per port for exceptional power and
cooling efficiency. It features consolidated power and fan assemblies
to improve environmental performance. The Brocade 5100 is a cost-
effective building block for standalone networks or the edge of enter-
prise core-to-edge fabrics.
Additional performance capabilities include the following:
32 Virtual Channels on each ISL enhance QoS traffic prioritization
and anti-starvation capabilities at the port level to avoid perfor-
mance degradation.
Exchange-based Dynamic Path Selection optimizes fabric-wide
performance and load balancing by automatically routing data to
the most efficient available path in the fabric. It augments ISL
Trunking to provide more effective load balancing in certain con-
figurations. In addition, DPS can balance traffic between the
Brocade 5100 and Brocade M-Series devices enabled with Bro-
cade Open Trunking.
Brocade 300 Switch
Brocade 300 Switch
The Brocade 300 Switch provides small to midsize enterprises with
SAN connectivity that simplifies IT management infrastructures,
improves system performance, maximizes the value of virtual server
deployments, and reduces overall storage costs. The 8 Gbit/sec Fibre
Channel Brocade 300 provides a simple, affordable, single-switch
solution for both new and existing SANs. It delivers up to 24 ports of 8
Gbit/sec performance in an energy-efficient, optimized 1U form factor.
To simplify deployment, the Brocade 300 features the EZSwitchSetup
wizard and other ease-of-use and configuration enhancements, as
well as the optional Brocade Access Gateway mode of operation (sup-
ported with 24-port configurations only). Access Gateway mode
enables connectivity into any SAN by utilizing NPIV switch standards to
present Fibre Channel connections as logical devices to SAN fabrics.
Attaching through NPIV-enabled switches and directors, the Brocade
300 in Access Gateway mode can connect to FOS-based, M-EOS-
based, or other SAN fabrics.
Organizations can easily enable Access Gateway mode (see page 151)
via the FOS CLI, Brocade Web Tools, or Brocade Fabric Manager. Key
benefits of Access Gateway mode include:
Improved scalability for large or rapidly growing server and virtual
server environments
Simplified management through the reduction of domains and
management tasks
Fabric interoperability for mixed vendor SAN configurations that
require full functionality
Brocade Fibre Channel HBAs
In mid-2008 Brocade released a family of Fibre Channel HBAs with
8 Gbit/sec and4 Gbit/sec HBAs.
Highlights of these new Brocade FC HBAs include:
Maximizes bus throughput with a Fibre Channel-to-PCIe 2.0a
Gen2 (x8) bus interface with intelligent lane negotiation
Prioritizes traffic and minimizes network congestion with target
rate limiting, frame-based prioritization, and 32 Virtual Channels
per port with guaranteed QoS
Enhances security with Fibre Channel-Security Protocol (FC-SP) for
device authentication and hardware-based AES-GCM; ready for in-
flight data encryption
Supports virtualized environments with NPIV for 255 virtual ports
Uniquely enables end-to-end (server-to-storage) management in
Brocade Data Center Fabric environments
Brocade 825/815 FC HBA
The Brocade 815 (single port) and Brocade 825 (dual ports) 8 Gbit/
sec Fibre Channel-to-PCIe HBAs provide a new level of server connec-
tivity through unmatched hardware capabilities and unique software
configurability. This new class of HBAs is designed to help IT organiza-
tions deploy and manage true end-to-end SAN service across next-
generation data centers.
Figure 63. Brocade 825 FC 8 Gbit/sec HBA (dual ports shown)
Brocade Fibre Channel HBAs
The Brocade 8 Gbit/sec FC HBA also:
Maximizes I/O transfer rates with up to 500,000 IOPS per port at
8 Gbit/sec
Utilizes N_Port Trunking capabilities to create a single logical
16 Gbit/sec high-speed link
Brocade 425/415 FC HBA
The Brocade 4 Gbit/sec FC HBA has capabilities similar to those
described for the 8 Gbit/sec version. The Brocade 4 Gbit/sec FC HBA
also:
Maximizes I/O transfer rates with up to 500,000 IOPS per port at
4 Gbit/sec
Utilizes N_Port Trunking capabilities to create a single logical
8 Gbit/sec high-speed link
Brocade
Figure 64. Brocade 415 FC 4 Gbit/sec HBA (single port shown)
Brocade SAN Health
The Brocade SAN Health family of offerings provides the most compre-
hensive tools and services for analyzing and reporting on storage
networking environments. These practical, easy-to-use solutions help
automate time-consuming tasks to increase administrator productivity,
simplify management, and streamline operations throughout the
enterprise.
Figure 65. SAN Health topology display
The SAN Health family ranges from a free diagnostic capture utility to
optional fee-based add-on modules and customized Brocade Services.
The family of offerings includes:
Brocade SAN Health Diagnostics Capture (Free data capture
utility). By capturing raw data about SAN fabrics, directors,
switches, and connected devices, this utility provides a practical,
fast way to keep track of networked storage environments. SAN
Health Diagnostics Capture collects diagnostic data, checks it for
problems, analyzes it against best-practice criteria, and then pro-
duces an Excel-based report containing detailed information on all
fabric and device elements. This report provides views that are
specifically designed for open systems or mainframe users, and
serves as the basis for all the SAN Health family products and ser-
vices. In addition, it generates a comprehensive Visio topology
diagram that provides a graphical representation of networked
storage environments.
Brocade SAN Health
Brocade SAN Health Professional (Free data analysis framework
that supports optional advanced functionality modules). Brocade
SAN Health Professional provides a framework for loading the orig-
inal report data generated by SAN Health Diagnostics Capture.
This framework supports extended functionality beyond the capa-
bilities of an Excel report and Visio topology diagram. Capabilities
such as searching, comparing, custom report generation, and
change analysis are all available in an easy-to-use GUI.
Using SAN Health Professional, organizations can quickly and eas-
ily search their SAN Health reports to find common attributes from
the channel adapters (HBA firmware and driver levels), director/
switch firmware, and specific error counter information.
Brocade SAN Health Professional Change Analysis (Optional fee-
based module with sophisticated change analysis
capabilities). SAN Health Professional Change Analysis is an
optional subscription-based add-on module for SAN Health Profes-
sional that enables organizations to compare two SAN Health
reports run at different times to visually identify what items have
changed from one audit to the next. Organizations can compare
two SAN Health reports with all the detailed changes highlighted
in an easy-to-understand format. The changes are easily search-
able, and organizations can quickly produce a change report.
Brocade SAN Health Expert (Subscription-based Brocade Services
offering featuring detailed analysis and quarterly consultations
with Brocade consultants). The Brocade SAN Health Expert Ser-
vice engagement is a subscription service designed for
organizations that want additional analysis and advice from a Bro-
cade consultant. As an extension of the SAN Health Diagnostics
Capture utility, this service entitles subscribers to four 1-hour live
consultations on a quarterly basis during a 365-day period.
As part of the service, a Brocade consultant prepares for each
telephone consultation by downloading and reviewing the sub-
scribers SAN Health reports and preparing architectural and
operational recommendations. This preparation serves as the dis-
cussion agenda for the live consultations. During the
consultations, subscribers also can ask specific questions about
their SAN environments. The quarterly consultations provide a
cost-effective way to build an ongoing plan for improving uptime
and continually fine-tuning SAN infrastructures.
By utilizing the free versions of the SAN Health Diagnostics Capture
utility and SAN Health Professional framework, organizations can
quickly gain an accurate view of their storage infrastructure, including
director and switch configurations along with all of the devices
attached to the network. They can then opt for the fee-based modules
that build on the SAN Health Professional framework if they want addi-
tional search, filtering, or reporting capabilities. Regardless, IT
organizations of all sizes can utilize these products and services to
perform critical tasks such as:
Taking inventory of devices, directors, switches, firmware versions,
and fabrics
Capturing and displaying historical performance data
Comparing zoning and switch configurations to best practices
Assessing performance statistics and error conditions
Producing detailed graphical reports and diagrams
Figure 66. SAN Health reporting screen
In addition to these capabilities, mainframe users can utilize a new
FICON-enhanced tool to model potential configurations and manage
change in a simplified format. Specifically, the tool reformats Input/
Output Completion Port (IOCP) configuration files into easy-to-under-
stand Microsoft Excel spreadsheets.
9
Distance Products
Brocade has a number of highly optimized distance extension prod-
ucts, including:
Brocade 7500 Extension Switch on page 133
FR4-18i Extension Blade on page 134
Brocade Edge M3000 on page 135
Brocade USD-X on page 136
Brocade 7500 Extension Switch
The Brocade 7500 combines 4 Gbit/sec Fibre Channel switching and
routing capabilities with powerful hardware-assisted traffic forwarding
for FCIP. It features 16 x FC ports and 2 x 1 GbE portsdelivering high
performance to run storage applications at line-rate speed with either
protocol. By integrating these services in a single platform, the Bro-
cade 7500 offers a wide range of benefits for storage and SAN
connectivity, including SAN scaling, long-distance extension, greater
resource sharing (either locally or across geographical areas), and sim-
plified management.
Figure 67. Brocade 7500 Extension Switch
Chapter 9: Distance Products
The Brocade 7500 provides an enterprise building block for consolida-
tion, data mobility, and business continuity solutions that improve
efficiency and cost savings:
Combines FCIP extension with Fibre Channel switching and rout-
ing to provide local and remote storage and SAN connectivity while
isolating SAN fabrics and IP WAN networks
Optimizes application performance with features such as Fast
Write, Brocade Accelerator for FICON (including Emulation and
Read/Write Tape Pipelining), and hardware-based compression
Maximizes bandwidth utilization with Adaptive Networking ser-
vices, including QoS and Traffic Isolation, trunking, and network
load balancing
Enables secure connections across IP WANs through IPSec
encryption
Interoperates with Brocade switches, routers, and the Brocade
DCX Backbone, enabling new levels of SAN scalability, perfor-
mance, and investment protection
Simplifies interconnection and support for heterogeneous SAN
environments
FR4-18i Extension Blade
The Brocade FR4-18i, integrating into either the
Brocade 48000 Director or the Brocade DCX
Backbone, combines Fibre Channel switching
and routing capabilities with powerful hardware-
assisted traffic forwarding for FCIP. The blade fea-
tures 16 x 4 Gbit/sec Fibre Channel ports and 2 x
1 GbE portsdelivering high performance to run
storage applications at line-rate speed with either
protocol. By integrating these services in a single
platform, the Brocade FR4-18i offers a wide
range of benefits for storage and SAN connectiv-
ity, including SAN scaling, long-distance
extension, greater resource sharing (either locally
or across geographical areas), and simplified
management.
Figure 68. FR4-18i Extension Blade
Brocade Edge M3000
Brocade Edge M3000
The Brocade Edge M3000 interconnects Fibre Channel and FICON
SANs over IP or ATM infrastructures. As a result, it enables many of the
most cost-effective, enterprise-class data replication solutionsinclud-
ing disk mirroring and remote tape backup/restore to maximize
business continuity. Moreover, the multipoint SAN routing capabilities
of the Brocade Edge M3000 provide a highly flexible storage infra-
structure for a wide range of remote storage applications
Figure 69. Brocade Edge M3000
The Brocade Edge M3000 enables the extension of mission-critical
storage networking applications in order to protect data and extend
access to the edges of the enterprise. The ability to extend both main-
frame and open systems tape and disk storage provides cost-effective
options for strategic storage infrastructure plans as well as support for
the following applications:
Synchronous or asynchronous disk mirroring
Data backup/restore, archive/retrieval, and migration
Extended tape or virtual tape
Extended disk
Content distribution
Storage sharing
Chapter 9: Distance Products
Brocade USD-X
The Brocade USD-X is a high-performance platform that connects and
extends mainframe and open systems storage-related data replication
applications for both disk and tape, along with remote channel net-
working for a wide range of device types.
Figure 70. Brocade USD-X, 12-slot and 6-slot versions
This multi-protocol gateway and extension platform interconnects host-
to-storage and storage-to-storage systems across the enterprise
regardless of distanceto create a high-capacity, high-performance
storage network using the latest high-speed interfaces.
In short, the Brocade USD-X:
Supports Fibre Channel, FICON, ESCON, Bus and Tag or mixed
environment systems
Fully exploits Gigabit Ethernet services
Delivers industry-leading throughput over thousands of miles
Provides hardware-based compression to lower bandwidth costs
Offers one platform for all remote storage connectivity needs
Shares bandwidth across multiple applications and sites
There are two versions
of the Brocade USD-X:
The 12-slot version
shown on the left
The 6-slot version
shown on the right
10
Backup and Data
Protection Products
The Brocade DCX Backbone and 48000 Director with the Brocade
FA4-18 Fabric Application Blade running Brocade or third-party appli-
cations provides a robust data protection solution.
NOTE: The functionality described for the FA4-18 Fabric Application
Blade is also available in the Brocade 7600 standalone platform.
Brocade FA4-18 Fabric Application Blade
The Brocade FA4-18 blade installed in a Bro-
cade DCX Backbone or a Brocade 48000
Director is a high-performance platform for fab-
ric-based storage applications. Delivering
intelligence in SANs to perform fabric-based
storage services, including online data migra-
tion, storage virtualization, and continuous
data replication and protection, this blade pro-
vides high-speed, highly reliable fabric-based
services throughout heterogeneous data center
environments.
Figure 71. Brocade FA4-18
Chapter 10: Backup and Data Protection Products
The Brocade FA4-18 is tightly integrated with a wide range of enter-
prise storage applications that leverage Brocade Storage Application
Services (SAS, an implementation of the T11 FAIS standard) to provide
wirespeed data movement and offload server resources. These appli-
cations include:
Brocade Data Migration Manager (page 139) provides an ultra-
fast, non-disruptive, and easy-to-manage solution for migrating
data in heterogeneous server and storage environments. It helps
organizations reduce overhead while accelerating data center
relocation or consolidation, array replacements, and Information
Lifecycle Management (ILM) activities.
EMC RecoverPoint on Brocade (page 141) is designed to provide
continuous remote replication and continuous data protection
across heterogeneous IT environments, enabling organizations to
protect critical applications from data loss and improve business
continuity. (EMC sells the Brocade FA4-18 for RecoverPoint solu-
tions under the EMC Connectrix Application Platform brand.)
EMC Invista on Brocade is designed to virtualize heterogeneous
storage in networked storage environments, enabling organiza-
tions to simplify and expand storage provisioning, and move data
seamlessly between storage arrays without costly downtime. (EMC
sells the Brocade FA4-18 for Invista solutions under the EMC Con-
nectrix Application Platform brand.)
Fujitsu ETERNUS VS900 virtualizes storage across Fibre Channel
networks, enabling organizations to allocate any storage to any
application with ease, simplify data movement across storage
tiers, and reduce storage costs.
Brocade Data Migration Manager Solution
The Brocade FA4-18 blade provides a high-performance platform for
tightly integrated storage applications that leverage the Brocade Stor-
age Application Services (SAS) API. Highlights of the FA4-18 include:
Provides 16 auto-sensing 1, 2, and 4 Gbit/sec Fibre Channel ports
with two auto-sensing 10/100/1000 Mbit/sec Ethernet ports for
LAN-based management
Leverages a fully pipelined, multi-CP U RI SC and memory system,
up to 64 Gbit/sec throughout, and up to 1 million IOPS to meet the
most demanding data center environments
Performs split-path hardware acceleration using partitioned port
processing and distributed control and data path processors,
enabling wire-speed data movement without compromising host
application performance
Helps ensure highly reliable storage solutions through failover-
capable data path processors combined with the high component
redundancy of the Brocade DCX or Brocade 48000
Brocade Data Migration Manager Solution
Brocade Data Migration Manager (DMM) provides a fast, non-disrup-
tive, and easy-to-manage migration solution for heterogeneous
environments.
As the need for block-level data migration becomes increasingly com-
mon, many IT organizations need to migrate data from one type of
storage array to another and from one vendor array to another. As
such, data migration carries an element of risk and often requires
extensive planning. Powerful, yet easy to use, Brocade DMM enables
these organizations to efficiently migrate block-level data and avoid
the high cost of application downtime.
Because it is less disruptive, more flexible, and easier to plan for than
traditional data migration offerings, Brocade DMM provides a wide
range of advantages. Residing on the SAN-based Brocade Application
Platform, Brocade DMM features a migrate-and-remove architecture
as well as wire-once setup that enables fast, simplified deployment
in existing SANs. This approach helps organizations implement and
manage data migration across SANs or WANs with minimal time and
resource investment.
Utilizing the 4 Gbit/sec port speed and 1 million IOPS performance of
the Brocade Application Platform, Brocade DMM migrates up to 128
volumes in parallel at up to five terabytes per hour. For maximum flexi-
bility, it supports both offline and online data migration in Windows,
HP-UX, Solaris, and AIX environments for storage arrays from EMC, HP,
Hitachi, IBM, Network Appliance, SUN, and other vendors.
Key features and benefits include:
Simplifies and accelerates block data migration during data cen-
ter relocation or consolidation, array replacements, or ILM
activities
Migrates up to 128 LUNs in parallel at up to 5 terabytes per hour
Performs online (as well as offline) migration without impacting
applications, eliminating costly downtime
Moves data between heterogeneous storage arrays from EMC,
Hitachi, HP, IBM, NetApp, Sun, and other leading vendors
Enables fast, seamless deployment in existing SAN fabrics
through a migrate-and-remove architecture
Automates multiple migration operations with easy start, stop,
resume, and throttle control
Utilizes an intuitive Windows management console or CLI scripting
EMC RecoverPoint Solution
EMC RecoverPoint on Brocade provides continuous remote replication
and continuous local data protection across heterogeneous IT environ-
ments, as shown in Figure 72. By leveraging the intelligence in
Brocade SAN fabrics and utilizing existing WAN connectivity, this inte-
grated solution helps IT organizations protect their critical applications
against data loss for improved business continuity.
EMC RecoverPoint Solution
Figure 72. EMC RecoverPoint on Brocade scenario
This solution includes advanced features that provide robust
performance and heterogeneous implementations:
Brocade SAS API for reliable, scalable, and highly available
storage applications
Fully pipelined, multi-CPU RISC (reduced instruction set comput-
ing) and memory system, providing inline processing capabilities
for optimum performance and flexibility
Partitioned port processing, which utilizes distributed control and
data path processors for wirespeed data transfer
A compact, cost-effective deployment footprint
Investment protection through non-disruptive interoperability with
existing SAN fabrics
Available for Microsoft Windows, AIX, HP-UX, Sun Solaris, Linux,
and VMware server environments, utilizing storage devices
residing in a Fibre Channel SAN
11
Branch Office and File
Management Products
With the unprecedented growth of file data across the enterprise,
todays IT organizations face ever-increasing file management chal-
lenges: greater numbers of files, larger files, rising user expectations,
and shorter maintenance windows.
Brocade File Management Engine on page 143
Brocade StorageX on page 145
Brocade File Insight on page 146
Brocade File Management Engine
Brocade File Management Engine (FME) creates a logical abstraction
layer between how files are accessed and the underlying physical stor-
age. Because file access is no longer bound to physical storage
devices, organizations can move or migrate files without disrupting
users or applications.
Figure 73. Brocade File Management Engine (FME)
Brocade FME utilizes sophisticated technology for true open file migra-
tionsimplifying file management and enabling organizations to
virtualize their files and manage resources more efficiently. As a
Chapter 11: Branch Office and File Management Products
result, organizations can manage file data whenever they want, saving
time, money, and resources. Moreover, the automation of labor-inten-
sive tasks reduces the potential for errors and business disruption.
Brocade FME combines non-disruptive file movement with policy-
driven automation for:
Transparent file migration, including open and locked files
File, server, and storage consolidation
Asset deployment and retirement
Tiered file classification and placement
File and directory archiving
Brocade FME provides a number of powerful features, some of which
are unique in the industry:
Open file migration. Enables non-disruptive movement of open or
locked files, supporting on-demand or scheduled movement
Redirection for logical migration. Logically links users to physical file
locations to avoid disruption
Transparency. Does not alter server, network, and storage resources
or client access and authentication
Automated policies. Saves time by simplifying file classification and
management while improving integrity by automatically monitoring file
placement
Scalable and granular namespace. Supports the management of bil-
lions of files and petabytes of data at the share, directory, or file level
Heterogeneous resource support. Abstracts servers, networks, and
storage for easier management, including common management of
SMB and CIFS data
Brocade StorageX
Brocade StorageX
Brocade StorageX is an integrated suite of applications that logically
aggregates distributed files across heterogeneous storage environ-
ments and across CIFS- and NFS-based files while providing policies to
automate file management functions. It supports tasks for key areas
such as:
Centralized network file management with location-independent
views of distributed files
File management agility and efficiency through transparent high-
speed file migration, consolidation, and replication
Security, regulatory, and corporate governance compliance with
reporting and seamless preservation of file permissions during
migration
Disaster recovery and enhanced business continuity with 247
file access, utilizing replicas across multiple heterogeneous, dis-
tributed locations
Centralized and automated key file management tasks for greater
productivity, including failover and remote site file management
Information Lifecycle Management (ILM) policies to automate
tiered file migration from primary storage to secondary devices
based on specified criteria
File data classification and reporting
Brocade StorageX provides administrators with powerful policies to
efficiently manage distributed files throughout an enterprise. More-
over, it directly addresses the needs of both administrators and users
by increasing data availability, optimizing storage capacity, and simpli-
fying storage management for filesall leading to significantly lower
costs for enterprise file data infrastructures.
Brocade StorageX integrates and extends innovative Microsoft Win-
dows-based technologies such as DFS to provide seamless integration
with Windows infrastructures. Rather than managing data through pro-
prietary technologies or file systems that must mediate access, it
enables file access through established mechanisms.
Brocade StorageX leverages Microsoft technology to:
Build upon the DFS namespace with a global namespace that
aggregates files and centralizes management across the
enterprise
Simplify Windows Server 2003 and Storage Server 2003 adoption
and migration from legacy operating systems, including Novell
Provide cost-effective, seamless failover across geographically dis-
tributed sites by centralizing management of the global failover
process
Brocade File Insight
Brocade File Insight is a free Windows-based reporting utility that pro-
vides a fast and easy way to understand SMB/CIFS file share
environments. It collects file metadata and produces meaningful
reports on file age, size, types, and other metadata statistics. Unlike
traditional manual data collection and reporting methods, File Insight
is easy to use, non-intrusive. and fast. It enables administrators to opti-
mize network-based file availability, movement, and access while
lowering the cost of ownership.
The file storage world today is increasingly networked and distributed,
and file storage management has become both complex and costly. IT
organizations often struggle to find answers to questions such as:
What is the percentage of files being managed that have not
changed in the past year?
How many files have not been accessed in the past six months?
What file types are most common?
What file types consume the most space?
To address these challenges, File Insight helps organizations assess
and better understand highly distributed file environments. Leveraging
this free file analysis utility, organizations can scan SMB/CIFS network
shares and use the resulting metadata to better understand their file
environments.
The File Insight console is an intuitive, task-based interface that is sim-
ple to install and use. It enables organizations to create and run File
Insight scans, and view the results. A File Insight scan collects meta-
data about the files stored on the network shares included in the scan,
and stores the scan results in a CSV file for local reporting and a Zip
file for Brocade-based report generation, as shown in Figure 74.
Brocade File Insight
Figure 74. Overview of Brocade File Insight
File Insight provides reports with the following types of information:
The number of files in an environment
File age and file size
How many files have not been accessed in two or more years
The most common file types by aggregate file count and file size
As a result, File Insight provides the information organizations need to
more confidently manage their network-based file storage and opti-
mize file data availability, movement, access, and cost.
If you have questions, contact fiadmin@brocade.com.
12
Advanced Fabric Services
and Software Products
Brocade ships its flagship proprietary operating system, Brocade Fab-
ric OS (FOS) on all B-Series platforms.
NOTE: Also supported for M-Series (formerly McDATA) platforms is Bro-
cade M-Enterprise OS.
The following optionally licensed Advanced Fabric Services are avail-
able to enhance the capabilities of FOS:
Brocade Advanced Performance Monitoring on page 150
Brocade Access Gateway on page 151
Brocade Fabric Watch on page 152
Brocade Inter-Switch Link Trunking on page 153
Brocade Extended Fabrics on page 154
Brocade offers a suite of manageability software products:
Brocade Enterprise Fabric Connectivity Manager on page 156
Brocade Fabric Manager on page 158
Brocade Web Tools on page 160
Brocade Fabric OS
Brocade Fabric OS is the operating system firmware that provides the
core infrastructure for deploying robust SANs. As the foundation for the
Brocade family of FC SAN switches and directors, it helps ensure the
reliable and high-performance data transport that is critical for scal-
able SAN fabrics interconnecting thousands of servers and storage
devices. With ultra-high-availability features such as non-disruptive hot
code activation, FOS is designed to support mission-critical enterprise
environments. A highly flexible solution, it is built with field-proven fea-
tures such as fabric auditing, continuous port monitoring, advanced
diagnostics and recovery, and data management/fault isolation. In
addition.
FOS capabilities include:
Maximizes flexibility by integrating high-speed access, infrastruc-
ture scaling, long-distance connectivity, and multiservice
intelligence into SAN fabrics
Enables highly resilient, fault-tolerant multiswitch Brocade SAN
fabrics
Supports multiservice application platforms for the most demand-
ing business environments
Features 1, 2, 4, 8, and 10 Gbit/sec capabilities for Fibre Channel
and FICON connectivity and 1 Gbit/sec Ethernet for long-distance
networking and iSCSI connectivity
Maximizes port usage with NPIV technology
Provides data management and fault isolation capabilities for fab-
rics via Administrative Domain, Advanced Zoning, and Logical SAN
(LSAN) zoning technologies
Supports IPv6 and IPv4 addressing for system management
interfaces
Brocade Advanced Performance Monitoring
Based on Brocade Frame Filtering technology and a unique perfor-
mance counter engine, Brocade Advanced Performance Monitoring is
a comprehensive tool for monitoring the performance of networked
storage resources. This tool helps reduce total cost of ownership and
over-provisioning while enabling SAN performance tuning, reporting of
service level agreements, and greater administrator productivity.
Advanced Performance Monitoring supports direct-attached, loop, and
switched fabric Fibre Channel SAN topologies by:
Monitoring transaction performance from source to destination
Monitoring ISL performance
Measuring device performance by port, Arbitrated Loop Physical
Address (ALPA), and LUN
Reporting Cyclic Redundancy Check error measurement statistics
Brocade Fabric OS
Measuring ISL Trunking performance and resource usage
Utilizing Top Talker reports, which rank the highest-bandwidth
data flows in the fabric for F_Ports and E_Ports (ISL)
Comparing IP versus SCSI traffic on each port
Brocade Access Gateway
Blade servers are experiencing explosive growth and acceptance in
todays data center IT environments. A critical part of this trend is con-
necting blade servers to SANs, which provide highly available and
scalable storage solutions. IT organizations that want to connect blade
server enclosures to SANs in this manner typically utilize one of two
methods: Fibre Channel SAN pass-through solutions or blade server
SAN switches.
Brocade offers blade server SAN switches from all leading blade man-
ufacturers, providing significant advantages over Fibre Channel SAN
pass-through solutions. With fewer cables and related components,
Brocade blade server SAN switches provide lower cost and greater reli-
ability by eliminating potential points of failure. Brocade has expanded
upon these blade server SAN switch benefits with the introduction of
the Brocade Access Gateway. Specifically for blade server SAN
switches, the Brocade Access Gateway simplifies server and storage
connectivity in blade environments. By enabling increased fabric con-
nectivity, greater scalability, and reduced management complexity, the
Brocade Access Gateway provides a complete solution for connecting
blade servers to any SAN fabric.
This unique solution protects investments in existing blade server SAN
switches by enabling IT organizations to use them as traditional Bro-
cade full-fabric SAN switches or operate them in Brocade Access
Gateway mode via Brocade Web Tools or the Brocade command line
interface. As a result, the Brocade Access Gateway provides a reliable
way to integrate state-of-the-art blade servers into heterogeneous
Fibre Channel SAN environments, as shown in Figure 75.
Figure 75. Access Gateway on blades and the Brocade 300 Switch
Highlights of the Brocade Access Gateway include:
Simplifies the connectivity of blade servers to any SAN fabric,
using hardware that is qualified by industry-leading OEMs
Increases scalability of blade server enclosures within SAN fabrics
Helps eliminate fabric disruption resulting from increased blade
server switch deployments
Simplifies deployment and change management utilizing standard
Brocade FOS
Provides extremely flexible port connectivity
Features fault-tolerant external ports for mission-critical high
availability
Brocade Fabric Watch
Brocade Fabric Watch is an optional SAN health monitor for Brocade
switches. Fabric Watch enables each switch to constantly watch its
SAN fabric for potential faultsand automatically alert network manag-
ers to problems before they become costly failures. Fabric Watch
tracks a variety of SAN fabric elements, events, and counters. Monitor-
ing fabric-wide events, ports, transceivers, and environmental
parameters permits early fault detection and isolation as well as per-
formance measurement. Unlike many systems monitors, Fabric Watch
is easy to configure. Network administrators can select custom fabric
elements and alert thresholdsor they can choose from a selection of
preconfigured settings.
Brocade Fabric OS
In addition, it is easy to integrate Fabric Watch with enterprise systems
management solutions. By implementing Fabric Watch, storage and
network managers can rapidly improve SAN availability and perfor-
mance without installing new software or system administration tools.
For a growing number of organizations, SAN fabrics are a mission-criti-
cal part of their systems architecture. These fabrics can include
hundreds of elements, such as hosts, storage devices, switches, and
ISLs. Fabric Watch can optimize SAN value by tracking fabric events
such as:
Fabric resources: fabric reconfigurations, zoning changes, and
new logins
Switch environmental functions: temperature, power supply, and
fan status, along with security violations and HA metrics
Port state transitions, errors, and traffic information for multiple
port classes as well as operational values for supported models of
transceivers
A wide range of performance information
Brocade Inter-Switch Link Trunking
Brocade ISL Trunking is available for all Brocade 2, 4, and 8 Gbit/sec
Fibre Channel switches, FOS-based directors, and the Brocade DCX
Backbone. This technology is ideal for optimizing performance and
simplifying the management of multi-switch SAN fabrics containing
Brocade switches and directors and the latest 8 Gbit/sec solutions.
When two or more adjacent ISLs in a port group are used to connect
two switches with trunking enabled, the switches automatically group
the ISLs into a single logical ISL, or trunk. The throughput of the
resulting trunk can range from 4 Gbit/sec to as much as 68 Gbit/sec.
Highlights of Brocade ISL Trunking include:
Combines up to eight ISLs into a single logical trunk that provides
up to 68 Gbit/sec data transfers (with 8 Gbit/sec solutions)
Optimizes link usage by evenly distributing traffic across all ISLs at
the frame level
Maintains in-order delivery to ensure data reliability
Helps ensure reliability and availability even when a link in the
trunk fails
Optimizes fabric-wide performance and load balancing with
Dynamic Path Selection
Simplifies management by reducing the number of ISLs required
Provides a high-performance solution for network- and data-inten-
sive applications
To further optimize network performance, Brocade 4 and 8 Gbit/sec
platforms support optional DPS. Available as a standard feature in Bro-
cade FOS (starting in Fabric OS 4.4), exchange-based DPS optimizes
fabric-wide performance by automatically routing data to the most effi-
cient available path in the fabric. DPS augments ISL Trunking to
provide more effective load balancing in certain configurations, such
as routing data between multiple trunk groupsor in Native Connectiv-
ity configurations with Brocade M-EOS products. This approach
provides transmit ISL Trunking from FOS to M-EOS products while M-
EOS products provide transmit trunking via Open Trunking, thereby
enabling bidirectional trunking support. As a result, this combination
of technologies provides the greatest design flexibility and the highest
degree of load balancing.
Depending on the number of links and link speeds employed, trunks
can operate at various distance/bandwidth combinations. For exam-
ple, trunking can support distances of 345 km for a 2 Gbit/sec, 5-link
trunk providing over 10 Gbit/sec of trunk bandwidth, or 210 km for a 4
Gbit/sec, 4-link trunk providing 17 Gbit/sec of trunk bandwidth.
Brocade Extended Fabrics
Fibre Channel-based networking technology has revitalized the reliabil-
ity and performance of server and storage environmentsproviding a
robust infrastructure to meet the most demanding business require-
ments. In addition to improving reliability and performance, Fibre
Channel provides the capability to distribute server and storage con-
nections over distances up to 30 km using enhanced long-wave optics
and dark fiberenabling SAN deployment in campus environments.
However, todays organizations often require SAN deployment over dis-
tances well beyond 30 km to support distributed facilities and stricter
business continuance requirements. To address these extended dis-
tance SAN requirements, Brocade offers Extended Fabrics software.
Brocade Fabric OS
Brocade Extended Fabrics enables organizations to leverage the
increased availability of DWDM equipment in major metropolitan
areas (see Figure 24). The most effective configuration for implement-
ing extended-distance SAN fabrics is to deploy Fibre Channel switches
at each location in the SAN. Each switch handles local interconnectiv-
ity and multiplexes traffic across long-distance DWDM links while the
Extended Fabrics software enables SAN management over extended
distances.
In this type of configuration, the Extended Fabrics software enables:
Fabric interconnectivity over Fibre Channel at longer distances.
ISLs or IFLs use dark fiber or DWDM connections to transfer data.
As Fibre Channel speeds increase, the maximum distance
decreases for each switch. However, the latest Brocade 8 Gbit/sec
technology sets a new benchmark for extended distancesup to
3400 km at 1 Gbit/sec and 425 km at 8 Gbit/secto move more
data over longer distances at a lower cost.
Simplified management over distance. Each device attached to
the SAN appears as a local device, an approach that simplifies
deployment and administration.
A comprehensive management environment. All management
traffic flows through internal SAN connections, so the fabric can
be managed from a single administrator console using Brocade
Enterprise Fabric Connectivity Manager (EFCM), Fabric Manager,
or the Web Tools switch management utility.
Table 5 provides distance data for Brocade Extended Fabrics.
Table 5. Extended Fabrics distances for 8 Gbit/sec platforms
Connection type Native Fibre Channel
Line speed 1, 2, 4, and 8 Gbit/sec
Maximum distance for
Brocade 5100 Switch
Up to 3400 km at 1 Gbit/sec
Brocade 5300 Switch
Brocade Enterprise Fabric Connectivity Manager
Brocade EFCM runs on M-EOS fabrics and includes Basic, Enterprise,
and Standard versions.
Brocade Basic EFCM
Brocade EFCM Basic is an intuitive, browser-based SAN management
tool for simple and straightforward configuration and management of
Brocade fabric switches. Ideal for the small to mid-sized business,. The
software is complimentary with every Brocade fabric switch and is per-
fect for companies migrating from direct-attached storage to a SAN or
companies maintaining small switch SANs. It is recommended for fab-
rics with one to three switches. Brocade EFCM Basic software is
accessed via a standard Web browser.
Brocade EFCM Standard and Enterprise
Brocade EFCM is a powerful and comprehensive SAN management
application. It helps organizations consolidate, optimize, and protect
their storage networks to reduce costs, meet their data protection
requirements, and improve their service levels through unprecedented
ease of use, scalability, global visualization, and intelligent automa-
tion. In particular, Brocade EFCM reduces the complexity and cost of
storage networks through centralized management of global SAN envi-
ronments as shown in Figure 76.
With enterprise-class reliability, proactive monitoring/alert notification,
and unprecedented scalability, it helps organizations maximize avail-
ability while enhancing security for their storage network
infrastructures.
Brocade 300Switch
Brocade 8 Gbit/sec blades
Interconnect distance Extended long-wave transceivers;
Fibre Channel repeaters, DWDM
Table 5. Extended Fabrics distances for 8 Gbit/sec platforms
Brocade Enterprise Fabric Connectivity Manager
Figure 76. Brocade EFCM interface
Highlights include:
Centralizes the management of multiple Brocade M-EOS and Bro-
cade Fabric OS SAN fabrics
Facilitates configuration and asset tracking with end-to-end visual-
ization of extended SANs, including HBAs, routers, switches, and
extension devices
Displays, configures, and zones Brocade HBAs, switches, direc-
tors, and the Brocade DCX Backbone
Adds, removes, and modifies remote devices with easy-to-use
functions that simplify management tasks
Provides industry-leading support for FICON mainframe environ-
ments, including FICON CUP, FICON CUP zoning, and NPIV
Enables integration with third-party management applications and
SRM tools for storage-wide management
Displays multiple geographically dispersed SANs through a local
Brocade EFCM instance
Brocade EFCM is available in Standard or Enterprise versions:
Brocade EFCM Standard provides advanced functionality that
small and mid-sized organizations can easily deploy and use to
simplify SAN ownership
Brocade EFCM Enterprise is ideal for large, multi-fabric, or multi-
site SANs and is upgradable with optional advanced functionality.
In addition, Brocade EFCM enables third-party product integration
through the Brocade SMI Agent.
Brocade Fabric Manager
Brocade Fabric Manager is a powerful application that manages multi-
ple Brocade FOS SAN switches and fabrics in real time. In particular, it
provides the essential functions for efficiently configuring, monitoring,
dynamically provisioning, and managing Brocade SAN fabrics on a
daily basis.
Through its single-point SAN management platform and integrated
Brocade Web Tools element manager, Brocade Fabric Manager facili-
tates the global integration and execution of management tasks
across multiple fabrics. It is tightly integrated with Brocade FOS and
Brocade Fabric Watch, an optional monitoring and troubleshooting
module. In addition, it integrates with third-party products through
built-in menu functions and the Brocade SMI Agent.
Brocade Fabric Manager
Figure 77. Brocade Fabric Manager displays a topology-centric view of
SAN environments
Brocade Fabric Manager provides unique methods for managing
SANs, including:
Device troubleshooting analysis. Utilizes a diagnostics wizard to
identify device miscommunication, reducing fault determination
time.
Offline zone management. Enables administrators to edit zone
information on a host without affecting the fabric, and then pre-
view the impact of changes before committing them.
Change management. Provides a configurable fabric snapshot/
compare feature that tracks changes to fabric objects and
membership.
Call home support. Performs automatic data collection and notifi-
cation in case of support issues, facilitating fault isolation,
diagnosis, and remote support.
Streamlined workflow. Utilizes wizards to streamline tasks such
as zoning and the setup of secure and routed fabrics.
Real-time and historical performance monitoring. Collects, dates,
and displays port and end-to-end monitoring data to facilitate
problem determination and capacity planning.
Customized views. Enables administrators to import customized
naming conventions and export information for customized
viewswith full integration for Microsoft Office and Crystal
Reports.
Advanced reporting. Includes GUI-based functions for exporting
configuration, performance monitoring, and physical asset data in
a spreadsheet format.
Profiling, backup, and cloning. Enables administrators to capture,
back up, and compare switch configuration profiles, and use clon-
ing to distribute switch profiles within the fabric.
Managing long-distance FCIP tunnels. Provides a wizard to sim-
plify the task of configuring, monitoring, and optimizing FCIP
tunnels and WAN bandwidth usage, including Quality of Service
(QoS) and FICON emulation parameters.
FICON/CUP. Configures and manages FICON and cascaded
FICON environments concurrently in Fibre Channel environments.
Scalable firmware download and repository. Supports firmware
upgrades across logical groups of switches, providing fabric pro-
files and recommendations for appropriate firmware, with
reporting facilities for a SAN-wide firmware inventory.
SAN security. Supports standards-based security features for
access controls and SAN protection, providing support for IPv6,
wizards to enable sec mode, policy editors, and HTTPS communi-
cation between servers and switches.
Launching of third-party management applications. Provides a
configurable menu item to launch management applications from
any switch in a fabric.
Brocade Web Tools
Brocade Web Tools, an intuitive and easy-to-use interface, enables
organizations to monitor and manage single Brocade Fibre Channel
switches and small Brocade SAN fabrics. Administrators can perform
tasks by using a Java-capable Web browser from standard laptops,
desktop PCs, or workstations at any location within the enterprise. In
addition, Web Tools access is available from Web browsers through a
secure channel via HTTPS.
To increase the level of detail for management tasks, Web Tools
enables organizations to configure and administer individual ports or
switches as well as small SAN fabrics. User name and password login
Brocade Web Tools
procedures protect against unauthorized actions by limiting access to
configuration features. Web Tools provides an extensive set of features
that enable organizations to quickly and easily perform key administra-
tive tasks such as:
Configuring individual switches IP addresses, switch names, and
Simple Network Management Protocol (SNMP) settings
Rebooting a switch from a remote location
Upgrading switch firmware and controlling switch boot options
Maintaining administrative user logins and passwords
Managing license keys, multiple user accounts, and RADIUS sup-
port for switch logins
Enabling Ports on Demand capabilities
Choosing the appropriate routing strategies for maximum perfor-
mance (dynamic routes)
Configuring links and managing ISL Trunking over extended
distances
Accessing other switches in the fabric that have similar
configurations
Figure 78. Brocade Web Tools Switch Explorer View of the Brocade
48000 Director
13
Solutions Products
In late 2007, Brocade created a number of divisions to achieve focus
in the following areas:
Data Center Infrastructure
Server Connectivity
File Management (see Chapter 11: Branch Office and File Man-
agement Products starting on page 143)
Services, Support, and Solutions (S3)
The sections in this chapter reflect relevant services and solutions
from the S3 Division.
Backup and Recover Services
Corporate data is growing at a dramatic rate. Databases are doubling-
sometimes tripling- every 12 months, while IT resources remain
unchanged. Internet applications and global business practices have
established the 24-hour business day, severely restricting the amount
of downtime available to perform regular data backup procedures.
Not long ago, backing up business data was a simple process. Backup
tapes were trucked offsite each night, while a backup administrator
ensured that the software and hardware environment was kept up and
running. In the event of a recovery effort, tapes were trucked back to
the site, loaded into tape drives, and accessed by the backup
administrator.
Today, backup and recovery is very different. The practice of backing
up and recovering data has evolved into a complex, demanding disci-
pline requiring continuous information, adherence to regulatory
compliance, and the need for networked data centers. As a result,
Chapter 13: Solutions Products
many companies are not able to maintain processes that assure the
degree of protection and recoverability they need for their growing
data, much less do so efficiently.
Brocade offers a lifecycle of Backup and Recovery services to help cus-
tomers meet their business challenges:
Backup and Recovery Workshop
Backup and Recovery Assessment and Design Services
Backup HealthCheck Services
Backup and Recovery Implementation Services
Reporting Tool Services
Brocades Backup and Recovery practice focuses on providing enter-
prise class backup and recovery solutions that leverage hardware,
software and services, as well as Brocades best practices for design
and implementation. Brocade consultants have deep knowledge of
IBM Tivoli Storage Manager (TSM) and Veritas NetBackup (NBU). Bro-
cades experts have in-depth expertise, real world experience and best
practices for planning and implementing enterprise backup and
recovery.
Brocade Virtual Tape Library Solution
To augment Brocades Backup and Recovery Services, Brocade offers
the Brocade Virtual Tape Library (VTL) Solution. This solution, featuring
a combination of Brocade products, services and support along with
VTL technology from FalconStor, provides customers a cost-effective
way to reduce backup windows, improve backup over the WAN and
enhance disaster recovery capabilities.
The Brocade VTL Solution is a disk-to-disk-to-tape virtualization solu-
tion that complements existing backup and recovery environments,
allowing customers to decrease backup and recovery windows while
leveraging existing infrastructure. It utilizes VTL technology to virtualize
disk and make it appear as a tape library within the SAN, enabling cus-
tomers to re-deploy lower-performing tape devices in remote locations
as an archival tool and leverage higher-performing VTLs as the primary
backup and restore vehicle. With features such as incremental
backup, hierarchical storage, disk-to-disk-to-tape backup via storage
pools, and more, this solution addresses large-scale data backup,
recovery and retention needs.
Brocade Virtual Tape Library Solution
The Brocade VTL Solution supports:
Integration with backup tape copy: It integrates with existing enter-
prise backup environments, enabling backup applications to
control and monitor all copies of the backup volumes for simpli-
fied management
Remote replication and archiving: It enables organizations to
remotely copy/archive data through FCIP by utilizing Brocade
extension products. In addition, Brocade Tape Pipelining increases
throughput and read and write performance over standard replica-
tion methods, enabling organizations to redeploy existing tape
resources to remote sites for archiving purposes, over virtually
unlimited distances.
To determine the right solution for each customer environment, Bro-
cade backup experts assess the existing customer environment for
overall performance and potential gaps. From that assessment and
recommendation, Brocade can then deploy the most appropriate prod-
ucts, technology and solution for that environment.
Chapter 13: Solutions Products
A
The Storage Networking
Industry Association (SNIA)
Industry associations embody the contradiction between competitive
interests of vendors and their recognition that the success of individ-
ual vendors is tied to the success of the industry as a whole. The
appropriate homily for industry associations is rising waters raise all
ships, although occasionally a gunboat will appear as a vendor's com-
petitive drive goes unchecked. An industry association may focus
primarily on marketing campaigns to raise end-user awareness of the
industry's technology, or combine marketing and technical initiatives
to promote awareness and to formulate standards requirements. The
Fibre Channel Industry Association, for example, has organized promo-
tional activity for out-bound messaging through Networld+Interop and
other venues as well as technical work on the SANmark program for
standards compliance. For standardization, the FCIA has worked pri-
marily through the NCITS T11 Committee, to the extent of holding FCIA
meetings and NCITS T11 sessions concurrently.
Overview
The umbrella organization for all storage networking technologies is
the Storage Networking Industry Association, or SNIA. The SNIA has
over 400 member companies and over 7,000 individuals, represent-
ing vendors and customers from a wide variety of storage disciplines
including management software, storage virtualization, NAS, Fibre
Channel, IP storage, disk and tape, and solution providers who offer
certified configurations and support. As with other industry associa-
tions, the SNIA is a volunteer organization with only a few paid staff
positions. Its activity is funded by the monetary and personnel contri-
butions of the membership. The general mission of the SNIA is to
promote the adoption of storage networking technology as a whole,
with the membership itself providing the momentum to accomplish
this goal. The more the membership invests in terms of finances and
Appendix A: The Storage Networking Industry Association (SNIA)
volunteer resources, the more the organization can accomplish. The
SNIA's outbound advocacy includes co-sponsorship of Storage Net-
working World conferences, the Storage Developers Conference and
other venues.
Board of Directors
As shown in the organizational chart below, the governing body of the
SNIA is the Board of Directors. Board members are elected by the
membership for two year terms. The ten elected board members are
supplemented by three at-large board members appointed by the
board itself. The board is responsible for establishing policies and
managing resources of the organization to fulfill the SNIA's mission
and provides oversight to the SNIA committees, industry forums, Initia-
tives, Technical Council, End User Council, the Technical Director and
the SNIA Technology Center.
Figure 79. Storage Networking Industry Association organizational
structure
To insure involvement in wider SNIA activity, Board members are
encouraged to chair or provide leadership in SNIA committees and
subgroups. This volunteer activity represents a substantial contribu-
tion of time and resources for member companies who participate at
Executive Director and Staff
the board level and reveals their commitment to the industry as a
whole. Of course, Board representation also provides an opportunity to
promote specific vendor agendas, although Board representation is
sufficiently diverse to discourage overt vendor-driven initiatives.
Executive Director and Staff
Board activity is supported by a salaried Executive Director and staff.
The Executive Director conducts the day to day operations of the orga-
nization and logistical support for SNIA meetings and conference
participation. In addition to the Executive Director, SNIA staff includes
the Technical Director, Technology Center Director, Marketing Man-
ager, Membership Manager and other operations and support
personnel.
Board Advisors
The board may receive counsel on industry-related issues from the
Board Advisory Council (BAC), typically former Board members and
interested parties who may attend board meetings and provide input
into Board discussions. Board Advisors can play a critical role in provid-
ing viewpoints on storage networking issues and in helping to promote
the SNIA within the industry.
Technical Council
The technical activity and strategic technical vision of the SNIA is man-
aged by the SNIA Technical Council. The Technical Council is
composed of nine of the top experts within the storage networking
community who volunteer their time and expertise to maintaining the
integrity of SNIA's technical initiatives. In 2001, the Technical Council
produced the SNIA Shared Storage Model as a guide to understanding
storage networking technologies. The Technical Council also oversees
the activity of the technical work groups in cooperation with the Techni-
cal Director.
SNIA Technology Center
The SNIA Technology Center in Colorado Springs was launched in the
spring of 2001 as a multi-purpose facility. The Technology Center was
made possible by a $3.5M grant from Compaq Computer Corporation
to the SNIA. It supports 14,000 square feet of lab and classroom
space and is operated as a vendor-neutral facility by the SNIA. Uses of
the Technology Center include interoperability demonstrations, stan-
dards compliance testing, proof of concept and evaluation
configurations, technology development in support of SNIA technical
work group activity, and training in storage networking technology.
As with other SNIA activities, the Technology Center is dependent on
contributions of money and equipment by member companies. Net-
work Appliance was one of the first vendors to contribute over half a
million dollars worth of equipment in the form of fully configured
NetApp filers, and other vendors have been contributing sponsorships
and equipment to get the center operational. The Technology Center is
a significant and practical step for the SNIA in providing its members
and the customer community a venue for accelerating storage net-
working adoption.
End User Council
Since vendors alone do not determine the useful purposes to which
technology will be put, the SNIA has organized an End User Council
(EUC) to solicit customer representation within the SNIA and customer
input into storage networking strategies. The EUC is composed of
administrators, SAN engineers, architects and support personnel who
have practical, day-to-day responsibility for shared storage operations.
The EUC can thus provide both strategic and tactical input into the
SNIA to help establish priorities and shape the future of storage
networking.
Committees
Much of the non-technical activity of the SNIA is conducted through
Committees. Committees may be chaired by SNIA board members or
other volunteers, with volunteer participation by member companies.
Committees are chartered with various tasks that must be performed
within the vendor-neutral culture of the mother organization. Commit-
tees and work groups have face-to-face meetings at least four times a
year, plus periodic conference calls to track their progress and assign
tasks. Current committees include the Executive, Channel, Standards,
Marketing, Education, International, Interoperability and Strategic Alli-
ances committees.
The Education Committee, for example, is responsible for creating
training and certification programs for the SNIA and creation of SNIA
technical tutorials presented at SNW and other venues. This activity
ranges from training classes held at the SNIA Technology Center to
technology certification through various partnerships. The Education
Committee has also produced the SNIA Dictionary of Storage Network-
ing Terminology.
Depending on time and resources, SNIA member companies may par-
ticipate in any or all of the SNIA committees. Although committee
activity is vendor-neutral and focused on the industry as a whole, par-
Technical Work Groups
ticipation is a means to insure that a company is adequately
represented in the creation of policies, processes and events that pro-
vide visibility in the market. Committee participation is also a means to
monitor the state of the industry and thus shape vendor strategies to
the consensus of industry peers.
Technical Work Groups
The SNIA technical work groups have been instrumental in formulating
requirements for technology standards that may then be forwarded to
the appropriate standards body for further work. Additional detail on
the activity of each technical work group may be found on the SNIA
web site. Most recently, SNIA work groups have produced the SMI-S
standard and advanced it through ISO as an international standard
benefiting the global community. Technical work groups support a
diversity of interests, from management and backup to security issues.
The Green Storage Technical Working Group, for example, is develop-
ing metrics for monitoring the energy efficiency of storage networking
infrastructure.
SNIA Initiatives
The SNIA currently has three major initiatives to promote the develop-
ment of standards for key areas of storage networking technology.
The SNIA Storage Management Initiative
The Storage Management Initiative (SMI) was created by the SNIA to
develop and standardize interoperable storage management technolo-
gies and aggressively promote them to the storage, networking and
end-user communities. This work has resulted in the approval of the
SMI Specification and the adoption of SMI-S as a common manage-
ment framework by all major storage networking vendors.
The SNIA XAM Initiative
The eXtensible Access Method (XAM) Initiative was formed to serve a
XAM community that includes storage vendors, independent software
vendors, and end users to ensure that a XAM specification fulfills mar-
ket needs for a fixed content data management interface standard.
These needs include interoperability, information assurance (security),
storage transparency, long-term records retention and automation for
Information Lifecycle Management (ILM)-based practices.
The SNIA Green Storage Initiative
The SNIA Green Storage Initiative (GSI) is dedicated to advancing
energy efficiency and conservation in all networked storage technolo-
gies and minimizing the environmental impact of data storage
operations. The GSIs mission is to conduct research on power and
cooling issues confronting storage administrators, educate the vendor
and user community about the importance of power conservation in
shared storage environments, and to provide input to the SNIA Green
Storage TWG on requirements for green storage metrics and
standards.
Industry Forums
To accommodate new storage networking trends within the SNIA
umbrella, the SNIA has created a category of SNIA Industry Forums as
a vehicle for organization and marketing. SNIA Industry Forums enjoy
some autonomy within SNIA, but are chartered within the general
guidelines of SNIA policy. The forum concept enables emergent tech-
nologies and services to leverage the SNIA infrastructure and thus
accelerate development without the need to create a separate indus-
try associations.
SNIA Data Management Forum
The Data Management Forum (DMF) is a cooperative initiative of IT
professionals, integrators and vendors working to define, implement,
qualify and teach improved and reliable methods for the protection,
retention and lifecycle management of electronic data and informa-
tion. The DMF is currently operating three initiative-based workgroups:
The Data Protection Initiative (DPI), Information Lifecycle Management
Initiative (ILMI), and The Long Term Archive and Compliance Storage
Initiative (LTACSI). Each initiative is chartered with the development
and deployment of best practices for a specific subset of data man-
agement functions.
SNIA IP Storage Industry Forum
The first forum created under the Industry Forum definition was the IP
Storage Forum. After some initial discussion on its scope, the IP Stor-
age Forum now represents all vendors who are developing block
storage data over IP solutions. Currently, subgroups have been created
for FCIP, iFCP and iSCSI protocols. Over 40 SNIA member companies
are enrolled in the Forum, including new IP storage vendors as well as
established storage networking vendors who are developing IP-based
interface for their products. The focus of the IP Storage Forum is mar-
keting and promotion of IP SAN technology. It thus complements the
technical work of the IP Storage Work Group.
Regional Affiliates
SNIA Storage Security Industry Forum
The SNIA Storage Security Industry Forum is tasked with promoting
secure solutions for storage networks, including authentication and
data encryption mechanisms for both Fibre Channel and IP storage
networks. The establishment of this forum is an indicator of the steady
penetration of storage networks into enterprise environments and the
security concerns that have accompanied more widespread
deployment.
Regional Affiliates
Since its formation ten years ago, the SNIA has become an interna-
tional organization with affiliates in over ten geographies including
Australia, New Zealand, Canada, China, Europe, India, Japan, and
South Asia. The SNIA regional affiliates support storage networking
technology development and promotion through local committee and
conference activities.
Summary
The SNIA represents a diversity of technologies that meet on the com-
mon ground of storage networking. Software vendors, hardware
vendors, solutions providers, integrators, consultants, and customers
committed to shared storage can work within the SNIA to advance
their individual and collective interests. As a volunteer organization,
the SNIA solicits involvement by its members and interested individu-
als for committee and work group activity. Additional information on
membership and services of the SNIA is available at www.snia.org.
TOM CLARK
Brocade Bookshelf
www.brocade.com/bookshelf
$39.95
STRATEGIES FOR
DATA PROTECTION
A strategic approach to comprehensive data protection includes
a spectrum of solutions that are essential parts of a coherent
ecosystem. Safeguarding data through data replication or
backup has little value if access to data is impeded or lost
through bad network design or network outage.Consequently,
it is as important to protect data access as it is to protect data
integrity. In this book we examine the key components of an
enterprise-wide data proteotion strategy, inoluding data oenter
SAN design and securing data assets in remote sites and
branch offices.
FIRST EDITION

Strategies For Data Protection First Edition

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Strategies For Data Protection First Edition

Uploaded by

Copyright:

Available Formats

STRATEGIES

You might also like