You are on page 1of 26

Hyper-V Live Migration over Distance

Reference Architecture Guide


By Hitachi Data Systems in collaboration with Microsoft, Brocade and Ciena

June 2010

Summary
Hitachi Data Systems, Microsoft, Brocade and Ciena are partnering to architect a robust business continuity
and disaster recovery solution using best-in-class technologies for implementing Microsoft Hyper-V Live
Migration over Distance.
A comprehensive business continuity and disaster recovery plan mandates the deployment of multiple data
centers located far enough apart to protect against regional power failures and disasters. Synchronous remote
data replication is the appropriate solution for organizations seeking the fastest possible data recovery,
minimal data loss and protection against database integrity problems. However, application performance is
affected by the distance and latency between the data centers, which might restrict the location of the data
centers.
Deploying best-in-class storage replication technologies with the Hitachi Universal Storage Platform family,
Hitachi Storage Cluster, Hyper-V Live Migration over Distance and data center interconnect products from
Brocade and Ciena creates a highly available and scalable solution for business continuity where synchronous
replication is a requirement. This document defines a tested reference architecture that supports Live
Migration over Distance.
For best results use Acrobat Reader 8.0.

Feedback
Hitachi Data Systems welcomes your feedback. Please share your thoughts by sending an email message to
SolutionLab@hds.com. Be sure to include the title of this white paper in your email message.

Table of Contents
Proactive Data Center Management ................................................................................................................................... 1
Solution Overview ................................................................................................................................................................ 2
Hyper-V Live Migration over Distance Requirements ............................................................................................... 4
Solution Components ............................................................................................................................................... 4
Tested Deployment .............................................................................................................................................................. 9
Storage Configuration .............................................................................................................................................. 9
Storage Area Network .............................................................................................................................................. 9
Wide Area Network ................................................................................................................................................ 11
Private Fiber Network ............................................................................................................................................. 13
Operating System .................................................................................................................................................. 13
Management Software ........................................................................................................................................... 14
Deployment Considerations .............................................................................................................................................. 14
Storage Bandwidth ................................................................................................................................................. 14
Storage Replication Paths ...................................................................................................................................... 14
Storage Redundancy ............................................................................................................................................. 14
Storage System Processing Capacity .................................................................................................................... 15
Professional Services ............................................................................................................................................. 15
Lab Validated Results ........................................................................................................................................................ 16
Conclusion .......................................................................................................................................................................... 18
Appendix A: Bill of Materials ............................................................................................................................................. 19
Appendix B: References .................................................................................................................................................... 21
Hitachi .................................................................................................................................................................... 21
Brocade .................................................................................................................................................................. 21
Ciena ...................................................................................................................................................................... 21

Hyper-V Live Migration over Distance


Reference Architecture
Hitachi Data Systems, Microsoft, Brocade and Ciena are partnering to architect a robust business continuity
and disaster recovery solution using best-in-class technologies for implementing Microsoft Hyper-V Live
Migration over Distance.
A comprehensive business continuity and disaster recovery plan mandates the deployment of multiple data
centers located far enough apart to protect against regional power failures and disasters. Synchronous remote
data replication is the appropriate solution for organizations seeking the fastest possible data recovery, minimal
data loss and protection against database integrity problems. However, application performance is affected by
the distance and latency between the data centers, which might restrict the location of the data centers.
The main drawback to synchronous replication is its distance limitation. Fibre Channel, the primary enterprise
storage transport protocol, is limited only by its physical layer flow control mechanism. However, application
response time becomes a problem as propagation delays lengthen with increased distance. Propagation
delays can significantly slow servers by forcing them to wait for confirmation of each storage operation at local
and remote sites. This means that the practical distance limit for synchronous replication is about 200
kilometers, or 125 miles, depending on the application response time tolerance and other factors.

Deploying best-in-class storage replication technologies with the Hitachi Universal Storage Platform family,
Hitachi Storage Cluster, Hyper-V Live Migration over Distance and data center interconnect products from
Brocade and Ciena creates a highly available and scalable solution for business continuity where synchronous
replication is a requirement. This document defines a tested reference architecture that supports Live Migration
over Distance.

Planning and implementation of this solution requires professional services from Hitachi Data Systems Global
Solutions Services.
This white paper is written for storage and data center administrators charged with disaster recovery and
business continuity planning. It assumes the reader has general knowledge of Microsoft Failover Clustering,
local and wide area networking and storage area networks.

Proactive Data Center Management


Implementing this solution enables you to proactively manage your data center, for example, in the event of a
known impending disaster like a hurricane, to balance workloads among multiple data centers, or to ease
consolidation of data centers.
This solution provides flexibility and agility for a high-availability computing environment. It enhances the
benefits derived through server virtualization:

Perceived zero data center downtime for maintenance Perform virtually any data center
maintenance task during normal working hours without affecting end users by simply moving the affected
applications either within the data center or to a remote site.

Workload balance Dynamically and non-disruptively move workloads between data centers.

Disaster avoidance Move applications in the case of an impending disaster.

Data center consolidation and migration Move workloads between data centers.

Ease of management Relieve storage and data center administrators of the need to learn many
different tools or products. Hitachi Storage Cluster and Hyper-V live migration use standard cluster
management interfaces to execute all operations such as application failover and failback between sites.
Live migration is performed using the Failover Cluster GUI, Virtual Machine Manager (VMM) or Powershell
scripting.

Solution Overview
A highly available, highly scalable Hyper-V failover cluster that supports Live Migration over Distance requires
highly available, highly scalable storage, storage fabric and networks. Because this solution uses Hitachi
Storage Cluster, Hitachi TrueCopy Synchronous software and Hyper-V Failover Clusters, virtual machines
can be migrated between storage systems across distance with minimal intervention.
This solution supports up to 16 Hyper-V host nodes in a multi-site failover cluster, with high availability
achieved at the local and remote sites with redundant physical paths enabled via multiple host bus adapters
(HBAs) from the servers. Proper zoning within the storage fabric and the use of multipathing software allows for
continued operation in the event of a hardware component failure. Redundant high-speed network and fabric
interconnects enable continued operation in the event of a hardware failure, ensuring high availability and
performance across geographically separated sites.
For this reference architecture, a distance of 200 kilometers was tested by using Fibre Channel spools between
the DWDMs at each site.
This reference architecture uses the Hitachi Universal Storage Platform VM as the storage platform, Hyper-V
Failover Clustering supporting live migration of VMs, and a Brocade network and fabric architecture to provide
connectivity across data centers. Note that although this reference architecture was tested on a Universal
Storage Platform VM, it can be deployed on a Universal Storage Platform V as well.
Figure 1 illustrates the reference architecture described by this white paper.

Figure 1. Reference Architecture

In this reference architecture, each site hosted eight Hyper-V servers connected to the Universal Storage
Platform VM via a Brocade DCX Backbone director at the local site and a DCX-4S director at the remote site.
To support the storage replication and rapid movement of VMs between the local and remote sites, Hitachi
Storage Cluster was implemented. A 1/10GbE Brocade TurboIron switch and a 10GbE NetIron XMR router at
each site, along with a Ciena 4200 DWDM, provided the network infrastructure to support both storage
replication traffic and Hyper-V Live Migration over Distance traffic.
To support the storage replication and Live Migration over Distance traffic bandwidth and latency requirements,
both Fibre Channel over IP ISL links along with native Fibre Channel inter-switch links (ISLs) were configured.
ISL links are inter-switch links that connect the switches into a switched fabric. This was done for the following
reasons:

To provide redundancy in case of an ISL link failure with either the Fibre Channel over IP ISL links or the
Fibre Channel ISL links, each ISL link was comprised of multiple physical links. In the case of failure of a
physical link within an ISL, the availability of a particular ISL link is not affected. The Fibre Channel FCIP
link contained three physical 1Gb links and the Fibre Channel ISL links contained two 4Gb physical links
for availability and performance.

To validate this reference architecture for storage replication traffic using Fibre Channel over IP.

To validate this reference architecture for Fibre Channel ISL links over distance.

Testing used SQL server transaction workloads along with Iometer workloads to validate this solution in terms
of bandwidth and latency capabilities. This ensured that live migrations occurred in a timely fashion and with no
perceivable outage to end users.
3

A total of 3Gb of bandwidth was configured to support the Fibre Channel FCIP links and validation of
throughput and latency was performed on these links. In the lab, Hitachi Data Systems was able to move, on
average, 400MB/s of write traffic across the replication links with average response times less than 20ms.
Brocades hardware compression on the Fibre Channel over IP ISLs within the DCX directors accounted for the
increased throughput.
A total of 8GB of bandwidth was configured to support the Fibre Channel ISL links and validation of throughput
and latency was performed on these links. Testing showed that we were able to move on average 480MB/s of
write traffic across the replication links with average response times less than 20ms. Additional bandwidth was
still available across the Fibre Channel ISL links, but increasing the link utilization requires additional resources
within the Universal Storage Platform VM storage systems, which were not available in the test environment.
This solution validated that multiple parallel live migrations completed successfully across distance between
nodes in the Hyper-V cluster. Because live migration is restricted to pairs of nodes in the cluster, this reference
architectures 16-node cluster design is limited to eight parallel live migrations. Hitachi Data Systems
successfully conducted eight simultaneous migrations in its testing of this reference architecture.
For more information about validation testing, see the Lab Validation section.

Hyper-V Live Migration over Distance Requirements


Hyper-V Live Migration over Distance has the following infrastructure requirements:

An IP network that can support the bandwidth requirements of the virtual machines that will be migrated.
This requirement can vary based on the number of modified pages that might need to be moved across the
IP network for a particular virtual machine.

A Fibre Channel IP or Fibre Channel network that can support the bandwidth and latency requirements for
storage replication across distance.

The source and destination Hyper-V host are required to have a private live migration IP network on the
same subnet and broadcast domain.

The IP subnet that is utilized by the virtual machines must be accessible from both the local and remote
servers. When live migrating a virtual machine between the local and remote site, the virtual machine must
retain its IP address so that TCP communication continues during and after the migration
.
The following variables affect live migration speed:

The number of modified pages on the VM to be migrated; the larger the number of modified pages, the
longer the VM remains in a migrating state

Available network bandwidth between source and destination physical computers

Hardware configuration of source and destination physical computers

Load on source and destination physical hosts

Available bandwidth (network or Fibre Channel) between Hyper-V physical hosts and shared storage

Solution Components
The following sections describe the components that make up this solution.

Hitachi Universal Storage Platform VM


This solution used two Hitachi Universal Storage Platform VMs as reliable, flexible, scalable and cost effective
storage systems for the Live Migration over Distance architecture. The Hitachi Universal Storage Platform VM
brings performance and ease of management to organizations of all sizes that are dealing with an increasing
number of virtualized business-critical applications. It is ideal for a failover clustering and storage replication
environment that demands high availability, scalability and ease-of-use.

This reference architecture uses Hitachi Dynamic Provisioning software to provision virtual machines. The
Universal Storage Platform VM with Hitachi Dynamic Provisioning software supports both internal and external
virtualized storage, simplifies storage administration and improves performance to help reduce overall power
and cooling costs.
Although this solution was tested on a Universal Storage Platform VM, it is also appropriate for use with
Universal Storage Platform V.

Hitachi TrueCopy Synchronous Software


Hitachi TrueCopy Synchronous software allows you to create and maintain duplicate copies of all user data
stored on a Hitachi Universal Storage Platform storage system for data duplication, backup and disaster
recovery scenarios. Data is replicated from a primary Universal Storage Platform storage system to a
secondary Universal Storage Platform storage system either in a local data center or across geographically
dispersed data centers. With TrueCopy Synchronous software, the remote copy of the data is always identical
to the local copy, which allows for fast restart and recovery of the data at the remote site.
During normal TrueCopy Synchronous software operations the primary volumes remain online to all hosts and
continue to process both read and write operations. In the event of a planned or unplanned outage, write
access to the secondary copy of the data can be invoked to allow recovery or migration with complete data
integrity.

Hitachi Storage Cluster


Hitachi Storage Cluster for Microsoft Hyper-V is a business continuity and disaster recovery solution for HyperV virtualized environments. Hitachi Storage Cluster enables the replication of virtual machines and their
associated data either locally or across geographically dispersed sites. Hitachi Storage Cluster provides for
automated or manual failover and failback of virtual machines and data resynchronization using either live or
quick migration.
Data replication and control are handled by the Hitachi Storage Cluster software and the storage system
controllers. This fully automated process has little effect on the applications running in the VM guest partitions.
Consistency groups and time-stamped writes ensure database integrity. This solution consists of LU replication
between the two geographically dispersed sites, with automated failover of VMs resources to the secondary
site in the event that the local site goes down, or a failover is initiated manually either through Virtual Machine
Manager (VMM) or the Failover Cluster GUI.
Through the implementation of a generic resource script in the cluster resource group for a particular VM,
Hitachi Storage Cluster ensures that the replication traffic is moving in the correct direction based on the
owning node of VM in the cluster. In the event of a failover HSC controls cluster access to the disk resources
by ensuring that the disks are ready for access and controlling the order by which resources come online within
the cluster group.
Virtual machines run as cluster resources within the Hyper-V cluster. If a node within the cluster that is hosting
the virtual machine fails, the virtual machine automatically fails over to an available node. The virtual machines
can be quickly moved between cluster nodes to allow for planned and unplanned outages. With Hitachi
Storage Cluster, the replicated LUs and the virtual machines are automatically brought online.

Figure 2 illustrates how multiple virtual machines and their associated applications can be made highly
available using Live Migration over Distance with Hitachi Storage Cluster.
Figure 2. Hitachi Storage Cluster for Hyper-V

Microsoft Windows Server 2008 R2


Windows Server 2008 R2 adds powerful enhancements to Hyper-V including increased availability, improved
management and simplified deployments. Windows Server 2008 R2 provides a new feature called live
migration, which allows for the movement of virtual machines across physical hosts in the datacenter with no
perceived downtime by the application and its users. In this reference architecture, live migration is used to
move virtual machines across extended distance to physical hosts located in a geographically separate
datacenter.
Live migration and quick migration both move running virtual machines from one Hyper-V physical host to
another. Quick migration saves, moves, and restores a VM, which results in some downtime to the end user.
Live migration uses a different process for moving the VM to another Hyper-V host:
1. Transfer all VM memory pages from the source Hyper-V host to the destination Hyper-V host over the
network.
While these pages are being transferred, all modifications to the VMs memory are tracked.
2. Transfer memory pages that were modified while Step 1 was executed to the destination Hyper-V host.
3. Move storage resources to the destination Hyper-V host computer.
4. Bring the destination VM online on the destination Hyper-V host computer.

Live migration produces significantly less downtime than quick migration for the VM being migrated. This
makes live migration the preferred method when users require uninterrupted access to the VM that is being
migrated. Because a live migration will complete in less time that the TCP timeout for the migrating VM, users
will not experience any outage even during steps 3 and 4 of the migration.
Because live migration moves virtual machines over the Ethernet network, the following networking features
within Microsoft Windows Server 2008 R2 enhance Live Migration over Distance:
The ability to specify which physical network at the NIC level that live migration uses when moving a virtual

machines configuration and memory pages across the network. For this reference architecture, the 10GbE
Brocade network hosts the live migration network traffic for both performance and throughput reasons.
Support for Jumbo Frames allows for larger payloads per network packet which improves overall throughput

and reduces CPU utilization for large transfers. This reference architecture uses Broadcom NICs that support
Jumbo Frames.
Support for VM Chimney, which allows a virtual machine to offload its network processing load onto the NIC

of the physical Hyper-V host computer. This improves CPU and overall network throughput performance and
is fully supported by live migration. This reference architecture uses Broadcom NICs that support VM
Chimney.
For more information about Hyper-V Live Migration, see the Hyper-V Live Migration Overview and Architecture
white paper.

Microsoft Virtual Machine Manager 2008 R2


Virtual Machine Manager 2008 R2 (VMM) is Microsofts management solution for the virtualized data center.
VMM enables the consolidation of multiple physical servers onto Hyper-V host servers, allowing them to run as
guest virtual machines. VMM also provides for the rapid provisioning of virtual machines, and unified
management of the virtual infrastructure through one console. This reference architecture uses Microsoft
Virtual Machine Manager 2008 R2 to manage live migration. This reference architecture uses VMM to manage
the virtual machines in the failover cluster and also to initiate and track live migration times.

Brocade Data Center Fabric Manager


(DCFM) is a comprehensive network management application that enables end-to-end management of data
center fabrics. Virtualization requires the careful management of the application stack rather than hardware
components to enable seamless mobility of applications and data throughout the data center fabric. Brocade
fabric management software centralizes the management of large, multi-fabric or multi-site storage networks,
improving visibility throughout the data center fabric and the virtual connections between servers and storage.
With enterprise-class reliability and scalability and advanced features such as proactive monitoring and alert
notification, Brocade management solutions help optimize storage resources in virtual environments and
maximize the performance of the data center fabric.

Brocade DCX Backbone and DCX-4S Directors


Brocade DCX and DCX-4S Backbone directors are a new class of fabric infrastructure platform. The DCX and
DCX-4S fit the requirements for extraordinary performance (bandwidth, port count, and power efficiency), nondisruptive scalability, continuous availability and end-to-end security, as required by this architecture. The DCX
and DCX-4S improve infrastructure utilization, simplify provisioning, capacity planning and management and
reduce infrastructure costs. The Brocade FX8-24 extension blade used in this solution provides Fibre Channel
over IP connectivity between sites.

Brocade Host Bus Adapters


The Brocade 8Gb Host Bus Adapters (HBAs) used in this reference architecture lay the foundation for
extending fabric intelligence to servers and through the network to virtual machines, applications, and services,
enabling end-to-end storage network management. This approach provides tighter integration across the
enterprise, including both physical and virtual infrastructure. Brocade HBAs provide robust and powerful
storage connectivity for virtual servers, helping to ensure that flexibility does not come at the price of
performance, reliability, or scalability. They also provide up to 8Gbit per second performance along with greater
data protection for both physical and virtualized environments. The flexible Brocade architecture simplifies
7

management of virtual connections, providing the ability to guarantee service levels, monitor I/O history and
isolate traffic per virtual machine.

Brocade NetIron Routers


The Brocade NetIron XMR series of routers features Brocade Direct Routing (BDR) technology for full
forwarding information base (FIB) programming in hardware, together with hardware-based, wire-speed access
control lists (ACLs) and policy-based routing (PBR) for robust, high performance IPv4, IPv6 and Layer 3 VPN
routing.
NetIron XMR routers provide the high availability, advanced failure detection and network traffic protection and
restoration schemes required to support Hyper-V Live Migration over Distance. The routers provide complete
hardware redundancy with resilient software featuring hitless failover and hitless software upgrades with
graceful restarts for maximizing router uptime. The multi-service IronWare operating system, powering the
NetIron XMR routers, offers advanced capabilities for rapid detection and bypass of link and node failures.

Brocade TurboIron Switches


This reference architecture uses the Brocade TurboIron 24X switch because it is a compact, high-performance,
high-availability and high-density 10GbE switch that meets the mission-critical data center requirements for
Live Migration over Distance. With an ultra-low-latency, cut-through, non-blocking architecture, the TurboIron
24X provides cost-effective server connectivity.
The TurboIron 24X can support 1GbE servers until they are upgraded to 10GbE-capable network interface
cards (NICs), simplifying migration to 10GbE server farms. The servers deployed in this architecture were
equipped with 1GbE network cards. The TurboIron 24X was deployed to save rack space, power and cooling in
the data center while delivering 24-hour-a-day, seven-day-a-week service through its high-availability design.
The TurboIron 24X can be with internal power redundancy features, which are usually available only in a
modular chassis form factor. Every TurboIron 24X has a single AC power supply, but for this reference
architecture an additional AC power supply was added for redundancy. The AC power supplies are hotswappable and load-sharing with auto-sensing and auto-switching capabilities, which are critical for power
redundancy and deployment flexibility.

Ciena 4200 DWDM


This reference architecture uses Cienas Virtual Optical WAN architecture with the CN 4200 and FlexSelect
Advanced Services Platform and associated modules to transpond or multiplex client ports over the network.
This Ciena Virtual Optical WDM provides the highest bandwidth and lowest latency available today over Fibre
Channel networks.
This reference architecture uses the Optical Transport Network (OTN/709) to support long distance virtual
machine migration and storage replication. With inherent Layer 1 protection and deterministic performance,
OTN is more reliable and easier to manage than native Ethernet solutions. This solution provides the most
efficient and flexible consumption of metro bandwidth for connectivity services with right-sized transport
tunnels. This approach enables flexible configurations to allocate only as much capacity to each application
and protocol channel as needed within a wavelength.
Cienas OTN implementation also benefits SAN fabric connectivity. Ciena can accommodate Fibre Channel
8Gb or 10Gb ISLs on individual 10Gb wavelengths. However, Ciena recommends a more efficient transport
using 4Gb Fibre Channel links because three 4GbFibre Channel ISLs can be efficiently transported using one
10GB wavelength.
The optical-fiber-based bandwidth solutions deployed in this reference architecture are ideal for business
continuity applications by providing the ideal transport for storage networking and migration of virtual machines
between data centers.

Tested Deployment
The following sections describe the tested deployment of Hyper-V Live Migration over Distance in the Hitachi
Data Systems laboratory.

Storage Configuration
Hitachi TrueCopy Synchronous software requires the use of a Hitachi Universal Storage Platform V or
Universal Storage Platform VM at the local site that contains the primary volumes and a Universal Storage
Platform V or Universal Storage Platform VM at the remote site for the secondary volumes. Testing conducted
to develop this reference architecture used a Universal Storage Platform VM.
The Universal Storage Platform VM storage system at the local site is known as the Main Control Unit (MCU)
and the Universal Storage Platform VM at the remote site is known as the Remote Control Unit (RCU). Remote
paths connect the two Universal Storage Platform VM storage systems over distance. The tested deployment
used two 4Gb Fibre Channel remote copy connections.
Table 1 lists the configuration specifications for the two Universal Storage Platform VMs deployed in this
reference architecture.
Table 1. Deployed Storage System Configuration
Component

Details

Storage system

Hitachi Universal Storage Platform VM

Microcode level

60-06-21

RAID group type

RAID-5 (3+1)

Cache memory

128GB

Front-end ports

32 Fibre Channel ports

TrueCopy ports

2 4GB Fibre Channel ports

Drive capacity

300GB

Drive type and number

24 Fibre Channel 15K RPM

Number of Dynamic Provisioning pools

Number of drives in pool

24

Number of RAID-5 (3+1) groups in pool

Number of LUs in pool

88

Size of each virtual LU

50GB

Number of VMs deployed

Number of LUs per VM

10

Storage Area Network


For this solution, Hitachi Data Systems connected the Hyper-V servers and the Hitachi Universal Storage
Platform VM through Brocade DCX enterprise-class directors, a DCX Backbone at the local site and a DCX-4S
at the remote site. Multiple links between the directors across distance were created to form inter-switch links
(ISLs). To support the storage replication traffic, two 4Gb Fibre Channel ISLs were trunked together to create
a single logical trunk that provides up to 8Gb/sec throughput. This was done to evenly distribute traffic across
all the ISLs, to ensure high availability and reliability if an ISL link with the trunk fails and to provide load
balancing by using Dynamic Path Selection (DPS).

In addition, to provide additional throughput, availability and performance for the storage replication traffic,
multiple Fibre Channel over IP links were configured between the directors across distance. Fibre Channel
over IP trunking was implemented to combine multiple FCIP links into a high bandwidth FCIP trunk spanning
multiple physical ports to provide load balancing and network failure resiliency. For more information, see the
Wide Area Network section of this paper.
This solution uses two redundant paths from each Hyper-V host to the Universal Storage Platform VM. Each
Hyper-V host had dual-path Brocade HBAs configured for high availability. Microsofts MPIO software provided
a round-robin load balancing algorithm that automatically selects a path by rotating through all available paths,
thus balancing the load across all available paths, optimizing IOPS and response time.
Figure 3 illustrates the storage area network configuration for the stretched 16-node Hyper-V failover cluster
used for this reference architecture.
Figure 3. Live Migration over Distance SAN Configuration

Table 2 lists the Brocade director configuration and firmware levels for the HBAs and the directors deployed in
this solution.
Table 2. Deployed Brocade Configuration and Firmware Levels
Device

Number
of Fibre
Channel
Ports

Number
of GbE
Ports

Notes

Brocade DCX Backbone

22

Firmware 6.3.0.b

Brocade DCX-4S

22

Firmware 6.3.0.b

Brocade HBA 825

N/A

Storport Miniport Driver 2.1.0.2


Firmware 2.1.0.2

10

Wide Area Network


Multiple networks were deployed in the reference architecture to support the high speed performance and
throughput requirements of live migration. The following sections describe the architecture for extending the
LAN network over distance and network architecture deployed for supporting storage replication over distance.

LAN Extension over Distance


The Brocade TurboIron 24X 1/10GbE switches at the local and remote sites were connected to the NetIron
XMR 10 GbE layer 2/3 switches and then onto the Ciena CN 4200 DWDM.
During live migration of a virtual machine, IP traffic for migrating the virtual machine flows over the 10GbE
network, as shown in Figure 4.
Figure 4. Live Migration Network

Storage Replication over the Network


For this reference architecture, both Fibre Channel over IP links and Fibre Channel links were configured to
support storage replication across distance. The Brocade FC8-24 extension blade provides the Fibre Channel
and IP connections for distance routing for the Fibre Channel over IP and Fibre Channel links.

11

Figure 5 shows the network configuration implemented for both the IP network that supports cluster
management traffic and live migration traffic and the Fibre Channel over IP and Fibre Channel links that
support storage replication.
Figure 5. Storage Replication Network

12

Private Fiber Network


The Ciena CN 4200 switches deployed at both the local and remote sites were configured to transport the
10GbE connections from Brocades NetIron routers, while also transporting multiple 4GB Fibre Channel ISLs
from the DCX Fibre Channel directors as shown in Figure 6. Two principle modules were configured in the CN
4200, the F10-T and the FC4-T, to support transport of the IP and Fibre Channel traffic. The Ciena 4200
DWDM network saves scarce Fibre Channel resources and avoids the cost of leased lines to support this
architecture.
Figure 6 shows the DWDM deployed for this reference architecture.
Figure 6. DWDM Configuration

The F10-T Transponder module provides transponding and regeneration of various 10Gb signals. For this
solution, the 10GbE connections from the NetIron XMRs connected directly into the F10-T.
The FC4-T Muxponder aggregates up to three 4Gb Fibre Channel ports across a 10Gb wavelength. The FC4T is a Fibre Channel aggregation card that can be provisioned to handle FC400 clients. For this architecture,
the Fibre Channel links from the Brocade DCXs were plugged directly into the FC4-T.

Operating System
Microsoft Windows Server 2008 R2 was deployed on 16 Hyper-V servers across geographically dispersed
datacenters, with eight servers deployed on the local site and eight servers on the remote site. Multiple
network connections were deployed to support the cluster management network and the live migration
network. Table 3 lists the deployed server hardware and software.

13

Table 3. Deployed Servers and Operating Systems


Make and
Model

Role

Quantity

Operating System

Dell 2950

Hyper-V host server

16

Windows 2008 R2, 4 x Quad-Core AMD Opteron


processor 1.9 GHz, 12 GB RAM.

HP DL585

Domain controller and DNS

Windows 2008 R2, 2 x Intel Xeon 1.9G Hz, 2GB RAM.

HP DL585

Management server for Hitachi


Storage Navigator software
and Brocade DCFM

Windows 2008 R2, 2 x Intel Xeon 1.9G Hz, 2GB RAM

Management Software
This section describes the software deployed to support the Hyper-V Live Migration over Distance architecture.
Table 4 lists the software used in this reference architecture.
Table 4. Deployed Management Software
Software

Version

Microsoft Virtual Machine Manager 2008

Release 2

Hitachi Storage Navigator

7.0

Hitachi Performance Monitor

7.0

Microsoft MPIO

006.0001.7600.6385

Brocade Datacenter Fabric Manager

10.4.0

Deployment Considerations
The following sections describe key considerations for planning a deployment of this solution.

Storage Bandwidth
To maintain a continuous replica copy, the bandwidth available across the TrueCopy links must be greater than
the average write workload that occurs during any Recovery Point Objective (RPO). This means that if an
organization wants to maintain an RPO of twenty minutes, then the twenty-minute interval with the greatest
write activity must be identified. With that the bandwidth required to keep up with this traffic can be calculated.

Storage Replication Paths


In many respects, replication traffic is processed much like any other workload on the storage subsystem. Write
I/O on the primary storage array is transferred across the replication links to a secondary storage array. For
Hitachi TrueCopy Synchronous software, this traffic uses SAN Fibre Channel connections between SCSI
initiator ports (MCUs) at the local storage system to SCSI (RCU) ports at the remote storage array. These Fibre
Channel paths have a specific bandwidth capacity and multiple connections might be required to ensure
sufficient capacity.
It is also important to ensure that the Brocade directors are properly sized to carry the storage replication traffic
over the ISL links.

Storage Redundancy
It is important to identify redundancy requirements, and to ensure that multiple links are available over the
network to carry replication traffic in case of failure.

14

Storage System Processing Capacity


TrueCopy software operations incur overhead on the storage system. Additional processor cycles are required
to service the additional workload for remote copy operations. On Hitachi storage systems, this additional
workload occurs within the processors for the front-end directors ports. Allocate sufficient front-end director
ports to accommodate the production write workload alongside additional requirements for handling remote
copy operations.

Professional Services
Planning and implementation of this solution requires professional services from Hitachi Data Systems Global
Solutions Services.

Hitachi Remote Copy Planning and Design Service


This required service assists you with costly bandwidth decisions and distance-data-replication challenges.
Using Hitachi Data Systems remote replication best practices, consultants produce a detailed study of the
current workload environment and make bandwidth recommendations necessary to support and improve the
customers remote-copy environment.
This service provides customers with a high-level design for the remote-replication solution, as well as a
detailed analysis of workload and performance characteristics to help support potentially expensive bandwidth
decisions. This service includes the following reports:

Bandwidth recommendation

High-level strategy for implementing distance data replication

Hardware and software audits of the host and storage environments identified for inclusion in the
replication environment

Workload characteristics

Objectives for the replication environment

Documentation of mechanisms to support the customers replication objectives

Strategic recommendations

Copy-group configuration recommendations

Hitachi Storage Cluster for Microsoft Environments Service


This required service allows customers to improve mission-critical availability and reliability in their Microsoft
Windows Server 2008 Hyper-V environments by leveraging Microsoft clustering with Hitachi replication
software. This turnkey solution provides the following services:

Discovery and assessment of your existing Microsoft and replication environments

Implementation planning

Process-based automation of Hitachi TrueCopy Remote Replication software

Production implementation assistance, including Microsoft Hyper-V host configuration

Testing

Knowledge transfer (including hardware and software configuration, command scripts, host operations, and
control mechanisms)

15

Lab Validated Results


The goals of the joint testing were to measure what the throughput and response times were across the Fibre
Channel over IP and Fibre Channel extended distance links, measure the duration of a live migration between
sites and test the live migration of an application between sites while clients were accessing the application.
For the last goal, testing used SQL Server with clients accessing an AdventureWorks database.
The throughput and response time testing consisted of migrating live VMs under a test workload from a source
server on the local site to a server on the remote site. Hitachi Data Systems initiated the test by using a load
generator to direct reads and writes to the VMs running on the local host. The size of the memory allocated to
the virtual machine was also modified to understand the effect on migration times as the memory size
increased.
After the targeted loading level was attained on the local host, live migration to the remote host was initiated
manually using VMM. Hitachi Data Systems monitored the tests until live migration completed successfully.
This included verifying that migration completed successfully without VM outage and that application operation
was uninterrupted during the live migration. VMM started and monitored the migration during each test.
In each test, one VM was migrated from the source to the destination server and was also tested migrating two
VMs in parallel from the source to the destination server. For each server pair, migration was successful in both
directions.
Table 5 lists the test results reported by VMM. The live migration times increased as the I/O workload and the
size of the memory allocated to the virtual machine increased. All response times were less than 20ms for
writes.
Table 5. Fibre Channel over IP Links for I/O Profile 75% Write, 25% Read (50% Random, 50% Sequential)
Read
(MB/sec)

Write
(MB/sec)

Response
Time (ms)

VM
Memory
Size (GB)

Live
Migration
Time

18447.31

159.32

465.94

15.82

00:01:16

19772.26

156.43

469.28

15.02

00:02:05

20727.91

158.69

472.06

15.52

00:02:25

21647.11

164.69

484.11

16.78

10

00:02:47

Total IOPS

Table 6 lists the test results. The live migration times increase as the I/O workload and the size of the memory
allocated to the virtual machine increases. All response times were less than 20ms for writes.
Table 6. Fibre Channel Links for I/O Profile 75% Write, 25% Read (50% Random, 50% Sequential)
Total IOPS

VM
Memory
Size (GB)

Live
Migration
Time

Read
(MB/sec)

Write
(MB/sec)

Response
Time (ms)

20755.56

156.26

468.77

15.84

00:01:17

21059.35

159.52

478.56

15.14

00:01:44

20627.91

160.77

482.30

15.50

00:02:47

22167.84

165.12

495.71

15.49

10

00:03:14

16

Table 7 lists the test results for migrating two virtual machines in parallel.
Table 7. Multiple Live Migrations for I/O Profile 75% Write, 25% Read (50% Random, 50% Sequential)
Virtual
Machine

Total
IOPS

Read
(MB/sec)

Write
(MB/Sec)

Response
Time (ms)

VM
Memory
Size (GB)

Live
Migration
Time

VM1

8077.11

62.95

188.83

9.91

00:01:25

VM2

6613.14

51.69

155.06

12.08

00:01:34

For testing application workloads, the SQLStress tool from Microsoft was used against an AdventureWorks
database. The SQLStress tool is a load driver for SQL Server that can be used to execute large batches of
queries or updates against one or more SQL Server databases. It can be used for stress testing, performance
testing, and/or for research purposes. The tool is controlled by configuring a simple XML file that contains the
connection information as well as descriptions of the types of work load to run.
The SQLStress tool works by creating a workload consisting of one or more work items; each work item
contains a T-SQL query. The tool then spawns a number of worker threads. Each thread executes a number of
iterations; with each iteration, it chooses a work item, and associated query, and executes it against the
database.
To ensure that the server memory on the SQL Server virtual machine was being heavily used, Transact-SQL
Stored Procedures generated recursive read queries to objects within the SQL Server database. For this test,
200 worker threads were executed against the database. Table 8 lists the test results. All live migrations were
initiated using VMM and the SQL queries continued uninterrupted after the migration.
Table 8. SQL Server Live Migration TImes
Transactions
per Second

Read
Response
Time (ms)

VM
Memory
Size
(GB)

Live
Migration
Time

2368

4.53

00:00:38

2401

4.22

00:00:56

2590

4.02

00:01:22

2644

3.86

00:01:46

2740

2.67

10

00:02:10

17

SQLStress generated update writes to the database. Table 9 lists the test results. All live migrations were
initiated using VMM and the SQL updates continued uninterrupted after the migration.
Table 9. SQL Server Live Migration Times (Update Workload)
Transactions
per Second

Write
Response
Time (ms)

VM
Memory
Size
(GB)

Live
Migration
Time

2192

20.12

00:00:39

2280

19.23

00:01:02

2230

19.01

00:01:21

2251

18.60

00:01:46

2343

18.02

10

00:02:06

Conclusion
This white paper describes a reference architecture that delivers an end-to-end high availability and business
continuity solution using Hyper-V live migration, advanced storage replication technologies from Hitachi Data
Systems, and network and storage extensions from Brocade and Ciena This solution enables the creation of a
highly flexible and easy-to-manage virtualized environment for critical applications.

18

Appendix A: Bill of Materials


Table 10 lists the Hitachi components deployed in this reference architecture.
Table 10. Hitachi Components
Part Number

Description

7846540.P

Hitachi 42Ux600x1050 Universal Storage Platform VM Rack US

DKC-F615I-16FS.P

Fibre 16-Port Adapter (4Gbps) (2)

DKC-F615I-450KS.P

1 HDD Canister (DKS2C-K450FC) (48)

DKC-F615I-B2.P

Disk Chassis (1)

DKC-F615I-C16G.P

Cache Memory Module (16GB) (8)

DKC-F615I-S4GQ.P

Shared Memory Module (4GB) (4)

DKC-F615I-CX.P

Cache Memory Adapter (1)

DKC-F615I-DKA.P

Disk Adapter

DKC-F615I-LGAB.P

Additional Battery (2)

DKC-F615I-LGAB.P

Device I/F Cable connects DKC and R0-DKU

DKC-F615I-UC0.P

Disk Control Frame

DKC615I-5.P

DKU Power Cord Kit(1-Phase 10A USA) (1)

DKC-F615I-PHUC.P

DKC Power Cord Kit(1-Phase 10A USA) (1)

Table 11 lists the Brocade components deployed in this reference architecture.


Table 11. Brocade Components
Part Number

Description

HD-DCX-0001

DCX,2PS,0P,2CP,2 CORE,0 SFP, Rack Mount, OS, WT, Z, DCX


Enterprise Software Bundle, EGM

BR-DCX-0102

PORT BLADE,32P,0SFP,DCX,BR

XBR-000148

FRU,SFP,SWL,8G,8-PK, BR

HD-FX824-0001

FX8-24 Blade, 22P, 12 8G SWL SFPs, 0 1GE SFP

XBR-000190

FRU,SFP,1GE COPPER,1-PK, ROHS

HD-DCX4S-0002

DCX4S Chassis, 0 SFP, EGM, Enterprise SW Bundle (FW, APM, TRK,


AN, SAO)

BR-DCX-0102

PORT BLADE,32P,0SFP,DCX,BR

XBR-000148

FRU,SFP,SWL,8G,8-PK, BR

HD-FX824-0001

FX8-24 Blade, 22P, 12 8G SWL SFPs, 0 1GE SFP

XBR-000190

FRU,SFP,1GE COPPER,1-PK, ROHS

HD-825-0010

Brocade 825 HBA, 2 port, 8Gb, FC

BR-DCFM-ENT

DCFM Enterprise Management Software, Version 10.x

NI-XMR-4-AC

NI-XMR 4-SLOT CHASSIS

19

Part Number

Description

NI-XMR-MR

NI XMR MANAGEMENT MODULE

NI-XMR-1GX20-SFP

NI XMR 20-PORT 100/1000 FIBER

NI-XMR-10GX4

NI XMR 4-PORT 10GE

NI-X-ACPWR-A

NI-XMR/MLX 4-SLOT CHASSIS AC POWER SUPPLY

NI-X-SF1

NI XMR/MLX SWITCH FABRIC FOR 4 SLOT CHASS

TI-24X-AC

24P 10GBE/1GBE,SFP+,TI,BR

RPS11

300W AC PWR SUPPLY FOR TURBOIRON

10G-SFPP-SR

10GBASE-SR, SFP+ OPTIC

E1MG-SX-OM

1000BASE-SX, SFP, OPTIC, MMF, LC CONN

E1MG-TX

MODULE, MINI-GBIC, TX, 1000BASE, RJ45

10G-XFP-SR

OPTIC, 10GBE, SR, XFP, MMF, LC

Table 12 lists the Ciena components deployed in this reference architecture.


Table 12. Ciena Components
Part Number

Description

B-820-0007-001

CN4200-SCH-EW CHASSIS FULL C, HALF D (2)

B-820-0007-002

CN4200-SCH-EW CHASSIS HALF C, HALF D (1)

B-720-0020-004

CN-200-B4L 4-CH 200GHZ DWDM MODULE ASSY (2)

B-720-0015-001

CN-ALM-STD - ALARM MODULE (3)

166-0011-900

CN-PSM-AC450T-450W AC PWR (6)

500-4200-101

MOUNTING KITCHASSIS19INCH/ETSICN4200 (3)

800-4200-KIT

CN4200 TURN-UP MATERIAL KIT (3)

S42-0001-71C

CN4200 BASE SW LICENSE CLASSIC CHASSIS; R7.1.X (3)

431-1133-001

POWER CORD AC NORTH AMERICA LEFT ANGLE 6FT (6)

B-720-0025-200

FC4-T90-TN - 3XFC400/FC200 (2)

B-720-1086-300

F10-T90-TN - TP;UNPRT (2)

B-720-0042-001

OAF-00-1-C - OPTICAL AMPLIFIER (4)

B-955-0003-007

CN 2110-T0-70-DCM Type 0 DISPERSION COMPENSATION

B-720-0017-001

MODULE (4)

B-720-0016-001

CN-HALF-BLK - BLANK CARD (2)

B-730-0008-001

CN-FULL-BLK - BLANK CARD (3)

130-4901-900

OPT-4G-SX SFP FOR FC400 (4)

XFP-OPT-SR XFP 850NM (2)

20

Appendix B: References
The following sections provide links to more information about the products mentioned in this white paper.

Hitachi

Hitachi Universal Storage Platform Family Best Practices with Hyper-V

Hitachi Storage Cluster for Microsoft Hyper-V Solution and Partnership Overview

Hitachi Storage Cluster for Microsoft Hyper-V: Optimizing Business Continuity and Disaster Recovery in
Microsoft Hyper-V Environments

Building a Scalable Microsoft Hyper-V Architecture on the Hitachi Universal Storage Platform Family

Optimizing High Availability and Disaster Recovery with Hitachi Storage Cluster in Microsoft Virtualized
Environments

Top 5 Business Reasons to use Hitachi Enterprise Storage in Virtualized Server Environments

Brocade

Brocade DCX Backbone Family Data Sheet

Brocade Data Center Fabric Manager Data Sheet

Brocade FX8-24 Extension Blade Data Sheet

Brocade 415, 425, 815 and 825 Fibre Channel HBA Data Sheet

Brocade TurboIron 24x Switch Data Sheet

Brocade NetIron XMR 4000, 8000, 16000, 32000 Data Sheet

Ciena

Ciena Data Center Connectivity Solutions

The CN 4200 FlexSelect Advanced Services Platform Family

FC4-T Fibre Channel Muxponder Module

F10-T 10G Transponder Module

21

Corporate Headquarters 750 Central Expressway, Santa Clara, California 95050-2627 USA
Contact Information: + 1 408 970 1000 www.hds.com / info@hds.com
Asia Pacific and Americas 750 Central Expressway, Santa Clara, California 95050-2627 USA
Contact Information: + 1 408 970 1000 www.hds.com / info@hds.com
Europe Headquarters Sefton Park, Stoke Poges, Buckinghamshire SL2 4HD United Kingdom
Contact Information: + 44 (0) 1753 618000 www.hds.com / info.uk@hds.com
Hitachi is a registered trademark of Hitachi, Ltd., in the United States and other countries. Hitachi Data Systems is a registered trademark and service mark of
Hitachi, Ltd., in the United States and other countries.
All other trademarks, service marks and company names mentioned in this document are properties of their respective owners.
Notice: This document is for informational purposes only, and does not set forth any warranty, expressed or implied, concerning any equipment or service offered
or to be offered by Hitachi Data Systems. This document describes some capabilities that are conditioned on a maintenance contract with Hitachi Data Systems
being in effect and that may be configuration dependent, and features that may not be currently available. Contact your local Hitachi Data Systems sales office for
information on feature and product availability.
Hitachi Data Systems Corporation 2010. All Rights Reserved.
AS-049-00 June 2010

You might also like