You are on page 1of 84

Microsoft – Collaboration Brief

June 2010

SAP Applications on Windows Server 2008


R2 High Availability Reference Guide

Authors
Josef Stelzel, Sr. Developer Evangelist, Microsoft Corporation,
jstelzel@microsoft.com

Summary
This paper describes how to implement a high availability solution for SAP applications
on Microsoft® Windows Server® 2008 R2. It is written for developers, technical
consultants, and solution architects. This paper introduces the technologies and
architecture used, describes various high availability scenarios, and discusses the
implementation process. This paper also contains links to advanced features and
technical topics including disaster recovery methods.
Note: Access to some of the linked information might be restricted such as SAP notes
available at the SAP Service Marketplace at https://service.sap.com. Access to this Web
site is available only to registered SAP customers and partners, and requires a user
name and password.
SAP Applications on Windows Server 2008 R2 High Availability Reference Guide ii

The information contained in this document represents the current view of Microsoft Corporation on the
issues discussed as of the date of publication. Because Microsoft must respond to changing market
conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot
guarantee the accuracy of any information presented after the date of publication.

This White Paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS,
IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT.

Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under
copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or
transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or
for any purpose, without the express written permission of Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights
covering subject matter in this document. Except as expressly provided in any written license agreement
from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks,
copyrights, or other intellectual property.

 2010 Microsoft Corporation. All rights reserved.

Microsoft, Windows, Windows Server, the Windows logo, SQL Server, and Active Directory are either
registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

The names of actual companies and products mentioned herein may be the trademarks of their respective
owners.

Applies To
• SAP NetWeaver 7.0
• SAP NetWeaver 2004
• SAP Business Suite (mySAP ERP)
• SAP Application Server
• SAP Replicated Enqueue
• SAP System Central Services

Keywords
SAP NetWeaver, disaster recovery, high availability, SAP Application Server, SAP
Replicated Enqueue, planned downtime, unplanned downtime, SQL Server 2005/2008
R2, Windows Server 2008 R2

Contact
This document is provided by Microsoft Corporation. Please check the SAP
interoperability area at www.microsoft.com/sap and the .NET interoperability area in the
SAP Developer Network at http://sdn.sap.com for updates or additional information.
SAP Applications on Windows Server 2008 R2 High Availability Reference Guide iii

Contents
Applies To ...................................................................................................................... ii
Executive Summary ...................................................................................................... 5
High Availability Considerations .................................................................................. 6
Critical application availability requirements ................................................................ 6
Classes of availability problems .................................................................................. 6
Loss of physical resources .................................................................................................... 6
Logical errors and inconsistencies ........................................................................................ 7
Disasters ................................................................................................................................ 7
Planned downtime ................................................................................................................. 7
Service level agreements............................................................................................ 7
Availability measures ............................................................................................................. 8
High availability solution risks and side effects ............................................................ 9
Increased complexity ............................................................................................................. 9
Higher costs........................................................................................................................... 9
Hyper-V virtualization and availability........................................................................ 10
Guest clustering................................................................................................................... 10
SAP Architecture and Requirements ......................................................................... 12
SAP NetWeaver and its components ........................................................................ 12
SAP Application Server architecture ......................................................................... 13
ABAP system architecture ................................................................................................... 14
Dual-stack system architecture ........................................................................................... 18
Java system architecture ..................................................................................................... 22
SAP system single points of failure ..................................................................................... 23
SAP standalone engines........................................................................................... 29
The SAP Web Dispatcher ................................................................................................... 30
SAP standalone gateway .................................................................................................... 31
TREX ................................................................................................................................... 31
SAP liveCache..................................................................................................................... 32
SAP Content Server ............................................................................................................ 33
Unplanned Downtime Avoidance Strategies ............................................................. 35
Hierarchy of high availability solutions ...................................................................... 35
Data storage protection ....................................................................................................... 36
Server protection ................................................................................................................. 37
Network high availability ...................................................................................................... 37
Application specific configurations ...................................................................................... 39
Simple cluster for a single SAP system..................................................................... 40
Using multiple clusters for SAP instances and databases ......................................... 42
SAP Replicated Enqueue ......................................................................................... 44
Multi-SID cluster ....................................................................................................... 45
Multi-node cluster ..................................................................................................... 50
SAP application servers ............................................................................................ 51
IT infrastructure protection ........................................................................................ 52
Hyper-V host cluster ................................................................................................. 53
Planned Downtime Minimization Solutions ............................................................... 54
Planning ahead for minimizing planned downtime .................................................... 54
Change management strategy deployment ........................................................................ 55
Backup and patching solutions ................................................................................. 55
Snapshot backup ................................................................................................................. 56
Optimized server maintenance system architecture .................................................. 57
Server and operating system maintenance ......................................................................... 57
SQL Server instance maintenance...................................................................................... 58
SAP application planned downtime reduction ........................................................... 60
SAP on Windows Server 2008 R2 High Availability Reference Guide iv

Hyper-V Live Migration ............................................................................................. 61


Data Inconsistency Protection Solutions .................................................................. 63
Logical error reasons ................................................................................................ 63
Database data inconsistencies............................................................................................ 64
Sabotage and accidental data deletion ............................................................................... 64
Data loss through viruses and worms ................................................................................. 64
Backup and recovery ................................................................................................ 65
Database backup strategies ................................................................................................ 66
Database log shipping .............................................................................................. 67
Snapshots................................................................................................................. 68
Database snapshots with SQL Server 2005/2008 R2 ........................................................ 68
Snapshots with storage solutions ........................................................................................ 69
Hyper-V snapshots for virtual machines ............................................................................. 69
Database consistency checks ................................................................................... 70
Large database consistency ................................................................................................ 71
Disaster Recovery Solutions ...................................................................................... 72
SAP system protection in a geographically dispersed cluster.................................... 73
Storage replication .............................................................................................................. 73
Cluster quorum configuration .............................................................................................. 75
Majority Node Set configuration for Windows Server 2003 ................................................ 76
File share witness for Windows Server 2003 ...................................................................... 77
Network configuration .......................................................................................................... 78
Microsoft SQL Server database log shipping ............................................................ 79
Database mirroring with SQL Server 2005/2008 R2 ................................................. 81
Asynchronous database mirroring....................................................................................... 83
Synchronous mirroring with automatic failover in case of error .......................................... 83
SAP database mirroring configurations ............................................................................... 83
Disaster recovery solutions for virtual machines ....................................................... 84
SAP Applications on Windows Server 2008 R2 High Availability Reference Guide 5

Executive Summary
Business applications are central to a corporate IT operation. All corporate business
processes are supported by software solutions that help to better plan, process, or
communicate in all business related tasks. Consequently, any service failure has an
immediate and direct impact on corporate business results. This often decreases
revenue and can damage the corporate image.
This is especially true for SAP applications as corporations increase their dependencies
on a productive IT environment. Enterprise Service Architecture (ESA) and the global
network of interacting companies have increased both uptime requirements as well as
the number of IT components that are ultimately needed to fulfill business requirements.
As an increasing number of companies join global networks, there is always a time zone
that utilizes a computing service. While in the past, centralized application systems like
SAP R/3 have been used, ESA orchestrates the use of service providers in order to
achieve a larger task. Those services can be distributed inside or outside a company
and need to be available.
High availability of mission critical applications has always been the focus for SAP
infrastructures. The starting point for increasing availability traditionally has been to
address the loss of a critical hardware resource that could generate downtime until the
computer system is available again. More solutions have been developed over time to
address other problems like downtime due to operating system defects, downtime
caused by data inconsistencies, or downtime caused by disasters like earthquakes,
floods, or terrorism. Even planned downtime, which is needed to upgrade systems or
install patches, is contrary to the requirement to have an application service consistently
available. However, planned downtime does reduce system vulnerability and increases
reliability.
This guide describes the solutions that address the various areas of availability for SAP
on the Windows® platform. It helps to identify the cause of potential downtime and
provides the technical strategy to reduce or eliminate it. In addition, this guide provides
solution description references that help the reader understand the technology and
quickly find assistance.
Microsoft® has a long history of providing a comprehensive portfolio of solutions for
protecting enterprise class applications like SAP. Microsoft Windows Server® 2008 R2
offers even more functionality than previous versions with clustering, geographic
distribution, and operating system security. Improved network configuration functionality,
performance enhancements, and storage subsystem management included with
Windows Server 2008 R2 make it easier to work with the latest technology from
hardware partners. As a central component of Windows Server 2008 R2, high availability
makes managing the complexity of modern infrastructures both effective and affordable.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 6

High Availability Considerations


High availability refers to all technical or conceptual solutions that are used to improve
the application availability. For the purpose of this paper, availability is defined as:
usable for the intended purpose of supporting corporate business processes.
Issues with business application availability can have many causes. These causes range
from hardware problems to planned downtime due to patching or installing upgrades, to
disasters that happen periodically on a larger scale. How to achieve optimal high
availability is not always easy to answer. A hardware problem protection solution might
not help protect the hardware from possible relational database inconsistencies. The
goal of this white paper is to identify the various reasons for the lack of SAP application
availability and describe the solutions available to safeguard these points of failure.
The intent of this paper is not only to emphasize the benefits of high availability
solutions, but to also identify the potential risks and side effects. When several options
are available to solve the same problem, this white paper will help the reader decide
which solution is optimal.

Critical application availability requirements


SAP applications are typically used for business critical processes or processes that are
essential for maintaining the company workflow. Additionally, SAP applications often
directly exchange data with external companies. For example, this is done to process
orders electronically or to validate external data such as with credit card validation. The
availability requirement for such applications has increased to nearly 24x7x365.
It is primarily corporate globalization and Internet communication that has generated the
increased availability requirements. No company can afford to manually track goods,
maintain accounting, or control the financial streams. Having the application services
available whenever the business process needs them is a necessity. Unavailability
directly impacts revenue and reputation, and can even threaten the existence of a
company. In addition, some business sectors like the financial sector have laws and
regulations that require the corporations to operate their core applications in a failsafe
and reliable manner.

Classes of availability problems


While the consequences of losing a mission critical application service for a company
are always the same, the reasons for an outage are very diverse and can have a
multitude of causes. This ranges from planned downtime for maintenance reasons to a
disaster striking an entire geographic area.
In order to implement rational protection against a potential problem, the nature of the
problem must be identified. The typical problem types are listed in the following sections.
The associated class of solutions with each problem type is discussed in more detail in
subsequent sections.
Loss of physical resources
One of the more obvious reasons for an application service loss is a required hardware
resource failure. Physical resources are not only servers, storage, or network
infrastructures of a company, but include the facilities supporting the computing
environment and provide shelter, air conditioning, and electrical power. While hardware
resources are more directly related to the application environment, the supporting facility
is also related to disaster recovery considerations.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 7

Logical errors and inconsistencies


While many computer system hardware failures often directly and immediately generate
unplanned application downtime, there are also problems that can cause logical errors,
glitches and spikes during operation, and data inconsistencies. A random memory
problem might for example cause data block corruption that potentially might only be
discovered the next time this data is accessed. Accidental data deletion or file corruption
caused by a computer virus is another example of logical errors.
In many cases, the effect of these problems is not a hardware failure, but system
degradation. However, since there is no way to predict when the system will use the data
again, a potential problem might arise at any time during normal operation. Real
application downtime is most often created during the process of recovering, such as
when restoring the last backup or cleaning database problems manually. Since the data
exists only once, maintaining data consistency is a crucial part of the availability concept.
Disasters
Disasters like fire, flood, hurricanes, earthquakes, and terrorism can instigate the loss of
all IT systems in a data center. Problems of this scale might not be sufficiently addressed
by having enough redundant hardware in one location. Besides having a proper
geographical distribution of computer systems for the continuation of a critical
application, typical questions include how to synchronize the data between the different
sites and how to plan for the real event. In addition to a technical solution, the complete
solution requires good planning skills and an in-depth knowledge of the applications and
organizational requirements.
Solutions that address the problems discussed are considered disaster recovery
solutions rather than high availability solutions. Although, disaster recovery solutions
might employ typical high availability techniques like clustering or database mirroring as
part of a recovery plan, they clearly have a different scope than solutions that protect
against a simple server hardware failure.
Planned downtime
Planned downtime occurs at an intended and often appropriate time, most likely at a
time of low application usage. Planned downtime is typically implemented for server and
software maintenance, upgrades or migrations, and changes to or the testing of critical
configurations. Ironically, this maintenance helps to improve the computer system
stability and security by eliminating known problems and by maintaining system
resources. However, this maintenance does require application downtime.
This white paper describes some architectural concepts that can help to minimize or
eliminate planned downtime for typical SAP application tasks. However, proper planning
and change management are still the main reasons for planned downtime. Therefore, it
might not be possible to eliminate all planned downtime.

Service level agreements


As shown in the following table, availability measures throughout the IT infrastructure
help administrators to determine which application service requires what level of
protection. While the loss of a test or a QA system might impact the work in an IT
department, the loss of a productive system almost always impacts company processes
and outbound communication. Since improving availability is not without cost, it makes
sense to focus on the most critical applications.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 8

Note: To avoid user interruption, all dependencies must be protected as well as the
primary application services. If a productive system is integrated into an IT infrastructure,
this infrastructure is also critical as is a potential data provider or data consumer in the
productive system. Any downtime associated with these dependent systems will interrupt
the primary application services as well.

Availability level Application


Standard Application services and infrastructure components that can fail for a short
period; typically one to two days without business impact. Standard often
also implies a definition of minimum protection like using reliable servers,
hot pluggable components, and so on.
High availability Applies to applications that need to be available even when a critical
hardware resource is lost. Logical errors and loss or inconsistency of data
needs to be addressed and a planned procedure for making the
application service available again in case of a problem must exist. The
duration of the outage has to be minimized.
Mission critical The application service is absolutely critical for the business processes. A
service loss, even for a short period of time, might have a high financial
impact on the company. All measures for protection must be taken.
Table 1. Service level agreements

Availability measures
In order to measure and quantify computer system or application availability, the
following formula is used:

Availability = 100% * achieved availability / planned availability

Availability is defined as the percentage that the application was used for an intended
purpose. Defined availability values like 99.999 are often used in marketing as a solution
quality indicator. The following table shows the assumed unavailability for various typical
values.

Availability Achieved Planned Maximum possible downtime


Percent Days Days Days Hours Minutes
95 346.75 365 18.25 438 26280
98 357.7 365 7.3 175.2 10512
99 361.35 365 3.65 87.6 5256
99.9 364.635 365 0.365 8.76 525.6
99.99 364.9635 365 0.0365 0.876 52.56
99.999 364.99635 365 0.00365 0.00876 5.256
Table 2. Assumed unavailability
SAP on Windows Server 2008 R2 - High Availability Reference Guide 9

High availability solution risks and side effects


Improving the availability of an application service is a complex and intense task. The
higher the requirement is, the higher the effort, cost, and complexity of the solution.
When the system design requires the implementation of any high availability solution, it
becomes a factor for the high availability assessment. This can cause downtime due to
the emergence of new processes such as failover testing or disaster emergency training.
Increased complexity
Increased critical application loss protection using hardware redundancy, mirroring
technologies, snapshots, clusters, and monitoring solutions always increase the
complexity of an IT system. The disadvantage of complexity is twofold: It costs more to
maintain and operate and it bears additional risks. A well educated staff must be
available around the clock in case potential problems arise. Good planning, proper IT
processes, and good communication between the teams in the IT department are also
critical for reliable and secure operation.
Higher costs
Improved application service availability always requires more effort than standard
solutions. Hardware failures might only be addressed by providing redundant servers
that can be used in case of a failure. High availability software solutions need to be
purchased, installed, and maintained. IT personnel need to be trained and IT processes
must reflect the extended capabilities. For example, IT personnel need to periodically
test and verify that the high availability functionality works. Even support from external
providers often must be structured toward the increased requirements. Shorter response
times and dedicated support offerings for improved availability carry higher price tags
than standard offerings.
The cost of implementing high availability solutions increases exponentially as the
expected level of protection rises. While server clustering or database mirroring are
standard high availability technologies today that can address problems of a single
computer system, extending those solutions into a disaster recovery concept adds more
costs. These additional costs include wide area networking, additional facilities and staff,
as well as the additional associated operational costs.
Generally, it is relatively easy for organizations to have inappropriate expectations
regarding availability targets. It is also easy for organizations to demand higher levels of
availability than they are actually willing to pay for before the cost implications are
understood.
The cost implications of most availability solutions include, but are not limited to, the
following:
• Hardware
• Software
• Network infrastructure
• Training
• Serviceability and support
• Operational costs
SAP on Windows Server 2008 R2 - High Availability Reference Guide 10

Hyper-V virtualization and availability


As server virtualization technology and supporting hardware have matured to enterprise-
level reliability, performance, and functionality, businesses are moving more and more of
their critical applications, such as SAP applications, to virtualized environments.
With this move, new storage and IT requirements as well as opportunities to significantly
improve overall application availability in planned or unplanned scenarios emerge.
Although a full discussion on virtualization is outside the scope of this document, various
virtualization techniques for high availability will be introduced as appropriate.
Microsoft Virtualization provides a new way to install SAP applications on a physical
server using the Windows Server 2008 R2 Hyper-V™ role. Rather than using a physical
server for each application, multiple applications can be consolidated to a single physical
server onto individual virtual machines (VMs) in a virtual environment. With this setup,
clustering techniques can be used to provide high availability.
However, with virtualization the requirement for maintaining the highest level of
availability is even more important. This is because a physical server in a Microsoft
virtualized environment typically holds many virtual machines. Therefore, if the server
was to fail, many applications would fail with it. To deal with this issue, several new
solutions for virtualized infrastructures have been developed based on existing methods
to reduce downtime. These virtualized high availability solutions discussed later in the
paper include:
• Unplanned downtime solution: The VM unplanned downtime scenario is still
addressed by Windows Server Failover Cluster (WSFC). The only difference exists
in the virtualization layer where the agents can now migrate the VM VHD files and
then restart the VM on a new server after a failover has occurred.
• Planned downtime solution: The implementation of Live Migration has significantly
improved the planned downtime scenario as maintenance downtime can now be
avoided altogether.
• Logical errors solution: Logical errors that occur inside a VM such as unintended
file deletion and data corruption are addressed with the Hyper-V snapshot feature.
• Disaster recovery solution: Disaster recovery solutions for Hyper-V now
incorporate storage replication and WSFC in geographically dispersed installations.
Guest clustering
Also available with Hyper-V is the ability to configure a WSFC between two VMs so that
the cluster service runs in the guest operating system. One advantage of this
configuration is that it provides the ability for an entire test lab for cluster services to exist
on one physical server. Because only one physical server is required, this configuration
would reduce costs.
While the VMs on a guest cluster could feasibly be located on a single physical server,
this setup would create problems if the high availability of an application inside this
cluster is important. Since the application cannot survive the failure of a single server if
both of the VMs in a guest cluster reside there, a configuration with the VMs located on
two physical servers is required for high availability. Please note that when using guest
clustering, the type of storage used for the cluster disks is restricted to iSCSI.
The following figure shows the configuration of a Hyper-V guest cluster on a single
physical server and on two standalone physical servers.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 11

Figure 1

More information about support for SQL Server® in a guest cluster environment can be
found at:
http://support.microsoft.com/kb/956893
A detailed description for how to configure a Hyper-V guest cluster can be found at:
http://blogs.technet.com/b/mghazai/archive/2009/12/12/hyper-v-guest-clustering-step-by-
step-guide.aspx
SAP on Windows Server 2008 R2 - High Availability Reference Guide 12

SAP Architecture and Requirements


High availability always requires a comprehensive analysis of potential risks and the
implementation of appropriate measures to protect against those risks. Technical
solutions that protect the application against a loss of critical hardware resources require
detailed knowledge about the workflow and requirements of the application itself. The
following section describes the general SAP infrastructure architecture options and the
basic protection requirements provided by the architecture against the loss of availability.

SAP NetWeaver and its components


As shown in the following figure, SAP NetWeaver™ is an application and integration
platform that consists of several individual components. In the most cases, the SAP
Application Server (AS) is the technical base for the individual functions of SAP
NetWeaver.

Figure 2. SAP NetWeaver framework

For user and information integration, SAP NetWeaver uses the SAP Enterprise Portal
(EP) and SAP Business Warehouse (BW). Data is also integrated by the SAP Master
Data Management (MDM). By using the SAP Mobile Infrastructure, user integration can
be extended to wide variety of remote devices. Process integration is performed by SAP
Process Integration (PI), formerly known as SAP Exchange Infrastructure (XI).
Enterprise service architectures are made possible by the integration of people,
information, and processes, and are the foundation of a new breed of applications.
Composite applications are composed from a variety of individual functions already
available in the application infrastructure, and demonstrate how to develop faster and
more flexible solutions for future business requirements.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 13

A downside of this flexibility is that it increases the dependency on a greater number of


components in the infrastructure in order to make a service available.
Note: Because of the composition of enterprise services to business applications, all
service providers in use must fulfill the same level of protection in order to make the
composite service highly available.
The following figure shows the SAP NetWeaver platform from a technical perspective in
order to show how high availability could be implemented. Besides the SAP AS for the
Enterprise Portal, Master Data Management, Business Warehouse, Process
Infrastructure, or Mobile Infrastructure, there are also standalone engines and
surrounding support systems. While the Internet Transaction Server today is mostly
replaced by the Internet Communication Manager, a component of the SAP AS, there
are often standalone gateways for RFC communication or the TREX search engine.
Typically, SAP NetWeaver is supplemented by an installation of the SAP solution
manager, a SAP NetWeaver administrator, and the SAP NetWeaver development
environment.

Figure 3. SAP NetWeaver software development environment

SAP Application Server architecture


The SAP AS is the technical base for SAP applications. While the former R/3 AS only
supported the ABAP programming language, the SAP AS supports Java as well.
The AS delivers the transactional power for business applications and must be extremely
stable, scalable, and secure. Since the application servers are used as the execution
SAP on Windows Server 2008 R2 - High Availability Reference Guide 14

layer for the business logic coded in ABAP or Java, they are required for the fulfillment of
the business process which in turn creates high availability requirements. Solutions for
optimized availability are supported by the SAP AS architecture, but always depend on
additional components such as redundant servers and monitoring, and control
processes that are typically in high availability clusters.
Before going deeper into the SAP AS architecture, the general features should be
discussed. All application servers consist of at least of one central database and a
central SAP instance that provides unique services for the SAP system. If more
transactional performance is required by the SAP system, additional application servers
can be added to the SAP system. A SAP system that is identified by a unique System
Identifier (SID) might consist of many SAP instances and the common database.
Depending on the type of application, a SAP AS can be installed for ABAP, Java, or for
both workload types as shown in the following figure:

Figure 4. SAP Web application processing options

ABAP system architecture


The layout and structure of SAP AS 7.00 has changed from version 6.40. The following
figure shows how the structure of a pure ABAP system was used up to SAP AS, version
6.40.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 15

Figure 5. A pure ABAP system using SAP AS version 6.40

This figure shows two instance types including a central instance and one or more dialog
instances. Processes like Dialog, Batch, Update, Spool, or the Dispatcher process exist
many times in a SAP system and are therefore redundant. Each installation of an ABAP
instance also has one gateway process configured that is used for communication
through the Remote Function Call (RFC) protocol. Also, each instance has its own
Internet Communication Manager (ICM) process for HTTP-based communication. The
Internet Graphics Server (IGS) only supports the creation of bitmaps for browser-based
clients.
To register all the instances of a SAP system and to support the communication between
the various components of a distributed SAP system, a single message server is
configured in the central instance. Also specific to the SAP system is the central
Enqueue server that manages the lock entries in a distributed SAP system in a lock table
inside of the shared memory of the server. Because of these two unique processes, the
term central instance was used for this installation. A central instance is the lowest work
unit of the SAP system and the performance can be extended by adding an additional
AS.
When looking closer at the directory structure of this SAP system, the installation of the
SAP AS 6.40 is demonstrated in the following figure.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 16

Figure 6. SAP central instance

All profiles and executables of a distributed SAP system are made available from the
central instance to all dialog instances through the share SAPMNT. In order to support a
simple patch process for executables, there is one master copy of the executables on
the central instance. Any time a dialog instance starts the SAP utility, SAPCPE checks
for the availability of a newer executable version. When available, this executable is
copied to the AS local runtime directory before it is used.
Changes in the SAP system ABAP reports are distributed by using the transport system.
SAP systems can be configured to be a member of a transport domain. For each
transport domain, there is one directory that is shared by all members of the domain.
The directory is: <Drive>:\usr\sap\trans. Because of the central character of this shared
directory, it can be considered a single point of failure for the operation of more than one
SAP system.
With the introduction of SAP AS 7.0, there was a major change in the layout of the
central instance. Similar to the structures in pure Java systems, the unique Message and
Enqueue server processes have been moved to a separate SAP instance: the ABAP
System Central (ASCS) instance. Therefore, no typical AS has more system wide
functions. The following figure shows the SAP landscape simplification:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 17

Figure 7. Simplified ABAP system setup using SAP AS version 7.0

Subsequently, the file system of an ASCS instance would look like the following figure:

Figure 8. ASCS instance directory structure


SAP on Windows Server 2008 R2 - High Availability Reference Guide 18

SAP installations of SAP AS 7.0 consisting of an ASCS instance and a dialog instance
will continue to use the name format D<instnr> for the instance directory. This combined
installation structure is shown in the following figure:

Figure 9. ASCS instance and dialog instance

Regardless of this combined installation structure, the Enqueue and Message server
processes are now in the ASCS instance. This naming convention was not changed
because of compatibility reasons with older versions.
Dual-stack system architecture
With the introduction of J2EE as a possible SAP system component in version 6.40, SAP
AS can be installed for ABAP, Java, or for both types of workloads. There is a
considerable difference in architecture between ABAP and Java platforms as seen in the
following figure:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 19

Figure 10. A dual-stack system using SAP AS version 6.40

As shown in this figure, both the ABAP and the Java part of the SAP AS have their own
Message and Enqueue server as critical components. The Java AS is primarily made up
of Java server processes. The software deployment manager (SDM) is used for the
installation and management of software versions. The server operating system must
also have a Java development kit (JDK) installed to configure the Java virtual machine
(JVM). The JDK for Windows is available for Windows through Sun Microsystems.
While in ABAP Applications Server version 6.40, the Enqueue and Message server are
still a part of the central instance: The Java AS always uses the system central services
(SCS) instance concept. This means that every 6.40 dual-stack system must, at a
minimum, consist of two instances.
As with the pure ABAP configuration, the hybrid system still has a central database that
divides the respective application data types by using a schema. In the hybrid structure,
the ABAP and Java functions are shut down simultaneously as if a single instance. Both
instance parameters are also configured in a single instance profile. The Java SCS
instance in this installation is a complete unit and has its own profile. It can be started or
stopped independently.
For the purpose of maintaining distributed installations of SAP instances, all profiles and
executables of the SAP system are shared on one central instance on a network share.
This server is typically the server that holds the central instance or the SCS instance.
All together, the dual-stack directory structures are naturally more complex than a pure
ABAP or a Java instance. The SAP AS 6.40 dual-stack system structure is shown in the
following figure:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 20

Figure 11. SAP AS 6.40 dual-stack file structure

As seen previously with the pure ABAP AS, the system structure for dual-stack systems
became simpler with the introduction of the SAP AS 7.0. The only difference between
the typical identical system instances is the Software Deployment Manager (SDM) that is
installed only in one instance. The SDM is required to install and patch Java programs
and is only needed when new programs are installed or during software maintenance.
Therefore, there is no need to configure the SDM in a cluster solution. To secure SDM
service availability, a backup copy can be installed on any AS when needed. As with
SAP AS version 7.10, the SDM will be completely removed from the installation and
replaced with a new Java Support Package Manager (JSPM) function. The software
maintenance functionality will then be an integrated part of every AS and this function
would be redundant.
The following figure shows the SAP AS 7.0 dual-stack system structure. As shown in the
figure, the ABAP Message server and Enqueue have now been moved to a new,
separate ASCS instance that simplifies the dual-stack SAP AS setup.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 21

Figure 12. SAP AS 7.0 dual-stack system structure

The following figure shows the SAP AS 7.0 dual-stack directory structure with an ABAP,
Java, and a SCS instance. In this file system layout, there is a clear distinction of the
different components described.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 22

Figure 13. SAP AS 7.0 dual-stack directory structure

Java system architecture


In addition to the installation variant for ABAP or dual-stack systems, there is a third way
to install pure Java systems. In this case, there is no difference between the SAP AS
versions 6.40 and 7.0. This configuration typically uses a SAP Web dispatcher or a
hardware load balancer to distribute the HTTP connection load. The Java system
application servers are all constructed in the same way with the addition of the Software
Deployment Manager (SDM) on one instance. This functionality, however, is removed in
SAP AS version 7.10 and will no longer appear in subsequent Java system versions.
The following are the three main components of a Java system:
• The central Enqueue and Messenger services used by all Java instances
• Java and dispatcher processes to handle the workload
• A central database for persistent data storage

These components are shown in the following figure:


SAP on Windows Server 2008 R2 - High Availability Reference Guide 23

Figure 14. Java system main components

Multiple J2EE instances placed on several physical servers create a Java cluster. The
basic rule is that a Java instance can only be configured once per physical server. At a
minimum, the Java instance must consist of at least one Java server process and a
dispatcher, but can also have multiple Java server processes. The central SCS instance
might also be put together with a regular Java instance on one physical server.
Similar to the ABAP installation, the profiles and executables of a distributed Java
system also reside on one physical server and are shared there. Because of the central
character of these files, this server is the server that holds the SCS instance of the SAP
system.
SAP system single points of failure
Single points of failure are SAP system elements that are critical in order to operate a
system and must be protected against high available SAP system loss or failure.

Single points of failure assessment


As mentioned previously, every SAP system has the following central components that
are required to be available at all times:
• A central database for data storage – one per system
• Separate message servers and Enqueue servers for ABAP and Java systems
• The SAPMNT-share for profiles, executables, and Java Secure Store files of a SAP
system. There is one SAPMNT-share per SAP system.
The purpose of the database is to provide persistent data storage for SAP system data
and the runtime environment. Databases work with a series of internal mechanisms
known as ACID (Atomic, Consistent, Isolated, and Durable). These mechanisms ensure
SAP on Windows Server 2008 R2 - High Availability Reference Guide 24

data consistency at all times. For example, there is a mechanism that logs all changes
executed during a transaction. If a database operation fails in the middle of a
transaction, the logs are used to restore the previous condition. The transaction logs can
also be used to reapply transactions to a database image. For example, a database
image restored from a backup would not reflect the latest transactional state since the
transactions have most likely been executed after the backup was created. The latest
transactions would be lost due to the restore if there was no transaction log available to
reapply them.
Databases are central application components and are often protected by high
availability clusters or other technologies like Microsoft SQL Server® 2005/2008 R2
Database Mirroring. High availability clusters use the same database image that is
accessed from two servers (shared disk) for server redundancy. Database mirroring, on
the other hand, is able to maintain a physically independent copy of the critical data. The
main purpose of all these technologies is to protect the database service against loss
since it is the most critical component of a SAP system.
The SAP System Message Server registers all SAP system instances and load
balancing user demands by connecting new users to the most available server in the
system. Existing connections will remain intact if a message server goes offline,
however, no new connections can be made by that server. This makes the Message
Server an ideal cluster solution candidate.
The Enqueue Server is part of the SAP lock concept. The purpose of the SAP lock
concept is to synchronize data access in order to protect the consistency of SAP data
objects. This is one of the most important functions of a SAP system. It keeps SAP data
consistent by not allowing two users to make changes to the same data object at the
same time. Instead, the data would be locked for the first user.
The Enqueue Server in the following figure consists of a work process and a lock table in
the shared memory of the server that is used to store the lock information for an entire
SAP system. The Enqueue work process is needed in distributed systems to insert or
verify lock information on behalf of the dialog instances. Local work processes can
directly access the lock table and do not need this Enqueue work process. If, however,
the lock table is lost by a server failure, lock information can no longer be verified. In a
distributed system, this would create a transaction reset and roll-back of all pending
transactions, even on dialog instances that would normally resume working, and all
session contexts would be lost. An example of a SAP AS 6.40 ABAP with a single point
of failure (SPOF) is shown in the following figure:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 25

Figure 15. SAP AS 6.40 ABAP SPOF

This figure shows only the critical SAP AS components and is therefore, not complete.
Another critical point resides in the file systems and network shares of the SAP
installation on Windows. It is important that the SAP system executables and profiles are
always installed with the central instance or SCS instance in newer systems. Access to
these files is provided through the SAPMNT share that is present only once per SAP
system. Executables available on this share are copied to the local machine before an
instance starts through the SAPCPE SAP program. This is done to improve the stability
of the SAP instance. However, the profiles are only read through this share.
The following figure shows the infrastructure of two servers: Server Alpha has the central
instance and Server Beta is a SAP application server. Server Alpha hosts the central
instance and therefore the SAPMNT share. Both instances have the share SAPLOC that
is used to access the local environment of a SAP instance. Both servers have two
environmental variables: SAPGLOBALHOST and SAPLOCALHOST. The UNC names
\\SAPGLOBALHOST\sapmnt and \\SAPLOCALHOST\saploc were derived from these
variables. These names are used in the SAP kernel to search the SAP system profiles
and system directories. Server Alpha has both variables set to the name of the local
server so all access points are local. However, Server Beta is directed to the central
server when accessing SAPGLOBALHOST. SAPLOCALHOST is used for all instance
specific operations and therefore is accessed again through a local path.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 26

Figure 16. Critical SAP AS components

The mentioned directory structure and the SMPMNT share must be protected within a
high availability solution because of their central significance for the SAP system. Since
the access to the UNC path is derived from the variable SAPGLOBALHOST, these files
are also called global files.
Prior to SAP AS version 7.0, in all ABAP or ABAP + Java systems, the central instance
was protected in a cluster. The reason was simple: It was not possible to separate the
Enqueue and the Message server from the rest of the SAP central instance. Together
with the central instance, the Enqueue server and the Message server, the global files
and the SAPMNT share were implicitly protected in a cluster as well.
With the development of the SAP Standalone Enqueue, it became possible for the first
time in the SAP AS 6.40 to extract the central component Message server and Enqueue
server into a single instance. By doing so, the cluster configuration for critical SAP
services was significantly simplified. While in version 6.40, the SAP System Central
Services (SCS) was introduced only for Java-based systems. SCS configurations also
became available for ABAP-based systems with SAP AS 7.0. These configurations are
called ASCS instances.
Note: All high availability configurations of SAP systems today are based on the SCS
instances, the protection of the SAPMNT share, and the GLOBAL files in a failover
cluster.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 27

One of the main benefits of this configuration compared to the protection of a complete
central instance in older versions is the fact that only two relatively lightweight services
need to be moved and restarted. SCS instances lead to shorter failover times and more
stability in the cluster implementation. Since there are no SAP users connected to a SCS
instance, the effect of a failover is also much smaller in the SAP system. Using the SAP
Replicated Enqueue in addition to SCS high availability configurations enables
enterprises to minimize application server interruptions. For more information, see the
Measures to Avoid Unplanned Downtime section.
The information below confirms which configuration is supported by which version.
Up to version 6.40, the central instance is clustered.
• During an upgrade of an existing 6.40 central instance to 7.0, the established
architecture remains intact. SAP has documented the migration steps to support the
new ASCS structure in SAP note 1011190.
• When initially installing SAP AS 7.0, only the SCS/ASCS instance will be clustered.
Pure Java systems:
• Since the SAP AS 6.40 SR1 release, only the SCS is clustered. No changes are
needed to upgrade to 7.0.
ABAP + Java systems:
• Since the SAP AS 6.40 SR1, the Java SCS instance together with the ABAP central
instance is clustered.
• With the new installation of SAP AS 7.0, only the ASCS instance and the SCS
instance are clustered.

SAP Enqueue process special requirements


The dependency on the single lock table was not completely resolved by the introduction
of the SAP Standalone Enqueue. To address this issue, the SAP Standalone Enqueue
was combined with a SAP Replicated Enqueue on another server. The following figure
shows the new system design with SCS instances to host the central and critical SAP
system services. There would be one for ABAP and one independent SCS instance for
Java. Any other component of the SAP System, such as the SAP AS that handles the
user workload, would not be considered critical because they are implicitly redundant if
more than one is configured in the system. Because of this redundancy, only the SCS
instances require high availability protection measures.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 28

Figure 17. New system design with SCS instances

If one combines the SAP Standalone Enqueue in the SCS instance with a SAP
Replicated Enqueue running on a second server, one can continuously replicate the lock
table. In a larger SAP system with several SAP application servers, an operation with
minimal interruptions is provided. This is provided even when the central services must
be transferred to another server due to a hardware failure.
The SAP Replicated Enqueue can only be used for lock table replication and cannot
function as a regular Enqueue server of a SAP system. The lock table in the SAP
Replicated Enqueue, which holds the replicated lock entries, cannot be used directly for
the Enqueue service. During the process of a failover of the Enqueue server to the
replication site, the standard Enqueue process is first started and a new, empty lock
table is created. The replicated data in the shadow lock table is then read and
transferred to the original lock table before the system is operational again.
The following figure shows the configuration of several SAP application servers and a
central instance in combination with a SAP Replicated Enqueue:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 29

Figure 18. A SAP Replicated Enqueue

A SAP Replicated Enqueue should be combined with a high availability cluster solution.
One reason for this is to enable the administrator to switch the Message server from one
server to another in case of a severe failure. Another reason for this setup is that the
SAP Replicated Enqueue is not a fully functional Enqueue server. Instead, it is only used
to replicate the lock table. During regular operation, the SAP Replicated Enqueue only
inserts lock requests into a standby lock table on a second server. In case the original
server dies, the normal Enqueue Server needs to failover to this server and resume work
with this replicated lock table. Additionally, high availability cluster solutions are also
used to protect the database against hardware failures.
Since Message and Enqueue servers in the SCS instance have very little resource
requirements on a server within a high availability cluster, it is possible to install
additional local application servers. In this context, local means that they are not
managed by cluster management and are lost by a failure of the respective server
hardware.

SAP standalone engines


The typical AS for all kind of SAP applications is the SAP AS. One of the benefits of the
SAP AS is the common architecture that can be reused to provide the runtime
environment for the majority of requirements. There are, however, a few special
requirements that require a more tailored design or special software components. In
SAP terminology, those engines are called the SAP standalone engines.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 30

The SAP Web Dispatcher


The SAP Web Dispatcher is a Software Load Balancer for HTTP or HTTPS connections.
Typically, it will be installed in a demilitarized zone (DMZ) between the SAP backend
systems and public Internet access.
Connection requests from the Internet will be passed by the SAP Web Dispatcher to the
available SAP system AS in a circular way. The routing algorithms are used to review
the capacity and load on all the various instances to determine which server to connect
to. With ABAP instances, the number of configured dialog work processes will be
evaluated. With Java instances, the number of available server processes determines
which server gets the next connection request.
The architecture of the SAP Web Dispatchers corresponds with the SAP Internet
Communications Manager (ICM) which is a component of every ABAP instance. While
an ICM forwards incoming connection requests directly to a dialog work process of an
ABAP instance or to the Java dispatcher of a dual-stack installation, the SAP Web
Dispatcher passes those requests first to the respective ICM of a SAP instance which in
turn further processes the request. The SAP Web Dispatcher basically acts as a
software router for the incoming HTTP requests.
Because of their central function for Internet communications, the SAP Web Dispatcher
is also a critical component in a system landscape that needs to be protected against
hardware failures. Since the SAP Web Dispatcher looks like a SAP ABAP instance, it
can be integrated relatively easy into a high availability cluster solution and therefore be
protected against hardware failure. The typical structure of a SAP landscape using a
Web Dispatcher is shown in the following figure:

Figure 19. SAP landscape using a Web dispatcher


SAP on Windows Server 2008 R2 - High Availability Reference Guide 31

In contrast to most of the high availability installations, the installation of a SAP Web
dispatcher in a cluster is not supported by SAPINST. SAP note 834184 provides the
steps to manually configure a WSFC for the SAP Web Dispatcher in detail. SAP notes
can be downloaded from the SAP Service Marketplace at https://service.sap.com.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
Additional SAP Web Dispatcher administration information is available at:
http://help.sap.com

To access this information, do the following:


 From the left menu pane, click SAP NetWeaver.
 Choose English under SAP NetWeaver 7.0 Library.
 Open Technical Operations Manual for SAP NetWeaver.
 Open Administration of Standalone Engines.
 Follow the SAP Web Dispatcher link.
SAP standalone gateway
The SAP gateway enables SAP systems and external programs to communicate with
one another. The protocol for the communication is the Common Programming Interface
Communication (CPI-C) which is also used by the Remote Function Call (RFC) interface.
Subsequently, all RFC connections in a SAP system rely on the SAP gateway process.
By default, each SAP AS has one gateway process configured.
In certain cases, it is also possible to configure a standalone gateway process. One
example would be to configure a standalone gateway for the System Landscape
Directory (SLD). As the SLD is a component of the Java AS, the standalone gateway
acts as a bridge to allow ABAP systems to read and write data per the RFC in the SLD.
Another typical use case is the installation of a standalone gateway on a single database
instance with no ABAP engine. In this case, the gateway is needed in order to make the
database calendar (Transaction DB13) functional. This configuration is also described in
SAP note 657999.
In order to configure a standalone gateway in a failover cluster, it is very simple to add
the gateway to the Enqueue and Message server process of a SCS instance. This
configuration is described in SAP note 1010990.
TREX
TREX is an abbreviation for Text Retrieval and Extraction and is a search engine
designed to search for structured and unstructured data. TREX provides SAP
applications with numerous services for searching, classifying, and text mining in large
document collections or unstructured data. In addition, TREX provides SAP applications
with services for searching and aggregating business objects or structured data.
This search engine is used as a standalone engine in combination with the SAP
Enterprise Portal (EP) or the Knowledge Management (KM) application. For access to
the TREX search engine, each SAP AS has ABAP or Java components that support the
communication with the engine. The most simplified form of a TREX installation is shown
in the following figure:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 32

Figure 20. Basic TREX installation

TREX is one example of a SAP solution that does not rely on a standard SAP AS, but is
run on special server architecture. TREX installations can also be implemented as
master/slave configurations spanning several physical servers.
Additional information about the distribution and implementation of a TREX engine is
available at http://service.sap.com/instguidesnw70.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
To access this information after logging on to the site, do the following:
 From the left menu pane, click SAP NetWeaver.
 Choose English under SAP NetWeaver 7.0 Library.
 Open Technical Operations Manual for SAP NetWeaver.
 Open Administration of Standalone Engines.
 Follow the Search and Classification (TREX) link.
General information about TREX is available at http://help.sap.com.
SAP liveCache
SAP liveCache is a component of the Advanced Planning and Optimization (APO)
application that supports the SAP SCM solution: an application for supply chain
management in the mySAP suite. SAP liveCache is a memory resident database for
rapid access. The foundation of this technology is derived from the SAP MAXDB,
formerly known as SAP DB. In addition to this memory resident database, each APO
system has a normal database for the APO data and programs. In order to access data
SAP on Windows Server 2008 R2 - High Availability Reference Guide 33

objects in the liveCache rapidly during operation, those objects are loaded into the
liveCache at startup. A special logging mechanism writes savepoints to the disk every
few minutes that does not reflect the transactional state of the system.
APO systems consist of a SAP AS and a liveCache as standalone engine. From the
perspective of high availability, there are two solutions possible to protect the liveCache:
• A failover cluster for the APO system and the liveCache. LiveCache is supported in
the WSFC as of SAP NetWeaver 7.0 SR1.
• A hot standby liveCache where the database log files are exported to a standby
server and constantly applied to a database in recovery mode. This log shipping
solution works with two independent servers that do not share common storage.
Additional information about the installation of SAP liveCache and cluster configurations
in WSFC is available at http://service.sap.com/instguidesnw70.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
In addition, see the following SAP note about the configuration of liveCache in WSFC at
https://service.sap.com/notes.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
 SAP note 780795: ―SAP liveCache 7.5: WSFC Installation‖
General information about the administration of the SAP liveCache is available at
http://help.sap.com.
To access this information after logging on to the site, do the following:
 From the left menu pane, click SAP NetWeaver.
 Choose English under SAP NetWeaver 7.0 Library.
 Open Technical Operations Manual for SAP NetWeaver.
 Open Administration of Standalone Engines.
 Follow the SAP liveCache Technology link.

SAP Content Server


The SAP Content Server is an independent server instance for temporary data and Web
documents that can be requested by the SAP AS through the Internet. By using the
Content Server, large document volumes can be maintained for cached access.
A SAP Content Server can be installed together with a SAP AS on a physical server or
as standalone instance. It is possible to install this server with or without its own
database. When installing this server with its own database, MAXDB is typically used.
The simultaneous use of a SAP AS database and a Content Server is not supported. In
order to protect the SAP Content Server against loss, it can be configured in a failover
cluster.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 34

The following SAP notes located at https://service.sap.com/notes provide information


about the installation and clustering of the SAP Content Server.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
 SAP note 175096: ―SAP Content Server installation guide‖
 SAP note 1039401: ―SAP Content Server Clustering with Windows 2003‖

To access this information, do the following:

 From the left menu pane, click SAP NetWeaver.


 Choose English under SAP NetWeaver 7.0 Library.
 Open Technical Operations Manual for SAP NetWeaver.
 Open Administration of Standalone Engines.
 Follow the SAP Content Server link.

Additional SAP Content Server administration information is available at:


http://help.sap.com.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 35

Unplanned Downtime Avoidance Strategies


Unplanned downtime is one of the biggest concerns in any IT operation. Especially with
the dependencies companies have on SAP applications, downtime of such services at
undesired times would block the normal execution of work and could generate severe
financial impacts due to lost revenue. The reason for unplanned downtime is most often
a failure in hardware components that affect application service availability.

Hierarchy of high availability solutions


Unplanned application interruptions often occur simultaneously with hardware resource
loss which is important for the application operation. In addition to the importance of
resource redundancy, supervision and control of application operations is critical for high
availability as well. For highly available SAP system operation, several solutions have
evolved that either independently or in combination assure better protection against the
unplanned downtime of a SAP application. The following figure shows the technical
solution hierarchy:

Figure 21. Technical solution hierarchy

Securing a SAP application against interruption due to the loss of hardware resources
generally requires applying several techniques. Applying these techniques will lead to
better application protection.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 36

Data storage protection


The lowest level of the high availability hierarchy manages the way data is stored and
made available in a secure and reliable way. Data storage protection is typically widely
implemented in a SAP data center. Most of the storage devices offer some level of
protection by default.
However, storage subsystems still have a number of challenges for a SAP data center.
First, the amount of data grows rapidly over time. In addition, the data needs to be
constantly protected to prevent hardware failure and data access performance loss that
directly relates to the overall SAP system performance. SAP system performance is
typically measured by user transaction response time.

SAN infrastructures
A Storage Area Network (SAN) provides a centralized approach to maintaining the
storage resources needed in a computer system. Traditionally, Direct Attached Storage
(DAS) has been used for the computer system local storage requirements. The use of
DAS has high space requirements and administration costs. By centralizing data storage
into a scalable, network type architecture, administrative costs are lowered, and space is
managed more efficiently. SANs can be built on Fibre Channel connections using fiber
optic cables and built on the SCSI protocol for block-oriented data transfer.
In addition, iSCSI devices are now available that use normal TCP/IP networks for the
transport. The SCSI protocol for the data transfer is packaged into the TCP/IP transport.
From the high availability perspective, the use of SAN infrastructures in data centers is
recommended. Redundancy of physical disks and protection against the single disk
failure is maintained in the storage subsystem itself and follows the hierarchical
approach shown at the beginning of this section. Depending on the vendor and the type
of storage subsystems in a SAN, even data replication over larger distances can be
achieved with SAN-based storage.
Additional information on the concepts of the SAN infrastructure for highly available
Windows systems can be found in the Server Cluster: Storage Area Networks white
paper by searching for the title at http://technet.microsoft.com/en-us/default.aspx.

Multipathing
Data storage protection against unplanned downtime always includes connection
protection between a server and its storage. If there is only one storage subsystem host
adapter and subsequently only one storage cable connection, any host adapter, cable,
or controller failure in the storage array would create an application interruption. The use
of a WSFC could help protect the server components such as the host adapter.
However, it is preferable to avoid connection failovers in a cluster. These failover types
can be avoided by using a redundant host adapter and two cable connections to a
storage device that in turn has two storage controllers. This configuration is called
multipathing.
The Windows operating system supports multipathing through the MPIO driver.
Additional information for MPIO configurations is found in the Multipathing and the
Microsoft MPIO Driver Architecture white paper at:
http://download.microsoft.com/download/3/0/4/304083f1-11e7-44d9-92b9-
2f3cdbf01048/mpio.doc
SAP on Windows Server 2008 R2 - High Availability Reference Guide 37

Server protection
Servers host the individual components such as the SAP instances and services that
compose a SAP system. The server role and importance depends on its function. A
database server for a productive SAP system typically has the highest requirements in
availability, stability, and performance. While SAP specific solutions are discussed later
in this paper, there are a number of general server recommendations that incorporate
high availability.
With high availability, redundancy is the method to protect servers against downtime.
Inside of a server this could mean that the server has two independent power supplies
with two power cords. Of course, each power cord needs to supply enough energy to
sustain the operation in case the other one fails. It might also include redundant host
adapters for storage or network access. Finally, a conceptually well designed system
with hot pluggable components is always valuable.
However, there are server components that cannot be easily configured to be redundant.
Main memory or CPUs are examples of these critical components as well as the server
operating system that also exists only once. There are two solutions that are typically
used to address this. One solution would be to use fault tolerant systems built to recover
from memory or CPU hardware failures. However, the disadvantage to this solution is
the limited performance range and higher prices.
The second solution for protecting servers against failures is high availability clusters like
the WSFC. With WSFC, two or more servers share storage subsystem access and can
take over the storage volumes and restart applications automatically in case a server
fails. This concept even maintains redundancy at the operating system level as each
server has its own operating system. However, clusters depend on additional software
components and need a proper configuration and a change management policy. We will
discuss the possible cluster implementations with SAP applications later in this section.
Network high availability
Networks are the backbone of all corporate communication, both internally and
externally. The SAP application network implementation has multiple communication
layers based on different functionalities including:
• A server network that interconnects SAP application servers and the database
server.
• A client network for local users using the SAP GUI or a browser.
• A demilitarized zone for connection to the public Internet.
• A provider for access to the public Internet.
Again, component redundancy is the key factor for high availability solutions. However,
the architecture of a real implementation reflects additional considerations. For example,
public access to the Internet immediately raises security concerns and has more
requirements than the internal and isolated server network. While in the server network,
besides availability, performance might be another issue. The following figure shows the
different SAP network aspects.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 38

Figure 22. Different SAP network aspects

Important considerations for highly available SAP networks include:


• Router and switch redundancy: The server network is redundant up to the used
routers or switches. Redundancy can be accomplished by network teaming of
Network Interface Cards (NIC). Routers need to monitor each other and take over
the functionality of a failed device.
• Client separation: Clients are typically not connected in a redundant way. However,
there needs to be a separation of clients between different switches so that only a
single client group can be affected by a hardware failure.
• DMZ redundancy: In the DMZ, there is typically a redundant SAP Web dispatcher
configuration or hardware load balancers.
• Internet access redundancy: Redundant access to the public Internet is necessary.
Additional information about the SAP systems high availability network requirements can
be found in the SAP Help pages at http://help.sap.com.

To access the information, do the following:

 In the left menu pane, click SAP NetWeaver.


 Under SAP NetWeaver 7.0 Library, choose English.
 Search for Network High Availability.

A description of the SAP landscape and SAP system network requirements can be found
at http://sdn.sap.com.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 39

Application specific configurations


High availability clusters are a classic solution for protecting SAP applications against
critical hardware resource failure. In the Windows-based SAP installations, SAP
supports the database and the SAP SCS instance installation in a WSFC.
SAP components can be installed into the cluster by using the SAPINST SAP tool. In the
simplest configuration, the cluster consists of two servers and a storage subsystem that
the application components are installed on. The storage subsystem has to be
accessible by both servers. A SAP system with a SQL Server database is shown in the
following figure:

Figure 23. SAP system with a SQL Server database

Each of the cluster nodes has its own local operating system with the SQL Server engine
installed locally. Each node must be capable of accessing the external storage
subsystem where the applications components are installed. Supported storage systems
include Serial Attached SCSI (SAS), Fibre Channel, or iSCSI-based systems.
Every WSFC cluster needs to maintain a copy of the cluster database that contains
cluster configuration information. This information determines which cluster node can
take ownership of the cluster resource group for the SAP application and database in
case the communication between the nodes is interrupted. When two servers compete
for the cluster resource group, this is known as Split-Brain syndrome and can generate a
deadlock.
In the simple cluster configuration shown in the previous figure, the cluster database is
stored on each node. If the cluster uses a Disk Witness, the Disk Witness will also store
SAP on Windows Server 2008 R2 - High Availability Reference Guide 40

a copy. Applications that are protected in a WSFC cluster are configured in cluster
groups. A cluster group contains the application resources like the shared disk storage
volume that contains the SAP installation file system. In the case that a cluster group
needs to be transferred from one server to another, such as during a hardware failure,
these resources must become available on the second server before the cluster service
can start the application there. Cluster resources can be configured to handle
dependencies on other resources. For example, it makes no sense to start a SAP SCS
instance before the SAP system database is available.
For the exchange of status information between the members of the WSFC cluster, a
private network is required. Since the status information that is periodically sent out is
similar to a heartbeat, the network is called the cluster heartbeat network.
Every SAP application network connection in the cluster is assigned a virtual IP address
that is activated on a server by the cluster service when starting the SAP cluster group.
While the virtual IP addresses are activated only on the server that runs the application,
all network cards also have configured local IP addresses that are permanently
assigned.
Additional information about Windows Server 2008 R2 failover clusters can be found at:
http://www.microsoft.com/windowsserver2008/en/us/clustering-home.aspx

Simple cluster for a single SAP system


High availability solutions should be as simple and controllable as possible. To simplify
and reduce dependencies of SAP applications in a cluster, the System Central Services
(SCS) instance was created. This instance only contains critical SAP system
components.
The simplest variant of a SAP system in a high availability cluster is a two-node cluster
with a SAP SCS instance on one side and a database on the other. In case of a failure,
the respective surviving server must be able to start up the cluster group it took over in
addition to the existing cluster group running on this server. This means when
determining the sizing of the WSFC nodes, enough CPU power and main memory must
be available to run both cluster groups simultaneously at any time.
Under normal circumstances, however, the cluster groups are distributed on the cluster
nodes. Since SCS instances are not used for the transactional load of a SAP system,
there must be additional SAP application servers installed inside the cluster or on
additional servers outside the cluster.
If such SAP application servers are installed on WSFC nodes, these application servers
are not configured in a cluster group and will not switch to another WSFC node in case
of a hardware failure. The proper server sizing for the additional application servers is a
bit more complicated as these resources also have to be taken into consideration. The
following figure shows a SAP system with the SCS instance, the database in the WSFC
cluster, and additional SAP application servers on separate servers.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 41

Figure 24. Simple cluster for single SAP systems

The installation of a SAP system in WSFC cluster solutions is described in the SAP
Installation Guide for the respective SAP NetWeaver release at:
https://service.sap.com/instguides

A user name and password are required to access this Web site. To access this
information after logging on to the site, do the following:
 From the left menu, open SAP NetWeaver.
 Select SAP NetWeaver 7.0 (2004s).
 Select Installation.
 Select Installation Guide - SAP NetWeaver 7.0 SR3 or Installation Guide -
SAP NetWeaver 7.0 SR2.
 Select Windows and the installation type (ABAP, ABAP + Java, or Java).
There are a number of SAP notes that provide additional information about WSFC
related issues at https://service.sap.com/notes.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 42

The following table lists the most important notes:

SAP note Description


106275 Availability of SAP components on Microsoft Cluster Server
139513 Merge transports for high availability systems
779253 Clustering your Java Add-In Systems on Windows
941092 MSCS: Post-Upgrade Steps for systems upgraded to NW 7.0 SR<x>
962955 Use of virtual TCP/IP host names
967123 SAP NetWeaver 7.0 / Business Suite 2005 SR2: Windows
1010990 Configuring a Standalone Gateway in a HA ASCS Instance
1011190 MSCS: Splitting the central instance after upgrading to 700
1043592 MSCS: Cluster Resource Monitor Crashes on W2K3 SP2
1172679 Troubleshooting MSCS Issues
Table 3. WSFC related SAP notes

Using multiple clusters for SAP instances and databases


A possible scenario for protecting multiple SAP systems in high availability
configurations is to construct separate clusters for the SCS instances and the databases.
However, the method chosen to protect the database does not have to be a cluster
solution. SQL Server database mirroring would be an alternate solution.
Since an SCS instance is a very lightweight service, the active server CPU utilization is
typically minimal. In order to maximize the use of the available computer power, it is
possible to install local SAP applications servers on each WSFC node. This installation
type has been supported since SAP NetWeaver 7.0 SR2. A possible layout in a
landscape with several SAP systems is shown in the following figure:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 43

Figure 25. SAP instance and database cluster

The description of a local SAP application server installation inside of a WSFC cluster is
the same as the standard SAP application server description. The SAP installation
guides for NetWeaver 7.0 describe the setup starting with the version SR2. These
guides are available at https://service.sap.com/instguides.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
To access this information after logging on to the site, do the following:
 From the left menu, open SAP NetWeaver.
 Select SAP NetWeaver 7.0 (2004s).
 Select Installation.
 Select Installation Guide - SAP NetWeaver 7.0 SR3 or Installation Guide -
SAP NetWeaver 7.0 SR2.
 Select Windows and the installation type.
An overview about the supported WSFC configurations is available from SAP in the
MSCS Configuration and Support Information for SAP NetWeaver ’04 and the SAP
NetWeaver 7.0 Systems white paper at http://sdn.sap.com/irj/sdn/windows.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 44

SAP Replicated Enqueue


As described in the SAP architecture and the Requirements section, the Enqueue
service or more precisely, the lock data in the Enqueue lock table plays an important role
for the overall SAP system operation. Especially in distributed installations with
additional SAP AS, the effect of losing the lock table always creates an abort of all
pending transactions. In pure Java systems, the effect of losing the lock data is even
more severe than in ABAP-based installations.
Since the introduction of the SAP SCS instances for Java in SAP AS version 6.40 and
for ABAP in SAP AS version 7.00, SAP Replicated Enqueue is an option to solve this
issue. The SAP Replicated Enqueue requires the Standalone Enqueue for proper
function. Since the SAP Standalone Enqueue is only used in the SCS instance, the
installation of a SCS instance is required. The following figure shows two SAP systems
where the SCS instance and the database of each system were each configured in an
independent WSFC cluster.

Figure 26. SAP Replicated Enqueue

The regular Enqueue service and its lock table are on the server from which the SCS
instance was started. The second server in the cluster has a SAP Replicated Enqueue in
addition to the active database. Additional application servers are located on servers that
are not in a cluster formation. All lock requests from the active Enqueue servers will be
mirrored onto the Replicated Enqueue.
In case of a severe SCS server hardware problem, the SCS instance will be transferred
to the database server and started from there. During this process, the SAP Replicated
Enqueue is stopped and the lock information from the mirrored lock table is copied into
the new lock table of the regular server. Therefore, the SAP AS outside these clusters
does not lose any information and their running transactions are not influenced.
The SAP Installation Guide for SAP NetWeaver 7.0 SR2 and SR3 describe the cluster
setup for the Enqueue Replication and the Enqueue Replication server installation.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 45

SAP note 524816 gives detailed information about the SAP Standalone Enqueue. SAP
note 804078 describes the concept of the SAP Replicated Enqueue and how it can be
used to protect a SAP system. Attached to this note is also an installation guide for the
Enqueue Replication server in a WSFC cluster. In addition, the SAP lock concept and
high availability solutions are described at http://help.sap.com.

To access this information, do the following:

 From the left menu pane, click SAP NetWeaver.


 Choose English under SAP NetWeaver 7.0 Library.
 Open Technical Operations Manual for SAP NetWeaver.
 Open Administration of Standalone Engines.
 Follow the Standalone Enqueue Server link.

Multi-SID cluster
A limitation of older Windows-based cluster configurations was that only one SAP
system per cluster could be configured. The reason for the restriction was because of
the SAPMNT share. Any access to the SAP system global files in a distributed
installation have to use this share. Since the share is configured on the <Drive>:\usr\sap
directory, there is only one unique location in the file system.
Underneath this path, there is a <SID> directory that hosts all the data for a specific SAP
system. The consequence of this structure is that if there is more than one SAP system
installed on the server, the share would contain the global data for all SAP systems.
Since this share has to be relocated to another server in the WSFC cluster in case of a
failover, that operation would impact all SAP systems. Because of this, SAP does not
support this configuration.
A remedy for the described problem and restriction is resolved by using a new SAP
installation method. With this method, the SAP system disks are linked with the <SID>
directory under <Drive>:\usr\sap by using junctions. Junctions are similar to symbolic
links in the diverse UNIX versions. They are a file system detour that allows access to a
designed directory to be automatically transferred to another directory. For example, the
following figure shows the principle setup of a WSFC cluster with three SAP systems
with AAA, BBB, and CCC designations. Each SAP system has its own hard disk that can
be accessed on shared drives from both the servers in the cluster. SAP system AAA and
system BBB run on Server A and system CCC runs on Server B.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 46

Figure 27. Principle junction architecture.

The SAPMNT share previously was configured as a cluster resource inside the cluster
configuration. Now it is in the local operation system of the respective server. The share
is stationary and is no longer managed through the cluster. Under the C:\usr\sap
directory path are three directories: AAA, BBB, and CCC. These directories have been
created in both servers.
Depending on system type, the directories in the following table are created on the
shared drives:
SAP system type Shared drive directory
All system variants \usr\sap\<SID>\SYS
Java \usr\sap\<SID>\SCS<InstanceNr>
ABAP \usr\sap\<SID>\ASCS<InstanceNr>
ABAP + Java add-in \usr\sap\<SID>\SCS<InstanceNrJava>
\usr\sap\<SID>\ASCS<InstanceNrABAP>

Table 4. Directories per system type


SAP on Windows Server 2008 R2 - High Availability Reference Guide 47

Next, all the junctions are created from the local hard drive of every server. To create
junctions, the executable linkd.exe from Microsoft is available. The executable is a part
of the Microsoft Windows resource kit. The syntax for the commands is:

linkd.exe <Argument1> <Argument2>

Depending on the system type, the arguments can be accessed from the following table:

SAP system type Junction creation arguments


All system <localdrive>\usr\sap\<SID>\SYS
variants
<shareddrive>\usr\sap\<SID>\SYS
Java <localdrive>\usr\sap\<SID>\SCS<InstanceNr>
<shareddrive>\usr\sap\<SID>\SCS<InstanceNr>
ABAP <localdrive>\usr\sap\<SID>\ASCS<InstanceNr>
<shareddrive>\usr\sap\<SID>\ASCS<InstanceNr>
ABAP + <localdrive>\usr\sap\<SID>\ASCS<InstanceNrABAP>
Java add in <shareddrive>\usr\sap\<SID>\ASCS<InstanceNrABAP>
<localdrive>\usr\sap\<SID>\SCS<InstanceNrJava>
<shareddrive>\usr\sap\<SID>\SCS<InstanceNrJava>

Table 5. Junction creation arguments

As seen in the following figure, after the sample clusters installation, the cluster groups
AAA and BBB were then activated on server A, and CCC on server B. All the SAP
instances file system accesses were redirected to the respective shared disk. The
external access takes place as usual in the cluster through the cluster group virtual IP
address.
With this configuration, if Server A crashes due to a hardware failure, two things will
happen. First, the shared disks of both applications AAA and BBB will be activated on
server B. Next, the virtual IP address of cluster group AAA and BBB will be activated on
server B. By using the junctions that point from the shared disk to a local hard drive of a
cluster server, a client is able to resume its work as usual and can resolve all data.
Clients who previously already have worked with the SAP application BBB on server B
are not affected. The following figure shows the situation after the failover:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 48

Figure 28. Junction configuration after failover

Using the junction configuration, all the SCS instances of a larger SAP landscape can
now be configured as one cluster. In general, Multi-SID clusters can also protect the
database instances. Because of the varying resource requirements of databases
compared to a SAP SCS instance, the sizing could be more difficult. Therefore, a better
design would be to place the databases and the SCS instances on two different clusters.
The following figure shows this system structure:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 49

Figure 29. Separate database and SCS instance clusters for simplified sizing

The database servers would have an additional SAP Standalone Gateway configured.
This is required as a local service for administration. Finally, each of the database
servers would also get their own SAPOSCOL service installed for performance
monitoring.
Multi-SID clusters demand a different approach during a cluster installation, but require
no changes in the application operation. As a minimum requirement, the use of SAP
SCS instances is required. For a pure Java system, it is already possible in version 6.40.
With ABAP or dual-stack systems, version 7.0 must be employed.
The installation of multi-SID cluster solutions will be described in a separate installation
guide for the NetWeaver 7.0 SR3 release. SAP note 106275 describes how the SAP
supports a multi-SID cluster for the SAP AS 7.0.
The Multi-SID WSFC Installation for SAP NetWeaver 7.0 compact disc master is
available at http://service.sap.com/swdc.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 50

To access this information after logging on to the site, do the following:


 From the left menu pane, click on Download.
 From the left menu pane, select Installations and Upgrades.
 From the left menu pane, click on Entry by Application group.
 In the list that appears in the right pane, click SAP NetWeaver.
 From the right menu pane, select SAP NetWeaver.
 From the right menu pane, select SAP NetWeaver 7.0.
 From the right menu pane, select Installation and Upgrade.
 From the right menu pane, select Windows Server.
 Select MS SQL SERVER as the database, and then scroll down to the list of
downloadable objects.
The SAP Installation Guide is available at https://service.sap.com/instguides.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
To access this guide after logging on to the site, do the following:
 Select Installation Guide - SAP NetWeaver 7.0 SR3.
 Scroll down to the Multi-SID Installation on Windows WSFC section.
 Select Windows as the platform and SQL Server as the database type.

Multi-node cluster
Besides the previous limitation of only one SAP system configuration per cluster, there
was also a restriction in the number of cluster member nodes supported for SAP
clusters. While Windows Server 2003 could support up to eight servers and Windows
Server 2008 R2 could support up to 16 servers in a WSFC cluster, SAP only supported
two-node clusters before SAP NetWeaver 7.0 SR2. Because these limitations no longer
exist with NetWeaver 2004 SR2, multi-node clusters are now possible. However, if
Replicated Enqueue is used, SCS must still be configured to run on two nodes.
The following figure shows a cluster with three servers. Two of the servers actively run a
SCS instance and the SAP system database while the third server is a backup in case
an error occurs on either of the first two servers. With proper sizing of the main storage
and the CPUs in the middle server, it is possible for both SAP instances to run in the
middle server at the same time.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 51

Figure 30. A cluster with three servers

Multi-Node clusters are supported with SAP NetWeaver 7.0 SR2 using the SAP
installation tool, SAPINST. The installation of additional nodes in a WSFC cluster is
described in the SAP Installation Guide.

SAP application servers


High availability through redundant hardware and automatic failover is standard when it
involves the SAP system single points of failure. The SAP system architecture also has
components, such as the SAP application server (AS), that provide redundant services
inside of a SAP system if more than one is available. In a redundant configuration, more
than one SAP AS exists in each SAP logon group.
Any user connected to a SAP AS is affected if the server fails. All pending transactions
are lost and the user will also be logged off of the SAP system. If there are additional
SAP application servers in the failed SAP logon group, the user could immediately logon
again. The SAP message server that routes connections during the logon process
facilitates a load balancing mechanism that tracks which servers are reachable and have
the least workload. The user would therefore be reconnected to the next available server
in the group and could resume their work immediately. This limits damage to repeating
the last failed transaction as would be necessary in a cluster solution as well.
In general, every SAP system in a critical business application must have more than one
SAP AS to be able to continue to work in case of a server error.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 52

IT infrastructure protection
Applications always have a direct relationship to the servers in a data center. Across
these servers necessary resources like CPU, memory or disk storage are made
available. At the same time, applications and server operating systems are consumers of
central IT services. These services include:
• Centralized backup processes
• File and print services
• Active Directory®
• Deployment services
• Patch server
• Network services such as DNS or DHCP
Not only does the server that the application is running on need to be protected against
interruptions, but all data center services and resources that are significant to application
operations must be protected as well. This fact is especially important because the data
center central services serve all applications and could cause an interruption on a larger
scale than the failure of a single AS.
For example, after a DNS service interruption, no name into an IP address resolution in
a data center can be carried out. The following list contains some critical services that
might require protection:
• DNS
• DHCP
• WINS
• NFS server
• Fileserver with SMB/CIFS
• Print server
• Authentication
• Time synchronization
• Backup functions
• Central monitoring service
Since there are many critical protection services, detailed discussion of these services is
beyond the scope of this white paper. Additional documentation is available at
http://technet.microsoft.com.
If third party solutions are being used, the high availability discussion should incorporate
the vendor perspective as well.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 53

Hyper-V host cluster


In order to protect SAP Installations against unplanned downtime and to help minimize
planned downtime in general, Hyper-V servers can be configured in a Hyper-V host
cluster. This implementation consists of a WSFC configuration on the Hyper-V parent. A
Hyper-V host cluster requires all VHD files for VMs to be stored on a SAN that is
accessible from all the Hyper-V servers in a specific setup.
In Windows Server 2008 R2, the WSFC was extended to manage VM failover in case of
an unplanned physical server outage. The VMs on either server in a Hyper-V host
cluster can be configured as highly available and become a cluster resource. To
implement high availability for VMs, Hyper-V host clustering configuration is
recommended. The following figure shows an example of a Hyper-V host cluster.

Figure 31. Hyper-V host cluster

Additional Hyper-V host cluster implementation information is available at:


http://technet.microsoft.com/en-us/library/cc732181(WS.10).aspx.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 54

Planned Downtime Minimization Solutions


Planned downtime refers to the period of time that an application is not available due to
maintenance work or system upgrades. Contrary to unplanned downtime, the time
period for this work is planned in advance and affected users are typically notified.
Planned downtime is required for the following work:
• Hardware maintenance and resource extensions
• Offline backups
• Operating system and application patches
• Operating system and application upgrades
• Unicode migration
• Configuration changes that require restarts
• Deployments, transports and upgrades
• Failover or disaster recovery tests
Planned downtime is crucial for reliable and safe application operation in addition to
computer systems and their supporting environment. By applying the recommended
fixes for known software bugs, increasing the computer system resources, or testing the
data center high availability solutions, the vulnerability of an application service against
unplanned downtime is largely minimized. Contrary to this requirement, 24x7 application
availability is becoming increasingly necessary and the time and frequency an
application service is unavailable must be minimized. In fact, planned downtime in
general has more to do with the application service unavailability than unplanned
downtime. Reducing planned downtime should therefore be part of any high availability
strategy.
When determining possible solutions for planned downtime, it is important to consider
the frequency that downtime occurs. If, for example, an offline database backup requires
downtime once a week, it is a premium candidate for a technical solution that helps to
eliminate this downtime. Many activities occur only once a month or annually such as an
operating system or application upgrade. The less frequent the downtime, the higher the
probability to find a convenient time slot where this work can be performed. In any case,
proper planning and change management is one of the essential tasks to minimize and
manage planned downtime.
On the technical level, there are many options available to minimize planned downtime
for dedicated tasks including the creation of backups and the operating system or
database engine patching process. In any case, the appropriate hardware architecture in
addition to proper planning is required to achieve this goal.

Planning ahead for minimizing planned downtime


One of the key components of each data center operation is the definition of the Service
Level Agreement (SLA) that defines the operational hours and application uptime. Since
applications are services that are provided by the IT department to the internal and
external users in a company, the term SLA describes the expected availability. A
productive system SLA example might appear as follows:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 55

• Operating hours: 8:00 to 20:00 CST, 5 days per week, Monday through Friday.
• System availability:
 22 hours, 7 days a week
 99.5 percent annual availability during the defined operating hours
In the above example, the IT department would have a maintenance window of two
hours per week. They would need to take additional measures to ensure that an
unplanned downtime would not exceed 0.5 percent of the uptime or 22 hours.
Change management strategy deployment
Having a limited time for maintenance requires the IT staff to get the most out of the
available time. Typically, the work flow during any maintenance action involves having a
backup copy of the existing state, performing the required work, and testing throughout
before the system is returned to production. Proper preparation is one of the key factors
for success. Testing changes must occur first on test systems in order to verify the side
effects of the change. This process generates information regarding the time
requirements to perform this task. An additional benefit is that the IT staff learns about
the required steps while working with the test systems. This also helps to minimize the
downtime when the same work has to be done on the productive server.
Another important task is planning ahead to have enough resources like disk storage or
main memory for the future growth of the SAP system. By providing enough resources,
the SAP system stability and quality of service improve and frequent shut downs for the
installation of additional components can be avoided. In productive SAP systems, a
common strategy is to inflate the required hardware resources at the start date of the
productive use and maintain enough headroom for at least six months of growth. Any
further extension should also reflect this principle. Besides adding resources, there are
also strategies such as archiving that can be incorporated to minimize the storage
requirements of a SAP database.
Planning the operating system and application software maintenance is another
operational aspect. It is essential to know the software vulnerabilities and install fixes in a
timely manner. Typically, installing fixes needs to be synchronized and installed in a
sequential manner. For example, test and QA systems are updated first to work out the
installation issues. The production systems are updated only after the issues are
resolved. The amount of security vulnerabilities in a system can be minimized by a
process called hardening. Hardening a SAP system is configuring the SAP system with
only the minimum platform functions that are necessary for operating the system.
Additional information about IT landscape hardening can be found by searching for the
SAP Hardening and Patch Management Guide for Windows Server white paper at:
https://www.sdn.sap.com/irj/sdn

Backup and patching solutions


As mentioned, the tasks of creating backups or the patching cycle of the operating
system and application layer are by far the most frequent reasons for planned downtime.
Fortunately, there are technical and architectural solutions for the IT landscape available
that help IT departments to avoid or minimize this aspect of planned downtime.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 56

Snapshot backup
Backing up a large database might take a long time. The primary issue when creating a
backup is that it must be transactional consistent in order to use it for a potential restore.
Transactional consistency means that all transactions are either finished or not
contained in the backup. SQL Server database backups are created by using the backup
database command. This command first executes a checkpoint which means that all
pages that have been changed since the last checkpoint and still reside in memory are
flushed from the database server main memory to the storage subsystem. After this
operation is complete, the database files are backed up by copying the data to another
disk or a tape device. To maintain the transactional consistency, the transaction log file
is also copied during this process. The transaction log is used to roll back or undo
transactions that were not finished at the time the backup was made.
Despite the fact that SQL Server backups are online, the backups produce an additional
load for the storage subsystem. Therefore, one usually tries to minimize the time of an
online backup. In order to minimize the time a backup will impact normal system
operation, it is possible to use the snapshot feature of SQL Server 2005/2008 R2 to
create the backup. Snapshot backups reduce unavailability of the SQL Server
2005/2008 R2 database during a backup to a couple of seconds. This is especially
useful for moderate to very large databases where availability is very important.
SQL Server snapshot backup is accomplished in cooperation with third party hardware
or software vendors, or both. These vendors use SQL Server 2005/2008 R2 features
that are designed for this purpose. The underlying backup technology creates a point-in-
time copy of the database image that is being backed up. The instantaneous copying is
typically accomplished by splitting a mirrored set of disks or by creating a copy of a disk
block when it is written. This preserves the original. At restore time, the original is made
available immediately and the synchronization of the underlying disks occurs in the
background if necessary. This restores operations almost instantaneously.
The following figure shows an example of snapshot technology with NetApp FAS storage
system and the NetApp SnapManager for Microsoft SQL Server and SnapDrive for
Windows solution. In this example, the time required for backups and restore can be
reduced to seconds by using SnapManager.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 57

Figure 32. Snapshot backup configuration

Detailed information about the SQL Server 2005/2008 R2 Snapshot Backup feature is
available at http://msdn.microsoft.com/en-us/library/ms189548.aspx.

Optimized server maintenance system architecture


Through a suitable system design, the downtime due to patch installation or hardware
component installation can be significantly reduced. The concept here follows the
principle of a rolling upgrade where single components in a server landscape can be
isolated and stopped while other components continue to run the application service.
Server and operating system maintenance
SAP system users are typically affected during the process of SAP component patching.
But, the SAP system can be specially configured to enable Windows operating system
patching during operation without affecting users.
The basic principle is to have sufficient system redundancy in order to take out a portion
of the running SAP system and still be able to continue the operation. This is achieved
by having groups of AS instances configured in logon groups. SAP users will be
assigned to the AS instances depending on the user profile. For example, all human
resource users would be in one group while all financial users would reside in another.
Should one of the instances in the group need to have a patch installed, this server can
be removed from its logon group and no more users will get assigned to this server when
logging on.
This is done in the transaction SMLG that manages the SAP logon groups. Similarly, the
server has to be removed from the group of batch servers (Transaction SM61), update
server group (Transaction SM14) and RFC server group (Transaction RZ12). In case the
server is used as a spool server, an alternate spool and logical print server need to be
activated (Transaction SPAD). Another prerequisite would be to define the parameter
rdisp/gui_auto_logout in the instance profile of this server (Transaction RZ10). With
this parameter set, all inactive users are automatically logged off of this instance after
SAP on Windows Server 2008 R2 - High Availability Reference Guide 58

the specified amount of time has expired. With this configuration, the instance is made
idle quickly and can be shut down.
It is required that all the remaining SAP AS instances in a logon are able to handle the
workload sufficiently. In larger systems, it might be appropriate to prepare one universal
AS instance that can join several groups. This can be achieved by installing several SAP
AS instances on a server and start the respective instance when needed. Such a stand-
by AS could be used temporarily to maintain the transactional performance of a SAP
system. The following figure shows the setup of this landscape:

Figure 33. Hot-standby server for AS maintenance

There are still the central elements of the SAP system including the servers for the SCS
instance and database. If both components are in a WSFC cluster, a server can be
isolated using a planned failover to the respective standby server. This switch can
actually happen at a convenient time with little effort. The empty server can
subsequently be patched and restored to operation.
SQL Server instance maintenance
Expanding on the previous maintenance concept by using SQL Server Database
Mirroring adds the option to patch the database engine installed on a server while the
SAP system continues to work. The basic principle is to switch the database to the mirror
copy when a patch needs to be installed at the database engine of the original server.
After successful installation, the database is switched back and the same process would
be executed on the mirror side as well. See the following figure for an example of this:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 59

Figure 34. Database mirroring for SQL Server instance maintenance

More information on SQL Server database mirroring can be found in the Disaster
Recovery Solutions section.
An example of a patch cycle for Windows and SQL Server patches by using the above
concept could look like:
Patch schedule:
• Windows and SQL Server patches applied monthly (if applicable)
• SAP service packs applied during quarterly release
Patch sequence:
• Patch first in sandbox and test. Only place in production after a few days of
successful testing.
Patch process:
• If no reboot is required, apply the patches and the patch process is finished.
• If reboot is required, perform the following steps on each AS:
• Isolate Dialog/Batch server from:
 Logon group
 RFC group
 Update group
SAP on Windows Server 2008 R2 - High Availability Reference Guide 60

 Batch group
 Spool (or have redundant spool server)
• Drain connections, patch, reboot, then add back into the respective group
and proceed to next server.
• If required, take the temporary AS into the respective group.
 Perform the following steps on the mirrored database servers:
• Suspend mirroring, patch, and reboot secondary server, re-synch, fail over to
secondary, patch and reboot primary server short SAP downtime during
failover has to be planned.
• There is no need to fail back.
 Perform the following steps on the SAP central instance server:
• Relocate the SAP central instance in the WSFC cluster to the database
server.
• Patch and reboot inactive node.
• Fail over database and CI, patch and reboot other node  short SAP
downtime during failover has to be planned.
• Distribute database and central instance on the two nodes as before for
better performance.

SAP application planned downtime reduction


SAP also works on measures to improve planned downtime by providing enhancements
on the application layer. For example, previously with R/3, any SAP profile parameter
change required a SAP instance stop and restart in order to activate the change, but
today, most parameters can be tuned online.
There are still issues with the installation of SAP support packs. For example, if the
executables of a SAP kernel have to be patched, the SAP instance also needs to be
restarted. SAP is currently working on a rolling kernel upgrade procedure to avoid
downtime during the exchange of kernel executables.
There are many less frequent tasks such as application upgrades or migrations where
SAP can optimize the required downtime. See the following section for more information.
Additional SAP application planned downtime information can be found in the following
resources:
Strategies to avoid or minimize planned downtime on the SAP application level are
described in the High Availability for mySAP.com Solutions white paper available at:
https://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/8f87a790-0201-0010-
558e-bcf2096ff33b
A collection of planned downtime information regarding SAP upgrades, database
reorganization, and database backups is available at http://help.sap.com.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 61

To access this information, do the following:

 From the left menu pane, click SAP NetWeaver.


 Choose English under SAP NetWeaver 7.0 Library.
 Search for Planned and Unplanned Downtime.

Information about SAP upgrades can be found at https://service.sap.com.


Note: Access to this Web site is available only to registered SAP customers and partners
and requires a user name and password.
After logging on to the SAP support portal, click the Quick Links menu and search for
/upgrade.

More information on SAP upgrades can be found on SDN by searching for upgrade at:
https://www.sdn.sap.com/irj/sdn
The SAP Service Marketplace has the following related notes available at:
https://service.sap.com/notes
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
 SAP note 139513: ―Merge transports for high availability systems‖
 SAP note 361735: ―Inactive import of reports‖

Hyper-V Live Migration


While a Hyper-V host cluster can manage a VM failover from one physical server to
another, this process always involves an interruption of the application inside of the VM.
However, if the migration of a VM is planned, the downtime caused by the relocation of
the VM can be avoided. The following figure illustrates the Live Migration process.

Figure 35. Live Migration


SAP on Windows Server 2008 R2 - High Availability Reference Guide 62

The Hyper-V host cluster can perform a Live Migration of the VMs without application
service downtime. The following example describes this process in more detail.
A Hyper-V host cluster configured for Live Migration and a VM running a SAP application
is actively used by clients. At some point, the administrator must migrate this VM to
another server in the Hyper-V host cluster since the server that the VM resides on must
go into maintenance.
Initially, while the VM is still actively used on the primary server, an empty VM is created
on the second server and the memory image of this VM is copied to the second server. If
the memory pages on the primary server are changed during this process, Hyper-V
detects this and copies those pages again. Eventually, the number of pages that are
different between the two servers is significantly reduced.
When the difference is small enough, Hyper-V pauses the VM on the primary server and
copies the last set of changed pages to the new server. Subsequently, the client access
is re-routed to the new server and the VM on the primary server can be deleted. Since
the final state transfer happens very quickly and no TCP timeout occurs, the client does
not recognize this transfer.
Note: It is important to note that Live Migration does not work for unplanned downtime.
In the case of a server failure, the VMs will fail over using failover clusters. The Live
Migration process must be planned and requires an active primary system for the
duration of the migration.
Since the Hyper-V host cluster and Live Migration use the same setup, this solution is an
extension of the high availability solution with WSFC. Live Migration provides the
capabilities for minimizing planned downtime in a virtual environment. These capabilities
are not available for applications that must be installed directly on the physical server.
More details on how to set up a Live Migration cluster are available in the following
documents:
Windows Server® 2008 R2 Hyper-V™ Live Migration white paper available at:
http://download.microsoft.com/download/C/C/7/CC7A50C9-7C10-4D70-A427-
C572DA006E61/LiveMigrationWhitepaper.xps.
Best Practices for SAP on Hyper-V white paper available at:
http://www.microsoft.com/virtualization/en/us/solution-business-apps.aspx.
Hyper-V: Live Migration Network Configuration Guide available at:
http://technet.microsoft.com/en-us/library/ff428137(WS.10).aspx.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 63

Data Inconsistency Protection Solutions


Logical errors are always affecting business data. While the loss of critical hardware
resources can almost always be detected immediately, this can also be addressed
through immediate and automatic actions, like when performing an application failover
inside a cluster. The issue with logical errors is somewhat more complicated. Typically,
data inconsistencies are only discovered when the data is accessed. If problems are
discovered at this point, typically the only resolution is to restore the last backup copy of
the undamaged data.
The extent and the duration of this rescue operation, as well as the effect on the running
productive operation, are totally dependent on the extent of the damage and the data
relevance for the operation. In worst case scenarios, however, data inconsistencies can
cause production system downtime while complete restoration of consistent data is
performed. Additionally, if a restore to a consistent point in time is required, work that
has been performed after the last backup was created might get lost.

Logical error reasons


There are many reasons for inconsistencies. They might be caused by technical
problems or hardware errors where the data is unintentionally overwritten. For example,
storage adapter problems are a typical reason for data corruption. Another reason could
be a programming error in an ABAP report or Java program that accidentally changes or
deletes the data in the database. Errors can also happen quite often through faulty
human operation such as accidental data deletion. Finally, there is always a probability
for data damage through sabotage or viruses that should not be overlooked.
The following figure shows the reasons for computer system downtime and their relative
importance — source ZDNet, October 2002:

Figure 36. Computer downtime reasons


SAP on Windows Server 2008 R2 - High Availability Reference Guide 64

Database data inconsistencies


Data consistency in the SAP system central data repository is one of the most
fundamental requirements of a stable SAP operation. After all, data only exists once in
an application. As we have discussed in the previous section, there are numerous
triggers that can cause data corruption at any given time during operation. What is
especially difficult in this class of problems is the early recognition of an error condition
before it causes damage.
Database consistency checks do not typically take place during normal operation, or at
least during normal workload. A procedure to perform such a consistency check with a
SQL Server database is discussed in the Data Inconsistency Protection Solutions
section in this white paper. Other database vendors have developed similar procedures.
These procedures typically can be found in the respective database vendor’s
documentation.
Sabotage and accidental data deletion
Accidental data deletion in the database or on the file system level can have very
serious consequences for application operations. It can cause a serious disruption to the
normal course of operations, or even bring them to a total stand still. In order to avoid
such problems, one must try to address IT operation security aspects through a concept
of authorization in which only designated persons have permissions to work in their area
of expertise. Of course, especially in the group of administrators, there is overlapping
authorization needed for their daily work. Indeed, it appears that accidental deletion of
data is not a rare problem. From past experiences with large data center operations of
SAP infrastructures, there are many reports about SAP application downtime due to
deleted data or tables in the database.
Once the damage has occurred, the only way to recover is to restore the missing data
from the last backup. This can be challenging since in most cases only the missing data
needs to be restored. If for example only a single table is affected, it does not make
sense to restore the complete database from the last backup. This procedure would
cause the data in all other tables not affected by the issue to be set to the same
backward state as the affected table. Therefore, restoring these tables would potentially
cause more data to be lost. Databases with snapshot technology can be especially
helpful to minimize the downtime duration.
A snapshot is a transactional consistent point in time after which all changes are directed
to a different physical location. In other words, they are an exact image of the database
to the point of its generation and can, without recovery, be put into operation. During the
system recovery process, it is possible to export tables from the snapshot and import
them into the active system. Alternately, in case of a very serious problem, the complete
condition of a snapshot could again be restored in the active database. However, under
these circumstances, changes in other tables might be lost too.
Data loss through viruses and worms
Viruses, worms and Trojans are an unfortunate reality that every business must address.
Malicious code, that tries to secretly enter computer systems, might start activities which
range from espionage to data erasure. This attack is not necessarily external. For
example, employees who work with their laptop at home or in a hotel might bring an
unwanted guest back when they plug their computer into the company network. Another
reason that viruses might appear is because infected personal software is installed on a
PC attached to the company network.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 65

Security measures that are taken to minimize data loss or espionage of confidential
information are significant. The measures taken include the implementation of firewalls,
virus scanners, and surveillance tools, as well as employee policies. Optimal security
requires that appropriate measures are taken on all levels of the IT operation including:
• Virus scanners on the computer level.
• Demilitarized zone for outbound communication.
• Firewalls at the network level.
• Well-developed authentication and audit procedures at the application level.
Security measures also include operational tasks, such as the timely installation of
security patches to close any possible gaps immediately after such vulnerabilities have
been published. A detailed discussion of the threats, as well as possible concepts and
measures are outside the scope of this white paper. The Microsoft TechNet library
provides detailed information about Microsoft products and technology for IT
professionals. The Microsoft TechNet library can be found at:
http://technet.microsoft.com/en-us/default.aspx

Backup and recovery


Backup of all application data, configuration information, operating system installation,
and file systems is one of the most critical tasks in the daily operation of data centers
and the first measure for protecting against the loss of critical data. The main purpose of
a backup is to maintain a copy of the critical data and configurations that enables swift
application service restoration in the case of a severe error that damages this data or the
runtime environment. The basic backup hierarchy is shown in the following figure:

Figure 37. Basic backup hierarchy

Appropriate backups can be used to restore individual files in case a single file has been
deleted. However, these backups can also help to recover a complete system if a severe
SAP on Windows Server 2008 R2 - High Availability Reference Guide 66

failure happens. Even in the case where a disaster destroyed the original computer
systems, backups can be applied to a second computer and operation will be restored. A
backup and restore strategy is the last step in a recovery from an unforeseen event that
will return a database to some predefined point: Most likely to the last completed
transaction prior to the failure.
All aspects of the backup and restore strategy should be well documented and reviewed
regularly. Most importantly, they should be tested regularly to ensure that the data and
the media for backups are valid and that the processes work as expected.
Database backup strategies
The backup and restore components provided with Microsoft SQL Server 2000 and later
enable the administrator to reproduce a database to an exact replica of the original
database at any point in the database history from the time an appropriate backup
strategy was implemented. There are several backup types available:
• Full backup: Makes a complete backup of the database to the last completed
transaction affected during the backup process.
• Differential backup: Makes a copy of the database pages changed since the last
full backup. It is a useful backup mechanism to back up a database without
consuming as many resources as a full backup. In a restore operation, this is used in
conjunction with a full backup.
• Transaction log backup: Makes a backup of all the completed transactions that
have taken place since the log was last backed up. A transaction log backup is used
in conjunction with a full backup, and potentially differential backups, to enable an
administrator to restore a database to a specific point in time or to the last completed
transaction that was backed up.
• File backup: When a database consists of multiple files, each file can be backed up
individually. This provides an accelerated backup process as well as a faster restore
process. File backups are used in conjunction with transaction log backups.
Additional backup solution information is available in the following resources:
• Step-by-Step Guide for Windows Server Backup in Windows Server white paper
available at http://technet.microsoft.com/de-de/default.aspx.
Note: This document can be found by searching for the title.
• SAP Backup and Restore Information MS SQL Server help documentation available
at http://help.sap.com.

To access this information, do the following:

 From the left menu pane, click SAP NetWeaver.


 Choose English under SAP NetWeaver 7.0 Library.
 Choose Technical Operations Manual.
 Choose General Administration Tasks.
 Choose Database Administrations.
 Follow the MS SQL Server link.
 Follow the Monitor for Backup and Restore Information link.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 67

• Blog entry: How does Microsoft perform backups in their SAP system landscape
available at:
http://blogs.msdn.com/saponsqlserver/archive/2008/03/28/how-does-microsoft-
perform-backups-in-their-sap-system-landscape.aspx

Database log shipping


Microsoft SQL Server log shipping is a technology that has been available since the
release of SQL Server 2000. The basic concept is also described in the Disaster
Recovery Solutions section for the purpose of maintaining a database for disaster
recovery. While the idea of disaster recovery is to have a second remote database copy
to continue operations in case of a catastrophic event, log shipping is also used to
maintain a database copy in the event of a severe logical error or inconsistency.
Log shipping is basically nothing more than a fully automated continuous backup of the
transaction logs of an active database to a remote computer and the application of these
backups to a standby database. The time interval at which the transaction log backups
are applied to the standby system is a configurable parameter. If there is a save delay
between the time a transaction has been performed at the productive database and the
transaction log backup is applied at a standby database, the application process to the
standby database can be stopped when a problem at the productive system occurs.
The standby database would be kept in a consistent and usable state up to the point
where the error occurred. It could be used for continuing the productive work. The
standby database only needs to be activated and users diverted to the new server. Or
the data could be extracted out of the standby database and applied to the primary
database in order to compensate for a user error. In practical implementation, the delay
time between a standby and a productive system should not be less than one hour.
SQL Server log shipping provides a very efficient data replication method and protects
against downtime caused by logical errors or inconsistencies. The principal functional
setup is shown in the following figure. SAP transactions are executed at the primary
database server. Every transaction is recorded in the local transaction log that is copied
to local disk at the primary system (1). After the log has been created on the primary
side, it is copied over the network to the secondary database server (2). The secondary
database server then reads the transaction log, incorporates it into its own log buffer,
and applies it to the standby database (3). During application of the transaction log, all
recorded transactions in the log are executed on the standby server.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 68

Figure 38. SQL Server log shipping

For more information about SQL Server log shipping, please refer to the Disaster
Recovery Solutions section.

Snapshots
A snapshot is an image of information that has been frozen at a certain point in time.
The snapshot delivers an accurate picture of the information at an accurately defined
point in time. Snapshots are typically taken from fast changing data, like in a file system
or in a database. Technically, snapshots typically use the copy-on-write principle. In this
principle, all data on a storage media is represented as a chunk of data blocks. Data
blocks access is provided by pointers. Each block has an individual pointer that
describes where this block resides on the media.
A snapshot first takes all the pointers at a certain point in time and saves them. Any time
a data block is changed, the data block is first copied into a snapshot file and the system
uses a new pointer for the changed block. By copying only the pointers to data initially
and copying data blocks only if changes occur, snapshots are very fast and require
relatively little disk space. However there will be some impact for copying changed
blocks to the snapshot.
Database snapshots with SQL Server 2005/2008 R2
With SQL Server 2005/2008 R2 Enterprise Edition, database snapshots can be created.
A database snapshot is a read-only, transactional consistent view of a database.
Transactional consistent means that only those transactions which have been finished
by a commit work statement are taken into the snapshot. Snapshots can be generated
automatically, at any point in time, and also be used for reading access during daily
operations, such as report generation. The snapshot copy of the database can be
queried by client applications and, in the event of the original database becoming
damaged or unusable, it can be reverted to the state it was in when the snapshot was
created.
Since every new snapshot requires storage space, it is recommended that the older
snapshots are always deleted after a certain time. The optimal time for retaining
snapshots depends a bit on the individual requirements, but a time interval of one or two
days for retaining snapshots is sufficient.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 69

Additional information about SQL Server 2005/2008 R2 available by searching for


snapshots in the SQL Server 2005 online books at:
http://msdn.microsoft.com/en-us/library/ms130214.aspx.
Further information can be found on the SQLCAT Web site at:
http://sqlcat.com/whitepapers/archive/2008/02/11/database-snapshot-performance-
considerations-under-i-o-intensive-workloads.aspx.
Snapshots with storage solutions
As we have seen, the snapshot feature of SQL Server 2005 provides an opportunity to
create a transactional consistent snapshot on the database level. In addition, there is
another snapshot feature on the storage level. This kind of snapshot works on the
physical storage layer and is not restricted to certain data types like database data. From
the technical perspective, a snapshot is represented by pointers to the blocks of data on
the storage volume. During a write in a specific data block, this block is copied into a
snapshot file and the pointer in use by the system is pointing to the new block while a
copy of the old pointer is maintained. In case one would need to go back to the time
when the snapshot was created, the old pointer would be re-activated. In other words,
the snapshot would be reverted. Another possibility is the deletion of a snapshot which
would merge all copied blocks with the original data. This would require the recorded
changes to adhere to the new standard.
In order to recover a system after a severe error, volume snapshot copies can be
activated as read-only and any damaged data can be extracted from this copy. Since the
creation of the snapshot during normal operation might be done online and without
severe performance impact on the system, it is possible to maintain several snapshot
copies per day. This process can be automated and might run without human
intervention.
A very important aspect of snapshots is scheduled backup execution. Snapshots do not
replace backups. Since snapshot copies of the data can reside on the same physical
media as the original data, it is possible to lose this data if the physical volume becomes
defective. Snapshots are a great and efficient way to maintain a consistent state in time
that can be reverted if needed.
Hyper-V snapshots for virtual machines
With Hyper-V snapshots, administrators can capture current VM time images that can be
accessed at any stage. Since a consistent VM state can be created by using snapshots,
this feature can be used before any critical VM change such as when applying patches,
changing configurations, or upgrading applications. If any of these steps fail, the VM can
be easily reverted to the previous state.
VM snapshots can be created inside the Hyper-V administration GUI or by using System
Center Virtual Machine Manager. Hyper-V enables users to create a hierarchy of
snapshots. When a snapshot is created, the existing VM VHD file is frozen in its current
state and any change that occurs inside a VM after a snapshot is created is transferred
to a new file that is called an AVHD file. If another snapshot is created after the first one,
the first AVHD file is frozen and any change made afterward is transferred to another
AVHD file.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 70

Database consistency checks


The Microsoft SQL Server provides the DBCC CHECKDB command to check data
consistency and make repairs to the detected inconsistencies. Since the execution of a
DBCC CHECKDB can impact the transactional performance of a SAP system, such
testing should take place during low volume times or in a maintenance window. As a rule
of thumb, the time required for the examination of a 70 GB SAP database on a four
processor computer, would be approximately one hour as a single threaded task. The
SQL Server Instance parameter ‘max degree of parallelism’ is set to 1.
Database consistency checks examine the entire database contents for inconsistencies.
All tables and index data is checked and verified as readable. In order to automate this
task, consistency checks can be scheduled in the DBA planning calendar to be executed
when the overall transactional load is low such as during the weekend. In the following
figure, the DBA Planning Calendar screen is shown.

Figure 39. DBA Planning Calendar screen

It is a good strategy to run the DBCC check outside downtime and with the normal
system workload during low volume operation such as over the weekend. Based on the
SQL instance configuration, the DBCC check command only requires one CPU core.
Therefore, with a modern multi-core server, there are still enough cores available to
maintain the SAP operation.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 71

Large database consistency


With very large databases over a terabyte in size, consistency checks using the DBCC
CHECKDB command in a maintenance window or during low volume times might impact
production for too long. To be able to regularly examine the database consistency for
such systems, one can choose another approach.
It is possible, for example, to restore the last backup of a productive database on a test
system periodically and then carry out the consistency check on this server. The
advantage of this approach is that the command runtime does not negatively affect the
productive SAP system performance anymore since it is executed on the test system.
A second advantage would be the fact that the test system gets frequently updated with
the latest productive data. Finally, restoring the last backup of the productive SAP
database to a test system is a very efficient check if the backup itself is usable. Finding
backups that are not readable or do not contain the right data is a real disaster once the
backup is really needed.
The process of restoring a backup also helps the administrators to train in the procedure.
Proper skills in the backup and restore process are very helpful if needed in a real
emergency. The training advantage should not be underestimated. SAP and Microsoft
support deal with dozens of cases every single year where database backups prove not
to be restorable due to mechanical issues such as tape errors, human error, or simply
lost tapes.
Additional data consistency check information is available in the following locations:
• SAP note 142731: DBCC Checks of SQL Server available at:
https://service.sap.com/notes
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
SAP Help also has information about database server checks at http://help.sap.com.

To access this information, do the following:

 From the left menu pane, click SAP NetWeaver.


 Under SAP NetWeaver 7.0 Library, choose English.
 Choose Technical Operations Manual.
 Choose General Administration.
 Choose Database Administration.
 Follow the MS SQL Server link.
 Under Periodic Tasks, follow the Database Server Checks link.
• SQL Server 2005 books online:
http://msdn.microsoft.com/en-us/library/ms130214.aspx
Note: Search for DBCC CHECKDB:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 72

Disaster Recovery Solutions


Disasters are events that significantly impact the local computer system that hosts an
application as well as the entire IT environment. Examples include earthquakes, fire,
floods, and acts of terrorism. Consequently, damages impact not only the data center
hardware, but also impact infrastructures such as air conditioning, the power grid, or
communication lines.
It is clear that any instance of IT systems that exists physically only on one site assumes
the risk of long outages if such an event happens. Even worse, if storage subsystems
are damaged and backup copies reside in the same geographic neighborhood, there is a
possibility of not only losing the application service, but the data as well. The
consequences of such an event would be catastrophic to the enterprise.
The only protection for events of this scale is a geographic distribution of the IT systems
that host the applications over a long distance and maintain a redundant copy of the
elementary data on each side. This concept needs to be supplemented with automated
failover solutions for the quick recovery of failed applications, proper planning, and a well
prepared and educated staff.
Disaster recovery consists of the measures taken to recover from a catastrophic event.
During this phase, the complete infrastructure including the hardware is unusable. From
the perspective of business continuity, there are two important definitions that help to
define the optimal solution:
• Recover Time Objective (RTO): Describes the maximum time interval until the
application service has to be available again. This time is measured between the
time the event occurred and the time the application service is usable again. In
practical implementations, it might range from minutes to several hours and has a
direct impact to the data replication technology and network bandwidth requirements
between two sites.
• Recovery Point Objective (RPO): Describes how much data might be lost in case a
catastrophic failure occurs. The amount of data is expressed in time going back to
the last transactional and functional consistent state. In an optimal case, a RPO of
zero would result in a synchronous replication of every transaction between two
sites. This definition again has consequences in the used technology and network
bandwidth.
Besides these two considerations from the business process perspective, there are two
more prerequisites that influence the decision for a disaster recovery solution. These
are:
• The distance between the primary and the disaster recovery site.
• The available network bandwidth between the primary site and the disaster recovery
site.
The following chapter describes in detail the technical components available with
Windows-based systems to achieve geographical distribution. Additionally, the solutions
for maintaining data copies over large distances based on SQL Server are also covered.
Disaster recovery solutions are typically combined with other solutions to protect against
outages. It is typically not desirable for example that a local server failure results in a
complete site transfer. Here, a local cluster would protect against hardware failures while
the geographic dispersion is only used in case of a real disaster. The decision when a
site transfer is required is typically not automated and requires human interaction.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 73

SAP system protection in a geographically dispersed cluster


In order to protect the Single Points of Failure (SPOF) of a SAP instance and database,
WSFC implementations are used in a geographic dispersed installation. Applications in a
multi-site cluster are typically set up to fail over just like a single site cluster. WSFC itself
provides health monitoring and failure detection of the applications, the nodes, and the
communication links.
SAP supports the use of the WSFC service in geographic dispersed configurations, but
requires that the behavior is identical to a local WSFC installation. In other words,
Windows clustering does not detect the extended nature of these types of clusters. This
can be achieved with the following settings:
• Storage arrays that are visible on both sides for the SAP file systems and database
• Changed quorum implementation from shared disk to majority node set cluster
• VLAN configurations for a single subnet because of the SAP system requirements
Note: Because of the complexity of geographically dispersed clusters, the hardware
vendor must be involved with the design, setup, configuration, and subsequent support
of the implementation. SAP support is limited to the standard WSFC cluster
implementation that does not recognize geographic dispersion. From the perspective of
a SAP application, it looks as if the cluster is local.
Storage replication
In order to enable a fast failover on a secondary site in case a catastrophic event occurs,
a synchronous copy of the file system in use by the SAP system has to be maintained on
each site. This block-level replication can be achieved with hardware or software-based
replication.

Hardware-based replication
With this method, the complete replication task is done at the storage level. The
following figure shows the basic setup:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 74

Figure 40. Hardware-based replication

The advantage of this solution is that it works completely independent from the
application. However, as the replication is performed by the storage controller, the SAN
storage devices have to be from the same vendor and there is a high bandwidth
requirement for the replication. Additionally, software components from the storage
vendors are required to enable WSFC to appropriately use this configuration. Examples
of storage-based replication providers are:
• EMC with SRDF
• HP Storage Works Business Copy EVA
• NetApp MetroCluster with SyncMirror
• IBM GDPS with PPRC
• Hitachi Storage Clusters
Note: For hardware or software-based replication solutions to work, they are required to
replicate SQL Server write I/Os in exactly the same order as originally issued by the
database.

Software-based replication
With this method, any change on the active side is copied over the network to the
secondary side and replicated there. This requires the use of a software product that is
not part of the initial cluster setup. While these software components increase the
implementation cost, the advantage is that different storage devices can be used. It is
even possible to have SAN storage on one side and a Direct Attached Storage (DAS) on
the other side. Examples of vendors for software-based replication products include:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 75

• NSI Double-Take
• Legato RepliStor
• Symantec Storage Replicator
• SteelEye DataKeeper
• Neverfail ClusterProtector

The following figure shows the principle setup when using this method.

Figure 41. Software-based replication

Cluster quorum configuration


In simple terms, quorum for a cluster is the number of elements in a cluster that must be
online in order to enable proper cluster function. If one or more nodes in a cluster can no
longer communicate to the other nodes in the cluster because of a split situation – for
example interrupted network connections – there must be a voting mechanism that
determines which side has the majority (quorum) to actively hold the applications in the
cluster.
Each WSFC cluster has a special resource known as the quorum resource. While with
Windows Server 2003 almost all server clusters used a disk in cluster storage as the
quorum resource, a different approach is used with Windows Server 2008 R2. If a node
could communicate with the specified disk, the node could function as a part of a cluster,
and otherwise it could not. This made the quorum resource a potential single point of
failure. In Windows Server 2008 R2, a majority of votes is what determines whether a
cluster achieves quorum. Nodes can vote, and where appropriate, either a disk in cluster
SAP on Windows Server 2008 R2 - High Availability Reference Guide 76

storage, called a disk witness, or a file share, called a file share witness can vote. There
is also a quorum mode called No Majority: Disk Only that functions like the disk-based
quorum in Windows Server 2003. Aside from that mode, there is no single point of failure
with the quorum modes since what matters is the number of votes, not whether a
particular element is available to vote.
There is a comprehensive description about the available quorum options for Windows
Server 2008 R2 available at http://technet.microsoft.com/en-us/library/cc770620.aspx.

Majority Node Set configuration for Windows Server 2003


In a Majority Node Set (MNS) cluster, each node in the cluster maintains a copy of the
quorum data locally on its system disk. This MNS quorum is constantly synchronized
and kept consistent by the cluster itself. If the configuration of the cluster changes, that
change is reflected across the different member nodes. The change is only considered
to have been committed if it has been successfully distributed to:

(Number of nodes configured in the cluster/2) + 1

This ensures that a majority of the nodes have an up-to-date copy of the data. The
cluster service itself will only start up and therefore bring resources online if a majority of
the nodes configured as part of the cluster are up and running the cluster service. If
there are fewer nodes, the cluster will not have the quorum and therefore, the cluster
service waits to restart until more nodes join.
In the case of a failure or split-brain, all partitions that do not contain a majority of nodes
are terminated. This ensures that if there is a partition running that contains a majority of
the nodes, it can safely start up any resources that are not running on that partition. This
ensures that it can be the only partition in the cluster that is running resources.
MNS quorum implementations are recommended for geographically dispersed clusters.
By having a single MSCS member node in a separate location, split-brain situations can
be avoided by using one node as an arbiter. See the following figure for an example of
this:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 77

Figure 42. MSCS member node in a separate location

SAP supports a Majority Node Set Cluster if it is part of a cluster solution offered by the
Original Equipment Manufacturer (OEM), or Independent Hardware Vendor (IHV).
File share witness for Windows Server 2003
The file share witness feature is an improvement to the current Majority Node Set (MNS)
quorum model of Windows Server 2003. This feature enables the use of a file share that
is external to the cluster as an additional vote to determine the status of the cluster in a
two-node MNS quorum cluster deployment.
One of the disadvantages of a two-node MNS cluster is that it cannot sustain the failure
of any cluster node without losing the majority of nodes. In other words, it cannot
continue operation. The only solution to overcome this problem is to configure at least
three nodes in a MNS cluster. The three cluster nodes need to be continuously available
and should be in different physical locations.
With the File Share Witness feature, it is possible to use an external file share instead of
the third cluster node also referred to as the witness. By using the File Share Witness, a
two-node MNS cluster can be configured and remains operational even if one cluster
node dies. The file share used acts as an additional vote to determine which node takes
ownership of the configured cluster resources.
Additional information about the File Share Witness feature is available in the Microsoft
Knowledge Base Article 921181 at http://support.microsoft.com/kb/921181.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 78

Network configuration
Any WSFC configuration requires at least two network adapters for the following
purposes:
• A public network that is used for the communication between the SAP central
instance, SAP AS, and SAP system client connections.
• A cluster private network that is used internally for status exchange and WSFC
cluster heartbeat information between the member nodes.
Each of these network adapters is required to have its own physical IP address and
corresponding host name. The cluster service in a WSFC cluster is unaware of a
possible geographical dispersion and assumes that its public and private network
interfaces still exist in the same network segment with the same IP subnet. This is
because cluster software is unable to determine network topology and because it
operates on IP failover that only functions within the same subnet. To accommodate
these restrictions for geographic dispersion, organizations can implement VLAN
technology.
Virtual LANs (VLANs) can be viewed as a group of devices on different physical LAN
segments that can communicate with each other as if they were all on the same physical
LAN segment. Even though some of the cluster service network communication
limitations have been removed in Windows Server 2008 R2, a single subnet is still
required. This is still true for SAP component communications as well.
With Windows Server 2003, the limitation for the heartbeat roundtrip time is 500
milliseconds. This fixed parameter is directly dependant on the latency and bandwidth of
the network connections used between the two sites. With Windows Server 2008 R2,
this parameter became configurable between 250 and 2000 ms on the same subnet.
Theoretically, even different subnets are possible with Windows Server 2008 R2, but due
to the requirements of the SAP instances, SAP installations are only possible in a single
subnet configuration. Additional geographically dispersed cluster information is available
in the following resources:
• White paper: Geographically Dispersed Clusters in Windows Server 2003
http://www.microsoft.com/windowsserver2003/techinfo/overview/clustergeo.mspx
• White paper: Server Cluster Quorum Options in Windows Server 2003
http://technet.microsoft.com (Note: Search for the title.)
• White paper: Stretching Microsoft Cluster with Geo-Dispersion
http://www.microsoft.com/technet/prodtechnol/windows2000serv/maintain/optimize/g
eoclust.mspx
• White paper: Server Clusters: Majority Node Set Quorum
http://technet.microsoft.com (Note: Search for the title.)
• Microsoft Storage solutions
http://www.microsoft.com/windowsserversystem/storage/default.aspx
• Microsoft Knowledge base article: Microsoft Cluster Services Installation Resources
http://support.microsoft.com/kb/259267
• Multi-Site clustering with Windows Server 2008 R2
https://www.microsoft.com/windowsserver2008/en/us/clustering-multisite.aspx
SAP on Windows Server 2008 R2 - High Availability Reference Guide 79

Microsoft SQL Server database log shipping


SQL Server log shipping was first implemented in SQL Server 2000 and provides a
convenient way to maintain a standby database even on a geographical distance. Its
basic functionality is the automatic transfer of transaction logs from the primary database
to a second database on another server. There are three operations to complete a log
ship process:
• Back up the transaction log at the primary database server.
• Copy the transaction log file backup to the secondary database server.
• Restore the log backup on the secondary database server.
The following figure shows the general setup:

Figure 43. Database log shipping with Microsoft SQL Server

Transactional log backups on the primary database server are copied to a local disk on
this server and transferred over the network to the standby database in the configured
time interval. Transactional log backups received on the standby server are applied to
the database. It is also possible to transfer the transactional log backups from the
primary to multiple standby servers.
The process of changing the database role from primary to secondary or to bring the
secondary database online in the event of the primary database becoming unavailable is
not an automatic process. The secondary database can be brought online manually.
During the process of setting up SQL Server log shipping, initially a database backup
copy is restored on the standby server. With log shipping in place, every transactional
change is reproduced on the standby side.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 80

By design, SQL Server log shipping might only maintain the SAP system database in
geographic dispersed way. As the complete functionality of a SAP system requires a
SAP central instance with the network shares and possibly an AS, these have to be
maintained separately. SQL Server log shipping is therefore not considered a full
disaster recovery solution, but is a simple method of maintaining a copy of the database
of a SAP system and can be combined with other technologies like database mirroring or
WSFC clusters.
The following figure shows the general setup with a local WSFC cluster:

Figure 44. Local WSFC cluster

Additional SQL Server log shipping information is available in the following resources:
• SAP Service Marketplace URL:
https://service.sap.com/notes
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
 SAP note 493290: ―Configuring SQL Server Log shipping‖
 SAP note 1101017: ―Log shipping on SQL Server 2005‖
• SQL Server 2000 high availability series
http://www.microsoft.com/technet/prodtechnol/sql/2000/deploy/harag05.mspx
• White paper: SAP with SQL Server 2005
http://www.microsoft.com/sql/techinfo/whitepapers/sap-with-sql-server.mspx
SAP on Windows Server 2008 R2 - High Availability Reference Guide 81

• White paper: Using SQL Server 2005 with SAP R/3


http://www.microsoft.com/technet/itsolutions/msit/operations/sql2005sap.mspx
• SAP Help documentation: SAP High Availability
http://help.sap.com

To access this information, do the following:

 From the left menu pane, click SAP NetWeaver.


 Choose English under SAP NetWeaver 7.0 Library.
 Choose Technical Operations Manual.
 Choose General Administration Tasks.
 Choose High Availability.
 Follow the SAP High Availability documentation link.
 In the left menu pane, choose Database High Availability.
 Choose High Availability for the MS SQL Server Database.

Database mirroring with SQL Server 2005/2008 R2


Database mirroring is a database feature developed as part of Microsoft SQL Server
2005. Conceptually, database mirroring consists of a database, called the principal
database, that resides on a Microsoft SQL Server 2005/2008 R2 database instance and
a mirror that resides on a different Microsoft SQL Server 2005/2008 R2 database. With
SQL Server 2005/2008 R2 database mirroring; all the database transactions of a
productive database are replicated on a standby database. Similar to the log shipping
procedure, the transaction logs of the database play a major role.
As seen in the following figure, transaction logs are used in all SQL Server databases to
record data changes during transactions. The recordings first go into a memory area
allocated to the database log buffer. From there, the data is written as quickly as
possible into a log file. In systems with active database mirroring, the log buffer content
is simultaneously transferred to the mirror server at the same time.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 82

Figure 45. Database Mirroring with SQL Server 2005

The mirror server that receives these transaction log records writes them into the mirror
database log buffer before it writes them into a local transaction log file. The received
transaction log records are then applied to the mirror database. During the transaction
log application, all transactions executed on the active database are then executed on
the mirror side. Therefore, both databases can be maintained on the same transactional
level.
While, for database mirroring, the active and the mirror database must always be
available, there is an optional third role: the witness. With this optional configuration, in
case of error, an automatic failover to a mirror database can take place. When the
database mirroring is used in a high availability configuration, within seconds, the mirror
server can take on the role of the active server. The mirror database becomes available
in cases where the witness confirms the failover.
Database mirroring assures the availability of a consistent standby database in case of a
productive database interruption. By encoding the data packets in the network transport,
good data security is assured. SQL Server 2005/2008 R2 enables three different mirror
configurations including:
• Asynchronous mirroring
• Synchronous mirroring
• Synchronous mirroring with automatic failover
SAP on Windows Server 2008 R2 - High Availability Reference Guide 83

Asynchronous database mirroring


Asynchronous data transfer between a productive database and a standby database
means that there is no waiting to acknowledge the transfer before the pending
transaction is concluded with the commit work statement. The primary advantage of this
operation is that the transaction processing is minimized. The time required to
acknowledge a transaction onto a mirror server can mean that with low network
bandwidth, there is a significant performance bottleneck.
Though the transaction performance is improved with asynchronous database mirroring,
there is a significant disadvantage. For example, one cannot guarantee that all
transactions were safely transported to the mirror server at any point in time. In case of
an error, this situation can lead to a loss of committed transactions. With the
asynchronous mirroring, SQL Server 2005/2008 R2 cannot switch automatically to a
standby server in case of an error without an additional Microsoft partner solution.
However, the standby database is still available, but only with the potential loss of
committed transactions. The database can be designed to continue with productive
operation, but the failover needs to be initiated manually.
Asynchronous database mirroring is best applied in disaster recovery scenarios.
Because of the greater distances between mirror servers, network bandwidths are often
limited in this case. Presently, the log shipping technology introduced with SQL Server
2000 is often deployed in this scenario.
Synchronous mirroring with automatic failover in case of error
With synchronous database mirroring, the advantage is that the database transactions
are seen as complete only if the writing process on the mirror side is complete. In this
type of operation, it is guaranteed that the mirror copy always has the exact same
transactional level as the original. Because of this increased data security, the automatic
switching of the database operation in case of an error is possible.
The prerequisite for the automatic failover configuration, however, is the installation of an
additional database server or the witness. This witness is basically an additional instance
of a SQL Server that is only needed for determining which mirror site is able to take over.
This can be basically any SQL Server instance. Even the free SQL Server Express
Edition would work. In case of a failure of the active database server, the mirror server
and the witness supply for the majority (quorum) that defines who can actively hold the
database. Even if the primary server recovers, the active database role is not
accidentally returned to the primary server. This is because the quorum defines the
mirror server as the active database owner after a failover occurs.
Due to the bandwidth requirements of synchronous database mirroring, fast network
connections with low latency are required. This typically also determines the maximum
distance that two sites can be apart from each other. In current technologies, this
distance is about 50 km.
SAP database mirroring configurations
There is detailed information about database mirroring installation and configuration for
SAP available in the SAP note 965908. This note is also important when determining
how to combine database mirroring with other technologies like log shipping.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 84

Additional information about SQL Server database mirroring is available in the following
resources:
• White paper: SAP with SQL Server 2005
http://www.microsoft.com/sql/techinfo/whitepapers/sap-with-sql-server.mspx
• White paper: Using SQL Server 2005 with SAP R/3
http://www.microsoft.com/technet/itsolutions/msit/operations/sql2005sap.mspx
• Books online: Microsoft SQL Server 2005
http://www.microsoft.com/technet/prodtechnol/sql/2005/dbmirror.mspx
• White paper: SQL Server 2008 Technologies for SAP Solutions
https://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/60a236a2-8104-
2b10-5ebe-8fef61cc82fd

Disaster recovery solutions for virtual machines


As described in the Hyper-V host cluster section, the foundation for all Hyper-V high
availability solutions is the WSFC. Similar to physical installation disaster recovery
solutions, Hyper-V disaster recovery is based mainly on geographically dispersed WSFC
clusters and SAN storage with storage replication.
The significant aspect here is that the SAN storage vendor has to provide integration
components for the cluster so that the cluster can handle the VHD files as cluster
resources on this type of storage. Currently, there are many storage vendors that
support disaster recovery configurations for Hyper-V including Live Migration over two
geographically dispersed sites. For more disaster recovery information, please see
http://www.microsoft.com/virtualization/en/us/solution-continuity.aspx.

You might also like