Professional Documents
Culture Documents
June 2010
Authors
Josef Stelzel, Sr. Developer Evangelist, Microsoft Corporation,
jstelzel@microsoft.com
Summary
This paper describes how to implement a high availability solution for SAP applications
on Microsoft® Windows Server® 2008 R2. It is written for developers, technical
consultants, and solution architects. This paper introduces the technologies and
architecture used, describes various high availability scenarios, and discusses the
implementation process. This paper also contains links to advanced features and
technical topics including disaster recovery methods.
Note: Access to some of the linked information might be restricted such as SAP notes
available at the SAP Service Marketplace at https://service.sap.com. Access to this Web
site is available only to registered SAP customers and partners, and requires a user
name and password.
SAP Applications on Windows Server 2008 R2 High Availability Reference Guide ii
The information contained in this document represents the current view of Microsoft Corporation on the
issues discussed as of the date of publication. Because Microsoft must respond to changing market
conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot
guarantee the accuracy of any information presented after the date of publication.
This White Paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS,
IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT.
Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under
copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or
transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or
for any purpose, without the express written permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights
covering subject matter in this document. Except as expressly provided in any written license agreement
from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks,
copyrights, or other intellectual property.
Microsoft, Windows, Windows Server, the Windows logo, SQL Server, and Active Directory are either
registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.
The names of actual companies and products mentioned herein may be the trademarks of their respective
owners.
Applies To
• SAP NetWeaver 7.0
• SAP NetWeaver 2004
• SAP Business Suite (mySAP ERP)
• SAP Application Server
• SAP Replicated Enqueue
• SAP System Central Services
Keywords
SAP NetWeaver, disaster recovery, high availability, SAP Application Server, SAP
Replicated Enqueue, planned downtime, unplanned downtime, SQL Server 2005/2008
R2, Windows Server 2008 R2
Contact
This document is provided by Microsoft Corporation. Please check the SAP
interoperability area at www.microsoft.com/sap and the .NET interoperability area in the
SAP Developer Network at http://sdn.sap.com for updates or additional information.
SAP Applications on Windows Server 2008 R2 High Availability Reference Guide iii
Contents
Applies To ...................................................................................................................... ii
Executive Summary ...................................................................................................... 5
High Availability Considerations .................................................................................. 6
Critical application availability requirements ................................................................ 6
Classes of availability problems .................................................................................. 6
Loss of physical resources .................................................................................................... 6
Logical errors and inconsistencies ........................................................................................ 7
Disasters ................................................................................................................................ 7
Planned downtime ................................................................................................................. 7
Service level agreements............................................................................................ 7
Availability measures ............................................................................................................. 8
High availability solution risks and side effects ............................................................ 9
Increased complexity ............................................................................................................. 9
Higher costs........................................................................................................................... 9
Hyper-V virtualization and availability........................................................................ 10
Guest clustering................................................................................................................... 10
SAP Architecture and Requirements ......................................................................... 12
SAP NetWeaver and its components ........................................................................ 12
SAP Application Server architecture ......................................................................... 13
ABAP system architecture ................................................................................................... 14
Dual-stack system architecture ........................................................................................... 18
Java system architecture ..................................................................................................... 22
SAP system single points of failure ..................................................................................... 23
SAP standalone engines........................................................................................... 29
The SAP Web Dispatcher ................................................................................................... 30
SAP standalone gateway .................................................................................................... 31
TREX ................................................................................................................................... 31
SAP liveCache..................................................................................................................... 32
SAP Content Server ............................................................................................................ 33
Unplanned Downtime Avoidance Strategies ............................................................. 35
Hierarchy of high availability solutions ...................................................................... 35
Data storage protection ....................................................................................................... 36
Server protection ................................................................................................................. 37
Network high availability ...................................................................................................... 37
Application specific configurations ...................................................................................... 39
Simple cluster for a single SAP system..................................................................... 40
Using multiple clusters for SAP instances and databases ......................................... 42
SAP Replicated Enqueue ......................................................................................... 44
Multi-SID cluster ....................................................................................................... 45
Multi-node cluster ..................................................................................................... 50
SAP application servers ............................................................................................ 51
IT infrastructure protection ........................................................................................ 52
Hyper-V host cluster ................................................................................................. 53
Planned Downtime Minimization Solutions ............................................................... 54
Planning ahead for minimizing planned downtime .................................................... 54
Change management strategy deployment ........................................................................ 55
Backup and patching solutions ................................................................................. 55
Snapshot backup ................................................................................................................. 56
Optimized server maintenance system architecture .................................................. 57
Server and operating system maintenance ......................................................................... 57
SQL Server instance maintenance...................................................................................... 58
SAP application planned downtime reduction ........................................................... 60
SAP on Windows Server 2008 R2 High Availability Reference Guide iv
Executive Summary
Business applications are central to a corporate IT operation. All corporate business
processes are supported by software solutions that help to better plan, process, or
communicate in all business related tasks. Consequently, any service failure has an
immediate and direct impact on corporate business results. This often decreases
revenue and can damage the corporate image.
This is especially true for SAP applications as corporations increase their dependencies
on a productive IT environment. Enterprise Service Architecture (ESA) and the global
network of interacting companies have increased both uptime requirements as well as
the number of IT components that are ultimately needed to fulfill business requirements.
As an increasing number of companies join global networks, there is always a time zone
that utilizes a computing service. While in the past, centralized application systems like
SAP R/3 have been used, ESA orchestrates the use of service providers in order to
achieve a larger task. Those services can be distributed inside or outside a company
and need to be available.
High availability of mission critical applications has always been the focus for SAP
infrastructures. The starting point for increasing availability traditionally has been to
address the loss of a critical hardware resource that could generate downtime until the
computer system is available again. More solutions have been developed over time to
address other problems like downtime due to operating system defects, downtime
caused by data inconsistencies, or downtime caused by disasters like earthquakes,
floods, or terrorism. Even planned downtime, which is needed to upgrade systems or
install patches, is contrary to the requirement to have an application service consistently
available. However, planned downtime does reduce system vulnerability and increases
reliability.
This guide describes the solutions that address the various areas of availability for SAP
on the Windows® platform. It helps to identify the cause of potential downtime and
provides the technical strategy to reduce or eliminate it. In addition, this guide provides
solution description references that help the reader understand the technology and
quickly find assistance.
Microsoft® has a long history of providing a comprehensive portfolio of solutions for
protecting enterprise class applications like SAP. Microsoft Windows Server® 2008 R2
offers even more functionality than previous versions with clustering, geographic
distribution, and operating system security. Improved network configuration functionality,
performance enhancements, and storage subsystem management included with
Windows Server 2008 R2 make it easier to work with the latest technology from
hardware partners. As a central component of Windows Server 2008 R2, high availability
makes managing the complexity of modern infrastructures both effective and affordable.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 6
Note: To avoid user interruption, all dependencies must be protected as well as the
primary application services. If a productive system is integrated into an IT infrastructure,
this infrastructure is also critical as is a potential data provider or data consumer in the
productive system. Any downtime associated with these dependent systems will interrupt
the primary application services as well.
Availability measures
In order to measure and quantify computer system or application availability, the
following formula is used:
Availability is defined as the percentage that the application was used for an intended
purpose. Defined availability values like 99.999 are often used in marketing as a solution
quality indicator. The following table shows the assumed unavailability for various typical
values.
Figure 1
More information about support for SQL Server® in a guest cluster environment can be
found at:
http://support.microsoft.com/kb/956893
A detailed description for how to configure a Hyper-V guest cluster can be found at:
http://blogs.technet.com/b/mghazai/archive/2009/12/12/hyper-v-guest-clustering-step-by-
step-guide.aspx
SAP on Windows Server 2008 R2 - High Availability Reference Guide 12
For user and information integration, SAP NetWeaver uses the SAP Enterprise Portal
(EP) and SAP Business Warehouse (BW). Data is also integrated by the SAP Master
Data Management (MDM). By using the SAP Mobile Infrastructure, user integration can
be extended to wide variety of remote devices. Process integration is performed by SAP
Process Integration (PI), formerly known as SAP Exchange Infrastructure (XI).
Enterprise service architectures are made possible by the integration of people,
information, and processes, and are the foundation of a new breed of applications.
Composite applications are composed from a variety of individual functions already
available in the application infrastructure, and demonstrate how to develop faster and
more flexible solutions for future business requirements.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 13
layer for the business logic coded in ABAP or Java, they are required for the fulfillment of
the business process which in turn creates high availability requirements. Solutions for
optimized availability are supported by the SAP AS architecture, but always depend on
additional components such as redundant servers and monitoring, and control
processes that are typically in high availability clusters.
Before going deeper into the SAP AS architecture, the general features should be
discussed. All application servers consist of at least of one central database and a
central SAP instance that provides unique services for the SAP system. If more
transactional performance is required by the SAP system, additional application servers
can be added to the SAP system. A SAP system that is identified by a unique System
Identifier (SID) might consist of many SAP instances and the common database.
Depending on the type of application, a SAP AS can be installed for ABAP, Java, or for
both workload types as shown in the following figure:
This figure shows two instance types including a central instance and one or more dialog
instances. Processes like Dialog, Batch, Update, Spool, or the Dispatcher process exist
many times in a SAP system and are therefore redundant. Each installation of an ABAP
instance also has one gateway process configured that is used for communication
through the Remote Function Call (RFC) protocol. Also, each instance has its own
Internet Communication Manager (ICM) process for HTTP-based communication. The
Internet Graphics Server (IGS) only supports the creation of bitmaps for browser-based
clients.
To register all the instances of a SAP system and to support the communication between
the various components of a distributed SAP system, a single message server is
configured in the central instance. Also specific to the SAP system is the central
Enqueue server that manages the lock entries in a distributed SAP system in a lock table
inside of the shared memory of the server. Because of these two unique processes, the
term central instance was used for this installation. A central instance is the lowest work
unit of the SAP system and the performance can be extended by adding an additional
AS.
When looking closer at the directory structure of this SAP system, the installation of the
SAP AS 6.40 is demonstrated in the following figure.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 16
All profiles and executables of a distributed SAP system are made available from the
central instance to all dialog instances through the share SAPMNT. In order to support a
simple patch process for executables, there is one master copy of the executables on
the central instance. Any time a dialog instance starts the SAP utility, SAPCPE checks
for the availability of a newer executable version. When available, this executable is
copied to the AS local runtime directory before it is used.
Changes in the SAP system ABAP reports are distributed by using the transport system.
SAP systems can be configured to be a member of a transport domain. For each
transport domain, there is one directory that is shared by all members of the domain.
The directory is: <Drive>:\usr\sap\trans. Because of the central character of this shared
directory, it can be considered a single point of failure for the operation of more than one
SAP system.
With the introduction of SAP AS 7.0, there was a major change in the layout of the
central instance. Similar to the structures in pure Java systems, the unique Message and
Enqueue server processes have been moved to a separate SAP instance: the ABAP
System Central (ASCS) instance. Therefore, no typical AS has more system wide
functions. The following figure shows the SAP landscape simplification:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 17
Subsequently, the file system of an ASCS instance would look like the following figure:
SAP installations of SAP AS 7.0 consisting of an ASCS instance and a dialog instance
will continue to use the name format D<instnr> for the instance directory. This combined
installation structure is shown in the following figure:
Regardless of this combined installation structure, the Enqueue and Message server
processes are now in the ASCS instance. This naming convention was not changed
because of compatibility reasons with older versions.
Dual-stack system architecture
With the introduction of J2EE as a possible SAP system component in version 6.40, SAP
AS can be installed for ABAP, Java, or for both types of workloads. There is a
considerable difference in architecture between ABAP and Java platforms as seen in the
following figure:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 19
As shown in this figure, both the ABAP and the Java part of the SAP AS have their own
Message and Enqueue server as critical components. The Java AS is primarily made up
of Java server processes. The software deployment manager (SDM) is used for the
installation and management of software versions. The server operating system must
also have a Java development kit (JDK) installed to configure the Java virtual machine
(JVM). The JDK for Windows is available for Windows through Sun Microsystems.
While in ABAP Applications Server version 6.40, the Enqueue and Message server are
still a part of the central instance: The Java AS always uses the system central services
(SCS) instance concept. This means that every 6.40 dual-stack system must, at a
minimum, consist of two instances.
As with the pure ABAP configuration, the hybrid system still has a central database that
divides the respective application data types by using a schema. In the hybrid structure,
the ABAP and Java functions are shut down simultaneously as if a single instance. Both
instance parameters are also configured in a single instance profile. The Java SCS
instance in this installation is a complete unit and has its own profile. It can be started or
stopped independently.
For the purpose of maintaining distributed installations of SAP instances, all profiles and
executables of the SAP system are shared on one central instance on a network share.
This server is typically the server that holds the central instance or the SCS instance.
All together, the dual-stack directory structures are naturally more complex than a pure
ABAP or a Java instance. The SAP AS 6.40 dual-stack system structure is shown in the
following figure:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 20
As seen previously with the pure ABAP AS, the system structure for dual-stack systems
became simpler with the introduction of the SAP AS 7.0. The only difference between
the typical identical system instances is the Software Deployment Manager (SDM) that is
installed only in one instance. The SDM is required to install and patch Java programs
and is only needed when new programs are installed or during software maintenance.
Therefore, there is no need to configure the SDM in a cluster solution. To secure SDM
service availability, a backup copy can be installed on any AS when needed. As with
SAP AS version 7.10, the SDM will be completely removed from the installation and
replaced with a new Java Support Package Manager (JSPM) function. The software
maintenance functionality will then be an integrated part of every AS and this function
would be redundant.
The following figure shows the SAP AS 7.0 dual-stack system structure. As shown in the
figure, the ABAP Message server and Enqueue have now been moved to a new,
separate ASCS instance that simplifies the dual-stack SAP AS setup.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 21
The following figure shows the SAP AS 7.0 dual-stack directory structure with an ABAP,
Java, and a SCS instance. In this file system layout, there is a clear distinction of the
different components described.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 22
Multiple J2EE instances placed on several physical servers create a Java cluster. The
basic rule is that a Java instance can only be configured once per physical server. At a
minimum, the Java instance must consist of at least one Java server process and a
dispatcher, but can also have multiple Java server processes. The central SCS instance
might also be put together with a regular Java instance on one physical server.
Similar to the ABAP installation, the profiles and executables of a distributed Java
system also reside on one physical server and are shared there. Because of the central
character of these files, this server is the server that holds the SCS instance of the SAP
system.
SAP system single points of failure
Single points of failure are SAP system elements that are critical in order to operate a
system and must be protected against high available SAP system loss or failure.
data consistency at all times. For example, there is a mechanism that logs all changes
executed during a transaction. If a database operation fails in the middle of a
transaction, the logs are used to restore the previous condition. The transaction logs can
also be used to reapply transactions to a database image. For example, a database
image restored from a backup would not reflect the latest transactional state since the
transactions have most likely been executed after the backup was created. The latest
transactions would be lost due to the restore if there was no transaction log available to
reapply them.
Databases are central application components and are often protected by high
availability clusters or other technologies like Microsoft SQL Server® 2005/2008 R2
Database Mirroring. High availability clusters use the same database image that is
accessed from two servers (shared disk) for server redundancy. Database mirroring, on
the other hand, is able to maintain a physically independent copy of the critical data. The
main purpose of all these technologies is to protect the database service against loss
since it is the most critical component of a SAP system.
The SAP System Message Server registers all SAP system instances and load
balancing user demands by connecting new users to the most available server in the
system. Existing connections will remain intact if a message server goes offline,
however, no new connections can be made by that server. This makes the Message
Server an ideal cluster solution candidate.
The Enqueue Server is part of the SAP lock concept. The purpose of the SAP lock
concept is to synchronize data access in order to protect the consistency of SAP data
objects. This is one of the most important functions of a SAP system. It keeps SAP data
consistent by not allowing two users to make changes to the same data object at the
same time. Instead, the data would be locked for the first user.
The Enqueue Server in the following figure consists of a work process and a lock table in
the shared memory of the server that is used to store the lock information for an entire
SAP system. The Enqueue work process is needed in distributed systems to insert or
verify lock information on behalf of the dialog instances. Local work processes can
directly access the lock table and do not need this Enqueue work process. If, however,
the lock table is lost by a server failure, lock information can no longer be verified. In a
distributed system, this would create a transaction reset and roll-back of all pending
transactions, even on dialog instances that would normally resume working, and all
session contexts would be lost. An example of a SAP AS 6.40 ABAP with a single point
of failure (SPOF) is shown in the following figure:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 25
This figure shows only the critical SAP AS components and is therefore, not complete.
Another critical point resides in the file systems and network shares of the SAP
installation on Windows. It is important that the SAP system executables and profiles are
always installed with the central instance or SCS instance in newer systems. Access to
these files is provided through the SAPMNT share that is present only once per SAP
system. Executables available on this share are copied to the local machine before an
instance starts through the SAPCPE SAP program. This is done to improve the stability
of the SAP instance. However, the profiles are only read through this share.
The following figure shows the infrastructure of two servers: Server Alpha has the central
instance and Server Beta is a SAP application server. Server Alpha hosts the central
instance and therefore the SAPMNT share. Both instances have the share SAPLOC that
is used to access the local environment of a SAP instance. Both servers have two
environmental variables: SAPGLOBALHOST and SAPLOCALHOST. The UNC names
\\SAPGLOBALHOST\sapmnt and \\SAPLOCALHOST\saploc were derived from these
variables. These names are used in the SAP kernel to search the SAP system profiles
and system directories. Server Alpha has both variables set to the name of the local
server so all access points are local. However, Server Beta is directed to the central
server when accessing SAPGLOBALHOST. SAPLOCALHOST is used for all instance
specific operations and therefore is accessed again through a local path.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 26
The mentioned directory structure and the SMPMNT share must be protected within a
high availability solution because of their central significance for the SAP system. Since
the access to the UNC path is derived from the variable SAPGLOBALHOST, these files
are also called global files.
Prior to SAP AS version 7.0, in all ABAP or ABAP + Java systems, the central instance
was protected in a cluster. The reason was simple: It was not possible to separate the
Enqueue and the Message server from the rest of the SAP central instance. Together
with the central instance, the Enqueue server and the Message server, the global files
and the SAPMNT share were implicitly protected in a cluster as well.
With the development of the SAP Standalone Enqueue, it became possible for the first
time in the SAP AS 6.40 to extract the central component Message server and Enqueue
server into a single instance. By doing so, the cluster configuration for critical SAP
services was significantly simplified. While in version 6.40, the SAP System Central
Services (SCS) was introduced only for Java-based systems. SCS configurations also
became available for ABAP-based systems with SAP AS 7.0. These configurations are
called ASCS instances.
Note: All high availability configurations of SAP systems today are based on the SCS
instances, the protection of the SAPMNT share, and the GLOBAL files in a failover
cluster.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 27
One of the main benefits of this configuration compared to the protection of a complete
central instance in older versions is the fact that only two relatively lightweight services
need to be moved and restarted. SCS instances lead to shorter failover times and more
stability in the cluster implementation. Since there are no SAP users connected to a SCS
instance, the effect of a failover is also much smaller in the SAP system. Using the SAP
Replicated Enqueue in addition to SCS high availability configurations enables
enterprises to minimize application server interruptions. For more information, see the
Measures to Avoid Unplanned Downtime section.
The information below confirms which configuration is supported by which version.
Up to version 6.40, the central instance is clustered.
• During an upgrade of an existing 6.40 central instance to 7.0, the established
architecture remains intact. SAP has documented the migration steps to support the
new ASCS structure in SAP note 1011190.
• When initially installing SAP AS 7.0, only the SCS/ASCS instance will be clustered.
Pure Java systems:
• Since the SAP AS 6.40 SR1 release, only the SCS is clustered. No changes are
needed to upgrade to 7.0.
ABAP + Java systems:
• Since the SAP AS 6.40 SR1, the Java SCS instance together with the ABAP central
instance is clustered.
• With the new installation of SAP AS 7.0, only the ASCS instance and the SCS
instance are clustered.
If one combines the SAP Standalone Enqueue in the SCS instance with a SAP
Replicated Enqueue running on a second server, one can continuously replicate the lock
table. In a larger SAP system with several SAP application servers, an operation with
minimal interruptions is provided. This is provided even when the central services must
be transferred to another server due to a hardware failure.
The SAP Replicated Enqueue can only be used for lock table replication and cannot
function as a regular Enqueue server of a SAP system. The lock table in the SAP
Replicated Enqueue, which holds the replicated lock entries, cannot be used directly for
the Enqueue service. During the process of a failover of the Enqueue server to the
replication site, the standard Enqueue process is first started and a new, empty lock
table is created. The replicated data in the shadow lock table is then read and
transferred to the original lock table before the system is operational again.
The following figure shows the configuration of several SAP application servers and a
central instance in combination with a SAP Replicated Enqueue:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 29
A SAP Replicated Enqueue should be combined with a high availability cluster solution.
One reason for this is to enable the administrator to switch the Message server from one
server to another in case of a severe failure. Another reason for this setup is that the
SAP Replicated Enqueue is not a fully functional Enqueue server. Instead, it is only used
to replicate the lock table. During regular operation, the SAP Replicated Enqueue only
inserts lock requests into a standby lock table on a second server. In case the original
server dies, the normal Enqueue Server needs to failover to this server and resume work
with this replicated lock table. Additionally, high availability cluster solutions are also
used to protect the database against hardware failures.
Since Message and Enqueue servers in the SCS instance have very little resource
requirements on a server within a high availability cluster, it is possible to install
additional local application servers. In this context, local means that they are not
managed by cluster management and are lost by a failure of the respective server
hardware.
In contrast to most of the high availability installations, the installation of a SAP Web
dispatcher in a cluster is not supported by SAPINST. SAP note 834184 provides the
steps to manually configure a WSFC for the SAP Web Dispatcher in detail. SAP notes
can be downloaded from the SAP Service Marketplace at https://service.sap.com.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
Additional SAP Web Dispatcher administration information is available at:
http://help.sap.com
TREX is one example of a SAP solution that does not rely on a standard SAP AS, but is
run on special server architecture. TREX installations can also be implemented as
master/slave configurations spanning several physical servers.
Additional information about the distribution and implementation of a TREX engine is
available at http://service.sap.com/instguidesnw70.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
To access this information after logging on to the site, do the following:
From the left menu pane, click SAP NetWeaver.
Choose English under SAP NetWeaver 7.0 Library.
Open Technical Operations Manual for SAP NetWeaver.
Open Administration of Standalone Engines.
Follow the Search and Classification (TREX) link.
General information about TREX is available at http://help.sap.com.
SAP liveCache
SAP liveCache is a component of the Advanced Planning and Optimization (APO)
application that supports the SAP SCM solution: an application for supply chain
management in the mySAP suite. SAP liveCache is a memory resident database for
rapid access. The foundation of this technology is derived from the SAP MAXDB,
formerly known as SAP DB. In addition to this memory resident database, each APO
system has a normal database for the APO data and programs. In order to access data
SAP on Windows Server 2008 R2 - High Availability Reference Guide 33
objects in the liveCache rapidly during operation, those objects are loaded into the
liveCache at startup. A special logging mechanism writes savepoints to the disk every
few minutes that does not reflect the transactional state of the system.
APO systems consist of a SAP AS and a liveCache as standalone engine. From the
perspective of high availability, there are two solutions possible to protect the liveCache:
• A failover cluster for the APO system and the liveCache. LiveCache is supported in
the WSFC as of SAP NetWeaver 7.0 SR1.
• A hot standby liveCache where the database log files are exported to a standby
server and constantly applied to a database in recovery mode. This log shipping
solution works with two independent servers that do not share common storage.
Additional information about the installation of SAP liveCache and cluster configurations
in WSFC is available at http://service.sap.com/instguidesnw70.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
In addition, see the following SAP note about the configuration of liveCache in WSFC at
https://service.sap.com/notes.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
SAP note 780795: ―SAP liveCache 7.5: WSFC Installation‖
General information about the administration of the SAP liveCache is available at
http://help.sap.com.
To access this information after logging on to the site, do the following:
From the left menu pane, click SAP NetWeaver.
Choose English under SAP NetWeaver 7.0 Library.
Open Technical Operations Manual for SAP NetWeaver.
Open Administration of Standalone Engines.
Follow the SAP liveCache Technology link.
Securing a SAP application against interruption due to the loss of hardware resources
generally requires applying several techniques. Applying these techniques will lead to
better application protection.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 36
SAN infrastructures
A Storage Area Network (SAN) provides a centralized approach to maintaining the
storage resources needed in a computer system. Traditionally, Direct Attached Storage
(DAS) has been used for the computer system local storage requirements. The use of
DAS has high space requirements and administration costs. By centralizing data storage
into a scalable, network type architecture, administrative costs are lowered, and space is
managed more efficiently. SANs can be built on Fibre Channel connections using fiber
optic cables and built on the SCSI protocol for block-oriented data transfer.
In addition, iSCSI devices are now available that use normal TCP/IP networks for the
transport. The SCSI protocol for the data transfer is packaged into the TCP/IP transport.
From the high availability perspective, the use of SAN infrastructures in data centers is
recommended. Redundancy of physical disks and protection against the single disk
failure is maintained in the storage subsystem itself and follows the hierarchical
approach shown at the beginning of this section. Depending on the vendor and the type
of storage subsystems in a SAN, even data replication over larger distances can be
achieved with SAN-based storage.
Additional information on the concepts of the SAN infrastructure for highly available
Windows systems can be found in the Server Cluster: Storage Area Networks white
paper by searching for the title at http://technet.microsoft.com/en-us/default.aspx.
Multipathing
Data storage protection against unplanned downtime always includes connection
protection between a server and its storage. If there is only one storage subsystem host
adapter and subsequently only one storage cable connection, any host adapter, cable,
or controller failure in the storage array would create an application interruption. The use
of a WSFC could help protect the server components such as the host adapter.
However, it is preferable to avoid connection failovers in a cluster. These failover types
can be avoided by using a redundant host adapter and two cable connections to a
storage device that in turn has two storage controllers. This configuration is called
multipathing.
The Windows operating system supports multipathing through the MPIO driver.
Additional information for MPIO configurations is found in the Multipathing and the
Microsoft MPIO Driver Architecture white paper at:
http://download.microsoft.com/download/3/0/4/304083f1-11e7-44d9-92b9-
2f3cdbf01048/mpio.doc
SAP on Windows Server 2008 R2 - High Availability Reference Guide 37
Server protection
Servers host the individual components such as the SAP instances and services that
compose a SAP system. The server role and importance depends on its function. A
database server for a productive SAP system typically has the highest requirements in
availability, stability, and performance. While SAP specific solutions are discussed later
in this paper, there are a number of general server recommendations that incorporate
high availability.
With high availability, redundancy is the method to protect servers against downtime.
Inside of a server this could mean that the server has two independent power supplies
with two power cords. Of course, each power cord needs to supply enough energy to
sustain the operation in case the other one fails. It might also include redundant host
adapters for storage or network access. Finally, a conceptually well designed system
with hot pluggable components is always valuable.
However, there are server components that cannot be easily configured to be redundant.
Main memory or CPUs are examples of these critical components as well as the server
operating system that also exists only once. There are two solutions that are typically
used to address this. One solution would be to use fault tolerant systems built to recover
from memory or CPU hardware failures. However, the disadvantage to this solution is
the limited performance range and higher prices.
The second solution for protecting servers against failures is high availability clusters like
the WSFC. With WSFC, two or more servers share storage subsystem access and can
take over the storage volumes and restart applications automatically in case a server
fails. This concept even maintains redundancy at the operating system level as each
server has its own operating system. However, clusters depend on additional software
components and need a proper configuration and a change management policy. We will
discuss the possible cluster implementations with SAP applications later in this section.
Network high availability
Networks are the backbone of all corporate communication, both internally and
externally. The SAP application network implementation has multiple communication
layers based on different functionalities including:
• A server network that interconnects SAP application servers and the database
server.
• A client network for local users using the SAP GUI or a browser.
• A demilitarized zone for connection to the public Internet.
• A provider for access to the public Internet.
Again, component redundancy is the key factor for high availability solutions. However,
the architecture of a real implementation reflects additional considerations. For example,
public access to the Internet immediately raises security concerns and has more
requirements than the internal and isolated server network. While in the server network,
besides availability, performance might be another issue. The following figure shows the
different SAP network aspects.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 38
A description of the SAP landscape and SAP system network requirements can be found
at http://sdn.sap.com.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 39
Each of the cluster nodes has its own local operating system with the SQL Server engine
installed locally. Each node must be capable of accessing the external storage
subsystem where the applications components are installed. Supported storage systems
include Serial Attached SCSI (SAS), Fibre Channel, or iSCSI-based systems.
Every WSFC cluster needs to maintain a copy of the cluster database that contains
cluster configuration information. This information determines which cluster node can
take ownership of the cluster resource group for the SAP application and database in
case the communication between the nodes is interrupted. When two servers compete
for the cluster resource group, this is known as Split-Brain syndrome and can generate a
deadlock.
In the simple cluster configuration shown in the previous figure, the cluster database is
stored on each node. If the cluster uses a Disk Witness, the Disk Witness will also store
SAP on Windows Server 2008 R2 - High Availability Reference Guide 40
a copy. Applications that are protected in a WSFC cluster are configured in cluster
groups. A cluster group contains the application resources like the shared disk storage
volume that contains the SAP installation file system. In the case that a cluster group
needs to be transferred from one server to another, such as during a hardware failure,
these resources must become available on the second server before the cluster service
can start the application there. Cluster resources can be configured to handle
dependencies on other resources. For example, it makes no sense to start a SAP SCS
instance before the SAP system database is available.
For the exchange of status information between the members of the WSFC cluster, a
private network is required. Since the status information that is periodically sent out is
similar to a heartbeat, the network is called the cluster heartbeat network.
Every SAP application network connection in the cluster is assigned a virtual IP address
that is activated on a server by the cluster service when starting the SAP cluster group.
While the virtual IP addresses are activated only on the server that runs the application,
all network cards also have configured local IP addresses that are permanently
assigned.
Additional information about Windows Server 2008 R2 failover clusters can be found at:
http://www.microsoft.com/windowsserver2008/en/us/clustering-home.aspx
The installation of a SAP system in WSFC cluster solutions is described in the SAP
Installation Guide for the respective SAP NetWeaver release at:
https://service.sap.com/instguides
A user name and password are required to access this Web site. To access this
information after logging on to the site, do the following:
From the left menu, open SAP NetWeaver.
Select SAP NetWeaver 7.0 (2004s).
Select Installation.
Select Installation Guide - SAP NetWeaver 7.0 SR3 or Installation Guide -
SAP NetWeaver 7.0 SR2.
Select Windows and the installation type (ABAP, ABAP + Java, or Java).
There are a number of SAP notes that provide additional information about WSFC
related issues at https://service.sap.com/notes.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 42
The description of a local SAP application server installation inside of a WSFC cluster is
the same as the standard SAP application server description. The SAP installation
guides for NetWeaver 7.0 describe the setup starting with the version SR2. These
guides are available at https://service.sap.com/instguides.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
To access this information after logging on to the site, do the following:
From the left menu, open SAP NetWeaver.
Select SAP NetWeaver 7.0 (2004s).
Select Installation.
Select Installation Guide - SAP NetWeaver 7.0 SR3 or Installation Guide -
SAP NetWeaver 7.0 SR2.
Select Windows and the installation type.
An overview about the supported WSFC configurations is available from SAP in the
MSCS Configuration and Support Information for SAP NetWeaver ’04 and the SAP
NetWeaver 7.0 Systems white paper at http://sdn.sap.com/irj/sdn/windows.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 44
The regular Enqueue service and its lock table are on the server from which the SCS
instance was started. The second server in the cluster has a SAP Replicated Enqueue in
addition to the active database. Additional application servers are located on servers that
are not in a cluster formation. All lock requests from the active Enqueue servers will be
mirrored onto the Replicated Enqueue.
In case of a severe SCS server hardware problem, the SCS instance will be transferred
to the database server and started from there. During this process, the SAP Replicated
Enqueue is stopped and the lock information from the mirrored lock table is copied into
the new lock table of the regular server. Therefore, the SAP AS outside these clusters
does not lose any information and their running transactions are not influenced.
The SAP Installation Guide for SAP NetWeaver 7.0 SR2 and SR3 describe the cluster
setup for the Enqueue Replication and the Enqueue Replication server installation.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 45
SAP note 524816 gives detailed information about the SAP Standalone Enqueue. SAP
note 804078 describes the concept of the SAP Replicated Enqueue and how it can be
used to protect a SAP system. Attached to this note is also an installation guide for the
Enqueue Replication server in a WSFC cluster. In addition, the SAP lock concept and
high availability solutions are described at http://help.sap.com.
Multi-SID cluster
A limitation of older Windows-based cluster configurations was that only one SAP
system per cluster could be configured. The reason for the restriction was because of
the SAPMNT share. Any access to the SAP system global files in a distributed
installation have to use this share. Since the share is configured on the <Drive>:\usr\sap
directory, there is only one unique location in the file system.
Underneath this path, there is a <SID> directory that hosts all the data for a specific SAP
system. The consequence of this structure is that if there is more than one SAP system
installed on the server, the share would contain the global data for all SAP systems.
Since this share has to be relocated to another server in the WSFC cluster in case of a
failover, that operation would impact all SAP systems. Because of this, SAP does not
support this configuration.
A remedy for the described problem and restriction is resolved by using a new SAP
installation method. With this method, the SAP system disks are linked with the <SID>
directory under <Drive>:\usr\sap by using junctions. Junctions are similar to symbolic
links in the diverse UNIX versions. They are a file system detour that allows access to a
designed directory to be automatically transferred to another directory. For example, the
following figure shows the principle setup of a WSFC cluster with three SAP systems
with AAA, BBB, and CCC designations. Each SAP system has its own hard disk that can
be accessed on shared drives from both the servers in the cluster. SAP system AAA and
system BBB run on Server A and system CCC runs on Server B.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 46
The SAPMNT share previously was configured as a cluster resource inside the cluster
configuration. Now it is in the local operation system of the respective server. The share
is stationary and is no longer managed through the cluster. Under the C:\usr\sap
directory path are three directories: AAA, BBB, and CCC. These directories have been
created in both servers.
Depending on system type, the directories in the following table are created on the
shared drives:
SAP system type Shared drive directory
All system variants \usr\sap\<SID>\SYS
Java \usr\sap\<SID>\SCS<InstanceNr>
ABAP \usr\sap\<SID>\ASCS<InstanceNr>
ABAP + Java add-in \usr\sap\<SID>\SCS<InstanceNrJava>
\usr\sap\<SID>\ASCS<InstanceNrABAP>
Next, all the junctions are created from the local hard drive of every server. To create
junctions, the executable linkd.exe from Microsoft is available. The executable is a part
of the Microsoft Windows resource kit. The syntax for the commands is:
Depending on the system type, the arguments can be accessed from the following table:
As seen in the following figure, after the sample clusters installation, the cluster groups
AAA and BBB were then activated on server A, and CCC on server B. All the SAP
instances file system accesses were redirected to the respective shared disk. The
external access takes place as usual in the cluster through the cluster group virtual IP
address.
With this configuration, if Server A crashes due to a hardware failure, two things will
happen. First, the shared disks of both applications AAA and BBB will be activated on
server B. Next, the virtual IP address of cluster group AAA and BBB will be activated on
server B. By using the junctions that point from the shared disk to a local hard drive of a
cluster server, a client is able to resume its work as usual and can resolve all data.
Clients who previously already have worked with the SAP application BBB on server B
are not affected. The following figure shows the situation after the failover:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 48
Using the junction configuration, all the SCS instances of a larger SAP landscape can
now be configured as one cluster. In general, Multi-SID clusters can also protect the
database instances. Because of the varying resource requirements of databases
compared to a SAP SCS instance, the sizing could be more difficult. Therefore, a better
design would be to place the databases and the SCS instances on two different clusters.
The following figure shows this system structure:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 49
Figure 29. Separate database and SCS instance clusters for simplified sizing
The database servers would have an additional SAP Standalone Gateway configured.
This is required as a local service for administration. Finally, each of the database
servers would also get their own SAPOSCOL service installed for performance
monitoring.
Multi-SID clusters demand a different approach during a cluster installation, but require
no changes in the application operation. As a minimum requirement, the use of SAP
SCS instances is required. For a pure Java system, it is already possible in version 6.40.
With ABAP or dual-stack systems, version 7.0 must be employed.
The installation of multi-SID cluster solutions will be described in a separate installation
guide for the NetWeaver 7.0 SR3 release. SAP note 106275 describes how the SAP
supports a multi-SID cluster for the SAP AS 7.0.
The Multi-SID WSFC Installation for SAP NetWeaver 7.0 compact disc master is
available at http://service.sap.com/swdc.
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 50
Multi-node cluster
Besides the previous limitation of only one SAP system configuration per cluster, there
was also a restriction in the number of cluster member nodes supported for SAP
clusters. While Windows Server 2003 could support up to eight servers and Windows
Server 2008 R2 could support up to 16 servers in a WSFC cluster, SAP only supported
two-node clusters before SAP NetWeaver 7.0 SR2. Because these limitations no longer
exist with NetWeaver 2004 SR2, multi-node clusters are now possible. However, if
Replicated Enqueue is used, SCS must still be configured to run on two nodes.
The following figure shows a cluster with three servers. Two of the servers actively run a
SCS instance and the SAP system database while the third server is a backup in case
an error occurs on either of the first two servers. With proper sizing of the main storage
and the CPUs in the middle server, it is possible for both SAP instances to run in the
middle server at the same time.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 51
Multi-Node clusters are supported with SAP NetWeaver 7.0 SR2 using the SAP
installation tool, SAPINST. The installation of additional nodes in a WSFC cluster is
described in the SAP Installation Guide.
IT infrastructure protection
Applications always have a direct relationship to the servers in a data center. Across
these servers necessary resources like CPU, memory or disk storage are made
available. At the same time, applications and server operating systems are consumers of
central IT services. These services include:
• Centralized backup processes
• File and print services
• Active Directory®
• Deployment services
• Patch server
• Network services such as DNS or DHCP
Not only does the server that the application is running on need to be protected against
interruptions, but all data center services and resources that are significant to application
operations must be protected as well. This fact is especially important because the data
center central services serve all applications and could cause an interruption on a larger
scale than the failure of a single AS.
For example, after a DNS service interruption, no name into an IP address resolution in
a data center can be carried out. The following list contains some critical services that
might require protection:
• DNS
• DHCP
• WINS
• NFS server
• Fileserver with SMB/CIFS
• Print server
• Authentication
• Time synchronization
• Backup functions
• Central monitoring service
Since there are many critical protection services, detailed discussion of these services is
beyond the scope of this white paper. Additional documentation is available at
http://technet.microsoft.com.
If third party solutions are being used, the high availability discussion should incorporate
the vendor perspective as well.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 53
• Operating hours: 8:00 to 20:00 CST, 5 days per week, Monday through Friday.
• System availability:
22 hours, 7 days a week
99.5 percent annual availability during the defined operating hours
In the above example, the IT department would have a maintenance window of two
hours per week. They would need to take additional measures to ensure that an
unplanned downtime would not exceed 0.5 percent of the uptime or 22 hours.
Change management strategy deployment
Having a limited time for maintenance requires the IT staff to get the most out of the
available time. Typically, the work flow during any maintenance action involves having a
backup copy of the existing state, performing the required work, and testing throughout
before the system is returned to production. Proper preparation is one of the key factors
for success. Testing changes must occur first on test systems in order to verify the side
effects of the change. This process generates information regarding the time
requirements to perform this task. An additional benefit is that the IT staff learns about
the required steps while working with the test systems. This also helps to minimize the
downtime when the same work has to be done on the productive server.
Another important task is planning ahead to have enough resources like disk storage or
main memory for the future growth of the SAP system. By providing enough resources,
the SAP system stability and quality of service improve and frequent shut downs for the
installation of additional components can be avoided. In productive SAP systems, a
common strategy is to inflate the required hardware resources at the start date of the
productive use and maintain enough headroom for at least six months of growth. Any
further extension should also reflect this principle. Besides adding resources, there are
also strategies such as archiving that can be incorporated to minimize the storage
requirements of a SAP database.
Planning the operating system and application software maintenance is another
operational aspect. It is essential to know the software vulnerabilities and install fixes in a
timely manner. Typically, installing fixes needs to be synchronized and installed in a
sequential manner. For example, test and QA systems are updated first to work out the
installation issues. The production systems are updated only after the issues are
resolved. The amount of security vulnerabilities in a system can be minimized by a
process called hardening. Hardening a SAP system is configuring the SAP system with
only the minimum platform functions that are necessary for operating the system.
Additional information about IT landscape hardening can be found by searching for the
SAP Hardening and Patch Management Guide for Windows Server white paper at:
https://www.sdn.sap.com/irj/sdn
Snapshot backup
Backing up a large database might take a long time. The primary issue when creating a
backup is that it must be transactional consistent in order to use it for a potential restore.
Transactional consistency means that all transactions are either finished or not
contained in the backup. SQL Server database backups are created by using the backup
database command. This command first executes a checkpoint which means that all
pages that have been changed since the last checkpoint and still reside in memory are
flushed from the database server main memory to the storage subsystem. After this
operation is complete, the database files are backed up by copying the data to another
disk or a tape device. To maintain the transactional consistency, the transaction log file
is also copied during this process. The transaction log is used to roll back or undo
transactions that were not finished at the time the backup was made.
Despite the fact that SQL Server backups are online, the backups produce an additional
load for the storage subsystem. Therefore, one usually tries to minimize the time of an
online backup. In order to minimize the time a backup will impact normal system
operation, it is possible to use the snapshot feature of SQL Server 2005/2008 R2 to
create the backup. Snapshot backups reduce unavailability of the SQL Server
2005/2008 R2 database during a backup to a couple of seconds. This is especially
useful for moderate to very large databases where availability is very important.
SQL Server snapshot backup is accomplished in cooperation with third party hardware
or software vendors, or both. These vendors use SQL Server 2005/2008 R2 features
that are designed for this purpose. The underlying backup technology creates a point-in-
time copy of the database image that is being backed up. The instantaneous copying is
typically accomplished by splitting a mirrored set of disks or by creating a copy of a disk
block when it is written. This preserves the original. At restore time, the original is made
available immediately and the synchronization of the underlying disks occurs in the
background if necessary. This restores operations almost instantaneously.
The following figure shows an example of snapshot technology with NetApp FAS storage
system and the NetApp SnapManager for Microsoft SQL Server and SnapDrive for
Windows solution. In this example, the time required for backups and restore can be
reduced to seconds by using SnapManager.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 57
Detailed information about the SQL Server 2005/2008 R2 Snapshot Backup feature is
available at http://msdn.microsoft.com/en-us/library/ms189548.aspx.
the specified amount of time has expired. With this configuration, the instance is made
idle quickly and can be shut down.
It is required that all the remaining SAP AS instances in a logon are able to handle the
workload sufficiently. In larger systems, it might be appropriate to prepare one universal
AS instance that can join several groups. This can be achieved by installing several SAP
AS instances on a server and start the respective instance when needed. Such a stand-
by AS could be used temporarily to maintain the transactional performance of a SAP
system. The following figure shows the setup of this landscape:
There are still the central elements of the SAP system including the servers for the SCS
instance and database. If both components are in a WSFC cluster, a server can be
isolated using a planned failover to the respective standby server. This switch can
actually happen at a convenient time with little effort. The empty server can
subsequently be patched and restored to operation.
SQL Server instance maintenance
Expanding on the previous maintenance concept by using SQL Server Database
Mirroring adds the option to patch the database engine installed on a server while the
SAP system continues to work. The basic principle is to switch the database to the mirror
copy when a patch needs to be installed at the database engine of the original server.
After successful installation, the database is switched back and the same process would
be executed on the mirror side as well. See the following figure for an example of this:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 59
More information on SQL Server database mirroring can be found in the Disaster
Recovery Solutions section.
An example of a patch cycle for Windows and SQL Server patches by using the above
concept could look like:
Patch schedule:
• Windows and SQL Server patches applied monthly (if applicable)
• SAP service packs applied during quarterly release
Patch sequence:
• Patch first in sandbox and test. Only place in production after a few days of
successful testing.
Patch process:
• If no reboot is required, apply the patches and the patch process is finished.
• If reboot is required, perform the following steps on each AS:
• Isolate Dialog/Batch server from:
Logon group
RFC group
Update group
SAP on Windows Server 2008 R2 - High Availability Reference Guide 60
Batch group
Spool (or have redundant spool server)
• Drain connections, patch, reboot, then add back into the respective group
and proceed to next server.
• If required, take the temporary AS into the respective group.
Perform the following steps on the mirrored database servers:
• Suspend mirroring, patch, and reboot secondary server, re-synch, fail over to
secondary, patch and reboot primary server short SAP downtime during
failover has to be planned.
• There is no need to fail back.
Perform the following steps on the SAP central instance server:
• Relocate the SAP central instance in the WSFC cluster to the database
server.
• Patch and reboot inactive node.
• Fail over database and CI, patch and reboot other node short SAP
downtime during failover has to be planned.
• Distribute database and central instance on the two nodes as before for
better performance.
More information on SAP upgrades can be found on SDN by searching for upgrade at:
https://www.sdn.sap.com/irj/sdn
The SAP Service Marketplace has the following related notes available at:
https://service.sap.com/notes
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
SAP note 139513: ―Merge transports for high availability systems‖
SAP note 361735: ―Inactive import of reports‖
The Hyper-V host cluster can perform a Live Migration of the VMs without application
service downtime. The following example describes this process in more detail.
A Hyper-V host cluster configured for Live Migration and a VM running a SAP application
is actively used by clients. At some point, the administrator must migrate this VM to
another server in the Hyper-V host cluster since the server that the VM resides on must
go into maintenance.
Initially, while the VM is still actively used on the primary server, an empty VM is created
on the second server and the memory image of this VM is copied to the second server. If
the memory pages on the primary server are changed during this process, Hyper-V
detects this and copies those pages again. Eventually, the number of pages that are
different between the two servers is significantly reduced.
When the difference is small enough, Hyper-V pauses the VM on the primary server and
copies the last set of changed pages to the new server. Subsequently, the client access
is re-routed to the new server and the VM on the primary server can be deleted. Since
the final state transfer happens very quickly and no TCP timeout occurs, the client does
not recognize this transfer.
Note: It is important to note that Live Migration does not work for unplanned downtime.
In the case of a server failure, the VMs will fail over using failover clusters. The Live
Migration process must be planned and requires an active primary system for the
duration of the migration.
Since the Hyper-V host cluster and Live Migration use the same setup, this solution is an
extension of the high availability solution with WSFC. Live Migration provides the
capabilities for minimizing planned downtime in a virtual environment. These capabilities
are not available for applications that must be installed directly on the physical server.
More details on how to set up a Live Migration cluster are available in the following
documents:
Windows Server® 2008 R2 Hyper-V™ Live Migration white paper available at:
http://download.microsoft.com/download/C/C/7/CC7A50C9-7C10-4D70-A427-
C572DA006E61/LiveMigrationWhitepaper.xps.
Best Practices for SAP on Hyper-V white paper available at:
http://www.microsoft.com/virtualization/en/us/solution-business-apps.aspx.
Hyper-V: Live Migration Network Configuration Guide available at:
http://technet.microsoft.com/en-us/library/ff428137(WS.10).aspx.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 63
Security measures that are taken to minimize data loss or espionage of confidential
information are significant. The measures taken include the implementation of firewalls,
virus scanners, and surveillance tools, as well as employee policies. Optimal security
requires that appropriate measures are taken on all levels of the IT operation including:
• Virus scanners on the computer level.
• Demilitarized zone for outbound communication.
• Firewalls at the network level.
• Well-developed authentication and audit procedures at the application level.
Security measures also include operational tasks, such as the timely installation of
security patches to close any possible gaps immediately after such vulnerabilities have
been published. A detailed discussion of the threats, as well as possible concepts and
measures are outside the scope of this white paper. The Microsoft TechNet library
provides detailed information about Microsoft products and technology for IT
professionals. The Microsoft TechNet library can be found at:
http://technet.microsoft.com/en-us/default.aspx
Appropriate backups can be used to restore individual files in case a single file has been
deleted. However, these backups can also help to recover a complete system if a severe
SAP on Windows Server 2008 R2 - High Availability Reference Guide 66
failure happens. Even in the case where a disaster destroyed the original computer
systems, backups can be applied to a second computer and operation will be restored. A
backup and restore strategy is the last step in a recovery from an unforeseen event that
will return a database to some predefined point: Most likely to the last completed
transaction prior to the failure.
All aspects of the backup and restore strategy should be well documented and reviewed
regularly. Most importantly, they should be tested regularly to ensure that the data and
the media for backups are valid and that the processes work as expected.
Database backup strategies
The backup and restore components provided with Microsoft SQL Server 2000 and later
enable the administrator to reproduce a database to an exact replica of the original
database at any point in the database history from the time an appropriate backup
strategy was implemented. There are several backup types available:
• Full backup: Makes a complete backup of the database to the last completed
transaction affected during the backup process.
• Differential backup: Makes a copy of the database pages changed since the last
full backup. It is a useful backup mechanism to back up a database without
consuming as many resources as a full backup. In a restore operation, this is used in
conjunction with a full backup.
• Transaction log backup: Makes a backup of all the completed transactions that
have taken place since the log was last backed up. A transaction log backup is used
in conjunction with a full backup, and potentially differential backups, to enable an
administrator to restore a database to a specific point in time or to the last completed
transaction that was backed up.
• File backup: When a database consists of multiple files, each file can be backed up
individually. This provides an accelerated backup process as well as a faster restore
process. File backups are used in conjunction with transaction log backups.
Additional backup solution information is available in the following resources:
• Step-by-Step Guide for Windows Server Backup in Windows Server white paper
available at http://technet.microsoft.com/de-de/default.aspx.
Note: This document can be found by searching for the title.
• SAP Backup and Restore Information MS SQL Server help documentation available
at http://help.sap.com.
• Blog entry: How does Microsoft perform backups in their SAP system landscape
available at:
http://blogs.msdn.com/saponsqlserver/archive/2008/03/28/how-does-microsoft-
perform-backups-in-their-sap-system-landscape.aspx
For more information about SQL Server log shipping, please refer to the Disaster
Recovery Solutions section.
Snapshots
A snapshot is an image of information that has been frozen at a certain point in time.
The snapshot delivers an accurate picture of the information at an accurately defined
point in time. Snapshots are typically taken from fast changing data, like in a file system
or in a database. Technically, snapshots typically use the copy-on-write principle. In this
principle, all data on a storage media is represented as a chunk of data blocks. Data
blocks access is provided by pointers. Each block has an individual pointer that
describes where this block resides on the media.
A snapshot first takes all the pointers at a certain point in time and saves them. Any time
a data block is changed, the data block is first copied into a snapshot file and the system
uses a new pointer for the changed block. By copying only the pointers to data initially
and copying data blocks only if changes occur, snapshots are very fast and require
relatively little disk space. However there will be some impact for copying changed
blocks to the snapshot.
Database snapshots with SQL Server 2005/2008 R2
With SQL Server 2005/2008 R2 Enterprise Edition, database snapshots can be created.
A database snapshot is a read-only, transactional consistent view of a database.
Transactional consistent means that only those transactions which have been finished
by a commit work statement are taken into the snapshot. Snapshots can be generated
automatically, at any point in time, and also be used for reading access during daily
operations, such as report generation. The snapshot copy of the database can be
queried by client applications and, in the event of the original database becoming
damaged or unusable, it can be reverted to the state it was in when the snapshot was
created.
Since every new snapshot requires storage space, it is recommended that the older
snapshots are always deleted after a certain time. The optimal time for retaining
snapshots depends a bit on the individual requirements, but a time interval of one or two
days for retaining snapshots is sufficient.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 69
It is a good strategy to run the DBCC check outside downtime and with the normal
system workload during low volume operation such as over the weekend. Based on the
SQL instance configuration, the DBCC check command only requires one CPU core.
Therefore, with a modern multi-core server, there are still enough cores available to
maintain the SAP operation.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 71
Hardware-based replication
With this method, the complete replication task is done at the storage level. The
following figure shows the basic setup:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 74
The advantage of this solution is that it works completely independent from the
application. However, as the replication is performed by the storage controller, the SAN
storage devices have to be from the same vendor and there is a high bandwidth
requirement for the replication. Additionally, software components from the storage
vendors are required to enable WSFC to appropriately use this configuration. Examples
of storage-based replication providers are:
• EMC with SRDF
• HP Storage Works Business Copy EVA
• NetApp MetroCluster with SyncMirror
• IBM GDPS with PPRC
• Hitachi Storage Clusters
Note: For hardware or software-based replication solutions to work, they are required to
replicate SQL Server write I/Os in exactly the same order as originally issued by the
database.
Software-based replication
With this method, any change on the active side is copied over the network to the
secondary side and replicated there. This requires the use of a software product that is
not part of the initial cluster setup. While these software components increase the
implementation cost, the advantage is that different storage devices can be used. It is
even possible to have SAN storage on one side and a Direct Attached Storage (DAS) on
the other side. Examples of vendors for software-based replication products include:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 75
• NSI Double-Take
• Legato RepliStor
• Symantec Storage Replicator
• SteelEye DataKeeper
• Neverfail ClusterProtector
The following figure shows the principle setup when using this method.
storage, called a disk witness, or a file share, called a file share witness can vote. There
is also a quorum mode called No Majority: Disk Only that functions like the disk-based
quorum in Windows Server 2003. Aside from that mode, there is no single point of failure
with the quorum modes since what matters is the number of votes, not whether a
particular element is available to vote.
There is a comprehensive description about the available quorum options for Windows
Server 2008 R2 available at http://technet.microsoft.com/en-us/library/cc770620.aspx.
This ensures that a majority of the nodes have an up-to-date copy of the data. The
cluster service itself will only start up and therefore bring resources online if a majority of
the nodes configured as part of the cluster are up and running the cluster service. If
there are fewer nodes, the cluster will not have the quorum and therefore, the cluster
service waits to restart until more nodes join.
In the case of a failure or split-brain, all partitions that do not contain a majority of nodes
are terminated. This ensures that if there is a partition running that contains a majority of
the nodes, it can safely start up any resources that are not running on that partition. This
ensures that it can be the only partition in the cluster that is running resources.
MNS quorum implementations are recommended for geographically dispersed clusters.
By having a single MSCS member node in a separate location, split-brain situations can
be avoided by using one node as an arbiter. See the following figure for an example of
this:
SAP on Windows Server 2008 R2 - High Availability Reference Guide 77
SAP supports a Majority Node Set Cluster if it is part of a cluster solution offered by the
Original Equipment Manufacturer (OEM), or Independent Hardware Vendor (IHV).
File share witness for Windows Server 2003
The file share witness feature is an improvement to the current Majority Node Set (MNS)
quorum model of Windows Server 2003. This feature enables the use of a file share that
is external to the cluster as an additional vote to determine the status of the cluster in a
two-node MNS quorum cluster deployment.
One of the disadvantages of a two-node MNS cluster is that it cannot sustain the failure
of any cluster node without losing the majority of nodes. In other words, it cannot
continue operation. The only solution to overcome this problem is to configure at least
three nodes in a MNS cluster. The three cluster nodes need to be continuously available
and should be in different physical locations.
With the File Share Witness feature, it is possible to use an external file share instead of
the third cluster node also referred to as the witness. By using the File Share Witness, a
two-node MNS cluster can be configured and remains operational even if one cluster
node dies. The file share used acts as an additional vote to determine which node takes
ownership of the configured cluster resources.
Additional information about the File Share Witness feature is available in the Microsoft
Knowledge Base Article 921181 at http://support.microsoft.com/kb/921181.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 78
Network configuration
Any WSFC configuration requires at least two network adapters for the following
purposes:
• A public network that is used for the communication between the SAP central
instance, SAP AS, and SAP system client connections.
• A cluster private network that is used internally for status exchange and WSFC
cluster heartbeat information between the member nodes.
Each of these network adapters is required to have its own physical IP address and
corresponding host name. The cluster service in a WSFC cluster is unaware of a
possible geographical dispersion and assumes that its public and private network
interfaces still exist in the same network segment with the same IP subnet. This is
because cluster software is unable to determine network topology and because it
operates on IP failover that only functions within the same subnet. To accommodate
these restrictions for geographic dispersion, organizations can implement VLAN
technology.
Virtual LANs (VLANs) can be viewed as a group of devices on different physical LAN
segments that can communicate with each other as if they were all on the same physical
LAN segment. Even though some of the cluster service network communication
limitations have been removed in Windows Server 2008 R2, a single subnet is still
required. This is still true for SAP component communications as well.
With Windows Server 2003, the limitation for the heartbeat roundtrip time is 500
milliseconds. This fixed parameter is directly dependant on the latency and bandwidth of
the network connections used between the two sites. With Windows Server 2008 R2,
this parameter became configurable between 250 and 2000 ms on the same subnet.
Theoretically, even different subnets are possible with Windows Server 2008 R2, but due
to the requirements of the SAP instances, SAP installations are only possible in a single
subnet configuration. Additional geographically dispersed cluster information is available
in the following resources:
• White paper: Geographically Dispersed Clusters in Windows Server 2003
http://www.microsoft.com/windowsserver2003/techinfo/overview/clustergeo.mspx
• White paper: Server Cluster Quorum Options in Windows Server 2003
http://technet.microsoft.com (Note: Search for the title.)
• White paper: Stretching Microsoft Cluster with Geo-Dispersion
http://www.microsoft.com/technet/prodtechnol/windows2000serv/maintain/optimize/g
eoclust.mspx
• White paper: Server Clusters: Majority Node Set Quorum
http://technet.microsoft.com (Note: Search for the title.)
• Microsoft Storage solutions
http://www.microsoft.com/windowsserversystem/storage/default.aspx
• Microsoft Knowledge base article: Microsoft Cluster Services Installation Resources
http://support.microsoft.com/kb/259267
• Multi-Site clustering with Windows Server 2008 R2
https://www.microsoft.com/windowsserver2008/en/us/clustering-multisite.aspx
SAP on Windows Server 2008 R2 - High Availability Reference Guide 79
Transactional log backups on the primary database server are copied to a local disk on
this server and transferred over the network to the standby database in the configured
time interval. Transactional log backups received on the standby server are applied to
the database. It is also possible to transfer the transactional log backups from the
primary to multiple standby servers.
The process of changing the database role from primary to secondary or to bring the
secondary database online in the event of the primary database becoming unavailable is
not an automatic process. The secondary database can be brought online manually.
During the process of setting up SQL Server log shipping, initially a database backup
copy is restored on the standby server. With log shipping in place, every transactional
change is reproduced on the standby side.
SAP on Windows Server 2008 R2 - High Availability Reference Guide 80
By design, SQL Server log shipping might only maintain the SAP system database in
geographic dispersed way. As the complete functionality of a SAP system requires a
SAP central instance with the network shares and possibly an AS, these have to be
maintained separately. SQL Server log shipping is therefore not considered a full
disaster recovery solution, but is a simple method of maintaining a copy of the database
of a SAP system and can be combined with other technologies like database mirroring or
WSFC clusters.
The following figure shows the general setup with a local WSFC cluster:
Additional SQL Server log shipping information is available in the following resources:
• SAP Service Marketplace URL:
https://service.sap.com/notes
Note: Access to this Web site is available only to registered SAP customers and
partners and requires a user name and password.
SAP note 493290: ―Configuring SQL Server Log shipping‖
SAP note 1101017: ―Log shipping on SQL Server 2005‖
• SQL Server 2000 high availability series
http://www.microsoft.com/technet/prodtechnol/sql/2000/deploy/harag05.mspx
• White paper: SAP with SQL Server 2005
http://www.microsoft.com/sql/techinfo/whitepapers/sap-with-sql-server.mspx
SAP on Windows Server 2008 R2 - High Availability Reference Guide 81
The mirror server that receives these transaction log records writes them into the mirror
database log buffer before it writes them into a local transaction log file. The received
transaction log records are then applied to the mirror database. During the transaction
log application, all transactions executed on the active database are then executed on
the mirror side. Therefore, both databases can be maintained on the same transactional
level.
While, for database mirroring, the active and the mirror database must always be
available, there is an optional third role: the witness. With this optional configuration, in
case of error, an automatic failover to a mirror database can take place. When the
database mirroring is used in a high availability configuration, within seconds, the mirror
server can take on the role of the active server. The mirror database becomes available
in cases where the witness confirms the failover.
Database mirroring assures the availability of a consistent standby database in case of a
productive database interruption. By encoding the data packets in the network transport,
good data security is assured. SQL Server 2005/2008 R2 enables three different mirror
configurations including:
• Asynchronous mirroring
• Synchronous mirroring
• Synchronous mirroring with automatic failover
SAP on Windows Server 2008 R2 - High Availability Reference Guide 83
Additional information about SQL Server database mirroring is available in the following
resources:
• White paper: SAP with SQL Server 2005
http://www.microsoft.com/sql/techinfo/whitepapers/sap-with-sql-server.mspx
• White paper: Using SQL Server 2005 with SAP R/3
http://www.microsoft.com/technet/itsolutions/msit/operations/sql2005sap.mspx
• Books online: Microsoft SQL Server 2005
http://www.microsoft.com/technet/prodtechnol/sql/2005/dbmirror.mspx
• White paper: SQL Server 2008 Technologies for SAP Solutions
https://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/60a236a2-8104-
2b10-5ebe-8fef61cc82fd