You are on page 1of 9

MIssion-Critical Availability for Windows Server Environments

Stratus delivers automatic uptime assurance that exceeds 99.999%

by Stratus Technologies March, 2011

Mission-Critical Availability for Microsoft Windows Server Environments

Uptime. All the Time. For more than 30 Years.


Stratus systems deliver greater than 99.999% uptime automatically, right out of the box. That's why the world's largest companies and government organizations rely on ftServer systems to run their most critical Microsoft Windows applications. In contrast to traditional high-availability alternatives, Stratus ftServer systems provide continuous availability. Uptime, based on real operational data, exceeds 99.999%for the server and Windows Server operating system. Stratus fault-tolerant platforms are used for mission- and business-critical applications across a wide range of industries: Financial services: trading, credit authorization, Internet banking, ATM management, transaction processing Manufacturing: manufacturing execution systems, enterprise resource planning Public sector: federal aviation and transportation systems, computer-aided dispatch, and record management Telecommunications: IP PBX, call center applications Healthcare: electronic medical records, hospital information systems, imaging Gaming: casino management, slot management, rewards program tracking In addition to many line-of-business applications, ftServer systems are also used to protect valuable application infrastructure and horizontal applications including Microsoft SQL Server, Microsoft Exchange, directory and security services, and Hyper-V virtualization. Stratus sells and supports its systems across the world, either directly or through authorized resellers. Our ftServer system customers include many of the worlds largest corporations along with smaller companies with critical applications that cannot afford downtime. Customers include those looking to improve the reliability of their existing or new Windows applications as well as those looking to decrease costs by moving critical applications from UNIX or propriety operating systems to Windows-based platforms. Stratus servers are used in both datacenter environments, where they fit seamlessly into existing infrastructures, and in remote environments, where the unmatched simplicity and powerful remote service capabilities of ftServer systems allow flawless operation with little or no on-site IT expertise. Stratus systems are standards-based, using the latest off-the-shelf components including the new, high-performance Intel Xeon hex-core processors.

Figure 1: Stratus and Microsoft: A Measurable Difference Leveraging more than 30 years experience in faulttolerant computing, Stratus has collaborated with Microsoft to provide valuable insight and validate fault tolerant-aware functionality in the Windows Server operating system. This collaboration is designed to enable uninterrupted operation of Windows Server under the most extreme circumstances.
Bill Laing Corporate Vice President, Windows Server and Tools Microsoft Corporation

Mission-Critical Availability for Microsoft Windows Server Environments

Advanced technology that prevents planned and unplanned downtime


Industry-leading uptime of 99.999% or greater ensures comprehensive availability protection Affordable open-systems family offers significant cost advantages compared to Unix or proprietary systems System simplicity and application transparency reduces operational costs compared to clusters or other software-based, high-availability solutions Active Upgrade technology improves software maintenance and reduces planned downtime during routine maintenance such as upgrades, patches and hot fixes Built-in fault-detection, isolation and call-home features reduce service times and improve service effectiveness Online manageability and serviceability protects lights out and remote locations Worldwide ActiveService network delivers 24/7 online service and support

Stratus and Microsoft: partners in availability


Stratus has been designated as a Microsoft Gold Certified partner. This signifies our alignment with Microsoft and their continued commitment to our success as well the reliability and experience of Stratus in providing Microsoft-compatible products and services. All Stratus ftServer systems have passed the rigorous Windows Hardware Compatibility Tests (HCT); are listed in the Windows Server Catalog; and are qualified for the Microsoft Windows logo. This is your assurance that standard Windows-compatible software will run flawlessly on ftServer models. Engineering groups within Microsoft Corporation and Stratus Technologies have worked closely for more than ten years to enhance the robustness of the Windows operating systems. Most recently, this ongoing collaboration contributed to enhanced enterprise-class reliability, availability, and performance of the Windows Server 2008 R2 operating systems. Microsoft tested Windows Server 2008 R2 on Stratus ftServer systems in its labs during the entire beta testing period. Using Stratus servers and tapping the company's expertise in technology for continuously available processing, Microsoft was able to:

Conduct automated device-failure testing to study operating system response. Develop specifications and prototype implementations for memory mirroring code. Improve surprise-removal support for PCI devices in storage, USB, PCI control, and plug-and-play. Identify and accelerate resolution of software issues affecting availability during various test phases.

Stratus is working with Microsoft to develop additional requirements and testing for fault-tolerant and high-availability systems in conjunction with the Windows Server 2008 operating system.

Mission-Critical Availability for Microsoft Windows Server Environments

The industrys highest measured uptime


Stratus Technologies uncompromising commitment to uptime is visible every day. Stratus is the first and only server vendor to report the dependability of our installed base of systems worldwide. The Stratus Uptime MeterSM is displayed prominently on the products page at www.stratus.com. It is refreshed daily from actual field data that includes hardware, software, and Windows operating system incidents. The results report that ftServer systems from entrylevel to enterprise class surpass five nines of uptime.

Fundamentals of Continuous Processing design


To provide the most complete protection for uptime, a comprehensive server solution must address the areas of hardware, software, and service. All aspects of the Continuous Processing design work together to prevent downtime, not simply minimize it. Preventing downtime is a key design point that differentiates the ftServer family from robust traditional servers and high-availability clusters (which use multiple servers to quickly recover from downtime when one of the servers in the cluster fails). Unlike reliability-enhancing approaches that are not integral to a servers design, built-in continuous availability helps limit exposure to the operator error that industry experts identify as a leading cause of unplanned downtime. Notably, off-the-shelf Windows-based applications need not be modified in any way to benefit from these exceptional availability safeguards. This advantage represents a considerable improvement compared with clusters that require failover scripting, repeated test procedures, and software changes to make applications cluster-aware. Stratus enables Continuous Processing capabilities in ftServer systems through three fundamental elements: Lockstep Technology, Failsafe Software and ActiveService Architecture.

Lockstep technology prevents hardware downtime and data loss


Lockstep technology uses replicated, fault-tolerant hardware components (essentially two servers packaged as one) that process the same instructions at the same time. In the event of a component malfunction, the partner component acts as an active spare that continues normal operation and averts system downtime. The system also detects and corrects transient hardware errors that could cause software failures if left unchecked. Automatic error handling: Hardware faults are handled automatically by the system, without failover delay or data loss. Using Stratus lockstep technology, ftServer systems maintain multiple CPU-memory units in precise synchronization executing the same instructions at exactly the same clock cycle. Lockstep processing ensures that any errors, even transient errors, are detected and that the system can survive any CPU-memory unit error without interrupting processing and without losing any data or state. While many servers offer duplicated power supplies, fans, and disk drives, only Stratus provides protection for core system components that include motherboards, processors, memory, I/O buses, and I/O adapters. Another advantage of this approach is that an ftServer system presents a single-system view and runs a single copy of all software, which typically reduces software licensing costs and simplifies administration as compared with multi-node cluster alternatives.

Mission-Critical Availability for Microsoft Windows Server Environments

Fault-tolerant I/O: Stratus implements fault-tolerant I/O through the use of replicated PCI buses, replicated I/O adapters, and replicated devices. All critical PCI adapters are duplicated as well: SCSI, SAS/SATA, Ethernet, remote management, and Fibre Channel. Internal SAS and SATA disk storage, along with expansion ftScalable Storage is configured as RAID, connected via two independent storage buses. Connections to external Fibre Channel hardware RAID arrays are also duplicated to ensure full fault-tolerant operation. Multiple paths are therefore available to any logical I/O operation, including both internal and external storage operations. Any I/O operation failure will result in a retry using an alternate path that ensures successful completion of the I/O operation. Figure 2: Stratus Lockstep Processing Architecture

Redundant components within the ftServer system operate in lockstep processing the same instructions as the same time to ensure maximum uptime protection for critical Windows environments.

Failsafe Software addresses planned and unplanned downtime


Failsafe software works in concert with lockstep technology to prevent many software errors from escalating into outages and to reduce planned downtime during routine maintenance. Unlike typical servers or clusters, ftServer hardware and software handles most errors transparently, shielding the operating system, middleware, and application software. Another advantage of the Stratus approach is that it constantly protects and maintains inmemory data. Management and diagnostic features capture, analyze, and notify Stratus of any software issues. This allows support personnel to take a proactive approach to correcting software problems before they recur. In addition, hardened device drivers add considerable reliability to the Windows environment on Stratus ftServer systems. Active Upgrade technology: Adding a new dimension of downtime protection for Microsoft Windows applications, Stratus Active Upgrade technology addresses the major causes of planned downtime. This unique technology allows customers to apply upgrades and patches to

Mission-Critical Availability for Microsoft Windows Server Environments

the Windows operating system, system software or applications (including Windows service packs and hot fixes) without taking the server or Windows applications offline for extended periods. Active Upgrade works by splitting the normally lockstepped system into two independent halves the upgrade side and the production side. Programs can be patched or upgraded on the upgrade side without impacting the production side. After applying the updates, the upgrade side can be rebooted and the new functionality tested while the production side continues to run the production software without interruption. After the upgrade side has passed its tests, it is merged with the production side to resume fault-tolerant operation. However, if the upgrade is unsuccessful for any reason, the system simply continues to run the original production side which is merged back into fault-tolerant operation without any interruption to the production applications. When merging a successful upgrade, the applications are briefly halted on the production side and then restarted on the upgraded system once the merge completes. Unlike rolling upgrades in a cluster configuration which require at least two upgrade procedures and two application stops and restarts (one for each cluster node), Active Upgrade requires only a single upgrade procedure and a single application stop and restart. Built-in reliability from two industry leaders: Along with Hyper-V virtualization, Windows Server 2008 R2 also brought new reliability features to the Windows platform. The ftServer family running Windows has demonstrated hardware and operating system availability levels beyond 99.999%, as measured by actual production system data. Stratus failsafe software addresses known sources of system and application failures, and minimizes downtime during repair or maintenance: Software is shielded from transient hardware errors, adding to overall software reliability. Hardened device drivers prevent software failures caused by driver errors, a common cause of Windows software problems. Embedded open-driver technology allows qualified third-party device drivers to take advantage of hardened-driver capabilities without the need to modify existing driver code. Software issues, including Windows operating system issues, are reliably captured, analyzed, and corrected. In-memory data is maintained in the event of hardware component failure. Extensive integration and error-insertion testing finds and resolves difficult errors. All of these software enhancements are implemented without affecting the Windows core operating system code. As a result, systems maintain 100% compatibility with Windows Server operating systems. All ftServer systems pass the same rigorous Windows Hardware Compatibility Tests (HCT) as other servers, ensuring that Windows applications run without modification on Stratus ftServer systems. All current models are listed on the Microsoft Hardware Compatibility List (HCL).

Mission-Critical Availability for Microsoft Windows Server Environments

ActiveService Architecture improves service response and effectiveness


The Stratus ActiveService Architecture is designed to detect and resolve problems before they cause system downtime. Key service features include automatic fault detection, automatic fault isolation, integrated call-home remote support, and online component replacement to provide serviceability that is unequaled by other servers. Automatic fault detection, isolation and call-home features: Stratus ftServer systems constantly monitor their own operation. When a fault is detected, the server correctly isolates the condition and automatically opens a call that tells the Stratus support center exactly what action to take. Remote support capabilities enable Stratus service engineers to troubleshoot and resolve problems online more than 98% of the time. If necessary, the system automatically orders its own hot-swappable replacement part and ensures the correct part is delivered, within 24 hours, to major locations worldwide. Users can install these components easily while the ftServer system continues to run without interruption. The server automatically reports any problem condition to a Stratus Customer Assistance Center (CAC) via a secure dial or optional Internet connection. The global ActiveService Network provides a worldwide infrastructure that enables remote access (with customer permission) to every customer system. Authorized support professionals are able to remotely investigate critical problems 24/7, without the need to visit the customer site. Hot-pluggable components and online maintenance: Stratus ftServer systems are designed to allow most hardware maintenance operations including hardware upgrades, reconfiguration and repair to be performed online. The ftServer modular architecture supports user replacement of failed components without any special tools or training. Indicator lights, clear labeling, blind-mate connectors and other design features make service simple and error-free. Components can be replaced with no operator commands needed to inform the system that a component will be removed or inserted, eliminating another potential cause of error. The result is that downtime associated with hardware maintenance operations or errors resulting from those operations is eliminated. Figure 3: A System that Diagnoses Problems and Orders its Own Replacement Part

Component fails. The ftServer system ISOLATES the fault and notifies Stratus that a CPU has failed

The ftServer system then automatically orders the CORRECT replacement part Next Day Delivery Service Hot-pluggable components are EASY to replace without specialized training or tools while the system continues to run The system automatically synchronizes with the new replacement component

Users experience no downtime or degradation in performance throughout this entire process.

Mission-Critical Availability for Microsoft Windows Server Environments

Industry-leading 24/7 service and support


Stratus has been providing fault-tolerant computer systems worldwide for over 28 years and delivering industry-leading 24/7 support has been a key element of Stratus availability advantage since the companys start. The vast majority of Stratus customers including Windows customers today choose one of Stratus 24/7 support offerings. Stratus customer service personnel have an average of 12 years experience and include engineers with the capability of troubleshooting Windows problems to the source code level. Statistically, 50% of customer support calls involve non-Stratus products; however Stratus is committed to assist in troubleshooting and root-causing these problems. If root cause analysis reveals that an availability issue may be related to the Windows Server operating system kernel, Stratus teams with Microsoft to resolve the situation. Because Microsoft is authorized to access the Stratus ASN (with the customers consent), the Stratus service engineer and a Microsoft engineer can work together to electronically examine the issue, analyze data, and find a solution.

Top industry awards validate Stratus availability leadership


InfoWorld designated a Stratus system Best Fault-Tolerant Server in its Technology of the Year Awards. Drawn from hundreds of IT products evaluated by the InfoWorld Test Center, the systems were ranked on six criteria: availability, performance, scalability, manageability, serviceability and value. Reviewers credited Stratusfault-tolerant architecture for being: as redundant as a server can get.... If it absolutely has to be up, no matter what, the ftServer delivers. Stratus Active Upgrade software was selected by the editors of SQL Server magazine as a Gold Award winner in the high availability category. The Active Upgrade selection was based on the products strategic importance to the market, its competitive advantages and value to the customer. Today, Stratus customers continue to rely on this unique technology to dramatically reduce planned downtime associated with software patches and upgrades.

Figure 4: Award-winning Servers and Availability Technology

Industry recognition validates Stratus' continuous availability innovations and technology leadership

Mission-Critical Availability for Microsoft Windows Server Environments

An unbeatable combination for mission-critical Windows environments


Stratus ftServer systems provide a continuous-availability option for Microsoft customers and prospects that is unique in the market. High-availability clustering and new virtualization technologies can improve availability for many applications, but for those customers that require availability levels of 99.999% or greater, that cannot afford to lose in-memory data, or that need the highest level of 24/7 support, Stratus ftServer systems, along with Stratus industry-leading support, provide a compelling solution. Intel-based ftServer systems use standard hardware components, are listed in the Windows Server catalog and share standard Windows operation and management processes with commodity servers. This means that ftServer systems can fit seamlessly into existing Windows environments and leverage existing Windows staff skills. Along with significant cost advantages, these compatibility features can provide a significant advantage when competing with UNIX or proprietary solutions that have been used for the bulk of mission-critical and business-critical application deployments in the past.

About Stratus Technologies


Stratus delivers uptime for the applications its customers depend on most for their success. With its ultra-reliable servers, software and services, Stratus products help to save lives and to protect the business and reputations of companies, institutions, and governments the world over. To learn more about worry-free computing, visit www.stratus.com

Stratus, ftServer, the ftServer logo, and Continuous Processing are registered trademarks; ActiveService, Active Upgrade, and the Stratus Technologies logo are trademarks; and Uptime Meter is a service mark of Stratus Technologies Bermuda Ltd. Microsoft, Windows, and Windows Server are either trademarks or registered trademarks of Microsoft Corporation in the United States and/or other countries. Intel, the Intel logos, Intel Inside and Xeon are trademarks or registered trademarks of the Intel Corporation in the United States and other countries. UNIX is a registered trademark of the Open Group in the United States and other countries.

Specifications and descriptions are summary in nature and subject to change without notice. Copyright X960-A 2011 Stratus Technologies Bermuda Ltd. All rights reserved.

You might also like