You are on page 1of 42

Agenda

• What is Availability
• Road to Active / Active
• Conflicts
• Migrations
• Today’s Technologies
• Case Studies
• Questions
HP & GoldenGate Software Partnership Highlights
• GoldenGate’s First Product on HP NSK Delivered 1996

• Success across all geographic regions and verticals including:


– banking; financial services; healthcare; retail & government.

• The majority of HP NonStop customers use GoldenGate solutions today.

• HP customers drove GoldenGate to support open systems.

• HP customers brought us to Active/Active.

• Currently engaged in other areas of HP. HP-UX, HP Neoview and Blades.


What is Availability?
Three States of Availability
Banking Transaction Processing
Retail POS / Order Processing
Healthcare Physician Order Entry
Clinical Information Systems
Customer facing Applications
Telecommunications & Billing
Operational Application

Performance, Latency,
Scalability
#1: Active

#2: Planned #3: Unplanned


Unplanned outage
Outage Outage

Migrations
System Failure

Upgrades

Data Failure
Maintenance
Road to Active / Active
Goals of an Active/Active Implementation

• Better use of existing hardware


– Put your backup system to use

• Continually test backup system


– It is working right now

• Reduce response time


– Handle peaks – each processing a portion of load
– Maintain your system with planned switchovers

• Allows phased Migrations/Upgrades (no downtime)!


– Once you have the ability to process on two systems, you can perform phased
migrations
How GoldenGate TDM Works: Modular “Building Blocks”

Capture: Committed changes are captured (and can be filtered) as they


occur by reading the transaction logs.

Trail files: Universal data format enables heterogeneity.

Route: No distance constraints via TCP/IP. Compression & encryption.

Delivery: Applies transactional data with


guaranteed integrity.

Source Trail
LAN / WAN / Target Trail

Capture
Internet Deliv
er

Target Trail
Source Deliver
Source Trail
Target
Capture
Database Database
Bi-directional
Uni-Directional Plus Live Reporting

When you need: Under Normal Operating Conditions


• Current up-to-the-minute reporting information PRIMARY SYSTEM AVAILABLE for
• Reduce impact of reporting demands on your production § BOTH READ and WRITE
system SECONDARY SYSTEM AVAILABLE for
• Verification of your failover data readiness • ONLY READ operations
Live Standby (Active – Passive)

When you need: Under Normal Operating Conditions


• Live reporting+ PRIMARY SYSTEM AVAILABLE for
• Fastest possible recovery & switchover • BOTH READ and WRITE
• Reverse direction replication ready SECONDARY SYSTEM AVAILABLE for
• Next best thing to Active-Active • ONLY READ operations
• Backup that can be used for reporting
Active / Active – Data Routed to Avoid Data Collision

When you need: Under Normal Operating Conditions


• Continuous availability Both SYSTEMS AVAILABLE for
• Transaction load distribution • BOTH READ and WRITE
• Performance scalability
Active / Active – With Data Collisions

When you need:


• Continuous availability Under Normal Operating Conditions
• Transaction load distribution Both SYSTEMS AVAILABLE for
• Performance scalability • BOTH READ and WRITE
• Conflict detection & resolution
Conflicts:
Avoidance, Detection, and Resolution
Active/Active - Considerations
Loop Detection
• Detecting if operation was performed by replication component or the
application
• Sometimes referenced as ping-pong detection

Conflict Avoidance
• Building an environment where conflicts are avoided under normal
processing conditions

Conflict Detection
• Detecting if the same row was updated on both the source and target
before the changes were applied by data replication

Conflict Resolution
• Determining business rules on how to handle collisions
Conflict Avoidance

• Application partitioning
– User-based
– Account number based
– Geographic
– …

• Database Key partitioning


– Even vs. Odd
– Increments by server count (1,4,7,10…) (2,5,8,11…)
(3,6,9,12…)
Conflict Scenarios
• Database Design
– Key Sequencing

• Application Logic
– Account Balance
– Inventory
– Customer address

• Network Outage
– What do you do?
Conflict Resolution Approaches

• Exception handling / management


– Human intervention
– Automated approaches

• Simple automated approaches


– Timestamp
– Trusted source / site priority
– Merge approach

• Complex automated approaches


– Quantitative resolution
– Complex rules-based resolution
Migrations
Migration Challenges

• Maintaining SLA during planned • Data issues


outage – Instantiating Terabytes/Petabytes
– Revenue Impact – Staging areas
– Customer Expectations – Change Management
– Interdependencies, Integration – Special Handling

• Synchronization issues • Failback strategy


– Incremental data movement – System/Application verification
– Source database impact – Continued data growth
• Application Availability
High Availability Zero database downtime and minimal application
downtime during the project
Low Impact Non-intrusive on the source database and OLTP activity

• Data Issues
Real Time Real-time incremental synchronization of data
transactions during the migration

• Risk Mitigation

Verification Verification of data between the databases before the


cutover
Failback Failback solution in the event of unexpected issues on
the new environment
If it ain’t broken… Why do they migrate critical systems?

• Their hardware or operating system is at “end-of-life”


– Tru64, OpenVMS, old hardware …

• Their application version is no longer supported


– Siebel 6.x, GE Carecast, etc
– Take advantage of new features

• Data center consolidation / virtualization


– Operating old servers becomes increasingly expensive
– TCO reduction, MIPS reduction

• Change in vendor / strategy


– Mainframe to HP-UX
Three Flavors of Migrations
Unidirectional Migration

• Eliminate downtime during the data migration


– Data on target is at near-zero lag from source data
• Big-bang cutover with no fail-back

Big-Bang Cutover

Source Trail Target Trail

Capture Deliver

Source Target
Database Database

Verify
Unidirectional Migration with Failback Option

• Eliminate downtime during the data migration


• Big-bang cutover with failback
– capture transactions on new system and if something goes wrong, bring
old system up-to-speed (failback requires downtime)

Big-Bang Cutover

Target Trail
Source Trail
Capture Deliver

Fail-back Contingency
Source Failback Trail
Failback Trail Target
Database Capture Database
Delivery

Verify
Bidirectional Migration
• Eliminate downtime during the data migration
• Gradual cutover with two active systems
• Switch users back and forth on a schedule
• Not Trivial – Need Application knowledge (Packaged Solutions for BASE24, GE
Carecast, Siebel)

Phased Cutovers

Source Trail Target Trail


Capture Deliver

Source Target Trail


Source Trail Target
Database Capture Database
Delivery

Verify
Migration Validation
How Confident Are You: Does Node A = Node B?

Visibility to act on discrepancies


sooner
Why Veridata? Data Discrepancies are a Reality…

User errors Application errors


§ Input errors § Faulty logic
§ Unintended use § Failed upgrades
§ Malicious intent § Latent bugs

Infrastructure errors Configuration errors


§ System failure § Applications
§ Disk corruption § Replication
§ Network outage § Network

“Although redundancy in a data architecture will be added value in some cases and required in others,
redundancy introduces the risk of discrepancies when all related copies of data are not kept in sync
and current.”
-- Ted Friedman, Gartner, January 2004
GoldenGate Veridata: How it Works

• The user chooses tables or files on the source and target databases
• The comparison is initiated from the Veridata web-based UI or command line
• As the databases continue to change, GoldenGate Veridata reports:
– Persistent discrepancies
– In-flight data discrepancies (user configurable)
Today’s Technologies
Hardware Redundancies
• Hardware / Operating System Redundancies
– Tandem
– Stratus
– Clustering

• Database Server Redundancies


– Oracle RAC
– DB2 Sysplex/Datasharing

• Storage Redundancies
– Storage Mirroring
– Host-based Mirroring
– Raid

• Backup Technology
– Backups
– Snapshots
Hardware Redundancies

• Pros
– Non intrusive
– Easy to implement
– Complementary strategy

• Cons
• No heterogeneous support
• Exact environments
• Inflexible
• Recovery is not instantaneous
• Distance constraints
Replication Technology

• Physical Replication
– EMC
– Fujitsu
– Hitachi
– Veritas

• Logical Replication
– DRNet
– GoldenGate
– RDF
– Shadowbase
Physical Replication

• Pros
– Non-intrusive
– Easy to implement
– Complementary strategy

• Cons
– No heterogeneous support
– Exact environments
– Inflexible
– Recovery is all or nothing
– Distance constraints
Logical Replication

• Pros
– Selective
– Filtering
– Mapping
– Transformation
– Active/Active
– Targeted repair
– No distance constraints
– Flexible topologies (one-to-many)

• Cons
– Not a black box implementation
Logical Replication – Further Breakdown

• Tightly Coupled/Peer to Peer


– Pros
• Less processes
– Cons
• Trouble with outages
• Hard to scale for high volumes
• Inflexible topologies
• Harder to implement heterogeneous capabilities

• Decoupled Architecture
– Pros
• Handle outages by design
• Create non-equal source and target pairs for better scalability
• Easy to add new platforms
• Easy to add new databases
– Cons
• More processes
Change Data Capture - Techniques
• Shadow Tables • Timestamp Based
– Pros – Pros
• Custom tailored capture • No modifications to the Application
• Real-Time capture • No increased I/O in commit path
• Easiest to code
– Cons
• Application intrusive – Cons
• Increased I/O in commit path • Batch capture
• Inflexible to Application changes • Impact on Source system
• Second toughest to code • Scripts and timestamp management

• Trigger Based • Log Based


– Pros – Pros
• Custom tailored capture • No modifications to the Application
• No modifications to application • No increased I/O in commit path
• Real-Time capture • Custom tailored capture
• Second easiest to code • No modifications to application
– Cons • Real-Time capture
• Increased I/O in commit path – Cons
• Inflexible to Application changes • Toughest to code
GoldenGate TDM: Heterogeneity Supports Applications Running On…

Databases O/S and Platforms


Capture:
§ Oracle HP NonStop
§ DB2 UDB (S series, Itanium, Blades, Neoview)
§ Microsoft SQL Server HP-UX
§ Sybase ASE HP TRU64
§ Teradata Windows 2000, 2003, XP
§ Enscribe Linux
§ SQL/MP
Sun Solaris
§ SQL/MX
IBM AIX
§ Ingres
IBM z/OS
Delivery: OpenVMS
§ All listed above
§ MySQL and any ODBC compatible
databases
Customer Case Studies
Case Study: Bank of America
Zero Downtime for 18,000 ATMs
18,000 ATMs Continuously Available
Business Challenges:
§ 100% availability for systems supporting 18,000 Fraud Detection
Application
ATMs
§ Disaster Tolerance: Reduce switchover time Dual-Active
§ Consolidate data from 4 geographically dispersed
Data Centers into a single system ACI BASE24 ACI BASE24
HP Nonstop ATMs HP Nonstop
§ Support active-active for HA and fraud detection SF VA
§ Synchronize thousands of transactions per
second, millions per day

GoldenGate Solution: Hot Backup Site:


Kansas City Data Center
§ High availability, dual-active solution with
advanced conflict resolution capabilities
§ Live Standby into data centers ACI Base 24 ATMs ACI Base 24
§ Enables zero downtime migrations, system
upgrades LA TX
§ Results:
§ Reduced application recovery time by 90% “GoldenGate offered us benefits that would also
enable us to meet our long term goals.”
§ Eliminate outages for application, database - Michele Schwappach, SVP Senior Technology Manager,
and OS upgrades Bank of America
Case Study: US Bank
Active/Active for Continuous Uptime
Business Challenges:
§ 100% availability for systems supporting 2,500 5,000 ATMs & 2,500 Branches
branches & 5,000 ATMs in US. Continuously Available
§ Zero Downtime during critical application
upgrades/migrations.
§ Scalability as systems grow.
§ Load balancing and improved response times and ACI Base24 ACI Base24
performance.
§ Ability to handle data conflicts. Dual-Active

GoldenGate Solution: HP Nonstop HP Nonstop


§ High availability, dual-active solution with St Paul, MN
advanced conflict resolution capabilities Portland, OR
§ Enables zero downtime migrations, system
upgrades MS SQL Server
§ Started with Active/Passive and moved to Data Warehouse
Active/Active environment.
“Active-active implementations can seem like a
§ US Bank created its own user-exits to handle data
daunting task but this should not discourage you
collisions. from pursuing such a solution because the benefits
§ Results: Continuous uptime are tremendous”
§ US Bank’s customers are happy. More casino Rich Rosales, Development Manager, US Bancorp
customers now!
Case Study: MGM Mirage
No Gamble for High Availability & Real-Time Data Warehousing
Continuously Available Applications &
Business Challenges: Single View of the Customer
• Improve availability for casino marker &
money management systems
• Integrate data in real-time from cage/money
Cage & Marker Mgmt.
mgmt systems, property mgmt & players Cage & Marker Mgmt. & Property Mgmt
club to enterprise data warehouse (EDW) For MGM
Backups Backups
• Improve customer service and business
intelligence for marketing & customer
service. HP Nonstop HP Nonstop Stratus Bellagio
Bellagio Treasure Island MGM

GoldenGate Solution:
• GoldenGate Live Standby for real-time
copies of production systems with no Opera Property Management
System (Oracle)
downtime
• GoldenGate real-time data feeds into EDW
increases the value of MGM’s consolidated
Enterprise Data Warehouse
customer view (SQL Server 2000) Players Club Program
SQL Server 2005 SQL Server 2000
• Migrate Players Club system from SQL
Server 2000-2005 & upgrade hardware
Results:
(future).
§ No Downtime for mission critical systems
§ Real-time consolidated view of customer in EDW
Thank You
cmcallister@goldengate.com
dmahon@goldengate.com

Questions?

You might also like