Professional Documents
Culture Documents
Power HA Workshop
Part 1
ing. Luciano Bez
Power HA - Workshop
Instructor
ing. Luciano Martn BAEZ MOYANO
lucianobaez@ar.ibm.com
luciano.baez
http://www.luchonet.com.ar
http://www.linkedin.com/in/lucianobaez
https://www.facebook.com/lucianobaez
Resources
http://ibmurl.hursley.ibm.com/NUMX
http://ibmurl.hursley.ibm.com/NUOH
ing. Luciano Bez lucianobaez@ar.ibm.com
May 2016
Power HA - Workshop
What is a Cluster ?
How many kinds of Clusters there are ?
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
Causes Of Downtime
Solution
Required
Disaster
Recovery
High
Availability
Downtime refers to a period of time or a
percentage of a time span that a machine
or system (usually a computer server) is
offline or not functioning, usually as a
result of either system failure (such as a
crash or routine maintenance.
(Continuous
Operations)
May 2016
Power HA - Workshop
Huh?
Did
something
happen?
Checkpoint restart.
Not too bad .
Start over.
Where's all my work?
Uptime and Availability are not synonymous. A system can be up, but not available, as
in the case of a network outage.
ing. Luciano Bez lucianobaez@ar.ibm.com
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
Heartbeat
SAN
May 2016
Power HA - Workshop
Some concepts
Not every application can run in a high-availability cluster environment, and the necessary design
decisions need to be made early in the software design phase. In order to run in a high-availability
cluster environment, an application must satisfy at least the following technical requirements:
There must be a relatively easy way to start, stop, force-stop, and check the status of the
application. In practical terms, this means the application must have a command line interface or
scripts to control the application, including support for multiple instances of the application.
The application must be able to use shared storage (NAS/SAN).
Most importantly, the application must store as much of its state on non-volatile shared storage
as possible. Equally important is the ability to restart on another node at the last state before
failure using the saved state from the shared storage.
The application must not corrupt data if it crashes, or restarts from the saved state.
Fail over: If a Node with a clustered resource crashes, the HA clustering remedies this situation by
immediately restarting the application on another node without requiring administrative intervention.
Fail back: Is the process to back the resource to the original node, after a failover.
Heartbeat: Is a connection between nodes which is used to monitor the health and status of each
node in the cluster.
Split-Brain: Occurs when all private links go down simultaneously, but the cluster nodes still
running. If that happens, each node in the cluster may mistakenly decide that every other node has
gone down and attempt to start services that other nodes are still running. Having duplicate
instances of services may cause data corruption on the shared storage.
ing. Luciano Bez lucianobaez@ar.ibm.com
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
Cluster
Service Access
FAIL
Application A
Heartbeat
Passive
Active Node
FAILOVER process Node
Power HA - Workshop
Service Access
Cluster
FAIL
Application A
Application B
Heartbeat
Active Node
Active Node
FAILOVER process
B
A
Power HA - Workshop
No
Request for Application
available
B
Service Access
Cluster
FAIL
UNLOAD
the
Development Database
Low importance
Application B application
(Development)
Application A
Heartbeat
(production)
Active Node
Active Node
FAILOVER process
B
A
Power HA - Workshop
2 HACMP
High Availability Cluster Multiprocessing
(Now called IBM PowerHA SystemMirror)
May 2016
Power HA - Workshop
HACMP History
PowerHA SystemMirror 7.1
HACMP 5.2
HACMP 4.4.1
Integration with
Tivoli
Application Monitoring
Cascading w/out
fallback option
Integration of HANFS
Functionality
Selective Fallover
HACMP 4.3.1
32 Node Support
Node by Node
migration
Fast Connect
Support
HACMP 4.2.2
Introduced HAES
based on RSCT
monitoring topology &
event management
services from PSSP
HACMP 5.1
HAS (Classic)
Dropped
Custom Resource
Groups
Heartbeating over IP
Aliases
Disk Heartbeating
HACMP 4.5
Introduction of IP
Aliasing
Persistent IP Address
Monitoring and
recovery from loss of
VG quorum
HACMP 5.4.1
HACMP 5.4.0
Non Disruptive Upgrades
Fast Failure Detection
IPAT on XD Networks
Linux on Power Support
Oracle Smart Assistant
GPFS 2.3 Integration
DSCLI Support
Intermix of DS
Enclosures
HACMP 5.3
OEM Volume & FS
Support
Location Dependencies
Startup Verification
Geographic LV Mirroring
IP Distribution Policies
PowerHA 5.5
May 2016
Power HA - Workshop
HACMP History
Single
server
Clusters
Site 1
Site 1
Split Site
Clusters
Multi-site
Disaster Recovery
Site 1
1
112
01
1989
1992
HACMP Cluster
Active-Passive Failover
Resource Group
Management
Planned and unplanned
outage handling
2000
Disaster Recovery with
Storage: DS8K, SVC
Framework for OEM disk
and FS support
Location dependencies
Low cost Host mirroring
File Collections
Site 2
11
122 1
94
31
8
765 0
219
3
4578
6
Third Party
Storage DR
Site 1
1
112
01
Site 2
11
122 1
94
3 10
8
765
HyperSwap
Site 1 Site 2
Active-Active
3 Site
Sites
Deployments
Active Active Sites
Site 1
Site 2
3 Site Deployments
Site 1 Site 2 Site 3
219
3
4578
6
2004
Fast Failure Detection
Framework for OEM disk
and FS support
Capacity Optimized
failovers
Low cost Host mirroring
GPFS Integration
Two Node Rapid
deployment assistant
WPAR HA Management
Browser based UI
RG Dependencies
Health Monitoring and
Verification framework
Flexible and uniform
E2E
integration
failover
policies for 1 or 2
Single
sites point of control
Application
level
granularity
DR with EMC,
Hitachi,
XIV
Distributed
server h/w mgt
NDU upgrades
E2E
0-3 sec / RTO < 1H
SelfRPO
healing
2010
2013
2012
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
Enterprise
Edition
Integrated heartbeat
Smart Assists
Highlights:
Editions to optimize software value
capture
Standard Edition targeted
at datacenter HA
Enterprise Edition targeted
at multi-site HA/DR
- Stretched Clusters
- Linked Clusters
Per processor core used + tiered
pricing structure
- Small/Med/Large
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
Application/LVM/Middleware
Application/LVM/Middleware
/dev/hdiskX
/dev/hdiskX
/dev/hdiskY
Metro Mirror
Primary DS8K
Secondary DS8K
/dev/hdiskX
/dev/hdiskY
Metro Mirror
Primary DS8K
Secondary DS8K
HyperSwap Cluster
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
Mirror group contain information about the disk pairs across the site. This information is used
to configure mirroring between the sites.
Mirror groups can contact a set of logical volume manager (LVM) volume groups and a set of
raw disks that are not managed by the AIX operating system.
All the disks devices that are associated with the LVM volume groups and raw disks that are
part of a mirror group are configured for consistency. For example, the IBM DS8800 views a
mirror group as one entity regarding consistency management during replication.
The following types of mirror groups are supported:
User mirror group: Represents the middleware-related disk devices. The HyperSwap function is
prioritized internally by PowerHA SystemMirror and is considered low priority.
System mirror group: Represents critical set of disks for system operation, such as, rootvg disks
and paging space disks. These types of mirror groups are used for mirroring a copy of data that is not
used by any other node or site other than the node that host these disks.
Repository mirror group: Represents the cluster repository disks of that are used by Cluster Aware
AIX (CAA).
May 2016
Power HA - Workshop
Pros:
Automated action on acquisition of
resources (bound to the PowerHA application
server)
HMC Verification Checking for connectivity to the
HMC
Ability to Grow LPAR on Failover
Save $ on PowerHA SM Licensing
Cons:
Requires Connectivity to HMC
Potentially Slower Failover System
Specs (Takes a lot of time)
Ssh comunication
LPAR A
HMC
LPAR B
HMC
Backup
May 2016
Power HA - Workshop
PowerHA/HACMP topology
Networking components
May 2016
Power HA - Workshop
PowerHA/HACMP topology
Resource components
r
Se
P Ad
dres
s
r
ve
Serv
i ce I
p
rou
eG
um
Vol
n
io
at
ic
pl
Ap
NF
Se
xp
ort
s
le
Fi
m
te
s
sy
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
RG A
RG B
RG A
RG C
RG B
RG C
RG D
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
Availability components
Not just PowerHA: The final high availability solution goes beyond the
PowerHA. A high availability solution comprises a reliable OS (AIX),
applications that are tested to work in a HA cluster, storage devices,
appropriate selection of hardware, trained administrators and thorough
design and planning.
May 2016
Power HA - Workshop
So what is PowerHA/HACMP?
It's an application wich
Topology
manager
Resource
manager
Event
manager
SNMP
manager
snmpd
clinfoES
clstat
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
PowerHA topology
The cluster topology represents the physical view of the cluster and how hardware cluster
components are connected using networks (IP and non-IP). To understand the operation of
PowerHA, you need to understand the underlying topology of the cluster, the role each component
plays and how PowerHA interacts. In this section we describe:
PowerHA cluster
Nodes
Sites
Policies (Split and Merge)
Networks ( physical, logical, labels, alias, multicasting,
unicasting, etc.)
Communication interfaces / devices
Persistent and Service node IP labels / addresses
Network modules (NIMs)
Topology and group services
Clients
etc.
May 2016
Power HA - Workshop
PowerHA topology
Networks
In PowerHA, the term network is used to define a logical entity that groups the communication interfaces and
devices used for communication between the nodes in the cluster, and for client access. The networks in
PowerHA can be defined as IP networks and non-IP networks. The following terms are used to describe
PowerHA networking:
May 2016
Power HA - Workshop
PowerHA topology
IP Address takeover mechanism
One of the key roles of PowerHA is to maintain the service IP labels / addresses highly available.
PowerHA does this by starting and stopping each service IP address as required on the
appropriate interface. When a resource group is active on a node, PowerHA supports two
methods of activating the service IP addresses:
May 2016
Power HA - Workshop
PowerHA topology
Persistent IP Label or Address
A persistent node IP label is an IP alias that can be assigned to a network for a specified node. A
persistent node IP label is a label that:
Always stays on the same node (is node-bound)
Co-exists with other IP labels present on the same interface
Does not require installation of an additional physical interface on that node
Is not part of any resource group
Assigning a persistent node IP label for a network on a node allows you to have a highly available
node-bound address on a cluster network. This address can be used for administrative purposes
because it always points to a specific node regardless of whether PowerHA is running.
Note: It is only possible to configure one persistent node IP label per network per node. For
example, if you have a node connected to two networks defined in PowerHA, that node can be
identified via two persistent IP labels (addresses), one for each network.
May 2016
Power HA - Workshop
PowerHA topology
Device based or serial networks
Serial networks are designed to provide an alternative method for exchanging information using
heartbeat packets between cluster nodes. In case of IP subsystem or physical network failure,
PowerHA can still differentiate between a network failure and a node failure when an independent
path is available and functional.
Serial networks are point-to-point networks, and therefore, if there are more than two nodes in
the cluster, the serial links should be configured as a ring, connecting each node in the cluster.
Even though each node is only aware of the state of its immediate neighbors, the RSCT daemons
ensure that the group leader is aware of any changes in state of any of the nodes.
Even though it is possible to configure a PowerHA cluster without non-IP networks, we strongly
recommend that you use at least one non-IP connection between each node in the cluster.
The following devices are supported for non-IP (device-based) networks in PowerHA:
Serial RS232 (rs232)
Target mode SCSI (tmscsi)
Target mode SSA (tmssa)
Disk heartbeat (diskhb)
Multi-node disk heartbeat (mndhb)
May 2016
Power HA - Workshop
PowerHA topology
Split policy
A cluster split event can occur between sites when a group of nodes cannot communicate with the
remaining nodes in a cluster. For example, in a linked cluster, a split occurs if all communication
links between the two sites fail. A cluster split event splits the cluster into two or more partitions.
The following options are available for configuring a split policy:
None: A choice of None indicates that no action will be taken when a cluster split event is
detected. Each partition that is created by the cluster split event becomes an independent
cluster. Each partition can start a workload independent of the other partition. If shared volume
groups are in use, this can potentially lead to data corruption. This option is the default
setting, since manual configuratiDo not use this option if your environment is configured to use
HyperSwap for PowerHA SystemMirroron is required to establish an alternative policy..
Tie breaker: A choice of Tie Breaker indicates that a disk will be used to determine which
partitioned site is allowed to continue to operate when a cluster split event occurs. Each partition
attempts to acquire the tie breaker by placing a lock on the tie breaker disk. The tie breaker is a
SCSI disk that is accessible to all nodes in the cluster. The partition that cannot lock the disk is
rebooted. If you use this option, the merge policy configuration must also use the tie breaker
option.
May 2016
Power HA - Workshop
PowerHA topology
Merge policy
Depending on the cluster split policy, the cluster might have two partitions that run independently of
each other. You can use PowerHA SystemMirror Version 7.1.2, or later, to configure a merge
policy that allows the partitions to operate together again after communications are restored
between the partitions.
The following options are available for configuring a merge policy:
Majority: The partition with the highest number of nodes remains online. If each partition has
the same number of nodes, then the partition that has the lowest node ID is chosen. The
partition that does not remain online is rebooted, as specified by the chosen action plan. This
option is available for linked clusters. For stretched clusters to use the majority option, your
environment must be running one of the following version of the AIX operating system:
*IBM AIX 7 with Technology Level 4, or later
*AIX Version 7.2, or later.
Tie breaker: Each partition attempts to acquire the tie breaker by placing a lock on the tie
breaker disk. The tie breaker is a SCSI disk that is accessible to all nodes in the cluster. The
partition that cannot lock the disk is rebooted, or has cluster services restarted, as specified by
the chosen action plan. If you use this option, your split policy configuration must also use the
tie breaker option.
May 2016
Power HA - Workshop
PowerHA topology
Highly available NFS server
The highly available NFS server functionality is included in the PowerHA SystemMirror product
subsystem.
A highly available NFS server allows a backup processor to recover current NFS activity should the
primary NFS server fail. The NFS server special functionality includes highly available modifications
and locks on network file systems (NFS).
You can do the following:
Use the reliable NFS server capability that preserves locks and dupcache (2-node
clusters only if using NFS version 2 and version 3)
Specify a network for NFS cross-mounting
Define NFS exports and cross-mounts at the directory level v Specify export options for
NFS-exported directories and file systems
Configure two nodes to use NFS.
PowerHA SystemMirror clusters can contain up to 16 nodes. Clusters that use NFS version 2 and
version 3 can have a maximum of two nodes, and clusters that use NFS version 4 can have a
maximum of 16 nodes.
May 2016
Power HA - Workshop
PowerHA topology
PowerHA SystemMirror common cluster configurations
Standby configurations: Standby configurations are the traditional redundant hardware configurations where
one or more standby nodes stand idle, waiting for a server node to leave the cluster. (Standby configurations with
online on home node only startup policy, Standby configurations with online using distribution policy startup)
Takeover configurations: In the takeover configurations, all cluster nodes do useful work, processing part of the
cluster's workload. There are no standby nodes. Takeover configurations use hardware resources more efficiently
than standby configurations since there is no idle processor. Performance can degrade after node detachment,
however, since the load on remaining nodes increases. (One-sided takeover, Mutual takeover, Two-node mutual
takeover configuration, Eight-node mutual takeover configuration)
Cluster configurations with multitiered applications: A typical cluster configuration that could utilize parent
and child dependent resource groups is the environment in which an application such as WebSphere depends on
another application such as DB2.
Cluster configurations with resource group location dependencies: You can configure the cluster so that
certain applications stay on the same node, or on different nodes not only at startup, but during fallover and
fallback events. To do this, you configure the selected resource groups as part of a location dependency set.
Cross-site LVM mirror configurations for disaster recovery: You can set up disks that are located at two
different sites for remote LVM mirroring, using a storage area network (SAN).
Cluster configurations with dynamic LPARs: The advanced partitioning features of AIX provide the ability to
dynamically allocate system CPU, memory, and I/O slot resources (dynamic LPAR).
ing. Luciano Bez lucianobaez@ar.ibm.com
May 2016
Power HA - Workshop
May 2016
Power HA - Workshop
PowerHA topology
Tasks to Configure Cluster Infrastructure and ownership
Plan-out IP Addresses
Hard set interface Ips
Document DNS names
update /etc/hosts
Share Storage
Drivers / filesets
Assign Luns
Alter SAN infrastructure
Zoning
Install Applications
Start/stop scripts
Space requirements
Optimize configuration for
performance
HA Cluster Installation / Deployment
Topology & Resource Setup
Fallover Testing
Monitoring the environment
ing. Luciano Bez lucianobaez@ar.ibm.com
Owner
Network Admin.
Storage Admin.
Application Admin.
HA Admin.
May 2016
Power HA - Workshop
Questions
May 2016