Professional Documents
Culture Documents
Sept 2002
Revision 5
Steve Strutt,
Tivoli Software, IBM UK
steve_strutt@uk.ibm.com
August 2003
Agenda
SAN Exploitation - LAN-Free backup
Performance characteristics
Requirements
LAN, Hardware, Software, Device, dependencies
SAN considerations
Device fail-over
HBA considerations
SAN design considerations
Device addressing considerations
Going Live
Testing, Diagnosing Problems
Hints and Tips
LAN-free Backup
Advantages:
- client data can be local or SAN-attached
- transparent to application/database
- takes backup traffic off the LAN
- reduces CP cycles on backup server (no I/O)
- faster speed (usually)
- only one backup server needs administration
LAN
DATA
Disadvantages:
- still requires CP cycles on client for backup I/O
- careful scheduling to avoid tape drive contention
(or exploit disk pooling)
DATA
FC device
DATA
DISK
TAPE
Direct to tape
Disk pool staging
Performance characteristics
LAN-Free is not necessarily faster
Only network eliminated as bottleneck
Could be other bottlenecks
Tape drives, Disk subsystem
Data types
good performance for large files and databases
small files, performance limited by file system and TSM
architecture
LAN-Free to tape
Potentially better performance for large files, as bottleneck becomes file
system or tape device
Small files cause tape drives to stop-start more and drives drop out of
streaming mode.
LAN-Free to disk
Ideal for small files, no stop-start overhead
Large UK Bank
TDP for Exchange to 3583 LTO
Backup 52GB/h (14.4M B/s) to single drive
Restore 51GB/h (14.2MB/s) from single drive
Agenda - Requirements
Hardware
LAN
Library support for LAN-Free
SAN device support
Software
Evolving TSM support for LAN-Free
TSM code dependencies
Requirements - LAN
LAN-Free still requires LAN for meta data
For Large files and databases
Minimal usage
Small files
Maybe the same if not more meta data on LAN than data on SAN if files
are very small.
LAN performance and loading still important
NT/W2K
Supported from 4.1.0, NT/W2K server and NT/W2K TDPs
4.2.1 for Backup Archive Client
HP-UX
Supported from 5.1.0, Full TSM device driver support
10
TSM 5.2
TSM Server and Storage Agent code only dependant at version and
release level
Independent of PTF level
Easier to deploy and install maintenance
11
12
Clustering
Redundancy
Multiple paths
13
14
15
Shared
SCSI
Bus
IP network
TSM Clients
Shared
Disk
Shared Tape
16
TSM Server
in HACMP cluster
17
18
Some horrible messages if TSM server tries to dismount tapes belonging to failed
storage agents.
Failure scenario
Server running Storage Agent goes down (hardware failure
Fibre loss), when using a tape device
ANR8925W Drive DRIVE0 in library ATLP1000 has not been confirmed for use by server UKSAN1_SA
for over 1200 seconds.
Drive will be reclaimed for use by others.
ANR8336I Verifying label of DLT volume 00157D in drive DRIVE0 (MT6.1.0.1).
ANR8311E An I/O error occurred while accessing drive DRIVE0 (MT6.1.0.1) for SETMODE operation,
errno = 1.
ANR8355E I/O error reading label for volume 00157D in drive DRIVE0 (MT6.1.0.1).
ANR8311E An I/O error occurred while accessing drive DRIVE0 (MT6.1.0.1) for OFFL operation, errno =
1.
ANR8469E Dismount of DLT volume 00157D from drive DRIVE0 (MT6.1.0.1) in library ATLP1000 failed.
ANR9999D mmsscsi.c(1647): ThreadId<48> Volume may still be in the drive DRIVE0 (MT6.1.0.1).
ANR8446I Manual intervention required for library ATLP1000.
19
20
10
NT/W2K
Max Scatter-Gather must be set to 65 or greater
Unable to write to new tapes on Storage Agent
W2K creates tapes which cannot be read
TSM will check for this in 4.2.1.11 and higher levels.
Issues message, unable to use drive.
W2K and LTO use 5.0.2.4 or higher level of Ultrium LTO driver.
21
HBA Sharing
Sharing of disk and tape on same HBA not always supported by
hardware vendors
Disk OK
Access to tape drives lost under high workload conditions
drives timeout, go offline
IBM supports disk and 3590 on AIX with 6227/8 adapter under moderate
workloads
IBM-SSG do not recommend sharing disk and tape in other configurations.
i.e. LTO
some evidence that it is OK in low workload environments, such as
previous NT/SQL server LAN-Free environment.
22
11
SAN design
Most SANs designed for disk access
data flow is optimized for hosts <==> disks
stovepipe design, separate SAN islands
Disk
Array
23
ED5000
4300 etc
TAPE
24
TAPE TAPE
12
Power up sequence
SAN, tape devices, then TSM Server and Storage Agents
25
TAPE
F C d e v ic e
WWN1
TAPE
AIX
Storage Agent
WWN2
Solaris
Storage Agent
26
TSM Definition
Library Lib1 lb1.0.1.3
Drive
Drive0 //./tape0
Drive
Drive1 //./tape1
/dev/rmt0
/dev/rmt1
Path
Path
Drive0 /dev/rmt0
Drive1 /dev/rmt1
/dev/rmt/0st
/dev/rmt/1st
Path
Path
Drive0 /dev/rmt/0st
Drive1 /dev/rmt/1st
13
TSM
OS
Device
Driver
WWN
SAN
HBA
Gateway/Router
WWN
OS Device
Name to
TSM Device
Name
SCSI ID
to
OS Device
Name
Device
WWN
to
SCSI ID
Tape
Drives
Device WWN
ID
1
SCSI Bus
ID
2
ID
3
SCSI ID to LUN
Solution
Use HBA Persistent Naming
Fixes SCSI address to device WWN
Static device name mapping
Device Names remain unchanged
Fixed device name to SCSI address mapping
TSM 5.2
Automatic device tracking
28
14
Platform
AIX
29
Emulex
Qlogic
Not Applicable
Not Applicable
(use 6227/8 adapter) (use 6227/8 adapter)
Windows
NT/W2K
YES
Solaris
YES
Yes
(from 8.1.3 with
SANblade Manager)
YES
30
15
31
32
16
Solaris
Static device naming convention
Uses symbolic link to map device name to SCSI address
ls l /dev/rmt/*
lrwxrwxrwx
1 root
other
45 Jan 3 14:22 /dev/rmt/0mt ->
../../devices/pci@1f,0/pci@1/scsi@2/mt@5,1:mt
33
At start of each operation TSM server and SA will check the device
is the one it expects it to be:
Windows
Initiates a search for the device and changes mapping to point to
new device and then continues operation.
UNIX
Issues message and fails operation on that device
34
17
T A PE
F C d e v ic e
WWN1
T A PE
AIX
Storage Agent
WWN2
Solaris
Storage Agent
35
/dev/rmt0
/dev/rmt1
Path
Path
Drive0 /dev/rmt0
Drive1 /dev/rmt1
/dev/rmt/0st
/dev/rmt/1st
Path
Path
Drive0 /dev/rmt/0st
Drive1 /dev/rmt/1st
36
18
Drive1
Drive
WWN/Serial No.
Library Element
Number
TSM
Server
Storage
Host
Agent1
Device Storage
Names Agent2
Storage
Agent3
37
38
19
39
Solaris
Relate device name to WWN using SCSI and LUN addresses
ls -l shows device name and SCSI/LUN mapping
dmesg output shows SCSI Target address to WWN mapping
ls l /dev/rmt/*
lrwxrwxrwx
1 root
other
45 Jan 3 14:22 /dev/rmt/0mt ->
../../devices/pci@1f,0/pci@1/scsi@2/mt@5,1:mt
dmesg (/var/adm/messages)
......
qla2200-hba0-SCSI-target-id-5-fibre-channel-name="100000e00201d0d7";
40
20
Diagnosing Problems
Storage agent messages
TSM device utilities
41
Testing
21
Testing
Check tape hardware works reliably with TSM server in LAN
configuration first
Check TDPs and B/A Client work on LAN first
Test each drive with each Storage Agent to check they are properly
defined and accessible.
Use BA Client
ANR8779E (Session: 7, Origin: UKSAN4_SA)
open drive /dev/mt1, error number=2.
Unable to
43
Diagnosing problems
Storage Agent can be run in foreground session, to see all
messages.
All Storage Agent messages should be logged centrally in the server
Activity Log
Can issue commands from TSM server console
storage_agent1: QUERY SESSION
44
22
45
23