Professional Documents
Culture Documents
NO
YES
Storage Unit Configuration Issue Section 1 Drives are down Section 3 Policy configuration Section 2
Master Server
Unix: /usr/openv /Netbackup/logs / bpsched Windows: <install_path>\Netbackup\ logs\ bpsched
Media Server
Unix: /usr/openv /Netbackup/logs/bpcd Windows: <install_path>\Netbackup\ logs\ bpcd Client: None
Page 1 of 13
Table of Contents
1 Storage Unit(s) are not configured correctly...............................................................................3 1.1 Maximum Concurrent Drives / Jobs.....................................................................................3 1.2 Media Server Specification.................................................................................................. 3 1.3 Robot Number..................................................................................................................... 3 1.4 On Demand Only................................................................................................................. 3 1.5 Multiplexing.......................................................................................................................... 3 2 Policy configuration.................................................................................................................... 4 3 Drives are down ........................................................................................................................ 5 3.1 Checking drive status and bringing drives up.......................................................................5 3.1.1 Checking and upping the drives from the GUI:...............................................................5 3.1.2 Checking and upping the drives from the Command Line:.............................................6 3.2.1 OS configuration issues................................................................................................ 7 3.2.2 SCSI Path issues........................................................................................................... 7 3.3.3 Device issues................................................................................................................ 8 4Verify connection to bpcd on the media server...........................................................................10 4.1 Verify that bpcd is listening................................................................................................10 4.2 Test bpcd connection with Telnet.......................................................................................10 5 Name Resolution between Master and Media Server..............................................................12 6 Links......................................................................................................................................... 13
Page 2 of 13
1.5 Multiplexing
In an SSO environment, when drives are shared, a job will typically requeue with a status code 134 if a drive is over-committed. In some situations, due to timing, this scenario may result in a Status Code 219 (if a specific Storage Unit is designated in the policy). To help minimize this, try increasing the Maximum Multiplexing per drive within the Storage Unit. Allowing more streams per drive will increase drive availability (This also applies to non-SSO environments). Note that Multiplexing can result in significantly longer restore times. See the Media Manager System Administrators Guide for more information on this feature.
Page 3 of 13
2 Policy configuration
As mentioned above, Storage Units are targeted from a policy. However, it is possible to override the policys Storage Unit designation from within the schedule itself. If a different Storage Unit is being used than expected, ensure that no schedules within the policy are set to override the Storage Unit.
Figure 1. The Override policy storage unit option as shown in the Windows Administration Console.
Page 4 of 13
Page 5 of 13
3.1.2 Checking and upping the drives from the Command Line:
To determine the status of the drives, run the vmoprcmd command below on the media server in question: UNIX: /usr/openv/volmgr/bin/vmoprcmd d Windows: <install_path\VERITAS\volmgr\bin\vmoprcmd d Drive status is DOWN as shown by the vmoprcmd command:
Drv Type Control Wr.Enbl. ReqId 0 4mm DOWN DRIVE STATUS User Label RecMID ExtMID Ready No -
Up the drives by running the following: UNIX: /usr/openv/volmgr/bin/vmoprcmd up <drive_index> Windows: <install_path\VERITAS\volmgr\bin\vmoprcmd up <drive_index>
Page 6 of 13
*.debug and *.error log file. If it is commented out (if there is a # in front of *.debug and *.error), then syslog is not recording errors at the OS level. To troubleshoot drive issues, turn on syslog logging. The AIXs errpt command can also be used to look for hardware errors: # errpt a (See Appendix for more detailed AIX logging instructions) Linux: /var/log/syslog Windows: 1. Right-Click My Computer. 2. Select Manage. 3. Expand System Tools and then Event Viewer. 4. View the Event Viewer Application and System logs for more details on the failure. As an application, NetBackup has no direct access to a device, instead relying on the operating system (OS) to handle any communication with the device. This means that during a write operation, NetBackup asks the OS to write to the device and report the success or failure of that operation. If the drives are down, there are 3 likely causes. These causes are described below:
Page 7 of 13
May 14 16:25:51 server unix: Vendor: QUANTUM Serial Number: qj 6 i O May 14 16:25:51 server unix: Sense Key: Aborted Command May 14 16:25:51 server unix: ASC: 0x47 (scsi parity error), ASCQ: 0x0, FRU: 0x0 Jun 20 02:20:09 server scsi: [ID 107833 kern.warning] WARNING: /pci@8,600000/JNI,FCR@2/st@14,0 (st15): Jun 20 02:20:09 server SCSI transport failed: reason 'tran_err': giving up Jun 20 02:26:09 server bptm[29663]: [ID 832037 daemon.error] scsi command failed, may be timeout, scsi_pkt.us_reason = 3 Jun 20 02:26:42 server jnic146x: [ID 362195 kern.notice] jnic146x1: Link not operational. Performing reset. Jun 20 02:26:42 server jnic146x: [ID 133166 kern.notice] jnic146x1: Link Down Dec 18 21:11:59 server vmunix: 0/1/0/0: Unable to access previously accessed device at nport ID 0x11900.
Example of SCSI communication errors from a Windows Event Viewer System log:
20040830 10:36:30 aic78xx E9 NA The device, \Device\Scsi\aic78xx1, did not respond within the timeout period. 20040830 10:46:33 aic78xx E9 NA The device, \Device\Scsi\aic78xx1, did not respond within the timeout period.
Page 8 of 13
Example of tape drive and media errors from a UNIX system (syslog, messages):
May 15 16:41:40 Fatal May 15 16:41:40 2181 May 15 16:41:40 O May 15 16:41:40 May 15 16:41:40 FRU: 0x0 server unix: Error for Command: write Error Level: server unix: Requested Block: 2181 Error Block: server unix: Vendor: QUANTUM Serial Number: qj 6 i server unix: Sense Key: Media Error server unix: ASC: 0xc (write error), ASCQ: 0x0,
Nov 5 21:08:47 server avrd[21163]: Tape drive QUANTUMDLT70001 (device 1, /dev/rmt/c9t0d0BESTnb) needs cleaning. Attempting to auto-clean...
Example of tape drive and media errors from a Windows Event Viewer System log:
20040804 22:47:53 dlttape-VRTS E7 NA The device, \Device\Tape2, has a bad block. 20040806 17:52:32 dlttape-VRTS E11 NA The driver detected a controller error on \Device\Tape0. 20040806 17:52:47 dlttape-VRTS E11 NA The driver detected a controller error on \Device\Tape0.
Page 9 of 13
Run the command below on the Media Server to check if the bpcd port is listening: Windows: netstat -a > c:\netstat.txt UNIX: netstat a > /tmp/netstat.out The text file that gets created should list processes that are running ( bpcd, vnetd, vopied, bpjava-msvc). Search this file to determine if bpcd is in a listen or listening status: Windows: TCP hostname:bpcd hostname.domain.com:0 LISTENING UNIX: *bpcd *.* 0 0 49152 LISTEN If bpcd isnt listening, it may need to be stopped and restarted. On Windows, bpcd can be cycled by stopping and restarting the Client service. If bpcd is listening, yet drives are up and a 219 is still encountered, create a bpsched and bpcd activity log directory and retry the operation. Check the resulting activity logs for records of an earlier failure.
Successful telnet output example from UNIX: # telnet alaska bpcd Trying 10.10.100.20... Connected to alaska.min.veritas.com. Escape character is '^]'.
Status Code: 219 The required Storage Unit is unavailable
Page 10 of 13
Hit the Ctrl ] and then type quit to end Unsuccessful telnet output example from UNIX: # telnet alaska bpcd Trying 10.10.100.20... telnet: Unable to connect to remote host: Connection refused Successful telnet output example from Windows: telnet alaska bpcd < If successful no displayed messages will be returned > To stop telnet: hold down Ctrl key and press the ] key, then release and type quit to end telnet session. Unsuccessful telnet output example from Windows: Connecting to Alaska. . .Could not open a connection to host on port 13782 : Connect failed If a telnet test fails, ensure again that bpcd process(this is the client service on Windows) is running and listening. If so, it may be necessary to analyze the network configuration to determine why the bpcd port (13782) can not be reached.
Page 11 of 13
Using bpclntcmd
bpclntcmd is a useful utility that can be run between the Master and Media Server. It will help determine if name resolution is working properly from NetBackups perspective. It can be run from the Media Server to the Master, or from the Master Server to the Media Server. Windows command location: %install_path%\VERITAS\NetBackup\bin\ UNIX command location: /usr/openv/netbackup/bin Switches and variations:
bpclntcmd bpclntcmd bpclntcmd bpclntcmd bpclntcmd bpclntcmd -pn -self -hn <hostname_of_masterserver> -hn <hostname_of_mediaserver> -ip <ip_of_masterserver> -ip <ip_of_mediaserver>
The goal of these commands is to make sure the hostname is consistently seen the same way after each command. Below is an explanation of what each switch does: -pn - The client process on the host connects to the Master Server and asks the question "Who am I?". The second line of the output is the result. This is how the client process on the host is being seen by the Master Server. -self - Checks how this host can be resolved. Ideally, there should be only 1 unique hostname and 1 unique IP address. -hn - Checks the given hostname and returns an IP. -ip - Checks the given IP and returns a hostname. If there is any inconsistency or errors in the results of these commands, correct the hostname to IP resolution by either editing the local hosts table or by updating the DNS database.
Page 12 of 13
6 Links
Click here to Search for other documents on Status Code 219
Click below to perform a search on the following relevant items:
Storage unit
Page 13 of 13