Professional Documents
Culture Documents
Contents
Sources of troubleshooting information ................................................... 26
Where LEDs appear .................................................................................................. 26
Where messages are displayed .................................................................................. 26
AutoSupport email messages help with troubleshooting .......................................... 27
Forms and use of diagnostic tools ............................................................................. 28
Where to find documentation .................................................................................... 28
Storage system LEDs .................................................................................. 30
20xx and SA200 system LEDs .................................................................................. 30
Location and meaning of LEDs on the front of 20xx and SA200 chassis .... 30
Location and meaning of LEDs on the back of 20xx and SA200
controller modules ................................................................................... 32
Location and meaning of 20xx and SA200 PSU LEDs ................................ 34
FAS22xx system LEDs ............................................................................................. 35
Location and meaning of LEDs on the front of 22xx chassis ....................... 35
Location and meaning of LEDs on the back of 22xx controllers .................. 36
Location and meaning of 22xx internal drive LEDs ..................................... 39
Location and meaning of 22xx PSU LEDs ................................................... 41
Location and meaning of 22xx internal FRU LEDs ..................................... 43
FAS25xx system LEDs ............................................................................................. 43
Location and meaning of LEDs on the front of FAS2520, FAS2552, and
FAS2554 chassis ..................................................................................... 43
Location and meaning of FAS25xx internal drive LEDs .............................. 45
Location and meaning of LEDs on the back of FAS2520 controllers .......... 47
Location and meaning of LEDs on the back of FAS255x controllers .......... 50
Location and meaning of FAS25xx PSU LEDs ............................................ 53
Location and meaning of FAS25xx internal FRU LEDs .............................. 55
SA300 system LEDs ................................................................................................. 56
Location and meaning of LEDs on the front of SA300 controllers .............. 56
Location and meaning of LEDs on the back of SA300 controllers .............. 57
Location and meaning of SA300 fan LEDs .................................................. 58
Location and meaning of SA300 PSU LEDs ................................................ 59
31xx system LEDs .................................................................................................... 60
4 | Hardware Platform Monitoring Guide
Your system also logs messages. See the System Administration Guide for the version of Data
ONTAP that your system is running for information about message logs.
Additional information about messages that appear on your system console or in logs may be
available through the Syslog Translator on the NetApp Support Site at support.netapp.com/eservice/
ems.
System-level You can find system-level diagnostics on FAS22xx, FAS25xx, 32xx, 62xx, and
diagnostics FAS80xx systems by entering sldiag commands at the Maintenance mode prompt.
The sldiag commands enable you to specify devices, tests, and options; run
diagnostics based on the command; and then view the results. These commands are
documented in the relevant man pages and in the command reference documents on
the NetApp Support Site at mysupport.netapp.com.
Additional information about system-level diagnostics is available in the System-
Level Diagnostics Guide on the NetApp Support Site at mysupport.netapp.com.
SYSDIAG The SYSDIAG tool is available on older systems by entering the boot_diags
tool command at the boot environment prompt and then navigating menu options.
The command boots the diagnostic program and then displays the Diagnostic
Monitor, the interface providing access to diagnostic menus. After you select and run
a test, the SYSDIAG tool generates a message and displays it on the system console
if the test finds an error.
Additional information about the SYSDIAG tool is available in the Diagnostics
Guide on the NetApp Support Site at mysupport.netapp.com.
Location and meaning of LEDs on the front of 20xx and SA200 chassis
You can check the LEDs on the front of the system to learn whether the power is turned on, whether
there is activity on the controller, whether the system is halted, or whether there is a fault in the
chassis.
The following illustration shows the LEDs on the front of the 20xx and SA200 chassis:
Storage system LEDs | 31
1 Power LED
2 Fault LED
The following table explains what the LEDs on the front of the chassis mean:
Note: If an internal disk drive fails or is disabled, the fault light on the front of the chassis turns on.
When you remove the faulty or disabled disk drive, the fault light turns off. However, the failure
of disk drives in expansion disk shelves does not affect the fault light on the front of the chassis.
32 | Hardware Platform Monitoring Guide
Location and meaning of LEDs on the back of 20xx and SA200 controller
modules
You can check the LEDs on the back of the controller module to learn whether the controller module
is functioning properly, or to learn the status of the system network, disk shelf connections, or
NVMEM.
The following LEDs are on the back of the controller module:
Fibre Channel port
Remote management port
Ethernet port
NVMEM
Controller module fault
The following illustration shows the location of LEDs on the rear of 2050 and SA200 controller
modules:
The LEDs on the back of 2020 controller modules are the same as on the back of 2050 and SA200
controller modules, except for the placement of some labels.
The following illustration shows the location of LEDs on the back of 2040 controller modules:
The following table explains what the LEDs on the back of the controller modules mean:
Attention: Do not replace DIMMs or any other system hardware when the NVMEM LED is
blinking. Doing so might cause you to lose data. Always flush NVMEM contents to disk by
entering a halt command at the system prompt before replacing the hardware.
Attention: To protect critical data in NVMEM, you cannot update BIOS or BMC firmware when
NVMEM is in use. Before updating firmware, ensure that NVMEM no longer contains critical
34 | Hardware Platform Monitoring Guide
data by performing a halt command to cleanly shut down Data ONTAP. When the system
reboots to the boot environment prompt, you can update your firmware.
1 AC LED
2 Fault LED
1 LEDs
2240-4 systems have 4U chassis, but the placement and function of the LEDs are the same as on
2220 and 2240-2 systems.
36 | Hardware Platform Monitoring Guide
The following table shows what the LED labels look like and explains what the LEDs mean:
The shelf ID digital display shows the shelf ID of the chassis, which contains disk drives.
Note: If the 2220 or 2240 system has no attached disk shelves, then the chassis can have any ID
number. However, if disk shelves are attached, the chassis shelf and attached disk shelves must
have unique ID numbers.
When the bezel is removed, a third LED, indicating activity, is revealed below the fault LED. The
following table shows what the activity LED label looks like and explains what the LED means.
1 3 4 7 9 12 13 14
1a 1b
0b 0a
LNK LNK
e0b e0d
2 5 6 5 8 10 11 13 14
Storage system LEDs | 37
2 SAS ports
5 Optional mezzanine card LEDs (either 2/4/8 Gbps FC or 10 GbE) (2240 systems only)
6 Optional mezzanine card ports (either 2/4/8 Gbps FC or 10 GbE) (2240 systems only)
7 Serial port
8 USB port
If you have a 2240 system, the optional mezzanine card provides one of the following sets of ports:
Two 2/4/8 Gbps FC ports, each with one LNK LED
Two 10-GbE ports, each with one activity LED and one LNK LED
38 | Hardware Platform Monitoring Guide
The following table describes the meaning of the LEDs on the back of the controller:
Activity LED
1
40 | Hardware Platform Monitoring Guide
Fault LED
2
The following illustration shows the front of a disk drive carrier in 2240-2 systems and the location
of its two LEDs:
1
2
Activity LED
1
Fault LED
2
Although the drive carriers differ in appearance, the behavior of the LEDs is the same. The following
table explains what the LEDs mean:
1
2
3
4
AC
1 PSU OK
2 DC fault
3 AC fault
4 Fan fault
The following illustration shows the location of PSU LEDs on the back of the 2240-4 system:
42 | Hardware Platform Monitoring Guide
3
4
1 2
Fan fault
1
AC fault
2
PSU OK
3
DC fault
4
The following table describes what the PSU LEDs on 22xx systems mean:
The following illustration shows the LEDs on the front of a FAS2554 system with the bezel in place:
44 | Hardware Platform Monitoring Guide
1 LEDs
The shelf ID LCD display shows the shelf ID of the chassis, which contains disk drives.
Note: If the system has no attached disk shelves, the chassis can have any ID number. However, if
disk shelves are attached, the chassis shelf and attached disk shelves must have unique ID
numbers.
When the bezel is removed, a third LED, indicating activity, is revealed below the fault LED. The
following table shows what the activity LED label looks like and explains what the LED means:
Activity LED
1
Fault LED
2
The following illustration shows the front of a disk drive carrier in FAS2552 systems and the
location of its two LEDs:
1
2
Storage system LEDs | 47
Activity LED
1
Fault LED
2
Although the drive carriers differ in appearance, the behavior of the LEDs is the same. The following
table explains what the LEDs mean:
1 SAS ports
7 Serial port
The following table describes the meaning of the LEDs on the back of the controller:
1 SAS ports
7 Serial port
The following table describes the meaning of the LEDs on the back of the controller:
1
2
3
4
AC
1 PSU OK
2 DC fault
3 AC fault
4 Fan fault
The following illustration shows the location of PSU LEDs on the back of the FAS2554 system:
3
4
1 2
Storage system LEDs | 55
Fan fault
1
AC fault
2
PSU OK
3
DC fault
4
The following table describes what the PSU LEDs on FAS25xx systems mean:
1 Activity LED
2 Status LED
3 Power LED
1 FC port LEDs
3 RLM LEDs
The following table explains what the LEDs on the back of the controller mean:
58 | Hardware Platform Monitoring Guide
The fan module FRU LED is amber and illuminates when a problem occurs in the fan. If you see an
error message that indicates a fan problem, you can remove the bezel and use the illuminated fan
FRU LED to locate the FRU in which the problem occurred.
Storage system LEDs | 59
1 PSU 1
2 PSU 2
3 PSU LEDs
When the bezel is in place, the LEDs are arranged horizontally in the following left-to-right order:
Power
Storage system LEDs | 61
Fault
Controller A activity
Controller B activity
Controller A is the controller in the top of the chassis, and Controller B is the controller in the bottom
of the chassis.
Note: When the bezel is removed, the LEDs are arranged vertically in the following top-to-bottom
order:
Power
Fault
Controller A activity
Controller B activity
The following table shows what the LED labels look like and explains what the LEDs mean:
The following table explains the behavior of the LEDs on the back of the controller:
The fan module FRU LED is amber and turns on when a problem occurs in the fan. If you see error
messages indicating a fan problem, you can remove the bezel and use the illuminated fan FRU LED
to locate the FRU where the problem occurred.
1 Fault LED
2 Power LED
The following table describes what the AC PSU and DC PSU LEDs mean:
Location and meaning of LEDs on the front of 32xx and SA320 chassis
You can check the LEDs on the front of the chassis to learn whether the power is turned on, the
controller is active, the system is halted, or a fault in the chassis has occurred.
The following illustration shows the LEDs on the front of the chassis:
A B
LEDs
1
When the bezel is in place, the LEDs are arranged horizontally in the following left-to-right order:
Power
Fault
Controller A activity
66 | Hardware Platform Monitoring Guide
Controller B activity
When two controllers are installed in the chassis, Controller A is the controller in the top bay and
Controller B is the controller in the bottom bay. When a controller and an I/O expansion module are
installed in the chassis, the controller is always in the top bay and the I/O expansion module is
always in the bottom bay.
The following table shows what the LED labels look like and explains what the LEDs mean:
Location and meaning of LEDs on the back of 32xx and SA320 controllers
You can check the LEDs on the back of the controller to learn the status of its network or disk shelf
connections, or, in an HA pair, to identify the controller where a fault occurred.
The following illustration shows the ports and LEDs on the back of the controller:
Storage system LEDs | 67
1 3 5 7 9
c0a 0c e0a
0a 0b
LNK LNK
!
c0b 0d e0b
2 4 6 8 10 11 12 13
SAS ports
2
HA port LEDs (LEDs pointing up belong to the upper port; LEDs pointing down belong to
3
the lower port)
HA ports
4
Fibre Channel port LEDs (the LED pointing up belongs to the upper port; the LED
5
pointing down belongs to the lower port)
1-GbE ports
8
USB (top) and serial console (bottom) ports (External USB devices are not currently
11
supported)
NVMEM LED
13
The following table describes the meaning of the LEDs on the back of the controller:
Location and meaning of LED on the back of 32xx and SA320 I/O expansion
modules
You can check the back of the I/O expansion module to detect whether a fault has occurred.
The following illustration shows the ports and LEDs on the back of an I/O expansion module:
70 | Hardware Platform Monitoring Guide
3 5
4 6
!
2 4
1 2
1
Fault LED
2
The following table describes the meaning of the LED on the I/O expansion module:
LED
1
The fan module FRU LED is amber and illuminates when a problem occurs in the fan. If you see
error messages indicating a fan problem, you can remove the bezel and use the illuminated fan FRU
LED to locate the FRU where the problem occurred.
1 Fault LED
2 Power LED
Location and meaning of LEDs on the front of 60xx and SA600 controllers
You can check the LEDs on the front of the controller to learn whether the power is turned on,
whether the system is active, whether the system is halted, or whether there is a fault in the chassis.
The following illustration shows the LEDs on the front of the controller.
1
2
3
1 Activity LED
2 Status LED
3 Power LED
The following table explains what the LEDs on the front of the controller mean:
74 | Hardware Platform Monitoring Guide
Location and meaning of LEDs on the back of 60xx and SA600 controllers
You can check the LEDs on the back of the controller to learn the status of network and disk shelf
connections.
The following illustration shows the location of LEDs on the back of the controller:
2 3
The following table explains what the LEDs on the rear of the controller mean:
2
76 | Hardware Platform Monitoring Guide
1 Fan
2 LEDs
1 LEDs
2 Power supply
Location and meaning of LEDs on the front of 62xx and SA620 chassis
You can check the LEDs on the front of the chassis to learn whether the power is turned on, the
controller is active, the system is halted, or a fault in the chassis has occurred.
The following illustration shows the LEDs on the front of the 62xx and SA620 chassis:
78 | Hardware Platform Monitoring Guide
Chassis LEDs
1
When the bezel is in place, the LEDs are arranged horizontally in the following left-to-right order:
Power
Fault
Controller A activity
Controller B activity
When two controllers are installed in the chassis, Controller A is the controller in the top bay, and
Controller B is the controller in the bottom bay. When a controller and an I/O expansion module are
installed in the chassis, the controller is always in the top bay and the I/O expansion module is
always in the bottom bay.
Note: When the bezel is removed, the LEDs are arranged vertically in the following top-to-bottom
order:
Power
Fault
Controller A activity
Controller B activity
The following table shows what the LED labels look like and explains what the LEDs mean:
Storage system LEDs | 79
Location and meaning of LEDs on the back of 62xx and SA620 controllers
You can check the LEDs on the back of the controller to learn the status of its network or disk shelf
connections, or, in an HA pair, to identify the controller where a fault occurred.
The following illustration shows the LEDs on left side of the back of the 62xx and SA620
controllers:
80 | Hardware Platform Monitoring Guide
1 2 3
0 e0a e0b
! e0c e0d e0e e0f
4 5 6 7 8 9
GbE port
7
10-GbE ports
8
The following table describes the meaning of the LEDs on left side of the back of the controller:
The following illustration shows the location of ports and LEDs on the right side of the back of the
controller.
0a 0b 0c 0d
2 3 4
USB port
1
Serial port
4
Storage system LEDs | 83
The following table describes the meaning of the LEDs on the right of the back of the controller:
Location and meaning of the 62xx and SA620 I/O expansion module LED
You can check the back of the I/O expansion module to check whether a fault has occurred.
The following illustration shows the ports and LED on the back of an 62xx and SA620 I/O expansion
module:
2 3 2
Fault LED
1
PCIe slots
2
The following table describes the meaning of the LED on the I/O expansion module:
84 | Hardware Platform Monitoring Guide
LED
1
The fan module FRU LED is amber and illuminates when a problem occurs in the fan. If you see
error messages indicating a fan problem, you can remove the bezel and use the illuminated fan FRU
LED to locate the FRU where the problem occurred.
1 Fault LED
2 Power LED
10-GbE slot
I/O slots (2)
The following FRU LEDs are in the I/O expansion module:
PCIe slots
I/O slots
FRU LEDs are off when the FRU is functioning normally and turn amber when a problem occurs.
They stay lit for at least 10 minutes even after you remove the controller or I/O expansion module
from the chassis.
Chassis LEDs
1
When the bezel is in place, the LEDs are arranged horizontally in the following left-to-right order:
Power
Attention
Storage system LEDs | 87
Controller A activity
Controller B activity
When two controllers are installed in the chassis, Controller A is the controller in the top bay, and
Controller B is the controller in the bottom bay.
Note: When the bezel is removed, the LEDs are arranged vertically in the following top-to-bottom
order:
Power
Attention
Controller A activity
Controller B activity
The following table shows what the chassis LED labels look like and explains what the LEDs mean:
Chassis LEDs
1
When the bezel is in place, the LEDs are arranged horizontally in the following left-to-right order:
Power
Attention
Controller A activity
Controller B activity
Note: The Controller B activity LED does not illuminate on FAS80xx systems equipped with I/O
expansion modules (IOXM).
Controller A is the controller in the top bay, and Controller B is the controller in the bottom bay.
When the bezel is removed, the LEDs are arranged vertically in the following top-to-bottom order:
Power
Attention
Controller A activity
Storage system LEDs | 89
Controller B activity
Note: The Controller B activity LED does not illuminate on FAS80xx systems equipped with I/O
expansion modules (IOXM).
The following table shows what the chassis LED labels look like and explains what each one means:
SAS ports
2
10GbE port LEDs (LEDs pointing up belong to the upper port; LEDs pointing down
3
belong to the lower port)
10GbE ports
4
UTA2 (CNA) data network port LEDs (LEDs pointing up belong to the upper port; LEDs
5
pointing down belong to the lower port)
Management port LEDs: remote management (top) and private management (bottom)
9
USB (top) and serial (bottom) ports (External USB devices not currently supported)
11
NVRAM LED
12
Storage system LEDs | 91
The following table describes the meaning of the LEDs on the back of the controller:
1 3 5 7
LNK LNK LNK LNK LNK LNK LNK LNK LNK LNK LNK LNK NV
0a 0b
S
A
S
0c 0d e0a e0b e0c e0d e0e 0e e0f 0f e0g 0g e0h 0h !
2 4 6 8
SAS ports
2
10GbE ports
4
NVRAM LED
7
The following table describes the meaning of the LEDs on the left side of the back of the controller:
94 | Hardware Platform Monitoring Guide
The following illustration shows the location of ports and LEDs on the right side of the back of the
controller:
1 3 5
2 4 6 7 8
1000Base-T ports
2
Serial port
8
The following table describes the meaning of the LEDs on the right of the back of the controller:
PCIe slots
1
Attention LED
2
HA interconnect ports
3
The following table describes the meaning of the LEDs on the I/O expansion module:
LED
1
The fan module FRU LED is amber and illuminates when a problem occurs in the fan. If you see
error messages indicating a fan problem, you can remove the bezel and use the illuminated fan FRU
LED to locate the FRU where the problem occurred.
LED
1
The fan module FRU LED is amber and illuminates when a problem occurs in the fan. If you see
error messages indicating a fan problem, you can remove the bezel and use the illuminated fan FRU
LED to locate the FRU where the problem occurred.
1 Attention LED
2 Power LED
RTC battery
NVRAM battery
FRU LEDs are off when the FRU is functioning normally and turn amber when a problem occurs.
They remain illuminated for at least 10 minutes even after you remove the controller from the
chassis.
NVRAM7 31xx
NVRAM8 62xx
NVRAM9 FAS80xx
102 | Hardware Platform Monitoring Guide
L01 PH1
L02 PH2
NVRAM5
The following table explains what the LEDs on an NVRAM5 or NVRAM6 adapter mean:
1 LED
2 Media converter
The following table explains what the LED on NVRAM5 and NVRAM6 media converters means:
One LED is near the right rear corner of the motherboard. It is labeled D87.
You can see this LED through the rear grille of the controller as shown in the following
illustration:
1 NVRAM7 LED
Attention: NVRAM7 LEDs flash red if unwritten data is being held in the NVRAM when power
to the controller is turned off. If you remove the NVRAM7 battery or NVRAM7 DIMM when the
red LEDs are flashing, you lose data that is being held in the NVRAM.
Note: In an HA pair, each node continually monitors its partner and mirrors its partner's NVRAM
data. Therefore, if you remove a controller from a 31xx system in an HA pair without first shutting
it down, you can disregard the illuminated NVRAM LEDs on the motherboard of the removed
controller.
1 LNK ACT
2
3
4
INT LNK
5 LNK ACT
6
Port 0 link and activity LEDs are relevant when port 0 of the controller is connected to a partner in an
HA pair. The following table explains the meaning of the port 0 LEDs:
106 | Hardware Platform Monitoring Guide
Port 1 LEDs reflect the state of the port 1 connector used between two controllers installed in
different chassis or the state of the internal InfiniBand connection used between two controllers
installed in the same chassis. The following table explains the meaning of the port 1 LEDs:
Port 1 LEDs depend on the state of the Internal link select LED, which in HA pair configurations
depends on how the controllers are connected. The following table explains the meaning of the
internal link select LED:
A destage status LED, located on the top of the adapter board, is visible through the grille of the
faceplate halfway between the top of the faceplate and the InfiniBand port 0 LEDs. The LED shows
the status of NVRAM8 data after an unexpected loss of system power.
Data might need to be destaged, or saved from active DRAM to nonvolatile flash memory after an
unexpected power loss. Destaging lasts about one minute. Once data has been destaged, it must be
restaged, or restored from nonvolatile flash memory to active DRAM during system initialization.
The destage LED might be lit as red or green. Its behavior depends on whether the system power is
on or off. When the system power is off, the LED behavior depends on whether the NVRAM8
adapter is running on battery power. The battery automatically turns off after data is destaged.
The following table explains the meaning of the destage status LED when the NVRAM8 adapter is in
the controller:
You can use the destage status LED when the adapter is removed from the system to determine
whether destage data is in the NVRAM8 adapter.
The following illustration shows the location of the destage status LED:
108 | Hardware Platform Monitoring Guide
You activate the destage status LED when the NVRAM8 adapter is removed from the controller by
pressing and holding the button marked both SW6 and STATUS on the bottom of the adapter board.
The following illustration shows the location of the button:
Storage system LEDs | 109
STATUS
SW6
STATUS
SW6
The LED can light up red or green and also show both colors simultaneously, creating a light that
appears amber. The following table explains the meaning of the destage status LED when the button
is pressed:
The external NVRAM9 status LED is found on the rear face of the chassis, as seen in the
following illustration:
This LED is labeled NV, as shown in the following illustration:
The NVRAM9 DIMM attention LED is found behind the DIMM and near the edge of the
NVRAM9 adapter. This LED is labeled FRU LED5.
The internal NVRAM9 status LED is found on the corner of the NVRAM9 adapter. This LED is
labeled Destage LED3.
You can see LED3 and LED5 only after you remove the controller from the chassis, as shown in
the following illustration:
Attention: If you remove the NVRAM9 battery or the NVRAM9 DIMM when the green LEDs are
flashing, you lose data that is being held in the NVRAM. The NVRAM9 battery can be removed
after the destage is completed without loss of data.
Note: In an HA pair, each node continually monitors its partner and mirrors its partner's NVRAM
data. If you remove a controller from an FAS80xx system in an HA pair without first shutting it
down, you can disregard the illuminated NVRAM LEDs on the motherboard of the removed
controller.
112 | Hardware Platform Monitoring Guide
1
2 5
3
6
4 5
2 6
1
Adapter card LEDs | 113
3 Port a
4 Port b
The ports in the preceding illustration are labeled a and b because Data ONTAP identifies ports
alphabetically. The physical ports are labeled Port 1 for Port a and Port 2 for Port b.
Note: These HBAs are supported only in target mode and single system image controller failover
cfmode. You cannot use this HBA as an initiator to connect to disks or tape, and you cannot use it
for fabric MetroCluster interconnect configurations.
The following table explains what the LEDs on a dual-port, 10-GB, FCoE HBA mean:
114 | Hardware Platform Monitoring Guide
Port SAN traffic green LED LAN traffic green LED Hardware state
a Off Off Power is off
Slow flashing (unison) Slow flashing (unison) Power is on/no link
On On Power is on/link established,
no activity
On Flashing Power is on/link established,
Rx/Tx Ethernet activity only
Flashing On Power is on/link established,
Rx/Tx storage activity only
Flashing Flashing Power is on/link established,
Rx/Tx Ethernet and storage
activity
Slow flashing, alternating Slow flashing, alternating Beaconing
with other LED with other LED
b Off Off Power is off
Slow flashing (unison) Slow flashing (unison) Power is on/no link
On On Power is on/link established,
no activity
On Flashing Power is on/link established,
Rx/Tx Ethernet activity only
Flashing On Power is on/link established,
Rx/Tx storage activity only
Flashing Flashing Power is on/link established,
Rx/Tx Ethernet and storage
activity
Slow flashing, alternating Slow flashing, alternating Beaconing
with other LED with other LED
Adapter card LEDs | 115
PCIe x8
16G FC/
10GbE
PORT 2
>
> 3 4 5
PORT 1
The ports in the preceding illustration are labeled a and b because Data ONTAP identifies ports
alphabetically. The physical ports are labeled Port 1 for Port a and Port 2 for Port b. Port a is shown
empty, as used with SFP+ copper cables. Port b is shown with an SFP+ optical module installed, as
used with LC fiber cables.
You must connect to the UTA2 ports using LC fiber-optic cables with supported SFP+ optical
modules, or by using supported copper SFP+ cables (in 10-GbE mode only).
116 | Hardware Platform Monitoring Guide
Note: Both ports must operate in the same mode (FC or 10-GbE).
The LEDs on the UTA2 provide information about traffic to the ports as well as information about
their status and their connections. The LED color and activity vary depending on the mode in which
the ports are configured as well as the ports' current status.
Note: By default, the UTA2 ships configured in Fibre Channel, target mode. To change the
personality and operational mode of the card, you must use the ucadmin command for systems
running Data ONTAP in 7-Mode, and the system node hardware unified-connect
command for systems running clustered Data ONTAP. See the man pages for details.
The following table explains what the LEDs on single-port copper GbE NICs mean:
The following table explains what the LEDs on single-port fiber GbE NICs mean:
The following illustration shows the location of LEDs on copper quad-port GbE NICs:
Adapter card LEDs | 119
1 2
3 4 5 6
3 4 5 6
1 2
1 ACT LED
2 LNK LED
3 Port a
4 Port b
5 Port c
6 Port d
The following table explains what the LEDs on a copper multiport GbE NIC mean:
120 | Hardware Platform Monitoring Guide
The following table explains what the LEDs on the fiber multiport GbE NICs might indicate:
The following illustration shows the location of LEDs on optical quad-port GbE NICs:
Adapter card LEDs | 121
GRN=1G GRN=1G
5
2
GRN=1G
GRN=1G
GRN=1G
Note: The ports might not be labeled. For convenience, they are identified in the following table as
ports a, b, c, and d:
The following table explains what the LEDs on optical multiport GbE NICs might indicate:
Location and meaning of LEDs on the dual-port 10-GbE NIC that supports
fiber optic cables with SFP+ modules or copper SFP+ cables
You can check the LEDs on your dual-port 10-GbE NIC that supports fiber optic cables and SFP +
optical modules or copper SFP + cables to learn whether there is a network connection and whether
there is data activity.
The following illustration shows the location of LEDs and ports on the NIC:
3
5
4
3 Port a
4 Port b
The following table explains what the LEDs on the NIC mean:
Location and meaning of LEDs on the dual-port 10-GbE NIC that supports
fiber optic cables with X6569 SFP+ modules or copper SFP+ cables
You can check the LEDs on your dual-port 10-GbE NIC that supports fiber optic cables and X6569
SFP+ optical modules or copper SFP+ cables to learn whether there is a network connection, whether
there is data activity, and whether the card is operating at 10-Gb speed.
The following illustration shows the location of LEDs and ports on the NIC:
1 GRN=10G
ACT/LNK A
2
4
5 GRN=10G
ACT/LNK A
6
124 | Hardware Platform Monitoring Guide
Port b ACT/Link
6
The following table explains what the LEDs on the card mean:
Location and meaning of single-port, 10-GbE NIC LEDs (2050 systems only)
You can check the LEDs on your single-port, 10-GbE NIC to learn whether there is a network
connection and whether there is data activity. This NIC is used only in 2050 systems.
The following illustration shows the location of LEDs on the single-port, 10-GbE NIC:
Adapter card LEDs | 125
1 LINK/ACT LED
2 Port a
The following table explains what the LEDs on the single-port, 10-Gb NIC mean:
10G=GRN
1G=YLW
100M=OFF
1
4
ACT/LNK
Port a Port b
1 2
Speed LED (one per port) Activity/Link LED (one per port)
3 4
The following table explains what the LEDs on the card mean:
The following table shows how the type and length of Ethernet cable used with the card determine
the speed at which it can perform.
Note: The color displayed by the card's speed LEDs is not affected by Ethernet cable type and
length.
Ethernet cable Maximum length for 10GBASE-T Maximum length for 1000BASE-T
type support support
Category 6a 100 meters 100 meters
Category 6 55 meters 100 meters
Category 5e Not supported 100 meters
LED Description
Green Power ready indicator: Replace the card if the LED is off
Blinking blue Indicates the presence of the card; LED dims slightly on heavy loads
Replace the card if it does not blink after you boot Data ONTAP.
1
2
The following table explains what the LEDs on the module mean:
1 Fault
Adapter card LEDs | 129
2 Activity
HBA LEDs
HBAs have LEDs that you can check to learn whether the adapter has power, whether a link is
established, and whether an error has occurred.
Storage systems might have Fibre Channel or iSCSI host bus adapters installed and configured on
them.
1 Green LED
130 | Hardware Platform Monitoring Guide
2 Amber LED
The following table explains what the LEDs on a dual-port Fibre Channel HBA mean:
1 Amber
2 Green
3 Yellow
4 Port a
5 Port b
6 Yellow
7 Green
8 Amber
9 TX
10 RX
11 TX
12 RX
Location and meaning of quad-port, 4-Gb, Fibre Channel HBA LEDs: four-
LED version
You can check the LEDs on the HBA to learn the status of the storage system Fibre Channel link and
whether data is being transferred.
The following illustration shows the location of LEDs:
5 Port a LED
6 Port c LED
7 Port b LED
8 Port d LED
Location and meaning of quad-port, 4-Gb, Fibre Channel HBA LEDs: 12-
LED version
You can check the LEDs on the HBA to learn the status of the Fibre Channel connection and whether
data is being transferred.
The following illustration shows the location of LEDs:
134 | Hardware Platform Monitoring Guide
Location and meaning of quad-port, 8-Gb, Fibre Channel HBA LEDs: 12-
LED version
You can check the LEDs on the HBA to learn the status of the Fibre Channel connection and whether
data is being transferred.
The following illustration shows the location of LEDs:
136 | Hardware Platform Monitoring Guide
5
1
6
5
2
6
5
3
6
5
4
6
7 9
8
Port a
1
Port b
2
Port c
3
Port d
4
TX (Transmit)
5
RX (Receive)
6
Adapter card LEDs | 137
Yellow LED
7
Green LED
8
Amber LED
9
1 LINK LED
2 ACT LED
3 Port 2
4 Port 1
The following table explains what the LEDs on a fiber optic, iSCSI, target HBA mean:
1 Speed LED
2 ACT LED
3 Port 2
4 Port 1
The following table explains what the LEDs on a copper iSCSI target HBA mean:
3 4
2
1 Port a
2 Port b
5
1
1 Port a
2 Port b
3 Port c
4 Port d
1
2 5
3
6
4 5
2 6
1
3 Port a
3 Port b
Adapter card LEDs | 143
1 Amber LED
2 Green LED
3 Yellow LED
4 Port a
5 Port b
6 Yellow LED
7 Green LED
8 Amber LED
9 Transmitter port
10 Receiver port
11 Transmitter port
12 Receiver port
1
2
3
6
4
7
5 6
7
3
2
1
Amber LED
1
Green LED
2
Yellow LED
3
Port a
4
146 | Hardware Platform Monitoring Guide
Port b
5
Transmitter port
6
Receiver port
7
PCIe x8
16G FC/
10GbE
16G
FCVI
PORT 2
>
> 3 4 5
PORT 1
The ports in the preceding illustration are labeled a and b because Data ONTAP identifies ports
alphabetically. The physical ports are labeled Port 1 for Port a and Port 2 for Port b. Port a is shown
empty, as used with SFP+ copper cables. Port b is shown with an SFP+ optical module installed, as
used with LC fiber cables.
You must connect to the adaptor ports using LC fiber-optic cables with supported SFP+ optical
modules, or by using supported copper SFP+ cables.
148 | Hardware Platform Monitoring Guide
The LEDs on the adapter provide information about traffic to the ports as well as information about
their status and their connections.
LED LED ID 16 Gbs link up/ 8 Gbs link up/activity 4 Gbs link up/
legend activity activity
LED 0 Off Off Flashing amber
3
2 LINK LED
3 ACT LED
1 Activity LEDs: LED 1 corresponds to port a, LED 2 corresponds to port b, and so on.
2 Port a
3 Port b
4 Port c
5 Port d
6 Activity LEDs: LED 1 corresponds to port a, LED 2 corresponds to port b, and so on.
7 Port d
8 Port c
9 Port b
10 Port a
Adapter card LEDs | 151
The following table explains what the LEDs on the TOE NIC mean:
The following table explains what the LEDs on the TOE NIC mean:
152 | Hardware Platform Monitoring Guide
The following illustration shows the location of LEDs on the TOE NIC:
1 LINK/ACT LED a
2 Port a
3 LINK/ACT LED b
4 Port b
Adapter card LEDs | 153
The following table explains what the LEDs on the TOE NIC mean:
Startup messages
When you apply power to your system, it verifies the hardware that is in the system, loads the
operating system, and displays startup informational and error messages on the system console.
There are two types of startup error messages:
POST error messages
Boot error messages
Both error message types are displayed on the system console, and an e-mail notification is sent out
by the remote management subsystem, if it is configured to do so.
POST messages
POST is a series of tests run from the motherboard PROM. These tests check the hardware on the
motherboard and differ depending on your system configuration.
POST messages appear on the system console before Data ONTAP software is loaded.
The following text is an example of a POST message on the console on a system that uses the
LOADER boot environment. Systems using the CFE boot environment display similar messages.
Note: If your system has an LCD, it displays POST messages without a header.
Startup messages | 155
Boot messages
After the boot is successfully completed, your system loads the operating system. Messages provide
information about your system and alert you to errors that occur during boot.
Note: The exact boot messages that appear on your system console depend on your system
configuration.
The following message is an example of the start of a boot message that appears on the system
console of a FAS6030 storage system at first boot.
Reserved
BIOS Version 3.0
...................
Boot Loader version 1.3
Copyright (C) 2000,2001,2002,2003 Broadcom Corporation.
Portions Copyright (C) 2002-2005 Network Appliance Inc.
CPU Type: Mobile Intel(R) Celeron(R) CPU 2.20GHz
Starting AUTOBOOT press Ctrl-C to abort...
The dots or plus signs are a progress indicator to show that the BIOS is not hung. If the system
restarts after a fault, the dots are replaced by plus signs to indicate that the system NVMEM is armed,
or being protected, during the boot process.
The BIOS should begin loading Data ONTAP within about 25 seconds after the initial greeting.
In the sensors show output, the BIOS Status sensor displays one of three states: Normal, Hung, or
Error. In the Reading column, the sensor displays BIOS and boot loader progress. In the example
output, the BIOS Status sensor displays a state of Normal and a reading of Loader #20, indicating
that the boot loader is running normally.
The following table lists the BIOS and boot loader progress values.
Status Description
0x00 System software has cleanly shut down. (Sent only by Data ONTAP.)
0x01 Memory initialization is in progress.
0x02 NVMEM initialization is in progress (when NVMEM is armed).
0x05 User has entered setup.
Startup messages | 157
Status Description
0x13 Booting to Data ONTAP (or boot loader).
0x1F BIOS is starting up. (Special message to the BMC.) This is the first BIOS status message.
It might be quickly followed by another.
0x20 Boot loader is running.
0x21 Boot loader is programming the primary firmware hub. The BMC does not allow the
system to be powered down at this time.
0x22 Boot loader is programming the alternate firmware hub. The BMC does not allow the
system to be powered down at this time.
0x2F Boot loader has transferred control to Data ONTAP. Data ONTAP might send this
periodically to inform the BMC that Data ONTAP is running, if the BMC has rebooted.
0x60 BMC has shut power off.
0x61 BMC has turned power on.
0x62 BMC has reset the system.
0x63 BMC Watchdog power cycle.
0x64 BMC Watchdog cold reset.
The BIOS Status sensor also displays BIOS and boot loader error codes. If the BIOS status sensor
displays a Hung or Error state, contact technical support for interpretation of the codes.
Description The BIOS cannot initialize the system memory or a DIMM has failed.
Corrective Check the DIMMs and replace any bad ones by completing the following steps:
action
1. Make sure that each DIMM is seated properly, and then power-cycle the
system.
2. If the problem persists, run the diagnostics to determine which DIMMs failed
by entering the following command at the boot environment prompt:
boot_diags
Description The BIOS cannot initialize the system memory or a DIMM has failed.
Corrective Check the DIMMs and replace any bad ones by completing the following steps:
action
1. Make sure that each DIMM is seated properly, and then power-cycle the
system.
2. If the problem persists, run the diagnostics to determine which DIMMs failed
by entering the following command at the boot environment prompt:
boot_diags
Description The BIOS cannot initialize the system memory or a DIMM has failed.
Corrective Check the DIMMs and replace any bad ones by completing the following steps:
action
1. Make sure that each DIMM is seated properly, and then power-cycle the
system.
2. If the problem persists, run the diagnostics to determine which DIMMs failed
by entering the following command at the boot loader prompt:
boot_diags
Description The BIOS cannot initialize the system memory or a DIMM has failed.
Corrective Check the DIMMs and replace any bad ones by completing the following steps:
action
1. Make sure that each DIMM is seated properly, and then power-cycle the
system.
2. If the problem persists, run the diagnostics to determine which DIMMs failed
by entering the following command at the boot loader prompt:
boot_diags
Description The BIOS cannot initialize the system memory or a DIMM has failed.
Corrective Check the DIMMs and replace any bad ones by completing the following steps:
action
1. Make sure that each DIMM is seated properly, and then power-cycle the
system.
2. If the problem persists, run the diagnostics to determine which DIMMs failed
by entering the following command at the boot loader prompt:
boot_diags
Description A bad DIMM was detected, which causes BIOS to disable node interleaving.
Corrective Check the DIMMs and replace any bad ones by completing the following steps:
action
1. Make sure that each DIMM is seated properly, and then power-cycle the
system.
2. If the problem persists, run the diagnostics to determine which DIMMs failed
by entering the following command at the boot loader prompt:
boot_diags
Description Timeout occurs when BIOS tries to read or write information through System
Management Bus (SMBUS) or Inter-Integrated Circuit (I2C).
Startup messages | 161
Description The information from the field-replaceable unit (FRU) Electrically Erasable
Programmable Read-Only Memory (EEPROM) is invalid.
Corrective 1. Enter the following command at the boot environment prompt:
action
boot_diags
2. To determine the FRU involved, select the following tests: mb and 74.
3. Check whether the FRUs model name, serial number, part number, and
revision are correct in one of the following ways:
Visually inspect the FRU.
Look for error messages indicating that the FRU information is invalid or
could not be read.
4. Contact technical support if you suspect a misprogrammed FRU.
Description CMOS checksum is bad, possibly because the system was reset during BIOS
boot or because of a dead RTC battery.
Corrective action 1. Reboot the system.
Description The previous boot was incomplete, and the default configuration was used.
Startup messages | 163
Description No valid boot loader is found in system flash memory while the option to Halt
For Invalid Boot Loader is disabled in setup. As a result, the system still can
boot from CompactFlash if it has a valid boot loader.
Corrective Enter the update_flash command two times to place a good boot loader in the
action system flash.
Description No valid boot loader is found in system flash memory while the option to Halt
For Invalid Boot Loader is enabled in setup. As a result, the system halts. You
should take corrective action.
Corrective Place a valid version of the boot loader in the system flash by completing either of
action the following series of steps:
1. Boot from the backup boot image.
2. Enter the update_flash command.
or
1. Enter BIOS setup and disable boot from system flash.
2. Save the setting.
3. Reboot to the boot environment prompt, and then enter the update_flash
command two times.
Description The Field Programmable Gate Array (FPGA) jumper was installed on the
motherboard.
Corrective action 1. Remove the FPGA jumper.
Description The watchdog times out while BIOS is doing PCI initialization.
Corrective 1. Power-cycle the system a few times or reset the system through the RLM.
action
2. If the problem persists, check the PCI interface by entering the following
command at the boot environment prompt:
boot_diags
Description The watchdog times out while BIOS is testing the extended memory.
Corrective 1. Power-cycle the system a few times or reset the system through the RLM.
action
2. If the problem persists, check the memory interface by entering the following
command at the boot loader prompt:
boot_diags
Description The watchdog times out while BIOS is setting up the HT link speed.
1. Power-cycle the system a few times or reset the system through the Remote LAN
Module (RLM).
2. If the problem persists, replace the motherboard.
No message on console
Note: Always power-cycle your system when you receive this message. If the system repeats the
error message, follow the corrective action for the error message.
Description The BIOS cannot initialize the system memory or a DIMM has failed.
Corrective action Check and replace the bad DIMM modules.
SP error code 030h
Description The BIOS cannot initialize the system memory or a DIMM has failed.
Corrective action Check and replace the bad DIMM modules.
SP error code 031h
Description The BIOS cannot initialize the system memory or a DIMM has failed.
Corrective action Check and replace the bad DIMM modules.
SP error code 032h
Startup messages | 167
Description Data ONTAP detected a bad DIMM and disabled it in the displayed DIMM
slot.
Corrective action Check and replace the bad DIMM modules.
SP error code 03Ah
Description The system BIOS detected an SPD (serial presence detect) checksum error in
the specified DIMM slot.
Corrective action Check and replace the bad DIMM modules.
SP error code 03Bh
Description A timeout occurs when the BIOS tries to read or write information through the
System Management Bus (SMBUS) or the Inter-Integrated Circuit (I2C).
Corrective action Run system-level diagnostics to check the SMBUS.
SP error code 041h
Description The information from the field-replaceable unit (FRU) Electrically Erasable
Programmable Read-Only Memory (EEPROM) is invalid.
168 | Hardware Platform Monitoring Guide
Corrective action Program the FRU information through the SP or system-level diagnostics.
SP error code 042h
Description The CMOS checksum is bad, possibly because the system was reset during a
BIOS boot or because of a dead RTC battery.
Corrective action None. The BIOS corrects the error automatically, and the system continues its
normal boot.
SP error code 051h
Description The previous boot attempt was incomplete, causing the system to boot with the
default BIOS configuration.
Corrective action Reboot the system.
SP error code 080h
Description The Service Processor fails to respond to the FRU ID read request.
Corrective action Check and replace the Service Processor.
SP error code 0A3h
Description No valid boot loader is found in the system flash memory while the option to
Halt For Invalid Boot Loader is disabled in setup. As the result, the system still
can boot from the boot media if it has a valid boot loader.
170 | Hardware Platform Monitoring Guide
Description No valid boot loader is found in the system flash memory while the option to
Halt For Invalid Boot Loader is enabled in setup. As the result, the system
halts. Users should take corrective action.
Corrective action Place a valid version of the boot loader in the system flash by completing the
following steps:
1. Boot the system from the backup boot image.
2. Enter the following command:
flash
Description The system BIOS detected a pattern write/read mismatch in the displayed
DIMM slot. Read/write mismatches indicate defective memory modules.
Startup messages | 171
Corrective action Check and replace the bad DIMM modules identified.
SP error code 03Ch
Description The BIOS detected an uncorrectable ECC error in the displayed DIMM slot.
Corrective action Check and replace the bad DIMM modules.
SP error code 035h
Description The system BIOS detected unknown errors in the displayed DIMM. These
errors might indicate defective memory modules.
Corrective action Check and replace the bad DIMM modules.
SP error code 038h
Fatal Error! All DIMM failed and system can not continue boot!
Message Fatal Error! All DIMM failed and system can not continue
boot!
Description All DIMMs are mapped out either as bad or having the disable flag set. The
system has no memory to continue.
Corrective action Complete the following steps:
1. Clear the CMOS.
2. Power-cycle the system.
3. If the problem persists, replace all DIMMs.
Description The registered dual inline memory modules (RDIMMs) and unregistered dual
inline memory modules (UDIMMs) are mixed in the system.
Corrective action Make sure that the RDIMMs and UDIMMs are not mixed. For information
about the correct memory for your system, contact technical support.
SP error code 0EDh
Description An unregistered dual inline memory module (UDIMM) is populated in the third
slot.
Corrective action Make sure that an unregistered dual inline memory module (UDIMM) is not
plugged into the third slot.
SP error code 0EEh
Fatal Error: No DIMM detected and system can not continue boot!
Message Fatal Error: No DIMM detected and system can not continue
boot!
Description All DIMM serial presence detect (SPD) EEPROMs are inaccessible due to the
hanging of the Inter-Integrated Circuit (I2C) switch for System Management
Bus (SMBUS). The system regards the condition as if there were no DIMMs on
the system.
Corrective action Complete the following steps:
1. If the message persists, try to power-cycle the system.
2. If the problem persists after power-cycling the system, replace the
motherboard.
Description The software memory test failed in memory reference code (MRC) checking.
Corrective action Check and replace the bad DIMM modules.
SP error code 0EBh
174 | Hardware Platform Monitoring Guide
Corrective Usually, you do not need to create and initialize a file system; do so only after
action consulting technical support.
Corrective 1. Verify that all expansion adapters in your system are supported.
action
2. Contact technical support for help. Have a list ready of all expansion
adapters installed in your system.
Description This message occurs when other error messages occur at the same time.
Corrective action See the other error messages and their respective corrective actions. If the
problem persists, contact technical support.
2. Turn off the power on your system and verify that the adapter is properly
seated in the expansion slot.
3. Verify that all Fibre Channel cables are connected.
No /etc/rc
Message No /etc/rc
Description The /etc/rc file is corrupted.
Corrective 1. At the hostname> prompt, enter
action
setup
No disk controllers
Message No disk controllers
Description The system cannot detect any Fibre Channel-Arbitrated Loop (FC-AL) disk
controllers.
Corrective action 1. Turn off your system power.
2. Verify that all NICs are properly seated in the appropriate expansion slots.
No disks
Message No disks
Description The system cannot detect any Fibre Channel-Arbitrated Loop (FC-AL) disks.
Corrective action Verify that all disks are properly seated in the drive bays.
No network interfaces
Message No network interfaces
Description The system cannot detect any network interfaces.
Corrective action 1. Turn off the system and verify that all network interface cards (NICs) are
seated properly in the appropriate expansion slots.
2. Run diagnostics to check the onboard Ethernet port.
3. If the problem persists, contact technical support.
No NVRAM present
Message No NVRAM present
Startup messages | 179
NVRAM #n downrev
Message NVRAM #n downrev
Description nThe serial number of the nonvolatile RAM (NVRAM) adapter. The
NVRAM adapter is an early revision that cannot be used with the system.
Corrective action Check the console for information about which revision of the NVRAM adapter
is required. Replace the NVRAM adapter.
Description The system cannot detect the nonvolatile RAM (NVRAM) adapter.
Corrective action For a stand-alone 3020 or FAS3050 system, make sure that the NVRAM
adapter is in slot 1.
For a 3020 or FAS3050 system in an HA pair, make sure that the NVRAM
adapter is in slot 2.
Description This platform is not supported on this release. Please consult the release notes
for your software.
Corrective You must downgrade your software version to a compatible release.
action
Verify that you have the correct URL for software download.
180 | Hardware Platform Monitoring Guide
Watchdog error
Message Watchdog error
Description An error occurred during the testing of the watchdog timer.
Corrective action Replace the motherboard.
Watchdog failed
Message Watchdog failed
Startup messages | 181
Description Your system watchdog reset hardware, used to reset your system from a system
hang condition, is not functioning properly.
Corrective action Replace the motherboard.
182 | Hardware Platform Monitoring Guide
Note: In 31xx systems, both controllers in a chassis share the power supplies. As a result, the
system is never shut down because of a single power supply failure. Removing one power supply
does not shut down the system.
Description This message occurs when the system is in a warning state. The system shuts
down immediately.
Corrective action Your action depends on whether the power supply is present.
If the power supply is not inserted, insert it.
If the power supply is inserted, power-cycle your system and run
diagnostics on the identified power supply. If the problem persists, replace
the identified power supply.
Note: In 31xx systems, both controllers in a chassis share the power supplies. As a result, the
system is never shut down because of a single power supply failure. Removing one power supply
does not shut down the system.
Description This message occurs when the power supply unit is removed from the system.
Corrective action Your action depends on whether the power supply is present:
If the power supply is not inserted, insert it.
If the power supply is inserted, power-cycle your system and run
diagnostics on the identified power supply.
If the problem persists, replace the identified power supply.
SNMP trap ID #394: I/O expansion module is not present in the chassis
monitor.chassisFan.degraded
Message monitor.chassisFan.degraded
Severity ALERT
Description This message is issued when a chassis fan is degraded.
Corrective action The fan unit should be replaced.
SNMP trap ID #412 Chassis fan is degraded: %s
monitor.chassisFan.ok
Message monitor.chassisFan.ok
Severity NOTICE
Description This message occurs when the chassis fans are OK.
Corrective action N/A
SNMP trap ID #366 Chassis FRU is OK
monitor.chassisFan.removed
Message monitor.chassisFan.removed
Severity ALERT
Description This message occurs when a chassis fan is removed.
Corrective action Replace the fan unit.
SNMP trap ID #363 Chassis FRU is removed
monitor.chassisFan.slow
Message monitor.chassisFan.slow
Severity ALERT
Description This message occurs when a chassis fan is spinning too slowly.
Corrective action Replace the fan unit.
190 | Hardware Platform Monitoring Guide
SNMP trap ID #365 Chassis FRU contains at least one fan spinning slowly
monitor.chassisFan.stop
Message monitor.chassisFan.stop
Severity ALERT
Description This message occurs when a chassis fan is stopped.
Corrective action Replace the fan unit.
SNMP trap ID #364 Chassis FRU contains at least one stopped fan
monitor.chassisFan.warning
Message monitor.chassisFan.warning
Severity ALERT
Description This message is issued when a chassis fan is spinning either too slowly or too
fast. This is a warning message.
Corrective action The fan unit should be replaced.
SNMP trap ID #415 Chassis fan is in warning state
monitor.chassisFanFail.xMinShutdown
Message monitor.chassisFanFail.xMinShutdown
Severity EMERG
Description This message indicates that multiple chassis fans have failed and the system
will shut down in few minutes unless corrected.
Corrective action Make sure the system fans are working.
SNMP trap ID #511 Multiple Chassis Fan failure: System will shut down in 2 minutes.
monitor.chassisPower.degraded
Message monitor.chassisPower.degraded
Severity NOTICE
Description This message indicates that a power supply is degraded.
Corrective action 1. If spare power supplies are available, try replacing them to see whether
that alleviates the problem.
EMS and operational messages | 191
monitor.chassisPower.ok
Message monitor.chassisPower.ok
Severity NOTICE
Description This messages indicates that the motherboard power is OK.
Corrective action N/A
SNMP trap IP #406 Normal operation
monitor.chassisPowerSupplies.ok
Message monitor.chassisPowerSupplies.ok
Severity INFO
Description This message indicates that all power supplies are OK.
Corrective action N/A
SNMP trap ID #396 Normal operation
monitor.chassisPowerSupply.degraded
Message monitor.chassisPowerSupply.degraded
Severity INFO
Description This message indicates that a power supply is degraded.
Corrective action A replacement power supply might be required. Contact technical support for
further instruction.
SNMP trap ID #392 Chassis power supply is degraded
monitor.chassisPowerSupply.notPresent
Message monitor.chassisPowerSupply.notPresent
Severity NOTICE
Description This message indicates that a power supply is not present.
Corrective action Replace the power supply.
SNMP trap ID #394 Power supply not present
192 | Hardware Platform Monitoring Guide
monitor.chassisPowerSupply.off
Message monitor.chassisPowerSupply.off
Severity NOTICE
Description This message indicates that a power supply is turned off.
Corrective action Turn on the power supply.
SNMP trap ID #395 Power supply not present
monitor.chassisPowerSupply.ok
Message monitor.chassisPowerSupply.ok
Severity INFO
Description This message indicates the power supply is OK
Corrective action None.
SNMP trap ID # 397 Chassis power supply (%id) is OK
monitor.chassisTemperature.cool
Message monitor.chassisTemperature.cool
Severity ALERT
Description This message occurs when the chassis temperature is too cool.
Corrective action Raise the temperature around the system.
SNMP trap ID #372 Chassis temperature is too cool
monitor.chassisTemperature.ok
Message monitor.chassisTemperature.ok
Severity NOTICE
Description This message occurs when the chassis temperature is normal.
Corrective action N/A
SNMP trap ID #376 Normal operation
monitor.chassisTemperature.warm
Message monitor.chassisTemperature.warm
EMS and operational messages | 193
Severity ALERT
Description This message occurs when the chassis temperature is too warm.
Corrective action Check to see whether air conditioning units are needed, or whether they are
functioning properly.
SNMP trap ID #372 Chassis temperature is too warm
monitor.cpuFan.degraded
Message monitor.cpuFan.degraded
Severity NOTICE
Description This message indicates that a CPU fan is degraded.
Corrective action 1. Replace the identified fan.
2. Power-cycle the system and run diagnostics on the system.
monitor.cpuFan.failed
Message monitor.cpuFan.failed
Severity NOTICE
Description This message indicates that a CPU fan is degraded.
Corrective action 1. Replace the identified fan.
2. Power-cycle the system and run diagnostics on the system.
monitor.cpuFan.ok
Message monitor.cpuFan.ok
Severity INFO
Description This message indicates that a CPU fan is OK.
Corrective action N/A
SNMP trap ID #386 Normal operation
194 | Hardware Platform Monitoring Guide
monitor.ioexpansion.unpresent
Message monitor.ioexpansion.unpresent
Severity NOTICE
Description This message occurs when the I/O expansion module is not inserted into the
chassis.
Corrective action None.
SNMP trap ID #394: I/O expansion module is not present in the chassis.
monitor.ioexpansionPower.degraded
Message monitor.ioexpansionPower.degraded
Severity NOTICE
Description This message indicates that power on the I/O expansion module is degraded.
Corrective action Degraded power might be caused by bad power supplies, bad wall power, or
bad components on the motherboard. If spare power supplies are available, try
exchanging them to see whether the problem is resolved. Otherwise, contact
technical support.
SNMP trap ID #403 Power on IO expansion is degraded:
monitor.ioexpansionPower.ok
Message monitor.ioexpansionPower.ok
Severity NOTICE
Description This messages indicates that power on the I/O expansion module is OK.
Corrective action None.
SNMP trap ID #406 Power on IO expansion module is OK
monitor.ioexpansionTemperature.cool
Message monitor.ioexpansionTemperature.cool
Severity ALERT
Description This warning message occurs when the I/O expansion module is too cold.
Corrective action The system cannot function in an environment that is too cold; find ways to
warm the system.
EMS and operational messages | 195
monitor.ioexpansionTemperature.ok
Message monitor.ioexpansionTemperature.ok
Severity NOTICE
Description This message occurs when the temperature of the I/O expansion module is
normal. It can occur for the following two cases: 1) LOG_NOTICE to show
that a bad condition has reverted to normal. 2) LOG_INFO for hourly to
indicate that the temperature is OK.
Corrective action None.
SNMP trap ID #376 Temperature of the I/O expansion module is OK.
monitor.ioexpansionTemperature.warm
Message monitor.ioexpansionTemperature.warm
Severity ALERT
Description This warning message occurs when the I/O expansion module is too warm.
Corrective action Evaluate the environment in which the system is functioning: Are air
conditioning units needed or is the current air conditioning not functioning
properly?
SNMP trap ID #372 I/O expansion module is too warm:
monitor.nvmembattery.warninglow
Message monitor.nvmembattery.warninglow
Severity WARNING
Description This message occurs when the NVMEM (nonvolatile memory) lithium battery
is low on power.
Corrective action Replace the NVMEM battery as soon as practical.
SNMP trap ID #63 NVMEM battery is low on power and should be replaced as soon as
practical.
monitor.nvramLowBattery
Message monitor.nvramLowBattery
Severity NODE_ERROR
196 | Hardware Platform Monitoring Guide
Description This message occurs when the NVRAM batteries are discovered to be at a
dangerously low power level.
Corrective action Contact technical support.
SNMP trap ID N/A
monitor.power.unreadable
Message monitor.power.unreadable
Severity INFO
Description This message occurs when a power sensor in the controller module is not
readable.
Corrective action Shut down the system and power-cycle the controller module. If the sensor is
still not readable, replace the controller module.
SNMP trap ID N/A
monitor.shutdown.cancel
Message monitor.shutdown.cancel
Severity WARNING
Description This message is issued when an automatic shutdown sequence has been
canceled.
Corrective action None.
SNMP trap ID #6 Automatic shutdown sequence canceled
monitor.shutdown.cancel.nvramLowBattery
Message monitor.shutdown.cancel.nvramLowBattery
Severity WARNING
Description This message is issued when an automatic shutdown sequence has been
postponed due to RAID reconstruction.
Corrective action Unknown
SNMP trap ID #6 NVRAM battery is dangerously Low. Halt delayed until %s finishes.
monitor.shutdown.chassisOverTemp
Message monitor.shutdown.chassisOverTemp
EMS and operational messages | 197
Severity CRIT
Description This message occurs just before shutdown, indicating that the chassis
temperature is too hot.
Corrective action Check to see if air conditioning units are needed, or whether they are
functioning properly.
#371 Chassis temperature is too hot
monitor.shutdown.chassisUnderTemp
Message monitor.shutdown.chassisUnderTemp
Severity CRIT
Description This message occurs just before shutdown, indicating that the chassis
temperature becomes too cold.
Corrective action Raise the temperature around the system.
SNMP trap ID #371 Chassis temperature is too cold
monitor.shutdown.emergency
Message monitor.shutdown.emergency
Severity NODE_FAULT
Description This message is issued when an emergency shutdown is initiated.
Corrective action None.
SNMP trap ID #6 Emergency shutdown: %s
monitor.shutdown.ioexpansionOverTemp
Message monitor.shutdown.ioexpansionOverTemp
Severity CRIT
Description This message occurs when the I/O expansion module is too hot. This message
is sent just before shutdown.
Corrective action The system environment is too hot; cool the environment.
SNMP trap ID #371 I/O expansion module is too hot:
monitor.shutdown.nvramLowBattery.pending
Message monitor.shutdown.nvramLowBattery.pending
198 | Hardware Platform Monitoring Guide
Severity WARNING
Description This message is issued when an automatic shutdown sequence is pending due to
a low battery.
Corrective action Replace the battery.
SNMP trap ID #62 Emergency shutdown: NVRAM battery dangerously low in degraded
mode. Replace the battery immediately!
monitor.temp.unreadable
Message monitor.temp.unreadable
Severity INFO
Description This message occurs when the controller module temperature is not readable.
The system does not automatically shut down if it becomes too hot for reliable
operation.
Corrective action Shut down the system and power-cycle the controller module. If the
temperature is still not readable, replace the controller module.
SNMP trap ID N/A
nvmem.battery.capacity.low
Message nvmem.battery.capacity.low
Severity NODE_ERROR
Description This message occurs when the NVMEM battery lacks the capacity to preserve
the NVMEM contents for the required minimum of 72 hours. The system is at
the risk of data loss if the power fails. This message repeats every hour while the
problem continues and the system shuts down in 24 hours if automatic
recharging of the battery does not restore its charge.
nvmem.battery.capacity.low.warn
Message nvmem.battery.capacity.low.warn
Severity INFO
Description This message occurs when the NVMEM battery capacity is below normal.
200 | Hardware Platform Monitoring Guide
nvmem.battery.capacity.normal
Message nvmem.battery.capacity.normal
Severity INFO
Description This message occurs when the NVMEM battery capacity is normal.
Corrective action None.
SNMP trap ID N/A
nvmem.battery.current.high
Message nvmem.battery.current.high
Severity NODE_ERROR
Description This message occurs when the NVMEM battery current is excessively high and
the system will shut down.
Corrective action First, correct any environmental problems, such as chassis overtemperature. If
the NVMEM battery current is still too high, replace the battery pack. If the
problem persists, replace the controller module.
SNMP trap ID N/A
nvmem.battery.current.high.warn
Message nvmem.battery.current.high.warn
Severity INFO
Description This message occurs when the NVMEM battery current is above normal.
Corrective action INFO
SNMP trap ID N/A
nvmem.battery.sensor.unreadable
Message nvmem.battery.sensor.unreadable
Severity INFO
Description This message occurs when the battery state of the battery-backed memory
(NVMEM) is unknown. One of the battery sensors is not readable.
EMS and operational messages | 201
Corrective action Shut down the system and power-cycle the controller module. If the problem is
not corrected, replace the battery. If the sensor is still not readable, replace the
controller module.
SNMP trap ID N/A
nvmem.battery.temp.high
Message nvmem.battery.temp.high
Severity NODE_ERROR
Description This message occurs when the NVMEM battery is too hot and the system is at a
high risk of data loss if power fails.
Corrective action If the system is excessively warm, allow it to cool gradually. If the NVMEM
battery temperature reading is still too high, replace the battery pack. If the
problem persists, replace the controller module.
SNMP trap ID N/A
nvmem.battery.temp.low
Message nvmem.battery.temp.low
Severity NODE_ERROR
Description This message occurs when the NVMEM battery is too cold and the system is at
a high risk of data loss if power fails.
Corrective action If the system is excessively cold, allow it to warm gradually. If the NVMEM
battery temperature reading is still too low, replace the battery pack. If the
problem persists, replace the controller module.
SNMP trap ID N/A
nvmem.battery.temp.normal
Message nvmem.battery.temp.normal
Severity INFO
Description This message occurs when the NVMEM battery temperature is normal.
Corrective action None.
SNMP trap ID N/A
202 | Hardware Platform Monitoring Guide
nvmem.battery.voltage.high
Message nvmem.battery.voltage.high
Severity NODE_ERROR
Description This message occurs when the NVMEM battery voltage is excessively high and
the system will shut down.
Corrective action First, correct any environmental problems, such as chassis overtemperature. If
the NVMEM battery voltage is still too high, replace the battery pack. If the
problem persists, replace the controller module.
SNMP trap ID N/A
nvmem.battery.voltage.high.warn
Message nvmem.battery.voltage.high.warn
Severity INFO
Description This message occurs when the NVMEM battery voltage is above normal.
Corrective action None.
SNMP trap ID N/A
nvmem.battery.voltage.normal
Message nvmem.battery.voltage.normal
Severity INFO
Description This message occurs when the NVMEM battery voltage is normal.
Corrective action None.
SNMP trap ID N/A
nvmem.voltage.high
Message nvmem.voltage.high
Severity NODE_ERROR
Description This message occurs when the NVMEM supply voltage is high and the system
is at a high risk of data loss if power fails.
Corrective action First, correct any environmental or battery problems. If the problem continues,
replace the controller module.
EMS and operational messages | 203
nvmem.voltage.high.warn
Message nvmem.voltage.high.warn
Severity INFO
Description This message occurs when the NVMEM supply voltage is above normal.
Corrective action None.
SNMP trap ID N/A
nvmem.voltage.normal
Message nvmem.voltage.normal
Severity INFO
Description This message occurs when the NVMEM supply voltage is normal.
Corrective action None.
SNMP trap ID N/A
nvram.bat.missing.error
Message nvram.bat.missing.error
Severity NODE_ERROR
Description This message occurs when the battery in the chassis is degrading.
Corrective action Contact technical support.
SNMP trap ID N/A
nvram.battery.capacity.low
Message nvram.battery.capacity.low
Severity NODE_ERROR
Description This message occurs when the NVRAM battery lacks the capacity to preserve
the NVRAM contents for the required minimum of 72 hours. The system is at
the risk of data loss if the power fails. This message repeats every hour while the
problem continues, and the system shuts down in 24 hours if automatic
recharging of the battery does not restore its charge.
204 | Hardware Platform Monitoring Guide
nvram.battery.capacity.low.critical
Message nvram.battery.capacity.low.critical
Severity NODE_ERROR
Description This message occurs when the NVRAM battery capacity is dangerously low.
To prevent data loss, the system will shut down in 20 minutes
Corrective action Correct any environmental problems, such as chassis over-temperature. The
battery charges automatically. If the capacity is not restored automatically,
replace the battery pack. If the problem persists, replace the controller module.
SNMP trap ID N/A
nvram.battery.capacity.low.warn
Messages nvram.battery.capacity.low.warn
Severity INFO
Description This message occurs when the NVRAM battery capacity is below normal.
Corrective action None.
SNMP trap ID N/A
nvram.battery.capacity.normal
Message nvram.battery.capacity.normal
Severity INFO
Description This message occurs when the NVRAM battery capacity is normal
Corrective action None.
SNMP trap ID N/A
nvram.battery.charging.nocharge
Message nvram.battery.charging.nocharge
Severity NODE_ERROR
EMS and operational messages | 205
Description This message occurs when the NVRAM battery is requesting to be charged but
the charger is not charging the battery. To prevent data loss, the system will
shut down in 20 minutes.
Corrective action Replace the NVRAM battery/card. If the problem persists, replace the
controller module.
SNMP trap ID N/A
nvram.battery.charging.normal
Message nvram.battery.charging.normal
Severity INFO
Description This message occurs when the NVRAM battery charging status is normal.
Corrective action None.
SNMP trap ID N/A
nvram.battery.charging.wrongcharge
Message nvram.battery.charging.wrongcharge
Severity NODE_ERROR
Description This message occurs when the NVRAM battery charger is charging the battery
even though the battery is not requesting to be charged. To prevent data loss,
the system will be shut down in 20 minutes.
Corrective action Replace the NVRAM battery. If the problem persists, replace the NVRAM
card.
SNMP trap ID N/A
nvram.battery.current.high
Message nvram.battery.current.high
Severity NODE_ERROR
Description This message occurs when the NVRAM battery current is excessively high and
the system will shut down.
Corrective action First, correct any environmental problems, such as chassis over-temperature. If
the NVRAM battery current is still too high, replace the battery pack. If the
problem persists, replace the controller module
SNMP trap ID N/A
206 | Hardware Platform Monitoring Guide
nvram.battery.current.high.warn
Message nvram.battery.current.high.warn
Severity INFO
Description This message occurs when the NVRAM battery current is above normal.
Corrective action None.
SNMP trap ID N/A
nvram.battery.current.low
Message nvram.battery.current.low
Severity NODE_ERROR
Description This message occurs when the NVRAM battery has a short circuit.
Corrective action Replace the NVRAM battery/card. If the problem persists, replace the
controller module
SNMP trap ID N/A
nvram.battery.current.low.warn
Message nvram.battery.current.low.warn
Severity NODE_ERROR
Description This message occurs when the NVRAM battery current is below normal.
Corrective action First, correct any environmental problems. If the NVRAM battery current is
still below normal, replace the NVRAM battery/card. If the problem persists,
replace the controller module.
SNMP trap ID N/A
nvram.battery.current.normal
Message nvram.battery.current.normal
Severity INFO
Description This message occurs when the NVRAM battery current is normal.
Corrective action None.
SNMP trap ID N/A
EMS and operational messages | 207
nvram.battery.end_of_life.high
Message nvram.battery.end_of_life.high
Severity INFO
Description This message occurs when the NVRAM battery-cycle count indicates that the
battery has reached its anticipated life expectancy.
Corrective action None.
SNMP trap ID N/A
nvram.battery.end_of_life.normal
Message nvram.battery.end_of_life.normal
Severity INFO
Description This message occurs when the NVRAM battery-cycle count indicates that the
battery is well below its anticipated life expectancy.
Corrective action None.
SNMP trap ID N/A
nvram.battery.fault
Message nvram.battery.fault
Severity NODE_ERROR
Description This message occurs when the NVRAM battery is reporting a fatal fault
condition. To prevent data loss, the system will shut down in 2 minutes.
Corrective action Correct any environmental problems, such as chassis over-temperature. If the
battery still reports a fatal fault condition, replace the NVRAM battery/card. If
the problem persists, replace the controller module.
SNMP trap ID N/A
nvram.battery.fault.warn
Message nvram.battery.fault.warn
Severity INFO
Description This message occurs when the NVRAM battery is reporting a non-fatal fault
condition.
Corrective action Correct any environmental problems, such as chassis over-temperature.
208 | Hardware Platform Monitoring Guide
nvram.battery.fcc.low
Message nvram.battery.fcc.low
Severity NODE_ERROR
Description This message occurs when the NVRAM battery full-charge capacity is low. To
prevent data loss, the system will shut down in 24 hours.
Corrective action First, correct any environmental problems, such as chassis over-temperature. If
the NVRAM full-charge capacity is still dangerously low, replace the NVRAM
battery/card. If the problem persists, replace the controller module.
SNMP trap ID N/A
nvram.battery.fcc.low.critical
Message nvram.battery.fcc.low.critical
Severity NODE_ERROR
Description This message occurs when the NVRAM battery full-charge capacity is
dangerously low. To prevent data loss, the system will shut down in 20
minutes.
Corrective action First, correct any environmental problems, such as chassis over-temperature. If
the NVRAM full-charge capacity is still dangerously low, replace the NVRAM
battery/card. If the problem persists, replace the controller module.
SNMP trap ID N/A
nvram.battery.fcc.low.warn
Message nvram.battery.fcc.low.warn
Severity INFO
Description This message occurs when the NVRAM battery full-charge capacity is below
normal.
Corrective action Replace the NVRAM battery/card during your next scheduled down-time
(within 3 months).
SNMP trap ID N/A
nvram.battery.fcc.normal
Message nvram.battery.fcc.normal
EMS and operational messages | 209
Severity INFO
Description This message occurs when the NVRAM battery full-charge capacity is normal.
Corrective action None.
SNMP trap ID N/A
nvram.battery.power.fault
Message nvram.battery.power.fault
Severity NODE_ERROR
Description This message occurs when the NVRAM battery is not getting powered.
Corrective action Correct any environmental problems such as chassis over-temperature. If the
NVRAM battery is still not getting power, replace the NVRAM battery/card. If
the problem persists, replace the controller module.
SNMP trap ID N/A
nvram.battery.power.normal
Message nvram.battery.power.normal
Severity INFO
Description This message occurs when the NVRAM battery power is normal.
Corrective action None.
SNMP trap ID N/A
nvram.battery.sensor.unreadable
Messages nvram.battery.sensor.unreadable
Severity INFO
Description This message occurs when the battery state of the battery-backed memory
(NVRAM) is unknown. One of the battery sensors is not readable.
Corrective action Shut down the system and power-cycle the controller module. If the problem is
not corrected, replace the NVRAM battery/card. If the sensor is still not
readable, replace the controller module.
SNMP trap ID N/A
210 | Hardware Platform Monitoring Guide
nvram.battery.temp.high
Message nvram.battery.temp.high
Severity NODE_ERROR
Description This message occurs when the NVRAM battery is too hot and the system is at a
high risk of data loss if power fails.
Corrective action If the system is excessively warm, allow it to cool gradually. If the NVRAM
battery temperature reading is still too high, replace the battery pack. If the
problem persists, replace the controller module.
SNMP trap ID N/A
nvram.battery.temp.high.warn
Message nvram.battery.temp.high.warn
Severity INFO
Description This message occurs when the NVRAM battery temperature is high.
Corrective action None.
SNMP trap ID N/A
nvram.battery.temp.low
Message nvram.battery.temp.low
Severity NODE_ERROR
Description This message occurs when the NVRAM battery is too cold and the system is at
a high risk of data loss if power fails.
Corrective action If the system is excessively cold, allow it to warm gradually. If the NVRAM
battery temperature reading is still too low, replace the battery pack. If the
problem persists, replace the controller module.
SNMP trap ID N/A
nvram.battery.temp.low.warn
Message nvram.battery.temp.low.warn
Severity INFO
Description This message occurs when the NVRAM battery temperature is low.
Corrective action None.
EMS and operational messages | 211
nvram.battery.temp.normal
Message nvram.battery.temp.normal
Severity INFO
Description This message occurs when the NVRAM battery temperature is normal.
Corrective action None.
SNMP trap ID N/A
nvram.battery.voltage.high
Message nvram.battery.voltage.high
Severity NODE_ERROR
Description This message occurs when the NVRAM battery voltage is excessively high and
the system will shut down.
Corrective action First, correct any environmental problems, such as chassis over-temperature. If
the NVRAM battery voltage is still too high, replace the battery pack. If the
problem persists, replace the controller module.
SNMP trap ID N/A
nvram.battery.voltage.high.warn
Message nvram.battery.voltage.high.warn
Severity INFO
Description This message occurs when the NVRAM battery voltage is above normal.
Corrective action None.
SNMP trap ID N/A
nvram.battery.voltage.low
Message nvram.battery.voltage.low
Severity NODE_ERROR
Description This message occurs when the NVRAM battery voltage is critically low. To
prevent data loss, the system will shut down in 2 minutes.
212 | Hardware Platform Monitoring Guide
Corrective action First correct any environmental problem, such as chassis over-temperature. If
the NVRAM battery voltage is still critically low, replace the NVRAM battery/
card. If the problem persists, replace the controller module.
SNMP trap ID N/A
nvram.battery.voltage.low.warn
Message nvram.battery.voltage.low.warn
Severity INFO
Description This message occurs when the NVRAM battery voltage is below normal. To
prevent data loss, the system will shut down in 24 hours.
Corrective action First, correct any environmental problems such as chassis over-temperature. If
the NVRAM battery voltage is still below normal, replace the NVRAM battery/
card. If the problem persists, replace the controller module.
SNMP trap ID N/A
nvram.battery.voltage.normal
Message nvram.battery.voltage.normal
Severity INFO
Description This message occurs when the NVRAM battery voltage is normal.
Corrective action None.
SNMP trap ID N/A
nvram.hw.initFail
Message nvram.hw.initFail
Severity ERR
Description This message occurs when the Data ONTAP NVRAM hardware fails to
initialize.
Corrective action Typically, this type of error is unexpected and indicates that the NVRAM
hardware is failing and should be replaced. Contact technical support for
assistance with the replacement.
SNMP trap ID N/A
EMS and operational messages | 213
ispcna.mpi.dump
Message ispcna.mpi.dump
Severity SVC_ERROR
Description This message occurs when an unexpected event or illegal condition is detected
by the CNA (Converged Network Adapter) Management Port Interface (MPI)
driver and the contents of the adapter's Static RAM and memory must be
dumped. After the dump, the adapter is reset and the contents of the dump are
stored in a file in the /etc/log/ql8mpi directory.
Corrective action None; the adapter was reset.
ispcna.mpi.dump.saved
Message ispcna.mpi.dump.saved
Severity SVC_ERROR
Description This message occurs when an unexpected event or illegal condition is detected
by the CNA (Converged Network Adapter) Management Port Interface (MPI)
driver and the contents of the adapter's Static RAM and memory are saved. The
dump files are stored on the system's root volume in the /etc/log/ql8mpi
directory, with the following file name format: mpi[adapter]_[date]_[time].bin
ispcna.mpi.initFailed
Message ispcna.mpi.initFailed
Severity NODE_ERROR
Description This message occurs when the CNA (Converged Network Adapter) fails to
initialize.
Corrective action Take corrective actions based on the indicated reason for the failure.
214 | Hardware Platform Monitoring Guide
callhome.flash.cache.failed
Message callhome.flash.cache.failed
Severity NODE_ERROR
Description This message occurs when Flash Management Module (FMM) detects that a
caching module has suffered a failure. Typically, this is the result of a hardware
failure on the caching module itself. FMM monitors all flash devices in the system.
If your system is configured to do so, it generates and transmits an AutoSupport (or
"call home") message to customer support and to the configured destinations.
Successful delivery of an AutoSupport message significantly improves problem
determination and resolution.
Corrective A caching module has failed. This is either an indication or the cause of a
action performance degradation. The exact impact cannot be estimated. This caching
module needs to be repaired or replaced. Contact customer support for more
details.
extCache.io.BlockChecksumError
Message extCache.io.BlockChecksumError
Severity NODE_ERROR
Description This message occurs when the external cache detects a block checksum
verification error while performing a read operation. The operation will be
retried from persistent storage (RAID).
Corrective action Contact technical support.
extCache.io.cardError
Message extCache.io.cardError
Severity NODE_Error
EMS and operational messages | 215
Description This message occurs when the external cache detects a card failure on read or
write I/O. If the I/O was a read, the operation will be retried from persistent
storage (RAID).
Corrective action Contact technical support.
extCache.io.readError
Message extCache.io.readError
Severity NODE_ERROR
Description This message occurs when the external cache detects an I/O error on a read.
The operation will be retried from persistent storage (RAID).
Corrective action Contact technical support.
extCache.io.writeError
Message extCache.io.writeError
Severity NODE_ERROR
Description This message occurs when the external cache detects an I/O error on a write.
This causes the external cache component to be disabled and might result in
degraded performance until the problem is corrected.
Corrective action Contact technical support.
extCache.offline
Message extCache.offline
Severity SVC_ERROR
Description This message occurs when the external cache is automatically taken offline and
disabled. This can happen after an I/O error on the external cache and might
result in degraded performance until the problem is corrected. Check the Event
Management System (EMS) log for earlier errors.
Corrective action Contact technical support.
extCache.ReconfigComplete
Message extCache.ReconfigComplete
Severity NODE_ERROR
216 | Hardware Platform Monitoring Guide
Description This message occurs when the Write Anywhere File Layout (WAFL) external
cache has detected a failure of one or more cache memory cards, and was able
to successfully reconfigure to continue operation with the remaining cards.
Corrective action None.
extCache.ReconfigFailed
Message extCache.ReconfigFailed
Severity NODE_ERROR
Description This message occurs when an attempt to reconfigure the external cache has
failed. The message identifies what step of the reconfiguration failed.
Corrective action Contact technical support.
extCache.ReconfigStart
Message extCache.ReconfigStart
Severity NODE_ERROR
Description This message occurs when the Write Anywhere File Layout (WAFL) external
cache has detected a failure of one or more cache memory cards. An attempt will
be made to restart the cache with the remaining card(s). Even if the cache is
restarted performance may be degraded due to the reduced size of cache
available. See related EMS messages for details of the failing unit.
Corrective Contact technical support.
action
extCache.UECCerror
Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectable
ECC memory errors occur per day for three consecutive days, replace the module.
Message extCache.UECCerror
Severity NODE_ERROR
Description This message occurs when an uncorrectable multi-bit ECC memory error is
reported to the Write Anywhere File Layout (WAFL) file system external cache.
When this event occurs the data will be re-read from persistent storage (RAID)
and operation continues. See related EMS messages for details about the failing
unit.
Corrective If multiple uncorrectable multi-bit ECC errors are issued, this indicates that a
action hardware component might be failing and should be considered for replacement.
EMS and operational messages | 217
extCache.UECCmax
Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectable
ECC memory errors occur per day for three consecutive days, replace the module.
Message extCache.UECCmax
Severity NODE_ERROR
Description This message occurs when the Write Anywhere File Layout (WAFL) file
system external cache has detected excessive multi-bit uncorrectable ECC
memory errors in a recent period. When too many multi-bit ECC errors are
reported, WAFL disables the external cache until the failing component is
replaced, resulting in degraded performance. See related EMS messages for
details about the failing unit.
Corrective Contact technical support.
action
fal.chan.offline.comp
Message fal.chan.offline.comp
Severity INFO
Description This message occurs when the FAL (Flash Adaptation Layer) finishes taking a
channel offline.
Corrective action None.
fal.chan.online.erase.warn
Message fal.chan.online.erase.warn
Severity INFO
Description This message occurs when an erase of a label block fails while attempting to
bring online a channel of a card. This could lead to a failure to read the label
(see the fal.chan.online.read.warn event).
Corrective action None.
fal.chan.online.fail
Message fal.chan.online.fail
Severity SVC_ERROR
218 | Hardware Platform Monitoring Guide
Description This message occurs when the FAL (Flash Adaptation Layer) fails to bring
online a channel of a card for the mentioned reason.
Corrective action None.
fal.chan.online.read.warn
Message fal.chan.online.read.warn
Severity INFO
Description This message occurs when the read of a label fails while attempting to bring online
a channel of a module. This is expected on the first boot with a Flash Cache
module. Otherwise, it means existing FAL (Flash Adaptation Layer) label
information is lost. The current version of software does not depend on label
information, so this loss is not a problem right now. However, future versions of
software might store cache data persistently. If persistent data is stored on a card
and this version of software is booted on such a system, failure to read the label
might lead to loss of some cached data.
Corrective None.
action
fal.chan.online.rep.fail
Message fal.chan.online.rep.fail
Severity SVC_ERROR
Description This message occurs when the FAL (Flash Adaptation Layer) fails to bring
online all channels in a caching module. The reasons for failure are listed in the
accompanying fal.chan.online.fail events.
Corrective action Contact technical support.
fal.chan.online.rep.part
Message fal.chan.online.rep.part
Severity SVC_ERROR
Description This message occurs when the FAL (Flash Adaptation Layer) fails to bring
online some channels in a caching module. The reasons for failure are listed in
the accompanying fal.chan.online.fail events.
Corrective action Contact technical support.
EMS and operational messages | 219
fal.chan.online.rep.succ
Message fal.chan.online.rep.succ
Severity INFO
Description This message occurs when the FAL (Flash Adaptation Layer) successfully
brings online all channels in a card.
Corrective action None.
fal.chan.online.rep.ver.err
Message fal.chan.online.rep.ver.err
Severity SVC_ERROR
Description This message occurs when the FAL (Flash Adaptation Layer) fails to bring
online all channels in a caching module because of version mismatch.
Corrective action Follow the documented revert procedure.
fal.chan.online.write.warn
Message fal.chan.online.write.warn
Severity INFO
Description This message occurs when a write of a label block fails while attempting to
bring online a channel of a module. This could lead to a failure to read the label
(see the fal.chan.online.read.warn event).
Corrective action None.
fal.init.failed
Message fal.init.failed
Severity SVC_ERROR
Description This message occurs when the FAL (Flash Adaptation Layer) fails to initialize.
This error likely indicates a software bug.
Corrective action Contact technical support.
fmm.bad.block.detected
Message fmm.bad.block.detected
Severity DEBUG
220 | Hardware Platform Monitoring Guide
Description This message occurs when Flash Management Module (FMM) gets a message
from a flash device driver reporting that a bad block is detected.
Corrective action None.
fmm.device.stats.missing
Message fmm.device.stats.missing
Severity DEBUG
Description This message occurs when the onboard copy of statistics maintained by Flash
Management Module (FMM) are missing. This can happen when a device is
initially activated in the controller.
Corrective action None.
fmm.domain.card.failure
Message fmm.domain.card.failure
Severity SVC_ERROR
Description This message occurs when the Flash Management Module (FMM) detects that
a flash device failed. Typically, this is the result of a hardware failure on the
flash device itself.
Corrective action Repair or replace the failed flash device.
fmm.domain.core.failure
Message fmm.domain.core.failure
Severity DEBUG
Description This message occurs when Flash Management Module (FMM) detects that a
core domain on a flash device managed by FMM has failed. Typically, this is
the result of a hardware failure on the flash device itself. Core failure is not
considered to be fatal.
Corrective action None.
fmm.domain.lun.failure
Message fmm.domain.lun.failure
Severity DEBUG
Description This message occurs when Flash Management Module (FMM) detects that a
LUN domain on a flash device managed by FMM has failed. Typically, this is
EMS and operational messages | 221
the result of a hardware failure on the flash device itself. LUN failure is not
considered fatal.
Corrective action None.
fmm.hourly.device.report
Message fmm.hourly.device.report
Severity DEBUG
Description This message is sent by Flash Management Module (FMM) every hour, to
report the status of a flash device that FMM manages.
Corrective action None.
fmm.log.bb
Message fmm.log.bb
Severity DEBUG
Description This message occurs when Flash Management Module (FMM) gets a message
from a flash device driver reporting that a bad block is detected.
Corrective action None.
fmm.threshold.bank.degraded
Message fmm.threshold.bank.degraded
Severity DEBUG
Description This message occurs when Flash Management Module (FMM) detects that in a
flash device, the percentage of a bank that is offline is above a warning
threshold. FMM responds with the action described by the action parameter.
Corrective action None.
fmm.threshold.bank.offline
Message fmm.threshold.bank.offline
Severity DEBUG
Description This message occurs when Flash Management Module (FMM) detects that in a
flash device, a critical percentage of a bank is offline, beyond which the bank
cannot operate. FMM responds with the action described by the action
parameter.
222 | Hardware Platform Monitoring Guide
fmm.threshold.card.degraded
Message fmm.threshold.card.degraded
fmm.threshold.card.failure
Message fmm.threshold.card.failure
Severity SVC_Error
Description This message occurs when Flash Management Module (FMM) detects the
offline percentage of a flash device exceeds a specified critical threshold beyond
which the device cannot operate. FMM responds with the action described by
the action parameter.
Corrective This flash device can no longer operate and will be taken offline. Repair or
action replace the flash device.
fmm.threshold.core.offline
Message fmm.threshold.core.offline
Severity DEBUG
Description This message occurs when Flash Management Module (FMM) detects that an
excessive number of blocks in a core of a flash device have gone bad. The
threshold for a core is defined as a percentage of bad blocks, and when that
threshold is exceeded, FMM responds with the action described by the action
parameter.
Corrective None.
action
fmm.threshold.lun.offline
Message fmm.threshold.lun.offline
Severity DEBUG
EMS and operational messages | 223
Description This message occurs when Flash Management Module (FMM) detects that an
excessive number of blocks in a flash device LUN have gone bad. The threshold
for a LUN is defined as a percentage of bad blocks, and when that threshold is
exceeded, FMM responds with the action described by the action parameter.
Corrective None.
action
iomem.bbm.bbtl.overflow
Message iomem.bbm.bbtl.overflow
Severity NODE_ERROR
Description This message occurs when the caching module driver detects that the Bad
Block Transaction Log has overflowed.
Corrective action None.
iomem.bbm.init.failed
Message iomem.bbm.init.failed
Severity NODE_ERROR
Description This message occurs when the caching module driver detects that an operation
to a NOR flash memory has failed.
Corrective action None.
iomem.bbm.new.flash
Message iomem.bbm.new.flash
Severity DEBUG
Description This message occurs when the caching module driver detects that a NAND
flash package has been replaced.
Corrective action None.
iomem.card.disable
Message iomem.card.disable
Severity WARNING
Description This message occurs when the caching module has been disabled as a result of
an explicit diagnostic command.
224 | Hardware Platform Monitoring Guide
iomem.card.enable
Message iomem.card.enable
Severity INFO
Description This message occurs when the caching module has been enabled as a result of
an explicit diagnostic command.
Corrective action None.
iomem.card.fail.cecc
Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectable
ECC memory errors occur per day for three consecutive days, replace the module.
Message iomem.card.fail.cecc
Severity NODE_ERROR
Description This message occurs when the caching module driver takes an acceleration card
offline due to an excessive number of correctable memory errors.
Corrective action Replace the caching module.
iomem.card.fail.data.crc
Message iomem.card.fail.data.crc
Severity NODE_ERROR
Description This message occurs when the caching module driver takes a caching module
offline due to an excessive number of detected data cyclic redundancy check
(CRC) errors.
Corrective action Replace the caching module.
iomem.card.fail.desc.crc
Message iomem.card.fail.desc.crc
Severity NODE_ERROR
Description This message occurs when the caching module driver takes a caching module
offline due to an excessive number of detected descriptor cyclic redundancy
check (CRC) errors.
Corrective action Replace the caching module.
EMS and operational messages | 225
iomem.card.fail.dimm
Message iomem.card.fail.dimm
Severity NODE_ERROR
Description This message occurs when the caching module driver takes a caching module
offline due to failure of a memory DIMMs.
Corrective action Replace the caching module.
iomem.card.fail.firmware.primary
Message iomem.card.fail.firmware.primary
Severity NODE_ERROR
Description This messages occurs when the caching module driver detects that the module is
not running on the primary firmware image. The card does not function unless it
running on the primary image.
Corrective Note: The following steps are for systems that use the SYSDIAG diagnostic tool.
action 32xx and 62xx systems use system-level diagnostics, which is a different
diagnostic tool. For details about using system-level diagnostics, see the System-
Level Diagnostics Guide on the NetApp Support Site at mysupport.netapp.com.
iomem.card.fail.fpga
Message iomem.card.fail.fpga
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a fatal operational
error with the onboard field programmable gate array (FPGA) hardware and is
taking the caching module offline.
226 | Hardware Platform Monitoring Guide
iomem.card.fail.fpga.primary
Message iomem.card.fail.fpga.primary
Severity NODE_ERROR
Description This messages occurs when the acceleration card driver detects that the card is not
running on the primary firmware image. The card does not function unless it is
running on the primary image.
Corrective Note: The following steps are for systems that use the SYSDIAG diagnostic
action tool. 32xx and 62xx systems use system-level diagnostics, which is a different
diagnostic tool. For details about using system-level diagnostics, see the System-
Level Diagnostics Guide on the NetApp Support Site at mysupport.netapp.com.
Take one of the following actions:
If you have a 16-GB Performance Acceleration Module, complete the following
steps:
1. Enter the following command at the boot environment prompt:
boot_diags
iomem.card.fail.fpga.rev
Message iomem.card.fail.fpga.rev
Severity NODE_ERROR
Description This message occurs when the caching module driver detects that the field
programmable gate array (FPGA) firmware image is a revision not supported by
the driver.
Corrective Note: The following steps are for systems that use the SYSDIAG diagnostic
action tool. 32xx and 62xx systems use system-level diagnostics, which is a different
diagnostic tool. For details about using system-level diagnostics, see the System-
Level Diagnostics Guide on the NetApp Support Site at mysupport.netapp.com.
Take one of the following actions:
EMS and operational messages | 227
iomem.card.fail.internal
Message iomem.card.fail.internal
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a fatal internal
error on the caching module and is taking the module offline.
Corrective action Contact technical support.
iomem.card.fail.pci
Message iomem.card.fail.pci
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a fatal PCI error
on the caching module and is taking the module offline.
Corrective action Contact technical support.
iomem.card.fail.uecc
Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectable
ECC memory errors occur per day for three consecutive days, replace the module.
Message iomem.card.fail.uecc
Severity NODE_ERROR
Description This message occurs when the caching module driver takes a caching module
offline due to an excessive number of uncorrectable memory errors.
Corrective action Replace the caching module.
228 | Hardware Platform Monitoring Guide
iomem.dimm.log.checksum
Message iomem.dimm.log.checksum
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a checksum error
in the error log for a DIMM on the caching module.
Corrective action Replace the caching module.
iomem.dimm.log.init
Message iomem.dimm.log.init
Severity INFO
Description This message occurs when the caching module driver initializes the error log
for a DIMM.
Corrective action None.
iomem.dimm.log.read
Message iomem.dimm.log.read
Severity NODE_ERROR
Description This message occurs when the caching module driver fails to read the error log
for a DIMM on the caching module.
Corrective action Replace the caching module.
iomem.dimm.log.sync
Message iomem.dimm.log.sync
Severity INFO
Description This message occurs when the caching module driver is writing the error log for
a DIMM to persistent storage.
Corrective action None.
iomem.dimm.log.write
Message iomem.dimm.log.write
Severity NODE_ERROR
EMS and operational messages | 229
Description This message occurs when the caching module driver fails to write the error log
for a DIMM on the caching module.
Corrective action Replace the caching module.
iomem.dimm.mismatch.banks
Message iomem.dimm.mismatch.banks
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with a
number of banks that does not match that of the other installed DIMMs on the
caching module.
Corrective action Replace the caching module.
iomem.dimm.mismatch.burst
Message iomem.dimm.mismatch.burst
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with a
burst size that does not match that of the other installed DIMMs on the caching
module.
Corrective action Replace the caching module.
iomem.dimm.mismatch.casLatency
Message iomem.dimm.mismatch.casLatency
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with a
column address select (CAS) that does not match that of the other installed
DIMMs on the caching module.
Corrective action Replace the caching module.
iomem.dimm.mismatch.columns
Message iomem.dimm.mismatch.columns
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with a
number of columns that does not match that of the other installed DIMMs on
the caching module.
230 | Hardware Platform Monitoring Guide
iomem.dimm.mismatch.dataWidth
Message iomem.dimm.mismatch.dataWidth
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with a
data synchronous dynamic RAM (SDRAM) width that does not match that of
the other installed DIMMs on the caching module.
Corrective action Replace the caching module.
iomem.dimm.mismatch.eccWidth
Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectable
ECC memory errors occur per day for three consecutive days, replace the module.
Message iomem.dimm.mismatch.eccWidth
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with an
ECC synchronous dynamic RAM (SDRAM) width that does not match that of
the other installed DIMMs on the caching module.
Corrective action Replace the caching module.
iomem.dimm.mismatch.ranks
Message iomem.dimm.mismatch.ranks
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with a
number of ranks that does not match that of the other installed DIMMs on the
caching module.
Corrective action Replace the caching module.
iomem.dimm.mismatch.rows
Message iomem.dimm.mismatch.rows
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with a
number of rows that does not match that of the other installed DIMMs on the
caching module.
EMS and operational messages | 231
iomem.dimm.mismatch.vendor
Message iomem.dimm.mismatch.vendor
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with a
manufacturer ID that does not match that of the other installed DIMMs on the
caching module.
Corrective action Replace the caching module.
iomem.dimm.spd.banks
Message iomem.dimm.spd.banks
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with a
number of banks incompatible with the memory controller of the caching
module.
Corrective action Replace the caching module.
iomem.dimm.spd.burst
Message iomem.dimm.spd.burst
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with a
burst size incompatible with the memory controller of the caching module.
Corrective action Replace the caching module.
iomem.dimm.spd.casLatency
Message iomem.dimm.spd.casLatency
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with a
column address select (CAS) latency incompatible with the memory controller
of the caching module
Corrective action Replace the caching module.
232 | Hardware Platform Monitoring Guide
iomem.dimm.spd.checksum
Message iomem.dimm.spd.checksum
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a checksum error
for the identifying information read from the serial presence detect (SPD)
electronically erasable programmable read-only memory (EEPROM) of a
DIMM installed on the caching module.
Corrective action Replace the caching module.
iomem.dimm.spd.columns
Message iomem.dimm.spd.columns
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with a
number of columns incompatible with the memory controller of the caching
module.
Corrective action Replace the caching module.
iomem.dimm.spd.dataWidth
Message iomem.dimm.spd.dataWidth
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with a
data synchronous dynamic RAM (SDRAM) width incompatible with the
memory controller of the caching module.
Corrective action Replace the caching module.
iomem.dimm.spd.detect
Message iomem.dimm.spd.detect
Severity INFO
Description This message occurs when the caching module driver detects the presence of an
installed DIMM during initialization.
Corrective action None.
EMS and operational messages | 233
iomem.dimm.spd.eccWidth
Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectable
ECC memory errors occur per day for three consecutive days, replace the module.
Message iomem.dimm.spd.eccWidth
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with an
ECC synchronous dynamic RAM (SDRAM) SDRAM width incompatible with
the memory controller of the caching module.
Corrective action Replace the caching module.
iomem.dimm.spd.ranks
Message iomem.dimm.spd.ranks
Severity NODE_ERROR
Description This message occurs when the acceleration card driver detects a DIMM with a
number of ranks incompatible with the memory controller of the acceleration
card.
Corrective action Replace the acceleration card.
iomem.dimm.spd.read
Message iomem.dimm.spd.read
Severity NODE_ERROR
Description This message occurs when the caching module driver fails to read the
identifying information from the synchronous dynamic RAM (SDRAM)
electronically erasable programmable read-only memory EEPROM of a DIMM
installed on the caching module.
Corrective action Replace the acceleration card.
iomem.dimm.spd.rows
Message iomem.dimm.spd.rows
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a DIMM with a
number of rows incompatible with the memory controller of the caching
module.
234 | Hardware Platform Monitoring Guide
iomem.dma.crc.data
Message iomem.dma.crc.data
Severity WARNING
Description This message occurs when the caching module driver detects a data checksum
error for data in transit across the PCI link between the system and the caching
module.
Corrective action Contact technical support.
iomem.dma.crc.desc
Message iomem.dma.crc.desc
Severity WARNING
Description This message occurs when the caching module driver detects a descriptor
checksum error for data in transit across the PCI link between the system and
the caching module.
Corrective action Contact technical support.
iomem.dma.internal
Message iomem.dma.internal
Severity WARNING
Description This message occurs when the caching module driver detects an internal direct
memory access (DMA) error during data transfer.
Corrective action Contact technical support.
iomem.dma.stall
Message iomem.dma.stall
Severity WARNING
Description This message occurs when the acceleration card driver detects a direct memory
access (DMA) channel has unexpectedly stalled and is attempting to restart the
DMA channel for normal operation.
Corrective action None.
EMS and operational messages | 235
iomem.ecc.cecc
Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectable
ECC memory errors occur per day for three consecutive days, replace the module.
Message iomem.ecc.cecc
Severity WARNING
Description This message occurs when a correctable ECC memory error is detected while
accessing the memory of a caching module. If frequent, correctable ECC errors
usually indicate that a hardware memory component of the caching module is
failing.
Corrective action None.
iomem.ecc.correct.off
Message iomem.ecc.correct.off
Severity WARNING
Description This message occurs when the error correction code (ECC) memory error
correction has been disabled for a caching module.
Corrective ECC error correction should never be disabled for the caching module under
action normal operating conditions. The only way that this can occur is if it has been
explicitly disabled through a private diagnostic interface. If this message is
encountered under normal operating conditions, contact technical support.
iomem.ecc.correct.on
Message iomem.ecc.correct.on
Severity INFO
Description This message occurs when the error correction code (ECC) memory error
correction has been enabled for a caching module.
Corrective action None.
iomem.ecc.detect.off
Message iomem.ecc.detect.off
Severity WARNING
Description This message occurs when the error correction code (ECC) memory error
detection has been disabled for an acceleration card.
236 | Hardware Platform Monitoring Guide
Corrective ECC error detection should never be disabled for the caching module under
action normal operating conditions. The only way that this can occur is if the
functionality has been explicitly disabled via a private diagnostic interface. If
this message is encountered under normal operating conditions, contact
technical support.
iomem.ecc.detect.on
Message iomem.ecc.detect.on
Severity INFO
Description This message occurs when the error correction code (ECC) memory error
detection has been enabled for a caching module.
Corrective action None.
iomem.ecc.inject
Message iomem.ecc.inject
Severity WARNING
Description This message occurs when an error correction code (ECC) memory error is
manually injected into the memory of a caching module. This injection event
will only occur during diagnostic testing.
Corrective action None.
iomem.ecc.summary
Message iomem.ecc.summary
Severity WARNING
Description This message occurs when the caching module driver makes its periodic error
summary report indicating that uncorrectable memory errors have been detected
on the acceleration card.
Corrective action Replace the acceleration card.
iomem.ecc.uecc
Message iomem.ecc.uecc
Severity NODE_ERROR
Description This message occurs when an uncorrectable ECC memory error is detected while
accessing the memory of a caching module. Uncorrectable ECC errors indicate
EMS and operational messages | 237
iomem.fail.stripe
Message iomem.fail.stripe
Severity INFO
Description An erase stripe is being failed.
Corrective action None.
iomem.firmware.package.access
Message iomem.firmware.package.access
Severity NODE_error
Description This message occurs when the caching module driver encounters a problem
while accessing the firmware package. The caching module might continue to
function, but it is recommended that you follow the corrective action at the
earliest opportunity.
Corrective Reinstall the Data ONTAP software package or service image.
action
iomem.firmware.primary
Message iomem.firmware.primary
Severity WARNING
Description This message occurs when the caching module driver detects that the card is not
running on the primary firmware image. The card does not function unless it is
running on the primary image.
Corrective action None.
iomem.firmware.program.complete
Message iomem.firmware.program.complete
238 | Hardware Platform Monitoring Guide
Severity INFO
Description This message occurs when the caching module driver finishes the programming
procedure for the caching module firmware.
Corrective action None.
iomem.firmware.program.fail
Message iomem.firmware.program.fail
Severity NODE_ERROR
Description This message occurs when the caching module driver fails to program the card
firmware.
Corrective action Contact technical support.
iomem.firmware.program.reboot
Message iomem.firmware.program.reboot
Severity INFO
Description This message occurs when the caching module driver triggers a reboot due to
programming firmware on one or more caching modules.
iomem.firmware.program.start
Message iomem.firmware.program.start
Severity INFO
Description This message occurs when the caching module driver begins the programming
procedure for the module firmware.
Corrective action None.
iomem.firmware.rev
Message iomem.firmware.rev
Severity WARNING
Description This message occurs when the caching module driver detects that the field
programmable gate array (FPGA) firmware image is a revision not supported
by the driver.
Corrective action None.
EMS and operational messages | 239
iomem.flash.mismatch.id
Message iomem.flash.mismatch.id
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a flash device
with an identifier that does not match the identifier contained in the field-
replaceable unit (FRU) information. The caching module is not functional until
you resolve this issue.
Corrective action Contact technical support.
iomem.fru.badInfo
Message iomem.fru.badInfo
Severity WARNING
Description This message occurs when the caching module driver detects invalid
information in the field-replaceable unit (FRU) electronically erasable
programmable read-only memory (EEPROM) of the caching module.
Corrective action Replace the caching module.
iomem.fru.checksum
Message iomem.fru.checksum
Severity WARNING
Description This message occurs when the caching module driver detects a checksum error
in the card field-replaceable unit (FRU) information for the caching module.
Corrective action Replace the caching module.
iomem.fru.read
Message iomem.fru.read
Severity WARNING
Description This message occurs when the caching module driver encounters an error
reading the field-replaceable unit (FRU) electronically erasable programmable
read-only memory (EEPROM) of the caching module.
Corrective action Replace the caching module.
240 | Hardware Platform Monitoring Guide
iomem.fru.write
Message iomem.fru.write
Severity WARNING
Description This message occurs when the caching module driver encounters an error
writing the field-replaceable unit (FRU) electronically erasable programmable
read-only memory (EEPROM) of the caching module.
Corrective action Replace the caching module.
iomem.i2c.link.down
Message iomem.i2c.link.down
Severity WARNING
Description This message occurs when the caching module driver detects the failure of
Inter-Integrated Circuit (I2C) serial link on the caching module.
Corrective action Replace the caching module.
iomem.i2c.read.addrNACK
Message iomem.i2c.read.addrNACK
Severity WARNING
Description This message occurs when the caching module driver detects an address
negative acknowledgment (NACK) error condition when reading data from an
Inter-Integrated Circuit (I2C) device on the caching module.
Corrective action Replace the caching module.
iomem.i2c.read.dataNACK
Message iomem.i2c.read.dataNACK
Severity WARNING
Description This message occurs when the caching module driver detects a data negative
acknowledgment (NACK) error condition when reading data from an Inter-
Integrated Circuit (I2C) device on the caching module.
Corrective action Replace the caching module.
EMS and operational messages | 241
iomem.i2c.read.timeout
Message iomem.i2c.read.timeout
Severity WARNING
Description This message occurs when the caching module driver times out while trying to
read data from an Inter-Integrated Circuit (I2C) device on the caching module.
Corrective action Replace the caching module.
iomem.i2c.write.addrNACK
Message iomem.i2c.write.addrNACK
Severity WARNING
Description This message occurs when the caching module driver detects an address
negative acknowledgment (NACK) error condition when writing data from an
Inter-Integrated Circuit (I2C) device on the caching module.
Corrective action Replace the caching module.
iomem.i2c.write.dataNACK
Message iomem.i2c.write.dataNACK
Severity WARNING
Description This message occurs when the caching module driver detects a data negative
acknowledgment (NACK) error condition when writing data from an Inter-
Integrated Circuit (I2C) device on the caching module.
Corrective action Replace the caching module.
iomem.i2c.write.timeout
Message iomem.i2c.write.timeout
Severity WARNING
Description This message occurs when the caching module driver times out while trying to
write data from an Inter-Integrated Circuit (I2C) device on the caching module.
Corrective action Replace the caching module.
iomem.init.detect.fpga
Message iomem.init.detect.fpga
242 | Hardware Platform Monitoring Guide
Severity INFO
Description This message occurs when the field-programmable gate array (FPGA) on a
caching module is detected and initialized for use by the driver.
Corrective action None.
iomem.init.detect.pci
Message iomem.init.detect.pci
Severity INFO
Description This message occurs when a caching module is detected in a PCI slot and is
being initialized for use by the system.
Corrective action None.
iomem.init.fail
Message iomem.init.fail
Severity NODE_ERROR
Description This message occurs when the caching module driver fails to initialize a
caching module.
Corrective action Look for the specific failure log messages in the EMS log prior to this message;
they identify the reason for the failure.
iomem.memory.flash.syndrome
Message iomem.memory.flash.syndrome
Severity DEBUG
Description This messages occurs when the caching module driver detects a syndrome code
associated with a flash memory access.
Corrective action None.
iomem.memory.none
Message iomem.memory.none
Severity NODE_ERROR
Description This message occurs when the caching module driver cannot detect any
installed memory on a caching module.
Corrective action Replace the caching module.
EMS and operational messages | 243
iomem.memory.power.high
Message iomem.memory.power.high
Severity WARNING
Description This message occurs when the memory of the caching module has been
configured to operate in high power mode.
Corrective Memory high power mode should never be enabled for the caching module
action under normal operating conditions. The only way that this can occur is if it has
been explicitly enabled via a private diagnostic interface. If this message is
encountered under normal operating conditions, contact technical support.
iomem.memory.power.low
Message iomem.memory.power.low
Severity INFO
Description This message occurs when the memory DIMMs of the caching module have
been configured to operate in low power mode.
Corrective action None.
iomem.memory.scrub.start
Message iomem.memory.scrub.start
Severity INFO
Description This message occurs when the background error correction code (ECC)
memory scrubbing process on a caching module is starting.
Corrective action None.
iomem.memory.size
Message iomem.memory.size
Severity INFO
Description This message occurs when the caching module driver has determined the
amount of memory installed on a caching module.
Corrective action None.
244 | Hardware Platform Monitoring Guide
iomem.memory.zero.complete
Message iomem.memory.zero.complete
Severity INFO
Description This message occurs when the boot-time zeroing of the memory of a caching
module is complete.
Corrective action None.
iomem.memory.zero.start
Message iomem.memory.zero.start
Severity INFO
Description This message occurs when the boot-time zeroing of the memory of a caching
module is starting.
Corrective action None.
iomem.nor.op.failed
Message iomem.nor.op.failed
Severity NODE_ERROR
Description This message occurs when the caching module driver detects that an operation
to a NOR flash memory has failed.
Corrective action None.
iomem.pci.error.config.bar
Message iomem.pci.error.config.bar
Severity NODE_ERROR
Description This message occurs when the caching module driver detects a misconfigured
Base Address Register (BAR) on the caching hardware.
Corrective Boot into diagnostics and use the applicable menu option to reprogram the
action primary field-programmable gate array (FPGA) image on the caching module.
If the problem persists, replace the caching module.
iomem.pio.op.failed
Message iomem.pio.op.failed
EMS and operational messages | 245
Severity NODE_ERROR
Description This message occurs when the caching module driver detects that a
programmed I/O (PIO) NAND flash access failed.
Corrective action None.
iomem.remap.block
Message iomem.remap.block
Severity INFO
Description This message occurs when a bad erase block is being remapped to a spare
block.
Corrective action None.
iomem.remap.target.bad
Message iomem.remap.target.bad
Severity INFO
Description This message occurs when the target of a remap is found to be bad.
Corrective action None.
iomem.temp.report
Message iomem.temp.report
Severity INFO
Description This message occurs periodically to report the operating temperature of the
field-programmable gate array (FPGA) on the caching module.
Corrective action None.
iomem.train.complete
Message iomem.train.complete
Severity INFO
Description This message occurs when the caching module driver has successfully trained
one of the memory controllers for a memory DIMM bank to report the
calibrated idelay setting.
Corrective action None.
246 | Hardware Platform Monitoring Guide
iomem.train.fail
Message iomem.train.fail
Severity NODE_ERROR
Description This message occurs when the caching module driver detects that the card
memory controllers have failed to train for the installed DIMMs.
Corrective action Replace the caching module.
iomem.train.notReady
Message iomem.train.notReady
Severity NODE_ERROR
Description This message occurs when the caching module driver detects that a caching
module memory controller has failed to become ready for operation after
calibration.
Corrective action Replace the caching module.
iomem.train.start
Message iomem.train.start
Severity INFO
Description This message occurs when the caching module driver initiates training of the
memory controllers on the acceleration card to calibrate them to the installed
memory modules.
Corrective action None.
iomem.vmargin.high
Message iomem.vmargin.high
Severity WARNING
Description This message occurs when the acceleration card driver has been configured to
margin a voltage level high for testing purposes.
Corrective action None.
iomem.vmargin.low
Message iomem.vmargin.low
EMS and operational messages | 247
Severity WARNING
Description This message occurs when the caching module driver has been configured to
margin a voltage level low for testing purposes.
Corrective action None.
iomem.vmargin.nominal
Message iomem.vmargin.nominal
Severity INFO
Description This message occurs when voltage margining has been returned to nominal
level on the caching module.
Corrective action None.
monitor.extCache.failed
Message monitor.extCache.failed
Severity LOG_WARNING
Description This message occurs if the monitor detects the Write Anywhere File Layout
(WAFL) external cache subsystem (FlexScale) has failed and is no longer
available for use.
Corrective action Consult the system logs to determine the original cause of the error.
monitor.flexscale.noLicense
Message monitor.flexscale.noLicense
Severity INFO
Description This message occurs if the monitor detects that the caching module is present
but the FlexScale product is not licensed. FlexScale requires a license for use.
Corrective action Obtain a license for the FlexScale product, or remove the caching module.
ds.sas.config.warning
Message ds.sas.config.warning
248 | Hardware Platform Monitoring Guide
Severity WARNING
Description This message occurs when the system detects a configuration problem on the
shelf I/O module.
Corrective action 1. Reseat the disk shelf I/O module.
2. If that does not fix the problem, replace the disk shelf I/O module.
ds.sas.crc.err
Message ds.sas.crc.err
Severity DEBUG
Description This message occurs when a serial-attached SCSI (SAS) cyclic redundancy
check (CRC) error is detected.
Corrective action N/A
SNMP trap ID N/A
ds.sas.drivephy.disableErr
Message ds.sas.drivephy.disableErr
Severity ERR
Description This message occurs when a physical layer device (PHY) on a serial-attached
SCSI (SAS) I/O module is disabled because of one of the following reasons:
Manually bypassed
Exceeded loss of double word synchronization threshold
Exceeded running disparity threshold transmitter fault
Exceeded cyclic redundancy check (CRC) error threshold
Exceeded invalid double word threshold
Exceeded PHY reset problem threshold
Exceeded broadcast change threshold
Mirroring disabled on the other I/O module
ds.sas.element.fault
Message ds.sas.element.fault
EMS and operational messages | 249
Severity ERR
Description This message indicates a transport error.
Corrective action 1. Check cabling to the disk shelf.
2. Check the status LED on the disk shelf and make sure that fault LEDs are
not on.
3. Clear any fault condition, if possible.
4. See the quick reference card beneath the disk shelf for information about the
meanings of the LEDs.
ds.sas.element.xport.error
Message ds.sas.element.xport.error
Severity ERR
Description This message indicates a transport error.
Corrective action 1. Check cabling to the disk shelf.
2. Check the status LED on the disk shelf and make sure that fault LEDs are
not on.
3. Clear any fault condition, if possible
4. See the quick reference card beneath the disk shelf for information about the
meanings of the LEDs.
ds.sas.hostphy.disableErr
Message ds.sas.hostphy.disableErr
Severity ERR
Description This message occurs when a host physical layer device (PHY) on a serial-
attached SCSI (SAS) I/O module is disabled because of one of the following
reasons:
Manually bypassed
Exceeded loss of double word synchronization threshold
Exceeded running disparity threshold Transmitter fault
Exceeded cyclic redundancy check (CRC) error threshold
250 | Hardware Platform Monitoring Guide
Corrective action Replace the disk shelf module to which the host physical layer device belongs.
SNMP trap ID N/A
ds.sas.invalid.word
Message ds.sas.invalid.word
Severity DEBUG
Description This message occurs when a serial-attached SCSI (SAS) word error is detected
in a SAS primitive. These errors can be caused by the disk drive, the cable, the
host bus adapter (HBA), or the shelf I/O module.
Corrective action The SAS specification allows for a certain bit error rate so that these errors can
occur. There is nothing to be alarmed about if these individual errors show up
occasionally.
SNMP trap ID N/A
ds.sas.loss.dword
Message ds.sas.loss.dword
Severity DEBUG
Description This message occurs when a serial-attached SCSI (SAS) loss of double word
synchronization error is detected in a SAS primitive.
Corrective action N/A
SNMP trap ID N/A
ds.sas.multPhys.disableErr
Message ds.sas.multPhys.disableErr
Severity ERR
Description This message occurs when physical layer devices (PHYs) are disabled on
multiple disk drives in a serial-attached SCSI (SAS) disk shelf.
Corrective action 1. Check whether the problems on the physical layer devices are valid.
EMS and operational messages | 251
2. If multiple physical layer devices are disabled at the same time, replace the
disk shelf module.
ds.sas.phyRstProb
Message ds.sas.phyRstProb
Severity DEBUG
Description This message occurs when a serial-attached SCSI (SAS) physical layer device
(PHY) reset error is detected in a SAS primitive.
Corrective action N/A
SNMP trap ID N/A
ds.sas.running.disparity
Message ds.sas.running.disparity
Severity DEBUG
Description This message occurs when a serial-attached SCSI (SAS) running disparity error
is detected in a SAS primitive. These errors are caused when the number of
logical 1s and 0s are too much out of sync.
Corrective action N/A
SNMP trap ID N/A
ds.sas.ses.disableErr
Message ds.sas.ses.disableErr
Severity NODE_ERROR
Description This message occurs when a virtual SCSI Enclosure Services (SES) physical
layer device (PHY) on a serial-attached SCSI (SAS) I/O module is disabled due
to one of the following reasons:
Manually bypassed
Exceeded loss of double word synchronization threshold
Exceeded running disparity threshold Transmitter fault
Exceeded cyclic redundancy check (CRC) error threshold
Exceeded invalid double word threshold
Exceeded PHY reset problem threshold
Exceeded broadcast change threshold
252 | Hardware Platform Monitoring Guide
Corrective action Replace the shelf module to which the concerned SES physical layer device
belongs.
SNMP trap ID N/A
ds.sas.xfer.element.fault
Message ds.sas.xfer.element.fault
Severity ERR
Description This message indicates that an element had a fault during an I/O request. It
might be because of a transient condition in link connectivity.
Corrective action 1. Check cabling to the shelf.
2. Check the status LED on the shelf, and make sure that fault LEDs are not
on.
3. Clear any fault condition, if possible.
4. See the quick reference card beneath the shelf for information about the
meanings of the LEDs.
ds.sas.xfer.export.error
Message ds.sas.xfer.export.error
Severity ERR
Description This message indicates a transport error during an I/O request. It might be due
to a transient condition in link activity.
Corrective action 1. Check cabling to the shelf.
ds.sas.xfer.not.sent
Message ds.sas.xfer.not.sent
Severity ERR
EMS and operational messages | 253
Description This message indicates that an I/O transfer could not be sent. It might be
because of a transient condition in link connectivity.
Corrective action 1. Check cabling to the shelf.
2. Check the status LED on the shelf, and make sure that fault LEDs are not
on.
3. Clear any fault condition, if possible.
4. See the quick reference card beneath the shelf for information about the
meanings of the LEDs.
ds.sas.xfer.unknown.error
Message ds.sas.xfer.unknown.error
Severity ERR
Description This message indicates that an unknown error occurred during an I/O request.
Corrective action N/A
SNMP trap ID N/A
sas.adapter.bad
Message sas.adapter.bad
Severity ALERT
Description This message occurs when the serial-attached SCSI (SAS) adapter fails to
initialize.
Corrective action 1. Reseat the adapter.
sas.adapter.bootarg.option
Message sas.adapter.bootarg.option
Severity INFO
Description The serial-attached SCSI (SAS) adapter driver is setting an option based on the
setting of a bootarg/environment variable.
Corrective action None
254 | Hardware Platform Monitoring Guide
sas.adapter.debug
Message sas.adapter.debug
Severity INFO
Description This message occurs during the serial-attached SCSI (SAS) adapter driver
debug event.
Corrective action None
SNMP trap ID N/A
sas.adapter.exception
Message sas.adapter.exception
Severity WARNING
Description This message occurs when the serial-attached SCSI (SAS) adapter driver
encounters an error with the adapter. The adapter is reset to recover.
Corrective action None.
SNMP trap ID N/A
sas.adapter.failed
Message sas.adapter.failed
Severity ERR
Description This message occurs when the serial-attached SCSI (SAS) adapter driver
cannot recover the adapter after resetting it multiple times. The adapter is put
offline.
Corrective action 1. If the adapter is in use, check the cabling.
2. If connected to disk shelves, check the seating of IOM cards and disks.
3. If the problem persists, try replacing the adapter.
4. If the issue is still not resolved, contact technical support.
sas.adapter.firmware.download
Message sas.adapter.firmware.download
EMS and operational messages | 255
Severity INFO
Description This message occurs when firmware is being updated on the serial-attached
SCSI (SAS) adapter.
Corrective action None.
SNMP trap ID N/A
sas.adapter.firmware.fault
Message sas.adapter.firmware.fault
Severity WARNING
Description This message occurs when a firmware fault is detected on the serial-attached
SCSI (SAS) adapter and it is being reset to recover.
Corrective action None.
SNMP trap ID N/A
sas.adapter.firmware.update.failed
Message sas.adapter.firmware.update.failed
Severity CRIT
Description This message occurs when firmware on the serial-attached SCSI (SAS) adapter
cannot be updated.
Corrective action Replace the adapter as soon as possible. The SAS adapter driver attempts to
continue using the adapter without updating the firmware image.
SNMP trap ID N/A
sas.adapter.not.ready
Message sas.adapter.not.ready
Severity ERR
Description This message occurs when the serial-attached SCSI (SAS) adapter does not
become ready after being reset.
Corrective action The SAS adapter driver automatically attempts to recover from this error. If the
error keeps occurring, the adapter might need to be replaced.
SNMP trap ID N/A
256 | Hardware Platform Monitoring Guide
sas.adapter.offline
Message sas.adapter.offline
Severity INFO
Description This message indicates the name of the associated serial-attached SCSI (SAS)
host bus adapter (HBA).
Corrective action None.
SNMP trap ID N/A
sas.adapter.offlining
Message sas.adapter.offlining
Severity INFO
Description This message occurs when the serial-attached SCSI (SAS) adapter is going
offline after all outstanding I/O requests have finished.
Corrective action None.
SNMP trap ID N/A
sas.adapter.online
Message sas.adapter.online
Severity INFO
Description This message indicates that the serial-attached SCSI (SAS) adapter is now
online.
Corrective action None.
SNMP trap ID N/A
sas.adapter.online.failed
Message sas.adapter.online.failed
Severity LOG_ERR
Description This message indicates the name of the associated serial-attached SCSI (SAS)
host bus adapter (HBA).
Corrective action 1. If the HBA is in use, check the cabling.
2. If the HBA is connected to disk shelves, check the seating of IOM cards.
EMS and operational messages | 257
sas.adapter.onlining
Message sas.adapter.onlining
Severity INFO
Description This message indicates that the serial-attached SCSI (SAS) adapter is in the
process of going online.
Corrective action None.
SNMP trap ID N/A
sas.adapter.reset
Message sas.adapter.reset
Severity INFO
Description This message occurs when the Data ONTAP serial-attached SCSI (SAS) driver
is resetting the specified HBA. This can occur during normal error handling or
by user request.
Corrective action None.
SNMP trap ID N/A
sas.adapter.unexpected.status
Message sas.adapter.unexpected.status
Severity WARNING
Description This message occurs when the serial-attached SCSI (SAS) adapter returns an
unexpected status and is reset to recover.
Corrective action None.
SNMP trap ID N/A
sas.cable.error
Message sas.cable.error
Severity WARNING
Description Failure to retrieve information about cable attached to the serial-attached SCSI
(SAS) adapter port occurred.
258 | Hardware Platform Monitoring Guide
sas.cable.pulled
Message sas.cable.pulled
Severity INFO
Description The cable attached to the serial-attached SCSI (SAS) adapter port was pulled
out.
Corrective action None.
SNMP trap ID N/A
sas.cable.pushed
Message sas.cable.pushed
Severity INFO
Description The cable attached to the serial-attached SCSI (SAS) adapter port was pushed
in.
Corrective action None.
SNMP trap ID N/A
sas.config.mixed.detected
Message sas.config.mixed.detected
Severity WARNING
Description This message occurs when a serial-attached SCSI (SAS) disk shelf contains a
mixture of SAS drives, serial advanced technology attachment (SATA) drives
or bridged SAS drives. Mixing drive types within a disk shelf is not supported.
Corrective action Ensure that each SAS disk shelf is populated with drives of only one type.
SNMP trap ID N/A
sas.device.invalid.wwn
Message sas.device.invalid.wwn
Severity ERR
EMS and operational messages | 259
Description This message occurs when the serial-attached SCSI (SAS) device responds with
an invalid worldwide name.
Corrective action Power-cycling the device might allow it to recover from this problem.
SNMP trap ID N/A
sas.device.quiesce
Message sas.device.quiesce
Severity INFO
Description This message indicates that at least one command to the specified device has not
completed in the normally expected time. In this case, the driver stops sending
additional commands to the device until all outstanding commands have had an
opportunity to be completed. This condition is automatically handled by the
Data ONTAP serial-attached SCSI (SAS) driver.
Corrective This condition by itself does not mean that the target device is problematic. High
action workloads might cause link saturation leading to device contention for the bus.
Transport issues might also cause link throughput to decrease, thereby causing
I/Os to take longer than normal.
If you see this message only on occasion, no action is required. The system
handles the condition automatically.
sas.device.resetting
Message sas.device.resetting
Severity WARNING
Description This message indicates device level error recovery has escalated to resetting the
device. It is usually seen in association with error conditions such as device
level timeouts or transmission errors.
This message reports the recovery action taken by the Data ONTAP serial-
attached SCSI (SAS) driver when evaluating associated device-related or link-
related error conditions.
sas.device.timeout
Message sas.device.timeout
Severity ERR
Description This message occurs when not all outstanding commands to the specified device
were completed within the allotted time. As part of the standard error handling
sequence managed by the Data ONTAP serial-attached SCSI (SAS) driver, all
commands to the device are aborted and reissued.
Corrective Device level timeouts are a common indication of a SAS link stability problem.
action In some cases, the link is operating normally and the specified device is having
trouble processing I/O requests in a timely manner. In such cases, the specified
device should be evaluated for possible replacement.
Quite often the problem results from the partial failure of a component involved
in the SAS transport. Common things to check include the following:
Complete seating of drive carriers in enclosure bays
Properly secured cable connections
IOM seating
Crimped or otherwise damaged cables
sas.initialization.failed
Message sas.initialization.failed
Severity ERR
Description This message occurs when the serial-attached SCSI (SAS) adapter fails to
initialize the link and appears to be unattached or disconnected.
Corrective action 1. If the adapter is in use, check the cabling.
2. If the adapter is connected to disk shelves, check the seating of IOM cards.
sas.link.error
Message sas.link.error
Severity ERR
Description This message occurs when the serial-attached SCSI (SAS) adapter cannot
recover the link and is going offline.
EMS and operational messages | 261
2. If the adapter is connected to disk shelves, check the seating of IOM cards
and disks.
3. If this does not resolve the issue, contact technical support.
sas.port.disabled
Message sas.port.disabled
Severity WARNING
Description The serial-attached SCSI (SAS) adapter port went down by virtue of being
disabled by the operator.
Corrective action None.
SNMP trap ID N//A
sas.port.down
Message sas.port.down
Severity WARNING
Description The serial-attached SCSI (SAS) adapter port went down through no action by
the operator.
Corrective action None.
SNMP trap ID N/A
sas.shelf.conflict
Message sas.shelf.conflict
Severity ERR
Description This message occurs when the system detects that two or more SAS (Serial
Attached SCSI) disk shelves have the same shelf ID. The SAS domain is
functional, but references to disk shelves will be based on disk shelf serial
numbers, not disk shelf IDs.
sasmon.adapter.phy.disable
Message sasmon.adapter.phy.disable
Severity ERR
Description This message occurs when a serial attached serial-attached SCSI (SAS)
transceiver (physical layer device) attached to a SAS host bus adapter (HBA) is
disabled due to one of the following reasons:
Exceeded loss of double word synchronization error threshold
Exceeded running disparity error threshold
Exceeded invalid double word error threshold
Exceeded physical layer device reset problem threshold
Exceeded broadcast change threshold
sasmon.adapter.phy.event
Message sasmon.adapter.phy.event
Severity DEBUG
Description This message occurs when a serial attached serial-attached SCSI (SAS)
transceiver (physical layer device) attached to a SAS host bus adapter (HBA)
experiences a transient error. These errors are observed on a received double
word (dword) or when resetting a PHY.
Types of these errors are disparity errors, invalid dword errors, physical layer
device (PHY) reset problem errors, loss of dword synchronization errors, and
PHY change events. The SAS specification allows for a certain bit error rate so
that these errors can occur under normal operating conditions.
There is no cause for concern if these individual errors show up occasionally.
Corrective None.
action
SNMP trap ID N/A
EMS and operational messages | 263
sasmon.disable.module
Message sasmon.disable.module
Severity INFO
Description This message occurs when the Data ONTAP module responsible for monitoring
the serial attached serial-attached SCSI (SAS) domains transient errors is
disabled due to the environment variable disable-sasmon? being set to
true.
Corrective action Set the environment variable disable-sasmon? to false to enable this
monitor module.
SNMP trap ID N/A
shm.threshold.spareBlocksConsumed
Message shm.threshold.spareBlocksConsumed
Severity NOTICE
Description This message occurs when the spares consumed value exceeds the first
threshold on an SSD.
Corrective action None.
shm.threshold.spareBlocksConsumedMax
Message shm.threshold.spareBlocksConsumedMax
Severity WARNING
Description This messages occurs when the spares consumed value exceeds the second
threshold on an SSD.
Corrective action None.
ses.access.noEnclServ
Message ses.access.noEnclServ
264 | Hardware Platform Monitoring Guide
Severity NODE_ERROR
Description This message occurs when SCSI Enclosure Services (SES) in the storage system
cannot establish contact with the enclosure monitoring process in any disk shelf on
the channel. Some disk shelves require that disks be installed and functioning in
particular shelf bays.
Corrective Note: This message applies to DS14mk2 or DS14mk4 disk shelves that are not
action AT-type shelves. DS14mk4 FC disk shelves are used in this message as an
example.
1. In disk shelves that require certain disk placement, verify that disks are
installed in the indicated bays: DS14mk4 FC: bays 0 and/or 1
Note: SCSI-based shelves, serial-attached SCSI (SAS) shelves, and
DS14mk2 AT shelves do not rely on disk placement for SES.
SES in the storage system tries periodically to reestablish contact with the disk
shelf.
2. If disks are placed correctly but the error persists for more than an hour, halt the
storage system, power-cycle the disk shelf, and reboot.
3. If the error persists, then SES hardware (for example, VEM, LRC, or IOM)
might need to be replaced; in SCSI-based shelves, replace the shelf.
ses.access.noMoreValidPaths
Message ses.access.noMoreValidPaths
Severity NODE_ERROR
Description This message occurs when SCSI Enclosure Services (SES) in the storage system
loses access to the enclosure monitoring process in the disk shelf. Some disk
shelves require that disks be installed and functioning in particular shelf bays.
Corrective Note: This message applies to DS14mk2 or DS14mk4 disk shelves that are not
action AT-type shelves. DS14mk4 is used in this message as an example
1. This message occurs when SES in the storage system loses access to the
enclosure monitoring process in the disk shelf.
Some disk shelves require that disks be installed and functioning in particular
shelf bays: DS14mk4 FC: bays 0 and/or 1
Note: SCSI-based shelves, serial-attached SCSI (SAS) shelves, and
DS14mk2 AT shelves do not rely on disk placement for SES.
SES in the storage system tries periodically to reestablish contact with the
disk shelf.
EMS and operational messages | 265
2. If disks are placed correctly but the error persists for more than an hour, halt the
storage system, power-cycle the disk shelf, and reboot.
3. If the error persists, then SES hardware (for example, VEM, LRC, or IOM)
might need to be replaced.
In SCSI-based shelves, replace the shelf.
ses.access.noShelfSES
Message ses.access.noShelfSES
Severity NODE_ERROR
Description This message occurs when SCSI Enclosure Services (SES) in the storage system
cannot establish contact with the SES process in the indicated disk shelf. Some
disk shelves require that disks be installed and functioning in particular disk shelf
bays.
Corrective Note: This message applies to DS14mk2 or DS14mk4 disk shelves that are not
action AT-type shelves. DS14mk4 is used in this message as an example.
1. In disk shelves that require certain disk placement, verify that disks are
installed in the indicated bays:
DS14mk4 FC: bays 0 and/or 1
Note: SCSI-based shelves, serial-attached SCSI (SAS) shelves, and
DS14mk2 AT shelves do not rely on disk placement for SES.
SES in the storage system tries periodically to reestablish contact with the
disk shelf.
2. If disks are placed correctly but the error persists for more than an hour, halt the
storage system, power-cycle the disk shelf, and reboot.
3. If the error persists, then SES hardware (for example, VEM, LRC, or IOM)
might need to be replaced.
In SCSI-based shelves, replace the shelf.
ses.access.sesUnavailable
Message ses.access.sesUnavailable
Severity NODE_ERROR
Description This message occurs when SCSI Enclosure Services (SES) in the storage system
cannot establish contact with the enclosure monitoring process in one or more disk
266 | Hardware Platform Monitoring Guide
shelves on the channel. Some disk shelves require that disks be installed and
functioning in particular disk shelf bays.
Corrective Note: This message applies to DS14mk2 or DS14mk4 disk shelves that are not
action AT-type shelves. DS14mk4 is used in this message as an example.
1. In disk shelves that require certain disk placements, verify that disks are
installed in the indicated bays:
DS14mk4 FC
: bays 0 and/or 1
Note: SCSI-based shelves, serial-attached SCSI (SAS) shelves, and
DS14mk2 AT shelves do not rely on disk placement for SES.
SES in the storage system tries periodically to reestablish contact with the disk
shelf.
2. If disks are placed correctly but the error persists for more than an hour, halt the
storage system, power-cycle the disk shelf, and reboot.
3. If the error persists, then SES hardware (for example, VEM, LRC, or IOM)
might need to be replaced. In SCSI-based shelves, replace the shelf.
ses.badShareStorageConfigErr
Message ses.badShareStorageConfigErr
Severity NODE_ERROR
Description This message occurs when a disk shelf module that is not supported in a
SharedStorage system, such as an LRC module, is detected in a SharedStorage
system.
Corrective action Replace the unsupported module with one that is supported, such as an ESH,
ESH2, or AT-FCX module.
ses.bridge.fw.getFailWarn
Message ses.bridge.fw.getFailWarn
Severity WARNING
Description This message occurs when the bridge firmware revision cannot be obtained.
Corrective action Check the connection to the bank of Maxtor drives.
ses.bridge.fw.mmErr
Message ses.bridge.fw.mmErr
EMS and operational messages | 267
Severity SVC_ERROR
Description This message occurs when the bridge firmware revision is inconsistent.
Corrective action Check the firmware revision number and make sure that they are consistent.
You might have to update the firmware.
ses.channel.rescanInitiated
Message ses.channel.rescanInitiated
Severity INFO
Description This message identifies the name of the adapter port or switch port being
rescanned; for example, 7a or myswitch:5.
Corrective action None.
ses.config.drivePopError
Message ses.config.drivePopError
Severity WARNING
Description This message occurs when the channel has more disk drives on it than are
allowed.
Systems using synchronous mirroring allow more disk drives per channel than
other systems.
Corrective Your action depends on whether you intend to use synchronous mirroring.
action
If you intend to use synchronous mirroring, make sure that the license is
installed.
If you do not intend to use synchronous mirroring, reduce the number of disk
drives on the channel to no more than the maximum allowed.
ses.config.IllegalEsh270
Message ses.config.IllegalEsh270
Severity NODE_ERROR
Description This message occurs when Data ONTAP detects one or more ESH disk shelf
modules in a disk shelf that is attached to a FAS270 system. This is not a
supported configuration.
Corrective action Replace the ESH modules with ESH2 modules.
268 | Hardware Platform Monitoring Guide
ses.config.shelfMixError
Message ses.config.shelfMixError
Severity NODE_ERROR
Description This message occurs when the channel has a mixture of ATA and Fibre Channel
disk shelves; this is not a supported configuration.
Corrective Mixed-mode operation of ATA and Fibre Channel disks on the system is only
action supported on separate loops. Move all Fibre Channel-based disk shelves to one
loop and place all Fibre Channel-to-ATA-based disk shelves on another loop.
ses.config.shelfPopError
Message ses.config.shelfPopError
Severity NODE_ERROR
Description This message occurs when the channel has more shelves on it than are allowed.
Corrective action Reduce the number of disk shelves on the channel to the number specified.
ses.disk.configOk
Message ses.disk.configOk
Severity INFO
Description This message occurs when there are no longer any drives in FAS2050 or SA200
system slots between 20 and 23.
Corrective action None.
ses.disk.illegalConfigWarn
Message ses.disk.illegalConfigWarn
Severity WARNING
Description This message occurs when disk drives are inserted into the bottom row of a
FAS2050 or an SA200 system. Disk drives are not supported in those slots.
Corrective action None.
ses.disk.pctl.timeout
Message ses.disk.pctl.timeout
Severity DEBUG
EMS and operational messages | 269
Description This message occurs when a power control request submitted to the specified
SCSI Enclosure Services (SES) module is not completed within 60 seconds.
Corrective Normally, there is no corrective action required for this error because the
action timeout might be due to a transient error. However, if you see this message
frequently, there might be an issue with the I/O module in the shelf, which
might need to be replaced.
ses.download.powerCyclingChannel
Message ses.download.powerCyclingChannel
Severity INFO
Description This message occurs when the power-cycling channel event is issued after a
disk shelf firmware download to disk shelves that require a power-cycle to
activate the new code.
Corrective action None.
ses.download.shelfToReboot
Message ses.download.shelfToReboot
Severity INFO
Description This message occurs after the completion of shelf firmware transfer to the
DS14mk2 AT disk shelf. At this point, the disk shelf requires about another five
minutes to transfer the new firmware to its nonvolatile program memory,
whereupon it reboots to begin to execute the new firmware. During this reboot,
an FC loop reinitialization occurs, temporarily interrupting the loop.
Corrective None.
action
ses.download.suspendIOForPowerCycle
Message ses.download.suspendIOForPowerCycle
Severity INFO
Description This message occurs when the suspending I/O event signals that the storage
subsystem is temporarily stopping I/O to disks while one or more disk shelves
have their power cycled after a download, if required by the disk shelf design.
Corrective None.
action
270 | Hardware Platform Monitoring Guide
ses.drive.PossShelfAddr
Message ses.drive.PossShelfAddr
Severity WARNING
Description This message occurs in conjunction with the message ses.drive.shelfAddr.mm
when there are devices that have apparently taken a wrong address; the adapter
shows device addresses that SCSI Enclosure Services (SES) indicates should not
exist, and vice versa.
This error is not a fatal condition. It means that SES cannot perform certain
operations on the affected disk drives, such as setting failure LEDs, because it is not
certain which disk shelf the affected disk drive is in.
Corrective 1. If the problem is throughout the disk shelf, replace the disk shelf.
action
2. If the error is only one disk drive per disk shelf, the drive might have taken an
incorrect address at power-on.
3. Arrange to make this disk drive a spare, and then reseat it to cause it to take its
address again.
4. If the problem persists, insert a different spare disk drive into the slot. If the
error then clears, replace the original disk drive.
5. If the problem persists, there is a hardware problem with the individual disk
bay. Replace the disk shelf.
ses.drive.shelfAddr.mm
Message ses.drive.shelfAddr.mm
Severity NODE_ERROR
Description This message occurs when there is a mismatch between the position of the drives
detected by the disk shelf and the address of the drives detected by the FC loop or
SCSI bus.
This error indicates that a disk drive took an address other than what the disk shelf
should have provided, or that SCSI Enclosure Services (SES) in a disk shelf cannot
be contacted for address information, or that a disk drive unexpectedly does not
participate in device discovery on the loop or bus.
If the message EMS_ses_drive_possShelfAddr subsequently appears, follow
the corrective actions in that message.
In this condition, the SES process in the system might be unable to perform certain
operations on the disk, such as setting failure LEDs or detecting disk swaps.
EMS and operational messages | 271
Corrective Note: This message applies to DS14mk2 or DS14mk4 disk shelves that are not
action AT-type shelves. DS14mk4 is used in this message as an example.
1. If this occurs to multiple disk drives on the same loop, check the I/O modules at
the back of the disk shelves on that loop for errors.
2. In disk shelves that require certain disk placement, verify that disks are installed
in the indicated bays:
DS14mk4 FC: bays 0 and/or 1
Note: SCSI-based disk shelves and DS14mk2 AT disk shelves do not rely on
disk placement for SES.
ses.exceptionShelfLog
Message ses.exceptionShelfLog
Severity DEBUG
Description This message occurs when an I/O module encounters an exception condition.
Corrective 1. Check the system logs to see whether any disk errors recently occurred.
action
2. Pull an AutoSupport message file that contains the latest copy of the shelf
log information from each disk shelf.
3. Try to correlate the date and time from the errors in the message file with the
date and time of events in the shelf log file.
ses.extendedShelfLog
Message ses.extendedShelfLog
Severity DEBUG
Description This message occurs when a disk encounters an error and the system requests that
additional log information be obtained from both modules in the disk shelf
reporting the error to aid in debugging problems.
Corrective 1. Check the system logs to see whether any disk errors recently occurred.
action
2. Pull an AutoSupport message file that contains the latest copy of the shelf log
information from each disk shelf.
3. Try to correlate the date and time from the errors in the message file with the
date and time of events in the shelf log file.
272 | Hardware Platform Monitoring Guide
ses.fw.emptyFile
Message ses.fw.emptyFile
Severity WARNING
Description This message occurs when a firmware file is found to be empty during a disk
shelf firmware update.
Corrective action Obtain the correct firmware file and place it in the etc/shelf_fw directory.
You can download the firmware file from the NetApp Support Site at
support.netapp.com/NOW/download/tools/diskshelf/.
ses.fw.resourceNotAvailable
Message ses.fw.resourceNotAvailable
Severity ERR
Description This message occurs when there is not enough contiguous memory available to
download disk shelf firmware.
Corrective action 1. Reduce the amount of system activities before performing a manual disk
shelf firmware update.
2. If the disk shelf firmware update fails again, reboot the storage system.
ses.giveback.restartAfter
Message ses.giveback.restartAfter
Severity INFO
Description This message occurs when SCSI Enclosure Services (SES) is restarted after
giveback.
Corrective action None.
ses.giveback.wait
Message ses.giveback.wait
Severity INFO
Description This message occurs when SCSI Enclosure Services (SES) information is not
available because the system is waiting for giveback.
Corrective action None.
EMS and operational messages | 273
ses.psu.coolingReqError
Message ses.psu.coolingReqError
Severity LOG_CRIT
Description This message occurs when the installed power supplies are placed so that air-flow
requirements of the disk shelf are not met. The power supply chassis and their
power supplies are an integral part of the disk shelf cooling and air-flow design.
Corrective Verify that the power supplies are placed in the locations required to provide
action proper air flow according to the disk shelf specifications.
DS14-style shelves always require both power supplies. SAS-Shelf24 requires
power supplies in power supply bays 1 and 4 for proper air flow and cooling.
ses.psu.powerReqError
Message ses.psu.powerReqError
Severity LOG_CRIT
Description This message occurs when too few power supplies are installed to redundantly
satisfy the current-draw requirements of the disk drives in the disk shelf. This
might occur if a power supply is removed or fails. Some disk drive models require
more power than others. If the disk shelf specifications for the installed drive
models specify more power supplies to support that disk type, then this condition
can also occur at disk swap or insertion in some disk shelves.
Corrective Verify that the number of power supplies installed satisfies the power requirements
action of the installed disk drives.
DS14-style shelves always require both power supplies. SAS-Shelf24 requires
power supplies in power supply bays 1 and 4 for proper cooling and air flow. If
any disk drives are 10K RPM or faster, then power supply bays 2 and 3 must also
have power supplies.
ses.remote.configPageError
Message ses.remote.configPageError
Severity INFO
Description This message occurs when a request to another system in a SharedStorage
configuration fails. This request was for a specific disk shelf's SCSI Enclosure
Services (SES) configuration page.
Corrective action Contact technical support.
274 | Hardware Platform Monitoring Guide
ses.remote.elemDescPageError
Message ses.remote.elemDescPageError
Severity INFO
Description This message occurs when a request to another system in a SharedStorage
configuration fails. This request was for the element descriptor pages that the
other system has local access to.
Corrective action Contact technical support.
ses.remote.faultLedError
Message ses.remote.faultLedError
Severity INFO
Description This message occurs when a request to another system to have it set the fault
LED of a disk drive on a disk shelf fails.
Corrective action Contact technical support.
ses.remote.flashLedError
Message ses.remote.flashLedError
Severity INFO
Description This message occurs when a request to another system to have it flash the LED
of a disk drive on a disk shelf fails.
Corrective action Contact technical support.
ses.remote.shelfListError
Message ses.remote.shelfListError
Severity INFO
Description This message occurs when a request to another system in a SharedStorage
configuration fails. This request was for a list of the disk shelves that the other
system has local access to.
Corrective action Contact technical support.
ses.remote.statPageError
Message ses.remote.statPageError
EMS and operational messages | 275
Severity INFO
Description This message occurs when a request to another system in a SharedStorage
configuration fails. This request was for the SCSI Enclosure Services (SES)
status pages that the other system has local access to.
Corrective action Contact technical support.
ses.shelf.changedID
Message ses.shelf.changedID
Severity WARNING
Description This message occurs on a SAS disk shelf when the disk shelf ID changes after
power is applied to the disk shelf.
Corrective 1. Verify that the disk shelf ID displayed in this message is the same as the disk
action shelf ID shown on the disk shelf.
2. If they are different, perform one of the following steps:
If the disk shelf ID displayed in this message is the one you want, reset the
disk shelf ID on the thumbwheel to match it.
If you want the new disk shelf ID instead of the disk shelf ID displayed in
the message, verify that the disk shelf ID you want does not conflict with
other disk shelves in the domain.
3. Power-cycle the disk shelf chassis. You can wait to perform this procedure
until your next maintenance window.
4. If the warning persists on both disk shelf modules after you complete the
procedure, replace the disk shelf chassis. If it persists on only one disk shelf
module, replace the disk shelf module.
ses.shelf.ctrlFailErr
Message ses.shelf.ctrlFailErr
Severity SVC_ERROR
Description This message occurs when the adapter and loop ID of the SCSI Enclosure
Services (SES) target for which the SES has control fail.
Corrective 1. Check the LEDs on the disk shelf and the disk shelf modules on the back of
action the disk shelf to see whether there are any abnormalities. If the modules
appear to be problematic, replace the applicable module.
2. If the SES target is a disk drive, check to see whether the disk drive failed. If
it failed, replace the disk drive.
276 | Hardware Platform Monitoring Guide
ses.shelf.em.ctrlFailErr
Message ses.shelf.em.ctrlFailErr
Severity SVC_ERROR
Description This message occurs when SCSI Enclosure Services (SES) control to the
internal disk drives of a system fails.
Corrective action 1. Enter environment shelf to see whether that disk shelf is still being
actively monitored.
2. If the environment shelf command indicates a failure, there is a
hardware failure in the system's internal disk shelf.
ses.shelf.IdBasedAddr
Message ses.shelf.IdBasedAddr
Severity WARNING
Description This message occurs on a serial-attached SCSI (SAS) disk shelf when the SAS
addresses of the devices are based on the disk shelf ID instead of the disk shelf
backplane serial number. This indicates problems communicating with the disk
shelf backplane.
Corrective 1. Reseat the master disk shelf module, as indicated by the output of the
action environment shelf command.
ses.shelf.invalNum
Message ses.shelf.invalNum
Severity WARNING
Description This message occurs when Data ONTAP detects that a serial-attached SCSI
(SAS) shelf connected to the system has an invalid shelf number.
Corrective action 1. Power-cycle the shelf.
ses.shelf.mmErr
Message ses.shelf.mmErr
Severity SVC_FAULT
Description This message occurs when there is a disk shelf that is not supported by the
platform it was booted on.
Corrective 1. Check whether the current version of Data ONTAP supports the disk shelf.
action
2. If the current version of Data ONTAP does not support the disk shelf, install
a version that does support the disk shelf.
If the disk shelf is supported, the error might be cleared by hourly attempts by
Data ONTAP to establish proper contact with the disk shelf.
ses.shelf.OSmmErr
Message ses.shelf.OSmmErr
Severity SVC_ERROR
Description This message occurs when there are incompatible Data ONTAP versions in a
SharedStorage configuration that would cause SCSI Enclosure Services (SES)
not to function properly.
Corrective action Update the system that has an earlier Data ONTAP version to match the one
that has the latest Data ONTAP version.
ses.shelf.powercycle.done
Message ses.shelf.powercycle.done
Severity INFO
Description This message occurs when a disk shelf power-cycle finishes.
Corrective action None.
ses.shelf.powercycle.start
Message ses.shelf.powercycle.start
Severity INFO
278 | Hardware Platform Monitoring Guide
Description This message occurs when a disk shelf is power-cycled and SCSI Enclosure
Services (SES) needs to wait for it to finish.
Corrective action None.
ses.shelf.sameNumReassign
Message ses.shelf.sameNumReassign
Severity WARNING
Description This message occurs when Data ONTAP detects more than one serial-attached
SCSI (SAS) disk shelf connected to the same adapter with the same shelf
number.
Corrective 1. Change the shelf number on the shelf to one that does not conflict with other
action shelves attached to the same adapter. Halt the system and reboot the shelf.
2. If the problem persists, contact technical support.
ses.shelf.unsupportAllowErr
Message ses.shelf.unsupportAllowErr
Description This message occurs when a disk shelf is not supported by Data ONTAP. Data
ONTAP will continue to use the disk shelf, but environmental monitoring of the
disk shelf is not possible.
Severity SVC_FAULT
Corrective 1. Check whether the current version of Data ONTAP supports the disk shelf.
action
2. If the current version of Data ONTAP does not support the disk shelf, install a
version that does support the disk shelf.
If the disk shelf is supported, the error might be cleared by hourly attempts by
Data ONTAP to establish proper contact with the disk shelf.
ses.shelf.unsupportedErr
Message ses.shelf.unsupportedErr
Severity SVC_FAULT
Description This message occurs when there is a disk shelf that is not supported by Data
ONTAP.
Corrective action Check whether this disk shelf is supported by a newer version of Data ONTAP.
If it is, upgrade to the appropriate version.
EMS and operational messages | 279
ses.startTempOwnership
Message ses.startTempOwnership
Severity DEBUG
Description This message occurs when SCSI Enclosure Services (SES) is starting
temporary ownership acquisition of disks owned by other nodes. This involves
removing the disk reservations while the SES operations are in progress
Corrective action Contact technical support.
ses.status.ATFCXError
Message ses.status.ATFCXError
Severity NODE_ERROR
Description This message occurs when the reporting disk shelf detects an error in the
indicated AT-FCX module. The module might not be able to perform I/O to
disks within the disk shelf.
Corrective action 1. Verify that the AT-FCX module is fully seated and secured.
ses.status.ATFCXInfo
Message ses.status.ATFCXInfo
Severity INFO
Description This message occurs when a previously reported error in the AT-FCX module
is corrected, or the system reports other information that does not necessarily
require customer action.
Corrective action None.
ses.status.currentError
Message ses.status.currentError
Severity NODE_ERROR
Description This message occurs when a critical condition is detected in the indicated
storage shelf current sensor. The shelf might be able to continue operation.
Corrective action 1. Verify that the power supply and the AC line are supplying power.
ses.status.currentInfo
Message ses.status.currentInfo
Severity INFO
Description This message occurs when an error or warning condition previously reported by
or about the disk shelf current sensor is corrected, or the system reports other
information about the current in the disk shelf that does not necessarily require
customer action.
Corrective action None.
ses.status.currentWarning
Message ses.status.currentWarning
Severity WARNING
Description This message occurs when a warning condition is detected in the indicated
storage shelf current sensor. The shelf might be able to continue operation.
Corrective action 1. Verify that the power supply and the AC line are supplying power.
ses.status.displayError
Message ses.status.displayError
Severity NODE_ERROR
Description This message occurs when the SCSI Enclosure Services (SES) module in the disk
shelf detects an error in the disk shelf display panel. The disk shelf might be
unable to provide correct addresses to its disks.
Corrective 1. If possible, verify that the connection between the disk shelf and the display is
action secure.
2. Verify that the SES module or modules are fully seated; replacing them might
solve the problem.
EMS and operational messages | 281
3. If the problem persists, the SES module that detected the warning condition
might be faulty.
4. If the problem persists after the module or modules are replaced, replace the
disk shelf.
5. If the problem persists, contact technical support.
ses.status.displayInfo
Message ses.status.displayInfo
Severity INFO
Description This message occurs when a previous condition in the display panel is
corrected.
Corrective action None.
ses.status.displayWarning
Message ses.status.displayWarning
Severity WARNING
Description This message occurs when the SCSI Enclosure Services (SES) module detects a
warning condition for the disk shelf display panel. The disk shelf might be unable
to provide correct addresses to its disks.
Corrective 1. If possible, verify that the connection between the disk shelf and the display is
action secure.
2. Verify that the SES module or modules are fully seated; replacing them might
solve the problem.
3. If the problem persists, the SES module that detected the warning condition
might be faulty.
4. If the problem persists after the module or modules are replaced, replace the
disk shelf.
5. If the problem persists, contact technical support.
ses.status.driveError
Message ses.status.driveError
Severity NODE_ERROR
282 | Hardware Platform Monitoring Guide
Description This message occurs when a critical condition is detected for the disk drive in
the shelf. The drive might fail.
Corrective 1. Make sure that the drive is not running on a degraded volume. If it is, then
action add as many spares as necessary into the system, up to the specified level.
2. After the volume is no longer in degraded mode, replace the drive that is
failing.
ses.status.driveOk
Message ses.status.driveOk
Severity INFO
Description This message occurs when a disk drive that was previously experiencing
problem returns to normal operation.
Corrective action None.
ses.status.driveWarning
Message ses.status.driveWarning
Severity NODE_ERROR
Description This message occurs when a non-critical condition is detected for the disk drive
in the shelf. The drive might fail.
Corrective 1. Make sure that the drive is not running on a degraded volume. If it is, then
action add as many spares as necessary into the system, up to the specified level.
2. After the volume is no longer in degraded mode, replace the drive that is
failing.
ses.status.electronicsError
Message ses.status.electronicsError
Severity NODE_ERROR
Description This message occurs when a failure has been detected in the module that
provides disk SCSI Enclosure Services (SES) monitoring capability.
Corrective action Replace the module. In some disk shelf types, this function is integrated into the
Fibre Channel, SCSI, or serial-attached SCSI (SAS) interface modules.
EMS and operational messages | 283
ses.status.electronicsInfo
Message ses.status.electronicsInfo
Severity INFO
Description This message occurs when a problem previously reported about the disk shelf
SCSI Enclosure Services (SES) electronics is corrected or when other
information about the enclosure electronics that does not necessarily require
customer action is reported.
Corrective action None.
ses.status.electronicsWarn
Message ses.status.electronicsWarn
Severity WARNING
Description This message occurs when a non-fatal condition is detected in the module that
provides disk SCSI Enclosure Services (SES) monitoring capability.
Corrective action Replace the module. In some disk shelf types, this function is integrated into the
Fibre Channel, SCSI, or serial-attached SCSI (SAS) interface modules.
ses.status.ESHPctlStatus
Message ses.status.ESHPctlStatus
Severity DEBUG
Description This message occurs when a change in the power control status is detected in
the indicated disk shelf.
Corrective action None.
ses.status.fanError
Message ses.status.fanError
Severity NODE_ERROR
Description This message occurs when the indicated disk shelf cooling fan or fan module
fails, and the shelf or its components are not receiving required cooling airflow.
Corrective action 1. Verify that the fan module is fully seated and secured. (The fan is integrated
into the power supply module in some disk shelves.)
2. If the problem persists, replace the fan module.
284 | Hardware Platform Monitoring Guide
ses.status.fanInfo
Message ses.status.fanInfo
Severity INFO
Description This message occurs when a condition previously reported about the disk shelf
cooling fan or fan module is corrected or when other information about the fans
that does not necessarily require customer action is reported.
Corrective action None.
ses.status.fanWarning
Message ses.status.fanWarning
Severity WARNING
Description This message occurs when a disk shelf cooling fan is not operating to
specification, or a component of a fan module has stopped functioning. The disk
shelf components continue to receive cooling airflow but might eventually reach
temperatures that are out of specification.
Corrective 1. Verify that the fan module is fully seated and secured. (The fan is integrated
action into the power supply module in some disk shelves.)
2. If the problem persists, replace the fan module.
3. If the problem persists, contact technical support.
ses.status.ModuleError
Message ses.status.ModuleError
Severity NODE_ERROR
Description This message occurs when the reporting disk shelf detects an error in the
indicated disk shelf module.
Corrective action 1. Verify that the shelf module is fully seated and secure.
ses.status.ModuleInfo
Message ses.status.ModuleInfo
EMS and operational messages | 285
Severity INFO
Description This message occurs when a previously reported error in the shelf module is
corrected or when other information that does not necessarily require customer
action is reported.
Corrective action None.
ses.status.ModuleWarn
Message ses.status.ModuleWarn
Severity WARNING
Description This message occurs when the reporting disk shelf detects a warning in the
indicated disk shelf module.
Corrective action 1. Verify that the shelf module is fully seated and secure.
ses.status.psError
Message ses.status.psError
Severity NODE_ERROR
Description This message occurs when a critical condition is detected in the indicated storage
shelf power supply. The power supply might fail.
Corrective 1. Verify that power input to the shelf is correct. If separate events of this type
action are reported simultaneously, the common power distribution point might be at
fault.
2. If the shelf is in a cabinet, verify that the power distribution unit is ON and
functioning properly. Make sure that the shelf power cords are fully inserted
and secured, the supply is fully seated and secured, and the supply is switched
ON.
3. Verify that power supply fans, if any, are functioning. If the problem persists,
replace the power supply.
4. If the problem persists, contact technical support.
ses.status.psInfo
Message ses.status.psInfo
Severity INFO
286 | Hardware Platform Monitoring Guide
Description This message occurs when a condition previously reported about the disk shelf
power supply is corrected or when other information about the power supply
that does not necessarily require customer action is reported.
Corrective action None.
ses.status.psWarning
Message ses.status.psWarning
Severity WARNING
Description This message occurs when a warning condition is detected in the indicated storage
shelf power supply. The power supply might be able to continue operation.
Corrective 1. Verify that the disk shelf is receiving power. If separate events of this type are
action reported simultaneously, the common power distribution point might be at
fault.
2. If the disk shelf is in a cabinet, verify that the power distribution unit status is
ON and functioning properly. Make sure that the disk shelf power cords are
fully inserted and secured, the power supply is fully seated and secured, and
the power supply is switched on.
3. If the problem persists, replace the power supply.
4. If the problem persists, contact technical support.
ses.status.temperatureError
Message ses.status.temperatureError
Severity NODE_ERROR
Description This message occurs when the indicated disk shelf temperature sensor reports a
temperature that exceeds the specifications for the disk shelf or its components.
Corrective 1. Verify that the ambient temperature where the shelf is installed is within
action equipment specifications using the environment shelf [adapter]
command, and that airflow clearances are maintained.
2. If the same disk shelf also reports fan or fan module failures, correct that
problem now. If the problem is reported by the ambient temperature sensor
(located on the operator panel), verify that the connection between the disk
shelf and the panel is secure, if possible.
3. If the problem persists, and if the shelf has multiple temperature sensors of
which only one exhibits the problem, replace the module that contains the
EMS and operational messages | 287
sensor that reports the error. If the problem persists, contact technical support
for assistance.
Note: You can display temperature thresholds for each shelf through the
environment shelf command.
ses.status.temperatureInfo
Message ses.status.temperatureInfo
Severity INFO
Description This message occurs when an error or warning condition previously reported by
or about the disk shelf temperature sensor is corrected or when other
information about the temperature in the disk shelf that does not necessarily
require customer action is reported.
Corrective action None.
ses.status.temperatureWarning
Message ses.status.temperatureWarning
Severity WARNING
Description This message occurs when the indicated disk shelf temperature sensor reports a
temperature that is close to exceeding the specifications for the disk shelf or its
components.
Corrective 1. Verify that the ambient temperature where the disk shelf is installed is within
action equipment specifications by using the environment shelf [adapter]
command, and that airflow clearances are maintained.
2. If this disk shelf also reports fan or fan module errors or warnings, correct
those problems now.
3. If the problem persists, and the shelf has multiple temperature sensors and only
one of them exhibits the problem, replace the module that contains the sensor.
4. If the problem persists, contact technical support.
Note: Temperature thresholds for each shelf can be displayed through the
environment shelf command.
ses.status.upsError
Message ses.status.upsError
Severity NODE_ERROR
288 | Hardware Platform Monitoring Guide
Description This message occurs when the disk shelf detects a failure in the uninterruptible
power supply (UPS) attached to it. This might occur, for example, if power to
the UPS is lost.
Corrective 1. Restore power to the UPS
action
2. Verify that the connection from the UPS to the disk shelf is in place and
secured and that the UPS is enabled.
3. If the problem persists, contact technical support.
ses.status.upsInfo
Message ses.status.upsInfo
Severity INFO
Description This message occurs when a condition previously reported about the
uninterruptible power supply (UPS) attached to the disk shelf is corrected or
when other information about the UPS that does not necessarily require
customer action is reported.
Corrective action None.
ses.status.volError
Severity NODE_ERROR
Description This message occurs when a critical condition is detected in the indicated disk
storage shelf voltage sensor. The shelf might be able to continue operation.
Corrective 1. Verify that the power supply and the AC line are supplying power.
action
2. Monitor the power grid for abnormalities.
3. Replace the power supply.
4. If the problem persists, contact technical support.
ses.status.volWarning
Message ses.status.volWarning
Severity WARNING
Description This message occurs when a warning condition is detected in the indicated
storage shelf voltage sensor. The shelf might be able to continue operation.
Corrective action 1. Verify that the power supply and the AC line are supplying power
EMS and operational messages | 289
ses.system.em.mmErr
Message ses.system.em.mmErr
Severity NODE_FAULT
Description This message occurs when Data ONTAP does not support this system with
internal disk drives.
Corrective action Check whether this system is currently supported. If it is, upgrade to the
appropriate Data ONTAP version.
ses.tempOwnershipDone
Message ses.tempOwnershipDone
Severity DEBUG
Description This message occurs when SCSI Enclosure Services (SES) completes
temporary ownership acquisition.
Corrective action Contact technical support.
sfu.adapterSuspendIO
Message sfu.adapterSuspendIO
Severity INFO
Description This message occurs during a disk shelf firmware update on a disk shelf that
cannot perform I/O while updating firmware. Typically, the shelves involved
are bridge-based as opposed to ESH-based.
Corrective action None.
sfu.auto.update.off.impact
Message sfu.auto.update.off.impact
Severity WARNING
Description This message occurs when the automated disk shelf firmware update cannot be
completed on a downrev disk shelf enclosure because the (hidden) global
option shelf.fw.auto.update is set to off.
290 | Hardware Platform Monitoring Guide
Corrective action Use the storage download shelf command to update. To have the
automatic update enabled, set the hidden option shelf.fw.auto.update to
on.
sfu.ctrllerElmntsPerShelf
Message sfu.ctrllerElmntsPerShelf
Severity INFO
Description This message occurs when a disk shelf firmware download determines the
number of controller elements per shelf that can be downloaded.
Corrective action None.
sfu.downloadCtrllerBridge
Message sfu.downloadCtrllerBridge
Severity INFO
Description This message occurs when a disk shelf firmware download starts on a particular
disk shelf.
Corrective action None.
sfu.downloadError
Message sfu.downloadError
Severity ERR
Description This message occurs when a disk shelf firmware update fails to successfully
download firmware to a disk shelf or shelves in the system.
Corrective action 1. Download the latest disk shelf firmware again from the NetApp Support
Site at support.netapp.com/NOW/download/tools/diskshelf/.
2. Attempt to download disk shelf firmware again by using the storage
download shelf command.
sfu.downloadingController
Message sfu.downloadingController
Severity INFO
Description This message occurs when a disk shelf firmware download starts on a particular
disk shelf.
EMS and operational messages | 291
sfu.downloadingCtrllerR1XX
Message sfu.downloadingCtrllerR1XX
Severity INFO
Description This message occurs when a disk shelf firmware download starts on a particular
disk shelf.
Corrective action None.
sfu.downloadStarted
Message sfu.downloadStarted
Severity INFO
Description This message occurs when a disk shelf firmware update starts to download disk
shelf firmware.
Corrective action None.
sfu.downloadSuccess
Message sfu.downloadSuccess
Severity INFO
Description This message occurs when disk shelf firmware is updated successfully.
Corrective action None.
sfu.downloadSummary
Message sfu.downloadSummary
Severity INFO
Description This message occurs when a disk shelf firmware update is completed
successfully.
Corrective action None.
sfu.downloadSummaryErrors
Message sfu.downloadSummaryErrors
Severity ERR
292 | Hardware Platform Monitoring Guide
Description This message occurs when a disk shelf firmware update is completed without
successfully downloading to all shelves it attempted.
Corrective action Issue the storage download shelf command again.
sfu.FCDownloadFailed
Message sfu.FCDownloadFailed
Severity ERR
Description This message occurs when a disk shelf firmware update fails to successfully
download shelf firmware to a Fibre Channel or an ATA shelf.
Corrective action 1. Download the latest disk shelf firmware again from the NetApp Support
Site at support.netapp.com/NOW/download/tools/diskshelf/.
2. Attempt to download disk shelf firmware again by using the storage
download shelf command.
sfu.firmwareDownrev
Message sfu.firmwareDownrev
Severity WARNING
Description This message occurs when disk shelf firmware is downrev and therefore cannot
be updated automatically.
Corrective action 1. Copy updated disk shelf firmware into the /etc/shelf_fw directory on the
storage appliance.
2. Manually issue the storage download shelf command.
sfu.firmwareUpToDate
Message sfu.firmwareUpToDate
Severity INFO
Description This message occurs when a disk shelf firmware update is requested but the
system determines that all shelves are already updated already to the latest
version of firmware available.
Corrective action None.
sfu.partnerInaccessible
Message sfu.partnerInaccessible
EMS and operational messages | 293
Severity ERR
Description This message occurs in an HA pair in which communication between partner
nodes cannot be established.
Corrective action 1. Verify that the HA pair interconnect is operational.
sfu.partnerNotResponding
Message sfu.partnerNotResponding
Severity ERR
Description This message occurs in an HA pair in which one node does not respond to
firmware download requests from another node. In this case, the other node
cannot download disk shelf firmware.
Corrective Verify that the HA pair interconnect is up and running on both nodes of the
action configuration and then attempt to redownload the disk shelf firmware, using the
storage download shelf command.
sfu.partnerRefusedUpdate
Message sfu.partnerRefusedUpdate
Severity ERR
Description This message occurs in an HA pair in which one node refuses firmware
download requests from its partner node. In this case, the partner node cannot
download disk shelf firmware.
Corrective 1. Verify that both the partners are running the same version of Data ONTAP
action and that the active/active configuration interconnect is up and running on all
nodes of the configuration.
2. Attempt the storage download shelf command again.
sfu.partnerUpdateComplete
Message sfu.partnerUpdateComplete
Severity INFO
Description This message occurs in an HA pair in which a partner downloads disk shelf
firmware and the download is completed. At this point, this notification is sent
and SCSI Enclosure Services (SES) are resumed by the partner.
294 | Hardware Platform Monitoring Guide
sfu.partnerUpdateTimeout
Message sfu.partnerUpdateTimeout
Severity INFO
Description This message occurs in an HA pair in which a partner downloads disk shelf
firmware but the download times out. At this point, this notification is sent and
SCSI Enclosure Services (SES) are resumed by the partner.
Corrective action 1. Verify that the HA pair interconnect is operational.
sfu.rebootRequest
Message sfu.rebootRequest
Severity INFO
Description This message occurs when the disk shelf firmware update is completed. The
disk shelf reboots to run the new code.
Corrective action None.
sfu.rebootRequestFailure
Message sfu.rebootRequestFailure
Severity ERR
Description This message occurs when an attempt to issue a reboot request after
downloading shelf firmware fails, indicating a software error.
Corrective action Reboot the storage system, if possible, and try the firmware update again.
sfu.resumeDiskIO
Message sfu.resumeDiskIO
Severity INFO
Description This message occurs when a disk shelf firmware update is completed and disk
I/O is resumed.
Corrective action None.
EMS and operational messages | 295
sfu.SASDownloadFailed
Message sfu.SASDownloadFailed
Severity ERR
Description This message occurs when a disk shelf firmware update fails to successfully
download shelf firmware to a shelf.
Corrective action 1. Download the latest disk shelf firmware again from the NetApp Support
Site at support.netapp.com/NOW/download/tools/diskshelf/.
2. Download disk shelf firmware again by using the storage download
shelf command.
sfu.statusCheckFailure
Message sfu.statusCheckFailure
Severity ERR
Description This message occurs when the storage download shelf command
encounters a failure while attempting to read the status of the firmware update
in progress.
Corrective action Retry the storage download shelf command.
sfu.suspendDiskIO
Message sfu.suspendDiskIO
Severity INFO
Description This message occurs when a disk shelf firmware update is started and disk I/O
is suspended.
Corrective action None.
sfu.suspendSES
Message Suspending enclosure services -- partner is updating disk shelf firmware.
Severity INFO
Description This message occurs when a disk shelf firmware update is requested in an HA
pair environment. In this case, one partner node updates the firmware on the
disk shelf module while the other partner node temporarily disables SCSI
Enclosure Services (SES) while the firmware update is in process.
296 | Hardware Platform Monitoring Guide
usb.adapter.debug
Message usb.adapter.debug
Severity INFORMATION
Description This message indicates a Data ONTAP universal serial bus (USB) adapter
driver debug event.
Corrective action None.
usb.adapter.exception
Message usb.adapter.exception
Severity WARNING
Description This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver encounters an error with the adapter. The adapter is reset to recover.
Corrective action None.
usb.adapter.failed
Message usb.adapter.failed
Severity ERROR
Description This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver cannot recover the adapter after resetting it multiple times. The adapter
and the devices attached to it will not be used anymore.
Corrective Take the following actions:
action
1. If the adapter is in use, verify that all attached devices are supported devices
and that they are seated correctly.
2. If the problem persists, replace the attached devices.
3. If the problem still persists, contact technical support for help in diagnosing a
USB issue.
EMS and operational messages | 297
usb.adapter.reset
Message usb.adapter.reset
Severity INFORMATION
Description This message occurs when the Data ONTAP universal serial bus (USB) driver
resets the specified adapter. This can occur during normal error handling.
Corrective action If the problem persists, then contact technical support.
usb.device.failed
Message usb.device.failed
Severity ERROR
Description This message occurs when multiple consecutive commands to the specified
universal serial bus (USB) device are not completed within the allotted time. All
recovery actions have been taken and the device cannot be used anymore.
usb.device.initialize.failed
Message usb.device.initialize.failed
Severity ERROR
Description This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver fails to initialize the device attached to the associated port in the associated
adapter for one of the following reasons: Cannot set a unique address for the
device; device descriptor is invalid or contains incorrect data; cannot set an active
configuration for the device; or the device had multiple interfaces. Note that the
Data ONTAP USB driver only supports USB 2.0 bulk-only mass storage devices.
Corrective Take one of the following actions:
action
1. If the device is connected to an external USB port, try reinserting the device.
2. If that fails, try replacing the device with a device from a different product
family.
298 | Hardware Platform Monitoring Guide
3. If the device is connected to the motherboard and the problem persists, contact
technical support for help in diagnosing a USB issue.
usb.device.maximum.connected
Message usb.device.maximum.connected
Severity WARNING
Description This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects a new USB device inserted into the associated port in the associated
adapter. This new device cannot be initialized because the maximum number of
USB devices supported by the Data ONTAP USB adapter driver is already
connected to the system.
Corrective Take the following actions:
action
1. Remove a USB device that is already connected but is not being used.
2. Wait for 10 seconds, then reinsert the new device.
usb.device.protocol.mismatch
Message usb.device.protocol.mismatch
Severity ERROR
Description This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects a protocol mismatch in the device attached to the associated port in
the associated adapter. It can be due to one of the following reasons:
Unsupported interface.
Unsupported device class or device subclass.
Does not support the required pipes.
Does not support required end points.
Does not support the required maximum transfer packet size.
Note that the Data ONTAP USB driver only supports USB 2.0 bulk-only mass
storage devices.
Corrective Take one of the following actions:
action
If the device is connected to an external USB port, try replacing the device with
a device from a different product family.
If the device is connected to the motherboard, contact technical support for help
in diagnosing a USB issue.
EMS and operational messages | 299
usb.device.removed
Message usb.device.removed
Severity INFORMATION
Description This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver successfully detects and handles the removal of the associated device,
and the device is no longer accessible.
Corrective action None.
usb.device.timeout
Message usb.device.timeout
Severity ERROR
Description This message occurs when an outstanding command to the specified universal
serial bus (USB) device is not completed within the allotted time. As part of the
standard error handling sequence managed by the Data ONTAP USB adapter
driver, this command to the device is aborted and reissued.
Corrective Device level timeouts are a common indication of a USB link stability problem. In
action some cases, the link is operating normally and the specified device is having
internal trouble processing I/O requests in a timely manner. In such cases, evaluate
the specified device for possible replacement. Quite often the problem results from
the partial failure of a component involved in the USB transport. The most
common thing to check is the seating of the USB device into the USB port or the
header.
Take one of the following actions:
If the device is connected to an external USB port, try replacing the device with
a device from a different product family.
If the device is connected to the motherboard, contact technical support for help
in diagnosing the USB issue.
usb.device.unsupported
Message usb.device.unsupported
Severity ERROR
Description This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects an unsupported device attached to the default boot device port on
the motherboard.
Corrective action Contact technical support for a replacement USB boot device.
300 | Hardware Platform Monitoring Guide
usb.device.unsupported.speed
Message usb.device.unsupported.speed
Severity ERROR
Description This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects a non high-speed device in the associated port.
Corrective Remove all non high-speed devices attached to the system because the Data
action ONTAP USB adapter driver does not support non high-speed devices.
usb.external.device.not.used
Message usb.external.device.not.used
Severity WARNING
Description This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects a USB device connected to the external port.
Corrective action Remove the external USB device connected to the system.
usb.externalHub.notSupported
Message usb.externalHub.notSupported
Severity WARNING
Description This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects a USB hub device.
Corrective action Remove all hub devices attached to the system because the USB adapter driver
does not support USB hub devices.
usb.port.error
Message usb.port.error
Severity ERROR
Description This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects an unrecoverable error on the associated port.
Corrective action Take the following actions:
1. If a device is attached to the associated port, try reinserting the device.
2. If the problem persists, try replacing the device.
EMS and operational messages | 301
usb.port.reset
Message usb.port.reset
Severity INFORMATION
Description This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver resets the specified port on the associated adapter. This can occur during
normal error handling.
Corrective action If the problem persists, contact technical support.
usb.port.state.indeterminate
Message usb.port.state.indeterminate
Severity WARNING
Description This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver cannot determine the status of the associated port.
Corrective Take the following actions:
action
1. If a device is attached to the associated port, try reinserting the device.
2. If the problem persists, try replacing the device.
3. If the problem still persists, contact technical support for assistance in
diagnosing a USB issue.
usb.port.status.inconsistent
Message usb.port.status.inconsistent
Severity ERROR
Description This message occurs when the Data ONTAP universal serial bus (USB) adapter
driver detects an inconsistent state of the associated port and cannot
communicate with the attached device.
Corrective If a device is attached to the associated port, try reinserting the device. If that
action fails, try replacing the device. If the problem persists, contact technical support
for assistance in diagnosing a USB issue.
302 | Hardware Platform Monitoring Guide
usbmon.boot.device.failed
Message usbmon.boot.device.failed
Severity ERROR
Description This message occurs when the Data ONTAP module that is responsible for
monitoring the health of the universal serial bus (USB) boot devices determines
that the associated boot device will fail all writes to the media.
Corrective Take the following actions:
action
1. Replace the device.
2. If the problem persists, contact technical support for help in diagnosing the
USB issue.
usbmon.boot.device.pfa
Message usbmon.boot.device.pfa
Severity WARNING
Description This message occurs when the Data ONTAP universal serial bus (USB) boot
device health monitor PFA (predictive failure analysis) determines that failure
is forthcoming for the associated boot device.
Corrective action Take the following actions:
1. Replace the device.
2. If the problem persists, contact technical support for help in diagnosing the
USB issue.
usbmon.disable.module
Message usbmon.disable.module
Severity INFORMATION
Description This message occurs when the Data ONTAP module that is responsible for
monitoring the health of the universal serial bus (USB) boot devices is disabled.
Corrective 1. Halt the system by entering the following command at the system prompt:
action
halt
2. After the system boots to the LOADER prompt, run the setenv disable-
usbmon? false command at the LOADER prompt.
EMS and operational messages | 303
usbmon.unable.to.monitor
Message usbmon.unable.to.monitor
Severity WARNING
Description This message occurs when the Data ONTAP module that is responsible for
monitoring the health of the universal serial bus (USB) boot devices cannot
extract health information from the monitored device.
Disk n is broken
Message Disk n is broken
Description nThe RAID group disk number. The solution depends on whether you have a
hot spare in the system.
Fatal? No.
Corrective action See the appropriate system administration guide for information about how to
locate a disk based on the RAID group disk number and how to replace a faulty
disk.
Dumping core
Message Dumping core
Description The system is dumping core after a system crash.
Fatal? Yes.
Corrective action Write down the system crash message on the system console and report the
problem to technical support.
FC-AL LINK_FAILURE
Message FC-AL LINK_FAILURE
Description Fibre Channel arbitrated loop has link failures.
Fatal? No.
Corrective action Report the problem to technical support.
Description Fibre Channel arbitrated loop has been determined to be unreliable. The link
errors are recoverable in the sense that the system is still up and running
Fatal? No.
Corrective action Report the problem to technical support.
Panicking
Message Panicking
Description The system is crashing. If the system does not hang while crashing, the message
Dumping core appears.
Fatal? Yes.
Corrective action Report the problem to technical support.
Fatal? Yes.
Corrective action Harness script filters them and creates a case.
Contact technical support.
When you select When you try to change the ucadmin modify: Invalid
an invalid type type but select an invalid type argument -- asdf
type:
node> ucadmin modify t Usage: ucadmin modify
[m <mode>] [t
asdf 3a <type>] [-f] <adapter>
Modifies Fibre
Channel and converged
network adapter
configuration
adapter -- adapter
name
-m mode -- fc | cna
-t type -- initiator
| target
-f -- force change
without confirmation
Port or When you When you started in FC target Error: command failed:
adapter is attempt to make mode and attempt to change to Adapter 5a must be
not offline changes while another mode or type: offline before changing
configuration; use the
the port or cluster::> system node "fcp adapter modify node
adapter is online hardware unified- node1 adapter 5a state
connect modify node down" command to offline
node1 adapter 5a mode the adapter and try
cna type target again
When you select When you try to change the Error: "asdf" is an
an invalid type type but select an invalid type: invalid value for field
"-type <initiator|
cluster::> system node target>"
hardware
unifiedconnect modify
node node1 adapter 5a cluster::> system
mode cna type asdf node hardware
unifiedconnect
modify ?
-node <nodename> Node
[-adapter] <text>
Adapter
[-mode {fc|uta}]
[-type {initiator|
target}] Configured
FC4 type
[[force|-f] [true]]
Force Configuration
Changes
311
Messages are sent to recipients that you designate when you configure AutoSupport in Data ONTAP.
Note: The SP must be properly configured to send AutoSupport messages. For information about
configuring the SP, see the System Administration Guide and the Software Setup Guide for the
version of Data ONTAP that your system is running.
312 | Hardware Platform Monitoring Guide
HEARTBEAT_LOSS
Message HEARTBEAT_LOSS
Description This message is sent by the Service Processor (SP) when it detects loss of
heartbeat from Data ONTAP, possibly because the system has stopped serving
data.
Corrective If this was a manually triggered or expected reboot, no action is needed.
action Otherwise, complete the following steps:
Service Processor messages | 313
REBOOT (abnormal)
Message REBOOT (abnormal)
Description This message is sent by the Service Processor (SP) when it detects an abnormal
reboot of the system.
Corrective If this was a manually triggered or expected reboot, no action is needed.
action Otherwise, complete the following steps:
1. Check the status of the system and determine the cause of reboot.
2. If the system fails to boot, contact technical support.
Description This message is sent by the Service Processor (SP) when the system firmware
has a Power On Self Test (POST) failure and cannot load and run Data
ONTAP.
Corrective action 1. Run diagnostics on your system.
Description This message is sent by the Service Processor (SP) when the sp test
autosupport command is run from the Data ONTAP CLI. This is a test
mechanism to verify the SP configuration.
Corrective action None.
Description This message is sent by the Service Processor (SP) when a user issues a system
core dump (NMI) SP command.
Corrective action None.
314 | Hardware Platform Monitoring Guide
Description This message is sent by the Service Processor (SP) when a user power-cycles
the system using SP.
Corrective action None.
Definition This message is sent by the Service Processor (SP) when a user powers off the
system using the SP.
Corrective action None.
Description This message is sent by the Service Processor (SP) when a user resets the
system using the SP.
Corrective action None.
sp.firmware.upgrade.reqd
Message sp.firmware.upgrade.reqd
Severity WARNING
Description This message occurs when the Service Processor (SP) firmware version and the
Data ONTAP software version are incompatible and cannot communicate
correctly about a particular capability.
Corrective Update the firmware version of the SP to the version recommended for your
action version of Data ONTAP. The firmware and update instructions are available on
the NetApp Support Site. After you update the firmware, this message should no
longer occur. If the message occurs again, contact technical support and explain
that you already updated the firmware to the recommended version.
Service Processor messages | 315
sp.firmware.version.unsupported
Message sp.firmware.version.unsupported
Severity WARNING
Description This message occurs when the firmware on the Service Processor (SP) is an
unsupported version and must be upgraded.
Corrective The firmware and instructions are available on the NetApp Support Site at
action mysupport.netapp.com. After the SP is running the new firmware, this message
should no longer occur. If the message occurs again, contact technical support and
explain that you already updated the firmware to the recommended version.
sp.heartbeat.resumed
Message sp.heartbeat.resumed
Severity INFO
Description This message occurs when the system detects resumption of Service Processor
(SP) heartbeat notifications indicating that the SP is now available. The earlier
issue indicated by the sp.heartbeat.stopped event has been resolved.
Corrective action None.
sp.heartbeat.stopped
Message sp.heartbeat.stopped
Severity WARNING
Description This message occurs when Data ONTAP does not receive expected Service
Processor (SP) heartbeat notifications. The SP and Data ONTAP exchange
heartbeat messages so that they can detect when one or the other is unavailable.
This event is generated when Data ONTAP has not received an expected
heartbeat message from the SP.
Corrective 1. Connect to the SP CLI and enter the following commands:
action
sp version
sp log debug
sp log messages
sp.network.link.down
Message sp.network.link.down
Severity WARNING
Description This message occurs when the Service Processor (SP) detects a link error on the
SP network port. This can happen if a network cable is not plugged into the SP
network port. It can also happen if the network that the SP is connected to cannot
run at 10/100 Mbps.
Corrective 1. Check whether the network cable is correctly plugged into the SP network
action port.
2. Check the link status LED on the SP.
3. Verify that the network that the SP is connected to supports autonegotiation to
10/100 Mbps or is running at one of those speeds; otherwise, SP network
connectivity does not work.
The SP supports a 10/100 Mbps Ethernet network in autonegotiation mode.
sp.notConfigured
Message sp.notConfigured
Severity WARNING
Description This message occurs weekly to remind you to configure the Service Processor
(SP). The SP is a physical device that is incorporated into your system to provide
remote access and remote management capabilities. To use the full functionality
of SP, you must configure it first.
Corrective Ensure that AutoSupport mailhosts and recipients are properly configured in Data
action ONTAP, and then take the following actions:
1. Configure the SP by entering the following command:
sp setup
If necessary, use the sp status command to obtain the SP's MAC address.
2. Verify the SP network configuration by entering the following command:
sp status
3. Verify that the SP can send AutoSupport messages by entering the following
command:
sp test autosupport
Service Processor messages | 317
sp.orftp.failed
Message sp.orftp.failed
Severity WARNING
Description This message occurs when there is a communication error while sending
information to or receiving information from the Service Processor (SP). This
error could be due to the following reasons:
Communication error while the information is being sent or received.
SP is nonoperational.
3. If this message persists after you reboot the SP, contact technical support.
sp.snmp.traps.off
Message sp.snmp.traps.off
Severity INFO
Description This message occurs each time a system boots, if the advanced privilege level in
Data ONTAP was used to disable the SNMP Trap feature of the Service
Processor (SP).
This message also occurs when the SNMP Trap capability is disabled and a user
invokes a Data ONTAP command to use the SP to send an SNMP trap.
Corrective SP SNMP Trap support is currently disabled. To enable this feature, set the
action sp.snmp.traps option to On.
sp.userlist.update.failed
Message sp.userlist.update.failed
Severity WARNING
318 | Hardware Platform Monitoring Guide
Description This message occurs when there is an error updating user information for the
Service Processor (SP). When user information is updated on Data ONTAP, the SP
is also updated with the new changes. This enables users to log in to the SP.
User information update for the Service Processor (SP) may have failed due to the
following reasons:
Communication error with the SP.
SP might not be operational.
Corrective 1. Check whether the SP is operational by entering the following command at the
action Data ONTAP prompt:
sp status
2. If the SP is operational and this message persists, reboot the SP by entering the
following command at the Data ONTAP prompt:
sp reboot
spmgmt.driver.hourly.stats
Message spmgmt.driver.hourly.stats
Severity WARNING
Description This message occurs when the system encounters an error while trying to get
hourly statistics from the Service Processor (SP). The error could be due to the
following reasons:
Communication error with the (SP).
SP is not operational.
Corrective 1. Check whether the SP is online by entering the following command at the Data
action ONTAP prompt:
sp status
2. If the SP is online and this message persists, reboot the SP by entering the
following command at the Data ONTAP prompt:
sp reboot
3. If this message persists after you reboot the SP, contact technical support.
Service Processor messages | 319
spmgmt.driver.mailhost
Message spmgmt.driver.mailhost
Severity WARNING
Description This message occurs when the Service Processor (SP) setup attempts to verify
whether a mailhost specified in Data ONTAP can be reached. In this case, SP
setup cannot connect to the specified mailhost.
Corrective 1. Verify that a valid mailhost is configured in Data ONTAP by checking the
action system AutoSupport configuration.
2. Ensure that Data ONTAP can successfully connect to the specified mailhost
by invoking a test command to invoke AutoSupport.
spmgmt.driver.network.failure
Message spmgmt.driver.network.failure
Severity WARNING
Description This message occurs when the system encounters a failure during network
configuration of the Service Processor (SP). The system cannot assign the SP a
DHCP (Dynamic Host Configuration Protocol) or fixed IP address.
Corrective 1. Check whether the network cable is correctly plugged into the SP network port.
action
2. Check the link status LED on the SP.
3. Verify that the network that the SP is connected to supports autonegotiation to
10/100 speed or is running at one of those speeds; otherwise, SP network
connectivity does not work.
The SP supports a 10/100 Ethernet network in autonegotiation mode.
spmgmt.driver.timeout
Message spmgmt.driver.timeout
Severity WARNING
Description This message occurs when there is a failure during communication with the
Service Processor (SP) firmware. The failure could be due to the following
reasons:
Communication error with the SP.
SP is not operational.
320 | Hardware Platform Monitoring Guide
Corrective 1. Check whether the SP is online by entering the following command at the Data
action ONTAP prompt:
sp status
2. If the SP is operational and this message persists, reboot the SP by entering the
following command at the Data ONTAP prompt:
sp reboot
After the reboot, this message should no longer occur. If the message occurs
again, contact technical support and explain that you already performed the
preceding steps.
321
RLM messages
The RLM provides remote management capabilities for some storage systems and continuously
monitors system health. Two types of messages are associated with the RLM and can help you
monitor your system and troubleshoot problems.
The following systems contain RLMs:
30xx and SA300 systems
31xx systems
60xx and SA600 systems
The RLM sends AutoSupport messages when certain problems occur with the system. These might
include a reboot failure or a user-triggered power cycle.
Data ONTAP generates EMS messages when RLM events and errors occur. These might include a
firmware update failure or a communication error.
Note: For more information about what the RLM does, see the System Administration Guide for
the version of Data ONTAP that your system is running.
Messages are sent to recipients that you designate when you configure AutoSupport in Data ONTAP.
Note: The RLM must be properly configured to send AutoSupport messages. For information
about configuring the RLM, see the System Administration Guide and the Software Setup Guide
for the version of Data ONTAP that your system is running.
322 | Hardware Platform Monitoring Guide
Reboot warning
Message Reboot warning
Description The Remote LAN Module (RLM) detects an abnormal system reboot.
Corrective action If this was a manually triggered or expected reboot, no action is necessary.
Otherwise, complete the following steps.
1. Check the status of the system and determine the cause of the reboot.
2. Contact technical support if the system fails to reboot.
Corrective action 1. Connect to the RLM command-line interface (CLI) to check whether the
RLM is operational.
2. Contact technical support if the problem persists.
2. Contact technical support if running diagnostics does not detect any faulty
components.
rlm.driver.hourly.stats
Message rlm.driver.hourly.stats
Severity Warning
Description The system encountered an error while trying to get hourly statistics from the
Remote LAN Module (RLM).
326 | Hardware Platform Monitoring Guide
Corrective action 1. Check whether the RLM is online by entering the following command at the
Data ONTAP prompt:
rlm status
2. If the RLM is operational and the problem persists, enter the following
command to reboot the RLM:
rlm reboot
rlm.driver.mailhost
Message rlm.driver.mailhost
Severity Warning
Description This message occurs when Remote LAN Module (RLM) setup verifies whether
a mailhost specified in ONTAP can be reached. In this case, RLM setup cannot
connect to the specified mailhost.
Corrective action 1. Verify that a valid mailhost is configured in Data ONTAP by checking the
system AutoSupport configuration.
2. Ensure that ONTAP can successfully connect to the specified mailhost by
entering a test AutoSupport command.
rlm.driver.network.failure
Message rlm.driver.network.failure
Severity Warning
Description A failure occurred during the network configuration of the Remote LAN Module
(RLM). The system could not assign the RLM a Dynamic Host Configuration
Protocol (DHCP) or fixed IP address.
Corrective 1. Check whether the RLM is online by entering the following command at the
action Data ONTAP prompt:
rlm status
2. If the RLM is operational and the problem persists, enter the following
command to reboot the RLM:
rlm reboot
rlm.driver.timeout
Message rlm.driver.timeout
RLM messages | 327
Severity Warning
Description A failure occurred during communication with the Remote LAN Module
(RLM).
Corrective action 1. Check whether the RLM is online by entering the following command at the
Data ONTAP prompt:
rlm status
2. If the RLM is operational and the problem persists, enter the following
command to reboot the RLM:
rlm reboot
rlm.firmware.update.failed
Message rlm.firmware.update.failed
Severity SVC_ERROR
Description An error occurred during an update to the Remote LAN Module (RLM) firmware.
The firmware might have failed due to the following reasons:
An incorrect RLM firmware image or a corrupted image file
A communication error while sending new firmware to the RLM
An update failure while applying new firmware at the RLM
A system reset or loss of power during an update
Corrective 1. Download the firmware image by entering the commands appropriate to your
action system:
2. Make sure that the RLM is still operational by entering the command
appropriate to your system:
328 | Hardware Platform Monitoring Guide
rlm.firmware.upgrade.reqd
Message rlm.firmware.upgrade.reqd
Severity WARNING
Description The Remote LAN Module (RLM) firmware version and the version of Data
ONTAP are incompatible and cannot communicate correctly about a particular
capability.
Corrective action Update the firmware version of the RLM to the version recommended for your
version of Data ONTAP.
For more information, see the section on upgrading RLM firmware in the
System Administration Guide.
rlm.firmware.version.unsupported
Message rlm.firmware.version.unsupported
Severity WARNING
Description The firmware on the Remote LAN Module (RLM) is an unsupported version
and must be upgraded.
Corrective Update the firmware version of the RLM to the version recommended for your
action version of Data ONTAP.
For more information, see the section on upgrading RLM firmware in the
System Administration Guide.
RLM messages | 329
rlm.heartbeat.bootFromBackup
Message rlm.heartbeat.bootFromBackup
Severity WARNING
Description The system rebooted the Remote LAN Module (RLM) from its backup firmware
to restore RLM availability. The RLM is considered unavailable when the system
stops receiving heartbeat notifications from the RLM. To restore availability, the
system tries to reboot the RLM form the RLM's primary firmware. If that fails, the
system tries to reboot the RLM from the RLM's backup firmware. This message is
generated if the reboot from backup firmware restores availability.
Corrective Update the firmware version of the RLM to the version recommended for your
action version of Data ONTAP.
For more information, see the section on upgrading RLM firmware in the System
Administration Guide.
rlm.heartbeat.resumed
Message rlm.heartbeat.resumed
Severity WARNING
Description The system detected the resumption of Remote LAN Module (RLM) heartbeat
notifications, indicating that the RLM is now available. The earlier issue
indicated by the rlm.heartbeat.stopped message was resolved.
Corrective action None needed.
rlm.heartbeat.stopped
Message rlm.heartbeat.stopped
Severity WARNING
Description The system did not receive an expected heartbeat message from the Remote LAN
Module (RLM). The RLM and the system exchange heartbeat messages, which
they use to detect when one or the other is unavailable.
Corrective 1. Connect to the RLM CLI.
action
2. Collect debugging information by entering the following commands:
rlm version
rlm config
rlm.network.link.down
Message rlm.network.link.down
Severity WARNING
Description The Remote LAN Module (RLM) detected a link error on the RLM network port.
This can happen if a network cable is not plugged into the RLM network port. It
can also happen if the network that the RLM is connected to cannot run at 10/100
Mbps.
Corrective 1. Check whether the network cable is correctly plugged into the RLM network
action port.
2. Check the link status LED on the RLM.
3. Verify that the network that the RLM is connected to supports autonegotiation
to 10/100 Mbps or is running at one of those speeds; otherwise, RLM network
connectivity does not work.
rlm.notConfigured
Message rlm.notConfigured
Severity WARNING
Description This message occurs weekly to remind you to configure the Remote LAN Module
(RLM). The RLM is a physical device that is incorporated into your system to
provide remote access and remote management capabilities. To use the full
functionality of RLM, you need to configure it first.
RLM messages | 331
rlm.orftp.failed
Message rlm.orftp.failed
Severity WARNING
Description A communication error occurred while sending or receiving information from
the Remote LAN Module (RLM).
Corrective action 1. Check whether the RLM is operational by entering the following command
at the Data ONTAP prompt:
rlm status
2. If the RLM is operational and this error persists, enter the following
command to reboot the RLM:
rlm reboot
3. If this message persists after you reboot the RLM, contact technical support.
rlm.snmp.traps.off
Message rlm.snmp.traps.off
Severity INFO
Description The advanced privilege level in Data ONTAP was used to disable the SNMP
trap feature of the Remote LAN Module (RLM). This message occurs at boot.
This message also occurs when the SNMP trap capability was disabled and a
user invokes a Data ONTAP command to use the RLM to send an SNMP trap.
Corrective To enable RLM SNMP trap support, set the rlm.snmp.traps option to On.
action
rlm.systemDown.alert
Message rlm.systemDown.alert
332 | Hardware Platform Monitoring Guide
Severity ALERT
Description System remote management detected a system down event.
This is only an SNMP trap that is sent out by the Remote LAN Module (RLM)
firmware. The trap includes a string describing the specific event that triggered
the trap. The string is structured in the following form with key=value pairs:
Corrective 1. Check the system to verify that it has power and is operational.
action
2. If your system is operational, run diagnostics on your entire system.
3. Contact technical support if the system is not serving data.
rlm.systemDown.notice
Message rlm.systemDown.notice
Severity NOTICE
Description System remote management detected a system down event.
This is only an SNMP trap that is sent out by the Remote LAN Module (RLM)
firmware. The trap includes a string describing the specific event that triggered the
trap. The string is structured in the following form with key=value pairs:
Corrective 1. Check the system to verify that it has power and is operational.
action
2. If your system is operational, run diagnostics on your entire system.
3. Consult technical support if the system is not serving data.
rlm.systemDown.warning
Message rlm.systemDown.warning
Severity WARNING
Description System remote management detected a system down event.
RLM messages | 333
This is only an SNMP trap that is sent out by the Remote LAN Module (RLM)
firmware. The trap includes a string describing the specific event that triggered the
trap. The string is structured in the following form with key=value pairs:
Corrective 1. Check the system to verify that it has power and is operational.
action
2. If your system is operational, run diagnostics on your entire system.
3. Consult technical support if the system is not serving data.
rlm.systemPeriodic.keepAlive
Message rlm.systemPeriodic.keepAlive
Severity INFO
Description System remote management sent a periodic keep-alive event.
This is only an SNMP trap that is sent out by the Remote LAN Module (RLM)
firmware. The trap includes a string describing the specific event that triggered
the trap. The string is structured in the following form with key=value pairs:
rlm.systemTest.notice
Message rlm.systemTest.notice
Severity NOTICE
Description System remote management detected a test event.
This is only an SNMP trap that is sent out by the Remote LAN Module (RLM)
firmware. The trap includes a string describing the specific event that triggered
the trap. The string is structured in the following form with key=value pairs:
334 | Hardware Platform Monitoring Guide
rlm.userlist.update.failed
Message rlm.userlist.update.failed
Severity WARNING
Description There was an error while updating user information for the Remote LAN Module
(RLM). When user information is updated on Data ONTAP, the RLM is also
updated with the new changes. This enables users to log in to the RLM.
Corrective 1. Check whether the RLM is operational by entering the following command at
action the Data ONTAP prompt:
rlm status
2. If the RLM is operational and this error persists, reboot the RLM by entering
the following command:
rlm reboot
BMC messages
The BMC provides remote platform management capabilities on FAS20xx and SA200 systems.
BMC capabilities include remote access, monitoring, troubleshooting, logging, and alerting features.
The BMC sends AutoSupport messages through its independent management interface, regardless of
the state of the system.
BMC_ASUP_UNKNOWN
Message BMC_ASUP_UNKNOWN
Description Unknown Baseboard Management Controller (BMC) error.
Corrective action Report the problem to technical support.
REBOOT (abnormal)
Message REBOOT (abnormal)
Explanation An abnormal reboot occurred.
Corrective action Verify that the system has returned to operation.
SYSTEM_POWER_OFF (environment)
Message SYSTEM_POWER_OFF (environment)
Description An environmental sensor entered a critical, nonrecoverable state, and Data
ONTAP has been requested to power off the system.
Corrective action Verify the environmental conditions of the system.
bmc.asup.crit
Message bmc.asup.crit
Description This message occurs when the Baseboard Management Controller (BMC) sends
an AutoSupport message of a CRITICAL priority.
Corrective The action you take depends on whether the operating environment for the
action system, storage, or associated cabling has changed.
If the operating environment has changed, shut down and power off the
system until the environment is restored to normal operations.
BMC messages | 339
If the operating environment has not changed, check for previous errors and
warnings. Also check for hardware statistics from Fibre Channel, SCSI, disk
drives, other communications mechanisms, and previous administrative
activities.
bmc.asup.error
Message bmc.asup.error
Description This message occurs when the Baseboard Management Controller (BMC) fails
to construct the necessary attachments of an AutoSupport message.
Corrective action This message indicates an internal error with the BMC's AutoSupport
processing. Contact technical support.
bmc.asup.init
Message bmc.asup.init
Description This message occurs when the Baseboard Management Controller (BMC) fails
to initialize its AutoSupport subsystem due to a lack of resources.
Corrective action This message indicates an internal error with the BMC's AutoSupport
processing. Contact technical support.
bmc.asup.queue
Message bmc.asup.queue
Description This message occurs when the Baseboard Management Controller (BMC) has
too many outstanding AutoSupport messages and no longer has enough
resources to service them.
Corrective This message might indicate an issue with your AutoSupport configuration.
action
1. Ensure that your system is configured to use the correct AutoSupport SMTP
mail host, and that the mail host is properly configured to handle
AutoSupport messages originating from the BMC.
2. For additional help, contact technical support.
bmc.asup.send
Message bmc.asup.send
Description This message occurs when the Baseboard Management Controller (BMC) sends
an AutoSupport message.
340 | Hardware Platform Monitoring Guide
Corrective action 1. Follow the corrective action recommended for the AutoSupport message
that was sent.
2. For additional help, contact technical support.
bmc.asup.smtp
Message bmc.asup.smtp
Description This message occurs when the Baseboard Management Controller (BMC) fails
to contact the mailhost when attempting to send an AutoSupport message.
Corrective This message indicates an issue with your AutoSupport configuration.
action
1. Ensure that your system is configured to use the correct AutoSupport SMTP
mail host and that the mail host is properly configured to handle AutoSupport
messages originating from the BMC.
2. For additional help, contact technical support.
bmc.batt.id
Message bmc.batt.id
Description This message occurs when the Baseboard Management Controller (BMC)
cannot read the part number information stored in the battery configuration
firmware.
Corrective action Contact technical support for the current procedure to determine whether the
battery failed.
bmc.batt.invalid
Message bmc.batt.invalid
Description This message occurs when the Baseboard Management Controller (BMC)
determines that the battery installed is not the correct model for your system.
Corrective action Contact technical support to request the appropriate replacement battery for
your model of system.
bmc.batt.mfg
Message bmc.batt.mfg
Description This message occurs when the Baseboard Management Controller (BMC)
cannot read the manufacturer information stored in the battery configuration
firmware.
BMC messages | 341
Corrective action Contact technical support for the current procedure to determine whether the
battery failed.
bmc.batt.rev
Message bmc.batt.rev
Description This message occurs when the Baseboard Management Controller (BMC)
cannot read the revision code stored in the battery configuration firmware.
Corrective action Contact technical support for the current procedure to determine whether the
battery failed.
bmc.batt.seal
Message bmc.batt.seal
Description This message occurs when the Baseboard Management Controller (BMC)
cannot seal the battery's configuration information after a battery upgrade.
Corrective action Contact technical support for the current procedure to determine whether the
battery failed.
bmc.batt.unknown
Message bmc.batt.unknown
Description This message occurs when the Baseboard Management Controller (BMC)
determines that the installed battery is not a recognized part that is approved for
use in your system.
Corrective action Contact technical support to request the appropriate replacement battery for
your model of system.
bmc.batt.unseal
Message bmc.batt.unseal
Description This message occurs when the Baseboard Management Controller (BMC)
cannot unseal the battery's configuration information to determine whether the
battery firmware requires an upgrade.
Corrective action Contact technical support for the current procedure to determine whether the
battery failed.
bmc.batt.upgrade
Message bmc.batt.upgrade
342 | Hardware Platform Monitoring Guide
Description This message occurs when the Baseboard Management Controller (BMC)
generates it before an upgrade of the battery's configuration firmware to
indicate to the user the present and new revisions of battery configuration.
Corrective action None.
bmc.batt.upgrade.busy
Message bmc.batt.upgrade.busy
Description This message occurs when the Baseboard Management Controller (BMC)
determines that the battery configuration firmware requires an upgrade, but that
the BMC is too busy to perform the upgrade.
Corrective It is normal to get this message one time after a BMC upgrade. However, if this
action message is issued more than once, it indicates a problem with your system.
Contact technical support for the current procedure to determine whether your
system needs to be replaced.
bmc.batt.upgrade.failed
Message bmc.batt.upgrade.failed
Description This message occurs when the Baseboard Management Controller (BMC) cannot
upgrade the battery configuration firmware to the latest revision.
Corrective In most cases, this error does not impact the functionality of your system, but
action replacing the battery might be advised at your next maintenance window.
Contact technical support for the current procedure to determine whether the
battery needs to be replaced.
bmc.batt.upgrade.failure
Message bmc.batt.upgrade.failure
Description This message occurs when the Baseboard Management Controller (BMC)
generates it for every configuration item in the battery configuration firmware
that could not be updated during a battery upgrade.
Corrective 1. Remove and reinsert the controller module. In most cases, this forces the
action BMC to reattempt and successfully upgrade the battery.
2. If you see this message more than once, contact technical support for the
current procedure to determine whether the battery needs to be replaced.
BMC messages | 343
bmc.batt.upgrade.ok
Message bmc.batt.upgrade.ok
Description This message occurs when the entire battery upgrade process is complete.
Corrective action None.
bmc.batt.upgrade.power-off
Message bmc.batt.upgrade.power-off
Description This message occurs in the rare event where the Baseboard Management
Controller (BMC) cannot turn on system power, and the battery has not been
checked to determine whether it requires a configuration upgrade.
Corrective 1. Remove and reinsert the controller module.
action
2. If you continue to see this message, contact technical support for the current
procedure to determine whether the controller module needs to be replaced.
bmc.batt.upgrade.voltagelow
Message bmc.batt.upgrade.voltagelow
Description This message occurs when the Baseboard Management Controller (BMC)
generates it because the battery is discharged to below 6.0V and the battery
requires a configuration firmware update.
Corrective This message is printed every 10 minutes until the battery is recharged. If you
action continue to see this message after one hour, contact technical support for the
current procedure to determine whether the battery needs to be replaced.
bmc.batt.voltage
Message bmc.batt.voltage
Description This message occurs in the rare event where the Baseboard Management
Controller (BMC) determines that the battery configuration firmware requires
an update and the battery is successfully prepared for the update, but the BMC
cannot read the battery voltage sensor.
Corrective Contact technical support for the current procedure to determine whether the
action battery needs to be replaced.
344 | Hardware Platform Monitoring Guide
bmc.config.asup.off
Message bmc.config.asup.off
Description This message occurs in the rare event that the Baseboard Management
Controller (BMC) detects corruption in the BMC's internal cached copy of the
AutoSupport mail host and/or configured destinations. AutoSupport messages
from the BMC are disabled until the system boots.
Corrective Boot the system to ensure that the BMC's cache of the AutoSupport
action configuration is correct.
bmc.config.corrupted
Message bmc.config.corrupted
Description This message occurs in the rare event that the Baseboard Management Controller
(BMC) internal configuration is corrupted and is being reset to defaults. Notably,
the SSH service on the BMC LAN interface is disabled until the system boots.
Corrective 1. Boot the system. Upon boot, the Secure Shell (SSH) host keys for the BMC
action are regenerated. The previous host keys for the BMC are no longer valid and
cannot be used for logins.
2. Contact technical support to determine whether your system needs
maintenance.
bmc.config.default
Message bmc.config.default
Description This message occurs in the rare event that the Baseboard Management Controller
(BMC) internal configuration is corrupted and is being reset to defaults. Notably,
the Secure Shell (SSH) service on the BMC LAN interface is disabled until the
system boots.
Corrective 1. Boot the system. Upon boot, the SSH host keys for the BMC are regenerated.
action The previous host keys for the BMC are no longer valid and cannot be used
for logins.
2. Contact technical support to determine whether your system needs
maintenance.
bmc.config.default.pef.filter
Message bmc.config.default.pef.filter
BMC messages | 345
Description This message occurs in the rare event that the Baseboard Management Controller
(BMC) internal configuration is corrupted and is being reset to defaults. Notably,
the BMC's Platform Event Filter (PEF) tables are being cleared to factory defaults.
Corrective Most users need to take no action. However, if you want to use custom Intelligent
action Platform Management Interface (IPMI) PEF tables, you need to reenable the
BMC IPMI LAN interface, and reload any custom PEF tables that might be
defined for your site.
bmc.config.default.pef.policy
Message bmc.config.default.pef.policy
Description This message occurs in the rare event that the Baseboard Management Controller
(BMC) internal configuration is corrupted and is being reset to defaults. Notably,
the BMC's Platform Event Filter (PEF) tables are being cleared to factory defaults.
Corrective Most users need to take no action. However, if you want to use custom IPMI PEF
action tables, you need to reenable the BMC Intelligent Platform Management Interface
(IPMI) LAN interface, and reload any custom PEF tables that might be defined for
your site.
bmc.config.fru.systemserial
Message bmc.config.fru.systemserial
Description This message occurs when the Baseboard Management Controller (BMC)
detects an invalid System Serial Number field in the systems field-replaceable
unit (FRU) configuration area.
Corrective action Contact technical support to determine the maintenance procedure for your
system.
bmc.config.mac.error
Message bmc.config.mac.error
Description This message occurs when the Baseboard Management Controller (BMC)
Ethernet Media Access Control (MAC) identifier is invalid.
Corrective action Contact technical support to determine the corrective procedure for your
system.
bmc.config.net.error
Message bmc.config.net.error
346 | Hardware Platform Monitoring Guide
Description This message occurs when the Baseboard Management Controller (BMC)
cannot start networking support on the BMC LAN interface.
Corrective action Contact technical support to determine the corrective procedure for your
system.
bmc.config.upgrade
Message bmc.config.upgrade
Description This message occurs when the Baseboard Management Controller (BMC)
internal configuration defaults are updated.
Corrective action None.
bmc.power.on.auto
Message bmc.power.on.auto
Description This message occurs when, upon power up, the Baseboard Management
Controller (BMC) detects that the system was previously soft powered-off.
Corrective action None.
bmc.reset.ext
Message bmc.reset.ext
Description This message occurs when the Baseboard Management Controller (BMC)
detects that a bmc reboot command was issued on the system previously.
Corrective action None.
bmc.reset.int
Message bmc.reset.int
Description This message occurs when the Baseboard Management Controller (BMC) was
reset through the BMC command sequence ngs smash; set reboot=1;
priv set diag.
bmc.reset.power
Message bmc.reset.power
Description This message occurs when the Baseboard Management Controller (BMC)
detects a system power up, or after the BMC is upgraded.
BMC messages | 347
bmc.reset.repair
Message bmc.reset.repair
Description This message occurs when the Baseboard Management Controller (BMC)
detects and corrects an internal BMC error.
Corrective action If you receive this message frequently, contact technical support to determine
the corrective procedure for your system.
bmc.reset.unknown
Message bmc.reset.unknown
Description This message occurs when the Baseboard Management Controller (BMC)
cannot determine why it was reset.
Corrective action This message usually indicates a BMC internal error. Contact technical support
to determine the corrective procedure for your system.
bmc.sensor.batt.charger.off
Message bmc.sensor.batt.charger.off
Description This message occurs when the Baseboard Management Controller (BMC)
detects that the battery charger cannot be disabled for the hourly battery load
test.
Corrective action Contact technical support to determine the corrective procedure for your
system.
bmc.sensor.batt.charger.on
Message bmc.sensor.batt.charger.on
Description This message occurs when the Baseboard Management Controller (BMC)
cannot reenable the battery charger after the hourly battery load test.
Corrective action Contact technical support to determine the corrective procedure for your
system.
bmc.sensor.batt.time.run.invalid
Message bmc.sensor.batt.time.run.invalid
348 | Hardware Platform Monitoring Guide
Description This message occurs when the Baseboard Management Controller (BMC)
detects that the battery's calculated run time differs substantially from the
battery's run-time sensor.
Corrective action None.
bmc.ssh.key.missing
Message bmc.ssh.key.missing
Description This message occurs when the Baseboard Management Controller (BMC)
detects that the Secure Shell (SSH) host keys for the BMC are corrupted or
missing.
Corrective action Reboot the system. The boot sequence regenerates the host key and makes the
BMC SSH service available again.
349
Steps
1. Determine which protocol licenses are installed on the system by using the license command.
Because protocol licenses should match on both nodes in an HA pair, be sure to check each node.
2. For each protocol license installed, confirm that the protocol is enabled and configured:
In the following example, NFS is configured and properly enabled:
Node1> cifs
CIFS not configured. Use "cifs setup" to configure
3. If necessary, refer to the Data ONTAP File Access and Protocols Management Guide for 7-Mode
for directions on configuring the protocol.
350 | Hardware Platform Monitoring Guide
Copyright information
Copyright 19942014 NetApp, Inc. All rights reserved. Printed in the U.S.
No part of this document covered by copyright may be reproduced in any form or by any means
graphic, electronic, or mechanical, including photocopying, recording, taping, or storage in an
electronic retrieval systemwithout prior written permission of the copyright owner.
Software derived from copyrighted NetApp material is subject to the following license and
disclaimer:
THIS SOFTWARE IS PROVIDED BY NETAPP "AS IS" AND WITHOUT ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE,
WHICH ARE HEREBY DISCLAIMED. IN NO EVENT SHALL NETAPP BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
NetApp reserves the right to change any products described herein at any time, and without notice.
NetApp assumes no responsibility or liability arising from the use of products described herein,
except as expressly agreed to in writing by NetApp. The use or purchase of this product does not
convey a license under any patent rights, trademark rights, or any other intellectual property rights of
NetApp.
The product described in this manual may be protected by one or more U.S. patents, foreign patents,
or pending applications.
RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to
restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer
Software clause at DFARS 252.277-7103 (October 1988) and FAR 52-227-19 (June 1987).
351
Trademark information
NetApp, the NetApp logo, Network Appliance, the Network Appliance logo, Akorri,
ApplianceWatch, ASUP, AutoSupport, BalancePoint, BalancePoint Predictor, Bycast, Campaign
Express, ComplianceClock, Customer Fitness, Cryptainer, CryptoShred, CyberSnap, Data Center
Fitness, Data ONTAP, DataFabric, DataFort, Decru, Decru DataFort, DenseStak, Engenio, Engenio
logo, E-Stack, ExpressPod, FAServer, FastStak, FilerView, Fitness, Flash Accel, Flash Cache, Flash
Pool, FlashRay, FlexCache, FlexClone, FlexPod, FlexScale, FlexShare, FlexSuite, FlexVol, FPolicy,
GetSuccessful, gFiler, Go further, faster, Imagine Virtually Anything, Lifetime Key Management,
LockVault, Manage ONTAP, Mars, MetroCluster, MultiStore, NearStore, NetCache, NOW (NetApp
on the Web), Onaro, OnCommand, ONTAPI, OpenKey, PerformanceStak, RAID-DP, ReplicatorX,
SANscreen, SANshare, SANtricity, SecureAdmin, SecureShare, Select, Service Builder, Shadow
Tape, Simplicity, Simulate ONTAP, SnapCopy, Snap Creator, SnapDirector, SnapDrive, SnapFilter,
SnapIntegrator, SnapLock, SnapManager, SnapMigrator, SnapMirror, SnapMover, SnapProtect,
SnapRestore, Snapshot, SnapSuite, SnapValidator, SnapVault, StorageGRID, StoreVault, the
StoreVault logo, SyncMirror, Tech OnTap, The evolution of storage, Topio, VelocityStak, vFiler,
VFM, Virtual File Manager, VPolicy, WAFL, Web Filer, and XBB are trademarks or registered
trademarks of NetApp, Inc. in the United States, other countries, or both.
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corporation in the United States, other countries, or both. A complete and current list of
other IBM trademarks is available on the web at www.ibm.com/legal/copytrade.shtml.
Apple is a registered trademark and QuickTime is a trademark of Apple, Inc. in the United States
and/or other countries. Microsoft is a registered trademark and Windows Media is a trademark of
Microsoft Corporation in the United States and/or other countries. RealAudio, RealNetworks,
RealPlayer, RealSystem, RealText, and RealVideo are registered trademarks and RealMedia,
RealProxy, and SureStream are trademarks of RealNetworks, Inc. in the United States and/or other
countries.
All other brands or products are trademarks or registered trademarks of their respective holders and
should be treated as such.
NetApp, Inc. is a licensee of the CompactFlash and CF Logo trademarks.
NetApp, Inc. NetCache is certified RealSystem compatible.
352 | Hardware Platform Monitoring Guide
Index
LEDs on the back of the controller module 32
0200: Failure Fixed Disk LEDs on the front of the chassis 30
error message 157, 166 NVMEM LED 32
0230: System RAM Failed at offset power LED 30
error message 158 PSU LEDs 34
0231: Shadow RAM failed at offset remote management port LEDs 32
error message 158 22xx system POST error messages
0232: Extended RAM failed at address line 0231: Shadow RAM Failed at offset 166
error message 159 22xx systems
0235: Multiple-bit ECC error occurred chassis fault LED 35
error message 159 controller activity LED 35
023C: Bad DIMM found in slot # controller fault LED 36
error message 159 Fibre Channel port LEDs 36
023E: Node Memory Interleaving disabled GbE port LEDs 36
error message 160 internal drive LEDs 39
0241: Agent Read Timeout internal FRU LEDs 43
error message 160 introduction to LEDs on 35
0242: Invalid FRU information introduction to POST error messages 166
error message 161 LEDs on the back of the controller 36
0250: System battery is dead LEDs on the front of the chassis 35
error message 161 management port LEDs 36
0251: System CMOS checksum bad mezzanine card 36
error message 162 NVMEM LED 36
0253: Clear CMOS jumper detected power LED 35
error message 162 PSU LEDs 41
0260: System timer error SAS port LEDs 36
error message 162 serial port 36
0280: Previous boot incomplete 2520 systems
error message 162 10/100/1000Base-T port LEDs 47
02C2: No valid Boot Loader in System FlashNon Fatal 1000Base-T port LEDs 47
error message 163 10GBase-T port LEDs 47
02C3: No valid Boot Loader in System FlashFatal controller attention LED 47
error message 163 LEDs on the back of the controller 47
02F9: FPGA jumper detected management port LEDs 47
POST error message 163 NVMEM LED 47
02FA: Watchdog Timer Reboot (PciInit) SAS port LEDs 47
error message 164 2520, 2552, and 2554 systems
02FB: Watchdog Timer Reboot (MemTest) chassis attention LED 43
POST error message 164 controller activity LED 43
02FC: LDTStop Reboot (HTLinkInit) LEDs on the front of the chassis 43
eror message 165 power LED 43
20xx systems 255x systems
controller module fault LED 32 10-GbE port LEDs 50
controller module LEDs 30 10/100/1000Base-T port LEDs 50
Ethernet port LEDs 32 1000Base-T port LEDs 50
fault LED 30 controller attention LED 50
Fibre Channel port LEDs 32
354 | Hardware Platform Monitoring Guide
C bmc.batt.upgrade.failure 342
bmc.batt.upgrade.ok 343
clustered system error messages bmc.batt.upgrade.power-off 343
UTA2 (CNA) 308 bmc.batt.upgrade.voltagelow 343
comments bmc.batt.voltage 343
how to send feedback about documentation 352 bmc.config.asup.off 344
bmc.config.corrupted 344
bmc.config.default 344
D bmc.config.default.pef.filter 344
diagnostic tools bmc.config.default.pef.policy 345
boot_diags command 28 bmc.config.fru.systemserial 345
forms and use of 28 bmc.config.mac.error 345
sldiag commands 28 bmc.config.net.error 345
doccomments bmc.config.upgrade 346
how to send feedback about documentation by using bmc.power.on.auto 346
352 bmc.reset.ext 346
documentation bmc.reset.int 346
how to send feedback about 352 bmc.reset.power 346
where to find platform troubleshooting 28 bmc.reset.repair 347
bmc.reset.unknown 347
bmc.sensor.batt.charger.off 347
E bmc.sensor.batt.charger.on 347
bmc.sensor.batt.time.run.invalid 347
EMS messages
bmc.ssh.key.missing 348
Chassis power supply removed: PS# 187
EMS messages about the RLM
information provided in 182
rlm.driver.hourly.stats 325
introduction to environmental 182
rlm.driver.mailhost 326
No network interfaces 178
rlm.driver.network.failure 326
rlm.firmware.update.failed 327
rlm.driver.timeout 326
ses.access.noMoreValidPaths 264
rlm.firmware.upgrade.reqd 328
ses.access.noShelfSES 265
rlm.firmware.version.unsupported 328
ses.disk.configOk 268
rlm.heartbeat.bootFromBackup 329
ses.download.shelfToReboot 269
rlm.heartbeat.resumed 329
ses.drive.shelfAddr.mm 270
rlm.heartbeat.stopped 329
EMS messages about the BMC
rlm.network.link.down 330
bmc.asup.crit 338
rlm.notConfigured 330
bmc.asup.error 339
rlm.orftp.failed 331
bmc.asup.init 339
rlm.snmp.traps.off 331
bmc.asup.queue 339
rlm.systemDown.alert 331
bmc.asup.send 339
rlm.systemDown.notice 332
bmc.asup.smtp 340
rlm.systemDown.warning 332
bmc.batt.id 340
rlm.systemPeriodic.keepAlive 333
bmc.batt.invalid 340
rlm.systemTest.notice 333
bmc.batt.mfg 340
rlm.userlist.update.failed 334
bmc.batt.rev 341
EMS messages about the SP
bmc.batt.seal 341
sp.firmware.upgrade.reqd 314
bmc.batt.unknown 341
sp.firmware.version.unsupported 315
bmc.batt.unseal 341
sp.heartbeat.resumed 315
bmc.batt.upgrade 341
sp.heartbeat.stopped 315
bmc.batt.upgrade.busy 342
sp.network.link.down 316
bmc.batt.upgrade.failed 342
Index | 357
LEDs on the back of the controller 74 location and meaning of FAS8020 system internal
LEDs on the back of the controller module 32 FRU 100
LEDs on the front of the chassis 60 location and meaning of FAS8040, FAS8060, and
LEDS on the front of the chassis 65 FAS8080 system fan LEDs 98
LEDs on the front of the controller 73 location and meaning of FAS8040, FAS8060, and
location and meaning of 2050 single-port 10-GbE FAS8080 system internal FRU 101
NIC 124 location and meaning of FAS80xx system PSU 99
location and meaning of 25xx system internal FRU location and meaning of fiber-optic iSCSI target
55 HBA 137
location and meaning of 31xx system LEDs on the location and meaning of Flash Cache module 128
back of the controller 62 location and meaning of multiport GbE NIC 118
location and meaning of 60xx system PSU 76 location and meaning of PAM 127
location and meaning of 8020 system fan LEDs 98 location and meaning of PSU 76
location and meaning of 8020 system internal FRU location and meaning of quad-port, 4-Gb, 12-LED
100 Fibre Channel HBA 133
location and meaning of 8040, 8060, and 8080 location and meaning of SA300 controller front 56
system fan LEDs 98 location and meaning of SA600 system PSU 76
location and meaning of 8040, 8060, and 8080 location and meaning of single-port GbE NICs 116
system internal FRU 101 location and meaning of, on the back of the
location and meaning of 80xx system PSU 99 controller 62
location and meaning of copper iSCSI target HBA management port LEDs 66, 89, 92
138 nonvolatile memory (NVMEM) 32
location and meaning of dual-port 10-GbE NIC 125 NVMEM LED 66
location and meaning of dual-port Fibre Channel NVRAM LED 89, 92
HBA 129 NVRAM5 adapter 102
location and meaning of dual-port GbE NICs 122, NVRAM5 and NVRAM6 media converter 103
123 NVRAM6 adapter 102
location and meaning of dual-port, 10-Gb, FCoE NVRAM7 adapter 103
CNA HBA 112 NVRAM8 adapter 104
location and meaning of dual-port, 10-Gb, FCoE NVRAM9 adapter 109
unified target HBA 112 on 25xx internal drive carriers 45
location and meaning of dual-port, 10GBase-CX4 on FAS25xx internal drive carriers 45
TOE NICs 152 on the back of the 80xx I/O expansion module 96
location and meaning of dual-port, 10GBase-SR on the back of the controller 66, 89
TOE NIC 151 on the back of the I/O expansion module 69
location and meaning of dual-port, 16-Gb FC, 10- on the front of the chassis 30
GbE/FCoE UTA2 115 onboard drive failures, 20xx systems 30
location and meaning of dual-port, 16-Gb power LED 65
MetroCluster adapter private management port 79
dual-port, 16-Gb FCVI adapter 147 PSU 34, 41, 71
location and meaning of dual-port, 2-Gb PSU LEDs 63, 84
MetroCluster adapter 142 PSU, 20xx systems 34
location and meaning of dual-port, 4-Gb PSU, SA200 systems 34
MetroCluster adapter 143 quad-port TOE NICs 149
location and meaning of dual-port, 8-Gb quad-port, 4-Gb, Fibre Channel HBA, four-LED
MetroCluster adapter 145 version 132
location and meaning of FAS25xx system internal quad-port, 8-Gb, Fibre Channel HBA, 12-LED
FRU 55 version 135
location and meaning of FAS8020 system fan LEDs remote management port 32, 79
98
Index | 363
SA200 system LEDs on the back of the controller location and meaning of dual-port, 10GBase-SR
module 32 TOE 151
SA200 system LEDs on the front of the chassis 30 location and meaning of multiport GbE 118
SA300 PSU 59 location and meaning of single-port GbE 116
SA300 system fan 58 No message on console
SA300 system LEDs on the back of the controller 57 error message 165
SA320 system fan LEDs 70 NVRAM5 adapter
SA320 system internal FRU LEDs 72 LEDs 102
SA320 system LED on the back of the I/O expansion NVRAM6 adapter
module 69 LEDs 102
SA320 system LEDs on the back of the controller 66 which systems support 101
SA320 system PSU LEDs 71 NVRAM7 adapter
SA600 system fan LEDs 75 LEDs 103
SA600 system LEDs on the back of the controller 74 which systems support 101
SA600 system LEDs on the front of the controller 73 NVRAM8 adapter
SA620 LEDs on the back of the controller 79 destage status 104
SA620 PSU LEDs 84 HA pair 104
SA620 system fan LEDs 84 LEDs 104
SA620 system internal FRU LEDs 85 which systems support 101
SA620 system LEDs on front of chassis 77 NVRAM9 adapter
SA620 system LEDs on the back of the I/O LEDs 109
expansion module 83 which systems support 101
SAS port LEDs 66, 89, 92
single-port TOE NICs 148
UTA2/CNA port LEDs 89, 92
O
visible from front of system 77 operational error messages
LEDs on the back of the I/O expansion module Disk hung during swap 303
I/O expansion module fault 83 Disk n is broken 304
private management port LEDs 83 Dumping core 304
Error dumping core 304
M FC-AL LINK_FAILURE 304
FC-AL RECOVERABLE ERRORS 304
maintenance mode error messages information provided in 182
UTA2 (CNA) 306 Panicking 305
MetroCluster adapter LEDs RMC Alert: Boot Error 305
introduction to 142 RMC Alert: Down Appliance 305
location and meaning of dual-port, 16-Gb 147 RMC Alert: OFW POST Error 305
location and meaning of dual-port, 2-Gb 142
location and meaning of dual-port, 4-Gb 143
location and meaning of dual-port, 8-Gb 145
P
platform troubleshooting
N where to find documentation for 28
POST error messages
NIC LEDs 0200: Failure Fixed Disk 157, 166
location and meaning of 2050 single-port 10-GbE 0230: System RAM Failed at offset 158, 166
124 0231: Shadow RAM failed at offset 158
location and meaning of dual-port 10-GbE 125 0231: Shadow RAM Failed at offset 166
location and meaning of dual-port GbE 122, 123 0232: Extended RAM failed at address line 159
location and meaning of dual-port, 10GBase-CX4 0232: Extended RAM Failed at address line 166
TOE 152 0235: Multiple-bit ECC error occurred 159
364 | Hardware Platform Monitoring Guide
023A: ONTAP Detected Bad DIMM in slot 167 023A: ONTAP Detected Bad DIMM in slot 167
023C: Bad DIMM found in slot # 159 023B: BIOS detected SPD checksum error in DIMM
023E: Node Memory Interleaving disabled 160, 167 slot: 167
0241: Agent Read Timeout 160 023E: Node Memory Interleaving disabled 167
0241: SMBus Read Timeout 167 0241: SMBus Read Timeout 167
0242: Invalid FRU information 161, 167 0242: Invalid FRU information 167
0250: System battery is dead 161 0250: System battery is dead - Replace and run
0250: System battery is dead - Replace and run SETUP 168
SETUP 168 0251: System CMOS checksum bad 168
0251: System CMOS checksum bad 162, 168 0260: System timer error 168
0253: Clear CMOS jumper detected 162 0271: Check date and time settings 168
0260: System timer error 162, 168 0280: Previous boot incomplete - Default
0271: Check date and time settings 168 configuration used 169
0280: Previous boot incomplete 162 02A2: System Error Log (SEL) Full 169
0280: Previous boot incomplete - Default 02A3: No Response From SP To FRU ID Read
configuration used 169 Request 169
02A3: No Response From SP To FRU ID Read 02C2: No valid Boot Loader in System Flash - Non
Request 169 Fatal 169
02C2: No valid Boot Loader in System Flash - Non 02C3: No valid Boot Loader in System Flash - Fatal
Fatal 169 170
02C2: No valid Boot Loader in System FlashNon BIOS detected pattern write/read mismatch in
Fatal 163 DIMM slot: 170
02C3: No valid Boot Loader in System Flash - Fatal BIOS detected uncorrectable ECC error in DIMM
170 slot: 171
02C3: No valid Boot Loader in System FlashFatal BIOS detected unknown errors in DIMM slot 171
163 Fatal Error: No DIMM detected and system can not
02F9: FPGA jumper detected 163 continue boot! 172
02FA: Watchdog Timer Reboot (PciInit) 164 Fatal Error! All channels are disabled! 171
02FB: Watchdog Timer Reboot (MemTest) 164 Fatal Error! All DIMM failed and system can not
02FC: LDTStop Reboot (HTLinkInit) 165 continue boot! 171
BIOS detected pattern write/read mismatch in Fatal Error! RDIMMs and UDIMMs are mixed! 172
DIMM slot: 170 No message on the console 173
BIOS detected uncorrectable ECC error in DIMM Software memory test failed! 173
slot: 171 POST error messages, 25xx systems
BIOS detected unknown errors in DIMM slot 171 0230: System RAM Failed at offset 166
Fatal Error: No DIMM detected and system can not 0232: Extended RAM Failed at address line 166
continue boot! 172 023A: ONTAP Detected Bad DIMM in slot 167
Fatal Error! All channels are disabled! 171 023B: BIOS detected SPD checksum error in DIMM
Fatal Error! All DIMM failed and system can not slot: 167
continue boot! 171 023E: Node Memory Interleaving disabled 167
Fatal Error! RDIMMs and UDIMMs are mixed! 172 0241: SMBus Read Timeout 167
Fatal Error! UDIMM in 3rd slot is not supported! 0242: Invalid FRU information 167
172 0250: System battery is dead - Replace and run
No message on console 165 SETUP 168
No message on the console 173 0251: System CMOS checksum bad 168
Software memory test failed! 173 0260: System timer error 168
POST error messages, 22xx systems 0271: Check date and time settings 168
0200: Failure Fixed Disk 166 0280: Previous boot incomplete - Default
0230: System RAM Failed at offset 166 configuration used 169
0232: Extended RAM Failed at address line 166 02A2: System Error Log (SEL) Full 169
Index | 365
02A3: No Response From SP To FRU ID Read 0250: System battery is dead - Replace and run
Request 169 SETUP 168
02C2: No valid Boot Loader in System Flash - Non 0251: System CMOS checksum bad 168
Fatal 169 0260: System timer error 168
02C3: No valid Boot Loader in System Flash - Fatal 0271: Check date and time settings 168
170 0280: Previous boot incomplete - Default
BIOS detected pattern write/read mismatch in configuration used 169
DIMM slot: 170 02A2: System Error Log (SEL) Full 169
BIOS detected uncorrectable ECC error in DIMM 02A3: No Response From SP To FRU ID Read
slot: 171 Request 169
BIOS detected unknown errors in DIMM slot 171 02C2: No valid Boot Loader in System Flash - Non
Fatal Error: No DIMM detected and system can not Fatal 169
continue boot! 172 02C3: No valid Boot Loader in System Flash - Fatal
Fatal Error! All channels are disabled! 171 170
Fatal Error! All DIMM failed and system can not BIOS detected pattern write/read mismatch in
continue boot! 171 DIMM slot: 170
Fatal Error! RDIMMs and UDIMMs are mixed! 172 BIOS detected uncorrectable ECC error in DIMM
No message on the console 173 slot: 171
Software memory test failed! 173 BIOS detected unknown errors in DIMM slot 171
POST error messages, 31xx systems Fatal Error: No DIMM detected and system can not
0200: Failure Fixed Disk 157 continue boot! 172
0230: System RAM Failed at offset: 158 Fatal Error! All channels are disabled! 171
0231: Shadow RAM failed at offset 158 Fatal Error! All DIMM failed and system can not
0232: Extended RAM failed at address line 159 continue boot! 171
0235: Multiple-bit ECC error occurred 159 Fatal Error! RDIMMs and UDIMMs are mixed! 172
023C: Bad DIMM found in slot # 159 Fatal Error! UDIMM in 3rd slot is not supported!
023E: Node Memory Interleaving disabled 160 172
0241: Agent Read Timeout 160 No message on the console 173
0242: Invalid FRU information 161 Software memory test failed! 173
0250: System battery is dead 161 POST error messages, 60xx and SA600 systems
0251: System CMOS checksum bad 162 0200: Failure Fixed Disk 157
0260: System timer error 162 0230: System RAM Failed at offset: 158
0280: Previous boot incomplete 162 0231: Shadow RAM failed at offset 158
02C2: No valid Boot Loader in System FlashNon 0232: Extended RAM failed at address line 159
Fatal 163 0235: Multiple-bit ECC error occurred 159
02C3: No valid Boot Loader in System FlashFatal 023C: Bad DIMM found in slot # 159
163 023E: Node Memory Interleaving disabled 160
02FA: Watchdog Timer Reboot (PciInit) 164 0241: Agent Read Timeout 160
02FC: LDTStop Reboot (HTLinkInit) 165 0242: Invalid FRU information 161
No message on console 165 0250: System battery is dead 161
POST error messages, 32xx and SA320 systems 0251: System CMOS checksum bad 162
0200: Failure Fixed Disk 166 0253: Clear CMOS jumper detected 162
0230: System RAM Failed at offset 166 0260: System timer error 162
0232: Extended RAM Failed at address line 166 0280: Previous boot incomplete 162
023A: ONTAP Detected Bad DIMM in slot 167 02C2: No valid Boot Loader in System FlashNon
023B: BIOS detected SPD checksum error in DIMM Fatal 163
slot: 167 02C3: No valid Boot Loader in System FlashFatal
023E: Node Memory Interleaving disabled 167 163
0241: SMBus Read Timeout 167 02FA: Watchdog Timer Reboot (PciInit) 164
0242: Invalid FRU information 167 02FC: LDTStop Reboot (HTLinkInit) 165
366 | Hardware Platform Monitoring Guide
0260: System timer error 162 User_triggered (system power cycle) 325
0280: Previous boot incomplete 162 User_triggered (system power off) 325
02C2: No valid Boot Loader in System FlashNon User_triggered (system power on) 325
Fatal 163 User_triggered (system reset) 325
02C3: No valid Boot Loader in System FlashFatal
163
02FA: Watchdog Timer Reboot (PciInit) 164
S
02FC: LDTStop Reboot (HTLinkInit) 165 SA200 systems
No message on console 165 controller module fault LED 32
PSU LEDs controller module LEDs 30
20xx systems 34 Ethernet port LEDs 32
22xx systems 41 fault LED 30
25xx systems 53 Fibre Channel port LEDs 32
31xx systems 63 LEDs on the back of the controller module 32
32xx systems 71 LEDs on the front of the chassis 30
60xx systems 76 NVMEM LED 32
62xx systems 84 power LED 30
FAS25xx systems 53 PSU LEDs 34
location and meaning of 80xx systems 99 remote management port LEDs 32
location and meaning of FAS80xx systems 99 startup progress, viewing 155
SA200 systems 34 SA300 systems
SA300 system 59 fan LED 58
SA320 systems 71 FC port LEDs 57
SA600 systems 76 GbE port LEDs 57
SA620 systems 84 introduction to LEDs on 56
introduction to POST error messages 157
Q LEDs on the back of the controller 57
location and meaning of controller front LEDs 56
quality documentation RLM LEDs 57
how to send feedback about improving 352 SA320 system POST error messages
0231: Shadow RAM Failed at offset 166
SA320 systems
R chassis fault LED 65
RLM controller activity LED 65
AutoSupport e-mail contents 322 controller fault LED 66
types of messages 321 controller-I/O expansion module configuration 65
when AutoSupport messages are sent 321 dual-controller configuration 65
when RLM EMS messages are sent 322 fan LED 70
RLM EMS messages Fibre Channel port LEDs 66
rlm.firmware.update.failed 327 GbE port LEDs 66
RLM-generated messages I/O expansion module fault LED 69
Heartbeat loss warning 322 internal FRU LEDs 72
Reboot (power loss) critical 323 introduction to POST error messages 166
Reboot (watchdog reset) warning 323 LED on the back of the I/O expansion module 69
Reboot warning 323 LEDs on the back of the controller 66
RLM heartbeat loss 323 LEDs on the front of the chassis 65
RLM heartbeat stopped 324 management port LEDs 66, 69
System boot failed (POST failed) 324 NVMEM LED 66
User triggered (RLM test) 324 power LED 65
User_triggered (system nmi) 324 PSU LEDs 71
368 | Hardware Platform Monitoring Guide