You are on page 1of 66

bullion S

General training

Prepared by Patrick LEMESLE (Field Support)

April 30th 2015

© For internal use © For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Notes

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Index

bullion S family and environment (3)


General views and indicators (8)
bullion S architecture and components (13)
Configuring and accessing management module (22)
Management pages informations
iCare (34)
BPM (44)
Firmware and BIOS upgrade (46)
BSMHW_NG_CLI (50)
Memory decoding (54)
Replacing CIX board (58)

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support Page
bullion S family
– Global stack
– Features
– Global view

Index

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Bullion S family

Bullion S2
– 1 module, 2 CPUs, up to 3 TB memory, 7 PCI
slots.
Bullion S4
– 2 modules, 4 CPUs, up to 6 TB memory, 14 PCI
slots.
Bullion S8
– 4 modules, 8 CPUs, up to 12 TB memory , 28
PCI slots.
Bullion S16
– 8 modules, 16 CPUs, up to 24 TB memory , 56
PCI slots.

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support Page
bullion : global stack for RAS features

Hypervisors

Memory & CPUs

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
bullion S servers : designed for RAS Features
Reliability
Easy to maintain

Xeon E5v2

Intel Xeon E5-xxxx Run Sure Features ® bullion RAS specific features
1. Hot-Add Memory (Linux)
1. Memory Single Device Data 2. Hot-Add/Remove PCie
Correction (ESXi/ Linux)
1. Enhanced Memory Single Device
3. Hot-Add partition w/ new
Data Correction (Device Tagging)
module
2. Dynamic Memory Migration
4. Predictive Failure Analysis
3. Enhanced Memory Double Device
(ESXi, Linux)
Data Correction
5. Memory hot swap (Linux, ESXi)
4. Failed DIMM Identification
- Memory blade hot remove /
5. Intel® SMI2 Packet Retry
hot add
6. Intel® SMI2 Memory Half-Width
- On-demand Memory Migration
Fail over
(in development)
7. Memory Address Parity Protection
6. 586 Hardware Circuit Sensors
8. Memory Demand and Patrol
reported in management console
Scrubbing
(Linux, ESXi)
9. Memory Rank Sparing
7. Memory Mapout – reported in
10.Memory Thermal Throttling
BMC logs
- Allows the system to
automatically work around
faulty DIMMs

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support Page
bullion S servers : designed for RAS
Patented mechanics for easy replacement (hot-plug) without
opening the server
Memory blades

PCIe blades
Disks
(w/ Pcie RAID adapter)

Fans
(5+1 redundancy)

PSUs : 1+1 redundancy


in active/passive mode
(UCM)
© For internal use
Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Notes

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Views & indicators
– Exploded view
– Components & Indicators
– Memory blade indicators

Index

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Basic 2S drawer : exploded view
Middle Top Cover

CPU Air Duct

Memory Riser
Module

RM3D3

CIX board
(Mother
Board) Power
Distribution PD16
Board
© For internal use
Big Data & Security | Legacy Systems | GCOS7 & Bullion Support Page
bullion S physical views
VGA Connector Serial Port COM2 6 fan blocks
Rear view

Drawer containing PSU MRLB board 3rd HDD RP16 module RP8 module intermodules XQPI
server informations USB + LAN1G GCOS7 For 16x boards For 8x boards
2 HDD UCM Module
Front
view

© For internal use 8 memory blades


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support LCP Module
Page
Leds and indicators
Fan ‘’fault’’ Led
Riear view

Drawer contening Serial Number, Identification blue led


And management MAC Address
PSU fault led
Disks leds lUCM leds
Front
view

© For internal use 8 memory blades detailled next page


Blue identification button led
Big Data & Security | Legacy Systems | GCOS7 & Bullion Support Page
Memory blades leds (CRU)
RM3D3/RM2D3

Ma
LED/button Color Description In RM3D3 / RM3D4
rk

Attention
A Indication to BIOS for Hot Plugging
Button

B Power LED Green Dim on successful MRC.. A

C Attention LED Amber Dim when RM3D3 or RM3D4 required attention


I
H
D CH0 DIMM2 RED Faulty CH0 DIMM2 D0 G
F
E CH0 DIMM1 RED Faulty CH0 DIMM1 D1
E
F CH0 DIMM0 RED Faulty CH0 DIMM0 D2
D
G CH1 DIMM2 RED Faulty CH1 DIMM2 D3 C
H CH1 DIMM1 RED Faulty CH1 DIMM1 D4 B
I CH1 DIMM0 RED Faulty CH1 DIMM0 D5

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support Page
Architecture &
components
– Block Diagram
– Packaging
– CIX - IMB
– Multi-modules cabling
– UCM

Index

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Mesca2 : block diagram

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Notes

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Mesca2 packaging overview
Mesca2 basic module
– 3U drawer, 19” rack standard

Motherboard
– 2-socket module or node (2S)
– CPU : IvyBridge-EX and then Haswell-EX and Broadwell-EX
– Maximum memory slots : 24 DIMMs per socket DDR3 and then DDR4
– Maximum use of available PCIe lanes offering low profile slots:
– seven (7) 8x PCIe slots
– or three (3) 16x PCIe slots and one (1) 8x PCIe slot
– BCS2, PCH, BMC, Gigabit Ethernet controller

Accessories
– Hot-plug PSU : 2N redundancy (2x700W or 2 x 1400W option)
– Hot-plug Fans : 5+1
– Hot-plug Front Disks : 0 to 2
– LCP
– Hot-plug UCM (option)
– Hot-plug Memory risers : 2 to 8
– Hot-plug PCIe risers : 0 to 7

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support Page
CIX mother board
BCS2 socket IMB connector

Memory boards
CPU sockets I/O boards
connectors
connectors
© For internal use
Big Data & Security | Legacy Systems | GCOS7 & Bullion Support Page
IMB backplanes overview
Cable assemblies :
– 4 different cabling versions
• 8 modules (24U height)
• 4 modules (12U height)
• 2 modules (6U height)
– Same Eth switch board
• Star cabling topology
• Remote redundant power
• SPI bus to CIX FPGA
– XQPI :
• 16 diff pairs per module
• 8TX/8RX
• All-to-all topology
• 14 Gbps per differential pair
• 14 GBps max BW per BCS2
• XQPI side band signals
– CATER, CLK, PMSYNC
Tyco 24U 8 modules IMB version
© For internal use
Big Data & Security | Legacy Systems | GCOS7 & Bullion Support Page
xqpi links configuration: octo module
m0 6 5 4 3 2 1 0

m1 m7

6 5 4 3 2 1 0 0 1 2 3 4 5 6

m2 m6

6 5 4 3 2 1 0 0 1 2 3 4 5 6

6 5 4 3 2 1 0 0 1 2 3 4 5 6

m3 m5

0 1 2 3 4 5 6

m4

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
xqpi links configuration: quadrimodule

m0 6 5 4 3 2 1 0

m1

6 5 4 3 2 1 0

m2

6 5 4 3 2 1 0

6 5 4 3 2 1 0

m3

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
xqpi links configuration : bimodule

m0 6 5 4 3 2 1 0

m1

6 5 4 3 2 1 0

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
MUCM control board (not CRU)

Ultra
capacitors Discharge
resistors VCAP I/O

Charger
Controller Transfer PD16
Switch connector

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
BMC configuration
– Psetup
– Connection to BMC/EMM
– EMM page descriptions
– System console (java)

Index

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
How to set management IP address (BMC)
Launch Psetup
– Psetup is provided at
delivery on ressources DVD

1 Select Management (BMC)


1>
a
MAC address
–a MAC address is located on a
sticker in box rear the server

2 Click on ‘’Query Device’’


2> 1
3
3
1> Set IP ADDRESS, Subnet
mask, Gateway
4>
4 Set Super User login
« super » and Super User
Password « pass »
4
5
5> To save settings, click on
‘’Setup Device’’ button 2 5

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
How to connect to the BMC

To connect at Server Hardware Console « SHC »


– Open any browser (IE, FireFox, Chrome, etc)
–1 Enter BMC IP address in URLfield
–2 Default user « super » default password « pass »
–3 Click on ‘’Log On’’
1

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Welcome to Management home page

System Control
– To manage server power sequences
– To launch server console to access BIOS, EFI, OS
Monitoring
– To view logs and sensors
Configuration
– To customize management parameters according to customer
environment
Maintenance
– List and update firmwares and BIOS
– List internal server components
– Usefull to reset BMC, exlude/include configuration components, set
the blue identification led on/off.

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
System Control Tab

Error Management 2

uncorrectable and fatal


3 error
Standard Power Operations
–1 To be used when server is in normal status.
Emergency or Unresponsive System Power Operations
–2 Must be used when frozen server or when no OS running
(i,e,server under EFI)
3
– Hard Reset reload BIOS without power cycle, not all registers
are reset.
© For internal use
Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Monitoring Tab

1 Sensor status
10 tabs to categorizing 2
different sensor types

2 System Event Log


(SEL)
– Display 1024 entries
3
Messages log build
by mangement
interface (EMM) 3
« Message logs »

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
To be done at installation time
1

1 2

2
3

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Maintenance folder 1/2

1
2

3
1
4

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Maintenance folder 2/2

5
7

BMC
5
6
7

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
System console launching (need JAVA) 1/2

2 possibilities :
– Click on ‘’Launch’’
– Click in ‘’preview’’ window represented in blue

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
System console launching (need JAVA) 2/2
According to workstation configuration, one or two warning
windows may appear at JAVA starting time:
– First one is to allow JAVA to start (may be absent)
– Next one (or only one) is for JAVA security about pop-up, you
a must answer NO to allow pop-up. (pop-up required for
remote media usage)

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Several system console views

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
iCare 1.6.5
– insight
– Care

Index

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Set of tools for plate form management
Bull Platform Manager (BPM)
Web interface over BSMCLI for system administrators
iCare
Web interface for maintenance and support teams

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
iCare purpose and architecture

The purpose of the iCare application is to


improve the maintainability of Bull servers.
Overall architecture

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
iCare: Discovery function 1/2

iCare is a freeware provided by Atos on ressources DVD. iCare must be


implemented and configured during first installation.
At each configuration upgrade or downgrade, iCare must be updated.

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
iCare: Discovery function 2/2

1 2

Apply1
– Sélect the servers et click on ‘’Apply’’
Get XML Template
– Can be used if we
2 need to plane next several servers not yet
connected to network. But take care because there is no
configuration control.

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
iCare: Autocall function 1/2

1
1 Set ‘’enable autocalls’’

2 Complete email for


destination
4 Several emails can be set
3 separated by (;)
4 Enter MAIL server
Click on ‘’Apply’’ to save
© For internal use
Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
iCare: Autocall function 2/2

Site engineer name 1


– Used to identify sender.
1
(See)

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
iCare: ‘’Monitoring’’ folder

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
iCare: How to test commuication with targets

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
iCare: How to save BMC configuration files

Folder contains
– C:\Program Files (x86)\bull\Bull insight Care\core\ammsite\data\bmc_backup\bullionS4_sup\XAN-LX7-
00010\2014-09-28_11-21-23

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
BPM 2.4.4
– Bull
– Platform
– Manager

Index

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
BPM: Partitionning

To change partitionning, sélect


modules and enter a partition name
in dialogue box.
To save, click on ’’Change’’

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Update Firmware
– System Hardware Console (SHC)

– Bull Platform Manager (BPM)

– BSM_HW_NG_CLI

Index

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Firmware and BIOS update using SHC

1
1

1
2
2

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Firmware and BIOS update using BPM

Top Picture show update component by component

Bottom picture show update for global partition (2 modules for S4, 4 modules
for S8, 8 modules for S16)
– Enter ressources DVD mounting point (ex, e:\)

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Firmwares and BIOS update using BPM

Click on Diff button to obtain différence between


DVD and current technical status.
© For internal use
Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
BSMHW_NG_CLI
– Connection to Cygwin
– Script examples

Index

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
BSM_HW_NG_CLI
Meaning : Bull System Management HardWare New Generation Command
Line Interface
The BSMHW_NG package contains scripts usable on Windows workstation.
How to use it on Windows :
– Open DOS shell in ADMINISTRATOR mode
• Accessories  cmd (right click  run as administrator)
• cd %BSMHW_NG_HOME%/engine/bin
• Bash --login –i
• At prompt enter « cd /bin »
• Now you are under Cygwin Linux shell
• All BSMHW_CLI commands can be launched as linux workstation

First go to executable folder /bin

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
BSM_HW_NG_CLI command example
To save BMC configuration files « bsmBMCcfg.sh »

To restore BMC configuration files « bsmBMCcfg.sh »

To check VMWARE memory hole setting

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
BSM_HW_NG_CLI command example

To list BIOS traces « bsmBioslog.sh »

To get last BIOS trace log

– Then file can be open in C:\Program Files (x86)\bull\BSM HW NG CLI\engine\tmp

For all other usages, go to :


86 A1 43FL 04 bullion S Remote Hardware Management CLI

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Memory management
– RAS features BIOS setting
– MCElog example
– Memory architecture

Index

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
RAS BIOS Setup configuration

Three modes are available in BIOS


Setup
Manual
– Manual settings
RAS (default usage for VMWARE,
Windows, RedHat)
– Leaky Bucket, Device Tagging, SDDC,
DDDC, Lockstep, patrol scrubbing,
demand scrubbing
Perf (default usage for HPC)
– Leaky Bucket, Device Tagging, SDDC,
patrol scrubbing, demand scrubbing

DIMMs failure alert & localization


with RAS feature activated
Same format than for correctable
memory error

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Intel Machine Check Architecture

• Corrected
 Generate a CMCI (Correctable Machine Check Interrupt) to the
OS :
/var/log/mcelog

CPU 0 BANK 13 (Memory Controller)


MISC 4900000080008200
TIME 1422285945 Mon Jan 26 16:25:45 2015
MCG status:MCi status:
Corrected error
MCi_MISC register valid
MCA: MEMORY CONTROLLER RD_CHANNEL0_ERR
Transaction: Memory read error
MemCtrl: Corrected memory read error
STATUS 8800004100800090 MCGSTATUS 0
MCGCAP 5000c20 APICID 0 SOCKETID 0
CPUID Vendor Intel Family 6 Model 62

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
DDR Lockstep (default in RAS mode)

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Replacing CIX board
– Before starting (1)
– Replace board (2)
– Update firmware (5)
– Restore config file (9)
– Fructose usage

Index

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Replacing CIX mother board on a multi module
1  If not done before, using bsmBMCcfg script, save BMC/EMM
configuration file and note BMC IP address.

2  Replace board following procedure in 86A735FL02 - Field Service


Guide, starting page 1-60 chapter 1.5. Servicing the Server Drawer but
don’t insert back module in XQPI interconnect box yet.

3  insert back module in rack and push it but leave it around 2cm (1
inch) out before XQPI connection.

4 Connect AC power cables then, using PSETUP, set BMC module IP

5 check and if needed, update module firmware’s (EMM, BIOS,


FPGA, PCPLD, MPLD, etc.)

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Replacing CIX mother board on a multi module
6 Using FRUCTOSE, set System serial Number and Chassis Serial
Number, when done, power up the repaired module to check his
operability, then power down module by FORCE POWER OFF button.

7  when done, remove AC power cords.

8  push module to complete inserted position and lock it using front


captive screws.

9  Connect AC power cables and when BMC’s are up, using bsmBMCcfg
script restore BMC/EMM configuration file

10  if necessary reset BMCs on all modules of the partition

11 Power up server using master System Hardware Console SHC.

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Fructose usage

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Notes

© For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support
Thanks
For more information please contact:
T+ 33 1 30803058
M+ 33 6 73182550
patrick-a.lemesle@atos.net

Atos, the Atos logo, Atos Consulting, Atos Worldgrid, Worldline,


BlueKiwi, Canopy the Open Cloud Company, Yunano, Zero Email, Zero
Email Certified and The Zero Email Company are registered
trademarks of Atos. July 2014. © 2014 Atos. Confidential information
owned by Atos, to be used by the recipient only. This document, or
any part of it, may not be reproduced, copied, circulated and/or
distributed nor quoted without prior written approval from Atos.

© For internal use © For internal use


Big Data & Security | Legacy Systems | GCOS7 & Bullion Support

You might also like