ipmitool


ipmitool - utility for controlling IPMI-enabled devices power supply, voltage, power sensor


To display sensor status

ipmitool sensor

ipmitool sensor | grep -i vps
/SYS/VPS         | 210.000    | Watts      | ok    | na        | na        | na        | 680.000   | na        | na       

T_AMB            | 22.000     | degrees C  | ok    | 5.000     | na        | na        | na        | na        | 37.000   

To monitor the temperature of the cells, Inlet Ambient Temperature across all cells

ipmitool sensor get T_AMB | grep -i Reading
Sensor Reading        : 22 (+/- 0) degrees C

dcli -l root -g all_group ipmitool sensor get T_AMB | grep -I Reading
Sensor Reading        : 29 (+/- 0) degrees C
Sensor Reading        : 29 (+/- 0) degrees C
Sensor Reading        : 29 (+/- 0) degrees C
Sensor Reading        : 29 (+/- 0) degrees C

To query the BMC for sensor data records (SDR)  to show if power supply or looking ok or not.
ipmitool sdr | grep PS0
  
  30 |PS0/PRSNT        | 0x02              | ok
    37 |PS0/FAN_FAULT    | 0x01              | ok
    38 |PS0/TEMP_FAULT   | 0x01              | ok
  
    3c |PS0/I_IN         | 0.50 Amps         | ok
    3d |PS0/V_OUT        | 12 Volts          | ok
    3e |PS0/I_OUT        | 7.50 Amps         | ok
    3f |PS0/IN_POWER     | 100 Watts         | ok
    40 |PS0/OUT_POWER    | 90 Watts          | ok

Exadata Software Executables And Software Locations


Compute Server software’s Locations and Executables

/u01 – GI and RDBMS binaries
/opt/oracle.cellos  - cell OS configuration,  executables
/opt/oracle.SupportTools – Support and configuration file
/opt/oracle.SupportTools/onecommand  - OneCommand files used for install/config
/etc/oracle/cell/network-config/cellip.ora –   Infiniband IP address for each cell
/etc/oracle/cell/network-config/cellinit.ora – IP address of the cell
/opt/oracle.oswatcher  - OS Watcher
/etc/rc.d/rc.Oracle.Exadata – init script for Exadata


Cell Server software’s locations and Executables
/opt/oracle
/opt/oracle.cellos
/opt/oracle.cellos/cell.conf                        cell configuration
/opt/oracle.cellos/Check                          validate hardware profile
/opt/oracle.cellos/ExadataDiagCollector.sh         collect cell diagnostic data
/opt/oracle.cellos/functions_cellos

/opt/oracle.cellos/imageinfo
dcli -l root -g cell_group imageinfo

/opt/oracle.cellos/imagehistory
                dcli –l root –g ~/cell_group imagehistory | grep Active

/opt/oracle.cellos/i                                     Diplay/config cell network enviornment
/opt/orac                                                   Kernel ISO image
/opt/oracle.cellos/MegaCli64
/opt/oracle.cellos/patch
/opt/oracle.cellos/restore_cellboot.sh
/opt/oracle.cellos/validations_cell
/opt/oracle.cellos/vldconfig
/opt/oracle.cellos/vldrun                             Configure cell validation
/opt/oracle/cell[VERSION]                                                                          – display current cell software
/opt/oracle/cell[VERSION]/cellsrv/deploy/config                                             – deployment files for active image
/opt/oracle/cell[VERSION]/cellsrv/deploy/config/cellinit.ora      – cell init parameter

var/log/oracle
var/log/oracle/cellos      Location of Trace files for cell services,c ell server startup / shutdown, others
var/log/oracle/diag/asm               Location storage cell related events

Backup Exadata Storage and Compute Servers

Backup Storage Servers:
Oracle does not support installation of any software of any type.  You’re not required to backup your storage server, Oracle automatically create backups of the OS and cell services software.  Oracle automatically maintains copies of the latest cell boot images and cell server software in /opt/oracle.cellos/iso/LastGoodConfig.  In addition, there are active and inactive system volume partitions to allow Oracle performing both in-partition and out-of partion patches.  During the patching activities, Oracle performs inactive partition first with out-of-partition and mark these partitions as active when successful. 

cd /opt/oracle.cellos/iso/lastGoodConfig

To display its contents, you need to mount the CELLBOOT USB flash drive partition.  Login as root, run fdisk –l to find your internal USB drive partition, then mount it (mount /dev/sdm1 /mnt/usb).  You can also creating a cell bot image on an external USB drive for backup redundancy purpose.

Backup the Compute Node:
You can install third-party backup software and agents on the Exadata Compute nodes.  Your backups will need to be done over either 1GbE, 10GbE, or or high speed infiniBand.  You can also use NET3 as the backup network. 

To see the compute node free storage capacity

vgdisplay | egrep '(VGExaDb|Alloc PE|Free  PE)'
  VG Name               VGExaDb
  Alloc PE / Size       59904 / 234.00 GB
  Free  PE / Size       153827 / 600.89 GB

Infiniband verification

Verify IB cards are present
# lspci | grep Infi
Should be 1 card per system.

Verify IB ports are UP
# ibstatus | grep phys
        phys state:      5: LinkUp
        phys state:      5: LinkUp

Should show 2 ports up per system.

Verify IB ports are at QDR
# ibstatus | grep rate
        rate:            40 Gb/sec (4X QDR)
        rate:            40 Gb/sec (4X QDR)
Should show 2 ports at 40 Gb/sec (4x QDR) per system.

ibcheckerrors (to check network error)


ibdiagnet -r (to check cable)

List Flash Storage on Exadata Servers

You can query SCSI flash device information using lsscsi or list your PCI flash module using flash_dom to understand how flash storage is configured and presented on an Exadata Storage Server

lsscsi -v | grep MARVEL

Note:  Flash devices are show four 8, 9, 10, 11.  This is because of four flash cards have four FMods.  Therefore there are 16 flash devices and you can use flash_dom to display the details for the PCI flash devices

flash_dom –l

HBA# Port Name         Chip Vendor/Type/Rev    MPT Rev  Firmware Rev  IOC     WWID                 Serial Number

1.  /proc/mpt/ioc0    LSI Logic SAS1068E C0     105      011b5c00     0       50310020001116a80     465769T+1209A4ADFA

        Current active firmware version is 011b5c00 (1.27.92)
        Firmware image's version is MPTFW-01.27.92.00-IT
        x86 BIOS image's version is MPTBIOS-6.26.00.00 (2008.10.14)
        FCode image's version is MPT SAS FCode Version 1.00.49 (2007.09.21)


          D#  B___T  Type       Vendor   Product          Rev    Operating System Device Name
          1.  0   0  Disk       ATA      MARVELLXXXXXXXD21Y   /dev/sdn    [8:0:0:0]
          2.  0   1  Disk       ATA      MARVELLXXXXXXXD21Y   /dev/sdo    [8:0:1:0]
          3.  0   2  Disk       ATA      MARVELLXXXXXXXD21Y   /dev/sdp    [8:0:2:0]
          4.  0   3  Disk       ATA      MARVELLXXXXXXXD21Y   /dev/sdq    [8:0:3:0]


Validate FMods

You can see the flash devices grouped into sets of four on PCI slots 1, 2, 4, 5.  Each device per PCI slot residing in FDOM 0, 1, 2,OR 3.  Each FMOD is 24GB slide of storage x 16 = 284GB.  In X2 models, there are 4x 96GB pci FLASH cards per storage cell.  Each PCI flash car has a dvice partitioned per FDOM so there are 16 flash devices.  These flash devices  are the flash disks and used for Smart flash cache and smart flash logging. 

You should see 16 FMODs.  To validate all FMODs are present at the OS level, you can perform the following steps:

cellcli > list physicaldisk where disktype='FlashDisk' attributes name, disktype, physicalSize, slotNumber

flash_dom –l

cellcli > list flashcache detail
         id:                     cf20a022-a381-428e-bf6f-ee54dbcdcb56
         size:                   364.75G
         status:                 normal

cellcli > list flashlog detail

id:                     e7cbe8c7-d2f3-4d01-9f87-da83861c21d1
         size:                   512M
         status:                 normal

To verify hardware profile
/opt/oracle.SupportTools/CheckHWnFWProfile -d


Flash Storage

You can query SCSI flash device using lsscsi, list your PCI flash modedule using flash_dom

lsscsi -v | grep MARVEL                 (to view your flash devices)

 [8:0:0:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdn
[11:0:3:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdac

flash_dom  -l      (flash devices are split into groups of four:  8, 9, 10, 11).  Each of flash cars have 4 FMods.  Therefore each cell storage have 16 flash devices (4 x 4)

1.  /proc/mpt/ioc0    LSI Logic SAS1068E C0     105      011b5c00     0       5080020001116a80     465769T+1209A40TXF

        Current active firmware version is 011b5c00 (1.27.92)
        Firmware image's version is MPTFW-01.27.92.00-IT
        x86 BIOS image's version is MPTBIOS-6.26.00.00 (2008.10.14)
        FCode image's version is MPT SAS FCode Version 1.00.49 (2007.09.21)


          D#  B___T  Type       Vendor   Product          Rev    Operating System Device Name
          1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y   /dev/sdn    [8:0:0:0]
          2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y   /dev/sdo    [8:0:1:0]
          3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y   /dev/sdp    [8:0:2:0]
          4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y   /dev/sdq    [8:0:3:0]


Display physical disk, Megaraid device information, mdadm software RAID configuration

lsscsi      can show both physical SAS drives and flash devices.  There are 12 SAS devices and
lsscsi -v | grep MARVEL

[8:0:0:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdn
[8:0:1:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdo
….
[11:0:3:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdac

lsscsi -v | grep LSI             show 12 physical devices

/opt/MegaRAID/MegaCli/MegaCli64 -ShowSummary –aALL  (you should see 12 disks)


                Connector          : Port 0 - 3<Internal><Encl Pos 0 >: Slot 0
                Vendor Id          : HITACHI
                Product Id         : HQWERASDFA.0T
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 2.727 TB
                Power State        : Active

Each Exadata Storage Server has 12 physical SAS disks and 4 96GB PCIe Sun Flash Accelerator flash cards.  Each card partition into 4 x 24 GB partitions.

On Compute Node, there are 4 x 300 GB SAS – 1 hot spare (slot 3) and 3 active (slot 0,1,2)
/opt/MegaRAID/MegaCli/MegaCli64 -ShowSummary –aALL


[root@r22dwdbs06h-adm MegaCli]# pvdisplay
  --- Physical volume ---
  PV Name               /dev/sda2
  VG Name               VGExaDb
  PV Size               834.89 GB / not usable 3.84 MB
  Allocatable           yes
  PE Size (KByte)       4096
  Total PE              213731
  Free PE               153827
  Allocated PE          59904
  PV UUID               TMKLyW-2e2u-2BFq-xf4l-jdne-Vnb7-LOmcHd
  
[root@r22dwdbs06h-adm MegaCli]# lvdisplay
  --- Logical volume ---
  LV Name                /dev/VGExaDb/LVDbSys1
  VG Name                VGExaDb
  --- Logical volume ---
  LV Name                /dev/VGExaDb/LVDbSys2
  VG Name                VGExaDb
 
  --- Logical volume ---
  LV Name                /dev/VGExaDb/LVDbSwap1
  VG Name                VGExaDb

  --- Logical volume ---
  LV Name                /dev/VGExaDb/LVDbOra1
  VG Name                VGExaDb
 
  --- Logical volume ---
  LV Name                /dev/VGExaDb/sysback
  VG Name                VGExaDb
  LV UUID                kAKBpc-2I2n-xeJh-8sZy-OPxD-hklZ-ykaBsv

lvdisplay -v /dev/VGExaDb/LVDbSys1 (shows attributes of that logical volume)


ls -ltr /dev/VGExaDb/LVDb* (show the map to /dev/mapper devices)

Display Storage and Compute Server architecture

You’re assigned to manage the Exadata Machine.  You want to display overall architecture, storage configuration, network, and OS.
uname –a            (To display the kernel version for Oracle enterprise linux)
dmidecode –s system-product-name (To display the server model and serial number)
[root@glscell01 ~]# dmidecode -s system-product-name
SUN FIRE X4170 M2 SERVER
[root@glscell01 ~]# dmidecode -s system-serial-number
1308FMM0MG 
imageinfo            (To see cell version, kernel version, active system partition)
imagehistory      (image installed/patched overtime)
[root@r22dwdbs06h-adm ~]# imagehistory
Version                              : 11.2.3.2.0.120713
Image activation date                : 2012-1-17 15:23:49 -0500
Imaging mode                         : fresh
Imaging status                       : success

Version                              : 11.2.3.2.1.130302
Image activation date                : 2013-12-04 21:02:09 -0600
Imaging mode                         : patch
Imaging status                       : success

Each storage cell contains either: 
12 physical 600GB 15000 RPM high performance of delivering up to 1.8 GB per second of raw per cell or
3 TB 7200 RPM High Capacity SAS disk of delivery up to 1.3GB per second of raw data bandwidth

Compute server:
NET0      allow ssh, it uses eth0 interface
NET1, NET2, NET1-2        RAC VIP and SCAN
IB            Connects two ports on the compute servers to both of the InfiniBand leaf switches in the rac.  All storage servers and RAC interconnect use this network

NET3 is built on eth3       Use for Backup/external traffic