Skip to content


smartctl -x на RAID-контроллере HP Smart Array E200i

Как-то раз на просторах ынтернета попался мне совет запустить smartctl с параметром -x.

Я, конечно, как всякий homo sapiens, сначала почитал man:

-x, --xall
     Prints all SMART and non-SMART information about the device. For ATA
     devices this is equivalent to ´-H -i -g all -c -A -f brief
     -l xerror,error -l xselftest,selftest -l selective -l directory
     -l scttemp -l scterc -l devstat -l sataphy´.
     and for SCSI, this is equivalent to
     ´-H -i -A -l error -l selftest -l background -l sasphy´.

Не увидев там ничего стрёмного, выполнил вот такую командочку:

# smartctl -x -a -d cciss,0 /dev/cciss/c0d0
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-2.6.32-642.13.1.el6.x86_64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
 
/dev/cciss/c0d0 [cciss_disk_00] [SCSI]: Device open changed type from 'sat,auto' to 'cciss'
Vendor:               SEAGATE 
Product:              ST91000640SS    
Revision:             0001
User Capacity:        1,000,204,886,016 bytes [1.00 TB]
Logical block size:   512 bytes
Logical Unit id:      0x5000c50025fd7283
Serial number:        9XG02CLM00009126234W
Device type:          disk
Transport protocol:   SAS
Local Time is:        Tue Jan 31 15:29:39 2017 UTC
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK
 
Current Drive Temperature:     22 C
Drive Trip Temperature:        68 C
Manufactured in week  of year 20
Specified cycle count over device lifetime:  10000
Accumulated start-stop cycles:  36
Specified load-unload count over device lifetime:  300000
Accumulated load-unload cycles:  36
Elements in grown defect list: 3
Vendor (Seagate) cache information
  Blocks sent to initiator = 791069177
  Blocks received from initiator = 8147385
  Blocks read from cache and sent to initiator = 6510918
  Number of read and write commands whose size <= segment size = 1294551
  Number of read and write commands whose size > segment size = 0
Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 37972.70
  number of minutes until next internal SMART test = 12
 
Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:    8169902        0         0   8169902          0       2604.051           0
write:         0        0         0         0          0          4.359           0
 
Non-medium error count:        1
 
[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
No self-tests have been logged
Long (extended) Self Test duration: 12198 seconds [203.3 minutes]
Segmentation fault (core dumped)

И... консолька замерла, связь с сервером пропала, пинга нет. Слава Хэнку, что сервер был не из production-кластера. И через пару минут самостоятельно поднялся.

При этом стоит отметить, что командочка smartctl -a -d cciss,0 /dev/cciss/c0d0 (то же самое, но без -x) там же пару минут ранее выполнялась несколько раз без каких-либо проблем. OS – CentOS 6.8 x86_64, RAID-контроллер HP Smart Array E200i.

Мораль: будьте осторожны со smartctl. Я предупредил.

Posted in *nix.

Tagged with , .


One Response

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. депозитчик says

    Кстати, похоже на баг в firmware.
    Тут вот в обновлении прошивки
    http://h20564.www2.hpe.com/hpsc/swd/public/detail?sp4ts.oid=3924068&swItemId=MTX_5e52f965d84f41c2bb65d33b58&swEnvOid=4103#tab3

    написано, что пофиксили баг

    Problems Fixed:
    Running SMARTCTL (smartmontools) on HP Proliant G6/G7 (Px1x) Smart Array controllers that have firmware version 5.70 to 6.62 installed with SATA drives attached may result in system not responding or reboot. Wehn reboot occurred, a reboot 1719 POST error message with lockup 0x15 displayed.

You must be logged in to post a comment.