Detect faulty drive in RAID 10 array

I’ve been told that I can only verify my HW RAID array is working perfectly with KVM. However, I want to be automatically notified when there is a problem by my server.

Is there a way via SSH (that will be called via system() in php) that can detect that a drive is having problems? I don’t need to identify which drive.

I have thought of one theory but I don’t know if it will work in practice. If I were to run a PHP script to fopen(‘/dev/[filesystem]’, ‘r’) and seeked every xGB for 1 byte and it seeks a position of the filesystem that’s having problems, it should return an error. Am I correct in thinking this idea?

I use XFS filesystem, I have heard of xfs_check but that says it needs to be ran in read-only mode which is inconvenient.

I use 3ware RAID controller.

Answer

Install the 3Ware tools (tw_cli) on your machine.

After you have installed them, get the id # of the controller (I’ve never understood the system behind it, for all I know it might be random):

$ tw_cli show

Ctl   Model        (V)Ports  Drives   Units   NotOpt  RRate   VRate  BBU
------------------------------------------------------------------------
c0    9550SXU-4LP  4         2        1       0       1       1      -

You can then query the array status with

$ tw_cli /c0 show

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-1    OK             -       -       -       74.4951   ON     OFF

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     NOT-PRESENT      -      -           -             -
p1     NOT-PRESENT      -      -           -             -
p2     OK               u0     74.53 GB    156301488     9QZ07NP2
p3     OK               u0     74.53 GB    156301488     9QZ08DS2

Obviously, this will look different on your machine. These example where lifted from here.

To actively verify (scrub) your drives, use

$ tw_cli /c0/u0 start verify

For automatic notifications, you should setup a monitoring system, e.g. Nagios or Icinga and use a plugin that checks the health of the array with the help of tw_cli. These plugins work nicely without Nagios/Icinga as well and could be easily used in a minimal monitoring system in form of a cron job that sends a mail of the plugin doesn’t return 0.

Attribution
Source : Link , Question Author : user3786834 , Answer Author : Sven

Leave a Comment