replacing 3 drives out of 8 in a server [closed]

We have a Dell R510 server running SQL 2008 R2 with 8 x 300GB drives running Raid 5.

We (just noticed) we had three bad drives with blinking lights so we powered down the server and replaced them with new ones.

When the server came back up the lights were green (but not flashing).

The server only shows XXXX GB of space so it is not reading the drives, did we miss a step to bring the new drives online?

Does the raid array need time to build or should we have swapped them one at a time?

We have a copy of the data so that is not a major issue to restore it.

Answer

Why would you ask the internet about this?

There’s so much WTF here, that I don’t understand where to start!!

This question shows a fundamental lack of understanding of hardware, RAID arrays, storage, monitoring, and general IT best-practices.

I read this question and can’t help but think:

  • Who is actually responsible for this server hardware? Where is the sysadmin/consultant/IT professional?

  • Why would you turn off a server to replace hot-swappable disks in a hardware RAID array? It’s not necessary to do so and it substantially increases your risk if you already suspect bad disks.

  • Did you understand what the “blinking lights” meant? What color were the lights? Perhaps they were indicating disk pre-failure instead of a complete failure.

  • You replaced the drives without knowing the impact of doing so. If anything, these actions made the situation worse and you may have destroyed your data.

  • Why would you expect the size the disk array to change following a drive replacement? What the hell does “XXXX GB” mean, and why is it pertinent to your question? How about relaying details like the capacity and type of disks, as well as the size of the array presented to the OS?

  • You just noticed a disk failure? You have spare disks available but no form of monitoring to actually identify failures? Your server monitoring should have TOLD you this. Even a basic visual check of the servers would help recognize problems. I doubt that the disks failed at the same time.

  • Did anyone check the system logs? What does the hardware RAID controller say when you boot the system? What do the Dell DRAC logs say? What does the operating system say?

  • Finally, if you have questions about the operations of your manufacturer-supported, brand-name hardware, and don’t understand what’s happening, wouldn’t it have made more sense to assess your situation (check logs, data and backups) and contact Dell?

I understand the consumerization of technology means that people are often tasked with responsibilities and placed in situations that they’re not qualified for, but the lack of basic troubleshooting skills exhibited here is appalling. It’s unfortunate that people are paid to provide this level of service.

Attribution
Source : Link , Question Author : Pico , Answer Author : ewwhite

Leave a Comment