Losing 2 disks on a RAID-5 array

Today I was quite disappointed when I saw that my RAID-5 array had suddenly lost 2 out of 4 drives. As you may know, losing 1 drive on RAID-5 is OK, losing 2 is not ok at all, it usually means that you have lost all your data.

In fact, my failure today was due to some electrical problems. If you are following this blog you know that my RAID drives are plugged to the server (Debian GNU/Linux) using USB, which is an extremely bad idea (don't do it at home 😉). And to add more on my stupidity, in order to reduce power consumption I changed my hard drives to laptop hard drives and have them powered through the USB hub… which was not plugged to the UPS. So today there was a power failure at home and since the server's USB was not providing enough power, two drives went off.

Since nothing was being written when it occurred, I know that the content on every drive was still good, but mdadm reported the array as degraded and reading was not really possible anymore.

So, what to do in that case? From what I have seen, the first thing is to stop the array, then to try to reassemble it with various options (but do not try to re-add the "failing" drives). Obviously I did some mistake… So, if at some point mdadm --assemble with any kind of options does not work, re-creating the array might be your last solution. At least it worked for me.

But be careful, when creating the array, you have to provide the same options (chunk size…) as it was before, and you have to keep the drives in the SAME ORDER. And when you have drives on USB, the order is a bit random (maybe I should have looked at each disk's UUID and write the order somewhere).

So I re created the array with the following command (DON'T FORGET assume-clean otherwise mdadm will start re-synchronizing your disks and it's something you may not want):

# mdadm --create --verbose /dev/md0 --level=5 --raid-devices=4 --assume-clean /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1

After doing so… it was still not working. Why? Because I didn't gave the right disk order! With 4 drives I have something like 24 different possibilities. How to find out if it's the right one? Well, that's quite easy, you should be able to mount the disk 😉. Doing a fsck might also be a good idea (don't forget the -n option as you don't want to write on the drive until you are sure that it's the correct order).

I was quite lucky since I found the right one on the second try.

Comments Add one by sending me an email.