December 03, 2003

RAID Death and Resurrection

My blog has been down.  The reason is partly that the software RAID on which my directory sat was down, which is due to a temporary power outage knocking out two drives at once, and I’ve been busy with fixing things.  (RAID level 5 can tolerate at most one drive going bad.)  Both of the drives became out of sync with the third, though one of them was really ok physically.  The other one apparently developed too many bad blocks on a track and the hardware bad block remapping couldn't deal with it, so it appeared to be in a failed state.  The md software RAID package unfortunately can't deal with bad blocks on disks, unlike modern filesystems, such as ext2.  Fortunately I was able to remove the data using a tool called mdadm, which can tell a disk that is out of sync to have the same event count as another, thus appearing to be in sync to md.  For most (99.99999%+) of the data, this works fine, and I was able to tar everything up and copy it to our Windows XP machine.  But that was only the beginning.

Firstly, I felt I needed a solution that would mostly eliminate the power instability, which plague us biannually, due to tree limbs falling on power lines during the high winds of Fall and Spring.  We  sometimes suffer brief brownouts, which leave electronic equipment in an unstable state, or power surges, which fry unprotected electronics, or blackouts that last hours.  (The last year has been particularly bad -- we lost a high-end editing VCR, a bread machine, several surge protection strips, and a microwave.)  So we bought a new Belkin 1200VA UPS (Uninterruptible Power Supply), which immediately supplies power to the networking equipment, Linux machine, and XP machine for 5-10 minutes after an outage begins, and then tells them to shutdown gracefully.  It also has surge protection and helps filter the signal.  It works great, and I wish we had one years ago.

Secondly, since bad blocks eventually build up in any working hard drive, a way of remapping the bad blocks was needed for our RAID setup.  I believe I found a solution with EVMS (Enterprise Volume Management System), which is an open source project funded by IBM that has a bad-block-remapping "plug-in."  It also sports a nice GUI, and provides many other features, such as snapshots, LVM, multipath setups, and clustering.  The only drawback is that it hasn't been available very long, so it probably has a few bugs.

To complicate matters, the old system disk we were using was only 10 GB, and after putting on swap space, home directories, and tons of software packages, it was nearly full.  I was growing tired of keeping the Redhat packages up to date, and dealing with the dependencies was a (slow and manual) pain.  Perhaps if we paid them it would not be a nuissance, but that seemed unnecessary, since I had been reading that other distributions provided free and nearly automated solutions to keeping packages current.  To top it off, Redhad decided to stop catering to desktop platforms and focus on their server market.  So I wanted to give another distro a try.  But first we needed more disk system disk space.  

Our Asus A7n8x Deluxe motherboard can support two SATA (Serial Advanced Technology Attachment) drives, so we went to PC-Club in Bellevue and bought a couple 120GB Western Digital SATA drives (WD1200JD) for around $270.  Serial-ATA is supposed to eventually replace the widespread Parallel-ATA IDE, and though the cables' diameters  and connectors are smaller and more manageable, SATA currently isn't much faster.

After the drives were installed, I spent several days reading about different Linux distributions; I considered Suse, Mandrake, Knoppix, Debian, Linux from Scratch, Gentoo, and others.  I decided on Gentoo, since the package management seemed well respected, there were many active developers, it's conducive to a better understanding of how things work "under the hood," and it is source code-based, allowing for custom compilation optimizations for the processor used and hacking of packages' code.  If you have a slow processor, you may not appreciate the hours required to compile everything from scratch, but fast modern processors can make quick work of it, and future ones will be even faster.

There are many different linux kernels available for use with gentoo.  Not all versions of the Linux kernel support SATA, unfortunately.  The 2.6 kernel, which is still in beta testing, the -ac (Alan Cox) versions of the 2.4 kernel, and probably a few others support SATA.  EVMS version 2, which has the bad block replacement, requires component called Device Manager, which is included in the 2.6 kernels but must be installed in 2.4 kernel versions.  I didn't have much luck persuading the 2.4 kernel versions to work.  Either they had SATA working or EVMS 2, but not both.  Only the vanilla 2.6 kernel was readily able to handle both after a few patches.  Finally, after several attempts, I have the 2.6 (beta 9) kernel and EVMS 2 working.  

The five 120GB disks each have 800MB, 11GB, 50GB, and 50GB partitions.  About 2.4GB of swap is spread across three of the 800MB partitions, while the other two are used as a boot partition and a backup boot partition.  The OS is on one of the 11GB and backed up to another.  Two 11GB partitions are combined as a fast 22GB RAID level 0 (striped) volume for scratch space that is not backed up.  As for the 50 GB partitions, they are used to create two 200GB volumes, each with RAID level 5.  The reason for two rather than a single large volume is to guard against filesystem corruption; files on one volume will be backed up to the other until I’m confident that the Reiser filesystem is stable.  For all volumes except one of the big RAIDs and the boot partitions, I am using Reiser 3.6.8, which is fast but I don’t trust it, after reading about how others who have been burned with earlier versions of Reiser.  The boot partitions are ext3, because I know that our bootloader, Grub, can deal with it.  The other big RAID volume is formatted with JFS (IBM’s Journaling File System, ported from AIX), which is good for large files, much like XFS.  It is still green, and not many people have used it yet, so I don’t trust it either.

At this point I'm feeling a combination of relief and paranoia.  I'm happy with Gentoo and I'm glad to have things working, but if an IDE controller goes out, leaving the RAID partitions unsynchronized, I don't know if EVMS can deal with it without losing data, and the mdadm tool can't help.

Posted by seander at December 3, 2003 09:20 AM
Comments