Sunday, December 2, 2012

migrating the raid

Every few years it's time to migrate that old disk array to a new set of disks.  My current array, or perhaps I should say "filesystem"-- started life as four 200GB drives in a raid5 configuration.  Later I upgraded to two 1TB disks in a raid1 configuration.  Today, I'm moving to a raid5 configuration of three 2TB disks.  Finally, a respectable amount of storage!

Software raid certainly has its merits.  I've moved these disks around between several motherboards, never worrying much about drivers or compatibility.  It just works.  So here I go, documenting this upgrade experience.

The current two drives are raid1, as I mentioned, so it is really pretty simple.  There is /dev/sda1 /dev/sdb1 serving as the root fs.  /dev/sda2 and /dev/sdb2 are my swap partitions-- no reason to use raid for that.

Going back to raid5 I'm going to have to re-introduce a bit of complexity: an additional raid1 volume for /boot, because only the newest grub2 bootloader can deal with booting from raid5.  Rather than go bleeding edge, I think I will sacrifice 1GB or so of my storage to serve this need.  Lets get started.

First thing is to figure out how you're going to get the drives into one computer.  You could do this with two computers and use a network, but let's keep it simple, and hopefully high-performance.  My motherboard has four sata ports.  Lets see, two plus three is five!  No problem though, we could unplug one of the raid1 drives and have just enough ports.

Even better-- lets leave the raid1 array intact, and setup our raid5 array with only two drives.  This will give us maximum read performance from the old raid1 and maximum write performance to the new raid5.  After we're all done, we'll add the third drive back to the raid5 array and let it build its parity information.  This is going to be so fast! (not really).

So now we have our four drives plugged in.  How should we boot this thing?  You could probably boot up into single user mode, then remount / as read only and it should be safe to copy... but... lets just keep it super clean.  Create a bootable usb thumb drive (I used a debian netinst installer image copied to a usb stick).

Boot up the debian installer.  Instead of installation, choose rescue mode.  At some point it will ask you if you want to assemble a raid, choose that.  It should get /dev/md0 ready to roll, but we don't want to mount it.  It will ask you if you want to execute a shell-- execute one in the installers environment, not md0.

Ok so now you have a root shell and if you cat /proc/mdstat you will see your md0 active.  To your horror it might say:
md0: active raid1 sdb1 sde1

sdb, sde?  What?  It doesn't matter what the dev names are, all the raid meta data is stores on each partition so it just figures it out.  sda in this case is the usb thumbdrive.

So my new drives are sdc and sdd.  I fdisk both of these drives and create a 1GB partition of type fd, a 1GB partiton of type 82, and the remainder of the drive as another of type fd.  Fd is linux raid, and 82 is swap.

 Lets get our /boot raid1 array up and going:

mdadm --create /dev/md1 -l1 -n2 /dev/sdc1 /dev/sdd1
done.  But we don't need to do anything with this yet.  Now lets build our new raid 5 array.
mdadm --create /dev/md2 -l5 -n3 /dev/sdc3 /dev/sdd3 missing
missing?  That is the trick to getting the array going with your missing drive!  Now we can start the copy.  Be extremely careful here.  Well, you should have been being extremely careful this whole time.

dd if=/dev/md0 of=/dev/md2 bs=10M

"if" is your source, your good precious data.  "of" is the destination, your empty raid5 array.  This might have been a good time to check those backups.  Or, this is a good argument for degrading the raid1 array earlier.  Integrity should trump performance.  I usually do a blocksize of at least 1M for performance, more than that really won't do anything though.

So, copying 1TB like this might take quite a while.  A long while.  Go see a movie, get some dinner, then hit a coffee shop for a cup of coffee.  Ok by now it should be done.  No?  Go to bed.

Ok, 7 hours later it is done.  Lets do a fs check:
e2fsck -fy /dev/md2
and let's resize it to fill up the actual raid we just put together:
resize2fs /dev/md2
Man, this takes forever.  You might be wondering why preserve the filesystem at all and not just tar the whole contents over.  This would work, but I have an enormous number of files due to my backuppc backup software (tons of hard links).  Plus, I've had this filesystem for over a decade.  We have a relationship.

Ok, finally that is resized.  Now we can create format the raid1 volume for /boot and finally try to get this thing bootable.
mkfs.ext3 /dev/md1
cd /mnt
mkdir md1 md2
mount /dev/md2 md2
mv md2/boot md2/oldboot
mkdir md2/boot
mount /dev/md1 md2/boot

This part is pretty foggy because I was beating my head against a wall for a while.  All or some of these some commands are needed

So now md1 is mounted under /boot on md2
chroot md2
cp -a oldboot/* boot
grub-mkdevicemap
vi /etc/initramfs-tools/modules
add raid1 and raid5
dpkg-reconfigure your kernel
update-grub
update initramfs -u -k all
grub-install /dev/md1
vi /etc/fstab
/dev/md2 /
/dev/md1 /boot

Aaaaaand there you go! 

No, wait, you want to add the 3rd drive, right?  Shut it down. Plug it in.   Fdisk it the same as the other disks.  Then add it to your raid and wait for the parity to build:

mdadm /dev/md2 -a /dev/sdc3
mdadm /dev/md1 -a /dev/sdc1

cat /proc/mdstat

Yahoo.  Ok, I screwed up and sdc1 is a hot-spare for md1.  I should have done the missing drive trick if I wanted 3 disks in my raid1.  Not too worried about it though.  Here are some quick benchmarks:
Before:
Raid1 2 drives (green WD low rpm):  Block Write: 42704 KB/s.  Block Read: 66728KB/s
Midway:
Raid5 degraded  2/3 (SG 7200RPM Barracuda):  Block Write:  73796 KB/s.  Block Read: 171882KB/s
Post:
Raid5 3/3 drives :  Block Write: 86499 KB/s.  Block Read: 229297 KB/s