The Debian Woody/Sid 2.4 Kernel RAID 1 DevFS ReiserFS HOWTO

Or, #include nifty-feature-set.h

James Bromberger, September 3rd, 2001

This document should help you get your Hardware running Debian GNU/Linux on Software RAID 1. This was born out of two week's worth of frustration and lack of documentation on this specific combination. I will write this as a step-by-step guide, and point out issues that arise and reasons as we go.

I recommend that you read and re-read the Software RAID HOWTO and Boot + Root + RAID + LILO HOWTO before going any further. I found that these two documents explained enough to get me started. However, using them alone I repeatedly got kernel panics as I tried to move to the new RAID1 root filesystem right after the initrd image ran (actually, while it is trying to mount the root is where the problem was for me, see ROOT= below).

My aim was to use standard packages, with no recompliation, and having all packages upgradeable by standard (apt) means.

Hardware Requirements

I am going to discuss booting on i386 hardware, because that is what I used. Some of that I discuss may be relevant to other architectures, but I don't know.

I was using a 1 GHz PIII machine, with two 80 GB IDE Hard Disk drives, on a Soltek SL-65KV2 motherboard, with 512 MB of RAM, and a floppy drive, in a reasonable case, plus an Intel Etherpro 100 NIC, and a snazzy Adaptec 29160 SCSI card. You don't need the SCSI card, but a network card for which there is already a Linux driver for is of use. The important bit is that I had two identical Hard Drives.

Getting Started

Boot Disks

Create a set of boot disks: the files you will need are available on your closest Debian mirror. At this point in time, you need the Rescue Disk (boot.bin), Root File System (root.bin), and the four (4) driver disks. All these images are located in $debian/dists/testing/disks-i386/. Refer to the documentation on using "dd" or "rawrite2" for putting these onto your media.

Hardware Assembly

Everything plugs together like normal, except your Hard Drives are attached to each of your IDE Controllers. Most mother boards support up to four (4) Hard Drives: two on the first controller, and two on the second. Each controller supports two drives: one as a master, and one as a slave. There are generally "jumpers" on hard drives to set them as master or slave: both the drives you have should be set to master. Since they are equivalent, it doesn't matter which one is plugged into the 1st IDE controller, and which is on the second.

Initial Install

To get started, you need to get a base system set up on one disk. Our plan is to have everything installed on one partition, then create our RAID devices, boot on to it, and then create our other partitions and migrate to it as needed. We aim to end up with one partition for /boot and the root filesystem, and others as needed. We can do the whole thing on one RAID partition, but thats your choice. We will also use DevFS, and ReiserFS.

Boot from the rescue disk, then the root, then follow the installation instructions. When you get prompted to partition, think about how bug your disk is, and how you would like it carved up in the long run. You can change this later, but starting out with a plan can be simpler for you.

For example, I had 80 GB to play with (two 80 GB drives under RAID 1 gives 80 GB of space at the end). I chose to have 4 GB for my root partition, and divide the rest up later on for /home, /usr, /var, and /usr/local (plus a little bit for swap).

Don't worry about running dselect or installing any "task-" packages. We just want a simple system set up on vanilla ext2 filesystem. We should end up with a self booting system on ext2, running whatever standard kernel was chosen.

Step up to RAID Capability

Now we need to prepare for running a RAID setup. Our packages need an update. Use apt, because it rocks, and install the following:

Edit /etc/modules and add the following modules:

Edit /etc/mkinitrd/modules, and add the same modules to this list. Your initrd image needs to be able to read and write to your RAID array, before your filesystem is mounted. Initrd is the trick here. You probably also want to see if you need to edit /etc/mkinitrd/mkinitrd.cfg and set the variable ROOT=probe to be ROOT=/dev/md0, or possibly, if using DevFS, ROOT=/dev/md/0.

Regenerate your initrd image for your new kernel with mkinitrd -o /tmp/initrd-new /lib/modules/2.4.x-... . If all is good, move this to /boot/initrd-2.4.x-... and edit your /etc/lilo.conf to add initrd=/boot/initrd against the "Linux" kernel entry. Run lilo, and you should see an asterisk next to the boot image "Linux".

Reboot into your new kernel

Create the RAID partitions

You now have a system that can use RAID on root. Fire up cfdisk, and partition your disk that will be one half of your RAID 1 array. Make sure you set the partition type to fd on all of them, and the first one should be marked as boot-able.

Create your raidtab. Copy /usr/share/doc/raidtools2/examples/raid1.conf or similar, and define your RAID devices. /dev/md0 will be our first one; you can leave the rest for the moment. Define it as having two devices, one being /dev/hdc1, and /dev/hda1, in that order. Be sure to mark /dev/hda1 as a failed-disk! You are still using this disk, and don't want it trampled on.

Now make the md0 device: mkraid /dev/md0. Now format the device: mkfs.reiserfs /dev/md0. Mount it somewhere: mount /dev/md0 /mnt. Copy your current system to it: cd /; find . -xdev | cpio -p /mnt. Edit the new systems /etc/fstab, which will currently be located at /mnt/etc/fstab: change the root partition to /dev/md0, and the partition type to reiserfs.

Edit your lilo.conf (on both systems) and update the definition for your Linux mount with: append="md=0,/dev/hdc1,/dev/hda1", and root=/dev/md0. Run lilo again to update it.

Reboot now onto your RAID 1 on reiserfs on root partition. Check that it worked with df.

Partition the rest of hdc now if you hadn't already. You may want swap on RAID 1, but you'll be far better off purchasing a bit of extra memory in its place. Thats up to you.

Now time to bring hda back into the fold. You can duplicate your hdc partition table to hda using sfdisk -d /dev/hdc | sfdisk /dev/hda.

You can bring hda online now by editing your /etc/raidtab and replacing the failed-disk against hda1 with raid-disk, and then running raidhotadd /dev/md0 /dev/hda1. The drive should start to sync up: you can watch /proc/mdstat for details on how this is going. You can continue while it does this.

Adding the other RAID partitions

So, now you can move other partitions to RAID 1. You don't need to muck around with failed-disk directives any more. You already have the partitions you need, so all you need to do is edit /etc/raidtab and define them, then mkraid /dev/mdi, format it as above, mount it somewhere, and copy your current filesystem tree to it. The a quick way of activating it is, using /usr as an example: mv /usr /usr2; mkdir /usr; mount /dev/mdi /usr, and if all has gone well, then remove /usr2. Later, rinse, repeat.

Features

You have a modern filesystem. Now, I recommend that you use a filesystem that is in the standard kernel tree, and available as a package. You entire system can be updated to newer kernels automatically: your new initrds will be generated with the required modules and ROOT definition. Just apt-get your new kernel, and all *should* be well. However, I make no promises.

Other Notes

You may wish to have your Hard Disks set up with hdparm to use multi-byte transfers. Install the hdparm package, and read the man page. I edited /etc/init.d/raid2, and against the "start" option, added hdparm -m 16 /dev/hda; hdparm -m 16 /dev/hdc: your options to -m may vary: read the manual page.

I ran into problems using a boot parmater of "devfs=mount". I'm not sure, but DevFS may be moving the traditional /dev/md0 to /dev/md/0 too early, and since DevFS daemon is not running at this point, there is no symbolic link to follow. Hence, devfs=mount may require that your kernel initrd image have ROOT=/dev/md/0 in place of ROOT=/dev/md0. This should have been fixed from version 0.1.12 of the initrd-tools package.

phobe:~> bonnie -s 1000
Writing with putc()...done
Writing intelligently...done
Rewriting...done
Reading with getc()...done
Reading intelligently...done
start 'em...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version  1.02       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
phobe         1000M 10898  90 29458  63  9559  13 10947  77 31891  14 317.4   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  9265  70 +++++ +++ 15273  99 12506  99 +++++ +++ 12694  99
phobe,1000M,10898,90,29458,63,9559,13,10947,77,31891,14,317.4,1,16,9265,70,+++++,+++,15273,99,12506,99,+++++,+++,12694,99
	phobe:~> df -k
	Filesystem           1k-blocks      Used Available Use% Mounted on
	/dev/md0               4003520     91092   3912428   3% /
	/dev/md1               4003584    304068   3699516   8% /usr
	/dev/md2               4003584    232564   3771020   6% /var
	/dev/md3               4003520   1071384   2932136  27% /home
	/dev/md5              62035860  11357460  50678400  19% /usr/local

phobe:~> cat /proc/mdstat 
Personalities : [raid1] 
read_ahead 1024 sectors
md5 : active raid1 ide/host0/bus1/target0/lun0/part7[1] ide/host0/bus0/target0/lun0/part7[0]
62037760 blocks [2/2] [UU]

md4 : active raid1 ide/host0/bus1/target0/lun0/part6[1] ide/host0/bus0/target0/lun0/part6[0]
97664 blocks [2/2] [UU]

md3 : active raid1 ide/host0/bus1/target0/lun0/part5[1] ide/host0/bus0/target0/lun0/part5[0]
4003648 blocks [2/2] [UU]

md2 : active raid1 ide/host0/bus1/target0/lun0/part3[1] ide/host0/bus0/target0/lun0/part3[0]
4003712 blocks [2/2] [UU]

md1 : active raid1 ide/host0/bus1/target0/lun0/part2[1] ide/host0/bus0/target0/lun0/part2[0]
4003712 blocks [2/2] [UU]

md0 : active raid1 ide/host0/bus0/target0/lun0/part1[1] ide/host0/bus1/target0/lun0/part1[0]
4003648 blocks [2/2] [UU]

unused devices: <none>

My Raidtab

Remember, you need to have this in /etc/raid/raidtab, and a symlink from /etc/raidtab.

raiddev			/dev/md/0
raid-level		1
nr-raid-disks		2
nr-spare-disks		0
chunk-size		4

device			/dev/hdc1
raid-disk		0

device			/dev/hda1
#failed-disk		1
raid-disk		1

#/usr
raiddev			/dev/md1
raid-level		1
nr-raid-disks		2
nr-spare-disks		0
chunk-size		4

device			/dev/hda2
raid-disk		0
device			/dev/hdc2
raid-disk		1


#/var
raiddev			/dev/md2
raid-level		1
nr-raid-disks		2
nr-spare-disks		0
chunk-size		4

device			/dev/hda3
raid-disk		0
device			/dev/hdc3
raid-disk		1


#/home
raiddev			/dev/md3
raid-level		1
nr-raid-disks		2
nr-spare-disks		0
chunk-size		4

device			/dev/hda5
raid-disk		0
device			/dev/hdc5
raid-disk		1

# swap
raiddev			/dev/md4
raid-level		1
nr-raid-disks		2
nr-spare-disks		0
chunk-size		4

device			/dev/hda6
raid-disk		0
device			/dev/hdc6
raid-disk		1

# usr/local
raiddev			/dev/md5
raid-level		1
nr-raid-disks		2
nr-spare-disks		0
chunk-size		4

device			/dev/hda7
raid-disk		0
device			/dev/hdc7
raid-disk		1

My lilo.conf

# Support LBA for large hard disks.
lba32

# Specifies the boot device.  This is where Lilo installs its boot
# block.  It can be either a partition, or the raw device, in which
# case it installs in the MBR, and will overwrite the current MBR.
boot=/dev/md0
raid-extra-boot="/dev/hda,/dev/hdc"

# Specifies the device that should be mounted as root. (`/')
root=/dev/md0

# Installs the specified file as the new boot sector
install=/boot/boot.b

# Specifies the location of the map file
map=/boot/map

# Specifies the number of deciseconds (0.1 seconds) LILO should
# wait before booting the first image.
delay=20

vga=normal
default=LinuxNoDevFS

image=/vmlinuz
	label=Linux
	root=/dev/md0
	append="md=0,/dev/hda1,/dev/hdc1 devfs=mount"
	read-only
	initrd=/initrd.img

image=/vmlinuz
	label=LinuxNoDevFS
	root=/dev/md0
	append="md=0,/dev/hda1,/dev/hdc1"
	read-only
	initrd=/initrd.img

image=/vmlinuz.old
	label=LinuxOLD
	root=/dev/md0
	append="md=0,/dev/hda1,/dev/hdc1 devfs=mount"
	read-only
	initrd=/initrd.img.old


image=/vmlinuz.old
	label=LinuxOLDNoDevFS
	read-only
	append="md=0,/dev/hda1,/dev/hdc1"
	optional
	root=/dev/hda1
	initrd=/initrd.img.old

In an emergency, break glass

So what to do if you can't get your root RAID1 filesystem to boot? Here is a straightforward way to get to your md0:

Still ticking over: May 2003

Just to make sure everyone is convinced this works: the system I installed this on is still working fine today, 8th May 2003 (touch wood). Over this time I have had only one failed disk, and many peacful nights of sleep knowing that one disk failing at any time isnt a huge job.