Monday, March 12, 2007

Why ZFS for home


I’m getting annoyed at people that keep saying ZFS is okay for servers but I don’t need it for home. ZFS scales from one drive to an infinite number of drives and has benefits for all of them.

Let’s take a look at the average home computer a single drive holding a mix of files, up to 300GB drives are common. That is a lot of data to lose and its getting easier lose data these days. Further more new hard drives aren’t getting any more reliable with time. Of course you can lose things on new hard drives by just misplacing them in one of the 1000’s of directories you can use in an attempt to organize your files.

What do other operating systems and file systems provide to fight this situation? In Linux you can use raid, redundant array of inexpensive drives, then if a hard drive fails your data is safe. Okay the fun part begins when you try to enable raid, the obvious choices are raid 1 (mirroring your data) or raid 5 (that uses part of your drives as parity protecting your data uses less space but requires a minimum of 3 drives to work). I won’t bore you with the technical details I will just show a small sample of the commands to create a raid 1, a mirror image of one drive onto a second drive.

These instructions were taken for

You have two devices of approximately same size, and you want the two to be mirrors of each other. Eventually you have more devices, which you want to keep as stand-by spare-disks, that will automatically become a part of the mirror if one of the active devices break.

Set up the /etc/raidtab file like this:

raiddev /dev/md0
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
persistent-superblock 1
device /dev/sdb6
raid-disk 0
device /dev/sdc5
raid-disk 1

If you have spare disks, you can add them to the end of the device specification like

device /dev/sdd5
spare-disk 0

Remember to set the nr-spare-disks entry correspondingly.

Ok, now we're all set to start initializing the RAID. The mirror must be constructed, eg. the contents (however unimportant now, since the device is still not formatted) of the two devices must be synchronized.

Issue the

mkraid /dev/md0

command to begin the mirror initialization.

Check out the /proc/mdstat file. It should tell you that the /dev/md0 device has been started, that the mirror is being reconstructed, and an ETA of the completion of the reconstruction.

Reconstruction is done using idle I/O bandwidth. So, your system should still be fairly responsive, although your disk LEDs should be glowing nicely.

The reconstruction process is transparent, so you can actually use the device even though the mirror is currently under reconstruction.

Try formatting the device, while the reconstruction is running. It will work. Also you can mount it and use it while reconstruction is running. Of Course, if the wrong disk breaks while the reconstruction is running, you're out of luck.

Looks like fun right? Before ZFS the situation wasn't much better in Solaris. A typical home user will see this and say I will do this next week. Then next week never comes. Of course doing raid5 only gets more complex in Linux at least, for ZFS its just a slight change to the commands used to create a mirror. In ZFS we execute two or three commands and we are done.

# zpool create data mirror c0t0d0 c0t1d0
# zfs create data/filesystem

Done. The only complex part is getting the last two entries and you can find those by running the Solaris format command. The red is added to help readability

#format < /dev/null Searching for disks...done

0. c0t0d0
1. c1t2d0

That takes care of drive failure; another problem is accidental deletion, accidentally installing a broken application or any change you would like to undo. Linux’s answer to this is backups, either on optical media, tape or perhaps another harddrive. This is expensive or time consuming, choose one. So the typical home user will most likely put this off to till another day and won’t have a backup for there data.

ZFS has snapshots, they are easy and painless and have a very low cost in resources to create. Snapshots are basically a picture of your data; these are taken in real time and are nearly instant in ZFS, to get these in any other OS you need to buy expensive raid hardware or an expensive software package something that no home user will want to buy.

For example I want to protect my mp3 collection so I put it on file system all its own.

# du -sh /mp3
17G /mp3

And then I took a snapshot of it for protection.

# time zfs snapshot data/mp3@may-1-2006
real 0m0.317s
user 0m0.017s
sys 0m0.030s

Not bad, 1/3 of a second to protect 17 Gigabyte of data. That can easily be restored should I make a mistake and delete or corrupt a file or all of them.

And here is a little script I created to take snapshots of all my zfs file systems and puts a date stamp on each one. Each snapshot takes very little space, so you can make as many as you need to be safe.

date=`date +%b-%d-%Y`

for i in `/usr/sbin/zfs list -H -t filesystem -o name` ;
do /usr/sbin/zfs snapshot $i@$date ;

A few minutes in crontab or your desktop graphical crontab creator and you can have this script execute daily with no user intervention. Below is a sample line to add to your crontab that that takes snapshots at 3:15 am

15 3 * * * /export/home/username/bin/snapshot_all

To see your snapshots, is easy you just look in .zfs/snapshot that is in each zfs filesystem. You can even see individual files that make up a snapshot by changing directories further into the snapshot. This even works if the file system is shared via NFS.

Now let’s take a look at how to recover from mistakes using snapshots. First lets create filesystem, and populate it with a few files.

#zfs create data/test1
#cd /data/test1
#mkfile 100m file1 file2 file3 file4 file5
file1 file2 file3 file4 file5

We now have 5 files, each 100 megabytes, lets take a snapshot, and then delete a couple files.

# zfs snapshot data/test1@backup
# rm file2 file3
# ls
file1 file4 file5

The files are gone. Oops a day later or a month later I realize I need those files.

# cd ..
#zfs rollback data/test1@backup

So all we do is rollback the using a saved snapshot and the files are back.

# ls
file1 file2 file3 file4 file5

ZFS makes it easy to create lots of filesystems, in Linux you are limited to 16 file systems per drive (yes I know you can use the Linux volume manager but of course that adds even more complexity to the raid setup outlined above, as drives get bigger you end with hundreds or even thousands of files and directories per drive making it easy to lose files in the levels of directories. With ZFS there is no real limit to the number of filesystems and they all share the storage pool, they are quick and easy to create.

#time zfs create data/word_processor_files
real 0m0.571s
user 0m0.019s
sys 0m0.040s

Little over half a second to create a filesystem and you can create as many as you like.

The next problem the home user may face is running out of space. Typically the user heads down to the local electronic or computer shop and gets another hard drive or two if they want to be safe and use raid, so they get to head back to the raid setup guide, of course. Depending on the filesystem you may be able to grow the filesystem with more cryptic commands turning your filesystems into a raid 1+0, but its pretty complicated, so most people resort to keeping them simple and moving files back and forth between the filesystems to get the space they need.

With ZFS it is only one command to add the drive(s) to the pool of storage.

#zpool add data mirror drive3 drive4

Afterward all your filesystems have access to the additional space. If money is a little tight, you can turn on compression on any filesystem you like with a simple command then all files that are added to the filesystem are compressed possibly using less space. Note this usually doesn’t slow down IO at all, on some systems and workloads it actually speeds up data access.

# zfs set compression=on data/filesystem

ZFS is so simple you can talk your grandmother through the process of creating filesystems or restoring data. This is just a small sample of what ZFS can do, but it’s all just as simple as what I have shown you in this document. Even if you are more advanced, you can still benefit from ZFS’ ease of use. No more hitting the web to study how-to's to setup raid or LVM. Even if you can't afford two drives in your home box ZFS will be perfectly happy with one drive, though you do lose hardware redundancy, snapshots are still there to take care of software or user introduced filesystem problems.

No comments:

Fast, Safe, Open, Free!
Open for business. Open for me! » Learn More