Home file server
I’ve been meaning to do something more sophisticated (and rather higher-capacity!) than the current setup for ages. Right now it’s a Ubuntu box with a 320GB disk running Samba.
At a hardware level the plan is to nab a pair of 1TB SATA disks. They can be had for ~AU$250 each. But I’d like to get more than 1TB of storage out of them while also having some data duplicated. Some data matters enough to want it mirrored, some is essentially throwaway stuff: nice to have, but if I lose it I really don’t care that much.
The traditional way of doing this on a UNIX-like system would be to allocate part of the disks to a mirror, and part to a stripe or concat volume. The downsides of doing this are reasonably obvious: if you lose one disk you lose all the data on the concat/stripe, and you have to allocate the storage up front.
(Gross simplification, I know, as you can probably recover data if it was a concat rather than a stripe, but there’s a certain amount of stuffing about involved and I wouldn’t exactly guarantee it working.)
So I’ve been reading about Windows Home Server with some interest. They have a sneaky trick: there’s no mirroring at a disk level, instead you can tell it to mirror individual files or directories and those will then be copied to two drives.
So there’s no need to allocate storage up-front — and in fact WHS won’t let you do that anyway — just mark stuff to be duplicated and it takes care of things. If you lose a disk, you lose anything that was stored on it that wasn’t duplicated, but you don’t lose access to stuff that was stored on other disks because it’s not doing the concat/stripe thing.
There’s a known data corruption bug in the current release version of WHS, so I wouldn’t trust that. But there’s an update being tested at the moment (they’re up to RC4) which is supposed to fix that and improve write performance. That it also introduces an exciting new problem when the disks are approaching full is uninspiring but the development group seems to be aware that it’s a problem and planning to address it, so…
I’ll be grabbing the 2×1TB disks soonish and will at least give WHS a try. The fallback position is the “traditonal UNIX” approach. I don’t see that I have anything but UNIX-geek cred to lose by trying Microsoft’s tool.
(Incidentally, this granular duplication model is something the Linux weenies miss when they ridicule WHS and start in on how people should just use a Linux box instead.)
Popularity: 29% [?]



Think maybe WHS is a rebadged Sun ZFS? Certainly _sounds_ a lot like it, warts and all, eh?
Nah. ZFS doesn’t do this sort of trick.
WHS is a modified Win2k3 Server with some extra bits to do the storage-fu. You don’t get one big filesystem, you get one filesystem (plus one for the OS) per physical disk and it keeps track of where it’s stashed the files by maintaining what sound like symlinks (they call them “tombstones”, which is very encouraging) on the “primary” disk.
The “tombstones” can be rebuilt if the primary disk fails, and if one of the other disks fail you only lose whatever data was on there and not duplicated. It’s a bit of a bodge job though and relies on you never accessing data directly, always going through the network-share layer (i.e., use the UNC paths, not the drive letters) even when you’re directly manipulating stuff on the machine.
ZFS does do some tricks with duplcation but that’s within a single filesystem and with no guarantee of the data going to different physical disks. It’s more about guarding against disk blocks going bad than anything else.
Anthony: While ZFS will let you keep multiple copies of the same file, if you lose an entire vdev your pool is toast.
Western Digital 1TB disks $215/each @ iStore on Lonsale St. Recent price surveying shows this place to be The Cheapest In Melbourne.
However, why not just buy 3 x 750 and do a RAIDZ? Same price, y’know. Actually, you might find 4 x 750 for the same price. The difference in power consumption is trival.
Oh, and how do you boot your not-mirrored machine after disk0 fails?
Yeah, I figured the disks would be available cheaper than that.
When the boot disk dies, you get a new one and do a “recovery reinstall” from the OS media. That reinstalls the OS and then reconstructs its idea of where the data actually is. Not as good as a mirrored boot disk, but the goal isn’t maximum availability, it’s to have the data survive so it’s not a trade-off that bothers me (much).
I haven’t made a firm decision yet, but I do want to at least eval WHS. From reading up on it, it looks like it makes more efficient use of disk than even RAIDZ can, simply because it has that very granular redundancy going on. Most of what I’m storing doesn’t need duplication or even recoverability in the face of a disk dying, it’s just random video I’ll probably never watch a second time.
FreeBSD can boot from ZFS! Which is actually better than Solaris can manage.
Me, what I have is a 3×500 Linux setup with a 3-way mirror for boot and RAID-5 for the rest.
Dave: Well, SXCE can boot from ZFS, and Solarus 10u6 will be able to boot from ZFS, and I know which one I’d want to back in a stability fight.
Frankly, that recovery strategy gives me the creeps. Disk is cheaper than time, effort and lost data. Still, can’t hurt to give anything a try, and it’s pretty easy to yank a power corder and zorch the contents of a disk to approximate a failure.