Saturday, August 11, 2007

Project #1 (Long, Geekish, Possibly Educational)

Alright, project #1 is multi-tasking.

1) Build & Test RAID 5 Array utilizing MDADM Raid Software under Linux. Take apart, go to #2
2) Build & Test (and perhaps set up for external exposure) RAID 1 under Redhat Antique, upgrading to utilize MADM 2.6.2 requiring GLIBc2.3
3) With spare HDD generated under 1 & 2, build text>voice PC for Dad, first under Linux, then under Windows XP or 2k.

Why, you ask, this rather complex approach? And what is RAID 1 or 5, and why do I care? And what's this about Linux? You're writing this on an XP box after all!

Well, the answer is complex. Suffice to say I need to first learn how to build a software-controlled RAID 5 array (I'll explain what that is in a minute) utilizing whatever flavor of Linux comes to hand (I like Ubuntu, a LOT - at least partly because it's an easy install on even moderately modern hardware), just so I know how and can say I've done it.

RAID translates to "Redundant Array of Inexpensive Drives" and comes in six main flavors with a variety of toppings; the main goal is to increase the speed at which the computer can get at stored data, and at the same time make that data safer (ideally) by hedging ones bets against hard drive crash. It's an approach mainly found on servers, but with HDD prices coming down, a RAID 5 array can give one a warm/fuzzy/safe feeling when it comes to ones data - and while no substitute for regular/religious backups, can get you back up and in business a great deal faster than restoring from tape/cd/dvd.

In other words, if you're a belt'n'suspenders sort who scatters fire extinguishers and firearms in safe and strategic locations throughout your life - RAID may just give you that same warm'n'fuzzy feeling about your data when combined with good backup practices ("I've done MY part, the rest is up to the Fates!")

"So what's this RAID thing you go on and on about?"

RAID Level 0 splits all your data as it is written and puts an equal portion on each disk. This is/was a speed trick, because by sharing out the data being read/written across several hard drives read/write heads, each head does a less work faster. Think of thirty folks writing " I will never use this" a total of a hundred times between them vs. one person writing the same sentence a hundred times - the group finishes the job first. Speeds things up, but if one drive crashes, every shred of data on the group of hard drives is gone forever - even if 9 out of 10 are just fine.

RAID Level 1 makes exact copies of data on several (2+) hard drives - if you have 1 primary disk, you have 1 secondary/mirror. If you have 10 primaries under this scheme, you have 10 secondaries. If a primary drive croaks, the secondary picks up the slack till you throw a replacement for the deader in, and then the "secondary" writes back to the new primary. Not cheap, due to the 1:1 ratio of hard drives.

RAID Level 2 obsolete, seldom if ever used in the real world. Requires 39 drives to pull it off (not 38, not 40). Not especially practical.

RAID Levels 3 and 4 take your data when it's being written and schmear it in equal portions in what are called "stripes" across several hard drives, and keeps a record of what's where (parity) on a special drive set aside from ONLY that kind of thing. Should a non-parity drive melt down (one of the ones with data on it) , the parity drive can rebuild the data. Downside is that this approach slows things down as the parity is figured out and written when the data is written, more steps=more time. Bonus is that if you do have a data drive drop dead, if you've put a spare drive on in advance, the array can pick it up and roll it in as a replacement without your intervention - but don't forget to pull the deader and replace it with a shiny new spare.

RAID Level 5 is for big kids and kids that like really cool toys. It takes your data as it's being written ans splats it AND the parity data across several hard drives (3+, all identical). Since it can read and write at the same time, speed picks up, and again, if one drive drops life is good. Two or more drives dropping at once, and you're in deep kimchee with total data loss (unless you've been backing up regularly). A bonus on this approach is that you use multiple identical drives, and the storage you get boils out to "n-1" where n represents the total number of drives - thus, if I have 3 100gb HDD in a RAID 5, I effectively have 200gb of HDD storage available to me; conversely, if I'm flying at RAID 1, if i have two 100gb HDD, I've 100gb of effective HDD space. Ick.

The reason that RAID is cool at home or in other environments is that it's dependable (like all things, if done right) and in 1/3/4/5 (additional levels exist, I've just burbled enough), and if you've ever restored the hard way (from CD/DVD/Tape/Etc)'s a whole bunch faster. And for the bad children among us who don't back up, RAID can save your bacon (not "will", but "can") if all goes well after a hard drive crash.

For just a nasty moment or two, think of what's on your PC - and how much fun it would be to lose it all. Now, think back-up and RAID and see how warm'n'fuzzy it leaves ya feeling.

Now as a less-evil techie guy, I *like* to test out stuff on my own gear before trying it on other folks. I tend to think this approach adds a certain...stability and peace of mind to my little world. Cuts down on the number of embarrassing surprises.

That brings us to building that Redhat Antique box. A buddy has an Antique box that he uses in his business - someone back in the late '90's built it for him and hung, what for the time, was a cool new thing called RAID 1 on it. It's getting old'n'sick, but he's asked me to limp it along for him and keep an eye on it for him. So...before setting my alleged evil genius loose, I've kidnapped an old PC of mine and am using it as a disposable box that if I succeed I'll turn into a back-up file server here at the house - but if I fail, I've only trashed my own equipment.

We'll see how it goes. G'nite all.

No comments: