5 THINGS: on Shared Storage Part 2: Drives, Size, Spindles, and Protection
Have you checked out the prior episode on shared storage? You should!
This 3 part series consists of:
Part 1: SANs, NASes, Bandwidth, and Connections.
Part 2: Drives: Size, Spindles & Protection (RAID)
Part 3: Management, Permissions & Support
On this episode, we’ll be continuing with on our exciting storage adventure, and examining Drives: Size, Spindles & Protection (RAID), in part 2 of our 3 part series. We’ll be holding off on ODA, LTO and Cloud storage for a future episode.If you’ve missed part 1 of 5 THINGS on storage, I highly suggest you go back and get your tech on.
1. What do I need to know about drives?
At any trade show and any electronics store, you’re going to drowned in a sea of storage options. Each of these solutions (and I use the term solution lightly), at their core, rely on the ability of a storage medium to receive, save, and deliver data as fast and as reliably as possible. But what separates the bargain bin drives from the enterprise ones? Let’s take a look at some of the factors.
The bigger the better, folks. More storage equals more space for your important media. 2, 3, and 4 TB drives are pretty common nowadays. What you need be aware of is that a larger capacity does not necessarily mean faster storage. Four 1 TB drives striped together can deliver more throughput than a single 4TB drive. Don’t let the size fool you.
Each drive I/O interface has its own quirks and own thresholds, but for our discussion, it’s main limiting factor is how much data the connection allows to flow through it at any one time. SATA or SAS are the most common interface nowadays for single drives or small arrays. These interface types typically allow for more throughput than most any single drive could ever deliver. Thus, a SAS or SATA connection is rarely your bottleneck, until you get into many “striped” drives (an array). Once you get into this realm, we move to other, more robust I/O interface solutions. However, as we discussed in Part 1, a fatter pipeline to and from the array may NOT be what you need.
Aside from capacity, The most common analytic is RPMs, or the rotational speed of the platters inside the hard drive. The faster the RPMs, the better. Commonly, speeds of 5400 RPM, 7200RPM, 10,000 RPM and 15,000 RPM are the numbers you’re most likely to come across. 5400 RPMs are never recommended for media usage, the speed just isn’t there for a pleasurable editing experience. 7200RPM is the generally accepted baseline for media usage.
Disk Buffer / Disk Cache
Next, we have the Disk Buffer – also known as the Disk Cache or Cache Buffer.
This is where data is stored on the drive temporarily before it’s read from or written to the spinning platters. This allows time for the disk to “catch up” if the requests for reading and writing cannot be met immediately, or if the data is being frequently accessed. However, it mainly makes the reading to and writing from a drive more efficient, organize, and causes less wear and tear on the spinning platters. Bigger is usually better.
MTBF (Mean Time Between Failure)
Lastly, we have MTBF (Mean Time Between Failure). This is muy importante. The lower the MTBF, the less robust the drive is, compared to others with “enterprise” branding. This means the drive may fail earlier in its lifetime of expected use, compared to an enterprise-class drive with a higher MTBF. This also comes at a price premium – better parts, higher tolerance, and more strict QC. If you want the steak, you gotta pay for it, lest you get ground chuck.
Now, those of you who are using Solid State Drives, or as is the case with the new Mac Pro -Flash Memory, many of the terms we’ve discussed are not applicable. However, spinning discs are still the overwhelming option for additional storage, due to its large capacity and lower price point. We’ll dive into SSDs on a future episode.
Many of these terms can be confusing, and many manufacturers have given various grades of drives consumer-friendly names to more easily get you the drive you need. Let’s examine one such naming schema, those used by Western Digital.
Western Digital uses colors to signify what drives are best used for each scenario.
Black drives are enterprise-class, are usually the fastest, and have the lowest MTBF. They can operate in warmer temperatures (drive arrays can get toasty) and due to these extras, are the most expensive in the bunch. While I use Western Digital as an example, drives with specs which are similar to WD’s Black classification are what you WANT in a mass storage solution. These are not your father’s hard drives, are typically NOT the ones you see in the weekly electronics flyer. You need to seek them out. These are excellent as single drives and also work well in a striped environment.
2. How do I house my drives?
Now, we need a chassis to house the drives. There is a sea of options out there. However, when you buy a shared storage SOLUTION, the solution provider (manufacturer or vendor) has already factored in all of these drive variables and incorporated these factors into building or tweaking a drive chassis in order to create a turnkey package *just* for you. More generic chassis that come without drives are somewhat less optimized, as to minimize compatibility issues. Normally, creating your own JBOD (Just a Bunch Of Drives) involves using these universal chassis, and is typically not the first choice in a demanding, production atmosphere.
These solutions, if done right, rely on a battle-tested combination of hardware components. Off the shelf components, put together because the cables fit, will never deliver the performance a tuned system can. If possible, avoid product scattershot.
3. What is a RAID and why should I care?
RAID, RAID, RAID. Oh, how you complicate thee.
A RAID is a Redundant Array of Independent (sometimes inexpensive) Disks. It’s a way of combining multiple physical drives to appear as 1 volume to your computer. More disc spindles RAIDed together equates to more options for performance, throughput, and data redundancy.
Take this scenario: suppose a drive fails (MTBF) once in 1 million operations. We don’t know *when* it will happen, we just know it probably will – and before 1 million total operations. Where this happens is up to chance, environment, and usage. Now, let’s say we RAID two drives together as one because that would yield twice as much space and speed. This obviously increases the risk of MTBF. Plus, if ONE (yes, ONE) drive starts to smoke – you’ve lost all of your data because you spanned the disks together. You can’t edit with half of every bit and byte gone. Given this truth, now multiply this by 4 drives. Howabout 16 drives or more? Russian Roulette, geek-style. Combining drives in this manner is known as RAID 0. This RAID level delivers the largest amount of throughput and capacity, but absolutely no support for preserving data if one drive dies. This absolutely bites when it comes to DATA AVAILABILITY.
The loss of one drive in a RAID 0 array could be a massive problem for the video editor (you just lost the entire movie, no problem… right?!?).
Since this is not acceptable in most circumstances, other RAID data-protection and performance formats have been developed that ensure there is some REDUNDANT distribution of data across multiple disks. While there are as many as the day is long, let’s examine those you will probably find out in the wild when dealing with video shared storage solutions:
If RAID 0 doubles your storage, then conversely, RAID 1 cuts the cumulative size in half. Why? RAID 1 (AKA “mirroring”) ensures that in the event of the rapture and half of all of the drives in your array blow chunks, that you don’t lose any of your data because a 1:1 copy of the data has been made on that extra storage. This can cause a slight hit in throughput (after all, your data is being written twice), and as outlined, a massive hit in storage space. Those of you who have used Avid’s Unity (not ISIS) have used RAID 0 or RAID 1 for years – it’s all Unity supported. Most other shared storage solutions also offer RAID 0 or RAID 1 as well.
RAID 2, 3, & 4
Outdated, or yielded unnecessary by RAID 5. Move along. BTW, whatever happened to Leonard Parts 1-5?
Probably the most popular RAIDset out there. It balances throughput and redundancy, with minimal overhead. This achieved through parity. What is parity you ask? When data is written, parity data is introduced and written with your video data across the drives. In the event a drive fails, this parity data is combined with existing media to recreate the lost media. As one can imagine, this decreases performance slightly: not only while the initial parity data is created and written, but if a drive dies, the array needs to recreate the media for usage in real time. All this being said, RAID 5 allows the speed benefit afforded by multiple drives acting as one, along with good redundancy. As a bonus, if a drive dies, most shared storage chassis can rebuild the lost media once a correct drive is inserted into the chassis to replace the dead drive – restoring your array to its former glory. Just give it until tomorrow, it usually takes a bit of time and you will see a modest performance hit during the rebuild process. But hey, it’s not gone!
Very similar to RAID 5, although the user has one more guard at the gate: 2 drives worth of parity are written, instead of 1. If a drive does die, there remains 1 drive still spinning and keeping the array functional. Same basic performance and storage hits as RAID 5. RAID 6 is slightly less common when seeking out Video RAIDs.
I usually ballpark a 12-20% hit on storage space AND throughput for your shared storage solution to handle a RAID5. This varies by manufacturer, but the 12-20% plays heavily into my storage formula at the end of this article.
I know; no one wants to lose space, but it’s better than losing half of your space with RAID1 or having no redundancy, a la RAID 0.
Other less popular RAID formats include RAID 0+1, RAID1+0 (AKA RAID 10), RAID 0+3, RAID 3+0, etc. Consult your local storage geek if you *really* want to delve into these.
It should be noted that a RAIDSET can be created at either the hardware level or at the software level. In Windows or on OS X, for example, you can RAID drives (usually RAID 0 or 1, rarely RAID 5) from within the OS. This is a software RAID, and while it works, it’s usually not as fast or bullet-proof as a hardware RAID 0 or 1, which (if available) is done on the chassis which contains your drives or the host card in your computer. Most hardware RAID controllers are designed specifically for RAID 1, 5 and 6. Hardware RAID is faster than software RAID for managing the layout of data and parity bits.
4. How much storage do I need?
Now for my patent pending formula.
Let’s take a 1 TB drive. Small, and easy on my math-challenged brain.
As you probably know, marketing 1TB does not equate to 1TB usable storage. This hard drive loss is due to base 2 math rather than base 8. Thus, when you begin to multiply bits, bytes, kilobytes, megabytes, gigabytes, etc…you end up with less than 1TB. And of course, marketing wins: 1TB is easier to sell than 930GB. So, we’re saddled with a 7% loss. Keep that number written down.
We now need to initialize the drives and format them into a RAID. Let’s say we go with RAID 5. The best balance of speed and redundancy. RAID 5 in hardware can be between 12-20% in loss of space due to the aforementioned redundancy. Again, this is different for each manufacturer, so no need for the math hate mail. Let’s use 15%, and subtract that from 930GB. This comes out to approx. 790GB. So, now we’re down 210GB from the advertised size.
As I mentioned earlier, performance (throughput, in this case) can decrease as the drive fills up, if the data is written sequentially on the disc(s). Some shared storage manufacturers (Facilis comes to mind) scatter the data across the drives, so a user never sees a performance hit; as performance is equal regardless of the amount of free space. That’s in the minority, so the magic number before a noticeable loss in throughput seems to be around 80%. Thus, we subtract another 20% from the 790GB. This comes out to 632GB.
That’s right. Of that shiny new 1TB drive, once introduced into a RAID 5 RAIDSET, and given some room for performance, we have nearly a 40% space loss.
5. What solutions do you recommend?
I should preface this again with the disclaimer that these storage solutions are uninfluenced decisions based on my experience.
As I mentioned earlier, I prefer the one-throat-to-choke philosophy when choosing solutions, especially storage. Thus, on a desktop level, I am a fan of G-Tech and Sonnet, with CalDigit as a distant 3rd. All are relatively inexpensive, have decent support, are readily available.
I’m often asked about Drobo, Synology, Qnap and other, more generic storage. I like these solutions when ease of use is paramount, you’re looking for storage that can be used in many different scenarios, and when you can tolerate less than optimal performance. These solutions tend to sacrifice peak performance and Quality of Service for ease of use and trying to work in every use case, not just the media space. They typically don’t have management software to handle multiple editors using the same media at the same time, which can cause havoc when projects and media are being used by multiple editors. That being said, I’ve had a Synology unit at home or several years, and use it on a daily basis as a home media server, and as a repository for high res media files when I’m onlining.
Once we get into the more enterprise storage solutions, the playing field fills up quickly. To complicate things further, many of the factors in choosing these storage solutions rely on the software or sharing ability on top of the drives we’ve just spent time discussing…..so, we’ll save that for the next episode.
Did I miss anything about Size, Spindles & Protection? Have another Red ale to recommend? Let me know in the comments, via Twitter, or on Facebook, and please, share this series with your friends. Stay tuned for Part 3: Management, Permissions & Support. Same bat time, same bat channel.
As always, thanks for watching.
Have you checked out the prior episode on shared storage? You should!
This 3 part series consists of: