With predictable regularity someone surfaces on the Web, claiming they have discovered a way to turn slow SATA arrays into high performance storage. Their method usually involves adding complex and sophisticated software to reallocate and optimize system resources. While there may a few circumstances where this might work, in reality it is usually just the opposite.
The problem with this concept is similar to the kit car world several decades ago. At the time, kit-build sports cars were all the rage. Automobile enthusiasts were intrigued by the idea of building a phenomenal sports car by mounting a sleek fiberglass body on the chassis of a humble Volkswagen Beetle. Done properly, the results were amazing! As long as their workmanship was good, the end results would rival the appearance of a Ferrari, Ford GT-40, or Lamborghini!
However, this grand illusion disappeared the minute its proud owner started the engine. Despite its stunning appearance, the kit car was still built on top of an anemic VW bug chassis, power train, and suspension!
Today we see a similar illusion being promoted by vendors claiming to offer “commodity storage” capable of delivering the same high performance as complex SAN and NAS systems. Overly enthusiastic suppliers push the virtues of cheap “commodity” storage arrays with amazing capabilities as a differentiator in this highly competitive market. The myth is perpetuated within the industry by a general lack of understanding of the underlying disk technology characteristics, and a desperate need to manage shrinking IT budgets, coupled with a growing demand for storage capacity.
According to this technical fantasy, underlying hardware limitations don’t count. In theory, if you simply run a bunch of complex software functions on the storage array controllers, you somehow repeal the laws of physics and get “something for nothing”.
That sounds appealing, but it unfortunately just doesn’t work that way. Like the kit car’s Achilles heel, hardware limitations of underlying disk technology govern the array’s capabilities, throughput, reliability, scalability, and price.
• Drive Latencies – the inherent latency incurred to move read/write heads and rotate disks until the appropriate sector address is available can vary significantly.
For example, comparing performance of a 300GB, 15K RPM SAS disk to a 3TB 7200 RPM SATA disk produces the following results:
• Controller Overhead – Masking SATA performance by adding processor capabilities may not be the answer either. Call it what you will – Controller, SP, NAS head, or something else. A storage controller is simply a dedicated server performing specialized storage operations. This means controllers can become overburdened by loading multiple sophisticated applications on them. More complex processes also means the controller consumes additional internal resources (memory, bandwidth, cache, I/O queues, etc.). As real-time capabilities like thin provisioning, automated tiering, deduplication and data compression applications are added, the array’s throughput will diminish.
• “Magic” Cache – This is another area where lots of smoke-and-mirrors can be found. Regardless of the marketing hype, cache is still governed by the laws of physics and has predictable characteristics. If you put a large amounts of cache in front of slow SATA disk, your systems will run really fast – as long as requested data is already located in cache. When it isn’t you must go out to slow SATA disk and utilize the same data retrieval process as every disk access. The same is true when cache is periodically flushed to disk to protect data integrity. Cache is a great tool that can significantly enhance the performance of a storage array. However, it is expensive, and will never act as a “black box” that somehow makes slow SATA disk perform like 15K RPM SAS disks.
• Other Differences – Additional differentiators between “commodity storage” and high performance storage include available I/Os per second, disk latency, RAID level selected, IOPS per GB capability, MTBF reliability, and the Bit Error Rate.
When citing the benefits of “tricked out” commodity storage, champions of this approach usually point to obscure white papers written by social media providers, universities, and research labs. These may serve as interesting reading, but seldom have much in common with production IT operations and “the real world”. Most Universities and research labs struggle with restricted funding, and must turn to highly creative (and sometimes unusual) methods to achieve specific functions from a less-than-optimal equipment. Large social media providers seldom suffer from budget constraints, but create non-standard solutions to meet highly specialized, stable, and predictable user scenarios. This may illustrate an interesting use of technology, but have little value for mainstream IT operations.
As with most things in life, “you can’t get something for nothing”, and the idea of somehow enhancing commodity storage to meet all enterprise data requirements is no exception.
It’s hard to retire a perfectly good storage array. Budgets are tight, there’s a backlog of new projects in the queue, people are on vacation, and migration planning can be difficult. As long as there is not a compelling reason to take it out of service, it’is far easier to simply leave it alone and focus on more pressing issues.
While this may be the path of least resistance, it can come at a high price. There are a number of good reasons why upgrading storage arrays to modern technology may yield superior results and possibly save money too!
Capacity – When your aging disk array was installed several years ago, 300 GB, 10K RPM, FC disk drives were mainstream technology. It was amazing to realize you could squeeze up to 45 TB in a single 42U equipment rack! Times have changed. The same 10K RPM DISK drive has tripled in capacity, providing 900 GB in the same 3.5 inch disk drive “footprint”. It’s now possible to get 135 TB (a 300% capacity increase) into the same equipment rack configuration. Since data center rack space currently costs around $3000 per month, that upgrade alone will dramatically increase capacity without incurring any increase in floor-space cost.
Density – Previous generation arrays packaged from (12) to (15) 3.5 inch FC or SATA disk drives into a single rack-mountable 4U array. Modern disk arrays support from (16) 3.5 inch disks per 3U tray, to (25) 2.5 inch disks in a 2U tray. Special ultra-high density configurations may house up to (60) FC, SAS, or SATA DISK drives in a 4U enclosure. As above, increasing storage density within an equipment rack significantly increases capacity while requiring no additional data center floor-space.
Energy Efficiency – Since the EPA’s IT energy efficiency study in 2007 (Report to Congress on Server and Data Center Energy Efficiency, Public Law 109-431), IT manufacturers have increased efforts to improve the energy efficiency of their products. This has resulted in disk drives that consume from 25% to 33% less energy, and storage array controllers lowering power consumption by up to 30%. That has had a significant impact on energy costs, including not only the power to run the equipment, but also power to operate the cooling systems needed to purge residual heat from the environment.
Controller Performance – Storage array controllers are little more than specialized servers designed specifically to manage such functions as I/O ports, disk mapping, RAID and cache operations, and execution of array-centric internal applications (such as thin provisioning and snapshots). Like any other server, storage controllers have benefited from advances in technology over the past few years. The current generation of disk arrays contain storage controllers with from 3 to 5 times the processing power of their predecessors.
Driver Compatibility – As newer technologies emerge, they tend to focus on developing software compatibility with the most recently released products and systems on the market. With the passage of time, it becomes less likely for storage arrays to be supported by the latest and greatest technology on the market. This may not impact daily operations, but it creates challenges when a need arises to integrate aging arrays with state-of-the-art systems.
Reliability – Common wisdom used to be that disk failure characteristics could be accurately represented by a ”bathtub graph”. The theory was the potential for failure was high when a disk was new. It then flattened out at a low probability throughout the disk’s useful life, then took a sharp turn upswing as it approached end-of-life. This model implied that extending disk service life had no detrimental effects until it approached end-of-life for the disks.
However over the past decade, detailed studies by Google and other large organizations with massive disk farms have proven the “bathtub graph” model incorrect. Actual failure rates in the field indicate the probability of a disk failure increases by 10% – 20% for every year the disk is in service. It clearly shows the probability of failure increases in a linear fashion over the disk’s service life. Extending disk service-life greatly increases the risk for disk failure.
Service Contracts –Many popular storage arrays are covered by standard three-year warranties. This creates a dilemma, since the useful service life of most storage equipment is considered to be either four or five years. When the original warranty expires, companies must decide whether to extend the existing support contract (at a significantly higher cost), or transitioning to a time & materials basis for support (which can result in some very costly repairs).
Budgetary Impact – For equipment like disk arrays, it is far too easy to fixate on replacement costs (CAPEX), and ignore the ongoing cost of operational expenses (OPEX). This may avoid large upfront expenditures, but it slowly bleeds the IT budget to death by having to maintain increasingly inefficient, fault-prone, and power hungry equipment.
The solution is to establish a program of rolling equipment replenishment on a four- or five-year cycle. By regularly upgrading 20% to 25% of all systems each year, the IT budget is more manageable, equipment failures are controlled, and technical obsolescence remains in check.
Getting rid of familiar things can be difficult. But unlike your favorite slippers, the LazyBoy recliner, or your special coffee cup, keeping outdated storage arrays in service well beyond their prime can cost your organization plenty.
Over the past couple of years we’ve heard an enthusiastic debate over whether Tape is dead, obsolete, or simply relegated to some secondary role like regulatory compliance or litigation response. A lot of the “uproar” has been caused by highly vocal deduplication companies marketing their products by creating fear, uncertainty, and doubt (FUD) among users. And why not, since most of these vendors do not have a significant presence in the tape backup market and therefore little to lose?
However, there are companies with legitimate concerns about managing their backup process and the direction they should take. They need to understand what the facts are and why one approach would be superior to another.
With this goal in mind, let’s take a look at two popular backup & recovery devices – the LTO-5 tape drive and the 3.0 TB SATA disk, and see how they compare.
|Average drive cost-per-unit||$1,450||$ 495|
|Typical media cost-per-unit||$54 per tape||N/A|
|Native formatted capacity||1500 GB||3000 GB|
Native sustained transfer rate
|140 MB/s||155 MB/s|
|Data buffer size||256 MB||64 MB|
|Average file access time||56 sec.||12.16 ms|
|Interfaces available||6 Gb/s SAS||6 Gb/s SAS|
|Typical duty cycle||8-hrs/day||24-hrs/day|
AES 256-bit encryption
|AES 256-bit encryption|
|Power consumption – Idle||6.7 Watts||7.4 Watts|
|Power consumption – Typical||23.1 Watts||11.3 Watts|
|Drive MTBF||50,000 hours at 100% duty cycle||1,200,000 hours|
|Non-recoverable Error Rate||1 in 1 × 1017 bits||1 sector 1×1015 bits|
|5-year total for 1.0 Petabyte||$37,495||$165,330|
No surprises here. Simply doing the math indicates the cost to store 1 Petabyte of data for 5-years would be more four times more on spinning disk than on tape media. Granted there are other factors involved in the process, but most offset each other. Both a tape library and a disk array take data center floor space and infrastructure resources. Both consume power and require cooling. Each system must be managed by skilled IT specialists. Deduplication may reduce disk capacity requirements (reducing cost) but so will tape compression and/or increasing the tape drive’s duty cycle from 8 to 12 hours per day. Surprisingly the only major variable over time is the cost of the media, which is heavily weighted in favor of tape.
In the foreseeable future the 4TB SATA disk will make the above calculations somewhat more favorable for the disk drive. However, we expect to see the LTO-6 tape drive in production in the second half of 2012, increasing the tape drive’s sustained transfer rate by 30% and tape media capacity by 47%. This will bring the above tape vs. disk comparison back into close alignment.
The sensible strategy is to develop a backup and recovery system that incorporates both technologies, to capitalize on the strengths of both. Using disk “pools” to aggregate nightly backups (whether deduplicated or not) ensures backup windows can be met, and greatly improves data restoration time. Backing up directly to tape from the “disk pools” allows streaming data to be sustained for maximum performance and transfers data to the lowest-cost media available for long-term archiving, disaster recovery, regulatory compliance, and litigation response.
It’s time this argument to bed. Both tape drives and SATA disk should play a role in a well-designed, highly optimized backup and recovery system. The “war” is over, and for once both combatants won!