10.14
Disclaimer: I work for a company that offers cloud services.
Henry Newman’s October 9th article at EnterpriseStorageForum.com wants to steer you away from cloud storage for reasons I don’t grasp. It’s filled with a lot of numbers and Doom’s Day scenarios but what it doesn’t say is that these numbers and scenarios apply equally, regardless of whether you’re looking at Cloud Storage or a traditional, on-premise storage solution. I have several problems with the article. Let’s start with this statement about cloud storage’s potential:
“There are two reasons for this [limited potential]: bandwidth limitations and the data integrity issues posed by the commodity drives that are typically used in cloud services.”This is followed by 3 tables about disk error rates with all kinds of numbers. Disk error rates are what they are. Until manufacturers build more reliable devices, the only thing you can do is build in redundancy and error checking to compensate. This affects cloud storage and traditional solutions alike so it’s really a non-factor. Oh yeah, he does point out that commodity (consumer-grade) drives tend to have lower hard error rates and, thus, higher catastrophic loss potential. There’s no arguing this. What I will argue is his “commodity drives typically used in cloud services.” Yes, some providers use commodity (consumer-grade) drives to keep costs low. Other cloud providers use enterprise-grade drives. Some enterprise storage vendors use commodity drives in their enterprise products, so it it fair for me to say “commodity drives typically used in enterprise storage?” Let’s get real here, the devil is in the details. Talk to your storage provider, be it an “enterprise” vendor or a cloud vendor. Make informed decisions based on fact, not conjecture.
Here’s another quote from the article: “one of the biggest issues is simply that hardware is going to break.” How is this going to limit cloud storage’s potential for enterprises? I’m sure he doesn’t mean to suggest that a traditional, on-premise enterprise storage environment has hardware that doesn’t break. Beyond that, I’m not sure why this is even brought up.
Okay, let’s look at these “bandwidth limitations.” I’m still trying to figure this one out. How is bandwidth limiting for Cloud and not for a traditional solution? SAN fabrics have limited bandwidth just like the Internet. His “point” is made using a worst-case scenario – a complete rebuild of a large data set. Yes, bandwidth is going to be a huge bottleneck – for both cloud and traditional solutions. If you’re trying to rebuild a data set in San Francisco using data in New York – it doesn’t matter what technology you’re using. The same data is going over the same wire at the same speeds. It’s going to take 1 day to replicate 1 Petabyte of data using an OC-768 connection for either cloud or traditional solutions, making bandwidth a non-factor when comparing both approaches.
This statement really makes me scratch my head:
“That’s why, at least for the biggest enterprise storage environments, a centralized disaster recovery site that you can move operations to until everything is restored will be a requirement for the foreseeable future.”So, how is all that data going to get to the recovery site? That’s right, replication over limited bandwidth. The same “reason” suggesting that bandwidth is going to limit cloud storage’s potential is going to limit your traditional enterprise storage environment.
Let’s cut to the chase. “Cloud” is like “cluster” – it means different things to different people. There are many ways to build a “cloud storage” solution. You can use enterprise-grade hardware, RAID, multiple replicas – everything you might use in a traditional storage environment or you can cut costs and go with cheaper hardware, eliminate RAID, keep only one replica of data…it’s all in how it’s engineered. Secondly, cloud storage has it’s limitations, just like a traditional storage environment. It can be used very effectively or it can be used very poorly. If you use something in a way it wasn’t designed and it performs poorly, you really can’t blame the technology, only yourself. Yes, you can drive in a nail using a wrench but I’d rather use a hammer.
One of the key points that makes “cloud” storage what it is is the technology used to access it. For example, Amazon’s S3 storage uses an API, either REST or SOAP calls (technically, you can use regular HTTP requests but you’ll have to parse some XML but it’s really as simple as this.) This is a world of difference from traditional, file/directory-based storage accessed via open()/fopen() system calls. This is what gives cloud storage it’s great flexibility. From an application perspective, the storage is “everywhere and nowhere, all at once.” I don’t have to connect to a particular data center, use weird protocols or hardware connectors – it’s all done using the familiar HTTP protocol or an application library that wraps REST/SOAP calls in convenient functions or object methods. And since cloud storage is built on TCP/IP protocols, I can leverage firewalls, IDS solutions and, in the case where REST/SOAP calls are used, load balancers and caching systems to improve performance and reliability.
In summary, it’s how you engineer a solution. Garbage in, garbage out. If you engineer an enterprise-grade solution, you get enterprise-grade benefits and results. Remember, it wasn’t very long ago that GNU/Linux was considered a toy operating system and not enterprise-capable. Today, GNU/Linux powers 88% of the most powerful compute clusters, as reported by TOP500.