The hottest thing in storage today is software defined storage (SDS). Every storage vendor is jumping on the SDS bandwagon.
The presentation titled Industrial -Strength SDS for the Cloud, by Sven Oehme, IBM Senior Research Scientist, drew a packed audience at Edge 2014 and touched on many of the sexiest acronyms in IBM’s storage portfolio. These included not just GPFS but also GSS (also called GPFS Storage Server), GNR, LROC (local read-only cache), and even worked in Linear Tape File System (LTFS).
The session promised to outline the customer problems SDS solves and show how to deploy it in large scale OpenStack environments with IBM GPFS. Industrial strength generally refers to large-scale, highly secure and available multi-platform environments.
The session abstract explained that the session would show how GPFS enables resilient, robust, reliable, storage deployed on low-cost industry standard hardware delivering limitless scalability, high performance, and automatic policy-based storage tiering from flash to disk to tape, further lowering costs. It also promised to provide examples of how GPFS provides a single, unified, scale-out data plane for cloud developers across multiple data centers worldwide. GPFS unifies OpenStack VM images, block devices, objects, and files with support for Nova, Cinder, Swift and Glance (OpenStack components), along with POSIX interfaces for integrating legacy applications. C’mon, if you have even a bit of IT geekiness, doesn’t that sound tantalizing?
One disclaimer before jumping into some of the details; despite having written white papers on SDS and cloud your blogger can only hope to approximate the rich context provided at the session.
Let’s start with the simple stuff; the expectations and requirements for cloud storage:
- Elasticity, within and across sites
- Secure isolation between tenants
- Non-disruptive operations
- No degradation by failing parts as components fail at scale
- Different tiers for different workloads
- Converged platform to handle boot volumes as well as file/object workload
- Locality awareness and acceleration for exceptional performance
- Multiple forms of data protection
Of course, affordable hardware and maintenance is expected as is quota/usage and workload accounting.
Things start getting serious with IBM’s General Parallel File System (GPFS). This what IBMers really mean when they refer to Elastic Storage, a single name space provided across individual storage resources, platforms, and operating systems. Add in different classes of storage devices (fast or slow disk, SSD, Flash, even LTFS tape), storage pools, and policies to control data placement and you’ve got the ability to do storage tiering. You can even geographically distribute the data through IBM’s Active Cloud Engine, initially a SONAS capability sometimes referred to as Active File Manager. Now you have a situation where users can access data by the same name regardless of where it is located. And since the system keeps distributed copies of the latest data it can handle a temporary loss of connectivity between sites.
To protect the data add in declustered software RAID, aka GNR or even GSS (GPFS Storage Server). The beauty of this is it reduces the space overhead of replication through declustered parity (80% vs. 33% utilization) while delivering extremely fast rebuild. In the process you can remove hardware storage controllers from the picture by doing the migration and RAID management in software on your commodity servers.
In the above graphic, focus on everything below the elongated blue triangle. Since it is being done in software, you can add an Object API for object storage. Throw in encryption software. Want Hadoop? Add that too. The power of SDS. Sweet
The architecture Oehme lays out utilizes generic servers with direct-attached switched JBOD (SBOD). It also makes ample use of LROC, which provides a large read cache that benefits many workloads, including SPECsfs, VMware, OpenStack, other virtualization, and database workloads.
A key element in Oehme’s SDS for the cloud is OpenStack. From a storage standpoint OpenStack Cinder, which provides access to block storage as if it were local, enables the efficient sharing of data between services. Cinder supports advanced features, such as snapshots, cloning, and backup. On the back end, Cinder supports Linux servers with iSCSI and LVM; storage controllers; shared filesystems like GPFS, NFS, GlusterFS; and more.
Since Oehme’s is to produceindustrial-strength SDS for the Cloud it needs to protect data. Data protection is delivered through backups, snapshots, cloning, replication, file level encryption, and declustered RAID, which spans all disks in the declustered array and results in faster RAID rebuild (because there are more disks available for RAID rebuild.)
The result is highly virtualized, industrial strength SDS for deployment in the cloud. Can you bear one more small image that promises to put this all together? Will try to leave it as big as can fit. Notice it includes a lot of OpenStack components connecting storage elements. Here it is.
DancingDinosaur is Alan Radding. Follow DancingDinosaur on Twitter @mainframeblog
Learn more about Alan Radding at technologywriter.com
Tags: Active Cloud Engine, analytics, Big Data, Cloud, declustered RAID, Elastic Storage, General Parallel File System (GPFS), GNR, GSS, hadoop, IBM, Linear Tape File System (LTFS), Linux, local read-only cache (LROC), OpenStack, software, software-defined storage (SDS), storage