Even before IBM introduced the zEnterprise as a hybrid mainframe it was thinking about Hadoop on the mainframe as part of its Blue Cloud initiative in 2007. That would include Xen and PowerVM virtualized Linux operating system images and Hadoop parallel workload scheduling.
Although Blue Cloud wasn’t specifically a mainframe initiative, even in 2007 the mainframe running Linux and z/VM could act as a Hadoop platform. More recently IBM turned to Hadoop for its InfoSphere BigInsights, an analytics platform built on top of the Apache Hadoop open framework for storing, managing and gaining insights from Internet-scale data.
Hadoop uses a programming model and software framework called Map/Reduce for writing applications that rapidly process vast amounts of data in parallel on large clusters of compute nodes. A second piece is the Hadoop Distributed File System (HDFS).
As Apache explains it, HDFS creates multiple replicas of data blocks and distributes them on compute nodes throughout a cluster to enable reliable, extremely rapid computations. When a program initiates a data search, each node looks at its data and processes it as instructed.
Hadoop, needs some form of grid computing. By 2010, grid computing had morphed into cloud computing. At IOD a few weeks back, IBM introduced a development test cloud for trying out Hadoop applications that, according to IBM, fit well with cloud principles and technology.
With the z196 the possibilities of Hadoop for mainframe shops become much more interesting. To begin, the z196 has the resource capacity to run hundreds, if not thousands, of virtualized Linux servers, each with designated storage capacity. That lays the foundation for a Hadoop platform.
Through a flexible, fault resistant, Hadoop-based infrastructure organizations can run different types of workloads. Analytics of massive amounts of data has emerged as a primary enterprise workload for Hadoop on the mainframe.
For analytics, the z196 can run Cognos on Linux on z. Better yet, you could attach the zBX extension cabinet and populate it with IBM Smart Analytics Optimizer cards running against Hadoop data sets. Or use some other POWER-based analytics running on POWER cards inside the zBX.
For z196 shops, the plan would be to use the machine as the platform for a private cloud that captured, stored, managed, and analyzed massive amounts of data generated from web applications, meters and sensors, POS systems, clickstreams, and such using Hadoop. Already, one midsize z196 user has deployed the machine to serve images and video to online shoppers. It is not too big a stretch to imagine tapping Hadoop capabilities to do more with the data.
Whether any of this happens depends on pricing. To begin, IBM has to reduce the cost of a z196 Hadoop environment because today companies are building these out of the cheapest commodity components.
The necessary cost reductions probably won’t happen until late 2011, when IBM plans to introduce Solution Edition discounts for the z196 comparable to the z10 Enterprise Linux Solution Edition discounts last year. It also will depend on how the pricing shakes out for the various zBX blades. To date IBM has talked a good Hadoop game but has given no hints of any readiness to cut pricing enough in the future to make it practical on the z196.