Earlier this week IBM announced new technologies intended to help companies and governments tackle Big Data by making it simpler, faster and more economical to analyze massive amounts of data. Its latest innovations, IBM suggested, would drive reporting and analytics results as much as 25 times faster.
The biggest of IBM’s innovations is BLU Acceleration, targeted initially for DB2. It combines a number of techniques to dramatically improve analytical performance and simplify administration. A second innovation, referred to as the enhanced Big Data Platform, improves the use and performance of the InfoSphere BigInsights and InfoSphere Streams products. Finally, it announced the new IBM PureData System for Hadoop, designed to make it easier and faster to deploy Hadoop in the enterprise.
BLU Acceleration is the most innovative of the announcements, probably a bona fide industry first, although others, notably Oracle, are scrambling to do something similar. BLU Acceleration enables much faster access to information by extending the capabilities of in-memory systems. It allows the loading of data into RAM instead of residing on hard disks for faster performance and dynamically moves unused data to storage. It even works, according to IBM, when data sets exceed the size of the memory.
Another innovation included in BLU Acceleration is data skipping, which allows the system to skip over irrelevant data that doesn’t need to be analyzed, such as duplicate information. Other innovations include the ability to analyze data in parallel across different processors; the ability to analyze data transparently to the application, without the need to develop a separate layer of data modeling; and actionable compression, where data no longer has to be decompressed to be analyzed because the data order has been preserved. Finally, it leverages parallel vector processing, which enables multi-core and SIMD (Single Instruction Multiple Data) parallelism.
During testing, IBM reported, some queries in a typical analytics workload ran more than 1000x faster when using the combined innovations of BLU Acceleration. It also resulted in 10x storage space savings during beta tests. BLU acceleration will be used first in DB2 10.5 and Informix 12.1 TimeSeries for reporting and analytics. It will be extended for other data workloads and to other products in the future.
BLU Acceleration promises to be as easy to use as load-and-go. BLU tables coexist with traditional row tables; using the same schema, storage, and memory. You can query any combination of row or BLU (columnar) tables, and IBM assures easy conversion of conventional tables to BLU tables.
DancingDinosaur likes seeing the System z included as an integral part of the BLU Acceleration program. The z has been a DB2 workhorse and apparently will continue to be as organizations move into the emerging era of big data analytics. On top of its vast processing power and capacity, the z brings its unmatched quality of service.
Specifically, IBM has called out the z for:
- InfoSphere BigInsights via the zEnterprise zBX for data exploration and online archiving
- IDAA (in-memory Netezza technology) for reporting and analytics as well as operational analytics
- DB2 for SQL and NoSQL transactions with enhanced Hadoop integration in DB2 11 (beta)
- IMS for highest performance transactions with enhanced Hadoop integration in IMS 13 (beta)
Of course, the zEnterprise is a full player in hybrid computing through the zBX so zEnterprise shops have a few options to tap when they want to leverage BLU Accelerator and IBM’s other big data innovations.
Finally, IBM announced the new IBM PureData System for Hadoop, which should simplify and streamline the deployment of Hadoop in the enterprise. Hadoop has become the de facto open systems approach to organizing and analyzing vast amounts of unstructured as well as structured data, such as posts to social media sites, digital pictures and videos, online transaction records, and cell phone location data. The problem with Hadoop is that it is not intuitive for conventional relational DBMS staff and IT. Vendors everywhere are scrambling to overlay a familiar SQL approach on Hadoop’s map/reduce method.
The new IBM PureData System for Hadoop promises to reduce from weeks to minutes the ramp-up time organizations need to adopt enterprise-class Hadoop technology with powerful, easy-to-use analytic tools and visualization for both business analysts and data scientists. It also provides enhanced big data tools for management, monitoring, development, and integration with many more enterprise systems. The product represents the next step forward in IBM’s overall strategy to deliver a family of systems with built-in expertise that leverages its decades of experience in reducing the cost and complexity associated with information technology.
Tags: BLU Acceleration, DB2, hadoop, hybrid computing, IBM, IBM Big Data Platform, IDAA, IMS, Informix Time Series, InfoSphere BigInsights, InfoSphere Streams, mainframe, Netezza, parallelism, PureData System for Hadoop, System z, zEnterprise