Posts Tagged ‘DB2’

IBM Big Data Innovations Heading to System z

April 4, 2013

Earlier this week IBM announced new technologies intended to help companies and governments tackle Big Data by making it simpler, faster and more economical to analyze massive amounts of data. Its latest innovations, IBM suggested, would drive reporting and analytics results as much as 25 times faster.

The biggest of IBM’s innovations is BLU Acceleration, targeted initially for DB2. It combines a number of techniques to dramatically improve analytical performance and simplify administration. A second innovation, referred to as the enhanced Big Data Platform, improves the use and performance of the InfoSphere BigInsights and InfoSphere Streams products. Finally, it announced the new IBM PureData System for Hadoop, designed to make it easier and faster to deploy Hadoop in the enterprise.

BLU Acceleration is the most innovative of the announcements, probably a bona fide industry first, although others, notably Oracle, are scrambling to do something similar. BLU Acceleration enables much faster access to information by extending the capabilities of in-memory systems. It allows the loading of data into RAM instead of residing on hard disks for faster performance and dynamically moves unused data to storage.  It even works, according to IBM, when data sets exceed the size of the memory.

Another innovation included in BLU Acceleration is data skipping, which allows the system to skip over irrelevant data that doesn’t need to be analyzed, such as duplicate information. Other innovations include the ability to analyze data in parallel across different processors; the ability to analyze data transparently to the application, without the need to develop a separate layer of data modeling; and actionable compression, where data no longer has to be decompressed to be analyzed because the data order has been preserved.   Finally, it leverages parallel vector processing, which enables multi-core and SIMD (Single Instruction Multiple Data) parallelism.

During testing, IBM reported, some queries in a typical analytics workload ran more than 1000x faster when using the combined innovations of BLU Acceleration. It also resulted in 10x storage space savings during beta tests. BLU acceleration will be used first in DB2 10.5 and Informix 12.1 TimeSeries for reporting and analytics. It will be extended for other data workloads and to other products in the future.

BLU Acceleration promises to be as easy to use as load-and-go.  BLU tables coexist with traditional row tables; using the same schema, storage, and memory. You can query any combination of row or BLU (columnar) tables, and IBM assures easy conversion of conventional tables to BLU tables.

DancingDinosaur likes seeing the System z included as an integral part of the BLU Acceleration program.  The z has been a DB2 workhorse and apparently will continue to be as organizations move into the emerging era of big data analytics. On top of its vast processing power and capacity, the z brings its unmatched quality of service.

Specifically, IBM has called out the z for:

  • InfoSphere BigInsights via the zEnterprise zBX for data exploration and online archiving
  • IDAA (in-memory Netezza technology) for reporting and analytics as well as operational analytics
  • DB2 for SQL and NoSQL transactions with enhanced Hadoop integration in DB2 11 (beta)
  • IMS for highest performance transactions with enhanced Hadoop integration  in IMS 13 (beta)

Of course, the zEnterprise is a full player in hybrid computing through the zBX so zEnterprise shops have a few options to tap when they want to leverage BLU Accelerator and IBM’s other big data innovations.

Finally, IBM announced the new IBM PureData System for Hadoop, which should simplify and streamline the deployment of Hadoop in the enterprise. Hadoop has become the de facto open systems approach to organizing and analyzing vast amounts of unstructured as well as structured data, such as posts to social media sites, digital pictures and videos, online transaction records, and cell phone location data. The problem with Hadoop is that it is not intuitive for conventional relational DBMS staff and IT. Vendors everywhere are scrambling to overlay a familiar SQL approach on Hadoop’s map/reduce method.

The new IBM PureData System for Hadoop promises to reduce from weeks to minutes the ramp-up time organizations need to adopt enterprise-class Hadoop technology with powerful, easy-to-use analytic tools and visualization for both business analysts and data scientists. It also provides enhanced big data tools for management, monitoring, development, and integration with many more enterprise systems.  The product represents the next step forward in IBM’s overall strategy to deliver a family of systems with built-in expertise that leverages its decades of experience in reducing the cost and complexity associated with information technology.

IBM PureData Brings New Analytics Platform

October 18, 2012

IBM finally has started to expand its PureSystems family of systems with the introduction of the PureData System.  The system promises to let organizations more efficiently manage and quickly analyze petabytes of data and then intelligently apply those insights in addressing business issues across their organization.

This is not a surprise. From the start, IBM talked about a family of PureSystems beyond the initial PureFlex and PureApplications. When the PureSystems family was introduced last spring, DancingDinosaur expected IBM to quickly add new expert servers starting with something it guessed would be called PureAnalytics and maybe another called PureTransactions.  PureData isn’t that far off. The new systems are being optimized specifically for transactional operations and data analytics workloads.

Specifically, PureData System for Transactions has been integrated and optimized as a ready-to-run database platform designed and tuned specifically for transactional data workloads. It supports both DB2 applications unchanged and Oracle database applications with only minimal changes. The machines come as three workload-specific models optimized either for transactional, operational, and big data analytics. They are:

  • PureData System for Transactions: Aimed at retail and credit card processing environments that depend on rapid handling of transactions and interactions these transactions may be small, but the volume and frequency require fast and efficient processing. The new system provides hardware and software configurations integrated and optimized for flexibility, integrity, availability and scalability for any transaction workload.
  • PureData System for Analytics: Enables organizations to quickly and easily analyze and explore big data, up to multi petabytes in volume. The new system simplifies and optimizes performance of data warehouse services and analytics applications. Powered by Netezza technology (in-memory analytics), the new system aims to accelerate analytics and boasts what IBM describes as the largest library of in-database analytic functions on the market today. Organizations can use it to predict and avoid customer churn in seconds, create targeted advertising and promotions using predictive and spatial analysis, and prevent fraud.
  • PureData System for Operational Analytics: Here organizations can receive actionable insights concurrently on more than 1,000 business operations to support real-time decision making. Operational warehouse systems are used for fraud detection during credit card processing, to deliver customer insights to call center operations (while the customer is still on the call or online), and track and predict real-time changes in supply and demand.

All the systems include PureSystems pattern-based expertise and automation. From a configuration standpoint, the full rack system can be pretty rich: 386 x86 processor cores, 6.2 TB DRAM, 19.2 TB flash (SSD), 128 TB disk (HDD), advanced storage tiering, up to 10x compression, a high speed RDMA interconnect, and dual internal 10 GB network links. Systems, however, can range from 96 cores to 386 cores. IBM reports early customer results of 10-100x faster performance over traditional custom-built systems and 20x greater concurrency and throughput for tactical queries resulting, in part, from IBM’s patented MPP hardware acceleration.

IBM hasn’t disclosed pricing, which is highly subject to the particular configuration anyway. However, the company is quick to tout its introductory deals: Credit-qualified clients that elect IBM financing can see immediate benefits with PureData System by deferring their first payment until January 2013 or obtaining a zero percent (interest-free) loan for 12, 24 or 36 months.

PureData may be better thought of as a data appliance delivering data services fed by applications that generate the data and reside elsewhere. With its factory built-in expertise, patterns, and appliance nature organizations can have, according to IBM, a PureData system up and running in hours, not days or weeks; run complex analytics in minutes, not hours; and handle more than 100 databases on a single system. PureData can be deployed in one step simply by specifying the cluster name, description, and applicable topology pattern. Built-in expertise handles the rest.

Now the game is to guess what the next PureSystems expert server will be. DancingDinosaur’s guess: a highly scalable implementation of VDI, maybe called PureDesktop.

IBM zEnterprise—the Software Difference

April 4, 2011

You could argue that there is no high end server available today to match the IBM zEnterprise/zBX in processing power, reliability, and scalability. It holds its own in terms of speeds and feeds, number of cores, memory. Software, however, may turn out to be the biggest differentiator among high end servers, and IBM has optimized a ton of software for the z, something others mainly just talk about.

The high end server market has suddenly entered a period of change. In March Oracle announced that will no longer support Itanium processors. HP immediately countered with a statement of support for Itanium. SGI announced a 256-core Xeon Windows system.  Also in March, Quanta Computer, a Chinese operation, reported squeezing 512 cores into a pizza box server running the Tilera multi-core processor. Tilera’s roadmap goes out to 2013 when it expects to pack 200 cores onto a processor.  Of course, IBM launched the first hybrid server, the zEnterprise consisting of the multi-core z196 coupled with the zBX last summer.

This recent flurry of server activity at the large-scale, multi-core end of the market leaves server buyers somewhat confused. One writes to DancingDinosaur asking: What will be the ultimate retail price per core?  What’s the current price per core of, perhaps, a chassis full of 8-core IBM System p blades, an HP Superdome, or an SGI UV 1000 running Windows or Linux?

Fair questions, for sure. The published OEM price last year was $900 per chip for a 64-core Tilera processor, which rounds to $14 per core. SGI reports that the Altix UV starts at $50,000 with Microsoft software an additional $2,999 per four sockets (32 cores). A buyer could end up facing different vendors and technologies competing at the $50, $100, $500, $1000, $5,000 and $10,000 per core price points. Each vendor will be promoting a different architecture, configuration, memory optimization, performance, and even form factor (multi u, pizza box, blades) attributes.

This is not just about price but integration, internal communication speeds, optimization, and more. At this point all the vendors need to be more forthcoming and transparent.

But this may not turn out to be a hardware, processor, memory, speeds and feeds battle. It may not even turn into a price-per-core battle or a total cost of ownership (TCO) vs. total cost of acquisition (TCA) battle. Ultimately, it has to come down to workloads supported and delivered, and that means software. And when it comes to workload optimization and software IBM already has an advantage, especially when compared to Oracle and HP.

A quick peek at IBM’s software lineup suggests the company has a lot of topnotch software to run on its hardware.  Factor in the ISV ecosystem and the IBM picture gets even better.

Let’s start with Gartner naming IBM the worldwide market share leader overall in the application infrastructure and middleware software segment.  If you drill down into the various sub-markets, IBM often comes up as leader there too. For example, IBM leads the business process management (BPM) market, with better than double the share of its closest competitor. IBM also leads in the message oriented middleware market, the transaction processing monitor market, and the combined markets for Enterprise Service Bus (ESB) and integration appliances.

Critical segments for sure, but businesses need more. For that IBM offers DB2 a powerful, enterprise database management system that can rival Oracle. WebSphere goes far beyond being just an application server; it encompasses a wide range of functionality including portals and commerce. With Rational, IBM can cover the entire application development lifecycle, and with Lotus IBM nails down communication and collaboration. And don’t forget Cognos, a proven BI tool, plus all the IBM Smart Analytics tools. Finally, IBM provides the Tivoli product set to manage both systems and storage.

The point: when it comes to high end servers it is not just about processor cores. It’s about systems optimized for the software you need to run your workloads. With enterprise data centers that will often be IBM

 


Follow

Get every new post delivered to your Inbox.

Join 446 other followers