Posts Tagged ‘BigInsights’

Mainframe Appeal Continues in 9th BMC Survey

October 30, 2014

With most of the over 1100 respondents (91%) reporting that the mainframe remains a viable long-term platform for them and a clear majority (60%) expecting to increase MIPS due to the normal growth of legacy applications and new application workloads the z continues to remain well entrenched. Check out the results for yourself here.

Maybe even more reassurance comes from almost half the respondents who reported that they expect the mainframe to attract and grow new workloads.  Most likely these will be Java and Linux workloads but one-third of the respondents listed cloud as a priority, jumping it up to sixth on the list of mainframe priorities. Mobile was cited as priority by 27% of the respondents followed by big data with 26% respondents.

ibm zec12

Apparently IBM’s steady promotion of cloud, mobile, and big data for the z over the past year is working. At Enterprise2014 IBM even made a big news with real time analytics and Hadoop on the z along with a slew of related announcements.

That new workloads like cloud, mobile, and big data made it into the respondents’ top 10 IT priorities for the year didn’t surprise Jonathan Adams, BMC vice president/general manager for z solutions.  The ease of developing in Java and its portability make it a natural for new workloads today, he noted.

In the survey IT cost reduction/optimization tops the list of IT priorities for 2014 by a large margin, 70% of respondents, followed by application availability, 52%.  Rounding out the top five are application modernization with 48%, data privacy, 47%, and business/IT alignment, 44%. Outsourcing finished out the top 10 priorities with 16%.

When asked to look ahead in terms of MIPS growth, the large majority of respondents expected growth to continue or at least remain steady. Only 9% expected MIPS to decline and 6% expected to eliminate the mainframe.  This number has remained consistent for years, noted Adams. DancingDinosaur periodically checks in with shops that announce plans to eliminate their mainframe and finds that a year later many have barely made any progress.

The top mainframe advantages shouldn’t surprise you:  availability (53%); security (51%); centralized data serving (47%) and transaction throughput (42%). More interesting results emerged when the respondents addressed new workloads. The mainframe’s cloud role includes data access (33%), cloud management from Linux on z (22%) and dynamic test environments via self-service (15%). Surprisingly, when it comes to big data analytics, 34% report that the mainframe acts as their analytics engine. This wasn’t supposed to be the case, at least not until BigInsights and Hadoop on z gained more traction.

Meanwhile, 28% say they move data off platform for analytics, and 14% report they federate mainframe data to an off-platform analytics engine. Yet, more than 81% now incorporate the mainframe into their Big Data strategy, up from 70% previously. The non-finance industries are somewhat more likely to use the mainframe as the big data engine, BMC noted. Those concerned with cost should seriously consider doing their analytics on the z, where the data is. It is costly to keep moving data around.

In terms of mobility, making existing applications accessible for mobile ranked as the top issue followed by developing new mobile applications and securing corporate data on mobile devices. Mobile processing increases for transaction volume came in at the bottom of mobility issues, but that will likely change when mobile transactions start impacting peak workload volumes and trigger increased costs. Again, those concerned about costs should consider IBM’s mobile transaction discount, which was covered by DancingDinsosaur here in the spring.

Since cost reduction is such a big topic again, the survey respondents offered their cost reduction priorities.  Reducing resource usage during peak led the list.  Other cost reduction priorities included consolidating mainframe software vendors, exploiting zIIP and specialty engines (which have distinctly lower cost/MIPS), and moving workloads to Linux on z.

So, judging from the latest BMC survey the mainframe is far from dead. But at least one recent IT consultant and commentator, John Appleby, seems to think so. This prediction has proven wrong so often that DancingDinosaur has stopped bothering to refute it.

BTW, change came to BMC last year  in the form of an acquisition by a venture capital group. Adams reports that the new owners have already demonstrated a commitment to continued investment in mainframe technology products, and plans already are underway for next year’s survey.

DancingDinosaur is Alan Radding. You can follow him on Twitter, @mainframeblog. Or see more of his writing at Technologywriter.com or in wide-ranging blogs here.

Hadoop Brings Big Data Analytics to the IBM System z

October 16, 2014

In a previous blog, DancingDinoaur reported on IBM’s initial announcement of Hadoop and other analytic products, like InfoSphere BigInsights, coming to the z. The IBM announcement itself can be found here.

Subsequent sessions at IBM Enterprise2014 delved more deeply into big data, analytics, and real-time analytics. A particularly good series of sessions was offered by Karen Durward, an IBM InfoSphere software product manager specializing in System z data integration. As Durward noted, BigInsights is Apache Hadoop wrapped up to make it easier to use for general IT and business managers.

Specifically, the real-time analytics package for z includes IBM InfoSphere BigInsights for Linux on System z, which combines open-source Apache Hadoop with enhancements to make Hadoop System z enterprise-ready. The solution also includes IBM DB2 Analytics Accelerator (IDAA), which improves data security while delivering a 2000x faster response time for complex data queries.

In her Hadoop on z session, Durward started with the Hadoop framework, which consists of four components:

  1. Common Core—the basic modules (libraries and utilities) on which all components are built
  2. Hadoop Distributed File System (HDFS)—stores data on multiple machines to provide very high aggregate bandwidth across a cluster of machines
  3. MapReduce—the programming model to support the high data volume data processing by the cluster
  4. YARN (Yet Another Resource Negotiator)—the platform used to manage the cluster’s compute resources including scheduling users’ applications. In effect, YARN decouples Hadoop workload and resource management.

The typical Hadoop process sounds deceptively straightforward.  Simply load data into an HDFS cluster, analyze the data in the cluster using MapReduce, write the resulting analysis back into the HDFS cluster. Then just read it.

Sounds easy enough until you try it. Then you need to deal with client nodes and name nodes, exchange metadata, and more. In addition, Hadoop is an evolving technology. Apache continues to add pieces to the environment in an effort to simplify it. For instance, Hive provides the Apache data warehouse framework, accessible using HivQL, and HBase brings Apache’s Hadoop database. Writing Map/Reduce code is a challenge so there is Pig, Apache’s platform for creating long and deep Hadoop source programs, and the list goes on. In short, Hadoop is not easy, especially for IT groups accustomed to relational databases and SQL. That’s why you need tools like BigInsights. The table below is how Durward sees the Hadoop tool landscape.

Software Needs Other Hadoop Products BigInsights
Open Source Apache Hadoop Y Y
Rich SQL on Hadoop (Big SQL) some Y
Tools for Business Users (BigSheets) NA Y
Advanced text analytics NA Y
In-Hadoop analytics NA Y
Rich developer tools NA Y
Enterprise workload & storage mgt. NA Y
Comprehensive suite NA Y

In fact, you need more than BigInsights. “We don’t know how to look at unstructured data,” said Durward. That’s why IBM layers on tools like Big SQL, which helps you query Hadoop’s HBase using industry-standard SQL. You can migrate a relational table to HBase using Big SQL or connect Big SQL via JDBC to run business intelligence and reporting tools, such as Cognos, which also runs on Linux on z. Similarly IBM offers BigSheets, a cloud application that performs ad hoc analytics at web-scale on unstructured and structured content using the familiar spreadsheet format.

Lastly, Hadoop queries often produce free-form text, which requires text analytics to make sense of the results. Not surprisingly, IBM offers BigInsights Text Analytics, a fast, declarative rule-based information extraction (IE) system that extracts insights from unstructured content. This system consists of a fast, efficient runtime that exploits numerous optimization techniques across extraction programs written in Annotation Query Language (AQL), an English-like declarative language for rule-based information extraction.

Hadoop for the z is more flexible than z data center managers may think. You can merge Hadoop data with z transactional data sources and analyze it all together through BigInsights.

So how big will big data be on the z? DancingDinosaur thought it could scale to hundreds of terabytes, even petabytes. Not so. You should limit Hadoop on the z to moderate volumes—from hundreds of gigabytes to tens of terabytes, Durward advises, adding “after that it gets expensive.”

Still, there are many advantages to running Hadoop on the z. To begin, the z brings rock solid security, is fast to deploy, and, through BigInsights, brings an easy-to-use data ingestion process. It also has proven to be easy to setup and run, taking just a few hours, with conversions handled automatically. Lastly, the data never leaves the platform, which avoids the expense and delay of moving data between platforms. But maybe most importantly, by wrapping Hadoop in a set of familiar, comfortable tools and burying its awkwardness out of sight Hadoop now becomes something every z shop can leverage.

DancingDinosaur is Alan Radding. Follow this blog on Twitter, @mainframeblog. Check out my work at Technologywriter.com

Software Licensing for IBM System z Distributed Linux Middleware

October 10, 2014

DancingDinosaur can’t attend a mainframe conference without checking out at least one session on mainframe software pricing by David Chase, IBM’s mainframe pricing guru. At IBM Enterprise2014, which wraps up today, the topic of choice was software licensing for Linux middleware. It’s sufficiently complicated to merit an entire session.

In case you think Linux on z is not in your future, maybe you should think again.  Linux is gaining momentum in even the largest z data centers. Start with IBM bringing new apps like InfoSphere, BigInsights (Hadoop), and OpenStack to z. Then there are apps from ISVs that just weren’t going to get their offerings to z/OS. Together it points to a telltale sign something is happening with Linux on z. And, the queasiness managers used to have about the open source nature of Linux has long been put to rest.

At some point, you will need to think about IBM’s software pricing for Linux middleware. Should you find yourself getting too lost in the topic, check out these links recommended by Chase:

To begin, software for Linux on z is treated differently than traditional mainframe software in terms of pricing. With Linux on z you think in terms of IFLs.  The quantity of IFLs represent the number of Linux engines subjected to IBM’s IPLA-based pricing.

Also think in terms of Processor Value Units (PVUs) rather than MSUs. For a pricing purposes, PVUs are analogous to MSUs although the values are different. A key point to keep in mind: distributed PVUs for Linux are not related to System z IPLA value units used for z/VM products. As is typical of IBM, those two different kinds of value units are NOT interchangeable.

Chase, however, provides a few ground rules:

  • Dedicated partition
    • Processors are always allocated in whole increments
    • Resources are only moved between partitions “explicitly” (e.g. by an operator or a scheduled job)
  • Shared pool:
    • Pool of processors shared by partitions (including virtual machines)
    • System automatically dispatches processor resources between partitions as needed
  • Maximum license requirements
  • Customer does not have to purchase more licenses for a product than the number of processors on the machine (e.g. maximum DB2 UDB licenses on a 12-way machine is 12)
    • Customer does not have to purchase more “shared pool” licenses for a product than the number of processors assigned to the shared pool (e.g. maximum of 7 MQSeries licenses for a shared pool with 7 processors). Note: This limit does not affect the additional licenses that might be required for dedicated partitions.

With that, as Chase explains it, Linux middleware pricing turns out to be relatively straightforward, determined by:

  • Processor Value Unit (PVU) rating for each kind of core
  • Any difference for different processor technologies (p, i, x, z, Sun, HP, AMD, etc—notice that the z is just one of many choices, not handled differently from the others
  • Number of processor cores which must be licensed (z calls them IFLs)
  • Price per PVU (constant per product, not different based upon technology)

Then it becomes a case of doing the basic arithmetic. The formula: # of PVUs x the # of cores required x the value ($) per core = your total cost.  Given this formula it is to your advantage to plan your Linux use to minimize IFLs and cores. You can’t do anything about the cost per PVU.

Distributed PVUs are the basis for licensing middleware on IFLs and are determined by the type of machine processor. The zEC12, z196, and z10 are rated at 120 PVUs. All others are rated at 100 PVUs. For example, any distributed middleware running on Linux on z this works out to:

  • z114—1IFL, 100 PVUs
  • z196—4IFLs, 480 PVUs
  • zEC12—8 IFLs, 960 PVUs

Also, distributed systems Linux middleware offerings are eligible for sub-capacity licensing. Specifically, sub-capacity licensing is available for all PVU-priced software offerings that run on:

  • UNIX (AIX, HP-UX, and Sun Solaris
  • i5/OS, OS/400
  • Linux (System i, System p, System z)
  • x86 (VMware ESX Server, VMware GSX Server, Microsoft Virtual Server)

IBM’s virtualization technologies also are included in Passport Advantage sub-capacity licensing offering, including LPAR, z/VM virtual machines in an LPAR, CPU Pooling support introduced in z/VM 6.3 APAR VM65418, and native z/VM (on machines which still support basic mode).

And in true z style, since this can seem more complicated than it should seem, there are tools available to do the job. In fact Chase doesn’t advise doing this without a tool. The current tool is the IBM License Metric Tool V9.0.1. You can find more details on it here.

If you are considering distributed Linux middleware software or are already wrestling with the pricing process, DancingDinosaur recommends you check out Chase’s links at the top of this piece. Good luck.

DancingDinosaur is Alan Radding. Follow DancingDinosaur on Twitter, @mainframeblog. You can check out more of my work at Technologywriter.com


%d bloggers like this: