Posts Tagged ‘machine learning’

Can IBM find a place for Watson?

September 7, 2018

After beating 2 human Jeopardy game champions three times in a row in 2011 IBM’s Watson has been hard pressed to come up with a comparable winning streak. Initially IBM appeared to expect its largest customers to buy richly configured Power Servers to run Watson on prem. When they didn’t get enough takers the company moved Watson to the cloud where companies could lease it for major knowledge-driven projects. When that didn’t catch on IBM started to lease Watson’s capabilities by the drink, promising to solve problems in onesies and twosies.

Jeopardy champs lose to Watson

Today Watson is promising to speed AI success through IBM’s Watson Knowledge Catalog. As IBM puts it: IBM Watson Knowledge Catalog powers intelligent, self-service discovery of data, models, and more; activating them for artificial intelligence, machine learning, and deep learning. Access, curate, categorize and share data, knowledge assets, and their relationships, wherever they reside.

DancingDinosaur has no doubt that Watson is stunning technology and has been rooting for its success since that first Jeopardy round seven years ago. Over that time, Watson and IBM have become a case study in how not to price, package, and market powerful yet expensive technology. The Watson Knowledge Catalog is yet another pricing and packaging experiment.

Based on the latest information online, Watson Knowledge Catalog is priced according to number of provisioned catalogs and discovery connections. There are two plans available: Lite and Professional. The Lite plan allows 1 catalog and 5 free discovery connections while the Professional plan provides unlimited of both. Huh? This statement begs for clarification and there probably is a lot of information and fine print required to answer the numerous questions the above description raises, but life is too short for DancingDinosaur to rummage around on the Watson Knowledge Catalog site to look for answers. Doesn’t this seem like something Watson itself should be able to clarify with a single click?

But no, that is too easy. Instead IBM takes the high road, which DancingDinosaur calls the education track.  Notes Jay Limburn, Senior Technical Staff Member and IBM Offering Manager: there are two main challenges that might impede you from realizing the true value of your data and slowing your journey to adopting artificial intelligence (AI). They are 1) inefficient data management and 2) finding the right tools for all data users.

Actually, the issues start even earlier. In attempting AI most established organizations start at a disadvantage, notes IBM. For example:

  • Most enterprises do not know what and where their data is
  • Data science and compliance teams are handicapped by the lack of data accessibility
  • Enterprises with legacy data are even more handicapped than digitally savvy startups
  • AI projects will expose problems with limited data and poor quality; many will simply fail just due to that.
  • The need to differentiate through monetization increases in importance with AI

These are not new. People have been whining about this since the most rudimentary data mining attempts were made decades ago. If there is a surprise it is that they have not been resolved by now.

Or maybe they finally have with the IBM Watson Knowledge Catalog. As IBM puts it, the company will deliver what promises to be the ultimate data Catalog that actually illuminates data:

  • Knows what data your enterprise has
  • Where it resides
  • Where it came from
  • What it means
  • Provide quick access to it
  • Ensure protection of use
  • Exploit Machine Learning for intelligence and automation
  • Enable data scientists, data engineers, stewards and business analysts
  • Embeddable everywhere for free, with premium features available in paid editions

OK, after 7 years Watson may be poised to deliver and it has little to do with Jeopardy but with a rapidly growing data catalog market. According to a Research and Markets report, the data catalog market is expected to grow from $210 million in 2017 to $620 million by 2022. How many sales of the Professional version gets IBM a leading share.

DancingDinosaur is Alan Radding, a veteran information technology analyst, writer, and ghost-writer. Follow DancingDinosaur on Twitter, @mainframeblog. See more of his work at technologywriter.com and here.

IBM AI Reference Architecture Promises a Fast Start

August 10, 2018

Maybe somebody in your organization has already fooled around with a PoC for an AI project. Maybe you already want to build it out and even put it into production. Great! According to IBM:  By 2020, organizations across a wide array of different industries that don’t deploy AI will be in trouble. So those folks already fooling around with an AI PoC will probably be just in time.

To help organization pull the complicated pieces of AI together, IBM, with the help of IDC, put together its AI Infrastrucure Reference Architecture. This AI reference architecture, as IBM explains, is intended to be used by data scientists and IT professionals who are defining, deploying and integrating AI solutions into an organization. It describes an architecture that will support a promising proof of concept (PoC), experimental application, and sustain growth into production as a multitenant system that can continue to scale to serve a larger organization, while integrating into the organization’s existing IT infrastructure. If this sounds like you check it out. The document runs short, less than 30 pages, and free.

In truth, AI, for all the wonderful things you’d like to do with it, is more a system vendor’s dream than yours.  AI applications, and especially deep learning systems, which parse exponentially greater amounts of data, are extremely demanding and require powerful parallel processing capabilities. Standard CPUs, like those populating racks of servers in your data center, cannot sufficiently execute AI tasks. At some point, AI users will have to overhaul their infrastructure to deliver the required performance if they want to achieve their AI dreams and expectations.

Therefore, IDC recommends businesses developing AI capabilities or scaling existing AI capabilities, should plan to deliberately hit this wall in a controlled fashion. Do it knowingly and in full possession of the details to make the next infrastructure move. Also, IDC recommends you do it in close collaboration with a server vendor—guess who wants to be that vendor—who can guide them from early stage to advanced production to full exploitation of AI capabilities throughout the business.

IBM assumes everything is going to AI as quickly as it can, but that may not be the case for you. AI workloads include applications based on machine learning and deep learning, using unstructured data and information as the fuel to drive the next results. Some businesses are well on their way with deploying AI workloads, others are experimenting, and a third group is still evaluating what AI applications can mean for their organization. At all three stages the variables that, if addressed properly, together make up a well-working and business-advancing solution are numerous.

To get a handle on these variables, executives from IT and LOB managers often form a special committee to actively consider their organization’s approach to the AI. Nobody wants to invest in AI for the sake of AI; the vendors will get rich enough as it is. Also, there is no need to reinvent the wheel; many well-defined use cases exist that are applicable across industries. Many already are noted in the AI reference guide.

Here is a sampling:

  • Fraud analysis and investigation (banking, other industries)
  • Regulatory intelligence (multiple industries)
  • Automated threat intelligence and prevention systems (many industries)
  • IT automation, a sure winner (most industries)
  • Sales process recommendation and automation
  • Diagnosis and treatment (healthcare)
  • Quality management investigation and recommendation (manufacturing)
  • Supply and logistics (manufacturing)
  • Asset/fleet management, another sure winner (multiple industries)
  • Freight management (transportation)
  • Expert shopping/buying advisory or guide

Notes IDC: Many can be developed in-house, are available as commercial software, or via SaaS in the cloud.

Whatever you think of AI, you can’t avoid it. AI will penetrate your company embedded in the new products and services you buy.

So where does IBM hope your AI effort end up? Power9 System, hundreds of GPUs, and PowerAI. Are you surprised?

DancingDinosaur is Alan Radding, a veteran information technology analyst, writer, and ghost-writer. Follow DancingDinosaur on Twitter, @mainframeblog. See more of his work at technologywriter.com and here.

Get a Next-Gen Datacenter with IBM-Nutanix POWER8 System

July 14, 2017

First announced by IBM on May 16 here, this solution, driven by client demand for a simplified hyperconverged—combined server, network, storage, hardware, software—infrastructure, is designed for data-intensive enterprise workloads.  Aimed for companies increasingly looking for the ease of deployment, use, and management that hyperconverged solutions promise. It is being offered as an integrated hardware and software offering in order to deliver on that expectation.

Music made with IBM servers, storage, and infrastructure

IBM’s new POWER8 hyperconverged solutions enable a public cloud-like experience through on-premises infrastructure with top virtualization and automation capabilities combined with Nutanix’s public and on-premises cloud capabilities. They provide a combination of reliable storage, fast networks, scalability and extremely powerful computing in modular, scalable, manageable building blocks that can be scaled simply by adding nodes when needed.

Over time, IBM suggests a roadmap of offerings that will roll out as more configurations are needed to satisfy client demand and as feature and function are brought into both the IBM Cognitive Systems portfolio and the Nutanix portfolio. Full integration is key to the value proposition of this offering so more roadmap options will be delivered as soon as feature function is delivered and integration testing can be completed.

Here are three immediate things you might do with these systems:

  1. Mission-critical workloads, such as databases, large data warehouses, web infrastructure, and mainstream enterprise apps
  2. Cloud native workloads, including full stack open source middleware, enterprise databases
    and containers
  3. Next generation cognitive workloads, including big data, machine learning, and AI

Note, however, the change in IBM’s pricing strategy. The products will be priced with the goal to remain neutral on total cost of acquisition (TCA) to comparable offerings on x86. In short, IBM promises to be competitive with comparable x86 systems in terms of TCA. This is a significant deviation from IBM’s traditional pricing, but as we have started to see already and will continue to see going forward IBM clearly is ready to play pricing flexibility to win the deals on products it wants to push.

IBM envisions the new hyperconverged systems to bring data-intensive enterprise workloads like EDB Postgres, MongoDB and WebSphere into a simple-to-manage, on-premises cloud environment. Running these complex workloads on IBM Hyperconverged Nutanix POWER8 system can help an enterprise quickly and easily deploy open source databases and web-serving applications in the data center without the complexity of setting up all of the underlying infrastructure plumbing and wrestling with hardware-software integration.

And maybe more to IBM’s ultimate aim, these operational data stores may become the foundational building blocks enterprises will use to build a data center capable of taking on cognitive workloads. These ever-advancing workloads in advanced analytics, machine learning and AI will require the enterprise to seamlessly tap into data already housed on premises. Soon expect IBM to bring new offerings to market through an entire family of hyperconverged systems that will be designed to simply and easily deploy and scale a cognitive cloud infrastructure environment.

Currently, IBM offers two systems: the IBM CS821 and IBM CS822. These servers are the industry’s first hyperconverged solutions that marry Nutanix’s one-click software simplicity and scalability with the proven performance of the IBM POWER architecture, which is designed specifically for data-intensive workloads. The IBM CS822 (the larger of the two offerings) sports 22 POWER8 processor cores. That’s 176 compute threads, with up to 512 GB of memory and 15.36 TB of flash storage in a compact server that meshes seamlessly with simple Nutanix Prism management.

This server runs Nutanix Acropolis with AHV and little endian Linux. If IBM honors its stated pricing policy promise, the cost should be competitive on the total cost of acquisition for comparable offerings on x86. DancingDinosaur is not a lawyer (to his mother’s disappointment), but it looks like there is considerable wiggle room in this promise. IBM Hyperconverged-Nutanix Systems will be released for general availability in Q3 2017. Specific timelines, models, and supported server configurations will be announced at the time of availability.

DancingDinosaur is Alan Radding, a veteran information technology analyst, writer, and ghost-writer. Please follow DancingDinosaur on Twitter, @mainframeblog. See more of his IT writing at technologywriter.com and here.

 

IBM On-Premises Cognitive Means z Systems Only

February 16, 2017

Just in case you missed the incessant drumbeat coming out of IBM, the company committed to cognitive computing. But that works for z data centers since IBM’s cognitive system is available on-premises only for the z. Another z first: IBM just introduced Machine Learning (key for cognitive) for the private cloud starting with the z.

ibm-congitive-graphic

There are three ways to get IBM cognitive computing solutions: the IBM Cloud, Watson, or the z System, notes Donna Dillenberger, IBM Fellow, IBM Enterprise Solutions. The z, however, is the only platform that IBM supports for cognitive computing on premises (sorry, no Power). As such, the z represents the apex of programmatic computing, at least as IBM sees it. It also is the only IBM platform that supports cognitive natively; mainly in the form of Hadoop and Spark, both of which are programmatic tools.

What if your z told you that a given strategy had a 92% of success. It couldn’t do that until now with IBM’s recently released cognitive system for z.

Your z system today represents the peak of programmatic computing. That’s what everyone working in computers grew up with, going all the way back to Assembler, COBOL, and FORTRAN. Newer languages and operating systems have arrived since; today your mainframe can respond to Java or Linux and now Python and Anaconda. Still, all are based on the programmatic computing model.

IBM believes the future lies in cognitive computing. Cognitive has become the company’s latest strategic imperative, apparently trumping its previous strategic imperatives: cloud, analytics, big data, and mobile. Maybe only security, which quietly slipped in as a strategic imperative sometime 2016, can rival cognitive, at least for now.

Similarly, IBM describes itself as a cognitive solutions and cloud platform company. IBM’s infatuation with cognitive starts with data. Only cognitive computing will enable organizations to understand the flood of myriad data pouring in—consisting of structured, local data but going beyond to unlock the world of global unstructured data; and then to decision tree-driven, deterministic applications, and eventually, probabilistic systems that co-evolve with their users by learning along with them.

You need cognitive computing. It is the only way, as IBM puts it: to move beyond the constraints of programmatic computing. In the process, cognitive can take you past keyword-based search that provides a list of locations where an answer might be located to an intuitive, conversational means to discover a set of confidence-ranked possibilities.

Dillenberger suggests it won’t be difficult to get to the IBM cognitive system on z . You don’t even program a cognitive system. At most, you train it, and even then the cognitive system will do the heavy lifting by finding the most appropriate training models. If you don’t have preexisting training models, “just use what the cognitive system thinks is best,” she adds. Then the cognitive system will see what happens and learn from it, tweaking the models as necessary based on the results and new data it encounters. This also is where machine learning comes in.

IBM has yet to document payback and ROI data. Dillenberger, however, has spoken with early adopters.  The big promised payback, of course, will come from the new insights uncovered and the payback will be as astronomical or meager as you are in executing on those insights.

But there also is the promise of a quick technical payback for z data centers managers. When the data resides on z—a huge advantage for the z—you just run analytics where the data is. In such cases you can realize up to 3x the performance, Dillenberger noted.  Even if you have to pull data from some other location too you still run faster, maybe 2x faster. Other z advantages include large amounts of memory, multiple levels of cache, and multiple I/O processors get at data without impacting CPU performance.

When the data and IBM’s cognitive system resides on the z you can save significant money. “ETL consumed huge amounts of MIPS. But when the client did it all on the z, it completely avoided the costly ETL process,” Dillenberger noted. As a result, that client reported savings of $7-8 million dollars a year by completely bypassing the x-86 layer and ETL and running Spark natively on the z.

As Dillenberger describes it, cognitive computing on the z is here now, able to deliver a payback fast, and an even bigger payback going forward as you execute on the insights it reveals. And you already have a z, the only on-premises way to IBM’s Cognitive System.

DancingDinosaur is Alan Radding, a veteran information technology analyst, writer, and ghost-writer. Please follow DancingDinosaur on Twitter, @mainframeblog. See more of his IT writing at technologywriter.com and here.

 

IBM Continues Open Source Commitment with Apache Spark

June 18, 2015

If anyone believes IBM’s commitment to open source is a passing fad, forget it. IBM has invested billions in Linux, open Power through the Open Power Foundation, and more. Its latest is the announcement of a major commitment to Apache Spark, a fast open source and general cluster computing system for big data.

spark VGN8668

Courtesy of IBM: developers work with Spark at Galvanize Hackathon

As IBM sees it, Spark brings essential advances to large-scale data processing. Specifically, it dramatically improves the performance of data dependent-apps and is expected to play a big role in the Internet of Things (IoT). In addition, it radically simplifies the process of developing intelligent apps, which are fueled by data. It does so by providing high-level APIs in Scala, Java, and Python, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Spark Streaming for stream processing.

IBM is contributing its breakthrough IBM SystemML machine learning technology to the Spark open source ecosystem. Spark brings essential advances to large-scale data processing, such as improvements in the performance of data dependent apps. It also radically simplifies the process of developing intelligent apps, which are fueled by data. But maybe the biggest advantage is that it can handle data coming from multiple, disparate sources.

What IBM likes in Spark is that it’s agile, fast, and easy to use. It also likes it being open source, which ensures it is improved continuously by a worldwide community. That’s also some of the main reasons mainframe and Power Systems data centers should pay attention to Spark.  Spark will make it easier to connect applications to data residing in your data center. If you haven’t yet noticed an uptick in mobile transactions coming into your data center, they will be coming. These benefit from Spark. And if you look out just a year or two, expect to see IoT applications adding to and needing to combine all sorts of data, much of it ending up on the mainframe or Power System in one form or another. So make sure Spark is on your radar screen.

Over the course of the next few months, IBM scientists and engineers will work with the Apache Spark open community to accelerate access to advanced machine learning capabilities and help drive speed-to-innovation in the development of smart business apps. By contributing SystemML, IBM hopes data scientists iterate faster to address the changing needs of business and to enable a growing ecosystem of app developers who will apply deep intelligence to everything.

To ensure that happens, IBM will commit more than 3,500 researchers and developers to work on Spark-related projects at more than a dozen labs worldwide, and open a Spark Technology Center in San Francisco for the Data Science and Developer community to foster design-led innovation in intelligent applications. IBM also aims to educate more than 1 million data scientists and data engineers on Spark through extensive partnerships with AMPLab, DataCamp, MetiStream, Galvanize, and Big Data University MOOC (Massive Open Online Course).

Of course, Spark isn’t going to be the end of tools to expedite the latest app dev. With IoT just beginning to gain widespread interest expect a flood of tools to expedite developing IoT data-intensive applications and more tools to facilitate connecting all these coming connected devices, estimated to number in the tens of billions within a few years.

DancingDinosaur applauds IBM’s decade-plus commitment to open source and its willingness to put real money and real code behind it. That means the IBM z System mainframe, the POWER platform, Linux, and the rest will be around for some time. That’s good; DancingDinosaur is not quite ready to retire.

DancingDinosaur is Alan Radding, a veteran IT analyst and writer. Please follow DancingDinosaur on Twitter, @mainframeblog. See more of his IT writing on Technologywriter.com and here.


%d bloggers like this: