Posts Tagged ‘GPU’

IBM AI Reference Architecture Promises a Fast Start

August 10, 2018

Maybe somebody in your organization has already fooled around with a PoC for an AI project. Maybe you already want to build it out and even put it into production. Great! According to IBM:  By 2020, organizations across a wide array of different industries that don’t deploy AI will be in trouble. So those folks already fooling around with an AI PoC will probably be just in time.

To help organization pull the complicated pieces of AI together, IBM, with the help of IDC, put together its AI Infrastrucure Reference Architecture. This AI reference architecture, as IBM explains, is intended to be used by data scientists and IT professionals who are defining, deploying and integrating AI solutions into an organization. It describes an architecture that will support a promising proof of concept (PoC), experimental application, and sustain growth into production as a multitenant system that can continue to scale to serve a larger organization, while integrating into the organization’s existing IT infrastructure. If this sounds like you check it out. The document runs short, less than 30 pages, and free.

In truth, AI, for all the wonderful things you’d like to do with it, is more a system vendor’s dream than yours.  AI applications, and especially deep learning systems, which parse exponentially greater amounts of data, are extremely demanding and require powerful parallel processing capabilities. Standard CPUs, like those populating racks of servers in your data center, cannot sufficiently execute AI tasks. At some point, AI users will have to overhaul their infrastructure to deliver the required performance if they want to achieve their AI dreams and expectations.

Therefore, IDC recommends businesses developing AI capabilities or scaling existing AI capabilities, should plan to deliberately hit this wall in a controlled fashion. Do it knowingly and in full possession of the details to make the next infrastructure move. Also, IDC recommends you do it in close collaboration with a server vendor—guess who wants to be that vendor—who can guide them from early stage to advanced production to full exploitation of AI capabilities throughout the business.

IBM assumes everything is going to AI as quickly as it can, but that may not be the case for you. AI workloads include applications based on machine learning and deep learning, using unstructured data and information as the fuel to drive the next results. Some businesses are well on their way with deploying AI workloads, others are experimenting, and a third group is still evaluating what AI applications can mean for their organization. At all three stages the variables that, if addressed properly, together make up a well-working and business-advancing solution are numerous.

To get a handle on these variables, executives from IT and LOB managers often form a special committee to actively consider their organization’s approach to the AI. Nobody wants to invest in AI for the sake of AI; the vendors will get rich enough as it is. Also, there is no need to reinvent the wheel; many well-defined use cases exist that are applicable across industries. Many already are noted in the AI reference guide.

Here is a sampling:

  • Fraud analysis and investigation (banking, other industries)
  • Regulatory intelligence (multiple industries)
  • Automated threat intelligence and prevention systems (many industries)
  • IT automation, a sure winner (most industries)
  • Sales process recommendation and automation
  • Diagnosis and treatment (healthcare)
  • Quality management investigation and recommendation (manufacturing)
  • Supply and logistics (manufacturing)
  • Asset/fleet management, another sure winner (multiple industries)
  • Freight management (transportation)
  • Expert shopping/buying advisory or guide

Notes IDC: Many can be developed in-house, are available as commercial software, or via SaaS in the cloud.

Whatever you think of AI, you can’t avoid it. AI will penetrate your company embedded in the new products and services you buy.

So where does IBM hope your AI effort end up? Power9 System, hundreds of GPUs, and PowerAI. Are you surprised?

DancingDinosaur is Alan Radding, a veteran information technology analyst, writer, and ghost-writer. Follow DancingDinosaur on Twitter, @mainframeblog. See more of his work at technologywriter.com and here.

IBM Jumps into the Next Gen Server Party with POWER9

February 15, 2018

IBM re-introduced its POWER9 lineup of servers  this week starting with 2-socket and 4-socket systems and more variations coming in the months ahead as IBM, along with the rest of the IT vendor community grapples with how to address changing data center needs. The first, the AC922, arrived last fall. DancingDinosaur covered it here. More, the S922/S914/S924 and H922/H924/L922, are promised later this quarter.

The workloads organizations are running these days are changing, often dramatically and quickly. One processor, no matter how capable or flexible or efficient will be unlikely to do the job going forward. It will take an entire family of chips.  That’s as true for Intel and AMR and the other chip players as IBM.

In some ways, IBM’s challenge is even qwerkier. Its chips will not only need to support Linux and Windows, but also IBMi and AIX. IBM simply cannot abandon its IBMi and AIX customer bases. So chips supporting IBMi and AIX are being built into the POWER9 family.

For IBMi the company is promising POWER9 exploitation for:

  • Expanding the secure-ability of IBMi with TLS, secure APIs, and logs for SIEM solutions
  • Expanded Install options with an installation process using USB 3.0 media
  • Encryption and compression for cloud storage
  • Increasing the productivity of developers and administrators

This may sound trivial to those who have focused on the Linux world and work with x86 systems too, but it is not for a company still mired in productive yet aging IBMi systems.

IBM also is promising POWER9 goodies for AIX, its legacy Unix OS, including:

  • AIX Security: PowerSC and PowerSC MFA updates for malware intrusion prevention and strong authentication
  • New workload acceleration with shared memory communications over RDMA (SMC-R)
  • Improved availability: AIX Live Update enhancements; GDR 1.2; PowerHA 7.2
  • Improved Cloud Mgmt: IBM Cloud PowerVC Manager for SDI; Import/Export;
  • AIX 7.2 native support for POWER9 – e.g. enabling NVMe

Again, if you have been running Linux on z or LinuxONE this may sound antiquated, but AIX has not been considered state-of-the-art for years. NVMe alone gives is a big boost.

But despite all the nice things IBM is doing for IBMi and AIX, DancingDinosaur believes the company clearly is betting POWER9 will cut into Intel x86 sales. But that is not a given. Intel is rolling out its own family of advanced x86 Xeon machines under the Skylake code name. Different versions will be packaged and tuned to different workloads. They are rumored, at the fully configured high end, to be quite expensive. Just don’t expect POWER9 systems to be cheap either.

And the chip market is getting more crowded. As Timothy Prickett Morgan, analyst at The Next Platform noted, various ARM chips –especially ThunderX2 from Cavium and Centriq 2400 from Qualcomm –can boost non-X86 numbers and divert sales from IBM’s POWER9 family. Also, AMD’s Epyc X86 processors have a good chance of stealing some market share from Intel’s Skylake. So the POWER9 will have to fight for every sale IBM wants.

Morgan went on: IBM differentiated the hardware and the pricing with its NVLink versions, depending on the workload and the competition, with its most aggressive pricing and a leaner and cheaper microcode and hypervisor stack reserved for the Linux workloads that the company is chasing. IBM very much wants to sell its Power-Linux combo against Intel’s Xeon-Linux and also keep AMD’s Epyc-Linux at bay. Where the Power8 chip had the advantage over the Intel’s Haswell and Broadwell Xeon E5 processors when it came to memory capacity and memory bandwidth per socket, and could meet or beat the Xeons when it came to performance on some workloads that is not yet apparent with the POWER9.

With the POWER9, however, IBM will likely charge a little less for companies buying its Linux-only variants, observes Morgan, effectively enabling IBM to win Linux deals, particularly where data analytics and open source databases drive the customer’s use case. Similarly, some traditional simulation and modeling workloads in the HPC and machine learning areas are ripe for POWER9.

POWER9 is not one chip. Packed into the chip are next-generation NVIDIA NVLink and OpenCAPI to provide significantly faster performance for attached GPUs. The PCI-Express 4.0 interconnect will be twice the speed of PCI-Express 3.0. The open POWER9 architecture also allows companies to mix a wide range of accelerators to meet various needs. Meanwhile, OpenCAPI can unlock coherent FPGAs to support varied accelerated storage, compute, and networking workloads. IBM also is counting on the 300+ members of the OpenPOWER Foundation and OpenCAPI Consortium to launch innovations for POWER9. Much is happening: Stay tuned to DancingDinosaur

DancingDinosaur is Alan Radding, a veteran information technology analyst, writer, and ghost-writer. Follow DancingDinosaur on Twitter, @mainframeblog. See more of his work at technologywriter.com and here.

Meltdown and Spectre Attacks Require IBM Mitigation

January 12, 2018

The chip security threats dubbed Meltdown and Spectre revealed last month apparently will require IBM threat mitigation in the form of code and patching. IBM has been reticent to make a major public announcement, but word finally is starting to percolate publicly.

Courtesy: Preparis Inc.

On January 4, one day after researchers disclosed the Meltdown and Spectre attack methods against Intel, AMD and ARM processors the Internet has been buzzing.  Wrote Eduard Kovacs on Wed.; Jan. 10, IBM informed customers that it had started analyzing impact on its own products. The day before IBM revealed its POWER processors are affected.

A published report from Virendra Soni, January 11, on the Consumer Electronics Show (CES) 2018 in Las Vegas where Nvidia CEO Jensen Huang revealed how the technology leaders are scrambling to find patches to the Spectre and Meltdown attacks. These attacks enable hackers to steal private information off users’ CPUs running processors from Intel, AMD, and ARM.

For DancingDinosaur readers, that puts the latest POWER chips and systems at risk. At this point, it is not clear how far beyond POWER systems the problem reaches. “We believe our GPU hardware is immune. As for our driver software, we are providing updates to help mitigate the CPU security issue,” Nvidia wrote in their security bulletin.

Nvidia also reports releasing updates for its software drivers that interact with vulnerable CPUs and operating systems. The vulnerabilities take place in three variants: Variant 1, Variant 2, and Variant 3. Nvidia has released driver updates for Variant 1 and 2. The company notes none of its software is vulnerable to Variant 3. Nvidia reported providing security updates for these products: GeForce, Quadro, NVS Driver Software, Tesla Driver Software, and GRID Driver Software.

IBM has made no public comments on which of their systems are affected. But Red Hat last week reported IBM’s System Z, and POWER platforms are impacted by Spectre and Meltdown. IBM may not be saying much but Red Hat is, according to Soni: “Red Hat last week reported that IBM’s System Z, and POWER platforms are exploited by Spectre and Meltdown.”

So what is a data center manager with a major investment in these systems to do?  Meltdown and Spectre “obviously are a very big problem, “ reports Timothy Prickett Morgan, a leading analyst at The Last Platform, an authoritative website following the server industry. “Chip suppliers and operating systems and hypervisor makers have known about these exploits since last June, and have been working behind the scenes to provide corrective countermeasures to block them… but rumors about the speculative execution threats forced the hands of the industry, and last week Google put out a notice about the bugs and then followed up with details about how it has fixed them in its own code. Read it here.

Chipmakers AMD and AMR put out a statement saying only Variant 1 of the speculative execution exploits (one of the Spectre variety known as bounds check bypass), and by Variant 2 (also a Spectre exploit known as branch target injection) affected them. AMD, reports Morgan, also emphasized that it has absolutely no vulnerability to Variant 3, a speculative execution exploit called rogue data cache load and known colloquially as Meltdown.  This is due, he noted, to architectural differences between Intel’s X86 processors and AMD’s clones.

As for IBM, Morgan noted: its Power chips are affected, at least back to the Power7 from 2010 and continuing forward to the brand new Power9. In its statement, IBM said that it would have patches out for firmware on Power machines using Power7+, Power8, Power8+, and Power9 chips on January 9, which passed, along with Linux patches for those machines; patches for the company’s own AIX Unix and proprietary IBM i operating systems will not be available until February 12. The System z mainframe processors also have speculative execution, so they should, in theory, be susceptible to Spectre but maybe not Meltdown.

That still leaves a question about the vulnerability of the IBM LinuxONE and the processors spread throughout the z systems. Ask your IBM rep when you can expect mitigation for those too.

Just patching these costly systems should not be sufficiently satisfying. There is a performance price that data centers will pay. Google noted a negligible impact on performance after it deployed one fix on Google’s millions of Linux systems, said Morgan. There has been speculation, Googled continued, that the deployment of KPTI (a mitigation fix) causes significant performance slowdowns. As far as is known, there is no fix for Spectre Variant 1 attacks, which have to be fixed on a binary-by-binary basis, according to Google.

Red Hat went further and actually ran benchmarks. The company tested its Enterprise Linux 7 release on servers using Intel’s “Haswell” Xeon E5 v3, “Broadwell” Xeon E5 v4, and “Skylake,” the upcoming Xeon SP processors, and showed impacts that ranged from 1-19 percent. You can demand these impacts be reflected in reduced system prices.

DancingDinosaur is Alan Radding, a veteran information technology analyst, writer, and ghost-writer. Please follow DancingDinosaur on Twitter, @mainframeblog. See more of his IT writing at technologywriter.com and here.

 


%d bloggers like this: