Posts Tagged ‘TensorFlow’

Is Your Enterprise Ready for AI?

May 11, 2018

According to IBM’s gospel of AI “we are in the midst of a global transformation and it is touching every aspect of our world, our lives, and our businesses.”  IBM has been preaching its gospel of AI of the past year or longer, but most of its clients haven’t jumped fully aboard. “For most of our clients, AI will be a journey. This is demonstrated by the fact that most organizations are still in the early phases of AI adoption.”

AC922 with NIVIDIA Tesla V100 and Enhanced NVLink GPUs

The company’s latest announcements earlier this week focus POWER9 squarely on AI. Said Tim Burke, Engineering Vice President, Cloud and Operating System Infrastructure, at Red Hat. “POWER9-based servers, running Red Hat’s leading open technologies offer a more stable and performance optimized foundation for machine learning and AI frameworks, which is required for production deployments… including PowerAI, IBM’s software platform for deep learning with IBM Power Systems that includes popular frameworks like Tensorflow and Caffe, as the first commercially supported AI software offering for [the Red Hat] platform.”

IBM insists this is not just about POWER9 and they may have a point; GPUs and other assist processors are taking on more importance as companies try to emulate the hyperscalers in their efforts to drive server efficiency while boosting power in the wake of declines in Moore’s Law. ”GPUs are at the foundation of major advances in AI and deep learning around the world,” said Paresh Kharya, group product marketing manager of Accelerated Computing at NVIDIA. [Through] “the tight integration of IBM POWER9 processors and NVIDIA V100 GPUs made possible by NVIDIA NVLink, enterprises can experience incredible increases in performance for compute- intensive workloads.”

To create an AI-optimized infrastructure, IBM announced the latest additions to its POWER9 lineup, the IBM Power Systems LC922 and LC921. Characterized by IBM as balanced servers offering both compute capabilities and up to 120 terabytes of data storage and NVMe for rapid access to vast amounts of data. IBM included HDD in the announcement but any serious AI workload will choke without ample SSD.

Specifically, these new servers bring an updated version of the AC922 server, which now features recently announced 32GB NVIDIA V100 GPUs and larger system memory, which enables bigger deep learning models to improve the accuracy of AI workloads.

IBM has characterized the new models as data-intensive machines and AI-intensive systems, LC922 and LC921 Servers with POWER9 processors. The AC922, arrived last fall. It was designed for the what IBM calls the post-CPU era. The AC922 was the first to embed PCI-Express 4.0, next-generation NVIDIA NVLink, and OpenCAPI—3 interface accelerators—which together can accelerate data movement 9.5x faster than PCIe 3.0 based x86 systems. The AC922 was designed to drive demonstrable performance improvements across popular AI frameworks such as TensorFlow and Caffe.

In the post CPU era, where Moore’s Law no longer rules, you need to pay as much attention to the GPU and other assist processors as the CPU itself, maybe even more so. For example, the coherence and high-speed of the NVLink enables hash tables—critical for fast analytics—on GPUs. As IBM noted at the introduction of the new machines this week: Hash tables are fundamental data structure for analytics over large datasets. For this you need large memory: small GPU memory limits hash table size and analytic performance. The CPU-GPU NVLink2 solves 2 key problems: large memory and high-speed enables storing the full hash table in CPU memory and transferring pieces to GPU for fast operations; coherence enables new inserts in CPU memory to get updated in GPU memory. Otherwise, modifications on data in CPU memory do not get updated in GPU memory.

IBM has started referring to the LC922 and LC921 as big data crushers. The LC921 brings 2 POWER9 sockets in a 1U form factor; for I/O it comes with both PCIe 4.0 and CAPI 2.0.; and offers up to 40 cores (160 threads) and 2TB RAM, which is ideal for environments requiring dense computing.

The LC922 is considerably bigger. It offers balanced compute capabilities delivered with the P9 processor and up to 120TB of storage capacity, again advanced I/O through PCIe 4.0/CAPI 2.0, and up to 44 cores (176 threads) and 2TB RAM. The list price, notes IBM is ~30% less.

If your organization is not thinking about AI your organization is probably in the minority, according to IDC.

  • 31 percent of organizations are in [AI] discovery/evaluation
  • 22 percent of organizations plan to implement AI in next 1-2 years
  • 22 percent of organizations are running AI trials
  • 4 percent of organizations have already deployed AI

Underpinning both servers is the IBM POWER9 CPU. The POWER9 enjoys a nearly 5.6x improved CPU to GPU bandwidth vs x86, which can improve deep learning training times by nearly 4x. Even today companies are struggling to cobble together the different pieces and make them work. IBM learned that lesson and now offers a unified AI infrastructure in PowerAI and Power9 that you can use today.

DancingDinosaur is Alan Radding, a veteran information technology analyst, writer, and ghost-writer. Please follow DancingDinosaur on Twitter, @mainframeblog. See more of his IT writing at technologywriter.com and here.

IBM’s POWER9 Races to AI

December 7, 2017

IBM is betting the future of its Power Systems on artificial intelligence (AI). The company introduced its newly designed POWER9 processor publicly this past Tuesday. The new machine, according to IBM, is capable of shortening the training of deep learning frameworks by nearly 4x, allowing enterprises to build more accurate AI applications, faster.

IBM engineer tests the POWER9

Designed for the post-CPU era, the core POWER9 building block is the IBM Power Systems AC922. The AC922, notes IBM, is the first to embed PCI-Express 4.0, next-generation NVIDIA NVLink, and OpenCAPI—3 interface accelerators—which together can accelerate data movement 9.5x faster than PCIe 3.0 based x86 systems. The AC922 is designed to drive demonstrable performance improvements across popular AI frameworks such as Chainer, TensorFlow and Caffe, as well as accelerated databases such as Kinetica.

More than a CPU under the AC922 cover

Depending on your sense of market timing, POWER9 may be coming at the best or worst time for IBM.  Notes industry observer Timothy Prickett Morgan, The Next Platform: “The server market is booming as 2017 comes to a close, and IBM is looking to try to catch the tailwind and lift its Power Systems business.”

As Morgan puts it, citing IDC 3Q17 server revenue figures, HPE and Dell are jockeying for the lead in the server space, and for the moment, HPE (including its H3C partnership in China) has the lead with $3.32 billion in revenues, compared to Dell’s $3.07 billion, while Dell was the shipment leader, with 503,000 machines sold in Q3 2017 versus HPE’s 501,400 machines shipped. IBM does not rank in the top five shippers but thanks in part to the Z and big Power8 boxes, IBM still holds the number three server revenue generator spot, with $1.09 billion in sales for the third quarter, according to IDC. The z system accounted for $673 million of that, up 63.8 percent year-on year due mainly to the new Z. If you do the math, Morgan continued, the Power Systems line accounted for $420.7 million in the period, down 7.2 percent from Q3 2016. This is not surprising given that customers held back knowing Power9 systems were coming.

To get Power Systems back to where it used to be, Morgan continued, IBM must increase revenues by a factor of three or so. The good news is that, thanks to the popularity of hybrid CPU-GPU systems, which cost around $65,000 per node from IBM, this isn’t impossible. Therefore, it should take fewer machines to rack up the revenue, even if it comes from a relatively modest number of footprints and not a huge number of Power9 processors. More than 90 percent of the compute in these systems is comprised of GPU accelerators, but due to bookkeeping magic, it all accrues to Power Systems when these machines are sold. Plus IBM reportedly will be installing over 10,000 such nodes for the US Department of Energy’s Summit and Sierra supercomputers in the coming two quarters, which should provide a nice bump. And once IBM gets the commercial Power9 systems into the field, sales should pick up again, Morgan expects.

IBM clearly is hoping POWER9 will cut into Intel x86 sales. But that may not happen as anticipated. Intel is bringing out its own advanced x86 Xeon machine, Skylake, rumored to be quite expensive. Don’t expect POWER9 systems to be cheap either. And the field is getting more crowded. Morgan noted various ARM chips –especially ThunderX2 from Cavium and Centriq 2400 from Qualcomm –can boost non-X86 numbers and divert sales from IBM’s Power9 system. Also, AMD’s Epyc X86 processors have a good chance of stealing some market share from Intel’s Skylake. So the Power9 will have to fight for every sale IBM wants and take nothing for granted.

No doubt POWER9 presents a good case and has a strong backer in Google, but even that might not be enough. Still, POWER9 sits at the heart of what is expected to be the most powerful data-intensive supercomputers in the world, the Summit and Sierra supercomputers, expected to knock off the world’s current fastest supercomputers from China.

Said Bart Sano, VP of Google Platforms: “Google is excited about IBM’s progress in the development of the latest POWER technology;” adding “the POWER9 OpenCAPI bus and large memory capabilities allow further opportunities for innovation in Google data centers.”

This really is about deep learning, one of the latest hot buzzwords today. Deep learning emerged as a fast growing machine learning method that extracts information by crunching through millions of processes and data to detect and rank the most important aspects of the data. IBM designed the POWER9 chip to manage free-flowing data, streaming sensors, and algorithms for data-intensive AI and deep learning workloads on Linux.  Are your people ready to take advantage of POWER9?

DancingDinosaur is Alan Radding, a veteran information technology analyst, writer, and ghost-writer. Please follow DancingDinosaur on Twitter, @mainframeblog. See more of his IT writing at technologywriter.com and here.


%d bloggers like this: