Posts Tagged ‘Caffe’

IBM’s POWER9 Races to AI

December 7, 2017

IBM is betting the future of its Power Systems on artificial intelligence (AI). The company introduced its newly designed POWER9 processor publicly this past Tuesday. The new machine, according to IBM, is capable of shortening the training of deep learning frameworks by nearly 4x, allowing enterprises to build more accurate AI applications, faster.

IBM engineer tests the POWER9

Designed for the post-CPU era, the core POWER9 building block is the IBM Power Systems AC922. The AC922, notes IBM, is the first to embed PCI-Express 4.0, next-generation NVIDIA NVLink, and OpenCAPI—3 interface accelerators—which together can accelerate data movement 9.5x faster than PCIe 3.0 based x86 systems. The AC922 is designed to drive demonstrable performance improvements across popular AI frameworks such as Chainer, TensorFlow and Caffe, as well as accelerated databases such as Kinetica.

More than a CPU under the AC922 cover

Depending on your sense of market timing, POWER9 may be coming at the best or worst time for IBM.  Notes industry observer Timothy Prickett Morgan, The Next Platform: “The server market is booming as 2017 comes to a close, and IBM is looking to try to catch the tailwind and lift its Power Systems business.”

As Morgan puts it, citing IDC 3Q17 server revenue figures, HPE and Dell are jockeying for the lead in the server space, and for the moment, HPE (including its H3C partnership in China) has the lead with $3.32 billion in revenues, compared to Dell’s $3.07 billion, while Dell was the shipment leader, with 503,000 machines sold in Q3 2017 versus HPE’s 501,400 machines shipped. IBM does not rank in the top five shippers but thanks in part to the Z and big Power8 boxes, IBM still holds the number three server revenue generator spot, with $1.09 billion in sales for the third quarter, according to IDC. The z system accounted for $673 million of that, up 63.8 percent year-on year due mainly to the new Z. If you do the math, Morgan continued, the Power Systems line accounted for $420.7 million in the period, down 7.2 percent from Q3 2016. This is not surprising given that customers held back knowing Power9 systems were coming.

To get Power Systems back to where it used to be, Morgan continued, IBM must increase revenues by a factor of three or so. The good news is that, thanks to the popularity of hybrid CPU-GPU systems, which cost around $65,000 per node from IBM, this isn’t impossible. Therefore, it should take fewer machines to rack up the revenue, even if it comes from a relatively modest number of footprints and not a huge number of Power9 processors. More than 90 percent of the compute in these systems is comprised of GPU accelerators, but due to bookkeeping magic, it all accrues to Power Systems when these machines are sold. Plus IBM reportedly will be installing over 10,000 such nodes for the US Department of Energy’s Summit and Sierra supercomputers in the coming two quarters, which should provide a nice bump. And once IBM gets the commercial Power9 systems into the field, sales should pick up again, Morgan expects.

IBM clearly is hoping POWER9 will cut into Intel x86 sales. But that may not happen as anticipated. Intel is bringing out its own advanced x86 Xeon machine, Skylake, rumored to be quite expensive. Don’t expect POWER9 systems to be cheap either. And the field is getting more crowded. Morgan noted various ARM chips –especially ThunderX2 from Cavium and Centriq 2400 from Qualcomm –can boost non-X86 numbers and divert sales from IBM’s Power9 system. Also, AMD’s Epyc X86 processors have a good chance of stealing some market share from Intel’s Skylake. So the Power9 will have to fight for every sale IBM wants and take nothing for granted.

No doubt POWER9 presents a good case and has a strong backer in Google, but even that might not be enough. Still, POWER9 sits at the heart of what is expected to be the most powerful data-intensive supercomputers in the world, the Summit and Sierra supercomputers, expected to knock off the world’s current fastest supercomputers from China.

Said Bart Sano, VP of Google Platforms: “Google is excited about IBM’s progress in the development of the latest POWER technology;” adding “the POWER9 OpenCAPI bus and large memory capabilities allow further opportunities for innovation in Google data centers.”

This really is about deep learning, one of the latest hot buzzwords today. Deep learning emerged as a fast growing machine learning method that extracts information by crunching through millions of processes and data to detect and rank the most important aspects of the data. IBM designed the POWER9 chip to manage free-flowing data, streaming sensors, and algorithms for data-intensive AI and deep learning workloads on Linux.  Are your people ready to take advantage of POWER9?

DancingDinosaur is Alan Radding, a veteran information technology analyst, writer, and ghost-writer. Please follow DancingDinosaur on Twitter, @mainframeblog. See more of his IT writing at technologywriter.com and here.

IBM Power System S822LC for HPC Beat Sort Record by 3.3x

November 17, 2016

The new IBM Power System S822LC for High Performance Computing servers set a new benchmark for sorting by taking less than 99 seconds (98.8 seconds) to finish sorting 100 terabytes of data in the Indy GraySort category, improving on last year’s best result, 329 seconds, by a factor of 3.3. The win proved a victory not only for the S822LC but for the entire OpenPOWER community. The team of Tencent, IBM, and Mellanox has been named the Winner of the Sort Benchmark annual global computing competition for 2016.

rack-of-new-ibm-power-systems-s822lc-for-high-performance-computing-servers-1Power System S822LC for HPC

Specifically, the machine, an IBM Power S822LC for High Performance Computing (HPC), features NVIDIA NVLink technology optimized for the Power architecture and NVIDIA’s latest GPU technology. The new system supports emerging computing methods of artificial intelligence, particularly deep learning. The combination, newly dubbed IBM PowerAI, provides a continued path for Watson, IBM’s cognitive solutions platform, to extend its artificial intelligence expertise in the enterprise by using several deep learning methods to train Watson.

Actually Tencent Cloud Data Intelligence (the distributed computing platform of Tencent Cloud) won each category in both the GraySort and MinuteSort benchmarks, establishing four new world records with its performance, outperforming the 2015 best speeds by 2-5x. Said Zeus Jiang, Vice President of Tencent Cloud and General Manager of Tencent’s Data Platform Department: “In the future, the ability to manage big data will be the foundation of successful Internet businesses.”

To get this level of performance Tencent runs 512 IBM OpenPOWER LC servers and Mellanox’100Gb interconnect technology, improving the performance of Tencent Cloud big data products with the infrastructure. Online prices for the S822LC starts at about $9600 for 2-socket, 2U with up to 20 cores (2.9-3.3Ghz), 1 TB memory (32 DIMMs), 230 GB/sec sustained memory bandwidth, 2x SFF (HDD/SSD), 2 TB storage, 5 PCIe slots, 4 CAPI enabled, up to 2 NVidia K80 GPU. Be sure to shop for volume discounts.

The 2016 Sort Benchmark Results below (apologies in advance if this table breaks apart)

Sort Benchmark Competition 20 Records (Tencent Cloud ) 2015 World Records 2016 Improvement
Daytona GraySort 44.8 TB/min 15.9 TB/min 2.8X greater performance
Indy GraySort 60.7 TB/min 18.2 TB/min 3.3X greater performance
Daytona MinuteSort 37 TB/min 7.7 TB/min 4.8X greater performance
Indy MinuteSort 55 TB/min 11 TB/min 5X greater performance

Pretty impressive, huh. As IBM explains it: Tencent Cloud used 512 IBM OpenPOWER servers and Mellanox’100Gb interconnect technology, improving the performance of Tencent Cloud big data products with the infrastructure. Then Tom Rosamilia, IBM Senior VP weighed in: “Industry leaders like Tencent are helping IBM and our OpenPOWER partners push performance boundaries for a cognitive era defined by big data and advanced analytics.” The computing record achieved by Tencent Cloud on OpenPOWER turned out to be an important milestone for the OpenPOWER Foundation too.

Added Amir Prescher, Sr. Vice President, Business Development, at Mellanox Technologies: “Real-time-analytics and big data environments are extremely demanding, and the network is critical in linking together the extra high performance of IBM POWER-based servers and Tencent Cloud’s massive amounts of data,” In effect, Tencent Cloud developed an optimized hardware/software platform to achieve new computing records while demonstrating that Mellanox’s 100Gb/s Ethernet technology can deliver total infrastructure efficiency and improve application performance, which should make it a favorite for big data applications.

Behind all of this was the new IBM Power System S822LC for High Performance Computing servers. Currently the servers feature a new IBM POWER8 chip designed for demanding workloads including artificial intelligence, deep learning and advanced analytics.  However, a new POWER9 chips has already been previewed and is expected next year.  Whatever the S822LC can do running POWER8 just imagine how much more it will do running POWER9, which IBM describes as a premier acceleration platform. DancingDinosaur covered POWER9 in early Sept. here.

To capitalize on the hardware, IBM is making a new deep learning software toolkit available, PowerAI, which runs on the recently announced IBM Power S822LC server built for artificial intelligence that features NVIDIA NVLink interconnect technology optimized for IBM’s Power architecture. The hardware-software combination provides more than 2X performance over comparable servers with 4 GPUs running AlexNet with Caffe. The same 4-GPU Power-based configuration running AlexNet with BVLC Caffe can also outperform 8 M40 GPU-based x86 configurations, making it the world’s fastest commercially available enterprise systems platform on two versions of a key deep learning framework.

Deep learning is a fast growing, machine learning method that extracts information by crunching through millions of pieces of data to detect and ranks the most important aspects of the data. Publicly supported among leading consumer web and mobile application companies, deep learning is quickly being adopted by more traditional enterprises across a wide range of industry sectors; in banking to advance fraud detection through facial recognition; in automotive for self-driving automobiles; and in retail for fully automated call centers with computers that can better understand speech and answer questions. Is your data center ready for deep learning?

DancingDinosaur is Alan Radding, a veteran information technology analyst, writer, and ghost-writer. Please follow DancingDinosaur on Twitter, @mainframeblog. See more of his IT writing at technologywriter.com and here.

 

 


%d bloggers like this: