Posts Tagged ‘Caffe’

IBM Power System S822LC for HPC Beat Sort Record by 3.3x

November 17, 2016

The new IBM Power System S822LC for High Performance Computing servers set a new benchmark for sorting by taking less than 99 seconds (98.8 seconds) to finish sorting 100 terabytes of data in the Indy GraySort category, improving on last year’s best result, 329 seconds, by a factor of 3.3. The win proved a victory not only for the S822LC but for the entire OpenPOWER community. The team of Tencent, IBM, and Mellanox has been named the Winner of the Sort Benchmark annual global computing competition for 2016.

rack-of-new-ibm-power-systems-s822lc-for-high-performance-computing-servers-1Power System S822LC for HPC

Specifically, the machine, an IBM Power S822LC for High Performance Computing (HPC), features NVIDIA NVLink technology optimized for the Power architecture and NVIDIA’s latest GPU technology. The new system supports emerging computing methods of artificial intelligence, particularly deep learning. The combination, newly dubbed IBM PowerAI, provides a continued path for Watson, IBM’s cognitive solutions platform, to extend its artificial intelligence expertise in the enterprise by using several deep learning methods to train Watson.

Actually Tencent Cloud Data Intelligence (the distributed computing platform of Tencent Cloud) won each category in both the GraySort and MinuteSort benchmarks, establishing four new world records with its performance, outperforming the 2015 best speeds by 2-5x. Said Zeus Jiang, Vice President of Tencent Cloud and General Manager of Tencent’s Data Platform Department: “In the future, the ability to manage big data will be the foundation of successful Internet businesses.”

To get this level of performance Tencent runs 512 IBM OpenPOWER LC servers and Mellanox’100Gb interconnect technology, improving the performance of Tencent Cloud big data products with the infrastructure. Online prices for the S822LC starts at about $9600 for 2-socket, 2U with up to 20 cores (2.9-3.3Ghz), 1 TB memory (32 DIMMs), 230 GB/sec sustained memory bandwidth, 2x SFF (HDD/SSD), 2 TB storage, 5 PCIe slots, 4 CAPI enabled, up to 2 NVidia K80 GPU. Be sure to shop for volume discounts.

The 2016 Sort Benchmark Results below (apologies in advance if this table breaks apart)

Sort Benchmark Competition 20 Records (Tencent Cloud ) 2015 World Records 2016 Improvement
Daytona GraySort 44.8 TB/min 15.9 TB/min 2.8X greater performance
Indy GraySort 60.7 TB/min 18.2 TB/min 3.3X greater performance
Daytona MinuteSort 37 TB/min 7.7 TB/min 4.8X greater performance
Indy MinuteSort 55 TB/min 11 TB/min 5X greater performance

Pretty impressive, huh. As IBM explains it: Tencent Cloud used 512 IBM OpenPOWER servers and Mellanox’100Gb interconnect technology, improving the performance of Tencent Cloud big data products with the infrastructure. Then Tom Rosamilia, IBM Senior VP weighed in: “Industry leaders like Tencent are helping IBM and our OpenPOWER partners push performance boundaries for a cognitive era defined by big data and advanced analytics.” The computing record achieved by Tencent Cloud on OpenPOWER turned out to be an important milestone for the OpenPOWER Foundation too.

Added Amir Prescher, Sr. Vice President, Business Development, at Mellanox Technologies: “Real-time-analytics and big data environments are extremely demanding, and the network is critical in linking together the extra high performance of IBM POWER-based servers and Tencent Cloud’s massive amounts of data,” In effect, Tencent Cloud developed an optimized hardware/software platform to achieve new computing records while demonstrating that Mellanox’s 100Gb/s Ethernet technology can deliver total infrastructure efficiency and improve application performance, which should make it a favorite for big data applications.

Behind all of this was the new IBM Power System S822LC for High Performance Computing servers. Currently the servers feature a new IBM POWER8 chip designed for demanding workloads including artificial intelligence, deep learning and advanced analytics.  However, a new POWER9 chips has already been previewed and is expected next year.  Whatever the S822LC can do running POWER8 just imagine how much more it will do running POWER9, which IBM describes as a premier acceleration platform. DancingDinosaur covered POWER9 in early Sept. here.

To capitalize on the hardware, IBM is making a new deep learning software toolkit available, PowerAI, which runs on the recently announced IBM Power S822LC server built for artificial intelligence that features NVIDIA NVLink interconnect technology optimized for IBM’s Power architecture. The hardware-software combination provides more than 2X performance over comparable servers with 4 GPUs running AlexNet with Caffe. The same 4-GPU Power-based configuration running AlexNet with BVLC Caffe can also outperform 8 M40 GPU-based x86 configurations, making it the world’s fastest commercially available enterprise systems platform on two versions of a key deep learning framework.

Deep learning is a fast growing, machine learning method that extracts information by crunching through millions of pieces of data to detect and ranks the most important aspects of the data. Publicly supported among leading consumer web and mobile application companies, deep learning is quickly being adopted by more traditional enterprises across a wide range of industry sectors; in banking to advance fraud detection through facial recognition; in automotive for self-driving automobiles; and in retail for fully automated call centers with computers that can better understand speech and answer questions. Is your data center ready for deep learning?

DancingDinosaur is Alan Radding, a veteran information technology analyst, writer, and ghost-writer. Please follow DancingDinosaur on Twitter, @mainframeblog. See more of his IT writing at and here.



%d bloggers like this: