Posts Tagged ‘GPU accelerator’

IBM Shows Off POWER and NVIDIA GPU Setting High Performance Record 

May 4, 2017

The record achievement used 60 Power processors and 120 GPU accelerators to shatter the previous supercomputer record, which used over a 700,000 processors. The results point to how dramatically the capabilities of high performance computing (HPC) has increase while the cost of HPC systems has declined. Or put another way: the effort demonstrates the ability of NVIDIA GPUs to simulate one billion cell models in a fraction of the time, while delivering 10x the performance and efficiency.

Courtesy of IBM: Takes a lot of processing to take you into a tornado

In short, the combined success of IBM and NVIDIA puts the power of cognitive computing within the reach of mainstream enterprise data centers. Specifically the project performed reservoir modeling to predict the flow of oil, water, and natural gas in the subsurface of the earth before they attempt to extract the maximum oil in the most efficient way. The effort, in this case, involved a billion-cell simulation, which took just 92 minutes using 30 for HPC servers equipped with 60 POWER processors and 120 NVIDIA Tesla P100 GPU accelerators.

“This calculation is a very salient demonstration of the computational capability and density of solution that GPUs offer. That speed lets reservoir engineers run more models and ‘what-if’ scenarios than previously,” according to Vincent Natoli, President of Stone Ridge Technology, as quoted in the IBM announcement. “By increasing compute performance and efficiency by more than an order of magnitude, we’re democratizing HPC for the reservoir simulation community,” he added.

“The milestone calculation illuminates the advantages of the IBM POWER architecture for data-intensive and cognitive workloads.” said Sumit Gupta, IBM Vice President, High Performance Computing, AI & Analytics in the IBM announcement. “By running Stone Ridge’s ECHELON on IBM Power Systems, users can achieve faster run-times using a fraction of the hardware.” Gupta continued. The previous record used more than 700,000 processors in a supercomputer installation that occupies nearly half a football field while Stone Ridge did this calculation on two racks of IBM Power Systems that could fit in the space of half a ping-pong table.”

This latest advance challenges perceived misconceptions that GPUs could not be efficient on complex application codes like reservoir simulation and are better suited to simple, more naturally parallel applications such as seismic imaging. The scale, speed, and efficiency of the reported result disprove this misconception. The milestone calculation with a relatively small server infrastructure enables small and medium-size oil and energy companies to take advantage of computer-based reservoir modeling and optimize production from their asset portfolio.

Billion cell simulations in the industry are rare in practice, but the calculation was accomplished to highlight the performance differences between new fully GPU-based codes like the ECHELON reservoir simulator and equivalent legacy CPU codes. ECHELON scales from the cluster to the workstation and while it can simulate a billion cells on 30 servers, it can also run smaller models on a single server or even on a single NVIDIA P100 board in a desktop workstation, the latter two use cases being more in the sweet spot for the energy industry, according to IBM.

As importantly, the company notes, this latest breakthrough showcases the ability of IBM Power Systems with NVIDIA GPUs to achieve similar performance leaps in other fields such as computational fluid dynamics, structural mechanics, climate modeling, and others that are widely used throughout the manufacturing and scientific community. By taking advantage of POWER and GPUs organizations can literally do more with less, which often is an executive’s impossible demand.

DancingDinosaur is Alan Radding, a veteran information technology analyst, writer, and ghost-writer. Please follow DancingDinosaur on Twitter, @mainframeblog. See more of his IT writing at and here.


OpenPOWER Starts Delivering the Goods

March 13, 2015

Are you leery of multi-vendor consortiums? DancingDinosaur as a rule is skeptical of the grand promises they make until they actually start delivering results. That was the case with OpenPOWER last spring when you read here that the OpenPOWER Foundation was introduced and almost immediately forgotten.

 power8 cpu blocks

IBM POWER8 processor, courtesy of IBM (click to enlarge)

But then last fall DancingDinosaur reported on NVIDIA and its new GPU accelerator integrated directly into the server here. This too was an OpenPOWER Foundation-based initiative. Suddenly, DancingDinosaur is thinking the OpenPOWER Foundation might actually produce results.

For example, IBM introduced a new range of systems capable of handling massive amounts of computational data faster at nearly 20 percent better price/performance than comparable Intel Xeon v3 Processor-based systems. The result:  a superior alternative to closed, commodity-based data center servers. Better performance and at a lower price. What’s not to like?

The first place you probably want to apply this improved price/performance is to big data, which generates 2.5 quintillion bytes of data across the planet every day. Even the miniscule portion of this amount that you actually generate will very quickly challenge your organization to build a sufficiently powerful technology infrastructures to gain actionable insights from this data fast enough and at a price you can afford.

The commodity x86 servers used today by most organizations are built on proprietary Intel processor technology and are increasingly stretched to their limits by workloads related to big data, cloud and mobile. By contrast, IBM is designing a new data centric approach to systems that leverages the building blocks of the OpenPOWER Foundation.

This is plausible given the success of NVIDIA with its GPU accelerator. And just this past week Altera demonstrated its OpenPOWER-based FPGA, now being used by several other Foundation members who are collaborating to develop high-performance compute solutions that integrate IBM POWER chips with Altera’s FPGA-based acceleration technologies.

Formed in late 2013, the OpenPOWER Foundation has grown quickly from 5 founders to over 100 today. All are collaborating in various ways to leverage the IBM POWER processor’s open architecture for broad industry innovation.

IBM is looking to offer the POWER8 core and other future cores under the OpenPOWER initiative but they are also making previous designs available for licensing. Partners are required to contribute intellectual property to the OpenPOWER Foundation to be able to gain high level status. The earliest successes have been around accelerators and such, some based on POWER8’s CAPI (Coherence Attach Processor Interface) expansion bus built specifically to integrate easily with external coprocessors like GPUs, ASICs and FPGAs. DancingDinosaur will know when the OpenPOWER Foundation is truly on the path to acceptance when a member introduces a non-IBM POWER8 server. Have been told that may happen in 2015.

In the meantime, IBM itself is capitalizing on the OpenPower Foundation. Its new IBM Power S824L servers are built on IBM’s POWER8 processor and tightly integrate other OpenPOWER technologies, including NVIDIA’s GPU accelerator. Built on the OpenPOWER stack, the Power S824L provides organizations the ability to run data-intensive tasks on the POWER8 processor while offloading other compute-intensive workloads to GPU accelerators, which are capable of running millions of data computations in parallel and are designed to significantly speed up compute-intensive applications.

Further leveraging the OpenPOWER Foundation at the start of March IBM announced that SoftLayer will offer OpenPOWER servers as part of its portfolio of cloud services. Organizations will then be able to select OpenPOWER bare metal servers when configuring their cloud-based IT infrastructure from SoftLayer, an IBM company. The servers were developed to help organizations better manage data-intensive workloads on public and private clouds, effectively extending their existing infrastructure inexpensively and quickly. This is possible because OpenPOWER servers leverage IBM’s licensable POWER processor technology and feature innovations resulting from open collaboration among OpenPOWER Foundation members.

Due in the second quarter, the SoftLayer bare metal servers run Linux applications and are based on the IBM POWER8 architecture. The offering, according to IBM, also will leverage the rapidly expanding community of developers contributing to the POWER ecosystem as well as independent software vendors that support Linux on Power and are migrating applications from x86 to the POWER architecture. Built on open technology standards that begin at the chip level, the new bare metal servers are built to assist a wide range of businesses interested in building custom hybrid, private, and public cloud solutions based on open technology.

BTW, it is time to register for IBM Edge2015 in Las Vegas May 10-15. Edge2015 combines all of IBM’s infrastructure products with both a technical track and an executive track.  You can be sure DancingDinosaur will be there. Watch for upcoming posts here that will highlight some of the more interesting sessions.DancingDinosaur is Alan Radding, a veteran IT analyst and writer.

Follow DancingDinosaur on Twitter, @mainframeblog. See more of his IT writing on and here.

IBM Builds Out POWER8 Systems

October 3, 2014

Just in time for IBM Enterprise 2014, which starts on Monday in Las Vegas, IBM announced some new Power8 systems and a slew of new capabilities. Much of this actually was first telegraphed earlier in September here, but now it is official. Expect the full unveiling at IBM Enterprise2014.

The new systems are the Power E870 and the Power E880. The E870 includes up to 80 POWER8 cores in 32-40 nodes and as much as 4TB of memory. The Power 880 will scale up to 128 POWER8 cores and promises even more in the next rev. It also sports up to 16TB of memory, again with more coming. This should be more than sufficient to perform analytics on significant workloads and deliver insights in real time. The E880 offers also enterprise storage pools to absorb varying shifts in workloads and handle up to 20 virtual machines per core.

Back in December, DancingDinosaur referred to the Power System 795 as a RISC mainframe.  It clearly has been superseded by the POWER8 E880 in terms for sheer performance although the E880 is architected primarily for data analytics. There has been no hint of a refresh of the Power 795, which hasn’t even gotten the Power7 + chip yet. Only two sessions at Enterprise2014 address the Power System 795. Hmmm.

The new POWER8 machines boast some impressive benchmarks as of Sept. 12, 2014: AP SD 2-tier, SPECjbb2013, SPECint_rate2006 and SPECfp_rate2006).  Specifically, IBM is boasting of the fastest performing core in the industry: 1.96x or better than the best Intel Xeon Ivy Bridge and 2.29x better than the best Oracle SPARC. In each test the new POWER8 machine ran less than 2/3 of the cores of the competing machine, 10 vs. 15 or 16 respectively.

In terms of value, IBM says the new POWER8 machines cost less than competing systems, delivering 1000 users per core, double its nearest competitor. When pressed by DancingDinosaur on its cost analysis, IBM experts explained they set up new Linux apps on an enterprise class POWER8 system and priced out a comparably configured system from HP based on its published prices. For the new POWER8 systems IBM was able to hold the same price point, which turned out to be 30% less expensive for comparable power given the chip’s increased performance. By factoring in the increase in POWER8 performance and the unchanged price IBM calculated it had the lowest cost for comparable performance. Recommend you run your actual numbers.

The recent announcement also included the first fruits of the OpenPower Foundation, an accelerator from NVIDIA.  The new GPU accelerator, integrated directly into the server, is aimed at larger users of big data analytics, especially those using NoSQL databases.  The accelerator is incorporated into a new server, the Power System S824L, which includes up to 24 POWER8 cores, 1 TB of memory and up to 2 NVIDIA K40 GPU accelerators.  It also includes a bare metal version of Ubuntu Linux. IBM reports it runs extracting patterns for a variety of analytics, big data, and technical computing workloads involving large amounts of data 8x faster.

Another new goodie, one based on OpenStack, is IBM Power Virtualization Center (PowerVC), billed as new advanced virtualization management that promises to simplify the creation and management of virtual machines on IBM Power Systems servers using PowerVM or PowerKVM hypervisors. By leveraging OpenStack, it should enable IBM Power System servers to integrate into a Software Defined Environment (SDE) and provide the necessary foundation required for the delivery of Infrastructure as a Service (IaaS) within the Cloud.

Finally, as part of the Power8 announcements, IBM unveiled Power Enterprise Pools, a slick capacity-on-demand technology also called Power Systems Pools.  It offers a highly resilient and flexible IT environment to support of large-scale server consolidation and meet demanding business applications requirements. Power Enterprise Pools allow for the aggregation of compute resources, including processors and memory, across a number of Power systems. Previously available for the Power 780 and 795, it is now available on large POWER8 machines.

Am off to IBM Enterprise2014 this weekend. Hope to see you there. When not in sessions look for me wherever the bloggers hang out (usually where there are ample power outlets to recharge laptops and smartphones). Also find me at the three evenings of live performances: 2 country rock groups, Delta Rae and The Wild Feathers and then, Rock of Ages. Check out all three here.

Alan Radding is DancingDinosaur. You can follow this blog and more on Twitter, @mainframeblog. Also, find me at

%d bloggers like this: