POWER8 did not disappoint. IBM unveiled its latest generation of systems built on its new POWER8 technology on Wednesday, April 23.
DancingDinosaur sees three important takeaways from this announcement:
First, the OpenPOWER Foundation. It was introduced months ago and almost immediately forgotten. DancingDinosaur covered it at the time here. It had handful of small partners. Only one was significant, Google, and was it was hard to imagine Google bringing out open source POWER servers. Now the Foundation has several dozen members and it still is not clear what Google is doing there, but the Foundation clearly is gaining traction. You can expect more companies to join the Foundation in the coming weeks and months.
With the Foundation IBM swears it is committed to a true open ecosystem; one where even competitors can license the technology and bring out their own systems. At some point don’t be surprised to see white box Power systems below IBM’s price. More likely in the short term will be specialized Power appliances. What you get as a foundation member is the Power SOC design, Bus Specifications, Reference Designs, FW OS, and Hypervisor Open Source. It also includes access to Little Endian Linux, which will ease the migration of software to POWER. BTW, Google is listed as a member focusing on open source firmware and on the cloud and high performance computing.
Second, the POWER8 processor itself and the new family of systems. The processor, designed for big data, will run more concurrent queries and run them up to 50x fast than x86 with 4x more threads per core than x86. Its I/O bandwidth is 5x faster than POWER7. It can handle 1TB of memory with 4-6x more memory bandwidth and more than 3x more on-chip cache than an x86. The processor itself will utilize 22nm circuits and run 2.5 -5 GHz.
POWER8 sports an eight-threaded processor. That means each of the 12 cores in the CPU will coordinate the processing of eight sets of instructions at a time for a total of 96 processes. Each process consists of a set of related instructions making up a discrete process within a program. By designating sections of an application that can run as a process and coordinate the results, a chip can accomplish more work than a single-threaded chip, IBM explains. By comparison, IBM reports Intel’s Ivy Bridge E5 Xeon CPUs are double-threaded cores, with up to eight cores, handling 16 processes at a time (compared to 96 with POWER8). Yes, there is some coordination overhead incurred as more threads are added. Still the POWER8 chip should attract interest among white box manufacturers and users of large numbers of servers processing big data.
Third is CAPI, your newest acronym. If something is going to be a game-changer, this will be it. The key is to watch for adoption. Coherent Accelerator Processor Interface (CAPI) sits directly on the POWER8 and works with the same memory addresses that the processor uses. Pointers de-referenced same as the host application. CAPI, in effect, removes OS and device driver overhead by presenting an efficient, robust, durable interface. In the process, it offloads complexity.
CAPI can reduce the typical seven-step I/O model flow to three steps (shared memory/notify accelerator, acceleration, and shared memory completion). The advantages revolve around virtual addressing and data caching through shared memory and reduced latency for highly referenced data. [see accompanying graphic] It also enables an easier, natural programming model with traditional thread level programming and eliminates the need to restructure the application to accommodate long latency I/O. Finally it enables apps otherwise not possible, such as those requiring pointer chasing.
It’s too early to determine if CAPI is a game changer but IBM has already started to benchmark some uses. For example, it ran NoSQL on POWER8 with CAPI and achieved a 5x cost reduction. When combined with IBM’s TMI flash it found it could:
- Attack problem sets otherwise too big for the memory footprint
- Deliver fast access to small chunks of data
- Achieve high throughput for data or simplify object addressing through memory semantics.
CAPI brings programming efficiency and simplicity. It uses the PCIe physical interface for the easiest programming and fastest, most direct I/O performance. It enables better virtual addressing and data caching. Although it was intended for acceleration it works well for I/O caching. And it has been shown to deliver a 5x cost reduction with equivalent performance when attaching to flash. In summary, CAPI enables you to regain infrastructure control and rein in costs to deliver services otherwise not feasible.
It will take time for CAPI to catch on. Developers will need to figure out where and how best to use it. But with CAPI as part of the OpenPOWER Foundation expect to see work taking off in a variety of directions. At a pre-briefing a few weeks ago, DancingDinosaur was able to walk through some very CAPI interesting demos.
As for the new POWER8 Systems lineup, IBM introduced 6 one- or two-socket systems, some for Linux others for all systems. The systems, reportedly, will start below $8000.
You can follow Alan Radding/DancingDinosaur on Twitter: @mainframeblog. Also, please join me at IBM Edge2014, this May 19-23 at the Venetian in Las Vegas. Find me in the bloggers lounge.