Last week IBM became the first major global cloud provider to make the NVIDIA Tesla P100 GPU accelerator available on the cloud. By combining NVIDIA’s acceleration technology with IBM’s Cloud platform, businesses can expect to more quickly and efficiently run compute-heavy workloads.
A few compute-heavy workloads, however, will barely be noticed in the aggregate volume of data that analyst firms like IDC see coming by 2025. IDC analysts David Reinsel, John Gantz and John Rydning, in their latest Digital Universe study conclude: The 163 zettabyte (ZB) global datasphere projected in Data Age 2025 is only the beginning.
Today, IBM and Nvidia see the big zettabytes driver as cognitive analytics, but that’s just one part and not even the largest. As IBM explains: The combination of IBM’s connectivity and bare metal servers with the Tesla P100 GPUs enables higher throughput than traditional virtualized servers. This high level of performance can allow organizations to deploy fewer, more powerful cloud servers to more quickly deliver increasingly complex simulations and big data workloads. Even before 2025, you can’t deploy enough NVIDIA technology to matter.
The IDC analysts take a more visionary view: Imagine being awoken and tended to by a virtual personal assistant that advises you on what clothing from your wardrobe is best suited to the weather report and your schedule for the day or being transported by your self-driving car. Or maybe you won’t need to commute to an office at all as technology will allow you to conjure workspaces out of thin air using interactive surfaces, and holographic teleconferencing becomes the norm for communicating virtually with colleagues.
Behind this vision are data and processing. As the IDC analysts put it: In earlier periods, data growth stemmed largely from the rise of the personal computer and the consumption of digital entertainment. The world today, however, contains more consumer devices (PCs, phones, game consoles, and music players) than human beings, and all these devices require data.
The embedding of computing power in a large number of endpoint devices also contributes to data growth. Today, the number of embedded system devices feeding into datacenters is less than one per person globally, but over the next 10 years, that number will increase to more than four per person. And the number of files generated will be unimaginable, measured in quintillions per year. BTW, it would take Niagara Falls 210,000 years to move one quintillion gallons of water.
Don’t worry about storing all these zettabytes. As the analysts note: The vast majority of the global datasphere is used and discarded rather than stored. This is primarily a reflection of the fact that most data is fundamentally disposable. Here is where metadata comes into play. From the huge amount of data created, you can prioritize which data has sufficient value to be stored through metadata, classification, tagging, and such.
The key: apply cognition to the selection of data to store. Let cognition determine the smart criteria to identify which data to retain, in what form, and for how long. That way you can hang onto critical information without the need to store all the data produced.
Even still, the quantity of data will outpace any reasonable expectation to store all the data. For example, it would take roughly 16 billion of today’s largest 12TB enterprise HDDs to store the 163ZB data expected to be created in 2025. But over the past 20 years, the entire disk drive industry shipped only 8 billion HDDs amounting to nearly 4ZB of storage.
Obviously, you can’t store 163 ZB of data or even a miniscule piece of it. The solution points to intelligence embedded throughout the system. Start with the application of smart criteria described above. And don’t ignore intelligent automation; people can’t possibly handle this without assistance.
And the 163ZB global datasphere projected in Data Age 2025 is only just the beginning. Another decade in technology years will likely bring about unforeseen advancements, use cases, businesses, and life- changing services that rely on yet more data. As IDC notes, the quantity of data generated will continue to outpace any notion of storing all of the data. Start now incorporating cognition to determine metadata, file tagging, data classification, and expiration dating.
DancingDinosaur is Alan Radding, a veteran information technology analyst, writer, and ghost-writer. Please follow DancingDinosaur on Twitter, @mainframeblog. See more of his IT writing at technologywriter.com and here.