Are the enterprise users still here? Oh, hey!

GPU acceleration throws a group of many similar calculations at thousands of simple cores. Their architecture makes it very cheap and power efficient for the amount of work they achieve. Gamers, obviously, enjoy the efficiency at tasks such as calculating pixels on a screen or modifying thousands of vertex positions. This technology has evolved more generally than graphics. Enterprise and research applications have been taking notice over the years.

GPU discussion, specifically, starts around 16 minutes.

Java, a friend of scientific and "big-data" developers, is also evolving in a few directions including "offload".

IBM's CTO of Java, John Duimovich, discussed a few experiments they created when optimizing the platform to use new hardware. Sorting arrays, a common task, saw between a 2-fold and 48-fold increase of performance. Including the latency of moving data and initializing GPU code, a 32,000-entry array took less than 1.5ms to sort, compared to about 3ms on the CPU. The sample code was programmed in CUDA.

The goal of these tests is, as far as I can tell, to (eventually) automatically use specialized hardware for Java's many built-in libraries. The pitch is free performance. Of course there is only so much you can get for free. Still, optimizing the few usual suspects is an obvious advantage, especially if it just translates average calls to existing better-suited libraries.

Hopefully they choose to support more than just CUDA whenever they take it beyond experimentation. The OpenPOWER Consortium, responsible for many of these changes, currently consists of IBM, Mellanox, TYAN, Google, and NVIDIA.