To a crowd of press and developers at their GTC summit, NVIDIA announced the GeForce GTX Titan Z add-in board (AIB). Each of the two, fully unlocked, GK110 GPUs would each have access to 6GB of GDDR5 memory (12GB total). The card was expected to be available on May 8th but has yet to surface. As NVIDIA has yet to comment on the situation, many question whether it ever will.

And then we get what we think are leaked benchmarks (note: two pictures).

One concern about the Titan Z was its rated 8 TeraFLOPs of compute performance. This is a fairly sizable reduction from the theoretical maximum of 10.24 TeraFLOPs of two Titan Black processors and even less than two first-generation Titans (9 TeraFLOPs combined). We expected that this is due to reduced clock rates. What we did not expect is for benchmarks to show the GPUs boost way above those advertised levels, and even beyond the advertised boost clocks of the Titan Black and the 780 Ti. The card was seen pushing 1058 MHz in some sections, which leads to a theoretical compute performance of 12.2 TeraFLOPs (6.1 TeraFLOPs per GPU) in single precision. That is a lot.

These benchmarks also show that NVIDIA has a slight lead over AMD's R9 295X2 in many games, except Battlefield 4 and Sleeping Dogs (plus 3DMark and Unigine). Of course, these benchmarks measure the software reported frame rate and frame times and those may or may not be indicative of actual performance. While I would say that the Titan Z appears to have a slight performance lead over the R9 295X2, although a solid argument for an AMD performance win exists, it does so double the cost (at its expected $3000 USD price point). That is not up for debate.

Whichever card is faster, AMD's is half the price and available for purchase right now.

So, until NVIDIA says anything, the Titan Z is in limbo. I am sure there exists CUDA developers who await its arrival. Personally, I would just get three Titan Blacks since you are going to need to manually schedule your workloads across multiple processors anyway (or 780 Tis if 32-bit arithmetic is enough precision). That is, of course, unless you cannot physically fit enough GeForce Titan Blacks in your motherboard and, as such, you require two GK110 chips per AIB (but not enough to bother writing a cluster scheduling application).