In the past, when NVIDIA launched a new GPU architecture, they would make a few designs for each of their market segments. All SKUs would be one of those chips, with varying amounts of it disabled or re-clocked to hit multiple price points. The mainstream enthusiast (GTX -70/-80) chip of each generation is typically 300mm2, and the high-end enthusiast (Titan / -80 Ti) chip is often around 600mm2.

Kepler used quite a bit of that die space for FP64 calculations, but that did not happen with consumer versions of Pascal. Instead, GP100 supported 1:2:4 FP64:FP32:FP16 performance ratios. This is great for the compute community, such as scientific researchers, but games are focused on FP32. Shortly thereafter, NVIDIA releases GP102, which had the same number of FP32 cores (3840) as GP100 but with much-reduced 64-bit performance… and much reduced die area. GP100 was 610mm2, but GP102 was just 471mm2.

At this point, I’m thinking that NVIDIA is pulling scientific computing chips away from the common user to increase the value of their Tesla parts. There was no reason to either make a cheap 6XXmm2 card available to the public, and a 471mm2 part could take the performance crown, so why not reap extra dies from your wafer (and be able to clock them higher because of better binning)?

And then Volta came out. And it was massive (815mm2).

At this point, you really cannot manufacture a larger integrated circuit. You are at the limit of what TSMC (and other fabs) can focus onto your silicon. Again, it’s a 1:2:4 FP64:FP32:FP16 ratio. Again, there is no consumer version in sight. Again, it looked as if NVIDIA was going to fragment their market and leave consumers behind.

And then Turing was announced. Apparently, NVIDIA still plans on making big chips for consumers… just not with 64-bit performance. The big draw of this 754mm2 chip is its dedicated hardware for raytracing. We knew this technology was coming, and we knew that the next generation would have technology to make this useful. I figured that meant consumer-Volta, and NVIDIA had somehow found a way to use Tensor cores to cast rays. Apparently not… but, don’t worry, Turing has Tensor cores too… they’re just for machine-learning gaming applications. Those are above and beyond the raytracing ASICs, and the CUDA cores, and the ROPs, and the texture units, and so forth.

But, raytracing hype aside, let’s think about the product stack:

  1. NVIDIA now has two ~800mm2-ish chips… and
  2. They serve two completely different markets.

In fact, I cannot see either FP64 or raytracing going anywhere any time soon. As such, it’s my assumption that NVIDIA will maintain two different architectures of GPUs going forward. The only way that I can see this changing is if they figure out a multi-die solution, because neither design can get any bigger. And even then, what workload would it even perform? (Moment of silence for 10km x 10km video game maps.)

What do you think? Will NVIDIA keep two architectures going forward? If not, how will they serve all of their customers?