Review Index:

NVIDIA GeForce 8800 GTX Review - DX10 and Unified Architecture

Author: Ryan Shrout
Manufacturer: NVIDIA

DirectX 10

So the move to unified shaders is a requirement of DX10 as I mentioned before, but that's not all that has changed in the move to double-digits.  The goal from Microsoft's DX10 designers was mainly to improve the ease of programming and allow the designers to more easily implement improved graphics, effects and more.  Shader Model 4.0 is being introduced with some big enhancements and geometry shaders make their debut as well.  Another important note about DX10 is that it is much more strict on the hardware specifications -- with no more "cap bits" hardware vendors can't simply disable some DX10 features to qualify as DX10 hardware, as Intel did in many cases with DX9.

The updated DX10 pipeline looks a little something like this; new features are listed off to the right there.  Geometry shaders and stream output are two very important additions to the specification.  Geometry shaders will allow dynamic modifications of objects in the GPU (rather than the CPU) and will give game creators a boost in creativity.  Stream output allows the processing engine (GPU) that DX10 runs on to communicate within the GPU, effectively allowing the pipelines to communicate with each other by outputting to shared memory. 

The geometry shader is placed right after the vertex shader in the DX10 pipeline, allow the vertex data to setup the geomerty that may be modified by the new processing unit.  The input and output from the geometry shaders must follow the same basic types as points, lines or triangle strips, but the data has the possibility of being modified inside the shader to produce various effects.

Here are some examples of what designers might be able to do with geometry shaders; automatic shadow box generation and physical simulations are among the most powerful.  Real-time environment creation is a fantastic idea that could allow a game's replay value to be potentially limitless; imagine games where the world really DOESN'T end can be dynamically generated on the fly by the gamer's GPUs to be different than any other gamer's.

Stream output is the other new addition to the DX10 pipeline and its importance will be seen in great detail through out this review and in the future months as more about NVIDIA's processing potential is revealed.  Strictly for gaming, stream output allows for a mid-phase memory write by the processing engine to store geometry or vertex shader data.  By enabling multi-pass operations, programmers can now do recursion (the bane of my undergraduate years...) as well as numerous other "tricks" to improve their software.

Maybe you remember a thing called physics?  It's all the rage in gaming recently, and with the ability to output data to memory mid-phase in the processing pipeline, NVIDIA is going to be able to compete with AGEIA's hardware more directly as that ability to communicate between pipes is what gave AGEIA's PhysX engine a performance advantage.

Another example of stream output at work is improved instancing -- where before you were limited to instancing items like grass that would all look the same and following the same paths, with stream output you can have instanced items that have individual "states" and attirbutes that allow programmers to use them for unique characters and items. 

For data junkies, this table summarizes up the changes moving from DX8 and Shader Model 1.0 to DX10 and the new SM4.0 specs. 

All of this new DX10 ability will allow programmers to do more than ever on the GPU, removing the CPU as a gaming bottleneck in many cases.  Here are a couple of examples: above we see an algorithm for human hair simulation.  Before, the majority of the physics and setup work was done on the CPU but now with the new options DX10 provides it can all be moved to the GPU.

Another example is using a stencil shadow algorithm (a popular method in current games) where most of the work previously relegated to the CPU can be moved to the GPU using DX10.

As an example of DX10 at work, this NVIDIA demo shows a landscape being created on the fly using geometry shaders with a particle system running only on the GPU to simulate the water running down the rock.  Oh, and the graphics are rendered on it too.