Realtime Raytracing Commit Spotted in Unity GitHub

Subject: General Tech | September 14, 2018 - 10:32 PM |
Tagged: rtx, Unity, ray tracing, directx raytracing, DirectX 12

As Ken wrote up his take in a separate post, NVIDIA has made Turing architecture details public, which will bring real-time ray tracing to PC gaming later this month. When it was announced, NVIDIA had some demos in Unreal Engine 4, and a few partnered games (Battlefield V, Shadow of the Tomb Raider, and Metro Exodus) showed off their implementations.

As we expected, Unity is working on supporting it too.

View Full Size

Not ray tracing, but from the same project at Unity.

The first commit showed up on Unity’s GitHub for their Scriptable Render Pipelines project, dated earlier today. Looking through the changes, it appears to just generate the acceleration structure based on the objects of type renderer in the current scene (as well as define the toggle properties of course). It looks like we are still a long way out.

I’m looking forward to ray tracing implementations, though. I tend to like art styles with anisotropic metal trims and soft shadows, which is difficult to get right with rasterization alone due to the reliance on other objects in the scene. In the case of metal, reflections dominate the look and feel of the material. In the case of soft shadows, you really need to keep track of how much of a light has been blocked between the rendered fragment and the non-point light.

And yes, it will depend on the art style, but mine just happens to be computationally expensive.

September 14, 2018 | 11:32 PM - Posted by NoSuchAnimalExistsJustYet (not verified)

There is no such animal as Realtime Ray-Tracing! Nvidia is using hybrid Ray Tracing and AI accelerated Denoising to make what use of any limited amout of RT cores rays cast in the limited 16.67ms to 33.33ms frame time for 60FPS down to 30FPS gaming.

There is still mostly rasterization going on mixed with some limited ray tracing output. You can fully ray trace any scene but that's not going to be done in 16.67ms or 33.33ms and it will Still take longer than just a few minutes per frame.

It's LIMITED Hybrid Ray Tracing and any other gaming "Real Time" banter is just marketing and nothing more! It's marketing and that marketing is heavily funded this time around! That's because Imagination Technologies did a similar thing with their PowerVR Wizard IP but they did not have the funds to buy their way into the hearts of the onlne "press"!

The more interesting part of Turing is the trained AIs running on Turing's Tensor Cores that are what is really making that Limited Ray Tracing possible. That and Trained AI doing the DLAA is also interesting.

September 15, 2018 | 04:22 PM - Posted by Scott Michaud

Rays are cast, and they are cast in real time, so the term "real-time ray tracing" is valid.

Yes, the actual geometry is rasterized in a typical video game, but rays are cast into the environment to gather auxiliary data that rasterization cannot gather (because it doesn't know much about the world around it). As such, I'm going to keep calling it real-time ray tracing.

This is especially since the API, as far as I can tell, allows you to fully ray trace if you like. The only thing stopping you would be performance, and it's better to use those rays for things like metal highlights and shadows. (We might see, like, one or two zero-bounce unlit fully ray traced demos for 360 video or something, though. Maybe even ray traced audio / AI / etc.?)

September 16, 2018 | 01:56 PM - Posted by DeepPurpleNeuralNetDenoising (not verified)

No Real Time Ray Tracing(The Holy Grail kind) is when the entire scene is traced 100% using only rays! And both Nvidia and Imagination Technologies(PowerVR Wizard) are using some form of Limited Ray Tracing/Hybrid Ray Tracing with the Ray Tracing output mixed with rasterization output.

And how many rays does that leave Nvidia's GTX 2080 Ti(10 GigaRays/second) or GTX 2080(8 Giga Rays/second) if that's divided by 1000(1000ms per Second) and then multiplied by the Frame Times of between 16.67ms(60 FPS Frame time) or 33.33ms(30 FPS frame time) for example.

And those millesecond frame times are not really the actual amount of time available for Ray Calculation as that Ray Calulation steps has to be done before the denoising steps can take place and whatever order the mix down steps have to be done with the rasterization pipeline steps and any other post processing steps. So a lot more other tasks have to be completed in the milliseconds of frame time available in that usually between 16.67ms and 33.33ms frame times that most games are targeting.

So for example at 60 FPS that's only 166,700,000 rays(10 billion/1000 times 16.67ms) for the GTX 2080Ti in that 16.67ms frame time minus the time it takes for all the other non parallel steps to be finished in that same 16.67ms. So actually any other steps that can not be completed in parallel will make the actual Ray Counts lower in order to make 16.67ms time allotment for 60FPS.

I'm guessing that Nvidia has things properly pipelined where each of the steps are overlapped and things are rendered by the tile to help with cache locality and to reduce any latency inducing VRAM memory calls that may disrupt the very delicate timing constraints required to pull this all off. But even at the Tile level Nvidia has probably broken things down fruther if one looks at the Streaming Multiprocessors (SMs) on Turing and their allotment of Tensor Cores, FP and INT shader cores, and Ray Tracing Core/Other SM units.

Look at how Nvidia defines its VRS(Variable Rate Shading) in the Turing Micro-Arch whitepaper and that's probably how Nvidia is allowing games makers to maintain the game's frame rates and frame variance targets for game play.

"Variable Rate Shading (VRS)

VRS allows developers to control shading rate dynamically, shading as little as once per sixteen pixels or as often as eight times per pixel. The application specifies shading rate using a combination of a shading-rate surface and a per-primitive (triangle) value. VRS is a very powerful
tool that allows developers to shade more efficiently, reducing work in regions of the screen where full resolution shading would not give any visible image quality benefit, and therefore improving frame rate. Several classes of VRS-based algorithms have already been identified, which can vary shading work based on content level of detail (Content Adaptive Shading), rate of
content motion (Motion Adaptive Shading), and for VR applications, lens resolution and eye
position (Foveated Rendering). " (1)

Here is somthing that also aimed at reducing uncessary re-processing:

"Texture-Space Shading

With texture-space shading, objects are shaded in a private coordinate space (a texture space)that is saved to memory, and pixel shaders sample from that space rather than evaluating results directly. With the ability to cache shading results in memory and reuse/resample them,
developers can eliminate duplicate shading work or use different sampling approaches that improve quality." (1)

The Turing Whitepaper defines a Turing Frame in "APPENDIX C RTX-OPS DESCRIPTION" under the heading:


So under that heading there is Figure 43(figure 44 uses the same graphic" titled:

"Figure 43. Workload Distribution Over One Turing Frame Time" (1)[see Figuer 43 On Page 72 of the Turing Whitepaper]

So One Turing Frame in that Graph/Image shows turing's general/rough per frame parallel workload pipeline sturcture. Nvidia's marketing likes to call it Real Time Ray Tracing but they are just splitting hairs as it's actually Limited Real Time Ray Tracing(Hybrid Ray Tracing) and anywhere in the Whitepaper that says Real Time Ray Tracing one can properly insert the words: Limited Real Time Ray Tracing(Hybrid Ray Tracing)!

Rays(both Light and Shadow Rays) are Cast Yes but not for the entirety of the process and that process also includes a fair bit more, if one looks at figure 43, of the nominal raster type of shading of the Yellow shaded Bar Part(FP32 kind of raster shading) and the Dark Green shaded Bar Part(INT 32 Kind of raster shading) workloads in addition to the DNN(Deep Neural Net) Shaded in Purple. So the Ray Tracing Part(In Light Green Shading) in Figure 43 makes up somewhat less of the overall rendering and it's actually the Neural Net based processing(Denoising) that is what allows that Limited Ray Tracing's very grainy output to be of use for gaming.

Nvidia's Whipepaper list the breakdown:


To compute RTX-OPs, the peak operations of each type based is derated on how often it is used.

In particular:

 Tensor operations are used 20% of the time

 CUDA cores are used 80% of the time

 RT cores are used 40% of the time (half of 80%)

 INT32 pipes are used 28% of the time (35% of 80%)

For example, RTX-OPS = TENSOR * 20% + FP32 * 80% + RTOPS * 40% + INT32 * 28%

Figure 44 shows an illustration of the peak operations of each type for GTX 2080 Ti. Plugging in
those peak operation counts results in a total RTX-OPs number of 78.
For example, 14 * 80% + 14 * 28% + 100 * 40% + 114 * 20%. " (1)

The actual Ray Tracing Core itself needs a Whitepaper all its own so hopefully there will be more known.

The Imagination Technologies(PowerVR Wizard) folks were the first to do this Limited/Hybrid Ray Tracing and not Nvidia and Nvidia's Neural Network Denoising is a definite Innovation somewhat in this Hybrid Ray Tracing process no questions about that. But Others can get their own Tensor Cores and do the same and even CPUs have Neural Net sorts of branch predicition built into their CPU cores.

But Nvidia sure has the funds to purchase from some in the Press the mindshare articles(Tom's Hardware USA/others) that state otherwise!

P.S. Nvidia! Whitepapers need a Glossery Of Terms and Acronyms and You need to label completely Your SM's! What is a SFU(?)

Wikipedia, their Fermi (microarchitecture), entry defines them as Special Function Units(SFU)

"Special Functions Units (SFUs): Execute transcendental instructions such as sin, cosine, reciprocal, and square root. Each SFU executes one instruction per thread, per clock; a warp executes over eight clocks. The SFU pipeline is decoupled from the dispatch unit, allowing the dispatch unit to issue to other execution units while the SFU is occupied." (2)

So Maybe that, the updated version for Turing, needs to be in the Whitepaper's Glossery of Terms, DNN Also! And in that Glossery that does NOT exist in your Nvidia Turing Whitepaper. What's up with that Nvidia! If the F-ing Mainframe Computer folks did it, Glosseries and Damn Nice Ones at that, and still do, why is the Microprocessor and Graphics Processor markets any different!


Graphics Reinvented"


"Fermi (microarchitecture)"

September 16, 2018 | 09:55 PM - Posted by Nottoopc (not verified)

Tldr. Real-time rays woooo!! Realtime aproxinated shadows, real-time reflections!

September 17, 2018 | 11:36 AM - Posted by WoWithSomeHooBackAtYa (not verified)

Whatever! Nvidia has more money than Imagination Technologies to afford to pay/literally assist(With Paid Nvidia Employees) the games developers in adopting Nvidia's "Real Time" Ray Tracing and the secret sauce is really the money it takes to get the technology adopted by the wider games/gaming engine ecosystem. Programmers(Games Developers) are not Pachyderms, they won't work for peanuts!

Imagination Technologies(IT) has Ray Tracing IP that it can also license to others so Nvidia's competition, Both AMD and Intel(In 2020 they say) can also get on that "Real Time" Ray Tracing Bandwagon. AMD can probably engineer their own Ray Tracing solution but Intel has to license anyways so IT's Ray Tracing IP can be licensed. One would think that Vulkan will also begin supporting Ray Tracing on Hardware and if not in the hardware then just have it implemented in software until the specific GPU hardware support arrives. That is how it's done with Both DX12 and Vulkan for any make/model of GPU, just query the OS and whatever GPU's hardware/software support for Ray Tracing is enumerated, usually via Plug and Play on windows and similar methods on other OSs.

That's what was done for Nvidia's lack of fully in the GPU's hardware based ASYNC compute compared to AMD's in hardware Async compute hardware ACE units. So AMD does Ray Tracing also it's just not as fully hardware based as Nvidia's RT cores. The Graphics APIs(DX12, Vulkan/others) are extensible by design and there are methods built into the Graphics APIs to register custom extentions to the Graphics APIs to target custom hardware and software methods for Ray Tracing and any other compute/AI functionality that's done on any specilized hardware or in software, including AI related workloads.

So Yes "Real-time rays woooo!! Realtime aproxinated shadows, real-time reflections" and Real Time Ray Traced Ambient Occlusion, environment lighting, and refractions also.

And do not forget that there are sound rays than can be traced on the very same hardware also along with other physics related processing, and even gaming collision detection via the Ray Tracing engines also. There is a whole bunch of emergent technological uses for that BVH(bounding volume hierarchy) In GPU hardware IP that's in Nvidia's RTX based SKUs Ray Tracing Cores. Ditto for the Tensor Cores doing other than AI processing if needed!

Every GPU does uses BVH for Ray Tracing either implemented in software or hardware but Nvidia's Hardware methods where already done by Imagination Technologies on their PowerVR Wizard GPUs.

This will Ray Tracing move by Nvidia force both AMD and Intel to also Include on GPU Ray Tracing acceleration Hardware and that can also be good for the Non Gaming Professional Graphics users also with that group able to really dial up the Numbers of Rays Traced for their non realtime/non-FPS constrained workloads. The more Rays the more realistic it gets!

Here's Some More TLDR for that single enfeebled cell of grey floating in that vasr SEA of Lipids beneath that think crust of bone that is your numbskull!

Woo and some Hoo For Them Rays!

September 17, 2018 | 11:40 AM - Posted by WoWithSomeHooBackAtYa (not verified)

Edit: vasr

To: vast

Fat fingers flailing across chicklet keys, and crappy spelling and proofreading, and it's all just to Troll with Walls-O-Text the TLDRs out there!

September 15, 2018 | 09:33 PM - Posted by ipoopwhenifart (not verified)

The rays that are traced by RTX cards are done in real time. What's so hard to understand about that?

September 16, 2018 | 02:10 PM - Posted by DeepPurpleNeuralNetDenoising (not verified)

That's marketing nonsense that I'm attacking because Imagination Technologies with their PowerVR Wizard IP beat Nvidia to the market and they also used Hybrid Ray Tracing.

I'm not knocking Nvidia's engineers I'm only Knocking their marketing department and Nvidia's marketing Monkeys need to keep their smelly little fingers out of Nvidia's whitepaper content.

It's still impressive that Nvidia's Neural Network based(Trained Denoising AI running on the Tensor Cores) can make use of that crappy Hybrid Ray Tracing Output. The AI cores are what is allowing Nvidia's Grainy Hybrid Ray Tracing Output to be of use for gamning. Props go out to the Imagination Technologies engineers also, some of them are employed by Nvidia and Apple/others after Imagination Technologies was sliced up and sold off to different owners.

BUT it's STILL Real Time Hybrid Ray Tracing and not "Real Time Ray Tracing"!

September 17, 2018 | 06:15 PM - Posted by PhoroNIXED (not verified)

OK It's Time to BOYCOTT Phoronix if they can not fix their annoying Ads problems!

What's up with the CPU usage on Phoronix also it's like I'm doing a blender CPU render. It's getting bad on Phoronix just like it has now gotten bad at AnandTech with the amount of annoying ads lately.

Ad block and Boycott that ad madness!

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

This question is for testing whether you are a human visitor and to prevent automated spam submissions.