Imagination PowerVR Ray Tracing with UE4 & Vulkan Demo

Subject: Graphics Cards, Mobile | June 2, 2017 - 02:23 AM |
Tagged: Imagination Technologies, PowerVR, ray tracing, ue4, vulkan

Imagination Technologies has published another video that demonstrates ray tracing with their PowerVR Wizard GPU. The test system, today, is a development card that is running on Ubuntu, and powering Unreal Engine 4. Specifically, it is using UE4’s Vulkan renderer.

The demo highlights two major advantages of ray traced images. The first is that, rather than applying a baked cubemap with screen-space reflections to simulate metallic objects, this demo calculates reflections with secondary rays. From there, it’s just a matter of hooking up the gathered information into the parameters that the shader requires and doing the calculations.

The second advantage is that it can do arbitrary lens effects, like distortion and equirectangular, 360 projections. Rasterization, which projects 3D world coordinates into 2D coordinates on a screen, assumes that edges are still straight, and that causes problems as FoV gets very large, especially full circle. Imagination Technologies acknowledges that workarounds exist, like breaking up the render into six faces of a cube, but the best approximation is casting a ray per pixel and seeing what it hits.

The demo was originally for GDC 2017, back in February, but the videos have just been released.

MWC 16: Imagination Technologies Ray Tracing Accelerator

Subject: Graphics Cards, Mobile, Shows and Expos | February 23, 2016 - 08:46 PM |
Tagged: raytracing, ray tracing, PowerVR, mwc 16, MWC, Imagination Technologies

For the last couple of years, Imagination Technologies has been pushing hardware-accelerated ray tracing. One of the major problems in computer graphics is knowing what geometry and material corresponds to a specific pixel on the screen. Several methods exists, although typical GPUs crush a 3D scene into the virtual camera's 2D space and do a point-in-triangle test on it. Once they know where in the triangle the pixel is, if it is in the triangle, it can be colored by a pixel shader.


Another method is casting light rays into the scene, and assigning a color based on the material that it lands on. This is ray tracing, and it has a few advantages. First, it is much easier to handle reflections, transparency, shadows, and other effects where information is required beyond what the affected geometry and its material provides. There are usually ways around this, without resorting to ray tracing, but they each have their own trade-offs. Second, it can be more efficient for certain data sets. Rasterization, since it's based around a “where in a triangle is this point” algorithm, needs geometry to be made up of polygons.

It also has the appeal of being what the real world sort-of does (assuming we don't need to model Gaussian beams). That doesn't necessarily mean anything, though.

At Mobile World Congress, Imagination Technologies once again showed off their ray tracing hardware, embodied in the PowerVR GR6500 GPU. This graphics processor has dedicated circuitry to calculate rays, and they use it in a couple of different ways. They presented several demos that modified Unity 5 to take advantage of their ray tracing hardware. One particularly interesting one was their quick, seven second video that added ray traced reflections atop an otherwise rasterized scene. It was a little too smooth, creating reflections that were too glossy, but that could probably be downplayed in the material ((Update: Feb 24th @ 5pm Car paint is actually that glossy. It's a different issue). Back when I was working on a GPU-accelerated software renderer, before Mantle, Vulkan, and DirectX 12, I was hoping to use OpenCL-based ray traced highlights on idle GPUs, if I didn't have any other purposes for it. Now though, those can be exposed to graphics APIs directly, so they might not be so idle.

The downside of dedicated ray tracing hardware is that, well, the die area could have been used for something else. Extra shaders, for compute, vertex, and material effects, might be more useful in the real world... or maybe not. Add in the fact that fixed-function circuitry already exists for rasterization, and it makes you balance gain for cost.

It could be cool, but it has its trade-offs, like anything else.

Ray Tracing is back? That's Wizard!

Subject: General Tech, Shows and Expos | March 19, 2014 - 01:20 PM |
Tagged: Imagination Technologies, gdc 14, wizard, ray tracing

The Tech Report visited Imagination Technologies' booth at GDC where they were showing off a new processor, the Wizard GPU.  It is based on the PowerVR Series6XT Rogue graphics processor which is specifically designed to accelerate ray tracing performance, a topic we haven't heard much about lately.  They describe the performance as capable of processing 300 million rays and 100 million dynamic triangles per second which translates to 7 to 10 rays per pixel at 720p and 30Hz or 3 to 5 rays a pixel at 1080p and 30Hz.  That is not bad, though Imagination Technologies estimates movies display at a rate of 16 to 32 rays per pixel so it may be a while before we see a Ray Tracing slider under Advanced Graphics Options.

4_PowerVR Ray Tracing - hybrid rendering (4).jpg

"When we visited Imagination Technologies at CES, they were showing off some intriguing hardware that augments their GPUs in order to accelerate ray-traced rendering. Ray tracing is a well-known and high-quality form of rendering that relies on the physical simulation of light rays bouncing around in a scene. Although it's been used in movies and in static scene creation, ray tracing has generally been too computationally intensive to be practical for real-time graphics and gaming. However, Imagination Tech is looking to bring ray-tracing to real-time graphics—in the mobile GPU space, no less—with its new family of Wizard GPUs."

Here is some more Tech News from around the web:

Tech Talk

GTC 2013: Jen-Hsun Huang Takes the Stage to Discuss NVIDIA's Future, New Hardware

Subject: General Tech, Graphics Cards | March 19, 2013 - 02:55 PM |
Tagged: unified virtual memory, ray tracing, nvidia, GTC 2013, grid vca, grid, graphics cards

Today, NVIDIA's CEO Jen-Hsun Huang stepped on stage to present the GTC keynote. In the presentation (which was live streamed on the GTC website and archived here.), NVIDIA discussed five major points, looking back over 2013 and into the future of its mobile and professional products. In addition to the product roadmap, NVIDIA discussed the state of computer graphics and GPGPU software. Remote graphics and GPU virtualization was also on tap. Finally, towards the end of the Keynote, the company revealed its first appliance with the NVIDIA GRID VCA. The culmination of NVIDIA's GRID and GPU virtualization technology, the VCA is a device that hosts up to 16 virtual machines which each can tap into one of 16 Kepler-based graphics processors (8 cards, 16 GPUs per card) to fully hardware accelerate software running of the VCA. Three new mobile Tegra parts and two new desktop graphics processors were also hinted at, with improvements to power efficiency and performance.


On the desktop side of things, NVIDIA's roadmap included two new GPUs. Following Kepler, NVIDIA will introduce Maxwell and Volta. Maxwell will feature a new virtualized memory technology called Unified Virtual Memory. This tech will allow both the CPU and GPU to read from a single (virtual) memory store. Much as with the promise of AMD's Kaveri APU, the Unified Virtual Meory will result in speed improvements in heterogeneous applications because data will not have to be copied to/from the GPU and CPU in order for the data to be processed. Server applications will really benefit from the shared memory tech. NVIDIA did not provide details, but from the sound of it, the CPU and GPU both continue to write to their own physical memory, but their is a layer of virtualized memory on top of that, that will allow the two (or more) different processors to read from each other's memory store.
Following Maxwell, Volta will be a physically smaller chip with more transistors (likely a smaller process node). In addition to the power efficiency improvements over Maxwell, it steps up the memory bandwidth significantly. NVIDIA will use TSV (through silicon via) technology to physically mount the graphics DRAM chips over the GPU (attached to the same silicon substrate electrically). According to NVIDIA, this new TSV-mounted memory will achieve up to 1 Terabytes/second of memory bandwidth, which is a notable increase over existing GPUs.


NVIDIA continues to pursue the mobile market with its line of Tegra chips that pair an ARM CPU, NVIDIA GPU, and SDR modem. Two new mobile chips called Logan and Parker will follow Tegra 4. Both new chips will support the full CUDA 5 stack and OpenGL 4.3 out of the box. Logan will feature a Kepler-based graphics porcessor on the chip that can “everything a modern computer ought to do” according to NVIDIA. Parker will have a yet-to-be-revealed graphics processor (Kepler successor). This mobile chip will utilize 3D FinFET transistors. It will have a greater number of transistors in a smaller package than previous Tegra parts (it will be about the size of a dime), and NVIDIA also plans to ramp up the frequency to wrangle more performance out of the mobile chip. NVIDIA has stated that Logan silicon should be completed towards the end of 2013, with the mobile chips entering production in 2014.


Interestingly, Logan has a sister chip that NVIDIA is calling Kayla. This mobile chip is capable of running ray tracing applications and features OpenGL geometric shaders. It can support GPGPU code and will be compatible with Linux.

NVIDIA has been pushing CUDA for several years, now. The company has seen some respectable adoption rates, by growing from 1 Tesla supercomputer in 2008 to its graphics cards being used in 50 supercomputers, with 500 million CUDA processors on the market. There are now allegedly 640 universities working with CUDA and 37,000 academic papers on CUDA.


Finally, NVIDIA's hinted-at new product announcement was the NVIDIA VCA, which is a GPU virtualization appliance that hooks into the network and can deliver up to 16 virtual machines running independant applications. These GPU accelerated workspaces can be presneted to thin clinets over the netowrk by installing the GRID client software on users' workstations. The specifications of the GRID VCA is rather impressive, as well.

The GRID VCA features:

  • 2 x Intel Xeon processors with 16 threads each (32 total threads)
  • 192GB to 384GB of system memory
  • 8 Kepler-based graphics cards, with two GPUs each (16 total GPUs)
  • 16 x GPU-accelerated virtual machines

The GRID VCA fits into a 4U case. It can deliver remote graphics to workstations, and is allegedly fast enough to deliver gpu accelerated software that is equivalent to having it run on the local machine (at least over LAN). The GRID Visual Computing Appliance will come in two flavors at different price points. The first will have 8 Kepler GPUs with 4GB of memory each, 16 CPU threads, and 192GB of system memory for $24,900. The other version will cost $34,900 and features 16 Kepler GPUs (4GB memory), 32 CPU threads, and 384GB system memory. On top of the hardware cost, NVIDIA is also charging licensing fees. While both GRID VCA devices can support unlimited devices, the licenses cost $2,400 and $4,800 per year respectively.


Overall, it was an interesting keynote, and the proposed graphics cards look to be offering up some unique and necessary features that should help hasten the day of ubiquitous general purpose GPU computing. The Unified Virtual Memory was something I was not expecting, and it will be interesting to see how AMD responds. AMD is already promising shared memory in its Kaveri APU, but I am interested to see the details of how NVIDIA and AMD will accomplish shared memory with dedicated grapahics cards (and whether CrossFire/SLI setups will all have a single shared memory pool)..

Stay tuned to PC Perspective for more GTC 2013 Coverage!

CES 2013: Caustic, now part of Imagination, Shows Series2 Ray Tracing Accelerators

Subject: Graphics Cards, Shows and Expos | January 12, 2013 - 11:38 AM |
Tagged: series2, ray tracing, imagination, ces 2013, CES, caustic

We have talked with Caustic on several occassions over the past couple of years about their desire to build a ray tracing accelerator.  Back in April of 2009 we first met with Caustic, learning who they were and what the goals of the company were; we saw early models of the CausticOne and CausticTwo and a demonstration of the capabilities of the hardware and software model. 

While at CES this year we found the group at a new place - the Imagination Technologies booth - having been acquired since we last talked.  Now named the Caustic Series2 OpenRL accelerator boards, we are looking at fully integrated ASICs rather than demonstration FPGAs. 


This is the Caustic 2500 and it will retail for $1495 and includes a pair of the RT2 chips and 16GB of memory.  One of the benefits of the Caustic technology is that while you need a lot of memory, you do not need expensive, fast memory like GDDR5 used in today's graphics cards.  By utilizing DDR2 memory Imagination is able to put a whopping 16GB on the 2500 model.


A key benefit of the Caustic ray tracing accelerators comes with the simply software integration.  You can see above that a AutoDesk Maya 2013 is utilizing the Caustic Visualizer as a simple viewport into the project just as you would use with any other RT or preview rendering technique.  The viewport software is also available for 3ds max. 

There is a lower cost version of the hardware, the Caustic 2100, that uses a single chip and has half the memory for a $795 price tag.  They are shipping this month and we are interested to see how quickly, and how eager developers are, to utilize this technology.

Coverage of CES 2013 is brought to you by AMD!

PC Perspective's CES 2013 coverage is sponsored by AMD.

Follow all of our coverage of the show at!

Intel in the Cloud with Ray-traces

Subject: General Tech, Graphics Cards, Processors, Mobile | March 8, 2012 - 04:02 AM |
Tagged: ray tracing, tablet, tablets, knight's ferry, Intel

Intel looks to bring ray-tracing from their Many Integrated Core (Intel MIC) architecture to your tablet… by remotely streaming from a server loaded with one or more Knight’s Ferry cards.

The anticipation of ray-tracing engulfed almost the entirety of 3D video gaming history. The reasonable support of ray-tracing is very seductive for games as it enables easier access to effects such as global illumination, reflections, and so forth. Ray-tracing is well deserved of its status as a buzzword.


Render yourself in what Knight’s Ferry delivered… with scaling linearly and ray-traced Wolfenstein

Screenshot from Intel Blogs.

Obviously Intel would love to make headway into the graphics market. In the past Intel has struggled to put forth an acceptable offering for graphics. It is my personal belief that Intel did not take graphics seriously when they were content selling cheap GPUs to be packed in with PCs. While the short term easy money flowed in, the industry slipped far enough ahead of them that they could not just easily pounce back into contention with a single huge R&D check.

Intel obviously cares about graphics now, and has been relentless at their research into the field. Their CPUs are far ahead of any competition in terms of serial performance -- and power consumption is getting plenty of attention itself.

Intel has long ago acknowledged the importance of massively parallel computing but was never quite able to bring products like Larabee against anything the companies they once ignored could retaliate with. This brings us back to ray-tracing: what is the ultimate advantage of ray-tracing?


Ray-tracing is a dead simple algorithm.


A ray-trace renderer is programmed very simply and elegantly. Effects are often added directly and without much approximation necessary. No hacking around is required in the numerous caveats within graphics APIs in order to get a functional render on screen. If you can keep throwing enough coal on the fire, it will burn without much effort -- so to speak. Intel just needs to put a fast enough processor behind it, and away they go.

Throughout the article, Daniel Pohl has in fact discussed numerous enhancements that they have made to their ray-tracing engine to improve performance. One of the most interesting improvements is their approach to antialiasing. If the rays from two neighboring pixels strike different meshes or strike the same mesh at the point of a sharp change in direction, denoted by color, between pixels then they are flagged for supersampling. The combination of that shortcut with MLAA will also be explored by Intel at some point.


A little behind-the-scenes trickery...

Screenshot from Intel Blogs.

Intel claims that they were able to achieve 20-30 FPS at 1024x600 resolutions streaming from a server with a single Knight’s Ferry card installed to an Intel Atom-based tablet. They were able to scale to within a couple percent of theoretical 8x performance with 8 Knight’s Ferry cards installed.

I very much dislike trusting my content to online streaming services as I am an art nut. I value the preservation of content which just is not possible if you are only able to access it through some remote third party -- can you guess my stance on DRM? That aside, I understand that Intel and others will regularly find ways to push content to where there just should not be enough computational horsepower to accept it.

Ray-tracing might be Intel’s attempt to circumvent all of the years of research that they ignored with conventional real-time rendering technologies. Either way, gaming engines are going the way of simpler rendering algorithms as GPUs become more generalized and less reliant on fixed-function hardware assigned to some arbitrary DirectX or OpenGL specification.

Intel just hopes that they can have a compelling product at that destination whenever the rest of the industry arrives.

Source: Intel Blog

IDF 2011: Knights Ferry Shown 8-Deep Running Ray Tracing

Subject: Graphics Cards, Processors, Shows and Expos | September 15, 2011 - 06:17 PM |
Tagged: ray tracing, knights ferry, idf 2011, idf

Very few things impress like a collection of 256 processor cores in a box.  But that is exactly what we saw on our last visit to the floor at the Intel Developer Forum this year when I stopped by to visit friend-of-the-site Daniel Pohl to discuss updates to the ray tracing research he has been doing for many years now.  This is what he showed us:


What you see there is a dual-Xeon server running a set of 8 (!!) Knights Ferry many-core processor discrete cards.  Each card holds a chip with 32 Intel Architecture cores running at 1.2 GHz on it and each core can handle 4 threads for a total of 1024 threads in flight at any given time!  Keep in mind these are all modified x86 cores with support for 16-bit wide vector processing so they are pumping through a LOT of FLOPS.  Pohl did note that only 31-32 of the cores are actually doing ray tracing at any given time though as they reserve a couple for scheduling tasks, operating system interaction, etc.


Each of the the eight cards in the system is using a pair of 6-pin PCIe power connectors and they are jammed in there pretty tight.  Pohl noted this was the only case they could find that would fit 8 dual-slot add-in cards into it so I'll take a note of that for when I build my own system around them.  Of course there are no display outputs on the Knights Ferry cards as they were never really turned into GPUs in the traditional sense.  They are essentially development and research for exascale computing and HPC workloads for servers though the plan is to bring the power to consumers eventually.


To run the demo the Knights Ferry ray tracing server was communicating over a Gigabit Ethernet connection to this workstation that was running game processing, interaction processing and more and passed off data about the movements of the camera and objects in the ray traced world to the server.  The eight Knights Ferry cards then render the frame, the Xeon CPUs compress the image (8:1 using a standard Direct 3D format) and send the data across the network.  All of this happens in real time with basically no latency issues when compared to direct PC gaming. 


While the ray tracing game engine projects might seem a little less exciting since the demise of Larrabee, Pohl and his team have been spending a lot of time on learning how to take advantage of the x86 cores available.  The Wolfenstein demo we have seen in past events has been improved to add things like HDR lighting, anti-aliasing and more.


Though these features have obviously been around in rasterization based solutions for quite a long time, the demo was meant to showcase the fact that ray tracing doesn't inherently have difficulty performing those kinds of tasks as long as the processing power is there and alotted to it. 


I am glad to see the ray tracing research continuing at Intel as I think that in the long-term future, that is the route that gaming and other graphics-based applications will be rendering.  And I am not alone - id Software founder and Doom/Quake creator John Carmack agreed in a recent interview we held with him

Source: PCPer