Looks Like Vega Nano is GO!

Subject: Graphics Cards | July 30, 2017 - 10:07 PM |
Tagged: Vega, Siggraph, Nano

This doesn't look like it was really meant to happen, but it is in the wild now!  Twitter user Drew has posted a picture of Chris Hook holding up a Vega Nano card outside the show.  It draws its design from the previous Vega products that we have seen with the shroud and the red cube in the top right corner.  No specifications were included with this post, but we can see that the card is significantly shorter than the RX Vega FE that Ryan had reviewed.


TDPs should be in the sub-200 watt range for such a design.  The original Nano was a 150 watt TDP part that performed quite well at the time.  Pricing is again not included, but we will be able to guess once the rest of the Vega lineup is announced later.

Source: Twitter

 AMD FireRender Technology Now ProRender, Part of GPUOpen

Subject: General Tech, Graphics Cards | July 25, 2016 - 09:48 PM |
Tagged: siggraph 2016, Siggraph, capsaicin, amd, 3D rendering

At their Capsaicin Siggraph event tonight AMD has announced that what was previously announced as the FireRender rendering engine is being officially launched as AMD Radeon ProRender, and this is becoming open-source as part of AMD's GPUOpen initiative.


From AMD's press release:

AMD today announced its powerful physically-based rendering engine is becoming open source, giving developers access to the source code.

As part of GPUOpen, Radeon ProRender (formerly previewed as AMD FireRender) enables creators to bring ideas to life through high-performance applications and workflows enhanced by photorealistic rendering.

GPUOpen is an AMD initiative designed to assist developers in creating ground-breaking games, professional graphics applications and GPU computing applications with much greater performance and lifelike experiences, at no cost and using open development tools and software.

Unlike other renderers, Radeon ProRender can simultaneously use and balance the compute capabilities of multiple GPUs and CPUs – on the same system, at the same time – and deliver state-of-the-art GPU acceleration to produce rapid, accurate results.

Radeon ProRender plugins are available today for many popular 3D content creation applications, including Autodesk® 3ds Max®, SOLIDWORKS by Dassault Systèmes and Rhino®, with Autodesk® Maya® coming soon. Radeon ProRender works across Windows®, OS X and Linux®, and supports AMD GPUs, CPUs and APUs as well as those of other vendors.

Source: AMD

AMD Announces Radeon Pro WX Series Graphics Cards

Subject: Graphics Cards | July 25, 2016 - 09:30 PM |
Tagged: siggraph 2016, Siggraph, Radeon Pro WX Series, Radeon Pro WX 7100, Radeon Pro WX 5100, Radeon Pro WX 4100, radeon, capsaicin, amd

AMD has announced new Polaris-based professional graphics cards at Siggraph 2016 this evening, with the Radeon Pro WX 4100, WX 5100, and WX 7100 GPUs.

Radeon Pro WX 7100.jpg

The AMD Radeon Pro WX 7100 GPU (Image credit: AMD)

From AMD's official press release:

AMD today unveils powerful new solutions to address modern content creation and engineering: the new Radeon Pro WX Series of professional graphics cards, which harness the award-winning Polaris architecture and is designed to deliver exceptional capabilities for the immersive computing era.

Radeon Pro solutions and the new Radeon Pro WX Series of professional graphics cards represent a fundamentally different approach for professionals rooted in a commitment to open, non-proprietary software and performant, feature-rich hardware that empowers people to create the “art of the impossible”.

The new Radeon Pro WX series graphics cards deliver on the promise of this new era of creation, are optimized for open source software, and are designed for creative professionals and those pushing the boundaries of science, technology and engineering.

Radeon Pro WX 5100.jpg

The AMD Radeon Pro WX 5100 GPU (Image credit: AMD)

Radeon Pro WX Series professional graphics cards are designed to address specific demands of the modern content creation era:

  • Radeon Pro WX 7100 GPU is capable of handling demanding design engineering and media and entertainment workflows and is AMD’s most affordable workstation solution for professional VR content creation.
  • Radeon Pro WX 5100 GPU is the ideal solution for product development, powered by the impending game-engine revolution in design visualization.
  • Radeon Pro WX 4100 GPU provides great performance in a half-height design, finally bringing mid-range application performance demanded by CAD professionals to small form factor (SFF) workstations

Radeon Pro WX 4100.jpg

The AMD Radeon Pro WX 4100 GPU (Image credit: AMD)

A breakdown of the known specifications for these new GPUs was provided by AnandTech in their report on the WX Series:


Chart credit: AnandTech

Source: AMD

SIGGRAPH 2016 -- NVIDIA Announces Pascal Quadro GPUs: Quadro P5000 and Quadro P6000

Subject: Graphics Cards | July 25, 2016 - 04:48 PM |
Tagged: siggraph 2016, Siggraph, quadro, nvidia

SIGGRAPH is the big, professional graphics event of the year, bringing together tens of thousands of attendees. They include engineers from Adobe, AMD, Blender, Disney (including ILM, Pixar, etc.), NVIDIA, The Khronos Group, and many, many others. Not only are new products announced, but many technologies are explained in detail, down to the specific algorithms that are used, so colleagues can advance their own research and share in kind.

But new products will indeed be announced.


The NVIDIA Quadro P6000

NVIDIA, having just launched a few Pascal GPUs to other markets, decided to announce updates to their Quadro line at the event. Two cards have been added, the Quadro P5000 and the Quadro P6000, both at the top end of the product stack. Interestingly, both use GDDR5X memory, meaning that neither will be based on the GP100 design, which is built around HBM2 memory.


The NVIDIA Quadro P5000

The lower end one, the Quadro P5000, should look somewhat familiar to our reader. Exact clocks are not specified, but the chip has 2560 CUDA cores. This is identical to the GTX 1080, but with twice the memory: 16GB of GDDR5X.

Above it sits the Quadro P6000. This chip has 3840 CUDA cores, paired with 24GB of GDDR5X. We have not seen a GPU with exactly these specifications before. It has the same number of FP32 shaders as a fully unlocked GP100 die, but it doesn't have HBM2 memory. On the other hand, the new Titan X uses GP102, combining 3584 CUDA cores with GDDR5X memory, although only 12GB of it. This means that the Quadro P6000 has 256 more (single-precision) shader units than the Titan X, but otherwise very similar specifications.

Both graphics cards have four DisplayPort 1.4 connectors, as well as a single DVI output. These five connectors can be used to drive up to four, 4K, 120Hz monitors, or four, 5K, 60Hz ones. It would be nice if all five connections could be used at once, but what can you do.


Pascal has other benefits for professional users, too. For instance, Simultaneous Multi-Projection (SMP) is used in VR applications to essentially double the GPU's geometry processing ability. NVIDIA will be pushing professional VR at SIGGRAPH this year, also launching Iray VR. This uses light fields, rendered on devices like the DGX-1, with its eight GP100 chips connected by NVLink, to provide accurately lit environments. This is particularly useful for architectural visualization.

No price is given for either of these cards, but they will launch in October of this year.

Source: NVIDIA

Qualcomm Introduces Adreno 5xx Architecture for Snapdragon 820

Subject: Graphics Cards, Processors, Mobile | August 12, 2015 - 07:30 AM |
Tagged: snapdragon 820, snapdragon, siggraph 2015, Siggraph, qualcomm, adreno 530, adreno

Despite the success of the Snapdragon 805 and even the 808, Qualcomm’s flagship Snapdragon 810 SoC had a tumultuous lifespan.  Rumors and stories about the chip and an inability to run in phone form factors without overheating and/or draining battery life were rampant, despite the company’s insistence that the problem was fixed with a very quick second revision of the part. There are very few devices that used the 810 and instead we saw more of the flagship smartphones uses the slightly cut back SD 808 or the SD 805.

Today at Siggraph Qualcomm starts the reveal of a new flagship SoC, Snapdragon 820. As the event coinciding with launch is a graphics-specific show, QC is focusing on a high level overview of the graphics portion of the Snapdragon 820, the updated Adreno 5xx architecture and associated designs and a new camera image signal processor (ISP) aiming to improve quality of photos and recording on our mobile devices.


A modern SoC from Qualcomm features many different processors working in tandem to impact the user experience on the device. While the only details we are getting today focus around the Adreno 530 GPU and Spectra ISP, other segments like connectivity (wireless), DSP, video processing and digital signal processing are important parts of the computing story. And we are well aware that Qualcomm is readying its own 64-bit processor architecture for the Kryo CPU rather than implementing the off-the-shelf cores from ARM used in the 810.

We also know that Qualcomm is targeting a “leading edge” FinFET process technology for SD 820 and though we haven’t been able to confirm anything, it looks very like that this chip will be built on the Samsung 14nm line that also built the Exynos 7420.

But over half of the processing on the upcoming Snapdragon 820 fill focus on visual processing, from graphics to gaming to UI animations to image capture and video output, this chip’s die will be dominated by high performance visuals.

Qualcomm’s lists of target goals for SD 820 visuals reads as you would expect: wanting perfection in every area. Wouldn’t we all love a phone or tablet that takes perfect photos each time, always focusing on the right things (or everything) with exceptional low light performance? Though a lesser known problem for consumers, having accurate color reproduction from capture, through processing and to the display would be a big advantage. And of course, we all want graphics performance that impresses and a user interface that is smooth and reliable while enabling NEW experience that we haven’t even thought of in the mobile form factor. Qualcomm thinks that Snapdragon 820 will be able to deliver on all of that.

Continue reading about the new Adreno 5xx architecture!!

Source: Qualcomm

Khronos Group at SIGGRAPH 2015

Subject: Graphics Cards, Processors, Mobile, Shows and Expos | August 10, 2015 - 09:01 AM |
Tagged: vulkan, spir, siggraph 2015, Siggraph, opengl sc, OpenGL ES, opengl, opencl, Khronos

When the Khronos Group announced Vulkan at GDC, they mentioned that the API is coming this year, and that this date is intended to under promise and over deliver. Recently, fans were hoping that it would be published at SIGGRAPH, which officially begun yesterday. Unfortunately, Vulkan has not released. It does hold a significant chunk of the news, however. Also, it's not like DirectX 12 is holding a commanding lead at the moment. The headers were public only for a few months, and the code samples are less than two weeks old.


The organization made announcements for six products today: OpenGL, OpenGL ES, OpenGL SC, OpenCL, SPIR, and, as mentioned, Vulkan. They wanted to make their commitment clear, to all of their standards. Vulkan is urgent, but some developers will still want the framework of OpenGL. Bind what you need to the context, then issue a draw and, if you do it wrong, the driver will often clean up the mess for you anyway. The briefing was structure to be evident that it is still in their mind, which is likely why they made sure three OpenGL logos greeted me in their slide deck as early as possible. They are also taking and closely examining feedback about who wants to use Vulkan or OpenGL, and why.

As for Vulkan, confirmed platforms have been announced. Vendors have committed to drivers on Windows 7, 8, 10, Linux, including Steam OS, and Tizen (OSX and iOS are absent, though). Beyond all of that, Google will accept Vulkan on Android. This is a big deal, as Google, despite its open nature, has been avoiding several Khronos Group standards. For instance, Nexus phones and tablets do not have OpenCL drivers, although Google isn't stopping third parties from rolling it into their devices, like Samsung and NVIDIA. Direct support of Vulkan should help cross-platform development as well as, and more importantly, target the multi-core, relatively slow threaded processors of those devices. This could even be of significant use for web browsers, especially in sites with a lot of simple 2D effects. Google is also contributing support from their drawElements Quality Program (dEQP), which is a conformance test suite that they bought back in 2014. They are going to expand it to Vulkan, so that developers will have more consistency between devices -- a big win for Android.


While we're not done with Vulkan, one of the biggest announcements is OpenGL ES 3.2 and it fits here nicely. At around the time that OpenGL ES 3.1 brought Compute Shaders to the embedded platform, Google launched the Android Extension Pack (AEP). This absorbed OpenGL ES 3.1 and added Tessellation, Geometry Shaders, and ASTC texture compression to it. It was also more tension between Google and cross-platform developers, feeling like Google was trying to pull its developers away from Khronos Group. Today, OpenGL ES 3.2 was announced and includes each of the AEP features, plus a few more (like “enhanced” blending). Better yet, Google will support it directly.

Next up are the desktop standards, before we finish with a resurrected embedded standard.

OpenGL has a few new extensions added. One interesting one is the ability to assign locations to multi-samples within a pixel. There is a whole list of sub-pixel layouts, such as rotated grid and Poisson disc. Apparently this extension allows developers to choose it, as certain algorithms work better or worse for certain geometries and structures. There were probably vendor-specific extensions for a while, but now it's a ratified one. Another extension allows “streamlined sparse textures”, which helps manage data where the number of unpopulated entries outweighs the number of populated ones.

OpenCL 2.0 was given a refresh, too. It contains a few bug fixes and clarifications that will help it be adopted. C++ headers were also released, although I cannot comment much on it. I do not know the state that OpenCL 2.0 was in before now.

And this is when we make our way back to Vulkan.


SPIR-V, the code that runs on the GPU (or other offloading device, including the other cores of a CPU) in OpenCL and Vulkan is seeing a lot of community support. Projects are under way to allow developers to write GPU code in several interesting languages: Python, .NET (C#), Rust, Haskell, and many more. The slide lists nine that Khronos Group knows about, but those four are pretty interesting. Again, this is saying that you can write code in the aforementioned languages and have it run directly on a GPU. Curiously missing is HLSL, and the President of Khronos Group agreed that it would be a useful language. The ability to cross-compile HLSL into SPIR-V means that shader code written for DirectX 9, 10, 11, and 12 could be compiled for Vulkan. He expects that it won't take long for a project to start, and might already be happening somewhere outside his Google abilities. Regardless, those who are afraid to program in the C-like GLSL and HLSL shading languages might find C# and Python to be a bit more their speed, and they seem to be happening through SPIR-V.

As mentioned, we'll end on something completely different.


For several years, the OpenGL SC has been on hiatus. This group defines standards for graphics (and soon GPU compute) in “safety critical” applications. For the longest time, this meant aircraft. The dozens of planes (which I assume meant dozens of models of planes) that adopted this technology were fine with a fixed-function pipeline. It has been about ten years since OpenGL SC 1.0 launched, which was based on OpenGL ES 1.0. SC 2.0 is planned to launch in 2016, which will be based on the much more modern OpenGL ES 2 and ES 3 APIs that allow pixel and vertex shaders. The Khronos Group is asking for participation to direct SC 2.0, as well as a future graphics and compute API that is potentially based on Vulkan.

The devices that this platform intends to target are: aircraft (again), automobiles, drones, and robots. There are a lot of ways that GPUs can help these devices, but they need a good API to certify against. It needs to withstand more than an Ouya, because crashes could be much more literal.

Khronos Announces "Next" OpenGL & Releases OpenGL 4.5

Subject: General Tech, Graphics Cards, Shows and Expos | August 15, 2014 - 08:33 PM |
Tagged: siggraph 2014, Siggraph, OpenGL Next, opengl 4.5, opengl, nvidia, Mantle, Khronos, Intel, DirectX 12, amd

Let's be clear: there are two stories here. The first is the release of OpenGL 4.5 and the second is the announcement of the "Next Generation OpenGL Initiative". They both occur on the same press release, but they are two, different statements.

OpenGL 4.5 Released

OpenGL 4.5 expands the core specification with a few extensions. Compatible hardware, with OpenGL 4.5 drivers, will be guaranteed to support these. This includes features like direct_state_access, which allows accessing objects in a context without binding to it, and support of OpenGL ES3.1 features that are traditionally missing from OpenGL 4, which allows easier porting of OpenGL ES3.1 applications to OpenGL.


It also adds a few new extensions as an option:

ARB_pipeline_statistics_query lets a developer ask the GPU what it has been doing. This could be useful for "profiling" an application (list completed work to identify optimization points).

ARB_sparse_buffer allows developers to perform calculations on pieces of generic buffers, without loading it all into memory. This is similar to ARB_sparse_textures... except that those are for textures. Buffers are useful for things like vertex data (and so forth).

ARB_transform_feedback_overflow_query is apparently designed to let developers choose whether or not to draw objects based on whether the buffer is overflowed. I might be wrong, but it seems like this would be useful for deciding whether or not to draw objects generated by geometry shaders.

KHR_blend_equation_advanced allows new blending equations between objects. If you use Photoshop, this would be "multiply", "screen", "darken", "lighten", "difference", and so forth. On NVIDIA's side, this will be directly supported on Maxwell and Tegra K1 (and later). Fermi and Kepler will support the functionality, but the driver will perform the calculations with shaders. AMD has yet to comment, as far as I can tell.


Image from NVIDIA GTC Presentation

If you are a developer, NVIDIA has launched 340.65 (340.23.01 for Linux) beta drivers for developers. If you are not looking to create OpenGL 4.5 applications, do not get this driver. You really should not have any use for it, at all.

Next Generation OpenGL Initiative Announced

The Khronos Group has also announced "a call for participation" to outline a new specification for graphics and compute. They want it to allow developers explicit control over CPU and GPU tasks, be multithreaded, have minimal overhead, have a common shader language, and "rigorous conformance testing". This sounds a lot like the design goals of Mantle (and what we know of DirectX 12).


And really, from what I hear and understand, that is what OpenGL needs at this point. Graphics cards look nothing like they did a decade ago (or over two decades ago). They each have very similar interfaces and data structures, even if their fundamental architectures vary greatly. If we can draw a line in the sand, legacy APIs can be supported but not optimized heavily by the drivers. After a short time, available performance for legacy applications would be so high that it wouldn't matter, as long as they continue to run.

Add to it, next-generation drivers should be significantly easier to develop, considering the reduced error checking (and other responsibilities). As I said on Intel's DirectX 12 story, it is still unclear whether it will lead to enough performance increase to make most optimizations, such as those which increase workload or developer effort in exchange for queuing fewer GPU commands, unnecessary. We will need to wait for game developers to use it for a bit before we know.

Richard Huddy Discusses FreeSync Availability Timeframes

Subject: General Tech, Displays | August 14, 2014 - 04:59 PM |
Tagged: amd, freesync, g-sync, Siggraph, siggraph 2014

At SIGGRAPH, Richard Huddy of AMD announced the release windows of FreeSync, their adaptive refresh rate technology, to The Tech Report. Compatible monitors will begin sampling "as early as" September. Actual products are expected to ship to consumers in early 2015. Apparently, more than one display vendor is working on support, although names and vendor-specific release windows are unannounced.


As for cost of implementation, Richard Huddy believes that the added cost should be no more than $10-20 USD (to the manufacturer). Of course, the final price to end-users cannot be derived from this - that depends on how quickly the display vendor expects to sell product, profit margins, their willingness to push new technology, competition, and so forth.

If you want to take full advantage of FreeSync, you will need a compatible GPU (look for "gaming" support in AMD's official FreeSync compatibility list). All future AMD GPUs are expected to support the technology.

Source: Tech Report

Intel and Microsoft Show DirectX 12 Demo and Benchmark

Subject: General Tech, Graphics Cards, Processors, Mobile, Shows and Expos | August 13, 2014 - 09:55 PM |
Tagged: siggraph 2014, Siggraph, microsoft, Intel, DirectX 12, directx 11, DirectX

Along with GDC Europe and Gamescom, Siggraph 2014 is going on in Vancouver, BC. At it, Intel had a DirectX 12 demo at their booth. This scene, containing 50,000 asteroids, each in its own draw call, was developed on both Direct3D 11 and Direct3D 12 code paths and could apparently be switched while the demo is running. Intel claims to have measured both power as well as frame rate.


Variable power to hit a desired frame rate, DX11 and DX12.

The test system is a Surface Pro 3 with an Intel HD 4400 GPU. Doing a bit of digging, this would make it the i5-based Surface Pro 3. Removing another shovel-load of mystery, this would be the Intel Core i5-4300U with two cores, four threads, 1.9 GHz base clock, up-to 2.9 GHz turbo clock, 3MB of cache, and (of course) based on the Haswell architecture.

While not top-of-the-line, it is also not bottom-of-the-barrel. It is a respectable CPU.

Intel's demo on this processor shows a significant power reduction in the CPU, and even a slight decrease in GPU power, for the same target frame rate. If power was not throttled, Intel's demo goes from 19 FPS all the way up to a playable 33 FPS.

Intel will discuss more during a video interview, tomorrow (Thursday) at 5pm EDT.


Maximum power in DirectX 11 mode.

For my contribution to the story, I would like to address the first comment on the MSDN article. It claims that this is just an "ideal scenario" of a scene that is bottlenecked by draw calls. The thing is: that is the point. Sure, a game developer could optimize the scene to (maybe) instance objects together, and so forth, but that is unnecessary work. Why should programmers, or worse, artists, need to spend so much of their time developing art so that it could be batch together into fewer, bigger commands? Would it not be much easier, and all-around better, if the content could be developed as it most naturally comes together?

That, of course, depends on how much performance improvement we will see from DirectX 12, compared to theoretical max efficiency. If pushing two workloads through a DX12 GPU takes about the same time as pushing one, double-sized workload, then it allows developers to, literally, perform whatever solution is most direct.


Maximum power when switching to DirectX 12 mode.

If, on the other hand, pushing two workloads is 1000x slower than pushing a single, double-sized one, but DirectX 11 was 10,000x slower, then it could be less relevant because developers will still need to do their tricks in those situations. The closer it gets, the fewer occasions that strict optimization is necessary.

If there are any DirectX 11 game developers, artists, and producers out there, we would like to hear from you. How much would a (let's say) 90% reduction in draw call latency (which is around what Mantle claims) give you, in terms of fewer required optimizations? Can you afford to solve problems "the naive way" now? Some of the time? Most of the time? Would it still be worth it to do things like object instancing and fewer, larger materials and shaders? How often?

Unreal Engine 4 on Mobile Kepler at SIGGRAPH

Subject: General Tech, Graphics Cards, Mobile, Shows and Expos | July 24, 2013 - 05:15 PM |
Tagged: Siggraph, kepler, mobile, tegra, nvidia, unreal engine 4

SIGGRAPH 2013 is wrapping up in the next couple of days but, now that NVIDIA removed the veil surrounding Mobile Kepler, people are chatting about what is to follow Tegra 4. Tim Sweeney, founder of Epic Games, contributed to NVIDIA Blogs the number of ways that certain attendees can experience Unreal Engine 4 at the show. As it turns out, NVIDIA engineers have displayed the engine both on Mobile Kepler as well as behind closed doors on desktop PCs.

Not from SIGGRAPH, this is a leak from, I believe, GTC late last March.

Also, this is Battlefield 3, not Unreal Engine 4.

Tim, obviously taking the developer standpoint, is very excited about OpenGL 4.3 support within the mobile GPU. In all, he did not say too much of note. They are targeting Unreal Engine 4 at a broad range of platforms: mobile, desktop, console, and, while absent from this editorial, web standards. Each of these platforms are settling on the same set of features, albeit with huge gaps in performance, allowing developers to focus on a scale of performance instead of a flowchart of capabilities.

Unfortunately for us, there have yet to be leaks from the trade show. We will keep you up-to-date if we find any, however.

Source: NVIDIA Blogs