Intro and NNEF 1.0 Finalization
SIGGRAPH 2018 is a huge computer graphics expo that occurs in a seemingly random host city around North America. (Asia has a sister event, called SIGGRAPH Asia, which likewise shuffles around.) In the last twenty years, the North American SIGGRAPH seems to like Los Angeles, which hosted the event nine times over that period, but Vancouver won out this year. As you would expect, the maintainers of OpenGL and Vulkan are there, and they have a lot to talk about.
- NNEF 1.0 has been finalized and released!
- The first public demo of OpenXR is available and on the show floor.
- glTF Texture Transmission Extension is being discussed.
- OpenCL Ecosystem Roadmap is being discussed.
- Khronos Educators Program has launched.
I will go through each of these points. Feel free to skip around between the sections that interest you!
Don't Call It SPIR of the Moment
Vulkan 1.0 released a little over two years ago. The announcement, with conformant drivers, conformance tests, tools, and patch for The Talos Principle, made a successful launch for the Khronos Group. Of course, games weren’t magically three times faster or anything like that, but it got the API out there; it also redrew the line between game and graphics driver.
The Khronos Group repeats this “hard launch” with Vulkan 1.1.
First, the specifications for both Vulkan 1.1 and SPIR-V 1.3 have been published. We will get into the details of those two standards later. Second, a suite of conformance tests has also been included with this release, which helps prevent an implementation bug from being an implied API that software relies upon ad-infinitum. Third, several developer tools have been released, mostly by LunarG, into the open-source ecosystem.
Fourth – conformant drivers. The following companies have Vulkan 1.1-certified drivers:
There are two new additions to the API:
The first is Protected Content. This allows developers to restrict access to rendering resources (DRM). Moving on!
The second is Subgroup Operations. We mentioned that they were added to SPIR-V back in 2016 when Microsoft announced HLSL Shader Model 6.0, and some of the instructions were available as OpenGL extensions. They are now a part of the core Vulkan 1.1 specification. This allows the individual threads of a GPU in a warp or wavefront to work together on specific instructions.
Shader compilers can use these intrinsics to speed up operations such as:
- Finding the min/max of a series of numbers
- Shuffle and/or copy values between lanes of a group
- Adding several numbers together
- Multiply several numbers together
- Evaluate whether any, all, or which lanes of a group evaluate true
In other words, shader compilers can do more optimizations, which boosts the speed of several algorithms and should translate to higher performance when shader-limited. It also means that DirectX titles using Shader Model 6.0 should be able to compile into their Vulkan equivalents when using the latter API.
This leads us to SPIR-V 1.3. (We’ll circle back to Vulkan later.) SPIR-V is the shading language that Vulkan relies upon, which is based on a subset of LLVM. SPIR-V is the code that is actually run on the GPU hardware – Vulkan just deals with how to get this code onto the silicon as efficiently as possible. In a video game, this would be whatever code the developer chose to represent lighting, animation, particle physics, and almost anything else done on the GPU.
The Khronos Group is promoting that the SPIR-V ecosystem can be written in either GLSL, OpenCL C, or even HLSL. In other words, the developer will not need to rewrite their DirectX shaders to operate on Vulkan. This isn’t particularly new – Unity did this sort-of HLSL to SPIR-V conversion ever since they added Vulkan – but it’s good to mention that it’s a promoted workflow. OpenCL C will also be useful for developers who want to move existing OpenCL code into Vulkan on platforms where the latter is available but the former rarely is, such as Android.
Speaking of which, that’s exactly what Google, Codeplay, and Adobe are doing. Adobe wrote a lot of OpenCL C code for their Creative Cloud applications, and they want to move it elsewhere. This ended up being a case study for an OpenCL to Vulkan run-time API translation layer and the Clspv OpenCL C to SPIR-V compiler. The latter is open source, and the former might become open source in the future.
Now back to Vulkan.
The other major change with this new version is the absorption of several extensions into the core, 1.1 specification.
The first is Multiview, which allows multiple projections to be rendered at the same time, as seen in the GTX 1080 launch. This can be used for rendering VR, stereoscopic 3D, cube maps, and curved displays without extra draw calls.
The second is device groups, which allows multiple GPUs to work together.
The third allows data to be shared between APIs and even whole applications. The Khronos Group specifically mentions that Steam VR SDK uses this.
The fourth is 16-bit data types. While most GPUs operate on 32-bit values, it might be beneficial to pack data into 16-bit values in memory for algorithms that are limited by bandwidth. It also helps Vulkan be used in non-graphics workloads.
We already discussed HLSL support, but that’s an extension that’s now core.
The sixth extension is YCbCr support, which is required by several video codecs.
The last thing that I would like to mention is the Public Vulkan Ecosystem Forum. The Khronos Group has regularly mentioned that they want to get the open-source community more involved in reporting issues and collaborating on solutions. In this case, they are working on a forum where both members and non-members will collaborate, as well as the usual GitHub issues tab and so forth.
You can check out the details here.
Subject: General Tech | December 10, 2017 - 06:51 PM | Scott Michaud
Tagged: Khronos, SYCL, sycl 1.2, sycl 1.2.1, opencl 1.2, opencl
The specification for SYCL 1.2.1, which is based on OpenCL 1.2, has been finalized and released on the Khronos website. The describe it as a major update over the previous standard, SYCL 1.2, and it is. Since May 2015, when SYCL 1.2 was finalized, The Khronos Group added features from C++11, C++14, and C++17, including the ISO C++17 Parallel Standard Template Library (STL).
In other words, you can create C++17 Parallel STL applications with SYCL 1.2.1, single-source, that are able to offload to OpenCL 1.2 devices.
Beyond that, the specification changes also help machine learning. The Khronos Group mentions that Google’s TensorFlow supports SYCL, bringing the framework to OpenCL devices. They want to continue updating the specification in this area, along with Safety Critical applications, such as automotive. They also want to keep updating the standard with ISO C++ features. In other words? SYCL is being adopted, and they intend ongoing support to match.
Subject: General Tech | November 26, 2017 - 09:19 PM | Scott Michaud
The Khronos Group has added a couple of new partners from China: The China Academy of Information and Communication Technology (CAICT) and Tencent. The former is a research institute for China’s Ministry of Industry and Information Technology, which should significantly help adoption of open standards across several Chinese companies. The latter is a huge Chinese telecom with huge investments in software and hardware; for instance, they own about 40% of Epic Games, makers of Unreal Engine.
The goal of this is to gain a huge amount of conformant software using these APIs. China is a huge market in several ways; not only would this push the technology into several products and middleware, but it should also help contribute back from China to the international standards from The Khronos Group. If there’s something that could be done to help an implementation become conformant, then that line of communication should be open rather than just encouraging them to fork-away a semi-but-not-quite-compliant standard, which apparently was an issue with OpenCL.
You can read the official press release at their website.
Subject: General Tech | November 26, 2017 - 07:39 PM | Scott Michaud
Tagged: Khronos, openvx, deep neural network, computer vision
OpenVX is an API that enables computer vision in a range of applications, from gesture tracking to surveillance to robotics. This version includes the neural network nodes (convolution, deconvolution, etc.) as well as import and export for compiling graphs offline. This is a part of the updated Adopters Program for OpenCX, which is, as usual for the Khronos Group, cross-platform and royalty-free.
Obviously, this only affects a subset of our readers directly, but cross-platform, royalty-free APIs for advanced computing functions will eventually lead to interesting technology. At the same time, for the developers in our audience, the tools are now available to test your code and hardware. The Khronos Group expects that early implementations will ship in 2018.
You can read the full press release at their website.
Subject: Graphics Cards | August 2, 2017 - 07:01 AM | Scott Michaud
Tagged: spir-v, opengl, Khronos
While Vulkan has been getting a lot of mindshare recently, OpenGL is still in active development. This release, OpenGL 4.6, adds a bunch of extensions into the core specification, making them more reliably available to engines. There’s a lot of them this time, many of which seem to borrow design elements from the work done on Vulkan.
The headlining feature is SPIR-V support as an ARB extension, which frees OpenGL programs from having their shaders written in GLSL. Many engines write their shaders in HLSL and use a transpiler to generate the corresponding GLSL, which may not support all features. The extension might also help titles target both OpenGL and Vulkan, although I’m not sure why we would see a driver that supports OpenGL 4.6 but not Vulkan.
Another extension is GL_KHR_no_error, which tells graphics drivers that they do not need to generate errors at runtime. This will save a bit of driver overhead. GL_ARB_indirect_parameters also helps with CPU overhead by allowing draws to pass parameters to other GPU-initiated draws, although this is a bit out of my domain. Also, if you’re not working in SPIR-V, GL_KHR_parallel_shader_compile will allow the driver to compile your GLSL shaders across multiple worker threads.
NVIDIA has a beta driver for developers, which is a couple of versions back compared to their consumer version, so you don’t want to install it unless you intend on developing OpenGL 4.6 applications. Mesa says that they shouldn’t be too far behind.
Subject: General Tech | July 10, 2017 - 07:24 AM | Scott Michaud
Tagged: Khronos, gltf, Blender
As we reported about a month ago, The Khronos Group has finalized glTF 2.0, which is a 3D format designed for whole scenes. Since then, Khronos have published an exporter for Blender that implements what appears to be all core features, as well as specular-gloss PBR (Extension), lights (Experimental), “materials common” (Experimental), and “materials displace” (Experimental). It is implemented as a whole bunch of Python scripts.
Apparently they provide their own PBR shader nodes for Cycles, rather than using the new Disney-based one in Blender 2.79. I’m not sure whether this was to make the export easier, or if development schedules just couldn’t align. Either way, both metallic/roughness and specular/gloss workflows have been provided, so that should make exporting either workflow relatively straight-forward.
An Data Format for Whole 3D Scenes
The Khronos Group has finalized the glTF 2.0 specification, and they recommend that interested parties integrate this 3D scene format into their content pipeline starting now. It’s ready.
glTF is a format to deliver 3D content, especially full scenes, in a compact and quick-loading data structure. These features differentiate glTF from other 3D formats, like Autodesk’s FBX and even the Khronos Group’s Collada, which are more like intermediate formats between tools, such as 3D editing software (ex: Maya and Blender) and game engines. They don’t see a competing format for final scenes that are designed to be ingested directly, quick and small.
glTF 2.0 makes several important changes.
The previous version of glTF was based on a defined GLSL material, which limited how it could be used, although it did align with WebGL at the time (and that spurred some early adoption). The new version switches to Physically Based Rendering (PBR) workflows to define their materials, which has a few advantages.
First, PBR can represent a wide range of materials with just a handful of parameters. Rather than dictating a specific shader, the data structure can just... structure the data. The industry has settled on two main workflows, metallic-roughness and specular-gloss, and glTF 2.0 supports them both. (Metallic-roughness is the core workflow, but specular-gloss is provided as an extension, and they can be used together in the same scene. Also, during the briefing, I noticed that transparency was not explicitly mentioned in the slide deck, but the Khronos Group confirmed that it is stored as the alpha channel of the base color, and thus supported.) Because the format is now based on existing workflows, the implementation can be programmed in OpenGL, Vulkan, DirectX, Metal, or even something like a software renderer. In fact, Microsoft was a specification editor on glTF 2.0, and they have publicly announced using the format in their upcoming products.
The original GLSL material, from glTF 1.0, is available as an extension (for backward compatibility).
A second advantage of PBR is that it is lighting-independent. When you define a PBR material for an object, it can be placed in any environment and it will behave as expected. Noticeable, albeit extreme examples of where this would have been useful are the outdoor scenes of Doom 3, and the indoor scenes of Battlefield 2. It also simplifies asset creation. Some applications, like Substance Painter and Quixel, have artists stencil materials onto their geometry, like gold, rusted iron, and scuffed plastic, and automatically generate the appropriate textures. It also aligns well with deferred rendering, see below, which performs lighting as a post-process step and thus skip pixels (fragments) that are overwritten.
PBR Deferred Buffers in Unreal Engine 4 Sun Temple.
Lighting is applied to these completed buffers, not every fragment.
glTF 2.0 also improves support for complex animations by adding morph targets. Most 3D animations, beyond just moving, rotating, and scaling whole objects, are based on skeletal animations. This method works by binding vertexes to bones, and moving, rotating, and scaling a hierarchy of joints. This works well for humans, animals, hinges, and other collections of joints and sockets, and it was already supported in glTF 1.0. Morph targets, on the other hand, allow the artist to directly control individual vertices between defined states. This is often demonstrated with a facial animation, interpolating between smiles and frowns, but, in an actual game, this is often approximated with skeletal animations (for performance reasons). Regardless, glTF 2.0 now supports morph targets, too, letting the artists make the choice that best suits their content.
Speaking of performance, the Khronos Group is also promoting “enhanced performance” as a benefit of glTF 2.0. I asked whether they have anything to elaborate on, and they responded with a little story. While glTF 1.0 validators were being created, one of the engineers compiled a list of design choices that would lead to minor performance issues. The fixes for these were originally supposed to be embodied in a glTF 1.1 specification, but PBR workflows and Microsoft’s request to abstract the format away from GLSL lead to glTF 2.0, which is where the performance optimization finally ended up. Basically, there wasn’t just one or two changes that made a big impact; it was the result of many tiny changes that add up.
Also, the binary version of glTF is now a core feature in glTF 2.0.
The slide looks at the potential future of glTF, after 2.0.
Looking forward, the Khronos Group has a few items on their glTF roadmap. These did not make glTF 2.0, but they are current topics for future versions. One potential addition is mesh compression, via the Google Draco team, to further decrease file size of 3D geometry. Another roadmap entry is progressive geometry streaming, via Fraunhofer SRC, which should speed up runtime performance.
Yet another roadmap entry is “Unified Compression Texture Format for Transmission”, specifically Basis by Binomial, for texture compression that remains as small as possible on the GPU. Graphics processors can only natively operate on a handful of formats, like DXT and ASTC, so textures need to be converted when they are loaded by an engine. Often, when a texture is loaded at runtime (rather than imported by the editor) it will be decompressed and left in that state on the GPU. Some engines, like Unity, have a runtime compress method that converts textures to DXT, but the developer needs to explicitly call it and the documentation says it’s lower quality than the algorithm used by the editor (although I haven’t tested this). Suffices to say, having a format that can circumvent all of that would be nice.
Again, if you’re interested in adding glTF 2.0 to your content pipeline, then get started. It’s ready. Microsoft is doing it, too.
Subject: General Tech | May 25, 2017 - 11:12 AM | Ryan Shrout
Tagged: vulkan, video, Surface Pro, SolidScale, seasonic, ps4 pro, podcast, opencl, micon, macbook pro, Khronos, fsp, Eisbaer, Chromebook, Alphacool, aimpad
PC Perspective Podcast #451 - 05/25/17
Join us for talk about the wew Surface Pro, analog keyboards, water cooled PSUs and more!
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the iTunes Store (audio only)
- Google Play - Subscribe to our audio podcast directly through Google Play!
- RSS - Subscribe through your regular RSS reader (audio only)
- MP3 - Direct download link to the MP3 file
Hosts: Ryan Shrout, Jeremy Hellstrom, Josh Walrath, Allyn Malventano
Peanut Gallery: Alex Lustenberg, Jim Tanous, Ken Addison
Podcast topics of discussion:
The Right People to Interview
Last week, we reported that OpenCL’s roadmap would be merging into Vulkan, and OpenCL would, starting at some unspecified time in the future, be based “on an extended version of the Vulkan API”. This was based on quotes from several emails between myself and the Khronos Group.
Since that post, I had the opportunity to have a phone interview with Neil Trevett, president of the Khronos Group and chairman of the OpenCL working group, and Tom Olson, chairman of the Vulkan working group. We spent a little over a half hour going over Neil’s International Workshop on OpenCL (IWOCL) presentation, discussing the decision, and answering a few lingering questions. This post will present the results of that conference call in a clean, readable way.
First and foremost, while OpenCL is planning to merge into the Vulkan API, the Khronos Group wants to make it clear that “all of the merging” is coming from the OpenCL working group. The Vulkan API roadmap is not affected by this decision. Of course, the Vulkan working group will be able to take advantage of technologies that are dropping into their lap, but those discussions have not even begun yet.
Neil: Vulkan has its mission and its roadmap, and it’s going ahead on that. OpenCL is doing all of the merging. We’re kind-of coming in to head in the Vulkan direction.
Does that mean, in the future, that there’s a bigger wealth of opportunity to figure out how we can take advantage of all this kind of mutual work? The answer is yes, but we haven’t started those discussions yet. I’m actually excited to have those discussions, and are many people, but that’s a clarity. We haven’t started yet on how Vulkan, itself, is changed (if at all) by this. So that’s kind-of the clarity that I think is important for everyone out there trying to understand what’s going on.
Tom also prepared an opening statement. It’s not as easy to abbreviate, so it’s here unabridged.
Tom: I think that’s fair. From the Vulkan point of view, the way the working group thinks about this is that Vulkan is an abstract machine, or at least there’s an abstract machine underlying it. We have a programming language for it, called SPIR-V, and we have an interface controlling it, called the API. And that machine, in its full glory… it’s a GPU, basically, and it’s got lots of graphics functionality. But you don’t have to use that. And the API and the programming language are very general. And you can build lots of things with them. So it’s great, from our point of view, that the OpenCL group, with their special expertise, can use that and leverage that. That’s terrific, and we’re fully behind it, and we’ll help them all we can. We do have our own constituency to serve, which is the high-performance game developer first and foremost, and we are going to continue to serve them as our main mission.
So we’re not changing our roadmap so much as trying to make sure we’re a good platform for other functionality to be built on.
Neil then went on to mention that the decision to merge OpenCL’s roadmap into the Vulkan API took place only a couple of weeks ago. The purpose of the press release was to reach OpenCL developers and get their feedback. According to him, they did a show of hands at the conference, with a room full of a hundred OpenCL developers, and no-one was against moving to the Vulkan API. This gives them confidence that developers will accept the decision, and that their needs will be served by it.
Next up is the why. Read on for more.