GDC 15: Khronos Acknowledges Mantle's Start of Vulkan

Subject: General Tech, Graphics Cards, Shows and Expos | March 3, 2015 - 03:37 PM |
Tagged: vulkan, Mantle, Khronos, glnext, gdc 15, GDC, amd

khronos-group-logo.png

Neil Trevett, the current president of Khronos Group and a vice president at NVIDIA, made an on-the-record statement to acknowledge the start of the Vulkan API. The quote came to me via Ryan, but I think it is a copy-paste of an email, so it should be verbatim.

Many companies have made great contributions to Vulkan, including AMD who contributed Mantle. Being able to start with the Mantle design definitely helped us get rolling quickly – but there has been a lot of design iteration, not the least making sure that Vulkan can run across many different GPU architectures. Vulkan is definitely a working group design now.

So in short, the Vulkan API was definitely started with Mantle and grew from there as more stakeholders added their opinion. Vulkan is obviously different than Mantle in significant ways now, such as its use of SPIR-V for its shading language (rather than HLSL). To see a bit more information, check out our article on the announcement.

Update: AMD has released a statement independently, but related to Mantle's role in Vulkan

EVGA and Inno3D Announce the First 4GB NVIDIA GeForce GTX 960 Cards

Subject: Graphics Cards | March 3, 2015 - 02:44 PM |
Tagged: video cards, nvidia, gtx 960, geforce, 4GB

They said it couldn't be done, but where there are higher density chips there's always a way. Today EVGA and Inno3D have both announced new versions of GTX 960 graphics cards with 4GB of GDDR5 memory, placing the cards in a more favorable mid-range position depending on the launch pricing.

960_evga.PNG

EVGA's new 4GB NVIDIA GTX 960 SuperSC

Along with the expanded memory capacity EVGA's card features their ACX 2.0+ cooler, which promises low noise and better cooling. The SuperSC is joined by a standard ACX and the higher-clocked FTW variant, which pushes Base/Boost clocks to 1304/1367MHz out of the box.

960_evga_2.PNG

Inno3D's press release provides fewer details, and the company appears to be launching a single new model featuring 4GB of memory which looks like a variant of their existing GTX 960 OC card.

inno3d_960.jpg

The existing Inno3D GTX 960 OC card

The current 2GB version of the GTX 960 can be found starting at $199, so expect these expanded versions to include a price bump. The GTX 960, with only 1024 CUDA cores (half the count of a GTX 980) and a 128-bit memory interface, has been a very good performer nonetheless with much better numbers than last year's GTX 760, and is very competitive with AMD's R9 280/285. (It's a great overclocker, too.) The AMD/NVIDIA debate rages on, and NVIDIA's partners adding another 4GB offering to the mix will certainly add to the conversation, particularly as an upcoming 4GB version of the GTX 960 was originally said to be unlikely.

Source: EVGA

ARM and Geomerics Show Enlighten 3 Lighting, Integrate with Unity 5

Subject: Graphics Cards, Mobile | March 3, 2015 - 12:00 PM |
Tagged: Unity, lighting, global illumination, geomerics, GDC, arm

Back in 2013 ARM picked up a company called Geomerics, responsible for one the industry’s most advanced dynamic lighting engines used in games ranging from mobile to console to PC. Called Enlighten, it is the lighting engine in many major games in a variety of markets. Battlefield 3 uses it, Need for Speed: The Run does as well, The Bureau: XCOM Declassified and Quantum Conundrum mark another pair of major games that depend on Geomerics technology.

geo-3.jpg

Great, but what does that have to do with ARM and why would the company be interested in investing in software that works with such a wide array of markets, most of which are not dominated by ARM processors? There are two answers, the first of which is directional: ARM is using the minds and creative talent behind Geomerics to help point the Cortex and Mali teams in the correct direction for CPU and GPU architecture development. By designing hardware to better address the advanced software and lighting systems Geomerics builds then Cortex and Mali will have some semblance of an advantage in specific gaming titles as well as a potential “general purpose” advantage. NVIDIA employs hundreds of gaming and software developers for this exact reason: what better way to make sure you are always at the forefront of the gaming ecosystem than getting high-level gaming programmers to point you to that edge? Qualcomm also recently (back in 2012) started employing game and engine developers in-house with the same goals.

ARM also believes it will be beneficial to bring publishers, developers and middleware partners to the ARM ecosystem through deployment of the Enlighten engine. It would be feasible to think console vendors like Microsoft and Sony would be more willing to integrate ARM SoCs (rather than the x86 used in the PS4 and Xbox One) when shown the technical capabilities brought forward by technologies like Geomerics Enlighten.

geomerics-1.jpg

It’s best to think of the Geomerics acquisition of a kind of insurance program for ARM, making sure both its hardware and software roadmaps are in line with industry goals and directives.

At GDC 2015 Geomerics is announcing the release of the Enlighten 3 engine, a new version that brings cinematic-quality real-time global illumination to market. Some of the biggest new features include additional accuracy on indirect lighting, color separated directional output (enables individual RGB calculations), better light map baking for higher quality output, and richer material properties to support transparency and occlusion.

All of this technology will be showcased in a new Subway demo that includes real-time global illumination simulation, dynamic transparency and destructible environments.

Geomerics Enlighten 3 Subway Demo

Enlighten 3 will also ship with Forge, a new lighting editor and pipeline tool for content creators looking to streamline the building process. Forge will allow import functionality from Autodesk 3ds Max and Maya applications making inter-operability easier. Forge uses a technology called YEBIS 3 to show estimated final quality without the time consuming final-build processing time.

geo-1.jpg

Finally, maybe the biggest news for ARM and Geomerics is that the Unity 5 game engine will be using Enlighten as its default lighting engine, giving ARM/Mali a potential advantage for gaming experiences in the near term. Of course Enlighten is available as an option for Unreal Engine 3 and 4 for developers using that engine in mobile, console and desktop projects as well as in an SDK form for custom integrations.

Who Should Care? Thankfully, Many People

The Khronos Group has made three announcements today: Vulkan (their competitor to DirectX 12), OpenCL 2.1, and SPIR-V. Because there is actually significant overlap, we will discuss them in a single post rather than splitting them up. Each has a role in the overall goal to access and utilize graphics and compute devices.

khronos-Vulkan-700px-eventpage.png

Before we get into what everything is and does, let's give you a little tease to keep you reading. First, Khronos designs their technologies to be self-reliant. As such, while there will be some minimum hardware requirements, the OS pretty much just needs to have a driver model. Vulkan will not be limited to Windows 10 and similar operating systems. If a graphics vendor wants to go through the trouble, which is a gigantic if, Vulkan can be shimmed into Windows 8.x, Windows 7, possibly Windows Vista despite its quirks, and maybe even Windows XP. The words “and beyond” came up after Windows XP, but don't hold your breath for Windows ME or anything. Again, the further back in Windows versions you get, the larger the “if” becomes but at least the API will not have any “artificial limitations”.

Outside of Windows, the Khronos Group is the dominant API curator. Expect Vulkan on Linux, Mac, mobile operating systems, embedded operating systems, and probably a few toasters somewhere.

On that topic: there will not be a “Vulkan ES”. Vulkan is Vulkan, and it will run on desktop, mobile, VR, consoles that are open enough, and even cars and robotics. From a hardware side, the API requires a minimum of OpenGL ES 3.1 support. This is fairly high-end for mobile GPUs, but it is the first mobile spec to require compute shaders, which are an essential component of Vulkan. The presenter did not state a minimum hardware requirement for desktop GPUs, but he treated it like a non-issue. Graphics vendors will need to be the ones making the announcements in the end, though.

Before we go further, some background is necessary. Read on for that and lots more!

GDC 15: AMD Mantle Might Be Dead as We Know It: No Public SDK Planned

Subject: Graphics Cards | March 2, 2015 - 02:31 PM |
Tagged: sdk, Mantle, dx12, API, amd

The Game Developers Conference is San Francisco starts today and you can expect to see more information about DirectX 12 than you could ever possibly want, so be prepared. But what about the original low-level API, AMD Mantle. Utilized in Battlefield 4, Thief and integrated into the Crytek engine (announced last year), announced with the release of the Radeon R9 290X/290, Mantle was truly the instigator that pushed Microsoft into moving DX12's development along at a faster pace.

Since DX12's announcement, AMD has claimed that Mantle would live on, bringing performance advantages to AMD GPUs and would act as the sounding board for new API features for AMD and game development partners. And, as was always trumpeted since the very beginning of Mantle, it would become an open API, available for all once it outgrew the beta phase that it (still) resides in.

mantle1.jpg

Something might have changed there.

A post over on the AMD Gaming blog from Robert Hallock has some news about Mantle to share as GDC begins. First, the good news:

AMD is a company that fundamentally believes in technologies unfettered by restrictive contracts, licensing fees, vendor lock-ins or other arbitrary hurdles to solving the big challenges in graphics and computing. Mantle was destined to follow suit, and it does so today as we proudly announce that the 450-page programming guide and API reference for Mantle will be available this month (March, 2015) at www.amd.com/mantle.
 
This documentation will provide developers with a detailed look at the capabilities we’ve implemented and the design decisions we made, and we hope it will stimulate more discussion that leads to even better graphics API standards in the months and years ahead.

That's great! We will finally be able to read about the API and how it functions, getting access to the detailed information we have wanted from the beginning. But then there is this portion:

AMD’s game development partners have similarly started to shift their focus, so it follows that 2015 will be a transitional year for Mantle. Our loyal customers are naturally curious what this transition might entail, and we wanted to share some thoughts with you on where we will be taking Mantle next:

AMD will continue to support our trusted partners that have committed to Mantle in future projects, like Battlefield™ Hardline, with all the resources at our disposal.

  1. Mantle’s definition of “open” must widen. It already has, in fact. This vital effort has replaced our intention to release a public Mantle SDK, and you will learn the facts on Thursday, March 5 at GDC 2015.
     
  2. Mantle must take on new capabilities and evolve beyond mastery of the draw call. It will continue to serve AMD as a graphics innovation platform available to select partners with custom needs.
     
  3. The Mantle SDK also remains available to partners who register in this co-development and evaluation program. However, if you are a developer interested in Mantle "1.0" functionality, we suggest that you focus your attention on DirectX® 12 or GLnext.

Essentially, AMD's Mantle API in it's "1.0" form is at the end of its life, only supported for current partners and the publicly available SDK will never be posted. Honestly, at this point, this isn't so much of a let down as it is a necessity. DX12 and GLnext have already superseded Mantle in terms of market share and mind share with developers and any more work AMD put into getting devs on-board with Mantle is wasted effort.

mantle-2.jpg

Battlefield 4 is likely to be the only major title to use AMD Mantle

AMD claims to have future plans for Mantle though it will continue to be available only to select partners with "custom needs." I would imagine this would expand outside the world games but could also mean game consoles could be the target, where developers are only concerned with AMD GPU hardware.

So - from our perspective, Mantle as we know is pretty much gone. It served its purpose, making NVIDIA and Microsoft pay attention to the CPU bottlenecks in DX11, but it appears the dream was a bit bigger than the product could become. AMD shouldn't be chastised because of this shift nor for its lofty goals that we kind-of-always knew were too steep a hill to climb. Just revel in the news that pours from GDC this week about DX12.

Source: AMD

So Long Adware, and Thanks for All the Fish!

Subject: Graphics Cards | March 1, 2015 - 07:30 AM |
Tagged: superfish, Lenovo, bloatware, adware

Obviously, this does not forget the controversy that Lenovo got themselves into, but it is certainly the correct response (if they act how they imply). Adware and bloatware is common to find on consumer PCs, which makes the slowest of devices even more sluggish as demos and sometimes straight-up advertisements claim their share of your resources. This does not even begin to discuss the security issues that some of these hitchhikers drag in. Again, I refer you to the aforementioned controversy.

lenovo-do.png

In response, albeit a delayed one, Lenovo has announced that, by the launch of Windows 10, they will only pre-install the OS and “related software”. Lenovo classifies this related software as drivers, security software, Lenovo applications, and applications for “unique hardware” (ex: software for an embedded 3D camera).

It looks to be a great step, but I need to call out “security software”. Windows 10 should ship with Microsoft's security applications in many regions, which really questions why a laptop provider would include an alternative. If the problem is that people expect McAfee or Symantec, then advertise pre-loaded Microsoft anti-malware and keep it clean. Otherwise, it feels like keeping a single finger in the adware take-a-penny dish.

At least it is not as bad as trying to install McAfee every time you update Flash Player. I consider Adobe's tactic the greater of two evils on that one. I mean, unless Adobe just thinks that Flash Player is so insecure that you would be crazy to install it without a metaphorical guard watching over your shoulder.

And then of course we reach the divide between “saying” and “doing”. We will need to see Lenovo's actual Windows 10 devices to find out if they kept their word, and followed its implications to a tee.

Source: Lenovo

Imagination Launches PowerVR GT7900, "Super-GPU" Targeting Consoles

Subject: Graphics Cards, Mobile | February 26, 2015 - 02:15 PM |
Tagged: super-gpu, PowerVR, Imagination Technologies, gt7900

As a preview to announcements and releases being made at both Mobile World Congress (MWC) and the Game Developers Summit (GDC) next week, Imagination Technologies took the wraps off of a new graphics product they are calling a "super-GPU". The PowerVR GT7900 is the new flagship GPU as a part of its Series7XT family that is targeting a growing category called "affordable game consoles." Think about the Android-powered set-top devices like the Ouya or maybe Amazon's Kindle TV.

gt7900-1.png

PowerVR breaks up its GPU designs into unified shading clusters (USCs) and the GT7900 has 16 of them for a total of 512 ALU cores. Imagination has previously posted a great overview of its USC architecture design and how you can compare its designs to other GPUs on the market. Imagination wants to claim that the GT7900 will offer "PC-class gaming experiences" though that is as ambiguous as the idea of a work load of a "console-level game." But with rated peak performance levels hitting over 800 GFLOPS in FP32 and 1.6 TFLOPS in FP16 (half-precision) this GPU does have significant theoretical capability.

  PowerVR GT7900 Tegra X1
Vendor Imagination Technologies NVIDIA
FP32 ALUs 512 256
FP32 GFLOPS 800 512
FP16 GFLOPS 1600 1024
GPU Clock 800 MHz 1000 MHz
Process Tech 16nm FinFET+ 20nm TSMC

Imagination also believes that PowerVR offers a larger portion of its peak performance for a longer period of time than the competition thanks to the tile-based deferred rendering (TBDR) approach that has been "refined over the years to deliver unmatched efficiency."

gt7900-2.png

The FP16 performance number listed above is useful as an extreme power savings option where the half-precision compute operates in a much more efficient manner. A fair concern is how many applications, GPGPU or gaming, actually utilize the FP16 data type but having support for it in the GT7900 allows developers to target it.

Other key features of the GT7900 include support for OpenGL ES 3.1 + AEP (Android Extension Pack), hardware tessellation and ASTC LDR and HDR texture compression standards. The GPU also can run in a multi-domain virtualization mode that would allow multiple operating systems to run in parallel on a single platform.

gt7900-3.png

Imagination believes that this generation of PowerVR will "usher a new era of console-like gaming experiences" and will showcase a new demo at GDC called Dwarf Hall.

I'll be at GDC next week and have already setup a meeting with Imagination to talk about the GT7900 so I can have some hands on experiences to report back with soon. I am continually curious about the market for these types of high-end "mobile" GPUs with the limited market that the Android console market currently addresses. Imagination does claim that the GT7900 is beating products with performance levels as high as the GeForce GT 730M discrete GPU - no small feat.

Author:
Manufacturer: Asus

Quiet, Efficient Gaming

The last few weeks have been dominated by talk about the memory controller of the Maxwell based GTX 970.  There are some very strong opinions about that particular issue, and certainly NVIDIA was remiss on actually informing consumers about how it handles the memory functionality of that particular product.  While that debate rages, we have somewhat lost track of other products in the Maxwell range.  The GTX 960 was released during this particular firestorm and, while it also shared the outstanding power/performance qualities of the Maxwell architecture, it is considered a little overpriced when compared to other cards in its price class in terms of performance.

It is easy to forget that the original Maxwell based product to hit shelves was the GTX 750 series of cards.  They were released a year ago to some very interesting reviews.  The board is one of the first mainstream cards in recent memory to have a power draw that is under 75 watts, but can still play games with good quality settings at 1080P resolutions.  Ryan covered this very well and it turned out to be a perfect gaming card for many pre-built systems that do not have extra power connectors (or a power supply that can support 125+ watt graphics cards).  These are relatively inexpensive cards and very easy to install, producing a big jump in performance as compared to the integrated graphics components of modern CPUs and APUs.

strix_01.jpg

The GTX 750 and GTX 750 Ti have proven to be popular cards due to their overall price, performance, and extremely low power consumption.  They also tend to produce a relatively low amount of heat, due to solid cooling combined with that low power consumption.  The Maxwell architecture has also introduced some new features, but the major changes are to the overall design of the architecture as compared to Kepler.  Instead of 192 cores per SMK, there are now 128 cores per SMM.  NVIDIA has done a lot of work to improve performance per core as well as lower power in a fairly dramatic way.  An interesting side effect is that the CPU hit with Maxwell is a couple of percentage points higher than Kepler.  NVIDIA does lean a bit more on the CPU to improve overall GPU power, but most of this performance hit is covered up by some really good realtime compiler work in the driver.

Asus has taken the GTX 750 Ti and applied their STRIX design and branding to it.  While there are certainly faster GPUs on the market, there are none that exhibit the power characteristics of the GTX 750 Ti.  The combination of this GPU and the STRIX design should result in an extremely efficient, cool, and silent card.

Click to read the rest of the review of the Asus STRIX GTX 750 Ti!

NVIDIA Faces Class Action Lawsuit for the GeForce GTX 970

Subject: Graphics Cards | February 23, 2015 - 04:12 PM |
Tagged: nvidia, geforce, GTX 970

So apparently NVIDIA and a single AIB partner, Gigabyte, are facing a class action lawsuit because of the GeForce GTX 970 4GB controversy. I am not sure why they singled out Gigabyte, but I guess that is the way things go in the legal world. Unlucky for them, and seemingly lucky for the rest.

nvidia-970-architecture.jpg

For those who are unaware, the controversy is based on NVIDIA claiming that the GeForce GTX 970 has 4GB of RAM, 64 ROPs, and 2048 KB of L2 Cache. In actuality, it has 56 ROPs and 1792KB of L2 Cache. The main talking point is that the RAM is segmented into two partitions, one that is 3.5GB and another that is 0.5GB. All 4GB are present on the card though, and accessible (unlike the disable L2 Cache and ROPs). Then again, I cannot see an instance in that class action lawsuit's exhibits which claim an incorrect number of ROPs or amount of L2 Cache.

Again, the benchmarks that you saw when the GeForce GTX 970 launched are still valid. Since the issue came up, Ryan has also tried various configurations of games in single- and multi-GPU systems to find conditions that would make the issue appear.

Source: Court Filing

Windows Update Installs GeForce 349.65 with WDDM 2.0

Subject: General Tech, Graphics Cards | February 21, 2015 - 04:23 PM |
Tagged: wddm 2.0, nvidia, geforce 349.65, geforce, dx12

Update 2: Outside sources have confirmed to PC Perspective that this driver contains DirectX 12 as well as WDDM 2.0. They also claim that Intel and AMD have DirectX 12 drivers available through Windows Update as well. After enabling iGPU graphics on my i7-4790K, the Intel HD 4600 received a driver update, which also reports as WDDM 2.0 in DXDIAG. I do not have a compatible AMD GPU to test against (just a couple of old Windows 7 laptops) but the source is probably right and some AMD GPUs will be updated to DX12 too.

So it turns out that if your motherboard dies during a Windows Update reboot, then you are going to be spending several hours reinstalling software and patches, but that is not important. What is interesting is the installed version number for NVIDIA's GeForce Drivers when Windows Update was finished with its patching: 349.65. These are not available on NVIDIA's website, and the Driver Model reports WDDM 2.0.

nvidia-34965-driver.png

It looks like Microsoft pushed out NVIDIA's DirectX 12 drivers through Windows Update. Update 1 Pt. 1: The "Runtime" reporting 11.0 is confusing though, perhaps this is just DX11 with WDDM 2.0?

nvidia-34965-dxdiag.png

I am hearing online that these drivers support the GeForce 600 series and later GPUs, and that there are later, non-public drivers available (such as 349.72 whose release notes were leaked online). NVIDIA has already announced that DirectX 12 will be supported on GeForce 400-series and later graphics cards, so Fermi drivers will be coming at some point. For now, it's apparently Kepler-and-later, though.

So with OS support and, now, released graphics drivers, all that we are waiting on is software and an SDK (plus any NDAs that may still be in effect). With Game Developers Conference (GDC 2015) coming up in a little over a week, I expect that we will get each of these very soon.

Update 1 Pt. 2: I should note that the release notes for 349.72 specifically mention DirectX 12. As mentioned above, is possible that 349.65 contains just WDDM 2.0 and not DX12, but it contains at least WDDM 2.0.