Manufacturer: NVIDIA

93% of a GP100 at least...

NVIDIA has announced the Tesla P100, the company's newest (and most powerful) accelerator for HPC. Based on the Pascal GP100 GPU, the Tesla P100 is built on 16nm FinFET and uses HBM2.


NVIDIA provided a comparison table, which we added what we know about a full GP100 to:

  Tesla K40 Tesla M40 Tesla P100 Full GP100
GPU GK110 (Kepler) GM200 (Maxwell) GP100 (Pascal) GP100 (Pascal)
SMs 15 24 56 60
TPCs 15 24 28 (30?)
FP32 CUDA Cores / SM 192 128 64 64
FP32 CUDA Cores / GPU 2880 3072 3584 3840
FP64 CUDA Cores / SM 64 4 32 32
FP64 CUDA Cores / GPU 960 96 1792 1920
Base Clock 745 MHz 948 MHz 1328 MHz TBD
GPU Boost Clock 810/875 MHz 1114 MHz 1480 MHz TBD
FP64 GFLOPS 1680 213 5304 TBD
Texture Units 240 192 224 240
Memory Interface 384-bit GDDR5 384-bit GDDR5 4096-bit HBM2 4096-bit HBM2
Memory Size Up to 12 GB Up to 24 GB 16 GB TBD
L2 Cache Size 1536 KB 3072 KB 4096 KB TBD
Register File Size / SM 256 KB 256 KB 256 KB 256 KB
Register File Size / GPU 3840 KB 6144 KB 14336 KB 15360 KB
TDP 235 W 250 W 300 W TBD
Transistors 7.1 billion 8 billion 15.3 billion 15.3 billion
GPU Die Size 551 mm2 601 mm2 610 mm2 610mm2
Manufacturing Process 28 nm 28 nm 16 nm 16nm

This table is designed for developers that are interested in GPU compute, so a few variables (like ROPs) are still unknown, but it still gives us a huge insight into the “big Pascal” architecture. The jump to 16nm allows for about twice the number of transistors, 15.3 billion, up from 8 billion with GM200, with roughly the same die area, 610 mm2, up from 601 mm2.


A full GP100 processor will have 60 shader modules, compared to GM200's 24, although Pascal stores half of the shaders per SM. The GP100 part that is listed in the table above is actually partially disabled, cutting off four of the sixty total. This leads to 3584 single-precision (32-bit) CUDA cores, which is up from 3072 in GM200. (The full GP100 architecture will have 3840 of these FP32 CUDA cores -- but we don't know when or where we'll see that.) The base clock is also significantly higher than Maxwell, 1328 MHz versus ~1000 MHz for the Titan X and 980 Ti, although Ryan has overclocked those GPUs to ~1390 MHz with relative ease. This is interesting, because even though 10.6 TeraFLOPs is amazing, it's only about 20% more than what GM200 could pull off with an overclock.

Continue reading our preview of the NVIDIA Pascal architecture!!

The Status of Windows Phone

Subject: Mobile, Shows and Expos | March 31, 2016 - 05:52 PM |
Tagged: BUILD, build 2016, microsoft, windows 10, windows phone

If you watched the opening keynote of Microsoft's Build conference, then you probably didn't see much Windows Phone (unless you were looking at your own). The Verge talked to Terry Myerson about this, and Microsoft confirmed that they are leading with non-Windows, 4-inch devices, and they want to “generate developer interest” on those platforms for this year.

PC World interpreted this conversation to say that Windows Phone is put on hold.


That might be a little hasty, though. Microsoft is still building Windows 10 for Mobile. In fact, since Microsoft updated “Windows OneCore” and jumped build to 14xxx-level build numbers with Windows 10 build 14251, Windows 10 Mobile and Windows 10 PC are kept in lockstep. As far as I know, that is still the plan, and Windows Insiders should continue to receive these on compatible devices.

That said, Microsoft has basically admitted that Windows Phone would just be a distraction for developers this year. At the very least, they don't believe that the platform will be ready for them until next year's Build conference, which means that consumers will probably be even further down than that because there would be no applications for them. Yes, Windows Phone could be slowly shimmying out of the spotlight, but it could also be delayed until they make a good impression, and have the PC, Xbox, Hololens, and other ecosystems secure to lift it up.

Source: The Verge

Microsoft's Phil Spencer Discusses UWP Concerns at Build

Subject: General Tech, Shows and Expos | March 30, 2016 - 05:14 PM |
Tagged: windows 10, uwp, microsoft, build 2016, BUILD

When a platform vendor puts up restrictions, it can be scary, and with good cause. Microsoft's Universal Windows Platform (UWP) is the successor of WinRT, which, in the Windows 8 era, forced web browsers to be reskins of Internet Explorer, forced developers to get both their software and themselves certified before publishing, and so forth. They still allowed the traditional, more open, Win32 API, but locked them into “the Desktop App”.

Naturally, UWP carries similar concerns, which some developers (like Tim Sweeney of Epic Games) voiced publicly. It's more permissive, but in a brittle way. We don't want Microsoft, or someone like a government who has authority over them, to flip a switch and prevent individuals from developing software, ban content that some stakeholder finds offensive (like art with LGBT characters in Russia, the Middle East, or even North America), or ban entire categories of software like encryption suites or third-party web browsers.


This is where we get to today's announcement.

Microsoft's Phil Spencer, essentially responding to Tim Sweeney's concerns, and the PC gaming community at large, announced changes to UWP to make it more open. I haven't had too much time to think about it, and some necessary details don't translate well to a keynote segment, but we'll relay what we know. First, they plan to open up VSync off, FreeSync, and G-Sync in May. I find this kind-of odd, since Windows 10 will not receive its significant update (“Anniversary Update”) until July, I'm not sure how they would deliver this. It seems a little big for a simple Windows Update patch. I mean, they have yet to even push new versions of their Edge web browser outside of Windows 10 builds.

The second change is more interesting. Microsoft announced, albeit without dedicating a solid release date or window, to allow modding and overlays in UWP applications. This means that software will be able to, somehow, enter into UWP's process, and users will be encouraged to, somehow, access the file system of UWP applications. Currently, you need to jump through severe hoops to access the contents of Windows Store applications.

They still did not address the issue of side-loading and developing software without a certificate. Granted, you can do both of those things in Windows 10, but in a way that seems like it could be easily removed in a future build, if UWP has enough momentum and whoever runs Microsoft at the time decides to. Remember, this would not be an insidious choice by malicious people. UWP is alluring to Microsoft because it could change the “Windows gets viruses” stigma that is associated with PCs. The problem is that it can be abused, or even unintentionally harm creators and potential users.

On the other hand, they are correcting some major issues. I'm just voicing concerns.

Source: Microsoft

Meet the new Intel Skulltrail NUC; Changing the Game

Subject: Shows and Expos | March 17, 2016 - 01:00 AM |
Tagged: skulltrail, Skull Canyon, nuc, Intel, GDC


No we are not talking about the motherboard from 2008 which was going to compete with AMD's QuadFX platform and worked out just as well.  We are talking about a brand new Skull Canyon NUC powered by an i7-6770HQ with Iris Pro 580 graphics and up to 32GB of DDR4-2133.  The NUC NUC6i7KYK will also be the first system we have seen with a fully capable USB Type-C port, it will offer Thunderbolt 3, USB 3.1 and DisplayPort 1.2 connectivity; not simultaneously but the flexibility is nothing less than impressive.  It will also sport a full-size HDMI 2.0 port and Mini DisplayPort 1.2 outputs so you can still send video while using the Type C port for data transfer.  The port will also support external graphics card enclosures if you plan on using this as a gaming machine as well.


The internal storage subsystem is equally impressive, dual M.2 slots will give you great performance, the SD card slot not so much but still a handy feature.  Connectivity is supplied by Intel Dual Band Wireless-AC 8260 (802.11 ac) and Bluetooth 4.2 and an infrared sensor will let you use your favourite remote control if you set up the Skulltrail NUC as a media server.  All of these features are in a device less than 0.7 litres in size, with your choice of two covers and support for your own if you desire to personalize your system.  The price is not unreasonable, the MSRP for a barebones system is $650, one with 16GB memory, 256GB SSD and Windows 10 should retail for about $1000.  You can expect to see these for sale on NewEgg in April to ship in May.

All this and more can be found on Intel's news room, and you can click here for the full system specs.

Source: Intel

Shedding a little light on Monday's announcement

Most of our readers should have some familiarity with GameWorks, which is a series of libraries and utilities that help game developers (and others) create software. While many hardware and platform vendors provide samples and frameworks, taking the brunt of the work required to solve complex problems, this is NVIDIA's branding for their suite of technologies. Their hope is that it pushes the industry forward, which in turn drives GPU sales as users see the benefits of upgrading.


This release, GameWorks SDK 3.1, contains three complete features and two “beta” ones. We will start with the first three, each of which target a portion of the lighting and shadowing problem. The last two, which we will discuss at the end, are the experimental ones and fall under the blanket of physics and visual effects.


The first technology is Volumetric Lighting, which simulates the way light scatters off dust in the atmosphere. Game developers have been approximating this effect for a long time. In fact, I remember a particular section of Resident Evil 4 where you walk down a dim hallway that has light rays spilling in from the windows. Gamecube-era graphics could only do so much, though, and certain camera positions show that the effect was just a translucent, one-sided, decorative plane. It was a cheat that was hand-placed by a clever artist.


GameWorks' Volumetric Lighting goes after the same effect, but with a much different implementation. It looks at the generated shadow maps and, using hardware tessellation, extrudes geometry from the unshadowed portions toward the light. These little bits of geometry sum, depending on how deep the volume is, which translates into the required highlight. Also, since it's hardware tessellated, it probably has a smaller impact on performance because the GPU only needs to store enough information to generate the geometry, not store (and update) the geometry data for all possible light shafts themselves -- and it needs to store those shadow maps anyway.


Even though it seemed like this effect was independent of render method, since it basically just adds geometry to the scene, I asked whether it was locked to deferred rendering methods. NVIDIA said that it should be unrelated, as I suspected, which is good for VR. Forward rendering is easier to anti-alias, which makes the uneven pixel distribution (after lens distortion) appear more smooth.

Read on to see the other four technologies, and a little announcement about source access.

MWC 16: Imagination Technologies Ray Tracing Accelerator

Subject: Graphics Cards, Mobile, Shows and Expos | February 24, 2016 - 01:46 AM |
Tagged: raytracing, ray tracing, PowerVR, mwc 16, MWC, Imagination Technologies

For the last couple of years, Imagination Technologies has been pushing hardware-accelerated ray tracing. One of the major problems in computer graphics is knowing what geometry and material corresponds to a specific pixel on the screen. Several methods exists, although typical GPUs crush a 3D scene into the virtual camera's 2D space and do a point-in-triangle test on it. Once they know where in the triangle the pixel is, if it is in the triangle, it can be colored by a pixel shader.


Another method is casting light rays into the scene, and assigning a color based on the material that it lands on. This is ray tracing, and it has a few advantages. First, it is much easier to handle reflections, transparency, shadows, and other effects where information is required beyond what the affected geometry and its material provides. There are usually ways around this, without resorting to ray tracing, but they each have their own trade-offs. Second, it can be more efficient for certain data sets. Rasterization, since it's based around a “where in a triangle is this point” algorithm, needs geometry to be made up of polygons.

It also has the appeal of being what the real world sort-of does (assuming we don't need to model Gaussian beams). That doesn't necessarily mean anything, though.

At Mobile World Congress, Imagination Technologies once again showed off their ray tracing hardware, embodied in the PowerVR GR6500 GPU. This graphics processor has dedicated circuitry to calculate rays, and they use it in a couple of different ways. They presented several demos that modified Unity 5 to take advantage of their ray tracing hardware. One particularly interesting one was their quick, seven second video that added ray traced reflections atop an otherwise rasterized scene. It was a little too smooth, creating reflections that were too glossy, but that could probably be downplayed in the material ((Update: Feb 24th @ 5pm Car paint is actually that glossy. It's a different issue). Back when I was working on a GPU-accelerated software renderer, before Mantle, Vulkan, and DirectX 12, I was hoping to use OpenCL-based ray traced highlights on idle GPUs, if I didn't have any other purposes for it. Now though, those can be exposed to graphics APIs directly, so they might not be so idle.

The downside of dedicated ray tracing hardware is that, well, the die area could have been used for something else. Extra shaders, for compute, vertex, and material effects, might be more useful in the real world... or maybe not. Add in the fact that fixed-function circuitry already exists for rasterization, and it makes you balance gain for cost.

It could be cool, but it has its trade-offs, like anything else.

MWC 16: LG G5 Hands-on. Performance and Modularity

Subject: Mobile, Shows and Expos | February 22, 2016 - 10:09 AM |
Tagged: video, snapdragon 820, snapdragon, qualcomm, MWC 2016, MWC, LG, G5

The new LG G5 flagship smartphone offers a unique combination of form factor, performance and modularity that no previous smartphone design has had. But will you want to buy in?

2016-02-22 09.15.37.jpg

I had a feeling that the Snapdragon 820 SoC from Qualcomm would make an impression at Mobile World Congress this year and it appears the company has improved on the previous flagship processor quite a bit. Both Samsung and LG have implemented it into the 2016 models, including the new G5, offering up a combination of performance and power efficiency that is dramatically better than the 810 that was hindered by heat and process technology concerns.

Along with the new processor, the G5 includes 4GB of RAM, 32GB of on-board storage with micro SD expansion, a 2,800 mAh battery and Android 6.0 out of the box. The display is 5.3-in and uses LG IPS technology with a 2560x1440 resolution, resulting in an impressive 554 PPI. LG has updated the USB connection to Type-C, a move that Samsung brushed off as unnecessary at this time.

The phones design is pretty standard and will look very familiar to anyone that has handled a G4 or similar flagship smartphone in recent months. It was bigger in the hand than the iPhone 6s but considering the panel size differences, it was more compact than expected.

2016-02-22 09.15.40.jpg

Modularity is the truly unique addition to the G5 though. The battery is replaceable by sliding out a bottom portion of the phone, released with a tab on the left side. This allows LG to maintain the metal body construction but still offer flexibility for power users that are used to having extra batteries in their bag. This mechanism also means LG can offer add-on modules for the phone.

2016-02-22 09.05.04.jpg

The first two available will be the LG Cam Plus and the LG Hi-Fi Plus. The Cam Plus gives the phone a camera grip as well as dedicated buttons for the shutter, video recording and zoom. Including an extra 1,200 mAh of battery is a nice touch too. The Hi-Fi Plus module has a DAC and headphone amplifier enbeded in it and can also be used connected to a PC through the USB Type-C connection; a nice touch.

2016-02-22 09.13.48.jpg

I was overall pretty impressed with what LG had to offer with the G5. Whether or not the modular design gains any traction will have to be seen; I have concerns over the public's desire to carry around modules or affect the form factor of their phones so dramatically.

MWC 16: HTC Vive Launches in April for $799 USD

Subject: Displays, Shows and Expos | February 22, 2016 - 01:27 AM |
Tagged: MWC, mwc 16, valve, htc, vive, Oculus

Valve and HTC announced that the Vive consumer edition will be available in April for $799 USD, with pre-orders beginning on February 29th. Leave it to Valve to launch a product on a date that doesn't always exist. The system comes with the headset, two VR controllers, and two sensors. The unit will have “full commercial availability” when it launches in April, but that means little if it sells out instantly. There's no way to predict that.

The announcement blog post drops a subtle jab at Oculus. “Vive will be delivered as a complete kit” seems to refer to the Oculus Touch controllers being delayed (and thus not in the hands of every user). This also makes me think about the price. The HTC Vive costs $200 more than the Oculus Rift. That said, it also has the touch controllers, which could shrink that gap. It also does not come with a standard gamepad, like Oculus does, although that's just wasted money if you already have one.


Unlike the Oculus, which has its own SDK, the Vive is powered by SteamVR. Most engines and middleware that support one seem to support both, so I'm not sure if this will matter. It could end up blocking content in an HD-DVD vs BluRay fashion. Hopefully Valve/HTC and Oculus/Facebook, or every software vendor on an individual basis, works through these interoperability concerns and create an open platform. Settling on a standard tends to commoditize industries, but that will eventually happen to VR at some point anyway. Hopefully, if it doesn't happen sooner, cross-compatibility at least happens then.

MWC 16: Epic Games Unveils ProtoStar Demo on Galaxy S7

Subject: Mobile, Shows and Expos | February 21, 2016 - 10:14 PM |
Tagged: Samsung, epic games, unreal engine 4, vulkan, galaxy s7, MWC, mwc 16

Mobile World Congress starts with a big bang... ... ... :3

Okay, not really; it starts with the formation of a star, which happens on a continual basis across the universe. I won't let facts get in the way of a pun, though.

As for the demo, it is powered by Unreal Engine 4 and runs on a Samsung Galaxy S7 with the Vulkan API. The setting seems to be some sort of futuristic laboratory that combines objects until it builds up into a star. It is bright and vibrant, with many particles, full-scene anti-aliasing, reflections, and other visual effects. The exact resolution when running on the phone was never stated, but the YouTube video was running at 1080p30, and the on-stage demo looked fairly high resolution, too.


Epic Games lists the features they added to mobile builds of Unreal Engine 4 for this demo:

  • Dynamic planar reflections
  • “Full” GPU particle support, which includes vector fields.
  • Temporal Anti-Alising, which blends neighboring frames to smooth jaggies in motion.
  • ASTC texture compression (created by ARM and AMD for OpenGL and OpenGL ES)
  • Full scene dynamic cascaded shadows
  • Chromatic aberration
  • Dynamic light refraction
  • Filmic tonemapping curve, which scales frames rendered in HDR to a presentable light range
  • Improved static reflections
  • High-quality depth of field
  • Vulkan API for thousands of onscreen, independent objects.

The company has not stated which version of Unreal Engine 4 will receive these updates. I doubt that it will land in 4.11, which is planned for March, but they tend to release a full dot-version every one to three months. They also have early previews for those who wish to try it early, some compiled leading up to launch, and others that need to be built from GitHub.

Source: Epic Games

Unreal Editor for Unreal Engine 4 in VR

Subject: General Tech, Shows and Expos | February 5, 2016 - 12:47 AM |
Tagged: GDC, gdc 2016, epic games, ue4, VR, vive vr

Epic Games released Unreal Engine 4 at GDC two years ago, and removed its subscription fee at the next year's show. This year, one of the things that they will show is Unreal Editor in VR with the HTC Vive. Using the system's motion controllers, you will be able to move objects and access UI panels in the virtual environment. They open the video declaring that this is not an experimental project.


Without using this technology, it's hard to comment on its usability. It definitely looks interesting, and might be useful for VR experiences. You can see what your experience will look like as you create it, and you probably even save a bit of time in rapid iteration by not continuously wearing and removing the equipment. I wonder how precise it will be though, since the laser pointers and objects seemed to snap and jitter a bit. That said, it might be just as precise and, even still, it only really matters how it looks and behaves, and it shouldn't even prevent minor tweaks after the fact anyway.

Epic Games expects to discuss the release plans at the show.

Source: Epic Games