Flash player not detected. Click here to install flash.
« 1 2 3 4 5 »

Podcast #362 - Benchmarking a Voodoo 3, Flash Media Summit 2015, Skylake Delidding and more!

Subject: General Tech | August 13, 2015 - 01:14 PM |
Tagged: podcast, video, amd, nvidia, GTX 970, Zotac GTX 970 AMP! Extreme Core Edition, dx12, 3dfx, voodoo 3, Intel, SSD 750, NVMe, Samsung, R9 Fury, Fiji, gtx 950

PC Perspective Podcast #362 - 08/13/2015

Join us this week as we discuss Benchmarking a Voodoo 3, Flash Media Summit 2015, Skylake Delidding and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

  • iTunes - Subscribe to the podcast directly through the iTunes Store
  • RSS - Subscribe through your regular RSS reader
  • MP3 - Direct download link to the MP3 file

Hosts: Ryan Shrout, Jeremy Hellstrom, Josh Walrath, and Sebastian Peak

Subscribe to the PC Perspective YouTube Channel for more videos, reviews and podcasts!!

AMD Radeon R9 Fury Unlocked as Fury X, Overclocked to 1 GHz HBM

Subject: Graphics Cards | August 12, 2015 - 05:29 PM |
Tagged: STRIX R9 Fury, Radeon R9 Fury, overclocking, oc, LN2, hbm, fury x, asus, amd

What happens when you unlock an AMD Fury to have the Compute Units of a Fury X, and then overclock the snot out of it using LN2? User Xtreme Addict in the HWBot forums has created a comprehensive guide to do just this, and the results are incredible.

fury_ln2_01.JPG

Not for the faint of heart (image credit: Xtreme Addict)

"The steps include unlocking the Compute Units to enable Fury X grade performance, enabling the hotwire soldering pads, a 0.95v Rail mod, and of course the trimpot/hotwire VGPU, VMEM, VPLL (VDDCI) mods.

The result? A GPU frequency of 1450 MHz and HBM frequency of 1000 MHz. For the HBM that's a 100% overclock."

Beginning with a stock ASUS R9 Fury STRIX card Xtreme Addict performed some surgery to fully unlock the voltage, and unlocked the Compute Units using a tool from this Overclock.net thread.

fury_ln2_02.jpg

The results? Staggering. HBM at 1000 MHz is double the rate of the stock Fury X, and a GPU core of 1450 MHz is a 400 MHz increase. So what kind of performance did this heavily overclocked card achieve?

"The performance goes up from 6237 points at default to 6756 after unlocking the CUs, then 8121 points after overclock on air cooling, to eventually end up at 9634 points when fully unleashed with liquid nitrogen."

Apparently they were able to push the card even further, ending up with a whopping 10033 score in 3DMark Fire Strike Extreme.

fury_ln2_03.JPG

While this method is far too extreme for 99% of enthusiasts, the idea of unlocking a retail Fury to the level of a Fury X through software/BIOS mods is much more accessible, as is the possibility of reaching much higher clocks through advanced cooling methods.

Unfortunately, if reading through this makes you want to run out and grab one of these STRIX cards availability is still limited. Hopefully supply catches up to demand in the near future.

fury_strix.PNG

A quick look at stock status on Newegg for the featured R9 Fury card

Source: HWBot

3dfx Voodoo 3 2000 PCI Unboxing - What year is it??!?

Subject: Graphics Cards | August 12, 2015 - 04:43 PM |
Tagged: what year is it, voodoo 3, voodoo, video, unboxing, pci, 3dfx

What do you do when you have a new, in box 3dfx Voodoo 3 2000 graphics card that gets some water damage? You do a classic unboxing and then try to get that PCI graphics card from 1999 up and running and playing some Unreal Tournament. 
 
pic1.jpg
 
Were we successful?
 

The Steam Boy gets a new name and pre-order price

Subject: General Tech | August 12, 2015 - 04:06 PM |
Tagged: gaming, Steam Machine, valve, Smach Zero

The portable Steam machine previously referred to as the Steam Boy is now called the Smach Zero and you can pre-order it starting November 10th for $300.  The device will feature a 5-inch 720p touch screen powered by an AMD Steppe Eagle SoC with a Jaguar-based CPU and GCN-based Radeon graphics.  It will have 4GB of RAM onboard, 32GB of internal storage with more available vis an SD Card Slot and support for USB OTG.  HEXUS was told the device should be able to handle Half-Life 2, Civilization V, Dota 2, Tropico 5, BioShock Infinite or Cities: Skylines on its integral display or outputted via the HDMI port.  Check out more on the Smach Zero here.

692933fe-c1d2-4637-a4a5-2e32f13db182.jpg

"Smach Zero Steam Machine pre-order availability and pricing have both been confirmed by the device maker. Smach published a press release yesterday saying that the handheld will be available on pre-order from 10th November at a special introductory price of $299."

Here is some more Tech News from around the web:

Gaming

Source: HEXUS

GIGABYTE GTX 980 Ti G1 GAMING loves it when you overclock

Subject: Graphics Cards | August 12, 2015 - 02:44 PM |
Tagged: GTX 980 Ti G1 GAMING, gigabyte, GTX 980 Ti, factory overclocked

The Gigabyte GTX 980 Ti G1 GAMING card comes with a 1152MHz Base Clock and 1241MHz Boost Clock straight out of the box and uses two 8-pin power connectors as opposed to an 8 and a 6-pin.  That extra power and the WINDFORCE 3X custom cooler help you when overclocking the card beyond the frequencies it ships at.  [H]ard|OCP used OC GURU II to up the voltage provided to this card and reached an overclock that hit 1367MHz in game with a 7GHz clock for the VRAM.  Manually they managed to go even further, the VRAM could reach 8GHz and the GPU clock was measured at 1535 in game, a rather significant increase.  The overclock increased performance by around 10% in most of the tests; which makes this card impressive even before you consider some of the other beneficial features which you can read about at [H]ard|OCP.

1439202407lsYxTzB0s7_1_15_l.jpg

"Today we review a custom built retail factory overclocked GIGABYTE GTX 980 Ti G1 GAMING video card. This video card is built to overclock in every way. We'll take this video card, compare it to the AMD Radeon R9 Fury X and overclock the GIGABYTE GTX 980 Ti G1 GAMING to its highest potential. The overclocking potential is amazing."

Here are some more Graphics Card articles from around the web:

Graphics Cards

Source: [H]ard|OCP

Looking at that Ubuntu phone? Hope you don't live in North America

Subject: General Tech | August 12, 2015 - 01:58 PM |
Tagged: ubuntu, smartphone, HPSA+, Aquaris E4.5, Aquaris E5 HD

The new Ubuntu powered Aquaris E4.5 and the Aquaris E5 HD are now available but thanks to North America's carriers not supporting HPSA+ properly, or in many cases at all, the best you could hope for on this side of the pond is a 2G connection.  They chips inside the phones are  quad-core ARM Cortex A7's running at 1.3GHz with Mali 400 graphics.  The E5 has a 5" screen with a resolutions of 720 x 1280, the 4.5 is 4.5" in size with a 540 x 960 resolution.  Overall the specs are not awe inspiring and the prices of roughly $190 and $220 seem a bit high but are certainly lower than what you would pay for a new Samsung or Apple product without a contract.  If you are interested then follow the links from The Register to order one.

ubuntu_phones.jpg

"In a Tuesday blog post, Ubuntu maker Canonical said that BQ, its Spanish hardware partner, has opened a new online store where customers around the world can order the Aquaris E4.5 and the Aquaris E5 HD, the two current Ubuntu models."

Here is some more Tech News from around the web:

Tech Talk

Source: The Register

Qualcomm Introduces Adreno 5xx Architecture for Snapdragon 820

Subject: Graphics Cards, Processors, Mobile | August 12, 2015 - 07:30 AM |
Tagged: snapdragon 820, snapdragon, siggraph 2015, Siggraph, qualcomm, adreno 530, adreno

Despite the success of the Snapdragon 805 and even the 808, Qualcomm’s flagship Snapdragon 810 SoC had a tumultuous lifespan.  Rumors and stories about the chip and an inability to run in phone form factors without overheating and/or draining battery life were rampant, despite the company’s insistence that the problem was fixed with a very quick second revision of the part. There are very few devices that used the 810 and instead we saw more of the flagship smartphones uses the slightly cut back SD 808 or the SD 805.

Today at Siggraph Qualcomm starts the reveal of a new flagship SoC, Snapdragon 820. As the event coinciding with launch is a graphics-specific show, QC is focusing on a high level overview of the graphics portion of the Snapdragon 820, the updated Adreno 5xx architecture and associated designs and a new camera image signal processor (ISP) aiming to improve quality of photos and recording on our mobile devices.

sd820-1.jpg

A modern SoC from Qualcomm features many different processors working in tandem to impact the user experience on the device. While the only details we are getting today focus around the Adreno 530 GPU and Spectra ISP, other segments like connectivity (wireless), DSP, video processing and digital signal processing are important parts of the computing story. And we are well aware that Qualcomm is readying its own 64-bit processor architecture for the Kryo CPU rather than implementing the off-the-shelf cores from ARM used in the 810.

We also know that Qualcomm is targeting a “leading edge” FinFET process technology for SD 820 and though we haven’t been able to confirm anything, it looks very like that this chip will be built on the Samsung 14nm line that also built the Exynos 7420.

But over half of the processing on the upcoming Snapdragon 820 fill focus on visual processing, from graphics to gaming to UI animations to image capture and video output, this chip’s die will be dominated by high performance visuals.

Qualcomm’s lists of target goals for SD 820 visuals reads as you would expect: wanting perfection in every area. Wouldn’t we all love a phone or tablet that takes perfect photos each time, always focusing on the right things (or everything) with exceptional low light performance? Though a lesser known problem for consumers, having accurate color reproduction from capture, through processing and to the display would be a big advantage. And of course, we all want graphics performance that impresses and a user interface that is smooth and reliable while enabling NEW experience that we haven’t even thought of in the mobile form factor. Qualcomm thinks that Snapdragon 820 will be able to deliver on all of that.

Continue reading about the new Adreno 5xx architecture!!

Source: Qualcomm

FMS 2015: Toshiba Announces QLC (4-bit MLC) 3D Archival Flash

Subject: Storage | August 11, 2015 - 08:40 PM |
Tagged: toshiba, ssd, FMS 2015, flash, BiCS, Archive, Archival, 3d

We occasionally throw around the '3-bit MLC' (Multi Level Cell) term in place of 'TLC' (Triple Level Cell) when talking about flash memory. Those terms are interchangeable, but some feel it is misleading as the former still contains the term MLC. At Toshiba's keynote today, they showed us why the former is important:

toshiba-keynote-3d-nand-fms-2015-custom-pc-review-6.jpg

Photo source: Sam Chen of Custom PC Review

That's right - QLC (Quadruple Level Cell), which is also 4-bit MLC, has been mentioned by Toshiba. As you can see at the right of that slide, storing four bits in a single flash cell means there are *sixteen* very narrow voltage ranges representing the stored data. That is a very hard thing to do, and even harder to do with high performance (programming/writing would take a relatively long time as the circuitry nudges the voltages to such a precise level). This is why Toshiba pitched this flash as a low cost solution for archival purposes. You wouldn't want to use this type of flash in a device that was written constantly, since the channel materials wearing out would have a much more significant effect on endurance. Suiting this flash to be written only a few times would keep it in a 'newer' state that would be effective for solid state data archiving.

The 1x / 0.5x / 6x figures appearing in the slide are meant to compare relative endurance to Toshiba's own planar 15nm flash. The figures suggest that Toshiba's BiCS 3D flash is efficient enough to go to QLC (4-bit) levels and still maintain a higher margin than their current MLC (2-bit) 2D flash.

More to follow as we continue our Flash Memory Summit coverage!

We hear you like Skylake-U news

Subject: Processors | August 11, 2015 - 06:39 PM |
Tagged: skylake-u, Intel

Fanless Tech just posted slides of Skylake-U the ultraportable version of Skylake, all of which have an impressively low TDP of 15W which can be reduced to either 10W or in some cases all the way down to 7.5W.  As they have done previously all are BGA socketed which means you will not be able to upgraded nor are you likely to see them in desktops, not necessarily a bad thing for this segment of the mobile market but certainly worth noting.

sku1.png

There will be two i7 models and two i5 along with a single i3 version, the top models of which, the Core i7-6600U and Core i5-6300U sport a slightly increased frequency and support for vPro.  Those two models, along with the i7-6500U and i5-6200U will have the Intel HD graphics 520 with frequencies of 300/1050 for the i7's and 300/1000 for the i5 and i3 chips

sku2.png

Along with the Core models will come a single Pentium chip, the 4405U and a pair of Celerons, the 3955U and 3855U.  They will have HD510 graphics, clocks of 300/950 or 300/900 for the Celerons and you will see slight reductions in PCIe and storage subsystems on teh 4405U and 3855U.  The naming scheme is less confusing that some previous generations, a boon for those with family or friends looking for a new laptop who are perhaps not quite as obsessed with processors as we are.

sku3.png

 

Source: Fanless Tech

FMS 2015: Samsung's New 256Gbit VNAND Enables 16TB PM1633a Datacenter SSD

Subject: Storage | August 11, 2015 - 04:59 PM |
Tagged: Samsung, vnand, 48-layer, tlc, 16TB, FMS 2015

I get these emails and comments all the time - "I want a larger capacity SSD". Ok, here ya go:

DSC04114_DxO.jpg

Samsung's earlier 48-layer VNAND announcement was exciting, but we already knew about it going into the keynote. What we did not know was that Samsung was going to blew the doors off of their keynote when they dropped this little gem. It's not just the largest capacity SSD, as this thing is more dense than any HDD's available today as well. That's 16TB of 48-layer TLC VNAND packed into a 2.5" form factor SAS-connected SSD.

...now what do you do once you have such a high density device? Well, you figure out how many you can cram into a 2U chassis of course!

DSC04155_DxO.jpg

Yup, that's 48 of those new SSDs, making for a capacity of 768TB in a 2U chassis. Samsung described this as a "JBOF" (Just a Bunch Of Flash), so processing the 2 million IOPS this array is capable of will have to be left to the connected system.

No word on pricing, but I'd think we are in 'mortgage the house' territory if you want to put this into your home PC.

There is more to follow from Flash Memory Summit, but for now I've got to run to another meeting!

FMS 2015: *UPDATED* Samsung Adds Layers to its 3D VNAND, Doubling Capacity While Reducing Power Consumption

Subject: Storage | August 11, 2015 - 04:39 PM |
Tagged: vnand, tlc, Samsung, FMS 2015, 48-layer, 32GB, 32-layer, 256Gbit

FMS 2015: Samsung Adds Layers to its 3D VNAND, Doubling Capacity While Reducing Power Consumption

Samsung recently added 2TB capacity parts to their 850 EVO SATA SSDs, but today’s announcement may double that. Today at Flash Memory Summit, Samsung has announced a new iteration on their 3D VNAND technology.

Picture5.png

Cross section of Samsung 32-layer VNAND. (TechInsights)

The announcement is a new TLC 3D VNAND (the type present in the 850 EVO Series). The new parts consist of an updated die with the following improvements:

  • 48 layer VNAND - up from 32 layers of the previous generation
  • 256Gbit (32GB) capacity - up from 128Gbit (16GB) capacity of 32-layer VNAND
  • 30% reduction in power consumption over 32-layer VNAND

48_Main.jpg

Samsung’s new 48-layer VNAND.

I suspected Samsung would go this route in order to compete with the recent announcements from Intel/Micron and SanDisk. Larger die capacities may not be the best thing for keeping performance high in smaller capacity SSDs (a higher number of smaller capacity dies helps there), but it is definitely a good capability to have since higher capacity per die translates to more efficient flash die production.

The Samsung keynote is at noon today (Pacific), and I will update this piece with any photos relevant to the announcement after that keynote.

*UPDATE*

I just got out of the Samsung keynote. There were some additional slides with data relevant to this post:

DSC04064_DxO.jpg

This image simply shows the additional vertical stacking, but adds that Samsung has this new flash in production right now.

DSC04062_DxO.jpg

The new higher capacity dies enable 1.4x greater density per wafer (realize that this does not mean more dies per wafer, as the image incorrectly suggests).

DSC04071_DxO.jpg

The power consumption improvements (right) were in the press release, however the speed improvements (left) were not. A 2x improvement in per-die speeds means that Samsung should not see a performance hit if they migrate their existing 128Gbit TLC VNAND SSDs over to these new 256Gbit parts. Speaking of which...

DSC04075_DxO.jpg

Not only is this new VNAND being produced *this month*, Samsung is retrofitting their 850 EVO line with the new parts. Again, we expect no performance delta but will likely retest these new versions just to double check for any outliers.

There was some more great info from the keynote, but that will appear in another post later today.

Samsung’s press blast appears after the break.

Source: Samsung

Is this the GTX 950?

Subject: Graphics Cards | August 11, 2015 - 02:54 PM |
Tagged: rumour, nvidia, gtx 950

Rumours of the impending release of a GTX 950 and perhaps even a GTX 950 Ti continue to spread, most recently at Videocardz who have developed a reputation for this kind of report.  Little is known at this time, the specifications are still unspecified but they have found a page showing a ASUS STRIX GTX 950, with 2GB of memory and a DirectCUII cooler. The prices shown are unlikely to represent the actual retail price, even in Finland where the capture is from.

PNY-GeForce-GTX-950.jpg

Also spotted is a PNY GTX 950 retail box which shows us little in the way of details, the power plug is facing away from the camera so we are still unsure how many power plugs will be need./  Videocardz also reiterates their belief from the first leak that the card will 75% of a GM206 Maxwell graphics processor, with 768 CUDA cores and a 128-bit interface.

Source: Videocardz

It's been a busy year for phones so far ... who is coming out on top?

Subject: Mobile | August 11, 2015 - 01:39 PM |
Tagged: smartphones, Moto G, galaxy s6, LG G4, iphone 6, HTC One M9, blackphone

The Inquirer has taken a look back at the past years smartphone releases with an eye towards providing a resource to help you compare them.  So far there are 11 phones in their round up, including the somewhat maligned Blackphone which was intended to be completely secure but turned out to be a little less invulnerable than advertised.  An overview of each phone is provided covering basic statistics such as screen size and resolution and often the processor inside.  As you would expect they also include a link to their reviews of the phone and they plan on updating the article as new phones are released.

blackphone-new-imagery-540x334.png

"THE SMARTPHONE MARKET is becoming increasingly competitive, make it harder and harder for buyers to choose which handset is right for them."

Here are some more Mobile articles from around the web:

Mobile

Source: The Inquirer

Windows 10 for everything arrives

Subject: General Tech | August 11, 2015 - 12:52 PM |
Tagged: windows 10, iot, raspberry pi 2

The slimmed down version of Windows 10 for devices such as the Raspberry Pi 2 has arrived and it is royalty free for makers, available right here.  The Register describes some problems with the current version, mostly incompatibility with certain peripherals but also include occasional video crashes or networking issues.  Seeing as how this particular incarnation of the OS is designed for creative minds tinkering on custom hardware the issues are not unexpected nor should you consider it proof the OS is not usable if you plan on tinkering with it.  You will need a full PC for development with Windows 10 and Visual Studio 2015 to start using the slimmed down Windows 10, nothing new but certainly worth noting.  Check out more on the Universal Windows Platform and Windows 10 for the IoT at The Register.

RPi2_0.png

"Microsoft has shipped the public release of Windows 10 IoT Core, the pared-down version of Windows 10 for embedded devices, including the Intel MinnowBoard Max and the Raspberry Pi 2."

Here is some more Tech News from around the web:

Tech Talk

Source: The Register

Overclock any NVIDIA GPU on Desktop and Mobile with a New Utility

Subject: Graphics Cards | August 10, 2015 - 06:14 PM |
Tagged: overclocking, overclock, open source, nvidia, MSI Afterburner, API

An author called "2PKAQWTUQM2Q7DJG" (likely not a real name) has published a fascinating little article today on his/her Wordpress blog entitled, "Overclocking Tools for NVIDIA GPUs Suck. I Made My Own". What it contains is a full account of the process of creating an overclocking tool beyond the constraints of common utilities such as MSI Afterburner.

By probing MSI's OC utility using Ollydbg (an x86 "assembler level analysing debugger") the author was able to track down how Afterburner was working.

nvapiload.png

“nvapi.dll” definitely gets loaded here using LoadLibrary/GetModuleHandle. We’re on the right track. Now where exactly is that lib used? ... That’s simple, with the program running and the realtime graph disabled (it polls NvAPI constantly adding noise to the mass of API calls). we place a memory breakpoint on the .Text memory segment of the NVapi.dll inside MSI Afterburner’s process... Then we set the sliders in the MSI tool to get some negligible GPU underclock and hit the “apply” button. It breaks inside NvAPI… magic!

After further explaining the process and his/her source code for an overclocking utility, the user goes on to show the finished product in the form of a command line utility.

overclock.png

There is a link to the finished version of this utility at the end of the article, as well as the entire process with all source code. It makes for an interesting read (even for the painfully inept at programming, such as myself), and the provided link to download this mysterious overclocking utility (disguised as a JPG image file, no less) makes it both tempting and a little dubious. Does this really allow overclocking any NVIDIA GPU, including mobile? What could be the harm in trying?? In all seriousness however since some of what was seemingly uncovered in the article is no doubt proprietary, how long will this information be available?

It would probably be wise to follow the link to the Wordpress page ASAP!

Source: Wordpress

Another look at Shuttle's DS57U barebones SFF system

Subject: Systems | August 10, 2015 - 03:52 PM |
Tagged: Celeron 3205U, DS57U, shuttle, SFF

Madshrimps have just wrapped up testing the Intel Celeron 3205U powered Shuttle DS57U, a SFF system which can be mounted to the back of a monitor with VESA or placed beside your monitor in the included stand.  The presence of two serial ports, WOL and resume after power outage mean this little system could also be used in industrial or POS duties.  It is worth noting that this system only supports 1.35V SODIMMs, make sure to choose the proper RAM to avoid disappointment.  Check out the full review here; if you like the case but not the CPU there are i3, i5 and even an i7 model for you to consider.

intro.jpg

"Shuttle has built the DS57U inside a proven chassis, which takes quite little space and succeeds to cool the internal components without the need of extra fans; one of the case laterals is acting like a huge heatsink and in this case it only remains warm even when the system is stressed to the max."

Here are some more Systems articles from around the web:

Systems

 

Source: Mad Shrimps

Meet the ADATA XPG SX930 family of SSDs

Subject: Storage | August 10, 2015 - 03:07 PM |
Tagged: adata, XPG SX930, JMF670H

ADATA's new XPG SX930 series is aimed at enthusiasts on a budget, the 120GB is about $65, the 240GB at $110 and the 480GB at $200.  The SSDs use the JMicron JMF670H controller, not one we have seen before and they also have a pseudo SLC cache which grows with the size of the drive from 4GB to 8GB to 16GB for the 480GB model.  The SSD Review tested out all three drives and found that the advertised speeds of 550MB/s read and 460MB/s write were more or less accurate and the drives did fairly well in their other tests as well.  If you need more speedy storage and are on a budget you should check out their full review.

Adata-XPG-SX930-SSD-Family-1024x683.jpg

"ADATA has memory products for all sections of the market, from consumer to industrial. As of late they have released a new consumer SSD, the XPG SX930. It is marketed towards the gamer and overclocker crowd at a pretty competitive price point."

Here are some more Storage reviews from around the web:

Storage

Running a small Win7 Domain and having bandwidth issues today?

Subject: General Tech | August 10, 2015 - 12:58 PM |
Tagged: windows 10, oops, microsoft

Microsoft promised that Windows 10 would not be pushed out to computers on a Domain, or at least allow you to block the update; a claim which has turned out to be slightly less than accurate.  If you are running a Windows 7 Domain which still relies Microsoft update as opposed to WSUS you may have noticed some serious traffic spikes this morning.  That is because some, perhaps all, of your computers are slurping down the 3GB Windows 10 update.  Check the Register for links to Microsoft and consider blocking Microsoft Update on your firewall until this has been sorted, unless you like a slow network and living dangerously.

images.jpg

"The problem is affecting domain-attached Windows 7 PCs not signed up to Windows Server Update Services (WSUS) for patches and updates, but looking for a Microsoft update instead."

Here is some more Tech News from around the web:

Tech Talk

Source: The Register
Manufacturer: PC Perspective

It's Basically a Function Call for GPUs

Mantle, Vulkan, and DirectX 12 all claim to reduce overhead and provide a staggering increase in “draw calls”. As mentioned in the previous editorial, loading graphics card with tasks will take a drastic change in these new APIs. With DirectX 10 and earlier, applications would assign attributes to (what it is told is) the global state of the graphics card. After everything is configured and bound, one of a few “draw” functions is called, which queues the task in the graphics driver as a “draw call”.

While this suggests that just a single graphics device is to be defined, which we also mentioned in the previous article, it also implies that one thread needs to be the authority. This limitation was known about for a while, and it contributed to the meme that consoles can squeeze all the performance they have, but PCs are “too high level” for that. Microsoft tried to combat this with “Deferred Contexts” in DirectX 11. This feature allows virtual, shadow states to be loaded from secondary threads, which can be appended to the global state, whole. It was a compromise between each thread being able to create its own commands, and the legacy decision to have a single, global state for the GPU.

Some developers experienced gains, while others lost a bit. It didn't live up to expectations.

pcper-2015-dx12-290x.png

The paradigm used to load graphics cards is the problem. It doesn't make sense anymore. A developer might not want to draw a primitive with every poke of the GPU. At times, they might want to shove a workload of simple linear algebra through it, while other requests could simply be pushing memory around to set up a later task (or to read the result of a previous one). More importantly, any thread could want to do this to any graphics device.

pcper-2015-dx12-980.png

The new graphics APIs allow developers to submit their tasks quicker and smarter, and it allows the drivers to schedule compatible tasks better, even simultaneously. In fact, the driver's job has been massively simplified altogether. When we tested 3DMark back in March, two interesting things were revealed:

  • Both AMD and NVIDIA are only a two-digit percentage of draw call performance apart
  • Both AMD and NVIDIA saw an order of magnitude increase in draw calls

Read on to see what this means for games and game development.

Khronos Group at SIGGRAPH 2015

Subject: Graphics Cards, Processors, Mobile, Shows and Expos | August 10, 2015 - 09:01 AM |
Tagged: vulkan, spir, siggraph 2015, Siggraph, opengl sc, OpenGL ES, opengl, opencl, Khronos

When the Khronos Group announced Vulkan at GDC, they mentioned that the API is coming this year, and that this date is intended to under promise and over deliver. Recently, fans were hoping that it would be published at SIGGRAPH, which officially begun yesterday. Unfortunately, Vulkan has not released. It does hold a significant chunk of the news, however. Also, it's not like DirectX 12 is holding a commanding lead at the moment. The headers were public only for a few months, and the code samples are less than two weeks old.

khronos-2015-siggraph-sixapis.png

The organization made announcements for six products today: OpenGL, OpenGL ES, OpenGL SC, OpenCL, SPIR, and, as mentioned, Vulkan. They wanted to make their commitment clear, to all of their standards. Vulkan is urgent, but some developers will still want the framework of OpenGL. Bind what you need to the context, then issue a draw and, if you do it wrong, the driver will often clean up the mess for you anyway. The briefing was structure to be evident that it is still in their mind, which is likely why they made sure three OpenGL logos greeted me in their slide deck as early as possible. They are also taking and closely examining feedback about who wants to use Vulkan or OpenGL, and why.

As for Vulkan, confirmed platforms have been announced. Vendors have committed to drivers on Windows 7, 8, 10, Linux, including Steam OS, and Tizen (OSX and iOS are absent, though). Beyond all of that, Google will accept Vulkan on Android. This is a big deal, as Google, despite its open nature, has been avoiding several Khronos Group standards. For instance, Nexus phones and tablets do not have OpenCL drivers, although Google isn't stopping third parties from rolling it into their devices, like Samsung and NVIDIA. Direct support of Vulkan should help cross-platform development as well as, and more importantly, target the multi-core, relatively slow threaded processors of those devices. This could even be of significant use for web browsers, especially in sites with a lot of simple 2D effects. Google is also contributing support from their drawElements Quality Program (dEQP), which is a conformance test suite that they bought back in 2014. They are going to expand it to Vulkan, so that developers will have more consistency between devices -- a big win for Android.

google-android-opengl-es-extensions.jpg

While we're not done with Vulkan, one of the biggest announcements is OpenGL ES 3.2 and it fits here nicely. At around the time that OpenGL ES 3.1 brought Compute Shaders to the embedded platform, Google launched the Android Extension Pack (AEP). This absorbed OpenGL ES 3.1 and added Tessellation, Geometry Shaders, and ASTC texture compression to it. It was also more tension between Google and cross-platform developers, feeling like Google was trying to pull its developers away from Khronos Group. Today, OpenGL ES 3.2 was announced and includes each of the AEP features, plus a few more (like “enhanced” blending). Better yet, Google will support it directly.

Next up are the desktop standards, before we finish with a resurrected embedded standard.

OpenGL has a few new extensions added. One interesting one is the ability to assign locations to multi-samples within a pixel. There is a whole list of sub-pixel layouts, such as rotated grid and Poisson disc. Apparently this extension allows developers to choose it, as certain algorithms work better or worse for certain geometries and structures. There were probably vendor-specific extensions for a while, but now it's a ratified one. Another extension allows “streamlined sparse textures”, which helps manage data where the number of unpopulated entries outweighs the number of populated ones.

OpenCL 2.0 was given a refresh, too. It contains a few bug fixes and clarifications that will help it be adopted. C++ headers were also released, although I cannot comment much on it. I do not know the state that OpenCL 2.0 was in before now.

And this is when we make our way back to Vulkan.

khronos-2015-siggraph-spirv.png

SPIR-V, the code that runs on the GPU (or other offloading device, including the other cores of a CPU) in OpenCL and Vulkan is seeing a lot of community support. Projects are under way to allow developers to write GPU code in several interesting languages: Python, .NET (C#), Rust, Haskell, and many more. The slide lists nine that Khronos Group knows about, but those four are pretty interesting. Again, this is saying that you can write code in the aforementioned languages and have it run directly on a GPU. Curiously missing is HLSL, and the President of Khronos Group agreed that it would be a useful language. The ability to cross-compile HLSL into SPIR-V means that shader code written for DirectX 9, 10, 11, and 12 could be compiled for Vulkan. He expects that it won't take long for a project to start, and might already be happening somewhere outside his Google abilities. Regardless, those who are afraid to program in the C-like GLSL and HLSL shading languages might find C# and Python to be a bit more their speed, and they seem to be happening through SPIR-V.

As mentioned, we'll end on something completely different.

khronos-2015-siggraph-sc.png

For several years, the OpenGL SC has been on hiatus. This group defines standards for graphics (and soon GPU compute) in “safety critical” applications. For the longest time, this meant aircraft. The dozens of planes (which I assume meant dozens of models of planes) that adopted this technology were fine with a fixed-function pipeline. It has been about ten years since OpenGL SC 1.0 launched, which was based on OpenGL ES 1.0. SC 2.0 is planned to launch in 2016, which will be based on the much more modern OpenGL ES 2 and ES 3 APIs that allow pixel and vertex shaders. The Khronos Group is asking for participation to direct SC 2.0, as well as a future graphics and compute API that is potentially based on Vulkan.

The devices that this platform intends to target are: aircraft (again), automobiles, drones, and robots. There are a lot of ways that GPUs can help these devices, but they need a good API to certify against. It needs to withstand more than an Ouya, because crashes could be much more literal.