Khronos Announces "Next" OpenGL & Releases OpenGL 4.5

Subject: General Tech, Graphics Cards, Shows and Expos | August 15, 2014 - 08:33 PM |
Tagged: siggraph 2014, Siggraph, OpenGL Next, opengl 4.5, opengl, nvidia, Mantle, Khronos, Intel, DirectX 12, amd

Let's be clear: there are two stories here. The first is the release of OpenGL 4.5 and the second is the announcement of the "Next Generation OpenGL Initiative". They both occur on the same press release, but they are two, different statements.

OpenGL 4.5 Released

OpenGL 4.5 expands the core specification with a few extensions. Compatible hardware, with OpenGL 4.5 drivers, will be guaranteed to support these. This includes features like direct_state_access, which allows accessing objects in a context without binding to it, and support of OpenGL ES3.1 features that are traditionally missing from OpenGL 4, which allows easier porting of OpenGL ES3.1 applications to OpenGL.

opengl_logo.jpg

It also adds a few new extensions as an option:

ARB_pipeline_statistics_query lets a developer ask the GPU what it has been doing. This could be useful for "profiling" an application (list completed work to identify optimization points).

ARB_sparse_buffer allows developers to perform calculations on pieces of generic buffers, without loading it all into memory. This is similar to ARB_sparse_textures... except that those are for textures. Buffers are useful for things like vertex data (and so forth).

ARB_transform_feedback_overflow_query is apparently designed to let developers choose whether or not to draw objects based on whether the buffer is overflowed. I might be wrong, but it seems like this would be useful for deciding whether or not to draw objects generated by geometry shaders.

KHR_blend_equation_advanced allows new blending equations between objects. If you use Photoshop, this would be "multiply", "screen", "darken", "lighten", "difference", and so forth. On NVIDIA's side, this will be directly supported on Maxwell and Tegra K1 (and later). Fermi and Kepler will support the functionality, but the driver will perform the calculations with shaders. AMD has yet to comment, as far as I can tell.

nvidia-opengl-debugger.jpg

Image from NVIDIA GTC Presentation

If you are a developer, NVIDIA has launched 340.65 (340.23.01 for Linux) beta drivers for developers. If you are not looking to create OpenGL 4.5 applications, do not get this driver. You really should not have any use for it, at all.

Next Generation OpenGL Initiative Announced

The Khronos Group has also announced "a call for participation" to outline a new specification for graphics and compute. They want it to allow developers explicit control over CPU and GPU tasks, be multithreaded, have minimal overhead, have a common shader language, and "rigorous conformance testing". This sounds a lot like the design goals of Mantle (and what we know of DirectX 12).

amd-mantle-queues.jpg

And really, from what I hear and understand, that is what OpenGL needs at this point. Graphics cards look nothing like they did a decade ago (or over two decades ago). They each have very similar interfaces and data structures, even if their fundamental architectures vary greatly. If we can draw a line in the sand, legacy APIs can be supported but not optimized heavily by the drivers. After a short time, available performance for legacy applications would be so high that it wouldn't matter, as long as they continue to run.

Add to it, next-generation drivers should be significantly easier to develop, considering the reduced error checking (and other responsibilities). As I said on Intel's DirectX 12 story, it is still unclear whether it will lead to enough performance increase to make most optimizations, such as those which increase workload or developer effort in exchange for queuing fewer GPU commands, unnecessary. We will need to wait for game developers to use it for a bit before we know.

Intel and Microsoft Show DirectX 12 Demo and Benchmark

Subject: General Tech, Graphics Cards, Processors, Mobile, Shows and Expos | August 13, 2014 - 09:55 PM |
Tagged: siggraph 2014, Siggraph, microsoft, Intel, DirectX 12, directx 11, DirectX

Along with GDC Europe and Gamescom, Siggraph 2014 is going on in Vancouver, BC. At it, Intel had a DirectX 12 demo at their booth. This scene, containing 50,000 asteroids, each in its own draw call, was developed on both Direct3D 11 and Direct3D 12 code paths and could apparently be switched while the demo is running. Intel claims to have measured both power as well as frame rate.

intel-dx12-LockedFPS.png

Variable power to hit a desired frame rate, DX11 and DX12.

The test system is a Surface Pro 3 with an Intel HD 4400 GPU. Doing a bit of digging, this would make it the i5-based Surface Pro 3. Removing another shovel-load of mystery, this would be the Intel Core i5-4300U with two cores, four threads, 1.9 GHz base clock, up-to 2.9 GHz turbo clock, 3MB of cache, and (of course) based on the Haswell architecture.

While not top-of-the-line, it is also not bottom-of-the-barrel. It is a respectable CPU.

Intel's demo on this processor shows a significant power reduction in the CPU, and even a slight decrease in GPU power, for the same target frame rate. If power was not throttled, Intel's demo goes from 19 FPS all the way up to a playable 33 FPS.

Intel will discuss more during a video interview, tomorrow (Thursday) at 5pm EDT.

intel-dx12-unlockedFPS-1.jpg

Maximum power in DirectX 11 mode.

For my contribution to the story, I would like to address the first comment on the MSDN article. It claims that this is just an "ideal scenario" of a scene that is bottlenecked by draw calls. The thing is: that is the point. Sure, a game developer could optimize the scene to (maybe) instance objects together, and so forth, but that is unnecessary work. Why should programmers, or worse, artists, need to spend so much of their time developing art so that it could be batch together into fewer, bigger commands? Would it not be much easier, and all-around better, if the content could be developed as it most naturally comes together?

That, of course, depends on how much performance improvement we will see from DirectX 12, compared to theoretical max efficiency. If pushing two workloads through a DX12 GPU takes about the same time as pushing one, double-sized workload, then it allows developers to, literally, perform whatever solution is most direct.

intel-dx12-unlockedFPS-2.jpg

Maximum power when switching to DirectX 12 mode.

If, on the other hand, pushing two workloads is 1000x slower than pushing a single, double-sized one, but DirectX 11 was 10,000x slower, then it could be less relevant because developers will still need to do their tricks in those situations. The closer it gets, the fewer occasions that strict optimization is necessary.

If there are any DirectX 11 game developers, artists, and producers out there, we would like to hear from you. How much would a (let's say) 90% reduction in draw call latency (which is around what Mantle claims) give you, in terms of fewer required optimizations? Can you afford to solve problems "the naive way" now? Some of the time? Most of the time? Would it still be worth it to do things like object instancing and fewer, larger materials and shaders? How often?

FMS 2014: Silicon Motion announces new SM2256 controller driving 1xnm TLC NAND

Subject: Storage, Shows and Expos | August 7, 2014 - 05:37 PM |
Tagged: ssd, SM2256, silicon motion, sata, FMS 2014, FMS

Silicon Motion has announced their SM2256 controller. We caught a glimpse of this new controller on the Flash Memory Summit show floor:

DSC04256.JPG

The big deal here is the fact that this controller is a complete drop-in solution that can drive multiple different types of flash, as seen below:

DSC04258.JPG

The SM2256 can drive all variants of TLC flash.

The controller itself looks to have decent specs, considering it is meant to drive 1xnm TLC flash. Just under 100k random 4k IOPS. Writes are understandably below the max saturation of SATA 6Gb/sec at 400MB/sec (writing to TLC is tricky!). There is also mention of Silicon Motion's NANDXtend Technology, which claims to add some extra ECC and DSP tech towards the end of increasing the ability to correct for bit errors in the flash (more likely as you venture into 8 bit per cell territory).

Press blast after the break:

FMS 2014: Phison announces new quad-core PS3110 SATA 6Gb/s SSD controller

Subject: Storage, Shows and Expos | August 7, 2014 - 05:25 PM |
Tagged: ssd, sata, PS5007, PS3110, phison, pcie, FMS 2014, FMS

At the Flash Memory Summit, Phison has updated their SSD controller lineup with a new quad-core SSD controller.

DSC04264.JPG

The PS3110 is capable of handling TLC as well as MLC flash, and the added horsepower lets it push as high as 100k IOPS.

DSC04260.JPG

Also seen was an upcoming PS5007 controller, capable of pushing PCIe 3.0 x4 SSDs at 300k IOPS and close to 3GB/sec sequential throughputs. While there were no actual devices on display of this new controller, we did spot the full specs:

DSC04263.JPG

Full press blast on the PS3110 appears after the break:

Source: Phison

FMS 2014: HGST Claims 3 Million IOPS and 1.5us Access Time SSD - updated with pics

Subject: General Tech, Storage, Shows and Expos | August 7, 2014 - 02:17 PM |
Tagged: ssd, phase change memory, PCM, hgst, FMS 2014, FMS

According to an HGST press release, the company will bring an SSD based on phase change memory to the 2014 Flash Memory Summit in Santa Clara, California. They claim that it will actually be at their booth, on the show floor, for two days (August 6th and 7th).

The device, which is not branded, connects via PCIe 2.0 x4. It is designed for speed. It is allegedly capable of 3 million IOPS, with just 1.5 microseconds required for a single access. For comparison, the 800GB Intel SSD DC P3700, recently reviewed by Allyn, had a dominating lead over the competitors that he tested. It was just shy of 250 thousand IOPS. This is, supposedly, about twelve times faster.

HGST_CompanyLogo.png

While it is based on a different technology than NAND, and thus not directly comparable, the PCM chips are apparently manufactured at 45nm. Regardless, that is significantly larger lithography than competing products. Intel is manufacturing their flash at 20nm, while Samsung managed to use a 30nm process for their recent V-NAND launch.

What does concern me is the capacity per chip. According to the press release, it is 1Gb per chip. That is about two orders of magnitude smaller than what NAND is pushing. That is, also, the only reference to capacity in the entire press release. It makes me wonder how small the total drive capacity will be, especially compared to RAM drives.

Of course, because it does not seem to be a marketed product yet, nothing about pricing or availability. It will almost definitely be aimed at the enterprise market, though (especially given HGST's track record).

*** Update from Allyn ***

I'm hijacking Scott's news post with photos of the actual PCM SSD, from the FMS show floor:

DSC04122.JPG

DSC04124.JPG

In case you all are wondering, yes, it does in fact work:

DSC04125.JPG

DSC04126.JPG

DSC04127.JPG

One of the advantages of PCM is that it is addressed at smaller sections as compared to typical flash memory. This means you can see ~700k *single sector* random IOPS at QD=1. You can only pull off that sort of figure with extremely low IO latency. They only showed this output at their display, but ramping up QD > 1 should reasonably lead to the 3 million figure claimed in their release.

Source: HGST

FMS 2014: Marvell announces new 88SS1093 PCIe SSD controller

Subject: Storage, Shows and Expos | August 6, 2014 - 03:03 PM |
Tagged: ssd, pcie, NVMe, Marvell, FMS 2014, FMS, controller, 88SS1093

Marvell is notorious for being the first to bring a 6Gb/sec SATA controller to market, and they continue to do very well in that area. Their very capable 88SS9189 controller powers the Crucial MX100 and M550, as well as the ADATA SP920.

chip-shot-88SS1093.jpg

Today they have announced a newer controller, the 88SS1093. Despite the confusing numbering, the 88SS1093 has a PCIe 3.0 x4 host interface and will support the full NVMe protocol. The provided specs are on the light side, as performance of this controller will ultimately depend on the speed and parallelism of the attached flash, but its sure to be a decent performer. I suspect it would behave like their SATA part, only no longer bottlenecked by SATA 6Gb/sec speeds.

More to follow as I hope to see this controller in person on the exhibition hall (which opens to press in a few hours). Full press blast after the break.

*** Update ***

Apologies as there was no photo to be taken - Marvell had no booth at the exibition space at FMS.

Source: Marvell

FMS 2014: Samsung announces 3D TLC VNAND, Storage Intelligence initiative

Subject: Storage, Shows and Expos | August 5, 2014 - 04:19 PM |
Tagged: FMS, vnand, tlc, ssd, Samsung, FMS 2014, Flash Memory Summit

Just minutes ago at the Flash Memory Summit, Samsung announced the production of 32-layer TLC VNAND:

DSC03974.JPG

This is the key to production of a soon-to-be-released 850 EVO, which should bring the excellent performance of the 850 Pro, with the reduced cost benefit we saw with the previous generation 840 EVO. Here's what the progression to 3D VNAND looks like:

progression slide.png

3D TLC VNAND will look identical to the right most image in the above slide, but the difference will be that the charge stored has more variability. Given that Samsung's VNAND tech has more volume to store electrons when compared to competing 2D planar flash technology, it's a safe bet that this new TLC will come with higher endurance ratings than those other technologies. There is much more information on Samsung's VNAND technology on page 1 of our 850 Pro review. Be sure to check that out if you haven't already!

Another announcement made was more of an initiative, but a very interesting one at that. SSDs are generally dumb when it comes to coordinating with the host - in that there is virtually no coordination. An SSD has no idea which pieces of files were meant to be grouped together, etc (top half of this slide):

DSC04016.JPG

Stuff comes into the SSD and it puts it where it can based on its best guess as to how it should optimize those writes. What you'd want to have, ideally, is a more intelligent method of coordination between the host system and the SSD (more like the bottom half of the above slide). Samsung has been dabbling in the possibilities here and has seen some demonstrable gains to be made. In a system where they made the host software aware of the SSD flash space, and vice versa, they were able to significantly reduce write latency during high IOPS activity.

DSC04014.JPG

The key is that if the host / host software has more control over where and how data is stored on the SSD, the end result is a much more optimized write pattern, which ultimately boosts overall throughput and IOPS. We are still in the experimentation stage on Storage Intelligence, with more to follow as standards are developed and the industry pushes forward.

It might be a while before we see Storage Intelligence go mainstream, but I'm definitely eager to see 3D TLC VNAND hit the market, and now we know it's coming! More to follow in the coming days as we continue our live coverage of the Flash Memory Summit!

PC Perspective Hardware Workshop 2014 @ Quakecon 2014 in Dallas, TX

Subject: Editorial, General Tech, Shows and Expos | July 23, 2014 - 04:43 PM |
Tagged: workshop, video, streaming, quakecon, prizes, live, giveaways

UPDATE: The event is over, but the video is embeded below if you want to see the presentations! Thanks again to everyone that attended and all of our sponsors!

It is that time of year again: another installment of the PC Perspective Hardware Workshop!  Once again we will be presenting on the main stage at Quakecon 2014 being held in Dallas, TX July 17-20th.

logo-1500px.jpg
 

Main Stage - Quakecon 2014

Saturday, July 19th, 12:00pm CT

Our thanks go out to the organizers of Quakecon for allowing us and our partners to put together a show that we are proud of every year.  We love giving back to the community of enthusiasts and gamers that drive us to do what we do!  Get ready for 2 hours of prizes, games and raffles and the chances are pretty good that you'll take something out with you - really, they are pretty good!

Our primary partners at the event are those that threw in for our ability to host the workshop at Quakecon and for the hundreds of shirts we have ready to toss out!  Our thanks to NVIDIASeasonic and Logitech!!

nvidia_logo_small.png

seasonic-transparent.png

logitech-transparent.png

Live Streaming

If you can't make it to the workshop - don't worry!  You can still watch the workshop live on our live page as we stream it over one of several online services.  Just remember this URL: http://pcper.com/live and you will find your way!

 

PC Perspective LIVE Podcast and Meetup

We are planning on hosting any fans that want to watch us record our weekly PC Perspective Podcast (http://pcper.com/podcast) on Wednesday or Thursday evening in our meeting room at the Hilton Anatole.  I don't yet know exactly WHEN or WHERE the location will be, but I will update this page accordingly on Wednesday July 16th when we get the data.  You might also consider following me on Twitter for updates on that status as well.

After the recording, we'll hop over the hotel bar for a couple drinks and hang out.  We have room for at leaast 50-60 people to join us in the room but we'll still be recording if just ONE of you shows up.  :)

Prize List (will continue to grow!)

Continue reading to see the list of prizes for the workshop!!!

John Carmack's Replacement at id: Tiago Sousa (Crytek)

Subject: General Tech, Shows and Expos | July 19, 2014 - 05:13 PM |
Tagged: quakecon 2014, quakecon, id, crytek

Tiago Sousa was "Lead R&D Graphics Engineer" at Crytek, according to his now defunct Twitter account, "@CRYTEK_TIAGO". According to his new Twitter account, "@idSoftwareTiago", he will be joining id Software to help with DOOM and idTech 6.

id-logo.jpg

A little less DOOM and gloom.

I find this more interesting because idTech 5 has not exactly seen much usage, outside of RAGE. Wolfenstein: The New Order was also released on the technology, two months ago. There is one other game planned -- and that is it. Sure, RAGE is almost three years old and the engine was first revealed in 2007, making it seven-year-old technology, basically. Still, that is a significant investment to see basically no return on, especially considering that its sales figures were not too impressive (Steam and other digital delivery services excluded).

I also cannot tell if this looks positive for id, after mixed comments from current and former employees (or people who claim to be), or bad for Crytek. The latter company was rumored to be hurting for cash since 2011 and saw the departure of many employees. I expect that there will be more to this story in the coming months and years.

Source: Twitter

Win a BYOC Seat at Quakecon 2014!

Subject: Shows and Expos | July 9, 2014 - 05:27 PM |
Tagged: workshop, quakecon, contest, byoc

Are you interested in attending Quakecon 2014 next weekend in Dallas, TX but just can't swing the BYOC spot? Well, thanks to our friends at Quakecon and at PC Part Picker, we have two BYOC spots up for grabs for fans of PC Perspective!

qconlogo-small.jpg

While we are excited to be hosting our PC Perspective Hardware Workshop with thousands of dollars in giveaways to pass out on Saturday the 19th, I know that the big draw is the chance to spend Thursday, Friday and Saturday at North America's largest LAN Party.

logo-1500px.jpg

The giveaway is simple. 

  1. Fill out the form below with your name and email address.
     
  2. Make sure you are able and willing to attend Quakecon from July 17th - July 20th. There is no point in winning a free BYOC spot that you cannot use!
     
  3. We'll pick a winner on Friday, July 11th so you'll have enough time to make plans.

There you have it. Get it to it guys and we'll see you in Dallas!