Subject: Storage, Shows and Expos | September 9, 2014 - 06:00 PM | Allyn Malventano
Tagged: ssd, SMR, pcie, NVMe, idf 2014, idf, hgst, hdd, 10TB
It's the first day of IDF, so it's only natural that we see a bunch of non-IDF news start pouring out :). I'll kick them off with a few announcements from HGST. First item up is their new SN100 line of PCIe SSDs:
These are NVMe capable PCIe SSDs, available from 800GB to 3.2TB capacities and in (PCI-based - not SATA) 2.5" as well as half-height PCIe cards.
Next up is an expansion of their HelioSeal (Helium filled) drive line:
Through the use of Shingled Magnetic Recording (SMR), HGST can make an even bigger improvement in storage densities. This does not come completely free, as due to the way SMR writes to the disk, it is primarily meant to be a sequential write / random access read storage device. Picture roofing shingles, but for hard drives. The tracks are slightly overlapped as they are written to disk. This increases density greatly, but writting to the middle of a shingled section is not possible without potentially overwriting two shingled tracks simultaneously. Think of it as CD-RW writing, but for hard disks. This tech is primarily geared towards 'cold storage', or data that is not actively being written. Think archival data. The ability to still read that data randomly and on demand makes these drives more appealing than retrieving that same data from tape-based archival methods.
Further details on the above releases is scarce at present, but we will keep you posted on further details as they develop.
Subject: Processors, Shows and Expos | September 9, 2014 - 03:02 PM | Ryan Shrout
Tagged: idf, idf 2014, Intel, keynote, live blog
Today is the beginning of the 2014 Intel Developer Forum in San Francisco! Join me at 9am PT for the first of our live blogs of the main Intel keynote where we will learn what direction Intel is taking on many fronts!
Subject: General Tech, Shows and Expos | September 2, 2014 - 09:51 PM | Scott Michaud
Tagged: nvidia, game24, pc gaming
At 6PM PDT on September 18th, 2014, NVIDIA and partners will be hosting GAME24. The evemt will start at that time, all around the world, and finish 24 hours later. The three main event locations are Los Angeles, California, USA; London, England; and Shanghai, China. Four, smaller events will be held in Chicago, Illinois, USA; Indianapolis, Indiana, USA; Mission Viejo, California, USA; and Stockholm, Sweden. It will also be live streamed on the official website.
Registration and attendance is free. If you will be in the area and want to join, sign up. Registration closes an hour before the event, but it is first-come-first-serve. Good luck. Have fun. Good game.
Subject: General Tech, Shows and Expos | August 22, 2014 - 08:53 PM | Jeremy Hellstrom
Tagged: richard huddy, kick ass, amd
Join AMD’s Chief Gaming Scientist, Richard Huddy on Saturday, Aug. 23, 2014 at 10:00 AM EDT/7:00 AM PDT to celebrate 30 Years of Graphics and Gaming. The event will feature interviews with Raja Koduri, AMD’s Corporate VP, Visual Computing; John Byrne, AMD’s Senior VP and General Manager, Computing and Graphics Business Group; and several special guests. You can also expect new product announcements along with stories covering the history of AMD. You can watch the twitch.tv livestream below once the festivities kick off!
There is also a contest for those who follow @AMDRadeon and retweet their tweet of "Follow @AMDRadeon Tune into #AMD30Live 8/23/14 at 9AM CT www.amd.com/AMD30Live – Follow & Retweet for a chance to win! www.amd.com/AMD30Live"
Subject: General Tech, Graphics Cards, Shows and Expos | August 16, 2014 - 12:33 AM | Scott Michaud
Tagged: siggraph 2014, Siggraph, OpenGL Next, opengl 4.5, opengl, nvidia, Mantle, Khronos, Intel, DirectX 12, amd
Let's be clear: there are two stories here. The first is the release of OpenGL 4.5 and the second is the announcement of the "Next Generation OpenGL Initiative". They both occur on the same press release, but they are two, different statements.
OpenGL 4.5 Released
OpenGL 4.5 expands the core specification with a few extensions. Compatible hardware, with OpenGL 4.5 drivers, will be guaranteed to support these. This includes features like direct_state_access, which allows accessing objects in a context without binding to it, and support of OpenGL ES3.1 features that are traditionally missing from OpenGL 4, which allows easier porting of OpenGL ES3.1 applications to OpenGL.
It also adds a few new extensions as an option:
ARB_pipeline_statistics_query lets a developer ask the GPU what it has been doing. This could be useful for "profiling" an application (list completed work to identify optimization points).
ARB_sparse_buffer allows developers to perform calculations on pieces of generic buffers, without loading it all into memory. This is similar to ARB_sparse_textures... except that those are for textures. Buffers are useful for things like vertex data (and so forth).
ARB_transform_feedback_overflow_query is apparently designed to let developers choose whether or not to draw objects based on whether the buffer is overflowed. I might be wrong, but it seems like this would be useful for deciding whether or not to draw objects generated by geometry shaders.
KHR_blend_equation_advanced allows new blending equations between objects. If you use Photoshop, this would be "multiply", "screen", "darken", "lighten", "difference", and so forth. On NVIDIA's side, this will be directly supported on Maxwell and Tegra K1 (and later). Fermi and Kepler will support the functionality, but the driver will perform the calculations with shaders. AMD has yet to comment, as far as I can tell.
Image from NVIDIA GTC Presentation
If you are a developer, NVIDIA has launched 340.65 (340.23.01 for Linux) beta drivers for developers. If you are not looking to create OpenGL 4.5 applications, do not get this driver. You really should not have any use for it, at all.
Next Generation OpenGL Initiative Announced
The Khronos Group has also announced "a call for participation" to outline a new specification for graphics and compute. They want it to allow developers explicit control over CPU and GPU tasks, be multithreaded, have minimal overhead, have a common shader language, and "rigorous conformance testing". This sounds a lot like the design goals of Mantle (and what we know of DirectX 12).
And really, from what I hear and understand, that is what OpenGL needs at this point. Graphics cards look nothing like they did a decade ago (or over two decades ago). They each have very similar interfaces and data structures, even if their fundamental architectures vary greatly. If we can draw a line in the sand, legacy APIs can be supported but not optimized heavily by the drivers. After a short time, available performance for legacy applications would be so high that it wouldn't matter, as long as they continue to run.
Add to it, next-generation drivers should be significantly easier to develop, considering the reduced error checking (and other responsibilities). As I said on Intel's DirectX 12 story, it is still unclear whether it will lead to enough performance increase to make most optimizations, such as those which increase workload or developer effort in exchange for queuing fewer GPU commands, unnecessary. We will need to wait for game developers to use it for a bit before we know.
Subject: General Tech, Graphics Cards, Processors, Mobile, Shows and Expos | August 14, 2014 - 01:55 AM | Scott Michaud
Tagged: siggraph 2014, Siggraph, microsoft, Intel, DirectX 12, directx 11, DirectX
Along with GDC Europe and Gamescom, Siggraph 2014 is going on in Vancouver, BC. At it, Intel had a DirectX 12 demo at their booth. This scene, containing 50,000 asteroids, each in its own draw call, was developed on both Direct3D 11 and Direct3D 12 code paths and could apparently be switched while the demo is running. Intel claims to have measured both power as well as frame rate.
Variable power to hit a desired frame rate, DX11 and DX12.
The test system is a Surface Pro 3 with an Intel HD 4400 GPU. Doing a bit of digging, this would make it the i5-based Surface Pro 3. Removing another shovel-load of mystery, this would be the Intel Core i5-4300U with two cores, four threads, 1.9 GHz base clock, up-to 2.9 GHz turbo clock, 3MB of cache, and (of course) based on the Haswell architecture.
While not top-of-the-line, it is also not bottom-of-the-barrel. It is a respectable CPU.
Intel's demo on this processor shows a significant power reduction in the CPU, and even a slight decrease in GPU power, for the same target frame rate. If power was not throttled, Intel's demo goes from 19 FPS all the way up to a playable 33 FPS.
Intel will discuss more during a video interview, tomorrow (Thursday) at 5pm EDT.
Maximum power in DirectX 11 mode.
For my contribution to the story, I would like to address the first comment on the MSDN article. It claims that this is just an "ideal scenario" of a scene that is bottlenecked by draw calls. The thing is: that is the point. Sure, a game developer could optimize the scene to (maybe) instance objects together, and so forth, but that is unnecessary work. Why should programmers, or worse, artists, need to spend so much of their time developing art so that it could be batch together into fewer, bigger commands? Would it not be much easier, and all-around better, if the content could be developed as it most naturally comes together?
That, of course, depends on how much performance improvement we will see from DirectX 12, compared to theoretical max efficiency. If pushing two workloads through a DX12 GPU takes about the same time as pushing one, double-sized workload, then it allows developers to, literally, perform whatever solution is most direct.
Maximum power when switching to DirectX 12 mode.
If, on the other hand, pushing two workloads is 1000x slower than pushing a single, double-sized one, but DirectX 11 was 10,000x slower, then it could be less relevant because developers will still need to do their tricks in those situations. The closer it gets, the fewer occasions that strict optimization is necessary.
If there are any DirectX 11 game developers, artists, and producers out there, we would like to hear from you. How much would a (let's say) 90% reduction in draw call latency (which is around what Mantle claims) give you, in terms of fewer required optimizations? Can you afford to solve problems "the naive way" now? Some of the time? Most of the time? Would it still be worth it to do things like object instancing and fewer, larger materials and shaders? How often?
Subject: Storage, Shows and Expos | August 7, 2014 - 09:37 PM | Allyn Malventano
Tagged: ssd, SM2256, silicon motion, sata, FMS 2014, FMS
Silicon Motion has announced their SM2256 controller. We caught a glimpse of this new controller on the Flash Memory Summit show floor:
The big deal here is the fact that this controller is a complete drop-in solution that can drive multiple different types of flash, as seen below:
The SM2256 can drive all variants of TLC flash.
The controller itself looks to have decent specs, considering it is meant to drive 1xnm TLC flash. Just under 100k random 4k IOPS. Writes are understandably below the max saturation of SATA 6Gb/sec at 400MB/sec (writing to TLC is tricky!). There is also mention of Silicon Motion's NANDXtend Technology, which claims to add some extra ECC and DSP tech towards the end of increasing the ability to correct for bit errors in the flash (more likely as you venture into 8 bit per cell territory).
Subject: Storage, Shows and Expos | August 7, 2014 - 09:25 PM | Allyn Malventano
Tagged: ssd, sata, PS5007, PS3110, phison, pcie, FMS 2014, FMS
At the Flash Memory Summit, Phison has updated their SSD controller lineup with a new quad-core SSD controller.
The PS3110 is capable of handling TLC as well as MLC flash, and the added horsepower lets it push as high as 100k IOPS.
Also seen was an upcoming PS5007 controller, capable of pushing PCIe 3.0 x4 SSDs at 300k IOPS and close to 3GB/sec sequential throughputs. While there were no actual devices on display of this new controller, we did spot the full specs:
Full press blast on the PS3110 appears after the break:
Subject: General Tech, Storage, Shows and Expos | August 7, 2014 - 06:17 PM | Scott Michaud
Tagged: ssd, phase change memory, PCM, hgst, FMS 2014, FMS
According to an HGST press release, the company will bring an SSD based on phase change memory to the 2014 Flash Memory Summit in Santa Clara, California. They claim that it will actually be at their booth, on the show floor, for two days (August 6th and 7th).
The device, which is not branded, connects via PCIe 2.0 x4. It is designed for speed. It is allegedly capable of 3 million IOPS, with just 1.5 microseconds required for a single access. For comparison, the 800GB Intel SSD DC P3700, recently reviewed by Allyn, had a dominating lead over the competitors that he tested. It was just shy of 250 thousand IOPS. This is, supposedly, about twelve times faster.
While it is based on a different technology than NAND, and thus not directly comparable, the PCM chips are apparently manufactured at 45nm. Regardless, that is significantly larger lithography than competing products. Intel is manufacturing their flash at 20nm, while Samsung managed to use a 30nm process for their recent V-NAND launch.
What does concern me is the capacity per chip. According to the press release, it is 1Gb per chip. That is about two orders of magnitude smaller than what NAND is pushing. That is, also, the only reference to capacity in the entire press release. It makes me wonder how small the total drive capacity will be, especially compared to RAM drives.
Of course, because it does not seem to be a marketed product yet, nothing about pricing or availability. It will almost definitely be aimed at the enterprise market, though (especially given HGST's track record).
*** Update from Allyn ***
I'm hijacking Scott's news post with photos of the actual PCM SSD, from the FMS show floor:
In case you all are wondering, yes, it does in fact work:
One of the advantages of PCM is that it is addressed at smaller sections as compared to typical flash memory. This means you can see ~700k *single sector* random IOPS at QD=1. You can only pull off that sort of figure with extremely low IO latency. They only showed this output at their display, but ramping up QD > 1 should reasonably lead to the 3 million figure claimed in their release.
Subject: Storage, Shows and Expos | August 6, 2014 - 07:03 PM | Allyn Malventano
Tagged: ssd, pcie, NVMe, Marvell, FMS 2014, FMS, controller, 88SS1093
Marvell is notorious for being the first to bring a 6Gb/sec SATA controller to market, and they continue to do very well in that area. Their very capable 88SS9189 controller powers the Crucial MX100 and M550, as well as the ADATA SP920.
Today they have announced a newer controller, the 88SS1093. Despite the confusing numbering, the 88SS1093 has a PCIe 3.0 x4 host interface and will support the full NVMe protocol. The provided specs are on the light side, as performance of this controller will ultimately depend on the speed and parallelism of the attached flash, but its sure to be a decent performer. I suspect it would behave like their SATA part, only no longer bottlenecked by SATA 6Gb/sec speeds.
More to follow as I hope to see this controller in person on the exhibition hall (which opens to press in a few hours). Full press blast after the break.
*** Update ***
Apologies as there was no photo to be taken - Marvell had no booth at the exibition space at FMS.