All | Editorial | General Tech | Graphics Cards | Networking | Motherboards | Cases and Cooling | Processors | Chipsets | Memory | Displays | Systems | Storage | Mobile | Shows and Expos
AM3+ Keeps Chugging Along
Consumers cannot say that MSI has not attempted to keep the AM3+ market interesting with a handful of new products based upon that socket. Throughout this past year MSI has released three different products addressing multiple price points and featuresets. The 970 Gaming was the first, the 970 KRAIT introduced USB 3.1 to the socket, and the latest 990FXA-Gaming board provides the most feature rich implementation of the socket plus USB 3.1.
AMD certainly has not done the platform any real favors as of late in terms of new CPUs and architectures to inhabit that particular socket. The last refresh we had was around a year ago with the release of the FX-8370 and 8370e. These are still based on the Piledriver based Vishera core that was introduced three years ago. Unlike the GPU market, the CPU market has certainly not seen the leaps and bounds in overall performance that we had enjoyed in years past.
MSI has taken the now geriatric 990FX (based upon the 890FX chipset released in 2010- I think AMD might have gotten their money out of this particular chipset iteration) and implemented it in a new design that embraces many of the top end features that are desired by enthusiasts. AMD still has a solid following and their products are very competitive from a price/performance standpoint (check out Ryan’s price/perf graphs from his latest Intel CPU review).
The packing material is pretty basic. Just cardboard and no foam. Still, fits nicely and is quite snug.
The idea behind the 990FXA-Gaming is to provide a very feature-rich product that appeals to gamers and enthusiasts. The key is to provide those features at a price point that will not scare away the budget enthusiasts. Just as MSI has done with the 970 Gaming, there were decisions made to keep costs down. We will get into these tradeoffs shortly.
To the Max?
Much of the PC enthusiast internet, including our comments section, has been abuzz with “Asynchronous Shader” discussion. Normally, I would explain what it is and then outline the issues that surround it, but I would like to swap that order this time. Basically, the Ashes of the Singularity benchmark utilizes Asynchronous Shaders in DirectX 12, but they disable it (by Vendor ID) for NVIDIA hardware. They say that this is because, while the driver reports compatibility, “attempting to use it was an unmitigated disaster in terms of performance and conformance”.
AMD's Robert Hallock claims that NVIDIA GPUs, including Maxwell, cannot support the feature in hardware at all, while all AMD GCN graphics cards do. NVIDIA has yet to respond to our requests for an official statement, although we haven't poked every one of our contacts yet. We will certainly update and/or follow up if we hear from them. For now though, we have no idea whether this is a hardware or software issue. Either way, it seems more than just politics.
So what is it?
Simply put, Asynchronous Shaders allows a graphics driver to cram workloads in portions of the GPU that are idle, but not otherwise available. For instance, if a graphics task is hammering the ROPs, the driver would be able to toss an independent physics or post-processing task into the shader units alongside it. Kollock from Oxide Games used the analogy of HyperThreading, which allows two CPU threads to be executed on the same core at the same time, as long as it has the capacity for it.
Kollock also notes that compute is becoming more important in the graphics pipeline, and it is possible to completely bypass graphics altogether. The fixed-function bits may never go away, but it's possible that at least some engines will completely bypass it -- maybe even their engine, several years down the road.
But, like always, you will not get an infinite amount of performance by reducing your waste. You are always bound by the theoretical limits of your components, and you cannot optimize past that (except for obviously changing the workload itself). The interesting part is: you can measure that. You can absolutely observe how long a GPU is idle, and represent it as a percentage of a time-span (typically a frame).
And, of course, game developers profile GPUs from time to time...
According to Kollock, he has heard of some console developers getting up to 30% increases in performance using Asynchronous Shaders. Again, this is on console hardware and so this amount may increase or decrease on the PC. In an informal chat with a developer at Epic Games, so massive grain of salt is required, his late night ballpark “totally speculative” guesstimate is that, on the Xbox One, the GPU could theoretically accept a maximum ~10-25% more work in Unreal Engine 4, depending on the scene. He also said that memory bandwidth gets in the way, which Asynchronous Shaders would be fighting against. It is something that they are interested in and investigating, though.
This is where I speculate on drivers. When Mantle was announced, I looked at its features and said “wow, this is everything that a high-end game developer wants, and a graphics developer absolutely does not”. From the OpenCL-like multiple GPU model taking much of the QA out of SLI and CrossFire, to the memory and resource binding management, this should make graphics drivers so much easier.
It might not be free, though. Graphics drivers might still have a bunch of games to play to make sure that work is stuffed through the GPU as tightly packed as possible. We might continue to see “Game Ready” drivers in the coming years, even though much of that burden has been shifted to the game developers. On the other hand, maybe these APIs will level the whole playing field and let all players focus on chip design and efficient injestion of shader code. As always, painfully always, time will tell.
That is a lotta SKUs!
The slow, gradual release of information about Intel's Skylake-based product portfolio continues forward. We have already tested and benchmarked the desktop variant flagship Core i7-6700K processor and also have a better understanding of the microarchitectural changes the new design brings forth. But today Intel's 6th Generation Core processors get a major reveal, with all the mobile and desktop CPU variants from 4.5 watts up to 91 watts, getting detailed specifications. Not only that, but it also marks the first day that vendors can announce and begin selling Skylake-based notebooks and systems!
All indications are that vendors like Dell, Lenovo and ASUS are still some weeks away from having any product available, but expect to see your feeds and favorite tech sites flooded with new product announcements. And of course with a new Apple event coming up soon...there should be Skylake in the new MacBooks this month.
Since I have already talked about the architecture and the performance changes from Haswell/Broadwell to Skylake in our 6700K story, today's release is just a bucket of specifications and information surround 46 different 6th Generation Skylake processors.
Intel's 6th Generation Core Processors
At Intel's Developer Forum in August, the media learned quite a bit about the new 6th Generation Core processor family including Intel's stance on how Skylake changes the mobile landscape.
Skylake is being broken up into 4 different line of Intel processors: S-series for desktop DIY users, H-series for mobile gaming machines, U-series for your everyday Ultrabooks and all-in-ones, Y-series for tablets and 2-in-1 detachables. (Side note: Intel does not reference an "Ultrabook" anymore. Huh.)
As you would expect, Intel has some impressive gains to claim with the new 6th Generation processor. However, it is important to put them in context. All of the claims above, including 2.5x performance, 30x graphics improvement and 3x longer battery life, are comparing Skylake-based products to CPUs from 5 years ago. Specifically, Intel is comparing the new Core i5-6200U (a 15 watt part) against the Core i5-520UM (an 18 watt part) from mid-2010.
Introduction and First Impressions
The Enthoo Pro M is the new mid-tower version of the Enthoo Pro, previously a full-tower ATX enclosure from the PC cooler and enclosure maker. This new enclosure adds another option to the $79 case market, which already has a number of solid options. Let's see how it stacks up!
I was very impressed by the Phanteks Enthoo EVOLV ATX enclosure, which received our Editor’s Choice award when reviewed earlier this year. The enclosure was very solidly made and had a number of excellent features, and even with a primarily aluminum construction and premium design it can be found for $119, rather unheard-of for this combination in the enclosure market. So what changes from that design might be expect to see with the $79 Enthoo Pro M?
The Pro M is a very businesslike design, constructed of steel and plastic, and with a very understated appearance. Not exactly “boring”, as it does have some personality beyond the typical rectangular box, with a brushed finish to the front panel which also features a vented front fan opening, and a side panel window to show off your build. But I think the real story here is the intelligent internal design, which is nearly identical to that of the EVOLV ATX.
Introduction and Technical Specifications
Courtesy of ASUS
The Z170-A motherboard is among initial offerings from ASUS' channel line of Intel Z170 chipset board line. The board features ASUS' new Channel line aesthetics, featuring white and black coloration to differentiate the line from their Z97 gold-theme offerings. ASUS uses the Z170-A to redefine the definition of a base-line motherboard, integrating many "upper-tier style" features not normally found on the lower tier offerings. The board's integrated Intel Z170 chipset integrates support for the latest Intel LGA1151 Skylake processor line as well as Dual Channel DDR4 memory. Offered at a price-competitive MSRP of $165, the Z170-A threatens to give the rest of the Z170-based boards a run for the money.
Courtesy of ASUS
The Z170 shares the same DIGI+ style power system of its higher priced siblings, featuring an 8-phase digital power delivery system. ASUS integrated the following features into the Z170-A board: four SATA 3 ports; one SATA-Express port; one M.2 PCIe x4 capable port; an Intel I219-V Gigabit NIC; three PCI-Express x16 slots; two PCI-Express x1 slots; one PCI slot; on-board power, and MemOK! buttons; EZ XMP and TPU switches; Crystal Sound 3 audio subsystem; integrated DisplayPort, HDMI, DVI, and VGA video ports; and USB 2.0, 3.0, and 3.1 Type-A and Type-C port support.
Courtesy of ASUS
The Z170-A motherboard comes standard with ASUS latest iteration of their sound technology, dubbed Crystal Sound 3. Like its predecessors, Crystal Sound 3 integrates the audio components on a isolated PCB from the other main board components minimizing noise generation caused by those other integrated devices. ASUS designed the audio subsystem with high-quality Japanese-sourced audio and power circuitry for a top-notch audio experience.
The Tiniest Fiji
Way back on June 16th, AMD held a live stream event during E3 to announce a host of new products. In that group was the AMD Radeon R9 Fury X, R9 Fury and the R9 Nano. Of the three, the Nano was the most intriguing to most of the online press as it was the one we knew the least about. AMD promised a full Fiji GPU in a package with a 6-in PCB and a 175 watt TDP. Well today, AMD is, uh, re-announcing (??) the AMD Radeon R9 Nano with more details on specifications, performance and availability.
First, let’s get this out of the way: AMD is making this announcement today because they publicly promised the R9 Nano for August. And with the final days of summer creeping up on them, rather than answer questions about another delay, AMD is instead going the route of a paper launch, but one with a known end date. We will apparently get our samples of the hardware in early September with reviews and the on-sale date following shortly thereafter. (Update: AMD claims the R9 Nano will be on store shelves on September 10th and should have "critical mass" of availability.)
Now let’s get to the details that you are really here for. And rather than start with the marketing spin on the specifications that AMD presented to the media, let’s dive into the gory details right now.
|R9 Nano||R9 Fury||R9 Fury X||GTX 980 Ti||TITAN X||GTX 980||R9 290X|
|GPU||Fiji XT||Fiji Pro||Fiji XT||GM200||GM200||GM204||Hawaii XT|
|Rated Clock||1000 MHz||1000 MHz||1050 MHz||1000 MHz||1000 MHz||1126 MHz||1000 MHz|
|Memory Clock||500 MHz||500 MHz||500 MHz||7000 MHz||7000 MHz||7000 MHz||5000 MHz|
|Memory Interface||4096-bit (HBM)||4096-bit (HBM)||4096-bit (HBM)||384-bit||384-bit||256-bit||512-bit|
|Memory Bandwidth||512 GB/s||512 GB/s||512 GB/s||336 GB/s||336 GB/s||224 GB/s||320 GB/s|
|TDP||175 watts||275 watts||275 watts||250 watts||250 watts||165 watts||290 watts|
|Peak Compute||8.19 TFLOPS||7.20 TFLOPS||8.60 TFLOPS||5.63 TFLOPS||6.14 TFLOPS||4.61 TFLOPS||5.63 TFLOPS|
AMD wasn’t fooling around, the Radeon R9 Nano graphics card does indeed include a full implementation of the Fiji GPU and HBM, including 4096 stream processors, 256 texture units and 64 ROPs. The GPU core clock is rated “up to” 1.0 GHz, nearly the same as the Fury X (1050 MHz), and the only difference that I can see in the specifications on paper is that the Nano is rated at 8.19 TFLOPS of theoretical compute performance while the Fury X is rated at 8.60 TFLOPS.
Retail Card Design
AMD is in an interesting spot right now. The general consensus is that both the AMD Radeon R9 Fury X and the R9 Fury graphics cards had successful launches into the enthusiast community. We found that the performance of the Fury X was slightly under that of the GTX 980 Ti from NVIDIA, but also that the noise levels and power draw were so improved on Fiji over Hawaii that many users would dive head first into the new flagship from the red team.
The launch of the non-X AMD Fury card was even more interesting – here was a card with a GPU performing better than the competition in a price point that NVIDIA didn’t have an exact answer. The performance gap between the GTX 980 and GTX 980 Ti resulted in a $550 graphics card that AMD had a victory with. Add in the third Fiji-based product due out in a few short weeks, the R9 Nano, and you have a robust family of products that don’t exactly dominate the market but do put AMD in a positive position unlike any it has seen in recent years.
But there are some problems. First and foremost for AMD, continuing drops in market share. With the most recent reports from multiple source claiming that AMD’s Q2 2015 share has dropped to 18%, an all-time low in the last decade or so, AMD needs some growth and they need it now. Here’s the catch: AMD can’t make enough of the Fiji chip to affect that number at all. The Fury X, Fury and Nano are going to be hard to find for the foreseeable future thanks to production limits on the HBM (high bandwidth memory) integration; that same feature that helps make Fiji the compelling product it is. I have been keeping an eye on the stock of the Fury and Fury X products and found that it often can’t be found anywhere in the US for purchase. Maybe even more damning is the fact that the Radeon R9 Fury, the card that is supposed to be the model customizable by AMD board partners, still only has two options available: the Sapphire, which we reviewed when it launched, and the ASUS Strix R9 Fury that we are reviewing today.
AMD’s product and financial issues aside, the fact is that the Radeon R9 Fury 4GB and the ASUS Strix iteration of it are damned good products. ASUS has done its usual job of improving on the design of the reference PCB and cooler, added in some great features and packaged it up a price that is competitive and well worth the investment for enthusiast gamers. Our review today will only lightly touch on out-of-box performance of the Strix card mostly because it is so similar to that of the initial Fury review we posted in July. Instead I will look at the changes to the positioning of the AMD Fury product (if any) and how the cooler and design of the Strix product helps it stand out. Overclocking, power consumption and noise will all be evaluated as well.
Introduction, Specifications, and Packaging
We have reviewed a lot of Variable Refresh Rate displays over the past several years now, and for the most part, these displays have come with some form of price premium attached. Nvidia’s G-Sync tech requires an additional module that adds some cost to the parts list for those displays. AMD took a while to get their FreeSync tech pushed through the scaler makers, and with the added effort needed to implement these new parts, display makers naturally pushed the new features into their higher end displays first. Just look at the specs of these displays:
- ASUS PG278Q 27in TN 1440P 144Hz G-Sync
- Acer XB270H 27in TN 1080P 144Hz G-Sync
- Acer XB280HK 28in TN 4K 60Hz G-Sync
- Acer XB270HU 27in IPS 1440P 144Hz G-Sync
- LG 34UM67 34in IPS 25x18 21:9 48-75Hz FreeSync
- BenQ XL2730Z 27in TN 1440P 40-144Hz FreeSync
- Acer XG270HU 27in TN 1440P 40-144Hz FreeSync
- ASUS MG279Q 27in IPS 1440P 144Hz FreeSync (35-90Hz)
Most of the reviewed VRR panels are 1440P or higher, and the only 1080P display currently runs $500. This unfortunately leaves VRR technology at a price point that is simply out of reach of gamers unable to drop half a grand on a display. What we need was a good 1080P display with a *full* VRR range. Bonus points to high refresh rates and in the case of a FreeSync display, a minimum refresh rate low enough that a typical game will not run below it. This shouldn’t be too hard since 1080P is not that demanding on even lower cost hardware these days. Who was up to this challenge?
Nixeus has answered this call with their new Nixeus Vue display. This is a 24” 1080P 144Hz FreeSync display with a VRR bottom limit of 30 FPS. It comes in two models, distinguished by a trailing letter in the model. The NX-VUE24B contains a ‘base’ model stand with only tilt support, while the NX-VUE24A contains a ‘premium’ stand with full height, rotation, and tilt support.
Does the $330-350 dollar Nixues Vue 24" FreeSync monitor fit the bill?
The Dell Venue 10 7000 Series tablet features a stunning 10.5" OLED screen and is designed to mate perfectly with the optional keyboard. So how does it perform as both a laptop and a tablet? Read on for the full review!
To begin with I will simply say the keyboard should not be an optional accessory. There, I've said it. As I used the Venue 10 7000, which arrived bundled with the keyboard, I was instantly excited about this design. The Venue 10 is a device that is as remarkable for its incredible screen as much as any other feature, but once coupled with the magnetically attached keyboard becomes something more - and quite different than existing implementations of the transforming tablet. More than a simple accessory the keyboard felt like it was really a part of the device when connected, and made it feel like a real laptop.
I'm getting way ahead of myself here so let's go back to the beginning, and back to a world where one might consider purchasing this tablet by itself. At $499 for the 16GB model you might reasonably ask how it compares to the identically-priced Apple iPad Air 2. Well, most of the comparison is going to be software/app related as the Venue 10 7000 is running Android 5.1 Lollipop, and of course the iPad runs iOS. The biggest difference between these tablets (besides the keyboard integration) becomes the 10.5-inch, 2560x1600 OLED screen, and oh what a screen it is!
A third primary processor
As the Hot Chips conference begins in Cupertino this week, Qualcomm is set to divulge another set of information about the upcoming Snapdragon 820 processor. Earlier this month the company revealed details about the Adreno 5xx GPU architecture, showcasing improved performance and power efficiency while also adding a new Spectra 14-bit image processor. Today we shift to what Qualcomm calls the “third pillar in the triumvirate of programmable processors” that make up the Snapdragon SoC. The Hexagon DSP (digital signal processor), introduced initially by Qualcomm in 2004, has gone through a massive architecture shift and even programmability shift over the last 10 years.
Qualcomm believes that building a balanced SoC for mobile applications is all about heterogeneous computing with no one processor carrying the entire load. The majority of the work that any modern Snapdragon processor must handle goes through the primary CPU cores, the GPU or the DSP. We learned about upgrades to the Adreno 5xx series for the Snapdragon 820 and we are promised information about Kryo CPU architecture soon as well. But the Hexagon 600-series of DSPs actually deals with some of the most important functionality for smartphones and tablets: audio, voice, imaging and video.
Interestingly, Qualcomm opened up the DSP to programmability just four years ago, giving developers the ability to write custom code and software to take advantages of the specific performance capabilities that the DSP offers. Custom photography, videography and sound applications could benefit greatly in terms of performance and power efficiency if utilizing the QC DSP rather than the primary system CPU or GPU. As of this writing, Qualcomm claims there are “hundreds” of developers actively writing code targeting its family of Hexagon processors.
The Hexagon DSP in Snapdragon 820 consists of three primary partitions. The main compute DSP works in conjunction with the GPU and CPU cores and will do much of the heavy lifting for encompassed workloads. The modem DSP aids the cellular modem in communication throughput. The new guy here is the lower power DSP in the Low Power Island (LPI) that shifts how always-on sensors can communicate with the operating system.
Introduction and First Impressions
The ASUS PB258Q is a "frameless" monitor with a full 2560x1440 resolution from a fairly compact 25-inch size, and at first glance it might appear to be a bare LCD panel affixed to a stand. This attractive design also features 100% sRGB coverage and full height/tilt/swivel and rotation adjustment. The price? Less than $400. We'll put it to the test to see just what kind of value to expect here.
A beautiful looking monitor even with nothing on the display
The ASUS PB258Q came out of nowhere one day when I was looking to replace a smaller 1080p display on my desk. Given some pretty serious size constraints I was hesitant to move up to the 27 - 30 inch range for 2560x1440 monitors, but I didn't want to settle for 1920x1080 again. The ASUS PB258Q intrigued me immediately not only due to its interesting size/resolution of 25-inch/1440p, but also for the claimed 100% sRGB coverage and fully adjustable stand. And then I looked over at the price. $376.99 shipped from Amazon with Prime shipping? Done.
The pricing (and compact 25-inch size) made it a more compelling choice to me than the PB278Q, ASUS's "professional graphics monitor" which uses a PLS panel, though this larger display has recently dropped in price to the $400 range. When the PB258Q arrived a couple of days later I was first struck by how compact it is, and how nice the monitor looked without even being powered up.
Another Maxwell Iteration
The mainstream end of the graphics card market is about to get a bit more complicated with today’s introduction of the GeForce GTX 950. Based on a slightly cut down GM206 chip, the same used in the GeForce GTX 960 that was released almost 8 months ago, the new GTX 950 will fill a gap in the product stack for NVIDIA, resting right at $160-170 MSRP. Until today that next-down spot from the GTX 960 was filled by the GeForce GTX 750 Ti, the very first iteration of Maxwell (we usually call it Maxwell 1) that came out in February of 2014!
Even though that is a long time to go without refreshing the GTX x50 part of the lineup, NVIDIA was likely hesitant to do so based on the overwhelming success of the GM107 for mainstream gaming. It was low cost, incredibly efficient and didn’t require any external power to run. That led us down the path of upgrading OEM PCs with GTX 750 Ti, an article and video that still gets hundreds of views and dozens of comments a week.
The GTX 950 has some pretty big shoes to fill. I can tell you right now that it uses more power than the GTX 750 Ti, and it requires a 6-pin power connector, but it does so while increasing gaming performance dramatically. The primary competition from AMD is the Radeon R7 370, a Pitcairn GPU that is long in the tooth and missing many of the features that Maxwell provides.
And NVIDIA is taking a secondary angle with the GTX 950 launch –targeting the MOBA players (DOTA 2 in particular) directly and aggressively. With the success of this style of game over the last several years, and the impressive $18M+ purse for the largest DOTA 2 tournament just behind us, there isn’t a better area of PC gaming to be going after today. But are the tweaks and changes to the card and software really going to make a difference for MOBA gamers or is it just marketing fluff?
Let’s dive into everything GeForce GTX 950!
Core and Interconnect
The Skylake architecture is Intel’s first to get a full release on the desktop in more than two years. While that might not seem like a long time in the grand scheme of technology, for our readers and viewers that is a noticeable change and shift from recent history that Intel has created with the tick-tock model of releases. Yes, Broadwell was released last year and was solid product, but Intel focused almost exclusively on the mobile platforms (notebooks and tablets) with it. Skylake will be much more ubiquitous and much more quickly than even Haswell.
Skylake represents Intel’s most scalable architecture to date. I don’t mean only frequency scaling, though that is an important part of this design, but rather in terms of market segment scaling. Thanks to brilliant engineering and design from Intel’s Israeli group Intel will be launching Skylake designs ranging from 4.5 watt TDP Core M solutions all the way up to the 91 watt desktop processors that we have already reviewed in the Core i7-6700K. That’s a range that we really haven’t seen before and in the past Intel has depended on the Atom architecture to make up ground on the lowest power platforms. While I don’t know for sure if Atom is finally trending towards the dodo once Skylake’s reign is fully implemented, it does make me wonder how much life is left there.
Scalability also refers to the package size – something that ensures that the designs the engineers created can actually be built and run in the platform segments they are targeting. Starting with the desktop designs for LGA platforms (DIY market) that fits on a 1400 mm2 design on the 91 watt TDP implementation Intel is scaling all the way down to 330 mm2 in a BGA1515 package for the 4.5 watt TDP designs. Only with a total product size like that can you hope to get Skylake in a form factor like the Compute Stick – which is exactly what Intel is doing. And note that the smaller packages require the inclusion of the platform IO chip as well, something that H- and S-series CPUs can depend on the motherboard to integrate.
Finally, scalability will also include performance scaling. Clearly the 4.5 watt part will not offer the user the same performance with the same goals as the 91 watt Core i7-6700K. The screen resolution, attached accessories and target applications allow Intel to be selective about how much power they require for each series of Skylake CPUs.
The fundamental design theory in Skylake is very similar to what exists today in Broadwell and Haswell with a handful of significant and hundreds of minor change that make Skylake a large step ahead of previous designs.
This slide from Julius Mandelblat, Intel Senior Principle Engineer, shows a higher level overview of the entirety of the consumer integration of Skylake. You can see that Intel’s goals included a bigger and wider core design, higher frequency, improved right architecture and fabric design and more options for eDRAM integration. Readers of PC Perspective will already know that Skylake supports both DDR3L and DDR4 memory technologies but the inclusion of the camera ISP is new information for us.
I knew that the move to DirectX 12 was going to be a big shift for the industry. Since the introduction of the AMD Mantle API along with the Hawaii GPU architecture we have been inundated with game developers and hardware vendors talking about the potential benefits of lower level APIs, which give more direct access to GPU hardware and enable more flexible threading for CPUs to game developers and game engines. The results, we were told, would mean that your current hardware would be able to take you further and future games and applications would be able to fundamentally change how they are built to enhance gaming experiences tremendously.
I knew that the reader interest in DX12 was outstripping my expectations when I did a live blog of the official DX12 unveil by Microsoft at GDC. In a format that consisted simply of my text commentary and photos of the slides that were being shown (no video at all), we had more than 25,000 live readers that stayed engaged the whole time. Comments and questions flew into the event – more than me or my staff could possible handle in real time. It turned out that gamers were indeed very much interested in what DirectX 12 might offer them with the release of Windows 10.
Today we are taking a look at the first real world gaming benchmark that utilized DX12. Back in March I was able to do some early testing with an API-specific test that evaluates the overhead implications of DX12, DX11 and even AMD Mantle from Futuremark and 3DMark. This first look at DX12 was interesting and painted an amazing picture about the potential benefits of the new API from Microsoft, but it wasn’t built on a real game engine. In our Ashes of the Singularity benchmark testing today, we finally get an early look at what a real implementation of DX12 looks like.
And as you might expect, not only are the results interesting, but there is a significant amount of created controversy about what those results actually tell us. AMD has one story, NVIDIA another and Stardock and the Nitrous engine developers, yet another. It’s all incredibly intriguing.
It comes after 8, but before 10
As the week of Intel’s Developer Forum (IDF) begins, you can expect to see a lot of information about Intel’s 6th Generation Core architecture, codenamed Skylake, finally revealed. When I posted my review of the Core i7-6700K, the first product based on that architecture to be released in any capacity, I was surprised that Intel was willing to ship product without the normal amount of background information for media and developers. Rather than give us the details and then ship product, which has happened for essentially every consumer product release I have been a part of, Intel did the reverse: ship a consumer friendly CPU and then promise to tell us how it all works later in the month at IDF.
Today I came across a document posted on Intel’s website that dives into very specific detail on the new Gen9 graphics and compute architecture of Skylake. Details on the Core architecture changes are not present, and instead we are given details on how the traditional GPU portion of the SoC has changed. To be clear: I haven’t had any formal briefing from Intel on this topic or anything surrounding the architecture of Skylake or the new Gen9 graphics system but I wanted to share the details we found available. I am sure we’ll learn more this week as IDF progresses so I will update this story where necessary.
What Intel calls Processor Graphics is what we used to call simply integrated graphics for the longest time. The purpose and role of processor graphics has changed drastically over the years and it is now not only responsible for 3D graphics rendering but compute, media and display capabilities of the Intel Skylake SoC (when discrete add-in graphics is not used). The architecture document used to source this story focuses on Gen9 graphics, the compute architecture utilized in the latest Skylake CPUs. The Intel HD Graphics 530 on the Core i7-6700K / Core i5-6600K is the first product released and announced using Gen9 graphics and is also the first to adopt Intel’s new 3-digit naming scheme.
This die shot of the Core i7-6700K shows the increased size and prominence of the Gen9 graphics in the overall SoC design. Containing four traditional x86 CPU cores and 1 “slice” implementation of Gen9 graphics (with three visible sub-slices we’ll describe below), this is not likely to be the highest performing iteration of the latest Intel HD Graphics technology.
Like the Intel processors before it, the Skylake design utilizes a ring bus architecture to connect the different components of the SoC. This bi-directional interconnect has a 32-byte wide data bus and connects to multiple “agents” on the CPU. Each individual CPU core is considered its own agent while the Gen9 compute architecture is considered one complete agent. The system agent bundles the DRAM memory, the display controller, PCI Express and other I/O interface that communicate with the rest of the PC. Any off-chip memory requests and transactions occur through this bus while on-chip data transfers tend to be handled differently.
It's Basically a Function Call for GPUs
Mantle, Vulkan, and DirectX 12 all claim to reduce overhead and provide a staggering increase in “draw calls”. As mentioned in the previous editorial, loading graphics card with tasks will take a drastic change in these new APIs. With DirectX 10 and earlier, applications would assign attributes to (what it is told is) the global state of the graphics card. After everything is configured and bound, one of a few “draw” functions is called, which queues the task in the graphics driver as a “draw call”.
While this suggests that just a single graphics device is to be defined, which we also mentioned in the previous article, it also implies that one thread needs to be the authority. This limitation was known about for a while, and it contributed to the meme that consoles can squeeze all the performance they have, but PCs are “too high level” for that. Microsoft tried to combat this with “Deferred Contexts” in DirectX 11. This feature allows virtual, shadow states to be loaded from secondary threads, which can be appended to the global state, whole. It was a compromise between each thread being able to create its own commands, and the legacy decision to have a single, global state for the GPU.
Some developers experienced gains, while others lost a bit. It didn't live up to expectations.
The paradigm used to load graphics cards is the problem. It doesn't make sense anymore. A developer might not want to draw a primitive with every poke of the GPU. At times, they might want to shove a workload of simple linear algebra through it, while other requests could simply be pushing memory around to set up a later task (or to read the result of a previous one). More importantly, any thread could want to do this to any graphics device.
The new graphics APIs allow developers to submit their tasks quicker and smarter, and it allows the drivers to schedule compatible tasks better, even simultaneously. In fact, the driver's job has been massively simplified altogether. When we tested 3DMark back in March, two interesting things were revealed:
- Both AMD and NVIDIA are only a two-digit percentage of draw call performance apart
- Both AMD and NVIDIA saw an order of magnitude increase in draw calls
Killing those end of summer blues
As we approach the end of summer and the beginning of the life of Windows 10, PC Perspective and Gigabyte (along with Thermaltake and Kingston) have teamed up to bring our readers a system build guide and giveaway that is sure to get your gears turning. If you think that an X99-based system with an 8-core Intel Extreme processor, SLI graphics, 480GB SSD and 32GB of memory sounds up your alley...pay attention.
Deep in thought...
Even with the dawn of Skylake nearly upon us, there is no debate that the Haswell-E platform will continue to be the basis of the enthusiasts dream system for a long time. Lower power consumption is great, but nothing is going to top 8-cores, 16-threads and all the PCI Express lanes you could need for expansion to faster storage and accessories. With that in mind Gigabyte has partnered with PC Perspective to showcase the power of X99 and what a builder today can expect when putting together a system with a fairly high budget, but with lofty goals in mind as well.
Let's take a look at the components we are using today.
|Gigabyte X99 System Build|
|Processor||Intel Core i7-5960X - $1048|
|Motherboard||Gigabyte X99 Gaming 5P - $309|
|Memory||Kingston HyperX Fury DDR4-2666 32GB - $325|
|Graphics Card||2 x Gigabyte G1 Gaming GTX 960 2GB - $199|
|Storage||Kingston HyperX Savage 480GB SSD - $194|
|Case||Thermaltake Core V51 - $82|
|Power Supply||Thermaltake Toughpower Grand 850 watt - $189|
|CPU Cooler||Thermaltake Water 3.0 Extreme S - $94|
|Total Price||$1591 - Amazon Full Card (except CPU)
$1048 - Amazon Intel Core i7-5960X
Grand Total: $2639
Going Beyond the Reference GTX 970
Zotac has been an interesting company to watch for the past few years. It is a company that has made a name for themselves in the small form factor community with some really interesting designs and products. They continue down that path, but they have increasingly focused on high quality graphics cards that address a pretty wide market. They provide unique products from the $40 level up through the latest GTX 980 Ti with hybrid water and air cooling for $770. The company used to focus on reference designs, but some years past they widened their appeal by applying their own design decisions to the latest NVIDIA products.
Catchy looking boxes for people who mostly order online! Still, nice design.
The beginning of this year saw Zotac introduce their latest “Core” brand products that aim to provide high end features to more modestly priced parts. The Core series makes some compromises to hit price points that are more desirable for a larger swath of consumers. The cards often rely on more reference style PCBs with good quality components and advanced cooling solutions. This equation has been used before, but Zotac is treading some new ground by offering very highly clocked cards right out of the box.
Overall Zotac has a very positive reputation in the industry for quality and support.
Plenty of padding in the box to protect your latest investment.
Zotac GTX 970 AMP! Extreme Core Edition
The product we are looking at today is the somewhat long-named AMP! Extreme Core Edition. This is based on the NVIDIA GTX 970 chip which features 56 ROPS, 1.75 MB of L2 cache, and 1664 CUDA Cores. The GTX 970 has of course been scrutinized heavily due to the unique nature of its memory subsystem. While it does physically have a 256 bit bus, the last 512 MB (out of 4GB) is addressed by a significantly slower unit due to shared memory controller capacity. In theory the card reference design supports up to 224 GB/sec of memory bandwidth. There are obviously some very unhappy people out there about this situation, but much of this could have been avoided if NVIDIA had disclosed the exact nature of the GTX 970 configuration.
A quick look at storage
** This piece has been updated to reflect changes since first posting. See page two for PCIe RAID results! **
Our Intel Skylake launch coverage is intense! Make sure you hit up all the stories and videos that are interesting for you!
- The Intel Core i7-6700K Review - Skylake First for Enthusiasts (Video)
- Skylake vs. Sandy Bridge: Discrete GPU Showdown (Video)
- ASUS Z170-A Motherboard Preview
- Intel Skylake / Z170 Rapid Storage Technology Tested - PCIe and SATA RAID
When I saw the small amount of press information provided with the launch of Intel Skylake, I was both surprised and impressed. The new Z170 chipset was going to have an upgraded DMI link, nearly doubling throughput. DMI has, for a long time, been suspected as the reason Intel SATA controllers have pegged at ~1.8 GB/sec, which limits the effectiveness of a RAID with more than 3 SSDs. Improved DMI throughput could enable the possibility of a 6-SSD RAID-0 that exceeds 3GB/sec, which would compete with PCIe SSDs.
Speaking of PCIe SSDs, that’s the other big addition to Z170. Intel’s Rapid Storage Technology was going to be expanded to include PCIe (even NVMe) SSDs, with the caveat that they must be physically connected to PCIe lanes falling under the DMI-connected chipset. This is not as big of as issue as you might think, as Skylake does not have 28 or 40 PCIe lanes as seen with X99 solutions. Z170 motherboards only have to route 16 PCIe lanes from the CPU to either two (8x8) or three (8x4x4) PCIe slots, and the remaining slots must all hang off of the chipset. This includes the PCIe portion of M.2 and SATA Express devices.
Light on architecture details
Our Intel Skylake launch coverage is intense! Make sure you hit up all the stories and videos that are interesting for you!
- The Intel Core i7-6700K Review - Skylake First for Enthusiasts (Video)
- Skylake vs. Sandy Bridge: Discrete GPU Showdown (Video)
- ASUS Z170-A Motherboard Preview
- Intel Skylake / Z170 Rapid Storage Technology Tested - PCIe and SATA RAID
The Intel Skylake architecture has been on our radar for quite a long time as Intel's next big step in CPU design. Through leaks and some official information discussed by Intel over the past few months, we know at least a handful of details: DDR4 memory support, 14nm process technology, modest IPC gains and impressive GPU improvements. But the details have remained a mystery on how the "tock" of Skylake on the 14nm process technology will differ from Broadwell and Haswell.
Interestingly, due to some shifts in how Intel is releasing Skylake, we are going to be doing a review today with very little information on the Skylake architecture and design (at least officially). While we are very used to the company releasing new information at the Intel Developer Forum along with the launch of a new product, Intel has instead decided to time the release of the first Skylake products with Gamescom in Cologne, Germany. Parts will go on sale today (August 5th) and we are reviewing a new Intel processor without the background knowledge and details that will be needed to really explain any of the changes or differences in performance that we see. It's an odd move honestly, but it has some great repercussions for the enthusiasts that read PC Perspective: Skylake will launch first as an enthusiast-class product for gamers and DIY builders.
For many of you this won't change anything. If you are curious about the performance of the new Core i7-6700K, power consumption, clock for clock IPC improvements and anything else that is measurable, then you'll get exactly what you want from today's article. If you are a gear-head that is looking for more granular details on how the inner-workings of Skylake function, you'll have to wait a couple of weeks longer - Intel plans to release that information on August 18th during IDF.
So what does the addition of DDR4 memory, full range base clock manipulation and a 4.0 GHz base clock on a brand new 14nm architecture mean for users of current Intel or AMD platforms? Also, is it FINALLY time for users of the Core i7-2600K or older systems to push that upgrade button? (Let's hope so!)