Subject: Storage
Manufacturer: Lexar

The Need for Speed

Around here storage is Allyn’s territory, but I decided to share my experience with a new $20 flash drive I picked up that promised some impressive speeds via USB 3.0. The drive is the Lexar JumpDrive P20, and I bought the 32GB version, which is the lowest capacity of the three drives in the series. 64GB and 128GB versions of the JumpDrive P20 are available, with advertised speeds of up to 400 MB/s from all three, and reads and up to 270 MB/s writes - if you buy the largest capacity.

DSC_0897.jpg

My humble 32GB model still boasts up to 140 MB/s writes, which would be faster than any USB drive I’ve ever owned (my SanDisk Extreme USB 3.0 16GB drive is limited to 60 MB/s writes, and can hit about 190 MB/s reads), and the speeds of the P20 even approach that of some lower capacity SATA 3 SSDs - if it lives up to the claims. The price was right, so I took the plunge. (My hard-earned $20 at stake!)

DSC_0903.jpg

Size comparison with other USB flash drives on hand (P20 on far right)

First we'll look at the features from Lexar:


  • Among the fastest USB flash drives available, with speeds up to 400MB/s read and 270MB/s write
  • Sleek design with metal alloy base and high-gloss mirror finish top
  • Securely protects files using EncryptStick Lite software, an advanced security solution with 256-bit AES encryption
  • Reliably stores and transfers files, photos, videos, and more
  • High-capacity options to store more files on the go
  • Compatible with PC and Mac systems
  • Backwards compatible with USB 2.0 devices
  • Limited lifetime warranty

Continue reading our review of the Lexar JumpDrive P20 USB drive!!

Author:

Introduction and Features

Introduction

2-Banner.jpg

SilverStone continues to push the envelope of power density with the release of their new SX800-LTI small form factor power supply. Following close on the heels of the SX700-LPT, the new unit now packs 800 watts into a small chassis. SFX form factor cases and power supplies continue grow in popularity and in market share and as one of the original manufacturers of SFX power supplies, Silverstone Technology Co. is striving to meet customer demand.

SX800-LTI

(SX=SFX Form Factor, 800=800W, L=Lengthened, TI=Titanium certified)

SilverStone has a long-standing reputation for providing a full line of high quality enclosures, power supplies, cooling components, and accessories for PC enthusiasts. With a continued focus on smaller physical size and support for small form-factor enthusiasts, SilverStone added the new SX800-LTI to their SFX form factor series. There are now eight power supplies in the SFX Series, ranging in output capacity from 300W to 800W. The SX800-LTI is the third SilverStone unit to feature a lengthened SFX chassis. The SX800-LTI enclosure is 30mm (1.2”) longer than a standard SFX power supply case, which allows using a quieter 120mm cooling fan rather than the typical 80mm fan used in most SFX power supplies.

In addition to its small size, the SX800-LTI features very high efficiency (80 Plus Titanium certified), all modular flat ribbon-style cables, and provides up to 800W of continuous DC output (850W peak). The SX800-LTI also operates in semi-fanless mode and incorporates a very quiet 120mm cooling fan.

3-SX800-diag.jpg

SilverStone SX800-LTI PSU Key Features:

•    Small Form Factor (SFX-L) design
•    800W continuous power output rated for 24/7 operation
•    80 Plus Titanium certified for very high efficiency
•    Quiet operation with semi-fanless operation
•    120mm cooling fan optimized for low noise
•    Powerful single +12V rail with 66A capacity
•    All-modular, flat ribbon-style cables
•    High quality construction with all Japanese capacitors
•    Strict ±3% voltage regulation and low AC ripple and noise
•    Support for high-end GPUs with four PCI-E 8/6-pin connectors
•    Safety Protections: OCP, OPP, OVP, UVP, SCP, and OTP

4-SX800-cables.jpg

Here is what SilverStone has to say about their new SX800-LTI power supply:

Since its launch in 2015, the SFX-L form factor has garnered popular recognition and support among enthusiasts with its larger 120mm fan able to achieve better balance of power and quietness in small form factor PCs than what was possible with standard SFX. And as a leader in power supply miniaturization, SilverStone has continued its efforts in advancing the SFX-L forward to reach ever higher limit.

The SX800-LTI not only has unprecedented 800 watts of power output but also has the highest level of 80 PLUS efficiency with a Titanium rating. It includes all features available from top of the line SilverStone PSUs such as flexible flat cables, all Japanese capacitors and advanced semi-fanless capability. For those looking to build the most efficient small form factor systems possible with great quality and power, the SX800-LTI is definitely the top choice.

Please continue reading our review of the SilverStone SX800-LTI PSU!!!

Manufacturer: Phononic

Introduction: A Hybrid Approach

The Hex 2.0 from Phononic is not your typical CPU cooler. It functions as both a thermoelectric cooler (TEC) - which you may also know as a Peltier cooler - and as a standard heatsink/fan, depending on CPU load. It offers a small footprint for placement in all but the lowest-profile systems, yet it boasts cooling potential beyond other coolers of its size. Yes, it is expensive, but this is a far more complex device than a standard air or even all-in-one liquid cooler - and obviously much smaller than even the most compact AiO liquid coolers.

DSC_0758.jpg

“The HEX 2.0 combines a proprietary state-of-the-art high performance thermoelectric module with an innovative heat exchanger. The small form factor CPU cooler pioneers a new category of cooling technology. The compact design comfortably fits in small chassis, including mini-ITX cases, while delivering cooling capacity beyond that of much larger coolers.”

Even though it does not always need to function as such, the Hex 2.0 is a thermoelectric cooling device, and that alone makes it interesting from a PC hardware enthusiast point of view (at least mine, anyway). The 'active-passive' approach taken by Phononic with the Hex 2.0 allows for greater performance potential that would otherwise be possible from a smaller TEC device, though our testing will of course reveal how effective it is in actual use.

PhononicHex20_Fig1.png

HEX 2.0 features an Active-Passive design (Credit: Phononic)

The goal for the HEX 2.0 CPU cooler was to provide similar cooling performance to all-in-one (AIO) liquid coolers or the very largest fan-heat sinks in a package that could fit into the smallest PC form factors (like miniITX). The active-passive design is what makes this possible. By splitting the CPU heat into two paths, as shown in Figure 1 (Ed. the above image), the thermoelectric device can be sized at an optimal point where it can provide the most benefit for lowering CPU temperature without having to be large enough to pump the entire CPU thermal load. We also designed electronic controls to turn off the thermoelectric heat pump at times of low CPU load, making for an energy efficient cooler that provides adequate cooling with zero power draw at low CPU loads. However, when the CPU is stressed and the CPU heat load increases, the electronic controls energize the thermoelectric heat pump, lowering the temperature of the passive base plate and the CPU itself. The active-passive design has one further benefit – when used in conjunction with the electronic controls, this design virtually eliminates the risk of condensation for the HEX 2.0.

Continue reading our review of the Phononic HEX 2.0 Thermoelectric CPU Cooler!

Author:
Subject: Processors, Mobile
Manufacturer: Qualcomm

A new start

Qualcomm is finally ready to show the world how the Snapdragon 835 Mobile Platform performs. After months of teases and previews, including a the reveal that it was the first processor built on Samsung’s 10nm process technology and a mostly in-depth look at the architectural changes to the CPU and GPU portions of the SoC, the company let a handful of media get some hands-on time with development reference platform and run some numbers.

To frame the discussion as best I can, I am going to include some sections from my technology overview. This should give some idea of what to expect from Snapdragon 835 and what areas Qualcomm sees providing the widest variation from previous SD 820/821 product.

Qualcomm frames the story around the Snapdragon 835 processor with what they call the “five pillars” – five different aspects of mobile processor design that they have addressed with updates and technologies. Qualcomm lists them as battery life (efficiency), immersion (performance), capture, connectivity, and security.

slides1-6.jpg

Starting where they start, on battery life and efficiency, the SD 835 has a unique focus that might surprise many. Rather than talking up the improvements in performance of the new processor cores, or the power of the new Adreno GPU, Qualcomm is firmly planted on looking at Snapdragon through the lens of battery life. Snapdragon 835 uses half of the power of Snapdragon 801.

slides2-11.jpg

Since we already knew that the Snapdragon 835 was going to be built on the 10nm process from Samsung, the first such high performance part to do so, I was surprised to learn that Qualcomm doesn’t attribute much of the power efficiency improvements to the move from 14nm to 10nm. It makes sense – most in the industry see this transition as modest in comparison to what we’ll see at 7nm. Unlike the move from 28nm to 14/16nm for discrete GPUs, where the process technology was a huge reason for the dramatic power drop we saw, the Snapdragon 835 changes come from a combination of advancements in the power management system and offloading of work from the primary CPU cores to other processors like the GPU and DSP. The more a workload takes advantage of heterogeneous computing systems, the more it benefits from Qualcomm technology as opposed to process technology.

slides2-22.jpg

Continue reading our preview of Qualcomm Snapdragon 835 performance!

Author:
Subject: Mobile
Manufacturer: Lenovo

Overview

If you look at the current 2-in-1 notebook market, it is clear that the single greatest influence is the Lenovo Yoga. Despite initial efforts to differentiate convertible Notebook-tablet designs, newly released machines such as the HP Spectre x360 series and the Dell XPS 13" 2-in-1 make it clear that the 360-degree "Yoga-style" hinge is the preferred method.

DSC02488.JPG

Today, we are looking at a unique application on the 360-degree hinge, the Lenovo Yoga Book. Will this new take on the 2-in-1 concept be so influential?

The Lenovo Yoga Book is 10.1" tablet that aims to find a unique way to implement a stylus on a modern touch device. The device itself is a super thin clamshell-style design, featuring an LCD on one side of the device, and a large touch-sensitive area on the opposing side.

lenovo-yoga-book-feature-notetaking-android-full-width.jpg

This large touch area serves two purposes. Primarily, it acts as a surface for the included stylus that Lenovo is calling the Real Pen. Using the Real Pen, users can do thing such as sketch in Adobe Photoshop and Illustrator or takes notes in an application such as Microsoft OneNote.

The Real Pen has more tricks up its sleeve than just a normal stylus. It can be converted from a pen with a Stylus tip on it to a full ballpoint pen. When paired with the "Create Pad" included with the Yoga Book, you can write on top of a piece of actual paper using the ballpoint pen, and still have the device pick up on what you are drawing.

Click here to continue reading our review of the Lenovo Yoga Book.

Author:
Subject: General Tech
Manufacturer: ARM

New "Fabric" for ARM

It is not much of a stretch to say that ARM has had a pretty impressive run for the past 10 years since we started paying attention to the company from a consumer point of view.  It took 22 years for ARM to power 50 billion chips that had been shipped.  It took another 4 years to hit the next 50 billion.  Now ARM expects to ship around 100 billion chips in the next four years.
arm_01.png
 
Last year we saw the introduction of multiple technologies from ARM in the shape of the latest Cortex-A CPUs and a new generation of Mali GPUs.  ARM has been near the forefront of applying their designs to the latest, cutting edge process technologies offered by Samsung and TSMC.  This change of pace has been refreshing considering that a few years ago they would announce a new architecture and expect to see it in new phones and devices about 3 years from that point.  Intel attempted a concerted push into mobile and ARM responded by tightening up their portfolio and aggressively pushing release dates.
 
This year appears no different for ARM as we expect new technologies to be announced again later this year that will update their offerings as well as process technology partnerships with the major pure-play foundries.  The first glimpse of what we can expect is ARM's announcement today of their DynamIQ technology.
 
arm_02.png
 
DynamIQ can be viewed as a portfolio of technologies that will power the next generation of ARM CPUs, GPUs, and potentially accelerators.  This encompasses power delivery, power control, connectivity, and topologies.
 
 
Subject: General Tech
Manufacturer: Topre

Ultimate Topre

There are cars that get you from point A to point B, and then there are luxurious grand touring cars which will get you there with power, comfort, and style - for a price. Based on the cost alone ($269.99 MSRP!) it seems like a safe bet to say that the REALFORCE RGB keyboard will be a similarly premium experience. Let’s take a look!

DSC_0965.jpg

There is as much personal taste at issue when considering a keyboard (or dream car!) as almost any other factor, and regardless of build quality or performance a keyboard is probably not going to work out for you if it doesn’t feel right. Mechanical keyboards are obviously quite popular, and more companies than ever offer their own models, many using Cherry MX key switches (or generic ‘equivalents’ - which vary in quality). Topre keys are different, as they are a capacitive key with a rubber dome and metal spring, and have a very smooth, fast feel to them - not clicky at all.

img_keyswitch.png

“Topre capacitive key switches are a patented hybrid between a mechanical spring based switch, a rubber dome switch, and a capacitive sensor which, combined, provide tactility, comfort, and excellent durability. The unique electrostatic design of Topre switches requires no physical mechanical coupling and therefore key switch bounce/chatter is eliminated.”

DSC_0962.jpg

Continue reading our review of the Topre REALFORCE RGB Keyboard!

Author:
Subject: Processors
Manufacturer: AMD

Here Comes the Midrange!

Today AMD is announcing the upcoming Ryzen 5 CPUs.  A little bit was known about them from several weeks ago when AMD talked about their upcoming 6 core processors, but official specifications were lacking.  Today we get to see what Ryzen 5 is mostly about.

ryzen5_01.png

There are four initial SKUs that AMD is talking about this evening.  These encompass quad core and six core products.  There are two “enthusiast” level SKUs with the X connotation while the other two are aimed at a less edgy crowd.

The two six core CPUs are the 1600 and 1600X.  The X version features the higher extended frequency range when combined with performance cooling.  That unit is clocked at a base 3.6 GHz and achieves a boost of 4 GHz.  This compares well to the top end R7 1800X, but it is short 2 cores and four threads.  The price of the R5 1600X is a very reasonable $249.  The 1600 does not feature the extended range, but it does come in at a 3.2 GHz base and 3.6 GHz boost.  The R5 1600 has a MSRP of $219.

ryzen5_04.png

When we get to the four core, eight thread units we see much the same stratification.  The top end 1500X comes in at $189 and features a base clock of 3.5 GHz and a boost of 3.7 GHz.  What is interesting about this model is that the XFR is raised by 100 MHz vs. other XFR CPUs.  So instead of an extra 100 MHz boost when high end cooling is present we can expect to see 200 MHz.  In theory this could run at 3.9 GHz in the extended state.  The lowest priced R5 is the 1400 which comes in at a very modest $169.  This features a 3.2 GHz base clock and a 3.4 GHz boost.

The 1400, 1500, and 1600 CPUs come with Wraith cooling solutions.  The 1600X comes bare as it is assumed that users want to use something a bit more robust.  The R5 1400 comes with the lower end Wraith Stealth cooler while the R5 1500X and R5 1600 come with the bigger Wraith Spire.  The bottom 3 SKUs are all rated at 65 watts TDP.  The 1600X comes in at the higher 95 watt rating.  Each of the CPUs are unlocked for overclocking.

ryzen5_03.png

These chips will provide a more fleshed out pricing structure for the Ryzen processors and provide users and enthusiasts with lower cost options for those wanting to invest in AMD again.  These chips all run on the new AM4 platform which are pretty strong in terms of features and I/O performance.

ryzen5_02.png

AMD is not shipping these parts today, but rather announcing them.  Review samples are not in hand yet and AMD expects world-wide availability by April 11.  This is likely a very necessary step for AMD as current AM4 motherboard availability is not at the level we were expecting to see.  We also are seeing some pretty quick firmware updates from motherboard partners to address issues with these first AM4 boards.  By April 11 I would expect to see most of the issues solved and a healthy supply of motherboards on the shelves to handle the influx of consumers waiting to buy these more midrange priced CPUs from AMD.

What they did not cover or answer would be how the four core products would be presented.  Would each be a single CCX and only 8 MB of L3 cace, or would AMD disable two cores in each CCX and present 16 MB of L3?  We currently do not have the answer to this.  Considering the latency between accessing different CCX units we can surely hope they only keep one CCX active.

ryzen5_05.png

Ryzen has certainly been a success for AMD and I have no doubt that their quarter will be pretty healthy with the estimated sales of around 1 million Ryzen CPUs since launch.  Announcing these new chips will give the mainstream and budget enthusiasts something to look forward to and plan their purchases around.  AMD is not announcing the Ryzen 3 products at this time.

Update: AMD got back to me this morning about a question I asked them about the makeup of cores, CCX units, and L3 cache.  Here is their response.

1600X: 3+3 with 16MB L3 cache. 1600: 3+3 with 16MB L3 cache. 1500X: 2+2 with 16MB L3 cache. 1400: 2+2 with 8MB L3 cache. As with Ryzen 7, each core still has 512KB local L2 cache.

Author:
Manufacturer: Various

Background and setup

A couple of weeks back, during the excitement surrounding the announcement of the GeForce GTX 1080 Ti graphics card, NVIDIA announced an update to its performance reporting project known as FCAT to support VR gaming. The updated iteration, FCAT VR as it is now called, gives us the first true ability to not only capture the performance of VR games and experiences, but the tools with which to measure and compare.

Watch ths video walk through of FCAT VR with me and NVIDIA's Tom Petersen

I already wrote an extensive preview of the tool and how it works during the announcement. I think it’s likely that many of you overlooked it with the noise from a new GPU, so I’m going to reproduce some of it here, with additions and updates. Everyone that attempts to understand the data we will be presenting in this story and all VR-based tests going forward should have a baseline understanding of the complexity of measuring VR games. Previous tools don’t tell the whole story, and even the part they do tell is often incomplete.

If you already know how FCAT VR works from reading the previous article, you can jump right to the beginning of our results here.

Measuring and validating those claims has proven to be a difficult task. Tools that we used in the era of standard PC gaming just don’t apply. Fraps is a well-known and well-understood tool for measuring frame rates and frame times utilized by countless reviewers and enthusiasts, but Fraps lacked the ability to tell the complete story of gaming performance and experience. NVIDIA introduced FCAT and we introduced Frame Rating back in 2013 to expand the capabilities that reviewers and consumers had access to. Using more sophisticated technique that includes direct capture of the graphics card output in uncompressed form, a software-based overlay applied to each frame being rendered, and post-process analyzation of that data, we could communicate the smoothness of a gaming experience, better articulating it to help gamers make purchasing decisions.

vrpipe1.png

For VR though, those same tools just don’t cut it. Fraps is a non-starter as it measures frame rendering from the GPU point of view and completely misses the interaction between the graphics system and the VR runtime environment (OpenVR for Steam/Vive and OVR for Oculus). Because the rendering pipeline is drastically changed in the current VR integrations, what Fraps measures is completely different than the experience the user actually gets in the headset. Previous FCAT and Frame Rating methods were still viable but the tools and capture technology needed to be updated. The hardware capture products we used since 2013 were limited in their maximum bandwidth and the overlay software did not have the ability to “latch in” to VR-based games. Not only that but measuring frame drops, time warps, space warps and reprojections would be a significant hurdle without further development. 

vrpipe2.png

vrpipe3.png

NVIDIA decided to undertake the task of rebuilding FCAT to work with VR. And while obviously the company is hoping that it will prove its claims of performance benefits for VR gaming, it should not be overlooked the investment in time and money spent on a project that is to be open sourced and free available to the media and the public.

vlcsnap-2017-02-27-11h31m17s057.png

NVIDIA FCAT VR is comprised of two different applications. The FCAT VR Capture tool runs on the PC being evaluated and has a similar appearance to other performance and timing capture utilities. It uses data from Oculus Event Tracing as a part of the Windows ETW and SteamVR’s performance API, along with NVIDIA driver stats when used on NVIDIA hardware to generate performance data. It will and does work perfectly well on any GPU vendor’s hardware though with the access to the VR vendor specific timing results.

fcatvrcapture.jpg

Continue reading our first look at VR performance testing with FCAT VR!!

Manufacturer: RockIt Cool

Introduction

Introduction

With the introduction of the Intel Kaby Lake processors and Intel Z270 chipset, unprecedented overclocking became the norm. The new processors easily hit a core speed of 5.0GHz with little more than CPU core voltage tweaking. This overclocking performance increase came with a price tag. The Kaby Lake processor runs significantly hotter than previous generation processors, a seeming reversal in temperature trends from previous generation Intel CPUs. At stock settings, the individual cores in the CPU were recording in testing at hitting up to 65C - and that's with a high performance water loop cooling the processor. Per reports from various enthusiasts sites, Intel used inferior TIM (thermal interface material) in between the CPU die and underside of the CPU heat spreader, leading to increased temperatures when compared with previous CPU generations (in particular Skylake). This temperature increase did not affect overclocking much since the CPU will hit 5.0GHz speed easily, but does impact the means necessary to hit those performance levels.

Like with the previous generation Haswell CPUs, a few of the more adventurous enthusiasts used known methods in an attempt to address the heat concerns of the Kaby Lake processor be delidding the processor. Unlike in the initial days of the Haswell processor, the delidding process is much more stream-lined with the availability of delidding kits from several vendors. The delidding process still involves physically removing the heat spreader from the CPU, and exposing the CPU die. However, instead of cooling the die directly, the "safer" approach is to clean the die and underside of the heat spreader, apply new TIM (thermal interface material), and re-affix the heat spreader to the CPU. Going this route instead of direct-die cooling is considered safer because no additional or exotic support mechanisms are needed to keep the CPU cooler from crushing your precious die. However, calling it safe is a bit of an over-statement, you are physically separating the heat spreader from the CPU surface and voiding your CPU warranty at the same time. Although if that was a concern, you probably wouldn't be reading this article in the first place.

Continue reading our Kaby Lake Relidding article!

Subject: Processors
Manufacturer: AMD

** UPDATE 3/13 5 PM **

AMD has posted a follow-up statement that officially clears up much of the conjecture this article was attempting to clarify. Relevant points from their post that relate to this article as well as many of the requests for additional testing we have seen since its posting (emphasis mine):

  • "We have investigated reports alleging incorrect thread scheduling on the AMD Ryzen™ processor. Based on our findings, AMD believes that the Windows® 10 thread scheduler is operating properly for “Zen,” and we do not presently believe there is an issue with the scheduler adversely utilizing the logical and physical configurations of the architecture."

  • "Finally, we have reviewed the limited available evidence concerning performance deltas between Windows® 7 and Windows® 10 on the AMD Ryzen™ CPU. We do not believe there is an issue with scheduling differences between the two versions of Windows.  Any differences in performance can be more likely attributed to software architecture differences between these OSes."

So there you have it, straight from the horse's mouth. AMD does not believe the problem lies within the Windows thread scheduler. SMT performance in gaming workloads was also addressed:

  • "Finally, we have investigated reports of instances where SMT is producing reduced performance in a handful of games. Based on our characterization of game workloads, it is our expectation that gaming applications should generally see a neutral/positive benefit from SMT. We see this neutral/positive behavior in a wide range of titles, including: Arma® 3, Battlefield™ 1, Mafia™ III, Watch Dogs™ 2, Sid Meier’s Civilization® VI, For Honor™, Hitman™, Mirror’s Edge™ Catalyst and The Division™. Independent 3rd-party analyses have corroborated these findings.

    For the remaining outliers, AMD again sees multiple opportunities within the codebases of specific applications to improve how this software addresses the “Zen” architecture. We have already identified some simple changes that can improve a game’s understanding of the "Zen" core/cache topology, and we intend to provide a status update to the community when they are ready."

We are still digging into the observed differences of toggling SMT compared with disabling the second CCX, but it is good to see AMD issue a clarifying statement here for all of those out there observing and reporting on SMT-related performance deltas.

** END UPDATE **

Editor's Note: The testing you see here was a response to many days of comments and questions to our team on how and why AMD Ryzen processors are seeing performance gaps in 1080p gaming (and other scenarios) in comparison to Intel Core processors. Several outlets have posted that the culprit is the Windows 10 scheduler and its inability to properly allocate work across the logical vs. physical cores of the Zen architecture. As it turns out, we can prove that isn't the case at all. -Ryan Shrout

Initial reviews of AMD’s Ryzen CPU revealed a few inefficiencies in some situations particularly in gaming workloads running at the more common resolutions like 1080p, where the CPU comprises more of a bottleneck when coupled with modern GPUs. Lots of folks have theorized about what could possibly be causing these issues, and most recent attention appears to have been directed at the Windows 10 scheduler and its supposed inability to properly place threads on the Ryzen cores for the most efficient processing. 

I typically have Task Manager open while running storage tests (they are boring to watch otherwise), and I naturally had it open during Ryzen platform storage testing. I’m accustomed to how the IO workers are distributed across reported threads, and in the case of SMT capable CPUs, distributed across cores. There is a clear difference when viewing our custom storage workloads with SMT on vs. off, and it was dead obvious to me that core loading was working as expected while I was testing Ryzen. I went back and pulled the actual thread/core loading data from my testing results to confirm:

SMT on usage.png

The Windows scheduler has a habit of bouncing processes across available processor threads. This naturally happens as other processes share time with a particular core, with the heavier process not necessarily switching back to the same core. As you can see above, the single IO handler thread was spread across the first four cores during its run, but the Windows scheduler was always hitting just one of the two available SMT threads on any single core at one time.

My testing for Ryan’s Ryzen review consisted of only single threaded workloads, but we can make things a bit clearer by loading down half of the CPU while toggling SMT off. We do this by increasing the worker count (4) to be half of the available threads on the Ryzen processor, which is 8 with SMT disabled in the motherboard BIOS.

smtoff4workers.png

SMT OFF, 8 cores, 4 workers

With SMT off, the scheduler is clearly not giving priority to any particular core and the work is spread throughout the physical cores in a fairly even fashion.

Now let’s try with SMT turned back on and doubling the number of IO workers to 8 to keep the CPU half loaded:

smton8workers.png

SMT ON, 16 (logical) cores, 8 workers

With SMT on, we see a very different result. The scheduler is clearly loading only one thread per core. This could only be possible if Windows was aware of the 2-way SMT (two threads per core) configuration of the Ryzen processor. Do note that sometimes the workload will toggle around every few seconds, but the total loading on each physical core will still remain at ~%50. I chose a workload that saturated its thread just enough for Windows to not shift it around as it ran, making the above result even clearer.

Synthetic Testing Procedure

While the storage testing methods above provide a real-world example of the Windows 10 scheduler working as expected, we do have another workload that can help demonstrate core balancing with Intel Core and AMD Ryzen processors. A quick and simple custom-built C++ application can be used to generate generic worker threads and monitor for core collisions and resolutions.

This test app has a very straight forward workflow. Every few seconds it generates a new thread, capping at N/2 threads total, where N is equal to the reported number of logical cores. If the OS scheduler is working as expected, it should load 8 threads across 8 physical cores, though the division between the specific logical core per physical core will be based on very minute parameters and conditions going on in the OS background.

By monitoring the APIC_ID through the CPUID instruction, the first application thread monitors all threads and detects and reports on collisions - when a thread from our app is running on the same core as another thread from our app. That thread also reports when those collisions have been cleared. In an ideal and expected environment where Windows 10 knows the boundaries of physical and logical cores, you should never see more than one thread of a core loaded at the same time.

app01.png

Click to Enlarge

This screenshot shows our app working on the left and the Windows Task Manager on the right with logical cores labeled. While it may look like all logical cores are being utilized at the same time, in fact they are not. At any given point, only LCore 0 or LCore 1 are actively processing a thread. Need proof? Check out the modified view of the task manager where I copy the graph of LCore 1/5/9/13 over the graph of LCore 0/4/8/12 with inverted colors to aid viewability.

app02-2.png

If you look closely, by overlapping the graphs in this way, you can see that the threads migrate from LCore 0 to LCore 1, LCore 4 to LCore 5, and so on. The graphs intersect and fill in to consume ~100% of the physical core. This pattern is repeated for the other 8 logical cores on the right two columns as well. 

Running the same application on a Core i7-5960X Haswell-E 8-core processor shows a very similar behavior.

app03.png

Click to Enlarge

Each pair of logical cores shares a single thread and when thread transitions occur away from LCore N, they migrate perfectly to LCore N+1. It does appear that in this scenario the Intel system is showing a more stable threaded distribution than the Ryzen system. While that may in fact incur some performance advantage for the 5960X configuration, the penalty for intra-core thread migration is expected to be very minute.

The fact that Windows 10 is balancing the 8 thread load specifically between matching logical core pairs indicates that the operating system is perfectly aware of the processor topology and is selecting distinct cores first to complete the work.

Information from this custom application, along with the storage performance tool example above, clearly show that Windows 10 is attempting to balance work on Ryzen between cores in the same manner that we have experienced with Intel and its HyperThreaded processors for many years.

Continue reading our look at AMD Ryzen and Windows 10 scheduling!

Author:
Manufacturer: NVIDIA

Flagship Performance Gets Cheaper

UPDATE! If you missed our launch day live stream, you can find the reply below:

It’s a very interesting time in the world of PC gaming hardware. We just saw the release of AMD’s Ryzen processor platform that shook up the processor market for the first time in a decade, AMD’s Vega architecture has been given the brand name “Vega”, and the anticipation for the first high-end competitive part from AMD since Hawaii grows as well. AMD was seemingly able to take advantage of Intel’s slow innovation pace on the processor and it was hoping to do the same to NVIDIA on the GPU. NVIDIA’s product line has been dominant in the mid and high-end gaming market since the 900-series with the 10-series products further cementing the lead.

box1.jpg

The most recent high end graphics card release came in the form of the updated Titan X based on the Pascal architecture. That was WAY back in August of 2016 – a full seven months ago! Since then we have seen very little change at the top end of the product lines and what little change we did see came from board vendors adding in technology and variation on the GTX 10-series.

Today we see the release of the new GeForce GTX 1080 Ti, a card that offers only a handful of noteworthy technological changes but instead is able to shake up the market by instigating pricing adjustments to make the performance offers more appealing, and lowering the price of everything else.

The GTX 1080 Ti GP102 GPU

I already wrote about the specifications of the GPU in the GTX 1080 Ti when it was announced last week, so here’s a simple recap.

  GTX 1080 Ti Titan X (Pascal) GTX 1080 GTX 980 Ti TITAN X GTX 980 R9 Fury X R9 Fury R9 Nano
GPU GP102 GP102 GP104 GM200 GM200 GM204 Fiji XT Fiji Pro Fiji XT
GPU Cores 3584 3584 2560 2816 3072 2048 4096 3584 4096
Base Clock 1480 MHz 1417 MHz 1607 MHz 1000 MHz 1000 MHz 1126 MHz 1050 MHz 1000 MHz up to 1000 MHz
Boost Clock 1582 MHz 1480 MHz 1733 MHz 1076 MHz 1089 MHz 1216 MHz - - -
Texture Units 224 224 160 176 192 128 256 224 256
ROP Units 88 96 64 96 96 64 64 64 64
Memory 11GB 12GB 8GB 6GB 12GB 4GB 4GB 4GB 4GB
Memory Clock 11000 MHz 10000 MHz 10000 MHz 7000 MHz 7000 MHz 7000 MHz 500 MHz 500 MHz 500 MHz
Memory Interface 352-bit 384-bit G5X 256-bit G5X 384-bit 384-bit 256-bit 4096-bit (HBM) 4096-bit (HBM) 4096-bit (HBM)
Memory Bandwidth 484 GB/s 480 GB/s 320 GB/s 336 GB/s 336 GB/s 224 GB/s 512 GB/s 512 GB/s 512 GB/s
TDP 250 watts 250 watts 180 watts 250 watts 250 watts 165 watts 275 watts 275 watts 175 watts
Peak Compute 10.6 TFLOPS 10.1 TFLOPS 8.2 TFLOPS 5.63 TFLOPS 6.14 TFLOPS 4.61 TFLOPS 8.60 TFLOPS 7.20 TFLOPS 8.19 TFLOPS
Transistor Count 12.0B 12.0B 7.2B 8.0B 8.0B 5.2B 8.9B 8.9B 8.9B
Process Tech 16nm 16nm 16nm 28nm 28nm 28nm 28nm 28nm 28nm
MSRP (current) $699 $1,200 $599 $649 $999 $499 $649 $549 $499

The GTX 1080 Ti looks a whole lot like the TITAN X launched in August of last year. Based on the 12B transistor GP102 chip, the new GTX 1080 Ti will have 3,584 CUDA core with a 1.60 GHz Boost clock. That gives it the same processor count as Titan X but with a slightly higher clock speed which should make the new GTX 1080 Ti slightly faster by at least a few percentage points and has a 4.7% edge in base clock compute capability. It has 28 SMs, 28 geometry units, 224 texture units.

GeForce_GTX_1080_Ti_Block_Diagram.png

Interestingly, the memory system on the GTX 1080 Ti gets adjusted – NVIDIA has disabled a single 32-bit memory controller to give the card a total of 352-bit wide bus and an odd-sounding 11GB memory capacity. The ROP count also drops to 88 units. Speaking of 11, the memory clock on the G5X implementation on GTX 1080 Ti will now run at 11 Gbps, a boost available to NVIDIA thanks to a chip revision from Micron and improvements to equalization and reverse signal distortion.

The move from 12GB of memory on the GP102-based Titan X to 11GB on the GTX 1080 Ti is an interesting move, and evokes memories of the GTX 970 fiasco where NVIDIA disabled a portion of that memory controller but left the memory that would have resided on it ON the board. At that point, what behaved as 3.5GB of memory at one speed and 500 MB at another speed, was the wrong move to make. But releasing the GTX 970 with "3.5GB" of memory would have seemed odd too. NVIDIA is not making the same mistake, instead building the GTX 1080 Ti with 11GB out the gate.

Continue reading our review of the NVIDIA GeForce GTX 1080 Ti 11GB graphics card!

Author:
Subject: Processors
Manufacturer: AMD

The right angle

While many in the media and enthusiast communities are still trying to fully grasp the importance and impact of the recent AMD Ryzen 7 processor release, I have been trying to complete my review of the 1700X and 1700 processors, in between testing the upcoming GeForce GTX 1080 Ti and preparing for more hardware to show up at the offices very soon. There is still much to learn and understand about the first new architecture from AMD in nearly a decade, including analysis of the memory hierarchy, power consumption, overclocking, gaming performance, etc.

During my Ryzen 7 1700 testing, I went through some overclocking evaluation and thought the results might be worth sharing earlier than later. This quick article is just a preview of what we are working on so don’t expect to find the answers to Ryzen power management here, only a recounting of how I was able to get stellar performance from the lowest priced Ryzen part on the market today.

The system specifications for this overclocking test were identical to our original Ryzen 7 processor review.

Test System Setup
CPU AMD Ryzen 7 1800X
AMD Ryzen 7 1700X
AMD Ryzen 7 1700
Intel Core i7-7700K
Intel Core i5-7600K
Intel Core i7-6700K
Intel Core i7-6950X
Intel Core i7-6900K
Intel Core i7-6800K
Motherboard ASUS Crosshair VI Hero (Ryzen)
ASUS Prime Z270-A (Kaby Lake, Skylake)
ASUS X99-Deluxe II (Broadwell-E)
Memory 16GB DDR4-2400
Storage Corsair Force GS 240 SSD
Sound Card On-board
Graphics Card NVIDIA GeForce GTX 1080 8GB
Graphics Drivers NVIDIA 378.49
Power Supply Corsair HX1000
Operating System Windows 10 Pro x64

Of note is that I am still utilizing the Noctua U12S cooler that AMD provided for our initial testing – all of the overclocking and temperature reporting in this story is air cooled.

DSC02643.jpg

First, let’s start with the motherboard. All of this testing was done on the ASUS Crosshair VI Hero with the latest 5704 BIOS installed. As I began to discover the different overclocking capabilities (BCLK adjustment, multipliers, voltage) I came across one of the ASUS presets. These presets offer pre-defined collections of settings that ASUS feels will offer simple overclocking capabilities. An option for higher BCLK existed but the one that caught my eye was straight forward – 4.0 GHz.

asusbios.jpg

With the Ryzen 1700 installed, I thought I would give it a shot. Keep in mind that this processor has a base clock of 3.0 GHz, a rated maximum boost clock of 3.7 GHz, and is the only 65-watt TDP variant of the three Ryzen 7 processors released last week. Because of that, I didn’t expect the overclocking capability for it to match what the 1700X and 1800X could offer. Based on previous processor experience, when a chip is binned at a lower power draw than the rest of a family it will often have properties that make it disadvantageous for running at HIGHER power. Based on my results here, that doesn’t seem to the case.

4.0.PNG

By simply enabling that option in the ASUS UEFI and rebooting, our Ryzen 1700 processor was running at 4.0 GHz on all cores! For this piece, I won’t be going into the drudge and debate on what settings ASUS changed to get to this setting or if the voltages are overly aggressive – the point is that it just works out of the box.

Continue reading our look at overclocking the new Ryzen 7 1700 processor!

Subject: General Tech
Manufacturer: Logitech

Introduction and Specifications

The G533 Wireless headset is the latest offering from Logitech, combining the company’s premium Pro-G drivers, 15-hour battery life, and a new, more functional style. Obvious comparisons can be made to last year’s G933 Artemis Spectrum, since both are wireless headsets using Logitech’s Pro-G drivers; but this new model comes in at a lower price while offering much of the same functionality (while dropping the lighting effects). So does the new headset sound any different? What about the construction? Read on to find out!

DSC_0393.jpg

The G533 exists alongside the G933 Artemis Spectrum in Logitech’s current lineup, but it takes most of the features from that high-end wireless model, while paring it down to create a lean, mean option for gamers who don’t need (or want) RGB lighting effects. The 40 mm Pro-G drivers are still here, and the new G533 offers a longer battery life (15 hours) than the G933 could manage, even with its lighting effects disabled (12 hours). 7.1-channel surround effects and full EQ and soundfield customization remain, though only DTS effects are present (no Dolby this time).

What do these changes translate to? First of all, the G533 headset is being introduced with a $149 MSRP, which is $50 lower than the G933 Artemis Spectrum at $199. I think many of our readers would trade RGB effects for lower cost, making this a welcome change (especially considering lighting effects don’t really mean much when you are wearing the headphones).Another difference is the overall weight of the headset at 12.5 oz, which is 0.5 oz lighter than the G933 at 13 oz.

DSC_0405.jpg

Continue reading our review of the Logitech G533 Wireless 7.1 Surround Gaming Headset!

Author:
Manufacturer: Riotoro

Introduction and Features

Introduction

Riotoro is a new player in the already crowded PC power supply market. Formed in 2014 and based in California, Riotoro originally started their PC hardware business with a focus on cases, mice, and LED fans targeted towards the gaming community. Now they are expanding their product offerings to include two new power supply lines, the Enigma and Onyx Series, along with two liquid CPU coolers and several RGB gaming keyboards. We will be taking a detailed look at Riotoro’s new Enigma 850W power supply in this review.

2-R850-diag.jpg

Riotoro announced the introduction of the three power supplies at Computex 2016: the Enigma 850W, Onyx 750W, and Onyx 650W. All three power supplies were developed in partnership with Great Wall and are based on new platforms designed to hit the sweet spot for practical real-world performance, reliability, and price. The Onyx line will initially be available in 650W and 750W models. The more up scale Enigma line will kick off with the 850W model.

The Riotoro Enigma 850W power supply is certified to comply with the 80 Plus Gold criteria for high efficiency, comes with semi-modular cables, and uses a quiet 140mm variable speed fan for cooling.

3-R850-front-cables.jpg

Riotoro Enigma 850W PSU Key Features:

•    850W Continuous DC output at up to 40°C
•    80 PLUS Gold certified for high efficiency
•    Semi-modular cables
•    Quiet 140mm cooling fan
•    Japanese made bulk (electrolytic) capacitors
•    Compatible with Intel and AMD processors and motherboards
•    Active Power Factor correction with Universal AC input (100 to 240 VAC)
•    Safety protections: OVP, UVP, OCP, OPP, and SCP
•    5-Year warranty
•    MSRP: $119.99 USD

Please continue reading our review of the Riotoro Enigma 850W PSU!!!

Author:
Subject: Processors
Manufacturer: AMD

AMD Ryzen 7 Processor Specifications

It’s finally here and its finally time to talk about. The AMD Ryzen processor is being released onto the world and based on the buildup of excitement over the last week or so since pre-orders began, details on just how Ryzen performs relative to Intel’s mainstream and enthusiast processors are a hot commodity. While leaks have been surfacing for months and details seem to be streaming out from those not bound to the same restrictions we have been, I think you are going to find our analysis of the Ryzen 7 1800X processor to be quite interesting and maybe a little different as well.

Honestly, there isn’t much that has been left to the imagination about Ryzen, its chipsets, pricing, etc. with the slow trickle of information that AMD has been sending out since before CES in January. We know about the specifications, we know about the architecture, we know about the positioning; and while I will definitely recap most of that information here, the real focus is going to be on raw numbers. Benchmarks are what we are targeting with today’s story.

Let’s dive right in.

The Zen Architecture – Foundation for Ryzen

Actually, as it turns out, in typical Josh Walrath fashion, he wrote too much about the AMD Zen architecture to fit into this page. So, instead, you'll find his complete analysis of AMD's new baby right here: AMD Zen Architecture Overview: Focus on Ryzen

ccx.png

AMD Ryzen 7 Processor Specifications

Though we have already detailed the most important specifications for the new AMD Ryzen processors when the preorders went live, its worth touching on them again and reemphasizing the important ones.

  Ryzen 7 1800X Ryzen 7 1700X Ryzen 7 1700 Core i7-6900K Core i7-6800K Core i7-7700K Core i5-7600K Core i7-6700K
Architecture Zen Zen Zen Broadwell-E Broadwell-E Kaby Lake Kaby Lake Skylake
Process Tech 14nm 14nm 14nm 14nm 14nm 14nm+ 14nm+ 14nm
Cores/Threads 8/16 8/16 8/16 8/16 6/12 4/8 4/4 4/8
Base Clock 3.6 GHz 3.4 GHz 3.0 GHz 3.2 GHz 3.4 GHz 4.2 GHz 3.8 GHz 4.0 GHz
Turbo/Boost Clock 4.0 GHz 3.8  GHz 3.7 GHz 3.7 GHz 3.6 GHz 4.5 GHz 4.2 GHz 4.2 GHz
Cache 20MB 20MB 20MB 20MB 15MB 8MB 8MB 8MB
Memory Support DDR4-2400
Dual Channel
DDR4-2400
Dual Channel
DDR4-2400
Dual Channel
DDR4-2400
Quad Channel
DDR4-2400
Quad Channel
DDR4-2400
Dual Channel
DDR4-2400
Dual Channel
DDR4-2400
Dual Channel
TDP 95 watts 95 watts 65 watts 140 watts 140 watts 91 watts 91 watts 91 watts
Price $499 $399 $329 $1050 $450 $350 $239 $309

All three of the currently announced Ryzen processors are 8-core, 16-thread designs, matching the Core i7-6900K from Intel in that regard. Though Intel does have a 10-core part branded for consumers, it comes in at a significantly higher price point (over $1500 still). The clock speeds of Ryzen are competitive with the Broadwell-E platform options though are clearly behind the curve when it comes the clock capabilities of Kaby Lake and Skylake. With admittedly lower IPC than Kaby Lake, Zen will struggle in any purely single threaded workload with as much as 500 MHz deficit in clock rate.

One interesting deviation from Intel's designs that Ryzen gets is a more granular boost capability. AMD Ryzen CPUs will be able move between processor states in 25 MHz increments while Intel is currently limited to 100 MHz. If implemented correctly and effectively through SenseMI, this allows Ryzen to get 25-75 MHz of additional performance in a scenario where it was too thermally constrainted to hit the next 100 MHz step. 

DSC02636.jpg

XFR (Extended Frequency Range), supported on the Ryzen 7 1800X and 1700X (hence the "X"), "lifts the maximum Precision Boost frequency beyond ordinary limits in the presence of premium systems and processor cooling." The story goes, that if you have better than average cooling, the 1800X will be able to scale up to 4.1 GHz in some instances for some undetermined amount of time. The better the cooling, the longer it can operate in XFR. While this was originally pitched to us as a game-changing feature that bring extreme advantages to water cooling enthusiasts, it seems it was scaled back for the initial release. Only getting 100 MHz performance increase, in the best case result, seems a bit more like technology for technology's sake rather than offering new capabilities for consumers.

cpu2.jpg

Ryzen integrates a dual channel DDR4 memory controller with speeds up to 2400 MHz, matching what Intel can do on Kaby Lake. Broadwell-E has the advantage with a quad-channel controller but how useful that ends of being will be interesting to see as we step through our performance testing.

One area of interest is the TDP ratings. AMD and Intel have very different views on how this is calculated. Intel has made this the maximum power draw of the processor while AMD sees it as a target for thermal dissipation over time. This means that under stock settings the Core i7-7700K will not draw more than 91 watts and the Core i7-6900K will not draw more than 140 watts. And in our testing, they are well under those ratings most of the time, whenever AVX code is not being operated. AMD’s 95-watt rating on the Ryzen 1800X though will very often be exceed, and our power testing proves that out. The logic is that a cooler with a 95-watt rating and the behavior of thermal propagation give the cooling system time to catch up. (Interestingly, this is the philosophy Intel has taken with its Kaby Lake mobile processors.)

lisa-29.jpg

Obviously the most important line here for many of you is the price. The Core i7-6900K is the lowest priced 8C/16T option from Intel for consumers at $1050. The Ryzen R7 1800X has a sticker price less than half of that, at $499. The R7 1700X vs Core i7-6800K match is interesting as well, where the AMD CPU will sell for $399 versus $450 for the 6800K. However, the 6800K only has 6-cores and 12-threads, giving the Ryzen part an instead 25% boost in multi-threaded performance. The 7700K and R7 1700 battle will be interesting as well, with a 4-core difference in capability and a $30 price advantage to AMD.

Continue reading our review of the new AMD Ryzen 7 1800X processor!!

Author:
Subject: Processors
Manufacturer: AMD

What Makes Ryzen Tick

We have been exposed to details about the Zen architecture for the past several Hot Chips conventions as well as other points of information directly from AMD.  Zen was a clean sheet design that borrowed some of the best features from the Bulldozer and Jaguar architectures, as well as integrating many new ideas that had not been executed in AMD processors before.  The fusion of ideas from higher performance cores, lower power cores, and experience gained in APU/GPU design have all come together in a very impressive package that is the Ryzen CPU.

zen_01.jpg

It is well known that AMD brought back Jim Keller to head the CPU group after the slow downward spiral that AMD entered in CPU design.  While the Athlon 64 was a tremendous part for the time, the subsequent CPUs being offered by the company did not retain that leadership position.  The original Phenom had problems right off the bat and could not compete well with Intel’s latest dual and quad cores.  The Phenom II shored up their position a bit, but in the end could not keep pace with the products that Intel continued to introduce with their newly minted “tic-toc” cycle.  Bulldozer had issues  out of the gate and did not have performance numbers that were significantly greater than the previous generation “Thuban” 6 core Phenom II product, much less the latest Intel Sandy Bridge and Ivy Bridge products that it would compete with.

AMD attempted to stop the bleeding by iterating and evolving the Bulldozer architecture with Piledriver, Steamroller, and Excavator.  The final products based on this design arc seemed to do fine for the markets they were aimed at, but certainly did not regain any marketshare with AMD’s shrinking desktop numbers.  No matter what AMD did, the base architecture just could not overcome some of the basic properties that impeded strong IPC performance.

52_perc_design_opt.png

The primary goal of this new architecture is to increase IPC to a level consistent to what Intel has to offer.  AMD aimed to increase IPC per clock by at least 40% over the previous Excavator core.  This is a pretty aggressive goal considering where AMD was with the Bulldozer architecture that was focused on good multi-threaded performance and high clock speeds.  AMD claims that it has in fact increased IPC by an impressive 54% from the previous Excavator based core.  Not only has AMD seemingly hit its performance goals, but it exceeded them.  AMD also plans on using the Zen architecture to power products from mobile products to the highest TDP parts offered.

 

The Zen Core

The basis for Ryzen are the CCX modules.  These modules contain four Zen cores along with 8 MB of shared L3 cache.  Each core has 64 KB of L1 I-cache and 32 KB of D-cache.  There is a total of 512 KB of L2 cache.  These caches are inclusive.  The L3 cache acts as a victim cache which partially copies what is in L1 and L2 caches.  AMD has improved the performance of their caches to a very large degree as compared to previous architectures.  The arrangement here allows the individual cores to quickly snoop any changes in the caches of the others for shared workloads.  So if a cache line is changed on one core, other cores requiring that data can quickly snoop into the shared L3 and read it.  Doing this allows the CPU doing the actual work to not be interrupted by cache read requests from other cores.

ccx.png

l2_cache.png

l3_cache.png

Each core can handle two threads, but unlike Bulldozer has a single integer core.  Bulldozer modules featured two integer units and a shared FPU/SIMD.  Zen gets rid of CMT for good and we have a single integer and FPU units for each core.  The core can address two threads by utilizing AMD’s version of SMT (symmetric multi-threading).  There is a primary thread that gets higher priority while the second thread has to wait until resources are freed up.  This works far better in the real world than in how I explained it as resources are constantly being shuffled about and the primary thread will not monopolize all resources within the core.

Click here to read more about AMD's Zen architecture in Ryzen!

Linked Multi-GPU Arrives... for Developers

The Khronos Group has released the Vulkan 1.0.42.0 specification, which includes experimental (more on that in a couple of paragraphs) support for VR enhancements, sharing resources between processes, and linking similar GPUs. This spec was released alongside a LunarG SDK and NVIDIA drivers, which are intended for developers, not gamers, that fully implement these extensions.

I would expect that the most interesting feature is experimental support for linking similar GPUs together, similar to DirectX 12’s Explicit Linked Multiadapter, which Vulkan calls a “Device Group”. The idea is that the physical GPUs hidden behind this layer can do things like share resources, such as rendering a texture on one GPU and consuming it in another, without the host code being involved. I’m guessing that some studios, like maybe Oxide Games, will decide to not use this feature. While it’s not explicitly stated, I cannot see how this (or DirectX 12’s Explicit Linked mode) would be compatible in cross-vendor modes. Unless I’m mistaken, that would require AMD, NVIDIA, and/or Intel restructuring their drivers to inter-operate at this level. Still, the assumptions that could be made with grouped devices are apparently popular with enough developers for both the Khronos Group and Microsoft to bother.

microsoft-dx12-build15-linked.png

A slide from Microsoft's DirectX 12 reveal, long ago.

As for the “experimental” comment that I made in the introduction... I was expecting to see this news around SIGGRAPH, which occurs in late-July / early-August, alongside a minor version bump (to Vulkan 1.1).

I might still be right, though.

The major new features of Vulkan 1.0.42.0 are implemented as a new classification of extensions: KHX. In the past, vendors, like NVIDIA and AMD, would add new features as vendor-prefixed extensions. Games could query the graphics driver for these abilities, and enable them if available. If these features became popular enough for multiple vendors to have their own implementation of it, a committee would consider an EXT extension. This would behave the same across all implementations (give or take) but not be officially adopted by the Khronos Group. If they did take it under their wing, it would be given a KHR extension (or added as a required feature).

The Khronos Group has added a new layer: KHX. This level of extension sits below KHR, and is not intended for production code. You might see where this is headed. The VR multiview, multi-GPU, and cross-process extensions are not supposed to be used in released video games until they leave KHX status. Unlike a vendor extension, the Khronos Group wants old KHX standards to drop out of existence at some point after they graduate to full KHR status. It’s not something that NVIDIA owns and will keep it around for 20 years after its usable lifespan just so old games can behave expectedly.

khronos-group-logo.png

How long will that take? No idea. I’ve already mentioned my logical but uneducated guess a few paragraphs ago, but I’m not going to repeat it; I have literally zero facts to base it on, and I don’t want our readers to think that I do. I don’t. It’s just based on what the Khronos Group typically announces at certain trade shows, and the length of time since their first announcement.

The benefit that KHX does bring us is that, whenever these features make it to public release, developers will have already been using it... internally... since around now. When it hits KHR, it’s done, and anyone can theoretically be ready for it when that time comes.

Author:
Manufacturer: NVIDIA

VR Performance Evaluation

Even though virtual reality hasn’t taken off with the momentum that many in the industry had expected on the heels of the HTC Vive and Oculus Rift launches last year, it remains one of the fastest growing aspects of PC hardware. More importantly for many, VR is also one of the key inflection points for performance moving forward; it requires more hardware, scalability, and innovation than any other sub-category including 4K gaming.  As such, NVIDIA, AMD, and even Intel continue to push the performance benefits of their own hardware and technology.

Measuring and validating those claims has proven to be a difficult task. Tools that we used in the era of standard PC gaming just don’t apply. Fraps is a well-known and well-understood tool for measuring frame rates and frame times utilized by countless reviewers and enthusiasts. But Fraps lacked the ability to tell the complete story of gaming performance and experience. NVIDIA introduced FCAT and we introduced Frame Rating back in 2013 to expand the capabilities that reviewers and consumers had access to. Using more sophisticated technique that includes direct capture of the graphics card output in uncompressed form, a software-based overlay applied to each frame being rendered, and post-process analyzation of that data, we were able to communicate the smoothness of a gaming experience, better articulating it to help gamers make purchasing decisions.

pipe1.jpg

VR pipeline when everything is working well.

For VR though, those same tools just don’t cut it. Fraps is a non-starter as it measures frame rendering from the GPU point of view and completely misses the interaction between the graphics system and the VR runtime environment (OpenVR for Steam/Vive and OVR for Oculus). Because the rendering pipeline is drastically changed in the current VR integrations, what Fraps measures is completely different than the experience the user actually gets in the headset. Previous FCAT and Frame Rating methods were still viable but the tools and capture technology needed to be updated. The hardware capture products we used since 2013 were limited in their maximum bandwidth and the overlay software did not have the ability to “latch in” to VR-based games. Not only that but measuring frame drops, time warps, space warps and reprojections would be a significant hurdle without further development.  

pipe2.jpg

VR pipeline with a frame miss.

NVIDIA decided to undertake the task of rebuilding FCAT to work with VR. And while obviously the company is hoping that it will prove its claims of performance benefits for VR gaming, it should not be overlooked the investment in time and money spent on a project that is to be open sourced and free available to the media and the public.

vlcsnap-2017-02-27-11h31m17s057.png

NVIDIA FCAT VR is comprised of two different applications. The FCAT VR Capture tool runs on the PC being evaluated and has a similar appearance to other performance and timing capture utilities. It uses data from Oculus Event Tracing as a part of the Windows ETW and SteamVR’s performance API, along with NVIDIA driver stats when used on NVIDIA hardware to generate performance data. It will and does work perfectly well on any GPU vendor’s hardware though with the access to the VR vendor specific timing results.

fcatvrcapture.jpg

Continue reading our preview of the new FCAT VR tool!

Author:
Subject: Editorial
Manufacturer: AMD

Zen vs. 40 Years of CPU Development

Zen is nearly upon us.  AMD is releasing its next generation CPU architecture to the world this week and we saw CPU demonstrations and upcoming AM4 motherboards at CES in early January.  We have been shown tantalizing glimpses of the performance and capabilities of the “Ryzen” products that will presumably fill the desktop markets from $150 to $499.  I have yet to be briefed on the product stack that AMD will be offering, but we know enough to start to think how positioning and placement will be addressed by these new products.

zen_01.jpg

To get a better understanding of how Ryzen will stack up, we should probably take a look back at what AMD has accomplished in the past and how Intel has responded to some of the stronger products.  AMD has been in business for 47 years now and has been a major player in semiconductors for most of that time.  It really has only been since the 90s where AMD started to battle Intel head to head that people have become passionate about the company and their products.

The industry is a complex and ever-shifting one.  AMD and Intel have been two stalwarts over the years.  Even though AMD has had more than a few challenging years over the past decade, it still moves forward and expects to compete at the highest level with its much larger and better funded competitor.  2017 could very well be a breakout year for the company with a return to solid profitability in both CPU and GPU markets.  I am not the only one who thinks this considering that AMD shares that traded around the $2 mark ten months ago are now sitting around $14.

 

AMD Through 1996

AMD became a force in the CPU industry due to IBM’s requirement to have a second source for its PC business.  Intel originally entered into a cross licensing agreement with AMD to allow it to produce x86 chips based on Intel designs.  AMD eventually started to produce their own versions of these parts and became a favorite in the PC clone market.  Eventually Intel tightened down on this agreement and then cancelled it, but through near endless litigation AMD ended up with a x86 license deal with Intel.

AMD produced their own Am286 chip that was the first real break from the second sourcing agreement with Intel.  Intel balked at sharing their 386 design with AMD and eventually forced the company to develop its own clean room version.  The Am386 was released in the early 90s, well after Intel had been producing those chips for years. AMD then developed their own version of the Am486 which then morphed into the Am5x86.  The company made some good inroads with these speedy parts and typically clocked them faster than their Intel counterparts (eg. Am486 40 MHz and 80 MHz vs. the Intel 486 DX33 and DX66).  AMD priced these points lower so users could achieve better performance per dollar using the same chipsets and motherboards.

zen_02.jpg

Intel released their first Pentium chips in 1993.  The initial version was hot and featured the infamous FDIV bug.  AMD made some inroads against these parts by introducing the faster Am486 and Am5x86 parts that would achieve clockspeeds from 133 MHz to 150 MHz at the very top end.  The 150 MHz part was very comparable in overall performance to the Pentium 75 MHz chip and we saw the introduction of the dreaded “P-rating” on processors.

There is no denying that Intel continued their dominance throughout this time by being the gold standard in x86 manufacturing and design.  AMD slowly chipped away at its larger rival and continued to profit off of the lucrative x86 market.  William Sanders III set the bar higher about where he wanted the company to go and he started on a much more aggressive path than many expected the company to take.

Click here to read the rest of the AMD processor editorial!