Subject: General Tech | August 25, 2016 - 10:51 AM | Ryan Shrout
Tagged: Zen, video, seasonic, Polaris, podcast, Omen, nvidia, market share, Lightning, hp, gtx 1060 3gb, gpu, brix, Audeze, asus, architecture, amd
PC Perspective Podcast #414 - 08/25/2016
Join us this week as we discuss the newly released architecture details of AMD Zen, Audeze headphones, AMD market share gains and more!
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the Store (audio only)
- Google Play - Subscribe to our audio podcast directly through Google Play!
- RSS - Subscribe through your regular RSS reader (audio only)
- MP3 - Direct download link to the MP3 file
Hosts: Ryan Shrout, Allyn Malventano, Josh Walrath and Jeremy Hellstrom
Week in Review:
News items of interest:
Hardware/Software Picks of the Week
Subject: Graphics Cards | August 24, 2016 - 10:34 AM | Ryan Shrout
Tagged: nvidia, market share, jpr, jon peddie, amd
As reported by both Mercury Research and now by Jon Peddie Research, in a graphics add-in card market that dropped dramatically in Q2 2016 in terms of total units shipped, AMD has gained significant market share against NVIDIA.
|GPU Supplier||Market share this QTR||Market share last QTR||Market share last year|
Source: Jon Peddie Research
Last year at this time, AMD was sitting at 18% market share in terms of units sold, an absolutely dismal result compared to NVIDIA's dominating 81.9%. Over the last couple of quarters we have seen AMD gain in this space, and keeping in mind that Q2 2016 does not include sales of AMD's new Polaris-based graphics cards like the Radeon RX 480, the jump to 29.9% is a big move for the company. As a result, NVIDIA falls back to 70% market share for the quarter, which is still a significant lead over the AMD.
Numbers like that shouldn't be taken lightly - for AMD to gain 7 points of market share in a single quarter indicates a substantial shift in the market. This includes all add-in cards: budget, mainstream, enthusiast and even workstation class products. One report I am received says that NVIDIA card sales specifically dropped off in Q2, though the exact reason why isn't known, and as a kind of defacto result, AMD gained sales share.
There are several other factors to watch with this data however. First, the quarterly drop in graphics card sales was -20% in Q2 when compared to Q1. That is well above the average seasonal Q1-Q2 drop, which JPR claims to be -9.7%. Much of this sell through decrease is likely due to consumers expecting releases of both NVIDIA Pascal GPUs and AMD Polaris GPUs, stalling sales as consumers delay their purchases.
The NVIDIA GeForce GTX 1080 launched on May 17th and the GTX 1070 on May 29th. The company has made very bold claims about product sales of Pascal parts so I am honestly very surprised that the overall market would drop the way it did in Q2 and that NVIDIA would fall behind AMD as much as it has. Q3 2016 may be the defining time for both GPU vendors however as it will show the results of the work put into both new architectures and both new product lines. NVIDIA reported record profits recently so it will be interesting to see how that matches up to unit sales.
Clean Sheet and New Focus
It is no secret that AMD has been struggling for some time. The company has had success through the years, but it seems that the last decade has been somewhat bleak in terms of competitive advantages. The company has certainly made an impact in throughout the decades with their 486 products, K6, the original Athlon, and the industry changing Athlon 64. Since that time we have had a couple of bright spots with the Phenom II being far more competitive than expected, and the introduction of very solid graphics performance in their APUs.
Sadly for AMD their investment in the “Bulldozer” architecture was misplaced for where the industry was heading. While we certainly see far more software support for multi-threaded CPUs, IPC is still extremely important for most workloads. The original Bulldozer was somewhat rushed to market and was not fully optimized, while the “Piledriver” based Vishera products fixed many of these issues we have not seen the non-APU products updated to the latest Steamroller and Excavator architectures. The non-APU desktop market has been served for the past four years with 32nm PD-SOI based parts that utilize a rebranded chipset base that has not changed since 2010.
Four years ago AMD decided to change course entirely with their desktop and server CPUs. Instead of evolving the “Bulldozer” style architecture featuring CMT (Core Multi-Threading) they were going to do a clean sheet design that focused on efficiency, IPC, and scalability. While Bulldozer certainly could scale the thread count fairly effectively, the overall performance targets and clockspeeds needed to compete with Intel were just not feasible considering the challenges of process technology. AMD brought back Jim Keller to lead this effort, an industry veteran with a huge amount of experience across multiple architectures. Zen was born.
Hot Chips 28
This year’s Hot Chips is the first deep dive that we have received about the features of the Zen architecture. Mike Clark is taking us through all of the changes and advances that we can expect with the upcoming Zen products.
Zen is a clean sheet design that borrows very little from previous architectures. This is not to say that concepts that worked well in previous architectures were not revisited and optimized, but the overall floorplan has changed dramatically from what we have seen in the past. AMD did not stand still with their Bulldozer products, and the latest Excavator core does improve upon the power consumption and performance of the original. This evolution was simply not enough considering market pressures and Intel’s steady improvement of their core architecture year upon year. Zen was designed to significantly improve IPC and AMD claims that this product has a whopping 40% increase in IPC (instructions per clock) from the latest Excavator core.
AMD also has focused on scaling the Zen architecture from low power envelopes up to server level TDPs. The company looks to have pushed down the top end power envelope of Zen from the 125+ watts of Bulldozer/Vishera into the more acceptable 95 to 100 watt range. This also has allowed them to scale Zen down to the 15 to 25 watt TDP levels without sacrificing performance or overall efficiency. Most architectures have sweet spots where they tend to perform best. Vishera for example could scale nicely from 95 to 220 watts, but the design did not translate well into sub-65 watt envelopes. Excavator based “Carrizo” products on the other hand could scale from 15 watts to 65 watts without real problems, but became terribly inefficient above 65 watts with increased clockspeeds. Zen looks to address these differences by being able to scale from sub-25 watt TDPs up to 95 or 100. In theory this should allow AMD to simplify their product stack by offering a common architecture across multiple platforms.
Subject: Graphics Cards | August 23, 2016 - 01:43 PM | Jeremy Hellstrom
Tagged: amd, nvidia, Tilt Brush, VR
[H]ard|OCP continues their foray into testing VR applications, this time moving away from games to try out the rather impressive Tilt Brush VR drawing application from Google. If you have yet to see this software in action it is rather incredible, although you do still require an artist's talent and practical skills to create true 3D masterpieces.
Artisic merit may not be [H]'s strong suite but testing how well a GPU can power VR applications certainly lies within their bailiwick. Once again they tested five NVIDIA GPUs and a pair of AMD's for dropped frames and reprojection caused by a drop in FPS.
"We are changing gears a bit with our VR Performance coverage and looking at an application that is not as GPU-intensive as those we have looked at in the recent past. Google's Tilt Brush is a virtual reality application that makes use of the HTC Vive head mounted display and its motion controllers to allow you to paint in 3D space."
Here are some more Graphics Card articles from around the web:
- PowerColor Red Devil RX 470 Overclocking @ [H]ard|OCP
- MSI GeForce GTX 1060 OC 6 GB @ techPowerUp
- ASUS STRIX GAMING GTX 1070 OC @ eTeknix
- EVGA GeForce GTX 1070 FTW GAMING ACX 3.0 @ Bjorn3d
Why Two 4GB GPUs Isn't Necessarily 8GB
We're trying something new here at PC Perspective. Some topics are fairly difficult to explain cleanly without accompanying images. We also like to go fairly deep into specific topics, so we're hoping that we can provide educational cartoons that explain these issues.
This pilot episode is about load-balancing and memory management in multi-GPU configurations. There seems to be a lot of confusion around what was (and was not) possible with DirectX 11 and OpenGL, and even more confusion about what DirectX 12, Mantle, and Vulkan allow developers to do. It highlights three different load-balancing algorithms, and even briefly mentions what LucidLogix was attempting to accomplish almost ten years ago.
If you like it, and want to see more, please share and support us on Patreon. We're putting this out not knowing if it's popular enough to be sustainable. The best way to see more of this is to share!
Subject: Processors | August 22, 2016 - 05:37 PM | Jeremy Hellstrom
Tagged: amd, a10-7870K
Leaving aside the questionable naming to instead focus on the improved cooler on this ~$130 APU from AMD. Neoseeker fired up the fun sized, 125W rated cooler on top of the A10-7870K and were pleasantly surprised at the lack of noise even under load. Encouraged by the performance they overclocked the chip by 500MHz to 4.4GHz and were rewarded with a stable and still very quiet system. The review focuses more the improvements the new cooler offers as opposed to the APU itself, which has not changed. Check out the review if you are considering a lower cost system that only speaks when spoken to.
"In order to find out just how much better the 125W thermal solution will perform, I am going to test the A10-7870K APU mounted on a Gigabyte F2A88X-UP4 motherboard provided by AMD with a set of 16 GB (2 x 8) DDR3 RAM modules set at 2133 MHz speed. I will then run thermal and fan speed tests so a comparison of the results will provide a meaningful data set to compare the near-silent 125W cooler to an older model AMD cooling solution."
Here are some more Processor articles from around the web:
Subject: Graphics Cards | August 18, 2016 - 07:58 PM | Scott Michaud
Tagged: amd, TrueAudio, trueaudio next
Using a GPU for audio makes a lot of sense. That said, the original TrueAudio was not really about that, and it didn't really take off. The API was only implemented in a handful of titles, and it required dedicated hardware that they have since removed from their latest architectures. It was not about using the extra horsepower of the GPU to simulate sound, although they did have ideas for “sound shaders” in the original TrueAudio.
TrueAudio Next, on the other hand, is an SDK that is part of AMD's LiquidVR package. It is based around OpenCL; specifically, it uses AMD's open-source FireRays library to trace the ways that audio can move from source to receiver, including reflections. For high-frequency audio, this is a good assumption, and that range of frequencies are more useful for positional awareness in VR, anyway.
Basically, TrueAudio Next has very little to do with the original.
Interestingly, AMD is providing an interface for TrueAudio Next to reserve compute units, but optionally (and under NDA). This allows audio processing to be unhooked from the video frame rate, provided that the CPU can keep both fed with actual game data. Since audio is typically a secondary thread, it could be ready to send sound calls at any moment. Various existing portions of asynchronous compute could help with this, but allowing developers to wholly reserve a fraction of the GPU should remove the issue entirely. That said, when I was working on a similar project in WebCL, I was looking to the integrated GPU, because it's there and it's idle, so why not? I would assume that, in actual usage, CU reservation would only be enabled if an AMD GPU is the only device installed.
Anywho, if you're interested, then be sure to check out AMD's other post on it, too.
Gunning for Broadwell-E
As I walked away from the St. Regis in downtown San Francisco tonight, I found myself wandering through the streets towards my hotel with something unique in tow. It was a smile. I was smiling, thinking about what AMD had just demonstrated and showed at its latest Zen processor reveal. The importance of this product launch can literally not be overstated for a company struggling to find a foothold to hang on to in a market that it once had a definitive lead. It’s been many years since I left a conference call, or a meeting, or a press conference feeling genuinely hopefully and enthusiastic about what AMD has shown me. Tonight I had that.
AMD’s CEO Lisa Su, and CTO Mark Papermaster, took stage down the street from the Intel Developer Forum to roll out a handful of new architectural details about the Zen architecture while also showing the first performance results comparing it to competing parts from Intel. The crowd in attendance, a mix of media and analysts, were impressed. The feeling was palpable in the room.
It’s late as I write this, and while there are some interesting architecture details to discuss, I think it is in everyone’s best interest that we touch on them lightly for now, and instead refocus on the deep-dive once the Hot Chips information comes out early next week. What you really want to know is clear: can Zen make Intel work again? Can Zen make that $1700 price tag on the Broadwell-E 6950X seem even more ludicrous? Yes.
The Zen Architecture
Much of what was discussed from the Zen architecture is a re-release of what has been out in recent months. This is a completely new, from the ground up, microarchitecture and not a revamp of the aging Bulldozer design. It integrated SMT (simultaneous multi-threading), a first for an AMD CPU, to better take efficient advantage of a longer pipeline. Intel has had HyperThreading for a long time now and AMD is finally joining the fold. A high bandwidth and low latency caching system is used to “feed the beast” as Papermaster put it and utilizing 14nm process technology (starting at Global Foundries) gives efficiency, and scaling a significant bump while enabling AMD to scale from notebooks to desktops to servers with the same architecture.
By far the most impressive claim from AMD thus far was that of a 40% increase in IPC over previous AMD designs. That’s a HUGE claim and is key to the success or failure of Zen. AMD proved to me today that the claims are real and that we will see the immediate impact of that architecture bump from day one.
Press was told of a handful of high level changes to the new architecture as well. Branch prediction gets a complete overhaul. This marks the first AMD processor to have a micro-op cache. Wider execution width with broader instruction schedulers are integrated, all of which adds up to much higher instruction level parallelism to improve single threaded performance.
Performance improvements aside, throughput and efficiency go up with Zen as well. AMD has integrated an 8MB L3 cache and improved prefetching for up 5x the cache bandwidth available per core on the CPU. SMT makes sure the pipeline stays full to prevent “bubbles” that introduce latency and lower efficiency while region-specific power gating means that we’ll see Zen in notebooks as well as enterprise servers in 2017. It truly is an impressive design from AMD.
Summit Ridge, the enthusiast platform that will be the first product available with Zen, is based on the AM4 platform and processors will go up to 8-cores and 16-threads. DDR4 memory support is included, PCI Express 3.0 and what AMD calls “next-gen” IO – I would expect a quick leap forward for AMD to catch up on things like NVMe and Thunderbolt.
The Real Deal – Zen Performance
As part of today’s reveal, AMD is showing the first true comparison between Zen and Intel processors. Sure, AMD showed a Zen-powered system running the upcoming Deus Ex running at 4K with a system powered by the Fury X, but the really impressive results where shown when comparing Zen to a Broadwell-E platform.
Using Blender to measure the performance of a rendering workload (a Zen CPU mockup of course), AMD ran an 8-core / 16-thread Zen processor at 3.0 GHz against an 8-core / 16-thread Broadwell-E processor at 3.0 GHz (likely a fixed clocked Core i7-6900K). The point of the demonstration was to showcase the IPC improvements of Zen and it worked: the render completed on the Zen platform a second or two faster than it did on the Intel Broadwell-E system.
Not much to look at, but Zen on the left, Broadwell-E on the right...
Of course there are lots of caveats: we didn’t setup the systems, I don’t know for sure that GPUs weren’t involved, we don’t know the final clocks of the Zen processors releasing in early 2017, etc. But I took two things away from the demonstration that are very important.
- The IPC of Zen is on-par or better than Broadwell.
- Zen will scale higher than 3.0 GHz in 8-core configurations.
AMD obviously didn’t state what specific SKUs were going to launch with the Zen architecture, what clock speeds they would run at, or even what TDPs they were targeting. Instead we were left with a vague but understandable remark of “comparable TDPs to Broadwell-E”.
Pricing? Overclocking? We’ll just have to wait a bit longer for that kind of information.
There is clearly a lot more for AMD to share about Zen but the announcement and showcase made this week with the early prototype products have solidified for me the capability and promise of this new microarchitecture. We have asked for, and needed, as an industry, a competitor to Intel in the enthusiast CPU space – something we haven’t legitimately had since the Athlon X2 days. Zen is what we have been pining over, what gamers and consumers have needed.
AMD’s processor stars might finally be aligning for a product that combines performance, efficiency and scalability at the right time. I’m ready for it –are you?
It always feels a little odd when covering NVIDIA’s quarterly earnings due to how they present their financial calendar. No, we are not reporting from the future. Yes, it can be confusing when comparing results and getting your dates mixed up. Regardless of the date before the earnings, NVIDIA did exceptionally well in a quarter that is typically the second weakest after Q1.
NVIDIA reported revenue of $1.43 billion. This is a jump from an already strong Q1 where they took in $1.30 billion. Compare this to the $1.027 billion of its competitor AMD who also provides CPUs as well as GPUs. NVIDIA sold a lot of GPUs as well as other products. Their primary money makers were the consumer space GPUs and the professional and compute markets where they have a virtual stranglehold on at the moment. The company’s GAAP net income is a very respectable $253 million.
The release of the latest Pascal based GPUs were the primary mover for the gains for this latest quarter. AMD has had a hard time competing with NVIDIA for marketshare. The older Maxwell based chips performed well against the entire line of AMD offerings and typically did so with better power and heat characteristics. Even though the GTX 970 was somewhat limited in its memory configuration as compared to the AMD products (3.5 GB + .5 GB vs. a full 4 GB implementation) it was a top seller in its class. The same could be said for the products up and down the stack.
Pascal was released at the end of May, but the company had been shipping chips to its partners as well as creating the “Founder’s Edition” models to its exacting specifications. These were strong sellers throughout the end of May until the end of the quarter. NVIDIA recently unveiled their latest Pascal based Quadro cards, but we do not know how much of an impact those have had on this quarter. NVIDIA has also been shipping, in very limited quantities, the Tesla P100 based units to select customers and outfits.
Subject: Graphics Cards | August 12, 2016 - 05:44 PM | Jeremy Hellstrom
Tagged: rx 470, LatencyMon, dpc, amd
When The Tech Report first conducted their review of the RX 470 they saw benchmark behaviour very different from any other GPU in that family but could not figure out what it was and resolve it before the mob arrived with pitchforks and torches demanding they publish or die.
As it turns out there was indeed something rotten in benchmark; incredibly high DPC on the test machine. Investigation determined the culprit to be the beta BIOS on their ASRock Z170 Extreme7+, specifically the BIOS which allowed you to overclock locked Intel CPUs. They have just released their new findings along with a look at LatencyMon and DPC in general. Take a look at the new benchmarks and information about DPC, but also absorb the consequences of demanding articles arrive picoseconds after the NDA expires; if there is a delay in publishing there might just be a damn good reason why.
"We retested our RX 470 to account for this issue, and we also updated our review with DirectX 12 benchmarks for Rise of the Tomb Raider and Hitman, plus full OpenGL and Vulkan benchmarks for Doom."
Here are some more Graphics Card articles from around the web:
- AMD & NVIDIA GPU VR Performance in Trials on Tatooine @ [H]ard|OCP
- AMD's Radeon RX 460 @ The Tech Report
- 18-Way GPU Linux Benchmarks, Including The Radeon RX 460 & RX 470 On Open-Source @ Phoronix
- ASUS Radeon RX 460 STRIX OC 4 GB @ techPowerUp
- MSI RX 470 Gaming X 8G @ Kiguru
- MSI GTX 1060 6GB Gaming X @ Kitguru
- MSI GeForce GTX 1070 Gaming Z @ Modders-Inc
- Nvidia Titan X (Pascal) Extended Overclock Guide @ Guru of 3D
- Nvidia Titan X @ Kitguru
- MSI GeForce GTX 1080 Gaming Z 8G Review @HiTech Legion
- Zotac GTX 1080 AMP! Edition 8 GB @ techPowerUp