Author:
Subject: Processors
Manufacturer: AMD

Clean Sheet and New Focus

It is no secret that AMD has been struggling for some time.  The company has had success through the years, but it seems that the last decade has been somewhat bleak in terms of competitive advantages.  The company has certainly made an impact in throughout the decades with their 486 products, K6, the original Athlon, and the industry changing Athlon 64.  Since that time we have had a couple of bright spots with the Phenom II being far more competitive than expected, and the introduction of very solid graphics performance in their APUs.

Sadly for AMD their investment in the “Bulldozer” architecture was misplaced for where the industry was heading.  While we certainly see far more software support for multi-threaded CPUs, IPC is still extremely important for most workloads.  The original Bulldozer was somewhat rushed to market and was not fully optimized, while the “Piledriver” based Vishera products fixed many of these issues we have not seen the non-APU products updated to the latest Steamroller and Excavator architectures.  The non-APU desktop market has been served for the past four years with 32nm PD-SOI based parts that utilize a rebranded chipset base that has not changed since 2010.

hc_03.png

Four years ago AMD decided to change course entirely with their desktop and server CPUs.  Instead of evolving the “Bulldozer” style architecture featuring CMT (Core Multi-Threading) they were going to do a clean sheet design that focused on efficiency, IPC, and scalability.  While Bulldozer certainly could scale the thread count fairly effectively, the overall performance targets and clockspeeds needed to compete with Intel were just not feasible considering the challenges of process technology.  AMD brought back Jim Keller to lead this effort, an industry veteran with a huge amount of experience across multiple architectures.  Zen was born.

 

Hot Chips 28

This year’s Hot Chips is the first deep dive that we have received about the features of the Zen architecture.  Mike Clark is taking us through all of the changes and advances that we can expect with the upcoming Zen products.

Zen is a clean sheet design that borrows very little from previous architectures.  This is not to say that concepts that worked well in previous architectures were not revisited and optimized, but the overall floorplan has changed dramatically from what we have seen in the past.  AMD did not stand still with their Bulldozer products, and the latest Excavator core does improve upon the power consumption and performance of the original.  This evolution was simply not enough considering market pressures and Intel’s steady improvement of their core architecture year upon year.  Zen was designed to significantly improve IPC and AMD claims that this product has a whopping 40% increase in IPC (instructions per clock) from the latest Excavator core.

hc_04.png

AMD also has focused on scaling the Zen architecture from low power envelopes up to server level TDPs.  The company looks to have pushed down the top end power envelope of Zen from the 125+ watts of Bulldozer/Vishera into the more acceptable 95 to 100 watt range.  This also has allowed them to scale Zen down to the 15 to 25 watt TDP levels without sacrificing performance or overall efficiency.  Most architectures have sweet spots where they tend to perform best.  Vishera for example could scale nicely from 95 to 220 watts, but the design did not translate well into sub-65 watt envelopes.  Excavator based “Carrizo” products on the other hand could scale from 15 watts to 65 watts without real problems, but became terribly inefficient above 65 watts with increased clockspeeds.  Zen looks to address these differences by being able to scale from sub-25 watt TDPs up to 95 or 100.  In theory this should allow AMD to simplify their product stack by offering a common architecture across multiple platforms.

Click to continue reading about AMD's Zen architecture!

Creatively testing GPUs with Google's Tilt Brush

Subject: Graphics Cards | August 23, 2016 - 01:43 PM |
Tagged: amd, nvidia, Tilt Brush, VR

[H]ard|OCP continues their foray into testing VR applications, this time moving away from games to try out the rather impressive Tilt Brush VR drawing application from Google.  If you have yet to see this software in action it is rather incredible, although you do still require an artist's talent and practical skills to create true 3D masterpieces. 

Artisic merit may not be [H]'s strong suite but testing how well a GPU can power VR applications certainly lies within their bailiwick.  Once again they tested five NVIDIA GPUs and a pair of AMD's for dropped frames and reprojection caused by a drop in FPS.

1471635809gU37bh4rad_6_1.jpg

"We are changing gears a bit with our VR Performance coverage and looking at an application that is not as GPU-intensive as those we have looked at in the recent past. Google's Tilt Brush is a virtual reality application that makes use of the HTC Vive head mounted display and its motion controllers to allow you to paint in 3D space."

Here are some more Graphics Card articles from around the web:

Graphics Cards

Source: [H]ard|OCP
Manufacturer: PC Perspective

Why Two 4GB GPUs Isn't Necessarily 8GB

We're trying something new here at PC Perspective. Some topics are fairly difficult to explain cleanly without accompanying images. We also like to go fairly deep into specific topics, so we're hoping that we can provide educational cartoons that explain these issues.

This pilot episode is about load-balancing and memory management in multi-GPU configurations. There seems to be a lot of confusion around what was (and was not) possible with DirectX 11 and OpenGL, and even more confusion about what DirectX 12, Mantle, and Vulkan allow developers to do. It highlights three different load-balancing algorithms, and even briefly mentions what LucidLogix was attempting to accomplish almost ten years ago.

pcper-2016-animationlogo-multiGPU.png

If you like it, and want to see more, please share and support us on Patreon. We're putting this out not knowing if it's popular enough to be sustainable. The best way to see more of this is to share!

Open the expanded article to see the transcript, below.

AMD's 7870 rides again, checking out the new cooler on the A10-7870K

Subject: Processors | August 22, 2016 - 05:37 PM |
Tagged: amd, a10-7870K

Leaving aside the questionable naming to instead focus on the improved cooler on this ~$130 APU from AMD.  Neoseeker fired up the fun sized, 125W rated cooler on top of the A10-7870K and were pleasantly surprised at the lack of noise even under load.  Encouraged by the performance they overclocked the chip by 500MHz to 4.4GHz and were rewarded with a stable and still very quiet system.  The review focuses more the improvements the new cooler offers as opposed to the APU itself, which has not changed.  Check out the review if you are considering a lower cost system that only speaks when spoken to.

14.jpg

"In order to find out just how much better the 125W thermal solution will perform, I am going to test the A10-7870K APU mounted on a Gigabyte F2A88X-UP4 motherboard provided by AMD with a set of 16 GB (2 x 8) DDR3 RAM modules set at 2133 MHz speed. I will then run thermal and fan speed tests so a comparison of the results will provide a meaningful data set to compare the near-silent 125W cooler to an older model AMD cooling solution."

Here are some more Processor articles from around the web:

Processors

Source: Neoseeker

AMD Announces TrueAudio Next

Subject: Graphics Cards | August 18, 2016 - 07:58 PM |
Tagged: amd, TrueAudio, trueaudio next

Using a GPU for audio makes a lot of sense. That said, the original TrueAudio was not really about that, and it didn't really take off. The API was only implemented in a handful of titles, and it required dedicated hardware that they have since removed from their latest architectures. It was not about using the extra horsepower of the GPU to simulate sound, although they did have ideas for “sound shaders” in the original TrueAudio.

amd-2016-true-audio-next.jpg

TrueAudio Next, on the other hand, is an SDK that is part of AMD's LiquidVR package. It is based around OpenCL; specifically, it uses AMD's open-source FireRays library to trace the ways that audio can move from source to receiver, including reflections. For high-frequency audio, this is a good assumption, and that range of frequencies are more useful for positional awareness in VR, anyway.

Basically, TrueAudio Next has very little to do with the original.

Interestingly, AMD is providing an interface for TrueAudio Next to reserve compute units, but optionally (and under NDA). This allows audio processing to be unhooked from the video frame rate, provided that the CPU can keep both fed with actual game data. Since audio is typically a secondary thread, it could be ready to send sound calls at any moment. Various existing portions of asynchronous compute could help with this, but allowing developers to wholly reserve a fraction of the GPU should remove the issue entirely. That said, when I was working on a similar project in WebCL, I was looking to the integrated GPU, because it's there and it's idle, so why not? I would assume that, in actual usage, CU reservation would only be enabled if an AMD GPU is the only device installed.

Anywho, if you're interested, then be sure to check out AMD's other post on it, too.

Source: AMD
Author:
Subject: Processors
Manufacturer: AMD
Tagged: Zen, amd

Gunning for Broadwell-E

As I walked away from the St. Regis in downtown San Francisco tonight, I found myself wandering through the streets towards my hotel with something unique in tow. It was a smile. I was smiling, thinking about what AMD had just demonstrated and showed at its latest Zen processor reveal. The importance of this product launch can literally not be overstated for a company struggling to find a foothold to hang on to in a market that it once had a definitive lead. It’s been many years since I left a conference call, or a meeting, or a press conference feeling genuinely hopefully and enthusiastic about what AMD has shown me. Tonight I had that.

AMD’s CEO Lisa Su, and CTO Mark Papermaster, took stage down the street from the Intel Developer Forum to roll out a handful of new architectural details about the Zen architecture while also showing the first performance results comparing it to competing parts from Intel. The crowd in attendance, a mix of media and analysts, were impressed. The feeling was palpable in the room.

zenicon.jpg

It’s late as I write this, and while there are some interesting architecture details to discuss, I think it is in everyone’s best interest that we touch on them lightly for now, and instead refocus on the deep-dive once the Hot Chips information comes out early next week. What you really want to know is clear: can Zen make Intel work again? Can Zen make that $1700 price tag on the Broadwell-E 6950X seem even more ludicrous? Yes.

The Zen Architecture

Much of what was discussed from the Zen architecture is a re-release of what has been out in recent months. This is a completely new, from the ground up, microarchitecture and not a revamp of the aging Bulldozer design. It integrated SMT (simultaneous multi-threading), a first for an AMD CPU, to better take efficient advantage of a longer pipeline. Intel has had HyperThreading for a long time now and AMD is finally joining the fold. A high bandwidth and low latency caching system is used to “feed the beast” as Papermaster put it and utilizing 14nm process technology (starting at Global Foundries) gives efficiency, and scaling a significant bump while enabling AMD to scale from notebooks to desktops to servers with the same architecture.

zenpm-10.jpg

By far the most impressive claim from AMD thus far was that of a 40% increase in IPC over previous AMD designs. That’s a HUGE claim and is key to the success or failure of Zen. AMD proved to me today that the claims are real and that we will see the immediate impact of that architecture bump from day one.

zenpm-4.jpg

Press was told of a handful of high level changes to the new architecture as well. Branch prediction gets a complete overhaul. This marks the first AMD processor to have a micro-op cache. Wider execution width with broader instruction schedulers are integrated, all of which adds up to much higher instruction level parallelism to improve single threaded performance.

zenpm-6.jpg

Performance improvements aside, throughput and efficiency go up with Zen as well. AMD has integrated an 8MB L3 cache and improved prefetching for up 5x the cache bandwidth available per core on the CPU. SMT makes sure the pipeline stays full to prevent “bubbles” that introduce latency and lower efficiency while region-specific power gating means that we’ll see Zen in notebooks as well as enterprise servers in 2017. It truly is an impressive design from AMD.

zenfull-27.jpg

Summit Ridge, the enthusiast platform that will be the first product available with Zen, is based on the AM4 platform and processors will go up to 8-cores and 16-threads. DDR4 memory support is included, PCI Express 3.0 and what AMD calls “next-gen” IO – I would expect a quick leap forward for AMD to catch up on things like NVMe and Thunderbolt.

The Real Deal – Zen Performance

As part of today’s reveal, AMD is showing the first true comparison between Zen and Intel processors. Sure, AMD showed a Zen-powered system running the upcoming Deus Ex running at 4K with a system powered by the Fury X, but the really impressive results where shown when comparing Zen to a Broadwell-E platform.

zenfull-29.jpg

Using Blender to measure the performance of a rendering workload (a Zen CPU mockup of course), AMD ran an 8-core / 16-thread Zen processor at 3.0 GHz against an 8-core / 16-thread Broadwell-E processor at 3.0 GHz (likely a fixed clocked Core i7-6900K). The point of the demonstration was to showcase the IPC improvements of Zen and it worked: the render completed on the Zen platform a second or two faster than it did on the Intel Broadwell-E system.

DSC01490.jpg

Not much to look at, but Zen on the left, Broadwell-E on the right...

Of course there are lots of caveats: we didn’t setup the systems, I don’t know for sure that GPUs weren’t involved, we don’t know the final clocks of the Zen processors releasing in early 2017, etc. But I took two things away from the demonstration that are very important.

  1. The IPC of Zen is on-par or better than Broadwell.
  2. Zen will scale higher than 3.0 GHz in 8-core configurations.

AMD obviously didn’t state what specific SKUs were going to launch with the Zen architecture, what clock speeds they would run at, or even what TDPs they were targeting. Instead we were left with a vague but understandable remark of “comparable TDPs to Broadwell-E”.

Pricing? Overclocking? We’ll just have to wait a bit longer for that kind of information.

Closing Thoughts

There is clearly a lot more for AMD to share about Zen but the announcement and showcase made this week with the early prototype products have solidified for me the capability and promise of this new microarchitecture. We have asked for, and needed, as an industry, a competitor to Intel in the enthusiast CPU space – something we haven’t legitimately had since the Athlon X2 days. Zen is what we have been pining over, what gamers and consumers have needed.

zenpm-11.jpg

AMD’s processor stars might finally be aligning for a product that combines performance, efficiency and scalability at the right time. I’m ready for it –are you?

Author:
Subject: Editorial
Manufacturer: NVIDIA

NVIDIA Today?

It always feels a little odd when covering NVIDIA’s quarterly earnings due to how they present their financial calendar.  No, we are not reporting from the future.  Yes, it can be confusing when comparing results and getting your dates mixed up.  Regardless of the date before the earnings, NVIDIA did exceptionally well in a quarter that is typically the second weakest after Q1.

NVIDIA reported revenue of $1.43 billion.  This is a jump from an already strong Q1 where they took in $1.30 billion.  Compare this to the $1.027 billion of its competitor AMD who also provides CPUs as well as GPUs.  NVIDIA sold a lot of GPUs as well as other products.  Their primary money makers were the consumer space GPUs and the professional and compute markets where they have a virtual stranglehold on at the moment.  The company’s GAAP net income is a very respectable $253 million.

results.png

The release of the latest Pascal based GPUs were the primary mover for the gains for this latest quarter.  AMD has had a hard time competing with NVIDIA for marketshare.  The older Maxwell based chips performed well against the entire line of AMD offerings and typically did so with better power and heat characteristics.  Even though the GTX 970 was somewhat limited in its memory configuration as compared to the AMD products (3.5 GB + .5 GB vs. a full 4 GB implementation) it was a top seller in its class.  The same could be said for the products up and down the stack.

Pascal was released at the end of May, but the company had been shipping chips to its partners as well as creating the “Founder’s Edition” models to its exacting specifications.  These were strong sellers throughout the end of May until the end of the quarter.  NVIDIA recently unveiled their latest Pascal based Quadro cards, but we do not know how much of an impact those have had on this quarter.  NVIDIA has also been shipping, in very limited quantities, the Tesla P100 based units to select customers and outfits.

Click to read more about NVIDIA's latest quarterly results!

Wherein the RX 470 teaches us a valuable lesson about deferred procedure calls

Subject: Graphics Cards | August 12, 2016 - 05:44 PM |
Tagged: rx 470, LatencyMon, dpc, amd

When The Tech Report first conducted their review of the RX 470 they saw benchmark behaviour very different from any other GPU in that family but could not figure out what it was and resolve it before the mob arrived with pitchforks and torches demanding they publish or die. 

As it turns out there was indeed something rotten in benchmark; incredibly high DPC on the test machine.  Investigation determined the culprit to be the beta BIOS on their ASRock Z170 Extreme7+, specifically the BIOS which allowed you to overclock locked Intel CPUs.  They have just released their new findings along with a look at LatencyMon and DPC in general.  Take a look at the new benchmarks and information about DPC, but also absorb the consequences of demanding articles arrive picoseconds after the NDA expires; if there is a delay in publishing there might just be a damn good reason why.

villagers_with_pitchforks.jpg

"We retested our RX 470 to account for this issue, and we also updated our review with DirectX 12 benchmarks for Rise of the Tomb Raider and Hitman, plus full OpenGL and Vulkan benchmarks for Doom."

Here are some more Graphics Card articles from around the web:

Graphics Cards

AMD Releases Radeon Software Crimson Edition 16.8.1

Subject: Graphics Cards | August 10, 2016 - 04:59 PM |
Tagged: amd, graphics drivers

Alongside the release of the Radeon RX 460 and RX 470 graphics cards, AMD has released the Radeon Software Crimson Edition 16.8.1 drivers. Beyond adding support for these new products, it also adds a Crossfire profile for F1 2016 and fixes a few issues, like Firefox and Overwatch crashing under certain circumstances. It also allows users of the RX 480 to overclock their memory higher than they previously could.

amd-2015-crimson-logo.png

AMD is continuing their trend of steadily releasing graphics drivers, and rapidly fixing important issues as they arise. Also, they have been verbose in their release notes, outlining fixes and known problems as they occur. Users can often track the bugs that affect them as they are added to the Known Issues, then graduated to Fixed Issues. While this often goes unrecognized, it's frustrating as a user to experience a bug and not know whether the company even knows about it, or they are just refusing to acknowledge it.

Useful release notes, like AMD has been publishing, are very helpful in that regard.

Source: AMD

Playing with VR, Call of the Starseed edition

Subject: General Tech | August 10, 2016 - 02:13 PM |
Tagged: gaming, starseed, VR, amd, nvidia, htc vive

When [H]ard|OCP looks at the performance of a VR game, be it a Vive or Rift title, they focus on the gameplay experience as opposed to benchmarks.  There are numerous reasons for this, from the fact that these games do not tend to stress GPUs like many triple A titles but also because the targets are different, steady render times below 11.1ms are the target as opposed to higher frame counts.  AMD initially had issues with this game, the newest driver release has resolved those issues completely.  The takeaway quote in [H]'s conclusions provide the most telling part of the review, "If we were to perform a blind gaming test, you would not be able to identify which GPU you were gaming with at the time."

1470689087vAVYKFP4hl_1_1.jpg

"We are back this week to take another objective look at AMD and NVIDIA GPU performance in one of the the top selling games in the VR-only realm, The Gallery Episode 1: Call of Starseed. This is another GPU-intensive title that has the ability to put some GPUs on their heels. How do the new RX 480 and GeForce 1000 series perform?"

Here is some more Tech News from around the web:

Gaming

Source: [H]ard|OCP