Computex 2017: AMD Threadripper will include 64 lanes of PCI Express 3.0, Demos with Quad Vega FE

Subject: Processors | May 30, 2017 - 10:49 PM |
Tagged: Threadripper, ryzen, PCI Express, amd

During AMD’s Computex keynote, the company confirmed that the every one of the upcoming Threadripper HEDT platform first announced earlier in May, will include 64 lanes of PCI Express 3.0. There will not be a differentiation in the product line with PCIe lanes or in memory channels (all quad-channel DDR4). This potentially gives AMD the advantage for system connectivity, as the Intel Skylake-X processor just announced yesterday will only sport of 44 lanes of PCIe 3.0 on chip.

View Full Size

Having 64 lanes of PCI Express on Threadripper could be an important differentiation point for the platform, offering the ability to run quad GPUs at full x16 speeds, without the need of any PLX-style bridge chips. You could also combine a pair of x16 graphics cards, and still have 32 lanes left for NVMe storage, 10 GigE networking devices, multi-channel SAS controllers, etc. And that doesn’t include any additional lanes that the X399 chipset may end up providing. We still can’t wait to see what motherboard vendors like ASUS, MSI and Gigabyte create with all that flexibility.

View Full Size

Holy hell.

On-stage, we saw a couple of demonstrations of what this connectivity capability can provide. First, a Threadripper system was shown powering Radeon RX Vega graphics cards running the new Prey PC title at 4K.

On-stage, we saw a couple of demonstrations of what this connectivity capability can provide. First, a Threadripper system was shown running the same Blender rendering demo used in the build up to the initial Ryzen CPU launch.

View Full Size

Next, CEO Lisa Su came back on stage to demo AMD Threadripper running with a set of four Radeon Vega Frontier Edition cards running together for ray tracing.

View Full Size

And finally, a gaming demo! AMD Ryzen Threadripper was demoed with dual Radeon RX Vega (the gaming versions) graphics cards running at 4K/Ultra settings on the new Prey PC title. No frame rates were mentioned, no FRAPS in the corner, etc.

(Side note: Radeon Vega FE was confirmed for June 27th launch. Radeon RX Vega will launch at SIGGRAPH at the end of July!)

We still have a ways to go before we can make any definitive comments on Threadripper, and with Intel announcing processors with core counts as high as 18 just yesterday, it’s fair to say that some of the excitement has been dwindling. However, with aggressive pricing and the right messaging from AMD, they still have an amazing opportunity to break away a large segment of the growing, and profitable, HEDT market from Intel.

Source: AMD

May 31, 2017 | 01:06 AM - Posted by James

I have been disappointed with most of the Ryzen coverage from review sites; hopefully they will do better with the HEDT launch. I know you guys aren't computer engineers, but you still should know benchmarking. You guys had the nice ping time charts, but only for Ryzen and Intel 8 core (or was it 10?), and not for Intel 4 core parts. Then you did a latency article that seemed to only include Intel 4 core parts and Ryzen, with no Intel 8 or more core parts. Do you think some of the these would be interesting for comparison.

For the Intel parts with more than 4 cores, you take a hit on latency compared to the 4 core. All L3 accesses have to go through the ring bus. For an 8 core, that will be more than 4 stops since you have ring stops for memory and IO. An 8 core may have 11 or 12 stops (2 memory controllers plus at least one IO). This also makes Intel salvage parts less attractive. If your 8 core is a salvaged 10 core, you still have the extra ring stops and you lose cache. With AMD parts, you get less cores, but no increase in latency and you get significantly more cache. Intel does keep the latency relatively low (their 8 core will not compete with their 4 core part on latency though), but it comes at a cost. Interconnect burns a massive amount of power in modern processors. Retaining a monolithic last level cache will burn a huge amount of power in that 256-bit interconnect. It doesn't scale past 8 to 10 cores, even burning a lot of power at core clock. You are already increasing latency enough where the tightly coupled 4 core will win. The 18 core part will almost certainly have 2 independent ring busses, with separate cache zones. I don't know who would buy that part at the price anyway. Once you start talking 24 to 32 cores, a monolithic last level cache just isn't doable at all.

AMD can deliver the best of both worlds. They offer the lower latency of a tightly coupled 4 core with the ability to scale up to as many cores as you want. The only problem is that the software needs to be optimized for it. 4 core/8 threads per zone seems pretty optimal. Intel 8 core parts are not the part to get for gaming and software optimization isn't going to change that. You are stuck with the latency penalty for that monolithic last level cache.

This also means that Nvidia's DX11 driver has limited scalability. They have to split up the draw calls, distribute them to multiple threads, and then merge them back together. That requires a huge amount of communication across threads. I would doubt that this can be made to perform well on Ryzen. On Intel parts, it will perform great on little 4 core parts. On an 8 core, it will be hit by the added latency of the monolithic last level cache. With DX12, there is much less communication across threads. All of the threads can submit to the GPU somewhat independently. This is how a low power 8 core system (Xbox Scorpio) will be able to turn out excelent performance.

Intel and Nvidia make these high end "marketing parts" that aren't really reasonable products. Nvidia released the Titan dual GPU cards using cherry picked processors and a ridiculous price to go along with it. AMD just used regular parts and added a water cooler, which made a part that people would actually consider buying (somewhat reasonable price/performance). Same thing with these high end Intel parts. They can release any Xeon they have as a desktop part, but at that price, it is little more than marketing and review fodder in my opinion. I am hoping that the top end ThreadRipper is under 1000$. It really is just two Ryzen parts. The 12 core will just be two 6 core parts. I don't know what the 10 and 14 will be. The 12 and 14 might be the sweet spot though. You get the highest clocks and a huge number of cores. The 64 pci-e lanes is incredible. That is probably due to the architecture. Intel might have some parts without fully functional pci-e lanes that can be sold as 28 lane parts, but their options are limited. With AMD's distributed architecture, I think they should be able to sell just about everything they make.

Anyway, I doubt anyone will read all of that, but as someone who has taken some computer engineering classes, most Ryzen reviews have just been annoying.

May 31, 2017 | 09:26 AM - Posted by psuedonymous

"This also means that Nvidia's DX11 driver has limited scalability. They have to split up the draw calls, distribute them to multiple threads, and then merge them back together."

As opposed to AMD's DX11 driver, which simply does not implement any threading in the first place.

June 1, 2017 | 04:23 PM - Posted by jojolapin102

It was very interesting to read that, I am an engineer student in computer (I am french sorry if I make mistakes), but I start computer engineering class only next year. You have learnt me a lot of things about CPU cache, I would like to know more and more !

May 31, 2017 | 01:13 AM - Posted by Hakuren

While I was stupidly hoping for Intel to price i9 more reasonably in the face of obvious competition from AMD they went with usual hardwareransom approach. To pay thousands of $ for basically minimal performance gain, but to get 44 PCIe lanes is just daft.

If only AMD comes with reasonable pricing (like they did with 'baby' Ryzen which just smashed Intel) then AMD will have a bumper year without shadow of doubt. I hope there will be board offering with more than basic 8 SATA ports. On such massively powerful chip, 8 ports is seriously weak-point from my point of view. Perhaps built-in SAS controllers? Endless possibilities with 64 lanes.

One thing which is definitively NOT required is RGB. Let it burn and die.

May 31, 2017 | 02:42 AM - Posted by StephanS

Intel cheapest 10 core is $1000 and 18 core demands $2000
I think AMD got plenty of room for the R9 serie (10,12,14,16 cores)

Something like this ?

R7 8 core $450 (24 pcie)
I9 8 core $620 (28 pcie)

r9 10 core $750 (64 pcie)
I9 10 core $1000 (44 pcie)
r9 12 core $950 (64 pcie)
r9 14 core $1150 (64 pcie)
r9 16 core $1450 (64 pcie)
i9 18 core $2000 (44 pcie)

May 31, 2017 | 12:29 PM - Posted by likeonions

This is exactly what I wanted from Ryzen, just had to wait a bit. We need a HEDT platform with quad gpus, true x16 per card. for photogrammetry and I knew X99 was long in the tooth. Just hoping the pricing is better than Intel.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.