Get your brains ready
Just before the weekend, Josh and I got a chance to speak with David Kanter about the AMD Zen architecture and what it might mean for the Ryzen processor due out in less than a month. For those of you not familiar with David and his work, he is an analyst and consultant on processor architectrure and design through Real World Tech while also serving as a writer and analyst for the Microprocessor Report as part of the Linley Group. If you want to see a discussion forum that focuses on architecture at an incredibly detailed level, the Real World Tech forum will have you covered - it's an impressive place to learn.
David was kind enough to spend an hour with us to talk about a recently-made-public report he wrote on Zen. It's definitely a discussion that dives into details most articles and stories on Zen don't broach, so be prepared to do some pausing and Googling phrases and technologies you may not be familiar with. Still, for any technology enthusiast that wants to get an expert's opinion on how Zen compares to Intel Skylake and how Ryzen might fare when its released this year, you won't want to miss it.
Subject: Processors | February 9, 2017 - 02:38 AM | Josh Walrath
Tagged: Zen, Skylake, Samsung, ryzen, kaby lake, ISSCC, Intel, GLOBALFOUNDRIES, amd, AM4, 14 nm FinFET
Yesterday EE Times posted some interesting information that they had gleaned at ISSCC. AMD released a paper describing the design process and advances they were able to achieve with the Zen architecture manufactured on Samsung’s/GF’s 14nm FinFETT process. AMD went over some of the basic measurements at the transistor scale and how it compares to what Intel currently has on their latest 14nm process.
The first thing that jumps out is that AMD claimes that their 4 core/8 thread x86 core is about 10% smaller than what Intel has with one of their latest CPUs. We assume it is either Kaby Lake or Skylake. AMD did not exactly go over exactly what they were counting when looking at the cores because there are some significant differences between the two architectures. We are not sure if that 44mm sq. figure includes the L3 cache or the L2 caches. My guess is that it probably includes L2 cache but not L3. I could be easily wrong here.
Going down the table we see that AMD and Samsung/GF are able to get their SRAM sizes down smaller than what Intel is able to do. AMD has double the amount of L2 cache per core, but it is only about 60% larger than Intel’s 256 KB L2. AMD also has a much smaller L3 cache as well than Intel. Both are 8 MB units but AMD comes in at 16 mm sq. while Intel is at 19.1 mm sq. There will be differences in how AMD and Intel set up these caches, and until we see L3 performance comparisons we cannot assume too much.
(Image courtesy of ISSCC)
In some of the basic measurements of the different processes we see that Intel has advantages throughout. This is not surprising as Intel has been well known to push process technology beyond what others are able to do. In theory their products will have denser logic throughout, including the SRAM cells. When looking at this information we wonder how AMD has been able to make their cores and caches smaller. Part of that is due to the likely setup of cache control and access.
One of the most likely culprits of this smaller size is that the less advanced FPU/SSE/AVX units that AMD has in Zen. They support AVX-256, but it has to be done in double the cycles. They can do single cycle AVX-128, but Intel’s throughput is much higher than what AMD can achieve. AVX is not the end-all, be-all but it is gaining in importance in high performance computing and editing applications. David Kanter in his article covering the architecture explicitly said that AMD made this decision to lower the die size and power constraints for this product.
Ryzen will undoubtedly be a pretty large chip overall once both modules and 16 MB of L3 cache are put together. My guess would be in the 220 mm sq. range, but again that is only a guess once all is said and done (northbridge, southbridge, PCI-E controllers, etc.). What is perhaps most interesting of it all is that AMD has a part that on the surface is very close to the Broadwell-E based Intel i7 chips. The i7-6900K runs at 3.2 to 3.7 GHz, features 8 cores and 16 threads, and around 20 MB of L2/L3 cache. AMD’s top end looks to run at 3.6 GHz, features the same number of cores and threads, and has 20 MB of L2/L3 cache. The Intel part is rated at 140 watts TDP while the AMD part will have a max of 95 watts TDP.
If Ryzen is truly competitive in this top end space (with a price to undercut Intel, yet not destroy their own margins) then AMD is going to be in a good position for the rest of this year. We will find out exactly what is coming our way next month, but all indications point to Ryzen being competitive in overall performance while being able to undercut Intel in TDPs for comparable cores/threads. We are counting down the days...
Subject: Processors | February 8, 2017 - 06:16 PM | Jeremy Hellstrom
Tagged: kaby lake, i5-7600K, Intel
[H]ard|OCP followed up their series on replacing the TIM underneath the heatspreader on Kaby Lake processors with another series depicting the i5-7600K in the buff. They removed the heatspreader completely and tried watercooling the die directly. As you can see in the video this requires more work than you might immediately assume, it was not simply shimming which was involved, some of the socket on the motherboard needed to be trimmed with a knife in order to get the waterblock to sit directly on the core. In the end the results were somewhat depressing, the risks involved are high and the benefits almost non-existent. If you are willing to risk it, replacing the TIM and reattaching the heatspreader is a far better choice.
"After our recent experiments with delidding and relidding our 7700K and 7600K to see if we could get better operating temperatures, we decided it was time to go topless! Popping the top on your CPU is one thing, and getting it to work in the current processor socket is another. Get out your pocket knife, we are going to have to make some cuts."
Here are some more Processor articles from around the web:
Subject: Processors | February 4, 2017 - 01:22 AM | Sebastian Peak
Tagged: titan x, ryzen, report, processor, nvidia, leak, cpu, benchmark, ashes of the singularity, amd
AMD's upcoming 8-core Ryzen CPU has appeared online in an apparent leak showing performance from an Ashes of the Singularity benchmark run. The benchmark results, available here on imgur and reported by TechPowerUp (among others today) shows the result of a run featuring the unreleased CPU paired with an NVIDIA Titan X graphics card.
It is interesting to consider that this rather unusual system configuration was also used by AMD during their New Horizon fan event in December, with an NVIDIA Titan X and Ryzen 8-core processor powering the 4K game demos of Battlefield 1 that were pitted against an Intel Core i7-6900K/Titan X combo.
It is also interesting to note that the processor listed in the screenshot above is (apparently) not an engineering sample, as TechPowerUp points out in their post:
"Unlike some previous benchmark leaks of Ryzen processors, which carried the prefix ES (Engineering Sample), this one carried the ZD Prefix, and the last characters on its string name are the most interesting to us: F4 stands for the silicon revision, while the 40_36 stands for the processor's Turbo and stock speeds respectively (4.0 GHz and 3.6 GHz)."
March is fast approaching, and we won't have to wait long to see just how powerful this new processor will be for 4K gaming (and other, less important stuff). For now, I want to find results from an AotS benchmark with a Titan X and i7-6900K to see how these numbers compare!
Subject: Processors | January 30, 2017 - 07:29 PM | Jeremy Hellstrom
Tagged: kaby lake, core i7 7700k, overclocking, delidding, risky business
Recently [H]ard|OCP popped the lid off of an i7-7700k to see if the rumours that once again Intel did not use high quality thermal interface material underneath the heatspreader. The experiment was a success in one way, the temperatures dropped 25.28%, from 91C to 68C. However the performance did not change much, they still could not reach a stable 5GHz overclock. They did not let that initial failure discourage them and spent some more time with their enhanced Kaby Lake processor to find scenarios in which they could reach or pass the 5GHz mark. They met with success when they reduced the RAM frequency to 2666MHz, by disabling Hyperthreading they could reach 5GHz with 3600MHz RAM but only when they increased the VCore did they manage to break 5GHz.
Of course you must exercise caution when tweaking to this level, a higher VCore will certainly reduce the lifespan of your chip and delidding can have a disastrous outcome even if done carefully. If you are interested in trying this, The Tech Report has a link to a 3D printed tool to help you in your endeavours.
"Last week we shared our overclocking results with our retail purchased Core i7-7700K Kaby Lake processor. We then took the Integrated Heat Spreader off, replaced the Thermal Interface Material and tried again for 5GHz with 3600MHz memory and failed. This time, less RAM MHz and more core voltage!"
Here are some more Processor articles from around the web:
Subject: Processors | January 16, 2017 - 09:11 PM | Jeremy Hellstrom
Tagged: kaby lake, sandy bridge
Not too long ago the release of a new processor family meant a noticeable improvement from the previous generation and the only question was how to upgrade, not if you should upgrade. Like many other things, that has passed on into the proverbial good old days and now we need reviews like this one published by [H]ard|OCP. Is there any noticeable performance difference between the two chips outside of synthetic benchmarks?
The test systems are slightly different as the memory has changed, the 7700K has 2666MHz DDR4 while the 2600K has 2133MHz DDR3; both CPUs are clocked at 4.5GHz however. Their results show actual performance deltas in productivity software such as HandBrake and Blender, justifying the upgrade for those who focus on content creation. As for gaming, if you have no GPU then you will indeed see performance increases; but nothing compared to buying a GPU.
"There are many HardOCP readers that are still running Sandy Bridge CPUs and have been waiting with anticipation of one day upgrading to a new system. One of the biggest things asked in the last month is just how the 2600K stacks up against the new 7700K processor. So we got hold of one of our readers 2600K systems and put it to the test."
Here are some more Processor articles from around the web:
High Bandwidth Cache
Apart from AMD’s other new architecture due out in 2017, its Zen CPU design, there is no other product that has had as much build up and excitement surrounding it than its Vega GPU architecture. After the world learned that Polaris would be a mainstream-only design that was released as the Radeon RX 480, the focus for enthusiasts came straight to Vega. It’s been on the public facing roadmaps for years and signifies the company’s return to the world of high end GPUs, something they have been missing since the release of the Fury X in mid-2015.
Let’s be clear: today does not mark the release of the Vega GPU or products based on Vega. In reality, we don’t even know enough to make highly educated guesses about the performance without more details on the specific implementations. That being said, the information released by AMD today is interesting and shows that Vega will be much more than simply an increase in shader count over Polaris. It reminds me a lot of the build to the Fiji GPU release, when the information and speculation about how HBM would affect power consumption, form factor and performance flourished. What we can hope for, and what AMD’s goal needs to be, is a cleaner and more consistent product release than how the Fury X turned out.
The Design Goals
AMD began its discussion about Vega last month by talking about the changes in the world of GPUs and how the data sets and workloads have evolved over the last decade. No longer are GPUs only worried about games, but instead they must address profession workloads, enterprise workloads, scientific workloads. Even more interestingly, as we have discussed the gap in CPU performance vs CPU memory bandwidth and the growing gap between them, AMD posits that the gap between memory capacity and GPU performance is a significant hurdle and limiter to performance and expansion. Game installs, professional graphics sets, and compute data sets continue to skyrocket. Game installs now are regularly over 50GB but compute workloads can exceed petabytes. Even as we saw GPU memory capacities increase from Megabytes to Gigabytes, reaching as high as 12GB in high end consumer products, AMD thinks there should be more.
Coming from a company that chose to release a high-end product limited to 4GB of memory in 2015, it’s a noteworthy statement.
The High Bandwidth Cache
Bold enough to claim a direct nomenclature change, Vega 10 will feature a HBM2 based high bandwidth cache (HBC) along with a new memory hierarchy to call it into play. This HBC will be a collection of memory on the GPU package just like we saw on Fiji with the first HBM implementation and will be measured in gigabytes. Why the move to calling it a cache will be covered below. (But can’t we call get behind the removal of the term “frame buffer”?) Interestingly, this HBC doesn’t have to be HBM2 and in fact I was told that you could expect to see other memory systems on lower cost products going forward; cards that integrate this new memory topology with GDDR5X or some equivalent seem assured.
With the near comes a new push for performance, efficiency and feature leadership from Qualcomm and its Snapdragon line of mobile SoCs. The Snapdragon 835 was officially announced in November of last year when the partnership with Samsung on 10nm process technology was announced, but we now have the freedom to share more of the details on this new part and how it changes Qualcomm’s position in the ultra-device market. Though devices with the new 835 part won’t be on the market for several more months, with announcements likely coming at CES this year.
Qualcomm frames the story around the Snapdragon 835 processor with what they call the “five pillars” – five different aspects of mobile processor design that they have addressed with updates and technologies. Qualcomm lists them as battery life (efficiency), immersion (performance), connectivity, and security.
Starting where they start, on battery life and efficiency, the SD 835 has a unique focus that might surprise many. Rather than talking up the improvements in performance of the new processor cores, or the power of the new Adreno GPU, Qualcomm is firmly planted on looking at Snapdragon through the lens of battery life. Snapdragon 835 uses half of the power of Snapdragon 801.
The company touts usage claims of 1+ day of talk time, 5+ days of music playback, 11 hours of 4K video playback, 3 hours of 4K video capture and 2+ hours of sustained VR gaming. These sound impressive, but as we must always do in this market, you must wait for consumer devices from Qualcomm partners to really measure how well this platform will do. Going through a typical power user comparison of a device built on the Snapdragon 835 to one use the 820, Qualcomm thinks it could result in 2 or more hours of additional battery life at the end of the day.
We have already discussed the new Quick Charge 4 technology, that can offer 5 hours of use with just 5 minutes of charge time.
Subject: Processors | January 3, 2017 - 08:54 PM | Jeremy Hellstrom
Tagged: z270, overclocking, kaby lake, Intel, i7-7700k, core i7-7700k, 7th generation core, 7700k, 14nm
Having already familiarized yourself with Intel's new Kaby Lake architecture and the i7-7700k processor in Ryan's review you may now be wondering how well the new CPU overclocks for others. [H]ard|OCP received three i7-7700k's and three different Z270 motherboards for testing and they set about overclocking these in combination to see what frequency they could reach. Only one of the chips was ever stable at 5GHz, and it is reassuring that it managed that on all three motherboards, the remaining two would only hit 4.8GHz which is still not a bad result. Drop by to see their settings in full detail.
"After having a few weeks to play around with Intel's new Kaby Lake architecture Core i7-7700K processors, we finally have some results that we want to discuss when it comes to overclocking and the magic 5GHz many of us are looking for, and what we think your chances are of getting there yourself."
Here are some more Processor articles from around the web:
- Intel's Core i7-7700K 'Kaby Lake' CPU @ The Tech Report
- Intel Kaby Lake i7-7700K & i5-7600K Review @ Hardware Canucks
- Intel Core i7-7700K vs 6700K: 22 Games, RX 480 & GTX 1080 @ techPowerUp
- ntel Kaby Lake Core i7-7700K Performance & Z270 Chipset Overview @ Techgage
- Intel 7th Generation Core i7 7700K Processor Review @ OCC
- Intel Kaby Lake Core i7-7700K IPC @ [H]ard|OCP
- Core i5-6400 @ Hardware Secrets
- FX-4300 @ Hardware Secrets
- AMD's New Ryzen CPU - SMT and IPC @ [H]ard|OCP
It probably doesn't surprise any of our readers that there has been a tepid response to the leaks and reviews that have come out about the new Core i7-7700K CPU ahead of the scheduled launch of Kaby Lake-S from Intel. Replacing the Skylake-based 6700K part as the new "flagship" consumer enthusiast CPU, the 7700K has quite a bit stacked against it. We know that Kaby Lake is the first in the new sequence of tick-tock-optimize, and thus there are few architectural changes to any portion of the chip. However, that does not mean that the 7700K and Kaby Lake in general don't offer new capabilities (HEVC) or performance (clock speed).
The Core i7-7700K is in an interesting spot as well with regard to motherboards and platforms. Nearly all motherboards that run the Z170 chipset will be able to run the new Kaby Lake parts without requiring an upgrade to the newly released Z270 chipset. However, the likelihood that any user on a Z170 platform today using a Skylake processor will feel the NEED to upgrade to Kaby Lake is minimal, to say the least. The Z270 chipset only offers a couple of new features compared to last generation, so the upgrade path is again somewhat limited in excitement.
Let's start by taking a look at the Core i7-7700K and how it compares to the previous top-end parts from the consumer processor line and then touch on the changes that Kaby Lake brings to the table.
With the beginning of CES just days away (as I write this), Intel is taking the wrapping paper off of its first gift of 2017 to the industry. As you can see from the slide above, more than just the Kaby Lake-S consumer socketed processors are launching today, but other components including Iris Plus graphics implementations and quad-core notebook implementations will need to wait for another day.
For DIY builders and OEMs, Kaby Lake-S, now known as the 7th Generation Core Processor family, offer some changes and additions. First, we will get a dual-core HyperThreaded processor with an unlocked designation in the Core i3-7350K. Other than the aforementioned Z270 chipset, Kaby Lake will be the first platform compatible with Intel Optane memory. (To be extra clear, I was told that previous processors will NOT be able to utilize Optane in its M.2 form factor.)
Though we have already witnessed Lenovo announcing products using Optane, this is the first official Intel discussion about it. Optane memory will be available in M.2 modules that can be installed on Z270 motherboards, improving snappiness and responsiveness. It seems this will be launched later in the quarter as we don't have any performance numbers or benchmarks to point to demonstrating the advantages that Intel touts. I know both Allyn and I are very excited to see how this differs from previous Intel caching technologies.
|Core i7-7700K||Core i7-6700K||Core i7-5775C||Core i7-4790K||Core i7-4770K||Core i7-3770K|
|Architecture||Kaby Lake||Skylake||Broadwell||Haswell||Haswell||Ivy Bridge|
|Socket||LGA 1151||LGA 1151||LGA 1150||LGA 1150||LGA 1150||LGA 1155|
|Base Clock||4.2 GHz||4.0 GHz||3.3 GHz||4.0 GHz||3.5 GHz||3.5 GHz|
|Max Turbo Clock||4.5 GHz||4.2 GHz||3.7 GHz||4.4 GHz||3.9 GHz||3.9 GHz|
|Memory Speeds||Up to 2400 MHz||Up to 2133 MHz||Up to 1600 MHz||Up to 1600 MHz||Up to 1600 MHz||Up to 1600 MHz|
|Cache (L4 Cache)||8MB||8MB||6MB (128MB)||8MB||8MB||8MB|
|System Bus||DMI3 - 8.0 GT/s||DMI3 - 8.0 GT/s||DMI2 - 6.4 GT/s||DMI2 - 5.0 GT/s||DMI2 - 5.0 GT/s||DMI2 - 5.0 GT/s|
|Graphics||HD Graphics 630||HD Graphics 530||Iris Pro 6200||HD Graphics 4600||HD Graphics 4600||HD Graphics 4000|
|Max Graphics Clock||1.15 GHz||1.15 GHz||1.15 GHz||1.25 GHz||1.25 GHz||1.15 GHz|