AMD Ryzen Pre-order Starts Today, Specs and Performance Revealed

Subject: Processors | February 22, 2017 - 09:00 AM |
Tagged: Zen, ryzen, preorder, pre-order, handbrake, Cinebench, amd

I know that many of you have been waiting months and years to put your money down for the Zen architecture and Ryzen processors from AMD. Well that day is finally here: AMD is opening pre-orders for Ryzen 7 1800X, Ryzen 7 1700X and Ryzen 7 1700 processors.

That’s the good news. The bad news? You’ll be doing it without the guidance of independent reviews.

For some of you, that won’t matter. And I can respect that! Getting your hands on Ryzen and supporting the disruption that it offers is something not only AMD fans have been preparing for, but tens of thousands of un-upgraded enthusiasts as well.

slides1wm.jpg

Sorry...AMD doesn't trust with slides it seems.

Proudly announced at our meeting with AMD this week, Zen not only met the 40% IPC goals it announced more than a year ago, but exceeded it! AMD claims more than a 52% increase in instructions per clock over Excavator and that is a conservative metric based on side conversations. This does a couple of things for the CPU market immediately: first it resets performance expectations for what Ryzen will offer when reviews do go live and second, it may actually put some worry into Intel.

AMD is allowing us to share baseline specifications of the processors, including clock speeds and core counts, as well as some selected benchmarks that show the Ryzen CPUs in an (obviously) favorable light.

  Ryzen R7 1800X Ryzen R7 1700X Ryzen R7 1700 Core i7-6900K Core i7-6800K Core i7-7700K
Architecture Zen Zen Zen Broadwell-E Broadwell-E Kaby Lake
Process Tech 14nm 14nm 14nm 14nm 14nm 14nm+
Cores/Threads 8/16 8/16 8/16 8/16 6/12 4/8
Base Clock 3.6 GHz 3.4 GHz 3.0 GHz 3.2 GHz 3.4 GHz 4.2 GHz
Turbo/Boost Clock 4.0 GHz 3.8  GHz 3.7 GHz 3.7 GHz 3.6 GHz 4.5 GHz
Cache 20MB 20MB 20MB 20MB 15MB 8MB
TDP 95 watts 95 watts 65 watts 140 watts 140 watts 91 watts
Price $499 $399 $329 $1050 $450 $350

AMD is being extremely aggressive with these prices and with the direct comparisons. The flagship Ryzen 7 1800X will run you just $499, the 1700X at $399 and the 1700 at $329. For AMD’s own comparisons, they pitted the Ryzen 7 1800X against the Core i7-6900K from Intel, selling for more than 2x the cost. Both CPUs have 8 cores and 16 threads, the AMD Ryzen part has higher clock speeds as well. If IPC is equivalent (or close), then it makes sense that the 1800X would be a noticeably faster part. If you care about performance per dollar even more…you should be impressed.

For the other comparisons, AMD is pitting the Ryzen 7 1700X with 8 cores and 16 threads against the Core i7-6800K, with 6 cores and 12 threads. Finally, the Ryzen 7 1700, still with an 8C/16T setup, goes against the Core i7-7700K with just 4 cores and 8 threads.

Here is a summary of the performance comparisons AMD is allowing to be showed.

perf1-wm.jpg

perf2-wm.jpg

Though it's only a couple of benchmarks, and the results are highly siloed to show Ryzen in the best light, the results are incredibly impressive. In Cinebench R15, the Ryzen 1800X is 9% faster than the Core i7-6900K but at half the price; even the Ryzen R7 1700X is beating it. The 1700X is 34% faster than the Core i7-6800K, and the 1700 is 31% faster than the quad-core Core i7-7700K. The only single threaded result AMD gave us shows matching performance from the Core i7-6900K based on the Intel Broadwell architecture and the new Ryzen R7 1800X. This might suppress some questions about single threaded performance of Ryzen before reviews, but Broadwell is a couple generations old in Intel’s lineup, so we should expect Kaby Lake to surpass it.

The Handbrake benchmark results only included Core i7-7700K and the Ryzen R7 1700, with the huge advantage going to AMD. Not unexpected considering the 2x delta in core and thread count.

perf3-wm.jpg

Finally, the performance per dollar conversion on the Cinebench scores is a substantially impactful visual. With a more than 2x improvement from the Ryzen 7 1800X to the Core i7-6900K, power-hungry users on a budget will have a lot to think about.

slides2wm.jpg

Sorry...AMD doesn't trust with slides it seems.

Clearly, AMD is very proud of the Ryzen processor and the Zen architecture, and they should be. This is a giant leap forward for the company compared to previous desktop parts. If you want to buy in today and pre-order, we have links below. If you’d rather wait for a full review from PC Perspective (or other outlets), you only have to wait until March 2nd.

Update Feb 22 @ 4:27am: An official Intel spokesman did respond to today's AMD news with the following: 

“We take any competition seriously but as we’ve learned, consumers usually take a ‘wait and see’ approach on performance claims for untested products. 7th Gen Intel® Core™ delivers the best experiences, and with 8th Gen Intel Core and new technologies like Intel® Optane™ memory coming soon, Intel will not stop raising the bar.” ­

While nothing drastic, the Intel comment is interesting in a couple of ways. First, the fact that Intel is responding at all means that they are rattled to some degree. Second, mention of the 8th Gen Core processor series indicates that they want potential buyers to know that something beyond Kaby Lake is coming down the pipe, a break from Intel's normally stoic demeanor.

Source: AMD

Report: Leaked AMD Ryzen 7 1700X Benchmarks Show Strong Performance

Subject: Processors | February 21, 2017 - 10:54 AM |
Tagged: ryzen, rumor, report, R7, processor, leak, IPC, cpu, Cinebench, benchmark, amd, 1700X

VideoCardz.com, continuing their CPU coverage of the upcoming Ryzen launch, has posted images from XFASTEST depicting the R7 1700X processor and some very promising benchmark screenshots.

AMD-Ryzen-7-1700X.jpg

(Ryzen 7 1700X on the right) Image credit XFASTEST via VideoCardz

The Ryzen 7 1700X is reportedly an 8-core/16-thread processor with a base clock speed of 3.40 GHz, and while overall performance from the leaked benchmarks looks very impressive, it is the single-threaded score from the Cinebench R15 run pictured which really makes this CPU look like major competition for Intel with IPC.

AMD-Ryzen-7-1700X-Cinebench.jpg

Image credit XFASTEST via VideoCardz

An overall score of 1537 is outstanding, placing the CPU almost even with the i7-6900K at 1547 based on results from AnandTech:

AnandTech_Benchmarks.png

Image credit AnandTech

And the single-threaded performance score of the reported Ryzen 7 1700X is 154, which places it above the i7-6900K's score of 153. (It is worth noting that Cinebench R15 shows a clock speed of 3.40 GHz for this CPU, which is the base, while CPU-Z is displaying 3.50 GHz - likely indicating a boost clock, which can reportedly surpass 3.80 GHz with this CPU.)

Other results from the reported leak include 3DMark Fire Strike, with a physics score of 17,916 with Ryzen 7 1700X clocking in at ~3.90 GHz:

AMD-Ryzen-7-1700X-Fire-Strike-Physics.png

Image credit XFASTEST via VideoCardz

We will know soon enough where this and other Ryzen processors stand relative to Intel's current offerings, and if Intel will respond to the (rumored) price/performance double whammy of Ryzen. An i7-6900K retails for $1099 and currently sells for $1049 on Newegg.com, and the rumored pricing (taken from Wccftech), if correct, gives AMD a big win here. Competition is very, very good!

wccftech_chart.PNG

Chart credit Wccftech.com

Source: VideoCardz

AMD Details Zen at ISSCC

Subject: Processors | February 8, 2017 - 09:38 PM |
Tagged: Zen, Skylake, Samsung, ryzen, kaby lake, ISSCC, Intel, GLOBALFOUNDRIES, amd, AM4, 14 nm FinFET

Yesterday EE Times posted some interesting information that they had gleaned at ISSCC.  AMD released a paper describing the design process and advances they were able to achieve with the Zen architecture manufactured on Samsung’s/GF’s 14nm FinFETT process.  AMD went over some of the basic measurements at the transistor scale and how it compares to what Intel currently has on their latest 14nm process.

icon.jpg

The first thing that jumps out is that AMD claimes that their 4 core/8 thread x86 core is about 10% smaller than what Intel has with one of their latest CPUs.  We assume it is either Kaby Lake or Skylake.  AMD did not exactly go over exactly what they were counting when looking at the cores because there are some significant differences between the two architectures.  We are not sure if that 44mm sq. figure includes the L3 cache or the L2 caches.  My guess is that it probably includes L2 cache but not L3.  I could be easily wrong here.

Going down the table we see that AMD and Samsung/GF are able to get their SRAM sizes down smaller than what Intel is able to do.  AMD has double the amount of L2 cache per core, but it is only about 60% larger than Intel’s 256 KB L2.  AMD also has a much smaller L3 cache as well than Intel.  Both are 8 MB units but AMD comes in at 16 mm sq. while Intel is at 19.1 mm sq.  There will be differences in how AMD and Intel set up these caches, and until we see L3 performance comparisons we cannot assume too much.

Zen-comparison.png

(Image courtesy of ISSCC)

In some of the basic measurements of the different processes we see that Intel has advantages throughout.  This is not surprising as Intel has been well known to push process technology beyond what others are able to do.  In theory their products will have denser logic throughout, including the SRAM cells.  When looking at this information we wonder how AMD has been able to make their cores and caches smaller.  Part of that is due to the likely setup of cache control and access.

One of the most likely culprits of this smaller size is that the less advanced FPU/SSE/AVX units that AMD has in Zen.  They support AVX-256, but it has to be done in double the cycles.  They can do single cycle AVX-128, but Intel’s throughput is much higher than what AMD can achieve.  AVX is not the end-all, be-all but it is gaining in importance in high performance computing and editing applications.  David Kanter in his article covering the architecture explicitly said that AMD made this decision to lower the die size and power constraints for this product.

Ryzen will undoubtedly be a pretty large chip overall once both modules and 16 MB of L3 cache are put together.  My guess would be in the 220 mm sq. range, but again that is only a guess once all is said and done (northbridge, southbridge, PCI-E controllers, etc.).  What is perhaps most interesting of it all is that AMD has a part that on the surface is very close to the Broadwell-E based Intel i7 chips.  The i7-6900K runs at 3.2 to 3.7 GHz, features 8 cores and 16 threads, and around 20 MB of L2/L3 cache.  AMD’s top end looks to run at 3.6 GHz, features the same number of cores and threads, and has 20 MB of L2/L3 cache.  The Intel part is rated at 140 watts TDP while the AMD part will have a max of 95 watts TDP.

If Ryzen is truly competitive in this top end space (with a price to undercut Intel, yet not destroy their own margins) then AMD is going to be in a good position for the rest of this year.  We will find out exactly what is coming our way next month, but all indications point to Ryzen being competitive in overall performance while being able to undercut Intel in TDPs for comparable cores/threads.  We are counting down the days...

Source: AMD

Jump into Kaby Lake naked

Subject: Processors | February 8, 2017 - 01:16 PM |
Tagged: kaby lake, i5-7600K, Intel

[H]ard|OCP followed up their series on replacing the TIM underneath the heatspreader on Kaby Lake processors with another series depicting the i5-7600K in the buff.  They removed the heatspreader completely and tried watercooling the die directly.  As you can see in the video this requires more work than you might immediately assume, it was not simply shimming which was involved, some of the socket on the motherboard needed to be trimmed with a knife in order to get the waterblock to sit directly on the core.  In the end the results were somewhat depressing, the risks involved are high and the benefits almost non-existent.  If you are willing to risk it, replacing the TIM and reattaching the heatspreader is a far better choice.

getimage.jpg

"After our recent experiments with delidding and relidding our 7700K and 7600K to see if we could get better operating temperatures, we decided it was time to go topless! Popping the top on your CPU is one thing, and getting it to work in the current processor socket is another. Get out your pocket knife, we are going to have to make some cuts."

Here are some more Processor articles from around the web:

Processors

Source: [H]ard|OCP

Report: AMD Ryzen Performance in Ashes of the Singularity Benchmark

Subject: Processors | February 3, 2017 - 08:22 PM |
Tagged: titan x, ryzen, report, processor, nvidia, leak, cpu, benchmark, ashes of the singularity, amd

AMD's upcoming 8-core Ryzen CPU has appeared online in an apparent leak showing performance from an Ashes of the Singularity benchmark run. The benchmark results, available here on imgur and reported by TechPowerUp (among others today) shows the result of a run featuring the unreleased CPU paired with an NVIDIA Titan X graphics card.

Ryzen_Ashes_Screenshot.jpg

It is interesting to consider that this rather unusual system configuration was also used by AMD during their New Horizon fan event in December, with an NVIDIA Titan X and Ryzen 8-core processor powering the 4K game demos of Battlefield 1 that were pitted against an Intel Core i7-6900K/Titan X combo.

It is also interesting to note that the processor listed in the screenshot above is (apparently) not an engineering sample, as TechPowerUp points out in their post:

"Unlike some previous benchmark leaks of Ryzen processors, which carried the prefix ES (Engineering Sample), this one carried the ZD Prefix, and the last characters on its string name are the most interesting to us: F4 stands for the silicon revision, while the 40_36 stands for the processor's Turbo and stock speeds respectively (4.0 GHz and 3.6 GHz)."

March is fast approaching, and we won't have to wait long to see just how powerful this new processor will be for 4K gaming (and other, less important stuff). For now, I want to find results from an AotS benchmark with a Titan X and i7-6900K to see how these numbers compare!

Source: TechPowerUp

Living dangerously; delidding your i7-7700k

Subject: Processors | January 30, 2017 - 02:29 PM |
Tagged: kaby lake, core i7 7700k, overclocking, delidding, risky business

Recently [H]ard|OCP popped the lid off of an i7-7700k to see if the rumours that once again Intel did not use high quality thermal interface material underneath the heatspreader.  The experiment was a success in one way, the temperatures dropped 25.28%, from 91C to 68C. However the performance did not change much, they still could not reach a stable 5GHz overclock.  They did not let that initial failure discourage them and spent some more time with their enhanced Kaby Lake processor to find scenarios in which they could reach or pass the 5GHz mark. They met with success when they reduced the RAM frequency to 2666MHz, by disabling Hyperthreading they could reach 5GHz with 3600MHz RAM but only when they increased the VCore did they manage to break 5GHz. 

Of course you must exercise caution when tweaking to this level, a higher VCore will certainly reduce the lifespan of your chip and delidding can have a disastrous outcome even if done carefully.  If you are interested in trying this, The Tech Report has a link to a 3D printed tool to help you in your endeavours.

kaby2.jpg

"Last week we shared our overclocking results with our retail purchased Core i7-7700K Kaby Lake processor. We then took the Integrated Heat Spreader off, replaced the Thermal Interface Material and tried again for 5GHz with 3600MHz memory and failed. This time, less RAM MHz and more core voltage!"

Here are some more Processor articles from around the web:

Processors

Source: [H]ard|OCP

Shall we keep hanging out under the Sandy Bridge or head on down to Kaby Lake?

Subject: Processors | January 16, 2017 - 04:11 PM |
Tagged: kaby lake, sandy bridge

Not too long ago the release of a new processor family meant a noticeable improvement from the previous generation and the only question was how to upgrade, not if you should upgrade.  Like many other things, that has passed on into the proverbial good old days and now we need reviews like this one published by [H]ard|OCP.  Is there any noticeable performance difference between the two chips outside of synthetic benchmarks? 

The test systems are slightly different as the memory has changed, the 7700K has 2666MHz DDR4 while the 2600K has 2133MHz DDR3; both CPUs are clocked at 4.5GHz however.  Their results show actual performance deltas in productivity software such as HandBrake and Blender, justifying the upgrade for those who focus on content creation.  As for gaming, if you have no GPU then you will indeed see performance increases; but nothing compared to buying a GPU.

1484259750Ex0fKuOcpc_1_1.jpg

"There are many HardOCP readers that are still running Sandy Bridge CPUs and have been waiting with anticipation of one day upgrading to a new system. One of the biggest things asked in the last month is just how the 2600K stacks up against the new 7700K processor. So we got hold of one of our readers 2600K systems and put it to the test."

Here are some more Processor articles from around the web:

Processors

Source: [H]ard|OCP

Three Kaby Lakes for three Z270s; it's an overclocking menage a trois

Subject: Processors | January 3, 2017 - 03:54 PM |
Tagged: z270, overclocking, kaby lake, Intel, i7-7700k, core i7-7700k, 7th generation core, 7700k, 14nm

Having already familiarized yourself with Intel's new Kaby Lake architecture and the i7-7700k processor in Ryan's review you may now be wondering how well the new CPU overclocks for others.  [H]ard|OCP received three i7-7700k's and three different Z270 motherboards for testing and they set about overclocking these in combination to see what frequency they could reach.  Only one of the chips was ever stable at 5GHz, and it is reassuring that it managed that on all three motherboards, the remaining two would only hit 4.8GHz which is still not a bad result.  Drop by to see their settings in full detail.

1483420732E61CFZVtYr_1_1.jpg

"After having a few weeks to play around with Intel's new Kaby Lake architecture Core i7-7700K processors, we finally have some results that we want to discuss when it comes to overclocking and the magic 5GHz many of us are looking for, and what we think your chances are of getting there yourself."

Here are some more Processor articles from around the web:

Processors

Source: [H]ard|OCP

Intel Allegedly Working to Replace Sandy Bridge

Subject: Processors | January 2, 2017 - 05:33 PM |
Tagged: sandy bridge, Intel

OC3D is claiming that Intel is working on a significantly new architecture, targeting somewhere around the 2019 or 2020 time frame. Like AMD’s Bulldozer, while there were several architectures after the initial release, they were all based around a set of the same basic assumptions with tweaks for better IPC, reducing bottlenecks, and so forth. Intel has also been using the same fundamentals since Sandy Bridge, albeit theirs aligned much better with how x86 applications were being developed.

Intel-logo.png

According to the report, Intel’s new architecture is expected to remove some old instructions, which will make it less compatible with applications that use these commands. This is actually very similar to what AMD was attempting to do with Bulldozer... to a point. AMD projected that applications would scale well to multiple cores, and use GPUs for floating-point operations; as such, they designed cores in pairs, and decided to eliminate redundant parts, such as half of the floating-point units. Hindsight being 20/20, we now know that developers didn’t change their habits (and earlier Bulldozer parts were allegedly overzealous with cutting out elements in a few areas, too).

In Intel’s case, from what we hear about at the moment, their cuts should be less broad than AMD’s. Rather than projecting a radical shift in programming, they’re just going to cut the fat of their existing instruction set, unless there’s bigger changes planned for the next couple years of development. As for the unlucky applications that use these instructions, OC3D speculates that either Intel or the host operating systems will provide some emulation method, likely in software.

If the things they cut haven’t been used in several years, then you can probably get acceptable performance in the applications that require them via emulation. On the other hand, a bad decision could choke the processor in the same way that Bulldozer, especially the early variants, did for AMD. On the other-other hand, Intel has something that AMD didn’t: the market-share to push (desktop) developers in a given direction. On the fourth hand, which I’ll return to its rightful owner, I promise, we don’t know how much the “(desktop)” clause will translate to overall software in two years.

Right now, it seems like x86 is successfully holding off ARM in performance-critical, consumer applications. If that continues, then Intel might be able to push x86 software development, even if they get a little aggressive like AMD did five-plus-development-time years ago.

Source: OC3D

Drabby Lake has sprung a leak, so to the Intel 200 series chipset

Subject: General Tech, Processors | December 15, 2016 - 12:29 PM |
Tagged: leak, kaby lake, intel 200

Tech ARP have an interesting story posted today, it would seem they pried the specs of the upcoming Kaby Lake processors and accompanying Intel 200 chipset.  The top chip, the $349 Core i7-7700K will have 4 cores and 8 threads running at 4.2 GHz, with an 8 MB L3 cache and a TDP of 95W while the non-K version will have it core clock dropped to 3.6GHz, TDP dropped to 65W and price lowered to $309.  The chipsets will encompass series similar to the previous generations from Intel, including the LGA 1151 Z270, H270, Q270, B250 and Q250 series.  There is no information on the socket the server level C422 and high end X299 boards will use in this leak, but we are sure you can extrapolate from existing rumours and innuendo.  Follow that link for the entire lineup.

techarp_leak.PNG

"As AMD gears up to launch the AMD Ryzen desktop processor in early Q1 2017, Intel has finalised the launch plans for their desktop Kaby Lake processors, and the accompanying 200 Series chipsets.

Although Intel has been extremely secretive, we managed to obtain the specifications and launch details of the desktop Kaby Lake processors, and the 200 Series chipsets. Check it out!"

Here is some more Tech News from around the web:

Tech Talk

Source: TechARP

ARM Partners with Xilinx to Accelerate Path to 7nm

Subject: Processors | December 8, 2016 - 09:00 AM |
Tagged: Xilinx, TSMC, standard cells, layout, FinFET, EDA, custom cell, arm, 7nm

Today ARM is announcing their partnership with Xilinx to deliver design solutions for their products on TSMC’s upcoming 7nm process node.  ARM has previously partnered with Xilinx on other nodes including 28, 20, and 16nm.  Their partnership extends into design considerations to improve the time to market of complex parts and to rapidly synthesize new designs for cutting edge process nodes.

Xilinx is licensing out the latest ARM Artisan Physical IP platform for TSMC’s 7nm.  Artisan Physical IP is a set of tools to help rapidly roll out complex designs as compared to what previous generations of products faced.  ARM has specialized libraries and tools to help implement these designs on a variety of processes and receive good results even on the shortest possible design times.

icon_arm.jpg

Design relies on two basic methodologies.  There is custom cell and then standard cell designs.  Custom cell design allows for a tremendous amount of flexibility in layout and electrical characteristics, but it requires a lot of man-hours to complete even the simplest logic.  Custom cell designs typically draw less power and provide higher clockspeeds than standard cell design.  Standard cells are like Legos in that the cells can be quickly laid out to create complex logic.  Software called EDA (Electronic Design Automation) can quickly place and route these cells.  GPUs lean heavily on standard cells and EDA software to get highly complex products out to market quickly.

These two basic methods have netted good results over the years, but during that time we have seen implementations of standard cells become more custom in how they behave.  While not achieving full custom performance, we have seen semi-custom type endeavors achieve appreciable gains without requiring the man hours to achieve fully custom.

In this particular case ARM is achieving a solid performance in power and speed through automated design that improves upon standard cells, but without the downsides of a fully custom part.  This provides positive power and speed benefits without the extra power draw of a traditional standard cell.  ARM further improves upon this with the ARM Artisan Power Grid Architect (PGA) which simplifies the development of a complex power grid that services a large and complex chip.

We have seen these types of advancements in the GPU world that NVIDIA and AMD enjoy talking about.  A better power grid allows the ASIC to perform at lower power envelopes due to less impedence.  The GPU guys have also utilized High Density Libraries to pack in the transistors as tight as possible to utilize less space and increase spatial efficiency.  A smaller chip, which requires less power is always a positive development over a larger chip of the same capabilities that requires more power.  ARM looks to be doing their own version of these technologies and are applying them to TSMC’s upcoming 7nm FinFET process.

TSMC is not releasing this process to mass production until at least 2018.  In 1H 2017 we will see some initial test and early production runs for a handful of partners.  Full blown production of 7nm will be in 2018.  Early runs and production are increasingly being used for companies working with low power devices.  We can look back at 20/16/14 nm processes and see that they were initially used by designs that do not require a lot of power and will run at moderate clockspeeds.  We have seen a shift in who uses these new processes with the introduction of sub-28nm process nodes.  The complexity of the design, process steps, materials, and libraries have pushed the higher performance and power hungry parts to a secondary position as the foundries attempt to get these next generation nodes up to speed.  It isn’t until after some many months of these low power parts are pushed through that we see adjustments and improvements in these next generation nodes to handle the higher power and clockspeed needs of products like desktop CPUs and GPUs.

Zynq-7015-module_large.jpg

ARM is certainly being much more aggressive in addressing next generation nodes and pushing their cutting edge products on them to allow for far more powerful mobile products that also exhibit improved battery life.  This step with 7nm and Xilinx will provide a lot of data to ARM and its partners downstream when the time comes to implement new designs.  Artisan will continue to evolve to allow partners to quickly and efficiently introduce new products on new nodes to the market at an accelerated rate as compared to years past.

Click to read the entire ARM post!

Source: ARM

Leaked Kaby Lake Sample Found and Overclocked

Subject: Processors | November 30, 2016 - 06:52 PM |
Tagged: kaby lake, Intel, core i7 7700k

Someone, who wasn’t Intel, seeded Tom’s Hardware an Intel Core i7-7700k, which is expected for release in the new year. This is the top end of the mainstream SKUs, bringing four cores (eight threads) to 4.2 GHz base, 4.5 GHz boost. Using a motherboard built around the Z170 chipset, they were able to clock the CPU up to 4.8 GHz, which is a little over 4% higher than the Skylake-based Core i7-6700k maximum overclock on the same board.

intel-2016-7700k-tomshardware.jpg

Image Credit: Tom's Hardware
Lucky number i7-77.

Before we continue, these results are based on a single sample. (Update: @7:01pm -- Also, the motherboard they used has some known overclock and stability issues. They mentioned it a bit in the post, like why their BCLK is 99.65MHz, but I forgot to highlight it here. Thankfully, Allyn caught it in the first ten minutes.) This sample has retail branding, but Intel would not confirm that it performs like they expect a retail SKU would. Normally, pre-release products are labeled as such, but there’s no way to tell if this one part is some exception. Beyond concerns that it might be slightly different from what consumers will eventually receive, there is also a huge variation in overclocking performance due to binning. With a sample size of one, we cannot tell whether this chip has an abnormally high, or an abnormally low, defect count, which affects both power and maximum frequency.

That aside, if this chip is representative of Kaby Lake performance, users should expect an increase in headroom for clock rates, but it will come at the cost of increased power consumption. In fact, Tom’s Hardware states that the chip “acts like an overclocked i7-6700K”. Based on this, it seems like, unless they want an extra 4 PCIe lanes on Z270, Kaby Lake’s performance might already be achievable for users with a lucky Skylake.

I should note that Tom’s Hardware didn’t benchmark the iGPU. I don’t really see it used for much more than video encoding anyway, but it would be nice to see if Intel improved in that area, seeing as how they incremented the model number. Then again, even users who are concerned about that will probably be better off just adding a second, discrete GPU anyway.

Rumor: Leaked Zen Prices and SKUs

Subject: Processors | November 28, 2016 - 09:26 PM |
Tagged: amd, Zen, Summit Ridge

Guru3D got hold of a product list, which includes entries for AMD’s upcoming Zen architecture.

Four SKUs are thus rumored to exist:

  • Zen SR3: (65W, quad-core, eight threads, ~$150 USD)
  • Zen SR5: (95W, hexa-core, twelve threads, ~$250 USD)
  • Zen SR7: (95W, octo-core, sixteen threads, ~$350 USD)
  • Special Zen SR7: (95W, octo-core, sixteen threads, ~$500 USD)

The sheet also states that none of these are supposed to contain integrated graphics, like we see on the current FX line. There is some merit to using integrated GPUs for specific tasks, like processing video while the main GPU is busy or doing a rapid, massively parallel calculation without the latency of memory copies, but AMD is probably right to not waste resources, such as TDP, fighting our current lack of compatible software and viable use cases for these SKUs.

amd-2016-summit-ridge-guru3d.png

Image Credit: Guru3D

The sheet also contains benchmarks for Cinebench R15. While pre-rendered video is a task that really should be done on GPUs at this point, especially with permissive, strong, open-source projects like Cycles, they do provide a good example of multi-core performance that scales. In this one test, the Summit Ridge 7 CPU ($350) roughly matches the Intel Core i7-6850K ($600), again, according to this one unconfirmed benchmark. It doesn’t list clock rates, but other rumors claim that the top-end chip will be around 3.2 GHz base, 3.5 GHz boost at stock, with manual overclocks exceeding 4 GHz.

These performance figures suggest that Zen will not beat Skylake on single-threaded performance, but it might be close. That might not matter, however. CPUs, these days, are kind-of converging around a certain level of per-thread performance, and are differentiating with core count, price, and features. Unfortunately, there doesn’t seem to have been many leaks regarding enthusiast-level chipsets for Zen, so we don’t know if there will be compelling use cases yet.

Zen is expected early in 2017.

Source: Guru3D

Qualcomm Teases Snapdragon 835, built on Samsung 10nm FinFET

Subject: Processors, Mobile | November 17, 2016 - 07:30 AM |
Tagged: snapdragon, Samsung, qualcomm, FinFET, 835, 10nm

Though we are still months away from shipping devices, Qualcomm has announced that it will be building its upcoming flagship Snapdragon 835 mobile SoC on Samsung’s 10nm 2nd generation FinFET process technology. Qualcomm tells us that integrating the 10nm node in 2017 will keep it “the technology leader in mobile platforms” and this makes the 835 the world's first 10nm production processor.

“Using the new 10nm process node is expected to allow our premium tier Snapdragon 835 processor to deliver greater power efficiency and increase performance while also allowing us to add a number of new capabilities that can improve the user experience of tomorrow’s mobile devices.”

Samsung announced its 10nm FinFET process technology in October of this year and it sports some impressive specifications and benefits to the Snapdragon 835 platform. Per Samsung, it offers “up to a 30% increase in area efficiency with 27% higher performance or up to 40% lower power consumption.” For Qualcomm and its partners, that means a smaller silicon footprint for innovative device designs, including thinner chassis or larger batteries (yes, please).

qualcomm-logo.jpg

Other details on the Snapdragon 835 are still pending a future reveal, but Qualcomm says that 835 is in production now and will be shipping in commercial devices in the first half of 2017. We did hear that the new 10nm chip is built on "more than 3 billion transistors" - making it an incredibly complex design!

Image_Keith Kressin Qualcomm, Ben Suh Samsung with 10nm Snapdragon 835.jpeg

Keith Kressin SVP, Product Management, Qualcomm Technologies Inc and Ben Suh, SVP, Foundry Marketing, Samsung, show off first 10nm mobile processor, Snapdragon 835, in New York at Qualcomm's Snapdragon Technology Summit.

I am very curious to see how the market reacts to the release of the Snapdragon 835. We are still seeing new devices being released using the 820/821 SoCs, including Google’s own flagship Pixel phones this fall. Qualcomm wants to maintain leadership in the SoC market by innovating on both silicon and software but consumers are becoming more savvy to the actual usable benefits that new devices offer. Qualcomm promises features, performance and power benefits on SD 835 to make the case for your next upgrade.

Full press release after the break!

Source: Qualcomm

NVIDIA Tegra SoC powers new Nintendo Switch gaming system

Subject: Processors, Mobile | October 20, 2016 - 11:40 AM |
Tagged: Nintendo, switch, nvidia, tegra

It's been a hell of a 24 hours for NVIDIA and the Tegra processor. A platform that many considered dead in the water after the failure of it to find its way into smartphones or into an appreciable amount of consumer tablets, had two major design wins revealed. First, it was revealed that NVIDIA is powered the new fully autonomous driving system in the Autopilot 2.0 hardware implementation in Tesla's current Model S, X and upcoming Model 3 cars.

Now, we know that Nintendo's long rumored portable and dockable gaming system called Switch is also powered by a custom NVIDIA Tegra SoC.

20-nintendo-switch-1200x923.jpg

We don't know much about the hardware that gives the Switch life, but NVIDIA did post a short blog with some basic information worth looking at. Based on it, we know that the Tegra processor powering this Nintendo system is completely custom and likely uses Pascal architecture GPU CUDA cores; though we don't know how many and how powerful it will be. It will likely exceed the performance of the Nintendo Wii U, which was only 0.35 TFLOPS and consisting of 320 AMD-based stream processors. How much faster we just don't know yet.

On the CPU side we assume that this is built using an ARM-based processor, most likely off-the-shelf core designs to keep things simple. Basing it on custom designs like Denver might not be necessary for this type of platform. 

Nintendo has traditionally used custom operating systems for its consoles and that seems to be what is happening with the Switch as well. NVIDIA mentions a couple of times how much work the technology vendor put into custom APIs, custom physic engines, new libraries, etc. 

The Nintendo Switch’s gaming experience is also supported by fully custom software, including a revamped physics engine, new libraries, advanced game tools and libraries. NVIDIA additionally created new gaming APIs to fully harness this performance. The newest API, NVN, was built specifically to bring lightweight, fast gaming to the masses.

We’ve optimized the full suite of hardware and software for gaming and mobile use cases. This includes custom operating system integration with the GPU to increase both performance and efficiency.

The system itself looks pretty damn interesting, with the ability to switch (get it?) between a docked to your TV configuration to a mobile one with attached or wireless controllers. Check out the video below for a preview.

I've asked both NVIDIA and Nintendo for more information on the hardware side but these guys tend to be tight lipped on the custom silicon going into console hardware. Hopefully one or the other is excited to tell us about the technology so we can some interesting specifications to discuss and debate!

UPDATE: A story on The Verge claims that Nintendo "took the chip from the Shield" and put it in the Switch. This is more than likely completely false; the Shield is a significantly dated product and that kind of statement could undersell the power and capability of the Switch and NVIDIA's custom SoC quite dramatically.

Source: Nintendo

Qualcomm Announces Snapdragon 653, 626, and 427 SoCs

Subject: Processors, Mobile | October 18, 2016 - 11:32 AM |
Tagged: SoC, Snapdragon 653, Snapdragon 626, Snapdragon 427, snapdragon, smartphone, qualcomm, mobile

Qualcomm has announced new 400 and 600-series Snapdragon parts, and these new SoCs (Snapdragon 653, 626, and 427) inherit technology found previously on the 800-series parts, including fast LTE connectivity and dual-camera support.

qualcomm-snapdragon-mobile-processor.jpg

The integrated LTE modem has been significantly for each of these SoCs, and Qualcomm lists these features for each of the new products:

  • X9 LTE with CAT 7 modem (300Mbps DL; 150Mbps UL) designed to provide users with a 50 percent increase in maximum uplink speeds over the X8 LTE modem.
  • LTE Advanced Carrier Aggregation with up to 2x20 MHz in the downlink and uplink
  • Support for 64-QAM in the uplink
  • Superior call clarity and higher call reliability with the Enhanced Voice Services (EVS) codec on VoLTE calls.

In addition to the new X9 modem, all three SoCs offer faster CPU and GPU performance, with the Snapdragon 653 (which replaces the 652) now supporting up to 8GB of memory - up from a max of 4GB previously. Each of the new SoCs also feature Qualcomm's Quick Charge 3.0 for fast charging.

SD_600_400.png

Full specifications for these new products can be found on the updated Snapdragon product page.

Availability of the new 600-series Snapdragon processors is set for the end of this year, so we could start seeing handsets with the faster parts soon; while the Snapdragon 427 is expected to ship in devices early in 2017.

Source: Qualcomm

Intel Launches Stratix 10 FPGA With ARM CPU and HBM2

Subject: Processors | October 10, 2016 - 02:25 AM |
Tagged: SoC, Intel, FPGA, Cortex A53, arm, Altera

 Intel and recently acquired Altera have launched a new FPGA product based on Intel’s 14nm Tri-Gate process featuring an ARM CPU, 5.5 million logic element FPGA, and HBM2 memory in a single package. The Stratix 10 is aimed at data center, networking, and radar/imaging customers.

The Stratix 10 is an Altera-designed FPGA (field programmable gate array) with 5.5 million logic elements and a new HyperFlex architecture that optimizes registers, pipeline, and critical pathing (feed-forward designs) to increase core performance and increase the logic density by five times that of previous products. Further, the upcoming FPGA SoC reportedly can run at twice the core performance of Stratix V or use up to 70% less power than its predecessor at the same performance level.

Intel Altera Stratix 10.jpg

The increases in logic density, clockspeed, and power efficiency are a combination of the improved architecture and Intel’s 14nm FinFET (Tri-Gate) manufacturing process.

Intel rates the FPGA at 10 TFLOPS of single precision floating point DSP performance and 80 GFLOPS/watt.

Interestingly, Intel is using an ARM processor to feed data to the FPGA chip rather than its own Quark or Atom processors. Specifically, the Stratix 10 uses an ARM CPU with four Cortex A53 cores as well as four stacks of on package HBM2 memory with 1TB/s of bandwidth to feed data to the FPGA. There is also a “secure device manager” to ensure data integrity and security.

The Stratix 10 is aimed at data centers and will be used with in specialized tasks that demand high throughput and low latency. According to Intel, the processor is a good candidate for co-processors to offload and accelerate encryption/decryption, compression/de-compression, or Hadoop tasks. It can also be used to power specialized storage controllers and networking equipment.

Intel has started sampling the new chip to potential customers.

Intel Altera Stratix 10 FPGA SoC.png

In general, FPGAs are great at highly parallelized workloads and are able to efficiently take huge amounts of inputs and process the data in parallel through custom programmed logic gates. An FPGA is essentially a program in hardware that can be rewired in the field (though depending on the chip it is not necessarily a “fast” process and it can take hours or longer to switch things up heh). These processors are used in medical and imaging devices, high frequency trading hardware, networking equipment, signal intelligence (cell towers, radar, guidance, ect), bitcoin mining (though ASICs stole the show a few years ago), and even password cracking. They can be almost anything you want which gives them an advantage over traditional CPUs and graphics cards though cost and increased coding complexity are prohibitive.

The Stratix 10 stood out as interesting to me because of its claimed 10 TFLOPS of single precision performance which is reportedly the important metric when it comes to training neural networks. In fact, Microsoft recently began deploying FPGAs across its Azure cloud computing platform and plans to build the “world’s fastest AI supercomputer. The Redmond-based company’s Project Catapult saw the company deploy Stratix V FPGAs to nearly all of its Azure datacenters and is using the programmable silicon as part of an “acceleration fabric” in its “configurable cloud” architecture that will be used initially to accelerate the company’s Bing search and AI research efforts and later by independent customers for their own applications.

It is interesting to see Microsoft going with FPGAs especially as efforts to use GPUs for GPGPU and neural network training and inferencing duties have increased so dramatically over the years (with NVIDIA being the one pushing the latter). It may well be a good call on Microsoft’s part as it could enable better performance and researchers would be able to code their AI accelerator platforms down to the gate level to really optimize things. Using higher level languages and cheaper hardware with GPUs does have a lower barrier to entry though. I suppose ti will depend on just how much Microsoft is going to charge customers to use the FPGA-powered instances.

FPGAs are in kind of a weird middle ground and while they are definitely not a new technology, they do continue to get more complex and powerful!

What are your thoughts on Intel's new FPGA SoC?

Also read:

Source: Intel

NVIDIA Teases Low Power, High Performance Xavier SoC That Will Power Future Autonomous Vehicles

Subject: Processors | October 1, 2016 - 06:11 PM |
Tagged: xavier, Volta, tegra, SoC, nvidia, machine learning, gpu, drive px 2, deep neural network, deep learning

Earlier this week at its first GTC Europe event in Amsterdam, NVIDIA CEO Jen-Hsun Huang teased a new SoC code-named Xavier that will be used in self-driving cars and feature the company's newest custom ARM CPU cores and Volta GPU. The new chip will begin sampling at the end of 2017 with product releases using the future Tegra (if they keep that name) processor as soon as 2018.

NVIDIA_Xavier_SOC.jpg

NVIDIA's Xavier is promised to be the successor to the company's Drive PX 2 system which uses two Tegra X2 SoCs and two discrete Pascal MXM GPUs on a single water cooled platform. These claims are even more impressive when considering that NVIDIA is not only promising to replace the four processors but it will reportedly do that at 20W – less than a tenth of the TDP!

The company has not revealed all the nitty-gritty details, but they did tease out a few bits of information. The new processor will feature 7 billion transistors and will be based on a refined 16nm FinFET process while consuming a mere 20W. It can process two 8k HDR video streams and can hit 20 TOPS (NVIDIA's own rating for deep learning int(8) operations).

Specifically, NVIDIA claims that the Xavier SoC will use eight custom ARMv8 (64-bit) CPU cores (it is unclear whether these cores will be a refined Denver architecture or something else) and a GPU based on its upcoming Volta architecture with 512 CUDA cores. Also, in an interesting twist, NVIDIA is including a "Computer Vision Accelerator" on the SoC as well though the company did not go into many details. This bit of silicon may explain how the ~300mm2 die with 7 billion transistors is able to match the 7.2 billion transistor Pascal-based Telsa P4 (2560 CUDA cores) graphics card at deep learning (tera-operations per second) tasks. Of course in addition to the incremental improvements by moving to Volta and a new ARMv8 CPU architectures on a refined 16nm FF+ process.

  Drive PX Drive PX 2 NVIDIA Xavier Tesla P4
CPU 2 x Tegra X1 (8 x A57 total) 2 x Tegra X2 (8 x A57 + 4 x Denver total) 1 x Xavier SoC (8 x Custom ARM + 1 x CVA) N/A
GPU 2 x Tegra X1 (Maxwell) (512 CUDA cores total 2 x Tegra X2 GPUs + 2 x Pascal GPUs 1 x Xavier SoC GPU (Volta) (512 CUDA Cores) 2560 CUDA Cores (Pascal)
TFLOPS 2.3 TFLOPS 8 TFLOPS ? 5.5 TFLOPS
DL TOPS ? 24 TOPS 20 TOPS 22 TOPS
TDP ~30W (2 x 15W) 250W 20W up to 75W
Process Tech 20nm 16nm FinFET 16nm FinFET+ 16nm FinFET
Transistors ? ? 7 billion 7.2 billion

For comparison, the currently available Tesla P4 based on its Pascal architecture has a TDP of up to 75W and is rated at 22 TOPs. This would suggest that Volta is a much more efficient architecture (at least for deep learning and half precision)! I am not sure how NVIDIA is able to match its GP104 with only 512 Volta CUDA cores though their definition of a "core" could have changed and/or the CVA processor may be responsible for closing that gap. Unfortunately, NVIDIA did not disclose what it rates the Xavier at in TFLOPS so it is difficult to compare and it may not match GP104 at higher precision workloads. It could be wholly optimized for int(8) operations rather than floating point performance. Beyond that I will let Scott dive into those particulars once we have more information!

Xavier is more of a teaser than anything and the chip could very well change dramatically and/or not hit the claimed performance targets. Still, it sounds promising and it is always nice to speculate over road maps. It is an intriguing chip and I am ready for more details, especially on the Volta GPU and just what exactly that Computer Vision Accelerator is (and will it be easy to program for?). I am a big fan of the "self-driving car" and I hope that it succeeds. It certainly looks to continue as Tesla, VW, BMW, and other automakers continue to push the envelope of what is possible and plan future cars that will include smart driving assists and even cars that can drive themselves. The more local computing power we can throw at automobiles the better and while massive datacenters can be used to train the neural networks, local hardware to run and make decisions are necessary (you don't want internet latency contributing to the decision of whether to brake or not!).

I hope that NVIDIA's self-proclaimed "AI Supercomputer" turns out to be at least close to the performance they claim! Stay tuned for more information as it gets closer to launch (hopefully more details will emerge at GTC 2017 in the US).

What are your thoughts on Xavier and the whole self-driving car future?

Also read:

Source: NVIDIA

AMD A12-9800 Overclocked to 4.8 GHz

Subject: Processors | September 27, 2016 - 07:01 AM |
Tagged: overclock, Bristol Ridge, amd

Update 9/27 @ 5:10pm: Added a link to Anandtech's discussion of Bristol Ridge. It was mentioned in the post, but I forgot to add the link itself when I transfered it to the site. The text is the same, though.

While Zen is nearing release, AMD has launched the AM4 platform with updated APUs. They will be based on an updated Excavator architecture, which we discussed during the Carrizo launch in mid-2015. Carrizo came about when AMD decided to focus heavily on the 15W and 35W power targets, giving the best possible experience for that huge market of laptops, in the tasks that those devices usually encounter, such as light gaming and media consumption.

amd-2016-a12-9800-overclock.png

Image Credit: NAMEGT via HWBot

Bristol Ridge, instead, focuses on the 35W and 65W thermal points. This will be targeted more at OEMs who want to release higher-performance products in the holiday time-frame, although consumers can purchase it directly, according to Anandtech, later in the year. I'm guessing it won't be pushed too heavily to DIY users, though, because they know that those users know Zen is coming.

It turns out that overclockers already have their hands on it, though, and it seems to take a fairly high frequency. NAMEGT, from South Korea, uploaded a CPU-Z screenshot to HWBot that shows the 28nm, quad-core part clocked at 4.8 GHz. The included images claim that this was achieved on air, using AMD's new stock “Wraith” cooler.

Source: HWBot

AMD's Upcoming Socket AM4 Pictured with 1331 Pins

Subject: Processors | September 19, 2016 - 10:35 AM |
Tagged: Socket AM4, processor, FX, cpu, APU, amd, 1331 pins

A report from Hungarian site HWSW (cited by Bit-Tech) has a close-up photo of the new AMD AM4 processor socket, and it looks like this will have 1331 pins (go ahead and count them, if you dare!).

socket_am4.jpg

Image credit: Bit-Tech via HWSW

AMD's newest socket will merge the APU and FX series CPUs into this new AM4 socket, unlike the previous generation which split the two between AM3+ and FM2+. This is great news for system builders, who now have the option of starting with an inexpensive CPU/APU, and upgrading to a more powerful FX processor later on - with the same motherboard.

The new socket will apparently require a new cooler design, which is contrary to early reports (yes, we got it wrong, too) that the AM4 socket would be compatible with existing AM3 cooler mounts (manufacturers could of course offer hardware kits for existing cooler designs). In any case, AMD's new socket takes more of the delicate copper pins you love to try not to bend!

Source: Bit-Tech