Intel to Ship FPGA-Accelerated Xeons in Early 2016

Subject: Processors | November 20, 2015 - 06:21 PM |
Tagged: xeon, Intel, FPGA

UPDATE (Nov 26th, 3:30pm ET): A few readers have mentioned that FPGAs take much less than hours to reprogram. I even received an email last night that claims FPGAs can be reprogrammed in "well under a second." This differs from the sources I've read when I was reading up on their OpenCL capabilities (for potential evolutions of projects) back in ~2013. That said, multiple sources, including one who claim to have personal experience with FPGAs, say that it's not the case. Also, I've never used an FPGA myself -- again, I was just researching them to see where some GPU-based projects could go.

Designing integrated circuits, as I've said a few times, is basically a game. You have a blank canvas that you can etch complexity into. The amount of “complexity” depends on your fabrication process, how big your chip is, the intended power, and so forth. Performance depends on how you use the complexity to compute actual tasks. If you know something special about your workload, you can optimize your circuit to do more with less. CPUs are designed to do basically anything, while GPUs assume similar tasks can be run together. If you will only ever run a single program, you can even bake some or all of its source code into hardware called an “application-specific integrated circuit” (ASIC), which is often used for video decoding, rasterizing geometry, and so forth.


This is an old Atom back when Intel was partnered with Altera for custom chips.

FPGAs are circuits that can be baked into a specific application, but can also be reprogrammed later. Changing tasks requires a significant amount of time (sometimes hours) but it is easier than reconfiguring an ASIC, which involves removing it from your system, throwing it in the trash, and printing a new one. FPGAs are not quite as efficient as a dedicated ASIC, but it's about as close as you can get without translating the actual source code directly into a circuit.

Intel, after purchasing FPGA manufacturer, Altera, will integrate their technology into Xeons in Q1 2016. This will be useful to offload specific tasks that dominate a server's total workload. According to PC World, they will be integrated as a two-chip package, where both the CPU and FPGA can access the same cache. I'm not sure what form of heterogeneous memory architecture that Intel is using, but this would be a great example of a part that could benefit from in-place acceleration. You could imagine a simple function being baked into the FPGA to, I don't know, process large videos in very specific ways without expensive copies.

Again, this is not a consumer product, and may never be. Reprogramming an FPGA can take hours, and I can't think of too many situations where consumers will trade off hours of time to switch tasks with high performance. Then again, it just takes one person to think of a great application for it to take off.

Source: PCWorld

Intel Launches Knights Landing-based Xeon Phi AIBs

Subject: Processors | November 18, 2015 - 07:34 AM |
Tagged: Xeon Phi, knights landing, Intel

The add-in board version of the Xeon Phi has just launched, which Intel aims at supercomputing audiences. They also announced that this product will be available as a socketed processor that is embedded in, as PC World states, “a limited number of workstations” by the first half of next year. The interesting part about these processors is that they combine a GPU-like architecture with the x86 instruction set.

intel-2015-KNL die.jpg

Image Credit: Intel (Developer Zone)

In the case of next year's socketed Knights Landing CPUs, you can even boot your OS with it (and no other processor installed). It will probably be a little like running a 72-core Atom-based netbook.

To make it a little more clear, Knights Landing is a 72-core, 512-bit processor. You might wonder how that can compete against a modern GPU, which has thousands of cores, but those are not really cores in the CPU sense. GPUs crunch massive amounts of calculations by essentially tying several cores together, and doing other tricks to minimize die area per effective instruction. NVIDIA ties 32 instructions together and pushes them down the silicon. As long as they don't diverge, you can get 32 independent computations for very little die area. AMD packs 64 together.

Knight's Landing does the same. The 512-bit registers can hold 16 single-precision (32-bit) values and operate on them simultaneously.

16 times 72 is 1152. All of a sudden, we're in shader-count territory. This is one of the reasons why they can achieve such high performance with “only” 72 cores, compared to the “thousands” that are present on GPUs. They're actually on a similar scale, just counted differently.

Update: (November 18th @ 1:51 pm EST) I just realized that, while I kept saying "one of the reasons", I never elaborated on the other points. Knights Landing also has four threads per core. So that "72 core" is actually "288 thread", with 512-bit registers that can perform sixteen 32-bit SIMD instructions simultaneously. While hyperthreading is not known to be 100% efficient, you could consider Knights Landing to be a GPU with 4608 shader units. Again, it's not the best way to count it, but it could sort-of work.

So in terms of raw performance, Knights Landing can crunch about 8 TeraFLOPs of single-precision performance or around 3 TeraFLOPs of double-precision, 64-bit performance. This is around 30% faster than the Titan X in single precision, and around twice the performance of Titan Black in double precision. NVIDIA basically removed the FP64 compute units from Maxwell / Titan X, so Knight's Landing is about 16x faster, but that's not really a fair comparison. NVIDIA recommends Kepler for double-precision workloads.

So interestingly, Knights Landing would be a top-tier graphics card (in terms of shading performance) if it was compatible with typical graphics APIs. Of course, it's not, and it will be priced way higher than, for instance, the AMD Radeon Fury X. Knight's Landing isn't available on Intel ARK yet, but previous models are in the $2000 - $4000 range.

Source: PC World

New Intel NUC Models Listed with 6th-Gen Skylake Processors

Subject: Processors, Systems | November 17, 2015 - 11:21 AM |
Tagged: Skylake, NUC6i5SYK, NUC6i5SYH, NUC6i3SYK, NUC6i3SYH, nuc, mini-pc, Intel, i5-6260U, i3-6100U


(Image credit: PCMag)

NUC systems sporting the latest Intel 6th-gen Skylake processors are coming, with the NUC6i5SYH, NUC6i5SYK, NUC6i3SYH, NUC6i3SYK listed with updated Core i5 and i3 CPUs. As this is a processor refresh the appearance and product nomenclature remain unchanged (unfortunately).


The four new Skylake Intel NUC models listed on Intel's product page

Here's Intel's description of the Skylake Core i5-powered NUC6i5SYH:

"Intel NUC Kit NUC6i5SYH is equipped with Intel’s newest architecture, the 6th generation Intel Core i5-6260U processor. Intel Iris graphics 540 with 4K display capabilities provides brilliant resolution for gaming and home theaters. NUC5i5SYH has room for a 2.5” drive for additional storage and an M.2 SSD so you can transfer your data at lightning speed. Designed for Windows 10, NUC6i5SYH has the performance to stream media, manage spreadsheets, or create presentations."

The NUC6i5SYH and NUC6i5SYK feature the i5-6260U is a dual-core, Hyper-Threaded 15W part with a base speed of 1.9 GHz with up to 2.8 GHz Turbo. It has 4 MB cache and supports up to 32GB 2133 MHz DDR4. The processor also provides Intel Iris graphics 540 (Skylake GT3e), which offers 48 Execution Units and 64 MB of dedicated eDRAM. The lower-end NUC6i3SYH and NUC6i3SYK models offer the i3-6100U, which is also a dual-core, Hyper-Threaded part, but this 15W processor's speed is fixed at 2.3 GHz without Turbo Boost, and it offers the lesser Intel HD Graphics 520.

Availability and pricing are not yet known, but expect to see the new models for sale soon.

Source: Intel

Report: Intel Broadwell-E Flagship i7-6950X a 10 Core, 20 Thread CPU

Subject: Processors | November 13, 2015 - 06:40 PM |
Tagged: X99, processor, LGA2011-v3, Intel, i7-6950X, HEDT, Haswell-E, cpu, Broadwell-E

Intel's high-end desktop (HEDT) processor line will reportedly be moving from Haswell-E to Broadwell-E soon, and with the move Intel will offer their highest consumer core count to date, according to a post at XFastest which WCCFtech reported on yesterday.


Image credit: VR-Zone

While it had been thought that Broadwell-E would feature the same core counts as Haswell-E (as seen on the leaked slide above), according to the report the upcoming flagship Core i7-6950X will be a massive 10 core, 20 thread part built using Intel's 14 nm process. Broadwell-E is expected to provide an upgrade to those running on Intel's current enthusiast X99 platform before Skylake-E arrives with an all-new chipset.

WCCFtech offered this chart in their report, outlining the differences between the HEDT generations (and providing a glimpse of the future Skylake-E variant):


Intel HEDT generations compared (Credit: WCCFtech)

It isn't all that surprising that one of Intel's LGA2011-v3 processors would arrive on desktops with 10 cores as these are closely related to the Xeon server processors, and Haswell based Xeon CPUs are already available with up to 18 cores, though priced far beyond what even the extreme builder would probably find reasonable (not to mention being far less suited to a desktop build based on motherboard compatibility). The projected $999 price tag for the Extreme Edition part with 10 cores would mark not only the first time an Intel desktop processor reached the core-count milestone, but it would also mark the lowest price to attain one of the company's 10-core parts to date (Xeon or otherwise).

Running Intel HD 530 graphics under Linux

Subject: Processors | November 12, 2015 - 01:22 PM |
Tagged: linux, Skylake, Intel, i5-6600K, hd 530, Ubuntu 15.10

A great way to shave money off of a minimalist system is to skip buying a GPU and using the one present on modern processors, as well as installing Linux instead of buying a Windows license.  The problem with doing so is that playing demanding games is going to be beyond your computers ability, at least without turning off most of the features that make the game look good.  To help you figure out what your machine would be capable of is this article from Phoronix.  Their tests show that Windows 10 currently has a very large performance lead compared to the same hardware running on Ubuntu as the Windows OpenGL driver is superior to the open-source Linux driver.  This may change sooner rather than later but you should be aware that for now you will not get the most out of your Skylakes GPU on Linux at this time.


"As it's been a while since my last Windows vs. Linux graphics comparison and haven't yet done such a comparison for Intel's latest-generation Skylake HD Graphics, the past few days I was running Windows 10 Pro x64 versus Ubuntu 15.10 graphics benchmarks with a Core i5 6600K sporting HD Graphics 530."

Here are some more Processor articles from around the web:



Source: Phoronix

Samsung Announces Exynos 8 Octa 8890 Application Processor

Subject: Processors, Mobile | November 12, 2015 - 09:30 AM |
Tagged: SoC, smartphone, Samsung Galaxy, Samsung, mobile, Exynos 8890, Exynos 8 Octa, Exynos 7420, Application Processor

Coming just a day after Qualcomm officially launched their Snapdragon 820 SoC, Samsung is today unveiling their latest flagship mobile part, the Exynos 8 Octa 8890.


The Exynos 8 Octa 8890 is built on Samsung’s 14 nm FinFET process like the previous Exynos 7 Octa 7420, and again is based on the a big.LITTLE configuration; though the big processing cores are a custom design this time around. The Exynos 7420 was comprised of four ARM Cortex A57 cores and four small Cortex A53 cores, and while the small cores in the 8890 are again ARM Cortex A53, the big cores feature Samsung’s “first custom designed CPU based on 64-bit ARMv8 architecture”.

“With Samsung’s own SCI (Samsung Coherent Interconnect) technology, which provides cache-coherency between big and small cores, the Exynos 8 Octa fully utilizes benefits of big.LITTLE structure for efficient usage of the eight cores. Additionally, Exynos 8 Octa is built on highly praised 14nm FinFET process. These all efforts for Exynos 8 Octa provide 30% more superb performance and 10% more power efficiency.”


Another big advancement for the Exynos 8 Octa is the integrated modem, which provides Category 12/13 LTE with download speeds (with carrier aggregation) of up to 600 Mbps, and uploads up to 150 Mbps. This might sound familiar, as it mirrors the LTE Release 12 specs of the new modem in the Snapdragon 820.

Video processing is handled by the Mali-T880 GPU, moving up from the Mali-T760 found in the Exynos 7 Octa. The T880 is “the highest performance and the most energy-efficient mobile GPU in the Mali family”, with up to 1.8x the performance of the T760 while being 40% more energy-efficient. 

Samsung will be taking this new SoC into mass production later this year, and the chip is expected to be featured in the company’s upcoming flagship Galaxy phone.

Full PR after the break.

Source: Samsung

GLOBALFOUNDRIES Achieves 14nm FinFET - Coming to New AMD Products

Subject: Processors | November 6, 2015 - 10:09 AM |
Tagged: tape out, processors, GLOBALFOUNDRIES, global foundries, APU, amd, 14 nm FinFET

GlobalFoundries has today officially announced their success with sample 14 nm FinFET production for upcoming AMD products.


(Image credit: KitGuru)

GlobalFoundries licensed 14 nm LPE and LPP technology from Samsung in 2014, and were producing wafers as early as April of this year. At the time a GF company spokesperson was quoted in this report at KitGuru, stating "the early version (14LPE) is qualified in our fab and our lead product is yielding in double digits. Since 2014, we have taped multiple products and testchips and are seeing rapid progress, in yield and maturity, for volume shipments in 2015." Now they have moved past LPE (Low Power Early) to LPP (Low Power Plus), with new products based on the technology slated for 2016:

"AMD has taped out multiple products using GLOBALFOUNDRIES’ 14nm Low Power Plus (14LPP) process technology and is currently conducting validation work on 14LPP production samples.  Today’s announcement represents another significant milestone towards reaching full production readiness of GLOBALFOUNDRIES’ 14LPP process technology, which will reach high-volume production in 2016."

GlobalFoundries was originally the manufacturing arm of AMD, and has continued to produce the companies processors since the spin-off in 2012. AMD's current desktop FX-8350 CPU was manufactured on 32 nm SOI, and more recently APUs such as the A10-7850K have been produced at 28 nm - both at GlobalFoundries. Intel's latest offerings such as the flagship 6700K desktop CPU are produced with Intel's 14nm process, and the success of the 14LPP production at GlobalFoundries has the potential to bring AMD's new processors closer parity with Intel (at least from a lithography standpoint).

Full PR after the break.

Report: Unreleased AMD Bristol Ridge SoC Listed Online

Subject: Processors | November 5, 2015 - 09:30 PM |
Tagged: SoC, report, processor, mobile apu, leak, FX-9830PP, cpu, Bristol Ridge, APU, amd

A new report points to an entry from the USB implementors forum, which shows an unreleased AMD Bristol Ridge SoC.


(AMD via

Bristol Ridge itself is not news, as the report at Computer Base observes (translation):

"A leaked roadmap had previously noted that Bristol Ridge is in the coming year soldered on motherboards for notebooks and desktop computers in special BGA package FP4."


( via Computer Base)

But there is something different about this chip as the report point out the model name FX-9830P pictured in the screen grab is consistent with the naming scheme for notebook parts, with the highest current model being FX-8800P (Carrizo), a 35W 4-thread Excavator part with 512 stream processors from the R7 GPU core.


(BenchLife via Computer Base)

No details are available other than information from a leaked roadmap (above), which points to Bristol Ridge as an FP4 BGA part for mobile, with a desktop variant for socket FM3 that would replace Kaveri/Godavari (and possibly still an Excavator part). New cores are coming in 2016, and we'll have to wait and see for additional details (or until more information inevitably leaks out).

Update, 11/06/15: WCCFtech expounds on the leak:

“Bristol Ridge isn’t just limited to mobility platforms but will also be featured on AM4 desktop platform as Bristol Ridge will be the APU generation available on desktops in 2016 while Zen would be integrated on the performance focused FX processors.”

WCCFtech’s report also included a link to this SiSoftware database entry for an engineering sample of a dual-core Stoney Ridge processor, a low-power mobile part with a 2.7 GHz clock speed. Stoney Ridge will reportedly succeed Carrizo-L for low-power platforms.

The report also provided this chart to reference the new products:



Report: Intel Xeon D SoC to Reach 16 Cores

Subject: Processors | October 23, 2015 - 02:21 PM |
Tagged: Xeon D, SoC, rumor, report, processor, Pentium D, Intel, cpu

Intel's Xeon D SoC lineup will soon expand to include 12-core and 16-core options, after the platform launched earlier this year with the option of 4 or 8 cores for the 14 nm chips.


The report yesterday from CPU World offers new details on the refreshed lineup which includes both Xeon D and Pentium D SoCs:

"According to our sources, Intel have made some changes to the lineup, which is now comprised of 13 Xeon D and Pentium D SKUs. Even more interesting is that Intel managed to double the maximum number of cores, and consequentially combined cache size, of Xeon D design, and the nearing Xeon D launch may include a few 12-core and 16-core models with 18 MB and 24 MB cache."

The move is not unexpected as Intel initially hinted at an expanded offering by the end of the year (emphasis added):

"...the Intel Xeon processor D-1500 product family is the first offering of a line of processors that will address a broad range of low-power, high-density infrastructure needs. Currently available with 4 or 8 cores and 128 GB of addressable memory..."


Current Xeon D Processors

The new flagship Xeon D model will be the D-1577, a 16-core processor with between 18 and 24 MB of L3 cache (exact specifications are not yet known). These SoCs feature integrated platform controller hub (PCH), I/O, and dual 10 Gigabit Ethernet, and the initial offerings had up to a 45W TDP. It would seem likely that a model with double the core count would either necessitate a higher TDP or simply target a lower clock speed. We should know more before too long.

For futher information on Xeon D, please check out our previous coverage: 

Source: CPU-World

Rumor: Apple to Use Custom AMD SoC for Next-Gen iMac

Subject: Processors | October 19, 2015 - 11:28 AM |
Tagged: Zen, SoC, processor, imac, APU, apple, amd

Rumor: Apple to Use AMD SoC for Next-Gen iMac News about AMD has been largely depressing of late, with the introduction of the R9 Fury/Fury X and Nano graphics cards a bright spot in the otherwise tumultuous year that was recently capped by a $65 million APU write down. But one area where AMD has managed to earn a big win has been the console market, where their APUs power the latest machines from Microsoft and Sony. The combination of CPU and a powerful GPU on a single chip is ideal for those small form-factor designs, and likewise it would be ideal for a slim all-in-one PC. But an iMac?


Image credit: Apple

A report from WCCFtech today points to the upcoming Zen architecture from AMD as a likely power source for a potential custom SoC:

"A Semi-custom SOC x86 for the iMac would have to include a high performance x86 component, namely Zen, in addition to a graphics engine to drive the visual experience of the device. Such a design would be very similar to the current semi-custom Playstation 4 and XBOX ONE Accelerated Processing Units, combining x86 CPU cores with a highly capable integrated graphics solution."

Those who don't follow Apple probably don't know the company switched over almost exclusively to AMD graphics a short time ago, with NVIDIA solutions phased out of all discrete GPU models. Whether politically motivated or simply the result of AMD providing what Apple wanted from a hardware/driver standpoint I can't say, but it's still a big win for AMD considering Apple's position as one of the largest computer manufacturers - even though its market share is very low in the highly fragmented PC market overall. And while Apple has exclusively used Intel processors in its systems since transitioning away from IBM's PowerPC beginning in 2006, the idea of an AMD custom APU makes a lot of sense for the company, especially for their size and heat constrained iMac designs.


Image credit: WCCFtech

Whether or not you'd ever consider buying an iMac - or any other computer from Apple, for that matter - it's still important for the PC industry as a whole that AMD continues to find success and provide competition for Intel. Consumers can only benefit from the potential for improved performance and reduced cost if competition heats up between Intel and AMD, something we really haven't seen on the CPU front in a few years now. With CEO Lisa Su stating that AMD "had secured two new semi-custom design wins" In their recent earnings call it could very well be that we will see Zen in future iMacs, or in other PC all-in-one solutions for that matter.

Regardless, it will be exciting to see some good competition from AMD, even if we will have to wait quite a while for it. Zen isn't ready yet and we have no indication that any such product would be introduced until later next year. It will be interesting to see what Intel might do to compete given their resources. 2016 could be interesting.

Source: WCCFtech

Qualcomm Enters Server CPU Space with 24-Core Socketed Processor

Subject: Processors | October 12, 2015 - 12:24 PM |
Tagged: servers, qualcomm, processor, enterprise, cpu, arm, 24-core

Another player emerges in the CPU landscape: Qualcomm is introducing its first socketed processor for the enterprise market.


Image credit: PC World

A 24-core design based on 64-bit ARM architecture has reached the prototype phase, in a large LGA package resembling an Intel Xeon CPU.

From the report published by PC World:

"Qualcomm demonstrated a pre-production chip in San Francisco on Thursday. It's a purpose-built system-on-chip, different from its Snapdragon processor, that integrates PCIe, storage and other features. The initial version has 24 cores, though the final part will have more, said Anand Chandrasekher, Qualcomm senior vice president."


Image credit: PC World

Qualcomm built servers as proof-of-concept with this new processor, "running a version of Linux, with the KVM hypervisor, streaming HD video to a PC. The chip was running the LAMP stack - Linux, the Apache Web server, MySQL, and PHP - and OpenStack cloud software," according to PC World. The functionality of this design demonstrate the chip's potential to power highly energy-efficient servers, making an obvious statement about the potential cost savings for large data companies such as Google and Facebook.

Source: PC World

Android to iPhone Day 17: SoC Performance

Subject: Processors, Mobile | October 12, 2015 - 11:08 AM |
Tagged: iphone 6s, iphone, ios, google, apple, Android, A9

PC Perspective’s Android to iPhone series explores the opinions, views and experiences of the site’s Editor in Chief, Ryan Shrout, as he moves from the Android smartphone ecosystem to the world of the iPhone and iOS. Having been entrenched in the Android smartphone market for 7+ years, the editorial series is less of a review of the new iPhone 6s as it is an exploration on how the current smartphone market compares to what each sides’ expectations are.

Full Story Listing:


My iPhone experiment continues, running into the start of the third full week of only carrying and using the new iPhone 6s. Today I am going to focus a bit more on metrics that can be measured in graph form – and that means benchmarks and battery life results. But before I dive into those specifics I need to touch on some other areas.

The most surprising result of this experiment to me, even as I cross into day 17, is that I honestly don’t MISS anything from the previous ecosystem. I theorized at the beginning of this series that I would find applications or use cases that I had adopted with Android that would not be able to be matched on iOS without some significant sacrifices. That isn’t the case – anything that I want to do on the iPhone 6s, I can. Have I needed to find new apps for taking care of my alarms or to monitor my rewards card library? Yes, but the alternatives for iOS are at least as good and often times I find there are more (and often better) solutions. I think it is fair to assume that same feeling of equality would be prevalent for users going in other direction, iPhone to Android, but I can’t be sure without another move back to Android sometime in the future. It may come to that.


My previous alarm app was replaced with Sleep Cycle

In my Day 3 post I mentioned my worry about the lack of Quick Charging support. Well I don’t know why Apple doesn’t talk it up more but the charging rate for the iPhone 6s and iPhone 6s Plus is impressive, and even more so when you pair them with the higher amperage charger that ships with iPads. Though purely non-scientific thus far, my through the day testing showed that I was able to charge the iPhone 6s Plus to 82% (from being dead after a battery test) in the span of 1.5 hours while the OnePlus 2 was only at 35%. I realize the battery on the OnePlus 2 is larger, but based purely on how much use time you get for your charging time wait, the iPhones appear to be just as fast as any Android phone I have used.

Photo taking with the iPhones 6s still impresses me – more so with the speed than the quality. Image quality is fantastic, and we’ll do more analytical testing in the near future, but while attending events over weekend including a Bengals football game (5-0!) and a wedding, the startup process for the camera was snappy and the shutter speed never felt slow. I never thought “Damn, I missed the shot I wanted” and that’s a feeling I’ve had many times over the last several years of phone use.


You don't want to miss photos like this!

There were a couple of annoyances that cropped up, including what I think is a decrease in accuracy of the fingerprint reader on the home button. In the last 4 days I have had more bouncing “try again” notices on the phone than in the entirety of use before that. It’s possible that the button has additional oils from my hands on it or maybe that I am getting lazier about placement of my fingers on the Touch ID, but it’s hard to tell.

Continue reading day 17 of my Android to iPhone editorial!!

Curious just what is special about the AMD Pro line of APUs?

Subject: Processors | October 5, 2015 - 04:48 PM |
Tagged: amd, PRO A12-8800B, Excavator, carrizo pro, Godavari Pro

AMD recently announced a Pro lineup of Excavator based chips which match their Carrizo and Godavari current lineup as far as the specifications go.  This was somewhat confusing as there were no real features at first glance that separated the Pro chips from the non-Pro cousins in the press material from AMD or HP.  Tech ARP posted the slides from the reveal and they note one key feature that separates the two chip families and why businesses should be interested in them.  These are hand-picked dies taken from hand picked wafers which AMD chose as they represent the best of the chips they have fabbed.  You should expect performance free from any possible defects which made it past quality control and if you do have bad enough luck to find a way to get a less than perfect chip they come with a 36 month extended OEM warranty.

In addition to being hand picked, machines with an AMD Pro chip will also come with an ARM TrustZone Technology based AMD Secure Processor onboard.  If you use a mobile device which has TPM and a crypto-processor onboard you will be familar with the technology; AMD is the first to bring this open sourced security platform to Windows based machines.  Small business owners may also be interested the AMD PRO Control Center which is an inventory management client which will not cost as much as ones designed for Enterprise and in theory should be easier to use as well.

This news is of lesser interest to the gamer you never know, if you can secure one of these hand picked chips you may find it gives you a bit more headroom for tweaking than your average run of the mill Godavari or Carrizo would.


"We will now only show you the presentation slides, we also recorded the entire conference call and created a special video presentation based on the conference call for you. We hope you enjoy our work."

Here are some more Processor articles from around the web:


Source: Tech ARP

Apple Dual Sources A9 SOCs with TSMC and Samsung: Some Extra Thoughts

Subject: Processors | September 30, 2015 - 09:55 PM |
Tagged: TSMC, Samsung, FinFET, apple, A9, 16 nm, 14 nm

So the other day the nice folks over at Chipworks got word that Apple was in fact sourcing their A9 SOC at both TSMC and Samsung.  This is really interesting news on multiple fronts.  From the information gleaned the two parts are the APL0898 (Samsung fabbed) and the APL1022 (TSMC).

These process technologies have been in the news quite a bit.  As we well know, it has been a hard time for any foundry to go under 28 nm in an effective way if your name is not Intel.  Even Intel has had some pretty hefty issues with their march to sub 32 nm parts, but they have the resources and financial ability to push through a lot of these hurdles.  One of the bigger problems that affected the foundries was the idea that they could push back FinFETs beyond what they were initially planning.  The idea was to hit 22/20 nm and use planar transistors and push development back to 16/14 nm for FinFET technology.


The Chipworks graphic that explains the differences between Samsung's and TSMC's A9 products.

There were many reasons why this did not work in an effective way for the majority of products that the foundries were looking to service with a 22/20 nm planar process.  Yes, there were many parts that were fabricated using these nodes, but none of them were higher power/higher performance parts that typically garner headlines.  No CPUs, no GPUs, and only a handful of lower power SOCs (most notably Apple's A8, which was around 89 mm squared and consumed up to 5 to 10 watts at maximum).  The node just did not scale power very effectively.  It provided a smaller die size, but it did not increase power efficiency and switching performance significantly as compared to 28 nm high performance nodes.

The information Chipworks has provided also verifies that Samsung's 14 nm FF process is more size optimized than TSMC's 16 nm FF.  There was originally some talk about both nodes being very similar in overall transistor size and density, but Samsung has a slightly tighter design.  Neither of them are smaller than Intel's latest 14 nm which is going into its second generation form.  Intel still has a significant performance and size advantage over everyone else in the field.  Going back to size we see the Samsung chip is around 96 mm square while the TSMC chip is 104.5 mm square.  This is not huge, but it does show that the Samsung process is a little tighter and can squeeze more transistors per square mm than TSMC.

In terms of actual power consumption and clock scaling we have nothing to go on here.  The chips are both represented in the 6S and 6S+.  Testing so far has not shown there to be significant differences between the two SOCs so far.  In theory one could be performing better than the other, but in reality we have not tested these chips at a low enough level to discern any major performance or power issue.  My gut feeling here is that Samsung's process is more mature and running slightly better than TSMC's, but the differences are going to be minimal at best.

The next piece of info that we can glean from this is that there just isn't enough line space for all of the chip companies who want to fabricate their parts with either Samsung or TSMC.  From a chip standpoint a lot of work has to be done to port a design to two different process nodes.  While 14 and 16 are similar in overall size and the usage of FinFETS, the standard cells and design libraries for both Samsung and TSMC are going to be very different.  It is not a simple thing to port over a design.  A lot of work has to be done in the design stage to make a chip work with both nodes.  I can tell you that there is no way that both chips are identical in layout.  It is not going to be a "dumb port" where they just adjust the optics with the same masks and magically make these chips work right off the bat.  Different mask sets for each fab, verification of both designs, and troubleshooting the yields by metal layer changes will be different for each manufacturer.

In the end this means that there just simply was not enough space at either TSMC or Samsung to handle the demand that Apple was expecting.  Because Apple has deep pockets they contracted out both TSMC and Samsung to produce two very similar, but still different parts.  Apple also likely outbid and locked down what availability to process wafers that Samsung and TSMC have, much to the dismay of other major chip firms.  I have no idea what is going on in the background with people like NVIDIA and AMD when it comes to line space for manufacturing their next generation parts.  At least for AMD it seems that their partnership with GLOBALFOUNDRIES and their version of 14 nm FF is having a hard time taking off.  Eventually more space will be made in production and yields and bins will improve.  Apple will stop taking up so much space and we can get other products rolling off the line.  In the meantime, enjoy that cutting edge iPhone 6S/+ with the latest 14/16 nm FF chips.

Source: Chipworks

Oh Hey! Skylake and Broadwell Stock Levels Replenish

Subject: Processors | September 27, 2015 - 07:01 AM |
Tagged: Skylake, iris pro, Intel, Broadwell

Thanks to the Tech Report for pointing this out, but some recent stock level troubles with Skylake and Broadwell have been overcome. Both Newegg and Amazon have a few Core i7-6700Ks that are available for purchase, and both also have the Broadwell Core i7s and Core i5s with Iris Pro graphics. Moreover, Microcenter has stock of the Skylake processor at some of their physical stores with the cheapest price tag of all, but they do not have the Broadwell chips with Iris Pro (they are not even listed).


You'll notice that Skylake is somewhat cheaper than the Core i7 Broadwell, especially on Newegg. That is somewhat expected, as Broadwell with Iris Pro is a larger die than Skylake with an Intel HD 530. A bigger die means that fewer can be cut from a wafer, and thus each costs more (unless the smaller die has a relatively high amount of waste to compensate of course). Also, if you go with Broadwell, you will miss out on the Z170 chipset, because they still use Haswell's LGA-1150 socket.

On the other hand, despite being based on an older architecture and having much less thermal headroom, you can find some real-world applications that really benefit from the 128 MB of L4 Cache that Iris Pro brings, even if the iGPU itself is unused. The graphics cache can be used by the main processor. In Project Cars, again, according to The Tech Report, the i7-5775C measured a 5% increase in frame rate over the newer i7-6700k -- when using a GeForce GTX 980. Granted, this was before the FCLK tweak on Skylake so there are a few oranges mixed with our apples. PCIe rates might be slightly different now.

Regardless, they're all available now. If you were awaiting stock, have fun.

Intel Will Not Bring eDRAM to Socketed Skylake

Subject: Graphics Cards, Processors | September 17, 2015 - 09:33 PM |
Tagged: Skylake, kaby lake, iris pro, Intel, edram

Update: Sept 17, 2015 @ 10:30 ET -- To clarify: I'm speaking of socketed desktop Skylake. There will definitely be Iris Pro in the BGA options.

Before I begin, the upstream story has a few disputes that I'm not entirely sure on. The Tech Report published a post in September that cited an Intel spokesperson, who said that Skylake would not be getting a socketed processor with eDRAM (unlike Broadwell did just before Skylake launched). This could be a big deal, because the fast, on-processor cache could be used by the CPU as well as the RAM. It is sometimes called “128MB of L4 cache”.


Later, ITWorld and others posted stories that said Intel killed off a Skylake processor with eDRAM, citing The Tech Report. After, Scott Wasson claimed that a story, which may or may not be ITWorld's one, had some “scrambled facts” but wouldn't elaborate. Comparing the two articles doesn't really illuminate any massive, glaring issues, but I might just be missing something.

Update: Sept 18, 2015 @ 9:45pm -- So I apparently misunderstood the ITWorld article. They were claiming that Broadwell-C was discontinued, while The Tech Report was talking about Socketed Skylake with Iris Pro. I thought they both were talking about the latter. Moreover, Anandtech received word from Intel that Broadwell-C is, in fact, not discontinued. This is odd, because ITWorld said they had confirmation from Intel. My guess is that someone gave them incorrect information. Sorry that it took so long to update.

In the same thread, Ian Cutress of Anandtech asked whether The Tech Report benchmarked the processor after Intel tweaked its FCLK capabilities, which Scott did not (but is interested in doing so). Intel addressed a slight frequency boost between the CPU and PCIe lanes after Skylake shipped, which naturally benefits discrete GPUs. Since the original claim was that Broadwell-C is better than Skylake-K for gaming, giving a 25% boost to GPU performance (or removing a 20% loss, depending on how you look at it) could tilt Skylake back above Broadwell. We won't know until it's benchmarked, though.

Iris Pro and eDRAM, while skipping Skylake, might arrive in future architectures though, such as Kaby Lake. It seems to have been demonstrated that, in some situations, and ones relevant to gamers at that, that this boost in eDRAM can help computation -- without even considering the compute potential of a better secondary GPU. One argument is that cutting the extra die room gives Intel more margins, which is almost definitely true, but I wonder how much attention Kaby Lake will get. Especially with AVX-512 and other features being debatably removed, it almost feels like Intel is treating this Tock like a Tick, since they didn't really get one with Broadwell, and Kaby Lake will be the architecture that will lead us to 10nm. On the other hand, each of these architectures are developed by independent teams, so I might be wrong in comparing them serially.

AMD Releases App SDK 3.0 with OpenCL 2.0

Subject: Graphics Cards, Processors | August 30, 2015 - 09:14 PM |
Tagged: amd, carrizo, Fiji, opencl, opencl 2.0

Apart from manufacturers with a heavy first-party focus, such as Apple and Nintendo, hardware is useless without developer support. In this case, AMD has updated their App SDK to include support for OpenCL 2.0, with code samples. It also updates the SDK for Windows 10, Carrizo, and Fiji, but it is not entirely clear how.


That said, OpenCL is important to those two products. Fiji has a very high compute throughput compared to any other GPU at the moment, and its memory bandwidth is often even more important for GPGPU workloads. It is also useful for Carrizo, because parallel compute and HSA features are what make it a unique product. AMD has been creating first-party software software and helping popular third-party developers such as Adobe, but a little support to the world at large could bring a killer application or two, especially from the open-source community.

The SDK has been available in pre-release form for quite some time now, but it is finally graduated out of beta. OpenCL 2.0 allows for work to be generated on the GPU, which is especially useful for tasks that vary upon previous results without contacting the CPU again.

Source: AMD

This is your Intel HD530 GPU on Linux

Subject: Processors | August 26, 2015 - 02:40 PM |
Tagged: Skylake, Intel, linux, Godavari

Using the GPU embedded in the vast majority of modern processors is a good way to reduce the price of and entry level system, as indeed is choosing Linux for your OS.  Your performance is not going to match that of a system with a discrete GPU but with the newer GPU cores available you will be doing much better than the old days of the IGP.  The first portion of Phoronix's review of the Skylake GPU covers the various versions of driver you can choose from while the rest compares Kaveri, Godavari, Haswell and Broadwell to the new HD530 on SkyLake CPUs.  Currently the Iris Pro 6200 present on Broadwell is still the best for gaming, though the A10-7870K Godavari performance is also decent.  Consider one of those two chips now, or await Iris Pro's possible arrival on a newer socketed processor if you are in no hurry.


"Intel's Core i5 6600K and i7 6700K processors released earlier this month feature HD Graphics 530 as the first Skylake graphics processor. Given that Intel's Open-Source Technology Center has been working on open-source Linux graphics driver support for over a year for Skylake, I've been quite excited to see how the Linux performance compares for Haswell and Broadwell as well as AMD's APUs on Linux."

Here are some more Processor articles from around the web:



Source: Phoronix

Qualcomm Introduces Adreno 5xx Architecture for Snapdragon 820

Subject: Graphics Cards, Processors, Mobile | August 12, 2015 - 07:30 AM |
Tagged: snapdragon 820, snapdragon, siggraph 2015, Siggraph, qualcomm, adreno 530, adreno

Despite the success of the Snapdragon 805 and even the 808, Qualcomm’s flagship Snapdragon 810 SoC had a tumultuous lifespan.  Rumors and stories about the chip and an inability to run in phone form factors without overheating and/or draining battery life were rampant, despite the company’s insistence that the problem was fixed with a very quick second revision of the part. There are very few devices that used the 810 and instead we saw more of the flagship smartphones uses the slightly cut back SD 808 or the SD 805.

Today at Siggraph Qualcomm starts the reveal of a new flagship SoC, Snapdragon 820. As the event coinciding with launch is a graphics-specific show, QC is focusing on a high level overview of the graphics portion of the Snapdragon 820, the updated Adreno 5xx architecture and associated designs and a new camera image signal processor (ISP) aiming to improve quality of photos and recording on our mobile devices.


A modern SoC from Qualcomm features many different processors working in tandem to impact the user experience on the device. While the only details we are getting today focus around the Adreno 530 GPU and Spectra ISP, other segments like connectivity (wireless), DSP, video processing and digital signal processing are important parts of the computing story. And we are well aware that Qualcomm is readying its own 64-bit processor architecture for the Kryo CPU rather than implementing the off-the-shelf cores from ARM used in the 810.

We also know that Qualcomm is targeting a “leading edge” FinFET process technology for SD 820 and though we haven’t been able to confirm anything, it looks very like that this chip will be built on the Samsung 14nm line that also built the Exynos 7420.

But over half of the processing on the upcoming Snapdragon 820 fill focus on visual processing, from graphics to gaming to UI animations to image capture and video output, this chip’s die will be dominated by high performance visuals.

Qualcomm’s lists of target goals for SD 820 visuals reads as you would expect: wanting perfection in every area. Wouldn’t we all love a phone or tablet that takes perfect photos each time, always focusing on the right things (or everything) with exceptional low light performance? Though a lesser known problem for consumers, having accurate color reproduction from capture, through processing and to the display would be a big advantage. And of course, we all want graphics performance that impresses and a user interface that is smooth and reliable while enabling NEW experience that we haven’t even thought of in the mobile form factor. Qualcomm thinks that Snapdragon 820 will be able to deliver on all of that.

Continue reading about the new Adreno 5xx architecture!!

Source: Qualcomm

We hear you like Skylake-U news

Subject: Processors | August 11, 2015 - 06:39 PM |
Tagged: skylake-u, Intel

Fanless Tech just posted slides of Skylake-U the ultraportable version of Skylake, all of which have an impressively low TDP of 15W which can be reduced to either 10W or in some cases all the way down to 7.5W.  As they have done previously all are BGA socketed which means you will not be able to upgraded nor are you likely to see them in desktops, not necessarily a bad thing for this segment of the mobile market but certainly worth noting.


There will be two i7 models and two i5 along with a single i3 version, the top models of which, the Core i7-6600U and Core i5-6300U sport a slightly increased frequency and support for vPro.  Those two models, along with the i7-6500U and i5-6200U will have the Intel HD graphics 520 with frequencies of 300/1050 for the i7's and 300/1000 for the i5 and i3 chips


Along with the Core models will come a single Pentium chip, the 4405U and a pair of Celerons, the 3955U and 3855U.  They will have HD510 graphics, clocks of 300/950 or 300/900 for the Celerons and you will see slight reductions in PCIe and storage subsystems on teh 4405U and 3855U.  The naming scheme is less confusing that some previous generations, a boon for those with family or friends looking for a new laptop who are perhaps not quite as obsessed with processors as we are.



Source: Fanless Tech