Intel AVX-512 Expanded

Subject: General Tech, Graphics Cards, Processors | July 19, 2014 - 12:05 AM |
Tagged: Xeon Phi, xeon, Intel, avx-512, avx

It is difficult to know what is actually new information in this Intel blog post, but it is interesting none-the-less. Its topic is the AVX-512 extension to x86, designed for Xeon and Xeon Phi processors and co-processors. Basically, last year, Intel announced "Foundation", the minimum support level for AVX-512, as well as Conflict Detection, Exponential and Reciprocal, and Prefetch, which are optional. This, earlier blog post was very much focused on Xeon Phi, but it acknowledged that the instructions will make their way to standard, CPU-like Xeons at around the same time.

Intel_Xeon_Phi_Family.jpg

This year's blog post brings in a bit more information, especially for common Xeons. While all AVX-512-supporting processors (and co-processors) will support "AVX-512 Foundation", the instruction set extensions are a bit more scattered.

 
Xeon
Processors
Xeon Phi
Processors
Xeon Phi
Coprocessors (AIBs)
Foundation Instructions Yes Yes Yes
Conflict Detection Instructions Yes Yes Yes
Exponential and Reciprocal Instructions No Yes Yes
Prefetch Instructions No Yes Yes
Byte and Word Instructions Yes No No
Doubleword and Quadword Instructions Yes No No
Vector Length Extensions Yes No No

Source: Intel AVX-512 Blog Post (and my understanding thereof).

So why do we care? Simply put: speed. Vectorization, the purpose of AVX-512, has similar benefits to multiple cores. It is not as flexible as having multiple, unique, independent cores, but it is easier to implement (and works just fine with having multiple cores, too). For an example: imagine that you have to multiply two colors together. The direct way to do it is multiply red with red, green with green, blue with blue, and alpha with alpha. AMD's 3DNow! and, later, Intel's SSE included instructions to multiply two, four-component vectors together. This reduces four similar instructions into a single operating between wider registers.

Smart compilers (and programmers, although that is becoming less common as compilers are pretty good, especially when they are not fighting developers) are able to pack seemingly unrelated data together, too, if they undergo similar instructions. AVX-512 allows for sixteen 32-bit pieces of data to be worked on at the same time. If your pixel only has four, single-precision RGBA data values, but you are looping through 2 million pixels, do four pixels at a time (16 components).

For the record, I basically just described "SIMD" (single instruction, multiple data) as a whole.

This theory is part of how GPUs became so powerful at certain tasks. They are capable of pushing a lot of data because they can exploit similarities. If your task is full of similar problems, they can just churn through tonnes of data. CPUs have been doing these tricks, too, just without compromising what they do well.

Source: Intel

Intel Earnings Report (Q2 2014)

Subject: General Tech, Processors, Mobile | July 16, 2014 - 12:37 AM |
Tagged: quarterly results, quarterly earnings, quarterly, Intel, earnings

Another fiscal quarter brings another Intel earnings report. Once again, they are doing well for themselves as a whole but are struggling to gain a foothold in mobile. In three months, they sold 8.7 billion dollars in PC hardware, of which 3.7 billion was profit. Its mobile division, on the other hand, brought in 51 million USD in revenue, losing 1.1 billion dollars for their efforts. In all, the company is profitable -- by about 3.84 billion USD.

Intel-Swimming-in-Money.jpg

One interesting metric which Intel adds to their chart, and I have yet to notice another company listing this information so prominently, is their number of employees, compared between quarters. Last year, Intel employed about 106,000 people, which increased to 106,300 two quarters ago. Between two quarters ago and this last quarter, that number dropped by 1400, to 104,900 employees, which was about 1.3% of their total workforce. There does not seem to be a reason for this decline (except for Richard Huddy, we know that he went to AMD).

Intel Process nodes_575px.png

Image Credit: Anandtech

As a final note, Anandtech, when reporting on this story, added a few historical trends near the end. One which caught my attention was the process technology vs. quarter graph, demonstrating their smallest transistor size over the last thirteen-and-a-bit years. We are still slowly approaching 0nm, following an exponential curve as it approaches its asymptote. The width, however, is still fairly regular. It looks like it is getting slightly longer, but not drastically (minus the optical illusion caused by the smaller drops).

Source: Intel

The Third x86-based SoC Player: VIA & Centaur's Isaiah II

Subject: General Tech, Processors, Mobile | July 11, 2014 - 01:58 PM |
Tagged: x86, VIA, isaiah II, Intel, centaur, arm, amd

There might be a third, x86-compatible processor manufacturer who is looking at the mobile market. Intel has been trying to make headway, including the direct development of Android for the x86 architecture. The company also has a few design wins, mostly with Windows 8.1-based tablets but also the occasional Android-based models. Google is rumored to be preparing the "Nexus 8" tablet with one of Intel's Moorefield SoCs. AMD, the second-largest x86 processor manufacturer, is aiming their Mullins platform at tablets and two-in-ones, but cannot afford to play snowplow, at least not like Intel.

via-centaur-countdown.jpg

VIA, through their Centaur Technology division, is expected to announce their own x86-based SoC, too. Called Isaiah II, it is rumored to be a quad core, 64-bit processor with a maximum clock rate of 2.0 GHz. Its GPU is currently unknown. VIA sold their stake S3 Graphics to HTC back in 2011, who then became majority shareholder over the GPU company. That said, HTC and VIA are very close companies. The chairwoman of HTC is the founder of VIA Technologies. The current President and CEO of VIA, who has been in that position since 1992, is her husband. I expect that the GPU architecture will be provided by S3, or will somehow be based on their technology. I could be wrong. Both companies will obviously do what they think is best.

It would make sense, though, especially if it benefits HTC with cheap but effective SoCs for Android and "full" Windows (not Windows RT) devices.

Or this announcement could be larger than it would appear. Three years ago, VIA filed for a patent which described a processor that can read both x86 and ARM machine language and translate it into its own, internal microinstructions. The Centaur Isaiah II could reasonably be based on that technology. If so, this processor would be able to support either version of Android. Or, after Intel built up the Android x86 code base, maybe they shelved that initiative (or just got that patent for legal reasons).

Android-x86.png

But what about Intel? Honestly, I see this being a benefit for the behemoth. Extra x86-based vendors will probably grow the overall market share, compared to ARM, by helping with software support. Even if it is compatible with both ARM and x86, what Intel needs right now is software. They can only write so much of it themselves. It is possible that VIA, being the original netbook processor, could disrupt the PC market with both x86 and ARM compatibility, but I doubt it.

Centaur Technology, the relevant division of VIA, will make their announcement in less than 51 days.

Source: 3d Center

Fully Enabling the A10-7850K while Utilizing a Standalone GPU

Subject: Processors | July 9, 2014 - 02:42 PM |
Tagged: nvidia, msi, Luxmark, Lightning, hsa, GTX 580, GCN, APU, amd, A88X, A10-7850K

When I first read many of the initial AMD A10 7850K reviews, my primary question was how would the APU act if there was a different GPU installed on the system and did not utilize the CrossFire X functionality that AMD talked about.  Typically when a user installs a standalone graphics card on the AMD FM2/FM2+ platform, they disable the graphics portion of the APU.  They also have to uninstall the AMD Catalyst driver suite.  So this then leaves the APU as a CPU only, and all of that graphics silicon is left silent and dark.

apu_first.jpg

Who in their right mind would pair a high end graphics card with the A10-7850K? This guy!

Does this need to be the case?  Absolutely not!  The GCN based graphics unit on the latest Kaveri APUs is pretty powerful when used in GPGPU/OpenCL applications.  The 4 cores/2 modules and 8 GCN cores can push out around 856 GFlops when fully utilized.  We also must consider that the APU is the first fully compliant HSA (Heterogeneous System Architecture) chip, and it handles memory accesses much more efficiently than standalone GPUs.  The shared memory space with the CPU gets rid of a lot of the workarounds typically needed for GPGPU type applications.  It makes sense that users would want to leverage the performance potential of a fully functioning APU while upgrading their overall graphics performance with a higher end standalone GPU.

To get this to work is very simple.  Assuming that the user has been using the APU as their primary graphics controller, they should update to the latest Catalyst drivers.  If the user is going to use an AMD card, then it would behoove them to totally uninstall the Catalyst driver and re-install only after the new card is installed.  After this is completed restart the machine, go into the UEFI, and change the primary video boot device to PEG (PCI-Express Graphics) from the integrated unit.  Save the setting and shut down the machine.  Insert the new video card and attach the monitor cable(s) to it.  Boot the machine and either re-install the Catalyst suite if an AMD card is used, or install the latest NVIDIA drivers if that is the graphics choice.

Windows 7 and Windows 8 allow users to install multiple graphics drivers from different vendors.  In my case I utilized a last generation GTX 580 (the MSI N580GTX Lightning) along with the AMD A10 7850K.  These products coexist happily together on the MSI A88X-G45 Gaming motherboard.  The monitor is attached to the NVIDIA card and all games are routed through that since it is the primary graphics adapter.  Performance seems unaffected with both drivers active.

luxmark_setup.PNG

I find it interesting that the GPU portion of the APU is named "Spectre".  Who owns those 3dfx trademarks anymore?

When I load up Luxmark I see three entries: the APU (CPU and GPU portions), the GPU portion of the APU, and then the GTX 580.  Luxmark defaults to the GPUs.  We see these GPUs listed as “Spectre”, which is the GCN portion of the APU, and the NVIDIA GTX 580.  Spectre supports OpenCL 1.2 while the GTX 580 is an OpenCL 1.1 compliant part.

With both GPUs active I can successfully run the Luxmark “Sala” test.  The two units perform better together than when they are run separately.  Adding in the CPU does increase the score, but not by very much (my guess here is that the APU is going to be very memory bandwidth bound in such a situation).  Below we can see the results of the different units separate and together.

luxmark_results_02.png

These results make me hopeful about the potential of AMD’s latest APU.  It can run side by side with a standalone card, and applications can leverage the performance of this unit.  Now all we need is more HSA aware software.  More time and more testing is needed for setups such as this, and we need to see if HSA enabled software really does see a boost from using the GPU portion of the APU as compared to a pure CPU piece of software or code that will run on the standalone GPU.

Personally I find the idea of a heterogeneous solution such as this appealing.  The standalone graphics card handles the actual graphics portions, the CPU handles that code, and the HSA software can then fully utilize the graphics portion of the APU in a very efficient manner.  Unfortunately, we do not have hard numbers on the handful of HSA aware applications out there, especially when used in conjunction with standalone graphics.  We know in theory that this can work (and should work), but until developers get out there and really optimize their code for such a solution, we simply do not know if having an APU will really net the user big gains as compared to something like the i7 4770 or 4790 running pure x86 code.

full_APU_GPU.PNG

In the meantime, at least we know that these products work together without issue.  The mixed mode OpenCL results make a nice case for improving overall performance in such a system.  I would imagine with more time and more effort from developers, we could see some really interesting implementations that will fully utilize a system such as this one.  Until then, happy experimenting!

Source: AMD

Celeron II: The Second Coming

Subject: Processors | July 8, 2014 - 04:23 PM |
Tagged: intel atom, Pentium G3258, overclocking

Technically it is an Anniversary Edition Pentium processor but it reminds those of us who have been in the game a long time of the old Celeron D's which cost very little and overclocked like mad!  The Pentium G3258 is well under $100 but the stock speed of 3.2GHz is only a recommendation as this processor is just begging to be overclocked.  The Tech Report coaxed it up to 4.8GHz on air cooling, 100MHz higher than the i7-4790K they tested.  A processor that costs about 20% of the price of the 4790K can almost meet its performance in Crysis 3 without resorting to even high end watercooling should make any gamer on a budget sit up an take notice.  Sure you lose the extra cores and other features of the flagship processor but if you are primarily a gamer these are not your focus, you simply want the fastest processor you can get at a reasonable amount of money.  Stay tuned for more information about the Anniversary Edition Pentium as there are more benchmarks to be run!

test-rig.jpg

"This new Pentium is an unlocked dual-core CPU based on the latest 22-nm Haswell silicon. I ran out and picked one up as soon as they went on sale last week. The list price is only 72 bucks, but Micro Center had them on sale for $60. In other words, you can get a processor that will quite possibly run at clock speeds north of 4GHz—with all the per-clock throughput of Intel's very latest CPU core—for the price of a new Call of Shooty game.

Also, ours overclocks like a Swiss watchmaker on meth."

Here are some more Processor articles from around the web:

Processors

Intel's Knights Landing (Xeon Phi, 2015) Details

Subject: General Tech, Graphics Cards, Processors | July 2, 2014 - 12:55 AM |
Tagged: Intel, Xeon Phi, xeon, silvermont, 14nm

Anandtech has just published a large editorial detailing Intel's Knights Landing. Mostly, it is stuff that we already knew from previous announcements and leaks, such as one by VR-Zone from last November (which we reported on). Officially, few details were given back then, except that it would be available as either a PCIe-based add-in board or as a socketed, bootable, x86-compatible processor based on the Silvermont architecture. Its many cores, threads, and 512 bit registers are each pretty weak, compared to Haswell, for instance, but combine to about 3 TFLOPs of double precision performance.

itsbeautiful.png

Not enough graphs. Could use another 256...

The best way to imagine it is running a PC with a modern, Silvermont-based Atom processor -- only with up to 288 processors listed in your Task Manager (72 actual cores with quad HyperThreading).

The main limitation of GPUs (and similar coprocessors), however, is memory bandwidth. GDDR5 is often the main bottleneck of compute performance and just about the first thing to be optimized. To compensate, Intel is packaging up-to 16GB of memory (stacked DRAM) on the chip, itself. This RAM is based on "Hybrid Memory Cube" (HMC), developed by Micron Technology, and supported by the Hybrid Memory Cube Consortium (HMCC). While the actual memory used in Knights Landing is derived from HMC, it uses a proprietary interface that is customized for Knights Landing. Its bandwidth is rated at around 500GB/s. For comparison, the NVIDIA GeForce Titan Black has 336.4GB/s of memory bandwidth.

Intel and Micron have worked together in the past. In 2006, the two companies formed "IM Flash" to produce the NAND flash for Intel and Crucial SSDs. Crucial is Micron's consumer-facing brand.

intel-knights-landing.jpg

So the vision for Knights Landing seems to be the bridge between CPU-like architectures and GPU-like ones. For compute tasks, GPUs edge out CPUs by crunching through bundles of similar tasks at the same time, across many (hundreds of, thousands of) computing units. The difference with (at least socketed) Xeon Phi processors is that, unlike most GPUs, Intel does not rely upon APIs, such as OpenCL, and drivers to translate a handful of functions into bundles of GPU-specific machine language. Instead, especially if the Xeon Phi is your system's main processor, it will run standard, x86-based software. The software will just run slowly, unless it is capable of vectorizing itself and splitting across multiple threads. Obviously, OpenCL (and other APIs) would make this parallelization easy, by their host/kernel design, but it is apparently not required.

It is a cool way that Intel arrives at the same goal, based on their background. Especially when you mix-and-match Xeons and Xeon Phis on the same computer, it is a push toward heterogeneous computing -- with a lot of specialized threads backing up a handful of strong ones. I just wonder if providing a more-direct method of programming will really help developers finally adopt massively parallel coding practices.

I mean, without even considering GPU compute, how efficient is most software at splitting into even two threads? Four threads? Eight threads? Can this help drive heterogeneous development? Or will this product simply try to appeal to those who are already considering it?

Source: Intel

The renewed FX-9590, still up to 5GHz

Subject: Processors | June 23, 2014 - 01:05 PM |
Tagged: amd, fx 9590, vishera

Hardware Canucks have just let out AMD's secret on a new take on a Vishera processor, the FX-9590 which will come with a Cooler Master Seidon 120 AIO LCS which will add $40 to the original $320 price tag.  The base clock of the 8 CPUs will still be 4.7GHz, 5GHz boost buit with the TDP of 219W the watercooler should allow the boost clock to be maintained longer.  If you ever planned on overclocking the FX-9590 but never picked it up because of the challenge of cooling it, then here is your chance.

FX-9590-1234.jpg

"It all started with a tweet. AMD teased an unnamed new FX-series chip on Twitter and we've got the inside track. It's a refreshed 5GHz FX-9590 with an included water cooling unit."

Here are some more Processor articles from around the web:

Processors

Qualcomm Focuses on Android Gaming, Snapdragon Benefits to Gamers, Developers

Subject: Processors, Mobile | June 23, 2014 - 10:08 AM |
Tagged: snapdragon, qualcomm, gaming, Android, adreno

Today Qualcomm has published a 22-page white paper that keys in on the company's focus around Android gaming and the benefits that Qualcomm SoCs offer. As the dominant SoC vendor in the Android ecosystem of smartphones, tablets and handhelds (shipping more than 32% in Q2 of 2013) QC is able to offer a unique combination of solutions to both developers and gamers that push Android gaming into higher fidelity with more robust game play.

According to the white paper, Android gaming is the fastest growing segment of the gaming market with a 30% compound annual growth rate from 2013 to 2015, as projected by Gartner. Experiences for mobile games have drastically improved since Android was released in 2008 with developers like Epic Games and the Unreal Engine pushing visuals to near-console and near-PC qualities. 

qcgaming1.jpg

Qualcomm is taking a heterogeneous approach to address the requirements of gaming that include AI execution, physics simulation, animation, low latency input and high speed network connectivity in addition to high quality graphics and 3D rendering. Though not directly a part of the HSA standards still in development, the many specialized engines that Qualcomm has developed for its Snapdragon SoC processors including traditional CPUs, GPUs, DSPs, security and connectivity allow the company to create a solution that is built for Android gaming dominance.

qcgaming2.jpg

In the white paper Qualcomm dives into the advantages that the Krait CPU architecture offers for CPU-based tasks as well as the power of the Adreno 4x series of GPUs that offer both raw performance and the flexibility to support current and future gaming APIs. All of this is done with single-digit wattage draw and a passive, fanless design and points to the huge undertaking that mobile gaming requires from an engineering and implementation perspective.

qcgaming3.jpg

For developers, the ability to target Snapdragon architectures with a single code path that can address a scalable product stack allows for the least amount of development time and the most return on investment possible. Qualcomm continues to support the development community with tools and assistance to bring out the peak performance of Krait and Adreno to get games running on lower power parts as well as the latest and upcoming generations of SoCs in flagship devices.

It is great to see Qualcomm focus on this aspect of the mobile market and the challenges presented by it require strong dedication from these engineering teams. Being able to create compelling gaming experiences with high quality imagery while maintaining the required power envelope is a task that many other company's have struggled with. 

Check out the new landing page over at Qualcomm if you are interested in more technical information as well as direct access to the white paper detailing the work Qualcomm is putting into its Snapdragon line of SoC for gamers.

Source: Qualcomm

AMD Restructures. Lisa Su Is Now COO.

Subject: Editorial, General Tech, Graphics Cards, Processors, Chipsets | June 13, 2014 - 03:45 PM |
Tagged: x86, restructure, gpu, arm, APU, amd

According to VR-Zone, AMD has reworked their business, last Thursday, sorting each of their projects into two divisions and moving some executives around. The company is now segmented into the "Enterprise, Embedded, and Semi-Custom Business Group", and the "Computing and Graphics Business Group". The company used to be divided between "Computing Solutions", which handled CPUs, APUs, chipsets, and so forth, "Graphics and Visual Solutions", which is best known for GPUs but also contains console royalties, and "All Other", which was... everything else.

amd-new2.png

Lisa Su, former general manger of global business, has moved up to Chief Operating Officer (COO), along with other changes.

This restructure makes sense for a couple of reasons. First, it pairs some unprofitable ventures with other, highly profitable ones. AMD's graphics division has been steadily adding profitability to the company while its CPU division has been mostly losing money. Secondly, "All Other" is about a nebulous as a name can get. Instead of having three unbalanced divisions, one of which makes no sense to someone glancing at AMD's quarterly earnings reports, they should now have two, roughly equal segments.

At the very least, it should look better to an uninformed investor. Someone who does not know the company might look at the sheet and assume that, if AMD divested from everything except graphics, that the company would be profitable. If, you know, they did not know that console contracts came into their graphics division because their compute division had x86 APUs, and so forth. This setup is now more aligned to customers, not products.

Source: VR-Zone

Thinking of swapping Linux for Windows on your new Bay Trail NUC?

Subject: Processors | June 5, 2014 - 03:32 PM |
Tagged: baytrail, linux, N2820, ubuntu 14.04, Linux 3.13, Linux 3.15, mesa, nuc

It would seem that installing Linux on your brand new Bay Trail powered NUC will cost you a bit of performance.  The testing Phoronix has performed on Intel NUC DN2820FYKH proves that it can handle running Linux without a hitch, however you will find that your overall graphical performance will dip a bit.  Using MESA 10.3 and both the current 3.13 kernel and the 3.15 development kernel Phoronix saw a small delta in performance between Ubuntu 14.04 and Win 8.1 ... until they hit the OpenGL performance.  As there is still no full OpenGL 4.0+ support there were tests that could not be run and even with the tests that could be there was a very large performance gap.  Do not let this worry you, as they point out in the article there is a dedicated team working on full compliance and you can expect updated results in the near future.

image.php_.jpg

"A few days ago my benchmarking revealed Windows 8.1 is outperforming Ubuntu Linux with the latest Intel open-source graphics drivers on Haswell hardware. I have since conducted tests on the Celeron N2820 NUC, and sadly, the better OpenGL performance is found with Microsoft's operating system."

Here are some more Processor articles from around the web:

Processors

Source: Phoronix

Computex 2014: Cavium Introduces 48 Core ThunderX ARM Processors

Subject: Processors, Mobile | June 4, 2014 - 08:00 AM |
Tagged: computex, computex 2014, arm, cavium, thunderx

While much of the news coming from Computex was centered around PC hardware, many of ARMs partners are making waves as well. Take Cavium for example, introducing the ThunderX CN88XX family of processors. With a completely custom ARMv8 architectural core design, the ThunderX processors will range from 24 to 48 cores and are targeted at large volume servers and cloud infrastructure. 48 cores!

The ThunderX family will be the first SoC to scale up to 48 cores and with a clock speed of 2.5 GHz and 16MB of L2 cache, should offer some truly impressive performance levels. Cavium claims to be the first socket-coherent ARM processor as well, using the Cavium Coherent Processor Interconnect. The I/O capacity stretches into the hundreds of Gigabits and quad channel DDR3 and DDR4 memory speeds up to 2.4 GHz keep the processors fed with work.

thunderx.jpg

Source: Gigaom.com

Here is the breakdown on the ThunderX families.

ThunderX_CP: Up to 48 highly efficient cores along with integrated virtSOC, dual socket coherency, multiple 10/40 GbE and high memory bandwidth. This family is optimized for private and public cloud web servers, content delivery, web caching, search and social media workloads.

ThunderX_ST: Up to 48 highly efficient cores along with integrated virtSOC, multiple SATAv3 controllers, 10/40 GbE & PCIe Gen3 ports, high memory bandwidth, dual socket coherency, and scalable fabric for east-west as well as north-south traffic connectivity. This family includes hardware accelerators for data protection/ integrity/security, user to user efficient data movement (RoCE) and compressed storage. This family is optimized for Hadoop, block & object storage, distributed file storage and hot/warm/cold storage type workloads.

ThunderX_SC: Up to 48 highly efficient cores along with integrated virtSOC, 10/40 GbE connectivity, multiple PCIe Gen3 ports, high memory bandwidth, dual socket coherency, and scalable fabric for east-west as well as north-south traffic connectivity. The hardware accelerators include Cavium’s industry leading, 4th generation NITROX and TurboDPI technology with acceleration for IPSec, SSL, Anti-virus, Anti-malware, firewall and DPI. This family is optimized for Secure Web front-end, security appliances and Cloud RAN type workloads.

ThunderX_NT: Up to 48 highly efficient cores along with integrated virtSOC, 10/40/100 GbE connectivity, multiple PCIe Gen3 ports, high memory bandwidth, dual socket coherency, and scalable fabric with feature rich capabilities for bandwidth provisioning , QoS, traffic Shaping and tunnel termination. The hardware accelerators include high packet throughput processing, network virtualization and data monitoring. This family is optimized for media servers, scale-out embedded applications and NFV type workloads.

We spoke with ARM earlier this year about its push into the server market and it is partnerships like these that will begin the ramp up to wide spread adoption of ARM-based server infrastructure. The ThunderX family will begin sampling in early Q4 2014 and production should be available by early 2015. 

Richard Huddy Departs Intel, Rejoins AMD

Subject: Graphics Cards, Processors | June 3, 2014 - 11:10 AM |
Tagged: Intel, amd, richard huddy

Interesting news is crossing the ocean today as we learn that Richard Huddy, who has previously had stints at NVIDIA, ATI, AMD and most recently, Intel, is teaming up with AMD once again. Richard brings with him years of experience and innovation in the world of developer relations and graphics technology. Often called "the Godfather" of DirectX, AMD wants to prove to the community it is taking PC gaming seriously.

richardhuddy.jpg

The official statement from AMD follows:

AMD is proud to announce the return of the well-respected authority in gaming, Richard Huddy. After three years away from AMD, Richard returns as AMD's Gaming Scientist in the Office of the CTO - he'll be serving as a senior advisor to key technology executives, like Mark Papermaster, Raja Koduri and Joe Macri. AMD is extremely excited to have such an industry visionary back. Having spent his professional career with companies like NVIDIA, Intel and ATI, and having led the worldwide ISV engineering team for over six years at AMD, Mr. Huddy has a truly unique perspective on the PC and Gaming industries.

Mr. Huddy rejoins AMD after a brief stint at Intel, where he had a major impact on their graphics roadmap.  During his career Richard has made enormous contributions to the industry, including the development of DirectX and a wide range of visual effects technologies.  Mr. Huddy’s contributions in gaming have been so significant that he was immortalized as ‘The Scientist’ in Max Payne (if you’re a gamer, you’ll see the resemblance immediately). 

Kitguru has a video from Richard Huddy explaining his reasoning for the move back to AMD.

Source: Kitguru.net

This move points AMD in a very interesting direction going forward. The creation of the Mantle API and the debate around AMD's developer relations programs are going to be hot topics as we move into the summer and I am curious how quickly Huddy thinks he can have an impact.

I have it on good authority we will find out very soon.

Computex 2014: Intel Officially Releases Devil's Canyon, Core i7-4790K

Subject: Processors | June 2, 2014 - 11:30 PM |
Tagged: Intel, i7-4790k, devil's canyon, computex 2014, computex, 4790k

Back in March, we learned from Intel that they were planning to release a new Haswell refresh processor targeted at the overclocking and gaming market, code named Devil's Canyon. As we noted then, this new version of the existing processors will include new CPU packaging and the oft-requested improved thermal interface material (TIM). What wasn't known were the final clock speeds and availability time lines.

slides01.jpg

The new Core i7-4790K processor will ship with a 4.0 GHz base clock with a maximum Turbo clock rate of 4.4 GHz! That is a 500 MHz increase in base clock speed over the Core i7-4770K and should result in a substantial (~10-15%) performance increase. The processor still supports HyperThreading for a total of 8 threads and is fully unlocked for even more clock speed improvements.

slides04.jpg

All of the other specifications remain the same - HD Graphics 4600, 8MB of L3 cache, 16 lanes of PCI Express, etc.

slides13.jpg

Intel spent some time on the Devil's Canyon Haswell processors to improve the packaging and thermals for overclockers and enthusiasts. The thermal interface material (TIM) that is between the top of the die and the heat spreader has been updated to a next-generation polymer TIM (NGPTIM). The change should improve cooling performance of all currently shipping cooling solutions (air or liquid) but it is still a question just HOW MUCH this change will actually matter. 

You can also tell from the photo comparison above that Intel has added capacitors to the back of the processor to "smooth" power delivery. This, combined with the NGPTIM should enable a bit more headroom for clock speeds with the Core i7-4790K.

slides08.jpg

In fact, there are two Devil's Canyon processors being launched this month. The Core i7-4790K will sell for $339, the same price as the Core i7-4770K, while the Core i5-4690K will sell for $242. The lower end option is a 3.5 GHz base clock, 3.9 GHz Turbo clock quad-core CPU without HyperThreading. While a nice step over the Core i5-4670K, it's only 100 MHz faster. Clearly the Core i7-4790K is the part everyone is going to be scrambling to buy.

slides07.jpg

Not to be left out, Intel is offering an unlocked Pentium processor for users on a tighter budget. This dual core CPU runs at 3.2 GHz base frequency and includes not just HD Graphics but support for QuickSync video. 

slides09.jpg

At just $72, the Pentium G3258 will likely be a great choice for gamers that lean towards builds like the one we made for the Titanfall release.

I was hoping to have a processor in hand to run benchmarks and overclocking testing on, but they haven't quite made it to the office yet. The 4.0 GHz clock speed is easily emulated by any 4770K and some BIOS tweaks but the additional overclocking headroom provided by the changed thermal interface is still in question. Honestly, based on conversations with motherboard vendors, Devil's Canyon headroom is only 100-200 MHz over the base Haswell parts, so don't expect to reach 6.0 GHz all of the sudden.

Later in the week we'll have the Core i7-4790K in hand and you can expect a full review shortly thereafter.

Intel Announces Partnership with Rockchip to Produce Low-Cost x86 Atom SoC

Subject: Processors | May 28, 2014 - 02:09 PM |
Tagged: tablet, SoC, Rockchip, mobile, Intel, atom, arm, Android

While details about upcoming Haswell-E processors were reportedly leaking out, an official announcement from Intel was made on Tuesday about another CPU product - and this one isn't a high-end desktop part. The chip giant is partnering with the fabless semiconductor manufacturer Rockchip to create a low-cost SoC for Android devices under the Intel name, reportedly fabricated at TSMC.

rockchip_logo.png

We saw almost exactly the opposite of this arrangement last October, when it was announced that Altera would be using Intel to fab ARMv8 chips. Try to digest this: Instead of Intel agreeing to manufacture another company's chip with ARM's architecture in their fabs, they are going through what is said to be China's #1 tablet SoC manufacturer to produce x86 chips...at TSMC? It's a small - no, a strange world we live in!

From Intel's press release: "Under the terms of the agreement, the two companies will deliver an Intel-branded mobile SoC platform. The quad-core platform will be based on an Intel® Atom™ processor core integrated with Intel's 3G modem technology."

As this upcoming x86 SoC is aimed at entry-level Android tablets this announcement might not seem to be exciting news at first glance, but it fills a short term need for Intel in their quest for market penetration in the ultramobile space dominated by ARM-based SoCs. The likes of Qualcomm, Apple, Samsung, TI, and others (including Rockchip's RK series) currently account for 90% of the market, all using ARM.

As previously noted, this partnership is very interesting from an industry standpoint, as Intel is sharing their Atom IP with Rockchip to make this happen. Though if you think back, the move is isn't unprecedented... I recall something about a little company called Advanced Micro Devices that produced x86 chips for Intel in the past, and everything seemed to work out OK there...

atom.png

When might we expect these new products in the Intel chip lineup codenamed SoFIA? Intel states "the dual-core 3G version (is) expected to ship in the fourth quarter of this year, the quad-core 3G version...expected to ship in the first half of 2015, and the LTE version, also due in the first half of next year." And again, this SoC will only be available in low-cost Android tablets under this partnership (though we might speculate on, say, an x86 SoC powered Surface or Ultrabook in the future?).

Source: Intel

Rumor: New Intel Core i7 Haswell-E Processor Specs Allegedly Leaked

Subject: Processors | May 27, 2014 - 03:58 PM |
Tagged: X99, rumors, octocore, lga2011, Intel, Haswell-E, cpu

As with any high-profile release there have been rumors circulating around Intel's upcoming high-end desktop processors for the X99 chipset, and a report today from Chinese site Coolaler claims to have the specs on these new Haswell-E CPU's.

intel_i7-5960x_5930k_5820k_sp.jpg

The alleged Haswell-E lineup

Of particular interest are the core counts, which appear to have been increased compared to the current Ivy Bridge-E products. The lineup will reportedly include a 6-core i7-5820K, 6-core i7-5930K, and 8-core i7-5960X. Yep, not only are we looking at an octo-core desktop part but now even the "entry-level" Extreme part might have 6 cores.

Nothing wrong with more cores (and this will be especially attractive if we see the same MSRP's as Ivy Bridge-E) but there might be one caveat with the i7-5820K, as the reported specs show fewer PCIe lanes on this CPU with 28, compared to the 40 lanes found on the higher Haswell-E parts (and all current Ivy-Bridge-E parts).

Haswell-E would still provide more lanes than the current desktop i7 parts (an i7-4770K has only 16), but the disparity would create an interesting quandary for a potential adopter. Though x8 connections for multi-GPU setups is par for the course already on non-X79 desktop systems, the SATA Express and M.2 standards will put more of a premium on PCIe lane allocation for storage going forward.

inte_core_i7-5690x_cpu-z.png

An alleged CPU-Z screenshot of an 8-core i7-5960X part

Of course no official word from Intel on the matter yet, and only speculation on pricing. This is completely unsubstantiated, but is certainly of interest - particularly as hex-core i7's previously commanded the pricing of a more premium part in each prior iteration.

Source: Coolaler

Reuters: Intel CEO says Broadwell PCs for sale by holidays

Subject: Processors | May 19, 2014 - 08:13 AM |
Tagged: Intel, Broadwell, z97, krzanich

Apparently attending Maker Faire gets you more than a look at the latest hacked gadgets produced by the community. Reuters got to talk with Intel CEO Brian Krzanich who confirmed that the company's upcoming Broadwell architecture processors using the new 14nm process technology would be on store shelves in time for the holidays.

"I can guarantee for holiday, and not at the last second of holiday," Krzanich said in an interview. "Back to school - that's a tight one. Back to school you have to really have it on-shelf in July, August. That's going to be tough."

broadwell.jpg

Dissecting that comment we can assume that Broadwell will likely be made available in September or October of this year. This becomes the most precise word from the mouth of Intel about the release of these new parts but of course there wasn't much detail to be had. Though "computers" was mentioned he did not specify notebooks, all-in-ones or desktops. And more importantly for our readers, he did not specify anything about the socketed parts we have been promised would run on the newly released Intel Z97 chipset.

Source: Reuters

Xiaomi MiPad Tablet is Tegra K1 Powered

Subject: General Tech, Graphics Cards, Processors, Mobile | May 15, 2014 - 02:02 PM |
Tagged: nvidia, xaiomi, mipad, tegra k1

Tegra K1 is NVIDIA's new mobile processor and this first to implement the Kepler graphics architecture. In other words, it has all of the same graphics functionality as a desktop GPU with 364 GigaFLOPs of performance (a little faster than a GeForce 9600 GT). This is quite fast for a mobile product. For instance, that amount of graphics performance could max out Unreal Tournament 3 to 2560x1600 and run Crysis at 720p. Being Kepler, it supports OpenGL 4.4, OpenGL ES 3.1, DirectX 11 and 12, and GPU compute languages.

Xiaomi is launching their MiPad in Beijing, today, with an 8-inch 2048x1536 screen and the Tegra K1. They will be available in June (for China) starting at $240 USD for the 16GB version and going up to $270 for the 64GB version. Each version has 2GB of RAM, an 8MP rear-facing camera, and a 5MP front camera.

Now, we wait and see if any Tegra K1 devices come to North America and Europe - especially at that price point.

Source: NVIDIA

AMD Allegedly Preparing New Mobile Kaveri APUs Including the Flagship FX-7600P

Subject: General Tech, Processors | May 11, 2014 - 08:41 PM |
Tagged: ulv, mobile apu, laptop, Kaveri, APU, amd

According to leaked information, AMD will allegedly be releasing mobile versions of its Kaveri APU later this year. There are reportedly seven new processors aimed at laptops and tablet that follow the same basic design as their desktop counterparts: steamroller CPU cores paired with a GCN-based graphics portion and an integrated memory controller.

According to information obtained by WCCF Tech, AMD will release four ULV and three standard voltage parts. All but one APU will have four Steamroller CPU cores paired with an Radeon R4, R5, R6, or R7 graphics processor with up to 512 GCN cores. The mobile APUs allegedly range in TDP from 17W to 35W and support various AMD technologies including TrueAudio, Mantle, and Eyefinity.

An AMD slide showing a die shot of the desktop "Kaveri" Accelerated Processing Unit (APU).

Of the seven rumored APUs, two of them are OEM-only parts that feature the “FX” moniker. The FX-7500 is the fastest ULV (ultra-low voltage) APU while the FX-7600P is AMD’s flagship mobile processor.

The FX-7600P is the chip that should most interest mobile gamers and enthusiasts looking for a powerful AMD-powered laptop or tablet. This processor allegedly features four CPU cores clocked at 2.7GHz base (that turbo to a maximum of 3.6GHz), a GPU with 512 GCN cores clocked at a base of 600MHz and a boost clock of 666MHz. The chip further uses 4MB of L2 cache and is a 35W TDP part. This should be a decent processor for laptops, offering acceptable general performance and some nice mobile gaming with the beefy integrated GPU!

AMD Mobile Kaveri APU Details Leak.png

The leaked AMD mobile Kaveri APU lineup via WCCF Tech.

Of course, for productivity machines where portability and battery life are bigger concerns, AMD will reportedly be offering up the dual core A6-7000. This 17W ULV processor combines two cores clocked at 2.2GHz (3.0GHz boost), a GPU based on the Radeon R4 with 192 GCN cores (494MHz base and 533MHz boost), and 2MB of L2 cache. Compared to the FX-7600P (and especially the desktop parts), the A6-7000 sips power. We will have to wait for reviews to see how it performs, but it will be facing stiff competition from Intel’s Core i3 Haswell CPUs and even the Bay Trail SoCs which come in at a lower TDP and offer higher thread counts. The GPU capabilities and GPGPU / HSA software advancements (such as LibreOffice adding GPGPU support) will make or break the A6-7000, in my opinion.

In all, the leaked mobile chips appear to be a decent upgrade over the previous generation. The new mobile APUs will bring incremental performance and power saving benefits to bear against competition from Intel. I’m looking forward to more official information and seeing what the OEMs are able to do with the new chips.

Source: WCCF Tech

AMD Shows Off ARM-Based Opteron A1100 Server Processor And Reference Motherboard

Subject: Processors | May 7, 2014 - 09:26 PM |
Tagged: TrustZone, server, seattle, PCI-E 3.0, opteron a1100, opteron, linux, Fedora, ddr4, ARMv8, arm, amd, 64-bit

AMD showed off its first ARM-based “Seattle” processor running on a reference platform motherboard at an event in San Francisco earlier this week. The new chip, which began sampling in March, is slated for general availability in Q4 2014. The “Seattle” processor will be officially labeled the AMD Opteron A1100.

During the press event, AMD demonstrated the Opteron A1100 running on a reference design motherboard (the Seattle Development Platform). The hardware was used to drive a LAMP software stack including an ARM optimized version of Linux based on RHEL, Apache 2.4.6, MySQL 5.5.35, and PHP 5.4.16. The server was then used to host a WordPress blog that included stream-able video.

AMD Seattle Development Platform Opteron A1100.jpg

Of course, the hardware itself is the new and interesting bit and thanks to the event we now have quite a few details to share.

The Opteron A1100 features eight ARM Cortex-A57 cores clocked at 2.0 GHz (or higher). AMD has further packed in an integrated memory controller, TrustZone encryption hardware, and floating point and NEON video acceleration hardware. Like a true SoC, the Opteron A1100 supports 8 lanes of PCI-E 3.0, eight SATA III 6Gbps ports, and two 10GbE network connections.

The Seattle processor has a total of 4MB of L2 cache (each pair of cores shares 1MB of L2) and 8MB L3 cache that all eight cores share. The integrated memory controller supports DDR3 and DDR4 memory in SO-DIMM, unbuffered DIMM, and registered ECC RDIMM forms (only one type per motherboard) enabling the ARM-based platform to be used in a wide range of server environments (enterprise, SMB, and home servers et al).

AMD has stated that the upcoming Opteron A1100 processor delivers between two and four times the performance of the existing Opteron X series (which uses four x86 Jaguar cores clocked at 1.9 GHz). The A1100 has a 25W TDP and is manufactured by Global Foundries. Despite the slight increase in TDP versus the Opteron X series (the Opteron X2150 is a 22W part), AMD claims the increased performance results in notable improvements in compute/watt performance.

AMD Opteron Server Processor.png

AMD has engineered a reference motherboard though partners will also be able to provide customized solutions. The combination of reference motherboard and ARM-based Opteron A1100 is known at the Seattle Development Platform. This reference motherboard features four registered DDR3 DIMM slots for up to 128GB of memory, eight SATA 6Gbps ports, support for standard ATX power supplies, and multiple PCI-E connectors that can be configured to run as a single PCI-E 3.0 x8 slot or two PCI-E 3.0 x4 slots.

The Opteron A1100 is an interesting move from AMD that will target low power servers. the ARM-based server chip has an uphill battle in challenging x86-64 in this space, but the SoC does have several advantages in terms of compute performance per watt and overall cost. AMD has taken the SoC elements (integrated IO, memory, companion processor hardware) of the Opteron X series and its APUs in general, removed the graphics portion, and crammed in as many low power 64-bit ARM cores as possible. This configuration will have advantages over the Opteron X CPU+GPU APU when running applications that use multiple serial threads and can take advantage of large amounts of memory per node (up to 128GB). The A1100 should excel in serving up files and web pages or acting as a caching server where data can be held in memory for fast access.

I am looking forward to the launch as the 64-bit ARM architecture makes its first major inroads into the server market. The benchmarks, and ultimately software stack support, will determine how well it is received and if it ends up being a successful product for AMD, but at the very least it keeps Intel on its toes and offers up an alternative and competitive option.

Source: Tech Report

Also at Intel and Google's Chrome OS Event: Human Rights

Subject: General Tech, Processors | May 7, 2014 - 12:06 AM |
Tagged: conflict-free, Intel, Congo

The Intel and Google keynote speech closed out with a video and an announcement. Each Chrome OS device that they mentioned will be among the first to use Haswell and Bay Trail processors manufactured with conflict-free minerals. They are not abandoning the Democratic Republic of the Congo, rather they seem to be forcing their suppliers to adhere to human rights standards if they want to do business with Intel.

This initiative has apparently led to the creation of the "Conflict-Free Smelter Program" which is run by the Conflict-Free Sourcing Initiative. This industry body includes several other companies, such as AMD, Apple, Foxconn, IBM, Microsoft, NVIDIA, Pegatron, Qualcomm, every laptop manufacturer that I could think of, and over 150 others.

Intel has been discussing this for a little while, and taking positive steps toward this goal along the way. There really is not that many other ways to say it: reducing the suffering in the world is a great goal.

Source: Intel