Intel AVX-512 Expanded

Subject: General Tech, Graphics Cards, Processors | July 19, 2014 - 03:05 AM |
Tagged: Xeon Phi, xeon, Intel, avx-512, avx

It is difficult to know what is actually new information in this Intel blog post, but it is interesting none-the-less. Its topic is the AVX-512 extension to x86, designed for Xeon and Xeon Phi processors and co-processors. Basically, last year, Intel announced "Foundation", the minimum support level for AVX-512, as well as Conflict Detection, Exponential and Reciprocal, and Prefetch, which are optional. This, earlier blog post was very much focused on Xeon Phi, but it acknowledged that the instructions will make their way to standard, CPU-like Xeons at around the same time.

Intel_Xeon_Phi_Family.jpg

This year's blog post brings in a bit more information, especially for common Xeons. While all AVX-512-supporting processors (and co-processors) will support "AVX-512 Foundation", the instruction set extensions are a bit more scattered.

 
Xeon
Processors
Xeon Phi
Processors
Xeon Phi
Coprocessors (AIBs)
Foundation Instructions Yes Yes Yes
Conflict Detection Instructions Yes Yes Yes
Exponential and Reciprocal Instructions No Yes Yes
Prefetch Instructions No Yes Yes
Byte and Word Instructions Yes No No
Doubleword and Quadword Instructions Yes No No
Vector Length Extensions Yes No No

Source: Intel AVX-512 Blog Post (and my understanding thereof).

So why do we care? Simply put: speed. Vectorization, the purpose of AVX-512, has similar benefits to multiple cores. It is not as flexible as having multiple, unique, independent cores, but it is easier to implement (and works just fine with having multiple cores, too). For an example: imagine that you have to multiply two colors together. The direct way to do it is multiply red with red, green with green, blue with blue, and alpha with alpha. AMD's 3DNow! and, later, Intel's SSE included instructions to multiply two, four-component vectors together. This reduces four similar instructions into a single operating between wider registers.

Smart compilers (and programmers, although that is becoming less common as compilers are pretty good, especially when they are not fighting developers) are able to pack seemingly unrelated data together, too, if they undergo similar instructions. AVX-512 allows for sixteen 32-bit pieces of data to be worked on at the same time. If your pixel only has four, single-precision RGBA data values, but you are looping through 2 million pixels, do four pixels at a time (16 components).

For the record, I basically just described "SIMD" (single instruction, multiple data) as a whole.

This theory is part of how GPUs became so powerful at certain tasks. They are capable of pushing a lot of data because they can exploit similarities. If your task is full of similar problems, they can just churn through tonnes of data. CPUs have been doing these tricks, too, just without compromising what they do well.

Source: Intel

Intel Earnings Report (Q2 2014)

Subject: General Tech, Processors, Mobile | July 16, 2014 - 03:37 AM |
Tagged: quarterly results, quarterly earnings, quarterly, Intel, earnings

Another fiscal quarter brings another Intel earnings report. Once again, they are doing well for themselves as a whole but are struggling to gain a foothold in mobile. In three months, they sold 8.7 billion dollars in PC hardware, of which 3.7 billion was profit. Its mobile division, on the other hand, brought in 51 million USD in revenue, losing 1.1 billion dollars for their efforts. In all, the company is profitable -- by about 3.84 billion USD.

Intel-Swimming-in-Money.jpg

One interesting metric which Intel adds to their chart, and I have yet to notice another company listing this information so prominently, is their number of employees, compared between quarters. Last year, Intel employed about 106,000 people, which increased to 106,300 two quarters ago. Between two quarters ago and this last quarter, that number dropped by 1400, to 104,900 employees, which was about 1.3% of their total workforce. There does not seem to be a reason for this decline (except for Richard Huddy, we know that he went to AMD).

Intel Process nodes_575px.png

Image Credit: Anandtech

As a final note, Anandtech, when reporting on this story, added a few historical trends near the end. One which caught my attention was the process technology vs. quarter graph, demonstrating their smallest transistor size over the last thirteen-and-a-bit years. We are still slowly approaching 0nm, following an exponential curve as it approaches its asymptote. The width, however, is still fairly regular. It looks like it is getting slightly longer, but not drastically (minus the optical illusion caused by the smaller drops).

Source: Intel

The Third x86-based SoC Player: VIA & Centaur's Isaiah II

Subject: General Tech, Processors, Mobile | July 11, 2014 - 04:58 PM |
Tagged: x86, VIA, isaiah II, Intel, centaur, arm, amd

There might be a third, x86-compatible processor manufacturer who is looking at the mobile market. Intel has been trying to make headway, including the direct development of Android for the x86 architecture. The company also has a few design wins, mostly with Windows 8.1-based tablets but also the occasional Android-based models. Google is rumored to be preparing the "Nexus 8" tablet with one of Intel's Moorefield SoCs. AMD, the second-largest x86 processor manufacturer, is aiming their Mullins platform at tablets and two-in-ones, but cannot afford to play snowplow, at least not like Intel.

via-centaur-countdown.jpg

VIA, through their Centaur Technology division, is expected to announce their own x86-based SoC, too. Called Isaiah II, it is rumored to be a quad core, 64-bit processor with a maximum clock rate of 2.0 GHz. Its GPU is currently unknown. VIA sold their stake S3 Graphics to HTC back in 2011, who then became majority shareholder over the GPU company. That said, HTC and VIA are very close companies. The chairwoman of HTC is the founder of VIA Technologies. The current President and CEO of VIA, who has been in that position since 1992, is her husband. I expect that the GPU architecture will be provided by S3, or will somehow be based on their technology. I could be wrong. Both companies will obviously do what they think is best.

It would make sense, though, especially if it benefits HTC with cheap but effective SoCs for Android and "full" Windows (not Windows RT) devices.

Or this announcement could be larger than it would appear. Three years ago, VIA filed for a patent which described a processor that can read both x86 and ARM machine language and translate it into its own, internal microinstructions. The Centaur Isaiah II could reasonably be based on that technology. If so, this processor would be able to support either version of Android. Or, after Intel built up the Android x86 code base, maybe they shelved that initiative (or just got that patent for legal reasons).

Android-x86.png

But what about Intel? Honestly, I see this being a benefit for the behemoth. Extra x86-based vendors will probably grow the overall market share, compared to ARM, by helping with software support. Even if it is compatible with both ARM and x86, what Intel needs right now is software. They can only write so much of it themselves. It is possible that VIA, being the original netbook processor, could disrupt the PC market with both x86 and ARM compatibility, but I doubt it.

Centaur Technology, the relevant division of VIA, will make their announcement in less than 51 days.

Source: 3d Center

Podcast #308 - Intel and Mantle, XSPC Watercooling Kits, Quantum Dots, and more!

Subject: General Tech | July 10, 2014 - 01:17 PM |
Tagged: podcast, video, Intel, Mantle, amd, nvidia, XSPC, quantum dots, western digital, My Cloud Mirror, A10-7850K, Kaveri, arm, quakecon

PC Perspective Podcast #308 - 07/10/2014

Join us this week as we discuss Intel using Mantle, XSPC Watercooling Kits, Quantum Dots, and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

  • iTunes - Subscribe to the podcast directly through the iTunes Store
  • RSS - Subscribe through your regular RSS reader
  • MP3 - Direct download link to the MP3 file

Hosts: Ryan Shrout, Josh Walrath, Jeremy Hellstrom, Allyn Malventano, and Morry Tietelman

Program length: 1:25:47

Subscribe to the PC Perspective YouTube Channel for more videos, reviews and podcasts!!

 

Panasonic ARMs will be fabbed at Intel

Subject: General Tech | July 8, 2014 - 01:40 PM |
Tagged: SoC, Panasonic, Intel, arm

Intel has been fabbing ARM chips for Altera since the end of last year after their unprecedented move of allowing non-Intel designs into their fabs.  This decision allowed Intel to increase the percentage of time the fabs were active, as they are no longer able to keep them at full capacity with their own chips and have even mothballed the new Fab 42 in Arizona.  Altera is a good customer, as are Tabula, Netronome and Microsemi but together they are still not enough to bring Intel's capacity close to 100%.  The Register has reported on a new contract with the ink still wet from signing; Panasonic will now be using Intel's Fabs for their ARM based SoCs.   The immense size of Panasonic should keep Intel busy and ensure that they continue to make mountains of money licensing their 14nm-process tri-Gate transistors as well as the Fab time.

panasonic-large1.jpg

"Intel has notched up another customer for its fledgling Foundry business as it tries to make money out of its manufacturing and engineering expertise besides x86 processor sales.

The world's most valuable chip manufacturer said on Monday that Panasonic's audio-visual gear will make future system-on-chips (SoCs) in Intel's factories."

Here is some more Tech News from around the web:

Tech Talk

 

Source: The Register
Manufacturer: Intel

When Magma Freezes Over...

Intel confirms that they have approached AMD about access to their Mantle API. The discussion, despite being clearly labeled as "an experiment" by an Intel spokesperson, was initiated by them -- not AMD. According to AMD's Gaming Scientist, Richard Huddy, via PCWorld, AMD's response was, "Give us a month or two" and "we'll go into the 1.0 phase sometime this year" which only has about five months left in it. When the API reaches 1.0, anyone who wants to participate (including hardware vendors) will be granted access.

AMD_Mantle_Logo.png

AMD inside Intel Inside???

I do wonder why Intel would care, though. Intel has the fastest per-thread processors, and their GPUs are not known to be workhorses that are held back by API call bottlenecks, either. Of course, that is not to say that I cannot see any reason, however...

Read on to see why, I think, Intel might be interested and what this means for the industry.

Intel's Knights Landing (Xeon Phi, 2015) Details

Subject: General Tech, Graphics Cards, Processors | July 2, 2014 - 03:55 AM |
Tagged: Intel, Xeon Phi, xeon, silvermont, 14nm

Anandtech has just published a large editorial detailing Intel's Knights Landing. Mostly, it is stuff that we already knew from previous announcements and leaks, such as one by VR-Zone from last November (which we reported on). Officially, few details were given back then, except that it would be available as either a PCIe-based add-in board or as a socketed, bootable, x86-compatible processor based on the Silvermont architecture. Its many cores, threads, and 512 bit registers are each pretty weak, compared to Haswell, for instance, but combine to about 3 TFLOPs of double precision performance.

itsbeautiful.png

Not enough graphs. Could use another 256...

The best way to imagine it is running a PC with a modern, Silvermont-based Atom processor -- only with up to 288 processors listed in your Task Manager (72 actual cores with quad HyperThreading).

The main limitation of GPUs (and similar coprocessors), however, is memory bandwidth. GDDR5 is often the main bottleneck of compute performance and just about the first thing to be optimized. To compensate, Intel is packaging up-to 16GB of memory (stacked DRAM) on the chip, itself. This RAM is based on "Hybrid Memory Cube" (HMC), developed by Micron Technology, and supported by the Hybrid Memory Cube Consortium (HMCC). While the actual memory used in Knights Landing is derived from HMC, it uses a proprietary interface that is customized for Knights Landing. Its bandwidth is rated at around 500GB/s. For comparison, the NVIDIA GeForce Titan Black has 336.4GB/s of memory bandwidth.

Intel and Micron have worked together in the past. In 2006, the two companies formed "IM Flash" to produce the NAND flash for Intel and Crucial SSDs. Crucial is Micron's consumer-facing brand.

intel-knights-landing.jpg

So the vision for Knights Landing seems to be the bridge between CPU-like architectures and GPU-like ones. For compute tasks, GPUs edge out CPUs by crunching through bundles of similar tasks at the same time, across many (hundreds of, thousands of) computing units. The difference with (at least socketed) Xeon Phi processors is that, unlike most GPUs, Intel does not rely upon APIs, such as OpenCL, and drivers to translate a handful of functions into bundles of GPU-specific machine language. Instead, especially if the Xeon Phi is your system's main processor, it will run standard, x86-based software. The software will just run slowly, unless it is capable of vectorizing itself and splitting across multiple threads. Obviously, OpenCL (and other APIs) would make this parallelization easy, by their host/kernel design, but it is apparently not required.

It is a cool way that Intel arrives at the same goal, based on their background. Especially when you mix-and-match Xeons and Xeon Phis on the same computer, it is a push toward heterogeneous computing -- with a lot of specialized threads backing up a handful of strong ones. I just wonder if providing a more-direct method of programming will really help developers finally adopt massively parallel coding practices.

I mean, without even considering GPU compute, how efficient is most software at splitting into even two threads? Four threads? Eight threads? Can this help drive heterogeneous development? Or will this product simply try to appeal to those who are already considering it?

Source: Intel

Podcast #306 - Budget PC Shootout, the Coolermaster Elite 110, AMD GameWorks competitor

Subject: General Tech | June 26, 2014 - 02:36 PM |
Tagged: xeon, video, seiki, podcast, nvidia, msi, Intel, HDMI 2.0, gt70 2pe, gt70, gameworks, FX-9590, displayport 1.3, coolermaster, amd, 4k

PC Perspective Podcast #306 - 06/26/2014

Join us this week as we discuss our Budget PC Shootout, the Coolermaster Elite 110, an AMD GameWorks competitor and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

  • iTunes - Subscribe to the podcast directly through the iTunes Store
  • RSS - Subscribe through your regular RSS reader
  • MP3 - Direct download link to the MP3 file

Hosts: Ryan Shrout, Josh Walrath, Jeremy Hellstrom, and Allyn Maleventano

Program length: 1:19:12

Subscribe to the PC Perspective YouTube Channel for more videos, reviews and podcasts!!

 

 

Some good news on competition for a change; SSDs may be getting cheaper

Subject: General Tech | June 23, 2014 - 01:38 PM |
Tagged: ssd, kingston, Samsung, Intel, sandisk, rumour

If the information provided to DigiTimes is correct we may be in for a price war between SSD manufacturers.  We have seen price drops in flash memory, especially with the advent of TLC and asynchronous flash which have been heartily approved by most enthusiasts.  However there is a chance that in the coming months competition will start driving prices of SSDs down but may have the opposite impact on other products.  Micron is planning on reducing the amount of memory it sells to other companies in order to ramp up its stock of SSDs and SanDisk has jumped into the market with both feet.  You can also expect to see all the major manufacturers start putting out more M.2 drives as adoption of Intel's Z97 board grows.

pricesfalling.jpg

"The SSD industry is heading for fierce price competition as major suppliers, including Micron Technology, Intel, Kingston Technology, SanDisk and Samsung Electronics, are gearing up efforts to outperform others, according to industry sources."

Here is some more Tech News from around the web:

Tech Talk

Source: DigiTimes

You got your FPGA in my Xeon!

Subject: General Tech | June 19, 2014 - 01:19 PM |
Tagged: xeon, Intel, FPGA

Intel has just revealed what The Register is aptly referring to as the FrankenChip, a hybrid Xeon E5 and FPGA chip.  This will allow large companies to access the power of a Xeon and be able to offload some work onto an FPGA they can program and optimize themselves.  The low power FPGA is actually on the chip, as opposed to Microsoft's recent implementation which saw FPGA's added to PCIe slots.  Intel's solution does not use up a slot and also offers direct access to the Xeon cache hierarchy and system memory via QPI which will allow for increased performance.  Another low power shot has been fired at ARM's attempts to grow their share of the server market but we shall see if the inherent complexity of programming an FPGA to work with an x86 is more or less attractive than switching to ARM.

intel-inside-logo-370x290.jpg

"Intel has expanded its chip customization business to help it take on the hazy threat posed by some of the world's biggest clouds adopting low-power ARM processors."

Here is some more Tech News from around the web:

Tech Talk

Source: The Register