Subject: Processors | July 28, 2016 - 02:47 PM | Tim Verry
Tagged: kaby lake, Intel, gt3e, coffee lake, 14nm
Intel will allegedly be releasing another 14nm processor following Kaby Lake (which is itself a 14nm successor to Skylake) in 2018. The new processors are code named "Coffee Lake" and will be released alongside low power runs of 10nm Cannon Lake chips.
Not much information is known about Coffee Lake outside of leaked slides and rumors, but the first processors slated to launch in 2018 will be mainstream mobile chips that will come in U and HQ mobile flavors which are 15W to 28W and 35W to 45W TDP chips respectively. Of course, these processors will be built on a very mature 14nm process with the usual small performance and efficiency gains beyond Skylake and Kaby Lake. The chips should have a better graphics unit, but perhaps more interesting is that the slides suggest that Coffee Lake will be the first architecture where Intel will bring "hexacore" (6 core) processors into mainstream consumer chips! The HQ-class Coffee Lake processors will reportedly come in two, four, and six core variants with Intel GT3e class GPUs. Meanwhile the lower power U-class chips top out at dual cores with GT3e class graphics. This is interesting because Intel has previous held back the six core CPUs for its more expensive and higher margin HEDT and Xeon platforms.
Of course 2018 is also the year for Cannon Lake which would have been the "tock" in Intel's old tick-tock schedule (which is no more) as the chips will move to a smaller process node and then Intel would improve on the 10nm process from there in future architectures. Cannon Lake is supposed to be built on the tiny 10nm node, and it appears that the first chips on this node will be ultra low power versions for laptops and tablets. Occupying the ULV platform's U-class (15W) and Y-class (4.5W), Cannon Lake CPUs will be dual cores with GT2 graphics. These chips should sip power while giving comparable performance to Kaby and Coffee Lake perhaps even matching the performance of the Coffee Lake U processors!
Stay tuned to PC Perspective for more information!
Subject: Processors, Mobile | July 18, 2016 - 12:03 AM | Sebastian Peak
Tagged: softbank, SoC, smartphones, mobile cpu, Cortex-A73, ARM Holdings, arm, acquisition
ARM Holdings is to be aquired by SoftBank for $32 billion USD. This report has been confirmed by the Wall Street Journal, who states that an official annoucement of the deal is likely on Monday as "both companies’ boards have agreed to the deal".
(Image credit: director.co.uk)
"Japan’s SoftBank Group Corp. has reached a more than $32 billion deal to buy U.K.-based chip-designer ARM HoldingsPLC, marking a significant push for the Japanese telecommunications giant into the mobile internet, according to a person familiar with the situation." - WSJ
ARM just announced their newest CPU core, the Cortex-A73, at the end of May, with performance and efficiency improvements over the current Cortex-A72 promised with the new architecture.
(Image credit: AnandTech)
We will have to wait and see if this aquisition will have any bearing on future product development, though it seems the acquisition targets the significant intellectual property value of ARM, whose designs can be found in most smartphones.
Subject: Processors, Mobile | July 11, 2016 - 11:44 AM | Sebastian Peak
Tagged: SoC, Snapdragon 821, snapdragon, qualcomm, adreno 530
Announced today, the Snapdragon 821 offers a modest CPU frequency increase over the Snapdragon 820, with clock speeds of up to 2.4 GHz compared to 2.2 GHz with the Snapdragon 820. The new SoC is still implementing Qualcomm's custom quad-core "Kryo" design, which is made up of two pairs of dual-core CPU clusters.
"What isn’t in this announcement is that the power cluster will likely be above 2 GHz and GPU clocks look to be around 650 MHz but without knowing whether there are some changes other than clock relative to Adreno 530 we can’t really estimate the performance of this part."
Specifics on the Adreno GPU were not mentioned in the official announcement. The 650 MHz GPU clock reported by Anandtech would offer a modest improvement over the SD820's 624 MHz Adreno 530 GPU. Additionally, the "power cluster" will reportedly move from 1.6 GHz with the SD820 to 2.0 GHz with the SD821.
No telling when this updated SoC will find its way into consumer devices, with the Snapdragon 820 currently available in the Samsung Galaxy S7/S7 Edge, LG G5, OnePlus 3, and a few others.
Subject: Graphics Cards, Processors | June 29, 2016 - 07:27 AM | Sebastian Peak
Tagged: RX 490, radeon, processors, Polaris, graphics card, Bristol Ridge, APU, amd, A12-9800
AMD's current "We're in the Game" promotion offers a glimpse at upcoming product names, including the Radeon RX 490 graphics card, and the new Bristol Ridge APUs.
Visit AMD's gaming promo page and click the link to "check eligibility" to see the following list of products, which includes the new product names:
It seems safe to assume that the new products listed - including the Radeon RX 490 - are close to release, though details on the high-end Polaris GPU are not mentioned. We do have details on the upcoming Bristol Ridge products, with this in-depth preview from Josh published back in April. The A12-9800 and A12-9800E are said to be the flagship products in this new 7th-gen lineup, so there will be new desktop parts with improved graphics soon.
Subject: Processors | June 27, 2016 - 02:40 PM | Jeremy Hellstrom
Tagged: dx12, 6700k, Intel, i7-6950X
[H]ard|OCP has been conducting tests using a variety of CPUs to see how well DX12 distributes load between cores as compared to DX11. Their final article which covers the 6700K and 6950X was done a little differently and so cannot be directly compared to the previously tested CPUs. That does not lower the value of the testing, scaling is still very obvious and the new tests were designed to highlight more common usage scenarios for gamers. Read on to see how well, or how poorly, Ashes of the Singularity scales when using DX12.
"This is our fourth and last installment of looking at the new DX12 API and how it works with a game such as Ashes of the Singularity. We have looked at how DX12 is better at distributing workloads across multiple CPU cores than DX11 in AotS when not GPU bound. This time we compare the latest Intel processors in GPU bound workloads."
Here are some more Processor articles from around the web:
- Intel Skylake Graphics: Windows 10 vs. Ubuntu 16.04 + Latest Open-Source Driver Code @ Phoronix
- AMD Wraith Cooler Performance on FX-6350 Black Edition CPU @ Neoseeker
- Athlon X4 880K @ Hardware Secrets
- AMD Athlon X4 845 CPU Review @ OCC
Subject: Processors | June 24, 2016 - 11:15 PM | Scott Michaud
Tagged: Intel, kaby lake, iGPU, h.265, hevc, vp8, vp9, codec, codecs
Fudzilla isn't really talking about their sources, so it's difficult to gauge how confident we should be, but they claim to have information about the video codecs supported by Kaby Lake's iGPU. This update is supposed to include hardware support for HDR video, the Rec.2020 color gamut, and HDCP 2.2, because, if videos are pirated prior to their release date, the solution is clearly to punish your paying customers with restrictive, compatibility-breaking technology. Time-traveling pirates are the worst.
According to their report, Kaby Lake-S will support VP8, VP9, HEVC 8b, and HEVC 10b, both encode and decode. However, they then go on to say that 10-bit VP9 and 10-bit HEVC 10b does not include hardware encoding. I'm not too knowledgeable about video codecs, but I don't know of any benefits to encoding 8-bit HEVC Main 10. Perhaps someone in our comments can clarify.
Subject: Processors | June 21, 2016 - 10:00 PM | Scott Michaud
Update (June 22nd @ 12:36 AM): Errrr. Right. Accidentally referred to the CPU in terms of TFLOPs. That's incorrect -- it's not a floating-point decimal processor. Should be trillions of operations per second (teraops). Whoops! Also, it has a die area of 64sq.mm, compared to 520sq.mm of something like GF110.
So this is an interesting news post. Graduate students at UCDavis have designed and produced a thousand-core CPU at IBM's facilities. The processor is manufactured on their 32nm process, which is quite old -- about half-way between NVIDIA's Fermi and Kepler if viewed from a GPU perspective. Its die area is not listed, though, but we've reached out to their press contact for more information. The chip can be clocked up to 1.78 GHz, yielding 1.78 teraops of theoretical performance.
These numbers tell us quite a bit.
The first thing that stands out to me is that the processor is clocked at 1.78 GHz, has 1000 cores, and is rated at 1.78 teraops. This is interesting because modern GPUs (note that this is not a GPU -- more on that later) are rated at twice the clock rate times the number of cores. The factor of two comes in with fused multiply-add (FMA), a*b + c, which can be easily implemented as a single instruction and are widely used in real-world calculations. Two mathematical operations in a single instruction yields a theoretical max of 2 times clock times core count. Since this processor does not count the factor of two, it seems like its instruction set is massively reduced compared to commercial processors.
If they even cut out FMA, what else did they remove from the instruction set? This would at least partially explain why the CPU has such a high theoretical throughput per transistor compared to, say, NVIDIA's GF110, which has a slightly lower TFLOP rating with about five times the transistor count -- and that's ignoring all of the complexity-saving tricks that GPUs play, that this chip does not. Update (June 22nd @ 12:36 AM): Again, none of this makes sense, because it's not a floating-point processor.
"Big Fermi" uses 3 billion transistors to achieve 1.5 TFLOPs when operating on 32 pieces of data simultaneously (see below). This processor does 1.78 teraops with 0.621 billion transistors.
On the other hand, this chip is different from GPUs in that it doesn't use their complexity-saving tricks. GPUs save die space by tying multiple threads together and forcing them to behave in lockstep. On NVIDIA hardware, 32 instructions are bound into a “warp”. On AMD, 64 make up a “wavefront”. On Intel's Xeon Phi, AVX-512 packs 16, 32-bit instructions together into a vector and operates them at once. GPUs use this architecture because, if you have a really big workload, you, chances are, have very related tasks; neighbouring pixels on a screen will be operating on the same material with slightly offset geometry, multiple vertexes of the same object will be deformed by the same process, and so forth.
This processor, on the other hand, has a thousand cores that are independent. Again, this is wasteful for tasks that map easily to single-instruction-multiple-data (SIMD) architectures, but the reverse (not wasteful in highly parallel tasks that SIMD is wasteful on) is also true. SIMD makes an assumption about your data and tries to optimize how it maps to the real-world -- it's either a valid assumption, or it's not. If it isn't? A chip like this would have multi-fold performance benefits, FLOP for
Subject: Processors | June 15, 2016 - 11:18 PM | Scott Michaud
Tagged: Zen, opteron, amd
We're beginning to see how the Zen architecture will affect AMD's entire product stack. This news refers to their Opteron line of CPUs, which are intended for servers and certain workstations. They tend to allow lots of memory, have lots of cores, and connect to a lot of I/O options and add-in boards at the same time.
In this case, Zen-based Opterons will be available in two, four, sixteen, and thirty-two core options, with two threads per core (yielding four, eight, thirty-two, and sixty-four threads, respectively). TDPs will range between 35W and 180W. Intel's Xeon E7 v4 goes up to 165W got 24 cores (on Broadwell-EX) so AMD has a little more headroom to play with for those extra eight cores. That is obviously a lot, and it should be, again, good for cloud applications that can be parallelized.
As for the I/O side of things, the rumored chip will have 128 PCIe 3.0 lanes. It's unclear whether that is per socket, or total. Its wording sounds like it is per-CPU, although much earlier rumors have said that it has 64 PCIe lanes per socket with dual-socket boards available. It will also support sixteen 10-Gigabit Ethernet connections, which, again, is great for servers, especially with virtualization.
These are expected to launch in 2017. Fudzilla claims that “very late 2016” is possible, but also that it will launch after high-end desktop, which are expected to be delayed until 2017.
Subject: Graphics Cards, Processors | June 13, 2016 - 03:51 PM | Scott Michaud
Tagged: amd, Polaris, Zen, Summit Ridge, rx 480, rx 470, rx 460
AMD has just unveiled their entire RX line of graphics cards at E3 2016's PC Gaming Show. It was a fairly short segment, but it had a few interesting points in it. At the end, they also gave another teaser of Summit Ridge, which uses the Zen architecture.
First, Polaris. As we know, the RX 480 was going to bring >5 TFLOPs at a $199 price point. They elaborated that this will apply to the 4GB version, which likely means that another version with more VRAM will be available, and that implies 8GB. Beyond the RX 480, AMD has also announced the RX 470 and RX 460. Little is known about the 470, but they mentioned that the 460 will have a <75W TDP. This is interesting because the PCIe bus provides 75W of power. This implies that it will not require any external power, and thus could be a cheap and powerful (in terms of esports titles) addition to an existing desktop. This is an interesting way to use the power savings of the die shrink to 14nm!
They also showed off a backpack VR rig. They didn't really elaborate, but it's here.
As for Zen? AMD showed the new architecture running DOOM, and added the circle-with-Zen branding to a 3D model of a CPU. Zen will be coming first to the enthusiast category with (up to?) eight cores, two threads per core (16 threads total).
The AMD Radeon RX 480 will launch on June 29th for $199 USD (4GB). None of the other products have a specific release date.
Subject: Processors | June 8, 2016 - 08:17 AM | Scott Michaud
Tagged: Xeon Phi, Intel, gpgpu
Intel's recent restructure had a much broader impact than I originally believed. Beyond the large number of employees who will lose their jobs, we're even seeing it affect other areas of the industry. Typically, ASUS releases their ZenPhone line with x86 processors, which I assumed was based on big subsidies from Intel to push their instruction set into new product categories. This year, ASUS chose the ARM-based Qualcomm Snapdragon, which seemed to me like Intel decided to stop the bleeding.
That brings us to today's news. After over 27 years at Intel, James Reinders accepted the company's early retirement offer, scheduled for his 10001st day with the company, and step down from his position as Intel's High Performance Computing Director. He worked on the Larabee and Xeon Phi initiatives, and published several books on parallelism.
According to his letter, it sounds like his retirement offer was part of a company-wide package, and not targeting his division specifically. That would sort-of make sense, because Intel is focusing on cloud and IoT. Xeon Phi is an area that Intel is battling NVIDIA for high-performance servers, and I would expect that it has potential for cloud-based applications. Then again, as I say that, AWS only has a handful of GPU instances, and they are running fairly old hardware at that, so maybe the demand isn't there yet.