NVIDIA Reveals 64-bit Denver CPU Core Details, Headed to New Tegra K1 Powered Devices Later This Year
Subject: Processors | August 12, 2014 - 01:06 AM | Tim Verry
Tagged: tegra k1, project denver, nvidia, Denver, ARMv8, arm, Android, 64-bit
During GTC 2014 NVIDIA launched the Tegra K1, a new mobile SoC that contains a powerful Kepler-based GPU. Initial processors (and the resultant design wins such as the Acer Chromebook 13 and Xiaomi Mi Pad) utilized four ARM Cortex-A15 cores for the CPU side of things, but later this year NVIDIA is deploying a variant of the Tegra K1 SoC that switches out the four A15 cores for two custom (NVIDIA developed) Denver CPU cores.
The custom 64-bit Denver CPU cores use a 7-way superscalar design and run a custom instruction set. Denver is a wide but in-order architecture that allows up to seven operations per clock cycle. NVIDIA is using a custom ISA and on-the-fly binary translation to convert ARMv8 instructions to microcode before execution. A software layer and 128MB cache enhance the Dynamic Code Optimization technology by allowing the processor to examine and optimize the ARM code, convert it to the custom instruction set, and further cache the converted microcode of frequently used applications in a cache (which can be bypassed for infrequently processed code). Using the wider execution engine and Dynamic Code Optimization (which is transparent to ARM developers and does not require updated applications), NVIDIA touts the dual Denver core Tegra K1 as being at least as powerful as the quad and octo-core packing competition.
Further, NVIDIA has claimed at at peak throughput (and in specific situations where application code and DCO can take full advantage of the 7-way execution engine) the Denver-based mobile SoC handily outpaces Intel’s Bay Trail, Apple’s A7 Cyclone, and Qualcomm’s Krait 400 CPU cores. In the results of a synthetic benchmark test provided to The Tech Report, the Denver cores were even challenging Intel’s Haswell-based Celeron 2955U processor. Keeping in mind that these are NVIDIA-provided numbers and likely the best results one can expect, Denver is still quite a bit more capable than existing cores. (Note that the Haswell chips would likely pull much farther ahead when presented with applications that cannot be easily executed in-order with limited instruction parallelism).
NVIDIA is ratcheting up mobile CPU performance with its Denver cores, but it is also aiming for an efficient chip and has implemented several power saving tweaks. Beyond the decision to go with an in-order execution engine (with DCO hopefully mostly making up for that), the beefy Denver cores reportedly feature low latency power state transitions (e.g. between active and idle states), power gating, dynamic voltage, and dynamic clock scaling. The company claims that “Denver's performance will rival some mainstream PC-class CPUs at significantly reduced power consumption.” In real terms this should mean that the two Denver cores in place of the quad core A15 design in the Tegra K1 should not result in significantly lower battery life. The two K1 variants are said to be pin compatible such that OEMs and developers can easily bring upgraded models to market with the faster Denver cores.
For those curious, In the Tegra K1, the two Denver cores (clocked at up to 2.5GHz) share a 16-way L2 cache and each have 128KB instruction and 64KB data L1 caches to themselves. The 128MB Dynamic Code Optimization cache is held in system memory.
Denver is the first (custom) 64-bit ARM processor for Android (with Apple’s A7 being the first 64-bit smartphone chip), and NVIDIA is working on supporting the next generation Android OS known as Android L.
The dual Denver core Tegra K1 is coming later this year and I am excited to see how it performs. The current K1 chip already has a powerful fully CUDA compliant Kepler-based GPU which has enabled awesome projects such as computer vision and even prototype self-driving cars. With the new Kepler GPU and Denver CPU pairing, I’m looking forward to seeing how NVIDIA’s latest chip is put to work and the kinds of devices it enables.
Are you excited for the new Tegra K1 SoC with NVIDIA’s first fully custom cores?
Subject: General Tech | June 19, 2013 - 09:51 PM | Josh Walrath
Tagged: Volta, nvidia, maxwell, licensing, kepler, Denver, Blogs, arm
Yesterday we all saw the blog piece from NVIDIA that stated that they were going to start licensing their IP to interested third parties. Obviously, there was a lot of discussion about this particular move. Some were in favor, some were opposed, and others yet thought that NVIDIA is now simply roadkill. I believe that it is an interesting move, but we are not yet sure of the exact details or the repercussions of such a decision on NVIDIA’s part.
The biggest bombshell of the entire post was that NVIDIA would be licensing out their latest architecture to interested clients. The Kepler architecture powers the very latest GTX 700 series of cards and at the top end it is considered one of the fastest and most efficient architectures out there. Seemingly, there is a price for this though. Time to dig a little deeper.
Kepler will be the first technology licensed to third party manufacturers. We will not see full GPUs, these will only be integrated into mobile products.
The very latest Tegra parts from NVIDIA do not feature the Kepler architecture for the graphics portion. Instead, the units featured in Tegra can almost be described as GeForce 7000 series parts. The computational units are split between pixel shaders and vertex shaders. They support a maximum compatibility of D3D 9_3 and OpenGL ES 2.0. This is a far cry from a unified shader architecture and support for the latest D3D 11 and OpenGL ES 3.0 specifications. Other mobile units feature the latest Mali and Adreno series of graphics units which are unified and support DX11 and OpenGL ES 3.0.
So why exactly does the latest Tegras not share the Kepler architecture? Hard to say. It could be a variety of factors that include time to market, available engineering teams, and simulations which could dictate if power and performance can be better served by a less complex unit. Kepler is not simple. A Kepler unit that occupies the same die space could potentially consume more power with any given workload, or conversely it could perform poorly given the same power envelope.
We can look at the desktop side of this argument for some kind of proof. At the top end Kepler is a champ. The GTX 680/770 has outstanding performance and consumes far less power than the competition from AMD. When we move down a notch and see the GTX 660 Ti/HD 7800 series of cards, we see much greater parity in performance and power consumptions. Going to the HD 7790 as compared to the 650 Ti Boost, we see the Boost part have slightly better performance but consumes significantly more power. Then we move down to the 650 and 650 Ti and these parts do not consume any more power than the competing AMD parts, but they also perform much more poorly. I know these are some pretty hefty generalizations and the engineers at NVIDIA could very effectively port Kepler over to mobile applications without significant performance or power penalties. But so far, we have not seen this work.
Power, performance, and die area aside there is also another issue to factor in. NVIDIA just announced that they are doing this. We have no idea how long this effort has been going, but it is very likely that it has only been worked on for the past six months. In that time NVIDIA needs to hammer out how they are going to license the technology, how much manpower they must provide licensees to get those parts up and running, and what kind of fees they are going to charge. There is a lot of work going on there and this is not a simple undertaking.
So let us assume that some three months ago an interested partner such as Rockchip or Samsung comes knocking to NVIDIA’s door. They work out the licensing agreements and this takes several months. Then we start to see the transfer of technology between the companies. Obviously Samsung and Rockchip are not going to apply this graphics architecture to currently shipping products, but will instead bundle it in with a next generation ARM based design. These designs are not spun out overnight. For example, the 64 bit ARMv8 designs have been finalized for around a year, and we do not expect to see initial parts being shipped until late 1H 2014. So any partner that decides to utilize NVIDIA’s Kepler architecture for such an application will not see this part be released until 1H 2015 at the very earliest.
Sheild is still based on a GPU posessing separate pixel and vertex shaders. DX11 and OpenGL ES 3.0? Nope!
If someone decides to license this technology from NVIDIA, it will not be of great concern. The next generation of NVIDIA graphics will already be out by that time, and we could very well be approaching the next iteration for the desktop side. NVIDIA plans on releasing a Kepler based mobile unit in 2014 (Logan), which would be a full year in advance of any competing product. In 2015 NVIDIA is planning on releasing an ARM product based on the Denver CPU and Maxwell GPU. So we can easily see that NVIDIA will only be licensing out an older generation product so it will not face direct competition when it comes to GPUs. NVIDIA obviously is hoping that their GPU tech will still be a step ahead of that of ARM (Mali), Qualcomm (Adreno), and Imagination Technologies (PowerVR).
This is an easy and relatively painfree way to test the waters that ARM, Imagination Technologies, and AMD are already treading. ARM only licenses IP and have shown the world that it can not only succeed at it, but thrive. Imagination Tech used to produce their own chips much like NVIDIA does, but they changed direction and continue to be profitable. AMD recently opened up about their semi-custom design group that will design specific products for customers and then license those designs out. I do not think this is a desperation move by NVIDIA, but it certainly is one that probably is a little late in coming. The mobile market is exploding, and we are approaching a time where nearly every electricity based item will have some kind of logic included in it, billions of chips a year will be sold. NVIDIA obviously wants a piece of that market. Even a small piece of “billions” is going to be significant to the bottom line.
Get notified when we go live!