NVIDIA Reveals a 5th CPU Core in Upcoming Kal-El Tegra SoC
Kal-El Tegra SoC to use 5 cores
Recent news from NVIDIA has unveiled some interesting new technical details about the upcoming Kal-El ARM-based Tegra SoC. While we have known for some time that this chip would include a quad-core processor and would likely be the first ARM-based quad-core part on the market, NVIDIA's Matt Wuebbling spilled the beans on a new technology called "Variable SMP" (vSMP) and a fifth core on the die.
An updated diagram shows the fifth "companion" core - Courtesy NVIDIA
This patented technology allows the upcoming Tegra processor to address a couple of key issues that affect smartphones and tablets: standby power consumption and manufacturing process deviations. Even though all five of the cores on Kal-El are going to be based on the ARM Cortex A9 design they will have very different power characteristics due to variations in the TSMC 40nm process technology that builds them. Typical of most foundries and process technologies, TSMC has both a "high performance" and a "low power" derivative of the 40nm technology usually aimed at different projects. The higher performing variation will run at faster clock speeds but will also have more transistor leakages thus increasing overall power consumption. The low power option does just the opposite: lowers the frequency ceiling while using less power at idle and usage states.
CPU power and performance curves - Courtesy NVIDIA
NVIDIA's answer to this dilemma is to have both - a single A9 core built on the low power transistors and quad A9s built on the higher performing transistors. The result is the diagram you saw at the top of this story with a quad-core SoC with a single ARM-based "companion." NVIDIA is calling this strategy Variable Symmetric Multiprocessing and using some integrated hardware tricks it is able to switch between operating on the lower power core OR one to four of the higher power cores. The low power process will support operating frequencies up to only 500 MHz while the high speed process transistors will be able to hit well above 1-1.2 GHz.
Core management based on workload - Courtesy NVIDIA
When in a very lower power or idle state, only the companion lower power core will be awake and active handling tasks like incoming email, audio playback and other data synchronization. Once the system wakes up and more intensive tasks are started one or more of the four cores based on the high performance transistors will be active while the companion CPU goes to sleep. Obviously NVIDIA has integrated the capability for any of primary cores to be in a lower power sleep state thanks to aggressive gating techniques and in some ways the addition of the "companion" core is an attempt to extend this theory of power management.
vSMP offers some interesting characteristics of its own that are impressive from a processor design perspective like the ability for the companion and the primary cores to utilize the same L2 cache. This negates any need for additional transistors dedicated to just the companion core saving on die space and performance penalties usually involved with data transfers from cache to cache. Also, the operating system and applications have no need to be aware of this additional processor core or the architectural complexities as the hardware layer is responsible for switching between the two "modes" of operation.
Power savings thanks to vSMP - Courtesy NVIDIA
The end result is lower power consumption at idle AND lower power consumption at peak processing times. All of this is because the "high performance" cores are tweaked to run at higher speeds at lower voltages leaving the "companion" core can run on different low power transistors during slow or idle periods. NVIDIA claims that Kal-El will be more efficient than the currently shipping dual-core Tegra 2 because of their innovations in design discussed here.
Data showing Kal-El using 2-3x less power than competing dual-core CPUs - Courtesy NVIDIA
In fact, NVIDIA's numbers in the provided white paper indicate that the quad-core Kal-El processor can run at the same performance levels of other company's competing current dual-core processors at 2-3x LOWER power consumption. The top entry in the table above shows the Kal-El SoC downclocked (480 MHz) running on four cores to match the performance levels of the OMAP4 and QC8660 CPUs. While performance levels are nearly the same, Kal-El is using about 2.8x less power than either of the competing platforms. When all four cores on Kal-El are clocked up to 1 GHz (likely less than the top speed it will offer at launch) then the power consumption is still lower than either TI or Qualcomm's chip but offers more than twice the performance.
Kal-El power consumption and performance compared to competition - Courtesy NVIDIA
All of this comes at a cost though - a very literal fashion. The real-world cost is to NVIDIA's customers that are going to be buying a bigger chip because of the decision to include a fifth ARM Cortex A9 core in a chip that is going to be billed as a "quad-core" unit. The power consumption benefits that NVIDIA is no doubt seeing from this decision (as well as the performance differences by being able to using higher performing silicon on the quad cores) come at the price of a larger die on devices that tend to need to be smaller and more compact. The design choice is interesting though and addresses the problem of "power versus performance" in a unique way and even though this decision could affect TSMC's ability to product the processor reliably, NVIDIA desperately needs to differentiate its product to gain ground in market of competitors entrenched with each other.