ARM Tech Day 2016: Introducing Cortex-A73, Mali-G71, and CCI-550
New Products for 2017
PC Perspective was invited to Austin, TX on May 11 and 12 to participate in ARM’s yearly tech day. Also invited were a handful of editors and analysts that cover the PC and mobile markets. Those folks were all pretty smart, so it is confusing as to why they invited me. Perhaps word of my unique talent of screenshoting PDFs into near-unreadable JPGs preceded me? Regardless of the reason, I was treated to two full days of in-depth discussion of the latest generation of CPU and GPU cores, 10nm test chips, and information on new licensing options.
Today ARM is announcing their next CPU core with the introduction of the Cortex-A73. They are also unwrapping the latest Mali-G71 graphics technology. Other technologies such as the CCI-550 interconnect are also revealed. It is a busy and important day for ARM, especially in light of Intel seemingly abandoning the sub-milliwatt mobile market.
ARM previously announced the Cortex-A72 in February, 2015. Since that time it has been seen in most flagship mobile devices in late 2015 and throughout 2016. The market continues to evolve, and as such the workloads and form factors have pushed ARM to continue to develop and improve their CPU technology.
The Sofia Antipolis, France design group is behind the new A73. The previous several core architectures had been developed by the Cambridge group. As such, the new design differs quite dramatically from the previous A72. I was actually somewhat taken aback by the differences in the design philosophy of the two groups and the changes between the A72 and A73, but the generational jumps we have seen in the past make a bit more sense to me.
The marketplace is constantly changing when it comes to workloads and form factors. More and more complex applications are being ported to mobile devices, including hot technologies like AR and VR. Other technologies include 3D/360 degree video, greater than 20 MP cameras, and 4K/8K displays and their video playback formats. Form factors on the other hand have continued to decrease in size, especially in overall height. We have relatively large screens on most premium devices, but the designers have continued to make these phones thinner and thinner throughout the years. This has put a lot of pressure on ARM and their partners to increase performance while keeping TDPs in check, and even reducing them so they more adequately fit in the TDP envelope of these extremely thin devices.
The focus for the design of the A73 is simply to increase overall performance, decrease power, and leverage the very latest process nodes from a variety of pure-play foundries. Sounds pretty simple, right? Obviously not. ARM has taken these performance and power considerations and focused their attention on keeping clockspeeds up, but improve IPC and efficiency (both from a power and pipeline perspective). The Sofia group started from the ground up with a new design that does not derive from the previous A72.
The base design features a 2-wide superscalar engine with dual decode. The previous A72 featured a triple decode engine. Our first reaction is of course, “Higher numbers of units means better performance!” This is not necessarily true, and ARM has managed to improve IPC through a variety of ways, all the while simplifying the design. The core is code named “Artemis”, which is the product we covered in our Artemis/10nm article a few weeks back. This is designed to be one of the smallest cores that ARM has introduced with a die size of 0.65mm sq. per core on the 10nm process.
Quite a bit of performance and efficiency has been gained from by going to a dual decode unit. ARM has implemented an instruction-fusion capability that allows multiple instructions to be fused and dispatched at once. They also have reduced the number of instructions that have to be split into Micro-OPS. Previously more complex instructions were split into these Micro-OPS, which increases latency and consumes extra clock cycles. By radically reworking the front end, ARM is allowing greater efficiency in instruction decode and dispatch. This improves performance, decreases complexity, and gives a good jump in power efficiency.
The A73 can be combined with the A53 for big/LITTLE configurations. One of the optimal configurations for upcoming midrange phones will be hex-core units features two A73s and four A53s. This takes up about the same amount of die space, but improves per thread performance as well as multithreaded. It seems to be a nice compromise that we will likely see showing up in quite a few handsets.