Review Index:

ARM Tech Day 2016: Introducing Cortex-A73, Mali-G71, and CCI-550

Subject: General Tech
Manufacturer: ARM

New Products for 2017

PC Perspective was invited to Austin, TX on May 11 and 12 to participate in ARM’s yearly tech day.  Also invited were a handful of editors and analysts that cover the PC and mobile markets.  Those folks were all pretty smart, so it is confusing as to why they invited me.  Perhaps word of my unique talent of screenshoting PDFs into near-unreadable JPGs preceded me?  Regardless of the reason, I was treated to two full days of in-depth discussion of the latest generation of CPU and GPU cores, 10nm test chips, and information on new licensing options.

View Full Size

Today ARM is announcing their next CPU core with the introduction of the Cortex-A73. They are also unwrapping the latest Mali-G71 graphics technology.  Other technologies such as the CCI-550 interconnect are also revealed.  It is a busy and important day for ARM, especially in light of Intel seemingly abandoning the sub-milliwatt mobile market.

View Full Size


ARM previously announced the Cortex-A72 in February, 2015.  Since that time it has been seen in most flagship mobile devices in late 2015 and throughout 2016.  The market continues to evolve, and as such the workloads and form factors have pushed ARM to continue to develop and improve their CPU technology.

View Full Size

The Sofia Antipolis, France design group is behind the new A73.  The previous several core architectures had been developed by the Cambridge group.  As such, the new design differs quite dramatically from the previous A72.  I was actually somewhat taken aback by the differences in the design philosophy of the two groups and the changes between the A72 and A73, but the generational jumps we have seen in the past make a bit more sense to me.

The marketplace is constantly changing when it comes to workloads and form factors.  More and more complex applications are being ported to mobile devices, including hot technologies like AR and VR.  Other technologies include 3D/360 degree video, greater than 20 MP cameras, and 4K/8K displays and their video playback formats.  Form factors on the other hand have continued to decrease in size, especially in overall height.  We have relatively large screens on most premium devices, but the designers have continued to make these phones thinner and thinner throughout the years.  This has put a lot of pressure on ARM and their partners to increase performance while keeping TDPs in check, and even reducing them so they more adequately fit in the TDP envelope of these extremely thin devices.

View Full Size

Click here to continue reading about ARM's Tech Day 2016!

The focus for the design of the A73 is simply to increase overall performance, decrease power, and leverage the very latest process nodes from a variety of pure-play foundries.  Sounds pretty simple, right?  Obviously not.  ARM has taken these performance and power considerations and focused their attention on keeping clockspeeds up, but improve IPC and efficiency (both from a power and pipeline perspective).  The Sofia group started from the ground up with a new design that does not derive from the previous A72.

View Full Size

The base design features a 2-wide superscalar engine with dual decode.  The previous A72 featured a triple decode engine.  Our first reaction is of course, “Higher numbers of units means better performance!”  This is not necessarily true, and ARM has managed to improve IPC through a variety of ways, all the while simplifying the design.  The core is code named “Artemis”, which is the product we covered in our Artemis/10nm article a few weeks back.  This is designed to be one of the smallest cores that ARM has introduced with a die size of 0.65mm sq. per core on the 10nm process.

View Full Size

Quite a bit of performance and efficiency has been gained from by going to a dual decode unit.  ARM has implemented an instruction-fusion capability that allows multiple instructions to be fused and dispatched at once.  They also have reduced the number of instructions that have to be split into Micro-OPS.  Previously more complex instructions were split into these Micro-OPS, which increases latency and consumes extra clock cycles.  By radically reworking the front end, ARM is allowing greater efficiency in instruction decode and dispatch.  This improves performance, decreases complexity, and gives a good jump in power efficiency.

View Full Size

The A73 can be combined with the A53 for big/LITTLE configurations.  One of the optimal configurations for upcoming midrange phones will be hex-core units features two A73s and four A53s.  This takes up about the same amount of die space, but improves per thread performance as well as multithreaded.  It seems to be a nice compromise that we will likely see showing up in quite a few handsets.

View Full Size

Video News

May 30, 2016 | 08:38 PM - Posted by MingLord (not verified)

Is it the sub milliWatt market Intel has left? I thought power consumption of Mali type chips would be more in the Watt range,unless it's stand-by power that is being measured?

Otherwise a nice article.Hopefully phones might start lasting all day for longer soon,and not need charging more and more often as your battery degrades over the months,till eventually you are lucky if you get a morning out of them!

May 30, 2016 | 11:46 PM - Posted by Josh Walrath

It should be "sub 1000 milliwatts... I gotta find that sentence and change it!

May 30, 2016 | 10:22 PM - Posted by Anonymous (not verified)

You are missing the slides that show the GPU's variable latency clause handling and the ability to split a clause and do work on another unrelated clause while the latency is hidden and other work performed keeping the Quad execution resources operating at a better overall utilization on the clause level of scheduling for execution resources utilization efficiency.

This is described in Anandtech's deep dive into the into the Mali-G71/BiFrost micro-arch.(1)

Man, AnandTech's two articles, one on the new A73/Artemis CPU core's Micro-Arch(2), and the one on the Mali-G71/BiFrost GPU Micro-Arch are up there on the same level with a Microprocessor Report(pay walled) articles this time around! I hope that AnandTech can keep that author around for when AMD's K12 needs to be reviewed! That AnandTech author definitely has a chip Arch/Design background, and both of those articles are damn good for an outside of a pay walled publication.

It's also imntresting to see talk about Vulkan and SPIR-V as an alternative(Temporary/Not?) for an HSA solution instead of HSAIL, in the AnandTech Mali-G71/BiFrost article.(1)

“From a software standpoint, it’s interesting to note that ARM has gone with an OpenCL 2.0-centric approach, intending to make the functionality accessible through that and related (SPIR-V utilizing) APIs such as Vulkan. G71 however does not support the Heterogeneous System Architecture’s HSAIL standard, this despite ARM being a member of the HSA Foundation. ARM did not have too much to say on the matter, but has stated that they never “totally bought into” HSAIL. OpenCL 2.0, by comparison, is a more generic implementation at the API level, leaving ARM to sort out the low level details as they see fit.

At this point heterogeneous compute is still a long term play for ARM. The potential performance improvements are, in the right scenarios, very significant. And using the GPU instead of the CPU is again a sound move when there’s lots of suitable parallel work to throw at it, especially in SoCs where power efficiency is so critical. But it will take time to bring software developers on board, so while the hardware will soon be here, it will take some time for the software to catch up.”(1)

"ARM Unveils Next Generation Bifrost GPU Architecture & Mali-G71: The New High-End Mali"

"The ARM Cortex A73 - Artemis Unveiled"

May 30, 2016 | 11:44 PM - Posted by Josh Walrath

You should check out page 3 where I describe clause handling and... have a slide showing it!

May 31, 2016 | 09:13 AM - Posted by Anonymous (not verified)

Yes I looked, but there is some missing clause slides that describe the Variable Clause handling/scheduling on the GPU that are still not included, read the Anandtech articles they are very deep dives into the new A73/Mail-G71 micro-archs, and I mean really impressive for any publication that is not pay-walled. I hope that they don't lose that author to the pay-walled publications, it's bad enough that Anand was hired away by Apple, and this new author appears to really be good, he even made some of his own diagrams for his comparisons between Arm Holdings' A72/related line of CPU micro-archs and Arm holdings' A73/related Micro-Archs(in his A73 article). So there is really a lot of work there by the author to really dive deep into things, I'm really impressed by his work!

Your article includes more info on the CCI-550, but AnandTech's deep deep dives are really impressive to read, and that's without Anand being around with his great work/contributions. I really hope AnandTech can keep that author around so he can do and article about AMD's K12 when it is officially announced.

May 31, 2016 | 11:15 AM - Posted by Josh Walrath

I could be reading this wrong, but I guess you are hinting that you sorta like that author over at Anand's?

I had 32 slides on 4 pages.  We had about 10 press decks to choose from.  Sorry I couldn't post every single one.  I think there were in total about 150 slides.  Really had to squeeze down what to use to give the best overall understanding without just making it one massive slide presentation.

May 31, 2016 | 12:55 PM - Posted by Anonymous (not verified)

No problems with the total information that you provided as there is so much new information and new technology coming online. So I'll read all of the articles across many online sources, and you have covered things that other articles have not included, so I'm reading yours and all the others that are out there, it's just that extra few slides on the variable latency Clause scheduling on the new Mali GPU that is very interesting in comparison to the other Mobile/desktop GPU makers hardware/thread scheduling methods on their GPU SKUs. ARM holdings have been very busy since they released the A72, and earlier Mail GPUs, and that is some very innovative design/engineering for the A73, and Mali-G71.

Do you have any links to the ARM press/decks offerings, do any/all of the press decks have PDF/Other links that are allowed to be shared, or even white-papers from ARM holdings. That's a lot of technical information that ARM has provided, and with COMPUTEX going on, It must be near impossible to keep up with all the new GPU/CPU/other technology information and hardware that has been premiered over that past month, to go along with the flood of information coming from COMPUTEX, especially from AMD and Nvidia.

It's going to take months of reading for sure just trying to stay on top of things with so much change happening at one small amount of time. 2016 is going to get even more interesting with Zen and other information releases coming over the remainder of the year. I guess that everyone at PCPer will be benchmarking/reviewing some new AM4 motherboards and some Bristol Ridge CPU(soon to be swapped out with Zen) SKUs with some Polaris GPUs over the next few weeks until the NDAs fully lift. Thanks for all of your hard work.

December 7, 2017 | 07:20 PM - Posted by Catherine (not verified)

FYI - thought it would be worth adding that ARM was a recipient of our 2017 Business Sustainability Award:

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

This question is for testing whether you are a human visitor and to prevent automated spam submissions.