Addressing New Markets
Machine Learning is one of the hot topics in technology, and certainly one that is growing at a very fast rate. Applications such as facial recognition and self-driving cars are powering much of the development going on in this area. So far we have seen CPUs and GPUs being used in ML applications, but in most cases these are not the most efficient ways of doing these highly parallel but relatively computationally simple workloads. New chips have been introduced that are far more focused on machine learning, and now it seems that ARM is throwing their hat into the ring.
ARM is introducing three products under the Project Trillium brand. It features a ML processor, a OD (Object Detection) processor, and a ARM developed Neural Network software stack. This project came as a surprise for most of us, but in hindsight it is a logical avenue for them to address as it will be incredibly important moving forward. Currently many applications that require machine learning are not processed at the edge, namely in the consumer’s hand or device right next to them. Workloads may be requested from the edge, but most of the heavy duty processing occurs in datacenters located all around the world. This requires communication, and sometimes pretty hefty levels of bandwidth. If neither of those things are present, applications requiring ML break down.
Subject: General Tech | October 31, 2017 - 11:04 PM | Josh Walrath
Tagged: MMU-600, Mali-G72, Mali-D71, Mali GPU, mali, Chi, Assertive Display 5, arm, AMBA
Not much can stand in the way of progress. This is particularly true in the mobile market. The competition is so fierce that we have seen yearly refreshes that push the feature and quality levels to new heights. Several years back we saw Apple release their high DPI displays and the rest of the industry followed. We have seen Android and iOS add new software features and capabilities into their products that have pushed the limits of the CPU and GPUs of these phones. Now we are entering a new era with AR and VR capabilities in phones and it is only pushing the performance envelope of these handheld devices that may consume only a couple of watts at full power.
One area that has needed an upgrade for a while is in the display capabilities cooked into the latest ARM processors. The needs of upcoming phones to display 4K resolutions at 120 Hz for high end devices that can also support VR capabilities are great. Previously units have been limited to 4K/30 or in higher end phones 4K/60 capabilities. With VR being a push in mobile as well as other features that require high resolution displays and high refresh rates, it was imperative for ARM to update their technology on this front.
Previously known as “Cetus”, ARM is introducing the three different functional units that comprise their latest display technology. The Mali-D71 is based on the new Komeda display architecture and it can handle the aforementioned 4K resolution at 120 Hz. The second portion is the MMU-600 which is a memory management unit which is tightly coupled with the D-71 to provide high bandwidth and low latency memory accesses to achieve that 4K/120 spec. Finally the Assertive Display 5 unit helps the D-71 provide HDR support across a wide range of specifications.
The new display processor is highly associated with the latest Mali GPU cores, but with enough work a 3rd party licensee could adapt it to another GPU architecture. This is obviously not the most efficient way of using this technology as it is regarded as a turnkey solution for the Mali GPU products. ARM has developed the software stack for both Andriod and Linux, and if needed it can develop Windows based drivers to fully leverage the features of this latest product. It is easily attached to 3rd party panel interfaces.
The D-71 is somewhat unique in that it adds a tremendous amount of features and speed, but is highly area efficient as compared to previous products. It takes up about half the size of the previous DP-650 unit, but because of the overall design and specialized hardware support in D-71 it features twice the pixel throughput at about 70% of the power consumption. This is an excellent example of inspired design overcoming previous generation limitations.
MMU-600 is a lynchpin in the operation as it provides advanced memory management which improves bandwidth and lowers latency dramatically as compared to the previous unit. It is tightly designed with the D71 and is highly optimized to work with the latest AMBA/CHI interconnect and Mali-G series of GPUs.
The final piece of this release is the Assertive Display 5 functionality. This provides extensive HDR support with a wide variety of panels. It is highly programmable and can provide HDR-like performance even to SDR displays. It has native HDR 10 and HLG support as well as converting HDR content to SDR. It implements blue light filtering in hardware as well as compensation for ambient light using the device sensors. ARM tries to ensure the best possible picture from the screen no matter the conditions.
The latest ARM display solution overcomes many of the limitations of the previous unit as well as adds a few new wrinkles with Assertive Display 5. It can provide top end VR and HDR experiences, as long as the GPU portion of the device can keep up with the needs of the software. ARM has removed a pretty significant hurdle to providing a rich visual experience with handheld devices.
Subject: Processors | October 24, 2017 - 02:12 AM | Josh Walrath
Tagged: arm, cortex, mali, PSA, security, TrustZone, Platform Security Architecture, amd, cortex-m, Armv8-m
It is no wonder that device security dominates news. Every aspect of our lives is approaching always connected status. Whether it is a major company forgetting to change a default password or an inexpensive connected webcam that is easily exploitable, security is now more important than ever.
ARM has a pretty good track record in providing solutions to their partners to enable a more secure computing experience in this online world. Their first entry to address this was SecurCore which was introduced in 2000. Later they released their TrustZone in 2003. Eventually that technology made it into multiple products as well as being adopted by 3rd party chip manufacturers.
Today ARM is expanding the program with this PSA announcement. Platform Security Architecture is a suite of technologies that encompasses software, firmware, and hardware. ARM technology has been included in over 100 billion chips shipped since 1991. ARM expects that another 100 billion will be shipped in the next four years. To get a jump on the situation ARM is introducing this comprehensive security architecture to enable robust security features for products from the very low end IoT to the highest performing server chips featuring ARM designs.
PSA is not being rolled out in any single product today. It is a multi-year journey for ARM and its partners and it can be considered a framework to provide enhanced security across a wide variety of products. The first products to be introduced using this technology will be the Armv8-M class of processors. Cortex-M processors with Trusted Firmware running on the Mbed OS will be the start of the program. Eventually it will branch out into other areas, but ARM is focusing much of its energy on the IoT market and ensuring that there is a robust security component to what could eventually scale out to be a trillion connected products.
There are two new hardware components attached to PSA. The first is the CryptoIsland 300 on-die security enclave. It is essentially a second layer of hardware security beyond that of the original TrustZone. The second is the SDC-600. This is a secure debug port that can be enabled and disabled using certificates. This cuts off a major avenue for security issues. These technologies are integrated into the CPUs themselves and are not offered as a 3rd party chip.
If we truly are looking at 1 trillion connected devices over the next 10 years, security is no longer optional. ARM is hoping to get ahead of this issue by being more proactive in developing these technologies and working with their partners to get them implemented. This technology will evolve over time to include more and more products in the ARM portfolio and hopefully will be adopted by their many licensees.
ARM Refreshes All the Things
This past April ARM invited us to visit Cambridge, England so they could discuss with us their plans for the next year. Quite a bit has changed for the company since our last ARM Tech Day in 2016. They were acquired by SoftBank, but continue to essentially operate as their own company. They now have access to more funds, are less risk averse, and have a greater ability to expand in the ever growing mobile and IOT marketplaces.
The ARM of today certainly is quite different than what we had known 10 years ago when we saw their technology used in the first iPhone. The company back then had good technology, but a relatively small head count. They kept pace with the industry, but were not nearly as aggressive as other chip companies in some areas. Through the past 10 years they have grown not only in numbers, but in technologies that they have constantly expanded on. The company became more PR savvy and communicated more effectively with the press and in the end their primary users. Where once ARM would announce new products and not expect to see shipping products upwards of 3 years away, we are now seeing the company be much more aggressive with their designs and getting them out to their partners so that production ends up happening in months as compared to years.
Several days of meetings and presentations left us a bit overwhelmed by what ARM is bringing to market towards the end of 2017 and most likely beginning of 2018. On the surface it appears that ARM has only done a refresh of the CPU and GPU products, but once we start looking at these products in the greater scheme and how they interact with DynamIQ we see that ARM has changed the mobile computing landscape dramatically. This new computing concept allows greater performance, flexibility, and efficiency in designs. Partners will have far more control over these licensed products to create more value and differentiation as compared to years past.
We have previously covered DynamIQ at PCPer this past March. ARM wanted to seed that concept before they jumped into more discussions on their latest CPUs and GPUs. Previous Cortex products cannot be used with DynamIQ. To leverage that technology we must have new CPU designs. In this article we are covering the Cortex-A55 and Cortex-A75. These two new CPUs on the surface look more like a refresh, but when we dig in we see that some massive changes have been wrought throughout. ARM has taken the concepts of the previous A53 and A73 and expanded upon them fairly dramatically, not only to work with DynamIQ but also by removing significant bottlenecks that have impeded theoretical performance.
A Watershed Moment in Mobile
This previous May I was invited to Austin to be briefed on the latest core innovations from ARM and their partners. We were introduced to new CPU and GPU cores, as well as the surrounding technologies that provide the basis of a modern SOC in the ARM family. We also were treated to more information about the process technologies that ARM would embrace with their Artisan and POP programs. ARM is certainly far more aggressive now in their designs and partnerships than they have been in the past, or at least they are more willing to openly talk about them to the press.
The big process news that ARM was able to share at this time was the design of 10nm parts using an upcoming TSMC process node. This was fairly big news as TSMC was still introducing parts on their latest 16nm FF+ line. NVIDIA had not even released their first 16FF+ parts to the world in early May. Apple had dual sourced their 14/16 nm parts from Samsung and TSMC respectively, but these were based on LPE and FF lines (early nodes not yet optimized to LPP/FF+). So the news that TSMC would have a working 10nm process in 2017 was important to many people. 2016 might be a year with some good performance and efficiency jumps, but it seems that 2017 would provide another big leap forward after years of seeming stagnation of pure play foundry technology at 28nm.
Yesterday we received a new announcement from ARM that shows an amazing shift in thought and industry inertia. ARM is partnering with Intel to introduce select products on Intel’s upcoming 10nm foundry process. This news is both surprising and expected. It is surprising in that it happened as quickly as it did. It is expected as Intel is facing a very different world than it had planned for 10 years ago. We could argue that it is much different than they planned for 5 years ago.
Intel is the undisputed leader in process technologies and foundry practices. They are the gold standard of developing new, cutting edge process nodes and implementing them on a vast scale. This has served them well through the years as they could provide product to their customers seemingly on demand. It also allowed them a leg up in technology when their designs may not have fit what the industry wanted or needed (Pentium 4, etc.). It also allowed them to potentially compete in the mobile market with designs that were not entirely suited for ultra-low power. x86 is a modern processor technology with decades of development behind it, but that development focused mainly on performance at higher TDP ranges.
This past year Intel signaled their intent to move out of the sub 5 watt market and cede it to ARM and their partners. Intel’s ultra mobile offerings just did not make an impact in an area that they were expected to. For all of Intel’s advances in process technology, the base ARM architecture is just better suited to these power envelopes. Instead of throwing good money after bad (in the form of development time, wafer starts, rebates) Intel has stepped away from this market.
This leaves Intel with a problem. What to do with extra production capacity? Running a fab is a very expensive endeavor. If these megafabs are not producing chips 24/7, then the company is losing money. This past year Intel has seen their fair share of layoffs and slowing down production/conversion of fabs. The money spent on developing new, cutting edge process technologies cannot stop for the company if they want to keep their dominant position in the CPU industry. Some years back they opened up their process products to select 3rd party companies to help fill in the gaps of production. Right now Intel has far more production line space than they need for the current market demands. Yes, there were delays in their latest Skylake based processors, but those were solved and Intel is full steam ahead. Unfortunately, they do not seem to be keeping their fabs utilized at the level needed or desired. The only real option seems to be opening up some fab space to more potential customers in a market that they are no longer competing directly in.
The Intel Custom Foundry Group is working with ARM to provide access to their 10nm HPM process node. Initial production of these latest generation designs will commence in Q1 2017 with full scale production in Q4 2017. We do not have exact information as to what cores will be used, but we can imagine that they will be Cortex-A73 and A53 parts in big.LITTLE designs. Mali graphics will probably be the first to be offered on this advanced node as well due to the Artisan/POP program. Initial customers have not been disclosed and we likely will not hear about them until early 2017.
This is a big step for Intel. It is also a logical progression for them when we look over the changing market conditions of the past few years. They were unable to adequately compete in the handheld/mobile market with their x86 designs, but they still wanted to profit off of this ever expanding area. The logical way to monetize this market is to make the chips for those that are successfully competing here. This will cut into Intel’s margins, but it should increase their overall revenue base if they are successful here. There is no reason to believe that they won’t be.
The last question we have is if the 10nm HPM node will be identical to what Intel will use for their next generation “Cannonlake” products. My best guess is that the foundry process will be slightly different and will not provide some of the “secret sauce” that Intel will keep for themselves. It will probably be a mobile focused process node that stresses efficiency rather than transistor switching speed. I could be very wrong here, but I don’t believe that Intel will open up their process to everyone that comes to them hat in hand (AMD).
The partnership between ARM and Intel is a very interesting one that will benefit customers around the globe if it is handled correctly from both sides. Intel has a “not invented here” culture that has both benefited it and caused it much grief. Perhaps some flexibility on the foundry side will reap benefits of its own when dealing with very different designs than Intel is used to. This is a titanic move from where Intel probably thought it would be when it first started to pursue the ultra-mobile market, but it is a move that shows the giant can still positively react to industry trends.
New Products for 2017
PC Perspective was invited to Austin, TX on May 11 and 12 to participate in ARM’s yearly tech day. Also invited were a handful of editors and analysts that cover the PC and mobile markets. Those folks were all pretty smart, so it is confusing as to why they invited me. Perhaps word of my unique talent of screenshoting PDFs into near-unreadable JPGs preceded me? Regardless of the reason, I was treated to two full days of in-depth discussion of the latest generation of CPU and GPU cores, 10nm test chips, and information on new licensing options.
Today ARM is announcing their next CPU core with the introduction of the Cortex-A73. They are also unwrapping the latest Mali-G71 graphics technology. Other technologies such as the CCI-550 interconnect are also revealed. It is a busy and important day for ARM, especially in light of Intel seemingly abandoning the sub-milliwatt mobile market.
ARM previously announced the Cortex-A72 in February, 2015. Since that time it has been seen in most flagship mobile devices in late 2015 and throughout 2016. The market continues to evolve, and as such the workloads and form factors have pushed ARM to continue to develop and improve their CPU technology.
The Sofia Antipolis, France design group is behind the new A73. The previous several core architectures had been developed by the Cambridge group. As such, the new design differs quite dramatically from the previous A72. I was actually somewhat taken aback by the differences in the design philosophy of the two groups and the changes between the A72 and A73, but the generational jumps we have seen in the past make a bit more sense to me.
The marketplace is constantly changing when it comes to workloads and form factors. More and more complex applications are being ported to mobile devices, including hot technologies like AR and VR. Other technologies include 3D/360 degree video, greater than 20 MP cameras, and 4K/8K displays and their video playback formats. Form factors on the other hand have continued to decrease in size, especially in overall height. We have relatively large screens on most premium devices, but the designers have continued to make these phones thinner and thinner throughout the years. This has put a lot of pressure on ARM and their partners to increase performance while keeping TDPs in check, and even reducing them so they more adequately fit in the TDP envelope of these extremely thin devices.
10nm Sooner Than Expected?
It seems only yesterday that we had the first major GPU released on 16nm FF+ and now we are talking about ARM about to receive their first 10nm FF test chips! Well, in fact it was yesterday that NVIDIA formally released performance figures on the latest GeForce GTX 1080 which is based on TSMC’s 16nm FF+ process technology. Currently TSMC is going full bore on their latest process node and producing the fastest current graphics chip around. It has taken the foundry industry as a whole a lot longer to develop FinFET technology than expected, but now that they have that piece of the puzzle seemingly mastered they are moving to a new process node at an accelerated rate.
TSMC’s 10nm FF is not well understood by press and analysts yet, but we gather that it is more of a marketing term than a true drop to 10 nm features. Intel has yet to get past 14nm and does not expect 10 nm production until well into next year. TSMC is promising their version in the second half of 2016. We cannot assume that TSMC’s version will match what Intel will be doing in terms of geometries and electrical characteristics, but we do know that it is a step past TSMC’s 16nm FF products. Lithography will likely get a boost with triple patterning exposure. My guess is that the back end will also move away from the “20nm metal” stages that we see with 16nm. All in all, it should be an improved product from what we see with 16nm, but time will tell if it can match the performance and density of competing lines that bear the 10nm name from Intel, Samsung, and GLOBALFOUNDRIES.
ARM has a history of porting their architectures to new process nodes, but they are being a bit more aggressive here than we have seen in the past. It used to be that ARM would announce a new core or technology, and it would take up to two years to be introduced into the market. Now we are seeing technology announcements and actual products hitting the scenes about nine months later. With the mobile market continuing to grow we expect to see products quicker to market still.
The company designed a simplified test chip to tape out and send to TSMC for test production on the aforementioned 10nm FF process. The chip was taped out in December, 2015. The design was shipped to TSMC for mask production and wafer starts. ARM is expecting the finished wafers to arrive this month.
Looking Towards 2016
ARM invited us to a short conversation with them on the prospects of 2016. The initial answer as to how they feel the upcoming year will pan out is, “Interesting”. We covered a variety of topics ranging from VR to process technology. ARM is not announcing any new products at this time, but throughout this year they will continue to push their latest Mali graphics products as well as the Cortex A72.
Trends to Watch in 2016
The one overriding trend that we will see is that of “good phones at every price point”. ARM’s IP scales from very low to very high end mobile SOCs and their partners are taking advantage of the length and breadth of these technologies. High end phones based on custom cores (Apple, Qualcomm) will compete against those licensing the Cortex A72 and A57 parts for their phones. Lower end options that are less expensive and pull less power (which then requires less battery) will flesh out the midrange and budget parts. Unlike several years ago, the products from top to bottom are eminently usable and relatively powerful products.
Camera improvements will also take center stage for many products and continue to be a selling point and an area of differentiation for competitors. Improved sensors and software will obviously be the areas where the ARM partners will focus on, but ARM is putting some work into this area as well. Post processing requires quite a bit of power to do quickly and effectively. ARM is helping here to leverage the Neon SIMD engine and leveraging the power of the Mali GPU.
4K video is becoming more and more common as well with handhelds, and ARM is hoping to leverage that capability in shooting static pictures. A single 4K frame is around 8 megapixels in size. So instead of capturing video, the handheld can achieve a “best shot” type functionality. So the phone captures the 4K video and then users can choose the best shot available to them in that period of time. This is a simple idea that will be a nice feature for those with a product that can capture 4K video.
Subject: Editorial | May 21, 2015 - 03:34 PM | Ken Addison
Tagged: podcast, video, amd, hbm, Fiji, g-sync, ips, XB270HU, corsair, Oculus, supermicro, asus, gladius, jem davies, arm, mali
PC Perspective Podcast #350 - 05/21/2015
Join us this week as we discuss AMD's plan for HBM, IPS G-SYNC, GameWorks and The Witcher 3, and more!
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the Store
- RSS - Subscribe through your regular RSS reader
- MP3 - Direct download link to the MP3 file
Hosts: Ryan Shrout, Jeremy Hellstrom, Josh Walrath, and Allyn Malventano
Program length: 1:24:12
Week in Review:
News item of interest:
Hardware/Software Picks of the Week:
Sebastian: Aukey Quick Charge 2.0 Portable Charger
Subject: Mobile | May 15, 2015 - 01:56 PM | Ryan Shrout
Tagged: video, mali, jem davies, interview, arm
Have you ever wondered how a mobile GPU is born? Or how the architecture of a mobile GPU like ARM Mali differs from the technology in your discrete PC graphics card? Perhaps you just want to know if ideas like HBM (high bandwidth memory) are going to find their way into the mobile ecosystem any time soon?
Josh and I sat down (virtually) with ARM's VP of Technology and Fellow, Jem Davies, to answer these questions and quite a bit more. The resulting interview will shed light on the design process of a mobile GPU, how you get the most out of an SoC that measures power by the milliwatt, what the world of mobile benchmarking needs to do to clean up its act and quite a bit more.
You'd be hard pressed to find a better way to spend the next hour of your day as you will without a doubt walk away more informed about the world of smartphones, tablets and GPUs.