Feedback

Qualcomm Snapdragon 835 Mobile Platform Performance Preview

Author:
Subject: Processors, Mobile
Manufacturer: Qualcomm

A new start

Qualcomm is finally ready to show the world how the Snapdragon 835 Mobile Platform performs. After months of teases and previews, including a the reveal that it was the first processor built on Samsung’s 10nm process technology and a mostly in-depth look at the architectural changes to the CPU and GPU portions of the SoC, the company let a handful of media get some hands-on time with development reference platform and run some numbers.

To frame the discussion as best I can, I am going to include some sections from my technology overview. This should give some idea of what to expect from Snapdragon 835 and what areas Qualcomm sees providing the widest variation from previous SD 820/821 product.

Qualcomm frames the story around the Snapdragon 835 processor with what they call the “five pillars” – five different aspects of mobile processor design that they have addressed with updates and technologies. Qualcomm lists them as battery life (efficiency), immersion (performance), capture, connectivity, and security.

View Full Size

Starting where they start, on battery life and efficiency, the SD 835 has a unique focus that might surprise many. Rather than talking up the improvements in performance of the new processor cores, or the power of the new Adreno GPU, Qualcomm is firmly planted on looking at Snapdragon through the lens of battery life. Snapdragon 835 uses half of the power of Snapdragon 801.

View Full Size

Since we already knew that the Snapdragon 835 was going to be built on the 10nm process from Samsung, the first such high performance part to do so, I was surprised to learn that Qualcomm doesn’t attribute much of the power efficiency improvements to the move from 14nm to 10nm. It makes sense – most in the industry see this transition as modest in comparison to what we’ll see at 7nm. Unlike the move from 28nm to 14/16nm for discrete GPUs, where the process technology was a huge reason for the dramatic power drop we saw, the Snapdragon 835 changes come from a combination of advancements in the power management system and offloading of work from the primary CPU cores to other processors like the GPU and DSP. The more a workload takes advantage of heterogeneous computing systems, the more it benefits from Qualcomm technology as opposed to process technology.

View Full Size

Continue reading our preview of Qualcomm Snapdragon 835 performance!

But let’s get to the new CPU, the Kryo 280. The Kryo 280 is the first processor built on ARM’s new semi-custom program called “Built on ARM Cortex Technology” that allows a partner like Qualcomm to take an off-the-shelf core (such as the Cortex-A73) and make modifications to it and rebrand it. This is a shift from the previous options of ARM cores or fully custom. Qualcomm, along with Apple and Samsung, had been the best examples of custom core designs for ARM SoCs, proving that you could do better with the added work of building your own CPU cores with an existing microarchitecture.

View Full Size

The result is an 8-core processor with four large cores and four smaller cores, similar to what we know as the ARM big.LITTLE design. The performance cores run up to 2.45 GHz, share 2MB of L2 cache and are 20% faster than the previous generation in a range of uses cases including app loads and VR. The smaller, efficiency cores will clock up to 1.9 GHz and have 1MB of L2 cache. While all eight cores have importance in the SoC, the chip spends 80% of the time running on the efficiency cores so the changes here can be more meaningful on total efficiency. Qualcomm did state that these cores would not be able to work at the same time, only the performance or efficiency cluster can be operating at one time.

Just what is Qualcomm customizing on the cores and what cores are they based on? Typically, Qualcomm isn’t going into much detail, not even telling us what cores are being modified. (I mentioned the Cortex-A73 above, which would typically be paired with the Cortex-A53 in a big.LITTLE configuration.) I was able to get some nuggets of information though. The efficiency core block has minimized transaction power with an increased L2 cache size. The bus interface on the cores was adjusted to fit into Qualcomm’s heterogeneous computing offloading scheme. Branch predictors were modified as well to better match past Kryo cores.

View Full Size

The immersion pillar is where Qualcomm talks about performance improvements. The biggest component in this space is the new Adreno 540 GPU. It is based on the same basic design as the Adreno 4x series of GPUs with no drastic changes to the architecture itself. Still, Qualcomm claims SD 835 has a 25% GPU performance advantage over the SD 820. Where does that come from? Again, with some vague comments throughout our meetings, I learned that engineers looked for the primary bottlenecks and addressed them with small tweaks. Z-culling was improved to minimize work on occluded pixels. Draw order independent depth projection was added. Tweaks to the ALUs. Higher order mipmaps and mipmap level swapping.

Regardless of how it’s done, a 25% increase in rendering capability should directly translate to improved gaming and VR experiences on Snapdragon 835.

Based on this information alone, I expected the Snapdragon 835 to be a modest improvement in CPU performance but a more significant jump in GPU performance compared to the SD 821 platform immediately before it. Power consumption and battery life improvements, the main draw from Qualcomm for the CPU segment of the SD 835 discussion, are more difficult to measure and impossible to accurately gauge from just a couple of hours with the reference device.

The Snapdragon 835 Mobile Platform Reference Device

View Full Size

Speaking of reference designs, it is worth taking a look at what Qualcomm built to share its new flagship design with partners and with media. As is traditionally the case with smart phones and tablets, reference designs are not typical of what you will see in the market. They will vary in shape, size, cooling capability, battery capacity and feature set, making today’s testing only valid for the raw performance of the SoC itself.

View Full Size

The specifications of said reference design ARE important however. A summary is included in the table below.

  Snapdragon 835 Reference Platform Google Pixel Huawei Mate 9 Huawei Mate 8
SoC Snapdragon 835 Snapdragon 821 Kirin 960 Kirin 950
CPU Cores Quad-core 2.45 GHz
Quad-core 1.90 GHz
2x 2.15 GHz Kryo
2x 1.60 GHz Kryo
4x 2.36 GHz Cortex-A73
4x 1.84 GHz Cortex-A53
4x 2.3 GHz Cortex-A72
4x 1.8 GHz Cortex-A53
GPU Cores Adreno 540 Adreno 530 Mali-G71 MP8 Mali-T880 MP4
RAM 6GB LPDDR4 4GB LPDDR4 4GB LPDDR4 3GB LPDDR4
Network Snapdragon X16 LTE Snapdragon X12 LTE Integrated LTE Cat. 12 Integrated LTE Cat. 6
Connectivity Qualcomm VIVE 802.11ac
2x2 MU-MIMO, tri-band Wi-Fi
Bluetooth 5.0
USB 3.0
NFC
802.11ac Wi-Fi
2x2 MU-MIMO
Bluetooth 4.2
USB 3.0
NFC
Dual-band 802.11ac Wi-Fi
Bluetooth 4.2
USB 2.0
NFC
Dual-band 802.11ac Wi-Fi
Bluetooth 4.2
USB 2.0
NFC
OS Android 7.1.1 Android 7.1.1 Android 7.0 Android 6.0

The Snapdragon 835 (with the product code name of MSM8998) uses an 8-core design with four large, high performance cores and four smaller, more power efficiency cores. The high-performance cores are rated a 2.46 GHz while the high efficiency cores will run at up to 1.90 GHz. In line with other current flagship phones, the reference design has 6GB of LPDDR4 memory and a 2560x1440 resolution screen. The system was running Android 7.1.1.

Kryo 280 CPU Performance

Our first set of tests will look at scaling and performance of the new Kryo 280 CPU, the first from Qualcomm in an 8-core design and the first to use the semi-custom licensing and integration from ARM.

View Full Size

Geekbench is one of our favorite and most reliable tests for mobile devices with a development team that is responsive and aware of the complication in performance evaluation. In our overall results, the single threaded performance of the Snapdragon 835 SoC comes behind only the Apple A9 and A10 parts, long considered the pinnacle of single thread mobile performance. Compared to Qualcomm’s own SD 821 part, the 835 is 35% faster. It barely edges out the Kirin 960 and 950 processors from HiSilicon/Huawei, making it the current fastest Android silicon for single threaded workloads in our testing.

For multi-thread comparisons, the Snapdragon 835 bests all other competitors, including those from Qualcomm, HiSilicon and Apple. The SD 835 is 52% faster than the SD 821, not a surprise with the doubling of available cores. Compared to the latest Kirin 960, the SD 835 is 3.6% faster and compared to the Apple A10 Fusion SoC, SD 835 is 11% faster.

These aren’t insignificant numbers; the Snapdragon 821 was behind the Kirin 950 and the Apple A9 part by significant amounts and the performance delta was starting to show its age. With both the A10 and the Kirin 960 already in shipping hardware, Qualcomm needed (and this still need to be proven out in retail devices) to offer a solution that could compete.

View Full Size

Our memory results from Geekbench show us drastic improvements in total system bandwidth. Reaching a peak of 3649, the Snapdragon 835 is 23.5% more performant than the SD 821 and edges past both Apple and HiSilicon for the top single-threaded result.

View Full Size

The Google Octane benchmark is one used to measure browser performance and all our Android-based testing was done on Chrome. iOS testing was done on Safari. Even though Safari-based testing blows the rest of the results away due to the specific OS/hardware optimizations, the Snapdragon 835 is the fastest Android hardware by 8.7% over the Kirin 960. Compared to the SD 821, this new SoC is 28% faster.

View Full Size

Again, we see that the Apple hardware has the advantage in our JS testing, but the SD 835 has a great showing, matching the performance of the Google Pixel phone powered by the Snapdragon 821. The Kirin 960 is approximately 9% slower.

Adreno 540 GPU Performance

The graphics subsystem of the Snapdragon 835 was expected to be notably faster than the GPU in the 821 based on the marketing and educational information provided by Qualcomm early in the release process. Does it live up to claims?

View Full Size

View Full Size

Let’s start with a look at 3DMark Slingshot in both ES 3.0 and ES 3.1 derivatives. Though Apple hardware doesn’t yet support ES 3.1, the results here put the new Snapdragon 835 and its Adreno 540 at a 30% GPU performance advantage over both the Kirin 960 and the SD 821! In the ES 3.0 results, the SD 835 is again the best performer, beating its own SD 821 by 29% and the Apple A10 by 42%!

View Full Size

View Full Size

In our GFXBench graphics testing the Snapdragon 821 already had a great showing against both Apple and HiSilicon SoCs. Under Manhattan ES 3.0, the SD 835 is 31% faster than the SD 821 and 39% faster than the A10 Fusion GPU. Those are substantial gains in performance and clearly paint the Snapdragon 835 Mobile Platform as a leader in graphics, VR and gaming on mobile platforms. The Kirin 960 is well behind in both test results: the SD 835 is 50% faster in T-Rex and 76% faster in Manhattan!

Other Pillars and Closing Thoughts

Our early performance benchmarks paint an interesting picture of the Qualcomm Snapdragon 835 Mobile Platform in its reference state. First, in terms of the new Kryo 280 CPU implementation, I find that the performance improvements are more substantial than initially expected. Improving on single threaded performance by 35% over the Snapdragon 821, and coming in close behind Apple’s flagship SoC, helps negate any complaints the market and partners would have had about previous generation parts. Despite the race to many-core designs, I still believe that single threaded scalability is the major building block of the user experience and responsiveness on mobile devices. How quickly can a device respond to your actions, based on how quickly it can scale clock speed and to what level, is one of the key points of contention.

Multi-threaded performance goes up more significantly with its octa-core design and a “big.LITTLE-like” implementation, even if they won’t call it that. Using four larger, more powerful cores with four smaller, more power efficient cores, give Qualcomm the ability to pick the best core for the workload and in theory, allows SD 835 to be more power efficient across a host of applications and conditions. In our testing the new Snapdragon 835 is more than 50% faster than the SD 821 and moves it to the top of the list in performance.

View Full Size

The graphics performance of the Adreno 540 takes the already advantageous place that Qualcomm was in and improves it even further. Our results how 30%+ graphics performance gains which helps us in two ways. First, in theory, if QC implements everything correctly, better performance at peak voltage and clocks means we can operate 60 Hz frame rates with less energy and thus, with more battery life. Second, for high utilization workloads like VR, the SD 835 is clearly going to be the leader in the field. If you need mobile graphics performance, Snapdragon 835 is the best option today.

There are other areas where Qualcomm has improved the platform surrounding the SoC itself. This includes power efficiency and scaling changes, still photo and video capture capabilities and performance, audio shifts like the inclusion of AptX and Bluetooth 5.0, security and of course, modems that support Gigabit LTE speeds! We have covered these parts of SD 835 in several different stories, all of which are noteworthy and required reading to really understand the value that Qualcomm claims to offer Snapdragon.

Obviously, we are calling this story a preview for a reason – the real deciding factor on the success of the Snapdragon 835 will be judged on Qualcomm’s partners and their retail products. We know of a few designs announced with the Snapdragon 835 Mobile Platform including the Sony Xperia XZ Premium and the Samsung Galaxy S8 (which is apparently buying up all the inventory). When we have our hands on those phones and can see how the total solution is implemented, THEN we can make a final judgement on this impressive part. As it stands now with the performance data we have here, the excitement for the Snapdragon 835 should be at its peak. If it meets or exceeds our results here, I see no reason it shouldn’t be an amazing mobile platform.


March 22, 2017 | 05:47 AM - Posted by LauRoman

SoC...

March 22, 2017 | 08:34 AM - Posted by razor512

I wish Qualcomm would do an SOC more similar to what is found in the iphone, where much of the focus is on single threaded performance.

In many of the browser benchmarks, if the CPU usage is monitored, you will see that much of the work is single threaded, thus the A10 SOC gets a major boost.

March 22, 2017 | 09:40 AM - Posted by Mobile_Dom

I think we'd all love to see that, but I don't think Qualcomm has it in them at the moment.

The last great CPU μArch that Qualcomm had was Krait, but I think that was only because A9 stuck around longer than ARM wanted and A15s early implementations where so poor.

Since then, ARM has gotten much better at it's μArch design, and I don't think Qualcomm knows what it takes anymore.

Apple's A10 Fusion chip has a single threaded perf. over 50% larger if you look at something like Geekbench.

At the moment, If Apple is Intel, Qualcomm is in danger of becoming Via.

March 22, 2017 | 04:20 PM - Posted by Anonymous (not verified)

ARM is going to do this.

March 22, 2017 | 12:40 PM - Posted by Anonymous (not verified)

The Apple A7 and above CPU designs are fully custom cores that are twice as wide order superscalar as the ARM holdings reference design cores. And this semi-custom Qualcomm 8 core design's cores probably are only 3 wide for A72/A53, or for the A73 2 regular decoders + 1(dedicated FP decoder) for 3 wide like all of the other ARM reference cores. So Qualcomm's cores are narrower than the Apple A7-A10 designs with Apple's 6(decoders) wide core designs really cranking out the per core IPCs.

If Apple eventually decides to create a custom ARMv8A ISA running micro-architecture core with SMT capabilities that efficiency will grow by 15-30% for various workloads.

IF the Qualcomm is making use of the A73(Artemis) then the front end instruction decoder is actually 2 wide +1(FP dedicated decoder) with the decoders instead of 3(A72) general decoders and any FP instructions on the Artemis(1) core are shunted to a side FP("+1") decoder/rename/dispatch for FP/Neon instructions workloads.

How much of the 835's performance comes from the process node shrink is still an unknown but there has to be some power savings and Qualcomm's Hexagon DSP processor is open to general application usage through the API! Qualcomm's value added IP like its Hexagon DSP and Adreno 540 GPU has probably more to do with the high end performance metrics than the CPU cores themselves figure into those performance metrics.

(1)

"The ARM Cortex A73 - Artemis Unveiled"

http://www.anandtech.com/show/10347/arm-cortex-a73-artemis-unveiled/2

March 22, 2017 | 09:50 AM - Posted by JDubz (not verified)

Qualcomm frames the story around the Snapdragon 835 processor with what they call the “five pillars” – five different aspects of mobile processor design that they have addressed with updates and technologies. Qualcomm lists them as battery life (efficiency), immersion (performance), connectivity, and security.

Ryan, you only listed four. You forgot to mention "Capture" as the fifth pillar. I know you don't touch on "Capture" in the article, but I just found it odd that you referenced the "five pillars" but then only listed four of them.

March 23, 2017 | 12:29 PM - Posted by Ryan Shrout

Ha, good point. :)

March 22, 2017 | 12:37 PM - Posted by Gunbuster

With windows phone dead who is Qualcomm going to get to gobble up all their sloppy seconds low performing SOC's?

March 22, 2017 | 01:11 PM - Posted by Anonymous (not verified)

Our memory results from Geekbench show us drastic improvements in total system bandwidth. Reaching a peak of 3649, the Snapdragon 835 is 23.5% more performant than the SD 821 and edges past both Apple and HiSilicon for the top single-threaded result.

Looking at the memory graph, was the 835's performance results mislabeled as Kirin 960?

March 22, 2017 | 04:39 PM - Posted by Travis Walker (not verified)

I wont to see this running windows 10. Im very interested in seeing how it will work and perform. I doubt it would happen but i would like see it running windows 10 and be paired with a graphics card something like the 1050 ti. This would make a awesome mobile gaming laptop if you can do that.

March 22, 2017 | 06:32 PM - Posted by Anonymous (not verified)

1050ti is ~75W. Intel U-series CPUs are 15W. This is something you can buy today.
Snap 835 is ~5W? So you dream of a laptop which is 10% more efficient then what is currently available? Dream bigger!

March 22, 2017 | 08:36 PM - Posted by Anonymous (not verified)

ARM running windows 10 with all the translation layers needed for win32 applications is not goin to happen. Add to that all the windows 10 bloadware(Spyware, adware, UWP layers) and no this will not work on any low power ARM SKUs outside of the server room SKUs with many more than 8 ARM cores! Winblows is not the answer.

March 22, 2017 | 09:34 PM - Posted by Jeremy Hellstrom

Well, you say that ... https://www.qualcomm.com/news/releases/2016/12/08/qualcomm-collaborates-...

March 23, 2017 | 12:09 PM - Posted by Anonymous (not verified)

"and no this will not work on any low power ARM SKUs outside of the server room SKUs"

"outside of the server room SKUs"

"outside of the server room SKUs", as to say it will not happen in laptops/phones but can happen in server rooms because the server SKUs have the extra cores and higher power budgets to run that bloatware infested OS crap! Windows on ARM will reguire extra CPU cores and resources to run all the bloat x86 to ARMVv8A ISA translation layers that are needed to support win32 applications!

Nobody wants(OEMs or others do not want) M$ in the server room on ARM anymore than they want Intel inside of their phones, nobody wants a ring through their nose. Wait and see how many folks use M$'s OS in their Arm based server rooms. Qualcomm's chips can run Linux very well and much better than Windows with its bloat that will require more CPU resources to run those x86/win32 to ARM ISA translation layers!

March 23, 2017 | 01:20 PM - Posted by Jeremy Hellstrom

RTFA.

Literally about what you claim no one wants actually happening.

March 24, 2017 | 12:37 AM - Posted by Anonymous (not verified)

Qualcomm should be boycotted for abuse of power and generally being assholes.

March 24, 2017 | 05:19 PM - Posted by Reader (not verified)

CPU performance of 16nm Kirin 960 is equal to 10nm SD 835. So, when Kirin moves to 10nm, it will be faster than 835. So what changes Quallcom made at CPU cores?

March 29, 2017 | 09:19 PM - Posted by Anonymous (not verified)

I have the mate 9 at the moment. My predictions were right the SD835 is not going to beat the kirin 960 by much

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote><p><br>
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.