Qualcomm’s GPU History
Despite its market dominance, Qualcomm may be one of the least known contenders in the battle for the mobile space. While players like Apple, Samsung, and even NVIDIA are often cited as the most exciting and most revolutionary, none come close to the sheer sales, breadth of technology, and market share that Qualcomm occupies. Brands like Krait and Snapdragon have helped push the company into the top 3 semiconductor companies in the world, following only Intel and Samsung.
Founded in July 1985, seven industry veterans came together in the den of Dr. Irwin Jacobs’ San Diego home to discuss an idea. They wanted to build “Quality Communications” (thus the name Qualcomm) and outlined a plan that evolved into one of the telecommunications industry’s great start-up success stories.
Though Qualcomm sold its own handset business to Kyocera in 1999, many of today’s most popular mobile devices are powered by Qualcomm’s Snapdragon mobile chipsets with integrated CPU, GPU, DSP, multimedia CODECs, power management, baseband logic and more. In fact the typical “chipset” from Qualcomm encompasses up to 20 different chips of different functions besides just the main application processor. If you are an owner of a Galaxy Note 4, Motorola Droid Turbo, Nexus 6, or Samsung Galaxy S5, then you are most likely a user of one of Qualcomm’s Snapdragon chipsets.
Qualcomm’s GPU History
Before 2006, the mobile GPU as we know it today was largely unnecessary. Feature phones and “dumb” phones were still the large majority of the market with smartphones and mobile tablets still in the early stages of development. At this point all the visual data being presented on the screen, whether on a small monochrome screen or with the color of a PDA, was being drawn through a software renderer running on traditional CPU cores.
But by 2007, the first fixed-function, OpenGL ES 1.0 class of GPUs started shipping in mobile devices. These dedicated graphics processors were originally focused on drawing and updating the user interface on smartphones and personal data devices. Eventually these graphics units were used for what would be considered the most basic gaming tasks.
New Features and Specifications
It is increasingly obvious that in the high end smartphone and tablet market, much like we saw occur over the last several years in the PC space, consumers are becoming more concerned with features and experiences than just raw specifications. There is still plenty to drool over when looking at and talking about 4K screens in the palm of your hand, octa-core processors and mobile SoC GPUs measuring performance in hundreds of GFLOPS, but at the end of the day the vast majority of consumers want something that does something to “wow” them.
As a result, device manufacturers and SoC vendors are shifting priorities for performance, features and how those are presented both the public and to the media. Take this week’s Qualcomm event in San Diego where a team of VPs, PR personnel and engineers walked me through the new Snapdragon 810 processor. Rather than showing slide after slide of comparative performance numbers to the competition, I was shown room after room of demos. Wi-Fi, LTE, 4K capture and playback, gaming capability, thermals, antennae modifications, etc. The goal is showcase the experience of the entire platform – something that Qualcomm has been providing for longer than just about anyone in this business, while educating consumers on the need for balance too.
As a 15-year veteran of the hardware space my first reaction here couldn’t have been scripted any more precisely: a company that doesn’t show performance numbers has something to hide. But I was given time with a reference platform featuring the Snapdragon 810 processor in a tablet form-factor and the results show impressive increases over the 801 and 805 processors from the previous family. Rumors of the chips heat issues seem overblown, but that part will be hard to prove for sure until we get retail hardware in our hands to confirm.
Today’s story will outline the primary feature changes of the Snapdragon 810 SoC, though there was so much detail presented at the event with such a short window of time for writing that I definitely won’t be able to get to it all. I will follow up the gory specification details with performance results compared to a wide array of other tablets and smartphones to provide some context to where 810 stands in the market.
Subject: Processors, Mobile | June 23, 2014 - 01:08 PM | Ryan Shrout
Tagged: snapdragon, qualcomm, gaming, Android, adreno
Today Qualcomm has published a 22-page white paper that keys in on the company's focus around Android gaming and the benefits that Qualcomm SoCs offer. As the dominant SoC vendor in the Android ecosystem of smartphones, tablets and handhelds (shipping more than 32% in Q2 of 2013) QC is able to offer a unique combination of solutions to both developers and gamers that push Android gaming into higher fidelity with more robust game play.
According to the white paper, Android gaming is the fastest growing segment of the gaming market with a 30% compound annual growth rate from 2013 to 2015, as projected by Gartner. Experiences for mobile games have drastically improved since Android was released in 2008 with developers like Epic Games and the Unreal Engine pushing visuals to near-console and near-PC qualities.
Qualcomm is taking a heterogeneous approach to address the requirements of gaming that include AI execution, physics simulation, animation, low latency input and high speed network connectivity in addition to high quality graphics and 3D rendering. Though not directly a part of the HSA standards still in development, the many specialized engines that Qualcomm has developed for its Snapdragon SoC processors including traditional CPUs, GPUs, DSPs, security and connectivity allow the company to create a solution that is built for Android gaming dominance.
In the white paper Qualcomm dives into the advantages that the Krait CPU architecture offers for CPU-based tasks as well as the power of the Adreno 4x series of GPUs that offer both raw performance and the flexibility to support current and future gaming APIs. All of this is done with single-digit wattage draw and a passive, fanless design and points to the huge undertaking that mobile gaming requires from an engineering and implementation perspective.
For developers, the ability to target Snapdragon architectures with a single code path that can address a scalable product stack allows for the least amount of development time and the most return on investment possible. Qualcomm continues to support the development community with tools and assistance to bring out the peak performance of Krait and Adreno to get games running on lower power parts as well as the latest and upcoming generations of SoCs in flagship devices.
It is great to see Qualcomm focus on this aspect of the mobile market and the challenges presented by it require strong dedication from these engineering teams. Being able to create compelling gaming experiences with high quality imagery while maintaining the required power envelope is a task that many other company's have struggled with.
Check out the new landing page over at Qualcomm if you are interested in more technical information as well as direct access to the white paper detailing the work Qualcomm is putting into its Snapdragon line of SoC for gamers.
Subject: Mobile | April 8, 2014 - 07:47 PM | Tim Verry
Tagged: SoC, snapdragon, qualcomm, LTE, ARMv8, adreno, 64-bit
Qualcomm has announced two new flagship 64-bit SoCs with the Snapdragon 808 and Snapdragon 810. The new chips will begin sampling later this year and should start showing up in high end smartphones towards the second half of 2015. The new 800-series parts join the previously announced mid-range Snapdragon 610 and 615 which are also 64-bit ARMv8 parts.
The Snapdragon 810 is Qualcomm's new flagship processor. The chip features four ARM Cortex A57 cores and four Cortex A53 cores in a big.LITTLE configuration, an Adreno 430 GPU, and support for Category 6 LTE (up to 300 Mbps downloads) and LPDDR4 memory. This flagship part uses the 64-bit ARMv8 ISA. The new Adreno 430 GPU integrated in the SoC is reportedly 30% faster than the Adreno 420 GPU in the Snapdragon 805 processor.
In addition to the flagship part, Qualcomm is also releasing the Snapdragon 808 which pairs two Cortex A57 CPU cores and four Cortex A53 CPU cores in a big.LITTLE configuration with an Adreno 418 (approximately 20% faster than the popular Adreno 320) GPU. This chip supports LPDDR3 memory and Qualcomm's new Category 6 LTE modem.
Both the 808 and 810 have Adreno GPUs which support OpenGL ES 3.1. The new chips support a slew of wireless I/O including Categrory 6 LTE, 802.11ac Wi-Fi, Bluetooth 4.1, and NFC.
Qualcomm is reportedly planning to produce these SoCs on a 20nm process. For reference, the mid-range 64-bit Snapdragon 610 and 615 use a 28nm LP manufacturing process. The new 20nm process (presumably from TSMC) should enable improved battery life and clockspeed headroom on the flagship parts. Exactly how big the mentioned gains will be will depend on the specific manufacturing process, with smaller gains from a bulk/planar process shrink or greater improvements coming from more advanced methods such as FD-SOI if the new chip on a 20nm process is the same transistor count as one on a 28nm process (which is being used in existing chips).
The 808 and 810 parts are the new high-end 64-bit chips which will effectively supplant the 32-bit Snapdragon 805 which is a marginal update over the Snapdragon 800. The naming conventions and product lineups are getting a bit crazy here, but suffice it to say that the 808 and 810 are the effective successors to the 800 while the 805 is a stop-gap upgrade while Qualcomm moves to 64-bit ARMv8 and secures manufacturing for the new chips which should be slightly faster CPU-wise, notably faster GPU-wise and more capable with the faster cellular modem support and 64-bit ISA support.
For those wondering, the press release also states that the company is still working on development of its custom 64-bit Krait CPU architecture. However, it does not appear that 64-bit Krait will be ready by the first half of 2015, which is why Qualcomm has opted to use ARM's Cortex A57 and A53 cores in its upcoming flagship 808 and 810 SoCs.
Subject: Processors, Mobile | February 24, 2014 - 05:30 PM | Ryan Shrout
Tagged: snapdrdagon 615, snapdragon 610, snapdragon 410, snapdragon, qualcomm, MWC 14, MWC, adreno 405, adreno
Intel, Mediatek and Allwinner have all come out with new SoC announcements at Mobile World Congress and Qualcomm is no different. By far the most interesting release is what it calls the "first commercial" 64-bit Octa-Core chipset with integrated global LTE support. The list of features and technologies included on the chipset is impressive.
The Snapdragon 615 integrates 8 x ARM Cortex-A53 cores that opterate on the newer 64-bit ARMv8 architecture while supporting 32-bit for backwards compatibility. Qualcomm is not using a custom designed CPU core for this chipset but the company has stated it will have its own custom 64-bit core sometime in 2015. This 8-core model is divided into a pair of quad-core clusters that will be tuned to different clock speed and power levels, offering the ability to run slightly more efficiently than would be possible with all cores tuned to the highest performance.
Snapdragon 610 is essentially the same design but is limited to a quad-core, single cluster setup.
Both of these parts will integrate the Qualcomm custom built Adreno 405 GPU that brings a DX11 class feature set, along with OpenGL ES 3.0 and OpenCL 1.2. The Adreno 405 performance is still unknown but it should be able to compete with the likes of PowerVR's Series6 used in the Apple A7 and Intel Merrifield parts. Quad HD resolutions are supported up to 2560x1600 and Miracast integration enables wireless display. H.265 hardware decode acceleration also found its way into the 615/610.
Connectivity features of the Snapdragon 615/610 include 802.11ac wireless as well as the company's 3rd generation LTE modem. Category 4 and carrier aggregation are optional.
Qualcomm has publicly stated that the move to 8-core processors with software lacking the capability to manage them properly was a poor decision. But it would appear that the "core race" has infected just about everyone.
Subject: Graphics Cards | February 25, 2013 - 08:01 PM | Josh Walrath
Tagged: nvidia, tegra, tegra 4, Tegra 4i, pixel, vertex, PowerVR, mali, adreno, geforce
When Tegra 4 was introduced at CES there was precious little information about the setup of the integrated GPU. We all knew that it would be a much more powerful GPU, but we were not entirely sure how it was set up. Now NVIDIA has finally released a slew of whitepapers that deal with not only the GPU portion of Tegra 4, but also some of the low level features of the Cortex A15 processor. For this little number I am just going over the graphics portion.
This robust looking fellow is the Tegra 4. Note the four pixel "pipelines" that can output 4 pixels per clock.
The graphics units on the Tegra 4 and Tegra 4i are identical in overall architecture, just that the 4i has fewer units and they are arranged slightly differently. Tegra 4 is comprised of 72 units, 48 of which are pixel shaders. These pixel shaders are VLIW based VEC4 units. The other 24 units are vertex shaders. The Tegra 4i is comprised of 60 units, 48 of which are pixel shaders and 12 are vertex shaders. We knew at CES that it was not a unified shader design, but we were still unsure of the overall makeup of the part. There are some very good reasons why NVIDIA went this route, as we will soon explore.
If NVIDIA were to transition to unified shaders, it would increase the overall complexity and power consumption of the part. Each shader unit would have to be able to handle both vertex and pixel workloads, which means more transistors are needed to handle it. Simpler shaders focused on either pixel or vertex operations are more efficient at what they do, both in terms of transistors used and power consumption. This is the same train of thought when using fixed function units vs. fully programmable. Yes, the programmability will give more flexibility, but the fixed function unit is again smaller, faster, and more efficient at its workload.
On the other hand here we have the Tegra 4i, which gives up half the pixel pipelines and vertex shaders, but keeps all 48 pixel shaders.
If there was one surprise here, it would be that the part is not completely OpenGL ES 3.0 compliant. It is lacking in one major function that is required for certification. This particular part cannot render at FP32 levels. It has been quite a few years since we have heard of anything not being able to do FP32 in the PC market, but it is quite common to not support it in the power and transistor conscious mobile market. NVIDIA decided to go with a FP 20 partial precision setup. They claim that for all intents and purposes, it will not be noticeable to the human eye. Colors will still be rendered properly and artifacts will be few and far between. Remember back in the day when NVIDIA supported FP16 and FP32 while they chastised ATI for choosing FP24 with the Radeon 9700 Pro? Times have changed a bit. Going with FP20 is again a power and transistor saving decision. It still supports DX9.3 and OpenGL ES 2.0, but it is not fully OpenGL ES 3.0 compliant. This is not to say that it does not support any 3.0 features. It in fact does support quite a bit of the functionality required by 3.0, but it is still not fully compliant.
This will be an interesting decision to watch over the next few years. The latest Mali 600 series, PowerVR 6 series, and Adreno 300 series solutions all support OpenGL ES 3.0. Tegra 4 is the odd man out. While most developers have no plans to go to 3.0 anytime in the near future, it will eventually be implemented in software. When that point comes, then the Tegra 4 based devices will be left a bit behind. By then NVIDIA will have a fully compliant solution, but that is little comfort for those buying phones and tablets in the near future that will be saddled with non-compliance once applications hit.
The list of OpenGL ES 3.0 features that are actually present in Tegra 4, but the lack of FP32 relegates it to 2.0 compliant status.
The core speed is increased to 672 MHz, well up from the 520 MHz in Tegra 3 (8 pixel and 4 vertex shaders). The GPU can output four pixels per clock, double that of Tegra 3. Once we consider the extra clock speed and pixel pipelines, the Tegra 4 increases pixel fillrate by 2.6x. Pixel and vertex shading will get a huge boost in performance due to the dramatic increase of units and clockspeed. Overall this is a very significant improvement over the previous generation of parts.
The Tegra 4 can output to a 4K display natively, and that is not the only new feature for this part. Here is a quick list:
2x/4x Multisample Antialiasing (MSAA)
24-bit Z (versus 20-bit Z in the Tegra 3 processor) and 8-bit Stencil
4K x 4K texture size incl. Non-Power of Two textures (versus 2K x 2K in the Tegra 3 processor) – for higher quality textures, and easier to port full resolution textures from console and PC games to Tegra 4 processor. Good for high resolution displays.
16:1 Depth (Z) Compression and 4:1 Color Compression (versus none in Tegra 3 processor) – this is lossless compression and is useful for reducing bandwidth to/from the frame buffer, and especially effective in antialiasing processing when processing multiple samples per pixel
Percentage Closer Filtering for Shadow Texture Mapping and Soft Shadows
Texture border color eliminate coarse MIP-level bleeding
sRGB for Texture Filtering, Render Surfaces and MSAA down-filter
1 - CSAA is no longer supported in Tegra 4 processors
This is a big generational jump, and now we only have to see how it performs against the other top end parts from Qualcomm, Samsung, and others utilizing IP from Imagination and ARM.
Subject: Mobile | December 16, 2011 - 06:00 AM | Tim Verry
Tagged: tegra, SoC, qualcomm, PowerVR, mobile, Android, adreno
Quite a few mobile device manufacturers are implementing graphics processors and image processors based on Imagination Technologies’ PowerVR technology. Popular licensees of Imagination Technologies PowerVR core patents include Intel, LG, Samsung, Sony, and Texas Instruments (a big one in regards to number of SoCs using PowerVR techs for mobile phones).
Interestingly, Qualcomm is not currently licensing the graphics processor portfolio that man other mobile OEMs license. Rather, Qualcomm is licensing the PowerVR display patents. The intellectual property features the PowerVR de-interlacing cores and de-judder purposed FRC (Frame Rate Conversion) core. The de-interlacing core(s) can do either “motion adaptive (MA) or motion compensated (MC) de-interlacing” as well as a few other algorithms to deliver smooth graphics. Further, the FRC cores take 24 FPS (frames per second) source material and outputs it as either 120 Hz or 240 Hz while applying image processing to keep the video looking smooth to the eye. The method for grabbing and extrapolating “extra” frames to take a 24 FPS video and display it on an LCD screen that refreshes at 120 Hz by displaying each one of those 24 frames five times every second involves a bit of math and algorithmic magic; a simplistic explanation can be read here.
It will be interesting to see how Qualcomm applies the image processing technology to their future SoCs (system on a chip) to entice manufacturers into going with them instead of competition like Texas Instruments or Nvidia’s Tegra chips. The Verge speculates that this Qualcomm and Imagination Technologies deal may be just the first step towards Qualcomm licensing more PowerVR tech (possibly) including the GPU portfolio. Whether Qualcomm will ditch their Adreno GPUs remains to be seen. If I had to guess, the SoC maker will invest in more PowerVR IP, but they will not completely abandon their Adreno graphics. Rather, they will continue developing next generation Adreno graphics for use in their SoCs while also integrating the useful and superior aspects of PowerVR graphics and display technologies. Another option may be to develop and sell both platforms (possibly with one being high end competition to Tegra and the other being for the rest of phones as competition to other low end, low power chips) to hedge their bets into the future of mobile SoCs which is a rapidly advancing industry where change and what is considered the top tech happens quickly.