ARM is a company that no longer needs much of an introduction. This was not always the case. ARM has certainly made a name for themselves among PC, tablet, and handheld consumers. Their primary source of income is licensing CPU designs as well as their ISA. While names like the Cortex A9 and Cortex A15 are fairly well known, not as many people know about the graphics IP that ARM also licenses. Mali is the product name of the graphics IP, and it encompasses an entire range of features and performance that can be licensed by other 3rd parties.
I was able to get a block of time with Nizar Romdhane, Head of the Mali Ecosystem at ARM. I was able to ask a few questions about Mali, ARM’s plans to address the increasingly important mobile graphics market, and how they will compete with competition from Imagination Technologies, Intel, AMD, NVIDIA, and Qualcomm.
We would like to thank Nizar for his time, as well as Phil Hughes in facilitating this interview. Stay tuned as we are expecting to continue this series of interviews with other ARM employees in the near future.
NVIDIA Details Tegra 4 and Tegra 4i Graphics
Subject: Graphics Cards | February 25, 2013 - 08:01 PM | Josh Walrath
Tagged: nvidia, tegra, tegra 4, Tegra 4i, pixel, vertex, PowerVR, mali, adreno, geforce
When Tegra 4 was introduced at CES there was precious little information about the setup of the integrated GPU. We all knew that it would be a much more powerful GPU, but we were not entirely sure how it was set up. Now NVIDIA has finally released a slew of whitepapers that deal with not only the GPU portion of Tegra 4, but also some of the low level features of the Cortex A15 processor. For this little number I am just going over the graphics portion.
This robust looking fellow is the Tegra 4. Note the four pixel "pipelines" that can output 4 pixels per clock.
The graphics units on the Tegra 4 and Tegra 4i are identical in overall architecture, just that the 4i has fewer units and they are arranged slightly differently. Tegra 4 is comprised of 72 units, 48 of which are pixel shaders. These pixel shaders are VLIW based VEC4 units. The other 24 units are vertex shaders. The Tegra 4i is comprised of 60 units, 48 of which are pixel shaders and 12 are vertex shaders. We knew at CES that it was not a unified shader design, but we were still unsure of the overall makeup of the part. There are some very good reasons why NVIDIA went this route, as we will soon explore.
If NVIDIA were to transition to unified shaders, it would increase the overall complexity and power consumption of the part. Each shader unit would have to be able to handle both vertex and pixel workloads, which means more transistors are needed to handle it. Simpler shaders focused on either pixel or vertex operations are more efficient at what they do, both in terms of transistors used and power consumption. This is the same train of thought when using fixed function units vs. fully programmable. Yes, the programmability will give more flexibility, but the fixed function unit is again smaller, faster, and more efficient at its workload.
On the other hand here we have the Tegra 4i, which gives up half the pixel pipelines and vertex shaders, but keeps all 48 pixel shaders.
If there was one surprise here, it would be that the part is not completely OpenGL ES 3.0 compliant. It is lacking in one major function that is required for certification. This particular part cannot render at FP32 levels. It has been quite a few years since we have heard of anything not being able to do FP32 in the PC market, but it is quite common to not support it in the power and transistor conscious mobile market. NVIDIA decided to go with a FP 20 partial precision setup. They claim that for all intents and purposes, it will not be noticeable to the human eye. Colors will still be rendered properly and artifacts will be few and far between. Remember back in the day when NVIDIA supported FP16 and FP32 while they chastised ATI for choosing FP24 with the Radeon 9700 Pro? Times have changed a bit. Going with FP20 is again a power and transistor saving decision. It still supports DX9.3 and OpenGL ES 2.0, but it is not fully OpenGL ES 3.0 compliant. This is not to say that it does not support any 3.0 features. It in fact does support quite a bit of the functionality required by 3.0, but it is still not fully compliant.
This will be an interesting decision to watch over the next few years. The latest Mali 600 series, PowerVR 6 series, and Adreno 300 series solutions all support OpenGL ES 3.0. Tegra 4 is the odd man out. While most developers have no plans to go to 3.0 anytime in the near future, it will eventually be implemented in software. When that point comes, then the Tegra 4 based devices will be left a bit behind. By then NVIDIA will have a fully compliant solution, but that is little comfort for those buying phones and tablets in the near future that will be saddled with non-compliance once applications hit.
The list of OpenGL ES 3.0 features that are actually present in Tegra 4, but the lack of FP32 relegates it to 2.0 compliant status.
The core speed is increased to 672 MHz, well up from the 520 MHz in Tegra 3 (8 pixel and 4 vertex shaders). The GPU can output four pixels per clock, double that of Tegra 3. Once we consider the extra clock speed and pixel pipelines, the Tegra 4 increases pixel fillrate by 2.6x. Pixel and vertex shading will get a huge boost in performance due to the dramatic increase of units and clockspeed. Overall this is a very significant improvement over the previous generation of parts.
The Tegra 4 can output to a 4K display natively, and that is not the only new feature for this part. Here is a quick list:
2x/4x Multisample Antialiasing (MSAA)
24-bit Z (versus 20-bit Z in the Tegra 3 processor) and 8-bit Stencil
4K x 4K texture size incl. Non-Power of Two textures (versus 2K x 2K in the Tegra 3 processor) – for higher quality textures, and easier to port full resolution textures from console and PC games to Tegra 4 processor. Good for high resolution displays.
16:1 Depth (Z) Compression and 4:1 Color Compression (versus none in Tegra 3 processor) – this is lossless compression and is useful for reducing bandwidth to/from the frame buffer, and especially effective in antialiasing processing when processing multiple samples per pixel
Depth Textures
Percentage Closer Filtering for Shadow Texture Mapping and Soft Shadows
Texture border color eliminate coarse MIP-level bleeding
sRGB for Texture Filtering, Render Surfaces and MSAA down-filter
1 - CSAA is no longer supported in Tegra 4 processors
This is a big generational jump, and now we only have to see how it performs against the other top end parts from Qualcomm, Samsung, and others utilizing IP from Imagination and ARM.
ARM snaps graphics marketshare from the dragon
Subject: General Tech, Mobile | October 22, 2012 - 02:00 PM | Jeremy Hellstrom
Tagged: arm, qualcomm, marketshare, SoC, imagination, Vivante, jon peddie, mali
ARM has made some serious impact on the mobile market with their Mali GPU on their SoC, with Jon Peddie Research reporting they have doubled their market share over the past year. That number is even more impressive when you pair it with the 91.3% growth in the mobile GPU market. Another player, Vivante, quadrupled their share of the market and while their products are found primarily in Asia you may recognize them as a member of the HSA. Their success comes at a cost to Imagination and Qualcomm, both of whom have seen their market shares drop. NVIDIA is currently making up 2.5% of the GPU market for tablets and smartphones which is not too bad when you consider that the other four main players all license their processors out while NVIDIA remains the sole provider of its Tegra SoCs. Get more numbers at The Inquirer.
"CHIP DESIGNERS ARM and Vivante have achieved significant market share gains in the system-on-chip (SoC) GPU market while Imagination and Qualcomm have seen their market shares fall."
Here is some more Tech News from around the web:
- AMD Q3 2012 analyst call talks IP strategy @ SemiAccurate
- Skype details Windows 8 app ahead of 26 October release @ The Inquirer
- Nanya Technology, Inotera to receive new financing to move to 30nm process, say sources @ DigiTimes
AFDS 2012: ARM once again on stage with AMD - partnership incoming?
Subject: Mobile, Shows and Expos | June 11, 2012 - 12:01 AM | Ryan Shrout
Tagged: mali, arm, amd, AFDS
In a blog post over at arm.com, ARM Fellow Jem Davies has made a point to let us all know that he is going to be attending the AMD Fusion Developer Summit yet again, but this time with something more concrete to discuss. In a very self-aware statement, Davies writes in his post that "my appearance last year generated a lot of speculation about the nature of the relationship between ARM and AMD."
From Davies' post:
This year, we have a great deal to discuss. ARM is all about low power and many people in the industry now realize that GPUs have a central role to play in providing highly energy-efficient computing. It’s an exciting future that can grow the ecosystem that surrounds computing. ARM’s unique portfolio of CPU, GPU, interconnect and physical IP puts us at the forefront of one of the most important technological changes in a long time. Reflecting on that and some of those changes, I will be making an announcement at the show.
Emphasis above is ours.
Also worth noting is that Jem Davies does not have his own session at AFDS, but rather we can expect to see him to come out on stage during another keynote, likely during Phil Rogers' or Mark Papermaster's.
AMD wants into the tablet market. ARM could accelerate that process.
Exactly WHAT the ARM/AMD announcement might be obviously isn't known by many yet, but we have speculated many times that an AMD built, ARM architecture processor, with Radeon-based graphics technology and ARM low-power CPU cores, could help AMD enter into the world of ultra-lower power SoCs very quickly. Markets like the pending onslaught of Windows 8 RT tablets and clamshells have NVIDIA foaming at the mouth and AMD would be remiss to not attempt to tackle the same markets and one-up Intel at the same time.
It should be an exciting week! Keep checking pcper.com and our AFDS site tag for all the latest news including keynote live blogs!
Intel and AMD be warned; ARM could grab up to 20% of the laptop market in the next 4 years
Subject: General Tech | July 19, 2011 - 01:02 PM | Jeremy Hellstrom
Tagged: Intel, amd, arm, mali, low power
Those who ignored Microsoft's announcement that Windows 8 will support ARM processors will perhaps take note of Isuppli's claim that ARM could grab 1 in 5 of the laptops sold by 2015. The extremely low powers System on a Chip design that they have been selling were at the opposite end of the market from AMD and Intel's X86 chips, but with the rise of the APU the market has undergone a fundamental change. While the X86 makers are trying to lower the power requirements of their APUs, ARM is busy trying to ramp up the power of their chips. There are already several vendors establishing a relationship with ARM, up to and including Apple.
ARM's Cortex A9 and Mali are impressive, but ARM is already talking about console level graphics quality from their next generation of chips which we will see in roughly 18 months. This improvement will also encompass their next generation of power efficency research, which should keep power consumption and heat well below what Intel and AMD will be trying to reach. As well, it might provide an interesting opportunity for NVIDIA as the lack of a license to integrate chips with the new X86 based architecture will not stop them from developing graphics enhancements for ARM based laptops. Drop by The Inquirer for more on this topic.
"CHIP DESIGNER ARM could power over 20 per cent of all laptops shipped in 2015, according to analyst outfit IHS Isuppli.
IHS Isuppli has forecast that the domination of X86 chips in the laptop market will start to diminish as Microsoft releases its Windows 8 operating system. Windows 8 will be the first desktop operating system from Microsoft that will support the ARM architecture that is found in just about every smartphone in existence."
Here is some more Tech News from around the web:
- Foxconn reportedly considering ECS acquisition @ DigiTimes
- ReRAM gets closer to reality @ SemiAccurate
- Samsung SH100 Review @ TechReviewSource
- Ninjalane Podcast - Duke Nukem Forever Favorite Asus Product Listener Mailbag
- Sandberg Hard Disk Cloner Review @ Real World Labs
- Test Driving GNU Hurd, With Benchmarks Against Linux @ Phoronix
- S2TC: A Possible Workaround For The S3TC Patent Situation @ Phoronix
- Cyborg Gaming Lights (amBX) Review @ HardwareHeaven
- Panasonic Lumix GH2 Review @ t-break
- Real World Labs And Thermalright Joint Contest
- Win a Blackberry Bold 9900 @ t-break





