Subject: General Tech | June 19, 2013 - 09:51 PM | Josh Walrath
Tagged: Volta, nvidia, maxwell, licensing, kepler, Denver, Blogs, arm
Yesterday we all saw the blog piece from NVIDIA that stated that they were going to start licensing their IP to interested third parties. Obviously, there was a lot of discussion about this particular move. Some were in favor, some were opposed, and others yet thought that NVIDIA is now simply roadkill. I believe that it is an interesting move, but we are not yet sure of the exact details or the repercussions of such a decision on NVIDIA’s part.
The biggest bombshell of the entire post was that NVIDIA would be licensing out their latest architecture to interested clients. The Kepler architecture powers the very latest GTX 700 series of cards and at the top end it is considered one of the fastest and most efficient architectures out there. Seemingly, there is a price for this though. Time to dig a little deeper.
Kepler will be the first technology licensed to third party manufacturers. We will not see full GPUs, these will only be integrated into mobile products.
The very latest Tegra parts from NVIDIA do not feature the Kepler architecture for the graphics portion. Instead, the units featured in Tegra can almost be described as GeForce 7000 series parts. The computational units are split between pixel shaders and vertex shaders. They support a maximum compatibility of D3D 9_3 and OpenGL ES 2.0. This is a far cry from a unified shader architecture and support for the latest D3D 11 and OpenGL ES 3.0 specifications. Other mobile units feature the latest Mali and Adreno series of graphics units which are unified and support DX11 and OpenGL ES 3.0.
So why exactly does the latest Tegras not share the Kepler architecture? Hard to say. It could be a variety of factors that include time to market, available engineering teams, and simulations which could dictate if power and performance can be better served by a less complex unit. Kepler is not simple. A Kepler unit that occupies the same die space could potentially consume more power with any given workload, or conversely it could perform poorly given the same power envelope.
We can look at the desktop side of this argument for some kind of proof. At the top end Kepler is a champ. The GTX 680/770 has outstanding performance and consumes far less power than the competition from AMD. When we move down a notch and see the GTX 660 Ti/HD 7800 series of cards, we see much greater parity in performance and power consumptions. Going to the HD 7790 as compared to the 650 Ti Boost, we see the Boost part have slightly better performance but consumes significantly more power. Then we move down to the 650 and 650 Ti and these parts do not consume any more power than the competing AMD parts, but they also perform much more poorly. I know these are some pretty hefty generalizations and the engineers at NVIDIA could very effectively port Kepler over to mobile applications without significant performance or power penalties. But so far, we have not seen this work.
Power, performance, and die area aside there is also another issue to factor in. NVIDIA just announced that they are doing this. We have no idea how long this effort has been going, but it is very likely that it has only been worked on for the past six months. In that time NVIDIA needs to hammer out how they are going to license the technology, how much manpower they must provide licensees to get those parts up and running, and what kind of fees they are going to charge. There is a lot of work going on there and this is not a simple undertaking.
So let us assume that some three months ago an interested partner such as Rockchip or Samsung comes knocking to NVIDIA’s door. They work out the licensing agreements and this takes several months. Then we start to see the transfer of technology between the companies. Obviously Samsung and Rockchip are not going to apply this graphics architecture to currently shipping products, but will instead bundle it in with a next generation ARM based design. These designs are not spun out overnight. For example, the 64 bit ARMv8 designs have been finalized for around a year, and we do not expect to see initial parts being shipped until late 1H 2014. So any partner that decides to utilize NVIDIA’s Kepler architecture for such an application will not see this part be released until 1H 2015 at the very earliest.
Sheild is still based on a GPU posessing separate pixel and vertex shaders. DX11 and OpenGL ES 3.0? Nope!
If someone decides to license this technology from NVIDIA, it will not be of great concern. The next generation of NVIDIA graphics will already be out by that time, and we could very well be approaching the next iteration for the desktop side. NVIDIA plans on releasing a Kepler based mobile unit in 2014 (Logan), which would be a full year in advance of any competing product. In 2015 NVIDIA is planning on releasing an ARM product based on the Denver CPU and Maxwell GPU. So we can easily see that NVIDIA will only be licensing out an older generation product so it will not face direct competition when it comes to GPUs. NVIDIA obviously is hoping that their GPU tech will still be a step ahead of that of ARM (Mali), Qualcomm (Adreno), and Imagination Technologies (PowerVR).
This is an easy and relatively painfree way to test the waters that ARM, Imagination Technologies, and AMD are already treading. ARM only licenses IP and have shown the world that it can not only succeed at it, but thrive. Imagination Tech used to produce their own chips much like NVIDIA does, but they changed direction and continue to be profitable. AMD recently opened up about their semi-custom design group that will design specific products for customers and then license those designs out. I do not think this is a desperation move by NVIDIA, but it certainly is one that probably is a little late in coming. The mobile market is exploding, and we are approaching a time where nearly every electricity based item will have some kind of logic included in it, billions of chips a year will be sold. NVIDIA obviously wants a piece of that market. Even a small piece of “billions” is going to be significant to the bottom line.
Subject: General Tech | June 19, 2013 - 03:02 PM | Jeremy Hellstrom
Tagged: amd, Kyoto, berlin, seattle, warsaw, arm
DigiTimes named the four new families of server chip that AMD will be using to keep their products in the server room. Kyoto is known as the Opteron X-series and is available now, based on Jaguar and offering GPU compute enhancements as well as increased CPU performance. The Seattle family will replace these CPUs in the near future and will represent a new era for AMD as these chips will be clusters of ARM Cortex-A57 on AMD's advanced Freedom Fabric. Berlin will be a true x86 AMD chip with the new Steamroller architecture which will replace Piledriver and support HSA compliant optimizations. Last is Warsaw, which will be the most powerful chip, uniting 12 or 16 Piledriver cores in a chip which is compatible with the current Socket G43 used by the Opteron 6300 family, offering a simple drop in upgrade solution.
"AMD has publicly disclosed its strategy and roadmap to recapture market share in enterprise and data center servers by unveiling products that address key technologies and meet the requirements of the fastest-growing data center and cloud computing workloads."
Here is some more Tech News from around the web:
- Nvidia stretches CUDA coding to ARM chips @ The Register
- Intel previews future 'Knights Landing' Xeon Phi x86 coprocessor with integrated memory @ The Register
- Fusion-io's founding CEO quits board @ The Register
- Apple issues Java patch for Mac OS X users fixing 40 critical vulnerabilities @ The Inquirer
- Flash flaw potentially makes every webcam or laptop a PEEPHOLE @ The Register
- The Linux Kernel As An Exquisitely Sensitive Stability Test For Overclocked Systems @ TechARP
- Samsung EX2F Camera Review - A Low-Light Advanced Point-And-Shoot For Any Photographer @ SSD Review
- Australian unis to test quantum-comms-over-fibre @ The Register
- Uros Goodspeed review: MiFi, but bigger @ Hardware.info
- Adding wireless charging to any phone @ Hack a Day
- Canon PowerShot N Review @ TechReviewSource
- E3 2013: Wrap Up Coverage @ Legit Reviews
Subject: General Tech | June 17, 2013 - 02:37 PM | Jeremy Hellstrom
Tagged: arm, clover trail, tegra 3
ARM might be in for more of a fight than we had thought if they want to keep their market share for the next generation of cellphones, assuming of course that they are sold in North America. The Register posted about research recently done contrasting performance and power efficiency between several phone CPUs; the Lenovo K900 with a 2.0GHz Atom Z2580, a a Samsung Nexus 10 with a dual core 1.7GHz Cortex-A15, a Galaxy S4 phone running a "big.LITTLE" Exynos Octa with paired quad-core Cortex-A15 and Cortex A7 and even a Asus Nexus 7 with an Nvidia Tegra 3. Those phones give a good representation of current generation technology and it seems that while the performance for the top phones was very similar, Intel's new ATOM did it with 2/3 the amperage, specifically an average of 0.85A as opposed to the 1.38A of the second lowest competitor. ATOM seems to have finally found a market segment it can do very well in as long as the price is right.
"The industry analysts at ABI Research pitted a Lenovo smartphone based on Intel's Atom-based Clover Trail+ platform against a quartet of ARM-based systems, and Chipzilla's system not only kept pace with the best of them, but did so using less power."
Here is some more Tech News from around the web:
- Optimized Binaries Provide Great Benefits For Intel Haswell @ Phoronix
- Samsung releases PCI-Express SSD for ultrabooks @ The Inquirer
- Intel 2014 Haswell-E to pack 8 cores, DDR4, X99 PCH and more @ VR-Zone
- Microsoft unleashes wave of Azure mobile updates @ The Register
- Critical Java SE update due Tuesday fixes 40 flaws @ The Register
- Blackberry 10.2 will support Android 4.2.2 Jelly Bean apps @ The Inquirer
- Android 5.0 Key Lime Pie to come in late October, also optimized for older phones @ VR-Zone
- Letting Bluetooth take the wires out of your headphones @ Hack a Day
- Adding WiFi to a kid’s tablet @ Hack a Day
- Intel bakes smaller, slower flash memory. Aah, now that's progress @ The Register
- TRENDnet AC1200 Dual Band Wireless USB Adapter (TEW-805UB) Review @ Madshrimps
- Computex 2013 Madshrimps Style @ Madshrimps
- AMD Today & Beyond Event @ SilverSpoon, Publika @ TechARP
- ModSynergy 10-Year Celebration Contest - USA and International Edition
Subject: General Tech | June 7, 2013 - 03:18 PM | Jeremy Hellstrom
Tagged: arm, 64bit, servers
With Calxeda and Applied Micro showing off ARM64 based servers at Computex this year, in addition to the existing products coming from Marvell and Dell, DigiTimes prediction that 64bit ARM processors will quickly grow in popularity seems to be based in fact. It was not too long ago that many thought that ARM was fooling themselves if they thought they could take server space from AMD and Intel but it looks like they were right to develop server chips. With low power usage becoming more popular than processor overkill and modularity growing in importance ARM seems poised to perform far beyond expectations. Expect to see a lot more new on ARM64 processors and products over the coming months.
"Although Intel platforms are still the mainstream in the server industry, since 64-bit products have a broader range of applications, and ARM has been aggressively promoting related products, sources from the server industry expect more 64-bit ARM-based products to appear in the market between the end of 2013 and the first quarter of 2014."
Here is some more Tech News from around the web:
- One Year After World IPv6 Launch — Are We There Yet? @ Slashdot
- The best and worst of Computex 2013 @ The Inquirer
- YES, Xbox One DOES need internet, DOES restrict game trading @ The Register
- Interview: Steve Jackson, role-playing game titan @ The Register
- Neteller vs Payoneer - Online Payment and Prepaid Cards @ FunkyKit
- How to Install Linux @ Linux.com
Cortex-A12 fills a gap
Starting off Computex with an interesting announcement, ARM is talking about a new Cortex-A12 core that will attempt to address a performance gap in the SoC ecosystem between the A9 and A15. In the battle to compete with Krait and Intel's Silvermont architecture due in late 2013, ARM definitely needed to address the separation in performance and efficiency of the A9 and A15.
Source: ARM. Top to bottom: Cortex-A15, A12, A9 die size estimate
Targeted at mid-range devices that tend to be more cost (and thus die-size) limited, the Cortex-A12 will ship in late 2014 for product sampling and you should begin seeing hardware for sale in early 2015.
Architecturally, the changes for the upcoming A12 core revolve around a move to fully out of order dual-issue design including the integrated floating point units. The execution units are faster and the memory design has been improved but ARM wasn't ready to talk about specifics with me yet; expect that later in the year.
ARM claims this results in a 40% performance gain for the Cortex-A12 over the Cortex-A9, tested in SPECint. Because product won't even start sampling until late in 2014 we have no way to verify this data yet or to evaluate efficiency claims. That time lag between announcement and release will also give competitors like Intel, AMD and even Qualcomm time to answer back with potential earlier availability.
Subject: General Tech | May 29, 2013 - 05:20 PM | Tim Verry
Tagged: x11, weston, wayland, videocore iv, Raspberry Pi, linux, bcm2835, arm
The Raspberry Pi Foundation has been working with Collabora to fund development of a Wayland display server that is compatible with the Raspberry Pi and also allows the continued use of legacy X applications.
So far, operating systems that run on the Raspberry Pi have used X as the display server and window compositor. The Raspberry Pi Foundation wants to move to a window compositor that will take advantage of the Raspberry Pi's Hardware Video Scaler (HVS) and take the burden of window composition off of the relatively much slower ARM CPU. The Raspberry Pi Foundation has chosen Wayland as the display server for the task.
The Raspberry Pi Model A.
Taking advantage of the HVS and OpenGL ES compatible GPU will make the system feel much more responsive and allow for advanced effects (fading, Expose'-like window browsers, et al) for those that like a little more bling with their OS.
The Wayland/Weston display server allows for GPU acceleration and window composition using the Pi's VideoCore IV GPU and HVS (which is independent of the hardware units that run OpenGL code). The display server will feed the entire set of windows along with how they should be laid out on screen (stacking order, transparency, 2D transform, ect.) to the HVS which will hardware accelerate the process and free the ARM CPU up for other tasks.
According to the Raspberry Pi Foundation, the Raspberry Pi's HVS is fairly powerful for a mobile-class SoC with 500 Megapixel/s scaling throughput and 1 Gigapixel per second blending throughput.
In addition to GPU acceleration, Wayland will allow non-rectangular windows, fading and other effects, support for legacy X applications with Xwayland, and a scaled window browser.
The Raspberry Pi Foundation has been working with developers since late last year and is nearly ready to roll a technology preview into the next Raspian operating system release. The developers are still working on improving the performance and reducing memory usage. As a result, the new Wayland/Weston display server is not expected to become the new default in the various Raspberry Pi operating systems until late 2013 at the earliest.
This is a project that is really nice to see, especially since at least a small part of the development work going into supporting the ARM-based Raspberry Pi on Wayland will help other ARM devices and Wayland in general which is becoming an increasingly popular choice in new Linux distributions and the best X alternative so far. Of course, this is primarily going to be a useful update for those Raspberry Pi users that run OSes with GUIs as the responsiveness should be a lot snappier!
If you simply can't wait until later this year, it is possible to install the technology preview (beta) of Wayland/Weston onto the current version of Raspbian Linux by cloning the git project or installing a Raspbian package of Weston 1.0. Blogger Daniel Stone has all the details for installing the display server onto your Pi under the section titled "sounds great; how do i get it?" on this post.
See a video of Wayland technology preview in action on the Raspberry Pi on the Raspberry Pi Foundation's blog.
Read more about the Raspberry Pi at PC Perspective.
Subject: General Tech | April 30, 2013 - 09:46 AM | Tim Verry
Tagged: ssd caching, operating system, linux, kernel 3.9, kernel, arm, 802.11ac
Linus Torvalds recently released a new version of the Linux kernel -- version 3.9 -- that advances the core of the GNU/Linux operating system with a number of new features. Among other tweaks, the new kernel rolls in new drivers, improves virtualization support, adds new hardware sleep modes, and tweaks file system and storage support.
The new kernel has added quite a few new experimental features, but developers/enthusiasts will no longer have to employ the CONFIG_EXPERIMENTAL flag when compiling the kernel in order to enable them. The kernel development team has decided to remove that option, enable the features by default, and merely tag those experimental features in the documentation. One of the experimental features is SSD caching that allows a solid state drive to cache both reads and writes. The SSD can cache frequently accessed data on the faster solid state drive as well as take the write cached data and write it to the hard drive when the IO subsystem isn’t being heavily utilized. The feature is not new to Linux distributions, but the caching support has now been moved to the kernel. Furthermore, the kernel is now RAID-aware when using the btrfs file system and RAID 5 or RAID 6.
On the driver front, Linux Kernel 3.9 now supports Intel’s upcoming 802.11ac Wi-Fi adapters, improved HD audio codec, AMD’s Oland (8500/8600) and Richland GPUs, and additional NVIDIA GPU support. The new kernel also rolls in a power-optimized driver for Intel’s Haswell GPU and several more track pads.
Kernel 3.9 also adds a new suspend/sleep mode. It will use more power than the traditional S3 (suspend to memory) sleep mode because components are not completely powered down (merely at their lowest sleep mode), but the system will be almost-instantly accessible upon exiting the new suspend mode as a result. According to H-Online, this "lightweight suspend" mode would be ideal for mobile devices or hardware used in network appliances. Also interesting is support for a KVM hypervisor on ARM Cortex A15 SoCs as well as some software tweaks to the kernel to improve web server workloads by allowing multiple networking sockets (and associated CPU processes) to listen on the same network port.
In all, version 3.9 looks to be a worthy upgrade, and one that I hope Linux distro makers will opt for in upcoming releases. I think the new drivers and the SSD caching being rolled into the kernel are the most important features for desktop users, though the networking stack improvements also sound interesting.
For more details, Thorsten Leemhuis has written up an extensive article on the new kernel.
British chip design company ARM recently released an unaudited financial report with details on its Q1 2013 performance. The mobile SoC giant announced that it saw 2.6 million ARM chips in the first quarter of this year, a 35% improvement over last year and further evidence that ARM still dominates the low-power mobile market.
In fact, the chip designer made $94.9 million in licensing all those ARM chips, which was a big chunk of the company’s total Q1 2013 revenue of $263.9 million. Revenue was up by 26% versus the first quarter of the previous year (Q1 2012), which was only $209.4 million. Further, ARM’s profit (pre-tax) is 89.4 million pounds or approximately $137 million USD.
ARM saw revenue from both licensing and royalties increase year over year (YoY) by 24% and 33% which indicates that more companies are jumping into the mobile and embedded markets with ARM chips or licenses to make custom designs of their own. According to the report, the company sold five-times more Mali GPUs, saw a 50% increase in ARM-powered embedded devices, and noticed a 25% increase in ARM mobile devices year over year respectively. ARM has also started moving ARMv8 (64-bit ARM) licenses. Of the total 22 licenses in Q1 2013, 7 of the licenses were for ARM’s Cortex-A50 series processors along with a single ARMv8 license (a total of 9 to date). In Q1 2013, ARM also sold three Mali GPU licenses, and one of those was for the company’s high-end Skymir GPU.
In all, ARM had a good first quarter and is showing signs of increased growth. With ARMv8 on the horizon, I am interested to see the company’s numbers next year and how they compare year over year as ARM attempts to take over the server room in particular. The profits and revenue are modest in comparison to X86 giant Intel's Q1 2013 results, but are not bad at all for a company that doesn’t produce chips itself!
Subject: Systems | April 19, 2013 - 03:56 AM | Tim Verry
Tagged: servers, project moonshot, microserver, hp, arm, Applied Micro Circuits, 64-bit
A recent press release from AppliedMicro (Applied Micro Circuits Corporation) announced that the company’s X-Gene server on a chip technology would be used in an upcoming HP Project Moonshot server.
An HP Moonshot server (expect the X-Gene version to be at least slightly different).
The X-Gene is a 64-bit ARM SoC that combines ARM processing cores with networking and storage offload engines as well as a high-speed interconnect networking fabric. AppliedMicro designed the chip to provide ARM-powered servers that will reportedly reduce the Total Cost of Ownership of running webservers in a data center by reducing upfront hardware and ongoing electrical costs.
The X-Gene chips that will appear in HP’s Project Moonshot servers feature a SoC with eight AppliedMicro-designed 64-bit ARMv8 cores clocked at 2.4GHz, four ARM Cortex A5 cores for running the Software Defined Network (SDN) controller, and support for storage IO, PCI-E IO, and integrated Ethernet (four 10Gb Ethernet links). The X-Gene chips are located on card-like daughter cards that slot into a carrier board that has networking fabric to connect all the X-Gene cards (and the SoCs on those cards). Currently, servers using X-Gene SoCs require a hardware switch to connect all of the X-Gene cards in a rack. However, the next-generation 28nm X-Gene chips will eliminate the need for a rack-level hardware switch as well as featuring 100Gb networking links).
The X-Gene chips in HP Project Moonshot will use relatively little power compared to Xeon-based solutions. AppliedMicro has stated that eh X-Gene chips will be at least two-times as power efficient, but has not officially release power consumption numbers for the X-Gene chips under load. However, at idle the X-Gene SoCs will use as little as 500mW and 300mW of power at idle and standby (sleep mode) respectively. The 64-bit quad issue, Out of Order Execution chips are some of the most-powerful ARM processors to date, though they will soon be joined by ARM’s own 64-bit design(s). I think the X-Gene chips are intriquing, and I am excited to see how well they fare in the data center environment running server applications. ARM has handily taken over the mobile space, but it is still relatively new in the server world. Even so, the 64-bit ARM chips by AppliedMicro (X-Gene) and others are the first step towards ARM being a viable option for servers.
According to AppliedMicro, HP Project Moonshot servers with X-Gene SoCs will be available later this year. You can find the press blast below.
Subject: General Tech | April 12, 2013 - 02:08 AM | Tim Verry
Tagged: SECO, nvidia, mini ITX, kepler, kayla, GTC 13, GTC, CUDA, arm
Last month, NVIDIA revealed its Kayla development platform that combines a quad core Tegra System on a Chip (SoC) with a NVIDIA Kepler GPU. Kayla will out later this year, but that has not stopped other board makers from putting together their own solutions. One such solution that began shipping earlier this week is the mITX GPU Devkit from SECO.
The new mITX GPU Devkit is a hardware platform for developers to program CUDA applications for mobile devices, desktops, workstations, and HPC servers. It combines a NVIDIA Tegra 3 processor, 2GB of RAM, and 4GB of internal storage (eMMC) on a Qseven module with a Mini-ITX form factor motherboard. Developers can then plug their own CUDA-capable graphics card into the single PCI-E 2.0 x16 slot (which actually runs at x4 speeds). Additional storage can be added via an internal SATA connection, and cameras can be hooked up using the CIC headers.
Rear IO on the mITX GPU Devkit includes:
- 1 x Gigabit Ethernet
- 3 x USB
- 1 x OTG port
- 1 x HDMI
- 1 x Display Port
- 3 x Analog audio
- 2 x Serial
- 1 x SD card slot
The SECO platform is a proving to be popular for GPGPU in the server space, especially with systems like Pedraforca. The intention of using these types of platforms in servers is to save power by using a low power ARM chip for inter-node communication and basic tasks while the real computing is done solely on the graphics cards. With Intel’s upcoming Haswell-based Xeon chips getting down to 13W TPDs though, systems like this are going to be more difficult to justify. SECO is mostly positioning this platform as a development board, however. One use in that respect is to begin optimizing GPU-accelerated code for mobile devices. With future Tegra chips to get CUDA-compatible graphics cards, new software development and optimization of existing GPGPU code for smartphones and tablet will be increasingly important.
Either way, the SECO mITX GPU Devkit is available now for 349 EUR or approximately $360 (in both cases, before any taxes).