Subject: General Tech | April 12, 2013 - 02:08 AM | Tim Verry
Tagged: SECO, nvidia, mini ITX, kepler, kayla, GTC 13, GTC, CUDA, arm
Last month, NVIDIA revealed its Kayla development platform that combines a quad core Tegra System on a Chip (SoC) with a NVIDIA Kepler GPU. Kayla will out later this year, but that has not stopped other board makers from putting together their own solutions. One such solution that began shipping earlier this week is the mITX GPU Devkit from SECO.
The new mITX GPU Devkit is a hardware platform for developers to program CUDA applications for mobile devices, desktops, workstations, and HPC servers. It combines a NVIDIA Tegra 3 processor, 2GB of RAM, and 4GB of internal storage (eMMC) on a Qseven module with a Mini-ITX form factor motherboard. Developers can then plug their own CUDA-capable graphics card into the single PCI-E 2.0 x16 slot (which actually runs at x4 speeds). Additional storage can be added via an internal SATA connection, and cameras can be hooked up using the CIC headers.
Rear IO on the mITX GPU Devkit includes:
- 1 x Gigabit Ethernet
- 3 x USB
- 1 x OTG port
- 1 x HDMI
- 1 x Display Port
- 3 x Analog audio
- 2 x Serial
- 1 x SD card slot
The SECO platform is a proving to be popular for GPGPU in the server space, especially with systems like Pedraforca. The intention of using these types of platforms in servers is to save power by using a low power ARM chip for inter-node communication and basic tasks while the real computing is done solely on the graphics cards. With Intel’s upcoming Haswell-based Xeon chips getting down to 13W TPDs though, systems like this are going to be more difficult to justify. SECO is mostly positioning this platform as a development board, however. One use in that respect is to begin optimizing GPU-accelerated code for mobile devices. With future Tegra chips to get CUDA-compatible graphics cards, new software development and optimization of existing GPGPU code for smartphones and tablet will be increasingly important.
Either way, the SECO mITX GPU Devkit is available now for 349 EUR or approximately $360 (in both cases, before any taxes).
Subject: General Tech | March 31, 2013 - 08:43 PM | Tim Verry
Tagged: nvidia, lenovo yoga, GTC 2013, GTC, gesture control, eyesight, ECS
During the Emerging Companies Summit at NVIDIA's GPU Technology Conference, Israeli company EyeSight Mobile Technologies' CEO Gideon Shmuel took the stage to discuss the future of its gesture recognition software. He also provided insight into how EyeSight plans to use graphics cards to improve and accelerate the process of identifying, and responding to, finger and hand movements along with face detection.
EyeSight is a five year old company that has developed gesture recognition software that can be installed on existing machines (though it appears to be aimed more at OEMs than directly to consumers). It can use standard cameras, such as webcams, to get its 2D input data and then gets a relative Z-axis from proprietary algorithms. This gives EyeSight essentially 2.5D of input data, and camera resolution and frame rate permitting, allows the software to identify and track finger and hand movements. EyeSight CEO Gideon Shmuel stated at the ECS presentation that the software is currently capable of "finger-level accuracy" at 5 meters from a TV.
Gestures include the ability to use your fingers as a mouse to point at on-screen objects, waving your hand to turn pages, scrolling, and even give hand signal cues.
The software is not open source, and there are no plans to move in that direction. The company has 15 patents pending on its technology, several of which it managed to file before the US Patent Office changed from First to Invent to First Inventor to File (heh, which is another article...). The software will support up to 20 million hardware devices in 2013, and EyeSight expects the number of compatible camera-packing devices to increase further to as many as 3.5 billion in 2015. Other features include the ability transparently map EyeSight input to Android apps without user's needing to muck with settings, and the ability to detect faces and "emotional signals" even in low light. According to the website, SDKs are available for Windows, Linux, and Android. The software maps the gestures it recognizes to Windows shortcuts, to increase compatibility with many existing applications (so long as they support keyboard shortcuts).
Currently, the EyeSight software is mostly run on the CPU, but the company is heavily investing into incorporating GPU support. Moving the processing to GPUs will allow the software to run faster and more power efficiently, especially on mobile devices (NVIDIA's Tegra platform was specifically mentioned). EyeSight's future road-map includes using GPU acceleration to bolster the number of supported gestures, move image processing to the GPUs, add velocity and vector control inputs, incorporate a better low-light filter (which will run on the GPU), and offload processing from the CPU to optimize power management and save CPU resources for the OS and other applications which is especially important for mobile devices. Gideon Shmuel also stated that he wants to see the technology being used on "anything with a display" from your smartphone to your air conditioner.
A basic version of the EyeSight input technology reportedly comes installed on the Lenovo Yoga convertible tablet. I think this software has potential, and would provide that Minority Report-like interaction that many enthusiasts wish for. Hopefully, EyeSight can deliver on its claimed accuracy figures and OEMs will embrace the technology by integrating it into future devices.
EyeSight has posted additional video demos and information about its touch-free technology on its website.
Do you think this "touch-free" gesture technology has merit, or will this type of input remain limited to awkward-integration in console games?
Subject: General Tech, Graphics Cards | March 20, 2013 - 01:47 PM | Tim Verry
Tagged: tesla, tegra 3, supercomputer, pedraforca, nvidia, GTC 2013, GTC, graphics cards, data centers
There is a lot of talk about heterogeneous computing at GTC, in the sense of adding graphics cards to servers. If you have HPC workloads that can benefit from GPU parallelism, adding GPUs gives you computing performance in less physical space, and using less power, than a CPU only cluster (for equivalent TFLOPS).
However, there was a session at GTC that actually took things to the opposite extreme. Instead of a CPU only cluster or a mixed cluster, Alex Ramirez (leader of Heterogeneous Architectures Group at Barcelona Supercomputing Center) is proposing a homogeneous GPU cluster called Pedraforca.
Pedraforca V2 combines NVIDIA Tesla GPUs with low power ARM processors. Each node is comprised of the following components:
- 1 x Mini-ITX carrier board
1 x Q7 module (which hosts the ARM SoC and memory)
- Current config is one Tegra 3 @ 1.3GHz and 2GB DDR2
- 1 x NVIDIA Tesla K20 accelerator card (1170 GFLOPS)
- 1 x InfiniBand 40Gb/s card (via Mellanox ConnectX-3 slot)
- 1 x 2.5" SSD (SATA 3 MLC, 250GB)
The ARM processor is used solely for booting the system and facilitating GPU communication between nodes. It is not intended to be used for computing. According to Dr. Ramirez, in situations where running code on a CPU would be faster, it would be best to have a small number of Intel Xeon powered nodes to do the CPU-favorable computing, and then offload the parallel workloads to the GPU cluster over the InfiniBand connection (though this is less than ideal, Pedraforca would be most-efficient with data-sets that can be processed solely on the Tesla cards).
While Pedraforca is not necessarily locked to NVIDIA's Tegra hardware, it is currently the only SoC that meets their needs. The system requires the ARM chip to have PCI-E support. The Tegra 3 SoC has four PCI-E lanes, so the carrier board is using two PLX chips to allow the Tesla and InfiniBand cards to both be connected.
The researcher stated that he is also looking forward to using NVIDIA's upcoming Logan processor in the Pedraforca cluster. It will reportedly be possible to upgrade existing Pedraforca clusters with the new chips by replacing the existing (Tegra 3) Q7 module with one that has the Logan SoC when it is released.
Pedraforca V2 has an initial cluster size of 64 nodes. While the speaker was reluctant to provide TFLOPS performance numbers, as it would depend on the workload, with 64 Telsa K20 cards, it should provide respectable performance. The intent of the cluster is to save power costs by using a low power CPU. If your sever kernel and applications can run on GPUs alone, there are noticeable power savings to be had by switching from a ~100W Intel Xeon chip to a lower-power (approximately 2-3W) Tegra 3 processor. If you have a kernel that needs to run on a CPU, it is recommended to run the OS on an Intel server and transfer just the GPU work to the Pedraforca cluster. Each Pedraforca node is reportedly under 300W, with the Tesla card being the majority of that figure. Despite the limitations, and niche nature of the workloads and software necessary to get the full power-saving benefits, Pedraforca is certainly an interesting take on a homogeneous server cluster!
In another session relating to the path to exascale computing, power use in data centers was listed as one of the biggest hurdles to getting to Exaflop-levels of performance, and while Pedraforca is not the answer to Exascale, it should at least be a useful learning experience at wringing the most parallelism out of code and pushing GPGPU to the limits. And that research will help other clusters use the GPUs more efficiently as researchers explore the future of computing.
The Pedraforca project built upon research conducted on Tibidabo, a multi-core ARM CPU cluster, and CARMA (CUDA on ARM development kit) which is a Tegra SoC paired with an NVIDIA Quadro card. The two slides below show CARMA benchmarks and a Tibidabo cluster (click on image for larger version).
Stay tuned to PC Perspective for more GTC 2013 coverage!