Subject: Processors | March 14, 2017 - 03:17 PM | Jeremy Hellstrom
Tagged: nvidia, JetsonTX1, Denver, Cortex A57, pascal, SoC
Amongst the furor of the Ryzen launch the NVIDIA's new Jetson TX2 SoC was quietly sent out to reviewers and today the NDA expired so we can see how it performs. There are more Ryzen reviews below the fold, including Phoronix's Linux testing if you want to skip ahead. In addition to the specifications in the quote, you will find 8GB of 128-bit LPDDR4 offering memory bandwidth of 58.4 GB/s and 32GBs of eMMC for local storage. This Jetson is running JetPack 3.0 L4T based off of the Linux 4.4.15 kernel. Phoronix tested out its performance, see for yourself.
"Last week we got to tell you all about the new NVIDIA Jetson TX2 with its custom-designed 64-bit Denver 2 CPUs, four Cortex-A57 cores, and Pascal graphics with 256 CUDA cores. Today the Jetson TX2 is shipping and the embargo has expired for sharing performance metrics on the JTX2."
Here are some more Processor articles from around the web:
- Hands-On Nvidia Jetson TX2: Fast Processing for Embedded Devices @ Hack a Day
- AMD Ryzen 7 1700X Review; Testing SMT @ Hardware Canucks
- AMD Ryzen 7 1700 Linux Benchmarks: Great Multi-Core Performance For $329 @ Phoronix
Subject: General Tech, Processors | March 12, 2017 - 05:11 PM | Tim Verry
Tagged: pascal, nvidia, machine learning, iot, Denver, Cortex A57, ai
Measuring 50mm x 87mm, the Jetson TX2 packs quite a bit of processing power and I/O including an SoC with two 64-bit Denver 2 cores with 2MB L2, four ARM Cortex A57 cores with 2MB L2, and a 256-core GPU based on NVIDIA’s Pascal architecture. The TX2 compute module also hosts 8 GB of LPDDR4 (58.3 GB/s) and 32 GB of eMMC storage (SDIO and SATA are also supported). As far as I/O, the Jetson TX2 uses a 400-pin connector to connect the compute module to the development board or final product and the final I/O available to users will depend on the product it is used in. The compute module supports up to the following though:
- 2 x DSI
- 2 x DP 1.2 / HDMI 2.0 / eDP 1.4
- USB 3.0
- USB 2.0
- 12 x CSI lanes for up to 6 cameras (2.5 GB/second/lane)
- PCI-E 2.0:
- One x4 + one x1 or two x1 + one x2
- Gigabit Ethernet
The Jetson TX2 runs the “Linux for Tegra” operating system. According to NVIDIA the Jetson TX2 can deliver up to twice the performance of the TX1 or up to twice the efficiency at 7.5 watts at the same performance.
The extra horsepower afforded by the faster CPU, updated GPU, and increased memory and memory bandwidth will reportedly enable smart end user devices with faster facial recognition, more accurate speech recognition, and smarter AI and machine learning tasks (e.g. personal assistant, smart street cameras, smarter home automation, et al). Bringing more power locally to these types of internet of things devices is a good thing as less reliance on the cloud potentially means more privacy (unfortunately there is not as much incentive for companies to make this type of product for the mass market but you could use the TX2 to build your own).
Cisco will reportedly use the Jetson TX2 to add facial and speech recognition to its Cisco Spark devices. In addition to the hardware, NVIDIA offers SDKs and tools as part of JetPack 3.0. The JetPack 3.0 toolkit includes Tensor-RT, cuDNN 5.1, VisionWorks 1.6, CUDA 8, and support and drivers for OpenGL 4.5, OpenGL ES 3 2, EGL 1.4, and Vulkan 1.0.
The TX2 will enable better, stronger, and faster (well I don't know about stronger heh) industrial control systems, robotics, home automation, embedded computers and kiosks, smart signage, security systems, and other connected IoT devices (that are for the love of all processing are hardened and secured so they aren't used as part of a botnet!).
Interested developers and makers can pre-order the Jetson TX2 Development Kit for $599 with a ship date for US and Europe of March 14 and other regions “in the coming weeks.” If you just want the compute module sans development board, it will be available later this quarter for $399 (in quantities of 1,000 or more). The previous generation Jetson TX1 Development Kit has also received a slight price cut to $499.
Subject: General Tech | October 16, 2014 - 01:16 PM | Ken Addison
Tagged: podcast, video, nvidia, GTX 980, sli, 3-way sli, 4-way sli, amd, R9 290X, Samsung, 840 evo, Intel, corsair, HX1000i, gigabyte, Z97X-UD5H, Lenovo, yoga 3 pro, yoga tablet 2. nexus 9, tegra k1, Denver
PC Perspective Podcast #322 - 10/16/2014
Join us this week as we discuss GTX 980 4-Way SLI, Samsung's EVO Performance Fix, Intel Earnings and more!
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the Store
- RSS - Subscribe through your regular RSS reader
- MP3 - Direct download link to the MP3 file
Hosts: Ryan Shrout, Josh Walrath, Morry Tietelman
Program length: 1:26:16
Week in Review:
News items of interest:
0:46:25 You Missed It! PCPer Live! Borderlands: The Pre-Sequel Game Stream Powered by NVIDIA
0:48:20 Trio of Lenovo News
Hardware/Software Picks of the Week:
Ryan: Sonos BOOST
Subject: Mobile | October 15, 2014 - 01:10 PM | Ryan Shrout
Tagged: tegra k1, tegra, nvidia, nexus 9, Nexus, google, Denver
Along with the announcement of the Google Nexus 6 phone, Google is also announcing a new tablet, the Nexus 9. Sporting an 8.9-in IPS screen with a 2048x1536 resolution (4:3 standing strong!), a 6700 mAh battery as well as the new Android Lollipop operating system, perhaps the most interesting specification is that it is built around NVIDIA's Tegra K1 SoC. Specifically, the 64-bit version based on the dual-core custom built Denver design, marking that architectures first release in shipping product.
UPDATE: Amazon.com has the Google Nexus 9 up for pre-order in both 16GB and 32GB capacities!
Tegra K1 using 64-bit Denver cores are unique in that it marks the first time NVIDIA has not used off-the-shelf cores from ARM in it's SoC designs. We also know, based on Tim's news post on PC Perspective in August, that the architecture is using a 7-way superscalar design and actually runs a custom instruction set that gets translated to ARMv8 in real-time.
A software layer and 128MB cache enhance the Dynamic Code Optimization technology by allowing the processor to examine and optimize the ARM code, convert it to the custom instruction set, and further cache the converted microcode of frequently used applications in a cache (which can be bypassed for infrequently processed code). Using the wider execution engine and Dynamic Code Optimization (which is transparent to ARM developers and does not require updated applications), NVIDIA touts the dual Denver core Tegra K1 as being at least as powerful as the quad and octo-core packing competition.
It is great news for NVIDIA that Google is using this version of the Tegra K1 (can we please just get a different name for this version of the chip) as it indicates Google's commitment to the architecture in Android going forward, opening doors for the parts integration with even more devices with other hardware vendors moving forward.
More than likely built by HTC, the Nexus 9 will ship in three different colors (black, white and beige) and has a lot of callbacks to the Nexus 7, one of if not THE most popular Android tablet on the market. The tablet has front-facing speakers which should make it good for headphone-free media consumption when necessary. You'll be able put the Nexus 9 into a working mode easily with a new magnetically attached keyboard dock, similar to the iPad accessories widely available.
The Nexus 9 weighs in at 425g (the iPad Air weighs 478g), will have 16GB and 32GB capacity options, going up for preorder on 10/17 and shipping by 11/03. Google will sell both a 32GB Wi-Fi and 32GB LTE model with the LTE version (as well as the Sand color) shipping "later this year." Pricing is set at $399 for the 16GB model, $479 for the 32GB model and $599 for the 32GB+LTE version. That is quite a price hike for LTE capability and the $80 gap between the 16GB and 32GB options is annoying as well.
|Screen||8.9" IPS LCD TFT 4:3 aspect ratio QXGA (2048x1536)|
|Size||153.68 mm x 228.25 mm x 7.95 mm|
|Weight||WiFi: 14.99 ounces (425g) LTE: 15.38 ounces (436g)|
|Camera||Rear Camera: 8MP, f/2.4, 29.2mm focal length (35mm equiv), Auto-focus, LED flash Front Camera: 1.6MP, f/2.4, 26.1mm focal length (35mm equiv), Fixed-focus, no flash|
|Audio||Front-facing stereo speakers, complete with HTC’s BoomSound™ technology|
|Memory||16, 32 GB eMMC 4.51 storage (actual formatted capacity will be less)|
|CPU||NVIDIA Tegra K1 - 64 bit; Dual Denver CPUs @ 2.3 GHz|
|GPU||Kepler 192-core GPU|
|Wireless|| Broadcom 802.11ac 2x2 (MIMO)
|Network||Quad-band GSM, CDMA, Penta-band HSPA, 4G LTE|
|Power**||6700 mAh Wifi Browsing: Up to 9.5 hours LTE Browsing: Up to 8.5 hours Video Playback: Up to 9.5 hours Wifi Standby: Up to 30 days LTE Standby: Up to 30 days|
|Sensors||GNSS support for GPS, GLONASS, and Beidou Bosch gyroscope and accelerometer AKM magnetometer & hall effect sensor Capella ambient light sensor|
|Ports & Connectors||Single micro-USB 2.0 for USB data/charging 3.5mm audio jack Dual front-facing speakers Dual microphones, top/bottom|
|OS||Android 5.0 Lollipop|
NVIDIA Reveals 64-bit Denver CPU Core Details, Headed to New Tegra K1 Powered Devices Later This Year
Subject: Processors | August 12, 2014 - 01:06 AM | Tim Verry
Tagged: tegra k1, project denver, nvidia, Denver, ARMv8, arm, Android, 64-bit
During GTC 2014 NVIDIA launched the Tegra K1, a new mobile SoC that contains a powerful Kepler-based GPU. Initial processors (and the resultant design wins such as the Acer Chromebook 13 and Xiaomi Mi Pad) utilized four ARM Cortex-A15 cores for the CPU side of things, but later this year NVIDIA is deploying a variant of the Tegra K1 SoC that switches out the four A15 cores for two custom (NVIDIA developed) Denver CPU cores.
The custom 64-bit Denver CPU cores use a 7-way superscalar design and run a custom instruction set. Denver is a wide but in-order architecture that allows up to seven operations per clock cycle. NVIDIA is using a custom ISA and on-the-fly binary translation to convert ARMv8 instructions to microcode before execution. A software layer and 128MB cache enhance the Dynamic Code Optimization technology by allowing the processor to examine and optimize the ARM code, convert it to the custom instruction set, and further cache the converted microcode of frequently used applications in a cache (which can be bypassed for infrequently processed code). Using the wider execution engine and Dynamic Code Optimization (which is transparent to ARM developers and does not require updated applications), NVIDIA touts the dual Denver core Tegra K1 as being at least as powerful as the quad and octo-core packing competition.
Further, NVIDIA has claimed at at peak throughput (and in specific situations where application code and DCO can take full advantage of the 7-way execution engine) the Denver-based mobile SoC handily outpaces Intel’s Bay Trail, Apple’s A7 Cyclone, and Qualcomm’s Krait 400 CPU cores. In the results of a synthetic benchmark test provided to The Tech Report, the Denver cores were even challenging Intel’s Haswell-based Celeron 2955U processor. Keeping in mind that these are NVIDIA-provided numbers and likely the best results one can expect, Denver is still quite a bit more capable than existing cores. (Note that the Haswell chips would likely pull much farther ahead when presented with applications that cannot be easily executed in-order with limited instruction parallelism).
NVIDIA is ratcheting up mobile CPU performance with its Denver cores, but it is also aiming for an efficient chip and has implemented several power saving tweaks. Beyond the decision to go with an in-order execution engine (with DCO hopefully mostly making up for that), the beefy Denver cores reportedly feature low latency power state transitions (e.g. between active and idle states), power gating, dynamic voltage, and dynamic clock scaling. The company claims that “Denver's performance will rival some mainstream PC-class CPUs at significantly reduced power consumption.” In real terms this should mean that the two Denver cores in place of the quad core A15 design in the Tegra K1 should not result in significantly lower battery life. The two K1 variants are said to be pin compatible such that OEMs and developers can easily bring upgraded models to market with the faster Denver cores.
For those curious, In the Tegra K1, the two Denver cores (clocked at up to 2.5GHz) share a 16-way L2 cache and each have 128KB instruction and 64KB data L1 caches to themselves. The 128MB Dynamic Code Optimization cache is held in system memory.
Denver is the first (custom) 64-bit ARM processor for Android (with Apple’s A7 being the first 64-bit smartphone chip), and NVIDIA is working on supporting the next generation Android OS known as Android L.
The dual Denver core Tegra K1 is coming later this year and I am excited to see how it performs. The current K1 chip already has a powerful fully CUDA compliant Kepler-based GPU which has enabled awesome projects such as computer vision and even prototype self-driving cars. With the new Kepler GPU and Denver CPU pairing, I’m looking forward to seeing how NVIDIA’s latest chip is put to work and the kinds of devices it enables.
Are you excited for the new Tegra K1 SoC with NVIDIA’s first fully custom cores?
Subject: General Tech | June 19, 2013 - 09:51 PM | Josh Walrath
Tagged: Volta, nvidia, maxwell, licensing, kepler, Denver, Blogs, arm
Yesterday we all saw the blog piece from NVIDIA that stated that they were going to start licensing their IP to interested third parties. Obviously, there was a lot of discussion about this particular move. Some were in favor, some were opposed, and others yet thought that NVIDIA is now simply roadkill. I believe that it is an interesting move, but we are not yet sure of the exact details or the repercussions of such a decision on NVIDIA’s part.
The biggest bombshell of the entire post was that NVIDIA would be licensing out their latest architecture to interested clients. The Kepler architecture powers the very latest GTX 700 series of cards and at the top end it is considered one of the fastest and most efficient architectures out there. Seemingly, there is a price for this though. Time to dig a little deeper.
Kepler will be the first technology licensed to third party manufacturers. We will not see full GPUs, these will only be integrated into mobile products.
The very latest Tegra parts from NVIDIA do not feature the Kepler architecture for the graphics portion. Instead, the units featured in Tegra can almost be described as GeForce 7000 series parts. The computational units are split between pixel shaders and vertex shaders. They support a maximum compatibility of D3D 9_3 and OpenGL ES 2.0. This is a far cry from a unified shader architecture and support for the latest D3D 11 and OpenGL ES 3.0 specifications. Other mobile units feature the latest Mali and Adreno series of graphics units which are unified and support DX11 and OpenGL ES 3.0.
So why exactly does the latest Tegras not share the Kepler architecture? Hard to say. It could be a variety of factors that include time to market, available engineering teams, and simulations which could dictate if power and performance can be better served by a less complex unit. Kepler is not simple. A Kepler unit that occupies the same die space could potentially consume more power with any given workload, or conversely it could perform poorly given the same power envelope.
We can look at the desktop side of this argument for some kind of proof. At the top end Kepler is a champ. The GTX 680/770 has outstanding performance and consumes far less power than the competition from AMD. When we move down a notch and see the GTX 660 Ti/HD 7800 series of cards, we see much greater parity in performance and power consumptions. Going to the HD 7790 as compared to the 650 Ti Boost, we see the Boost part have slightly better performance but consumes significantly more power. Then we move down to the 650 and 650 Ti and these parts do not consume any more power than the competing AMD parts, but they also perform much more poorly. I know these are some pretty hefty generalizations and the engineers at NVIDIA could very effectively port Kepler over to mobile applications without significant performance or power penalties. But so far, we have not seen this work.
Power, performance, and die area aside there is also another issue to factor in. NVIDIA just announced that they are doing this. We have no idea how long this effort has been going, but it is very likely that it has only been worked on for the past six months. In that time NVIDIA needs to hammer out how they are going to license the technology, how much manpower they must provide licensees to get those parts up and running, and what kind of fees they are going to charge. There is a lot of work going on there and this is not a simple undertaking.
So let us assume that some three months ago an interested partner such as Rockchip or Samsung comes knocking to NVIDIA’s door. They work out the licensing agreements and this takes several months. Then we start to see the transfer of technology between the companies. Obviously Samsung and Rockchip are not going to apply this graphics architecture to currently shipping products, but will instead bundle it in with a next generation ARM based design. These designs are not spun out overnight. For example, the 64 bit ARMv8 designs have been finalized for around a year, and we do not expect to see initial parts being shipped until late 1H 2014. So any partner that decides to utilize NVIDIA’s Kepler architecture for such an application will not see this part be released until 1H 2015 at the very earliest.
Sheild is still based on a GPU posessing separate pixel and vertex shaders. DX11 and OpenGL ES 3.0? Nope!
If someone decides to license this technology from NVIDIA, it will not be of great concern. The next generation of NVIDIA graphics will already be out by that time, and we could very well be approaching the next iteration for the desktop side. NVIDIA plans on releasing a Kepler based mobile unit in 2014 (Logan), which would be a full year in advance of any competing product. In 2015 NVIDIA is planning on releasing an ARM product based on the Denver CPU and Maxwell GPU. So we can easily see that NVIDIA will only be licensing out an older generation product so it will not face direct competition when it comes to GPUs. NVIDIA obviously is hoping that their GPU tech will still be a step ahead of that of ARM (Mali), Qualcomm (Adreno), and Imagination Technologies (PowerVR).
This is an easy and relatively painfree way to test the waters that ARM, Imagination Technologies, and AMD are already treading. ARM only licenses IP and have shown the world that it can not only succeed at it, but thrive. Imagination Tech used to produce their own chips much like NVIDIA does, but they changed direction and continue to be profitable. AMD recently opened up about their semi-custom design group that will design specific products for customers and then license those designs out. I do not think this is a desperation move by NVIDIA, but it certainly is one that probably is a little late in coming. The mobile market is exploding, and we are approaching a time where nearly every electricity based item will have some kind of logic included in it, billions of chips a year will be sold. NVIDIA obviously wants a piece of that market. Even a small piece of “billions” is going to be significant to the bottom line.