Subject: Editorial | April 26, 2018 - 02:34 AM | Josh Walrath
Tagged: Zen, tesla, raja koduri, Jim Keller, Intel, Conroe, Banias, amd
Update: The official Intel announcement can be found here.
For anyone that follows the twists and turns of the semiconductor world, the name “Jim Keller” is approaching legendary proportions. He was a driving force in AMD’s K7 and K8 development, he moved on to PA Semi which was acquired by Apple to produce their class leading SoC’s for the iPhone, and then went back to AMD to become lead architect of the Zen architecture which powers the latest Ryzen CPUs from AMD. He then moved on to Tesla to be in charge of chip development for their autonomous driving program.
Very little has been heard from Jim Keller while he was at Tesla. The assumption was that he continued to do his job there and worked hard to innovate the potential chip designs that would power next generation Tesla vehicles to have fully autonomous driving capabilities. While that program has been in its infancy, we have not heard of custom chips being utilized by Tesla in the latest cars.
Now we have confirmation that Jim has left Tesla and has in fact been hired by Intel. Some months back Raja Koduri was hired by Intel to be in charge of all core development with a special interest in GPUs. It looks as if Raja has persuaded Jim to hop on board and help with what appears to be a stagnant core development team on the CPU side.
Intel has a history of “not invented here” mentality that has in previous years caused massive problems with the company. The reliance on the Pentium IV and its further development allowed their primary competitor to sneak up on them and shake up the marketplace. It took a design group out of Israel to set Intel onto a better path with the Banias/Conroe architectures which then lead to the Core architecture that we have seen iterated upon for the past decade.
The company has stagnated again. While the current Core architecture is faster in terms of IPC than Zen, it is a company that has not pursued innovation in a manner that has kept its competitor at bay. Jim Keller went back to AMD and architected what would become the Zen family of chips. In the space of those years he was there, he took the best technology AMD had to offer and built from the ground up a new architecture that could compete against Intel for a fraction of the R&D costs that the semiconductor giant typically spends. Intel stands to lose some significant marketshare in mobile, desktop, and server with the latest offerings from AMD. Combine this with the issues that the manufacturing group have run into with their development of the 10nm process, Intel seems to finally realize that design is really what matters when manufacturing issues hit. We can remember back in the Athlon 64/Pentium 4 days when AMD was 18 months behind on process technology, but still held a power/performance edge over Intel. While manufacturing can give a large advantage to any chip, a great design will not have to rely as heavily on cutting edge process tech to be competitive. Intel should hold all the keys to creating a truly overpowering series of products for their primary markets, but AMD has shown up with the plucky architecture that could cause some serious perturbations throughout the mobile, desktop, and server markets.
It seems that Raja is “getting the gang back together” to revamp the design culture at Intel to more adequately deal with threats to their CPU dominance across the board. They also are probably looking more closely at the ultra-mobile market that ARM has dominated for the past decade. Previous Atom designs have not come close to the efficiency needed to address those markets, but perhaps with a change of leadership and architects we can see Intel successfully address this very important area with high performance/high efficiency chips that we honestly expect them to be able to design.
Jim Keller to Intel looks to be a transformational move. Not just because of his expertise in architecture, but also a shift in how Intel goes about its daily business. Bringing this kind of expertise into the company is a watershed moment that moves away from the “not invented here” mentality that seems to dictate decisions at the company when they are not facing serious competition. We will see what kind of power Raja and Jim can leverage in changing the culture of the company. What cannot be denied is that Intel has frittered away its advantages in core design by not implementing aggressive product and feature changes for the past decade to insure its dominance in the CPU world. Compound this situation with the manufacturing woes at 10nm and we can see that Intel needed a shakeup.
Consider Intel shook.
Subject: General Tech | March 27, 2018 - 03:30 PM | Ken Addison
Tagged: nvidia, GTC, quadro, gv100, GP100, tesla, titan v, v100, votla
One of the big missing markets for NVIDIA with their slow rollout of the Volta architecture was professional workstations. Today, NVIDIA announced they are bringing Volta to the Quadro family with the Quadro GV100 card.
Powered by the same GV100 GPU that announced at last year's GTC in the Tesla V100, and late last year in the Titan V, the Quadro GV100 represents a leap forward in computing power for workstation-level applications. While these users could currently be using TITAN V for similar workloads, as we've seen in the past, Quadro drivers generally provide big performance advantages in these sorts of applications. Although, we'd love to see NVIDIA repeat their move of bringing these optimizations to the TITAN lineup as they did with the TITAN Xp.
As it is a Quadro, we would expect this to be NVIDIA's first Volta-powered product which provides certified, professional driver code paths for applications such as CATIA, Solidedge, and more.
NVIDIA also heavily promoted the idea of using two of these GV100 cards in one system, utilizing NVLink. Considering the lack of NVLink support for the TITAN V, this is also the first time we've seen a Volta card with display outputs supporting NVLink in more standard workstations.
More importantly, this announcement brings NVIDIA's RTX technology to the professional graphics market.
With popular rendering applications like V-Ray already announcing and integrating support for NVIDIA's Optix Raytracing denoiser in their beta branch, it seems only a matter of time before we'll see a broad suite of professional applications supporting RTX technology for real-time. For example, raytraced renders of items being designed in CAD and modeling applications.
This sort of speed represents a potential massive win for professional users, who won't have to waste time waiting for preview renderings to complete to continue iterating on their projects.
The NVIDIA Quadro GV100 is available now directly from NVIDIA now for a price of $8,999, which puts it squarely in the same price range of the previous highest-end Quadro GP100.
Subject: General Tech, Graphics Cards | January 5, 2018 - 02:59 PM | Jeremy Hellstrom
Tagged: meltdown, spectre, geforce, quadro, NVS, nvidia, tesla, security
If you were wondering if NVIDIA products are vulnerable to some of the latest security threats, the answer is yes. Your Shield device or GPU is not vulnerable to CVE-2017-5754, aka Meltdown, however the two variants of Spectre could theoretically be used to infect you.
Variant 1 (CVE-2017-5753): Mitigations are provided with the security update included in this bulletin. NVIDIA expects to work together with its ecosystem partners on future updates to further strengthen mitigations.
Variant 2 (CVE-2017-5715): Mitigations are provided with the security update included in this bulletin. NVIDIA expects to work together with its ecosystem partners on future updates to further strengthen mitigations.
Variant 3 (CVE-2017-5754): At this time, NVIDIA has no reason to believe that Shield TV/tablet is vulnerable to this variant.
The Android based Shield tablet should be updated to Shield Experience 5.4, which should arrive before the end of the month. Your Shield TV, should you actually still have a working on will receive Shield Experience 6.3 along the same time frame.
The GPU is a little more complex as there are several product lines and OSes which need to be dealt with. There should be a new GeForce driver appearing early next week for gaming GPUs, with HPC cards receiving updates on the dates you can see below.
There is no reason to expect Radeon and Vega GPUs to suffer from these issues at this time. Intel could learn a bit from NVIDIA's response, which has been very quick and includes ther older hardware.
Subject: General Tech | August 17, 2017 - 12:48 PM | Jeremy Hellstrom
Tagged: nvidia, pascal, grid, tesla, Quadro vDWS
NVIDIA have updated their GRID virtual PC architecture to allow up to 24 virtual desktops, each with a 1GB desktop, doubling the previous capacity of their virtual machine tool. Along with this increase comes a new service called Quadro vDWS which allows you to power those virtual desktops with one of their HPC cards like their Pascal-based line of Tesla GPU accelerators. For workflows which incorporate things such as VR or photorealism this will offer a significant increase in performance; unfortunately Minesweeper will not see any improvements. NVIDIA accompanied this launch with a new blade server, the Tesla P6 which has 16GB of memory which can be split down to 16 1GB virtual desktops. Drop by The Inquirer for more information including on where to get this new software.
"NVIDIA has announced a new software suite which will allow users to virtualise an operating system to turn the company's ridiculously powerful Tesla GPU servers into powerful workstations."
Here is some more Tech News from around the web:
- Nokia 8 vs Galaxy S8 specs comparison @ The Inquirer
- Roku Gets Tough On Pirate Channels, Warns Users @ Slashdot
- Toshiba must allow Western Digital access to joint-venture assets @ The Register
- OCUK’s Andrew Gibson clears up RX Vega64 pricing disaster @ Kitguru
- How to build your own DIY makeshift levitation machine at home @ The Register
Subject: Graphics Cards | May 10, 2017 - 01:32 PM | Ryan Shrout
Tagged: v100, tesla, nvidia, gv100, gtc 2017
During the opening keynote to NVIDIA’s GPU Technology Conference, CEO Jen-Hsun Huang formally unveiled the latest GPU architecture and the first product based on it. The Tesla V100 accelerator is based on the Volta GPU architecture and features some amazingly impressive specifications. Let’s take a look.
|Tesla V100||GTX 1080 Ti||Titan X (Pascal)||GTX 1080||GTX 980 Ti||TITAN X||GTX 980||R9 Fury X||R9 Fury|
|GPU||GV100||GP102||GP102||GP104||GM200||GM200||GM204||Fiji XT||Fiji Pro|
|Base Clock||-||1480 MHz||1417 MHz||1607 MHz||1000 MHz||1000 MHz||1126 MHz||1050 MHz||1000 MHz|
|Boost Clock||1455 MHz||1582 MHz||1480 MHz||1733 MHz||1076 MHz||1089 MHz||1216 MHz||-||-|
|ROP Units||128 (?)||88||96||64||96||96||64||64||64|
|Memory Clock||878 MHz (?)||11000 MHz||10000 MHz||10000 MHz||7000 MHz||7000 MHz||7000 MHz||500 MHz||500 MHz|
|Memory Interface||4096-bit (HBM2)||352-bit||384-bit G5X||256-bit G5X||384-bit||384-bit||256-bit||4096-bit (HBM)||4096-bit (HBM)|
|Memory Bandwidth||900 GB/s||484 GB/s||480 GB/s||320 GB/s||336 GB/s||336 GB/s||224 GB/s||512 GB/s||512 GB/s|
|TDP||300 watts||250 watts||250 watts||180 watts||250 watts||250 watts||165 watts||275 watts||275 watts|
|Peak Compute||15 TFLOPS||10.6 TFLOPS||10.1 TFLOPS||8.2 TFLOPS||5.63 TFLOPS||6.14 TFLOPS||4.61 TFLOPS||8.60 TFLOPS||7.20 TFLOPS|
While we are low on details today, it appears that the fundamental compute units of Volta are similar to that of Pascal. The GV100 has 80 SMs with 40 TPCs and 5120 total CUDA cores, a 42% increase over the GP100 GPU used on the Tesla P100 and 42% more than the GP102 GPU used on the GeForce GTX 1080 Ti. The structure of the GPU remains the same GP100 with the CUDA cores organized as 64 single precision (FP32) per SM and 32 double precision (FP64) per SM.
Click to Enlarge
Interestingly, NVIDIA has already told us the clock speed of this new product as well, coming in at 1455 MHz Boost, more than 100 MHz lower than the GeForce GTX 1080 Ti and 25 MHz lower than the Tesla P100.
Click to Enlarge
Volta adds in support for a brand new compute unit though, known as Tensor Cores. With 640 of these on the GPU die, NVIDIA directly targets the neural network and deep learning fields. If this is your first time hearing about Tensor, you should read up on its influence on the hardware markets, bringing forth an open-source software library for machine learning. Google has invested in a Tensor-specific processor already, and now NVIDIA throws its hat in the ring.
Adding Tensor Cores to Volta allows the GPU to do mass processing for deep learning, on the order of a 12x improvement over Pascal’s capabilities using CUDA cores only.
For users interested in standard usage models, including gaming, the GV100 GPU offers 1.5x improvement in FP32 computing, up to 15 TFLOPS of theoretical performance and 7.5 TFLOPS of FP64. Other relevant specifications include 320 texture units, a 4096-bit HBM2 memory interface and 16GB of memory on-module. NVIDIA claims a memory bandwidth of 900 GB/s which works out to 878 MHz per stack.
Maybe more impressive is the transistor count: 21.1 BILLION! NVIDIA claims that this is the largest chip you can make physically with today’s technology. Considering it is being built on TSMC's 12nm FinFET technology and has an 815 mm2 die size, I see no reason to doubt them.
Shipping is scheduled for Q3 for Tesla V100 – at least that is when NVIDIA is promising the DXG-1 system using the chip is promised to developers.
I know many of you are interested in the gaming implications and timelines – sorry, I don’t have an answer for you yet. I will say that the bump from 10.6 TFLOPS to 15 TFLOPS is an impressive boost! But if the server variant of Volta isn’t due until Q3 of this year, I find it hard to think NVIDIA would bring the consumer version out faster than that. And whether or not NVIDIA offers gamers the chip with non-HBM2 memory is still a question mark for me and could directly impact performance and timing.
Subject: General Tech | March 29, 2017 - 03:04 AM | Scott Michaud
Tagged: tesla, tencent
Five percent of Tesla Motors has just been purchased by Tencent Holdings Limited. For our audience, this could be interesting in two ways. First, Tesla Motors is currently home to Jim Keller, who designed several CPUs architectures at AMD and Apple, including AMD’s K8, Apple’s A4 and A5, and AMD’s recent Zen. Second, Tencent has been purchasing minority chunks of several companies, including almost half of Epic Games, five percent of Activision Blizzard, and a few others, but the move into automotive technologies is somewhat new for them.
From Tesla’s perspective, Tencent could be strong leverage into the Chinese market. In fact, Elon Musk tweeted to Bloomberg Business that they are glad to have Tencent “as an investor and advisor”. Clearly, this means that they consider Tencent to be, in some fashion, an adviser for the company.
Personally, I’m curious how Tencent will affect the energy side of the company, including their subsidiary, SolarCity. I don’t really have anything to base this on, since it’s just as “out of left field” for Tencent as automotive technologies, but it’s something I’ll be occasionally glancing at none-the-less.
Tesla stores your Owner Authentication token in plain text ... which leads to a bad Ashton Kutcher movie
Subject: General Tech | November 25, 2016 - 12:52 PM | Jeremy Hellstrom
Tagged: Android, Malware, hack, tesla, security
You might expect better from Tesla and Elon Musk but apparently you would be dissappointed as the OAuth token in your cars mobile app is stored in plain text. The token is used to control your Tesla and is generated when you enter in your username and password. It is good for 90 days, after which it requires you to log in again so a new token can be created. Unfortunately, since that token is stored as plain text, someone who gains access to your Android phone can use that token to open your cars doors, start the engine and drive away. Getting an Android user to install a malicious app which would allow someone to take over their device has proven depressingly easy. Comments on Slashdot suggest it is unreasonable to blame Tesla for security issues in your devices OS, which is hard to argue; on the other hand it is impossible for Telsa to defend choosing to store your OAuth in plain text.
"By leveraging security flaws in the Tesla Android app, an attacker can steal Tesla cars. The only hard part is tricking Tesla owners into installing an Android app on their phones, which isn't that difficult according to a demo video from Norwegian firm Promon. This malicious app can use many of the freely available Android rooting exploits to take over the user's phone, steal the OAuth token from the Tesla app and the user's login credentials."
Here is some more Tech News from around the web:
- CERT tells Microsoft to keep EMET alive because it's better than Win 10's own security @ The Register
- Amazon Makes Good On Its Promise To Delete 'Incentivized' Reviews @ Slashdot
- Tech giants warn IoT vendors to get real about security @ The Register
- 8 of the best outdoor gadgets and accessories @ The Inquirer
Subject: General Tech | September 15, 2016 - 01:58 PM | Ryan Shrout
Tagged: VR, video, tesla, Silverstone, podcast, nvidia, msi, MoCA, Maximus VIII Formula, MasterLiquid, holodeck, GFE, geforce experience, euclideon, cooler master, asus, actiontec
PC Perspective Podcast #417 - 09/15/16
Join us this week as we discuss the Maximus VIII Forumla, MoCA adapters, GFE logins and more!
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the Store (audio only)
- Google Play - Subscribe to our audio podcast directly through Google Play!
- RSS - Subscribe through your regular RSS reader (audio only)
- MP3 - Direct download link to the MP3 file
Hosts: Ryan Shrout, Allyn Malventano, Josh Walrath and Jeremy Hellstrom
Week in Review:
This episode is brought to you by Casper! (Use code “pcper”)
News items of interest:
Hardware/Software Picks of the Week
Subject: General Tech | September 14, 2016 - 01:06 PM | Jeremy Hellstrom
Tagged: pascal, tesla, p40, p4, nvidia, neural net, m40, M4, HPC
The Register have package a nice explanation of the basics of how neural nets work in their quick look at NVIDIA's new Pascal based HPC cards, the P4 and P40. The tired joke about Zilog or Dick Van Patten stems from the research which has shown that 8-bit precision is most effective when feeding data into a neural net. Using 16 or 32-bit values slows the processing down significantly while adding little precision to the results produced. NVIDIA is also perfecting a hybrid mode, where you can opt for a less precise answer produced by your local, presumably limited, hardware or you can upload the data to the cloud for the full treatment. This is great for those with security concerns or when a quicker answer is more valuable than a more accurate one.
As for the hardware, NVIDIA claims the optimizations on the P40 will make it "40 times more efficient" than an Intel Xeon E5 CPU and it will also provide slightly more throughput than the currently available Titan X. You can expect to see these arrive in the market sometime over then next two months.
"Nvidia has designed a couple of new Tesla processors for AI applications – the P4 and the P40 – and is talking up their 8-bit math performance. The 16nm FinFET GPUs use Nv's Pascal architecture and follow on from the P100 launched in June. The P4 fits on a half-height, half-length PCIe card for scale-out servers, while the beefier P40 has its eyes set on scale-up boxes."
Here is some more Tech News from around the web:
- Windows 10 Anniversary Update might not arrive on your PC until November @ The Inquirer
- iOS 10 reviewed: There’s no reason not to update @ Ars Technica
- iOS 10 rollout goes titsup as update 'bricks' iPhones and iPads @ The Inqurier
- DevOps and the Art of Secure Application Deployment @ Linux.com
- HTC to unveil new Desire smartphones on September 20 @ DigiTimes
- Using a thing made by Microsoft, Apple or Adobe? It probably needs a patch today @ The Register
- New York Fines Viacom, Mattel and Hasbro For Tracking Kids Online @ Slashdot
- Microsoft's Service Fabric for Linux hits public preview @ The Register
Subject: Graphics Cards | June 20, 2016 - 01:57 PM | Scott Michaud
Tagged: tesla, pascal, nvidia, GP100
GP100, the “Big Pascal” chip that was announced at GTC, will be coming to PCIe for enterprise and supercomputer customers in Q4 2016. Previously, it was only announced using NVIDIA's proprietary connection. In fact, they also gave themselves some lead time with their first-party DGX-1 system, which retails for $129,000 USD, although we expect that was more for yield reasons. Josh calculated that each GPU in that system is worth more than the full wafer that its die was manufactured on.
This brings us to the PCIe versions. Interestingly, they have been down-binned from the NVLink version. The boost clock has been dropped to 1300 MHz, from 1480 MHz, although that is matched with a slightly lower TDP (250W versus the NVLink's 300W). This lowers the FP16 performance to 18.7 TFLOPs, down from 21.2, FP32 performance to 9.3 TFLOPs, down from 10.6, and FP64 performance to 4.7 TFLOPs, down from 5.3. This is where we get to the question: did NVIDIA reduce the clocks to hit a 250W TDP and be compatible with the passive cooling technology that previous Tesla cards utilize, or were the clocks dropped to increase yield?
They are also providing a 12GB version of the PCIe Tesla P100. I didn't realize that GPU vendors could selectively disable HBM2 stacks, but NVIDIA disabled 4GB of memory, which also dropped the bus width to 3072-bit. You would think that the simplicity of the circuit would want to divide work in a power-of-two fashion, but, knowing that they can, it makes me wonder why they did. Again, my first reaction is to question GP100 yield, but you wouldn't think that HBM, being such a small part of the die, is something that they can reclaim a lot of chips by disabling a chunk, right? That is, unless the HBM2 stacks themselves have yield issues -- which would be interesting.
There is also still no word on a 32GB version. Samsung claimed the memory technology, 8GB stacks of HBM2, would be ready for products in Q4 2016 or early 2017. We'll need to wait and see where, when, and why it will appear.