GPU Enthusiasts Are Throwing a FET
NVIDIA is rumored to launch Pascal in early (~April-ish) 2016, although some are skeptical that it will even appear before the summer. The design was finalized months ago, and unconfirmed shipping information claims that chips are being stockpiled, which is typical when preparing to launch a product. It is expected to compete against AMD's rumored Arctic Islands architecture, which will, according to its also rumored numbers, be very similar to Pascal.
This architecture is a big one for several reasons.
Image Credit: WCCFTech
First, it will jump two full process nodes. Current desktop GPUs are manufactured at 28nm, which was first introduced with the GeForce GTX 680 all the way back in early 2012, but Pascal will be manufactured on TSMC's 16nm FinFET+ technology. Smaller features have several advantages, but a huge one for GPUs is the ability to fit more complex circuitry in the same die area. This means that you can include more copies of elements, such as shader cores, and do more in fixed-function hardware, like video encode and decode.
That said, we got a lot more life out of 28nm than we really should have. Chips like GM200 and Fiji are huge, relatively power-hungry, and complex, which is a terrible idea to produce when yields are low. I asked Josh Walrath, who is our go-to for analysis of fab processes, and he believes that FinFET+ is probably even more complicated today than 28nm was in the 2012 timeframe, which was when it launched for GPUs.
It's two full steps forward from where we started, but we've been tiptoeing since then.
Image Credit: WCCFTech
Second, Pascal will introduce HBM 2.0 to NVIDIA hardware. HBM 1.0 was introduced with AMD's Radeon Fury X, and it helped in numerous ways -- from smaller card size to a triple-digit percentage increase in memory bandwidth. The 980 Ti can talk to its memory at about 300GB/s, while Pascal is rumored to push that to 1TB/s. Capacity won't be sacrificed, either. The top-end card is expected to contain 16GB of global memory, which is twice what any console has. This means less streaming, higher resolution textures, and probably even left-over scratch space for the GPU to generate content in with compute shaders. Also, according to AMD, HBM is an easier architecture to communicate with than GDDR, which should mean a savings in die space that could be used for other things.
Third, the architecture includes native support for three levels of floating point precision. Maxwell, due to how limited 28nm was, saved on complexity by reducing 64-bit IEEE 754 decimal number performance to 1/32nd of 32-bit numbers, because FP64 values are rarely used in video games. This saved transistors, but was a huge, order-of-magnitude step back from the 1/3rd ratio found on the Kepler-based GK110. While it probably won't be back to the 1/2 ratio that was found in Fermi, Pascal should be much better suited for GPU compute.
Image Credit: WCCFTech
Mixed precision could help video games too, though. Remember how I said it supports three levels? The third one is 16-bit, which is half of the format that is commonly used in video games. Sometimes, that is sufficient. If so, Pascal is said to do these calculations at twice the rate of 32-bit. We'll need to see whether enough games (and other applications) are willing to drop down in precision to justify the die space that these dedicated circuits require, but it should double the performance of anything that does.
So basically, this generation should provide a massive jump in performance that enthusiasts have been waiting for. Increases in GPU memory bandwidth and the amount of features that can be printed into the die are two major bottlenecks for most modern games and GPU-accelerated software. We'll need to wait for benchmarks to see how the theoretical maps to practical, but it's a good sign.
Subject: General Tech | September 17, 2015 - 12:00 PM | Ken Addison
Tagged: xps 12, video, TSMC, Steam Controller, r9 nano, podcast, pascal, nvidia, msi, hdplex h5, gtx 980ti sea hawk, fury x, Fiji, dell, corsair, amd
PC Perspective Podcast #367 - 09/17/2015
Join us this week as we discuss the AMD R9 Nano, a Corsair GTX 980Ti, NVIDIA Pascal Rumors and more!
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the Store
- RSS - Subscribe through your regular RSS reader
- MP3 - Direct download link to the MP3 file
Hosts: Ryan Shrout, Josh Walrath, Jeremy Hellstrom, and Allyn Malventano
Program length: 1:29:03
Week in Review:
0:03:45 The AMD Radeon R9 Nano Review
0:41:15 This week’s podcast is brought to you by Casper. Use code PCPER at checkout for $50 towards your order!
News item of interest:
1:06:25 More About HDPLEX H5
Hardware/Software Picks of the Week:
Subject: Graphics Cards | September 16, 2015 - 09:16 AM | Sebastian Peak
Tagged: TSMC, Samsung, pascal, nvidia, hbm, graphics card, gpu
According to a report by BusinessKorea TSMC has been selected to produce the upcoming Pascal GPU after initially competing with Samsung for the contract.
Though some had considered the possibility of both Samsung and TSMC sharing production (albeit on two different process nodes, as Samsung is on 14 nm FinFET), in the end the duties fall on TSMC's 16 nm FinFET alone if this report is accurate. The move is not too surprising considering the longstanding position TSMC has maintained as a fab for GPU makers and Samsung's lack of experience in this area.
The report didn't make the release date for Pascal any more clear, naming it "next year" for the new HBM-powered GPU, which will also reportedly feature 16 GB of HBM 2 memory for the flagship version of the card. This would potentially be the first GPU released at 16 nm (unless AMD has something in the works before Pascal's release), as all current AMD and NVIDIA GPUs are manufactured at 28 nm.
Subject: Graphics Cards | July 24, 2015 - 12:16 PM | Sebastian Peak
Tagged: rumor, pascal, nvidia, HBM2, hbm, graphics card, gpu
An exclusive report from Fudzilla claims some outlandish numbers for the upcoming NVIDIA Pascal GPU, including 17 billion transistors and a massive amount of second-gen HBM memory.
According to the report:
"Pascal is the successor to the Maxwell Titan X GM200 and we have been tipped off by some reliable sources that it will have more than a double the number of transistors. The huge increase comes from Pascal's 16 nm FinFET process and its transistor size is close to two times smaller."
The NVIDIA Pascal board (Image credit: Legit Reviews)
Pascal's 16nm FinFET production will be a major change from the existing 28nm process found on all current NVIDIA GPUs. And if this report is accurate they are taking full advantage considering that transistor count is more than double the 8 billion found in the TITAN X.
(Image credit: Fudzilla)
And what about memory? We have long known that Pascal will be NVIDIA's first forray into HBM, and Fudzilla is reporting that up to 32GB of second-gen HBM (HBM2) will be present on the highest model, which is a rather outrageous number even compared to the 12GB TITAN X.
"HBM2 enables cards with 4 HBM 2.0 cards with 4GB per chip, or four HBM 2.0 cards with 8GB per chips results with 16GB and 32GB respectively. Pascal has power to do both, depending on the SKU."
Pascal is expected in 2016, so we'll have plenty of time to speculate on these and doubtless other rumors to come.
Subject: Graphics Cards, Processors, Mobile | July 19, 2015 - 06:59 AM | Scott Michaud
Tagged: Zen, TSMC, Skylake, pascal, nvidia, Intel, Cannonlake, amd, 7nm, 16nm, 10nm
Getting smaller features allows a chip designer to create products that are faster, cheaper, and consume less power. Years ago, most of them had their own production facilities but that is getting rare. IBM has just finished selling its manufacturing off to GlobalFoundries, which was spun out of AMD when it divested from fabrication in 2009. Texas Instruments, on the other hand, decided that they would continue manufacturing but get out of the chip design business. Intel and Samsung are arguably the last two players with a strong commitment to both sides of the “let's make a chip” coin.
So where do you these chip designers go? TSMC is the name that comes up most. Any given discrete GPU in the last several years has probably been produced there, along with several CPUs and SoCs from a variety of fabless semiconductor companies.
Several years ago, when the GeForce 600-series launched, TSMC's 28nm line led to shortages, which led to GPUs remaining out of stock for quite some time. Since then, 28nm has been the stable work horse for countless high-performance products. Recent chips have been huge, physically, thanks to how mature the process has become granting fewer defects. The designers are anxious to get on smaller processes, though.
In a conference call at 2 AM (EDT) on Thursday, which is 2 PM in Taiwan, Mark Liu of TSMC announced that “the ramping of our 16 nanometer will be very steep, even steeper than our 20nm”. By that, they mean this year. Hopefully this translates to production that could be used for GPUs and CPUs early, as AMD needs it to launch their Zen CPU architecture in 2016, as early in that year as possible. Graphics cards have also been on that technology for over three years. It's time.
Also interesting is how TSMC believes that they can hit 10nm by the end of 2016. If so, this might put them ahead of Intel. That said, Intel was also confident that they could reach 10nm by the end of 2016, right until they announced Kaby Lake a few days ago. We will need to see if it pans out. If it does, competitors could actually beat Intel to the market at that feature size -- although that could end up being mobile SoCs and other integrated circuits that are uninteresting for the PC market.
Following the announcement from IBM Research, 7nm was also mentioned in TSMC's call. Apparently they expect to start qualifying in Q1 2017. That does not provide an estimate for production but, if their 10nm schedule is both accurate and also representative of 7nm, that would production somewhere in 2018. Note that I just speculated on an if of an if of a speculation, so take that with a mine of salt. There is probably a very good reason that this date wasn't mentioned in the call.
Back to the 16nm discussion, what are you hoping for most? New GPUs from NVIDIA, new GPUs from AMD, a new generation of mobile SoCs, or the launch of AMD's new CPU architecture? This should make for a highly entertaining comments section on a Sunday morning, don't you agree?
Subject: Graphics Cards | March 17, 2015 - 01:47 PM | Ryan Shrout
Tagged: pascal, nvidia, gtc 2015, GTC, geforce
At the keynote of the GPU Technology Conference (GTC) today, NVIDIA CEO Jen-Hsun Huang disclosed some more updates on the roadmap for future GPU technologies.
Most of the detail was around Pascal, due in 2016, that will introduce three new features including mixed compute precision, 3D (stacked) memory, and NVLink. Mixed precision is a method of computing in FP16, allowing calculations to run much faster at lower accuracy than full single or double precision when they are not necessary. Keeping in mind that Maxwell doesn't have an implementation with full speed DP compute (today), it would seem that NVIDIA is targeting different compute tasks moving forward. Though details are short, mixed precision would likely indicate processing cores than can handle both data types.
3D memory is the ability to put memory on-die with the GPU directly to improve overall memory banwidth. The visual diagram that NVIDIA showed on stage indicated that Pascal would have 750 GB/s of bandwidth, compared to 300-350 GB/s on Maxwell today.
NVLink is a new way of connecting GPUs, improving on bandwidth by more than 5x over current implementations of PCI Express. They claim this will allow for connecting as many as 8 GPUs for deep learning performance improvements (up to 10x). What that means for gaming has yet to be discussed.
NVIDIA made some other interesting claims as well. Pascal will be more than 2x more performance per watt efficient than Maxwell, even without the three new features listed above. It will also ship (in a compute targeted product) with a 32GB memory system compared to the 12GB of memory announced on the Titan X today. Pascal will also have 4x the performance in mixed precision compute.
Subject: General Tech | March 27, 2014 - 01:10 PM | Jeremy Hellstrom
Tagged: pascal, nvlink, nvidia, maxwell, jen-hsun huang, GTC
Before we get to see Volta in action NVIDIA is taking a half step and releasing the Pascal architecture which will use Maxwell-like Streaming Multiprocessors and will introduce stacked or 3D memory which will reside on the same substrate as the GPU. Jen-Hsun claimed this new type of memory will vastly increase the bandwidth available, provide two and a half times the capacity and be four times as energy efficient at the same time. Along with the 3D memory announcement was the revealing of NVLink, an alternative interconnect which he claims will offer 5-12 times the bandwidth of PCIe and will be utilized by HPC systems. From his announcement that NVLink will feature eight 20Gbps lanes per block or as NVIDIA is calling them, bricks, which The Tech Report used to make a quick calculation and came up with an aggregate bandwidth of a brick of around 20GB/s. Read on to see what else was revealed.
"Today during his opening keynote at the Nvidia GPU Technology Conference, CEO Jen-Hsun Huang offered an update to Nvidia's GPU roadmap. The big reveal was about a GPU code-named Pascal, which will be a generation beyond the still-being-introduced Maxwell architecture in the firm's plans."
Here is some more Tech News from around the web:
- Nvidia, VMware join to pipe high-quality 3D graphics from the cloud @ The Register
- Android has 97 Percent of Mobile Malware, But Nearly None in the U.S. @ DailyTech
- Amazon HALVES cloud storage prices after Google's shock slash @ The Register
- Bitcoin mining malware hits Android @ The Inquirer
- Facebook Oculus VR buy causes rift with developers and tech fans @ The Inquirer
- iSAW EXtreme Action Camera @ Kitguru
- Netgear VueZone VZSX2800 Wireless Surveillance Camera Kit @ eTeknix