It always feels a little odd when covering NVIDIA’s quarterly earnings due to how they present their financial calendar. No, we are not reporting from the future. Yes, it can be confusing when comparing results and getting your dates mixed up. Regardless of the date before the earnings, NVIDIA did exceptionally well in a quarter that is typically the second weakest after Q1.
NVIDIA reported revenue of $1.43 billion. This is a jump from an already strong Q1 where they took in $1.30 billion. Compare this to the $1.027 billion of its competitor AMD who also provides CPUs as well as GPUs. NVIDIA sold a lot of GPUs as well as other products. Their primary money makers were the consumer space GPUs and the professional and compute markets where they have a virtual stranglehold on at the moment. The company’s GAAP net income is a very respectable $253 million.
The release of the latest Pascal based GPUs were the primary mover for the gains for this latest quarter. AMD has had a hard time competing with NVIDIA for marketshare. The older Maxwell based chips performed well against the entire line of AMD offerings and typically did so with better power and heat characteristics. Even though the GTX 970 was somewhat limited in its memory configuration as compared to the AMD products (3.5 GB + .5 GB vs. a full 4 GB implementation) it was a top seller in its class. The same could be said for the products up and down the stack.
Pascal was released at the end of May, but the company had been shipping chips to its partners as well as creating the “Founder’s Edition” models to its exacting specifications. These were strong sellers throughout the end of May until the end of the quarter. NVIDIA recently unveiled their latest Pascal based Quadro cards, but we do not know how much of an impact those have had on this quarter. NVIDIA has also been shipping, in very limited quantities, the Tesla P100 based units to select customers and outfits.
Subject: General Tech | August 9, 2016 - 03:06 AM | Tim Verry
Tagged: xbox one s, xbox one, TSMC, microsoft, console, 16nm
Microsoft recently unleashed a smaller version of its gaming console in the form of the Xbox One S. The new "S" variant packs an internal power supply, 4K Blu-ray optical drive, and a smaller (die shrunk) AMD SoC into a 40% smaller package. The new console is clad in all white with black accents and a circular vent on left half of the top. A USB port and pairing button has been added to the front and the power and eject buttons are now physical rather than capacitive (touch sensitive).
Rear I/O remains similar to the original console and includes a power input, two HDMI ports (one input, one output), two USB 3.0 ports, one Ethernet, one S/PDIF audio out, and one IR out port. There is no need for the power brick anymore though as the power supply is now internal. Along with being 40% smaller, it can now be mounted vertically using an included stand. While there is no longer a dedicated Kinect port, it is still possible to add a Kinect to your console using an adapter.
The internal specifications of the Xbox One S remain consistent with the original Xbox One console except that it will now be available in a 2TB model. The gaming console is powered by a nearly identical processor that is now 35% smaller thanks to being manufactured on a smaller 16nm FinFet process node at TSMC. While the chip is more power efficient, it still features the same eight Jaguar CPU cores clocked at 1.75 GHz and 12 CU graphics portion (768 stream processors). Microsoft and AMD now support HDR and 4K resolutions and upscaling with the new chip. The graphics portion is where the new Xbox One S gets a bit interesting because it appears that Microsoft has given the GPU a bit of an overclock to 914 MHz. Compared to the original Xbox One's 853 MHz, this is a 7.1% increase in clockspeed. The increased GPU clocks also results in increased bandwidth for the ESRAM (204 GB/s on the original Xbox One versus 219 GB/s on the Xbox One S).
According to Microsoft, the increased GPU clockspeeds were necessary to be able to render non HDR versions of the game for Game DVR, Game Streaming, and taking screenshots in real time. A nice side benefit to this though is that the extra performance can result in improved game play in certain games. In Digital Foundry's testing, Richard Leadbetter found this to be especially true in games with unlocked frame rates or in games that are 30 FPS locked but where the original console could not hit 30 FPS consistently. The increased clocks can be felt in slightly smoother game play and less screen tearing. For example, they found that the Xbox One S got up to 11% higher frames in Project Cars (47 FPS versus 44) and between 6% to 8% in Hitman. Further, they found that the higher clocks help performance in playing Xbox 360 games on the Xbox One in backwards compatibility mode such as Alan Wake's American Nightmare.
The 2TB Xbox One S is available now for $400 while the 1TB ($350) and 500GB ($300) versions will be available on the 23rd. For comparison, the 500GB Xbox One (original) is currently $250. The Xbox One 1TB game console varies in price depending on game bundle.
What are your thoughts on the smaller console? While the ever so slight performance boost is a nice bonus, I definitely don't think that it is worth specifically upgrading for if you already have an Xbox One. If you have been holding off, now is the time to get a discounted original or smaller S version though! If you are hoping for more performance, definitely wait for Microsoft's Scorpio project or it's competitor the PlayStation 4 Neo (or even better a gaming PC right!? hehe).
I do know that Ryan has gotten his hands on the slimmer Xbox One S, so hopefully we will see some testing of our own as well as a teardown (hint, hint!).
- Xbox One Teardown - Microsoft still hates you
- PC vs. PS4 vs. Xbox One Hardware Comparison: Building a Competing Gaming PC
- Sony PS4 and Microsoft Xbox One Already Hitting a Performance Wall
- Tech Interview: Inside Xbox One S @ Eurogamer
Subject: Graphics Cards | July 16, 2016 - 10:37 PM | Scott Michaud
Tagged: Volta, pascal, nvidia, maxwell, 16nm
For the past few generations, NVIDIA has been roughly trying to release a new architecture with a new process node, and release a refresh the following year. This ran into a hitch as Maxwell was delayed a year, apart from the GTX 750 Ti, and then pushed back to the same 28nm process that Kepler utilized. Pascal caught up with 16nm, although we know that some hard, physical limitations are right around the corner. The lattice spacing for silicon at room temperature is around ~0.5nm, so we're talking about features the size of ~the low 30s of atoms in width.
This rumor claims that NVIDIA is not trying to go with 10nm for Volta. Instead, it will take place on the same, 16nm node that Pascal is currently occupying. This is quite interesting, because GPUs scale quite well with complexity changes, as they have many features with a relatively low clock rate, so the only real ways to increase performance are to make the existing architecture more efficient, or make a larger chip.
That said, GP100 leaves a lot of room on the table for an FP32-optimized, ~600mm2 part to crush its performance at the high end, similar to how GM200 replaced GK110. The rumored GP102, expected in the ~450mm2 range for Titan or GTX 1080 Ti-style parts, has some room to grow. Like GM200, however, it would also be unappealing to GPU compute users who need FP64. If this is what is going on, and we're totally just speculating at the moment, it would signal that enterprise customers should expect a new GPGPU card every second gaming generation.
That is, of course, unless NVIDIA recognized ways to make the Maxwell-based architecture significantly more die-space efficient in Volta. Clocks could get higher, or the circuits themselves could get simpler. You would think that, especially in the latter case, they would have integrated those ideas into Maxwell and Pascal, though; but, like HBM2 memory, there might have been a reason why they couldn't.
We'll need to wait and see. The entire rumor could be crap, who knows?
ARM Releases Egil Specs
The final product that ARM showed us at that Austin event is the latest video processing unit that will be integrated into their Mali GPUs. The Egil video processor is a next generation unit that will be appearing later this year with the latest products that utilize Mali GPUs up and down the spectrum. It is not tied to the latest G71 GPU, but rather can be used with a multitude of current Mali products.
Video is one of the biggest usage cases for modern SOCs in mobile devices. People constantly stream and record video from their handhelds and tablets, and there are some real drawbacks in current video processor products from a variety of sources. We have seen the amazing increase in pixel density on phones and tablets and the power draw to render video effectively on these products has gone up. We have also seen the introduction of new codecs that require a serious amount of processing capabilities to decode.
Egil is a scalable product that can go from one core to six. A single core can display video from a variety of codecs at 1080P and up to 80 fps. The six core solution can play back 4K video at 120 Hz. This is assuming that the Egil processor is produced on a 16nm FF process or smaller and running at 800 MHz. This provides a lot of flexibility with SOC manufacturers that allows them to adequately tailor their products for specific targets and markets.
The cores themselves are fixed function blocks with dedicated controllers and control logic. Previous video processors were more heavy on the decode aspects rather than encode. Now that we have more pervasive streaming from mobile devices and cameras/optics that can support higher resolutions and bitrates, ARM has redesigned Egil to offer extensive encoding capabilities. Not only does it add this capability, but it enhances it by not only decoding at 4K but being able to encode four 1080p30 streams at the same time.
Egil will eventually find its way into other products such as TVs. These custom SOCs will be even more important as 4K playback and media become more common plus potential new functionality that has yet to be implemented effectively on TVs. For the time being we will likely see this in mobile first, with the initial products hitting the market in the second half of 2016.
ARM is certainly on a roll this year with introducing new CPU, GPU, and now video processors. We will start to see these products being introduced throughout the end of this year and into the next. The company certainly has not been resting or letting potential competitors get the edge on them. Their products are always focused on consuming low amounts of power, but the potential performance looks to satisfy even power hungry users in the mobile and appliance markets. Egil is another solid looking member to the lineup that brings some impressive performance and codec support for both decoding and encoding.
GPU Enthusiasts Are Throwing a FET
NVIDIA is rumored to launch Pascal in early (~April-ish) 2016, although some are skeptical that it will even appear before the summer. The design was finalized months ago, and unconfirmed shipping information claims that chips are being stockpiled, which is typical when preparing to launch a product. It is expected to compete against AMD's rumored Arctic Islands architecture, which will, according to its also rumored numbers, be very similar to Pascal.
This architecture is a big one for several reasons.
Image Credit: WCCFTech
First, it will jump two full process nodes. Current desktop GPUs are manufactured at 28nm, which was first introduced with the GeForce GTX 680 all the way back in early 2012, but Pascal will be manufactured on TSMC's 16nm FinFET+ technology. Smaller features have several advantages, but a huge one for GPUs is the ability to fit more complex circuitry in the same die area. This means that you can include more copies of elements, such as shader cores, and do more in fixed-function hardware, like video encode and decode.
That said, we got a lot more life out of 28nm than we really should have. Chips like GM200 and Fiji are huge, relatively power-hungry, and complex, which is a terrible idea to produce when yields are low. I asked Josh Walrath, who is our go-to for analysis of fab processes, and he believes that FinFET+ is probably even more complicated today than 28nm was in the 2012 timeframe, which was when it launched for GPUs.
It's two full steps forward from where we started, but we've been tiptoeing since then.
Image Credit: WCCFTech
Second, Pascal will introduce HBM 2.0 to NVIDIA hardware. HBM 1.0 was introduced with AMD's Radeon Fury X, and it helped in numerous ways -- from smaller card size to a triple-digit percentage increase in memory bandwidth. The 980 Ti can talk to its memory at about 300GB/s, while Pascal is rumored to push that to 1TB/s. Capacity won't be sacrificed, either. The top-end card is expected to contain 16GB of global memory, which is twice what any console has. This means less streaming, higher resolution textures, and probably even left-over scratch space for the GPU to generate content in with compute shaders. Also, according to AMD, HBM is an easier architecture to communicate with than GDDR, which should mean a savings in die space that could be used for other things.
Third, the architecture includes native support for three levels of floating point precision. Maxwell, due to how limited 28nm was, saved on complexity by reducing 64-bit IEEE 754 decimal number performance to 1/32nd of 32-bit numbers, because FP64 values are rarely used in video games. This saved transistors, but was a huge, order-of-magnitude step back from the 1/3rd ratio found on the Kepler-based GK110. While it probably won't be back to the 1/2 ratio that was found in Fermi, Pascal should be much better suited for GPU compute.
Image Credit: WCCFTech
Mixed precision could help video games too, though. Remember how I said it supports three levels? The third one is 16-bit, which is half of the format that is commonly used in video games. Sometimes, that is sufficient. If so, Pascal is said to do these calculations at twice the rate of 32-bit. We'll need to see whether enough games (and other applications) are willing to drop down in precision to justify the die space that these dedicated circuits require, but it should double the performance of anything that does.
So basically, this generation should provide a massive jump in performance that enthusiasts have been waiting for. Increases in GPU memory bandwidth and the amount of features that can be printed into the die are two major bottlenecks for most modern games and GPU-accelerated software. We'll need to wait for benchmarks to see how the theoretical maps to practical, but it's a good sign.
Subject: General Tech | October 1, 2015 - 06:17 PM | Ken Addison
Tagged: podcast, video, fable legends, dx12, apple, A9, TSMC, Samsung, 14nm, 16nm, Intel, P3608, NVMe, logitech, g410, TKL, nvidia, geforce now, qualcomm, snapdragon 820
PC Perspective Podcast #369 - 10/01/2015
Join us this week as we discuss the Fable Legends DX12 Benchmark, Apple A9 SoC, Intel P3608 SSD, and more!
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the Store
- RSS - Subscribe through your regular RSS reader
- MP3 - Direct download link to the MP3 file
Hosts: Ryan Shrout, Josh Walrath, Jeremy Hellstrom, and Allyn Malventano
Program length: 1:42:35
Week in Review:
0:54:10 This episode of PC Perspective is brought to you by…Zumper, the quick and easy way to find your next apartment or home rental. To get started and to find your new home go to http://zumper.com/PCP
News item of interest:
Hardware/Software Picks of the Week:
Subject: Graphics Cards, Processors, Mobile | July 19, 2015 - 10:59 AM | Scott Michaud
Tagged: Zen, TSMC, Skylake, pascal, nvidia, Intel, Cannonlake, amd, 7nm, 16nm, 10nm
Getting smaller features allows a chip designer to create products that are faster, cheaper, and consume less power. Years ago, most of them had their own production facilities but that is getting rare. IBM has just finished selling its manufacturing off to GlobalFoundries, which was spun out of AMD when it divested from fabrication in 2009. Texas Instruments, on the other hand, decided that they would continue manufacturing but get out of the chip design business. Intel and Samsung are arguably the last two players with a strong commitment to both sides of the “let's make a chip” coin.
So where do you these chip designers go? TSMC is the name that comes up most. Any given discrete GPU in the last several years has probably been produced there, along with several CPUs and SoCs from a variety of fabless semiconductor companies.
Several years ago, when the GeForce 600-series launched, TSMC's 28nm line led to shortages, which led to GPUs remaining out of stock for quite some time. Since then, 28nm has been the stable work horse for countless high-performance products. Recent chips have been huge, physically, thanks to how mature the process has become granting fewer defects. The designers are anxious to get on smaller processes, though.
In a conference call at 2 AM (EDT) on Thursday, which is 2 PM in Taiwan, Mark Liu of TSMC announced that “the ramping of our 16 nanometer will be very steep, even steeper than our 20nm”. By that, they mean this year. Hopefully this translates to production that could be used for GPUs and CPUs early, as AMD needs it to launch their Zen CPU architecture in 2016, as early in that year as possible. Graphics cards have also been on that technology for over three years. It's time.
Also interesting is how TSMC believes that they can hit 10nm by the end of 2016. If so, this might put them ahead of Intel. That said, Intel was also confident that they could reach 10nm by the end of 2016, right until they announced Kaby Lake a few days ago. We will need to see if it pans out. If it does, competitors could actually beat Intel to the market at that feature size -- although that could end up being mobile SoCs and other integrated circuits that are uninteresting for the PC market.
Following the announcement from IBM Research, 7nm was also mentioned in TSMC's call. Apparently they expect to start qualifying in Q1 2017. That does not provide an estimate for production but, if their 10nm schedule is both accurate and also representative of 7nm, that would production somewhere in 2018. Note that I just speculated on an if of an if of a speculation, so take that with a mine of salt. There is probably a very good reason that this date wasn't mentioned in the call.
Back to the 16nm discussion, what are you hoping for most? New GPUs from NVIDIA, new GPUs from AMD, a new generation of mobile SoCs, or the launch of AMD's new CPU architecture? This should make for a highly entertaining comments section on a Sunday morning, don't you agree?
Subject: Storage, Shows and Expos | June 3, 2015 - 03:47 AM | Allyn Malventano
Tagged: tlc, ssd, micron, flash, computex 2015, computex, 16nm
While 16nm TLC was initially promised Q4 of 2014, I believe Micron distracted themselves a little with their dabbles into Dynamic Write Acceleration technology. No doubt wanting to offer ever more cost effective SSDs to their portfolio, the new TLC 16nm flash will take up less die space for the same capacity, meaning more dies per 300mm wafer, ultimately translating to lower cost/GB of consumer SSDs.
Micron's 16nm (MLC) flash
The Crucial MX200 and BX100 SSDs have already been undercutting the competition in cost/GB, so the possibility of even lower cost SSDs is a more than welcome idea - just so long as they can keep the reliability of these parts high enough. IMFT has a very solid track record in this regard, so I don't suspect any surprises in that regard.
Full press blast appears after the break.
For a mere $100 you can pick up the 256GB model or for $200 you can double that to 512GB. That certainly makes the drives attractive but the performance is there as well, often beating its predecessor the MX500 series. If reliability is a concern the onboard RAIN feature guards against writes to bad flash, there are onboard capacitors to allow writes to finish in the case of power outages and a 3 year warranty. Check out the full review at The Tech Report if you need a second opinion after Allyn's review.
"The Crucial MX100 is the first solid-state drive to use Micron's 16-nm MLC NAND. It's also one of the most affordable SSDs around, with the 256GB version priced at $109.99 and the 512GB at $224.99. We take a closer look at how the two stack up against a range of competitors, and the results might surprise you."
Here are some more Storage reviews from around the web:
- Crucial MX100 Solid State Drive @ Benchmark Reviews
- Toshiba Q Series Pro 256GB SSD @ NikKTech
- Samsung 845DC EVO @ SSD Review
- OCZ Vertex 460 240GB Review @ OCC
- OCZ RevoDrive 350 480GB PCIe SSD Review @ Legit Reviews
- Vantec EZ SWAP M3500 Series Review @HiTech Legion
- Netgear ReadyNAS RN312, RN314 & RN316 @ Legion Hardware
- Thecus N4560 SOHO/Home NAS Server Review @ Madshrimps
- Thecus N7710-G @ techPowerUp
- ADATA XPG SDXC UHS-1 U3 Card @ The SSD Review
Introduction, Specifications and Packaging
Back in July of last year, Micron announced production of 16nm flash memory. These were the same 128gbit dies as the previous gen parts, but 16nm means the dies are smaller, meaning more dies from a single wafer, ultimately translating to lower end user cost.
It takes a bit of time for those new flash die shrinks to trickle into mainstream products. Early yields from a given shrink tend to not have competitive endurance on initial production. As production continues, the process gets tweaked, resulting in greater and longer enduring yields.