Eight is enough, looking at how the new Telsa HPC cards from NVIDIA will work

Subject: General Tech | September 14, 2016 - 01:06 PM |
Tagged: pascal, tesla, p40, p4, nvidia, neural net, m40, M4, HPC

The Register have package a nice explanation of the basics of how neural nets work in their quick look at NVIDIA's new Pascal based HPC cards, the P4 and P40.  The tired joke about Zilog or Dick Van Patten stems from the research which has shown that 8-bit precision is most effective when feeding data into a neural net.  Using 16 or 32-bit values slows the processing down significantly while adding little precision to the results produced.  NVIDIA is also perfecting a hybrid mode, where you can opt for a less precise answer produced by your local, presumably limited, hardware or you can upload the data to the cloud for the full treatment.  This is great for those with security concerns or when a quicker answer is more valuable than a more accurate one.

As for the hardware, NVIDIA claims the optimizations on the P40 will make it "40 times more efficient" than an Intel Xeon E5 CPU and it will also provide slightly more throughput than the currently available Titan X.  You can expect to see these arrive in the market sometime over then next two months.

newtesla.PNG

"Nvidia has designed a couple of new Tesla processors for AI applications – the P4 and the P40 – and is talking up their 8-bit math performance. The 16nm FinFET GPUs use Nv's Pascal architecture and follow on from the P100 launched in June. The P4 fits on a half-height, half-length PCIe card for scale-out servers, while the beefier P40 has its eyes set on scale-up boxes."

Here is some more Tech News from around the web:

Tech Talk

Source: The Register

Rumor: NVIDIA GeForce GTX 1050 GPU-Z Screenshot

Subject: Graphics Cards | September 6, 2016 - 05:45 PM |
Tagged: nvidia, pascal, gtx 1050, geforce

I don't know why people insist on encoding screenshots from form-based windows in JPEG. You have very little color variation outside of text, which is typically thin and high-contrast from its surroundings. JPEG's Fourier Transform will cause rippling artifacts in the background, which should be solid color, and will almost definitely have a larger file size. Please, everyone, at least check to see how big a PNG will be before encoding it as JPEG. (In case you notice that I encoded it in JPEG too, that's because re-compressing JPEG artifacts makes PNG's file-size blow up, forcing me to actually need to use JPEG.)

nvidia-2016-gtx1050leak-benchlife.jpg

It also makes it a bit more difficult to tell whether a screenshot has been manipulated, because the hitches make everything look suspect. Regardless, BenchLife claims to have a leaked GPU-Z result for the GeForce GTX 1050. They claim that it will be using the GP107 die at 75W, although the screenshot claims neither of these. If true, this means that it will not be a further cut-down version of GP106, as seen in the two GTX 1060 parts, which would explain a little bit why they wanted both of them to remain in the 1060 level of branding. (Although why they didn't call the 6GB version the 1060 Ti is beyond me.)

What the screenshot does suggest, though, is that it will have 4GB of GDDR5 memory, on a 128-bit bus. It will have 768 shaders, the same as the GTX 950, although clocked about 15% higher (boost vs boost) and 15W lower, bringing it back into the range of PCIe bus power (75W). That doesn't mean that it will not have a six-pin external power connector, but that could be the case, like the 750 Ti.

This would give it about 2.1 TeraFLOPs of performance, which is on part with the GeForce GTX 660 from a few generations ago, as well as the RX 460, which is also 75W TDP.

Source: Benchlife

Razer Updates Blade and Blade Stealth Laptops

Subject: Systems | September 3, 2016 - 12:10 AM |
Tagged: razer, blade, blade stealth, kaby lake, pascal

The Razer Blade and the Razer Blade Stealth seem to be quite different in their intended usage. The regular model is slightly more expensive than its sibling, but it includes a quad-core (eight thread) Skylake processor and an NVIDIA GTX 1060. The Stealth model, on the other hand, uses a Kaby Lake (the successor to Skylake) dual-core (four thread) processor, and it uses the Intel HD Graphics 620 iGPU instead of adding a discrete part from AMD or NVIDIA.

razer-2016-newlaptops.png

The Stealth model weighs about 2.84 lbs, while the regular model is (relatively) much more heavy at 4.1 - 4.3 lbs, depending on the user's choice of screen. The extra weight is likely due in part to the much larger battery, which is needed to power the discrete GPU and last-generation quad-core CPU. Razer claims that the Stealth's 53.6 Wh battery will power the device for 9 hours. They do not seem to make any claims about how long the non-Stealth's 70Wh battery will last. Granted, that would depend on workload anyway.

This is where the interesting choice begins. Both devices are compatible with the Razer Core, which allows externally-attached desktop GPUs to be plugged into Razer laptops. If you look at their website design, the Razer Blade Stealth promotes the Core more prominently, even including a “Buy Now” button for it on the header. They also advertise 100% AdobeRGB color support on the Stealth, which is useful for graphics designers because it can be calibrated to either sRGB (web and video) or print (magazines) color spaces.

To me, the Stealth seems more for a user who wants to bring their laptop to work (or school) on a daily basis, and possibly plug it into a discrete GPU when they get home. Alternatively, the Razer Blade without a suffix is for someone who wants a strong, powerful PC that, while not as fast as a full desktop, is decently portable and even VR ready without external graphics. The higher resolution choices, despite the slower internal graphics, also suggests that the Stealth is more business, while the Blade is more gaming.

Before we go, Razer has also included a license of Fruity Loops Studio 12 Producer Edition. This is a popular piece of software that is used to create music by layering individual instruments and tracks. Even if you license Adobe Creative Cloud, this is one of the areas that, while Audition technically can overlap with, it's really not designed to. Instead, think GarageBand.

The Razer Blade Stealth is available now, from $999.99 (128GB QHD) to $1999.00 (1TB 4K).

The Razer Blade is also available now, from $1799.99 (256GB 1080p) to $2699.99 (1TB QHD+).

Source: Razer
Author:
Manufacturer: ASUS

Specifications and Card Breakdown

The flurry of retail built cards based on NVIDIA's new Pascal GPUs has been hitting us hard at PC Perspective. So much in fact that, coupled with new gaming notebooks, new monitors, new storage and a new church (you should listen to our podcast, really) output has slowed dramatically. How do you write reviews for all of these graphics cards when you don't even know where to start? My answer: blindly pick one and start typing away.

07.jpg

Just after launch day of the GeForce GTX 1060, ASUS sent over the GTX 1060 Turbo 6GB card. Despite the name, the ASUS Turbo line of GTX 10-series graphics cards is the company's most basic, most stock iteration of graphics cards. That isn't necessarily a drawback though - you get reference level performance at the lowest available price and you still get the promises of quality and warranty from ASUS.

With a target MSRP of just $249, does the ASUS GTX 1060 Turbo make the cut for users looking for that perfect mainstream 1080p gaming graphics card? Let's find out.

Continue reading our review of the ASUS GeForce GTX 1060 Turbo 6GB!

EVGA's Water Cooled GTX 1080 FTW Hybrid Runs Cool and Quiet

Subject: Graphics Cards | August 23, 2016 - 04:18 PM |
Tagged: water cooling, pascal, hybrid cooler, GTX 1080, evga

EVGA recently launched a water cooled graphics card that pairs the GTX 1080 processor with the company's FTW PCB and a closed loop (AIO) water cooler to deliver a heavily overclockable card that will set you back $730.

The GTX 1080 FTW Hybrid is interesting because the company has opted to use the same custom PCB design as its FTW cards rather than a reference board. This FTW board features improved power delivery with a 10+2 power phase, two 8-pin PCI-E power connectors, Dual BIOS, and adjustable RGB LEDs. The cooler is shrouded with backlit EVGA logos and has a fan to air cool the memory and VRMs that is reportedly quiet and uses a reverse swept blade design (like their ACX air coolers) rather than a traditional blower style fan. The graphics processor is cooled by a water loop.

EVGA GTX 1080 FTW Hybrid.jpg

The water block and pump sit on top of the GPU with tubes running out to the 120mm radiator. Luckily the fan on the radiator can be easily disconnected, allowing users to use their own fan if they wish. According to Youtuber Jayztwocents, the Precision XOC software controls the fan speed of the fan on the card itself but users can not adjust the radiator fan speed themselves. You can connect your own fan to your motherboard and control it that way, however.

Display outputs include one DVI-D, one HDMI, and three DisplayPort outputs (any four of the five can be used simultaneously).

Out of the box this 215W TDP graphics card has a factory overclock of 1721 MHz base and 1860 MHz boost. Thanks to the water cooler, the GPU stays at a frosty 42°C under load. When switched to the slave BIOS (which has a higher power limit and more aggressive fan curve), the card GPU Boosted to 2025 and hit 51°C (he managed to keep that to 44°C by swapping his own EK-Vardar fan onto the radiator). Not bad, especially considering the Founder's Edition hit 85°C on air in our testing! Unfortunately, EVGA did not touch the memory and left the 8GB of GDDR5X at the stock 10 GHz.

  GTX 1080 GTX 1080 FTW Hybrid GTX 1080 FTW Hybrid Slave BIOS
GPU GP104 GP104 GP104
GPU Cores 2560 2560 2560
Rated Clock 1607 MHz 1721 MHz 1721 MHz
Boost Clock 1733 MHz 1860 MHz 2025 MHz
Texture Units 160 160 160
ROP Units 64 64 64
Memory 8GB 8GB 8GB
Memory Clock 10000 MHz 10000 MHz 10000 MHz
TDP 180 watts 215 watts ? watts
Max Tempurature 85°C 42°C 51°C
MSRP (current) $599 ($699 FE) $730 $730

The water cooler should help users hit even higher overclocks and/or maintain a consistent GPU Boost clock at much lower temperatures than on air. The GTX 1080 FTW Hybrid graphics card does come at a bit of a premium at $730 (versus $699 for Founders or ~$650+ for custom models), but if you have the room in your case for the radiator this might be a nice option! (Of course custom water cooling is more fun, but it's also more expensive, time consuming, and addictive. hehe)

What do you think about these "hybrid" graphics cards?

Source: EVGA

Serious mobile gaming power from ASUS, if you can afford it

Subject: Mobile | August 19, 2016 - 03:15 PM |
Tagged: asus, ROG, gtx 1070, G752VS OC Edition, pascal, gaming laptop

The mobile version of the GTX 1070, referred to here as the GTX 1070M even if NVIDIA doesn't, is a interesting part sporting 128 more cores than the desktop version albeit at a lower clock.  Hardware Canucks received the ASUS RoG G752VS OC Edition gaming laptop which uses the mobile GTX 1070, overclocked by 50MHz on the Core and by 150MHz on the 8GB of RAM, along with an i7-6820 running at 3.8GHz.  This particular model will set you back $3000US and offers very impressive performance on either it's 17.3" 1080p G-SYNC display or on an external display of your choice.  The difference in performance between the new GTX 1070(M) and the previous GTX 980M is marked, check out the full review to see just how much better this card is ... assuming the price tag doesn't immediately turn you off.

GTX1070-NOTEBOOK-10.jpg

"The inevitable has finally happened: NVIDIA's Pascal architecture has made its way into gaming notebooks....and it is spectacular. In this review we take a GTX 1070-totting laptop out for a spin. "

Here are some more Mobile articles from around the web:

More Mobile Articles

Podcast #413 - NVIDIA Pascal Mobile, ARM and Intel partner on 10nm, Flash Memory Summit and more!

Subject: Editorial | August 18, 2016 - 02:20 PM |
Tagged: video, podcast, pascal, nvidia, msi, mobile, Intel, idf, GTX 1080, gtx 1070, gtx 1060, gigabyte, FMS, Flash Memory Summit, asus, arm, 10nm

PC Perspective Podcast #413 - 08/18/2016

Join us this week as we discuss the new mobile GeForce GTX 10-series gaming notebooks, ARM and Intel partnering on 10nm, Flash Memory Summit and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

Hosts:  Allyn Malventano, Sebastian Peak, Josh Walrath and Jeremy Hellstrom

Program length: 1:29:39
  1. Week in Review:
  2. This episode of PC Perspective is brought to you by Casper!! Use code “PCPER”
  3. News items of interest:
    1. 0:42:05 Final news from FMS 2016
  4. Hardware/Software Picks of the Week
    1. Ryan: VR Demi Moore
  5. Closing/outro

Author:
Subject: Editorial
Manufacturer: NVIDIA

NVIDIA Today?

It always feels a little odd when covering NVIDIA’s quarterly earnings due to how they present their financial calendar.  No, we are not reporting from the future.  Yes, it can be confusing when comparing results and getting your dates mixed up.  Regardless of the date before the earnings, NVIDIA did exceptionally well in a quarter that is typically the second weakest after Q1.

NVIDIA reported revenue of $1.43 billion.  This is a jump from an already strong Q1 where they took in $1.30 billion.  Compare this to the $1.027 billion of its competitor AMD who also provides CPUs as well as GPUs.  NVIDIA sold a lot of GPUs as well as other products.  Their primary money makers were the consumer space GPUs and the professional and compute markets where they have a virtual stranglehold on at the moment.  The company’s GAAP net income is a very respectable $253 million.

results.png

The release of the latest Pascal based GPUs were the primary mover for the gains for this latest quarter.  AMD has had a hard time competing with NVIDIA for marketshare.  The older Maxwell based chips performed well against the entire line of AMD offerings and typically did so with better power and heat characteristics.  Even though the GTX 970 was somewhat limited in its memory configuration as compared to the AMD products (3.5 GB + .5 GB vs. a full 4 GB implementation) it was a top seller in its class.  The same could be said for the products up and down the stack.

Pascal was released at the end of May, but the company had been shipping chips to its partners as well as creating the “Founder’s Edition” models to its exacting specifications.  These were strong sellers throughout the end of May until the end of the quarter.  NVIDIA recently unveiled their latest Pascal based Quadro cards, but we do not know how much of an impact those have had on this quarter.  NVIDIA has also been shipping, in very limited quantities, the Tesla P100 based units to select customers and outfits.

Click to read more about NVIDIA's latest quarterly results!

Manufacturer: NVIDIA

Is Enterprise Ascending Outside of Consumer Viability?

So a couple of weeks have gone by since the Quadro P6000 (update: was announced) and the new Titan X launched. With them, we received a new chip: GP102. Since Fermi, NVIDIA has labeled their GPU designs with a G, followed by a single letter for the architecture (F, K, M, or P for Fermi, Kepler, Maxwell, and Pascal, respectively), which is then followed by a three digit number. The last digit is the most relevant one, however, as it separates designs by their intended size.

nvidia-2016-Quadro_P6000_7440.jpg

Typically, 0 corresponds to a ~550-600mm2 design, which is about as larger of a design that fabrication labs can create without error-prone techniques, like multiple exposures (update for clarity: trying to precisely overlap multiple designs to form a larger integrated circuit). 4 corresponds to ~300mm2, although GM204 was pretty large at 398mm2, which was likely to increase the core count while remaining on a 28nm process. Higher numbers, like 6 or 7, fill back the lower-end SKUs until NVIDIA essentially stops caring for that generation. So when we moved to Pascal, jumping two whole process nodes, NVIDIA looked at their wristwatches and said “about time to make another 300mm2 part, I guess?”

The GTX 1080 and the GTX 1070 (GP104, 314mm2) were born.

nvidia-2016-gtc-pascal-banner.png

NVIDIA already announced a 600mm2 part, though. The GP100 had 3840 CUDA cores, HBM2 memory, and an ideal ratio of 1:2:4 between FP64:FP32:FP16 performance. (A 64-bit chunk of memory can store one 64-bit value, two 32-bit values, or four 16-bit values, unless the register is attached to logic circuits that, while smaller, don't know how to operate on the data.) This increased ratio, even over Kepler's 1:6 FP64:FP32, is great for GPU compute, but wasted die area for today's (and tomorrow's) games. I'm predicting that it takes the wind out of Intel's sales, as Xeon Phi's 1:2 FP64:FP32 performance ratio is one of its major selling points, leading to its inclusion in many supercomputers.

Despite the HBM2 memory controller supposedly being actually smaller than GDDR5(X), NVIDIA could still save die space while still providing 3840 CUDA cores (despite disabling a few on Titan X). The trade-off is that FP64 and FP16 performance had to decrease dramatically, from 1:2 and 2:1 relative to FP32, all the way down to 1:32 and 1:64. This new design comes in at 471mm2, although it's $200 more expensive than what the 600mm2 products, GK110 and GM200, launched at. Smaller dies provide more products per wafer, and, better, the number of defective chips should be relatively constant.

Anyway, that aside, it puts NVIDIA in an interesting position. Splitting the xx0-class chip into xx0 and xx2 designs allows NVIDIA to lower the cost of their high-end gaming parts, although it cuts out hobbyists who buy a Titan for double-precision compute. More interestingly, it leaves around 150mm2 for AMD to sneak in a design that's FP32-centric, leaving them a potential performance crown.

nvidia-2016-pascal-volta-roadmap-extremetech.png

Image Credit: ExtremeTech

On the other hand, as fabrication node changes are becoming less frequent, it's possible that NVIDIA could be leaving itself room for Volta, too. Last month, it was rumored that NVIDIA would release two architectures at 16nm, in the same way that Maxwell shared 28nm with Kepler. In this case, Volta, on top of whatever other architectural advancements NVIDIA rolls into that design, can also grow a little in size. At that time, TSMC would have better yields, making a 600mm2 design less costly in terms of waste and recovery.

If this is the case, we could see the GPGPU folks receiving a new architecture once every second gaming (and professional graphics) architecture. That is, unless you are a hobbyist. If you are? I would need to be wrong, or NVIDIA would need to somehow bring their enterprise SKU into an affordable price point. The xx0 class seems to have been pushed up and out of viability for consumers.

Or, again, I could just be wrong.

That old chestnut again? Intel compares their current gen hardware against older NVIDIA kit

Subject: General Tech | August 17, 2016 - 12:41 PM |
Tagged: nvidia, Intel, HPC, Xeon Phi, maxwell, pascal, dirty pool

There is a spat going on between Intel and NVIDIA over the slide below, as you can read about over at Ars Technica.  It seems that Intel have reached into the industries bag of dirty tricks and polished off an old standby, testing new hardware and software against older products from their competitors.  In this case it was high performance computing products which were tested, Intel's new Xeon Phi against NVIDIA's Maxwell, tested on an older version of the Caffe AlexNet benchmark.

NVIDIA points out that not only would they have done better than Intel if an up to date version of the benchmarking software was used, but that the comparison should have been against their current architecture, Pascal.  This is not quite as bad as putting undocumented flags into compilers to reduce the performance of competitors chips or predatory discount programs but it shows that the computer industry continues to have only a passing acquaintance with fair play and honest competition.

intel-xeon-phi-performance-claim.jpg

"At this juncture I should point out that juicing benchmarks is, rather sadly, par for the course. Whenever a chip maker provides its own performance figures, they are almost always tailored to the strength of a specific chip—or alternatively, structured in such a way as to exacerbate the weakness of a competitor's product."

Here is some more Tech News from around the web:

Tech Talk

Source: Ars Technica