Why Two 4GB GPUs Isn't Necessarily 8GB
We're trying something new here at PC Perspective. Some topics are fairly difficult to explain cleanly without accompanying images. We also like to go fairly deep into specific topics, so we're hoping that we can provide educational cartoons that explain these issues.
This pilot episode is about load-balancing and memory management in multi-GPU configurations. There seems to be a lot of confusion around what was (and was not) possible with DirectX 11 and OpenGL, and even more confusion about what DirectX 12, Mantle, and Vulkan allow developers to do. It highlights three different load-balancing algorithms, and even briefly mentions what LucidLogix was attempting to accomplish almost ten years ago.
If you like it, and want to see more, please share and support us on Patreon. We're putting this out not knowing if it's popular enough to be sustainable. The best way to see more of this is to share!
Subject: General Tech | August 18, 2016 - 06:20 PM | Jeremy Hellstrom
Tagged: Intel, joule, iot, IDF 2016, SoC, 570x, 550x, Intel RealSense
Intel has announced the follow up to Edison and Curie, their current SoC device, called Joule. They have moved away from the Quark processors they previously used to a current generation Atom. The device is designed to compete against NVIDIA's Jetson as it is far more powerful than a Raspberry Pi and will be destined for different usage. It will support Intel RealSense, perhaps appearing in the newly announced Project Alloy VR headset. Drop by Hack a Day for more details on the two soon to be released models, the Joule 570x and 550x.
"The high-end board in the lineup features a quad-core Intel Atom running at 2.4 GHz, 4GB of LPDDR4 RAM, 16GB of eMMC, 802.11ac, Bluetooth 4.1, USB 3.1, CSI and DSI interfaces, and multiple GPIO, I2C, and UART interfaces."
Here is some more Tech News from around the web:
- Microsoft Windows UAC can be bypassed for untraceable hacks @ The Inquirer
- Microsoft PowerShell Goes Open Source and Lands On Linux and Mac @ Slashdot
- The Witcher 3: Wild Hunt Enables NVIDIA Ansel Support For 3D Stereo Screenshots @ Techgage
- ech support scammers mess with hacker's mother, so he retaliated with ransomware @ The Register
- 90 per cent of people ignore security notices because their brains are too busy @ The Inquirer
- NikKTech With Kingston HyperX It's All About Speed Global Giveaway
Subject: Editorial | August 18, 2016 - 06:20 PM | Ryan Shrout
Tagged: video, podcast, pascal, nvidia, msi, mobile, Intel, idf, GTX 1080, gtx 1070, gtx 1060, gigabyte, FMS, Flash Memory Summit, asus, arm, 10nm
PC Perspective Podcast #413 - 08/18/2016
Join us this week as we discuss the new mobile GeForce GTX 10-series gaming notebooks, ARM and Intel partnering on 10nm, Flash Memory Summit and more!
The URL for the podcast is: http://pcper.com/podcast - Share with your friends!
- iTunes - Subscribe to the podcast directly through the Store (audio only)
- Google Play - Subscribe to our audio podcast directly through Google Play!
- RSS - Subscribe through your regular RSS reader (audio only)
- MP3 - Direct download link to the MP3 file
Hosts: Allyn Malventano, Sebastian Peak, Josh Walrath and Jeremy Hellstrom
Week in Review:
This episode of PC Perspective is brought to you by Casper!! Use code “PCPER”
News items of interest:
0:42:05 Final news from FMS 2016
Hardware/Software Picks of the Week
Subject: Graphics Cards, Processors | August 17, 2016 - 05:38 PM | Scott Michaud
Tagged: Xeon Phi, larrabee, Intel
Tom Forsyth, who is currently at Oculus, was once on the core Larrabee team at Intel. Just prior to Intel's IDF conference in San Francisco, which Ryan is at and covering as I type this, Tom wrote a blog post that outlined the project and its design goals, including why it didn't hit market as a graphics device. He even goes into the details of the graphics architecture, which was almost entirely in software apart from texture units and video out. For instance, Larrabee was running FreeBSD with a program, called DirectXGfx, that gave it the DirectX 11 feature set -- and it worked on hundreds of titles, too.
Also, if you found the discussion interesting, then there is plenty of content from back in the day to browse. A good example is an Intel Developer Zone post from Michael Abrash that discussed software rasterization, doing so with several really interesting stories.
Subject: General Tech | August 17, 2016 - 04:41 PM | Jeremy Hellstrom
Tagged: nvidia, Intel, HPC, Xeon Phi, maxwell, pascal, dirty pool
There is a spat going on between Intel and NVIDIA over the slide below, as you can read about over at Ars Technica. It seems that Intel have reached into the industries bag of dirty tricks and polished off an old standby, testing new hardware and software against older products from their competitors. In this case it was high performance computing products which were tested, Intel's new Xeon Phi against NVIDIA's Maxwell, tested on an older version of the Caffe AlexNet benchmark.
NVIDIA points out that not only would they have done better than Intel if an up to date version of the benchmarking software was used, but that the comparison should have been against their current architecture, Pascal. This is not quite as bad as putting undocumented flags into compilers to reduce the performance of competitors chips or predatory discount programs but it shows that the computer industry continues to have only a passing acquaintance with fair play and honest competition.
"At this juncture I should point out that juicing benchmarks is, rather sadly, par for the course. Whenever a chip maker provides its own performance figures, they are almost always tailored to the strength of a specific chip—or alternatively, structured in such a way as to exacerbate the weakness of a competitor's product."
Here is some more Tech News from around the web:
- USB Implementers Forum introduces branding for safe USB-C charging @ The Inquirer
- Some Windows 10 Anniversary Update: SSD freeze @ The Register
- Intel Project Alloy: all-in-one VR headset takes aim at Google's Project Daydream @ The Inquirer
- Wanna build your own drone? Intel emits Linux-powered x86 brains for DIY flying gizmos @ The Register
- Intel's Optane XPoint DIMMs pushed back – source @ The Register
A Watershed Moment in Mobile
This previous May I was invited to Austin to be briefed on the latest core innovations from ARM and their partners. We were introduced to new CPU and GPU cores, as well as the surrounding technologies that provide the basis of a modern SOC in the ARM family. We also were treated to more information about the process technologies that ARM would embrace with their Artisan and POP programs. ARM is certainly far more aggressive now in their designs and partnerships than they have been in the past, or at least they are more willing to openly talk about them to the press.
The big process news that ARM was able to share at this time was the design of 10nm parts using an upcoming TSMC process node. This was fairly big news as TSMC was still introducing parts on their latest 16nm FF+ line. NVIDIA had not even released their first 16FF+ parts to the world in early May. Apple had dual sourced their 14/16 nm parts from Samsung and TSMC respectively, but these were based on LPE and FF lines (early nodes not yet optimized to LPP/FF+). So the news that TSMC would have a working 10nm process in 2017 was important to many people. 2016 might be a year with some good performance and efficiency jumps, but it seems that 2017 would provide another big leap forward after years of seeming stagnation of pure play foundry technology at 28nm.
Yesterday we received a new announcement from ARM that shows an amazing shift in thought and industry inertia. ARM is partnering with Intel to introduce select products on Intel’s upcoming 10nm foundry process. This news is both surprising and expected. It is surprising in that it happened as quickly as it did. It is expected as Intel is facing a very different world than it had planned for 10 years ago. We could argue that it is much different than they planned for 5 years ago.
Intel is the undisputed leader in process technologies and foundry practices. They are the gold standard of developing new, cutting edge process nodes and implementing them on a vast scale. This has served them well through the years as they could provide product to their customers seemingly on demand. It also allowed them a leg up in technology when their designs may not have fit what the industry wanted or needed (Pentium 4, etc.). It also allowed them to potentially compete in the mobile market with designs that were not entirely suited for ultra-low power. x86 is a modern processor technology with decades of development behind it, but that development focused mainly on performance at higher TDP ranges.
This past year Intel signaled their intent to move out of the sub 5 watt market and cede it to ARM and their partners. Intel’s ultra mobile offerings just did not make an impact in an area that they were expected to. For all of Intel’s advances in process technology, the base ARM architecture is just better suited to these power envelopes. Instead of throwing good money after bad (in the form of development time, wafer starts, rebates) Intel has stepped away from this market.
This leaves Intel with a problem. What to do with extra production capacity? Running a fab is a very expensive endeavor. If these megafabs are not producing chips 24/7, then the company is losing money. This past year Intel has seen their fair share of layoffs and slowing down production/conversion of fabs. The money spent on developing new, cutting edge process technologies cannot stop for the company if they want to keep their dominant position in the CPU industry. Some years back they opened up their process products to select 3rd party companies to help fill in the gaps of production. Right now Intel has far more production line space than they need for the current market demands. Yes, there were delays in their latest Skylake based processors, but those were solved and Intel is full steam ahead. Unfortunately, they do not seem to be keeping their fabs utilized at the level needed or desired. The only real option seems to be opening up some fab space to more potential customers in a market that they are no longer competing directly in.
The Intel Custom Foundry Group is working with ARM to provide access to their 10nm HPM process node. Initial production of these latest generation designs will commence in Q1 2017 with full scale production in Q4 2017. We do not have exact information as to what cores will be used, but we can imagine that they will be Cortex-A73 and A53 parts in big.LITTLE designs. Mali graphics will probably be the first to be offered on this advanced node as well due to the Artisan/POP program. Initial customers have not been disclosed and we likely will not hear about them until early 2017.
This is a big step for Intel. It is also a logical progression for them when we look over the changing market conditions of the past few years. They were unable to adequately compete in the handheld/mobile market with their x86 designs, but they still wanted to profit off of this ever expanding area. The logical way to monetize this market is to make the chips for those that are successfully competing here. This will cut into Intel’s margins, but it should increase their overall revenue base if they are successful here. There is no reason to believe that they won’t be.
The last question we have is if the 10nm HPM node will be identical to what Intel will use for their next generation “Cannonlake” products. My best guess is that the foundry process will be slightly different and will not provide some of the “secret sauce” that Intel will keep for themselves. It will probably be a mobile focused process node that stresses efficiency rather than transistor switching speed. I could be very wrong here, but I don’t believe that Intel will open up their process to everyone that comes to them hat in hand (AMD).
The partnership between ARM and Intel is a very interesting one that will benefit customers around the globe if it is handled correctly from both sides. Intel has a “not invented here” culture that has both benefited it and caused it much grief. Perhaps some flexibility on the foundry side will reap benefits of its own when dealing with very different designs than Intel is used to. This is a titanic move from where Intel probably thought it would be when it first started to pursue the ultra-mobile market, but it is a move that shows the giant can still positively react to industry trends.
Subject: Storage | August 16, 2016 - 06:00 PM | Allyn Malventano
Tagged: XPoint, Testbed, Optane, Intel, IDF 2016, idf
IDF 2016 is up and running, and Intel will no doubt be announcing and presenting on a few items of interest. Of note for this Storage Editor are multiple announcements pertaining to upcoming Intel Optane technology products.
Optane is Intel’s branding of their joint XPoint venture with Micron. Intel launched this branding at last year's IDF, and while the base technology is as high as 1000x faster than NAND flash memory, full solutions wrapped around an NVMe capable controller have shown to sit at roughly a 10x improvement over NAND. That’s still nothing to sneeze at, and XPoint settles nicely into the performance gap seen between NAND and DRAM.
Since modern M.2 NVMe SSDs are encroaching on the point of diminishing returns for consumer products, Intel’s initial Optane push will be into the enterprise sector. There are plenty of use cases for a persistent storage tier faster than NAND, but most enterprise software is not currently equipped to take full advantage of the gains seen from such a disruptive technology.
XPoint die. 128Gbit of storage at a ~20nm process.
In an effort to accelerate the development and adoption of 3D XPoint optimized software, Intel will be offering enterprise customers access to an Optane Testbed. This will allow for performance testing and tuning of customers’ software and applications ahead of the shipment of Optane hardware.
I did note something interesting in Micron's FMS 2016 presentation. QD=1 random performance appears to start at ~320,000 IOPS, while the Intel demo from a year ago (first photo in this post) showed a prototype running at only 76,600 IOPS. Using that QD=1 example, it appears that as controller technology improves to handle the large performance gains of raw XPoint, so does performance. Given a NAND-based SSD only turns in 10-20k IOPS at that same queue depth, we're seeing something more along the lines of 16-32x performance gains with the Micron prototype. Those with a realistic understanding of how queues work will realize that the type of gains seen at such low queue depths will have a significant impact in real-world performance of these products.
The speed of 3D XPoint immediately shifts the bottleneck back to the controller, PCIe bus, and OS/software. True 1000x performance gains will not be realized until second generation XPoint DIMMs are directly linked to the CPU.
The raw die 1000x performance gains simply can't be fully realized when there is a storage stack in place (even an NVMe one). That's not to say XPoint will be slow, and based on what I've seen so far, I suspect XPoint haters will still end up burying their heads in the sand once we get a look at the performance results of production parts.
Leaked roadmap including upcoming Optane products
Intel is expected to show a demo of their own more recent Optane prototype, and we suspect similar performance gains there as their controller tech has likely matured. We'll keep an eye out and fill you in once we've seen Intel's newer Optane goodness it in action!
Subject: General Tech, Processors, Displays, Shows and Expos | August 16, 2016 - 05:50 PM | Ryan Shrout
Tagged: VR, virtual reality, project alloy, Intel, augmented reality, AR
At the opening keynote to this summer’s Intel Developer Forum, CEO Brian Krzanich announced a new initiative to enable a completely untether VR platform called Project Alloy. Using Intel processors and sensors the goal of Project Alloy is to move all of the necessary compute into the headset itself, including enough battery to power the device for a typical session, removing the need for a high powered PC and a truly cordless experience.
This is indeed the obvious end-game for VR and AR, though Intel isn’t the first to demonstrate a working prototype. AMD showed the Sulon Q, an AMD FX-based system that was a wireless VR headset. It had real specs too, including a 2560x1440 OLED 90Hz display, 8GB of DDR3 memory, an AMD FX-8800P APU with R7 graphics embedded. Intel’s Project Alloy is currently using unknown hardware and won’t have a true prototype release until the second half of 2017.
There is one key advantage that Intel has implemented with Alloy: RealSense cameras. The idea is simple but the implications are powerful. Intel demonstrated using your hands and even other real-world items to interact with the virtual world. RealSense cameras use depth sensing to tracking hands and fingers very accurately and with a device integrated into the headset and pointed out and down, Project Alloy prototypes will be able to “see” and track your hands, integrating them into the game and VR world in real-time.
The demo that Intel put on during the keynote definitely showed the promise, but the implementation was clunky and less than what I expected from the company. Real hands just showed up in the game, rather than representing the hands with rendered hands that track accurately, and it definitely put a schism in the experience. Obviously it’s up to the application developer to determine how your hands would actually be represented, but it would have been better to show case that capability in the live demo.
Better than just tracking your hands, Project Alloy was able to track a dollar bill (why not a Benjamin Intel??!?) and use it to interact with a spinning lathe in the VR world. It interacted very accurately and with minimal latency – the potential for this kind of AR integration is expansive.
Those same RealSense cameras and data is used to map the space around you, preventing you from running into things or people or cats in the room. This enables the first “multi-room” tracking capability, giving VR/AR users a new range of flexibility and usability.
Though I did not get hands on with the Alloy prototype itself, the unit on-stage looked pretty heavy, pretty bulky. Comfort will obviously be important for any kind of head mounted display, and Intel has plenty of time to iterate on the design for the next year to get it right. Both AMD and NVIDIA have been talking up the importance of GPU compute to provide high quality VR experiences, so Intel has an uphill battle to prove that its solution, without the need for external power or additional processing, can truly provide the untethered experience we all desire.
Subject: Systems | August 16, 2016 - 12:00 PM | Sebastian Peak
Tagged: small form-factor, SFF, nvidia, Lenovo, Killer Networking, Intel, IdeaCentre Y710 Cube, GTX 1080, gaming, gamescom, cube
Lenovo has announced the IdeaCentre Y710 Cube; a small form-factor system designed for gaming regardless of available space, and it can be configured with some very high-end desktop components for serious performance.
"Ideal for gamers who want to stay competitive no matter where they play, the IdeaCentre Y710 Cube comes with a built-in carry handle for easy transport between gaming stations. Housed sleekly within a new, compact cube form factor, it features NVIDIA’s latest GeForce GTX graphics and 6th Gen Intel Core processors to handle today’s most resource-intensive releases."
The Y710 Cube offers NVIDIA GeForce graphics up to the GTX 1080, and up to a 6th-generation Core i7 processor. (Though a specific processor number was not mentioned, this is likely the non-K Core i7-6700 CPU given the 65W cooler specified below).
Lenovo offers a pre-installed XBox One controller receiver with the Y710 Cube to position the small desktop as a console alternative, and the machines are configured with SSD storage and feature Killer Double Shot Pro networking (where the NIC and wireless card are combined for better performance).
- Processor: Up to 6th Generation Intel Core i7 Processor
- Operating System: Windows 10 Home
- Graphics: Up to NVIDIA GeForce GTX 1080; 8 GB
- Memory: Up to 32 GB DDR4
- Storage: Up to 2 TB HDD + 256 GB SSD
- Cooling: 65 W
- Networking: Killer LAN / WiFi 10/100/1000M
- Video: 1x HDMI, 1x VGA
- Rear Ports: 1x USB 2.0 1x USB 3.0
- Front Ports: 2x USB 3.0
- Dimensions (L x D x H): 393.3 x 252.3 x 314.5 mm (15.48 x 9.93 x 12.38 inches)
- Weight: Starting at 16.3 lbs (7.4 kg)
- Carry Handle: Yes
- Accessory: Xbox One Wireless Controller/Receiver (optional)
The IdeaCentre Y710 Cube is part of Lenovo's Gamescom 2016 annoucement, and will be available for purchase starting in October. Pricing starts at $1,299.99 for a version with the GTX 1070.
Subject: General Tech | August 12, 2016 - 05:16 PM | Jeremy Hellstrom
Tagged: 3D XPoint, Intel, FMS 2016
You might have caught our reference to this on the podcast, XPoint is amazingly fast but the marketing clams were an order or magnitude or two off of the real performance levels. Al took some very nice pictures at FMS and covered what Micron had to say about their new QuantX drives. The Register also dropped by and offers a tidbit on the pricing, roughly four to five times as much as current flash or about half the cost of an equivalent amount of RAM. They also compare the stated endurance of 25 complete drive writes per day to existing flash which offers between 10 to 17 depending on the technology used.
The question they ask at the end is one many data centre managers will also be asking, is the actual speed boost worth the cost of upgrading or will other less expensive alternatives be more economical?
"XPoint will substantially undershoot the 1,000-times-faster and 1,000-times-longer-lived-than-flash claims made by Intel when it was first announced – with just a 10-times speed boost and 2.5-times longer endurance in reality."
Here is some more Tech News from around the web:
- Thieves can wirelessly unlock up to 100 million Volkswagens, each at the press of a button @ The Register
- McAfee outs malware dev firm with scores of Download.com installs @ The Register
- Creator of Chatbot that Beat 160K Parking Fines Now Tackling Homelessness @ Slashdot
- New Air-Gap Jumper Covertly Transmits Data in Hard-Drive Sounds @ Slashdot
- Galaxy Note 7 to get Android 7.0 Nougat in 'two to three months' @ The Inquirer