Khronos Announces "Next" OpenGL & Releases OpenGL 4.5

Subject: General Tech, Graphics Cards, Shows and Expos | August 15, 2014 - 08:33 PM |
Tagged: siggraph 2014, Siggraph, OpenGL Next, opengl 4.5, opengl, nvidia, Mantle, Khronos, Intel, DirectX 12, amd

Let's be clear: there are two stories here. The first is the release of OpenGL 4.5 and the second is the announcement of the "Next Generation OpenGL Initiative". They both occur on the same press release, but they are two, different statements.

OpenGL 4.5 Released

OpenGL 4.5 expands the core specification with a few extensions. Compatible hardware, with OpenGL 4.5 drivers, will be guaranteed to support these. This includes features like direct_state_access, which allows accessing objects in a context without binding to it, and support of OpenGL ES3.1 features that are traditionally missing from OpenGL 4, which allows easier porting of OpenGL ES3.1 applications to OpenGL.

opengl_logo.jpg

It also adds a few new extensions as an option:

ARB_pipeline_statistics_query lets a developer ask the GPU what it has been doing. This could be useful for "profiling" an application (list completed work to identify optimization points).

ARB_sparse_buffer allows developers to perform calculations on pieces of generic buffers, without loading it all into memory. This is similar to ARB_sparse_textures... except that those are for textures. Buffers are useful for things like vertex data (and so forth).

ARB_transform_feedback_overflow_query is apparently designed to let developers choose whether or not to draw objects based on whether the buffer is overflowed. I might be wrong, but it seems like this would be useful for deciding whether or not to draw objects generated by geometry shaders.

KHR_blend_equation_advanced allows new blending equations between objects. If you use Photoshop, this would be "multiply", "screen", "darken", "lighten", "difference", and so forth. On NVIDIA's side, this will be directly supported on Maxwell and Tegra K1 (and later). Fermi and Kepler will support the functionality, but the driver will perform the calculations with shaders. AMD has yet to comment, as far as I can tell.

nvidia-opengl-debugger.jpg

Image from NVIDIA GTC Presentation

If you are a developer, NVIDIA has launched 340.65 (340.23.01 for Linux) beta drivers for developers. If you are not looking to create OpenGL 4.5 applications, do not get this driver. You really should not have any use for it, at all.

Next Generation OpenGL Initiative Announced

The Khronos Group has also announced "a call for participation" to outline a new specification for graphics and compute. They want it to allow developers explicit control over CPU and GPU tasks, be multithreaded, have minimal overhead, have a common shader language, and "rigorous conformance testing". This sounds a lot like the design goals of Mantle (and what we know of DirectX 12).

amd-mantle-queues.jpg

And really, from what I hear and understand, that is what OpenGL needs at this point. Graphics cards look nothing like they did a decade ago (or over two decades ago). They each have very similar interfaces and data structures, even if their fundamental architectures vary greatly. If we can draw a line in the sand, legacy APIs can be supported but not optimized heavily by the drivers. After a short time, available performance for legacy applications would be so high that it wouldn't matter, as long as they continue to run.

Add to it, next-generation drivers should be significantly easier to develop, considering the reduced error checking (and other responsibilities). As I said on Intel's DirectX 12 story, it is still unclear whether it will lead to enough performance increase to make most optimizations, such as those which increase workload or developer effort in exchange for queuing fewer GPU commands, unnecessary. We will need to wait for game developers to use it for a bit before we know.

Podcast #313 - New Kaveri APUs, ASUS ROG Swift G-Sync Monitor, Intel Core M Processors and more!

Subject: General Tech | August 14, 2014 - 03:30 PM |
Tagged: video, ssd, ROG Swift, ROG, podcast, ocz, nvidia, Kaveri, Intel, g-sync, FMS 2014, crossblade ranger, core m, Broadwell, asus, ARC 100, amd, A6-7400K, A10-7800, 14nm

PC Perspective Podcast #313 - 08/14/2014

Join us this week as we discuss new Kaveri APUs, ASUS ROG Swift G-Sync Monitor, Intel Core M Processors and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

  • iTunes - Subscribe to the podcast directly through the iTunes Store
  • RSS - Subscribe through your regular RSS reader
  • MP3 - Direct download link to the MP3 file

Hosts: Ryan Shrout, Jeremy Hellstrom, Josh Walrath, and Allyn Malventano

Program length: 1:41:24
 

Subscribe to the PC Perspective YouTube Channel for more videos, reviews and podcasts!!

 

Intel and Microsoft Show DirectX 12 Demo and Benchmark

Subject: General Tech, Graphics Cards, Processors, Mobile, Shows and Expos | August 13, 2014 - 09:55 PM |
Tagged: siggraph 2014, Siggraph, microsoft, Intel, DirectX 12, directx 11, DirectX

Along with GDC Europe and Gamescom, Siggraph 2014 is going on in Vancouver, BC. At it, Intel had a DirectX 12 demo at their booth. This scene, containing 50,000 asteroids, each in its own draw call, was developed on both Direct3D 11 and Direct3D 12 code paths and could apparently be switched while the demo is running. Intel claims to have measured both power as well as frame rate.

intel-dx12-LockedFPS.png

Variable power to hit a desired frame rate, DX11 and DX12.

The test system is a Surface Pro 3 with an Intel HD 4400 GPU. Doing a bit of digging, this would make it the i5-based Surface Pro 3. Removing another shovel-load of mystery, this would be the Intel Core i5-4300U with two cores, four threads, 1.9 GHz base clock, up-to 2.9 GHz turbo clock, 3MB of cache, and (of course) based on the Haswell architecture.

While not top-of-the-line, it is also not bottom-of-the-barrel. It is a respectable CPU.

Intel's demo on this processor shows a significant power reduction in the CPU, and even a slight decrease in GPU power, for the same target frame rate. If power was not throttled, Intel's demo goes from 19 FPS all the way up to a playable 33 FPS.

Intel will discuss more during a video interview, tomorrow (Thursday) at 5pm EDT.

intel-dx12-unlockedFPS-1.jpg

Maximum power in DirectX 11 mode.

For my contribution to the story, I would like to address the first comment on the MSDN article. It claims that this is just an "ideal scenario" of a scene that is bottlenecked by draw calls. The thing is: that is the point. Sure, a game developer could optimize the scene to (maybe) instance objects together, and so forth, but that is unnecessary work. Why should programmers, or worse, artists, need to spend so much of their time developing art so that it could be batch together into fewer, bigger commands? Would it not be much easier, and all-around better, if the content could be developed as it most naturally comes together?

That, of course, depends on how much performance improvement we will see from DirectX 12, compared to theoretical max efficiency. If pushing two workloads through a DX12 GPU takes about the same time as pushing one, double-sized workload, then it allows developers to, literally, perform whatever solution is most direct.

intel-dx12-unlockedFPS-2.jpg

Maximum power when switching to DirectX 12 mode.

If, on the other hand, pushing two workloads is 1000x slower than pushing a single, double-sized one, but DirectX 11 was 10,000x slower, then it could be less relevant because developers will still need to do their tricks in those situations. The closer it gets, the fewer occasions that strict optimization is necessary.

If there are any DirectX 11 game developers, artists, and producers out there, we would like to hear from you. How much would a (let's say) 90% reduction in draw call latency (which is around what Mantle claims) give you, in terms of fewer required optimizations? Can you afford to solve problems "the naive way" now? Some of the time? Most of the time? Would it still be worth it to do things like object instancing and fewer, larger materials and shaders? How often?

Intel is disabling TSX in Haswell due to software failures

Subject: General Tech | August 12, 2014 - 01:07 PM |
Tagged: Intel, haswell, tsx, errata

Transactional Synchronization Extensions, aka TSX, are a backwards compatible set of instructions which first appeared in some Haswell chips as a method to improve concurrency and multi-threadedness with as little work for the programmer as possible.  It was intended to improve the scaling of multi-threaded apps running on multi-core processors and has not yet been widely adopted.  The adoption has run into another hurdle, in some cases the use of TSX can cause critical software failures and as a result Intel will be disabling the instruction set via new BIOS/UEFI updates which will be pushed out soon.  If your software uses the new instruction set and you wish it to continue to do so you should avoid updating your motherboard BIOS/UEFI and ask your users to do the same.  You can read more about this bug/errata and other famous problems over at The Tech Report.

intel2.jpg

"The TSX instructions built into Intel's Haswell CPU cores haven't become widely used by everyday software just yet, but they promise to make certain types of multithreaded applications run much faster than they can today. Some of the savviest software developers are likely building TSX-enabled software right about now."

Here is some more Tech News from around the web:

Tech Talk

Author:
Subject: Processors
Manufacturer: Intel

Coming in 2014: Intel Core M

The era of Broadwell begins in late 2014 and based on what Intel has disclosed to us today, the processor architecture appears to be impressive in nearly every aspect. Coming off the success of the Haswell design in 2013 built on 22nm, the Broadwell-Y architecture will not only be the first to market with a new microarchitecture, but will be the flagship product on Intel’s new 14nm tri-gate process technology.

The Intel Core M processor, as Broadwell-Y has been dubbed, includes impressive technological improvements over previous low power Intel processors that result in lower power, thinner form factors, and longer battery life designs. Broadwell-Y will stretch into even lower TDPs enabling 9mm or small fanless designs that maintain current battery lifespans. A new 2nd generation FIVR with modified power delivery design allows for even thinner packaging and a wider range of dynamic frequencies than before. And of course, along with the shift comes an updated converged core design and improved graphics performance.

All of these changes are in service to what Intel claims is a re-invention of the notebook. Compared to 2010 when the company introduced the original Intel Core processor, thus redirecting Intel’s direction almost completely, Intel Core M and the Broadwell-Y changes will allow for some dramatic platform changes.

broadwell-12.jpg

Notebook thickness will go from 26mm (~1.02 inches) down to a small as 7mm (~0.27 inches) as Intel has proven with its Llama Mountain reference platform. Reductions in total thermal dissipation of 4x while improving core performance by 2x and graphics performance by 7x are something no other company has been able to do over the same time span. And in the end, one of the most important features for the consumer, is getting double the useful battery life with a smaller (and lighter) battery required for it.

But these kinds of advancements just don’t happen by chance – ask any other semiconductor company that is either trying to keep ahead of or catch up to Intel. It takes countless engineers and endless hours to build a platform like this. Today Intel is sharing some key details on how it was able to make this jump including the move to a 14nm FinFET / tri-gate transistor technology and impressive packaging and core design changes to the Broadwell architecture.

Intel 14nm Technology Advancement

Intel consistently creates and builds the most impressive manufacturing and production processes in the world and it has helped it maintain a market leadership over rivals in the CPU space. It is also one of the key tenants that Intel hopes will help them deliver on the world of mobile including tablets and smartphones. At the 22nm node Intel was the first offer 3D transistors, what they called tri-gate and others refer to as FinFET. By focusing on power consumption rather than top level performance Intel was able to build the Haswell design (as well as Silvermont for the Atom line) with impressive performance and power scaling, allowing thinner and less power hungry designs than with previous generations. Some enthusiasts might think that Intel has done this at the expense of high performance components, and there is some truth to that. But Intel believes that by committing to this space it builds the best future for the company.

Continue reading our reveal of Intel's Broadwell Architecture and 14nm Process Technology!!

MSI Shows X99S SLI Plus Motherboard on Twitter

Subject: Motherboards | August 5, 2014 - 05:38 PM |
Tagged: msi, Intel, X99, x99s sli plus

Well, this just happened.

msix99s.jpg

So there you have it, the X99 chipset is a thing, the MSI X99S SLI Plus is a thing, and it looks damned sexy. 

 

 

msix99s2.jpg

I lightened up the photo some to show off more of the features as the black coloring on everything made it all hard to see. Revealed are a total of 8 DIMM slots (DDR4 we assume), four PCI Express x16 slots (though we don't know how many lanes each is connected to), 8 SATA ports, 1 SATA Express and some more goodies. What do you guys think? Stoked for the pending Haswell-E / X99 release?

Source: Twitter

The new netbook?

Subject: General Tech, Mobile | July 24, 2014 - 02:32 PM |
Tagged: Intel, microsoft, netbook, Bay Trail

According to DigiTimes we may see a resurgence of netbooks, this time powered by Bay Trail which will make them far more usable than the original generation.  There are three postulated tiers, the $200-250 range of 10.1-15.6" models and $250-400 or $400-600 in 11.6-17.3" which will make them larger in size than the original generation which failed to attract many consumers.  They are currently scheduled to ship with Bay Trail-M with future models likely to have Braswell inside in a mix of transformer style 2 in 1's with touchscreens and more traditional laptop designs.  You can expect to see a maximum thickness of 25mm and a mix of HDD and SSD storage on these and we can only hope that the estimated pricing is more accurate than the pricing on Ultrabooks turned out to be.

pr_toshiba_netbook_f.jpg

"For the US$199-249 notebooks, Intel and Microsoft's specification preferences are 10.1- to 15.6-inch clamshell non-touchscreen models using Intel's Bay Trail-M series processors or upcoming Braswell-based processors, which are set to release in the second quarter of 2015."

Here is some more Tech News from around the web:

Tech Talk

Source: DigiTimes

Podcast #310 - NVIDIA SHIELD Tablet, WD 6TB Red and 4TB Red Pro HDDs and more!

Subject: General Tech | July 24, 2014 - 12:58 PM |
Tagged: podcast, video, nvidia, shield, shield tablet, tegra, tegra k1, WD, red, 6tb red, 4tb red pro, A88X-G45 Gaming, xiaomi, maxwell, amd, Intel

PC Perspective Podcast #310 - 07/24/2014

Join us this week as we discuss the NVIDIA SHIELD Tablet, WD 6TB Red and 4TB Red Pro HDDs and more!

You can subscribe to us through iTunes and you can still access it directly through the RSS page HERE.

The URL for the podcast is: http://pcper.com/podcast - Share with your friends!

  • iTunes - Subscribe to the podcast directly through the iTunes Store
  • RSS - Subscribe through your regular RSS reader
  • MP3 - Direct download link to the MP3 file

Hosts: Ryan Shrout, Jeremy Hellstrom, Josh Walrath, and Allyn Malventano

Program length: 1:25:40

Subscribe to the PC Perspective YouTube Channel for more videos, reviews and podcasts!!

 

Intel Launches SSD Pro 2500 Series for Businesses of All Sizes

Subject: Storage | July 22, 2014 - 04:02 PM |
Tagged: Intel, ssd, Pro 2500, enterprise, encryption, mcafee

Intel has not offered many products which take advantage of their takeover of McAfee, now known as Intel Security but today's release of the Intel SSD Pro 2500 Series changes that.  This family of SSDs will work with McAfee ePolicy Orchestrator to allow the automatic implementation of hardware-based 256-bit encryption on these drives in a similar manner to what Endpoint Encryption has done in the past.  Since it sits on the hardware Intel claims no impact to the speed is caused by the on the fly encryption.  If you use Intel Setup and Configuration Software with vPro you can even monitor the health of deployed drives.  Check out Intel's page here and the PR below.

image003.png

SANTA CLARA, Calif., July 22, 2014 – Intel Corporation today announced an addition to the Intel® Solid-State Drive (SSD) Professional Family: the Intel® SSD Pro 2500 Series. This new business-class SSD delivers lower total cost of ownership, security and manageability features, and blazing-fast SSD performance demanded by today’s business users.

Intel SSD Pro 2500 Series offers IT departments peace of mind with advanced security features and capabilities designed for businesses ranging from small companies through large IT-managed enterprises. Security and remote manageability features, combined with lower annual failure rates than hard disk drives (HDDs), help to reduce the need for resource-intensive deskside visits.

Managing data security is critical for businesses and a challenge for IT leaders. Data breaches, often a result of lost or stolen PCs, can cost a business nearly $50,000 in lost productivity, replacement, data recovery and legal costs.1 To help businesses mitigate the threat of such costly breaches, the Intel Pro 2500 Series SSDs are self-encrypting drives (SED) utilizing hardware-based 256-bit encryption to protect data without a loss of performance. Additionally, the new Intel drives feature the Trusted Computing Group’s OPAL 2.0* standard and are Microsoft eDrive* capable. These policy-based controls help to prevent data breaches and support crypto erase to repurpose the drive for reuse.

“The need to protect assets, keep an eye on the bottom line and ensure employees have the best tools is a challenge for IT departments,” said Rob Crooke, Intel corporate vice president and general manager of the Non-Volatile Memory Solutions Group. “The Intel SSD Pro 2500 Series is a well-rounded solution to help balance those often competing needs. Adding the Pro 2500 Series to the Intel SSD Professional Family delivers a powerful storage solution to help businesses of all sizes meet their critical IT needs.”

“The Intel SSD Pro 2500 Series is the second-generation OPAL-based client storage solution that helps IT departments protect their users’ data and also provides valuable features to reduce operational costs,” stated Candace Worley, senior vice president and general manager, Endpoint Security, McAfee*, part of Intel Security. “The Pro 2500 Series is a perfect companion to our data protection solutions, managed by McAfee ePolicy Orchestrator*, all working in concert to provide IT departments with data security, management and control, wherever their endpoints may be.”

In an environment with Intel® vPro™ Technology, with Intel® Setup and Configuration Software and leading security software, the Pro 2500 Series drives can be managed remotely allowing IT to monitor and report drive health as well as track assets and remedy faults. This remote manageability enforces IT policies to help prevent mishaps and simultaneously provides a great user experience. Embedded and Internet of Things applications can also take advantage of the remote manageability features to help limit the number of IT professionals needed to oversee devices. To assist in protecting user data and lower the total cost of ownership, applications such as ATMs and remote digital signage can be updated, monitored and managed remotely.

“Corporations of every size are facing the growing challenge of protecting sensitive data and ensuring compliance with a litany of data protection laws and regulations,” said Bill Solms, president and CEO of Wave Systems*. “The Intel SSD Pro 2500 Series offers a sound foundation for any data security program, incorporating hardware-level encryption without impacting drive performance. Wave’s on-premise and cloud-based management software complements the Intel SSD Pro 2500 by offering remote drive provisioning, automated password recovery and secure audit logs to document that encryption was in place should a laptop become lost or stolen.”

The Intel SSD Professional Family is part of the Intel® Stable Image Platform Program, including a 15-month availability of the components and drivers for compatibility and stability across a qualified IT image. This helps minimize IT qualification and deployment times. The Intel SSD Pro 2500 Series also features five advance power modes helping to balance performance and power to enable a longer battery life and provide a better mobile experience.

The Intel SSD Pro 2500 Series will be available in both 2.5-inch and M.2 form factors and in capacities ranging from 120GB to 480GB. The Intel SSD Pro 2500 Series is backed by a 5-year limited warranty and features a world-class annualized failure rate (AFR) well below 1 percent. The AFRs of other SSDs and HDDs can reach as high as 5 percent or more in mobile environments.

Source: Intel

Intel AVX-512 Expanded

Subject: General Tech, Graphics Cards, Processors | July 19, 2014 - 03:05 AM |
Tagged: Xeon Phi, xeon, Intel, avx-512, avx

It is difficult to know what is actually new information in this Intel blog post, but it is interesting none-the-less. Its topic is the AVX-512 extension to x86, designed for Xeon and Xeon Phi processors and co-processors. Basically, last year, Intel announced "Foundation", the minimum support level for AVX-512, as well as Conflict Detection, Exponential and Reciprocal, and Prefetch, which are optional. This, earlier blog post was very much focused on Xeon Phi, but it acknowledged that the instructions will make their way to standard, CPU-like Xeons at around the same time.

Intel_Xeon_Phi_Family.jpg

This year's blog post brings in a bit more information, especially for common Xeons. While all AVX-512-supporting processors (and co-processors) will support "AVX-512 Foundation", the instruction set extensions are a bit more scattered.

 
Xeon
Processors
Xeon Phi
Processors
Xeon Phi
Coprocessors (AIBs)
Foundation Instructions Yes Yes Yes
Conflict Detection Instructions Yes Yes Yes
Exponential and Reciprocal Instructions No Yes Yes
Prefetch Instructions No Yes Yes
Byte and Word Instructions Yes No No
Doubleword and Quadword Instructions Yes No No
Vector Length Extensions Yes No No

Source: Intel AVX-512 Blog Post (and my understanding thereof).

So why do we care? Simply put: speed. Vectorization, the purpose of AVX-512, has similar benefits to multiple cores. It is not as flexible as having multiple, unique, independent cores, but it is easier to implement (and works just fine with having multiple cores, too). For an example: imagine that you have to multiply two colors together. The direct way to do it is multiply red with red, green with green, blue with blue, and alpha with alpha. AMD's 3DNow! and, later, Intel's SSE included instructions to multiply two, four-component vectors together. This reduces four similar instructions into a single operating between wider registers.

Smart compilers (and programmers, although that is becoming less common as compilers are pretty good, especially when they are not fighting developers) are able to pack seemingly unrelated data together, too, if they undergo similar instructions. AVX-512 allows for sixteen 32-bit pieces of data to be worked on at the same time. If your pixel only has four, single-precision RGBA data values, but you are looping through 2 million pixels, do four pixels at a time (16 components).

For the record, I basically just described "SIMD" (single instruction, multiple data) as a whole.

This theory is part of how GPUs became so powerful at certain tasks. They are capable of pushing a lot of data because they can exploit similarities. If your task is full of similar problems, they can just churn through tonnes of data. CPUs have been doing these tricks, too, just without compromising what they do well.

Source: Intel