AMD Announces Radeon R9 285X and R9 285 Graphics Card

Subject: Graphics Cards | August 23, 2014 - 10:46 AM |
Tagged: amd, radeon, R9, r9 285, 285

Today during AMD's live stream event celebrating 30 years of graphics and gaming, the company spent a bit of time announcing and teasing a new graphics card, the Radeon R9 285X and R9 285. Likely based on the Tonga GPU die, the specifications haven't been confirmed but most believe that the chip will feature 2048 stream processors, 128 texture units, 32 ROPs and a 256-bit memory bus.

r9285.jpg

In a move to help donate to the Child's Play charity, AMD currently has an AMD Radeon R9 285 on Ebay. It lists an ASUS built Strix-style cooled retail card, with 2GB of memory being the only specification that is visible on the box.

r92852.jpg

The R9 285X and R9 285 will replace the R9 280X and R9 280 more than likely and we should see these shipping and available in very early September.

Source: AMD

PCPer Live! Recap - NVIDIA G-Sync Surround Demo and Q&A

Subject: Graphics Cards, Displays | August 22, 2014 - 08:05 PM |
Tagged: video, gsync, g-sync, tom petersen, nvidia, geforce

Earlier today we had NVIDIA's Tom Petersen in studio to discuss the retail availability of G-Sync monitors as well as to get hands on with a set of three ASUS ROG Swift PG278Q monitors running in G-Sync Surround! It was truly an impressive sight and if you missed any of it, you can catch the entire replay right here.

Even if seeing the ASUS PG278Q monitor again doesn't interest you (we have our full review of the monitor right here), you won't want to miss the very detailed Q&A that occurs, answering quite a few reader questions about the technology. Covered items include:

  • Potential added latency of G-Sync
  • Future needs for multiple DP connections on GeForce GPUs
  • Upcoming 4K and 1080p G-Sync panels
  • Can G-Sync Surround work through an MST Hub?
  • What happens to G-Sync when the frame rate exceeds the panel refresh rate? Or drops below minimum refresh rate?
  • What does that memory on the G-Sync module actually do??
  • A demo of the new NVIDIA SHIELD Tablet capabilities
  • A whole lot more!

Another big thank you to NVIDIA and Tom Petersen for stopping out our way and for spending the time to discuss these topics with our readers. Stay tuned here at PC Perspective as we will have more thoughts and reactions to G-Sync Surround very soon!!

NVIDIA Live Stream: We Want Your Questions!

Subject: Graphics Cards, Displays, Mobile | August 21, 2014 - 05:23 PM |
Tagged: nvidia, video, live, shield, shield tablet, g-sync, gsync, tom petersen

Tomorrow at 12pm EDT / 9am PDT, NVIDIA's Tom Petersen will be stopping by the PC Perspective office to discuss some topics of interest. There has been no lack of topics floating around the world of graphics card, displays, refresh rates and tablets recently and I expect the show tomorrow to be incredibly interesting and educational.

On hand we'll be doing demonstrations of G-Sync Surround (3 panels!) with the ASUS ROG Swift PG278Q display (our review here) and also show off the SHIELD Tablet (we have a review of that too) with some multiplayer action. If you thought the experience with a single G-Sync monitor was impressive, you will want to hear what a set of three of them can be like.

pcperlive.png

NVIDIA Live Stream with Tom Petersen

9am PT / 12pm ET - August 22nd

PC Perspective Live! Page

The topic list is going to include (but not limited to):

  • ASUS PG278Q G-Sync monitor
  • G-Sync availability and pricing
  • G-Sync Surround setup, use and requirements
  • Technical issues surrounding G-Sync: latency, buffers, etc.
  • Comparisons of G-Sync to Adaptive Sync
  • SHIELD Tablet game play
  • Altoids?

gsyncsurround.jpg

But we want your questions! Do you have burning issues that you think need to be addressed by Tom and the NVIDIA team about G-Sync, FreeSync, GameWorks, Tegra, tablets, GPUs and more? Nothing is off limits here, though obviously Tom may be cagey on future announcements. Please use the comments section on this news post below (registration not required) to ask your questions and we can organize them before the event tomorrow. We MIGHT even be able to come up with a couple of prizes to giveaway for live viewers as well...

See you tomorrow!!

An odd Q2 for tablets and PCs

Subject: General Tech, Graphics Cards | August 19, 2014 - 12:30 PM |
Tagged: jon peddie, gpu market share, q2 2014

Jon Peddie Research's latest Market Watch adds even more ironic humour to the media's continuing proclamations of the impending doom of the PC industry.  This quarter saw tablet sales decline while overall PCs were up and that was without any major releases to drive purchasers to adopt new technology.  While JPR does touch on the overall industry this report is focused on the sale of GPUs and APUs and happens to contain some great news for AMD.  They saw their overall share of the market increase by 11% from last quarter and by just over a percent of the entire market.  Intel saw a small rise in share though it does still hold the majority of the market as PCs with no discrete GPU are more likely to contain Intel's chips than AMDs.  That leaves NVIDIA who are still banking solely on discrete GPUs and saw over an 8% decline from last quarter and a decline of almost two percent in the total market.  Check out the other graphs in JPR's overview right here.

unnamed.jpg

"The big drop in graphics shipments in Q1 has been partially offset by a small rise this quarter. Shipments were up 3.2% quarter-to-quarter, and down 4.5% compared to the same quarter last year."

Here is some more Tech News from around the web:

Tech Talk

Khronos Announces "Next" OpenGL & Releases OpenGL 4.5

Subject: General Tech, Graphics Cards, Shows and Expos | August 15, 2014 - 08:33 PM |
Tagged: siggraph 2014, Siggraph, OpenGL Next, opengl 4.5, opengl, nvidia, Mantle, Khronos, Intel, DirectX 12, amd

Let's be clear: there are two stories here. The first is the release of OpenGL 4.5 and the second is the announcement of the "Next Generation OpenGL Initiative". They both occur on the same press release, but they are two, different statements.

OpenGL 4.5 Released

OpenGL 4.5 expands the core specification with a few extensions. Compatible hardware, with OpenGL 4.5 drivers, will be guaranteed to support these. This includes features like direct_state_access, which allows accessing objects in a context without binding to it, and support of OpenGL ES3.1 features that are traditionally missing from OpenGL 4, which allows easier porting of OpenGL ES3.1 applications to OpenGL.

opengl_logo.jpg

It also adds a few new extensions as an option:

ARB_pipeline_statistics_query lets a developer ask the GPU what it has been doing. This could be useful for "profiling" an application (list completed work to identify optimization points).

ARB_sparse_buffer allows developers to perform calculations on pieces of generic buffers, without loading it all into memory. This is similar to ARB_sparse_textures... except that those are for textures. Buffers are useful for things like vertex data (and so forth).

ARB_transform_feedback_overflow_query is apparently designed to let developers choose whether or not to draw objects based on whether the buffer is overflowed. I might be wrong, but it seems like this would be useful for deciding whether or not to draw objects generated by geometry shaders.

KHR_blend_equation_advanced allows new blending equations between objects. If you use Photoshop, this would be "multiply", "screen", "darken", "lighten", "difference", and so forth. On NVIDIA's side, this will be directly supported on Maxwell and Tegra K1 (and later). Fermi and Kepler will support the functionality, but the driver will perform the calculations with shaders. AMD has yet to comment, as far as I can tell.

nvidia-opengl-debugger.jpg

Image from NVIDIA GTC Presentation

If you are a developer, NVIDIA has launched 340.65 (340.23.01 for Linux) beta drivers for developers. If you are not looking to create OpenGL 4.5 applications, do not get this driver. You really should not have any use for it, at all.

Next Generation OpenGL Initiative Announced

The Khronos Group has also announced "a call for participation" to outline a new specification for graphics and compute. They want it to allow developers explicit control over CPU and GPU tasks, be multithreaded, have minimal overhead, have a common shader language, and "rigorous conformance testing". This sounds a lot like the design goals of Mantle (and what we know of DirectX 12).

amd-mantle-queues.jpg

And really, from what I hear and understand, that is what OpenGL needs at this point. Graphics cards look nothing like they did a decade ago (or over two decades ago). They each have very similar interfaces and data structures, even if their fundamental architectures vary greatly. If we can draw a line in the sand, legacy APIs can be supported but not optimized heavily by the drivers. After a short time, available performance for legacy applications would be so high that it wouldn't matter, as long as they continue to run.

Add to it, next-generation drivers should be significantly easier to develop, considering the reduced error checking (and other responsibilities). As I said on Intel's DirectX 12 story, it is still unclear whether it will lead to enough performance increase to make most optimizations, such as those which increase workload or developer effort in exchange for queuing fewer GPU commands, unnecessary. We will need to wait for game developers to use it for a bit before we know.

AMD Catalyst 14.7 Release Candidate 3

Subject: Graphics Cards | August 14, 2014 - 07:20 PM |
Tagged: catalyst 14.7 RC3, beta, amd

A new Catalyst Release Candidate has arrived and as with the previous driver it no longer supports Windows 8.0 or the WDDM 1.2 driver, so upgrade to Win 7 or Win 8.1 before installing please.  AMD will eventually release a driver which supports WDDM 1.1 under Win 8.0 for those who do not upgrade.

AMD-Catalyst-12-11-Beta-11-7900-Modded-Driver-Crafted-for-Performance.jpg

Feature Highlights of the AMD Catalyst 14.7​ RC3 Driver for Windows Includes all improvements found in the AMD Catalyst 14.7 RC driver

  • Display interface enhancements to improve 4k monitor performance and reduce flickering.
  • Improvements apply to the following products: ​
    • AMD Radeon R9 290 Series
    • AMD Radeon R9 270 Series
    • AMD Radeon HD 7800 Series​ ​​
  • Even with these improvements, cable quality and other system variables can affect 4k performance. AMD recommends using DisplayPort 1.2 HBR2 certified cables with a length of 2m (~6 ft) or less when driving 4K monitors.​
  • Wildstar: AMD Crossfire profile support
  • Lichdom: Single GPU and Multi-GPU performance enhancements
  • Watch Dogs: Smoother gameplay on single GPU and Multi-GPU configurations​

Feature Highlights of the AMD Catalyst 14.7​ RC Driver for Windows

  • Includes all improvements found in the AMD Catalyst 14.6 RC driver
    • AMD ​CrossFire and AMD Radeon Dual Graphics profile update for Plants vs. Zombies​​​
    • Assassin's Creed IV - improved CrossFire scaling (3840x2160 High Settings) up to 93%
    • Collaboration with AOC has identified non-standard display timings as the root cause of 60Hz SST flickering exhibited by the AOC U2868PQU panel on certain AMD Radeon graphics cards.
    • A software workaround has been implemented in AMD Catalyst 14.7 RC driver to resolve the display timing issues with this display. Users are further encouraged to obtain newer display firmware from AOC that will resolve flickering at its origin.
    • Users are additionally advised to utilize DisplayPort-certified cables to ensure the integrity of the DisplayPort data connection.​​​

Feature Highlights of the AMD Catalyst 14.6 RC Driver for Windows

  • Plants vs. Zombies (Direct3D performance improvements):
    • AMD Radeon R9 290X - 1920x1080 Ultra – improves up to 11%
    • AMD Radeon R9 290X - 2560x1600 Ultra – improves up to 15%
    • AMD Radeon R9 290X CrossFire configuration (3840x2160 Ultra) - 92% scaling
  • 3DMark Sky Diver improvements:
    • AMD A4-6300 – improves up to 4%
    • Enables AMD Dual Graphics/AMD CrossFire support
  • Grid Auto Sport: AMD CrossFire profile
  • Wildstar: Power Xpress profile
    • Performance improvements to improve smoothness of application
    • Performance improves up to 24% at 2560x1600 on the AMD Radeon R9 and R7 Series of products for both single GPU and multi-GPU configurations.
  • Watch Dogs: AMD CrossFire – Frame pacing improvements
  • Battlefield Hardline Beta: AMD CrossFire profile

Known Issues

  • Running Watch Dogs with a R9 280X CrossFire configuration may result in the application running in CrossFire software compositing mode
  • Enabling Temporal SMAA in a CrossFire configuration when playing Watch Dogs will result in flickering
  • AMD CrossFire configurations with AMD Eyefinity enabled will see instability with BattleField 4 or Thief when running Mantle
  • Catalyst Install Manager text is covered by Express/Custom radio button text
  • Express Uninstall does not remove C:\Program Files\(AMD or ATI) folder
Source: AMD

Intel and Microsoft Show DirectX 12 Demo and Benchmark

Subject: General Tech, Graphics Cards, Processors, Mobile, Shows and Expos | August 13, 2014 - 09:55 PM |
Tagged: siggraph 2014, Siggraph, microsoft, Intel, DirectX 12, directx 11, DirectX

Along with GDC Europe and Gamescom, Siggraph 2014 is going on in Vancouver, BC. At it, Intel had a DirectX 12 demo at their booth. This scene, containing 50,000 asteroids, each in its own draw call, was developed on both Direct3D 11 and Direct3D 12 code paths and could apparently be switched while the demo is running. Intel claims to have measured both power as well as frame rate.

intel-dx12-LockedFPS.png

Variable power to hit a desired frame rate, DX11 and DX12.

The test system is a Surface Pro 3 with an Intel HD 4400 GPU. Doing a bit of digging, this would make it the i5-based Surface Pro 3. Removing another shovel-load of mystery, this would be the Intel Core i5-4300U with two cores, four threads, 1.9 GHz base clock, up-to 2.9 GHz turbo clock, 3MB of cache, and (of course) based on the Haswell architecture.

While not top-of-the-line, it is also not bottom-of-the-barrel. It is a respectable CPU.

Intel's demo on this processor shows a significant power reduction in the CPU, and even a slight decrease in GPU power, for the same target frame rate. If power was not throttled, Intel's demo goes from 19 FPS all the way up to a playable 33 FPS.

Intel will discuss more during a video interview, tomorrow (Thursday) at 5pm EDT.

intel-dx12-unlockedFPS-1.jpg

Maximum power in DirectX 11 mode.

For my contribution to the story, I would like to address the first comment on the MSDN article. It claims that this is just an "ideal scenario" of a scene that is bottlenecked by draw calls. The thing is: that is the point. Sure, a game developer could optimize the scene to (maybe) instance objects together, and so forth, but that is unnecessary work. Why should programmers, or worse, artists, need to spend so much of their time developing art so that it could be batch together into fewer, bigger commands? Would it not be much easier, and all-around better, if the content could be developed as it most naturally comes together?

That, of course, depends on how much performance improvement we will see from DirectX 12, compared to theoretical max efficiency. If pushing two workloads through a DX12 GPU takes about the same time as pushing one, double-sized workload, then it allows developers to, literally, perform whatever solution is most direct.

intel-dx12-unlockedFPS-2.jpg

Maximum power when switching to DirectX 12 mode.

If, on the other hand, pushing two workloads is 1000x slower than pushing a single, double-sized one, but DirectX 11 was 10,000x slower, then it could be less relevant because developers will still need to do their tricks in those situations. The closer it gets, the fewer occasions that strict optimization is necessary.

If there are any DirectX 11 game developers, artists, and producers out there, we would like to hear from you. How much would a (let's say) 90% reduction in draw call latency (which is around what Mantle claims) give you, in terms of fewer required optimizations? Can you afford to solve problems "the naive way" now? Some of the time? Most of the time? Would it still be worth it to do things like object instancing and fewer, larger materials and shaders? How often?

To boldy go where no 290X has gone before?

Subject: Graphics Cards | August 13, 2014 - 06:11 PM |
Tagged: factory overclocked, sapphire, R9 290X, Vapor-X R9 290X TRI-X OC

As far as factory overclocks go, the 1080MHz core and 5.64GHz RAM on the new Sapphire Vapor-X 290X is impressive and takes the prize for the highest factory overclock on this card [H]ard|OCP has seen yet.  That didn't stop them from pushing it to 1180MHz and 5.9GHz after a little work which is even more impressive.  At both the factory and manual overclocks the card handily beat the reference model and the manually overclocked benchmarks could meet or beat the overclocked MSI GTX 780 Ti GAMING 3G OC card.  The speed is not the only good feature, Intelligent Fan Control keeps two of the three fans from spinning when the GPU is under 60C which vastly reduces the noise produced by this card.  It is currently selling for $646, lower than the $710 that the GeForce is currently selling for as well.

1406869221rJVdvhdB2o_1_6_l.jpg

"We take a look at the SAPPHIRE Vapor-X R9 290X TRI-X OC video card which has the highest factory overclock we've ever encountered on any AMD R9 290X video card. This video card is feature rich and very fast. We'll overclock it to the highest GPU clocks we've seen yet on R9 290X and compare it to the competition."

Here are some more Graphics Card articles from around the web:

Graphics Cards

Source: [H]ard|OCP

Time to update your Gallium3D

Subject: General Tech, Graphics Cards | August 6, 2014 - 01:34 PM |
Tagged: radeon, Gallium3D, catalyst 14.6 Beta, linux, ubuntu 14.04

The new Gallium3D is up against the open source Catalyst 14.6 Beta, running under Ubuntu 14.04 and both the 3.14 and 3.16 Linux kernels, giving Phoronix quite a bit of testing to do.  They have numerous cards in their test ranging from an HD 6770 to an R9 290 though unfortunately there are no Gallium3D results for the R9 290 as it will not function until the release of the Linux 3.17 kernel.  Overall the gap is closing, the 14.6 Beta still remains the best performer but the open source alternative is quickly closing the gap.

image.php_.jpg

"After last week running new Nouveau vs. NVIDIA proprietary Linux graphics benchmarks, here's the results when putting AMD's hardware on the test bench and running both their latest open and closed-source drivers. Up today are the results of using the latest Radeon Gallium3D graphics code and Linux kernel against the latest beta of the binary-only Catalyst driver."

Here is some more Tech News from around the web:

Tech Talk

Source: Phoronix

Rumor: NVIDIA GeForce GTX 880 Is Actually September?

Subject: General Tech, Graphics Cards | August 3, 2014 - 04:59 PM |
Tagged: nvidia, maxwell, gtx 880

Just recently, we posted a story that claimed NVIDIA was preparing to launch high-end Maxwell in the October/November time frame. Apparently, that was generous. The graphics company is said to announce their GeForce GTX 880 in mid-September, with availability coming later in the month. It is expected to be based on the GM204 architecture (which previous rumors claim is 28nm).

nvidia-geforce.png

It is expected that the GeForce GTX 880 will be available with 4GB of video memory, with an 8GB version possible at some point. As someone who runs multiple (five) monitors, I can tell you that 2GB is not enough for someone of my use case. Windows 7 says the same. It kicks me out of applications to tell me that it does not have enough video memory. This would be enough reason for me to get more GPU memory.

We still do not know how many CUDA cores will be present in the GM204 chip, or if the GeForce GTX 880 will have all of them enabled (but I would be surprised if it didn't). Without any way to derive its theoretical performance, we cannot compare it against the GTX 780 or 780Ti. It could be significantly faster, it could be marginally faster, or it could be somewhere between.

But we will probably find out within two months.

Source: Videocardz

AMD Releases FreeSync Information as a FAQ

Subject: General Tech, Graphics Cards, Displays | July 29, 2014 - 09:02 PM |
Tagged: vesa, nvidia, g-sync, freesync, DisplayPort, amd

Dynamic refresh rates have two main purposes: save power by only forcing the monitor to refresh when a new frame is available, and increase animation smoothness by synchronizing to draw rates (rather than "catching the next bus" at 16.67ms, on the 16.67ms, for 60 Hz monitors). Mobile devices prefer the former, while PC gamers are interested in the latter.

Obviously, the video camera nullifies the effect.

NVIDIA was first to make this public with G-Sync. AMD responded with FreeSync, starting with a proposal that was later ratified by VESA as DisplayPort Adaptive-Sync. AMD, then, took up "Project FreeSync" as an AMD "hardware/software solution" to make use of DisplayPort Adaptive-Sync in a way that benefits PC gamers.

Today's news is that AMD has just released an FAQ which explains the standard much more thoroughly than they have in the past. For instance, it clarifies the distinction between DisplayPort Adaptive-Sync and Project FreeSync. Prior to the FAQ, I thought that FreeSync became DisplayPort Adaptive-Sync, and that was that. Now, it is sounding a bit more proprietary, just built upon an open, VESA standard.

If interested, check out the FAQ at AMD's website.

Source: AMD

NVIDIA 340.52 Drivers Are Now Available

Subject: General Tech, Graphics Cards | July 29, 2014 - 08:27 PM |
Tagged: nvidia, geforce, graphics drivers, shield tablet, shield

Alongside the NVIDIA SHIELD Tablet launch, the company has released their GeForce 340.52 drivers. This version allows compatible devices to use GameStream and it, also, is optimized for Metro: Redux and Final Fantasy XIV (China).

nvidia-geforce.png

The driver supports GeForce 8-series graphics cards, and later. As a reminder, for GPUs that are not based on the Fermi architecture (or later), 340.xx will be your last driver version. NVIDIA does intend to provided extended support for 340.xx (and earlier) drivers until April 1st, 2016. But, when Fermi, Kepler, and Maxwell move on to 343.xx, Tesla and earlier will not. That said, most of the content of this driver is aimed at Kepler and later. Either way, the driver itself is available for those pre-Fermi cards.

I should also mention that a user of Anandtech's forums noted the removal of Miracast from NVIDIA documentation. NVIDIA has yet to comment, although it is still very short notice, at this point.

Source: NVIDIA

This high end multi-GPU 4k showdown includes overclocking

Subject: Graphics Cards | July 29, 2014 - 02:27 PM |
Tagged: asus, gtx 780, R9 290X DC2 OC, sli, crossfire, STRIX GTX 780 OC 6GB, R9 290X

We have seen [H]ard|OCP test ASUS' STRIX GTX 780 OC 6GB and R9 290X DirectCU II before but this time they have been overclocked and paired up for a 4k showdown.  For a chance NewEgg gives the price advantage to AMD, $589 versus $599 at the time of writing (with odd blips in prices on Amazon).   The GTX 780 has been set to 1.2GHz and 6.6GHz while the 290X is 1.1GHz and 5.6GHz, keep in mind dual GPU setups may not reach the same frequencies as single cards.  Read on for their conclusions and decide if you prefer to brag about a higher overclock or have better overall performance.

14060235239aDa7rbLPT_1_1_l.jpg

"We take the ASUS STRIX GTX 780 OC 6GB video card and run two in SLI and overclock both of these at 4K resolutions to find the ultimate gameplay performance with 6GB of VRAM. We will also compare these to two overclocked ASUS Radeon R9 290X DirectCU II CrossFire video cards for the ultimate VRAM performance showdown."

Here are some more Graphics Card articles from around the web:

Graphics Cards

Source: [H]ard|OCP

Raptr Update Available (for Both NVIDIA and AMD GPUs)

Subject: General Tech, Graphics Cards | July 28, 2014 - 09:00 AM |
Tagged: raptr, pc game streaming

Raptr seems to be gaining in popularity. Total playtime recorded by the online service was up 15% month-over-month, from May to June. The software is made up of a few features that are designed to make the lives of PC gamers easier and better, ranging from optimizing game settings to recording gameplay. If you have used a recent version of GeForce Experience, then you probably have a good idea of what Raptr does.

raptr_game_settings_example_balanced.jpg

Today, Raptr has announced a new, major update. The version's headlining feature is hardware accelerated video recording, and streaming, for both AMD and NVIDIA GPUs. Raptr claims that their method leads to basically no performance lost, regardless of which GPU vendor is used. Up to 20 minutes of previous gameplay can be recorded after it happened and video of unlimited length can be streamed on demand.

Raptr_WOW-Quality-Video.jpg

Notice the recording overlay in the top left.

The other, major feature of this version is enhanced sharing of said videos. They can be uploaded to Raptr.com and shared to Facebook and Twitter, complete with hashtags (#BecauseYolo?)

If interested, check out Raptr at their website.

Source: Raptr

Rumor: NVIDIA GeForce 800-Series Is 28nm in Oct/Nov.

Subject: General Tech, Graphics Cards | July 24, 2014 - 07:32 PM |
Tagged: nvidia, gtx 880

Many of our readers were hoping to drop one (or more) Maxwell-based GPUs in their system for use with their 4K monitors, 3D, or whatever else they need performance for. That has not happened, nor do we even know, for sure, when it will. The latest rumors claim that the NVIDIA GeForce GTX 870 and 880 desktop GPUs will arrive in October or November. More interesting, it is expected to be based on GM204 at the current, 28nm process.

nvidia-pascal-roadmap.jpg

The recent GPU roadmap, as of GTC 2014

NVIDIA has not commented on the delay, at least that I know of, but we can tell something is up from their significantly different roadmap. We can also make a fairly confident guess, by paying attention to the industry as a whole. TSMC has been struggling to keep up with 28nm production, having increased wait times by six extra weeks in May, according to Digitimes, and whatever 20nm capacity they had was reportedly gobbled up by Apple until just recently. At around the same time, NVIDIA inserted Pascal between Maxwell and Volta with 3D memory, NVLink, and some unified memory architecture (which I don't believe they yet elaborated on).

nvidia-previous-roadmap.jpg

The previous roadmap. (Source: Anandtech)

And, if this rumor is true, Maxwell was pushed from 20nm to a wholly 28nm architecture. It was originally supposed to be host of unified virtual memory, not Pascal. If I had to make a safe guess, I would assume that NVIDIA needed to redesign their chip to 28nm and, especially with the extra delays at TSMC, cannot get the volume they need until Autumn.

Lastly, going by the launch of the 750ti, Maxwell will basically be a cleaned-up Kepler architecture. Its compute units were shifted into power-of-two partitions, reducing die area for scheduling logic (and so forth). NVIDIA has been known to stash a few features into each generation, sometimes revealing them well after retail availability, so that is not to say that Maxwell will be "a more efficient Kepler".

I expect its fundamental architecture should be pretty close, though.

Source: KitGuru

NVIDIA Preparing GeForce 800M (Laptop) Maxwell GPUs?

Subject: General Tech, Graphics Cards, Mobile | July 19, 2014 - 03:29 AM |
Tagged: nvidia, geforce, maxwell, mobile gpu, mobile graphics

Apparently, some hardware sites got their hands on an NVIDIA driver listing with several new product codes. They claim thirteen N16(P/E) chips are listed (although I count twelve (??)). While I do not have much knowledge of NVIDIA's internal product structure, the GeForce GTX 880M, based on Kepler, is apparently listed as N15E.

nvidiamaxwellroadmap.jpg

Things have changed a lot since this presentation.

These new parts will allegedly be based on the second-generation Maxwell architecture. Also, the source believes that these new GPUs will in the GeForce GTX 800-series, possibly with the MX suffix that was last seen in October 2012 with the GeForce GTX 680MX. Of course, being a long-time PC gamer, the MX suffix does not exactly ring positive with my memory. It used to be the Ti-line that you wanted, and the MX-line that you could afford. But who am I kidding? None of that is relevant these days. Get off my lawn.

Source: Videocardz

Intel AVX-512 Expanded

Subject: General Tech, Graphics Cards, Processors | July 19, 2014 - 03:05 AM |
Tagged: Xeon Phi, xeon, Intel, avx-512, avx

It is difficult to know what is actually new information in this Intel blog post, but it is interesting none-the-less. Its topic is the AVX-512 extension to x86, designed for Xeon and Xeon Phi processors and co-processors. Basically, last year, Intel announced "Foundation", the minimum support level for AVX-512, as well as Conflict Detection, Exponential and Reciprocal, and Prefetch, which are optional. This, earlier blog post was very much focused on Xeon Phi, but it acknowledged that the instructions will make their way to standard, CPU-like Xeons at around the same time.

Intel_Xeon_Phi_Family.jpg

This year's blog post brings in a bit more information, especially for common Xeons. While all AVX-512-supporting processors (and co-processors) will support "AVX-512 Foundation", the instruction set extensions are a bit more scattered.

 
Xeon
Processors
Xeon Phi
Processors
Xeon Phi
Coprocessors (AIBs)
Foundation Instructions Yes Yes Yes
Conflict Detection Instructions Yes Yes Yes
Exponential and Reciprocal Instructions No Yes Yes
Prefetch Instructions No Yes Yes
Byte and Word Instructions Yes No No
Doubleword and Quadword Instructions Yes No No
Vector Length Extensions Yes No No

Source: Intel AVX-512 Blog Post (and my understanding thereof).

So why do we care? Simply put: speed. Vectorization, the purpose of AVX-512, has similar benefits to multiple cores. It is not as flexible as having multiple, unique, independent cores, but it is easier to implement (and works just fine with having multiple cores, too). For an example: imagine that you have to multiply two colors together. The direct way to do it is multiply red with red, green with green, blue with blue, and alpha with alpha. AMD's 3DNow! and, later, Intel's SSE included instructions to multiply two, four-component vectors together. This reduces four similar instructions into a single operating between wider registers.

Smart compilers (and programmers, although that is becoming less common as compilers are pretty good, especially when they are not fighting developers) are able to pack seemingly unrelated data together, too, if they undergo similar instructions. AVX-512 allows for sixteen 32-bit pieces of data to be worked on at the same time. If your pixel only has four, single-precision RGBA data values, but you are looping through 2 million pixels, do four pixels at a time (16 components).

For the record, I basically just described "SIMD" (single instruction, multiple data) as a whole.

This theory is part of how GPUs became so powerful at certain tasks. They are capable of pushing a lot of data because they can exploit similarities. If your task is full of similar problems, they can just churn through tonnes of data. CPUs have been doing these tricks, too, just without compromising what they do well.

Source: Intel

Google I/O 2014: Android Extension Pack Announced

Subject: General Tech, Graphics Cards, Mobile, Shows and Expos | July 7, 2014 - 04:06 AM |
Tagged: tegra k1, OpenGL ES, opengl, Khronos, google io, google, android extension pack, Android

Sure, this is a little late. Honestly, when I first heard the announcement, I did not see much news in it. The slide from the keynote (below) showed four points: Tesselation, Geometry Shaders, Computer [sic] Shaders, and ASTC Texture Compression. Honestly, I thought tesselation and geometry shaders were part of the OpenGL ES 3.1 spec, like compute shaders. This led to my immediate reaction: "Oh cool. They implemented OpenGL ES 3.1. Nice. Not worth a news post."

google-android-opengl-es-extensions.jpg

Image Credit: Blogogist

Apparently, they were not part of the ES 3.1 spec (although compute shaders are). My mistake. It turns out that Google is cooking their their own vendor-specific extensions. This is quite interesting, as it adds functionality to the API without the developer needing to target a specific GPU vendor (INTEL, NV, ATI, AMD), waiting for approval from the Architecture Review Board (ARB), or using multi-vendor extensions (EXT). In other words, it sounds like developers can target Google's vendor without knowing the actual hardware.

Hiding the GPU vendor from the developer is not the only reason for Google to host their own vendor extension. The added features are mostly from full OpenGL. This makes sense, because it was announced with NVIDIA and their Tegra K1, Kepler-based SoC. Full OpenGL compatibility was NVIDIA's selling point for the K1, due to its heritage as a desktop GPU. But, instead of requiring apps to be programmed with full OpenGL in mind, Google's extension pushes it to OpenGL ES 3.1. If the developer wants to dip their toe into OpenGL, then they could add a few Android Extension Pack features to their existing ES engine.

Epic Games' Unreal Engine 4 "Rivalry" Demo from Google I/O 2014.

The last feature, ASTC Texture Compression, was an interesting one. Apparently the Khronos Group, owners of OpenGL, were looking for a new generation of texture compression technologies. NVIDIA suggested their ZIL technology. ARM and AMD also proposed "Adaptive Scalable Texture Compression". ARM and AMD won, although the Khronos Group stated that the collaboration between ARM and NVIDIA made both proposals better than either in isolation.

Android Extension Pack is set to launch with "Android L". The next release of Android is not currently associated with a snack food. If I was their marketer, I would block out the next three versions as 5.x, and name them (L)emon, then (M)eringue, and finally (P)ie.

Would I do anything with the two skipped letters before pie? (N)(O).

ASUS STRIX GTX 780 OC 6GB in SLI, better than a Titan and less expensive to boot!

Subject: Graphics Cards | July 4, 2014 - 01:40 PM |
Tagged: STRIX GTX 780 OC 6GB, sli, crossfire, asus, 4k

Multiple monitor and 4k testing of the ASUS STRIX GTX 780 OC cards in SLI is not about the 52MHz out of box overclock but about the 12GB of VRAM that your system will have.  Apart from an issue with BF4, [H]ard|OCP tested the STRIX against a pair of reference GTX 780s and HD 290X cards at resolutions of 5760x1200 and 3840x2160.   The extra RAM made the STRIX shine in comparison to the reference card as not only was the performance better but [H] could raise many of the graphical settings but was not enough to push its performance past the 290X cards in Crossfire.  One other takeaway from this review is that even 6GB of VRAM is not enough to run Watch_Dogs with Ultra textures at these resolutions.

1402436254j0CnhAb2Z5_1_20_l.jpg

"You’ve seen the new ASUS STRIX GTX 780 OC Edition 6GB DirectCU II video card, now let’s look at two of these in an SLI configuration! We will explore 4K and NV Surround performance with two ASUS STRIX video cards for the ultimate high-resolution experience and see if the extra memory helps this GPU make better strides at high resolutions."

Here are some more Graphics Card articles from around the web:

Graphics Cards

Source: [H]ard|OCP

Intel's Knights Landing (Xeon Phi, 2015) Details

Subject: General Tech, Graphics Cards, Processors | July 2, 2014 - 03:55 AM |
Tagged: Intel, Xeon Phi, xeon, silvermont, 14nm

Anandtech has just published a large editorial detailing Intel's Knights Landing. Mostly, it is stuff that we already knew from previous announcements and leaks, such as one by VR-Zone from last November (which we reported on). Officially, few details were given back then, except that it would be available as either a PCIe-based add-in board or as a socketed, bootable, x86-compatible processor based on the Silvermont architecture. Its many cores, threads, and 512 bit registers are each pretty weak, compared to Haswell, for instance, but combine to about 3 TFLOPs of double precision performance.

itsbeautiful.png

Not enough graphs. Could use another 256...

The best way to imagine it is running a PC with a modern, Silvermont-based Atom processor -- only with up to 288 processors listed in your Task Manager (72 actual cores with quad HyperThreading).

The main limitation of GPUs (and similar coprocessors), however, is memory bandwidth. GDDR5 is often the main bottleneck of compute performance and just about the first thing to be optimized. To compensate, Intel is packaging up-to 16GB of memory (stacked DRAM) on the chip, itself. This RAM is based on "Hybrid Memory Cube" (HMC), developed by Micron Technology, and supported by the Hybrid Memory Cube Consortium (HMCC). While the actual memory used in Knights Landing is derived from HMC, it uses a proprietary interface that is customized for Knights Landing. Its bandwidth is rated at around 500GB/s. For comparison, the NVIDIA GeForce Titan Black has 336.4GB/s of memory bandwidth.

Intel and Micron have worked together in the past. In 2006, the two companies formed "IM Flash" to produce the NAND flash for Intel and Crucial SSDs. Crucial is Micron's consumer-facing brand.

intel-knights-landing.jpg

So the vision for Knights Landing seems to be the bridge between CPU-like architectures and GPU-like ones. For compute tasks, GPUs edge out CPUs by crunching through bundles of similar tasks at the same time, across many (hundreds of, thousands of) computing units. The difference with (at least socketed) Xeon Phi processors is that, unlike most GPUs, Intel does not rely upon APIs, such as OpenCL, and drivers to translate a handful of functions into bundles of GPU-specific machine language. Instead, especially if the Xeon Phi is your system's main processor, it will run standard, x86-based software. The software will just run slowly, unless it is capable of vectorizing itself and splitting across multiple threads. Obviously, OpenCL (and other APIs) would make this parallelization easy, by their host/kernel design, but it is apparently not required.

It is a cool way that Intel arrives at the same goal, based on their background. Especially when you mix-and-match Xeons and Xeon Phis on the same computer, it is a push toward heterogeneous computing -- with a lot of specialized threads backing up a handful of strong ones. I just wonder if providing a more-direct method of programming will really help developers finally adopt massively parallel coding practices.

I mean, without even considering GPU compute, how efficient is most software at splitting into even two threads? Four threads? Eight threads? Can this help drive heterogeneous development? Or will this product simply try to appeal to those who are already considering it?

Source: Intel