Another retail card reveals the results
Since the release of the new AMD Radeon R9 290X and R9 290 graphics cards, we have been very curious about the latest implementation of AMD's PowerTune technology and its scaling of clock frequency as a result of the thermal levels of each graphics card. In the first article covering this topic, I addressed the questions from AMD's point of view - is this really a "configurable" GPU as AMD claims or are there issues that need to be addressed by the company?
The biggest problems I found were in the highly variable clock speeds from game to game and from a "cold" GPU to a "hot" GPU. This affects the way many people in the industry test and benchmark graphics cards as running a game for just a couple of minutes could result in average and reported frame rates that are much higher than what you see 10-20 minutes into gameplay. This was rarely something that had to be dealt with before (especially on AMD graphics cards) so to many it caught them off-guard.
Because of the new PowerTune technology, as I have discussed several times before, clock speeds are starting off quite high on the R9 290X (at or near the 1000 MHz quoted speed) and then slowly drifting down over time.
Another wrinkle occurred when Tom's Hardware reported that retail graphics cards they had seen were showing markedly lower performance than the reference samples sent to reviewers. As a result, AMD quickly released a new driver that attempted to address the problem by normalizing to fan speeds (RPM) rather than fan voltage (percentage). The result was consistent fan speeds on different cards and thus much closer performance.
However, with all that being said, I was still testing retail AMD Radeon R9 290X and R9 290 cards that were PURCHASED rather than sampled, to keep tabs on the situation.
Subject: General Tech, Graphics Cards | November 26, 2013 - 03:18 AM | Scott Michaud
Tagged: R9 290X, r9 290, amd
Multiple sites are reporting that some AMD's Radeon R9 290 cards could be software-unlocked into 290Xs with a simple BIOS update. While the difference in performance is minor, free extra shader processors might be tempting for some existing owners.
"Binning" is when a manufacturer increases yield by splitting one product into several based on how they test after production. Semiconductor fabrication, specifically, is prone to constant errors and defects. Maybe only some of your wafers are not stable at 4 GHz but they can attain 3.5 or 3.7 GHz. Why throw those out when they can be sold as 3.5 GHz parts?
This is especially relevant to multi-core CPUs and GPUs. Hawaii XT has 2816 Stream processors; a compelling product could be made even with a few of those shut down. The R9 290, for instance, permits 2560 of these cores. The remaining have been laser cut or, at least, should have been.
Apparently certain batches of Radeon R9 290s were developed with fully functional Hawaii XT chips that were software locked to 290 specifications. There have been reports that several users of cards from multiple OEMs were able to flash a new BIOS to unlock these extra cores. However, other batches seem to be properly locked.
This could be interesting for lucky and brave users but I wonder why this happened. I can think of two potential causes:
- Someone (OEMs or AMD) had too many 290X chips, or
- The 290 launch was just that unprepared.
Either way, newer shipments should be properly locked even from affected OEMs. Again, not that it really matters given the performance differences we are talking about.
Subject: General Tech, Graphics Cards | November 22, 2013 - 06:26 PM | Scott Michaud
Tagged: nvidia, jpr, amd
Jen Peddie Research (JPR) reports an 8% rise in quarter-to-quarter shipments of graphics add-in boards (AIBs) for NVIDIA and a decrease of 3% for AMD. This reverses the story from last quarter where NVIDIA lost 8% and AMD gained. In all, NVIDIA holds over half the market (64.5%).
JPR attributed AMD's gains seen last quarter to consumers who added a discrete graphics solution to systems which already contain an integrated product. SLi and Crossfire were noted but pale in comparison. I expect that Never Settle to have contributed heavily. This quarter, the free games initiative was reduced with the new GPU lineup. For a decent amount of time, nothing was offered.
At the same time, NVIDIA launched the GTX 780 Ti and their own game bundle. While I do not believe this promotion was as popular as AMD's Never Settle, it probably helped. That said, it is still probably too early to tell whether the Battlefield 4 promotion (or Thief's addition to Silver Tier) will help them regain some ground.
The other vendors, Matrox and S3, were "flat to declining". Their story is the same as last quarter: they less than (maybe much less than) 7000 units. On the whole, add-in board shipments are rising from last quarter; that quarter, however, was a 5.4% drop from the one before.
Subject: General Tech, Graphics Cards, Systems | November 21, 2013 - 09:47 PM | Scott Michaud
Tagged: nvidia, tesla, supercomputing
GPUs are very efficient in terms of operations per watt. Their architecture is best suited for a gigantic bundle of similar calculations (such as a set of operations for each entry of a large blob of data). These are the tasks which also take up the most computation time especially for, not surprisingly, 3D graphics (where you need to do something to every pixel, fragment, vertex, etc.). It is also very relevant for scientific calculations, financial and other "big data" services, weather prediction, and so forth.
Tokyo Tech KFC achieves over 4 GigaFLOPs per watt of power draw from 160 Tesla K20X GPUs in its cluster. That is about 25% more calculations per watt than current leader of the Green500 (CINECA Eurora System in Italy, with 3.208 GFLOPs/W).
One interesting trait: this supercomputer will be cooled by oil immersion. NVIDIA offers passively cooled Tesla cards which, according to my understanding of how this works, suit very well to this fluid system. I am fairly certain that they remove all of the fans before dunking the servers (I figured they would be left on).
By the way, was it intentional to name computers dunked in giant vats of heat-conducting oil, "KFC"?
Intel has done a similar test, which we reported on last September, submerging numerous servers for over a year. Another benefit of being green is that you are not nearly as concerned about air conditioning.
NVIDIA is actually taking it to the practical market with another nice supercomputer win.
Other NVIDIA Supercomputing News:
- IBM and NVIDIA collaborate on GPU-accelerating IBM's enterprise software.
- Piz Daint, powered by Tesla K20X GPUs, greenest PFLOP-scale supercomputer.
Subject: Graphics Cards | November 20, 2013 - 01:52 PM | Jeremy Hellstrom
Tagged: mars, asus, ROG MARS 760, gtx 760, dual gpu
Fremont, CA (November 19, 2013) - ASUS Republic of Gamers (ROG) today announced the MARS 760 graphics card featuring two GeForce GTX 760 graphics-processing units (GPUs) capable of delivering incredible gaming performance and ensuring ultra-smooth high-resolution gameplay. The MARS 760 even outpaces the GeForce GTX TITAN — with game performance that’s up to 39% faster overall. The MARS 760 is a two-slot card packed with exclusive ASUS technologies including DirectCU II for 20%-cooler and vastly quieter operation, DIGI+ voltage-regulator module (VRM) for ultra-stable power delivery and GPU Tweak, an easy-to-use utility that lets users safely overclock the two GTX 760 GPUs.
Exclusive ASUS features provide cool, quiet, durable and stable performance ASUS exclusive DirectCU II technology puts 8 highly-conductive cooling copper heatpipes in direct contact with both GPUs. These heatpipes provide extremely efficient cooling, allowing the MARS 760 to run 20% cooler and vastly quieter than reference GeForce GTX 690 cards. Dual 90mm dust-proof fans help to provide six times (6X) greater airflow than reference design. And with 4GB of GDDR5 video memory, the ASUS ROG MARS 760 is capable of delivering visuals with incredibly high frame rates and no stutter, ensuring extremely smooth gameplay — even at WQHD resolutions. An attention-grabbing LED even illuminates as the MARS 760 is operating under load.
The MARS 760 is equipped with ROG’s acclaimed DIGI+ voltage-regulation module (VRM), featuring a 12-phase power design that reduces power noise by 30% and enhances efficiency by 15%. Custom sourced black metallic capacitors offer 20%-better temperature endurance for a lifespan that’s up to five times (5X) longer. The new card is built with extremely hardwearing polymerized organic-semiconductor capacitors (POSCAPs) and has an aluminum back-plate, further lowering power noise while increasing both durability and stability to unlock overclocking potential.
The exclusive GPU Tweak tuning tool allows quick, simple and safe control over clock speeds, voltages, cooling-fan speeds and power-consumption thresholds; GPU Tweak lets users push the two GTX 760 GPUs even further. The ROG edition of GPU Tweak included with the MARS 760 also enables detailed GPU load-line calibration and VRM-frequency tuning, allowing for the most extensive control and tweaking parameters in order to maximize overclocking potential — all adjusted via an attractive and easy-to-use graphical interface.
The GPU Tweak Streaming feature, the newest addition to the GPU Tweak tool, lets users share on-screen action over the internet in real time so others can watch live as games are played. It’s even possible to add a title to the streaming window along with scrolling text, pictures and webcam images.
- NVIDIA GeForce GTX 760 SLI
- PCI Express 3.0
- 4096MB GDDR5 memory (2GB per GPU)
- 1008MHz (1072MHz boosted) core speed
- 6004 MHz (1501 MHz GDDR5) memory clock
- 512-bit memory interface
- 2560 x 1600 maximum DVI resolution
- 2 x dual-link DVI-I output
- 1 x dual-link DVI-D output
- 1 x Mini DisplayPort output
- HDMI output (via dongle)
Subject: General Tech, Graphics Cards | November 18, 2013 - 03:33 PM | Scott Michaud
Tagged: tesla, nvidia, K40, GK110b
The Tesla K20X ruled NVIDIA's headless GPU portfolio for quite some time now. The part is based on the GK110 chip with 192 shader cores disabled, like the GeForce Titan, and achieved 3.9 TeraFLOPs of compute performance (1.31 TeraFLOPs in double precision). Also, like the Titan, the K20X offers 6GB of memory.
The Tesla K40X
So the layout was basically the following: GK104 ruled the gamer market except for the, in hindsight, oddly-positioned GeForce Titan which was basically a Tesla K20X without a few features like error correction (ECC). The Quadro K6000 was the only card to utilize all 2880 CUDA cores.
Then, at the recent G-Sync event, NVIDIA CEO Jen-Hsun Huang announced the GeForce GTX 780Ti. This card uses the GK110b processor and incorporates all 2880 CUDA cores albeit with reduced double-precision performance (for the 780 Ti, not for GK110b in general). So now we have Quadro and GeForce with the full power Kepler, your move Tesla.
And they did, the Tesla K40 launched this morning and it brought more than just cores.
A brief overview
The GeForce launch was famous for its inclusion of GPU Boost, a feature absent in the Tesla line. It turns out that NVIDIA was paying attention to the feature but wanted to include it in a way that suited data centers. GeForce cards boost based on the status of the card, its temperature or its power draw. This is apparently unsuitable for data centers because they would like every unit operating at a very similar performance. The Tesla K40 has a base clock of 745 MHz but gives the data center two boost clocks that they can manually set: 810 MHz and 875 MHz.
Relative performance benchmarks
The Tesla K40 also doubles the amount of RAM to 12GB. Of course this allows for the GPU to work on larger data sets without streaming in the computation from system memory or worse.
There is currently no public information on pricing for the Tesla K40 but it is available starting today. What we do know are the launch OEM partners: ASUS, Bull, Cray, Dell, Eurotech, HP, IBM, Inspur, SGI, Sugon, Supermicro, and Tyan.
If you are interested in testing out a K40, NVIDIA has remotely hosted clusters that your company can sign up for at the GPU Test Drive website.
Subject: General Tech, Graphics Cards | November 14, 2013 - 07:54 PM | Scott Michaud
Tagged: never settle forever, never settle, battlefield 4, amd
UPDATE (11/14/2013): After many complaints from the community about the lack of availability of graphics cards that actually HAD the Battlefield 4 bundle included with them, AMD is attempting to clarify the situation. In a statement sent through email, AMD says that the previous information sent to press "was not clear and has led to some confusion" which is definitely the case. While it was implied that all customers that bought R9 series graphics cards would get a free copy of BF4, when purchased on or after November 13th, the truth is that "add-in-board partners ultimately decide which select AMD Radeon R9 SKUs will include a copy of BF4."
So, how are you to know what SKUs and cards are actually going to include BF4? AMD is trying hard to setup a landing page at http://amd.com/battlefield4 that will give gamers clear, and absolute, listings of which R9 series cards include the free copy of the game. When I pushed AMD for a timeline on exactly when these would be posted, the best I could get was "in the next day or two."
As for users that bought an R9 280X, R9 270X, R9 270, R9 290X or R9 290 after the announcement of the bundle program changes but DID NOT get a copy of BF4, AMD is going to try and help them out by offering up 1,000 Battlefield 4 keys over AMD's social channels. The cynic in me thinks this is another ploy to get more Facebook likes and Twitter followers, but in truth the logistics of verifying purchases at this point would be a nightmare for AMD. Though I don't have details on HOW they are going to distribute these keys, I certainly hope they are going to find a way to target those users that were screwed over in this mess. Follow www.facebook.com/amdgaming or www.twitter.com/amdradeon for more information on this upcoming promotion.
AMD did send over a couple of links to cards that are currently selling with Battlefield 4 included, as an example of what to look for:
As far as I know, the board partners will also decide which online outlets to offer the bundle through, so even if you see the same SKU on Amazon.com, it may not come with Battlefield 4 as well. It appears in this case, and going forward, extreme caution is in order when looking for the right card for you.
END UPDATE (11/14/2013)
AMD announced the first Never Settle on October 22nd, 2012 with Sleeping Dogs, Far Cry 3, Hitman: Absolution, and 20% off of Medal of Honor: Warfighter. The deal was valued at around $170. It has exploded since then to become a choose-your-own-bundle across a variety of tiers.
This bundle is mostly different.
Basically, apart from the R7 260X (I will get to that later), all applicable cards will receive Battlefield 4. This is a one-game promotion unlike Never Settle. Still, it is one very good game that will soon be accelerated with Mantle in an upcoming patch. It should be a good example of games based on Frostbite 3 for at least the next few years.
The qualifying cards are: R9 270, R9 270X, R9 280, R9 280X, R9 290, and R9 290X. They must be purchased from a participating retailer beginning November 13th.
The R7 260X is slightly different because it is more familiar to Never Settle. It will not have access to a free copy of Battlefield 4. Instead, the R7 260X will have access to two of six Never Settle Forever Silver Tier games: Hitman: Absolution, Sleeping Dogs, Sniper Elite (V2), Far Cry 3: Blood Dragon, DiRT 3, and (for the first time) THIEF. It is possible that other silver-tier Never Settle Forever owners, who have yet to redeem their voucher, might qualify as well. I am not sure about that. Regardless, THIEF was chosen because the developer worked closely with AMD to support both Mantle as well as TrueAudio.
Since this deal half-updates Never Settle and half-doesn't... I am unsure what this means for the future of the bundle. They seem to be simultaneously supporting and disavowing it. My personal expectation is that AMD wants to continue with Never Settle but they just cut their margins too thin with this launch. This will be a good question to revisit later in the GPU lifecycle when margins become more comfortable.
What do you think? Does AMD's hyper-aggressive hardware pricing warrant a temporary suspension of Never Settle? I mean, until today, they were being purchased without any bundle what-so-ever.
Qualifying R9-Series Cards (purchased after Nov 13 from participating retailers) can check out AMD's Battlefield 4 portal.
Qualifying R7 260X owners, on the other hand, can check out the Never Settle Forever portal.
Subject: Graphics Cards | November 13, 2013 - 09:54 PM | Ryan Shrout
Tagged: video, Mantle, apu13, amd
While attending the AMD APU13 event, an annual developer conference the company uses to promote heterogeneous computing, I got to sit in during a deep dive on the AMD Mantle, a new hardware level API first announced in September. Rather than attempt to re-explain what was explained quite well, I decided to record the session on video and then intermix the slides presented in a produced video for our readers.
The result is likely the best (and seemingly first) explanation of how Mantle actually works and what it does differently than existing APIs like DirectX and OpenGL.
Also, because we had some requests, I am embedding the live blog we ran during Johan Andersson's keynote from APU13. Enjoy!
Subject: Graphics Cards, Processors | November 12, 2013 - 06:10 PM | Ryan Shrout
Tagged: amd, Kaveri, APU, video, hsa
Yesterday at the AMD APU13 developer conference, the company showed off the upcoming Kaveri APU running Battlefield 4 completely on the integrated graphics. I was able to push the AMD guys along and get a little more personal demo to share with our readers. The Kaveri APU had some of its details revealed this week:
- Quad-core Steamroller x86
- 512 Stream Processor GPU
- 856 GFLOPS of theoretical performance
- 3.7 GHz CPU clock speed, 720 MHz GPU clock speed
AMD wanted to be sure we pointed out in this video that the estimate clock speeds for FLOP performance may not be what the demo system was run at (likely a bit lower). Also, the version of Battlefield 4 here is the standard retail version and with further improvements from the driver team as the upcoming Mantle API implementation will likely introduce even more performance for the APU.
The game was running at 1920x1080 with MOSTLY medium quality settings (lighting set to low) but the results still looked damn impressive and the frame rates were silky and smooth. Considering this is running on a desktop with integrated processor graphics, the game play experience is simply unmatched.
Memory in the system was running at 2133 MHz.
The second demo looks at the image decoding acceleration that AMD is going to enable with Kaveri APUs upon release with a driver. Essentially, as the demonstration shows in the video, AMD is overwriting the integrated Windows JPG decompression algorithm with a new one that utilizes HSA to accelerate on both the x86 and SIMD (GPU) portions of the silicon. For the most strenuous demo that used 22 MP images saw a 100% increase in performance compared to the Kaveri CPU cores alone.
EVGA Brings Custom GTX 780 Ti Early
Reference cards for new graphics card releases are very important for a number of reasons. Most importantly, these are the cards presented to the media and reviewers that judge the value and performance of these cards out of the gate. These various articles are generally used by readers and enthusiasts to make purchasing decisions, and if first impressions are not good, it can spell trouble. Also, reference cards tend to be the first cards sold in the market (see the recent Radeon R9 290/290X launch) and early adopters get the same technology in their hands; again the impressions reference cards leave will live in forums for eternity.
All that being said, retail cards are where partners can differentiate and keep the various GPUs relevant for some time to come. EVGA is probably the most well known NVIDIA partner and is clearly their biggest outlet for sales. The ACX cooler is one we saw popularized with the first GTX 700-series cards and the company has quickly adopted it to the GTX 780 Ti, released by NVIDIA just last week.
I would normally have a full review for you as soon as we could but thanks to a couple of upcoming trips that will keep me away from the GPU test bed, that will take a little while longer. However, I thought a quick preview was in order to show off the specifications and performance of the EVGA GTX 780 Ti ACX.
As expected, the EVGA ACX design of the GTX 780 Ti is overclocked. While the reference card runs at a base clock of 875 MHz and a typical boost clock of 928 MHz, this retail model has a base clock of 1006 MHz and a boost clock of 1072 MHz. This means that all 2,880 CUDA cores are going to run somewhere around 15% faster on the EVGA ACX model than the reference GTX 780 Ti SKUs.
We should note that though the cooler is custom built by EVGA, the PCB design of this GTX 780 Ti card remains the same as the reference models.
Get notified when we go live!