Subject: Editorial, General Tech, Graphics Cards, Processors, Shows and Expos | March 30, 2014 - 01:45 AM | Scott Michaud
Tagged: gdc 14, GDC, GCN, amd
While Mantle and DirectX 12 are designed to reduce overhead and keep GPUs loaded, the conversation shifts when you are limited by shader throughput. Modern graphics processors are dominated by sometimes thousands of compute cores. Video drivers are complex packages of software. One of their many tasks is converting your scripts, known as shaders, into machine code for its hardware. If this machine code is efficient, it could mean drastically higher frame rates, especially at extreme resolutions and intense quality settings.
Emil Persson of Avalanche Studios, probably known best for the Just Cause franchise, published his slides and speech on optimizing shaders. His talk focuses on AMD's GCN architecture, due to its existence in both console and PC, while bringing up older GPUs for examples. Yes, he has many snippets of GPU assembly code.
AMD's GCN architecture is actually quite interesting, especially dissected as it was in the presentation. It is simpler than its ancestors and much more CPU-like, with resources mapped to memory (and caches of said memory) rather than "slots" (although drivers and APIs often pretend those relics still exist) and with how vectors are mostly treated as collections of scalars, and so forth. Tricks which attempt to combine instructions together into vectors, such as using dot products, can just put irrelevant restrictions on the compiler and optimizer... as it breaks down those vector operations into those very same component-by-component ops that you thought you were avoiding.
Basically, and it makes sense coming from GDC, this talk rarely glosses over points. It goes over execution speed of one individual op compared to another, at various precisions, and which to avoid (protip: integer divide). Also, fused multiply-add is awesome.
I know I learned.
As a final note, this returns to the discussions we had prior to the launch of the next generation consoles. Developers are learning how to make their shader code much more efficient on GCN and that could easily translate to leading PC titles. Especially with DirectX 12 and Mantle, which lightens the CPU-based bottlenecks, learning how to do more work per FLOP addresses the other side. Everyone was looking at Mantle as AMD's play for success through harnessing console mindshare (and in terms of Intel vs AMD, it might help). But honestly, I believe that it will be trends like this presentation which prove more significant... even if behind-the-scenes. Of course developers were always having these discussions, but now console developers will probably be talking about only one architecture - that is a lot of people talking about very few things.
This is not really reducing overhead; this is teaching people how to do more work with less, especially in situations (high resolutions with complex shaders) where the GPU is most relevant.
Subject: Motherboards | March 6, 2014 - 02:44 AM | Tim Verry
Tagged: mini ITX, micro ATX, Kabini, GCN, FS1B, biostar, AM1
Biostar has officially launched three new AM1 platform motherboards that support AMD's latest Kabini-based desktop SoC. The new Biostar hardware falls under the new AM1M series and includes the micro ATX AM1m-HP board and two mini ITX boards: the AM1MH and AM1ML.
All three boards feature a FS1B SoC socket, two DDR3 DIMM slots, two SATA III 6Gbps ports, one PCI-E 2.0 x16 slot (running at x4 speeds), one PCI-E 2.0 x1 slot, Gigabit Ethernet, and 5.1 channel audio. The micro ATX AM1M-HP adds a legacy PCI slot to the mix. In an interesting twist, Biostar has oriented the memory horizontally above the FS1B socket rather than vertically and to the right of the socket.
Rear I/O on the AM1M-HP and AM1MH boards includes:
- 2 x PS/2
- 1 x HDMI
- 1 x VGA
- 2 x USB 3.0
- 2 x USB 2.0
- 1 x RJ45 (GbE)
- 3 x analog audio
The other mini ITX board (the AM1ML) has the same rear IO configuration minus the HDMI video output.
Biostar has not released pricing or availability information, but the boards should ship sometime in mid-April.
Low Power and Low Price
Back at CES earlier this year, we came across a couple of interesting motherboards that were neither AM3+ nor FM2+. These small, sparse, and inexpensive boards were actually based on the unannounced AM1 platform. This socket is actually the FS1b socket that is typically reserved for mobile applications which require the use of swappable APUs. The goal here is to provide a low cost, upgradeable platform for emerging markets where price is absolutely key.
AMD has not exactly been living on easy street for the past several years. Their CPU technologies have not been entirely competitive with Intel. This is their bread and butter. Helping to prop the company up though is a very robust and competitive graphics unit. The standalone and integrated graphics technology they offer are not only competitive, but also class leading in some cases. The integration of AMD’s GCN architecture into APUs has been their crowning achievement as of late.
This is not to say that AMD is totally deficient in their CPU designs. Their low power/low cost designs that started with the Bobcat architecture all those years back have always been very competitive in terms of performance, price, and power consumption. The latest iteration is the Kabini APU based on the Jaguar core architecture paired with GCN graphics. Kabini will be the part going into the FS1b socket that powers the AM1 platform.
Kabini is a four core processor (Jaguar) with a 128 unit GCN graphics part (8 GCN cores). These APUs will be rated at 25 watts up and down the stack. Even if they come with half the cores, it will still be a 25 watt part. AMD says that 25 watts is the sweet spot in terms of performance, cooling, and power consumption. Go lower than that and too much performance is sacrificed, and any higher it would make more sense to go with a Trinity/Richland/Kaveri solution. That 25 watt figure also encompasses the primary I/O functionality that typically resides on a standalone motherboard chipset. Kabini features 2 SATA 6G ports, 2 USB 3.0 ports, and 8 USB 2.0 ports. It also features multiple PCI-E lanes as well as a 4x PCI-E connection for external graphics. The chip also supports DisplayPort, HDMI, and VGA outputs. This is a true SOC from AMD that does a whole lot of work for not a whole lot of power.
Subject: General Tech | January 15, 2014 - 11:56 PM | Tim Verry
Tagged: r9 m290x, r7 m265, r5 m230, mobile gpu, GCN, amd
AMD recently took the wraps off of its latest mobile GPU series in the form of the R5 M200, R7 M200, and R9 M200 series. Currently, there is one GPU in each respective Rx M200 series including the AMD Radeon R5 M230, R7 M265, and R9 M290X. Do not get too excited, however. All of the new mobile GPUs are based on desktop versions of Volcanic Islands and not AMD's new Hawaii GPUs. As such, the Rx M200 series are essentially rebrands of the Radeon HD 8000M series (which was in turn OEM rebrands of the HD 7000M series) based around AMD's Graphics Core Next 1.0 architecture and specifically the Pitcairn GPU implementation.
All of the Rx M200 series support DirectX 11.2 Tier 1, up to 4GB GDDR5 memory, and at least 320 GCN shader cores. Informatin on the mid-range R7 M265 is scarce, but AMD has released information on the low and high end chips. Further, Computer Base has managed to put together specifications for the R5 M230 and R9 M290X. In short, the R5 M230 is a rebranded HD 8570 with higher clockspeeds and support for more memory while the R9 M290X is a rebranded HD 8970M with official support for DirectX 11.2 Tier 1 (the HD8970M technically supports it as well). A more detailed breakdown is as follows.
The R9 M290X features 1280 shaders clocked at 850MHz/900MHz (base/boost), 80 texture units, and 32 ROPs. OEMs can pair the GPU with up to 4GB of GDDR5 memory clocked at 1,200 MHz on a 256-bit bus.
The R5 M230 has 320 shaders clocked at 855MHz, 20 texture units, and 4 ROPs. This GPU can support up to 4GB of GDDR5 memory at 1,000MHz over a 64-bit bus.
Users will be able to get the new Rx M200 series graphics cards in mobile systems from Alienware, Clevo, Lenovo, and MSI. Other manufactures should pick up the new GPUs soon as well. The new series is not terribly exciting being nearly identical to the existing HD 8000M counterparts, but it does update the lineup to AMD's new naming and branding scheme. Notably, should AMD release a Hawaii-based mobile GPU, it has not left itself much room as far as naming goes (R9 M295X?).
The 7 Year Console Refresh
The consoles are coming! The consoles are coming! Ok, that is not necessarily true. One is already here and the second essentially is too. This of course brings up the great debate between PCs and consoles. The past has been interesting when it comes to console gaming, as often the consoles would be around a year ahead of PCs in terms of gaming power and prowess. This is no longer the case with this generation of consoles. Cutting edge is now considered mainstream when it comes to processing and graphics. The real incentive to buy this generation of consoles is a lot harder to pin down as compared to years past.
The PS4 retails for $399 US and the upcoming Xbox One is $499. The PS4’s price includes a single controller, while the Xbox’s package includes not just a controller, but also the next generation Kinect device. These prices would be comparable to some low end PCs which include keyboard, mouse, and a monitor that could be purchased from large brick and mortar stores like Walmart and Best Buy. Happily for most of us, we can build our machines to our own specifications and budgets.
As a directive from on high (the boss), we were given the task of building our own low-end gaming and productivity machines at a price as close to that of the consoles and explaining which solution would be superior at the price points given. The goal was to get as close to $500 as possible and still have a machine that would be able to play most recent games at reasonable resolutions and quality levels.
Subject: Processors | November 13, 2013 - 05:35 PM | Josh Walrath
Tagged: Puma, Mullins, mobile, Jaguar, GCN, beema, apu13, APU, amd, 2014
AMD’s APU13 is all about APUs and their programming, but the hardware we have seen so far has been dominated by the upcoming Kaveri products for FM2+. It seems that AMD has more up their sleeves for release this next year, and it has somewhat caught me off guard. The Beema and Mullins based products are being announced today, but we do not have exact details on these products. The codenames have been around for some time now, but interest has been minimal since they are evolutionary products based on Kabini and Temash APUs that have been available this year. Little did I know that things would be far more interesting than that.
The basis for Beema and Mullins is the Puma core. This is a highly optimized revision of Jaguar, and in some ways can be considered a new design. All of the basics in terms of execution units, caches, and memory controllers are the same. What AMD has done is go through the design with a fine toothed comb and make it far more efficient per clock than what we have seen previously. This is still a 28 nm part, but the extra attention and love lavished upon it by AMD has resulted in a much more efficient system architecture for the CPU and GPU portions.
The parts will be offered in two and four core configurations. Beema will span from 10W to 25W configurations. Mullins will go all the way down to “2W SDP”. SDP essentially means that while the chip can be theoretically rated higher, it will rarely go above that 2W envelope in the vast majority of situations. These chips are expected to be around 2X more efficient per clock than the previous Jaguar based products. This means that at similar clock speeds, Beema and Mullins will pull far less power than that previous gen. It should also allow some higher clockspeeds at the top end 25W area.
These will be some of the first fanless quad cores that AMD will introduce for the tablet market. Previously we have seen tablets utilize the cut down versions of Temash to hit power targets, but with this redesign it is entirely possible to utilize the fully enabled quad core Mullins. AMD has not given us specific speeds for these products, but we can guess that they will be around what we see currently, but the chip will just have a lower TDP rating.
AMD is introducing their new security platform based on the ARM Trustzone. Essentially a small ARM Cortex A5 is integrated in the design and handles the security aspects of this feature. We were not briefed on how this achieves security, but the slide below gives some of the bullet points of the technology.
Since the pure-play foundries will not have a workable 20 nm process for AMD to jump to in a timely manner, AMD had no other choice but to really optimize the Jaguar core to make it more competitive with products from Intel and the ARM partners. At 28 nm the ARM ecosystem has a power advantage over AMD, while at 22 nm Intel offers similar performance to AMD but with greater power efficiency.
This is a necessary update for AMD as the competition has certainly not slowed down. AMD is more constrained obviously by the lack of a next-generation process node available for 1H 2014, so a redesign of this magnitude was needed. The performance per watt metric is very important here, as it promises longer battery life without giving up the performance people received from the previous Kabini/Temash family of APUs. This design work could be carried over to the next generation of APUs using 20 nm and below, which hopefully will keep AMD competitive with the rest of the market. Beema and Mullins are interesting looking products that will be shown off at CES 2014.
AMD Releases Catalyst 13.11 Beta 9.2 Driver To Correct Performance Variance Issue of R9 290 Series Graphics Cards
Subject: Graphics Cards, Cases and Cooling | November 8, 2013 - 02:41 AM | Tim Verry
Tagged: R9 290X, powertune, hawaii, graphics drivers, gpu, GCN, catalyst 13.11 beta, amd, 290x
AMD recently launched its 290X graphics card, which is the new high-end single GPU solution using a GCN-based Hawaii architecture. The new GPU is rather large and incorporates an updated version of AMD's PowerTune technology to automatically adjust clockspeeds based on temperature and a maximum fan speed of 40%. Unfortunately, it seems that some 290X cards available at retail exhibited performance characteristics that varied from review units.
AMD has looked into the issue and released the following statement in response to the performance variances (which PC Perspective is looking into as well).
Hello, We've identified that there's variability in fan speeds across AMD R9 290 series boards. This variability in fan speed translates into variability of the cooling capacity of the fan-sink. The flexibility of AMD PowerTune technology enables us to correct this variability in a driver update. This update will normalize the fan RPMs to the correct values.
The correct target RPM values are 2200RPM for the AMD Radeon R9 290X "Quiet mode", and 2650RPM for the R9 290. You can verify these in GPU-Z. If you're working on stories relating to R9 290 series products, please use this driver as it will reduce any variability in fan speeds. This driver will be posted publicly tonight.
From the AMD statement, it seems to be an issue with fan speeds from card to card causing the performance variances. With a GPU that is rated to run at up to 95C, a fan limited to 40% maximum, and dynamic clockspeeds, it is only natural that cards could perform differently, especially if case airflow is not up to par. On the other hand, the specific issue pointed out by other technology review sites (per my understanding, it was initially Tom's Hardware that reported on the retail vs review sample variance) is an issue where the 40% maximum on certain cards is not actually the RPM target that AMD intended.
AMD intended for the Radeon R9 290X's fan to run at 2200RPM (40%) in Quiet Mode and the fan on the R9 290 (which has a maximum fan speed percentage of 47%) to spin at 2650 RPM in Quiet Mode. However, some cards 40% values are not actually hitting those intended RPMs, which is causing performance differences due to cooling and PowerTune adjusting the clockspeeds accordingly.
Luckily, the issue is being worked on by AMD, and it is reportedly rectified by a driver update. The driver update ensures that the fans are actually spinning at the intended speed when set to the 40% (R9 290X) or 47% (R9 290) values in Catalyst Control Center. The new driver, which includes the fix, is version Catalyst 13.11 Beta 9.2 and is available for download now.
If you are running a R9 290 or R9 290X in your system, you should consider updating to the latest driver to ensure you are getting the cooling (and as a result gaming) performance you are supposed to be getting.
Catalyst 13.11 Beta 9.2 is available from the AMD website.
- AMD Radeon R9 290X Hawaii - The Configurable GPU?
- AMD Radeon R9 290 4GB Review - Trip to Hawaii for $399
Stay tuned to PC Perspective for more information on the Radeon R9 290 series GPU performance variance issue as it develops.
Image credit: Ryan Shrout (PC Perspective).
Subject: Graphics Cards | October 10, 2013 - 03:29 PM | Jeremy Hellstrom
Tagged: radeon, r9 270x, GCN, sapphire, toxic edition, factory overclocked
We saw the release of the reference R9s yesterday and today we get to see the custom models such as the Sapphire TOXIC R9 270X which Legit Reviews just finished benchmarking. The TOXIC sports a 100MHz overclock on both GPU and RAM as well as a custom cooler with three fans. While it remains a two slot GPU it is longer than the reference model and requires a full foot of clearance inside the case. Read on to see what kind of performance boost you can expect and how much further you can push this card.
"When it comes to discrete graphics, the $199 price point is known as the gamer’s sweet spot by both AMD and NVIDIA. This is arguably the front line in the battle for your money when it coming to gaming graphics cards. The AMD Radeon R9 270X is AMD’s offering to gamers at this competitive price point. Read on to see how it performs!"
Here are some more Graphics Card articles from around the web:
- Gigabyte Radeon R9 270X WindForce OC 2GB @ eTeknix
- ASUS Radeon R9 270X Direct CU II TOP 2GB @ eTeknix
- MSI Radeon R9 270X Hawk Edition Video Card Review @HiTech Legion
- Gigabyte R9 270X Windforce @ LanOC Reviews
- Sapphire R9 280X Toxic Edition OC 3GB @ Kitguru
- MSI Radeon R9 270X GAMING 2GB @ Benchmark Reviews
- AMD Radeon R9 280X / R9 270X from ASUS and MSI @ Hardware.info
- ASUS R9 270X Direct CU II TOP @ Kitguru
- Gigabyte Radeon R9 270X OC 2GB Video Card Review @ HiTech Legion
- ASUS R9 280X Matrix Platinum @ Kitguru
- Will it Crossfire? R9 280X & HD 7970 Scaling Tested @ Hardware Canucks
- AMD Radeon R9 280X Graphics Card Review @ Techgage
- AMD Radeon R7 260X Versus NVIDIA GeForce GTX 650 Ti Boost @ Legit Reviews
Subject: Graphics Cards | October 8, 2013 - 05:30 PM | Jeremy Hellstrom
Tagged: amd, GCN, graphics core next, hd 7790, hd 7870 ghz edition, hd 7970 ghz edition, r7 260x, r9 270x, r9 280x, radeon, ASUS R9 280X DirectCU II TOP
AMD's rebranded cards have arrived, though with a few improvements to the GCN architecture that we already know so well. This particular release seems to be focused on price for performance which is certainly not a bad thing in these uncertain times. The 7970 GHz Edition launched at $500, while the new R9 280X will arrive at $300 which is a rather significant price drop and one which we hope doesn't damage AMD's bottom line too badly in the coming quarters. [H]ard|OCP chose the ASUS R9 280X DirectCU II TOP to test, with a custom PCB from ASUS and a mild overclock which helped it pull ahead of the 7970 GHz. AMD has tended towards leading off new graphics card families with the low and midrange models, we have yet to see the top of the line R9 290X in action yet.
Ryan's review, including frame pacing, can be found right here.
"We evaluate the new ASUS R9 280X DirectCU II TOP video card and compare it to GeForce GTX 770 and Radeon HD 7970 GHz Edition. We will find out which video card provides the best value and performance in the $300 price segment. Does it provide better performance a than its "competition" in the ~$400 price range?"
Here are some more Graphics Card articles from around the web:
- AMD's Radeon R7 260X @ The Tech Report
- AMD's Radeon R9 280X and 270X @ The Tech Report
- AMD Radeon R9 270X & R7 260X Review @ Neoseeker
- AMD Radeon R7 260X 2GB @ eTeknix
- AMD Radeon R9 270X 2GB @ eTeknix
- AMD Radeon R7 260X, R9 270X and R9 280X @ Hardware.info
- Sapphire AMD Radeon R9 280X Vapor-X OC 3GB @ eTeknix
- Radeon R9 270X and R7 260X @ TechSpot
- AMD Radeon R9 270X & R7 260X @ Legion Hardware
- AMD Radeon R9 270X & R7 260X Review @ Hardware Canucks
- AMD Radeon R9 280X 3GB Review @ Hardware Canucks
- Sapphire R9 280X Vapor X @ Kitguru
- AMD R7 260X @ Kitguru
- AMD R9 270X @ Kitguru
The AMD Radeon R9 280X
Today marks the first step in an introduction of an entire AMD Radeon discrete graphics product stack revamp. Between now and the end of 2013, AMD will completely cycle out Radeon HD 7000 cards and replace them with a new branding scheme. The "HD" branding is on its way out and it makes sense. Consumers have moved on to UHD and WQXGA display standards; HD is no longer extraordinary.
But I want to be very clear and upfront with you: today is not the day that you’ll learn about the new Hawaii GPU that AMD promised would dominate the performance per dollar metrics for enthusiasts. The Radeon R9 290X will be a little bit down the road. Instead, today’s review will look at three other Radeon products: the R9 280X, the R9 270X and the R7 260X. None of these products are really “new”, though, and instead must be considered rebrands or repositionings.
There are some changes to discuss with each of these products, including clock speeds and more importantly, pricing. Some are specific to a certain model, others are more universal (such as updated Eyefinity display support).
Let’s start with the R9 280X.
AMD Radeon R9 280X – Tahiti aging gracefully
The AMD Radeon R9 280X is built from the exact same ASIC (chip) that powers the previous Radeon HD 7970 GHz Edition with a few modest changes. The core clock speed of the R9 280X is actually a little bit lower at reference rates than the Radeon HD 7970 GHz Edition by about 50 MHz. The R9 280X GPU will hit a 1.0 GHz rate while the previous model was reaching 1.05 GHz; not much a change but an interesting decision to be made for sure.
Because of that speed difference the R9 280X has a lower peak compute capability of 4.1 TFLOPS compared to the 4.3 TFLOPS of the 7970 GHz. The memory clock speed is the same (6.0 Gbps) and the board power is the same, with a typical peak of 250 watts.
Everything else remains the same as you know it on the HD 7970 cards. There are 2048 stream processors in the Tahiti version of AMD’s GCN (Graphics Core Next), 128 texture units and 32 ROPs all being pushed by a 384-bit GDDR5 memory bus running at 6.0 GHz. Yep, still with a 3GB frame buffer.