Subject: General Tech, Processors | December 14, 2013 - 01:55 AM | Scott Michaud
Tagged: opteron, arm, amd
The ARMv8 architecture extends the hardware platform to 64-bit. This increase is mostly useful to address massive amounts of memory but can also have other benefits for performance. I think many of us remember the excitement prior to x86-64 and the subsequent let-down when we realized that, for most applications, typical vector extensions kept up in performance especially considering the compatibility issues of the day. It needed to happen but it was a hard sell until... it was just ubiquitous.
AMD has not kept it secret that they are developing 64-bit ARM processors for data centers but, until this week, further details were scarce. Under the codename, "Seattle", these processors will be available in four and eight cores. The Opteron branding will expand beyond x86 to include these new processors. The pitch to enterprises is simple: want both ARM and x86? Why bother with two vendors!
Seattle will also support up to 128GB of ECC memory and 10 Gigabit Ethernet for dense, but power efficient, compute clusters. It will be manufactured on the 28nm process.
The majority of AMD's blog post proclaimed its commitment to software support and it is definitely true that they hold a very high status in both the Linux and Apache Foundations. ARMv8 is supported in Linux starting with kernel 3.7.
Seattle is expected to launch in the second half of 2014.
Subject: General Tech, Processors | December 13, 2013 - 08:49 PM | Scott Michaud
Tagged: Intel, haswell
Intel will begin to refresh their Haswell line of processors, according to VR-Zone, starting in Q2 and continue into Q3. This will be accompanied by their 9-series of motherboard chipsets. The Intel Core i7-4770 and Core i7-4771 will be replaced, not just surpassed, by the Core i7-4790. That said, the only difference is a 100MHz bump to both the base and turbo CPU frequencies.
The K-series processors will come in Q3 and are said to be based on Haswell-E with DDR4 memory. I find this quite confusing because of previous reports that Broadwell-K would appear at roughly the same time. I am unsure what this means for Broadwell-K and I am definitely unsure why some Haswell-E components would be considered part of the Haswell refresh instead of the Haswell-E launch.
Subject: General Tech, Processors | December 10, 2013 - 06:56 PM | Scott Michaud
Tagged: Richland, amd
AMD has been heavily promoting their Kaveri platform leading up to its January launch. This new generation of parts should slowly replace Richland with faster and HSA-compliant silicon. AMD added a new member of the Richland family on October 29th, however, called the A10-6790K. With a base frequency of 4.1 GHz (turbo to 4.3 GHz) and 384 shader cores clocked at 844 MHz, it has a maximum theoretical compute power of 779 GFLOPs.
Image Credit: HCW
Carl Nelson of Hardcoreware (HCW) picked one of these APUs up and tested it against a number of metrics (including OpenCL performance) and four similarly priced competitors. Specifically, he found Battlefield 4 playable on low (~35 FPS) at 720p without a discrete graphics solution especially for a home theater PC (HTPC).
Even though better things are on the horizon, you may want to check out his review if only as comparison to what will arrive next month. Who knows, maybe this fits your $120-130 price point.
Subject: Processors | December 9, 2013 - 06:23 PM | Jeremy Hellstrom
Tagged: xeon e3, Intel, haswell, 1230Lv3
Server chips with low power consumption are in style an the Xeon E3-1230Lv3 certainly qualifies at a tiny 25W TDP. It is a Haswell chip running at a peak speed of 1.8GHz which would be great for a small business or for a home server. eTeknix compared the performance of this chip to the i7-4770K with a TDP more than three times that of the Xeon which is perhaps a little unfair to the E3 but is a familiar chip to most enthusiasts. That said the Xeon doesn't fall too far behind in many tests and at $250 it is less expensive to slap into a Z87 motherboard and it will reduce your power bill somewhat.
"Intel’s Xeon E3-1230Lv3 CPU has been a hotly anticipated processor for a wide variety of target audiences – home users, office users, small business users and enterprise users. Today we’ve got an opportunity to put Intel’s enterprise Xeon E3-1230Lv3 CPU to the test in a professional home user or “prosumer” type of environment, by pairing it up with SuperMicro’s server-grade C7Z87-OCE motherboard. The Intel Xeon E3-1230Lv3 is an important CPU because it offers four cores, eight threads, a 1.8GHz base frequency, a 2.8GHz Turbo frequency and 8MB of cache all for a tiny TDP of just 25W."
Here are some more Processor articles from around the web:
- Intel Core i3 4330 / i5 4440 @ Hardware.info
- Core i5-4670K, Core i5-4670, Core i5-4570 and Core i5-4430 @ X-bit Labs
- How to Overclock an Intel 4770K Guide @ OCC
- All Core i3 Models @ Hardware Secrets
- Intel Core i7 4960X Ivy Bridge Extreme Edition On Linux @ Phoronix
- Intel Core i3 4130 @ Phoronix
- The Workstation & Server CPU Comparison Guide @ TechARP
- All AMD FX CPU Models @ Hardware Secrets
Subject: General Tech, Graphics Cards, Processors | December 3, 2013 - 04:12 AM | Scott Michaud
Tagged: Kaveri, APU, amd
The launch and subsequent availability of Kaveri is scheduled for the CES time frame. The APU unites Steamroller x86 cores with several Graphics Core Next (GCN) cores. The high-end offering, the A10-7850K, is capable of 856 GFLOPs of compute power (most of which is of course from the GPU).
Image/Leak Credit: Prohardver.hu
We now know about two SKUs: the A10-7850K and the A10-7700K. Both parts are quite similar except that the higher model is given a 200 MHz CPU bump, 3.8 GHz to 4.0 Ghz, and 33% more GPU units, 6 to 8.
But how does this compare? The original source (prohardver.hu) claims that Kaveri will achieve an average 28 FPS in Crysis 3 on low at 1680x1050; this is a 12% increase over Richland. It also achieved an average 53 FPS with Sleeping Dogs on Medium which is 26% more than Richland.
These are healthy increases over the previous generation but do not even account for HSA advantages. I am really curious what will happen if integrated graphics become accessible enough that game developers decide to target it for general compute applications. The reduction in latency (semi-wasted time bouncing memory between compute devices) might open this architecture to where it can really shine.
We will do our best to keep you up to date on this part especially when it launches at CES.
Subject: General Tech, Graphics Cards, Processors | November 28, 2013 - 03:30 AM | Scott Michaud
Tagged: Intel, Xeon Phi, gpgpu
Intel was testing the waters with their Xeon Phi co-processor. Based on the architecture designed for the original Pentium processors, it was released in six products ranging from 57 to 61 cores and 6 to 16GB of RAM. This lead to double precision performance of between 1 and 1.2 TFLOPs. It was fabricated using their 22nm tri-gate technology. All of this was under the Knights Corner initiative.
In 2015, Intel plans to have Knights Landing ready for consumption. A modified Silvermont architecture will replace the many simple (basically 15 year-old) cores of the previous generation; up to 72 Silvermont-based cores (each with 4 threads) in fact. It will introduce the AVX-512 instruction set. AVX-512 allows applications to vectorize 8 64-bit (double-precision float or long integer) or 16 32-bit (single-precision float or standard integer) values.
In other words, packing a bunch of related problems into a single instruction.
The most interesting part? Two versions will be offered: Add-In Boards (AIBs) and a standalone CPU. It will not require a host CPU, because of its x86 heritage, if your application is entirely suited for an MIC architecture; unlike a Tesla, it is bootable with existing and common OSes. It can also be paired with standard Xeon processors if you would like a few strong threads with the 288 (72 x 4) the Xeon Phi provides.
And, while I doubt Intel would want to cut anyone else in, VR-Zone notes that this opens the door for AIB partners to make non-reference cards and manage some level of customer support. I'll believe a non-Intel branded AIB only when I see it.
Subject: General Tech, Processors, Storage | November 19, 2013 - 01:15 PM | Ryan Shrout
Tagged: i7-4770k, gold box, deals, amazon, 530 series
I don't often post about the Amazon Gold Box deals, but today the company has some great offerings specific to PC enthusiasts and DIY builders that you might want to take advantage of. Please keep in mind though that these deals are only good today, November 19th!!
The flagship offering is the Intel Core i7-4770K, the company's highest end LGA1150 Haswell processor, is on sale for $299; $60 off the normal MSRP. That is the best price I have seen on that flagship CPU with the exception of in-store offerings from MicroCenters.
For those of you on a tighter budget, Amazon has the Core i5-3570K Ivy Bridge processor on sale for $199.
Another great price can be had on the Intel 530 Series 240GB SSD that is going for $149; well under the MSRP price.
Here are some other interesting deals, all found on the Gold Box deal page:
- Kingston HyperX 8GB kit DDR3-1600 for $64
- Corsair H90 cooler for $69
- Corsair AX760i power supply for $149
- Corsair Obsidian 900D Super Tower case for $289
- Creative Sound Blaster Z PCIE Sound Card for $59
- Corsair K50 Gaming Keyboard for $79
- Corsair M65 Gaming Mouse for $49
- Kingston HyperX Steel Series Siberia V2 Gaming Headset for $64
And just remember: these deals are only good today, November 19th!!
Subject: Processors | November 13, 2013 - 05:35 PM | Josh Walrath
Tagged: Puma, Mullins, mobile, Jaguar, GCN, beema, apu13, APU, amd, 2014
AMD’s APU13 is all about APUs and their programming, but the hardware we have seen so far has been dominated by the upcoming Kaveri products for FM2+. It seems that AMD has more up their sleeves for release this next year, and it has somewhat caught me off guard. The Beema and Mullins based products are being announced today, but we do not have exact details on these products. The codenames have been around for some time now, but interest has been minimal since they are evolutionary products based on Kabini and Temash APUs that have been available this year. Little did I know that things would be far more interesting than that.
The basis for Beema and Mullins is the Puma core. This is a highly optimized revision of Jaguar, and in some ways can be considered a new design. All of the basics in terms of execution units, caches, and memory controllers are the same. What AMD has done is go through the design with a fine toothed comb and make it far more efficient per clock than what we have seen previously. This is still a 28 nm part, but the extra attention and love lavished upon it by AMD has resulted in a much more efficient system architecture for the CPU and GPU portions.
The parts will be offered in two and four core configurations. Beema will span from 10W to 25W configurations. Mullins will go all the way down to “2W SDP”. SDP essentially means that while the chip can be theoretically rated higher, it will rarely go above that 2W envelope in the vast majority of situations. These chips are expected to be around 2X more efficient per clock than the previous Jaguar based products. This means that at similar clock speeds, Beema and Mullins will pull far less power than that previous gen. It should also allow some higher clockspeeds at the top end 25W area.
These will be some of the first fanless quad cores that AMD will introduce for the tablet market. Previously we have seen tablets utilize the cut down versions of Temash to hit power targets, but with this redesign it is entirely possible to utilize the fully enabled quad core Mullins. AMD has not given us specific speeds for these products, but we can guess that they will be around what we see currently, but the chip will just have a lower TDP rating.
AMD is introducing their new security platform based on the ARM Trustzone. Essentially a small ARM Cortex A5 is integrated in the design and handles the security aspects of this feature. We were not briefed on how this achieves security, but the slide below gives some of the bullet points of the technology.
Since the pure-play foundries will not have a workable 20 nm process for AMD to jump to in a timely manner, AMD had no other choice but to really optimize the Jaguar core to make it more competitive with products from Intel and the ARM partners. At 28 nm the ARM ecosystem has a power advantage over AMD, while at 22 nm Intel offers similar performance to AMD but with greater power efficiency.
This is a necessary update for AMD as the competition has certainly not slowed down. AMD is more constrained obviously by the lack of a next-generation process node available for 1H 2014, so a redesign of this magnitude was needed. The performance per watt metric is very important here, as it promises longer battery life without giving up the performance people received from the previous Kabini/Temash family of APUs. This design work could be carried over to the next generation of APUs using 20 nm and below, which hopefully will keep AMD competitive with the rest of the market. Beema and Mullins are interesting looking products that will be shown off at CES 2014.
Subject: General Tech, Processors | November 12, 2013 - 06:50 PM | Scott Michaud
Tagged: Kaveri, apu13, amd
AMD will deliver its latest round of APUs (Kaveri) on January 14th. These processors, built on a 28nm process, will combine the Steamroller architecture on the CPU with HSA-compliant Graphics Core Next (GCN) cores on the GPU. Together they are expected to bring 856 GFLOPs of computational performance.
Thomas Ryan at SemiAccurate, however, remembers that AMD expected over a TeraFLOP.
Of course Kaveri has been a troubled chip for AMD. At this point Kaveri is over a year late and most of that delay is due to a series of internal issues at AMD rather than technical problems. But now with the knowledge that Kaveri missed AMD’s internal performance targets by about 20 percent it’s hard to be very positive about AMD’s next big-core APU.
The problem comes from a reduction in the clock rate AMD expected back in February 2012. Steamroller was expected to reach 4 GHz but that has been slightly reduced to 3.7 GHz; this is obviously a small impact from a compute standpoint (weakened by just under10 GFLOPs). The GPU, on the other hand, was cut from 900MHz down to 720 MHz; its performance was reduced by a whole
25% (Update: 20%. Accidentally divided by 720 instead of 900). Using AMD's formula for calculating FLOP performance, Kaveri's 856 GFLOP rating corresponds to an 18% reduction from the original 1050 GFLOP target.
But, personally, I am still positive about Kaveri.
The introduction of HSA features into mainstream x86 processors has begun. The ability to share memory between the CPU and the GPU could be a big deal, especially for tasks such as AI and physics. AI especially interests me (although I am by no means an expert) because it is a mixture of branching and parallel instructions. The HSA model could, potentially, operate on the data with whichever architecture makes sense. Currently, synchronizing CPU and GPU memory is very costly; you could easily spend most of your processing time budget waiting for memory transfers.
856 GFLOPs is a definite reduction from 1050 GFLOPs. Still, if Kaveri (and APUs going forward) can effectively nullify the latencies involved with GPGPU work, an Intel Ivy Bridge-E Core i7 4960X has an instruction throughput of ~160 GFLOPs.
And before you say it: Yes, I know, Ivy Bridge-E can be paired with fast discrete graphics. This combination is ideal for easily separated tasks such as when the CPU prepares a frame and then a GPU draws it; you get the best of both worlds if both can keep working.
But what if your workload is a horrific mish-mash of back-and-forth serial and parallel? That is where AMD might have an edge.
Subject: Graphics Cards, Processors | November 12, 2013 - 06:10 PM | Ryan Shrout
Tagged: amd, Kaveri, APU, video, hsa
Yesterday at the AMD APU13 developer conference, the company showed off the upcoming Kaveri APU running Battlefield 4 completely on the integrated graphics. I was able to push the AMD guys along and get a little more personal demo to share with our readers. The Kaveri APU had some of its details revealed this week:
- Quad-core Steamroller x86
- 512 Stream Processor GPU
- 856 GFLOPS of theoretical performance
- 3.7 GHz CPU clock speed, 720 MHz GPU clock speed
AMD wanted to be sure we pointed out in this video that the estimate clock speeds for FLOP performance may not be what the demo system was run at (likely a bit lower). Also, the version of Battlefield 4 here is the standard retail version and with further improvements from the driver team as the upcoming Mantle API implementation will likely introduce even more performance for the APU.
The game was running at 1920x1080 with MOSTLY medium quality settings (lighting set to low) but the results still looked damn impressive and the frame rates were silky and smooth. Considering this is running on a desktop with integrated processor graphics, the game play experience is simply unmatched.
Memory in the system was running at 2133 MHz.
The second demo looks at the image decoding acceleration that AMD is going to enable with Kaveri APUs upon release with a driver. Essentially, as the demonstration shows in the video, AMD is overwriting the integrated Windows JPG decompression algorithm with a new one that utilizes HSA to accelerate on both the x86 and SIMD (GPU) portions of the silicon. For the most strenuous demo that used 22 MP images saw a 100% increase in performance compared to the Kaveri CPU cores alone.