Subject: Storage | March 27, 2017 - 12:16 PM | Allyn Malventano
Tagged: XPoint, Optane Memory, Optane, M.2, Intel, cache, 3D XPoint
We are just about to hit two years since Intel and Micron jointly launched 3D XPoint, and there have certainly been a lot of stories about it since. Intel officially launched the P4800X last week, and this week they are officially launching Optane Memory. The base level information about Optane Memory is mostly unchanged, however, we do have a slide deck we are allowed to pick from to point out some of the things we can look forward to once the new tech starts hitting devices you can own.
Alright, so this is Optane Memory in a nutshell. Put some XPoint memory on an M.2 form factor device, leverage Intel's SRT caching tech, and you get a 16GB or 32GB cache laid over your system's primary HDD.
To help explain what good Optane can do for typical desktop workloads, first we need to dig into Queue Depths a bit. Above are some examples of the typical QD various desktop applications run at. This data is from direct IO trace captures of systems in actual use. Now that we've established that the majority of desktop workloads operate at very low Queue Depths (<= 4), lets see where Optane performance falls relative to other storage technologies:
There's a bit to digest in this chart, but let me walk you through it. The ranges tapering off show the percentage of IOs falling at the various Queue Depths, while the green, red, and orange lines ramping up to higher IOPS (right axis) show relative SSD performance at those same Queue Depths. The key to Optane's performance benefit here is that it can ramp up to full performance at very low QD's, while the other NAND-based parts require significantly higher parallel requests to achieve full rated performance. This is what will ultimately lead to a much snappier responsiveness for, well, just about anything hitting the storage. Fun fact - there is actually a HDD on that chart. It's the yellow line that you might have mistook as the horizontal axis :).
As you can see, we have a few integrators on board already. Official support requires a 270 series motherboard and Kaby Lake CPU, but it is possible that motherboard makers could backport the required NVMe v1.1 and Intel RST 15.5 requirements into older systems.
For those curious, if caching is the only way power users will be able to go with Optane, that's not the case. Atop that pyramid there sits an 'Intel Optane SSD', which should basically be a consumer version of the P4800X. It is sure to be an incredibly fast SSD, but that performance will most definitely come at a price!
We should be testing Optane Memory shortly and will finally have some publishable results of this new tech as soon as we can!
If you look at the current 2-in-1 notebook market, it is clear that the single greatest influence is the Lenovo Yoga. Despite initial efforts to differentiate convertible Notebook-tablet designs, newly released machines such as the HP Spectre x360 series and the Dell XPS 13" 2-in-1 make it clear that the 360-degree "Yoga-style" hinge is the preferred method.
Today, we are looking at a unique application on the 360-degree hinge, the Lenovo Yoga Book. Will this new take on the 2-in-1 concept be so influential?
The Lenovo Yoga Book is 10.1" tablet that aims to find a unique way to implement a stylus on a modern touch device. The device itself is a super thin clamshell-style design, featuring an LCD on one side of the device, and a large touch-sensitive area on the opposing side.
This large touch area serves two purposes. Primarily, it acts as a surface for the included stylus that Lenovo is calling the Real Pen. Using the Real Pen, users can do thing such as sketch in Adobe Photoshop and Illustrator or takes notes in an application such as Microsoft OneNote.
The Real Pen has more tricks up its sleeve than just a normal stylus. It can be converted from a pen with a Stylus tip on it to a full ballpoint pen. When paired with the "Create Pad" included with the Yoga Book, you can write on top of a piece of actual paper using the ballpoint pen, and still have the device pick up on what you are drawing.
Subject: Storage | March 19, 2017 - 12:21 PM | Allyn Malventano
Tagged: XPoint, SSD DC P4800X, Optane Memory, Optane, Intel, client, 750GB, 3D XPoint, 375GB, 1.5TB
Intel brought us out to their Folsom campus last week for some in-depth product briefings. Much of our briefing is still under embargo, but the portion that officially lifts this morning is the SSD DC P4800X:
MSRP for the 375GB model is estimated at $1520 ($4/GB), which is rather spendy, but given that the product has shown it can effectively displace RAM in servers, we should be comparing the cost/GB with DRAM and not NAND. It should also be noted this is also nearly half the cost/GB of the X25-M at its launch. Capacities will go all the way up to 1.5TB, and U.2 form factor versions are also on the way.
For those wanting a bit more technical info, the P4800X uses a 7-channel controller, with the 375GB model having 4 dies per channel (28 total). Overprovisioning does not do for Optane what it did for NAND flash, as XPoint can be rewritten at the byte level and does not need to be programmed in (KB) pages and erased in larger (MB) blocks. The only extra space on Optane SSDs is for ECC, firmware, and a small spare area to map out any failed cells.
Those with a keen eye (and calculator) might have noted that the early TBW values only put the P4800X at 30 DWPD for a 3-year period. At the event, Intel confirmed that they anticipate the P4800X to qualify at that same 30 DWPD for a 5-year period by the time volume shipment occurs.
Subject: Processors | March 17, 2017 - 03:48 PM | Jeremy Hellstrom
Tagged: amd, Intel, ryzen, sanity check
Ars Technica asks the question that many reasonable people are also pondering, "Intel still beats Ryzen at games, but how much does it matter?". We here at PCPer have seen the same sorts of responses which Ars has, there is a group of people who had the expectation that Ryzen would miraculously beat any and all Intel chips at every possible task. More experienced heads were hoping for about what we received, a chip which can challenge Broadwell, offering performance which improved greatly on their previous architecture. The launch has revealed some growing pains with AMD's new baby but not anything which makes Ryzen bad.
Indeed, with more DX12 or Vulkan games arriving we should see AMD's performance improve, especially if programmers start to take more effective advantage of high core counts. Head over to read the article, unless you feel that is not a requirement to comment on this topic.
"In spite of this, reading the various reviews around the Web—and comment threads, tweets, and reddit posts—one gets the feeling that many were hoping or expecting Ryzen to somehow beat Intel across the board, and there's a prevailing narrative that Ryzen is in some sense a bad gaming chip. But this argument is often paired with the claim that some kind of non-specific "optimization" is going to salvage the processor's performance, that AMD fans just need to keep the faith for a few months, and that soon Ryzen's full power will be revealed."
Here are some more Processor articles from around the web:
- AMD Ryzen 7 1800X, 1700X, and 1700 Processor Review @ Neoseeker
- AMD's Ryzen 5 Processors; A Preview @ Hardware Canucks
- AMD Ryzen 7 1800X 3.6 GHz @ techPowerUp
- AMD Ryzen 7 1700 @ Kitguru
Subject: General Tech | March 13, 2017 - 02:35 PM | Jeremy Hellstrom
Tagged: Intel, mobileye, self driving car, billions
BMW's self driving car division asked Intel and Mobileye to partner together to design the iNext spin off of BMW's electric car division. Mobileye specializes in sensors and software for autonomous or assisted driving, Tesla used their products in the Model S. Their success has not gone unnoticed and today they are Intel's latest acquisition in the IoT market, purchased for a total of roughly $15.3 billion, US. Expect to see more Intel Inside stickers on cars, as they have recently purchased another IoT firm specializing in chip security as well as one focused on computer vision. Pop by The Inquirer for links to those other purchases.
"On Monday, Intel announced that it has purchased the company for £12.5bn, marking the biggest-ever acquisition of an Israeli tech company. It's also the biggest purchase of a company solely focused on the autonomous driving sector."
Here is some more Tech News from around the web:
- 6 of the most useful Google things no one uses @ The Inquirer
- Malware infecting Androids somewhere in the supply chain @ The Register
- IBM pushes blockchain system for e-transaction @ DigiTimes
- Q Has Nothing on Naomi Wu @ Hack a Day
- User lubed PC with butter, because pressing a button didn't work @ The Register
- Tim Berners-Lee says privacy needs fixing – and calls for 'algorithmic transparency' @ The Register
- Arozzi Arena Gaming Desk Review @ NikKTech
With the introduction of the Intel Kaby Lake processors and Intel Z270 chipset, unprecedented overclocking became the norm. The new processors easily hit a core speed of 5.0GHz with little more than CPU core voltage tweaking. This overclocking performance increase came with a price tag. The Kaby Lake processor runs significantly hotter than previous generation processors, a seeming reversal in temperature trends from previous generation Intel CPUs. At stock settings, the individual cores in the CPU were recording in testing at hitting up to 65C - and that's with a high performance water loop cooling the processor. Per reports from various enthusiasts sites, Intel used inferior TIM (thermal interface material) in between the CPU die and underside of the CPU heat spreader, leading to increased temperatures when compared with previous CPU generations (in particular Skylake). This temperature increase did not affect overclocking much since the CPU will hit 5.0GHz speed easily, but does impact the means necessary to hit those performance levels.
Like with the previous generation Haswell CPUs, a few of the more adventurous enthusiasts used known methods in an attempt to address the heat concerns of the Kaby Lake processor be delidding the processor. Unlike in the initial days of the Haswell processor, the delidding process is much more stream-lined with the availability of delidding kits from several vendors. The delidding process still involves physically removing the heat spreader from the CPU, and exposing the CPU die. However, instead of cooling the die directly, the "safer" approach is to clean the die and underside of the heat spreader, apply new TIM (thermal interface material), and re-affix the heat spreader to the CPU. Going this route instead of direct-die cooling is considered safer because no additional or exotic support mechanisms are needed to keep the CPU cooler from crushing your precious die. However, calling it safe is a bit of an over-statement, you are physically separating the heat spreader from the CPU surface and voiding your CPU warranty at the same time. Although if that was a concern, you probably wouldn't be reading this article in the first place.
Subject: Networking, Storage | March 4, 2017 - 11:57 PM | Scott Michaud
Tagged: netgear, Intel, Avoton, recall
While this is more useful for our readers in the IT field, NETGEAR has issued a (non-urgent) recall on sixteen models of Rackmount NAS and Wireless Controller devices. It looks like the reason for this announcement is to maintain customer relations. They are planning to reach out to customers “over the next several months” to figure out a solution for them. Note the relaxed schedule.
The affected model numbers are:
- WC7500 Series:
- WC7500-10000S, WC7500-100INS, WC7500-100PRS, WB7520-10000S, WB7520-100NAS, WB7530-10000S, WB7530-100NAS
- WC7600 Series:
- WC7600-20000S, WC7600-200INS, WC7600-200PRS, WB7620-10000S, WB7620-100NAS, WB7630-10000S, WB7630-100NAS
The Register noticed that each of these devices contain Intel’s Avoton-based Atom processors. You may remember our coverage from last month, which also sourced The Register, that states these chips may fail to boot over time. NETGEAR is not blaming Intel for their recall, but gave The Register a wink and a nudge when pressed: “We’re not naming the vendor but it sounds as if you’ve done your research.”
Again, while this news applies to enterprise customers and it’s entirely possible that Intel (if it actually is the Avoton long-term failure issue) is privately supporting them, it’s good to see NETGEAR being honest and upfront. Problems will arise in the tech industry; often (albeit not always) what matters more is how they are repaired.
What Makes Ryzen Tick
We have been exposed to details about the Zen architecture for the past several Hot Chips conventions as well as other points of information directly from AMD. Zen was a clean sheet design that borrowed some of the best features from the Bulldozer and Jaguar architectures, as well as integrating many new ideas that had not been executed in AMD processors before. The fusion of ideas from higher performance cores, lower power cores, and experience gained in APU/GPU design have all come together in a very impressive package that is the Ryzen CPU.
It is well known that AMD brought back Jim Keller to head the CPU group after the slow downward spiral that AMD entered in CPU design. While the Athlon 64 was a tremendous part for the time, the subsequent CPUs being offered by the company did not retain that leadership position. The original Phenom had problems right off the bat and could not compete well with Intel’s latest dual and quad cores. The Phenom II shored up their position a bit, but in the end could not keep pace with the products that Intel continued to introduce with their newly minted “tic-toc” cycle. Bulldozer had issues out of the gate and did not have performance numbers that were significantly greater than the previous generation “Thuban” 6 core Phenom II product, much less the latest Intel Sandy Bridge and Ivy Bridge products that it would compete with.
AMD attempted to stop the bleeding by iterating and evolving the Bulldozer architecture with Piledriver, Steamroller, and Excavator. The final products based on this design arc seemed to do fine for the markets they were aimed at, but certainly did not regain any marketshare with AMD’s shrinking desktop numbers. No matter what AMD did, the base architecture just could not overcome some of the basic properties that impeded strong IPC performance.
The primary goal of this new architecture is to increase IPC to a level consistent to what Intel has to offer. AMD aimed to increase IPC per clock by at least 40% over the previous Excavator core. This is a pretty aggressive goal considering where AMD was with the Bulldozer architecture that was focused on good multi-threaded performance and high clock speeds. AMD claims that it has in fact increased IPC by an impressive 54% from the previous Excavator based core. Not only has AMD seemingly hit its performance goals, but it exceeded them. AMD also plans on using the Zen architecture to power products from mobile products to the highest TDP parts offered.
The Zen Core
The basis for Ryzen are the CCX modules. These modules contain four Zen cores along with 8 MB of shared L3 cache. Each core has 64 KB of L1 I-cache and 32 KB of D-cache. There is a total of 512 KB of L2 cache. These caches are inclusive. The L3 cache acts as a victim cache which partially copies what is in L1 and L2 caches. AMD has improved the performance of their caches to a very large degree as compared to previous architectures. The arrangement here allows the individual cores to quickly snoop any changes in the caches of the others for shared workloads. So if a cache line is changed on one core, other cores requiring that data can quickly snoop into the shared L3 and read it. Doing this allows the CPU doing the actual work to not be interrupted by cache read requests from other cores.
Each core can handle two threads, but unlike Bulldozer has a single integer core. Bulldozer modules featured two integer units and a shared FPU/SIMD. Zen gets rid of CMT for good and we have a single integer and FPU units for each core. The core can address two threads by utilizing AMD’s version of SMT (symmetric multi-threading). There is a primary thread that gets higher priority while the second thread has to wait until resources are freed up. This works far better in the real world than in how I explained it as resources are constantly being shuffled about and the primary thread will not monopolize all resources within the core.
Linked Multi-GPU Arrives... for Developers
The Khronos Group has released the Vulkan 220.127.116.11 specification, which includes experimental (more on that in a couple of paragraphs) support for VR enhancements, sharing resources between processes, and linking similar GPUs. This spec was released alongside a LunarG SDK and NVIDIA drivers, which are intended for developers, not gamers, that fully implement these extensions.
I would expect that the most interesting feature is experimental support for linking similar GPUs together, similar to DirectX 12’s Explicit Linked Multiadapter, which Vulkan calls a “Device Group”. The idea is that the physical GPUs hidden behind this layer can do things like share resources, such as rendering a texture on one GPU and consuming it in another, without the host code being involved. I’m guessing that some studios, like maybe Oxide Games, will decide to not use this feature. While it’s not explicitly stated, I cannot see how this (or DirectX 12’s Explicit Linked mode) would be compatible in cross-vendor modes. Unless I’m mistaken, that would require AMD, NVIDIA, and/or Intel restructuring their drivers to inter-operate at this level. Still, the assumptions that could be made with grouped devices are apparently popular with enough developers for both the Khronos Group and Microsoft to bother.
A slide from Microsoft's DirectX 12 reveal, long ago.
As for the “experimental” comment that I made in the introduction... I was expecting to see this news around SIGGRAPH, which occurs in late-July / early-August, alongside a minor version bump (to Vulkan 1.1).
I might still be right, though.
The major new features of Vulkan 18.104.22.168 are implemented as a new classification of extensions: KHX. In the past, vendors, like NVIDIA and AMD, would add new features as vendor-prefixed extensions. Games could query the graphics driver for these abilities, and enable them if available. If these features became popular enough for multiple vendors to have their own implementation of it, a committee would consider an EXT extension. This would behave the same across all implementations (give or take) but not be officially adopted by the Khronos Group. If they did take it under their wing, it would be given a KHR extension (or added as a required feature).
The Khronos Group has added a new layer: KHX. This level of extension sits below KHR, and is not intended for production code. You might see where this is headed. The VR multiview, multi-GPU, and cross-process extensions are not supposed to be used in released video games until they leave KHX status. Unlike a vendor extension, the Khronos Group wants old KHX standards to drop out of existence at some point after they graduate to full KHR status. It’s not something that NVIDIA owns and will keep it around for 20 years after its usable lifespan just so old games can behave expectedly.
How long will that take? No idea. I’ve already mentioned my logical but uneducated guess a few paragraphs ago, but I’m not going to repeat it; I have literally zero facts to base it on, and I don’t want our readers to think that I do. I don’t. It’s just based on what the Khronos Group typically announces at certain trade shows, and the length of time since their first announcement.
The benefit that KHX does bring us is that, whenever these features make it to public release, developers will have already been using it... internally... since around now. When it hits KHR, it’s done, and anyone can theoretically be ready for it when that time comes.
Zen vs. 40 Years of CPU Development
Zen is nearly upon us. AMD is releasing its next generation CPU architecture to the world this week and we saw CPU demonstrations and upcoming AM4 motherboards at CES in early January. We have been shown tantalizing glimpses of the performance and capabilities of the “Ryzen” products that will presumably fill the desktop markets from $150 to $499. I have yet to be briefed on the product stack that AMD will be offering, but we know enough to start to think how positioning and placement will be addressed by these new products.
To get a better understanding of how Ryzen will stack up, we should probably take a look back at what AMD has accomplished in the past and how Intel has responded to some of the stronger products. AMD has been in business for 47 years now and has been a major player in semiconductors for most of that time. It really has only been since the 90s where AMD started to battle Intel head to head that people have become passionate about the company and their products.
The industry is a complex and ever-shifting one. AMD and Intel have been two stalwarts over the years. Even though AMD has had more than a few challenging years over the past decade, it still moves forward and expects to compete at the highest level with its much larger and better funded competitor. 2017 could very well be a breakout year for the company with a return to solid profitability in both CPU and GPU markets. I am not the only one who thinks this considering that AMD shares that traded around the $2 mark ten months ago are now sitting around $14.
AMD Through 1996
AMD became a force in the CPU industry due to IBM’s requirement to have a second source for its PC business. Intel originally entered into a cross licensing agreement with AMD to allow it to produce x86 chips based on Intel designs. AMD eventually started to produce their own versions of these parts and became a favorite in the PC clone market. Eventually Intel tightened down on this agreement and then cancelled it, but through near endless litigation AMD ended up with a x86 license deal with Intel.
AMD produced their own Am286 chip that was the first real break from the second sourcing agreement with Intel. Intel balked at sharing their 386 design with AMD and eventually forced the company to develop its own clean room version. The Am386 was released in the early 90s, well after Intel had been producing those chips for years. AMD then developed their own version of the Am486 which then morphed into the Am5x86. The company made some good inroads with these speedy parts and typically clocked them faster than their Intel counterparts (eg. Am486 40 MHz and 80 MHz vs. the Intel 486 DX33 and DX66). AMD priced these points lower so users could achieve better performance per dollar using the same chipsets and motherboards.
Intel released their first Pentium chips in 1993. The initial version was hot and featured the infamous FDIV bug. AMD made some inroads against these parts by introducing the faster Am486 and Am5x86 parts that would achieve clockspeeds from 133 MHz to 150 MHz at the very top end. The 150 MHz part was very comparable in overall performance to the Pentium 75 MHz chip and we saw the introduction of the dreaded “P-rating” on processors.
There is no denying that Intel continued their dominance throughout this time by being the gold standard in x86 manufacturing and design. AMD slowly chipped away at its larger rival and continued to profit off of the lucrative x86 market. William Sanders III set the bar higher about where he wanted the company to go and he started on a much more aggressive path than many expected the company to take.