** UPDATE 3/13 5 PM **
AMD has posted a follow-up statement that officially clears up much of the conjecture this article was attempting to clarify. Relevant points from their post that relate to this article as well as many of the requests for additional testing we have seen since its posting (emphasis mine):
"We have investigated reports alleging incorrect thread scheduling on the AMD Ryzen™ processor. Based on our findings, AMD believes that the Windows® 10 thread scheduler is operating properly for “Zen,” and we do not presently believe there is an issue with the scheduler adversely utilizing the logical and physical configurations of the architecture."
"Finally, we have reviewed the limited available evidence concerning performance deltas between Windows® 7 and Windows® 10 on the AMD Ryzen™ CPU. We do not believe there is an issue with scheduling differences between the two versions of Windows. Any differences in performance can be more likely attributed to software architecture differences between these OSes."
So there you have it, straight from the horse's mouth. AMD does not believe the problem lies within the Windows thread scheduler. SMT performance in gaming workloads was also addressed:
"Finally, we have investigated reports of instances where SMT is producing reduced performance in a handful of games. Based on our characterization of game workloads, it is our expectation that gaming applications should generally see a neutral/positive benefit from SMT. We see this neutral/positive behavior in a wide range of titles, including: Arma® 3, Battlefield™ 1, Mafia™ III, Watch Dogs™ 2, Sid Meier’s Civilization® VI, For Honor™, Hitman™, Mirror’s Edge™ Catalyst and The Division™. Independent 3rd-party analyses have corroborated these findings.
For the remaining outliers, AMD again sees multiple opportunities within the codebases of specific applications to improve how this software addresses the “Zen” architecture. We have already identified some simple changes that can improve a game’s understanding of the "Zen" core/cache topology, and we intend to provide a status update to the community when they are ready."
We are still digging into the observed differences of toggling SMT compared with disabling the second CCX, but it is good to see AMD issue a clarifying statement here for all of those out there observing and reporting on SMT-related performance deltas.
** END UPDATE **
Editor's Note: The testing you see here was a response to many days of comments and questions to our team on how and why AMD Ryzen processors are seeing performance gaps in 1080p gaming (and other scenarios) in comparison to Intel Core processors. Several outlets have posted that the culprit is the Windows 10 scheduler and its inability to properly allocate work across the logical vs. physical cores of the Zen architecture. As it turns out, we can prove that isn't the case at all. -Ryan Shrout
Initial reviews of AMD’s Ryzen CPU revealed a few inefficiencies in some situations particularly in gaming workloads running at the more common resolutions like 1080p, where the CPU comprises more of a bottleneck when coupled with modern GPUs. Lots of folks have theorized about what could possibly be causing these issues, and most recent attention appears to have been directed at the Windows 10 scheduler and its supposed inability to properly place threads on the Ryzen cores for the most efficient processing.
I typically have Task Manager open while running storage tests (they are boring to watch otherwise), and I naturally had it open during Ryzen platform storage testing. I’m accustomed to how the IO workers are distributed across reported threads, and in the case of SMT capable CPUs, distributed across cores. There is a clear difference when viewing our custom storage workloads with SMT on vs. off, and it was dead obvious to me that core loading was working as expected while I was testing Ryzen. I went back and pulled the actual thread/core loading data from my testing results to confirm:
The Windows scheduler has a habit of bouncing processes across available processor threads. This naturally happens as other processes share time with a particular core, with the heavier process not necessarily switching back to the same core. As you can see above, the single IO handler thread was spread across the first four cores during its run, but the Windows scheduler was always hitting just one of the two available SMT threads on any single core at one time.
My testing for Ryan’s Ryzen review consisted of only single threaded workloads, but we can make things a bit clearer by loading down half of the CPU while toggling SMT off. We do this by increasing the worker count (4) to be half of the available threads on the Ryzen processor, which is 8 with SMT disabled in the motherboard BIOS.
SMT OFF, 8 cores, 4 workers
With SMT off, the scheduler is clearly not giving priority to any particular core and the work is spread throughout the physical cores in a fairly even fashion.
Now let’s try with SMT turned back on and doubling the number of IO workers to 8 to keep the CPU half loaded:
SMT ON, 16 (logical) cores, 8 workers
With SMT on, we see a very different result. The scheduler is clearly loading only one thread per core. This could only be possible if Windows was aware of the 2-way SMT (two threads per core) configuration of the Ryzen processor. Do note that sometimes the workload will toggle around every few seconds, but the total loading on each physical core will still remain at ~%50. I chose a workload that saturated its thread just enough for Windows to not shift it around as it ran, making the above result even clearer.
Synthetic Testing Procedure
While the storage testing methods above provide a real-world example of the Windows 10 scheduler working as expected, we do have another workload that can help demonstrate core balancing with Intel Core and AMD Ryzen processors. A quick and simple custom-built C++ application can be used to generate generic worker threads and monitor for core collisions and resolutions.
This test app has a very straight forward workflow. Every few seconds it generates a new thread, capping at N/2 threads total, where N is equal to the reported number of logical cores. If the OS scheduler is working as expected, it should load 8 threads across 8 physical cores, though the division between the specific logical core per physical core will be based on very minute parameters and conditions going on in the OS background.
By monitoring the APIC_ID through the CPUID instruction, the first application thread monitors all threads and detects and reports on collisions - when a thread from our app is running on the same core as another thread from our app. That thread also reports when those collisions have been cleared. In an ideal and expected environment where Windows 10 knows the boundaries of physical and logical cores, you should never see more than one thread of a core loaded at the same time.
Click to Enlarge
This screenshot shows our app working on the left and the Windows Task Manager on the right with logical cores labeled. While it may look like all logical cores are being utilized at the same time, in fact they are not. At any given point, only LCore 0 or LCore 1 are actively processing a thread. Need proof? Check out the modified view of the task manager where I copy the graph of LCore 1/5/9/13 over the graph of LCore 0/4/8/12 with inverted colors to aid viewability.
If you look closely, by overlapping the graphs in this way, you can see that the threads migrate from LCore 0 to LCore 1, LCore 4 to LCore 5, and so on. The graphs intersect and fill in to consume ~100% of the physical core. This pattern is repeated for the other 8 logical cores on the right two columns as well.
Running the same application on a Core i7-5960X Haswell-E 8-core processor shows a very similar behavior.
Click to Enlarge
Each pair of logical cores shares a single thread and when thread transitions occur away from LCore N, they migrate perfectly to LCore N+1. It does appear that in this scenario the Intel system is showing a more stable threaded distribution than the Ryzen system. While that may in fact incur some performance advantage for the 5960X configuration, the penalty for intra-core thread migration is expected to be very minute.
The fact that Windows 10 is balancing the 8 thread load specifically between matching logical core pairs indicates that the operating system is perfectly aware of the processor topology and is selecting distinct cores first to complete the work.
Information from this custom application, along with the storage performance tool example above, clearly show that Windows 10 is attempting to balance work on Ryzen between cores in the same manner that we have experienced with Intel and its HyperThreaded processors for many years.
Subject: General Tech | February 19, 2017 - 05:07 PM | Scott Michaud
Tagged: pc gaming, blizzard, windows, EoL
Most companies have already abandoned Windows XP and Vista, including Microsoft once Vista leaves extended support in April, but Blizzard is known for long-term support. This is the company that is still selling Diablo 2, even producing retail disks for it last I checked, almost seventeen years after it was released (including a patch last year).
Later this year, World of Warcraft, StarCraft II, Diablo III, Hearthstone, and Heroes of the Storm will no longer support Windows XP or Vista. This will not all happen at once, even though it would actually make less sense if they did. I mean, why would they coordinate several teams to release a patch at the same time and maximize annoyance to the affected users who cannot schedule or afford an upgrade at that specific time?
Although, if that’s you, then you should probably get around to it sooner than later.
Living Long and Prospering
The open fork of AMD’s Mantle, the Vulkan API, was released exactly a year ago with, as we reported, a hard launch. This meant public, but not main-branch drivers for developers, a few public SDKs, a proof-of-concept patch for The Talos Principle, and, of course, the ratified specification. This sets up the API to find success right out of the gate, and we can now look back over the year since.
Thor's hammer, or a tempest in a teapot?
The elephant in the room is DOOM. This game has successfully integrated the API and it uses many of its more interesting features, like asynchronous compute. Because the API is designed in a sort-of “make a command, drop it on a list” paradigm, the driver is able to select commands based on priority and available resources. AMD’s products got a significant performance boost, relative to OpenGL, catapulting their Fury X GPU up to the enthusiast level that its theoretical performance suggested.
Mobile developers have been picking up the API, too. Google, who is known for banishing OpenCL from their Nexus line and challenging OpenGL ES with their Android Extension Pack (later integrated into OpenGL ES with version 3.2), has strongly backed Vulkan. The API was integrated as a core feature of Android 7.0.
On the engine and middleware side of things, Vulkan is currently “ready for shipping games” as of Unreal Engine 4.14. It is also included in Unity 5.6 Beta, which is expected for full release in March. Frameworks for emulators are also integrating Vulkan, often just to say they did, but sometimes to emulate the quirks of these system’s offbeat graphics co-processors. Many other engines, from Source 2 to Torque 3D, have also announced or added Vulkan support.
Finally, for the API itself, The Khronos Group announced (pg 22 from SIGGRAPH 2016) areas that they are actively working on. The top feature is “better” multi-GPU support. While Vulkan, like OpenCL, allows developers to enumerate all graphics devices and target them, individually, with work, it doesn’t have certain mechanisms, like being able to directly ingest output from one GPU into another. They haven’t announced a timeline for this.
Subject: General Tech | December 23, 2016 - 12:54 PM | Jeremy Hellstrom
Tagged: windows, microsoft, windows 10
Chris Capossela, Chief Marketing Officer at Microsoft, was on Windows Weekly recently and admitted, for the first time, that Microsoft may have gone a bit too far during their "Get Windows 10" extravaganza. This shocking revelation supposedly occurred a short while after they released the version in which the red X in the popup window broke with their GUI's standard and no longer closed the window and cancelled the installation. According to Slashdot this is the first time Microsoft have admitted to the use of excessive rendition techniques on Windows 7 and 8 users.
"It's no secret that Microsoft has been aggressively pushing Windows 10 to users. Over the past year and a half, we have seen users complain about Windows 10 automatically getting downloaded to their computer, and in some cases, getting installed on its own as well. The automatic download irked many users who were on limited or slow data plans, or didn't want to spend gigabytes of data on Windows 10."
Here is some more Tech News from around the web:
- Rising flash demand looks non-volatile. Time to build a fab @ The Register
- Windows 10 nags, Dirty Cow, Microsoft's Linux man love: The Reg's big ones for 2016 @ The Register
- Steam Fined $3 Million For Refusing Refunds @ Slashdot
Subject: General Tech | November 2, 2016 - 12:57 PM | Jeremy Hellstrom
Tagged: microsoft, OEM, windows, EoL
We've known for quite some time that Microsoft planned to stop providing OEMs with keys for Windows 7 or 8.1 this Halloween and they have made good on that promise. If you already have a valid license you will contine to be able to use it on your machine and even reinstall from scratch but you won't be able to buy a machine without Windows 10 anymore. On the corporate side this is being ignored, the new machine may ship with Win10 installed but that will not last long. This is your last chance to grab one of the few remaining unused Windows 7 or 8.1 keys, The Register managed to spot at least one company still offering a Win7 downgrade so get moving if that is your plan.
"If you can get Dell, HP Inc, Lenovo or any other PC-maker to sell you a PC running Windows 7 Professional or Windows 8.1, please let us know how you did it because Microsoft no longer sells the operating system to OEMs."
Here is some more Tech News from around the web:
- LastPass Makes Password Management Free Across All Of Your PCs, Tablets and Smartphones @ Slashdot
- 5 systemd Tools You Should Start Using Now @ Linux.com
- The Sharp Z2 & Sharp M1 Smartphones Revealed @ TechARP
- BlackBerry DTEK60 vs DTEK50 specs comparison @ The Inquirer
- Sound-mufflers chuck acoustic sleep blanket at the noise-plagued @ The Register
- Broadcom buys Brocade for £4.8bn in bid to bolster storage biz @ The Inquirer
- Fancy Bear: Russia-linked hackers blamed for exploiting Windows zero-day flaw @ The Inquirer
- VMware stubs its toe again: NSX has another VM-flattening bug @ The Register
Subject: Systems, Mobile | August 31, 2016 - 02:30 PM | Sebastian Peak
Tagged: Yoga Book, windows, wacom, notebook, Lenovo, Halo Keyboard, Create Pad, Android
Lenovo has unveiled the Yoga Book, a 2-in-1 design with a unique touch-based lower half below a conventional 1920x1200 IPS touch display. Lenovo is calling the Yoga Book "the world’s thinnest and lightest 2-in-1", with a 9.6mm thickness and weight of 1.52 pounds.
This lower section is a hybrid design, combining Lenovo's "Halo Keyboard" virtual keyboard with a surface called "Create Pad"; allowing the lower half to be used for pen writing (with handwriting recognition) and drawing. The "Real Pen" (which is a dual-use ink pen and stylus) offers 2,048 pressure levels and 100-degree angle detection, according to Lenovo, and promises a precise experience when writing and creating artwork.
"The Halo Keyboard re-imagines the possibilities of a modern keyboard, while providing the technology platform for all other standout Yoga Book productivity-driven features, such as the Create Pad and Real Pen. It appears to the user as a full, backlit virtual keyboard with shortcut keys for a typing experience that matches that of a physical keyboard, easily overcoming the challenges of typing on a tablet screen."
"The lack of physical keys also allows the Halo Keyboard’s flush surface to house the Create Pad. For the artists and free hand note-takers, the Create Pad converts into a virtual notepad that instantly digitizes everything from doodles and to-do lists to web page annotations and on-screen notes, using the Real Pen and our Note Saver app."
The Yoga Book is available in both Android and Windows versions, with the Android version offering a custom interface called "Book UI". As to hardware, both versions are powered by an Intel Atom x5-Z8550 processor (quad-Core, up to 2.4 GHz) with 4GB of LPDDR3 memory and 64GB of onboard storage (expandable via microSD cards up to 128GB in size).
What about pricing? This might be surprising for a high-concept device like this, as Lenovo has chosen to compete in the $500 tablet space. The Android-powered Yoga Book starts at $499, with the Yoga Book with Windows at $549. Both will be available starting in October.
Full press release after the break.
Subject: General Tech | June 6, 2016 - 03:46 PM | Scott Michaud
Tagged: windows, pc gaming, osx, linux
The next week-and-a-half should be good for video game enthusiasts. E3 2016 starts on June 14th, although EA, Bethesda, Microsoft, Ubisoft, Sony, and AMD (with PCGamer) have press conferences throughout the 12th and the 13th. Of course, not to get lost in the traffic, many entities are releasing their announcements prior to those conferences. For instance, Watch Dogs 2 will have a reveal on this Wednesday, June 8th, five days prior to Ubisoft's press conference.
This post is about a Kickstarter project called Yooka-Laylee, though. This title is being created by Playtonic Games, which contains several past employees of Rare, apparently to create a proper Banjo-Kazooie-style platform title. It raised over two million British Pounds (~3 million USD) and targeted an October 2016 release date. That has since slipped to Q1 2017, but that should be expected for a crowdfunding project, especially when the stretch goals start piling up. It is scheduled to be released on Windows, Mac, and Linux... and a few other boxes.
Of course, they couldn't resist making a Banjo-Kazooie: Nuts & Bolts joke at the end...
... I chuckled.
Subject: Graphics Cards | May 10, 2016 - 12:11 PM | Ryan Shrout
Tagged: windows 10, windows, vrr, variable refresh rate, uwp, microsoft, g-sync, freesync
Back in March, Microsoft's Phil Spencer addressed some of the concerns over the Unified Windows Platform and PC gaming during his keynote address at the Build Conference. He noted that MS would "plan to open up VSync off, FreeSync, and G-Sync in May" and the company would "allow modding and overlays in UWP applications" sometime further into the future. Well it appears that Microsoft is on point with the May UWP update.
According to the MS DirectX Developer Blog, a Windows 10 update being pushed out today will enable UWP to support unlocked frame rates and variable refresh rate monitors in both G-Sync and FreeSync varieties.
As a direct response to your feedback, we’re excited to announce the release today of new updates to Windows 10 that make gaming even better for game developers and gamers.
Later today, Windows 10 will be updated with two key new features:
Support for AMD’s FreesyncTM and NVIDIA’s G-SYNC™ in Universal Windows Platform games and apps
Unlocked frame rate for Universal Windows Platform (UWP) games and apps
Once applications take advantage of these new features, you will be able to play your UWP games with unlocked frame rates. We expect Gears of War: UE and Forza Motorsport 6: Apex to lead the way by adding this support in the very near future.
This OS update will be gradually rolled out to all machines, but you can download it directly here.
These updates to UWP join the already great support for unlocked frame rate and AMD and NVIDIA’s technologies in Windows 10 for classic Windows (Win32) apps.
Please keep the feedback coming!
Today's update won't automatically enable these features in UWP games like Gears of War or Quantum Break, they will still need to be updated individually by the developer. MS states that Gears of War and Forza will be the first to see these changes, but there is no mention of Quantum Break here, which is a game that could DEFINITELY benefit from the love of variable refresh rate monitors.
Microsoft describes an unlocked frame rate as thus:
Vsync refers to the ability of an application to synchronize game rendering frames with the refresh rate of the monitor. When you use a game menu to “Disable vsync”, you instruct applications to render frames out of sync with the monitor refresh. Being able to render out of sync with the monitor refresh allows the game to render as fast as the graphics card is capable (unlocked frame rate), but this also means that “tearing” will occur. Tearing occurs when part of two different frames are on the screen at the same time.
I should note that these changes do not indicate that Microsoft is going to allow UWP games to go into an exclusive full screen mode - it still believes the disadvantages of that configuration outweigh the advantages. MS wants its overlays and a user's ability to easily Alt-Tab around Windows 10 to remain. Even though MS mentions screen tearing, I don't think that non-exclusive full screen applications will exhibit tearing.
Gears of War on Windows 10 is a game that could definitely use an uncapped render rate and VRR support.
Instead, what is likely occurring, as we saw with the second iteration of the Ashes of the Singularity benchmark, is that the game will have an uncapped render rate internally but that frames rendered OVER 60 FPS (or the refresh rate of the display) will not be shown. This will improve perceived latency as the game will be able to present the most up to date frame (with the most update to date input data) when the monitor is ready for a new refresh.
UPDATE 5/10/16 @ 4:31pm: Microsoft just got back to me and said that my above statement wasn't correct. Screen tearing will be able to occur in UWP games on Windows 10 after they integrate support for today's patch. Interesting!!
For G-Sync and FreeSync users, the ability to draw to the screen at any range of render rates will offer an even further advantage of uncapped frame rates, no tearing but also, no "dropped" frames caused by running at off-ratios of a standard monitor's refresh rate.
I'm glad to see Microsoft taking these steps at a brisk pace after the feedback from the PC community early in the year. As for UWP's continued evolution, the blog post does tease that we should "expect to see some exciting developments on multiple GPUs in DirectX 12 in the near future."
Subject: General Tech | March 19, 2016 - 04:36 PM | Tim Verry
Tagged: windows, sony, remote play, ps4, game streaming
Sony will be opening up its Remote Play feature to include Windows and Mac PCs with the next system update, version 3.5. In its current form, Remote Play allows users to stream games from their PS4 to certain Sony devices including Xperia phones, Vita handhelds, and the PlayStation TV "microconsole". The new update will let users stream games from the game console to PCs over your home network.
PS4 System Update 3.5 is set to release later this month. While a beta is available, the beta build does not include the streaming feature. It does add support for live streaming to Dailymotion, updates to the social platform (e.g. planned parties), and an incognito mode that allows user to appear offline (how has it taken Sony this long to support that??).
Sony opening up the streaming is a welcome move as it puts it more in line with Microsoft's offering by not requiring specific hardware. Actually, it may be a bit better since users might be able to get away with using older Windows operating systems (Xbox One is limited to Windows 10) as well as streaming to their Macs. Further, Ars is reporting that Sony stopped shipping its PlayStation TV hardware in the US and Europe at the end of 2015. Thus, that may be one of the reasons Sony is moving away from streaming only to Sony hardware. I'm interested in trying out the Remote Play game streaming to see how it compares to the Xbox One to Windows 10 streaming which has worked pretty well so far for me in streaming Forza to my desktop!
Game streaming is proving to be popular and it is interesting to see both popular gaming consoles will soon allow you to stream games from the living room to your computers while at the same time Valve and others are pushing for solutions (e.g. Steam In-Home Streaming) to stream games from your PCs to the living room. Exciting times, especially if you're able to used wired network connections!
What do you think about Sony's plans for expanding Remote Play? Did you use the PS TV?
Subject: General Tech | July 30, 2015 - 03:58 PM | Scott Michaud
Tagged: microsoft, windows, windows 10, visual studio
July 29th started the official roll-out of Windows 10 and, for Windows Insiders, was pretty much “Wednesday”. We already had everything of relevance by Monday on the OS side of things, and not even a security patch landed in our Windows Update queue. It was not the only thing that Microsoft launched today, though. While Visual Studio 2015 was released last week, it said that it was not compatible with pre-10240 SDKs and would delete them during the installation process and you will be unable to develop SDK apps until the one for 10240 launches on July 29th.
So, coincident with the OS release, Microsoft finally published the 10240 Windows SDK. Now, if you run Visual Studio 2015's installer, it will install the new SDK directly. You do not need to download it from a secondary source. These headers and libraries are placed in the “Windows Kits” folder of your 32-bit Program Files directory... ironically, without deleting the previous SDKs that it threatened to, when run before July 29th. Go figure.
Also, even though DirectX 12 has been in the Windows SDK for quite some time, Microsoft has, also, finally released code examples and they put them on their GitHub page. These samples teach you how to do things like draw a triangle, manage DirectX 11-era contexts alongside DirectX 12 ones in your application, and create an n-body gravity simulation. They welcome pull requests for fixes, although they might appreciate new samples as well.