Windows 10 Task Manager is often used by end users to gauge the performance of their machine, especially when they think something is amiss. There are several reasons why this isn’t really a good performance gauge.
It’s a point-in-time measurement that lacks context of the overall scale of resource usage.
It doesn’t see inside processes to understand the impact of Anti-Virus and other security software on the processes.
Task Manager CPU stats are deceptive and inconsistent, at time of writing (rest of this blog explains).
One would normally expect a CPU to just have 0-100% and an app uses 5%. But if the CPU is 8 cores and the core the thread for the app is on is in boost mode for example, it’s not 100% CPU we’re measuring, its like, 112%. So now the Processes tab is showing you 100% of 112%, etc.
Example, my co-worker Aaron Margosis wrote a utility that can run a thread at 100% CPU on a core. So in the screenshot below, I’m running a single thread at 100% CPU on one core, on an AMD Ryzen 5900x CPU.
Which is accurate? It ‘depends’ on what you want to know. If it’s utilization of all the cores at 100%, it’s 4% CPU. But the core the thread is on is likely boosted by AMD’s chip technology, so it’s really 6% of the 4% capable core due to boosting.
Does this seem like a large inconsistency? 2% is no big deal, right?
Let’s expand the experiment to 8 cores (the chip has 24 so we’ll be ok to run this test).
This is a slightly larger variance. So if I were a user complaining my machine was slow, I’d obviously think my CPU was being eaten by this test program, at 42.4%, when in reality it’s 33.333%. So 9% variance. Not huge in the world but still, it’s confusing. Especially since the tooltip on CPU in both tabs of Task Manager say the same thing “Total processor utilization across all cores”.
Below is running 12 cores of my 24 core AMD 5900x.
So now we’re seeing a 13.5% variance. So over 1% per core. My systems’ BIOS is not set to aggressively OC the CPU, I could probably get bigger variances by doing so. Maybe that’s post #2 for this topic.
These tie back to Performance Monitor as well. The more accurate data points for CPU measurements in Windows 8 and Server 2012 and above are
Processor Information\% Processor Utility
Processor Information\% Privileged Utility
Which is where the “Processes” view is getting it’s values.
So one can think of Processes tab on Task Manager as the “% CPU used of all available % of CPU available” vs the Details tab which is more “% CPU used of 100%/core”.
Microsoft is looking at a better way to display this all the time, cognizant that end users are used to Task Manager for gauging performance, not say, Perfmon or an ETW trace with the Windows ADK.
It is worth noting this variance does not appear to impact virtual machines, so far as I’ve been able to observe at this point.
Sometimes in support you’ll be asked to collect a boot trace to help troubleshoot slow boot or slow logon scenarios. The symptoms are a long time passes from startup to the CTRL+ALT+DEL or from CTRL+ALT+DEL to a usable desktop experience. This blog will walk you through the steps needed to do this.
The only alternative is to download the ADK for Windows 10, install the Windows Performance Toolkit (aka WPT), and do the trace using either WPRUI (with the boot scenario selected) or use xbootmgr if you prefer command line.
The Windows ADK for Windows 10 is sometimes updated when a new build is out. Usually, for Windows 10, you want to use the most recent ADK’s install of the WPT. At writing that is the ADK for Windows 10 version 2004. You can always get the link to the most current ADK at the page Download and install the Windows ADK. Installing the WPT requires you to run the ADK installer which pulls what you select in the checkboxes from the web (as shown below).
Or if you prefer, you can download and install the redistributable located in my OneDrive. Your call. I put the Build 2004 redist’s for x86 and x64 there.
Once the WPT is installed, the command line to grab a boot trace is:
This of course must be run as administrator. By default an Administrator command prompt puts you in System32, so it’s best to make a directory off C:\ and name it Trace or whatnot and change directory to there to run the command. The output of the trace will be written to the directory where the trace command is run by default.
Run the command, this will reboot the host and then boot up the kernel in tracing mode.
Wait for CTRL+ALT+DEL after the machine reboots and login
The trace will count down for 2 minutes and then write to C:\trace.
The interim trace files will be labeled KM and UM in the file name. Those are pre-merge files from kernel memory and user memory respectively. Once those are both paged to disk from RAM, xbootmgr will merge the two into a single file and delete the KM and UM working files.
TLDR: At time of writing, Windows 10 20H2 has a bug where the default buffer allocations in boot tracing are inadequate to capture the data of a boot trace. The fix is pretty simple, use good old xbootmgr instead. This is a binary from the older ADK and gets installed when you install the current ADK.
What am I talking about? How did I find this?
I hit a scenario where I needed a boot trace. So I set it up like so, this is a pretty typical set of options for a boot trace. Collect 1st level triage, CPU, DiskIO and File IO events. Log to file (the only option in a boot trace) and change your iterations from 3 to 1.
But when the trace rebooted the VM and came back up, it had dropped events. Dropped events mean at some point in the recording, data was lost. Windows knows it lost data but not what type. So this makes interpreting the trace extremely unreliable.
Typically this is due to poor storage performance. So I tested the storage with CrystalDiskMark. And since the VM is hosted on an NVME drive, it did pretty well.
These numbers are more than adequate for our needs. So what gives? There is a mechanic in collecting traces known as ETW buffers that capture the data from ETW providers.
Think of this as radio waves. Each ETW provider in Windows is a radio station. Each one is broadcasting all the time. When you collect an ETW trace, what you are telling Windows you want to do is listen to a station or set of stations, and collect that data into memory, or in the case of a boot trace, a pair of files. Windows can do this for you usually with no issues, by allocating Non-Paged Kernel memory to trace buffers. In xbootmgr and its cousin, xperf.exe, you can tweak the buffers allocated to the trace, both the count of buffers, and the memory size of each buffer. Typically the default values work just fine, but if you are dealing with a very busy system or terrible storage performance, sometimes you can drop events.
To go back to the radio analogy, this would be like the broadcast missing segments of time, or static perhaps is a way to think of it.
So back to the scenario, I had dropped events, and I confirmed storage was great. So what next?
I thinned the trace, iteratively, down to just 1st level triage checked and “Light” instead of “Verbose” and still dropped events.
I also tried the “GeneralProfileForLargeServers.wprp” file that is located in the “C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit” directory. I tried this because this file has statically set values for buffers. But still, no dice, dropped events.
What I ended up doing to fix this was call xbootmgr and then I had no dropped events. Curious. I can only surmise Windows 10 20H2 has a different configuration than previous Windows versions for the ETW collections.
The command I used is xbootmgr -trace boot -traceflags dispatcher+latency. This rebooted my machine as expected and collected a trace. When I opened it, it had no errors. Success!
Then simply double-clicking the resulting etw file was met with success.
I’ll be opening a Feedback using the Feedback app and placing a link here shortly. If this impacts you and you’d like to see it fixed please upvote here. I hope this has helped you understand what is going on and how to work around the current issue. Happy Tracing!
One of my routines when installing Windows 10 fresh (or updating builds when it wipes my preferences) is to change Task Manager’s view to report on additional columns of value. Let me show you what I’m doing:
My machine has an uptime of 1 day, 15 hours. I game quite a bit. Running an Nvidia 2080ti.
Somehow while not gaming, I’m using 3GB of dedicated video ram…
While not gaming.
So where is it going?
Xbox game services (I don’t use this, forgot to turn off gamebar doh)
Does this get released when I launch a game? Probably not. Good reminder to shut off what you don’t need for better gaming experiences.
I was looking at space used on my C drive in Windows 10 (just upgraded to 2004 build, yay!) and found something that seemed off to me.
Now, it’s not unusual to have driver suites like bluetooth, or sound controllers, gpu’s, headsets, etc, take up some amount of space. That’s fine. Over a GB just to show me a battery status? W-T-F.
So I drilled down in there, it appears the HyperX NGenuity suite downloads the art/text/drivers for all their products, not just the one you use. And it keeps them there. Even if you don’t have the products and don’t intend on ever purchasing them.
Worse, I closed the software, deleted the extra directories, and relaunched only to find that now the NGenuity suite hangs (can’t minimize, move the window, close it, etc) at launch.
So lose a GB of space, or guess what your battery is at in your headset. Buyer beware.
The batch file will delete its source tasks from task scheduler and then crash the Windows host with a kernel dump (if you configured step 3, if not it will be an ‘automatic’ dump, which may still be ok.
Create a Perfmon Alert trigger
Get the PID (process id) of the crashing user mode process (taskmanager/details will show you this)
Start perfmon. Create a new user data collector set.
Name it ‘stop above’ and select the “Advanced” radio button.
Select “Performance Counter Alert”
For the top left pane, expand “Process” and then select “ID Process”. Perfmon is quirky so you may need to click something else then click back to “ID Process”. Then in the bottom left pane, pick the process name (in my example, AISuite3). Click Add.
Set the alert for when Above and Limit (the pid, in my case, 777)
Do the same for below with a below rule on 777. The end result should be 2 data collectors.
Click on each blue cube and go to the right and right-click and properties each Data Collector. Set the task to D:\temp\stop.bat
So that each data collector task does the same thing.
Right-click/start both Data Collector Sets so that if the PID of your process changes from the one specific in the two data collector sets, the system crashes.
When this all happens, the machine will reboot, you’ll have a memory.dmp file in C:\Windows, and a user-mode dump of your target process in the DebugDiag folder (or elsewhere if you configured that path differently).
Problem: You are gaming/typing/using your computer and your foreground window in Windows loses focus. Origin game client may appear briefly in front of your window, then disappear.
TL/DR – in my instance, it was Star Wars Battlefront 2 (I even caught it once running Battlefront 2 (update maybe?) and it seized control of foreground 3-4 times in a row. I’ve uninstalled Star Wars Battleground 2 and the issue appears to be resolved for me. At least 10 hours of non-alt-tab-hell.
I have this annoying problem and saw the developers at EA were struggling with it. So I figured I’d take a look…
I collected a trace with WPRui.exe using the following check boxes:
First Level Triage
Desktop Composition Activity
I then let it record until the problem happened, and saved the recording. The resulting log was 10GB in size (I have 64GB of RAM installed so this was somewhat expected). WPA will not open the file however, I’ve been sitting at this for the last 2 hours:
So I decided I’d try PerfView. PerfView is like WPA, but different. Way different. You can find it on Github naturally. On Git there are a lot of video and tutorial links, which I ignored with abandon and set off to explore.
First less on PerfView is, it by default won’t display any more than 20,000,000 events it seems. My trace was a tad, husky, shall we say, so I had to change the cmdline to “PerfView64.exe /SkipMSec:340000” which told it “Hey, open the trace, but don’t really load a lot of detail until you pass 340,000 ms of time.” My trace was about 470 seconds long, so this starts really watching around 340 seconds in give or take.
The first area of PerfView I went to was the Process List.
This gave me a list first of all transient processes, then below a list of persistent processes in the trace. I searched for all Origin processes to get the PIDs I might need, and to see which ones died, what their exit codes were, which ones were persistent, etc.
Then I double-clicked on Events. This shows every event that happened in the time range of my view, but I knew what I was looking for. Two in fact. Win32k manages what window is in foreground (and a lot of other things). The specific two events I was looking for were:
Microsoft-Windows-Win32k/FocusedProcessChange and Microsoft-Windows-Win32k/FocusChange. These would tip me off when I left my explorer window temporarily by Origin preempting my UI. I found several of these.
So that’s a list of every time the foreground window in my UI changed. Next I looked for quick transition times. See the 470,010-471,736? This was the flicker. It was interesting because it seemed to ‘minimize’ but (and this is how I’ve always experienced this) still was foreground, until I took it back. One example of the behavior I had was the screen flickered while I was scrolling down a web page with my mouse wheel. After the flicker, I had to click the web page again to get the focus, so scrolling could continue. The problem also happened very close to the end of my trace (it’s occurrence being why I stopped the trace in the first place).
What is interesting here to me is that Origin 18076 was the normal Origin process. The 55028 was one of the transient ones. It appears to spawn a sub-process named Origin, to do an ad change/rotation? You’ll see what I mean when you see the command line: