Quiesce

Most of the time, I just proceed straight to asking my computer to do stuff for me. Sometimes, especially when I don’t yet know what stuff I want, I will instead start by asking my computer not to do stuff, over and over until it’s actually not doing anything that rises above the expected noise floor. Reducing noise allows my human senses to be more effective.

Central to this whole process is having a noise floor expectation in the first place. That’s hard to define precisely, and varies based on many factors, but generally I use one of these criteria:

  • some process spawning repeatedly for no discernible reason in tight time intervals
  • some process using more than 10% CPU when you think the system should be idle

Over the years, I’ve noticed impressive variety in the range of disquieted things I encounter, and the specific techniques I use to soothe them. This seems like an ideal pretext for a collection of blog entries, so here’s the first one in that category.

If you noticed that Diagnostics Reporter is spawning every 10 seconds, you probably weren’t surprised to find this occurring at that specific interval, because 10 seconds is the minimum respawn delay for a launchd job. In theory it’s an option to disable this job entirely, but that is almost never the first choice, especially if the job is Apple-provided, because there might be various OS behaviors that assume an Apple-provided job is not disabled. Aside from that, I accept the value of reporting diagnostics – just not every 10 seconds.

You probably know several ways to monitor CPU usage, but how do you monitor process executions? These days, the canonical answer is to leverage Apple’s Endpoint Security framework. Here’s a little shell pipeline that uses the eslogger tool to get a realtime view of all process executions, which is then filtered by jq to only emit the time stamp and process args fields from each result. We then highlight any occurrence of “Diagnostics” in the results using grep.

sudo eslogger exec \
  | jq --unbuffered -r '"\(.time) \(.event.exec.args)"' \
  | grep -i --line-buffered -E --color 'Diagnostics|$'
What’s up with this grep invocation?

Normally grep would filter out lines that don’t contain the search term, but we do a little trick here by running in ‘extended’ mode (-E) which allows regular expressions, and the regex pattern we specify is Diagnostics|$, where the | is an alternation operator (“or”), and $ is a special regex thing called an anchor, which in this case means “end of line”. Because anchors are zero-width, and because every line has an end, this matches all lines but only colorizes the regex matches the aren’t zero-width.

output from eslogger
2023-10-09T22:09:35.393651108Z ["xpcproxy","com.apple.DiagnosticsReporter"]
2023-10-09T22:09:35.413238655Z ["/System/Library/CoreServices/Diagnostics Reporter.app/Contents/MacOS/Diagnostics Reporter"]
2023-10-09T22:09:45.519159562Z ["xpcproxy","com.apple.DiagnosticsReporter"]
2023-10-09T22:09:45.538428223Z ["/System/Library/CoreServices/Diagnostics Reporter.app/Contents/MacOS/Diagnostics Reporter"]
2023-10-09T22:09:46.515199380Z ["xpcproxy","com.apple.TrustEvaluationAgent.system"]
2023-10-09T22:09:46.533250020Z ["trustevaluationagent"]
2023-10-09T22:09:46.919154479Z ["xpcproxy","com.apple.TrustEvaluationAgent"]
2023-10-09T22:09:46.956647406Z ["trustevaluationagent"]
2023-10-09T22:09:47.178895049Z ["xpcproxy","com.apple.mdworker.shared.04000000-0000-0000-0000-000000000000"]
2023-10-09T22:09:47.181817432Z ["xpcproxy","com.apple.mdworker.shared.20000000-0000-0000-0000-000000000000"]
2023-10-09T22:09:47.205124702Z ["/System/Library/Frameworks/CoreServices.framework/Frameworks/Metadata.framework/Versions/A/Support/mdworker_shared","-s","mdworker","-c","MDSImporterWorker","-m","com.apple.mdworker.shared"]
2023-10-09T22:09:47.206892231Z ["/System/Library/Frameworks/CoreServices.framework/Frameworks/Metadata.framework/Versions/A/Support/mdworker_shared","-s","mdworker","-c","MDSImporterWorker","-m","com.apple.mdworker.shared"]
2023-10-09T22:09:55.639618195Z ["xpcproxy","com.apple.DiagnosticsReporter"]
2023-10-09T22:09:55.661773112Z ["/System/Library/CoreServices/Diagnostics Reporter.app/Contents/MacOS/Diagnostics Reporter"]
2023-10-09T22:10:05.763758148Z ["xpcproxy","com.apple.DiagnosticsReporter"]
2023-10-09T22:10:05.784599565Z ["/System/Library/CoreServices/Diagnostics Reporter.app/Contents/MacOS/Diagnostics Reporter"]
2023-10-09T22:10:15.885406139Z ["xpcproxy","com.apple.DiagnosticsReporter"]
2023-10-09T22:10:15.905317005Z ["/System/Library/CoreServices/Diagnostics Reporter.app/Contents/MacOS/Diagnostics Reporter"]

Assuming this Diagnostics Reporter thrashing isn’t “expected”, a working knowledge of launchd would lead to the suspicion that Diagnostics Reporter is being launched because of some state that it can’t successfully clean up when it’s running. Almost all software will tell you how it’s doing, but almost always that monologue is invisible from the exalted heights of the macOS User Experience. The first step is always to get down to earth where you can hear whatever is being said, instead of only hearing (at most) one of the pre-approved statements. Since pretty much all Apple software logs to os_log, we’ll start with a log query. We ask to see only errors by asking to filter out debug and info messages. Since eslogger told us the full path to Diagnostics Reporter, we can use that in the log query:

sudo log show --no-debug --no-info --last 1h --predicate \
'senderImagePath == "/System/Library/CoreServices/Diagnostics Reporter.app/Contents/MacOS/Diagnostics Reporter"'

Ignoring the substantial ‘boilerplate’ portions of each line, the messages of interest are:

Error accessing file:///var/db/PanicReporter/current.panic. The file “current.panic” couldn’t be opened because there is no such file.
Error accessing file:///var/db/PanicReporter/display.panic. The file “display.panic” couldn’t be opened because there is no such file.
Error accessing file:///var/db/DiagnosticsReporter/current.ale.crash. File doesn't exist at /var/db/DiagnosticsReporter/current.ale.crash or is not readable.
Error accessing file:///var/db/DiagnosticsReporter/current.watchdog. The file “current.watchdog” couldn’t be opened because there is no such file.
Invalid launch.

Ok, now we’re getting somewhere. Let’s look around at those paths. Interesting that one these assertions includes the “… or is not readable” clause.

andre@boom ~ % ls -al /var/db/PanicReporter 
andre@boom ~ % 
andre@boom ~ $ ls -al /var/db/DiagnosticsReporter                                     
total 0
drwxrwxrwx    3 root  wheel    96 Sep 26 22:51 .
drwxr-xr-x  131 root  wheel  4192 Oct  9 15:18 ..
lrwxr-xr-x    1 root  wheel    66 Apr 11  2022 current.ale.crash -> /Library/Logs/DiagnosticReports/WindowServer-2022-04-11-103855.ips

If you and ls are old pals, it is immediately obvious that this file is a symbolic link, so let’s chase it:

andre@boom ~ % stat /Library/Logs/DiagnosticReports/WindowServer-2022-04-11-103855.ips
stat: /Library/Logs/DiagnosticReports/WindowServer-2022-04-11-103855.ips: stat: No such file or directory

Ah ha. Broken symlink. Who knows how it broke or why Diagnostics Reporter considers this a fatal error. The solution is to delete that broken symlink, since the file it refers to is gone. After doing this, no more Diagnostics Reporter spawning and immediately dying every 10 seconds. We win this round.

There is perhaps an easier way we could have found the problematic state, which is to look at the trigger conditions in the launchd job that spawns Diagnostics Reporter.

andre@boom ~ % for d in /System/Library/LaunchAgents /System/Library/LaunchDaemons ; do find ${d} -iname '*diagnostics*' ; done
/System/Library/LaunchAgents/com.apple.DiagnosticsReporter.plist
/System/Library/LaunchAgents/com.apple.diagnostics_agent.plist
/System/Library/LaunchDaemons/com.apple.InstallerDiagnostics.installerdiagd.plist
/System/Library/LaunchDaemons/com.apple.InstallerDiagnostics.installerdiagwatcher.plist

andre@boom ~ % plutil -p /System/Library/LaunchAgents/com.apple.DiagnosticsReporter.plist
{
  "EnablePressuredExit" => 0
  "EnableTransactions" => 0
  "Label" => "com.apple.DiagnosticsReporter"
  "ProcessType" => "App"
  "Program" => "/System/Library/CoreServices/Diagnostics Reporter.app/Contents/MacOS/Diagnostics Reporter"
  "QueueDirectories" => [
    0 => "/var/db/PanicReporter/"
    1 => "/var/db/DiagnosticsReporter/"
  ]
  "StandardErrorPath" => "/dev/null"
}

The launchd.plist man page says the following about QueueDirectories:

     QueueDirectories <array of strings>
     This optional key keeps the job alive as long as the directory or directories specified are not empty.

About dre

I like all kinds of food.
This entry was posted in quiesce. Bookmark the permalink.

Leave a Reply