This is the eleventh installment in the 12 day series for SQL tidbits during this holiday season.
Previous articles in this mini-series on quick tidbits:
- SQL Sat LV announcement
- Burning Time
- Reviewing Peers
- Broken Broker
- Peer Identity
- Lost in Space
- Command ‘n Conquer
- Ring in The New
- Queries Going Boom
- Retention of XE Session Data in a Table
Did you know there is a default job in SQL Server that is created with the purpose of removing system health phantom records? This job also helps keep the system tables ,that are related to policy based management, nice and trim if you have policy based management enabled. This job could fail for one of a couple of reasons. And when it fails it could be a little annoying. This article is to discuss fixing one of the causes for this job to fail.
I want to discuss when the job will fail due to the job step related to the purging of the system health phantom records. Having run into this on a few occasions, I found several proposed fixes, but only one really worked consistently.
The error that may be trapped is as follows:
A job step received an error at line 1 in a PowerShell script.
The corresponding line is ‘(Get-Item SQLSERVER:\SQLPolicy\SomeServer\DEFAULT).EraseSystemHealthPhantomRecords()’.
Correct the script and reschedule the job. The error information returned by PowerShell is:
‘SQL Server PowerShell provider error: Could not connect to ‘SomeServer\DEFAULT’.
[Failed to connect to server SomeServer. -->
The first proposed fix came from Microsoft at this link. In the article it proposed the root cause of the problem being due to the servername not being correct. Now that article is specifically for clusters, but I have seen this issue occur more frequently on non-clusters than on clusters. Needless to say, the advice in that article has yet to work for me.
Another proposed solution I found was to try deleting the “\Default” in the agent job that read something like this.
(Get-Item SQLSERVER:\SQLPolicy\SomeServer\Default).EraseSystemHealthPhantomRecords()
Yet another wonderful proposal from the internet suggested using Set-ExecutionPolicy to change the execution policy to UNRESTRICTED.
Failed “fix” after failed “fix” is all I was finding. Then it dawned on me. I had several servers where this job did not fail. I had plenty of examples of how the job should look. Why not check those servers and see if something is different. I found a difference and ever since I have been able to use the same fix on multiple occasions.
The server where the job was succeeding had this in the job step instead of the previously pasted code.
if (‘$(ESCAPE_SQUOTE(INST))’ -eq ‘MSSQLSERVER’) {$a = ‘\DEFAULT’} ELSE {$a = ”};
(Get-Item SQLSERVER:\SQLPolicy\$(ESCAPE_NONE(SRVR))$a).EraseSystemHealthPhantomRecords()
That, to my eyes, is a significant difference. Changing the job step to use this version of the job step has been running successfully for me without error.
I probably should have referenced a different server instead of resorting to the internet in this case. And that stands for many things – check a different server and see if there is a difference and see if you can get it to work on a different server. I could have saved time and frustration by simply looking at local “resources” first.
If you have a failing syspolicy purge job, check to see if it is failing on the phantom record cleanup. If it is, try this fix and help that job get back to dumping the garbage from your server.