Out of all the problems you can have with SQL Server troubleshooting connectivity issues can be the most challenging. When you factor in the complexities of Active Directory and SQL Server’s interaction with it fixing SSPI errors can be down right baffling.
At my company we are moving onto new hardware and along the way standardizing on SQL Server 2005 x64.
Since this is all happening on new hardware I have the luxury of doing most of the work before we flip the switch.
We had one migration under our belt and the second one was looking good when the SSPI came and decided to make sure I spent all my Saturday working.
I ran down my list of things to check:
Date out of sync with domain more than 30 minutes – Nope.
Bad DNS entry – Nope.
Miss configured service account – Nope.
Log on locally with Windows account – Yep.
Now I was stating to get frustrated and kind of worried that a roll back may be called for. I did what all good panicked DBA’s do, I searched the Internet. Most of it I had already tried and some of it just didn’t apply. It’s not often a solid web search is such a complete strike out.
When all else fails, step back look at the problem as a whole and start from the beginning.
What do we know? We can’t establish a trusted login between SQL Server and the domain.
The OS and domain are just fine. You can log in locally with a domain account and you can remotely access other server resources with a domain account.
SQL Server services start up just fine under a domain account.
You can log in locally to SQL Server but not remotely.
SQL authentication works just fine.
That pretty much leaves a configuration issue somewhere. Since we had set this server up it initially had a different name and IP address that would be changed to the old server name and IP address. There are alot of known problems with other parts of SQL Server like reporting services when you do this kind of rename but generally SQL Server is just fine.
You drop the old SQL Server name
sp_dropserver old_name
sp_addserver new_name, local
OR
sp_dropserver [old_name\instancename]
sp_addserver [new_name\instancename], local
Verify everything with these two queries
SELECT @@Servername
SELECT * FROM [servername].msdb.dbo.sysjobs
If you have a problem with the @@Servername you get back nothing or the wrong server name.
The four part select should return data if not you usually get this linked server is not configured for blah blah blah. Which means you may have forgotten the ,local part.
After checking all of these things off again, we still had the issue!
Now I really was stumped.
I didn’t have a lot of other choices and time was running out. I just started going through EVERYTHING that was network related in the SQL Server configuration.
Eventually, I saw the issue. The server IP was correct but the IP in the SQL Server Network Configuration was wrong! I’ve done a ton of renames like this and hadn’t encountered this particular setting not changing when the server IP changed.
I reset it to the new server IP and just like magic SQL Server could authenticate to the domain.
There was much cheering from my peers, ok not really but I was cheering. The server was up and everyone was able to get back to doing other things as the database faded into the background once again.
I just keep telling myself that SQL Server one of the easiest RDMS’s on the market.