August 3, 2016 at 7:53 am
Sorry if you have already seen this under SQL 2005 Forum but it was obviously in the wrong place
I have a server running 2 instances of SQL 2008 R2 SP1.
On one instance the Maintenance Plans are working fine on the other they are suspended because SSIS cannot be found.
"Message
Unable to start execution of step 1 (reason: The SSIS subsystem failed to load [see the SQLAGENT.OUT file for details]; The job has been suspended). The step failed."
I have checked the msdb.subsystems table and manually added a row for SSIS (sp_verify_subsystems won't find SSIS) but every time I restart the SQL Agent the SSIS row disappears.
From what I can tell both instances are setup identically with the same user permissions etc. The use which runs the services is in the local admin group on the server.
The above mentioned server is a live box, the test box is also experiencing the same problem although only having one instance installed and running SQL 2008 R2 SP3.
Any suggestions would be much appreciated. Thanks
August 3, 2016 at 2:38 pm
And what are the details in the SQLAgent.out file when the job fails? You would also want to look at when Agent started, not just when the job fails as if it's purely a subsystem failing to load issue, the reasons are often in the startup portion of the Agent log file.
Sue
August 3, 2016 at 3:05 pm
Run this query in the problematic instance:
select * from msdb.dbo.syssubsystems
See if the locations listed for the .dll's and the .exe's exist, and the files actually are in the locations.
If they don't, figure out where they DO exist, and correct them by updating the table.
sp_configure 'allow updates', 1
reconfigure with override
update msdb.dbo.syssubsystems
set subsystem_dll = replace(subsystem_dll,'Bad Location','Good Location')
from msdb.dbo.syssubsystems
where subsystem_dll like 'Bad Location%'
sp_configure 'allow updates', 0
reconfigure with override
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
August 4, 2016 at 1:53 am
Thanks for your suggestions, I have checked the SQLAgent.out file and there are no clues in there, just the following.
2016-07-26 11:17:25 - ! [LOG] Step 1 of job 'Database Backup Plan.Subplan_1' (0xA6E09C11EB71404B8A5DC1AAAF48A0D2) cannot be run because the SSIS subsystem failed to load. The job has been suspended
I have also checked the dll and exe locations and they are where SQl expects them to be.
C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Binn
C:\Program Files\Microsoft SQL Server\100\DTS\Binn
Any other suggestions would be gratefully received.
August 4, 2016 at 7:37 am
If the subsystems data appears correct and all the paths are fine, You can also get the error with permissions issues, not just messed up subsysystems. Check the windows log and see if you have any errors at the same time the job fails, check the owner of the job, check to permissions for the Agent account to both the Binn directories (those directories for subsystem_dll and agent_exe in the syssubsystem table).
With the .Net dependencies, I would guess that could cause some problems as well so just verify that you have .Net install on that server, is current and has the current patches.
I would probably try some other simple package just to make sure SSIS in general seems fine. And then schedule that to make sure all is fine...just even if it does nothing other than executing select * from sysprocesses.
Sue
August 4, 2016 at 8:30 am
Thanks Sue,
I double checked the subsystem paths and they are correct but the row disappears when the Agent is restarted anyway.
I have check the user permissions and the user that run the Agent is in the local admin group on the server.
.Net is installed and patched.
What I can't get my head round is that on the live server where there are 2 instances, one works and one doesn't. This would imply that the problem is not with SSIS as it is a shared service.
I am using the instance which is working to run Maintenance Plan against the one that isn't and the test server.
I also have other server on the same version of OS and SQL that are running fine.
Thanks again
Evan
August 4, 2016 at 9:18 am
I have seen the same behavior with two instances so it's not as weird as it may seem. The syssubsystems table gets populated based on entries in the registry. It reads two different keys - the instance path and the setup path. The instance paths would be different. The system stored procedure that runs when agent starts is going to check these paths and populate the table if needed. I would probably run the same registry check manually on both instances and see what the paths are showing. Run this on both instances and see what you have - I'm pretty sure I grabbed these from the system stored procedures themselves:
Declare @InstRootPath nvarchar(500)
declare @VersionRootPath nvarchar(500)
EXEC master.dbo.xp_instance_regread N'HKEY_LOCAL_MACHINE', N'SOFTWARE\Microsoft\MSSQLServer\Setup', N'SQLPath', @InstRootPath OUTPUT
select @InstRootPath
EXEC master.dbo.xp_regread N'HKEY_LOCAL_MACHINE', N'SOFTWARE\Microsoft\Microsoft Sql Server\100', N'VerSpecificRootDir', @VersionRootPath OUTPUT
select @VersionRootPath
One other question - do you know or remember if either of the instances upgraded from SQL 2005 by chance?
Sue
August 4, 2016 at 5:42 pm
The other thing I keep forgetting...when you restarted SQL Agent, look at the Out file, error log and see if there are errors on startup for the SSIS subsystem. I think that's where you can sometimes get slightly more details if a subsystem fails to load into cache.
Sue
August 5, 2016 at 3:27 am
Thanks again Sue,
I have run the script to query the registry and the output is the same for both instances.
C:\Program Files\Microsoft SQL Server\100C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL
None of the instances were upgraded from SQL 2005 they were all fresh installs of SQL 2008.
I manually inserted the SSIS row in the subsystems table again and restarted the Agent, again the row disappeared and there are no clues in the SQLAgent.out file, that I can see.
2016-08-05 10:02:30 - ? [100] Microsoft SQLServerAgent version 10.50.6000.34 ((Unknown) unicode retail build) : Process ID 1028
2016-08-05 10:02:30 - ? [101] SQL Server MAMMOTH version 10.50.6220 (0 connection limit)
2016-08-05 10:02:30 - ? [102] SQL Server ODBC driver version 10.50.6000
2016-08-05 10:02:30 - ? [103] NetLib being used by driver is DBNETLIB.DLL; Local host server is
2016-08-05 10:02:30 - ? [310] 1 processor(s) and 2048 MB RAM detected
2016-08-05 10:02:30 - ? [339] Local computer is MAMMOTH running Windows NT 6.1 (7601) Service Pack 1
2016-08-05 10:02:31 - ? [432] There are 11 subsystems in the subsystems cache
2016-08-05 10:02:39 - ! [364] The Messenger service has not been started - NetSend notifications will not be sent
2016-08-05 10:02:39 - ? [129] SQLSERVERAGENT starting under Windows NT service control
2016-08-05 10:02:39 - + [260] Unable to start mail session (reason: No mail profile defined)
2016-08-05 10:02:39 - + [396] An idle CPU condition has not been defined - OnIdle job schedules will have no effect
August 5, 2016 at 12:45 pm
The paths all being the same between two instances isn't right. The subsystem_dll paths should be different between instances on the same server.
I have no idea how the instance root paths could end up the same for different instances in the registry. That would just seem like something went very wrong with the installation of the instance. I would double check that to verify - just run this on both instances to see if the subsystem_dll paths are really the same for both instances:
SELECT subsystem,
FORMATMESSAGE(description_id) as description,
subsystem_dll,
agent_exe,
start_entry_point,
event_entry_point,
stop_entry_point,
max_worker_threads,
subsystem_id
FROM msdb.dbo.syssubsystems
ORDER by subsystem_id
If it turns out they are different, rather than updating syssubsystems yourself and restarting, you may want to run the following instead - after taking a backup of msdb:
use msdb
go
delete from msdb.dbo.syssubsystems
exec msdb.dbo.sp_verify_subsystems 1
And then restart Agent afterwards.
That's generally the code I've seen Microsoft recommend to fix messed up subsystems, same thing they reference in the KB articles. However, being that it reads from the registry, if your registry entries are truly messed up I would guess it won't fix anything.
Sue
August 8, 2016 at 3:56 am
Hi Sue,
My mistake, I should of said that the paths are as I would expect. The failing instances points to the correct locations
C:\Program Files\Microsoft SQL Server\100C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL
and so does the working instance
C:\Program Files\Microsoft SQL Server\100C:\Program Files\Microsoft SQL Server\MSSQL10_50.AIRWATCH\MSSQL
I have tried the sp_verify_subsystems and it doesn't find the SSIS package despite the information above being present in the registry.
Regards
Evan
August 8, 2016 at 1:59 pm
Okay so now, on the failing instance, does the following file exist:
SQLDTSSS.DLL
Looks like it should be in C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\binn
If it does, I would still check the permissions on the folder, don't fall back on "It's a member of the local administrators" since that doesn't really matter for a lot of the permissions the SQL Server service accounts need.
If the service accounts have been changed outside of Configuration manager (i.e changing things in services applet) the appropriate rights can be missed. I've had SSIS messed up due to that before and just re-entering everything for the accounts in Configuration Manager, apply and restarting the services fixed things.
You could check the groups for SQL services on that server and and then the rights using gpedit and checking User Rights Assignment.
Sue
August 11, 2016 at 2:16 am
Hi Sue
Yes the file does exist in that location.
I have checked the folder permissions and all looks ok.
Although the service account hasn't changed I have refreshed the details through Configuration Manager and restarted all the services but still no joy.
August 11, 2016 at 12:26 pm
Hey Evan -
I think we covered most of the known fixes on this - you tried the manually populating the table, tried the MS way of deleting the contents of the table and doing sp_verify_subsystems 1 to refresh, verified all the paths in the registry, verified the files are in the appropriate places, verified the security, nothing in windows or SQL Server logs, etc.
Unfortunately, the only other thing I can think of is to uninstall and reinstall SSIS on that server. You can use scripts to do the backups until you have a maintenance window to do that if you go that route.
Sue
August 12, 2016 at 1:57 am
Thanks for your assistance with this problem Sue, it's a pity we couldn't find a solution but these things happen. I will consider re-installing SSIS but it is only the maintenance plans that are affected and I have set these up through a working instance, so there is no immediate need to do so. I will need to service pack one of the failing instances shortly so I may wait and see what happens then.
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply