July 8, 2010 at 3:21 pm
Suppose I have 2 DB servers and 1 Mail server. DB1 can see the Mail server. DB2 is in another state. It can see DB1 via a VPN but it cannot see the Mail server.
How do I send alerts and job failure notifications from DB2?
Actually I have a solution for Alerts: The Alerts on DB2 call a "Send Alert to DB1" Job, and the Job calls a proc on DB1 which raises an error which fires an alert which sends an email.
But what about DB2 Job failures?
I would really like a "Run this other Job if the current Job fails", similar to what you can do with an Alert response, but I'm not seeing that feature.
I see that I can attach an alert to a job, but I'm not able to make any sense of it. The "new alert" wizard doesn't have a "when this job fails" option, which is what I really want.
Any advice? Thanks!
July 8, 2010 at 3:38 pm
Can't you instead setup a job on DB1 that searches for job failures on DB2 and then sends an email when it finds one?
Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
Ramblings of a DBA (My SQL Server Blog)[/url]
Subscribe to my blog
July 8, 2010 at 3:51 pm
I like Tara's suggestion. If you are using DB1, let that do the work. If it's down, doesn't matter if DB2 does anything.
However, I would rather have you find some way to send a msg from DB2. Can you not use SMTP mail? Or a Q to store messages and then have them pulled from DB1? If this is only through a VPN, it's a limited alerting system.
July 9, 2010 at 7:51 am
I hadn't really considered polling. I had been attacking the problem from an "event forwarding" perspective. With a polling solution, I guess I would need to keep some kind of local "last polled timestamp" so I wouldn't re-detect previous errors.
These are good workarounds. I was hoping there was just something obvious I was missing in the alert/job relationship. I don't really understand what it means to attach an alert to a job, but then that alert just looks like a standard generic alert, and not a "if this job fails" type alert. What's the point of that feature?
Agreed that sending messages directly from the DB2 server would be ideal. It was going to take the network team some time to help put that together (we have PCI restrictions that limit our obvious options), and so my manager asked me to do some kind of forwarding in the mean time.
By the way, I'm very grateful to have received a response from 2 of my favorite DBA's. Thanks!
July 9, 2010 at 8:43 am
will s (7/8/2010)
The "new alert" wizard doesn't have a "when this job fails" option, which is what I really want.
Maybe I'm misunderstanding, but failed jobs can send out Notifications which can either do NetSend, Email or Page.
There's also the SMTP extended stored procedures that someone wrote up for SQL 2000 before the days of DBMail. But if your server can't see the Mail Server, I doubt it would do you any good.
July 9, 2010 at 9:56 am
You can still record the events locally on DB2. Detect them, store them in a local table on the db server, with a timestamp, and then query that from DB1. If you find a new alert, send it on.
July 9, 2010 at 10:48 am
Yep, using Notifications at the DB1 site, they work great. Not an option at DB2.
Not to be too lazy 😉 but it seems like the SQL Agent Event Forwarding feature should handle all of this. I'm not sure why I'm having to build my own forwarding/polling engine. I did try this: On DB2 I turned on Event Forwarding and created a Test Job that fails (intentional divide by 0 error). However my corresponding alerts on DB1 never fired, and I couldn't find any proof in the logs that "event forwarding" had actually occurred. Maybe a job failure is not itself an "event"? (The divide-by-0 error is not an "event" b/c it's only a level 16 error and won't go to the logs for detection unless I explicitly "raiserror with log".)
I do have a script that I use to run every day to detect failed jobs before I got Notifications setup, so I guess I'll set this up in a "polling" job on DB1 as has been suggested. I've just been able to do everything else up to this point using the out-of-the-box features, and it felt more "bona fide"...
I'll post the final solution once I've got it.
July 9, 2010 at 1:12 pm
Okay, I now have a job running every 5 minutes on the "DB2" server that forwards Job errors to the central "DB1" server. The job calls "usp_ForwardSqlJobErrors" which uses a control table + "vwJobStepsInError" to detect job steps that have failed since the last check. I put these in an Admin database, as Tara suggests 😉 The code for the proc and view are below.
The errors are bundled up and passed to [DB1].Admin.dbo.usp_RaiseErrorFromDB2 (names sanitized). I suppose you could have that proc do a number of things, like send an email explicitly, etc. I have it raising an error that is caught by an alert, which then does appropriate nifty notification stuff.
Since my target proc is raising an error, I had to configure the 5 minute job to "report success upon failure", even though I've got a try/catch in there, in order to prevent the job from reporting on itself every 5 minutes. 😛
In retrospect it might have been a better use of my time to push for an SMTP solution from the DB2 server that currently can't see a mail server. Except that this solution actually gives me some error information from the actual step that failed, which is more than I can get from a Job Notification. I may replace my existing "Notifications" infrastructure on my other servers with this...
usp_ForwardSqlJobErrors:
/*
Forward any new job errors to the central server.
created 20100709 by wills
*/
create proc usp_ForwardSqlJobErrors
as
begin
if not exists(select * from INFORMATION_SCHEMA.tables where TABLE_NAME = 'SqlJobErrorForwardingControl')
begin
create table dbo.SqlJobErrorForwardingControl
(
LastRunDateForwarded datetime
)
end
declare @PreviousLastRunDateForwarded datetime
declare @NewLastRunDateForwarded datetime
set @PreviousLastRunDateForwarded = isnull((select max(LastRunDateForwarded)
from dbo.SqlJobErrorForwardingControl), 0)
set @NewLastRunDateForwarded = isnull((select max(RunDate)
from Admin.dbo.vwJobStepsInError
where RunDate > @PreviousLastRunDateForwarded), 0)
if @NewLastRunDateForwarded > @PreviousLastRunDateForwarded begin
--If we have error messages to send, collect them up in a single variable.
--(I had to use nvarchar(4000) due to some bug in nvarchar(max) that was nullifiying all but the last row once I added the Message column).
declare @msg nvarchar(4000)
select @msg = coalesce(@msg + '; ' + char(13), '') + N'Error in JobName:' + JobName + N', Step:' + StepName + N', Msg:' + cast([Message] as nvarchar(100))
from Admin.dbo.vwJobStepsInError
where RunDate > @PreviousLastRunDateForwarded
and RunDate <= @NewLastRunDateForwarded
order by InstanceID desc
--print @msg
begin try
--Send errors to central server.
--(We're calling a proc that is going to raise an error. Try/Catch it.)
exec [DB1].Admin.dbo.usp_RaiseErrorFromDB2 @msg
end try
begin catch
--Proceed.
end catch
--Update control table.
if not exists(select * from dbo.SqlJobErrorForwardingControl)
begin
insert dbo.SqlJobErrorForwardingControl(LastRunDateForwarded)
select @NewLastRunDateForwarded
end
else
begin
update dbo.SqlJobErrorForwardingControl
set LastRunDateForwarded = @NewLastRunDateForwarded
end
end
end
vwJobStepsInError:
/*
Review SQL Agent Job steps that had errors.
usage:
select * from Admin.dbo.vwJobStepsInError
created 20100709 by wills - used SSMS Log File Viewer sql as base
*/
create view vwJobStepsInError
as
select top 100 percent
case tshj.run_date
when 0 then null
else convert(datetime, stuff(stuff(cast(tshj.run_date as nchar(8)), 7, 0, '-'), 5, 0, '-') + N' ' + stuff(stuff(substring(cast(1000000 + tshj.run_time as nchar(7)), 2, 6), 5, 0, ':'), 3, 0, ':'), 120)
end as [RunDate],
tshj.instance_id as [InstanceID],
tshj.job_id as [JobID],
tshj.job_name as [JobName],
tshj.step_id as [StepID],
tshj.step_name as [StepName],
tshj.sql_severity as [SqlSeverity],
tshj.sql_message_id as [SqlMessageID],
tshj.message as [Message],
tshj.run_status as [RunStatus],
tshj.run_duration as [RunDuration],
tshj.operator_emailed as [OperatorEmailed],
tshj.operator_netsent as [OperatorNetsent],
tshj.operator_paged as [OperatorPaged],
tshj.retries_attempted as [RetriesAttempted],
tshj.server as [Server]
from openrowset('sqloledb',
'server=(local);trusted_connection=yes',
'set fmtonly off exec msdb.dbo.sp_help_jobhistory @mode=''FULL''')as tshj
where sql_severity > 0 --errors only
order by [InstanceID] desc
July 9, 2010 at 4:47 pm
You could also reuse this on DB1, let it also then send out alerts for itself.
Sounds like a good solution.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply