July 10, 2023 at 3:46 am
We are evaluating SQL Server 2022 and have upgraded an SQL Server Fail Over Cluster and AG with with CU 4. In order not to run the jobs created by Reporting services on the passive node, we have added a step to check to all SQL Agent Jobs that does a RAISERROR to exit out when found not to be the active node. Typically this would not show in the SQL Agent Log, however since we upgraded it to 2022, the SQL Agent Log is being flooded with [516] Step 1 for job 0x407ADE6DFC8CF84AAEB2EA38EC037401 failed with SQL error number 50000, severity 16.
I cannot find any reference to why this may be occurring. Does anyone know why this may be, or can suggest what other severity we could use?
Thanks!
July 10, 2023 at 5:03 pm
Instead of raising an error, simply test if the server is the active node.
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
July 11, 2023 at 1:48 pm
AG meant for read only secondary , why not use it and configure SSRS there.
Regards
Durai Nagarajan
July 11, 2023 at 1:51 pm
AG meant for read only secondary , why not use it and configure SSRS there.
Not always. If you have a read only secondary, you will need to license the secondary's. You may not need a read-only secondary.
As an example, of the 11 AG's I have in my environment, only 2 have read only secondary's. The rest are set up for DR/HA
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
July 11, 2023 at 2:31 pm
correct in terms of license but if no readable secondary we can go for cluster that might save resources right.
Regards
Durai Nagarajan
July 11, 2023 at 2:43 pm
correct in terms of license but if no readable secondary we can go for cluster that might save resources right.
It depends upon the requirements and need. Clustering requires shared storage, etc. It's difficult to cluster across different data centers.
Without an understanding of the OP's requirements, I would not begin to guess what technology would be "better" for their environment.
And, regardless if a secondary is read only, you really do not want most things to execute against the secondary. Any ETL, and most maintenance, would only need to be executed on the primary.
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
July 11, 2023 at 4:24 pm
I firmly disagree with the statement that AGs are only for implementing read only secondaries. AGs are more flexible and have fewer fragile components compared to an FCI while storage is about the single cheapest resource in a server environment even if you are all flash.
To mirror what Michael said, don't use errors to kill the job, test it. and I would go a step further and specifically recommend creating a job step that queries the primary status of the AG, then if it is not primary, queries the currently running job to then stop the currently running job. It allows you to boiler plate an agent job step you can reuse on all of your AG jobs instead of nesting the logic of each job instead of an if statement.
July 12, 2023 at 2:25 am
That is what we are doing
/* allow a normal or a primary replica database through */
IF ISNULL(sys.fn_hadr_is_primary_replica('ReportServer' ),1) != 1 BEGIN RAISERROR(' %s is not primary replica for %s',16,1, @@Servername, 'ReportServer') END
This is added as step 1 in all jobs. As I said normally SQL Agent Log wouldn't show this as a error, now it is.
July 12, 2023 at 2:02 pm
Using the code below, we did not raise an error.
If [dbo].[IsPrimaryInAG](agname) = 1 Begin
Execute job
End
I guess what are you trying to find out by raising an error?
/****** Object: UserDefinedFunction [dbo].[IsPrimaryInAG] Script Date: 3/9/2018 12:47:50 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[IsPrimaryInAG]
(
@AG_Name sysname
)
RETURNS tinyint
AS
BEGIN
DECLARE @PrimaryReplica sysname;
SELECT
@PrimaryReplica = HAGS.primary_replica
FROM sys.dm_hadr_availability_group_states HAGS
INNER JOIN sys.availability_groups AG ON HAGS.group_id = AG.group_id
WHERE AG.name = @AG_Name;
IF UPPER(@PrimaryReplica) = UPPER(@@Servername) BEGIN
RETURN 1
END; -- primary
RETURN 0; -- not primary
END;
GO
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
July 13, 2023 at 4:17 am
Ok thanks. Will look at that.
But as I said, RaiseError severity 16 never used to report errors in the SQL Agent, now it is with the message
"[516] Step 1 for job 0x9151A0B9139138499B4DB2DFE0852290 failed with SQL error number 229, severity 14"
May 2, 2024 at 5:23 pm
We just upgraded to SQL 2022 and are experiencing the same thing.
To provide some context the reason we use a "Primary Checker" script like the original author is to detect and exit a SQL Agent job if we're not on the primary. This is especially helpful on jobs with a lot of steps. You run the Primary Checker as step 1, and if we're not on the primary then the step will fail because of the RAISEERROR
and you set the job to Exit with Success. This is a very common way of handing jobs with availability groups.
This behavior is new and mostly a problem for us because SentryOne is trapping those errors and throwing alerts at us because the Primary Checker step is failing. Adding IsPrimaryAG
function referenced above would be great for simple jobs, but it would require you to wrap all job steps in it. And it's only valid for TSQL steps.
Necroing this thread to see if the OP came up with a better way to handle this new behavior?
May 2, 2024 at 7:49 pm
We just upgraded to SQL 2022 and are experiencing the same thing.
To provide some context the reason we use a "Primary Checker" script like the original author is to detect and exit a SQL Agent job if we're not on the primary. This is especially helpful on jobs with a lot of steps. You run the Primary Checker as step 1, and if we're not on the primary then the step will fail because of the RAISEERROR and you set the job to Exit with Success. This is a very common way of handing jobs with availability groups.
It may be common, but it makes no sense, just like reorg at 5% fragmentation, rebuild at 30% fragmentation.
This is not really an error. Where did you download this code from?
This behavior is new and mostly a problem for us because SentryOne is trapping those errors and throwing alerts at us because the Primary Checker step is failing. Adding IsPrimaryAG function referenced above would be great for simple jobs, but it would require you to wrap all job steps in it. And it's only valid for TSQL steps.
Not really. Depending upon the situation, you may want to execute a step on both of the nodes.
If this is a job that has a number of steps and you do not want to add this to every step, then create a separate job. Test if it's the primary, and use sp_start_job to run the job you want to run. That is one way to handle this.
You can also, although this is more complicated, create a process that is triggered by the alert 1480, Failover, that enables or disables the jobs. You would also want to execute a separate job/script that does this test on a schedule in case the alert is not triggered.
You can suppress the alerts from SentryOne.
There are other ways to handle this if you think outside the box.
I also cannot get a severity 16 error to write to the log. I copied the code above, and ran it on my 2022 box. I tried a few other things to throw a sev 16 error, and it did not write to the log.
I'm not sure why this is treated as an error. It is simply a test of a condition. If true, keep processing, false, move on. It is no different than an IF statement construct.
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
May 2, 2024 at 9:15 pm
Just google "sql agent jobs availability group primary checker" and you will get plenty of hits. Been using it successfully since SQL 2012, but with SQL 2022 the behavior has changed.
SQL Server Agent doesn't have any native constructs for dealing with Availability Groups so people are forced to come up with Rube Goldberg machines to perform what should be a simple thing. Is it a hacky solution, yes, but so are all of the other ones I have seen. A job that kicks off another job. A job that enables and disables other jobs on failover. All suffer from unnecessary complexity. At least the primary checker is simple and honestly it still works fine, it just added noise to the SQL Agent Log.
I think it's perfectly reasonable to investigate why the behavior has changed before going off and rearchitecting a solution.
May 3, 2024 at 12:04 am
This was removed by the editor as SPAM
May 3, 2024 at 12:16 am
LOL OK tough guy. Have a great night.
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply