September 15, 2011 at 6:30 am
Hi Chaps
I have a stored procedure which is scheduled to run hourly - this stored procedure deletes all rows from an existing table and then using 'insert into' repopulates the table with data (and rebuilds indexes too).
My problem is the query which repopulates the table takes a couple of minutes to complete, during which time, because the start of the procedure includes a DELETE FROM, the table being used live is empty. Is there an alternative method which leaves the older data in tact for as long as possible until the update is complete?
My naive and uneducated mind leads me to think of maybe using the SP to insert into a temp table, then deleting from real table, then copying into real table - assuming copying from the temp to real is faster than inserting into whilst running the monster query! But I don't know how to do this.
No doubt there are also much cleaner ways to effectively update table data from a complex queries other than deleting and inserting via a stored procedures but my knowledge is pretty limited I'm ashamed to say.
I've not posted the query in the SP because it really is a monster with multiple union statements but I will if it's needed.
Cheers
September 15, 2011 at 6:38 am
This was removed by the editor as SPAM
September 15, 2011 at 6:41 am
A couple of things.
1) If you're deleting everything from the destination, use TRUNCATE rather than DELETE. TRUNCATE generally runs faster because it does not fully log the rows being deleted. It just logs the extents.
2) If your query is a monster with that many UNIONs, break it up into smaller chunks so that it processes faster. Use individual INSERTS and that will give your users something to hit while the other queries are inserting.
3) If this table is so important that you can't have an "empty result set", why are you completely removing all the data from it?
4) Why aren't you doing this in the off hours if this is so important?
Yes, running the data into either a temp table or a staging table (permanent user table that can be truncated) could work for you, but there are other options.
Option A) Load Temp / Staging table, TRUNCATE / DELETE from the destination and then INSERT. (In that order)
Option B) Put an identity on the table (if there isn't one already). Load a variable with the MAX(MyIdentColumn). INSERT new data into table. DELETE FROM MyTable WHERE MyIdentColumn <= @Var.
If you absolutely must remove old data from the table, instead of doing an OUTER JOIN and just updating it with the new data, then I prefer Option B. But then, I also prefer doing this in the off hours where no one will notice a few minutes of the data doing weird things.
EDIT: FYI, I haven't used MERGE yet (we just upgraded), so all my advice is based on SQL 2k5 and below knowledge.
September 15, 2011 at 7:57 am
use merge statment, it's the best option. In other hand just serialize querys in steps, soo a job don't invoque next execution until finish one.
-
Lic. Andrés M. Aiello
DBA MSSQL - Oracle
http://aiellodba.blogspot.com/
@AndresAiello
September 15, 2011 at 10:25 am
The truncate and changing the union all to multiple selects has halved the time this takes to run; it's now below one minute so thank you very much for the help thus far 🙂
To answer one question - the data is entered regularly, every few minutes in fact, but in a format which is less than useless for reporting (just lousy software I guess!) - unfortunately the users will be running the reports all of the time and expect them to be as 'current' as possible - so an over night regime is not preferred if it can be avoided
Merge sounds like the correct approach but I've not yet fathomed that out.
Here's a snippet from the stored procedure as it is now (sorry about the messy format) - if there is anything particularly stupid that I'm missing please let me know..apart from using merge/cte etc (which would obviously need me to read a few more articles)
USE [ciswarehouse]
GO
/****** Object: StoredProcedure [dbo].[spLdd12] Script Date: 09/15/2011 17:10:17 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author:<Author,,Name>
-- Create date: <Create Date,,>
-- Description:<Description,,>
-- =============================================
ALTER PROCEDURE [dbo].[spLdd12]
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
IF OBJECT_ID('MyTransformed','U') IS NOT NULL
truncate table dbo.MyTransformed
Insert into ciswarehouse.dbo.MyTransformed
(StudentRegister, Activity, Student, MarkDate, ShDate, Mark, Weekdate, Present, Absence, Authorised, Unauthorised)
SELECT dbase1.dbo.table1_studentregister.sr_id, dbase1.dbo.table1_activity.a_id, dbase1.dbo.table1_studentregister.sr_student , DATEADD(wk, n. N - 1, dbase1.dbo.table1_activity.a_start), DATEADD(dd, datediff(dd, 0, dateadd(wk, n. N - 1, dbase1.dbo.table1_activity.a_start)), 0),'/' , DATEADD(ww, DATEDIFF(ww, 0, DATEADD(wk, n. N - 1, dbase1.dbo.table1_activity.a_start)), 0), '1','0','0','0'
FROM dbase1.dbo.table1_activity INNER JOIN
dbase1.dbo.table1_studentregister ON dbase1.dbo.table1_activity.a_id = dbase1.dbo.table1_studentregister.sr_activity INNER JOIN
ciswarehouse.dbo.Tally AS n ON n. N BETWEEN 1 AND LEN(dbase1.dbo.table1_activity.a_weekpattern) AND
SUBSTRING(dbase1.dbo.table1_activity.a_weekpattern,n.N,1)='1' AND SUBSTRING(dbase1.dbo.table1_studentregister.sr_weekpattern, n. N, 1)
= '/'
WHERE (dbase1.dbo.table1_activity.a_end > '01-Aug-2011')
Insert into ciswarehouse.dbo.MyTransformed
(StudentRegister, Activity, Student, MarkDate, ShDate, Mark, Weekdate, Present, Absence, Authorised, Unauthorised)
SELECT dbase1.dbo.table1_studentregister.sr_id , dbase1.dbo.table1_activity.a_id, dbase1.dbo.table1_studentregister.sr_student , DATEADD(wk, n. N - 1,
dbase1.dbo.table1_activity.a_start), DATEADD(dd, datediff(dd, 0, dateadd(wk, n. N - 1, dbase1.dbo.table1_activity.a_start)), 0),'O' , DATEADD(ww, DATEDIFF(ww, 0, DATEADD(wk, n. N - 1, dbase1.dbo.table1_activity.a_start)), 0), '0','1','0','1'
FROM dbase1.dbo.table1_activity INNER JOIN
dbase1.dbo.table1_studentregister ON dbase1.dbo.table1_activity.a_id = dbase1.dbo.table1_studentregister.sr_activity INNER JOIN
ciswarehouse.dbo.Tally AS n ON n. N BETWEEN 1 AND LEN(dbase1.dbo.table1_activity.a_weekpattern) AND
SUBSTRING(dbase1.dbo.table1_activity.a_weekpattern,n.N,1)='1' AND SUBSTRING(dbase1.dbo.table1_studentregister.sr_weekpattern, n. N, 1)
= 'O'
WHERE (dbase1.dbo.table1_activity.a_end > '01-Aug-2011')
Insert into ciswarehouse.dbo.MyTransformed
(StudentRegister, Activity, Student, MarkDate, ShDate, Mark, Weekdate, Present, Absence, Authorised, Unauthorised)
SELECT dbase1.dbo.table1_studentregister.sr_id , dbase1.dbo.table1_activity.a_id, dbase1.dbo.table1_studentregister.sr_student , DATEADD(wk, n. N - 1,
dbase1.dbo.table1_activity.a_start), DATEADD(dd, datediff(dd, 0, dateadd(wk, n. N - 1, dbase1.dbo.table1_activity.a_start)), 0),' ' , DATEADD(ww, DATEDIFF(ww, 0, DATEADD(wk, n. N - 1, dbase1.dbo.table1_activity.a_start)), 0), '0','0','0','0'
FROM dbase1.dbo.table1_activity INNER JOIN
dbase1.dbo.table1_studentregister ON dbase1.dbo.table1_activity.a_id = dbase1.dbo.table1_studentregister.sr_activity INNER JOIN
ciswarehouse.dbo.Tally AS n ON n. N BETWEEN 1 AND LEN(dbase1.dbo.table1_activity.a_weekpattern) AND
SUBSTRING(dbase1.dbo.table1_activity.a_weekpattern,n.N,1)='1' AND SUBSTRING(dbase1.dbo.table1_studentregister.sr_weekpattern, n. N, 1)
= ' '
WHERE (dbase1.dbo.table1_activity.a_end > '01-Aug-2011')
..a bunch more selects and insert intos with slightly different WHERE clauses
alter index all on MyTransformed rebuild
end
September 15, 2011 at 10:27 am
Aiello DBA (9/15/2011)
use merge statment, it's the best option. In other hand just serialize querys in steps, soo a job don't invoque next execution until finish one.-
Lic. Andrés M. Aiello
DBA MSSQL - Oracle
http://aiellodba.blogspot.com/
@AndresAiello
Hi - do you mean break the select statements into separate stored procedures and run seperate steps in the server agent or as I've done and replaced the 'union all' with multiple select/insert into ?
September 15, 2011 at 10:39 am
I miss understand the problem. I think that next execution try to read, but it's the app isn it? well the best option is MERGE and you will not need to delete, because the clause "is inteligent" and delete elements that not exist anymore in temp table, update o add dependes of temp table. Are you under 2008?
-
Lic. Andrés M. Aiello
DBA MSSQL - Oracle
http://aiellodba.blogspot.com/
@AndresAiello
September 15, 2011 at 10:44 am
Aiello DBA (9/15/2011)
I miss understand the problem. I think that next execution try to read, but it's the app isn it? well the best option is MERGE and you will not need to delete, because the clause "is inteligent" and delete elements that not exist anymore in temp table, update o add dependes of temp table. Are you under 2008?-
Lic. Andrés M. Aiello
DBA MSSQL - Oracle
http://aiellodba.blogspot.com/
@AndresAiello
Yeah I'm using 2008 - I had to admit I've no idea how to use Merge- atleast not yet 🙂
September 15, 2011 at 10:50 am
l.danks (9/15/2011)
To answer one question - the data is entered regularly, every few minutes in fact, but in a format which is less than useless for reporting (just lousy software I guess!) - unfortunately the users will be running the reports all of the time and expect them to be as 'current' as possible - so an over night regime is not preferred if it can be avoided
This sounds like a case for Transactional Replication to me.
September 15, 2011 at 2:11 pm
Here is how I would do this:
1) Create two tables for the process - an odd/even pair (MyTransformedDataOdd, MyTransformedDataEven).
2) Create 2 synonyms - one for the current data, one for the updated data
3) Check the day, if even - update the even table, if odd update the odd table
The shell of the process would be:
Truncate Table dbo.MyTransformedDataUpdate ...
Insert Into dbo.MyTransformedDataUpdate ...
Drop Synonym MyTransformedDataUpdate;
Drop Synonym MyTransformedDataCurrent;
If datepart(hour, getdate()) % 2 -- Odd
Begin
Create Synonym dbo.MyTransformedDataUpdate For dbo.MyTransformedDataOdd;
Create Synonym dbo.MyTransformedDataCurrent For dbo.MyTransformedDataEven;
End
Else
Create Synonym dbo.MyTransformedDataUpdate For dbo.MyTransformedDataEven;
Create Synonym dbo.MyTransformedDataCurrent For dbo.MyTransformedDataOdd;
End
The drop/create should take less than a second - but, if someone is accessing the table when it runs it will be blocked until it can switch over. You may have to put that piece in a transaction to insure someone doesn't get access while the synonyms are being swapped.
You may need to use something else to determine which tables to use - but the idea is that you always update the non actively used table and then switch the synonyms when ready. Your users would be directed to always use the 'Current' table.
Using MERGE won't help - since that will lock the table (or rows) and could cause blocking for your users as the table is being updated/inserted and I am assuming you don't want your users affected by this process.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
September 15, 2011 at 7:31 pm
l.danks (9/15/2011)
Here's a snippet from the stored procedure as it is now (sorry about the messy format) - if there is anything particularly stupid that I'm missing please let me know..
😉 The only thing that falls into that category is the fact that you didn't take the time to format the code which caused you to miss several performance opportunities...
ALL of the code is exactly the same except for some criteria here and there. If you move some of that criteria from the ON's in the FROM clause to CASE functions in the SELECT list, you can greatly improve the speed of execution because you won't need so many queries.
I took your first SELECT, used table aliases to greatly reduce the clutter, reformatted the code for readability, and added some column names for quick understanding of what is what instead of trying to eyeball what things are between the INSERT column list and the SELECT list.
SELECT StudentRegister = sr.sr_id,
Activity = a.a_id,
Student = sr.sr_student,
MarkDate = DATEADD(wk, (n.N-1), a.a_start),
ShDate = DATEADD(dd, DATEDIFF(dd, 0, DATEADD(wk, (n.N-1), a.a_start)), 0),
Mark = '/' ,
Weekdate = DATEADD(ww, DATEDIFF(ww, 0, DATEADD(wk, (n.N-1), a.a_start)), 0),
Flags = '1000'
FROM dbase1.dbo.table1_activity AS a
INNER JOIN dbase1.dbo.table1_studentregister AS sr ON a.a_id = sr.sr_activity
INNER JOIN ciswarehouse.dbo.Tally AS n ON n.N BETWEEN 1 AND LEN(a.a_weekpattern)
AND SUBSTRING(a.a_weekpattern, n.N, 1) = '1'
AND SUBSTRING(sr.sr_weekpattern, n.N, 1) = '/'
WHERE a.a_end > '01-Aug-2011'
Then, I looked at the only differences between the various SELECT's you posted and found that only a couple of literals change from SELECT to SELECT. That led me to the following code which handles ALL 3 of the SELECT's you posted.
SELECT StudentRegister = sr.sr_id,
Activity = a.a_id,
Student = sr.sr_student,
MarkDate = DATEADD(wk, (n.N-1), a.a_start),
ShDate = DATEADD(dd, DATEDIFF(dd, 0, DATEADD(wk, (n.N-1), a.a_start)), 0),
Mark = SUBSTRING(sr.sr_weekpattern, n.N, 1),
Weekdate = DATEADD(ww, DATEDIFF(ww, 0, DATEADD(wk, (n.N-1), a.a_start)), 0),
Flags = CASE --This makes it super easy to see which bits we'll use under different conditions
WHEN SUBSTRING(sr.sr_weekpattern, n.N, 1) = '/' THEN '1000'
WHEN SUBSTRING(sr.sr_weekpattern, n.N, 1) = '0' THEN '0101'
WHEN SUBSTRING(sr.sr_weekpattern, n.N, 1) = ' ' THEN '0000'
END
FROM dbase1.dbo.table1_activity AS a
INNER JOIN dbase1.dbo.table1_studentregister AS sr ON a.a_id = sr.sr_activity
INNER JOIN ciswarehouse.dbo.Tally AS n ON n.N BETWEEN 1 AND LEN(a.a_weekpattern)
AND SUBSTRING(a.a_weekpattern, n.N, 1) = '1'
AND SUBSTRING(sr.sr_weekpattern, n.N, 1) IN ('/','0',' ')
WHERE a.a_end > '01-Aug-2011'
Now... I realize the you don't want a 4 character "Flag" column. So the next thing to do is split it apart. This will also be your first lesson on CTE's. 😉
WITH
cteGetAll AS
( --=== This is the exact same code as above!
SELECT StudentRegister = sr.sr_id,
Activity = a.a_id,
Student = sr.sr_student,
MarkDate = DATEADD(wk, (n.N-1), a.a_start),
ShDate = DATEADD(dd, DATEDIFF(dd, 0, DATEADD(wk, (n.N-1), a.a_start)), 0),
Mark = SUBSTRING(sr.sr_weekpattern, n.N, 1), --Notice... no more literal!
Weekdate = DATEADD(ww, DATEDIFF(ww, 0, DATEADD(wk, (n.N-1), a.a_start)), 0),
Flags = CASE --This makes it super easy to see which bits we'll use under different conditions
WHEN SUBSTRING(sr.sr_weekpattern, n.N, 1) = '/' THEN '1000'
WHEN SUBSTRING(sr.sr_weekpattern, n.N, 1) = '0' THEN '0101'
WHEN SUBSTRING(sr.sr_weekpattern, n.N, 1) = ' ' THEN '0000'
END
FROM dbase1.dbo.table1_activity AS a
INNER JOIN dbase1.dbo.table1_studentregister AS sr ON a.a_id = sr.sr_activity
INNER JOIN ciswarehouse.dbo.Tally AS n ON n.N BETWEEN 1 AND LEN(a.a_weekpattern)
AND SUBSTRING(a.a_weekpattern, n.N, 1) = '1'
AND SUBSTRING(sr.sr_weekpattern, n.N, 1) IN ('/','0',' ')
WHERE a.a_end > '01-Aug-2011'
) --=== Now, use what we calculated above.
INSERT INTO ciswarehouse.dbo.MyTransformed
(StudentRegister, Activity, Student, MarkDate, ShDate, Mark, Weekdate, Present, Absence, Authorised, Unauthorised)
SELECT StudentRegister, Activity, Student, MarkDate, ShDate, Mark, Weekdate,
Present = SUBSTRING(Flags,1,1), --This is where we split the flags out
Absence = SUBSTRING(Flags,2,1),
Authorised = SUBSTRING(Flags,3,1),
Unauthorised = SUBSTRING(Flags,4,1)
FROM cteGetAll
;
THAT little piece of computational heaven replaces ALL of the code you posted... all 3 of the INSERT/SELECT's.
All the CTE is, in this case, is a SUBQUERY very much like you could put in the FROM clause. Its result set is simply used as if it were a table.
Obviously, I don't have your data to test the code with and it's very possible I may have made an error. Soooo.... just comment out the line with the INSERT on it and see if it gives you what you want. If any errors crop up, it should be very easy for you to find and fix them.
It should also be easy for you to see how to incorporate all of the other INSERT/SELECTs you may have.
Once you're sure it's returning what you want, uncomment the INSERT and Bob's your Uncle. 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
September 19, 2011 at 2:41 am
That looks incredibly 'clean' Jeff - I'll test this out later today
Thank you all for your advice and hard work! I gave a poorly worded problem and a poorly constructed query and ended up with a slick piece of code. 😎
September 19, 2011 at 4:33 am
Is there a reason you can't use Transactional Replication? I think it would solve your problem without you having to code anything.
September 22, 2011 at 2:35 am
Sorry about the delay in replying chaps, work got a little hectic.
Hi Brandie, the only real reason if I'm honest is my lack of knowledge/training - that or a MERGE are probably very good ways of doing this (I wouldn't really know but they sound right from the odd link you guys have posted and I've tried to follow). Alas the pragmatism of having a solution presented that I can atleast get my head around (if not fully appreciate) settles it for now. Though sod's law would say I'll be back here in two months needing to do exactly that 😉
Jeff, your piece of magic runs in around 45/50 seconds so quite an improvement I think - the query itself runs quicker but once I stick it into a scheduled stored procedure its back down to 50 seconds..but that is more than good enough.
I do have one slight problem with the two query results not matching, but I do know why..my last 'union all' looked for a NULL rather than an SPACE - I hadn't displayed that in my example..
INNER JOIN ciswarehouse.dbo.Tally AS n
ON n. N BETWEEN 1 AND LEN(ulive.dbo.capd_activity.a_weekpattern)
AND SUBSTRING(ulive.dbo.capd_activity.a_weekpattern,n.N,1)='1'
AND (ulive.dbo.capd_studentregister.sr_weekpattern is null)
But I'll work that out.
Thanks to you all for ideas and help - each has helped me take an extra baby step further than my starting position 'select * from table' 🙂
September 9, 2012 at 8:16 pm
Sorry. I lost track of this thread.
The code I wrote uses BETWEEN 1 and the length of a column. I've recenntly discovered that sometimes SQL Server loses its mind and does a full table scan of the Tally Table when you try to identify the starting value to be used in the Tally Table. If you change the ...
INNER JOIN ciswarehouse.dbo.Tally AS n ON n.N BETWEEN 1 AND LEN(a.a_weekpattern)
... to ...
INNER JOIN ciswarehouse.dbo.Tally AS n ON n.N <= LEN(a.a_weekpattern)
and use a "1" or "unit" based Tally Table, you could see quite a performance improvement in the form of a range scan clustered index seek instead of an index scan.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply