August 29, 2017 at 12:52 pm
Hi everyone,
I'm trying to come up with something that is better performance-wise with a stored procedure I've built. I'm not going to go into too much detail but here's a summary:
I've got a Client_merge table containing two columns: Main_client, Dupe_client (both int type) and I want to loop through every tables in my database and update every values that contains the Dupe_client_id to the Main_client_id. It's basically for merging two client files into one.
Pretty easy but what I've built is definitely not great performance-wise. I'm always looking for new stuff to learn so I would be very happy if someone would explain me a better way to do this.
SET NOCOUNT ON;
DECLARE @No_client_main as int;
DECLARE @No_client_dupe as int;
DECLARE @Code_store_main as nvarchar(3);
DECLARE @Code_store_dupe as nvarchar(3);
DECLARE @Code_store_ref as nvarchar(3);
DECLARE @No_file_main as int;
DECLARE @No_file_main_vc as int;
DECLARE @No_file_dupe as int;
DECLARE @No_file_dupe_vc as int;
DECLARE @No_file_ref as int;
DECLARE @No_file_vc_ref as int;
DECLARE @Derniere_visite_main as datetime;
DECLARE @Derniere_visite_dupe as datetime;
DECLARE @Memo_merge as nvarchar(200);
DECLARE @Memo_merge_dupe as nvarchar(200);
DECLARE @reversed_main_client as bit;
DECLARE @cursorclient as cursor;
SET @cursorclient = CURSOR FOR
SELECT top 1000 No_client_main, No_client_dupe
FROM client_import_merge
WHERE Transfered = 0
OPEN @cursorclient
FETCH NEXT FROM @cursorclient INTO
@No_client_main, @No_client_dupe;
WHILE @@FETCH_STATUS = 0
BEGIN
BEGIN TRY
BEGIN TRANSACTION
SELECT TOP 1 @Code_store_main = Code_store, @No_file_main = No_file, @No_file_main_vc = No_file_VC FROM associe_a where no_client = @No_client_main
SELECT TOP 1 @Code_store_dupe = Code_store, @No_file_dupe = No_file, @No_file_dupe_vc = No_file_VC FROM associe_a where no_client = @No_client_dupe
SET @reversed_main_client = 0;
SET @No_file_ref = @No_file_dupe;
SET @No_file_vc_ref = @No_file_dupe_vc;
SET @Code_store_ref = @Code_store_dupe;
IF @Code_store_main = @Code_store_dupe AND @No_file_main > @No_file_dupe
BEGIN
SET @reversed_main_client = 1;
SET @No_file_ref = @No_file_main;
SET @No_file_vc_ref = @No_file_main_vc;
SET @Code_store_ref = @Code_store_main;
UPDATE Associe_a
SET
No_file = @No_file_dupe,
No_file_vc = @No_file_dupe_vc
WHERE No_client = @No_client_main
AND Code_store = @Code_store_main
UPDATE Associe_a
SET
No_file = @No_file_main,
No_file_vc = @No_file_main_vc
WHERE No_client = @No_client_dupe
AND Code_store = @Code_store_dupe
END
-- Do a bunch of updates like the one below..
UPDATE Facture
SET
No_client = @No_client_main
WHERE No_client = @No_client_dupe
SELECT @Derniere_visite_main = Derniere_visite FROM Suivi WHERE No_client = @No_client_main
SELECT @Derniere_visite_dupe = Derniere_visite FROM Suivi WHERE No_client = @No_client_dupe
IF DATEDIFF(day, ISNULL(@Derniere_visite_main, '1900-01-01 00:00:00.000'), ISNULL(@Derniere_visite_dupe, '1900-01-01 00:00:00.000')) >= 0
BEGIN
UPDATE Suivi
SET
Derniere_visite = @Derniere_visite_dupe
WHERE No_client = @No_client_main
END
SET @Memo_merge = 'Fusion avec file ' + RTRIM(@Code_store_ref) + '-' + CONVERT(nvarchar(12), @No_file_ref) + ', VC(' + CONVERT(nvarchar(12), ISNULL(@No_file_vc_ref, 0)) + ') le ' + CONVERT(char(10), getdate(), 126) + '.';
SET @Memo_merge_dupe = 'Fusion avec file ' + RTRIM(@Code_store_main) + '-' + CONVERT(nvarchar(12), @No_file_main) + ', VC(' + CONVERT(nvarchar(12), ISNULL(@No_file_main_vc, 0)) + ') le ' + CONVERT(char(10), getdate(), 126) + '.';
INSERT INTO Memo(No_client, Date_memo, Memo)
VALUES (@No_client_main, getdate(), @Memo_merge)
INSERT INTO Memo(No_client, Date_memo, Memo)
VALUES (@No_client_dupe, getdate(), @Memo_merge_dupe)
UPDATE client
SET
actif = 0
WHERE No_client = @No_client_dupe
-- Log action
UPDATE client_import_merge
SET
transfered = 1,
transfered_at = GETDATE(),
reversed = @reversed_main_client
WHERE no_client_main = @No_client_main
AND no_client_dupe = @No_client_dupe
COMMIT TRANSACTION
END TRY
BEGIN CATCH
IF @@TRANCOUNT > 0
ROLLBACK TRANSACTION
PRINT ERROR_SEVERITY()
PRINT ERROR_MESSAGE()
PRINT ERROR_STATE()
END CATCH
FETCH NEXT FROM @cursorclient INTO
@No_client_main, @No_client_dupe;
END
CLOSE @cursorclient;
DEALLOCATE @cursorclient;
Maybe including everything was not necessary but did it just in case. Right now I can merge about 7 clients per second, but I have over 50k clients to update so, if there is a better way, please let me know.
Thanks
August 29, 2017 at 1:38 pm
Start by declaring the cursor as fast_forward!
For proper help, you'll need to provide schema, realistic test data and desired results
August 29, 2017 at 1:54 pm
Eirikur Eiriksson - Tuesday, August 29, 2017 1:38 PMStart by declaring the cursor as fast_forward!For proper help, you'll need to provide schema, realistic test data and desired results
I didn't even know that this existed so I'm gonna look at this, thanks.
August 29, 2017 at 11:27 pm
You may want to use this combination <fast forward, local>
Reference I read few years earlier: https://sqlperformance.com/2012/09/t-sql-queries/cursor-options
August 30, 2017 at 7:03 am
SMNayak - Tuesday, August 29, 2017 11:27 PMYou may want to use this combination <fast forward, local>Reference I read few years earlier: https://sqlperformance.com/2012/09/t-sql-queries/cursor-options
Great article, thanks. I ran my stored proc using LOCAL FAST_FORWARD on my test DB and just like in the article, it was 5x faster.
August 30, 2017 at 7:43 am
sibs132 - Wednesday, August 30, 2017 7:03 AMSMNayak - Tuesday, August 29, 2017 11:27 PMYou may want to use this combination <fast forward, local>Reference I read few years earlier: https://sqlperformance.com/2012/09/t-sql-queries/cursor-options
Great article, thanks. I ran my stored proc using LOCAL FAST_FORWARD on my test DB and just like in the article, it was 5x faster.
Don't forget NOT to use a procedural approach. SQL is best for set based operations. But in some situations, you can't avoid writing a loop.
August 30, 2017 at 11:12 am
Looking at the code, at least at first glance, it looks like this doesn't need a cursor at all.
You should be able to these updates in sets....
IF OBJECT_ID('tempdb..#MergeData', 'U') IS NOT NULL
DROP TABLE #MergeData;
SELECT DISTINCT
Code_store_main = aam.Code_store,
No_file_main = aam.No_file,
No_file_main_vc = aam.No_file_VC,
Code_store_dupe = aad.Code_store,
No_file_dupe = aad.No_file,
No_file_dupe_vc = aad.No_file_VC
INTO #MergeData
FROM
dbo.client_import_merge cim
JOIN associe_a aam
ON cim.No_client_main = aam.no_client
JOIN associe_a aad
ON cim.No_client_dupe = aad.no_client;
UPDATE aa SET
aa.No_file = md.No_file_dupe,
aa.No_file_vc = md.No_file_dupe_vc
FROM
Associe_a aa
JOIN #MergeData md
ON aa.No_client = md.No_client_main
AND aa.Code_store = md.Code_store_main
WHERE
aa.Code_store_main = aa.Code_store_dupe
AND aa.No_file_main > aa.No_file_dupe;
-- ... an so on...
August 30, 2017 at 5:42 pm
[/quote]
I have to agree. I didn't see anything in there to merit a cursor.
sibs132, I'd suggest creating a copy of the database on a test server and playing with it there. You'll want to make sure you cover everything and a test database is the place to play with that. Based on my experience, I think you'll find that the work is worth the time you'll save if you do this type of thing semi-frequently. The performance comparison of RBAR (row by agonizing row) versus set-based processing in SQL Server isn't much of a comparison.
Please post back if you have questions.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy