August 29, 2017 at 12:52 pm
Hi everyone,
I'm trying to come up with something that is better performance-wise with a stored procedure I've built. I'm not going to go into too much detail but here's a summary:
I've got a Client_merge table containing two columns: Main_client, Dupe_client (both int type) and I want to loop through every tables in my database and update every values that contains the Dupe_client_id to the Main_client_id. It's basically for merging two client files into one.
Pretty easy but what I've built is definitely not great performance-wise. I'm always looking for new stuff to learn so I would be very happy if someone would explain me a better way to do this.
SET NOCOUNT ON;
DECLARE @No_client_main as int;
DECLARE @No_client_dupe as int;
DECLARE @Code_store_main as nvarchar(3);
DECLARE @Code_store_dupe as nvarchar(3);
DECLARE @Code_store_ref as nvarchar(3);
DECLARE @No_file_main as int;
DECLARE @No_file_main_vc as int;
DECLARE @No_file_dupe as int;
DECLARE @No_file_dupe_vc as int;
DECLARE @No_file_ref as int;
DECLARE @No_file_vc_ref as int;
DECLARE @Derniere_visite_main as datetime;
DECLARE @Derniere_visite_dupe as datetime;
DECLARE @Memo_merge as nvarchar(200);
DECLARE @Memo_merge_dupe as nvarchar(200);
DECLARE @reversed_main_client as bit;
DECLARE @cursorclient as cursor;
SET @cursorclient = CURSOR FOR
SELECT top 1000 No_client_main, No_client_dupe
FROM client_import_merge
WHERE Transfered = 0
OPEN @cursorclient
FETCH NEXT FROM @cursorclient INTO
@No_client_main, @No_client_dupe;
WHILE @@FETCH_STATUS = 0
BEGIN
BEGIN TRY
BEGIN TRANSACTION
SELECT TOP 1 @Code_store_main = Code_store, @No_file_main = No_file, @No_file_main_vc = No_file_VC FROM associe_a where no_client = @No_client_main
SELECT TOP 1 @Code_store_dupe = Code_store, @No_file_dupe = No_file, @No_file_dupe_vc = No_file_VC FROM associe_a where no_client = @No_client_dupe
SET @reversed_main_client = 0;
SET @No_file_ref = @No_file_dupe;
SET @No_file_vc_ref = @No_file_dupe_vc;
SET @Code_store_ref = @Code_store_dupe;
IF @Code_store_main = @Code_store_dupe AND @No_file_main > @No_file_dupe
BEGIN
SET @reversed_main_client = 1;
SET @No_file_ref = @No_file_main;
SET @No_file_vc_ref = @No_file_main_vc;
SET @Code_store_ref = @Code_store_main;
UPDATE Associe_a
SET
No_file = @No_file_dupe,
No_file_vc = @No_file_dupe_vc
WHERE No_client = @No_client_main
AND Code_store = @Code_store_main
UPDATE Associe_a
SET
No_file = @No_file_main,
No_file_vc = @No_file_main_vc
WHERE No_client = @No_client_dupe
AND Code_store = @Code_store_dupe
END
-- Do a bunch of updates like the one below..
UPDATE Facture
SET
No_client = @No_client_main
WHERE No_client = @No_client_dupe
SELECT @Derniere_visite_main = Derniere_visite FROM Suivi WHERE No_client = @No_client_main
SELECT @Derniere_visite_dupe = Derniere_visite FROM Suivi WHERE No_client = @No_client_dupe
IF DATEDIFF(day, ISNULL(@Derniere_visite_main, '1900-01-01 00:00:00.000'), ISNULL(@Derniere_visite_dupe, '1900-01-01 00:00:00.000')) >= 0
BEGIN
UPDATE Suivi
SET
Derniere_visite = @Derniere_visite_dupe
WHERE No_client = @No_client_main
END
SET @Memo_merge = 'Fusion avec file ' + RTRIM(@Code_store_ref) + '-' + CONVERT(nvarchar(12), @No_file_ref) + ', VC(' + CONVERT(nvarchar(12), ISNULL(@No_file_vc_ref, 0)) + ') le ' + CONVERT(char(10), getdate(), 126) + '.';
SET @Memo_merge_dupe = 'Fusion avec file ' + RTRIM(@Code_store_main) + '-' + CONVERT(nvarchar(12), @No_file_main) + ', VC(' + CONVERT(nvarchar(12), ISNULL(@No_file_main_vc, 0)) + ') le ' + CONVERT(char(10), getdate(), 126) + '.';
INSERT INTO Memo(No_client, Date_memo, Memo)
VALUES (@No_client_main, getdate(), @Memo_merge)
INSERT INTO Memo(No_client, Date_memo, Memo)
VALUES (@No_client_dupe, getdate(), @Memo_merge_dupe)
UPDATE client
SET
actif = 0
WHERE No_client = @No_client_dupe
-- Log action
UPDATE client_import_merge
SET
transfered = 1,
transfered_at = GETDATE(),
reversed = @reversed_main_client
WHERE no_client_main = @No_client_main
AND no_client_dupe = @No_client_dupe
COMMIT TRANSACTION
END TRY
BEGIN CATCH
IF @@TRANCOUNT > 0
ROLLBACK TRANSACTION
PRINT ERROR_SEVERITY()
PRINT ERROR_MESSAGE()
PRINT ERROR_STATE()
END CATCH
FETCH NEXT FROM @cursorclient INTO
@No_client_main, @No_client_dupe;
END
CLOSE @cursorclient;
DEALLOCATE @cursorclient;
Maybe including everything was not necessary but did it just in case. Right now I can merge about 7 clients per second, but I have over 50k clients to update so, if there is a better way, please let me know.
Thanks
August 29, 2017 at 1:38 pm
Start by declaring the cursor as fast_forward!
😎
For proper help, you'll need to provide schema, realistic test data and desired results
August 29, 2017 at 1:54 pm
Eirikur Eiriksson - Tuesday, August 29, 2017 1:38 PMStart by declaring the cursor as fast_forward!
😎For proper help, you'll need to provide schema, realistic test data and desired results
I didn't even know that this existed so I'm gonna look at this, thanks.
August 29, 2017 at 11:27 pm
You may want to use this combination <fast forward, local>
Reference I read few years earlier: https://sqlperformance.com/2012/09/t-sql-queries/cursor-options
August 30, 2017 at 7:03 am
SMNayak - Tuesday, August 29, 2017 11:27 PMYou may want to use this combination <fast forward, local>Reference I read few years earlier: https://sqlperformance.com/2012/09/t-sql-queries/cursor-options
Great article, thanks. I ran my stored proc using LOCAL FAST_FORWARD on my test DB and just like in the article, it was 5x faster.
August 30, 2017 at 7:43 am
sibs132 - Wednesday, August 30, 2017 7:03 AMSMNayak - Tuesday, August 29, 2017 11:27 PMYou may want to use this combination <fast forward, local>Reference I read few years earlier: https://sqlperformance.com/2012/09/t-sql-queries/cursor-options
Great article, thanks. I ran my stored proc using LOCAL FAST_FORWARD on my test DB and just like in the article, it was 5x faster.
Don't forget NOT to use a procedural approach. SQL is best for set based operations. But in some situations, you can't avoid writing a loop.
August 30, 2017 at 11:12 am
Looking at the code, at least at first glance, it looks like this doesn't need a cursor at all.
You should be able to these updates in sets....
IF OBJECT_ID('tempdb..#MergeData', 'U') IS NOT NULL
DROP TABLE #MergeData;
SELECT DISTINCT
Code_store_main = aam.Code_store,
No_file_main = aam.No_file,
No_file_main_vc = aam.No_file_VC,
Code_store_dupe = aad.Code_store,
No_file_dupe = aad.No_file,
No_file_dupe_vc = aad.No_file_VC
INTO #MergeData
FROM
dbo.client_import_merge cim
JOIN associe_a aam
ON cim.No_client_main = aam.no_client
JOIN associe_a aad
ON cim.No_client_dupe = aad.no_client;
UPDATE aa SET
aa.No_file = md.No_file_dupe,
aa.No_file_vc = md.No_file_dupe_vc
FROM
Associe_a aa
JOIN #MergeData md
ON aa.No_client = md.No_client_main
AND aa.Code_store = md.Code_store_main
WHERE
aa.Code_store_main = aa.Code_store_dupe
AND aa.No_file_main > aa.No_file_dupe;
-- ... an so on...
August 30, 2017 at 5:42 pm
[/quote]
I have to agree. I didn't see anything in there to merit a cursor.
sibs132, I'd suggest creating a copy of the database on a test server and playing with it there. You'll want to make sure you cover everything and a test database is the place to play with that. Based on my experience, I think you'll find that the work is worth the time you'll save if you do this type of thing semi-frequently. The performance comparison of RBAR (row by agonizing row) versus set-based processing in SQL Server isn't much of a comparison.
Please post back if you have questions.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply