August 19, 2014 at 5:58 am
I am new to SQL Server and know next to nothing about T-SQL, although I do know a bit about SQL. Our purchasing department got a database package and some customised T-SQL from an external company and it runs for over 12 hours !! I would appreciate any help or guidance on how to go about troubleshooting the code....
/* appclean.sql *
* $Id: appclean.sql,v 1.6 2014/08/01 12:21:43 bok Exp $ *
* Copyright (c) 2012-2014 Minerva Danmark A/S *
* *
* Programmed by: Bo Kaltoft *
* *
* Script for cleaning the application data for dataload. The *
* result of the script will be partly stored in a result *
* table HUM_DATAIMPORT_RES. The QTY column in the _RES table *
* can be >0 if the QTY must be updated in Innovator or 0 if *
* the application should be deleted. *
* Records from the input table will be marked as processed if *
* they are added to the _RES table or if it is found that they*
* already exist in Innovator otherwise they will need to be *
* loaded. *
* *
* Execution environment must provide: *
* @batchid = <batchid> *
* *
* */
SET NOCOUNT ON
/* Replace null values in indexed columns */
UPDATE innovator.HUM_DATAIMPORT_200 SET [BLOCK] = '' WHERE [BLOCK] IS NULL;
UPDATE innovator.HUM_DATAIMPORT_200 SET [OP CODE] = '' WHERE [OP CODE] IS NULL;
UPDATE innovator.HUM_DATAIMPORT_200 SET [APPL SFX] = '' WHERE [APPL SFX] IS NULL;
declare @number bigint;
DECLARE @family varchar(50);
DECLARE @model varchar(50);
DECLARE @type varchar(50);
DECLARE @option varchar(50);
DECLARE @block varchar(50);
DECLARE @section varchar(50);
DECLARE @item varchar(50);
DECLARE @appsfx varchar(50);
DECLARE @dcno varchar(50);
DECLARE @begdt varchar(50);
DECLARE @enddt varchar(50);
DECLARE @dcpn varchar(50);
DECLARE @qty varchar(50);
DECLARE @mtpid char(32);
DECLARE @strdt varchar(8) = CONVERT(varchar(8),SYSUTCDATETIME(),112);
IF OBJECT_ID('#tempTable') IS NOT NULL DROP Table #tempTable
SELECT
IDENTITY(bigint,1,1) as number,
mtp.[HUM_FAMILY_CODE],
mtp.[HUM_MODEL_CODE],
mtp.[HUM_TYPE_CODE],
mtp.[HUM_OPTION_CODE],
mtp.[HUM_BLOCK],
mtp.[HUM_SECTION],
mtp.[HUM_ITEM_CODE],
mtp.[HUM_APPL_SFX],
mtp.[HUM_DC_NO],
mtp.[HUM_BEG_DT],
mtp.[HUM_END_DT],
mtp.[HUM_ITEM_NUMBER],
mtp.[HUM_QUANTITY],
mtp.[ID]
into #tempTable
FROM innovator.HUM_MODEL_TYPE_PART AS mtp
INNER JOIN innovator.HUM_DATAIMPORT_MM AS mm
ON (mm.[HUM_FAMILY_CODE] COLLATE DATABASE_DEFAULT = mtp.[HUM_FAMILY_CODE] COLLATE DATABASE_DEFAULT
AND mm.[HUM_MODEL_CODE] COLLATE DATABASE_DEFAULT = mtp.[HUM_MODEL_CODE] COLLATE DATABASE_DEFAULT
AND mm.[HUM_TYPE_CODE] COLLATE DATABASE_DEFAULT = mtp.[HUM_TYPE_CODE] COLLATE DATABASE_DEFAULT)
WHERE mm.[BATCHID] = @batchid;
DECLARE @count int = 0;
set @number = (select MIN(number) from #tempTable);
WHILE @number IS NOT NULL
begin
SELECT
@number = number,
@FAMILY = HUM_FAMILY_CODE,
@MODEL = HUM_MODEL_CODE,
@TYPE = HUM_TYPE_CODE,
@option = HUM_OPTION_CODE,
@block = HUM_BLOCK,
@section= HUM_SECTION,
@item = HUM_ITEM_CODE,
@appsfx = HUM_APPL_SFX,
@dcno = HUM_DC_NO,
@begdt = HUM_BEG_DT,
@enddt = HUM_END_DT,
@dcpn = HUM_ITEM_NUMBER,
@qty = HUM_QUANTITY,
@mtpid = ID
FROM #tempTable WHERE number = @number;
print(@family+@model+@type+@option);
/* Set empty strings to null (i200 has '' values) */
/* Try to find deleted Application */
SET @enddt = (SELECT TOP 1 i200.[END DT] FROM innovator.HUM_DATAIMPORT_200 AS i200
WHERE i200.BATCHID = @batchid
AND i200.[PROCESSED] = (0)
AND i200.[ACTIVE] = (1)
AND i200.[L1 DC PN] = @dcpn
AND i200.[FAMILY] = @family
AND i200.[MODEL] = @model
AND i200.[TYPE] = @type
AND i200.[OP CODE] = @option
AND i200.[BLOCK] = @block
AND i200.[SEC] = @section
AND i200.[ITEM] = @item
AND i200.[APPL SFX] = @appsfx
AND i200.[DC NO BEAM] = @dcno
AND i200.[BEG DT] = @begdt);
/* We didn't find a value, so this indicates a delete */
IF(@enddt IS NULL)
BEGIN
SET @count = @count + 1;
INSERT INTO innovator.HUM_DATAIMPORT_RES
([FAMILY],[MODEL],[TYPE],[OPTION],[BLOCK],[SECTION],[ITEM],[APPLSFX],[DC],[BEGDT],
[ENDDT],[L1DCPN],[QTY],[MTPID],[BATCHID],[PROCESSED],[ACTIVE])
VALUES
(@family,@model,@type,@option,@block,@section,@item,@appsfx,@dcno,@begdt,
@strdt,@dcpn,'0',@mtpid,@batchid,(0),(1));
END
/* Or changed Application */
ELSE IF(@qty != (SELECT i200.[QTY] FROM innovator.HUM_DATAIMPORT_200 AS i200
WHERE i200.BATCHID = @batchid
AND i200.[PROCESSED] = (0)
AND i200.[ACTIVE] = (1)
AND i200.[L1 DC PN] = @dcpn
AND i200.[FAMILY] = @family
AND i200.[MODEL] = @model
AND i200.[TYPE] = @type
AND i200.[OP CODE] = @option
AND i200.[BLOCK] = @block
AND i200.[SEC] = @section
AND i200.[ITEM] = @item
AND i200.[APPL SFX] = @appsfx
AND i200.[DC NO BEAM] = @dcno
AND i200.[BEG DT] = @begdt))
BEGIN
SET @count = @count + 1;
INSERT INTO innovator.HUM_DATAIMPORT_RES
([FAMILY],[MODEL],[TYPE],[OPTION],[BLOCK],[SECTION],[ITEM],[APPLSFX],[DC],[BEGDT],
[ENDDT],[L1DCPN],[QTY],[MTPID],[BATCHID],[PROCESSED],[ACTIVE])
VALUES
(@family,@model,@type,@option,@block,@section,@item,@appsfx,@dcno,@begdt,
@enddt,@dcpn,@qty,@mtpid,@batchid,(0),(1));
UPDATE innovator.HUM_DATAIMPORT_200 SET PROCESSED=(1)
WHERE BATCHID = @batchid
AND [PROCESSED] = (0)
AND [ACTIVE] = (1)
AND [L1 DC PN] = @dcpn
AND [FAMILY] = @family
AND [MODEL] = @model
AND [TYPE] = @type
AND [OP CODE] = @option
AND [BLOCK] = @block
AND [SEC] = @section
AND [ITEM] = @item
AND [APPL SFX] = @appsfx
AND [DC NO BEAM] = @dcno
AND [BEG DT] = @begdt;
END
set @number = (select MIN(number) from #tempTable where number>@number);
END
SET NOCOUNT OFF
This is RBAR processing at its worst !!!
August 19, 2014 at 6:50 am
Assuming the code is left as is, an index here or there could take the overall process down from 10 hours to 1 hour. The short answer is that you need to analyze the execution plan. This batch of T-SQL is actually a couple of blocks of parameterized insert / update statements that get executed N (???) times, so to set this up, you may need to copy a block into a query window and focus on a single execution by supplying specific values.
The following link is to a free e-book published by Red Gate.
http://www.red-gate.com/community/books/sql-server-execution-plans-ed-2
In terms of how the overall process is written, if I had to write this thing from scratch, my first stab would be to do it using a single MERGE statement. But since this process was supplied to you by a 3rd party ISV, it may not be a good idea to tear it apart, unless you have specific technical requiremets about how it should function.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
August 19, 2014 at 7:36 am
Hi Eric,
Thanks for the response....I am sure that they would be more than happy with an hour.. 😛
I have downloaded the PDF and will be looking at it soon.
Again, thanks for the help.
August 19, 2014 at 8:29 am
garry.lovesey (8/19/2014)
This is RBAR processing at its worst !!!
You're absolutely right. This was made by someone who heard that cursors were bad and replaced them with while loops.
If you can change the code, try to go with the MERGE statement as Eric pointed out. Even separate statements for DELETE and INSERT that can work in a set based manner. Even a well written cursor would reduce the time.
If you can't change the code, then the indexes are probably your only solution (along with updated statistics).
Is it worth to try alternatives for the code?
August 19, 2014 at 8:36 am
We can change the code as long as we run it past the external company for testing. I am sure that they would be more than happy to have some working code that they can sell to their other customers 😀
I shall look at the MERGE statement on BOL to start off.....
August 19, 2014 at 8:45 am
I'm not sure why they have COLLATE statements in the WHERE clause, but it's unlikely to help the code.
Keep an eye on all those variables and parameters too for bad parameter sniffing (or none in the case of the variables). That could seriously impact the plan generated.
Other than that, I'm with everyone else, get rid of the RBAR.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
August 19, 2014 at 9:17 am
Hi Grant, thank you for your input. I have sent off an e-mail asking the supplier if they really need the COLLATE statements in there.
Do I need the MERGE statement here or can I get away with a straight INSERT INTO....SELECT FROM here ??
Maybe two of them...one for if [END DT] is NULL and one for @qty != (SELECT i200.[QTY] FROM innovator.HUM_DATAIMPORT_200 AS i200).
August 19, 2014 at 10:33 am
The check to see if the temp table exists doesn't work. The OBJECT_ID() function with a simple table name only looks in the current database. To check for a temp table you need a three-part name ("IF OBJECT_ID('tempdb..#tempTable') IS NOT NULL").
The COLLATE functions will prevent any indexes from being used for this join, so they really should go. Can you just look at the table definitions to see if there really is a collation difference in the three code columns in the first query? I have seen some crazy things in third-party apps, but not as crazy as using multiple collations on related columns in the same database.
You can use one MERGE statement for the whole procedure (aside from defining the @strdt variable). Your target table is HUM_DATAIMPORT_200, and the OUTPUT clause will log records to HUM_DATAIMPORT_RES. The rows where END_DT IS NULL are not actually updated, but they have to be included so they will show up in the OUTPUT. I used "UPDATE SET PROCESSED=PROCESSED" for these rows so they are not modified.
WITH mtp AS (
SELECT mtp.[HUM_FAMILY_CODE],
mtp.[HUM_MODEL_CODE],
mtp.[HUM_TYPE_CODE],
mtp.[HUM_OPTION_CODE],
mtp.[HUM_BLOCK],
mtp.[HUM_SECTION],
mtp.[HUM_ITEM_CODE],
mtp.[HUM_APPL_SFX],
mtp.[HUM_DC_NO],
mtp.[HUM_BEG_DT],
mtp.[HUM_END_DT],
mtp.[HUM_ITEM_NUMBER],
mtp.[HUM_QUANTITY],
mtp.[ID]
FROM innovator.HUM_MODEL_TYPE_PART mtp
INNER JOIN innovator.HUM_DATAIMPORT_MM AS mm
ON (mm.[HUM_FAMILY_CODE] = mtp.[HUM_FAMILY_CODE]
AND mm.[HUM_MODEL_CODE] = mtp.[HUM_MODEL_CODE]
AND mm.[HUM_TYPE_CODE] = mtp.[HUM_TYPE_CODE])
WHERE mm.BATCHID = @batchid )
MERGE INTO innovator.HUM_DATAIMPORT_200 AS i200
USING mtp
ON i200.BATCHID= @batchid
AND i200.[PROCESSED] = (0)
AND i200.[ACTIVE] = (1)
AND i200.[L1 DC PN] = mtp.[HUM_ITEM_NUMBER]
AND i200.[FAMILY] = mtp.[HUM_FAMILY_CODE]
AND i200.[MODEL] = mtp.[HUM_MODEL_CODE]
AND i200.[TYPE] = mtp.[HUM_TYPE_CODE]
AND i200.[OP CODE] = mtp.[HUM_OPTION_CODE]
AND i200.[BLOCK] = mtp.[HUM_BLOCK]
AND i200.[SEC] = mtp.[HUM_SECTION]
AND i200.[ITEM] = mtp.[HUM_ITEM_CODE]
AND i200.[APPL SFX] = mtp.[HUM_APPL_SFX]
AND i200.[DC NO BEAM] = mtp.[HUM_DC_NO]
AND i200.[BEG DT] = mtp.[HUM_BEG_DT]
AND (i200.END_DT IS NULL OR i200.QTY != mtp.HUM_QUANTITY)
-- The UPDATE does nothing for deleted applications
WHEN MATCHED AND i200.END_DT IS NULL THEN UPDATE SET PROCESSED = i200.PROCESSED
-- For changed Applications, set Processed = 1
WHEN MATCHED THEN UPDATE SET PROCESSED = 1
-- All rows are inserted into HUM_DATAIMPORT_RES
OUTPUT mtp.[HUM_FAMILY_CODE],
mtp.[HUM_MODEL_CODE],
mtp.[HUM_TYPE_CODE],
mtp.[HUM_OPTION_CODE],
mtp.[HUM_BLOCK],
mtp.[HUM_SECTION],
mtp.[HUM_ITEM_CODE],
mtp.[HUM_APPL_SFX],
mtp.[HUM_DC_NO],
mtp.[HUM_BEG_DT],
ISNULL(i200.END_DT, @strdt),
mtp.[HUM_ITEM_NUMBER],
CASE WHEN i200.END_DT IS NULL THEN '0' ELSE i200.QTY END,
mtp.[ID],
@batchid,
0,
1
INTO innovator.HUM_DATAIMPORT_RES
([FAMILY],[MODEL],[TYPE],[OPTION],[BLOCK],[SECTION],[ITEM],[APPLSFX],[DC],[BEGDT],
[ENDDT],[L1DCPN],[QTY],[MTPID],[BATCHID],[PROCESSED],[ACTIVE]);
August 19, 2014 at 11:35 am
garry.lovesey (8/19/2014)
Hi Grant, thank you for your input. I have sent off an e-mail asking the supplier if they really need the COLLATE statements in there.Do I need the MERGE statement here or can I get away with a straight INSERT INTO....SELECT FROM here ??
Maybe two of them...one for if [END DT] is NULL and one for @qty != (SELECT i200.[QTY] FROM innovator.HUM_DATAIMPORT_200 AS i200).
A MERGE is one possibility. It's certainly something I'd include in any experiments I was doing on this to improve the performance.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
August 20, 2014 at 2:45 am
Hello SSCrazy, I'd like to say thanks for the code.....I have been looking at it this morning and it is coming up with some errors in the OUTPUT section. These are:
ISNULL(i200.[END DT], @strdt),
-- The multi-part identifier "i200.END DT" could not be bound
CASE WHEN i200.[END DT] IS NULL THEN '0' ELSE i200.QTY END,
-- The multi-part identifier "i200.END DT" could not be bound
Is the problem with the space in the column name, even though it has the square brackets around it, or are you not allowed to refer to this variable in the output section ??
August 20, 2014 at 8:23 am
Those should be END_DT, which I believe is the what I originally posted.
August 20, 2014 at 10:25 am
Scott Coleman (8/20/2014)
Those should be END_DT, which I believe is the what I originally posted.
Unfortunately there is no column in i200 with that name, which is why I changed it to be i200.[END DT].
It looks like it won't go into the output unless it is in the SELECT statement. Again, thanks for the original code....
Currently I am looking at a LEFT OUTER JOIN from the temp table to the i200 table to get the [END DT] and [QTY] values. From there it is (hopefully) a straight INSERT.....SELECT FROM into the HUM_DATALOAD_RES table.
August 20, 2014 at 11:14 am
Since I was playing with your code without any schema, everything was underlined with red squiggles so it was hard for me to see which fields had underscores and which had embedded spaces.
All columns in both the source (the common table expression that includes the SELECT statement) and the target (HUM_DATAIMPORT_200) are available in the OUTPUT clause. The issue is that the targeted table cannot be referred to with the table alias (i200), you have to use INSERTED or DELETED to clarify which version of the column to use (i.e. INSERTED.[END DT]).
I used a common table expression to make the MERGE structure a little cleaner. If you moved the source query into the USING clause, the OUTPUT section could also include any column in any of the source tables (with the original table aliases).
August 21, 2014 at 12:05 am
Hi Scott,
Thanks for the explanation.....
I have been working through the MERGE command on a few websites trying to work out how it works....latest one being this one.
http://www.made2mentor.com/2013/05/writing-t-sql-merge-statements-the-right-way/
Thanks again.....
August 26, 2014 at 5:00 am
A final update.........
The original code was running for 10 hours and then failing with timeout issues. A MERGE statement was created and tested with a small set of data to ensure that the output was the same as before.
This code has now been tested with the same load as before and instead of running for the 10 hours (and failing) it ran for five minutes and completed !!!
I would like to thank everyone who helped on this thread. Especially SSCrazy (Scott Coleman).....:-D
I would also like to thank Jeff Moden for this article that was used in the testing of the tables: http://weblogs.sqlteam.com/jeffs/archive/2004/11/10/2737.aspx
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply