September 13, 2012 at 1:54 pm
I have a procedure that, due to the nature of the data, has to be iterated through record by record. There are 15000 records, with the potential to go over 1 million.
On a previous run, the processing took about 15 min for 110k records. This time, 15000 is taking about 3 hours. I have keys and indexes so I'm not thinking that's the problem. I have lots of PRINT commands in there to show me the progress of the proc. I do this mainly for development purposes and turn them off when it runs in production.
After 2.5 hours of running, I have 720k lines of printed text in the Messages tab. NOthing significant by itself: 'Set Col_1 = 'BLAH' to show me that the variable was populated correctly...
'DocNum = 12345' to show me which doc it's working on now, etc.
It's a lot of text after 15000 iterations. Could this be slowing me down significantly?
Thanks
Crusty
September 13, 2012 at 2:15 pm
Can we ask why it has to be processed record by record?
September 13, 2012 at 2:28 pm
It has to be processed record by record due to nested delimiters. The data is being pushed through an SSIS package; however, we discovered that the data provider has taken multpile rows/columns of data and combined them to 1. The columns can contain pretty much any kind of data and not all the patter isn't the same. The only thing that stays the same is the number of delimiters for the columns and rows.
We start at the First value in the field and count the delimiters for columns and rows. We add the right number of columns to a landing table. Then we start breaking apart the data at the row delimiters and add additional rows to the table. For example
1;2 would then become
1
2
In our case we have 13 additional columns worth of data shoved into 1 column. An example of a record would be:
::2:::::2:::::LTS 15; ::::::::::::16; ::::::::::::17; ::::::::::::18 BK 2 MILES SUBDIVISION
The first round converts that mess into:
::2:::::2:::::LTS 15
::::::::::::16
::::::::::::17
::::::::::::18 BK 2 MILES SUBDIVISION
The second round breaks those up into individual columns of data. As you can see, the possibilities are endless as to how the data is arranged. The provider also doesn't always let us know when things change.
Sounds fun, doesn't it.
September 13, 2012 at 2:31 pm
if you are running the process in SSMS, then there'd be an impact for sure; just check Task Manager: SSMS will be bloated with memory, since it's storing all that text that was printed somewhere;
if you are on the server , that's eventaully going to cause memory pressure, and the OS will probably ask SQL for some more memory, which can slow things down as well.
Lynn has an excellent point: with a tiny bit of effort, anything you are doing Row By Agonizing Row can be converted to a near instantaneos Set based statement, that doesn't care if it's one or a millions rows processed;
the effort's basically the same, and several orders of magnitude faster than RBAR processing.
Lowell
September 13, 2012 at 2:33 pm
I thought I WAS using set-based processing. I"m definitley not using a cursor.....
Pray tell... more feedback on what you're refering to...
BTW... RBAR.... LOL That's funny. :w00t:
September 13, 2012 at 2:38 pm
Would really need to see some sample data (just a couple of rows that emulates the actual data) and the expected results based on that sample data.
From what you posted, not really sure what is expected.
September 13, 2012 at 2:41 pm
September 13, 2012 at 2:46 pm
CptCrusty1 (9/13/2012)
I thought I WAS using set-based processing. I"m definitley not using a cursor.....Pray tell... more feedback on what you're refering to...
BTW... RBAR.... LOL That's funny. :w00t:
Well Capn', if you are printing 'DocNum = 12345' all the time, you are processing records one by one...RBAR can be a while loop, a cursor, or a few other things, like building a stack of commands and executing them in a batch...your not limmited to a cursor, the RBAR is from processing anything one at a time.
The previous post about splitting your data with the DelimitedSplit is probably exactly what you need to resolve the performance issue, and free up a few hours of your time to tackle the next problem.
Lowell
September 13, 2012 at 2:47 pm
Here's the RAW data:
Doc Number|Document Date|Rec Date|Book|Vol|Page|Doc Type|# of Pages|Related|Grantor|Grantee|Legal|Amount|Image ID
1994-00000001|12/27/1993 12:00:00 AM|01/03/1994|SL|16|33|STATE TAX LIEN|2|:OPR:1169:250|TEXAS STATE OF|KFC NATIONAL MANAGEMENT CO|::::::::::::FOR RELEASE SEE OR VOL 1169 PG 250 2/23/94 ||\1994-01-03\1994-00000001-0001.tif;\1994-01-03\1994-00000001-0002.tif
1994-00000002|12/27/1993 12:00:00 AM|01/03/1994|SL|16|35|STATE TAX LIEN|2||TEXAS STATE OF|APACHE MACHINE & MFG INC|::::::::::::FOR RELEASE SEE OR VOL 1164 PG 3 ||\1994-01-03\1994-00000002-0001.tif;\1994-01-03\1994-00000002-0002.tif
1994-00000003|12/27/1993 12:00:00 AM|01/03/1994|SL|16|37|STATE TAX LIEN|2||TEXAS STATE OF|SURBERS FIBERGLASS INC|||\1994-01-03\1994-00000003-0001.tif;\1994-01-03\1994-00000003-0002.tif
1994-00000004|12/27/1993 12:00:00 AM|01/03/1994|SL|16|39|STATE TAX LIEN|2||TEXAS STATE OF|COLOR TILE INC|::::::::::::FOR REL SEE 1167/836 & 1168/501 OR 2/14; ::::::::::::17/94||\1994-01-03\1994-00000004-0001.tif;\1994-01-03\1994-00000004-0002.tif
1994-00000005|12/27/1993 12:00:00 AM|01/03/1994|SL|16|41|STATE TAX LIEN|2||TEXAS STATE OF|PERMIAN PAINTING INC|||\1994-01-03\1994-00000005-0001.tif;\1994-01-03\1994-00000005-0002.tif
1
Here's a Snapshot of the RAW Table... I'm narrowing the focus to the column we're interested in.
DocNum Legal
1994-00007372:2:17:::::17:::::LT 2 BK 17 FAIR OAKS ADDITON 3RD FILING
1994-00007373:16:4:::::4:::::LT 16 & 17 BK 4 EMERALD FOREST ESTATES
1994-00007374:12:35:::::35:::::LT 12 BK 35 FLEETWOOD ADDITION 3RD FILING
1994-00007375:16:4:::::4:::::LT 16 BK 4 UNIVERSITY GARDENS 1ST FILING
1994-00007376:19:2:::::2:::::LT 19 BK 2 REPLAT KENWOOD ESTATES ADDITION
The legal Column in this case only has Column delimiters nested in the data, no record delimiters. These are also pretty consistent. The end result should be additional columns added for each of the Column Delimiters (':') with the data after each delimiter kachunked into the appropriate new column.
IF, for example, there was a Record Delimiter as well, the there would need to be an additional record added for the Docnum in question, and then those values broken out as well.
Whatcha think?
Crusty.
September 13, 2012 at 2:49 pm
OH... and we're using 2005... Figured I should throw it out there.. I'm in a 2008 forum... I know.. someone is going to yell at me...:Whistling:
September 13, 2012 at 2:51 pm
CptCrusty1 (9/13/2012)
Here's the RAW data:Doc Number|Document Date|Rec Date|Book|Vol|Page|Doc Type|# of Pages|Related|Grantor|Grantee|Legal|Amount|Image ID
1994-00000001|12/27/1993 12:00:00 AM|01/03/1994|SL|16|33|STATE TAX LIEN|2|:OPR:1169:250|TEXAS STATE OF|KFC NATIONAL MANAGEMENT CO|::::::::::::FOR RELEASE SEE OR VOL 1169 PG 250 2/23/94 ||\1994-01-03\1994-00000001-0001.tif;\1994-01-03\1994-00000001-0002.tif
1994-00000002|12/27/1993 12:00:00 AM|01/03/1994|SL|16|35|STATE TAX LIEN|2||TEXAS STATE OF|APACHE MACHINE & MFG INC|::::::::::::FOR RELEASE SEE OR VOL 1164 PG 3 ||\1994-01-03\1994-00000002-0001.tif;\1994-01-03\1994-00000002-0002.tif
1994-00000003|12/27/1993 12:00:00 AM|01/03/1994|SL|16|37|STATE TAX LIEN|2||TEXAS STATE OF|SURBERS FIBERGLASS INC|||\1994-01-03\1994-00000003-0001.tif;\1994-01-03\1994-00000003-0002.tif
1994-00000004|12/27/1993 12:00:00 AM|01/03/1994|SL|16|39|STATE TAX LIEN|2||TEXAS STATE OF|COLOR TILE INC|::::::::::::FOR REL SEE 1167/836 & 1168/501 OR 2/14; ::::::::::::17/94||\1994-01-03\1994-00000004-0001.tif;\1994-01-03\1994-00000004-0002.tif
1994-00000005|12/27/1993 12:00:00 AM|01/03/1994|SL|16|41|STATE TAX LIEN|2||TEXAS STATE OF|PERMIAN PAINTING INC|||\1994-01-03\1994-00000005-0001.tif;\1994-01-03\1994-00000005-0002.tif
1
Here's a Snapshot of the RAW Table... I'm narrowing the focus to the column we're interested in.
DocNum Legal
1994-00007372:2:17:::::17:::::LT 2 BK 17 FAIR OAKS ADDITON 3RD FILING
1994-00007373:16:4:::::4:::::LT 16 & 17 BK 4 EMERALD FOREST ESTATES
1994-00007374:12:35:::::35:::::LT 12 BK 35 FLEETWOOD ADDITION 3RD FILING
1994-00007375:16:4:::::4:::::LT 16 BK 4 UNIVERSITY GARDENS 1ST FILING
1994-00007376:19:2:::::2:::::LT 19 BK 2 REPLAT KENWOOD ESTATES ADDITION
The legal Column in this case only has Column delimiters nested in the data, no record delimiters. These are also pretty consistent. The end result should be additional columns added for each of the Column Delimiters (':') with the data after each delimiter kachunked into the appropriate new column.
IF, for example, there was a Record Delimiter as well, the there would need to be an additional record added for the Docnum in question, and then those values broken out as well.
Whatcha think?
Crusty.
That the snapshot doesn't match the data you posted?
And that this still isn't quite where it needs to be as far as expected results, but it provides a little more insight into the problem.
September 13, 2012 at 2:53 pm
here's the data as a readyly consumable CTE;
if i knew what the process was doing, we could add a lot more to this:
With MyTableExample ([Doc Number],[Document Date],[Rec Date],[Book],[Vol],[Page],[Doc Type],[# of Pages],[Related],[Grantor],[Grantee],[Legal],[Amount],[Image ID] )
AS
(
SELECT '1994-00000001','12/27/1993 12:00:00 AM','01/03/1994','SL','16','33','STATE TAX LIEN','2',':OPR:1169:250','TEXAS STATE OF','KFC NATIONAL MANAGEMENT CO','::::::::::::FOR RELEASE SEE OR VOL 1169 PG 250 2/23/94 ','','\1994-01-03\1994-00000001-0001.tif;\1994-01-03\1994-00000001-0002.tif' UNION ALL
SELECT '1994-00000002','12/27/1993 12:00:00 AM','01/03/1994','SL','16','35','STATE TAX LIEN','2','','TEXAS STATE OF','APACHE MACHINE & MFG INC','::::::::::::FOR RELEASE SEE OR VOL 1164 PG 3 ','','\1994-01-03\1994-00000002-0001.tif;\1994-01-03\1994-00000002-0002.tif' UNION ALL
SELECT '1994-00000003','12/27/1993 12:00:00 AM','01/03/1994','SL','16','37','STATE TAX LIEN','2','','TEXAS STATE OF','SURBERS FIBERGLASS INC','','','\1994-01-03\1994-00000003-0001.tif;\1994-01-03\1994-00000003-0002.tif' UNION ALL
SELECT '1994-00000004','12/27/1993 12:00:00 AM','01/03/1994','SL','16','39','STATE TAX LIEN','2','','TEXAS STATE OF','COLOR TILE INC','::::::::::::FOR REL SEE 1167/836 & 1168/501 OR 2/14; ::::::::::::17/94','','\1994-01-03\1994-00000004-0001.tif;\1994-01-03\1994-00000004-0002.tif' UNION ALL
SELECT '1994-00000005','12/27/1993 12:00:00 AM','01/03/1994','SL','16','41','STATE TAX LIEN','2','','TEXAS STATE OF','PERMIAN PAINTING INC','','','\1994-01-03\1994-00000005-0001.tif;\1994-01-03\1994-00000005-0002.tif'
)
SELECT * FROM MyTableExample
Lowell
September 13, 2012 at 2:57 pm
No... The snapshot does not match the data; however, the problem doesn't change. The final result would be......
DocNumStrValSequenceCol_1Col_2Col_3Col_4Col_5Col_6Col_7Col_8Col_9Col_10Col_11Col_12Col_13
1994-00008417:4:6:::::6:::::LT 41NULL46NULLNULLNULLNULL6NULLNULLNULLNULLLT 4
1994-00007372:2:17:::::17:::::LT 2 BK 17 FAIR OAKS ADDITON 3RD FILING1NULL217NULLNULLNULLNULL17NULLNULLNULLNULLLT 2 BK 17 FAIR OAKS ADDITON 3RD FILING
1994-00007373:16:4:::::4:::::LT 16 & 17 BK 4 EMERALD FOREST ESTATES1NULL164NULLNULLNULLNULL4NULLNULLNULLNULLLT 16 & 17 BK 4 EMERALD FOREST ESTATES
1994-00007374:12:35:::::35:::::LT 12 BK 35 FLEETWOOD ADDITION 3RD FILING1NULL1235NULLNULLNULLNULL35NULLNULLNULLNULLLT 12 BK 35 FLEETWOOD ADDITION 3RD FILING
1994-00007375:16:4:::::4:::::LT 16 BK 4 UNIVERSITY GARDENS 1ST FILING1NULL164NULLNULLNULLNULL4NULLNULLNULLNULLLT 16 BK 4 UNIVERSITY GARDENS 1ST FILING
It's kinda ugly, but I'm not sure how else to shove it in here.....
September 13, 2012 at 2:59 pm
Are there always 13 columns in the data for Legal? Looking at one of the data rows you provided it doesn't appear to be the case.
Viewing 15 posts - 1 through 15 (of 59 total)
You must be logged in to reply to this topic. Login to reply