October 8, 2010 at 12:22 pm
Craig Farrell (10/8/2010)
I had assumed, since 2k5 and 2k don't behave, but figured it was worth the question. I'm sure Wayne and the others will be able to get you a direct solution off the actual data. I'll just step back and watch the show unless we can do some form of CSV importing, but I'm not sure how much bandwidth you have to try to get that exported.
lemme see if I can put something together. The more help the better. I am completely lost/frustrated with this whole thing.
October 8, 2010 at 1:06 pm
CSV file attached.
First column = id
Second column = controlnumber
Third column = endno
October 8, 2010 at 2:46 pm
craig-404139 (10/8/2010)
CSV file attached.First column = id
Second column = controlnumber
Third column = endno
That worked. Alright, so we're looking at taking two unique values, figuring out the 'fill', and splitting off the pattern.
@Wayne: You're right. With no numerics ever in the alpha code, yours runs faster. Against this test data my method is 7 seconds, yours is 5. Mind that's with a data return and using select vs. select. I like to keep things as easy as possible.
@Craig-4: I should have a solution soonish. I'm getting null results from endno for some reason and I'm trying to track down the data burp.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
October 8, 2010 at 2:48 pm
Craig Farrell (10/8/2010)
craig-404139 (10/8/2010)
CSV file attached.First column = id
Second column = controlnumber
Third column = endno
That worked. Alright, so we're looking at taking two unique values, figuring out the 'fill', and splitting off the pattern.
@Wayne: You're right. With no numerics ever in the alpha code, yours runs faster. Against this test data my method is 7 seconds, yours is 5. Mind that's with a data return and using select vs. select. I like to keep things as easy as possible.
@Craig-4: I should have a solution soonish. I'm getting null results from endno for some reason and I'm trying to track down the data burp.
I know about the 13 null rows. Sorry I forgot to remove them. You can discard those rows. It's an error in how I put this together.
October 8, 2010 at 2:50 pm
Found them, the sample rows:
2048205PPCI000737005NULL
2048206PPCI000737009NULL
2048207PPCI000999798NULL
2048208PPCI001346238NULL
2048209PPCI001350958NULL
2048210PPCI001462905NULL
2048211PPCI001620877NULL
2048212PPCI001620882NULL
2048213PPCI001620891NULL
2048214PPCI001620899NULL
2048215PPCI001620909NULL
2048216PPCI001620957NULL
2048217PPCI001620962NULL
They appear to be single page docs. Is this expected in the data?
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
October 8, 2010 at 2:52 pm
Craig Farrell (10/8/2010)
Found them, the sample rows:2048205PPCI000737005NULL
2048206PPCI000737009NULL
2048207PPCI000999798NULL
2048208PPCI001346238NULL
2048209PPCI001350958NULL
2048210PPCI001462905NULL
2048211PPCI001620877NULL
2048212PPCI001620882NULL
2048213PPCI001620891NULL
2048214PPCI001620899NULL
2048215PPCI001620909NULL
2048216PPCI001620957NULL
2048217PPCI001620962NULL
They appear to be single page docs. Is this expected in the data?
Technically yes. But you can discard those rows or update the control = endno. Either way.
October 8, 2010 at 3:00 pm
I think i'm on to something. The below code works if I strip out the PPCI prefix. (Used a update/replace statement on the test table) Runs in a few seconds. Now to modify it to work with the prefixes.....
select d.controlnumber as start
,(select min(a.endno) as id
from document as a
left outer join document as b
on a.endno = b.controlnumber - 1
where b.controlnumber is null
and a.endno >= d.endno
) as ending
from document as d
left outer join document as d1
on d1.endno = d.controlnumber - 1
where d1.endno is null
order by d.controlnumber
These are my initial results with the prefixes stripped out.
000000001 - 000737004
000737006- 000737008
000737010- 000999797
000999799- 001346237
001346239- 001350957
001350959- 001462904
001462906- 001620876
001620878- 001620881
001620883- 001620890
001620892- 001620898
001620900- 001620908
001620910- 001620956
001620958- 001620961
001620963- 002903970
October 8, 2010 at 3:04 pm
This will sound strange but... are you sure you have gaps?
Run this code:
CREATE TABLE #MidPoint
(RowID INT,
Pattern VARCHAR(100),
Startnum BIGINT,
EndNum BIGINT
)
INSERT INTO #Midpoint
select
ROW_NUMBER() OVER (ORDER BY
left(controlnumber, PatIndex('%[0-9]%', controlnumber)-1),
convert(int, SubString(controlnumber, PatIndex('%[0-9]%', controlnumber), 20))
) AS RowNum,
left(controlnumber, PatIndex('%[0-9]%', controlnumber)-1) AS Pattern,
convert(int, SubString(controlnumber, PatIndex('%[0-9]%', controlnumber), 20)) AS StartNum
,ISNULL( convert(int, SubString(endno, PatIndex('%[0-9]%', endno), 20))
, convert(int, SubString(controlnumber, PatIndex('%[0-9]%', controlnumber), 20))) --StartNum if null
AS EndNum
from
gaps
order by
controlnumber
SELECT
p.*, p2.EndNum AS PreviousEndNum,
CASE WHEN (p.startNum = p2.endnum + 1) THEN 'Filled' ELSE 'Gapped' END
FROM
#Midpoint AS p
LEFT JOIN-- Link to the prior row
#Midpoint AS p2
ONp.RowID = p2.RowID + 1
--WHEREp.StartNum <> p2.EndNum + 1
order by
p.RowID
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
October 8, 2010 at 3:10 pm
Did you delete the 'null' endno rows? If so I ran your code against mine and it didnt find any. try running my code against yours and see what happens? You'll see the gaps.
**edit**
Looking at the 13 null rows and comparing my results, I see there are no gaps had I left them in. try removing them and if our codes are the same, we should see the same results.
**double edit**
My ideal output is returned from my code. You code produces alot of extra fields. Any quick way to cut it down? See my previous post for an example or run my code.
October 8, 2010 at 3:25 pm
These are my results with the null rows removed:
Pattern GapStart GapEnd
-------- -------------------- --------------------
PPCI 737005 737005
PPCI 737009 737009
PPCI 999798 999798
PPCI 1346238 1346238
PPCI 1350958 1350958
PPCI 1462905 1462905
PPCI 1620877 1620877
PPCI 1620882 1620882
PPCI 1620891 1620891
PPCI 1620899 1620899
PPCI 1620909 1620909
PPCI 1620957 1620957
PPCI 1620962 1620962
The code modification in question (I'm not nearly done with this, just a validation step confused me):
SELECT
CAST( p.Pattern AS VARCHAR(8)) AS Pattern,
p2.EndNum +1 AS GapStart,
p.StartNum -1 AS GapEnd
--p.*, p2.EndNum AS PreviousEndNum,
--CASE WHEN (p.startNum = p2.endnum + 1) THEN 'Filled' ELSE 'Gapped' END
FROM
#Midpoint AS p
LEFT JOIN-- Link to the prior row
#Midpoint AS p2
ONp.RowID = p2.RowID + 1
WHEREp.StartNum <> p2.EndNum + 1
order by
p.RowID
EDIT:
I just did a spot check on your results, and they're inaccurate. For example, you return this:
000737006 - 000737008
RowID 73230 has startnum 737006 through 737008
You're picking up some false positives with your code.
2nd edit: your row Id is 1972826
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
October 8, 2010 at 3:30 pm
Craig Farrell (10/8/2010)
These are my results with the null rows removed:
Pattern GapStart GapEnd
-------- -------------------- --------------------
PPCI 737005 737005
PPCI 737009 737009
PPCI 999798 999798
PPCI 1346238 1346238
PPCI 1350958 1350958
PPCI 1462905 1462905
PPCI 1620877 1620877
PPCI 1620882 1620882
PPCI 1620891 1620891
PPCI 1620899 1620899
PPCI 1620909 1620909
PPCI 1620957 1620957
PPCI 1620962 1620962
The code modification in question (I'm not nearly done with this, just a validation step confused me):
SELECT
CAST( p.Pattern AS VARCHAR(8)) AS Pattern,
p2.EndNum +1 AS GapStart,
p.StartNum -1 AS GapEnd
--p.*, p2.EndNum AS PreviousEndNum,
--CASE WHEN (p.startNum = p2.endnum + 1) THEN 'Filled' ELSE 'Gapped' END
FROM
#Midpoint AS p
LEFT JOIN-- Link to the prior row
#Midpoint AS p2
ONp.RowID = p2.RowID + 1
WHEREp.StartNum <> p2.EndNum + 1
order by
p.RowID
I see the confusion now. Your code is returning the actual gaps. My code is returning the range of continuous numbers. In the end i'll need to identify the gaps as your code does in addition to the actual ranges. So it does help.
How can I add the logic to remove the prefix to my existing code? Any ideas?
October 8, 2010 at 3:39 pm
Craig if you were local i'd buy you a beer. We killed 2 birds with one stone. I needed to identify the ranges of sequential numbers (my code) and the ranges of the gaps (your code).
Now I just need to tweak my code to dynamically remove the prefix like yours does.
October 8, 2010 at 3:42 pm
craig-404139 (10/8/2010)
Craig if you were local i'd buy you a beer. We killed 2 birds with one stone. I needed to identify the ranges of sequential numbers (my code) and the ranges of the gaps (your code).Now I just need to tweak my code to dynamically remove the prefix like yours does.
Easy enough, gimme 2 minutes here and I'll get you both result sets. 🙂 Btw, the method of removing the pattern portion (PPCI) is the whole substring(charindex()) bit.
I'm almost there. I just goofed something when I tried to include fake 0 rows to catch the 1 through x bits if we start at say, 90. Should be easy enough to do a double result set from the same code.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
October 8, 2010 at 4:02 pm
This should do it:
DROP TABLE #gaps
DROP TABLE #Contiguous
DROP TABLE #MidPoint
CREATE TABLE #MidPoint
(RowID INT,
PrevRowID INT,
Pattern VARCHAR(100),
Startnum BIGINT,
EndNum BIGINT,
Series INT
)
INSERT INTO #Midpoint
( RowID, Pattern, StartNum, EndNum)
select
ROW_NUMBER() OVER (ORDER BY
left(controlnumber, PatIndex('%[0-9]%', controlnumber)-1),
convert(int, SubString(controlnumber, PatIndex('%[0-9]%', controlnumber), 20))
) AS RowNum,
left(controlnumber, PatIndex('%[0-9]%', controlnumber)-1) AS Pattern,
convert(int, SubString(controlnumber, PatIndex('%[0-9]%', controlnumber), 20)) AS StartNum
,
--ISNULL(
convert(int, SubString(endno, PatIndex('%[0-9]%', endno), 20))
--, convert(int, SubString(controlnumber, PatIndex('%[0-9]%', controlnumber), 20))) --StartNum if null
AS EndNum
from
(SELECT
*
FROM
gaps
UNION ALL -- Create a dummy 0 record for each pattern to catch 1 through x gaps
SELECT
NULL AS id,
Pattern + REPLICATE('0', NumPlaces) AS controlNumber,
Pattern + REPLICATE('0', NumPlaces) AS endno
FROM
(SELECT DISTINCT
left(controlnumber, PatIndex('%[0-9]%', controlnumber)-1) AS Pattern,
LEN( controlnumber) - LEN( left(controlnumber, PatIndex('%[0-9]%', controlnumber))) AS NumPlaces
FROM
gaps
) AS PatZero
) as drv
where
endno is not null
order by
controlnumber
UPDATE #Midpoint SET PrevRowID = RowID -1
CREATE CLUSTERED INDEX idx_Midpoint1 ON #MidPoint (RowID)
--CREATE NONCLUSTERED INDEX idx_Midpoint2 ON #MidPoint( PrevRowID) INCLUDE ( EndNum)
SELECT
CAST( p.Pattern AS VARCHAR(8)) AS Pattern,
p2.EndNum +1 AS GapStart,
p.StartNum -1 AS GapEnd
--p.*, p2.EndNum AS PreviousEndNum,
--CASE WHEN (p.startNum = p2.endnum + 1) THEN 'Filled' ELSE 'Gapped' END
INTO
#gaps
FROM
#Midpoint AS p
LEFT JOIN-- Link to the prior row
#Midpoint AS p2
ONp.PrevRowID = p2.RowID
WHEREp.StartNum <> p2.EndNum + 1
order by
p.RowID
--Quirky Update here. It SHOULD be ran with OPTION( MAXDOP 1). Only needs it for this query.
DECLARE @Series INT
SET @Series = 1
UPDATE p
SET
@Series = Series = CASE WHEN p.StartNum <> p2.EndNum + 1 THEN @Series + 1 ELSE @Series END
FROM
#Midpoint AS p
LEFT JOIN-- Link to the prior row
#Midpoint AS p2
ONp.PrevRowID = p2.RowID
-- ORDER BY p.RowID
-- The clustered index deals with this for us.
OPTION( MAXDOP 1)
SELECT
Pattern,
Series,
MIN( StartNum) AS BeginSeries,
Max( EndNum) AS EndSeries
INTO
#Contiguous
FROM
#MidPoint
GROUP BY
Pattern,
Series
select * from #gaps order by gapstart
select * from #Contiguous
ORDER by
pattern, series, beginseries
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
October 8, 2010 at 4:17 pm
I am getting weird results from the #contiguous table. It looks like:
PPCI412903540
PPCI7102903235
PPCI1932903562
PPCI2122853875
PPCI1022903952
PPCI131402902866
PPCI8592903560
PPCI1452903559
PPCI5492903970
PPCI11452900794
PPCI632903558
PPCI12542903556
PPCI33962903208
PPCI902903539
Viewing 15 posts - 46 through 60 (of 91 total)
You must be logged in to reply to this topic. Login to reply