February 24, 2011 at 3:55 am
ok where to begin.
I've got an excel file with a part number, and a factor
I need to change this list into ranges for each factor.
Each factor can have multiple ranges.
This is more or less what I have:
Create Table XlsDump
(PartNumber varchar(50)
,Factor decimal(18,10))
Insert Into XlsDump
Select '35100-07g00-019' as 'PartNumber', '0.2129380000' as 'Factor'
Union Select '35100-07g01-019', '0.2129380000'
Union Select '35100-07g02-019', '0.2129380000'
Union Select '65310-07G01-000', '0.2129380000'
Union Select '65310-07G02-000', '0.2129380000'
Union Select '65310-07G03-000', '0.2129380000'
Union Select '11400-95855-000', '0.2293830000'
Union Select '11400-95856-000', '0.2293830000'
Union Select '11400-95857-000', '0.2293830000'
Union Select '61000-00001-000', '0.2293830000'
Union Select '61000-00002-000', '0.2293830000'
Union Select '61000-00003-000', '0.2293830000'
Union Select '24356-89000-000', '0.3344860000'
Union Select '24356-89001-000', '0.3344860000'
Union Select '24356-89002-000', '0.3344860000'
Union Select '51400-00001-000', '0.3344860000'
Union Select '51400-00002-000', '0.3344860000'
Union Select '51400-00003-000', '0.3344860000'
Select * from XlsDump
And I should somehow build this up to be:
--Select '65310-07G00-000' as 'RangeFrom','65310-07G10-000' as 'RangeTo','0.2129380000' as 'Factor'
--Thanks for highlighting the mistake
Select '65310-07G00-000' as 'RangeFrom','65310-07G03-000' as 'RangeTo','0.2129380000' as 'Factor'
UNION Select '35100-07g00-019','35100-07G10-YU1','0.2129380000'
UNION Select '11400-95855-000','11400-95855-000','0.2293830000'
UNION Select '61000-00000-000','61000-99999-000','0.2293830000'
UNION Select '24356-89000-000','24356-89131-000','0.3344860000'
UNION Select '51400-00000-000','51400-49999.999','0.3344860000'
I started to do this in excel. I Sorted by PartNumber, then by Factor
Manually this is going to take forever!
The XlsDump table has 56K lines 🙁
It should resolve to about 100 lines with ranges for the 5 unique factors.
How on earth to I tackle this?
The server is 2000, but for doing to formatting I have a 2008 express running on my pc, so can use advanced functions as well if need be.
Any help will be appreciated.
February 24, 2011 at 7:20 am
Your example output has the following in it...
Select '65310-07G00-000' as 'RangeFrom', '65310-07G10-000' as 'RangeTo', '0.2129380000' as 'Factor'
The largest value for 65310 in your example data is '65310-07G03-000'. Where did you get '65310-07G10-000' from in your example output?
--Jeff Moden
Change is inevitable... Change for the better is not.
February 24, 2011 at 12:19 pm
Sorry, just noticed that. That was a bit of actual data that got mixed up in sample data.
Thus the output should be:
SELECT '0.2293830000' As 'Factor'
,'11400-95855-000' as 'PartFrom'
,'11400-95857-000' as 'PartTo'
UNION SELECT '0.3344860000' As 'Factor'
,'24356-89000-000' as 'PartFrom'
,'24356-89002-000' as 'PartTo'
UNION SELECT '0.2129380000' As 'Factor'
,'35100-07g00-019' as 'PartFrom'
,'35100-07g02-019' as 'PartTo'
UNION SELECT '0.3344860000' As 'Factor'
,'51400-00001-000' as 'PartFrom'
,'51400-00003-000' as 'PartTo'
UNION SELECT '0.2293830000' As 'Factor'
,'61000-00001-000' as 'PartFrom'
,'61000-00003-000' as 'PartTo'
UNION SELECT '0.2129380000' As 'Factor'
,'65310-07G01-000' as 'PartFrom'
,'65310-07G03-000' as 'PartTo'
February 24, 2011 at 12:46 pm
I have a solution in mind, but would like to see first if there is an alternative.
I can add an extra column to the table. say (seq int null)
Then Create a cursor that compares the current factor in the cursor with the previous factor
If they are the same the seq number stays the same. If not, it is a new range, and I increment seq with one
Then I can pretty much select min(id) and max(id) for that seq to get the partfrom and partto
But I understand this type of thinking is not very popular around here ;-p
February 24, 2011 at 3:36 pm
Ok this is what I did.
Let see if we can find some cleaner solutions.
I for one still use cursors in these type of scenarios, but only because I'm not aware of any other way..Here goes.
--Begin Transaction
declare @OldFactor as decimal(18,10)
declare @NewFactor as decimal(18,10)
declare @part as varchar(50)
declare @ID as int
Declare @Seq as int
Select @Seq = 0
Declare @STR varchar(100)
declare Factors2Process
cursor Static for
(Select PartNumber, Factor, SymFactorID from SymFactor)
open Factors2Process
fetch Next
From Factors2Process
Into @Part, @NewFactor, @ID
While @@fetch_status = 0
begin
--print @ID
if @NewFactor = @OldFactor
Begin
Select @STR = 'Part: ' + cast(isnull(@Part,0) as varchar(50)) + ' New: ' + cast(isnull(@NewFactor,0) as varchar(50)) + ' ' + ' Old: ' + cast(isnull(@OldFactor,0) as varchar(50)) + ' Seq: ' + cast(isnull(@seq,0) as varchar(50))
Print @STR
End
Else
Begin
Select @seq = @seq + 1
Select @STR = 'Part: ' + cast(isnull(@Part,0) as varchar(50)) + ' New: ' + cast(isnull(@NewFactor,0) as varchar(50)) + ' ' + ' Old: ' + cast(isnull(@OldFactor,0) as varchar(50)) + ' Seq: ' + cast(isnull(@seq,0) as varchar(50))
Print @STR
End
update SymFactor2
Set Seq = @Seq
Where SymFactorID = @ID
Select @OldFactor = @NewFactor
Fetch Next From Factors2Process
Into @Part, @NewFactor, @ID
end
Close Factors2Process
Deallocate Factors2Process
--Commit Transaction
--Rollback Transaction
Create Table #link(FromID int, ToID int, Seq int)
Insert Into #link
Select Distinct
MIN(SymFactorID)
,MAX(SymFactorID)
,Seq
from
SymFactor2 sf_a
Group By
Seq
Select
frm.PartNumber as 'PartFrom'
,t.PartNumber as 'PartTo'
,frm.Factor as 'Factor'
From
#link
Inner Join SymFactor2 frm
on #link.FromID = frm.SymFactorID
Inner Join SymFactor2 t
on #link.ToID = t.SymFactorID
February 24, 2011 at 5:00 pm
Try this:
select Factor,min(PartNumber) as PartFrom,max(PartNumber) as PartTo
from Factors2Process
group BY Factor,stuff(PartNumber,7,5,'')
order by Factor,PartFrom
And by the way - there is no order by on the query in your cursor definition...you should have one.
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
February 24, 2011 at 11:47 pm
mister.magoo (2/24/2011)
Try this:
select Factor,min(PartNumber) as PartFrom,max(PartNumber) as PartTo
from Factors2Process
group BY Factor,stuff(PartNumber,7,5,'')
order by Factor,PartFrom
And by the way - there is no order by on the query in your cursor definition...you should have one.
From the Actual data, this query returns 13K Ranges.
The Cursor groups it into 500 ranges.
Yeah the table the cursor selects from is pre-formated. I will add the sorting, but do not think it will make a difference. I double checked the cursor by 1st printing the ID to make sure the order is correct. it printed 1, 2, 3, 4 etc... so I'm Happy. Then I also double checked the 1st few ranges, and they came out the same as if I would have done them manually. I'm fairly happy with the results, except I did not expect 500 ranges. I really though it would be about 100.
I would still LOVE to see if there is a non cursor way of doing this.
February 25, 2011 at 12:58 am
ZA_Crafty (2/24/2011)
mister.magoo (2/24/2011)
Try this:
select Factor,min(PartNumber) as PartFrom,max(PartNumber) as PartTo
from Factors2Process
group BY Factor,stuff(PartNumber,7,5,'')
order by Factor,PartFrom
And by the way - there is no order by on the query in your cursor definition...you should have one.
From the Actual data, this query returns 13K Ranges.
The Cursor groups it into 500 ranges.
Yeah the table the cursor selects from is pre-formated. I will add the sorting, but do not think it will make a difference. I double checked the cursor by 1st printing the ID to make sure the order is correct. it printed 1, 2, 3, 4 etc... so I'm Happy. Then I also double checked the 1st few ranges, and they came out the same as if I would have done them manually. I'm fairly happy with the results, except I did not expect 500 ranges. I really though it would be about 100.
I would still LOVE to see if there is a non cursor way of doing this.
The number of ranges produced by this query is dependent on the way you want to group - in my case I used the first and third parts of the PartNumber columns as that seemed to match your expected output (and the results matched your expected output). Could you please clarify what defines a range in your data?
The cursor you are using is not guaranteed to always return the data in the expected order - that is what an ORDER BY clause does for you - it guarantees the order. Just because you have checked a few times and it seems right does not mean it will always be without an ORDER BY.
The query I posted will do what you want if you can just define what criteria should be used to define a range.
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
February 25, 2011 at 2:25 am
mister.magoo (2/25/2011)
ZA_Crafty (2/24/2011)
mister.magoo (2/24/2011)
Try this:
select Factor,min(PartNumber) as PartFrom,max(PartNumber) as PartTo
from Factors2Process
group BY Factor,stuff(PartNumber,7,5,'')
order by Factor,PartFrom
And by the way - there is no order by on the query in your cursor definition...you should have one.
From the Actual data, this query returns 13K Ranges.
The Cursor groups it into 500 ranges.
Yeah the table the cursor selects from is pre-formated. I will add the sorting, but do not think it will make a difference. I double checked the cursor by 1st printing the ID to make sure the order is correct. it printed 1, 2, 3, 4 etc... so I'm Happy. Then I also double checked the 1st few ranges, and they came out the same as if I would have done them manually. I'm fairly happy with the results, except I did not expect 500 ranges. I really though it would be about 100.
I would still LOVE to see if there is a non cursor way of doing this.
The number of ranges produced by this query is dependent on the way you want to group - in my case I used the first and third parts of the PartNumber columns as that seemed to match your expected output (and the results matched your expected output). Could you please clarify what defines a range in your data?
The cursor you are using is not guaranteed to always return the data in the expected order - that is what an ORDER BY clause does for you - it guarantees the order. Just because you have checked a few times and it seems right does not mean it will always be without an ORDER BY.
The query I posted will do what you want if you can just define what criteria should be used to define a range.
a range is defined by volume of sales. Also the range can overlap. thus a - c is factor 1, d is factor 2 then e-g is gain factor one. So you cannot simply say a to g is factor 1
If this makes sense...
So the way I have it, order 1st by part number, hen by factor, thn for each change in factor the first and las part is the range.
February 25, 2011 at 5:23 am
Ok, so based on your latest bit of information, try this:
sorry - hadn't spotted that you are on sql 2000...removed code that won't work for you...
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
February 25, 2011 at 7:05 pm
ZA_Crafty (2/25/2011)
mister.magoo (2/25/2011)
ZA_Crafty (2/24/2011)
mister.magoo (2/24/2011)
Try this:
select Factor,min(PartNumber) as PartFrom,max(PartNumber) as PartTo
from Factors2Process
group BY Factor,stuff(PartNumber,7,5,'')
order by Factor,PartFrom
And by the way - there is no order by on the query in your cursor definition...you should have one.
From the Actual data, this query returns 13K Ranges.
The Cursor groups it into 500 ranges.
Yeah the table the cursor selects from is pre-formated. I will add the sorting, but do not think it will make a difference. I double checked the cursor by 1st printing the ID to make sure the order is correct. it printed 1, 2, 3, 4 etc... so I'm Happy. Then I also double checked the 1st few ranges, and they came out the same as if I would have done them manually. I'm fairly happy with the results, except I did not expect 500 ranges. I really though it would be about 100.
I would still LOVE to see if there is a non cursor way of doing this.
The number of ranges produced by this query is dependent on the way you want to group - in my case I used the first and third parts of the PartNumber columns as that seemed to match your expected output (and the results matched your expected output). Could you please clarify what defines a range in your data?
The cursor you are using is not guaranteed to always return the data in the expected order - that is what an ORDER BY clause does for you - it guarantees the order. Just because you have checked a few times and it seems right does not mean it will always be without an ORDER BY.
The query I posted will do what you want if you can just define what criteria should be used to define a range.
a range is defined by volume of sales. Also the range can overlap. thus a - c is factor 1, d is factor 2 then e-g is gain factor one. So you cannot simply say a to g is factor 1
If this makes sense...
So the way I have it, order 1st by part number, hen by factor, thn for each change in factor the first and las part is the range.
With all of that in mind, I've not checked your cursor. 😉 I do have some code that will resolve a million row table in just a couple of seconds, though. Rather than me talk about it, I'll let the comments in the code do the talking...
First, your original test data with some added rows just to proof the pudding...
/*****************************************************************
Build a test table. Note that this is NOT a part of the solution.
*****************************************************************/
--===== Do this experiment in a nice safe place that everyone has
USE TempDB
;
--===== Conditionally drop the test table to make reruns easier
IF OBJECT_ID('TempDB.dbo.XlsDump','U') IS NOT NULL
DROP TABLE dbo.XlsDump
;
--===== Create the test table
CREATE TABLE dbo.XlsDump
(
PartNumber VARCHAR(50),
Factor DECIMAL(18,10)
)
;
--===== Populate the test table. Notice the extra data.
INSERT INTO dbo.XlsDump
(PartNumber, Factor)
SELECT '35100-07g00-019', '0.2129380000' UNION ALL
SELECT '35100-07g01-019', '0.2129380000' UNION ALL
SELECT '35100-07g02-019', '0.2129380000' UNION ALL
SELECT '65310-07G01-000', '0.2129380000' UNION ALL
SELECT '65310-07G02-000', '0.2129380000' UNION ALL
SELECT '65310-07G03-000', '0.2129380000' UNION ALL
--------------------------------------------------------------------
SELECT '65310-07G04-000', '0.3000000000' UNION ALL
SELECT '65310-07G05-000', '0.3000000000' UNION ALL
SELECT '65310-07G06-000', '0.3000000000' UNION ALL
SELECT '65310-07G07-000', '0.2129380000' UNION ALL
SELECT '65310-07G08-000', '0.2129380000' UNION ALL
SELECT '65310-07G09-000', '0.2129380000' UNION ALL
--------------------------------------------------------------------
SELECT '11400-95855-000', '0.2293830000' UNION ALL
SELECT '11400-95856-000', '0.2293830000' UNION ALL
SELECT '11400-95857-000', '0.2293830000' UNION ALL
SELECT '61000-00001-000', '0.2293830000' UNION ALL
SELECT '61000-00002-000', '0.2293830000' UNION ALL
SELECT '61000-00003-000', '0.2293830000' UNION ALL
SELECT '24356-89000-000', '0.3344860000' UNION ALL
SELECT '24356-89001-000', '0.3344860000' UNION ALL
SELECT '24356-89002-000', '0.3344860000' UNION ALL
SELECT '51400-00001-000', '0.3344860000' UNION ALL
SELECT '51400-00002-000', '0.3344860000' UNION ALL
SELECT '51400-00003-000', '0.3344860000'
;
--===== Display the contents of the test table.
SELECT * FROM dbo.XlsDump
;
... and the solution. Be sure to read the article referenced in the code. If you don't follow all of the rules, it could give you the wrong answer and we can't have that. If you do follow the rules, then no worries. The only thing you really have to get used to is how quickly it blows the doors off any other method in SQL Server 2000. 😉
/*****************************************************************
Now we solve the problem using a "checked" "Quirky Update".
Please see the article at the following URL for how it works.
http://www.sqlservercentral.com/articles/T-SQL/68467/
The "check" isn't currently in the article.
*****************************************************************/
--===== Conditionally drop the work table to make reruns easier
IF OBJECT_ID('TempDB..#MyHead','U') IS NOT NULL
DROP TABLE #MyHead
;
--===== Populate the work table from the original data making some translations on the way
SELECT RowNum = IDENTITY(INT,1,1),
PartNumber,
Part = CAST(SUBSTRING(PartNumber,1,CHARINDEX('-',PartNumber)-1) AS VARCHAR(10)),
Factor,
GroupNumber = CAST(NULL AS INT)
INTO #MyHead
FROM dbo.XlsDump
ORDER BY PartNumber, Factor
OPTION (MAXDOP 1) --absolutely essential here
;
--===== Create the quintessential clustered index in the order we need.
-- This should be the only index on this temp table if you can help it.
CREATE UNIQUE CLUSTERED INDEX IX_Composite
ON #MyHead (RowNum,Part,Factor)
WITH FILLFACTOR = 100
;
--===== Declare and preset some variables to support the "Quirky Update"
DECLARE @Counter INT,
@GroupNumber INT,
@PrevPart VARCHAR(10),
@PrevFactor DECIMAL(18,10)
;
SELECT @Counter = 1,
@GroupNumber = 0
;
--===== Do an update just like you would with a cursor but MUCH faster.
-- This just does some grouping based on part and factor.
UPDATE tgt
SET @GroupNumber = GroupNumber = CASE --@Counter is used as an error detector
WHEN @Counter = RowNum
THEN CASE
WHEN Part = @PrevPart AND Factor = @PrevFactor
THEN @GroupNumber
ELSE @GroupNumber + 1
END
ELSE 1/0 --Force error if counter gets out of sync with RowNum for ANY reason
END,
@PrevPart = Part,
@PrevFactor = Factor,
@Counter = @Counter + 1
FROM #MyHead tgt WITH (TABLOCKX) --absolutely required
OPTION (MAXDOP 1) --absolutely required
;
--===== Display the desired result.
SELECT Factor,
PartFrom = MIN(PartNumber),
PartTo = MAX(PartNumber)
FROM #MyHead
GROUP BY Factor, GroupNumber
ORDER BY PartFrom
;
--Jeff Moden
Change is inevitable... Change for the better is not.
February 25, 2011 at 7:12 pm
Sorry... I almost forgot. Here are the results I get from the final SELECT in the solution...
FactorPartFromPartTo
0.229383000011400-95855-00011400-95857-000
0.334486000024356-89000-00024356-89002-000
0.212938000035100-07g00-01935100-07g02-019
0.334486000051400-00001-00051400-00003-000
0.229383000061000-00001-00061000-00003-000
0.212938000065310-07G01-00065310-07G03-000
0.300000000065310-07G04-00065310-07G06-000
0.212938000065310-07G07-00065310-07G09-000
--Jeff Moden
Change is inevitable... Change for the better is not.
February 27, 2011 at 2:11 pm
Jeff! You are my hero. Will go through it again, but I think i got most of it.
I simply love the force error part...
ELSE 1/0
and #MyHead... very creative 😉
February 27, 2011 at 2:36 pm
Heh... thanks for the feedback, ZA. Most people don't get the pun built into that particular table name. 😛
Just as a reminder and a friendly suggestion... the method I posted only works correctly if you follow all the rules. If you haven't taken the time to read the article at the link I posted, at least take the time to read the rules posted near the end of that article.
Last but not least, if there's anything you don't understand about the code, please ask! There's nothing worse than having to support code if you aren't 100% sure of how it works.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 9, 2011 at 6:15 am
OK, so Now a curve ball was dealt.
Not all part numbers follow the same format.
As in my sample, most parts has the format:
00000-000-000
I just came across some partnumbers that crash on line
Part = CAST(SUBSTRING(PartNumber,1,CHARINDEX('-',PartNumber)-1) AS VARCHAR(10)),
They do not contain any "-" characters
This is just for reference? so taking the 1st 7chars for instance and stripping off any special chars like "-" should do the trick?
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply