results incorrect ROW_NUMBER() OVER (PARTITION BY

  • here is query:

    select CompanyCode,RigNumber,RigLocation,NAVLocation,monthdate,

    ROW_NUMBER() OVER (PARTITION BY companycode,rignumber,navlocation ORDER BY monthdate,companycode,rignumber, navlocation ) AS TransNumber

    from RigNumberLocation

    where RigNumber='rig-3037' and CompanyCode='mer'

    order by MonthDate

    here are results:

    CompanyCodeRigNumberRigLocationNAVLocationmonthdateTransNumber

    MER RIG-3037 MERBOW 0503 9/1/2009 1

    MER RIG-3037 MERKIL 0522 10/1/20091

    MER RIG-3037 MERKIL 0522 11/1/20092

    MER RIG-3037 MERKIL 0522 12/1/20093

    MER RIG-3037 MERKIL 0522 1/1/20104

    MER RIG-3037 MERWTX 0518 2/1/20101

    MER RIG-3037 MERWTX 0518 3/1/20102

    MER RIG-3037 MERKIL 0522 4/1/20105*

    MER RIG-3037 MERKIL 0522 5/1/20106*

    MER RIG-3037 MERKIL 0522 6/1/20107*

    So this follows a piece of equipment RIG-3037 and where it is at on a monthly basis. I have a ROW_NUMBER statement that is not coming out correctly. In the first location MERBOW it stayed one month. Thus a TransNumber of 1. Then it moved to MERKIL where it stayed 4 months so we have TransNumber 1,2,3 and 4. Then MERWTX where it stayed two months so TransNumber of 1 and 2. Then it went to MERKIL once again but instead of starting out at 1 the TransNumber started at 5. So i should be seeing TransNumber of 1,2, and 3 instead of 5,6 and 7. I know what it's doing but not why it is doing this. I have tried to work the ROW_NUMBER and the partition and order by in many different ways but it is not coming out correctly. Would anyone be able to look at this and see if i am doing something stupid? Thanks.

  • You probably won't be getting many replies if you don't give us some data in a consumable format.

    Please read the first article linked in my signature line: you'll find a nice way to post sample data and many more people will likely help you.

    -- Gianluca Sartori

  • can you delete this out and i will repost? thanks.

  • You can edit your original post using the "edit" button.

    -- Gianluca Sartori

  • Hi

    If you have a look at this article[/url] by Jeff Moden, it will explain how to group by contiguous islands of dates. This is essentially what you are trying to do.

    While this doesn't quite get the result that you want, it is because there is a large gap in the months between 12/1/2009 and 1/1/2010 for rig location MERKIL

    with sample as (

    select * FROM

    (VALUES

    ('MER','RIG-3037','MERBOW','0503',CAST('20090901' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20091001' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20091101' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20091201' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100101' AS DATETIME)),

    ('MER','RIG-3037','MERWTX','0518',CAST('20100201' AS DATETIME)),

    ('MER','RIG-3037','MERWTX','0518',CAST('20100301' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100401' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100501' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100601' AS DATETIME))

    ) AS D(CompanyCode,RigNumber,RigLocation,NAVLocation,monthdate)

    ),

    grouped as (

    SELECT *,

    DATEADD(month, - ROW_NUMBER() OVER (PARTITION BY CompanyCode,RigNumber,RigLocation,NAVLocation ORDER BY monthdate), monthdate) G

    FROM sample

    )

    select *, ROW_NUMBER() OVER (PARTITION BY CompanyCode,RigNumber,RigLocation,NAVLocation, G ORDER BY monthdate) TransNumber

    from grouped

    order by monthdate;

    Another way to do this ignoring the gaps in the date sequences would be

    with sample as (

    select * FROM

    (VALUES

    ('MER','RIG-3037','MERBOW','0503',CAST('20090901' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20091001' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20091101' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20091201' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100101' AS DATETIME)),

    ('MER','RIG-3037','MERWTX','0518',CAST('20100201' AS DATETIME)),

    ('MER','RIG-3037','MERWTX','0518',CAST('20100301' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100401' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100501' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100601' AS DATETIME))

    ) AS D(CompanyCode, RigNumber, RigLocation, NAVLocation, monthdate)

    ),

    grouped as (

    SELECT *,

    ROW_NUMBER() OVER (PARTITION BY CompanyCode, RigNumber, RigLocation, NAVLocation ORDER BY monthdate)

    - ROW_NUMBER() OVER (PARTITION BY CompanyCode, RigNumber ORDER BY monthdate) G

    FROM sample

    )

    select *, ROW_NUMBER() OVER (PARTITION BY CompanyCode, RigNumber, RigLocation, NAVLocation, G ORDER BY monthdate) TransNumber

    from grouped

    order by monthdate;

  • spaghettidba (11/7/2013)


    You probably won't be getting many replies if you don't give us some data in a consumable format.

    Please read the first article linked in my signature line: you'll find a nice way to post sample data and many more people will likely help you.

    I'm feeling generous this morning.

    This looks more like a running totals problem to me than something you can solve with a ROW_NUMBER().

    WITH SampleData

    (CompanyCode, RigNumber, RigLocation, NAVLocation, monthdate) AS

    (

    SELECT 'MER','RIG-3037','MERBOW','0503',CAST('9/1/2009' AS DATETIME)

    UNION ALL SELECT 'MER','RIG-3037','MERKIL','0522','10/1/2009'

    UNION ALL SELECT 'MER','RIG-3037','MERKIL','0522','11/1/2009'

    UNION ALL SELECT 'MER','RIG-3037','MERKIL','0522','12/1/2009'

    UNION ALL SELECT 'MER','RIG-3037','MERKIL','0522','1/1/2010'

    UNION ALL SELECT 'MER','RIG-3037','MERWTX','0518','2/1/2010'

    UNION ALL SELECT 'MER','RIG-3037','MERWTX','0518','3/1/2010'

    UNION ALL SELECT 'MER','RIG-3037','MERKIL','0522','4/1/2010'

    UNION ALL SELECT 'MER','RIG-3037','MERKIL','0522','5/1/2010'

    UNION ALL SELECT 'MER','RIG-3037','MERKIL','0522','6/1/2010'

    )

    SELECT *

    INTO #Temp

    FROM SampleData

    ALTER TABLE #Temp ALTER COLUMN CompanyCode VARCHAR(10) NOT NULL;

    ALTER TABLE #Temp ALTER COLUMN RigNumber VARCHAR(10) NOT NULL;

    ALTER TABLE #Temp ALTER COLUMN monthdate DATETIME NOT NULL;

    ALTER TABLE #Temp ADD PRIMARY KEY(CompanyCode, RigNumber, monthdate);

    ALTER TABLE #Temp ADD TransactionNumber INT NULL;

    DECLARE @TransNo INT = 0

    ,@KeyValue VARCHAR(20) = NULL

    ,@RigLocation VARCHAR(10);

    UPDATE #Temp WITH(TABLOCKX)

    SET TransactionNumber = CASE WHEN @KeyValue IS NULL THEN @TransNo

    WHEN @KeyValue <> CompanyCode + RigNumber THEN 1

    ELSE @TransNo END

    ,@TransNo = CASE WHEN @RigLocation = RigLocation THEN @TransNo + 1 ELSE 1 END

    ,@RigLocation = RigLocation

    ,@KeyValue = CompanyCode + RigNumber

    OPTION (MAXDOP 1);

    SELECT *

    FROM #Temp

    GO

    DROP TABLE #Temp;

    Before you get too excited about this nasty fast solution to this problem, you should probably (very) carefully read all the rules you'll need to use to ensure that this Quirky Update will work.

    Solving the Running Total and Ordinal Rank Problems[/url]

    This is the fastest solution available in SQL 2008. There are other (slower) ways that it can be done. This article compares them (even though the problem discussed is different in specifics, it falls into the same class of problem):

    Calculating Values within a Rolling Window in Transact SQL [/url]

    In your case, the "rolling window" is CompanyCode, RigNumber and instead of SUMming values in the window you're incrementing a counter.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • mickyT (11/7/2013)


    Hi

    If you have a look at this article[/url] by Jeff Moden, it will explain how to group by contiguous islands of dates. This is essentially what you are trying to do.

    While this doesn't quite get the result that you want, it is because there is a large gap in the months between 12/1/2009 and 1/1/2010 for rig location MERKIL

    with sample as (

    select * FROM

    (VALUES

    ('MER','RIG-3037','MERBOW','0503',CAST('20090109' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20090110' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20090111' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20090112' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100101' AS DATETIME)),

    ('MER','RIG-3037','MERWTX','0518',CAST('20100102' AS DATETIME)),

    ('MER','RIG-3037','MERWTX','0518',CAST('20100103' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100104' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100105' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100106' AS DATETIME))

    ) AS D(CompanyCode,RigNumber,RigLocation,NAVLocation,monthdate)

    ),

    grouped as (

    SELECT *,

    DATEADD(dd, - ROW_NUMBER() OVER (PARTITION BY CompanyCode,RigNumber,RigLocation,NAVLocation ORDER BY monthdate), monthdate) G

    FROM sample

    )

    select *, ROW_NUMBER() OVER (PARTITION BY CompanyCode,RigNumber,RigLocation,NAVLocation, G ORDER BY monthdate) TransNumber

    from grouped

    order by monthdate;

    Another way to do this ignoring the gaps in the date sequences would be

    with sample as (

    select * FROM

    (VALUES

    ('MER','RIG-3037','MERBOW','0503',CAST('20090109' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20090110' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20090111' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20090112' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100101' AS DATETIME)),

    ('MER','RIG-3037','MERWTX','0518',CAST('20100102' AS DATETIME)),

    ('MER','RIG-3037','MERWTX','0518',CAST('20100103' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100104' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100105' AS DATETIME)),

    ('MER','RIG-3037','MERKIL','0522',CAST('20100106' AS DATETIME))

    ) AS D(CompanyCode, RigNumber, RigLocation, NAVLocation, monthdate)

    ),

    grouped as (

    SELECT *,

    ROW_NUMBER() OVER (PARTITION BY CompanyCode, RigNumber, RigLocation, NAVLocation ORDER BY monthdate)

    - ROW_NUMBER() OVER (PARTITION BY CompanyCode, RigNumber ORDER BY monthdate) G

    FROM sample

    )

    select *, ROW_NUMBER() OVER (PARTITION BY CompanyCode, RigNumber, RigLocation, NAVLocation, G ORDER BY monthdate) TransNumber

    from grouped

    order by monthdate;

    Perhaps there is a way with ROW_NUMBER() and I was just too lazy to look for it! 😛


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Just realized that I was reading the dates the wrong way.

    Will edit my previous to fix

  • mickyT (11/7/2013)


    Just realized that I was reading the dates the wrong way.

    Will edit my previous to fix

    Other than the dates being wrong in your sample table, it still works because I just ran it against the values I put into my #Temp table.

    Nice solution Micky!


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Thanks ... inspired by others (Jeff) of course:-)

  • mickyT (11/7/2013)


    Thanks ... inspired by others (Jeff) of course:-)

    Of course. I'd say mine too, but in this particular case since they both work, I'd go with Micky's (advice to the OP).


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Yes it still works and works great. now i will have to pick it apart a bit to understand it, but i really appreciate both of you helping me with my problem. Thanks again!

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply