Incorrect Sorting

  • Sample data and the query syntax:

    use tempdb

    go

    create table Test (Product nchar(20), Actual_Rate decimal(14,2), Quantity int)

    insert into dbo.Test

    select 'A-1015-7658',1,35.95

    union all

    select 'A-1015-7658',1,35.95

    union all

    select 'A-1015-7658',3,120.19

    union all

    select 'A-1015-7658',2,80.13

    union all

    select 'A-1015-7658',1,45.39

    union all

    select 'A-1015-7658',1,45.39

    select Product,

    case when Quantity > 0 then (Actual_Rate / Quantity)

    else Actual_Rate

    end as Actual_Rate,

    row_number() over(partition by Product order by Actual_Rate desc) as RowNum,

    count(*) over(partition by Product) as Cnt

    from dbo.Test

    drop table dbo.Test

    When I run this it does not sort all of the records correctly, I expected the values to be in descending order. Can anyone help with this?

    Thanks

  • There's no ORDER BY on that query so you can make absolutely no assumptions about the order the data will be returned in. If you want a specific order, add the ORDER BY clause. If that's not there the data will be returned in whatever order it's left in after the query processor finished.

    The order by in the ROW_NUMBER just affects the assignment of the row numbers. It may or may not affect the final order of the data.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Care to elaborate on the problem a bit? What would really help is and explaination of your code (why the count() over, for one). Also, what is the expected results based on the sample data.

  • Sorry, here's the full code:

    use tempdb

    go

    create table Test (Product nchar(20), Actual_Rate decimal(14,2), Quantity int)

    insert into dbo.Test

    select 'A-1015-7658',35.95,1

    union all

    select 'A-1015-7658',35.95,1

    union all

    select 'A-1015-7658',120.19,3

    union all

    select 'A-1015-7658',80.13,2

    union all

    select 'A-1015-7658',45.39,1

    union all

    select 'A-1015-7658',45.39,1

    ;with MedianTest as

    (

    select Product,

    case when Quantity > 0 then (Actual_Rate / Quantity)

    else Actual_Rate

    end as Actual_Rate,

    row_number() over(partition by Product order by Actual_Rate desc) as RowNum,

    count(*) over(partition by Product) as Cnt

    from dbo.Test

    )

    select Product, avg(Actual_Rate) as Median

    from MedianTest

    where RowNum IN((Cnt + 1) / 2, (Cnt + 2) / 2)

    group by Product;

    drop table dbo.Test

    Basically I'm trying to calculate the Median of a set of values. At the moment the sort order is causing problems because the values are in the wrong order.

  • Still waiting for an explanation as well as what the sample data SHOULD look like if it was sorted correctly.

  • David (6/8/2009)


    Basically I'm trying to calculate the Median of a set of values. At the moment the sort order is causing problems because the values are in the wrong order.

    If you're talking about the final sort order of the returned result set, there's no defined order because you have no ORDER BY. If you mean some other order, please can you give more details?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Am I missing something here, this query gives the median?

  • To find the median of a result set you have to first order the result set and then depending on the number of items in the set either return the average of the two middle items or return the middle item. So for a dataset containing an even number of items the result is the average of the two middle items, for a dataset containing an odd number of items the result is the middle item.

    Running the first part of the query, the statement within the WITH block, returns the following dataset:

    A-1015-7658 40.063333333333316

    A-1015-7658 40.065000000000026

    A-1015-7658 45.390000000000036

    A-1015-7658 45.390000000000046

    A-1015-7658 35.950000000000056

    A-1015-7658 35.950000000000066

    Now to find the median the dataset needs to be sorted:

    A-1015-7658 35.950000000000056

    A-1015-7658 35.950000000000066

    A-1015-7658 40.063333333333316

    A-1015-7658 40.065000000000026

    A-1015-7658 45.390000000000036

    A-1015-7658 45.390000000000046

    However, ordering the dataset like this causes problems with the RowNum. The RowNum is used to locate either the two middle items or the middle item as defined above.

    The answer using my sample data is 40.06. Because my sample data contains an even number of items the result is the average of the two middle items.

  • Apologies, missed that entirely.

    If you calculate actual rate before you assign the row number the method works correctly.

    I haven't investigated the performance hit on this, but if you do the following CTE:

    ;with MedianTest as

    (

    select a.*,

    row_number() over(partition by Product order by Actual_Rate desc) as RowNum,

    count(*) over(partition by Product) as Cnt

    from

    (

    select Product,

    case when Quantity > 0 then (Actual_Rate / Quantity)

    else Actual_Rate

    end as Actual_Rate

    from dbo.Test

    ) a

    )

  • Perfect! Thanks Allister, that works.

  • No bother, btw I really like this solution for finding the median, pretty neat. Will have a play around with large result sets when I get a chanace.

  • Allister Reid (6/9/2009)


    No bother, btw I really like this solution for finding the median, pretty neat. Will have a play around with large result sets when I get a chanace.

    As much as I'd like to take all the credit, I actually found the following article really helpful http://www.sqlmag.com/Article/ArticleID/49827/sql_server_49827.html and it is where the majority of the code was from 😉

  • Cheers, thanks for the link!

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply