Distributing the greatest value by rank in SQL

  • Steve,

    I was actually replying to dwain c. but I do thank you as well. I was out of the office for a few days and didn't try yours until now. With your code though it does not account for a situation where the record being evaluated is less than the previous which was also less than the previous...if you can follow that:

    Avg_Sales..........YourCode_Outcome.......Desired_Outcome

    832.........................832............................832

    672.........................832............................832

    451.........................672............................832

    Hope this is more clear. I still need to be referring either to the "YourCode_Outcome" column with my logic or continue the logic back to the beginning of that particular window.

  • lmeinke (12/14/2012)


    Steve,

    I was actually replying to dwain c. but I do thank you as well. I was out of the office for a few days and didn't try yours until now. With your code though it does not account for a situation where the record being evaluated is less than the previous which was also less than the previous...if you can follow that:

    Avg_Sales..........YourCode_Outcome.......Desired_Outcome

    832.........................832............................832

    672.........................832............................832

    451.........................672............................832

    Hope this is more clear. I still need to be referring either to the "YourCode_Outcome" column with my logic or continue the logic back to the beginning of that particular window.

    You might try using the Quote button to clarify who you are responding to.

    I'm curious what test data you used with my code that you say doesn't work. I thought I tested for the first of a group. If you post that example along with expected results, I might be able to look into it further and suggest something.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • lmeinke (12/14/2012)


    Steve,

    I was actually replying to dwain c. but I do thank you as well. I was out of the office for a few days and didn't try yours until now. With your code though it does not account for a situation where the record being evaluated is less than the previous which was also less than the previous...if you can follow that:

    Avg_Sales..........YourCode_Outcome.......Desired_Outcome

    832.........................832............................832

    672.........................832............................832

    451.........................672............................832

    Hope this is more clear. I still need to be referring either to the "YourCode_Outcome" column with my logic or continue the logic back to the beginning of that particular window.

    Perhaps I missed something, but your output here doesn't tell me what employee size these avg_sales figures belong to, which is what matters, does it not? Or is there a consideration beyond that? Please post the sample data that you used, so I can see where I went wrong, if that's the case.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • dwain.c (12/14/2012)


    lmeinke (12/14/2012)


    Steve,

    I was actually replying to dwain c. but I do thank you as well. I was out of the office for a few days and didn't try yours until now. With your code though it does not account for a situation where the record being evaluated is less than the previous which was also less than the previous...if you can follow that:

    Avg_Sales..........YourCode_Outcome.......Desired_Outcome

    832.........................832............................832

    672.........................832............................832

    451.........................672............................832

    Hope this is more clear. I still need to be referring either to the "YourCode_Outcome" column with my logic or continue the logic back to the beginning of that particular window.

    You might try using the Quote button to clarify who you are responding to.

    I'm curious what test data you used with my code that you say doesn't work. I thought I tested for the first of a group. If you post that example along with expected results, I might be able to look into it further and suggest something.

    Dwain (Now I know what the quote button is for),

    Thank you again for your response. The issue you with your code is not the first record it when successively declining values for two or more records. I tried to illustrate with the example above. If you use the '451' in the third row it is looking at the '672' and taking that value as is the larger of the two. It needs to also take into account the preceding record of '932' and use that value. I read an article using the quirky update technique that seems to suggest I could update a variable and hold the greatest value OR populate another column with the greatest value and refer to that column rather than the same column of the preceding row.

    Hope this helps.

    Thank you,

    Lonnie

  • Ok, I see where I had trouble. Here's a fix, but I'd be concerned about it's performance with a large volume of records. It might be really bad...

    ;WITH SampleData (Category, Avg_Sales, Emp_Size, Industry, SalesRank, Emp_Rank) AS (

    SELECT '01-Widgets', 100, '1-4', 'Accommodations', 2, 1 UNION ALL

    SELECT '01-Widgets', 75, '5-9', 'Accommodations', 1, 2 UNION ALL

    SELECT '01-Widgets', 175, '10-19', 'Accommodations', 3, 3 UNION ALL

    SELECT '01-Widgets', 225, '20-49', 'Accommodations', 6, 4 UNION ALL

    SELECT '01-Widgets', 200, '50-99', 'Accommodations', 5, 5 UNION ALL

    SELECT '01-Widgets', 180, '100-199', 'Accommodations', 4, 6

    )

    SELECT SD.Industry, SD.Category, SD.Emp_Size, (

    SELECT MAX(Avg_Sales)

    FROM SampleData AS SD1

    WHERE SD1.Industry = SD.Industry

    AND SD1.Category = SD.Category

    AND SD1.Emp_Rank <= SD.Emp_Rank

    ) AS Max_Sales

    FROM SampleData AS SD

    Use it at your own risk. (NOTE / EDIT: I no longer need a dummy record with a zero value)

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson (12/17/2012)


    Ok, I see where I had trouble. Here's a fix, but I'd be concerned about it's performance with a large volume of records. It might be really bad...

    ;WITH SampleData (Category, Avg_Sales, Emp_Size, Industry, SalesRank, Emp_Rank) AS (

    SELECT '01-Widgets', 100, '1-4', 'Accommodations', 2, 1 UNION ALL

    SELECT '01-Widgets', 75, '5-9', 'Accommodations', 1, 2 UNION ALL

    SELECT '01-Widgets', 175, '10-19', 'Accommodations', 3, 3 UNION ALL

    SELECT '01-Widgets', 225, '20-49', 'Accommodations', 6, 4 UNION ALL

    SELECT '01-Widgets', 200, '50-99', 'Accommodations', 5, 5 UNION ALL

    SELECT '01-Widgets', 180, '100-199', 'Accommodations', 4, 6

    )

    SELECT SD.Industry, SD.Category, SD.Emp_Size, (

    SELECT MAX(Avg_Sales)

    FROM SampleData AS SD1

    WHERE SD1.Industry = SD.Industry

    AND SD1.Category = SD.Category

    AND SD1.Emp_Rank <= SD.Emp_Rank

    ) AS Max_Sales

    FROM SampleData AS SD

    Use it at your own risk. (NOTE / EDIT: I no longer need a dummy record with a zero value)

    Steve,

    Thank you very much, while slow (1:40 for roughly 25K records) it works perfectly. This will work for my application. I am grateful for people who think differently from me as I was stuck in a cursor like mentality to solve this....I could see no other way. Thank you again.

    Lonnie

  • Glad to hear it works for you, although I'm thinking there might be a better way than to just regenerate the data every time you need it. If the time of 1:40 is 1 minute 40 seconds, it's not necessary, but if it's 1 hour 40 minutes, you might want to at least consider whether a trigger could update a separate table (quickly enough to avoid insert performance issues) that could always be selected from to get the results of this query without ANY performance issues. The trigger would basically perform what's in the subquery and insert that into a table, or update the table, on every insert or update to the source table (the one that has the Emp_Rank values in it). Just something to think about, and again, if your result time is minutes instead of hours, it's completely unncessary.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson (12/17/2012)


    Glad to hear it works for you, although I'm thinking there might be a better way than to just regenerate the data every time you need it. If the time of 1:40 is 1 minute 40 seconds, it's not necessary, but if it's 1 hour 40 minutes, you might want to at least consider whether a trigger could update a separate table (quickly enough to avoid insert performance issues) that could always be selected from to get the results of this query without ANY performance issues. The trigger would basically perform what's in the subquery and insert that into a table, or update the table, on every insert or update to the source table (the one that has the Emp_Rank values in it). Just something to think about, and again, if your result time is minutes instead of hours, it's completely unncessary.

    Thank you again Steve. It is one minute 40 seconds so it is acceptable at this point. Although your other suggestion will be placed in my memory banks for future reference.

    Lonnie M

  • lmeinke (12/17/2012)


    dwain.c (12/14/2012)


    lmeinke (12/14/2012)


    Steve,

    I was actually replying to dwain c. but I do thank you as well. I was out of the office for a few days and didn't try yours until now. With your code though it does not account for a situation where the record being evaluated is less than the previous which was also less than the previous...if you can follow that:

    Avg_Sales..........YourCode_Outcome.......Desired_Outcome

    832.........................832............................832

    672.........................832............................832

    451.........................672............................832

    Hope this is more clear. I still need to be referring either to the "YourCode_Outcome" column with my logic or continue the logic back to the beginning of that particular window.

    You might try using the Quote button to clarify who you are responding to.

    I'm curious what test data you used with my code that you say doesn't work. I thought I tested for the first of a group. If you post that example along with expected results, I might be able to look into it further and suggest something.

    Dwain (Now I know what the quote button is for),

    Thank you again for your response. The issue you with your code is not the first record it when successively declining values for two or more records. I tried to illustrate with the example above. If you use the '451' in the third row it is looking at the '672' and taking that value as is the larger of the two. It needs to also take into account the preceding record of '932' and use that value. I read an article using the quirky update technique that seems to suggest I could update a variable and hold the greatest value OR populate another column with the greatest value and refer to that column rather than the same column of the preceding row.

    Hope this helps.

    Thank you,

    Lonnie

    I hate leaving loose ends so try this. Bet it is faster on your 25K rows.

    CREATE TABLE #SalesByEmployeeCategory

    (Category VARCHAR(100)

    ,Avg_Sales MONEY

    ,Emp_Size VARCHAR(100)

    ,Industry VARCHAR(100)

    ,DesiredOutput MONEY

    ,EmpRank AS (CAST(LEFT(Emp_Size, CHARINDEX(' ', Emp_Size)) AS INT)) PERSISTED

    PRIMARY KEY CLUSTERED (Industry, Category, EmpRank))

    INSERT INTO #SalesByEmployeeCategory

    (Category, Avg_Sales, Emp_Size, Industry)

    SELECT '01-Widgets',' $100','1 - 4','Accommodations' UNION ALL

    SELECT '01-Widgets',' $75','5 - 9','Accommodations' UNION ALL

    SELECT '01-Widgets',' $175','10 - 19','Accommodations' UNION ALL

    SELECT '01-Widgets',' $225','20 - 49','Accommodations' UNION ALL

    SELECT '01-Widgets',' $200','50 - 99','Accommodations' UNION ALL

    SELECT '02-Furniture',' $100','1 - 4','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $75','5 - 9','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $175','10 - 19','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $225','20 - 49','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $200','50 - 99','Manufacturing' UNION ALL

    SELECT '03-Lighting',' $300','1 - 4','Fixtures' UNION ALL

    SELECT '03-Lighting',' $200','5 - 9','Fixtures' UNION ALL

    SELECT '03-Lighting',' $100','10 - 19','Fixtures' UNION ALL

    SELECT '03-Lighting',' $75','20 - 49','Fixtures' UNION ALL

    SELECT '03-Lighting',' $50','50 - 99','Fixtures'

    DECLARE @IndCat VARCHAR(200) = ''

    ,@LastSales MONEY --= $0

    UPDATE #SalesByEmployeeCategory WITH(TABLOCKX)

    SET

    DesiredOutput=CASE WHEN @LastSales > Avg_Sales AND @IndCat = Industry+Category

    THEN @LastSales ELSE Avg_Sales END

    ,@LastSales=CASE WHEN @IndCat <> Industry+Category THEN 0

    WHEN @LastSales > Avg_Sales THEN @LastSales ELSE Avg_Sales END

    ,@IndCat=Industry+Category

    ,@LastSales=CASE WHEN @IndCat <> Industry+Category THEN 0

    WHEN @LastSales > Avg_Sales THEN @LastSales ELSE Avg_Sales END

    OPTION (MAXDOP 1)

    SELECT *

    FROM #SalesByEmployeeCategory

    DROP TABLE #SalesByEmployeeCategory


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (12/17/2012)


    lmeinke (12/17/2012)


    dwain.c (12/14/2012)


    lmeinke (12/14/2012)


    Steve,

    I was actually replying to dwain c. but I do thank you as well. I was out of the office for a few days and didn't try yours until now. With your code though it does not account for a situation where the record being evaluated is less than the previous which was also less than the previous...if you can follow that:

    Avg_Sales..........YourCode_Outcome.......Desired_Outcome

    832.........................832............................832

    672.........................832............................832

    451.........................672............................832

    Hope this is more clear. I still need to be referring either to the "YourCode_Outcome" column with my logic or continue the logic back to the beginning of that particular window.

    You might try using the Quote button to clarify who you are responding to.

    I'm curious what test data you used with my code that you say doesn't work. I thought I tested for the first of a group. If you post that example along with expected results, I might be able to look into it further and suggest something.

    Dwain (Now I know what the quote button is for),

    Thank you again for your response. The issue you with your code is not the first record it when successively declining values for two or more records. I tried to illustrate with the example above. If you use the '451' in the third row it is looking at the '672' and taking that value as is the larger of the two. It needs to also take into account the preceding record of '932' and use that value. I read an article using the quirky update technique that seems to suggest I could update a variable and hold the greatest value OR populate another column with the greatest value and refer to that column rather than the same column of the preceding row.

    Hope this helps.

    Thank you,

    Lonnie

    I hate leaving loose ends so try this. Bet it is faster on your 25K rows.

    CREATE TABLE #SalesByEmployeeCategory

    (Category VARCHAR(100)

    ,Avg_Sales MONEY

    ,Emp_Size VARCHAR(100)

    ,Industry VARCHAR(100)

    ,DesiredOutput MONEY

    ,EmpRank AS (CAST(LEFT(Emp_Size, CHARINDEX(' ', Emp_Size)) AS INT)) PERSISTED

    PRIMARY KEY CLUSTERED (Industry, Category, EmpRank))

    INSERT INTO #SalesByEmployeeCategory

    (Category, Avg_Sales, Emp_Size, Industry)

    SELECT '01-Widgets',' $100','1 - 4','Accommodations' UNION ALL

    SELECT '01-Widgets',' $75','5 - 9','Accommodations' UNION ALL

    SELECT '01-Widgets',' $175','10 - 19','Accommodations' UNION ALL

    SELECT '01-Widgets',' $225','20 - 49','Accommodations' UNION ALL

    SELECT '01-Widgets',' $200','50 - 99','Accommodations' UNION ALL

    SELECT '02-Furniture',' $100','1 - 4','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $75','5 - 9','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $175','10 - 19','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $225','20 - 49','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $200','50 - 99','Manufacturing' UNION ALL

    SELECT '03-Lighting',' $300','1 - 4','Fixtures' UNION ALL

    SELECT '03-Lighting',' $200','5 - 9','Fixtures' UNION ALL

    SELECT '03-Lighting',' $100','10 - 19','Fixtures' UNION ALL

    SELECT '03-Lighting',' $75','20 - 49','Fixtures' UNION ALL

    SELECT '03-Lighting',' $50','50 - 99','Fixtures'

    DECLARE @IndCat VARCHAR(200) = ''

    ,@LastSales MONEY --= $0

    UPDATE #SalesByEmployeeCategory WITH(TABLOCKX)

    SET

    DesiredOutput=CASE WHEN @LastSales > Avg_Sales AND @IndCat = Industry+Category

    THEN @LastSales ELSE Avg_Sales END

    ,@LastSales=CASE WHEN @IndCat <> Industry+Category THEN 0

    WHEN @LastSales > Avg_Sales THEN @LastSales ELSE Avg_Sales END

    ,@IndCat=Industry+Category

    ,@LastSales=CASE WHEN @IndCat <> Industry+Category THEN 0

    WHEN @LastSales > Avg_Sales THEN @LastSales ELSE Avg_Sales END

    OPTION (MAXDOP 1)

    SELECT *

    FROM #SalesByEmployeeCategory

    DROP TABLE #SalesByEmployeeCategory

    Dwain,

    This is very fast and works well with your data. Unfortunately it does not work with mine. I'm not certain if it has to do with ordering or a primary index but when I run this on my data it assigns the MAX value (within the Category and Industry) to all rows. I will dig into further.

    Thank you,

    Lonnie

  • Dwain,

    As a follow up. There seems to be an issue related to the process order of the data (I know...there is no actual order). When I add an additional column such as a Sales_Rank column with a varchar value of 1-5 your data also takes the highest value for some groups and distributes it across all the rows in that group (Industry+Category).

    CREATE TABLE #SalesByEmployeeCategory

    (Category VARCHAR(100)

    ,Avg_Sales MONEY

    ,Emp_Size VARCHAR(100)

    ,Sales_Rank VARCHAR(100)

    ,Industry VARCHAR(100)

    ,DesiredOutput MONEY

    ,EmpRank AS (CAST(LEFT(Emp_Size, CHARINDEX(' ', Emp_Size)) AS INT)) PERSISTED

    PRIMARY KEY CLUSTERED (Industry, Category, EmpRank))

    INSERT INTO #SalesByEmployeeCategory

    (Category, Avg_Sales, Emp_Size, Sales_Rank, Industry)

    SELECT '01-Widgets',' $100','1 - 4', '4','Accommodations' UNION ALL

    SELECT '03-Lighting',' $75','20 - 49','4','Fixtures' UNION ALL

    SELECT '01-Widgets',' $175','10 - 19','3','Accommodations' UNION ALL

    SELECT '01-Widgets',' $200','50 - 99','2', 'Accommodations' UNION ALL

    SELECT '02-Furniture',' $100','1 - 4','4','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $75','5 - 9','5','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $175','10 - 19','3','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $225','20 - 49','2','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $200','50 - 99','1','Manufacturing' UNION ALL

    SELECT '01-Widgets',' $75','5 - 9','5','Accommodations' UNION ALL

    SELECT '03-Lighting',' $300','1 - 4','1','Fixtures' UNION ALL

    SELECT '03-Lighting',' $200','5 - 9','2','Fixtures' UNION ALL

    SELECT '01-Widgets',' $225','20 - 49','1','Accommodations' UNION ALL

    SELECT '03-Lighting',' $100','10 - 19','3','Fixtures' UNION ALL

    SELECT '03-Lighting',' $50','50 - 99','5','Fixtures'

    DECLARE @IndCat VARCHAR(200) = ''

    ,@LastSales MONEY --= $0

    UPDATE #SalesByEmployeeCategory WITH(TABLOCKX)

    SET

    DesiredOutput=CASE WHEN @LastSales > Avg_Sales AND @IndCat = Industry+Category

    THEN @LastSales ELSE Avg_Sales END

    ,@LastSales=CASE WHEN @IndCat <> Industry+Category THEN 0

    WHEN @LastSales > Avg_Sales THEN @LastSales ELSE Avg_Sales END

    ,@IndCat=Industry+Category

    ,@LastSales=CASE WHEN @IndCat <> Industry+Category THEN 0

    WHEN @LastSales > Avg_Sales THEN @LastSales ELSE Avg_Sales END

    OPTION (MAXDOP 1)

    SELECT *

    FROM #SalesByEmployeeCategory

  • The thing about the Quirky Update is that it's, well... quirky.

    Sometimes you'll need to play with it a might to get the results you expect.

    I'll look at this again but it might be a couple of days because several things have come up (taking my car to the shop this morning for one).


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Dwain,

    A "Quirky Update" isn't going to work here, as the problem isn't just a matter of retrieving a value from a previous record. It's a matter of finding the maximum value for ALL previous records, so this kind of solution isn't going to do it. Unfortunately, there's really no well-performing way to go about this, and the performance is also going to be heavily dependent on how often the data has non-conforming values, and exactly where they occur. Given that the query I supplied runs in a minute and 40 seconds, there's not a lot of incentive to improve the query until the volume of input records gets larger - or at least large enough to cause a performance problem, anyway. We'd all love to have solutions that are blazingly fast, but sometimes, what's needed is just going to chunk a bit. It's the nature of the beast. That said, however, if you DO find something that runs faster, and that produces the same result, I'd love to see it. I'm always up to learn something new.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • lmeinke,

    I see where you added the sales_rank column but I don't see how that adversely affects the results. So let's modify the final select slightly.

    CREATE TABLE #SalesByEmployeeCategory

    (Category VARCHAR(100)

    ,Avg_Sales MONEY

    ,Emp_Size VARCHAR(100)

    ,Sales_Rank VARCHAR(100)

    ,Industry VARCHAR(100)

    ,DesiredOutput MONEY

    ,EmpRank AS (CAST(LEFT(Emp_Size, CHARINDEX(' ', Emp_Size)) AS INT)) PERSISTED

    PRIMARY KEY CLUSTERED (Industry, Category, EmpRank))

    INSERT INTO #SalesByEmployeeCategory

    (Category, Avg_Sales, Emp_Size, Sales_Rank, Industry)

    SELECT '01-Widgets',' $100','1 - 4', '4','Accommodations' UNION ALL

    SELECT '03-Lighting',' $75','20 - 49','4','Fixtures' UNION ALL

    SELECT '01-Widgets',' $175','10 - 19','3','Accommodations' UNION ALL

    SELECT '01-Widgets',' $200','50 - 99','2', 'Accommodations' UNION ALL

    SELECT '02-Furniture',' $100','1 - 4','4','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $75','5 - 9','5','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $175','10 - 19','3','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $225','20 - 49','2','Manufacturing' UNION ALL

    SELECT '02-Furniture',' $200','50 - 99','1','Manufacturing' UNION ALL

    SELECT '01-Widgets',' $75','5 - 9','5','Accommodations' UNION ALL

    SELECT '03-Lighting',' $300','1 - 4','1','Fixtures' UNION ALL

    SELECT '03-Lighting',' $200','5 - 9','2','Fixtures' UNION ALL

    SELECT '01-Widgets',' $225','20 - 49','1','Accommodations' UNION ALL

    SELECT '03-Lighting',' $100','10 - 19','3','Fixtures' UNION ALL

    SELECT '03-Lighting',' $50','50 - 99','5','Fixtures'

    DECLARE @IndCat VARCHAR(200) = ''

    ,@LastSales MONEY --= $0

    UPDATE #SalesByEmployeeCategory WITH(TABLOCKX)

    SET

    DesiredOutput=CASE WHEN @LastSales > Avg_Sales AND @IndCat = Industry+Category

    THEN @LastSales ELSE Avg_Sales END

    ,@LastSales=CASE WHEN @IndCat <> Industry+Category THEN 0

    WHEN @LastSales > Avg_Sales THEN @LastSales ELSE Avg_Sales END

    ,@IndCat=Industry+Category

    ,@LastSales=CASE WHEN @IndCat <> Industry+Category THEN 0

    WHEN @LastSales > Avg_Sales THEN @LastSales ELSE Avg_Sales END

    OPTION (MAXDOP 1)

    SELECT n=ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),*

    FROM #SalesByEmployeeCategory

    DROP TABLE #SalesByEmployeeCategory

    This produces the following output:

    n Category Avg_Sales Emp_Size Sales_Rank Industry DesiredOutput EmpRank

    1 01-Widgets 100.00 1 - 4 4 Accommodations 100.00 1

    2 01-Widgets 75.00 5 - 9 5 Accommodations 100.00 5

    3 01-Widgets 175.00 10 - 19 3 Accommodations 175.00 10

    4 01-Widgets 225.00 20 - 49 1 Accommodations 225.00 20

    5 01-Widgets 200.00 50 - 99 2 Accommodations 225.00 50

    6 03-Lighting 300.00 1 - 4 1 Fixtures 300.00 1

    7 03-Lighting 200.00 5 - 9 2 Fixtures 300.00 5

    8 03-Lighting 100.00 10 - 19 3 Fixtures 300.00 10

    9 03-Lighting 75.00 20 - 49 4 Fixtures 300.00 20

    10 03-Lighting 50.00 50 - 99 5 Fixtures 300.00 50

    11 02-Furniture 100.00 1 - 4 4 Manufacturing 100.00 1

    12 02-Furniture 75.00 5 - 9 5 Manufacturing 100.00 5

    13 02-Furniture 175.00 10 - 19 3 Manufacturing 175.00 10

    14 02-Furniture 225.00 20 - 49 2 Manufacturing 225.00 20

    15 02-Furniture 200.00 50 - 99 1 Manufacturing 225.00 50

    Can you please identify by record number (n) which record(s) has the wrong result for the Desired Output column, what it should be and why?

    If none of the records in the above results are wrong (they look correct to me but perhaps I am misunderstanding the question), check the DDL for your actual table (post here if you can):

    1. Does it have exactly the definition shown for the computed (persisted) EmpRank column?

    2. Is there a CLUSTERED INDEX on Industry, Category and EmpRank?

    3. Did you remember to include WITH(TABLOCKX) and OPTION (MAXDOP 1) as I have done?

    4. Are you doing an JOINs or using any other query hints in your actual table UPDATE?


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • sgmunson (12/18/2012)


    Dwain,

    A "Quirky Update" isn't going to work here, as the problem isn't just a matter of retrieving a value from a previous record. It's a matter of finding the maximum value for ALL previous records, so this kind of solution isn't going to do it. Unfortunately, there's really no well-performing way to go about this, and the performance is also going to be heavily dependent on how often the data has non-conforming values, and exactly where they occur. Given that the query I supplied runs in a minute and 40 seconds, there's not a lot of incentive to improve the query until the volume of input records gets larger - or at least large enough to cause a performance problem, anyway. We'd all love to have solutions that are blazingly fast, but sometimes, what's needed is just going to chunk a bit. It's the nature of the beast. That said, however, if you DO find something that runs faster, and that produces the same result, I'd love to see it. I'm always up to learn something new.

    I disagree with you on two points:

    #1

    "A "Quirky Update" isn't going to work here, as the problem isn't just a matter of retrieving a value from a previous record. It's a matter of finding the maximum value for ALL previous records"

    I think that's exactly what my QU is doing.

    #2

    "Given that the query I supplied runs in a minute and 40 seconds, there's not a lot of incentive to improve the query"

    That incentive is always there. We should never be satisfied with something just because it is "good enough." Before you know it we forget about it and the record set grows until the query no longer scales to match and then we have users complaining that the system is slow.

    Now I'm not saying my query is necessarily perfect. It is possible I've misunderstood the desired result in some way. I do believe this problem is a good fit for this solution approach. The QU isn't limited to the prior record. It can retrieve the information off any prior record, given sufficient matching criteria for it to identify exactly what information it needs to retrieve.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 15 posts - 16 through 30 (of 40 total)

You must be logged in to reply to this topic. Login to reply