Comparison Across Columns

  • Hello there. I'm currently wracking my mind over this issue, which is probably stupidly simple, but I think I might be overcomplicating it too much :-P.

    Basically, I have a table that contains some columns with dimensions of an item, like so:

    CREATE TABLE Dimension(

    [Length] float,

    [Width] float,

    [Height] float

    )

    And sample data like so:

    INSERT INTO Dimension([Length], [Width], [Height])

    (SELECT 15.5, 4.2, 3.6

    UNION ALL

    SELECT 4.5, 6.7, 8.4

    UNION ALL

    SELECT 7.1, 9.4, 2.3

    UNION ALL

    SELECT 5.5, 1.7, 5.7

    UNION ALL

    SELECT 7.4, 2.3, 2.1

    )

    Basically, what I need to do is to take the three columns of the table, and compare them against some numbers; specifically, 9, 6.5, and 3. If any given dimension exceeds 9, the item is a failure. If no dimension is less than 3, it's a failure. Anything else is good to go.

    I've been trying some fiddling with a table variable with some UNION ALLs to split the columns into rows, then I tried comparing that way; however, using something like:

    SELECT * WHERE Dimension > MIN(Dimension) AND Dimension < MAX(Dimension)

    Doesn't return anything, even when something meets the criteria. Interestingly, if I use >= and <=, it works perfectly.

    ... Now that I think about it, I could probably just use that approach to accomplish what I'm after, and just use, say, 3.000000001 as the lower bound, and ditto for the upper bound with decimals. Darn. Well, would that be the most elegant way to do it, or is there some better way? I can provide coding for the UNION ALL table variable solution I'm using as well, if needed.

    Thanks for your time!

    - 😀

  • hisakimatama (6/19/2012)Basically, what I need to do is to take the three columns of the table, and compare them against some numbers; specifically, 9, 6.5, and 3. If any given dimension exceeds 9, the item is a failure. If no dimension is less than 3, it's a failure. Anything else is good to go.

    I don't really understand how 6.5 relates to this in the above discussion. To handle:

    1) Fail if any dimension exceeds 9 and

    2) Fail if no dimension is less than 3

    You could do something like this:

    CREATE TABLE #Dimension([Length] float, [Width] float, [Height] float)

    INSERT INTO #Dimension([Length], [Width], [Height])

    SELECT 15.5, 4.2, 3.6

    UNION ALL SELECT 4.5, 6.7, 8.4

    UNION ALL SELECT 7.1, 9.4, 2.3

    UNION ALL SELECT 5.5, 1.7, 5.7

    UNION ALL SELECT 7.4, 2.3, 2.1

    ;WITH CTE AS (

    SELECT [Length], [Width], [Height]

    ,(SELECT MAX(dim)

    FROM (VALUES ([Length]), ([Width]), ([Height])) x(dim)) AS MaxDim

    ,(SELECT MIN(dim)

    FROM (VALUES ([Length]), ([Width]), ([Height])) x(dim)) AS MinDim

    FROM #Dimension)

    SELECT [Length], [Width], [Height], MaxDim, MinDim

    FROM CTE

    WHERE MaxDim < 9 AND MinDim < 3

    DROP TABLE #Dimension


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Whoops, my mistake :hehe:. Was a bit too frazzled at the time I was typing. One additional clause should be that if two or more values are above 6.5, the item fails. Basically, what I'm doing is seeing if an item can fit into a certain mailing box, and the numbers provided are the dimensions for said box. I think the way I've formulated the check should catch any case where the item won't fit in the box, but I may be wrong... My brain is still full of fuzz. Been programming in C# all day, and I think I'm suffering from SQL withdrawal :blink:

    The CTE solution you provided is very similar to the temp table idea I had, but it seems to have the right formatting that I couldn't quite pin down. I'll experiment with it some when I'm in the office tomorrow and see if I can get it working. Thanks!

    - 😀

  • Understood. A minor modification:

    ;WITH CTE AS (

    SELECT [Length], [Width], [Height]

    ,(SELECT MAX(dim)

    FROM (VALUES ([Length]), ([Width]), ([Height])) x(dim)) AS MaxDim

    ,(SELECT MIN(dim)

    FROM (VALUES ([Length]), ([Width]), ([Height])) x(dim)) AS MinDim

    ,(SELECT CASE WHEN MAX(dim1) > 6.5 AND MAX(dim2) > 6.5 THEN 1 ELSE 0 END

    FROM (VALUES ([Length],[Width]), ([Length], [Height])

    ,([Width], [Height])) x(dim1, dim2)) AS TwoDims

    FROM #Dimension)

    SELECT [Length], [Width], [Height], MaxDim, MinDim, TwoDims

    FROM CTE

    WHERE MaxDim < 9 AND MinDim < 3 AND TwoDims = 0

    BTW. This was lots of fun. Gives me a chance to play around with VALUES sets, which are relatively new to me.

    And it makes a great little workshop on VALUES sets for a SQL training class I'm developing!


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Great! This seems to work just like I need it to :-D. Just to check my understanding here, VALUES seems to basically create a temp table sort of structure from a list of columns that you provide to it, correct? From there, you're creating two VALUES "tables", and finding the max and min dimensions out of them. After that, you split off dimensions in pairs using VALUES and then check to see if they're both above the 6.5 boundary. After that, the WHERE clause trims out everything that fails the checks, and shows the details of what the VALUES clauses determined. Is that fairly correct?

    Thanks a lot for the help, and for showing me a neat new clause I didn't know about. I think there might be a few things I could use this for instead of messier solutions I had to put together, so I've gained a lot here 😀

    - 😀

  • Your understanding seems to be 100% on the money.

    While I've known about VALUES sets for a long time in theory, I never really had a chance to play with them until a couple of weeks ago when my SQL was upgraded to 2008.

    I wouldn't call them infintely useful but I've definitely found a couple of uses and at least one that outperforms the traditional approach to a noticeable degree.

    Gotta love learning new things - thanks to SSC.com!


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Hm. Found something that's slightly strange. I've got one item with dimensions 4.5, 7, and 1. This item is oddly being marked as having TwoDims = 1, though I can't quite understand why; if I followed the example correctly, it should be judging that like so:

    4.5, 7 <- 7 is greater than 6.5, but 4.5 isn't

    4.5, 1 <- Neither is greater than 6.5

    7, 1 <- 7 is greater than 6.5, but 1 isn't

    As such, there isn't a case for this item where both values are greater than 6.5, so TwoDims shouldn't be marked with a 1.

    It's happening with a rather small subset of the items I'm examining overall, but as luck would have it, two of today's items have this oddity, so it immediately jumped out at me :hehe:. Any clues on why it's misbehaving in such a strange way?

    - 😀

  • Yes, the problem is that the width is being used for both Dim1 and Dim2. So the max of the values 4.5, 4.5, 7 is 7 and the max of 7, 1, 1 is also 7. I altered Dwain's query by using a CROSS APPLY and changing the expression for the two dims.

    The CROSS APPLY probably saves a little extra computation, because you're only dealing with one derived table instead of four (the three subqueries and the one CTE).

    SELECT [Length], [Width], [Height], MaxDim MinDim, CountLargeDims

    FROM #Dimension

    CROSS APPLY (

    SELECT Max(dim) AS MaxDim

    ,Min(Dim) AS MinDim

    ,COUNT(CASE WHEN Dim >= 6.5 THEN Dim END) AS CountLargeDims

    FROM (VALUES([Length]), ([Width]), ([Height])) AS Dims

    WHERE MaxDim < 9 AND MinDim < 3 AND CountLargeDims > 2

    Drew

    Edited: to correct the ending tag for the /code.

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Drew hit the nail on the head. My bug, my bad!

    But I can correct it without the CROSS APPLY!

    ;WITH CTE AS (

    SELECT [Length], [Width], [Height]

    ,(SELECT MAX(dim)

    FROM (VALUES ([Length]), ([Width]), ([Height])) x(dim)) AS MaxDim

    ,(SELECT MIN(dim)

    FROM (VALUES ([Length]), ([Width]), ([Height])) x(dim)) AS MinDim

    ,(SELECT COUNT(CASE WHEN dim > 6.5 THEN 1 ELSE NULL END)

    FROM (VALUES ([Length]), ([Width]), ([Height])) x(dim)) AS TwoDims

    FROM #Dimension)

    SELECT [Length], [Width], [Height], MaxDim, MinDim, TwoDims

    FROM CTE

    WHERE MaxDim < 9 AND MinDim < 3 AND TwoDims < 2


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Ah, that's where I was misunderstanding! I thought it was three sets of pairs, but it's actually two sets of three :-). Makes perfect sense how it wasn't working quite right. Ran both queries and readjusted what I was creating from the data, and everything looks great now. Well, aside from some items that aren't matching to this data because of incorrect data elsewhere in the database, but that's something I'll spend today fixing :-P.

    Thanks again to both of you! I appreciate it quite a bit.

    - 😀

  • dwain.c (6/20/2012)


    Drew hit the nail on the head. My bug, my bad!

    But I can correct it without the CROSS APPLY!

    You can, but I find the CROSS APPLY simpler and more intuitive. I don't have SQL 2008 available at work, so I can't test whether it is also faster.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen (6/20/2012)


    dwain.c (6/20/2012)


    Drew hit the nail on the head. My bug, my bad!

    But I can correct it without the CROSS APPLY!

    You can, but I find the CROSS APPLY simpler and more intuitive. I don't have SQL 2008 available at work, so I can't test whether it is also faster.

    Drew

    I do, I can and I'm interested!

    Test harness: 1,000,000 rows

    Note: I had to make a couple of mods to Drew's query to correct syntax and get the row counts equivalent (he was counting a dim at 6.5 as being over 6.5 - not sure which is right).

    SET NOCOUNT ON

    CREATE TABLE #Dimension([Length] float, [Width] float, [Height] float)

    ;WITH Tally (n) AS (

    SELECT TOP 1000000 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns a CROSS JOIN sys.all_columns b

    )

    INSERT INTO #Dimension([Length], [Width], [Height])

    SELECT ROUND(RAND(CHECKSUM(NEWID())) * 3. + 4, 1)

    ,ROUND(RAND(CHECKSUM(NEWID())) * 3. + 4, 1)

    ,ROUND(RAND(CHECKSUM(NEWID())) * 2. + 2, 1)

    FROM Tally

    --SELECT * FROM #Dimension

    PRINT REPLICATE('-', 20) + ' Dwain''s Straight Up VALUES Sets'

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    ;WITH CTE AS (

    SELECT [Length], [Width], [Height]

    ,(SELECT MAX(dim)

    FROM (VALUES ([Length]), ([Width]), ([Height])) x(dim)) AS MaxDim

    ,(SELECT MIN(dim)

    FROM (VALUES ([Length]), ([Width]), ([Height])) x(dim)) AS MinDim

    ,(SELECT COUNT(CASE WHEN dim > 6.5 THEN 1 ELSE NULL END)

    FROM (VALUES ([Length]), ([Width]), ([Height])) x(dim)) AS TwoDims

    FROM #Dimension)

    SELECT [Length], [Width], [Height], MaxDim, MinDim, TwoDims

    FROM CTE

    WHERE MaxDim < 9 AND MinDim < 3 AND TwoDims < 2

    SET STATISTICS IO OFF

    SET STATISTICS TIME OFF

    PRINT REPLICATE('-', 20) + ' Drew''s CROSS APPLY'

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT [Length], [Width], [Height], MaxDim, MinDim, CountLargeDims

    FROM #Dimension

    CROSS APPLY (

    SELECT Max(dim) AS MaxDim

    ,Min(Dim) AS MinDim

    ,COUNT(CASE WHEN Dim > 6.5 THEN Dim END) AS CountLargeDims

    FROM (VALUES([Length]), ([Width]), ([Height])) AS Dims(dim)) x

    WHERE MaxDim < 9 AND MinDim < 3 AND CountLargeDims < 2

    SET STATISTICS IO OFF

    SET STATISTICS TIME OFF

    DROP TABLE #Dimension

    Drum roll please - the results:

    -------------------- Dwain's Straight Up VALUES Sets

    Table '#Dimension__________________________________________________________________________________________________________0000000000D8'. Scan count 5, logical reads 4083, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 2638 ms, elapsed time = 4461 ms.

    Warning: Null value is eliminated by an aggregate or other SET operation.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    -------------------- Drew's CROSS APPLY

    Table '#Dimension__________________________________________________________________________________________________________0000000000D8'. Scan count 5, logical reads 4083, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 1980 ms, elapsed time = 4141 ms.

    Warning: Null value is eliminated by an aggregate or other SET operation.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    Edit: Ooops! Originally posted results for 100,000 rows and in that case the straight up VALUES sets had a slight edge. Not so at 1,000,000 rows! Looks like Drew edged me out by about 25% (CPU).

    I'm on the fence about whether the CROSS APPLY is more readable. I certainly concur that when you use CROSS APPLY VALUES as an alternate to UNPIVOT, readability is way higher for CROSS APPLY.

    You gotta admit: SSC.com is great for learning, fun and competition!


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Since there's only three dimensions being tested, I wonder if it's easier to just add computed columns to the base table and do all the calcs inline in one query, something like this:

    ALTER TABLE Dimension

    ADD MaxDim AS CASE

    WHEN Length >= Width AND Length >= Height THEN Length

    WHEN Width >= Length AND Width >= Height THEN Width

    ELSE Height END,

    MinDim AS CASE

    WHEN Length <= Width AND Length <= Height THEN Length

    WHEN Width <= Length AND Width <= Height THEN Width

    ELSE Height END

    GO

    DECLARE @MaxDim float

    DECLARE @MinDim float

    DECLARE @CheckDim float

    SET @MaxDim = 9

    SET @MinDim = 3

    SET @CheckDim = 6.5

    SELECT *

    FROM Dimension

    WHERE

    MaxDim < @MaxDim AND

    MinDim < @MinDim AND

    CASE WHEN Length > @CheckDim THEN 1 ELSE 0 END +

    CASE WHEN Width > @CheckDim THEN 1 ELSE 0 END +

    CASE WHEN Height > @CheckDim THEN 1 ELSE 0 END < 2

    --ORDER BY Length * Width * Height DESC --? list smallest box possible first ?

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher (6/21/2012)


    Since there's only three dimensions being tested, I wonder if it's easier to just add computed columns to the base table and do all the calcs inline in one query, something like this:

    ALTER TABLE Dimension

    ADD MaxDim AS CASE

    WHEN Length >= Width AND Length >= Height THEN Length

    WHEN Width >= Length AND Width >= Height THEN Width

    ELSE Height END,

    MinDim AS CASE

    WHEN Length <= Width AND Length <= Height THEN Length

    WHEN Width <= Length AND Width <= Height THEN Width

    ELSE Height END

    GO

    DECLARE @MaxDim float

    DECLARE @MinDim float

    DECLARE @CheckDim float

    SET @MaxDim = 9

    SET @MinDim = 3

    SET @CheckDim = 6.5

    SELECT *

    FROM Dimension

    WHERE

    MaxDim < @MaxDim AND

    MinDim < @MinDim AND

    CASE WHEN Length > @CheckDim THEN 1 ELSE 0 END +

    CASE WHEN Width > @CheckDim THEN 1 ELSE 0 END +

    CASE WHEN Height > @CheckDim THEN 1 ELSE 0 END < 2

    --ORDER BY Length * Width * Height DESC --? list smallest box possible first ?

    Scott - That's cool! And if the OP doesn't want to add computed columns to his table, the computation could be done in a CTE or VIEW as well.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • That's an interesting solution as well, Scott :-). Unfortunately, I can't add anymore columns to the existing table without causing it to bloat up more. The table in question is a pretty denormalized mess of data, so I don't want to make the problem any worse if I can help it. I was mostly just dragging the three dimension columns out into a temp table as a temporary workspace to keep clutter down. Still, it's a means of solving the problem I hadn't even considered, and it could come in handy in the future as well.

    - 😀

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply