Finding the column containing the minimum value

  • Given a table like this

    declare @Tab1 table

    (

    id int identity(1,1),

    v1 int,

    v2 int,

    v3 int,

    v4 int,

    v5 int,

    v6 int)

    insert @Tab1

    (v1,v2,v3,v4,v5,v6)

    select 1,2,3,4,5,6

    union

    select 2,3,1,4,5,6

    union

    select 3,1,2,4,5,6

    union

    select 4,5,6,2,3,1

    select * from @Tab1 t

    I want a decent-performing way of returning for each row ID, the column number containing the minimum of v1 to v6, thus:

    1,1

    2,3

    3,2

    4,6

    Any ideas?

    Thanks.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil the only way i could see it was to use a case statement that evaluates each column;

    this works, but i don't know how it will perform on large data sets:

    also i think we need to use <= for all the value comparisons in case two columns are equal, you take the first column witht he lowest value?

    --based on the setup you provided:

    SELECT Id,

    CASE

    WHEN (v1 <= v2) AND (v1 <= v3) AND (v1 <= v4) AND (v1 <= v5) AND (v1 <= v6)

    THEN 1

    WHEN (v2 <= v1) AND (v2 <= v3) AND (v2 <= v4) AND (v2 <= v5) AND (v2 <= v6)

    THEN 2

    WHEN (v3 <= v1) AND (v3 <= v2) AND (v3 <= v4) AND (v3 <= v5) AND (v3 <= v6)

    THEN 3

    WHEN (v4 <= v1) AND (v4 <= v2) AND (v4 <= v3) AND (v4 <= v5) AND (v4 <= v6)

    THEN 4

    WHEN (v5 <= v1) AND (v5 <= v2) AND (v5 <= v3) AND (v5 <= v4) AND (v5 <= v6)

    THEN 5

    ELSE 6

    END AS ColIndex

    FROM @Tab1 t

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Something similar to this outperforms the obvious pivot approach for removing the blanks from address elements:

    select t.*, x.ColIndex

    from @Tab1 t

    CROSS APPLY (SELECT TOP 1 ColIndex from (

    SELECT ColIndex = 1, ColName = v1 UNION ALL SELECT 2, v2 UNION ALL SELECT 3, v3 UNION ALL SELECT 4, v4 UNION ALL SELECT 5, v5 UNION ALL SELECT 6, v6

    ) d ORDER BY ColName

    ) x (ColIndex)

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • maybe it's just me, but both my solution and Chris's seem to produce the exact same execution plan?, but Chris's is 48% relative to the batch where mine was 17%? so the big case is better?

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell (10/14/2011)


    maybe it's just me, but both my solution and Chris's seem to produce the exact same execution plan?, but Chris's is 48% relative to the batch where mine was 17%? so the big case is better?

    Interesting - here, the plans are different. Your query is a straightup table scan and compute scalar, mine features a sort (and NLIJ) and is 4x "more expensive".

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • i'm an ID-TEN-T;

    i looked at jsu tthe execution plan for the setup table variable...of course they are the same...once i scrolled down, i saw differences.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell (10/14/2011)


    i'm an ID-TEN-T;

    i looked at jsu tthe execution plan for the setup table variable...of course they are the same...once i scrolled down, i saw differences.

    Haha. Thanks for the excellent input guys.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Anytime Phil. Test them both, I reckon Lowell's MegaCASE will win hands down.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (10/14/2011)


    Anytime Phil. Test them both, I reckon Lowell's MegaCASE will win hands down.

    Now that you mention it, that was what I was looking at 🙂

    I like Chris' solution, because it's something I wouldn't have come up with, so I set up a test with 1,000,000 rows.

    IF object_id('tempdb..#Tab1') IS NOT NULL

    BEGIN

    DROP TABLE #Tab1

    END

    --1,000,000 Random rows of data

    SELECT TOP 1000000 IDENTITY(INT,1,1) AS ID,

    v1, v2, v3, v4, v5, v6

    INTO #Tab1

    FROM (SELECT (ABS(CHECKSUM(NewId())) % 10) + 1 AS v1,

    (ABS(CHECKSUM(NewId())) % 10) + 1 AS v2,

    (ABS(CHECKSUM(NewId())) % 10) + 1 AS v3,

    (ABS(CHECKSUM(NewId())) % 10) + 1 AS v4,

    (ABS(CHECKSUM(NewId())) % 10) + 1 AS v5,

    (ABS(CHECKSUM(NewId())) % 10) + 1 AS v6

    FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3) a

    WHERE (v1 <> v2 AND v1 <> v3 AND v1 <> v4 AND v1 <> v5 AND v1 <> v6) AND

    (v2 <> v3 AND v2 <> v4 AND v2 <> v5 AND v2 <> v6) AND (v3 <> v4 AND v3 <> v5 AND v3 <> v6) AND

    (v4 <> v5 AND v4 <> v6) AND (v5 <> v6)

    --Add a Primary Key

    ALTER TABLE #Tab1

    ADD CONSTRAINT Tab1_PK_ID

    PRIMARY KEY CLUSTERED (ID) WITH FILLFACTOR = 100

    DECLARE @VAR INT

    PRINT '========== BASELINE =========='

    SET STATISTICS TIME ON

    SELECT @VAR = ID FROM #Tab1

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',80)

    PRINT '========== CROSS JOIN =========='

    SET STATISTICS TIME ON

    SELECT @VAR = x.ColIndex

    FROM #Tab1 t

    CROSS APPLY (

    SELECT TOP 1 ColIndex

    FROM (SELECT ColIndex = 1, ColName = v1 UNION ALL SELECT 2, v2

    UNION ALL SELECT 3, v3 UNION ALL SELECT 4, v4

    UNION ALL SELECT 5, v5 UNION ALL SELECT 6, v6) d

    ORDER BY ColName) x(ColIndex)

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',80)

    PRINT '========== CASE =========='

    SET STATISTICS TIME ON

    SELECT @VAR = CASE WHEN (v1 <= v2) AND (v1 <= v3) AND (v1 <= v4) AND (v1 <= v5) AND (v1 <= v6)

    THEN 1

    WHEN (v2 <= v1) AND (v2 <= v3) AND (v2 <= v4) AND (v2 <= v5) AND (v2 <= v6)

    THEN 2

    WHEN (v3 <= v1) AND (v3 <= v2) AND (v3 <= v4) AND (v3 <= v5) AND (v3 <= v6)

    THEN 3

    WHEN (v4 <= v1) AND (v4 <= v2) AND (v4 <= v3) AND (v4 <= v5) AND (v4 <= v6)

    THEN 4

    WHEN (v5 <= v1) AND (v5 <= v2) AND (v5 <= v3) AND (v5 <= v4) AND (v5 <= v6)

    THEN 5

    ELSE 6

    END

    FROM #Tab1 t

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',80)

    ========== BASELINE ==========

    SQL Server Execution Times:

    CPU time = 140 ms, elapsed time = 153 ms.

    ================================================================================

    ========== CROSS JOIN ==========

    SQL Server Execution Times:

    CPU time = 9937 ms, elapsed time = 9950 ms.

    ================================================================================

    ========== CASE ==========

    SQL Server Execution Times:

    CPU time = 516 ms, elapsed time = 509 ms.

    ================================================================================


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • 20x faster! I'm going home...:blush:

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Applying Lowell's suggested method to my real-world version of this problem worked splendidly. >300k records updated in less than a second.

    Thanks again.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • I'm perfectly aware that Lowell provided a nice lightning-fast solution, but...

    I couldn't resist!

    SELECT id, col = (

    SELECT col

    FROM (

    SELECT id, v1, v2, v3, v4, v5, v6

    ) AS v

    UNPIVOT ( value FOR col IN (v1, v2, v3, v4, v5, v6) ) AS u

    WHERE value = (

    SELECT MIN(value)

    FROM (

    SELECT *

    FROM (

    SELECT id, v1, v2, v3, v4, v5, v6

    ) AS v1

    UNPIVOT ( value FOR col IN (v1, v2, v3, v4, v5, v6) ) AS u

    ) AS src

    )

    )

    FROM #Tab1 AS T

    -- Gianluca Sartori

  • Gianluca Sartori (10/14/2011)


    I couldn't resist!

    Me either 😀

    ========== BASELINE ==========

    SQL Server Execution Times:

    CPU time = 203 ms, elapsed time = 205 ms.

    ================================================================================

    ========== CROSS JOIN ==========

    SQL Server Execution Times:

    CPU time = 12558 ms, elapsed time = 12561 ms.

    ================================================================================

    ========== CASE ==========

    SQL Server Execution Times:

    CPU time = 733 ms, elapsed time = 731 ms.

    ================================================================================

    ========== UNPIVOT ==========

    SQL Server Execution Times:

    CPU time = 3088 ms, elapsed time = 829 ms.

    ================================================================================


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Cadavre (10/17/2011)


    Gianluca Sartori (10/14/2011)


    I couldn't resist!

    Me either 😀

    ========== BASELINE ==========

    SQL Server Execution Times:

    CPU time = 203 ms, elapsed time = 205 ms.

    ================================================================================

    ========== CROSS JOIN ==========

    SQL Server Execution Times:

    CPU time = 12558 ms, elapsed time = 12561 ms.

    ================================================================================

    ========== CASE ==========

    SQL Server Execution Times:

    CPU time = 733 ms, elapsed time = 731 ms.

    ================================================================================

    ========== UNPIVOT ==========

    SQL Server Execution Times:

    CPU time = 3088 ms, elapsed time = 829 ms.

    ================================================================================

    Et tu, Brute!

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Cadavre (10/17/2011)


    Gianluca Sartori (10/14/2011)


    I couldn't resist!

    Me either 😀

    I already knew it was slow, but it was such a fun problem!

    It's something you don't stumble upon very often.

    -- Gianluca Sartori

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply