October 14, 2011 at 6:02 am
Given a table like this
declare @Tab1 table
(
id int identity(1,1),
v1 int,
v2 int,
v3 int,
v4 int,
v5 int,
v6 int)
insert @Tab1
(v1,v2,v3,v4,v5,v6)
select 1,2,3,4,5,6
union
select 2,3,1,4,5,6
union
select 3,1,2,4,5,6
union
select 4,5,6,2,3,1
select * from @Tab1 t
I want a decent-performing way of returning for each row ID, the column number containing the minimum of v1 to v6, thus:
1,1
2,3
3,2
4,6
Any ideas?
Thanks.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 14, 2011 at 6:27 am
Phil the only way i could see it was to use a case statement that evaluates each column;
this works, but i don't know how it will perform on large data sets:
also i think we need to use <= for all the value comparisons in case two columns are equal, you take the first column witht he lowest value?
--based on the setup you provided:
SELECT Id,
CASE
WHEN (v1 <= v2) AND (v1 <= v3) AND (v1 <= v4) AND (v1 <= v5) AND (v1 <= v6)
THEN 1
WHEN (v2 <= v1) AND (v2 <= v3) AND (v2 <= v4) AND (v2 <= v5) AND (v2 <= v6)
THEN 2
WHEN (v3 <= v1) AND (v3 <= v2) AND (v3 <= v4) AND (v3 <= v5) AND (v3 <= v6)
THEN 3
WHEN (v4 <= v1) AND (v4 <= v2) AND (v4 <= v3) AND (v4 <= v5) AND (v4 <= v6)
THEN 4
WHEN (v5 <= v1) AND (v5 <= v2) AND (v5 <= v3) AND (v5 <= v4) AND (v5 <= v6)
THEN 5
ELSE 6
END AS ColIndex
FROM @Tab1 t
Lowell
October 14, 2011 at 6:34 am
Something similar to this outperforms the obvious pivot approach for removing the blanks from address elements:
select t.*, x.ColIndex
from @Tab1 t
CROSS APPLY (SELECT TOP 1 ColIndex from (
SELECT ColIndex = 1, ColName = v1 UNION ALL SELECT 2, v2 UNION ALL SELECT 3, v3 UNION ALL SELECT 4, v4 UNION ALL SELECT 5, v5 UNION ALL SELECT 6, v6
) d ORDER BY ColName
) x (ColIndex)
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
October 14, 2011 at 6:42 am
maybe it's just me, but both my solution and Chris's seem to produce the exact same execution plan?, but Chris's is 48% relative to the batch where mine was 17%? so the big case is better?
Lowell
October 14, 2011 at 6:50 am
Lowell (10/14/2011)
maybe it's just me, but both my solution and Chris's seem to produce the exact same execution plan?, but Chris's is 48% relative to the batch where mine was 17%? so the big case is better?
Interesting - here, the plans are different. Your query is a straightup table scan and compute scalar, mine features a sort (and NLIJ) and is 4x "more expensive".
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
October 14, 2011 at 6:53 am
i'm an ID-TEN-T;
i looked at jsu tthe execution plan for the setup table variable...of course they are the same...once i scrolled down, i saw differences.
Lowell
October 14, 2011 at 7:02 am
Lowell (10/14/2011)
i'm an ID-TEN-T;i looked at jsu tthe execution plan for the setup table variable...of course they are the same...once i scrolled down, i saw differences.
Haha. Thanks for the excellent input guys.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 14, 2011 at 7:04 am
Anytime Phil. Test them both, I reckon Lowell's MegaCASE will win hands down.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
October 14, 2011 at 7:07 am
ChrisM@Work (10/14/2011)
Anytime Phil. Test them both, I reckon Lowell's MegaCASE will win hands down.
Now that you mention it, that was what I was looking at 🙂
I like Chris' solution, because it's something I wouldn't have come up with, so I set up a test with 1,000,000 rows.
IF object_id('tempdb..#Tab1') IS NOT NULL
BEGIN
DROP TABLE #Tab1
END
--1,000,000 Random rows of data
SELECT TOP 1000000 IDENTITY(INT,1,1) AS ID,
v1, v2, v3, v4, v5, v6
INTO #Tab1
FROM (SELECT (ABS(CHECKSUM(NewId())) % 10) + 1 AS v1,
(ABS(CHECKSUM(NewId())) % 10) + 1 AS v2,
(ABS(CHECKSUM(NewId())) % 10) + 1 AS v3,
(ABS(CHECKSUM(NewId())) % 10) + 1 AS v4,
(ABS(CHECKSUM(NewId())) % 10) + 1 AS v5,
(ABS(CHECKSUM(NewId())) % 10) + 1 AS v6
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3) a
WHERE (v1 <> v2 AND v1 <> v3 AND v1 <> v4 AND v1 <> v5 AND v1 <> v6) AND
(v2 <> v3 AND v2 <> v4 AND v2 <> v5 AND v2 <> v6) AND (v3 <> v4 AND v3 <> v5 AND v3 <> v6) AND
(v4 <> v5 AND v4 <> v6) AND (v5 <> v6)
--Add a Primary Key
ALTER TABLE #Tab1
ADD CONSTRAINT Tab1_PK_ID
PRIMARY KEY CLUSTERED (ID) WITH FILLFACTOR = 100
DECLARE @VAR INT
PRINT '========== BASELINE =========='
SET STATISTICS TIME ON
SELECT @VAR = ID FROM #Tab1
SET STATISTICS TIME OFF
PRINT REPLICATE('=',80)
PRINT '========== CROSS JOIN =========='
SET STATISTICS TIME ON
SELECT @VAR = x.ColIndex
FROM #Tab1 t
CROSS APPLY (
SELECT TOP 1 ColIndex
FROM (SELECT ColIndex = 1, ColName = v1 UNION ALL SELECT 2, v2
UNION ALL SELECT 3, v3 UNION ALL SELECT 4, v4
UNION ALL SELECT 5, v5 UNION ALL SELECT 6, v6) d
ORDER BY ColName) x(ColIndex)
SET STATISTICS TIME OFF
PRINT REPLICATE('=',80)
PRINT '========== CASE =========='
SET STATISTICS TIME ON
SELECT @VAR = CASE WHEN (v1 <= v2) AND (v1 <= v3) AND (v1 <= v4) AND (v1 <= v5) AND (v1 <= v6)
THEN 1
WHEN (v2 <= v1) AND (v2 <= v3) AND (v2 <= v4) AND (v2 <= v5) AND (v2 <= v6)
THEN 2
WHEN (v3 <= v1) AND (v3 <= v2) AND (v3 <= v4) AND (v3 <= v5) AND (v3 <= v6)
THEN 3
WHEN (v4 <= v1) AND (v4 <= v2) AND (v4 <= v3) AND (v4 <= v5) AND (v4 <= v6)
THEN 4
WHEN (v5 <= v1) AND (v5 <= v2) AND (v5 <= v3) AND (v5 <= v4) AND (v5 <= v6)
THEN 5
ELSE 6
END
FROM #Tab1 t
SET STATISTICS TIME OFF
PRINT REPLICATE('=',80)
========== BASELINE ==========
SQL Server Execution Times:
CPU time = 140 ms, elapsed time = 153 ms.
================================================================================
========== CROSS JOIN ==========
SQL Server Execution Times:
CPU time = 9937 ms, elapsed time = 9950 ms.
================================================================================
========== CASE ==========
SQL Server Execution Times:
CPU time = 516 ms, elapsed time = 509 ms.
================================================================================
October 14, 2011 at 7:11 am
20x faster! I'm going home...:blush:
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
October 14, 2011 at 9:14 am
Applying Lowell's suggested method to my real-world version of this problem worked splendidly. >300k records updated in less than a second.
Thanks again.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 14, 2011 at 10:30 am
I'm perfectly aware that Lowell provided a nice lightning-fast solution, but...
I couldn't resist!
SELECT id, col = (
SELECT col
FROM (
SELECT id, v1, v2, v3, v4, v5, v6
) AS v
UNPIVOT ( value FOR col IN (v1, v2, v3, v4, v5, v6) ) AS u
WHERE value = (
SELECT MIN(value)
FROM (
SELECT *
FROM (
SELECT id, v1, v2, v3, v4, v5, v6
) AS v1
UNPIVOT ( value FOR col IN (v1, v2, v3, v4, v5, v6) ) AS u
) AS src
)
)
FROM #Tab1 AS T
-- Gianluca Sartori
October 17, 2011 at 1:54 am
Gianluca Sartori (10/14/2011)
I couldn't resist!
Me either 😀
========== BASELINE ==========
SQL Server Execution Times:
CPU time = 203 ms, elapsed time = 205 ms.
================================================================================
========== CROSS JOIN ==========
SQL Server Execution Times:
CPU time = 12558 ms, elapsed time = 12561 ms.
================================================================================
========== CASE ==========
SQL Server Execution Times:
CPU time = 733 ms, elapsed time = 731 ms.
================================================================================
========== UNPIVOT ==========
SQL Server Execution Times:
CPU time = 3088 ms, elapsed time = 829 ms.
================================================================================
October 17, 2011 at 1:59 am
Cadavre (10/17/2011)
Gianluca Sartori (10/14/2011)
I couldn't resist!Me either 😀
========== BASELINE ==========
SQL Server Execution Times:
CPU time = 203 ms, elapsed time = 205 ms.
================================================================================
========== CROSS JOIN ==========
SQL Server Execution Times:
CPU time = 12558 ms, elapsed time = 12561 ms.
================================================================================
========== CASE ==========
SQL Server Execution Times:
CPU time = 733 ms, elapsed time = 731 ms.
================================================================================
========== UNPIVOT ==========
SQL Server Execution Times:
CPU time = 3088 ms, elapsed time = 829 ms.
================================================================================
Et tu, Brute!
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
October 17, 2011 at 2:18 am
Cadavre (10/17/2011)
Gianluca Sartori (10/14/2011)
I couldn't resist!Me either 😀
I already knew it was slow, but it was such a fun problem!
It's something you don't stumble upon very often.
-- Gianluca Sartori
Viewing 15 posts - 1 through 15 (of 25 total)
You must be logged in to reply to this topic. Login to reply