February 5, 2015 at 10:36 am
This isn't for a project I am currently working on, but I have been curious about how you could make this comparison:
Say you have a table that has records with numbers sort of like lottery winning numbers, say:
TableWinners
num1, num2, num3, num4, num5, num6
33 52 47 23 17 28
... more records with similar structure.
Then you have another table with chosen numbers, same structure as above, TableGuesses.
How could you do the following comparisons between TableGuesses and TableWinners:
1. Compare a single record in TableGuesses to a single record in TableWinners to get a count of the number of numbers that match (kind of a typical lottery type of thing).
2. Compare a single record in TableGuessess to ALL records in TableWinners to see which record in TableWinners is the closest match to the selected record in TableGuesses.
Does that make sense?
Thanks!
February 5, 2015 at 10:44 am
You would probably unpivot the data in both tables so that you have a key for which set of numbers it belongs to, but each winning/drawn number is its own row.
Then for the first one you join the unpivoted tables and count rows and you have a count of matches.
For the second you do the same steps for all rows of the winning numbers table, grouped by a unique identifier of the drawing with a count, and then order the results by number of matches so that you can get the row(s) with the highest matches.
February 5, 2015 at 10:47 am
Interesting...
I have never unpivoted a table, but I think I understand what it is your are saying. I will read up on unpivot to see how to make it work.
Thanks!
February 5, 2015 at 10:48 am
robert.wiglesworth (2/5/2015)
This isn't for a project I am currently working on, but I have been curious about how you could make this comparison:Say you have a table that has records with numbers sort of like lottery winning numbers, say:
TableWinners
num1, num2, num3, num4, num5, num6
33 52 47 23 17 28
... more records with similar structure.
Then you have another table with chosen numbers, same structure as above, TableGuesses.
How could you do the following comparisons between TableGuesses and TableWinners:
1. Compare a single record in TableGuesses to a single record in TableWinners to get a count of the number of numbers that match (kind of a typical lottery type of thing).
2. Compare a single record in TableGuessess to ALL records in TableWinners to see which record in TableWinners is the closest match to the selected record in TableGuesses.
Does that make sense?
Thanks!
Normalise it out, one row per number instead of one row per column, and it's trivial. Left-join guesses to winners on number, and whatever you want to call a "strip", then aggregate by strip. If you can't change the schema then split your columns into rows using CROSS APPLY VALUES (Dwain Camps) or delimitedsplit2k8 (Jeff Moden).
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
February 5, 2015 at 11:01 am
Ok, I see what you are saying...
instead of something like this:
drawdate num1 num2 num3 num4 num5 num6
02/05/2015 55 12 45 33 22 06
Do something like this:
drawdate num
02/05/2015 55
02/05/2015 12
02/05/2015 45
02/05/2015 33
02/05/2015 22
02/05/2015 06
Yes? So that would lead to a much taller but narrow table. Is there a performance issue associated with that?
February 5, 2015 at 11:07 am
Basically you are making the rows into columns, each with the identifier of the original row, and one of the winning numbers.
There is an UNPIVOT sql statement, or you can achieve the same logically using CROSS APPLY and VALUES
Some examples to get you started
CREATE TABLE #WinningNumbers
(WinningNumberID int,
num1 int,
num2 int,
num3 int,
num4 int,
num5 int,
num6 int)
INSERT #WinningNumbers (WinningNumberID,num1,num2,num3,num4,num5,num6)
SELECT 1, 12,25,27,31,40,45
--FIRST UNPIVOT
SELECT WinningNumberID, IndNumber, NumberID
FROM
(SELECT WinningNumberID,num1,num2,num3,num4,num5,num6
FROM #WinningNumbers) p
UNPIVOT
(IndNumber FOR NumberID IN (num1,num2,num3,num4,num5,num6)) AS unpiv
--UNPIVOT USING APPLY
SELECT WinningNumberID,IndNumber,NumberID
FROM #WinningNumbers
CROSS APPLY (VALUES ('num1',num1),('num2',num2),('num3',num3),('num4',num4),('num5',num5),('num6',num6)) AS x(NumberID,IndNumber)
DROP TABLE #WinningNumbers
As for whether there is a performance issue, yes, potentially. The actually apply method of unpivoting is fairly fast, but if you are doing that, and then querying the unpivoted table, then if there is a lot of data it could be slow. Note: the performance issue isn't because you are creating more rows, it is because you are scanning and transforming everything then querying it)
Of course, it could be argued that the numbers should stored in this fashion to begin with.
February 5, 2015 at 11:18 am
Interesting...
I have never unpivoted a table, but I think I understand what it is your are saying. I will read up on unpivot to see how to make it work.
Thanks!
Check out An Alternative (Better?) Method to UNPIVOT (SQL Spackle)[/url]
-- Itzik Ben-Gan 2001
February 5, 2015 at 12:12 pm
Using the unpivot method in Dwain's article I came up with this solution...
(Note that I used 4 numbers for brevity)
USE SSRS_POC
GO
IF OBJECT_ID('tempdb..#tablewinners') IS NOT NULL DROP TABLE #tablewinners;
IF OBJECT_ID('tempdb..#tableguesses') IS NOT NULL DROP TABLE #tableguesses;
CREATE TABLE #tableWinners (nid int primary key, n1 int, n2 int, n3 int, n4 int);
CREATE TABLE #tableguesses (nid int primary key, n1 int, n2 int, n3 int, n4 int);
GO
INSERT #tableWinners VALUES (1,12,19,36,40);
INSERT #tableguesses
VALUES(1,11,12,34,41),--1 match (12)
(2,41,34,11,19),--1 match (19)
(3,40,4,19,12);--3 matches (12,19,40)
WITH
winners AS
( SELECT nid, n, nvalue
FROM #tableWinners
CROSS APPLY(VALUES ('n1',n1),('n2',n2),('n3',n3),('n4',n4)) x(n,nvalue)
),
Guesses AS
( SELECT nid, n, nvalue
FROM #tableGuesses
CROSS APPLY (VALUES ('n1',n1),('n2',n2),('n3',n3),('n4',n4)) x(n,nvalue)
)
SELECT
--TOP 1 WITH TIES
g.nid, count(g.nid) AS matches
FROM winners w
JOIN Guesses g ON w.nvalue = g.nvalue
GROUP BY g.nid
--ORDER BY matches DESC
DROP TABLE #tableWinners
DROP TABLE #tableguesses
This will give you how many numbers match. Uncomment the TOP 1 WITH TIES and ORDER BY clauses and it will return the closest match.
Edit: Code formatting
-- Itzik Ben-Gan 2001
February 5, 2015 at 7:04 pm
Let's see if I can work past my blushing here and help out a little bit.
First off, Alan.B's got a great solution as it knocks off both of your specific requirements with basically a single query.
For obvious reasons, I like the CROSS APPLY VALUES approach to UNPIVOT. 🙂
Since this is an academic question according to the OP, let's look at a few perturbations of the requirement:
Suppose that your table of winners contains 5 or less numbers:
1. After unpivoting you want to find any matches in the guesses table that contain all of those numbers. This is relational division with remainder if there are less than 5 numbers in the table.
2. If your table of winners contains exactly 5 numbers and the winner must be a 100% match against the guess, this is relational division with no remainder.
In either cases of 1 and 2, if we assume that the guesses table can contain more than five numbers, this article covers both:
High Performance Relational Division [/url]
3. If your table of winners must be matched by the sequence of occurrence, that's a special case of relational division that Peter Larsson aptly named "ordered relational division."
Identifying a Subsequence in a Sequence, Part 3
4. Suppose you introduce a requirement that in the case of a tie (same count of matches), the "best match" is the one whose difference in the remaining numbers is minimized.
Now that latter is an interesting problem! Since you didn't state it as a requirement I won't try to solve it right now, but don't be surprised if someday you see an article on it!
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
July 21, 2022 at 12:18 pm
This was removed by the editor as SPAM
July 21, 2022 at 4:34 pm
Hey there!
Each winning/drawn number has its own row, however you would probably unpivot the data in both tables so that you have a key for which group of numbers it belongs to. Then you connect the unpivoted tables for the first one, count the rows, and then you get a count of matches. In order to retrieve the row(s) with the most matches, you need to perform the same processes for each row of the winning numbers table for the second drawing, grouping them by a counted unique drawing identification.
That's basically what most of the posts and a lot of the code on this thread suggests. Have you some alternative code that you'd like to share?
--Jeff Moden
Change is inevitable... Change for the better is not.
July 22, 2022 at 12:57 pm
This was removed by the editor as SPAM
July 27, 2022 at 7:39 pm
This was removed by the editor as SPAM
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply