Select an ID that matches data from mutiple rows

Question

Select an ID that matches data from mutiple rows

Dameeks

SSC Enthusiast

Points: 111
More actions
July 29, 2010 at 11:52 am

#235858

I'm working on a way to retrieve 4 character alpha numeric codes. The codes represent a particular sequence of choices that were made. My first attempt at this is as follows: create two tables 1 that holds the codes and the second holds the sequence the code represents. I'm running in to trouble when trying the retrieve the code from sequence table base on what selections the user has made.
Example:
Codes
10001
20002
30003
40004
50005
Sequences
IDCodeIdSeqDecisionId
1112999
2212999
3223000
4312999
5323000
6333003
7412999
8423000
9433003
10443004
11512999
12523000
13533003
14543004
15553005
What I would want is this. If the user made decision 2999&3000 they will receive code 0002. If they select 2999&3000&30003 they would get code 0003. Anyone suggestions are welcome.
Thanks.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Ben Teraberry SSChampion Points: 11845 More actions · Answer 1

It's unclear from your post what your actual tables are and what your expected output is. Please see the link in my signature for information on how to post questions in such a way that people will be able to help you.

I do believe you're basically looking for the ROW_NUMBER() function, so take a look at that in Books Online.

└> bt

Forum Etiquette: How to post data/code on a forum to get the best help[/url]

Nevyn SSChampion Points: 14210 More actions · Answer 2

Is this in a stored procedure? Can you give us an idea of the input you get sent from the user and the format it is in.

It sounds like you are saying they give you a list of N decision IDs, and you are trying to find the code that has a sequence record for each ID in the list they send. Is that correct?

What format do they send you the list in?

Dameeks SSC Enthusiast Points: 111 More actions · Answer 3

I can be anything: view, stored procedure, scalar function, etc. It doesn't matter to me.

Yes that exactly what I'm trying to do. Return a single code from a list of decision IDs.

Just assume that the individual decision IDs are known before I need to retrieve the code. There is no restriction on the input format. As the user makes selections I would store the decision ID and put them into the format I need.

Nevyn SSChampion Points: 14210 More actions · Answer 4

I'll give this a shot tonight. There are a few ways you could do it.

Can there be only one matching code to a given sequence?

Dameeks SSC Enthusiast Points: 111 More actions · Answer 5

Yes each code would be unique to the set of decisions that were made.

Thanks for your help.

Nevyn SSChampion Points: 14210 More actions · Answer 6

I assume the sequence must also be correct?

For example a user entering 2999,3000 matches code 0002. what if it's 3000, 2999 (same numbers, different sequence) instead?

scott.pletcher SSCertifiable Points: 5958 More actions · Answer 7

I will assume that the input sequence of the DecisionIds does not matter -- as long as all match, it doesn't what order they were in.

I think something like this should do it:

SELECT --add column name here to get final value column from codes table--

FROM (

SELECT CodeId

FROM Sequences

GROUP BY CodeId

HAVING SUM(CASE WHEN DecisionId IN (2999, 3000) --<<-- your list of DecisionIds

THEN 1 ELSE 0 END) = SUM(1)

) AS derived

--add standard code here to join derived.CodeId to codes table to get final result--

You have several methods to incorporate the DecisionId list:

build the statement dynamically, including the chosen values for the DecisionId;

put the DecisionIds in a table and join to it -- easy to mod the code for that, just let me know.

Scott Pletcher, SQL Server MVP 2008-2010

Nevyn SSChampion Points: 14210 More actions · Answer 8

That doesn't quite work, Scott. It will also return any subset matches (with his test data your query returns codes 0001 and 0002)

scott.pletcher SSCertifiable Points: 5958 More actions · Answer 9

OOPS, right, I forgot to add a check against the count of the input values (I didn't test it, since I didn't have test data readily available):

SELECT --add column name here to get final value column from codes table--

FROM (

SELECT CodeId

FROM Sequences

GROUP BY CodeId

HAVING SUM(CASE WHEN DecisionId IN (2999, 3000) --<<-- your list of DecisionIds

THEN 1 ELSE 0 END) = SUM(1)

AND SUM(1) = 2 --<<-- count of the number of DecisionIds in list

) AS derived

--add standard code here to join derived.CodeId to codes table to get final result--

Scott Pletcher, SQL Server MVP 2008-2010

Nevyn SSChampion Points: 14210 More actions · Answer 10

This is what I came up with (and also test data for people to find a better solution):

CREATE TABLE #codes (CodeIDint,

Codechar(4)

)

CREATE TABLE #sequences (SequenceIDint,

CodeIDint,

Seqint,

DecisionIDint)

INSERT INTO #codes (CodeID,Code)

SELECT 1,'0001' UNION ALL

SELECT 2,'0002' UNION ALL

SELECT 3,'0003' UNION ALL

SELECT 4,'0004' UNION ALL

SELECT 5,'0005'

INSERT INTO #sequences (SequenceID,CodeID,Seq,DecisionID)

SELECT 1,1,1,2999 UNION ALL

SELECT 2,2,1,2999 UNION ALL

SELECT 3,2,2,3000 UNION ALL

SELECT 4,3,1,2999 UNION ALL

SELECT 5,3,2,3000 UNION ALL

SELECT 6,3,3,3003 UNION ALL

SELECT 7,4,1,2999 UNION ALL

SELECT 8,4,2,3000 UNION ALL

SELECT 9,4,3,3003 UNION ALL

SELECT 10,4,4,3004 UNION ALL

SELECT 11,5,1,2999 UNION ALL

SELECT 12,5,2,3000 UNION ALL

SELECT 13,5,3,3003 UNION ALL

SELECT 14,5,4,3004 UNION ALL

SELECT 15,5,5,3005

CREATE TABLE #UserInput (Seq int,

DecisionID int)

INSERT INTO #UserInput (Seq, DecisionID)

SELECT 1,2999 UNION ALL

SELECT 2,3000

SELECT der2.Code

FROM (SELECT CodesCount=COUNT(1)

FROM #UserInput) AS der1

INNER JOIN(

SELECT Code = C.code, MatchCount = COUNT(1)

FROM #codes C

INNER JOIN #sequences S ON C.CodeID = S.CodeID

LEFT OUTER JOIN #UserInput U ON S.DecisionID = U.DecisionID

AND S.Seq = U.Seq

GROUP BY C.code) AS der2

ON der1.CodesCount = der2.MatchCount

DROP TABLE #UserInput

DROP TABLE #codes

DROP TABLE #sequences

Note, this is assuming that you've taken the user input for sequence and put it into a temp table called UserInput. If the order they enter decision IDs doesnt matter then Seq doesnt need to be on that table.

Other possible solution if you want a delimited input list and dont want to use a temp table would be to create a derived table denormalizing the DecisionIDs and query it with your whole list as a string. I didnt choose that one as it only works if sequence must match, and you need to be careful with formatting to make it work.

andrewd.smith SSCrazy Eights Points: 8444 More actions · Answer 11

There are several ways of doing this, but here's an old-fashioned method.

I've assumed your table definition looks something like this:

CREATE TABLE Sequences (

ID int NOT NULL PRIMARY KEY,

CodeId int NOT NULL,

Seq int NOT NULL,

DecisionId char(4) NOT NULL

)

GO

I've also taken the liberty to add an index.

CREATE INDEX IDX_Sequences ON Sequences(CodeId, Seq, DecisionId)

GO

Here is your test data in a readily consumable format:

INSERT INTO Sequences(ID, CodeId, Seq, DecisionId)

SELECT 1, 1, 1, '2999' UNION ALL

SELECT 2, 2, 1, '2999' UNION ALL

SELECT 3, 2, 2, '3000' UNION ALL

SELECT 4, 3, 1, '2999' UNION ALL

SELECT 5, 3, 2, '3000' UNION ALL

SELECT 6, 3, 3, '3003' UNION ALL

SELECT 7, 4, 1, '2999' UNION ALL

SELECT 8, 4, 2, '3000' UNION ALL

SELECT 9, 4, 3, '3003' UNION ALL

SELECT 10, 4, 4, '3004' UNION ALL

SELECT 11, 5, 1, '2999' UNION ALL

SELECT 12, 5, 2, '3000' UNION ALL

SELECT 13, 5, 3, '3003' UNION ALL

SELECT 14, 5, 4, '3004' UNION ALL

SELECT 15, 5, 5, '3005'

I have assumed that your input data is strictly a 1-character delimited string containing one or more 4-character descision codes (where the sequence of codes is significant), so that we can efficiently split the string using a Tally table (a table of sequential integers) and populate a temporary table.

In the subsequent SELECT query, both the number of rows in the Sequences table with the selected CodeId column value and the number of rows where the Seq and DecisionId columns match between the Sequences and temporary tables must equal the number of rows in the temporary table.

DECLARE @choices varchar(1000)

DECLARE @n int

SELECT @choices = '2999&3000'

CREATE TABLE #Choice (

Seq int NOT NULL PRIMARY KEY,

DecisionId char(4) NOT NULL

)

INSERT INTO #Choice (Seq, DecisionId)

SELECT T.N, SUBSTRING(@choices, T.N * 5 - 4, 4)

FROM Tally T WHERE (T.N BETWEEN 1 AND (LEN(@choices) + 1) / 5)

SELECT @n = @@ROWCOUNT

SELECT S.CodeId

FROM Sequences S

LEFT OUTER JOIN #Choice C ON (C.Seq = S.Seq AND C.DecisionId = S.DecisionId)

GROUP BY S.CodeId

HAVING (COUNT(C.Seq) = @n AND COUNT(*) = @n)

DROP TABLE #Choice

andrewd.smith SSCrazy Eights Points: 8444 More actions · Answer 12

Nevyn,

Your query needs some fixing up to make it work. The left outer join is not having any filtering effect, so the query is just returning the Code for which the sequence length matches the number of items in the user input.

SELECT der2.Code

FROM ( SELECT CodesCount=COUNT(1)

FROM #UserInput) AS der1

INNER JOIN (

SELECT Code = C.code, MatchCount = COUNT(1)

FROM #codes C

INNER JOIN #sequences S ON C.CodeID = S.CodeID

LEFT OUTER JOIN #UserInput U

ON S.DecisionID = U.DecisionID AND S.Seq = U.Seq

GROUP BY C.code) AS der2

ON der1.CodesCount = der2.MatchCount

To correct this, you could make the following change, which makes the query similar to mine except that I didn't bother including the join to the Codes table.

SELECT der2.Code

FROM ( SELECT CodesCount=COUNT(1)

FROM #UserInput) AS der1

INNER JOIN (

SELECT Code = C.code, CodesCount = COUNT(1), MatchCount = COUNT(U.Seq)

FROM #codes C

INNER JOIN #sequences S ON C.CodeID = S.CodeID

LEFT OUTER JOIN #UserInput U

ON S.DecisionID = U.DecisionID AND S.Seq = U.Seq

GROUP BY C.code) AS der2

ON der1.CodesCount = der2.MatchCount AND der1.CodesCount = der2.CodesCount