Select an ID that matches data from mutiple rows

  • I'm working on a way to retrieve 4 character alpha numeric codes. The codes represent a particular sequence of choices that were made. My first attempt at this is as follows: create two tables 1 that holds the codes and the second holds the sequence the code represents. I'm running in to trouble when trying the retrieve the code from sequence table base on what selections the user has made.

    Example:

    Codes

    10001

    20002

    30003

    40004

    50005

    Sequences

    IDCodeIdSeqDecisionId

    1112999

    2212999

    3223000

    4312999

    5323000

    6333003

    7412999

    8423000

    9433003

    10443004

    11512999

    12523000

    13533003

    14543004

    15553005

    What I would want is this. If the user made decision 2999&3000 they will receive code 0002. If they select 2999&3000&30003 they would get code 0003. Anyone suggestions are welcome.

    Thanks.

  • It's unclear from your post what your actual tables are and what your expected output is. Please see the link in my signature for information on how to post questions in such a way that people will be able to help you.

    I do believe you're basically looking for the ROW_NUMBER() function, so take a look at that in Books Online.

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Is this in a stored procedure? Can you give us an idea of the input you get sent from the user and the format it is in.

    It sounds like you are saying they give you a list of N decision IDs, and you are trying to find the code that has a sequence record for each ID in the list they send. Is that correct?

    What format do they send you the list in?

  • I can be anything: view, stored procedure, scalar function, etc. It doesn't matter to me.

    Yes that exactly what I'm trying to do. Return a single code from a list of decision IDs.

    Just assume that the individual decision IDs are known before I need to retrieve the code. There is no restriction on the input format. As the user makes selections I would store the decision ID and put them into the format I need.

  • I'll give this a shot tonight. There are a few ways you could do it.

    Can there be only one matching code to a given sequence?

  • Yes each code would be unique to the set of decisions that were made.

    Thanks for your help.

  • I assume the sequence must also be correct?

    For example a user entering 2999,3000 matches code 0002. what if it's 3000, 2999 (same numbers, different sequence) instead?

  • I will assume that the input sequence of the DecisionIds does not matter -- as long as all match, it doesn't what order they were in.

    I think something like this should do it:

    SELECT --add column name here to get final value column from codes table--

    FROM (

    SELECT CodeId

    FROM Sequences

    GROUP BY CodeId

    HAVING SUM(CASE WHEN DecisionId IN (2999, 3000) --<<-- your list of DecisionIds

    THEN 1 ELSE 0 END) = SUM(1)

    ) AS derived

    --add standard code here to join derived.CodeId to codes table to get final result--

    You have several methods to incorporate the DecisionId list:

    build the statement dynamically, including the chosen values for the DecisionId;

    put the DecisionIds in a table and join to it -- easy to mod the code for that, just let me know.

    Scott Pletcher, SQL Server MVP 2008-2010

  • That doesn't quite work, Scott. It will also return any subset matches (with his test data your query returns codes 0001 and 0002)

  • OOPS, right, I forgot to add a check against the count of the input values (I didn't test it, since I didn't have test data readily available):

    SELECT --add column name here to get final value column from codes table--

    FROM (

    SELECT CodeId

    FROM Sequences

    GROUP BY CodeId

    HAVING SUM(CASE WHEN DecisionId IN (2999, 3000) --<<-- your list of DecisionIds

    THEN 1 ELSE 0 END) = SUM(1)

    AND SUM(1) = 2 --<<-- count of the number of DecisionIds in list

    ) AS derived

    --add standard code here to join derived.CodeId to codes table to get final result--

    Scott Pletcher, SQL Server MVP 2008-2010

  • This is what I came up with (and also test data for people to find a better solution):

    CREATE TABLE #codes (CodeIDint,

    Codechar(4)

    )

    CREATE TABLE #sequences (SequenceIDint,

    CodeIDint,

    Seqint,

    DecisionIDint)

    INSERT INTO #codes (CodeID,Code)

    SELECT 1,'0001' UNION ALL

    SELECT 2,'0002' UNION ALL

    SELECT 3,'0003' UNION ALL

    SELECT 4,'0004' UNION ALL

    SELECT 5,'0005'

    INSERT INTO #sequences (SequenceID,CodeID,Seq,DecisionID)

    SELECT 1,1,1,2999 UNION ALL

    SELECT 2,2,1,2999 UNION ALL

    SELECT 3,2,2,3000 UNION ALL

    SELECT 4,3,1,2999 UNION ALL

    SELECT 5,3,2,3000 UNION ALL

    SELECT 6,3,3,3003 UNION ALL

    SELECT 7,4,1,2999 UNION ALL

    SELECT 8,4,2,3000 UNION ALL

    SELECT 9,4,3,3003 UNION ALL

    SELECT 10,4,4,3004 UNION ALL

    SELECT 11,5,1,2999 UNION ALL

    SELECT 12,5,2,3000 UNION ALL

    SELECT 13,5,3,3003 UNION ALL

    SELECT 14,5,4,3004 UNION ALL

    SELECT 15,5,5,3005

    CREATE TABLE #UserInput (Seq int,

    DecisionID int)

    INSERT INTO #UserInput (Seq, DecisionID)

    SELECT 1,2999 UNION ALL

    SELECT 2,3000

    SELECT der2.Code

    FROM (SELECT CodesCount=COUNT(1)

    FROM #UserInput) AS der1

    INNER JOIN(

    SELECT Code = C.code, MatchCount = COUNT(1)

    FROM #codes C

    INNER JOIN #sequences S ON C.CodeID = S.CodeID

    LEFT OUTER JOIN #UserInput U ON S.DecisionID = U.DecisionID

    AND S.Seq = U.Seq

    GROUP BY C.code) AS der2

    ON der1.CodesCount = der2.MatchCount

    DROP TABLE #UserInput

    DROP TABLE #codes

    DROP TABLE #sequences

    Note, this is assuming that you've taken the user input for sequence and put it into a temp table called UserInput. If the order they enter decision IDs doesnt matter then Seq doesnt need to be on that table.

    Other possible solution if you want a delimited input list and dont want to use a temp table would be to create a derived table denormalizing the DecisionIDs and query it with your whole list as a string. I didnt choose that one as it only works if sequence must match, and you need to be careful with formatting to make it work.

  • There are several ways of doing this, but here's an old-fashioned method.

    I've assumed your table definition looks something like this:

    CREATE TABLE Sequences (

    ID int NOT NULL PRIMARY KEY,

    CodeId int NOT NULL,

    Seq int NOT NULL,

    DecisionId char(4) NOT NULL

    )

    GO

    I've also taken the liberty to add an index.

    CREATE INDEX IDX_Sequences ON Sequences(CodeId, Seq, DecisionId)

    GO

    Here is your test data in a readily consumable format:

    INSERT INTO Sequences(ID, CodeId, Seq, DecisionId)

    SELECT 1, 1, 1, '2999' UNION ALL

    SELECT 2, 2, 1, '2999' UNION ALL

    SELECT 3, 2, 2, '3000' UNION ALL

    SELECT 4, 3, 1, '2999' UNION ALL

    SELECT 5, 3, 2, '3000' UNION ALL

    SELECT 6, 3, 3, '3003' UNION ALL

    SELECT 7, 4, 1, '2999' UNION ALL

    SELECT 8, 4, 2, '3000' UNION ALL

    SELECT 9, 4, 3, '3003' UNION ALL

    SELECT 10, 4, 4, '3004' UNION ALL

    SELECT 11, 5, 1, '2999' UNION ALL

    SELECT 12, 5, 2, '3000' UNION ALL

    SELECT 13, 5, 3, '3003' UNION ALL

    SELECT 14, 5, 4, '3004' UNION ALL

    SELECT 15, 5, 5, '3005'

    I have assumed that your input data is strictly a 1-character delimited string containing one or more 4-character descision codes (where the sequence of codes is significant), so that we can efficiently split the string using a Tally table (a table of sequential integers) and populate a temporary table.

    In the subsequent SELECT query, both the number of rows in the Sequences table with the selected CodeId column value and the number of rows where the Seq and DecisionId columns match between the Sequences and temporary tables must equal the number of rows in the temporary table.

    DECLARE @choices varchar(1000)

    DECLARE @n int

    SELECT @choices = '2999&3000'

    CREATE TABLE #Choice (

    Seq int NOT NULL PRIMARY KEY,

    DecisionId char(4) NOT NULL

    )

    INSERT INTO #Choice (Seq, DecisionId)

    SELECT T.N, SUBSTRING(@choices, T.N * 5 - 4, 4)

    FROM Tally T WHERE (T.N BETWEEN 1 AND (LEN(@choices) + 1) / 5)

    SELECT @n = @@ROWCOUNT

    SELECT S.CodeId

    FROM Sequences S

    LEFT OUTER JOIN #Choice C ON (C.Seq = S.Seq AND C.DecisionId = S.DecisionId)

    GROUP BY S.CodeId

    HAVING (COUNT(C.Seq) = @n AND COUNT(*) = @n)

    DROP TABLE #Choice

  • Nevyn,

    Your query needs some fixing up to make it work. The left outer join is not having any filtering effect, so the query is just returning the Code for which the sequence length matches the number of items in the user input.

    SELECT der2.Code

    FROM ( SELECT CodesCount=COUNT(1)

    FROM #UserInput) AS der1

    INNER JOIN (

    SELECT Code = C.code, MatchCount = COUNT(1)

    FROM #codes C

    INNER JOIN #sequences S ON C.CodeID = S.CodeID

    LEFT OUTER JOIN #UserInput U

    ON S.DecisionID = U.DecisionID AND S.Seq = U.Seq

    GROUP BY C.code) AS der2

    ON der1.CodesCount = der2.MatchCount

    To correct this, you could make the following change, which makes the query similar to mine except that I didn't bother including the join to the Codes table.

    SELECT der2.Code

    FROM ( SELECT CodesCount=COUNT(1)

    FROM #UserInput) AS der1

    INNER JOIN (

    SELECT Code = C.code, CodesCount = COUNT(1), MatchCount = COUNT(U.Seq)

    FROM #codes C

    INNER JOIN #sequences S ON C.CodeID = S.CodeID

    LEFT OUTER JOIN #UserInput U

    ON S.DecisionID = U.DecisionID AND S.Seq = U.Seq

    GROUP BY C.code) AS der2

    ON der1.CodesCount = der2.MatchCount AND der1.CodesCount = der2.CodesCount

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply