eliminating dupes & getting more columns with aggregate

  • Ok, I can't seem to work this out. I have a table with some bad data in it. The primary key is made of two columns, fK_id_1 and fK_id_2. The trouble is, I need to choose from a couple of records that have the 'primary' column marked 'y', and choose the one with the most recent date. This query gets me the records I want:

    SELECT [fK_id_1], MAX([date])
    FROM dbo.
    WHERE [primary]='Y' GROUP BY [fK_id_1]

    But, I want to return the fK_id_2 of those records. If I throw that column into the query I get too many records. How can I get the columns I need?

  •      SELECT t1.*

         FROM dbo.

    t1

         JOIN

         ( SELECT [fK_id_1], MAX([date]) MaxDate

           FROM dbo.

           WHERE [primary]='Y'

           GROUP BY [fK_id_1] ) dt

         ON t1.fk_id_1 = dt.fk_id_1 and t1.[date] = dt.MaxDate


    * Noel

  • Yes, that's one I tried, too. The trouble is, the one key and date aren't unique; when I join on them, I return too many rows. And, if I then match the outer table on 'primary' as well (see below), I get too few.

    SELECT t1.*
    FROM dbo.
    t1 JOIN( SELECT [fK_id_1], MAX([date]) MaxDate FROM dbo.
    WHERE [primary]='Y' GROUP BY [fK_id_1] ) dt ON t1.fk_id_1 = dt.fk_id_1 AND t1.[date] = dt.MaxDate AND t1.[primary]='Y'
  • Totally untested...

         SELECT t1.*

         FROM dbo.

    t1

         JOIN

         ( SELECT [fK_id_1], [fK_id_2], MAX([date]) MaxDate

           FROM dbo.

           WHERE [primary]='Y'

           GROUP BY [fK_id_1], [fK_id_2]&nbsp dt

         ON t1.fk_id_1 = dt.fk_id_1 and t1.[date] = dt.MaxDate

    and t1.fk_id_2 = dt.fk_id_2

  • Sorry, I think you should supply more information - best would be DDL and sample data, or at least description of possibilities. We can hardly start looking for a reliable solution if we don't know at least this:

    - what datatype is "Date" column? 

    - does it include time part or is it date only?

    - if several rows have the same ID_1 and date, but different ID_2, which of the rows is "correct" - seems you only want to return one row in such cases?

  • select t1.fk_id_1, t1.fk_id_2, t1.[date]

    from dbo.

    t1

    where [primary] = 'Y'

     and t1.[date] = (select max(t2.[date])

       from dbo.

    t2

       where t1.fk_id_1 = t2.fk_id_1

        and t1.fk_id_2 = t2.fk_id_2)

     

  • Vladan, When most people call a column 'date', it's usually a date datatype. In this case, it appears I don't have time, but the logic should work either way. Yes, I only want one record, which is why I'm trying to get the MAX date.

    Ken, yours isn't eliminating dupes.

    I've discovered part of my troubles. Seems I have a bunch of fK2's that all have the same date. Alas, what I want is only one in each case of fK1, but I don't know if that can be done.

    Here's sample DDL for everyone to work with:

    DECLARE @tmpTable TABLE (
    [fK_1] tinyint,
    [fK_2] tinyint,
    [date] smalldatetime,
    [primary] char(1)
    )
    INSERT @tmpTable VALUES (1,1,'6/30/2005','n')
    INSERT @tmpTable VALUES (1,3,'6/30/2005','y')
    INSERT @tmpTable VALUES (2,2,'7/1/2005','y')
    INSERT @tmpTable VALUES (3,3,'6/30/2005','n')
    INSERT @tmpTable VALUES (3,5,'7/4/2005','y') --this one
    INSERT @tmpTable VALUES (3,6,'7/3/2005','y')
    INSERT @tmpTable VALUES (4,7,'7/9/2005','n')
    INSERT @tmpTable VALUES (5,7,'7/7/2005','y')
    INSERT @tmpTable VALUES (5,8,'7/8/2005','n')
    INSERT @tmpTable VALUES (5,9,'7/8/2005','y') -- this one
    INSERT @tmpTable VALUES (7,1,'7/31/2005','n')
    INSERT @tmpTable VALUES (7,2,'8/1/2005','y')
    INSERT @tmpTable VALUES (6,6,'8/1/2005','y') -- do these
    INSERT @tmpTable VALUES (6,7,'8/1/2005','y') -- three also
    INSERT @tmpTable VALUES (6,8,'8/1/2005','y') -- get selected?
    
    /*shows the number of records I actually want back, and how many dupes are
    connected to each (that should be filtered out)*/
    SELECT [fK_1],COUNT([fK_1])
    FROM @tmpTable
    WHERE [primary]='y'
    GROUP BY [fK_1]
    
    /*shows the records that have dupes and the MAX date that I'm depending on*/
    SELECT [fK_1],MAX([date]) AS [DATE],COUNT([fK_1])
    FROM @tmpTable
    WHERE [primary]='y'
    GROUP BY [fK_1]
    HAVING COUNT([primary])>1
    
    /*shows best effort so far -- still have to deal with #6*/
    SELECT A.[fK_1],A.[fK_2],A.[date] FROM @tmpTable A JOIN (
    SELECT [fK_1],MAX([date]) AS [DATE]
    FROM @tmpTable
    WHERE [primary]='y'
    GROUP BY [fK_1]
    ) B ON A.[fK_1]=B.[fK_1] AND A.[DATE]=B.[DATE] AND A.[primary]='y'
    ORDER BY A.[fK_1]
    
  • Not quite sure what you are trying to do but this should produce the result you want:

    select t1.fk_1, t1.fk_2, t1.[date]

    from @tmpTable t1

    where t1.[primary] = 'Y'

     and t1.[date] = (select max(t2.[date])

       from @tmpTable t2

       where t2.[primary] = 'Y'

     and t1.fk_1 = t2.fk_1)

     and t1.fk_2 = (select max(t3.fk_2)

       from @tmpTable t3

       where t3.[primary] = 'Y'

     and t1.fk_1 = t3.fk_1

     and t1.[date] = t3.[date])

    ORDER BY t1.[fK_1]

     

  • I have to disagree, of course one would suppose so, but unfortunately I've seen many questions on these forums where only after several attempts to solve the problem we were informed, that "startdate" is stored as varchar. Thanks for the DDL, it will be much easier to attempt anything

    Now, the main question remains - which one of the several rows should be returned? Following 3 rows from your example have the same fK_1 and same date, but different fK_2. Which of these rows do you want in the result and why?

    INSERT @tmpTable VALUES (6,6,'8/1/2005','y')

    INSERT @tmpTable VALUES (6,7,'8/1/2005','y')

    INSERT @tmpTable VALUES (6,8,'8/1/2005','y')

  • end-user - did you have an answer to Vladan's question? You need to decide which 1 of those 3 rows you want.

    If you can tell us, then it's easy for us to give you what you need...

    Ryan Randall

    Solutions are easy. Understanding the problem, now, that's the hard part.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply