join question

  • SELECT  a.*

      FROM emp_table a INNER JOIN emp_table b

        ON a.emp_name=b.emp_name AND a.section_id=b.section_id AND a.dept_number=b.dept_number AND a.rowid>b.rowid

    order by 1 desc

    WHEN i run the above query i  am getting 325  records.Actually i have only 25 duplicate records.WHEN i add distinct a.* i as shown below  I am getting the correct result.I have a primary key on the column rowid.

    SELECT  DISTINCT a.*

      FROM emp_table a INNER JOIN emp_table b

        ON a.emp_name=b.emp_name AND a.section_id=b.section_id AND a.dept_number=b.dept_number AND a.rowid>b.rowid

    order by 1 desc

    Whats the problem with my first query. all the columns in the query are of int data types.

    Thanks.

  • I'm sorry emp_name in the above query is a varchar datatype.

    Thanks.

  • Keep in mind that every time a match between a and b occurr you get a row for the match. SO for instance if I have data like this

    A

    James

    Fred

    Bill

    B

    James 1

    James 2

    Fred 1

    Bill 1

    Bill 2

    I would get

    James

    James

    Fred

    Bill

    Bill

    Because both James and Bill have more than one match in B.

    To releive you add DISTINCT to get rid of the duplicates.

  • You might have dirty data. ( especially since You didn't expect duplicates )

    What is the primary key of emp_table ?, and what are the Unique Indexes on emp_table ?.

    Can an employee only "belong" to only one section and one dept ?

    If not maybe the database design might be wrong.

    /rockmoose


    You must unlearn what You have learnt

  • I would guess Antares686 is on track.  I'd try adding the name, dept and id from the "b" to help figure out which records are being joined and trouble shoot.

     

    Another solution is use a group by clause to group on the unique name and show ei

    Select

    a.emp_name,a.section_id, a.dept_number,count(a.rowid),min(a.rowid),max(a.rowid)

    from emp_table a

    GROUP BY a.emp_name,a.section_id, a.dept_number

    HAVING count(a.rowid) > 1

     

    If you are always deleting the newer or older one, you can use the min or max results in the where clause of the delete statement

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply