Selecting entries with non-distinct ID

Question

Post reply

Selecting entries with non-distinct ID

Pam Brisjar

SSChampion

Points: 12094
More actions
March 23, 2011 at 1:44 pm

#236720

I'm thinking there's a better way to write this query. Any ideas?
IF OBJECT_ID('TempDB..#persons', 'U') IS NOT NULL
DROP TABLE #persons
GO
CREATE TABLE #persons
(
person_identity INT ,
person_id CHAR(10) ,
person_data VARCHAR(20)
)
INSERT INTO #persons
( person_identity ,
person_id ,
person_data
)
SELECT 1 ,
'123' ,
'abcd'
UNION ALL
SELECT 1 ,
'123' ,
'xyz'
UNION ALL
SELECT 2 ,
'987' ,
'xyz'
UNION ALL
SELECT 2 ,
'545' ,
'nnn'
UNION ALL
SELECT 3 ,
'MTN' ,
'nothing'
UNION ALL
SELECT 3 ,
'269' ,
'nbdfjkl'
UNION ALL
SELECT 3 ,
'76489' ,
NULL
UNION ALL
SELECT 4 ,
'8B937' ,
'hello' ;
WITH PersonDistinct
AS ( SELECT pd.person_identity ,
ROW_NUMBER() OVER ( PARTITION BY pd.person_identity ORDER BY pd.person_id ) AS RowNum
FROM ( SELECT DISTINCT
person_identity ,
person_id
FROM #persons AS p WITH ( NOLOCK )
) AS pd
)
SELECT person_identity ,
RowNum
FROM PersonDistinct
WHERE RowNum = 2
ORDER BY person_identity ,
RowNum
Person_identity is an FK to another table existing in a 1-many relationship with this table. Each person_identity should have only one person_id value associated with it. The difference between rows is the additional data (obviously much more/different than what I have posted in the sample - no realistic way to use these for testing anything here). So I need to identify the person_identities that have more than one person_id associated with them. From the above sample, these are the results I want:
person_identityRowNum
22
32
The query I've written works, but I can't help but suspect there's a better way to write this.
(and no, I did NOT design this schema...)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

WayneS SSC Guru Points: 95461 More actions · Answer 1

Pam,

Actually, I think you wrote a pretty good query.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Maxim Picard SSCrazy Points: 2052 More actions · Answer 2

I like to write it this way:

SELECT

person_identity,

COUNT(*) AS Num

FROM

(

SELECT

person_identity,

person_id

FROM

#persons

GROUP BY

person_identity,

person_id

) AS PersonDistinct

GROUP BY

person_identity

HAVING

COUNT(*) > 1;

But it does not seem to perform any better, query plans are almost the same.

Pam Brisjar SSChampion Points: 12094 More actions · Answer 3

Thanks, Wayne. I keep wanting simpler but I don't think it's possible. Mostly wanted a sanity check.

Maxim, I thought about going that way as well but I like playing with the new toys :). Good to know they're performing about the same. Makes sense.