December 9, 2005 at 8:36 am
I'm working to find a way to verify in my master student table that each student in a table has only one student ID number. My table has studentnumber, which could be in the table more than once if the student is enrolled at more than one school ... so I'll need (I'm thinking) to use their lastname, firstname and birthdate fields to identify the duplicates, and then identify among the duplicates found within that set instances were the identical records (based on lastname, firstname, birthdate) have differing values in the student ID number.
The overall goal is to identify a student who managed to get two or more student ID numbers assigned to his or her name. I've tried adapting the ACCESS find duplicates query, which will find the duplicates based on lastname, firstname and birthdate, but what it won't do is identify the cases among this group of duplicates where a set of matching records has a different student ID.
Here's some sample data... The records I need to identify are the first two, where the student ID is different, but the FN,LN and BD are the same.
Student ID | FirstName | LastName | Birthdate |
1221144 | Joe | Blow22 | 11/25/1990 |
1221145 | Joe | Blow22 | 11/25/1990 |
1221146 | John | Brown | 2/2/1988 |
1221147 | Sean | West | 2/1/1989 |
1221148 | Eliah | Williams | 4/5/1999 |
A brain buster for me... I'm hoping you have a thought!
Thanks in advance for your help!
December 9, 2005 at 9:05 am
Try this. I have assumed that birthdate is stored as datetime. If it is a varchar, then be careful to check for differences in regional dates ( 1/2/1989 and 2/1/1989 )
select *
from student
where firstname + ' ' + lastname + ' - ' + convert(varchar(10), birthdate, 103) in
(select firstname + ' ' + lastname + ' - ' + convert(varchar(10), birthdate, 103)
from student
group by firstname, lastname, dob
having count(firstname + ' ' + lastname + ' - ' + convert(varchar(10), birthdate, 103)) > 1)
December 9, 2005 at 9:51 am
I think dc's approach is the best one.
December 9, 2005 at 10:48 am
"best one" - from all these numerous posts that're vying with each other to provide the solution..?!?!
**ASCII stupid question, get a stupid ANSI !!!**
December 9, 2005 at 11:51 am
I wouldn't do a string cocatenation for this. This will work for this as well.
SET NOCOUNT ON
DECLARE @Student TABLE
(
StudentID CHAR(7),
FirstName VARCHAR(50),
LastName VARCHAR(50),
BirthDate DATETIME
)
INSERT @Student
SELECT '1221144', 'Joe', 'Blow22', '11/25/1990' UNION
SELECT '1221145', 'Joe', 'Blow22', '11/25/1990' UNION
SELECT '1221146', 'John', 'Brown', '2/2/1988' UNION
SELECT '1221147', 'Sean', 'West', '2/1/1989' UNION
SELECT '1221148', 'Eliah', 'Williams', '4/5/1999'
SELECT *
FROM
@Student A
JOIN
(
SELECT FirstName, LastName, BirthDate
FROM
@Student
GROUP BY FirstName, LastName, BirthDate
HAVING COUNT(StudentID) > 1) B
ON
A.FirstName = B.FirstName
AND A.LastName = B.LastName
AND A.BirthDate = B.BirthDate
Regards,
gova
December 9, 2005 at 11:59 am
Very nice govinn - "I think govinn's approach is the best one." - since there's no concatenation involved.
Hey - the weekend bug's gotten me...
**ASCII stupid question, get a stupid ANSI !!!**
December 9, 2005 at 12:01 pm
Can we say one of the best.
Regards,
gova
December 9, 2005 at 12:04 pm
Nope - since there're only 2 responses - okay - I'll compromise & rephrase...yours is the better one since it avoids concatenation...
**ASCII stupid question, get a stupid ANSI !!!**
December 9, 2005 at 12:54 pm
The result this one produces is the same as I got with the derive Access find duplicates code...... the problem being, it is finding the duplicates based upon same FN,LN,BD, but it's not limiting the result to those among this group who have two or more different student iD numbers, which is what I'm after... ... What I'm seeing in the result are all instances of this student record in the table with multiple instances of the same student ID number included in the returned set.
Thanks!
December 9, 2005 at 1:06 pm
Thanks the the reply..
This query is returning all duplicates based on the FN,LN,Bd whether or not the records returned have the same PERMNUM (student ID) ... the same as I get with my query... but what I need is only the group of records with duplicate FN,LN,BD with different Student ID numbers... so I can find those students who managed to get two different ID numbers...
Thanks!
December 9, 2005 at 1:20 pm
Replace table/column names with the appropriate names for your table.
Select Distinct A.*
From YourTable As A
Where Exists (
Select *
From YourTable As B
Where A.FN = B.FN
And A.LN = B.FN
And A.BD = B.BD
And A.StudentID <> B.StudentID
)
December 9, 2005 at 4:26 pm
Thanks for your (and everyone else here's) expertise and kindness in sharing your knowledge. I am in your debt.
December 12, 2005 at 1:45 am
I was a little puzzled why Govinn's query didn't work. Then I realised that this could be because your table has more columns than you have stated above - a unique id, or something else that distinguishes records with the same FirstName, LastName, BirthDate and StudentID. Try this modification:
SELECT *
FROM
@Student A
JOIN
(
SELECT FirstName, LastName, BirthDate
FROM
@Student
GROUP BY FirstName, LastName, BirthDate
HAVING COUNT(distinct StudentID) > 1) B
ON
A.FirstName = B.FirstName
AND A.LastName = B.LastName
AND A.BirthDate = B.BirthDate
You might want to run only a part of this query, since it is probably the easiest way to identify duplicates - although it is not quite what you have been asking for:
SELECT FirstName, LastName, BirthDate
FROM
@Student
GROUP BY FirstName, LastName, BirthDate
HAVING COUNT(distinct StudentID) > 1
December 12, 2005 at 9:53 am
Thanks.. there are more colums, but none with unique features. Your second query works, but won't fill my requirement due to their being duplicate entries in the table for instances where a student is enrolled in more than one site, either concurrently or due to a mid-year transfer.
My other question regarding your top query here,
SELECT *
FROM
@Student A
JOIN
(
SELECT FirstName, LastName, BirthDate
FROM
@Student
GROUP BY FirstName, LastName, BirthDate
HAVING COUNT(distinct StudentID) > 1) B
ON
A.FirstName = B.FirstName
AND A.LastName = B.LastName
AND A.BirthDate = B.BirthDate
In the "Select From @Student A" part ...are you building a temporary duplicate table of my original table, and doing joins between them to establish the links and counts here.. am i correct in thinking that? What is the "@student" syntax 's purpose here?
Thanks!
December 13, 2005 at 2:07 am
The @Student table is from govinn's post. It is a table variable and not a "real" table in order to avoid changing the table structure of the database when testing the query (I guess). You should replace @Student by the name of your table when you're testing if the query meets your requirements.
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply