making records unique

Question

making records unique

mick burden

SSCarpal Tunnel

Points: 4432
More actions
September 12, 2013 at 5:51 am

#395467

Hi everyone, I've a table with 15 fields with 70,000 rows of data. I now realise I need to make each record unique. I could have gone down the road of using if exists ... statement in my Store Procedure to stop duplicates, but I'd have to test each field to see if they all make a duplicate row. I did think about making all the fields a primary key. The problem with that is I've allowed Nulls in some of the fields, I did think of updating the Null fields with 'N/A' but I can't remember the SQL statment that does it. Can someone please help me resolve this with a suggestion.

Viewing 15 posts - 1 through 15 (of 32 total)

You must be logged in to reply to this topic. Login to reply

mick burden SSCarpal Tunnel Points: 4432 More actions · Answer 1

If I make a new table with the same structure, is there a way I could copy table1 into the new table using select into but substituting NULL with 'N/A'

kapil_kk SSC-Insane Points: 21316 More actions · Answer 2

Its not a good idea to include all fields in a primary key as it will affect the performance....

Did you try DISTINCT to see how many records you are getting after using it or it is returning same 70000 rows after using DISTINCT?

please let us know

_______________________________________________________________
To get quick answer follow this link:
http://www.sqlservercentral.com/articles/Best+Practices/61537/

kapil_kk SSC-Insane Points: 21316 More actions · Answer 3

mick burden (9/12/2013)
If I make a new table with the same structure, is there a way I could copy table1 into the new table using select into but substituting NULL with 'N/A'

Can you please post the table structure and tell us for which fields you are getting repeatitive values?

_______________________________________________________________
To get quick answer follow this link:
http://www.sqlservercentral.com/articles/Best+Practices/61537/

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 4

mick burden (9/12/2013)
Hi everyone, I've a table with 15 fields with 70,000 rows of data. I now realise I need to make each record unique.

Easy.

I could have gone down the road of using if exists ... statement in my Store Procedure to stop duplicates, but I'd have to test each field to see if they all make a duplicate row.

Still easy

I did think about making all the fields a primary key. The problem with that is I've allowed Nulls in some of the fields, I did think of updating the Null fields with 'N/A' but I can't remember the SQL statment that does it. Can someone please help me resolve this with a suggestion.

Probably not the best solution.

What do you really want to do? Remove the duplicate rows? Fix the sproc so that duplicate rows are not output? Both?

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

SteveEClarke SSCertifiable Points: 5161 More actions · Answer 5

Yes - you can do a select into, and do a case when login per field - as its a one off pain at first but then its done

________________________________________________________________________________________________
Regards
Steve
SQL 2008 DBA/DBD - MCTS/MCITP

Please don't trust me, test the solutions I give you before using them.

mick burden SSCarpal Tunnel Points: 4432 More actions · Answer 6

I've tried Distinct as requested and there's only about 200 duplicates

mick burden SSCarpal Tunnel Points: 4432 More actions · Answer 7

Yes - you can do a select into, and do a case when login per field - as its a one off pain at first but then its done

how does that work?

mick burden SSCarpal Tunnel Points: 4432 More actions · Answer 8

Here is my script for creating the table

if object_id('PostCodesAndAddressesBt','U')is null

CREATE TABLE PostCodesAndAddressesBt

(ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,

Postcode varchar(10) ,

BuildingName varchar(200),

housenumber varchar(20) ,

SubBuilding varchar(100) ,

County varchar(100),

Town varchar(100) ,

LocalName varchar(100) ,

PostOutCode varchar(100),

PostInCode varchar(100),

Throughname varchar(100),

Throughdesc varchar(100),

poboxno varchar(100),

BusinessName varchar(200),

locality varchar(200))

mick burden SSCarpal Tunnel Points: 4432 More actions · Answer 9

mick burden

SSCarpal Tunnel

Points: 4432

September 12, 2013 at 7:11 am

#1649698

I'd like to see both senarios to see what's best

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 10

Can you post a few rows please Mick, showing some dupes?

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 11

What results do you get from this query?

;WITH Deleter AS (

SELECT

ID,

rn = ROW_NUMBER() OVER(ORDER BY Postcode,BuildingName,housenumber,SubBuilding,County,Town,

LocalName,PostOutCode,PostInCode,Throughname,Throughdesc,poboxno,BusinessName,locality)

FROM PostCodesAndAddressesBt

)

SELECT *

FROM Deleter

WHERE rn > 1

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

mick burden SSCarpal Tunnel Points: 4432 More actions · Answer 12

I get to columns of integers, I know the first column is ID numbers, not sure what the other integers are, can you tell me what that statement does please?

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 13

ChrisM@Work (9/12/2013)
What results do you get from this query?
;WITH Deleter AS (
SELECT
ID,
rn = ROW_NUMBER() OVER(ORDER BY Postcode,BuildingName,housenumber,SubBuilding,County,Town,
LocalName,PostOutCode,PostInCode,Throughname,Throughdesc,poboxno,BusinessName,locality)
FROM PostCodesAndAddressesBt
)
SELECT *
FROM Deleter
WHERE rn > 1

I think he meant this, Mick:

;WITH Deleter AS (

SELECT

ID,

rn = ROW_NUMBER() OVER(PARTITION BY Postcode,BuildingName,housenumber,SubBuilding,County,Town,

LocalName,PostOutCode,PostInCode,Throughname,Throughdesc,poboxno,BusinessName,locality)

FROM PostCodesAndAddressesBt

)

SELECT *

FROM Deleter

WHERE rn > 1

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

mick burden SSCarpal Tunnel Points: 4432 More actions · Answer 14

I've run the sql snippet and I'm getting the following error

Msg 4112, Level 15, State 1, Line 4

The ranking function "ROW_NUMBER" must have an ORDER BY clause.