December 13, 2012 at 2:56 am
Hi,
I am using sql server 2008 enterprise edition. I need to insert large number of rows to my table frequently. I have a primary key column in my table that has the hash value by computing 3 columns of the row. if i use bcp process it terminates the process because of duplicate rows in my data. What other best method to insert large datas to table without dublication. It should be as faster as bcp process.
Thanks in advance.
December 13, 2012 at 3:01 am
WITH CTE (
[ColumnName]
,[DuplicateCount]
)
AS(
SELECT
[ColumnName]
,[DuplicateCount]
,ROW_NUMBER() OVER(PARTITION BY [ColumnName]
ORDER BY [ColumnName]
AS DuplicateCount
FROM sample_table
)
DELETE
FROMCTE
WHERE (DuplicateCount > 1)
I think this an efficient way.
December 13, 2012 at 3:09 am
But I dont want to delete the records after insertion. Instead I want to skip the duplicate rows while inserting large number of rows.
December 13, 2012 at 3:19 am
If the data is already in the database then you should be able to do
WITH CTE
AS
(
Select
ROW_NUMBER() OVER(Partition by UniqueColumn ORDER BY UniqueColumn) rn
,<columnlist>
From myTable
)
Insert into BULK Desttable
Select <colunlist>
from CTE
where Rn=1
If its not then you may have to preload it.
_________________________________________________________________________
SSC Guide to Posting and Best Practices
December 13, 2012 at 4:48 am
CTE can be modified to select non duplicate values .
Please Refer,
http://msdn.microsoft.com/en-us/library/ms190766(v=sql.105).aspx
December 13, 2012 at 8:28 pm
saravanakumar-485245 (12/13/2012)
But I dont want to delete the records after insertion. Instead I want to skip the duplicate rows while inserting large number of rows.
You could change your PK to "Ignore Dupes".
The real key is that I would never ever import data directly to its final resting spot. I ALWAYS use a staging table because I ALWAYS validate the data first. If there are dupes present, it's good to know what they are and how many there are. You might also find out that what you're calling a duplicate, really isn't and that you may have selected an incorrect set of columns as the PK.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply