June 6, 2011 at 12:05 pm
I think you misunderstood what I was saying. Your original insert is inserting records it shouldn't because the query you have is not correctly selecting the records the want. This is not a problem with sql. It is that your select is not getting the data you want it to.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
June 6, 2011 at 12:14 pm
your original query does not look to me like it is doing what you want it to.
SELECT A.invoice, A.Provider, A.Division, A.BA, A.AGE_AT_DOS, A.SEX, A.TES_CRE, A.INV_CRE_DT,
A.INV_SER_DT, A.Measure
FROM IDXRaw A
WHERE A.INVOICE NOT IN (SELECT B.INVNUM
FROM invoices B
WHERE A.INVOICE = B.INVNUM
AND A.MEASURE = B.MEASURE)
This will not allow a record from IDXRaw be inserted if only the Invoice is in invoices and you want it to be invoice and Measure. You should probably use a left join to get the results you want or look into the Merge statement. Merge is incredibly powerful.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
June 7, 2011 at 11:20 am
Are you positive that they are duplicates if any of the columns have anything different leading or falling zeros, spaces could make sql server see the row as unique.
June 7, 2011 at 11:33 am
Grasshopper,
I am pretty darn definite that there are duplicates based on the 2 fields I am considering (DIVISION and INVNUM.) There are something like 1.5 million records I am looking at in my source table. After INSERTING into TARGET table excluding duplicates I end up with ~740,000 records. I have accomplished this by using the IGNORE DUPLICATE KEY flag. I had rather not use that full-time but it worked for this purpose. I believe that my script is written correctly since if I run it a second time zero records are inserted because they already exist. It is like the records are not in the table until after the script finishes? Maybe that is how it is designed? Being a newbie, I am clueless. 🙂
June 7, 2011 at 11:37 am
If you do Just a select with a where clause of one of the values that you know should not have duplicates. Do you get more than one record?
June 7, 2011 at 11:43 am
Yes, I ran the following and was returned with many rows with counts as high as 9.
SELECT INVOICE, MEASURE, COUNT(*)
FROM IDXRaw
GROUP BY INVOICE, MEASURE
IDXRaw is my source table.
June 7, 2011 at 11:46 am
Then your original query should probably have needed to have a distinct .
select distinct [columns] ....
That would be why you inserted duplicates.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
June 7, 2011 at 11:46 am
Then I would start with the data it sounds like there is something like spaces or hidden characters have you copy and pasted some of the values in a text to see if you have any spaces or hidden characters?
June 7, 2011 at 11:48 am
Like this
INSERT INTO invoices (INVNUM, PROVIDER, DIVISION, BA, AGE, SEX, TES_CRE_DT,
INV_CRE_DT, INV_SER_DT, measure)
SELECT A.invoice, A.Provider, A.Division, A.BA, A.AGE_AT_DOS, A.SEX, A.TES_CRE, A.INV_CRE_DT,
A.INV_SER_DT, A.Measure
SELECT DISTINCT A.invoice, A.Provider, A.Division, A.BA, A.AGE_AT_DOS, A.SEX, A.TES_CRE, A.INV_CRE_DT,
A.INV_SER_DT, A.Measure
FROM IDXRaw A
left join invoices B on A.INVOICE = B.INVNUM AND A.MEASURE = B.MEASURE
where A.INVOICE is null
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
June 7, 2011 at 11:52 am
Hey Sean
The only thing is that a distinct does not always work. If some user entered in the same row with only a space different in one of the rows it is distinct. ex:
name sex job
Joe M CarPillot
Joe M Car Pillot
Joe M Car_Pillot
June 7, 2011 at 12:00 pm
Well distinct will always work. In fact each of those IS distinct. It is not going to solve the problem in your case.
If can give the ddl for your source table and a few rows of sample data that will help.
If all three of these exist for the same Invoice and Measure
Joe M CarPillot
Joe M Car Pillot
Joe M Car_Pillot
which one should be inserted?
The solution is fairly simple but you have to figure out which one of those you want. The solution will have to order your source results by something and use the first one for each unique combo you want in the destination.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
June 7, 2011 at 12:00 pm
Ok, so I did a SELECT using:
SELECT distinct INVOICE, MEASURE
FROM IDXRaw
am get the right number of records. Doesn't that prove that my duplicates do not have extra characters, spaces, etc... and thus the original INSERT script should have worked without having to use INCLUDE DUPLICATE KEYS?
So now I have 2 options: Use INCLUDE DUPLICATE KEYS or DISTINCT in my select. Still don't understand why the original INSERT doesn't work on first run but will when the records already exist on the TARGET table on the second run.
Thanks for all the help and the quick responses. Y'all are great!
June 7, 2011 at 12:11 pm
If your just looking to find dupliate vaules. The following would work:
SELECT INVOICE, MEASURE, COUNT(*)
FROM IDXRaw
GROUP BY INVOICE, MEASURE HAVING COUNT(*) > 1
June 7, 2011 at 12:14 pm
lgoolsby 86333 (6/7/2011)
Ok, so I did a SELECT using:
SELECT distinct INVOICE, MEASURE
FROM IDXRaw
am get the right number of records. Doesn't that prove that my duplicates do not have extra characters, spaces, etc... and thus the original INSERT script should have worked without having to use INCLUDE DUPLICATE KEYS?
So now I have 2 options: Use INCLUDE DUPLICATE KEYS or DISTINCT in my select. Still don't understand why the original INSERT doesn't work on first run but will when the records already exist on the TARGET table on the second run.
Thanks for all the help and the quick responses. Y'all are great!
Because you are inserting the results of the select statement. Which says give me all records from the source table where the invoice, measure combination does not exist and there are more than 1 record for some of those combinations.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
June 7, 2011 at 12:20 pm
Try something like this.
create table #IDXRaw
(
invoice int,
MEASURE varchar(5),
Provider varchar(50)
)
create table #invoices
(
INVNUM int,
MEASURE varchar(5)
)
insert #IDXRaw
select 1, 'gal', 'provid1'
union all
select 1, 'gal', 'provid 1'
union all
select 1, 'gal', 'provid1'
union all
select 1, 'gal', 'provid12312'
union all
select 2, 'ml', 'Joe M CarPillot'
union all
select 2, 'ml', 'Joe M Car Pillot'
union all
select 2, 'ml', 'Joe M Car_Pillot'
;with cte as
(
SELECT A.invoice, A.Measure, A.Provider, ROW_NUMBER() over(partition by invoice, measure order by A.Provider) as RowNum
FROM #IDXRaw A
WHERE A.INVOICE NOT IN
(
SELECT B.INVNUM
FROM #invoices B
WHERE A.INVOICE = B.INVNUM
AND A.MEASURE = B.MEASURE
)
)
insert #invoices
select invoice, MEASURE from cte
where RowNum = 1
Notice how I provided ddl and sample data to make it easy for others to consume easily? You can modify this to suit your exact situation but this should be pretty close. The real decision you have to make it what you want to order your source by.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
Viewing 15 posts - 16 through 30 (of 34 total)
You must be logged in to reply to this topic. Login to reply