November 10, 2009 at 2:36 pm
hi All,
I have a task to complete and i am hoping to get some direction on how to get statred with that.
I have received a large amount of data from our overseas office primarily with company and people contact details and about 80% of the data in the excel files is already present in our database and another 20% are new contacts to be added.
But there may be some updates for the 80% in the excel file for address, job title, city etc. I have to update any changes from the excel to reflect in our database.
Intially i have thought to import all the data from excel as a table in sql server 2005 and then compare it with what data we have in our two tables tblcompany, tblpeople and then update the changes in production database.
Is that a correct approach? is there any other way of doing it. And how do i compare both the tables to update only the changes.
Please help, any suggestions will be highly appreicated.
Thanks, Shilpa.
November 10, 2009 at 3:05 pm
Hi,
The approach you have mentioned (to import data into SQL Server and then compare) is a right approach.
regarding your question about how to compare both the tables and then update only changes is possible with using JOIN.
A much better description of the solution would be possible if you can give the create table scripts for the 2 tables and also the condition to look for.
Blog -- LearnSQLWithBru
Join on Facebook Page Facebook.comLearnSQLWithBru
Twitter -- BruMedishetty
November 10, 2009 at 3:15 pm
Hi,
Yes, this is one of the options. You can load everything to a staging table and then update rows that do exist in the target table and insert the rest. There is no point in filtering out rows that did not change from the update statement in my opinion, as updating all rows will not change rows that are equal anyway.
Do you have a column or columns that uniquely identify particular rows? The sql would look something like this:
--update existing rows with imported values
update dest
set cola = a.cola, colb = a.colb, colc = a.colc
from staging a inner join dest b on a.keycol = b.keycol
--insert missing rows
insert dest (cola, colb, colc)
select cola, colb, colc
from staging
where not exists
(select 1 from dest where keycol = staging.keycol)
The other option probably would be to use Fuzzy Lookup Transformation in SSIS if you have manually entered data with typos and spelling mistakes. Probably it wouldn't solve all your problems though.
Regards
Piotr
...and your only reply is slàinte mhath
November 10, 2009 at 4:10 pm
Piotr.Rodak (11/10/2009)
...There is no point in filtering out rows that did not change from the update statement in my opinion, as updating all rows will not change rows that are equal anyway....
That is not true. This simple test script demonstrates that.
if object_id('tempdb..#t','U') is not null drop table #t
go
select
*
into
#t
from
-- Function F_TABLE_NUMBER_RANGE available on this link
-- http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=47685
F_TABLE_NUMBER_RANGE(1,5000000)
order by
number
go
declare @st datetime
declare @x1 int
declare @x2 int
set @st = getdate()
set @x1 = 44
set @x2 = @x1
update #t
set number = number+@x1-@x2
where
number <> number+@x1-@x2
select Elapsed_MS_1 = datediff(ms,0,getdate()-@st)
go
declare @st datetime
declare @x1 int
declare @x2 int
set @st = getdate()
set @x1 = 44
set @x2 = @x1
update #t
set number = number+@x1-@x2
select Elapsed_MS_2 = datediff(ms,0,getdate()-@st)
Results:
(5000000 row(s) affected)
(0 row(s) affected)
Elapsed_MS_1
------------
843
(1 row(s) affected)
(5000000 row(s) affected)
Elapsed_MS_2
------------
7686
(1 row(s) affected)
November 10, 2009 at 5:15 pm
hi Rodak,
Thanks for your suggestion. I have listed the structure of data in Excel file but the same data is stored in two different tables in the database tlbcompany, tblperson.
I can match excel data with table data using COID, PersonID
Do i need to split the Excel file into company and people ?
And i want to update only the rows which are different from the database tables.
Can i use INTERSECT/EXCEPT to find the rows with changes and update only those? how would i go about that?
Columns in excel file
CoID
PersonID
Company
Address
Street
City
State
PostCode
Country
Dx
Phone
Fax
MrMs
FistName
LastName
Title
Specialisation
Direct Phone
Direct Fax
November 10, 2009 at 5:51 pm
Thanks Michael, I realize this is a risky statement. Thanks for your sample. On the other hand I don't think there will be 5 million rows of data sent in Excel files of the OP. That's why I decided to say that it will not make too big difference and will be easier to write (there may be many columns to compare, not only the key).
sharonrao123, I don't know what are your business requirements - when you say that you have same data in tblcompany and tblperson, does it mean that both tables have the same structure? Are COId and PersonId always populated for both tables? I guess that you will have to run the statements I sent earlier separately for companies and for persons, but this is only a guess, you have to decide what result you want to achieve. You can use EXCEPT to capture different rows, but NOT EXISTS will be OK as well.
Regards
Piotr
...and your only reply is slàinte mhath
November 10, 2009 at 7:55 pm
hi Rodak,
Sorry i was not clear in my response. We store company and person details seperately in company (columns : name, address, street, city, state, postcode, country, dx, phone, fax) and people (columns: mrms, firstname, lastname, title, email, directphone, directfax) table and the people table has CoIDas foreing key.
But in Excel file they are combined that should not be a problem ignore that.
Total records in excel file should be around 30,000. And i decided to split the excel sheet data into company and people.
Excel sheet data has aslo got the primary keys for people(personid) and company(coid) but my question really was is there a way to find out which columns have different data when compared between excel and database. And please note that for each record not all the fields have updates for some records it might be address field and for some title field and so on.
I am not really able to decide weather to update all the fields based or just update the fields which have changes. As you said that is something which i have to figure out.
Thanks for your input i really appreciate your time.
Shilpa.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply