November 28, 2017 at 10:48 am
Hi,
I am writing ETL routine (stored proc based) to fetch data from mysql and insert in mssql database. I am using open query/linked server to fetch data in temp table and after some processing insert to mssql tables.
we have few big tables, like one of our table on mysql takes 1million + rows from application daily. I suspect that millions of rows will slow down process if I will use temp table . I will use batch processing.
but my question is about best practice or effective process. Can you please suggest what could be efficient way to process millions of rows from mysql ?
Thanks
November 28, 2017 at 10:55 am
thbaig - Tuesday, November 28, 2017 10:48 AMHi,
I am writing ETL routine (stored proc based) to fetch data from mysql and insert in mssql database. I am using open query/linked server to fetch data in temp table and after some processing insert to mssql tables.
we have few big tables, like one of our table on mysql takes 1million + rows from application daily. I suspect that millions of rows will slow down process if I will use temp table . I will use batch processing.
but my question is about best practice or effective process. Can you please suggest what could be efficient way to process millions of rows from mysql ?Thanks
Simply put do not use a linked server to do this it's not the temp table that will slow down the process, extract the data on the mysql side to a flat file, transfer the file to someplace SQL Server has access to and load the file from there.
November 28, 2017 at 11:00 am
ZZartin - Tuesday, November 28, 2017 10:55 AMthbaig - Tuesday, November 28, 2017 10:48 AMHi,
I am writing ETL routine (stored proc based) to fetch data from mysql and insert in mssql database. I am using open query/linked server to fetch data in temp table and after some processing insert to mssql tables.
we have few big tables, like one of our table on mysql takes 1million + rows from application daily. I suspect that millions of rows will slow down process if I will use temp table . I will use batch processing.
but my question is about best practice or effective process. Can you please suggest what could be efficient way to process millions of rows from mysql ?Thanks
Simply put do not use a linked server to do this it's not the temp table that will slow down the process, extract the data on the mysql side to a flat file, transfer the file to someplace SQL Server has access to and load the file from there.
Thank you. but i have to run job periodically to load and update data on mssql, as mysql will remain primary db for applicaiton
November 28, 2017 at 11:39 am
thbaig - Tuesday, November 28, 2017 11:00 AMThank you. but i have to run job periodically to load and update data on mssql, as mysql will remain primary db for applicaiton
OK, but what's the issue with the proposed solution?
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
November 28, 2017 at 12:37 pm
I would use SSIS to load the data into your staging table. SSIS will handle chunking out the data so you don't have to move 1 million row in one transaction. If you want to use a linked server chunk out the data yourself in a loop.
November 28, 2017 at 12:38 pm
Phil Parkin - Tuesday, November 28, 2017 11:39 AMthbaig - Tuesday, November 28, 2017 11:00 AMThank you. but i have to run job periodically to load and update data on mssql, as mysql will remain primary db for applicaitonOK, but what's the issue with the proposed solution?
dependency. i will need to manage a routine at mysql to extract data then copy to network, copy to mssql and then load. also as this will be recursive job, i will need to do whole process for e very execution that don;t seems optimal
November 28, 2017 at 12:44 pm
thbaig - Tuesday, November 28, 2017 12:38 PMPhil Parkin - Tuesday, November 28, 2017 11:39 AMthbaig - Tuesday, November 28, 2017 11:00 AMThank you. but i have to run job periodically to load and update data on mssql, as mysql will remain primary db for applicaitonOK, but what's the issue with the proposed solution?
dependency. i will need to manage a routine at mysql to extract data then copy to network, copy to mssql and then load. also as this will be recursive job, i will need to do whole process for e very execution that don;t seems optimal
SSIS can do everything listed, but it's over-complicating things. Joe's idea is better. Don't use a linked server.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply