July 4, 2016 at 4:41 am
I'm hoping that someone might have some good advice on this one. Basically I have a table with 3.5M rows that I need to transfer from a SQL Server 2008 R2 instance to a MySQL instance. The MySQL instance is hosted in the Amazon Web Services RDS platform so has limited functionality.
The source table definition is:
CREATE TABLE Messages(
MessageID [int] NOT NULL,
ChannelID [int] NOT NULL,
MessageText [ntext] NULL,
DateCreated [smalldatetime] NOT NULL,
DateModified [smalldatetime] NULL,
CONSTRAINT PK_Messages PRIMARY KEY CLUSTERED
(
MessageID ASC,
ChannelID ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON PRIMARY
) ON PRIMARY TEXTIMAGE_ON PRIMARY
The destination MySQL table definition is:
CREATE TABLE `messages` (
`MessageID` int(11) NOT NULL,
`ChannelID` int(11) NOT NULL,
`MessageText` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`DateCreated` datetime DEFAULT NULL,
`DateModified` datetime DEFAULT NULL,
PRIMARY KEY (`MessageID`,`ChannelID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
My biggest issues so far have been performance and encoding, mainly due to the NTEXT source column.
If I use a ADO.NET connector for the source to SQL server and a ADO.NET connector to the destination the data copies correctly but very slowly as its processing one row at a time rather than batches.
If I use a ADO.NET connector for the source to SQL server and a ODBC connector to the destination the data is again correct but performance is again terrible as only the row by row method works, when set to batch I get this error "Parameter arrays with data at execution are not supported"
I then tried evaluating the cozyroc SSIS+ utilities which boast of batch support when using their proprietary ODBC Destination. This too has issues, it managed to get through 400K rows of the 3.5M quickly however then fails on an error "string or buffer error". Changing the defaults for number of rows in the buffer and the buffer size has little effect and it still fails.
I have tried using bcp to copy the data from SQL server to file, then import to mysql using the LOAD LOCAL IN FILE syntax but this too has resulted in issues.
Is there any opensource destination that's great at writing to MySQL with SSIS and supports batch inserts?
MCITP SQL 2005, MCSA SQL 2012
July 4, 2016 at 5:13 am
July 4, 2016 at 7:06 am
twin.devil (7/4/2016)
Please check the following Slow source? Make your data flow buffers smaller![/url]
Thank you for the suggestion however I have tried this and there is no improvement.
By default the native SSIS destinations when writing to mysql are writing rows one at a time, I have confirmed this as I can see the single row insert statements in the mysql general log. So whilst its done what I believe its designed to do single row inserts will always be far slower than multi row batched inserts.
When trying the same approach with the cozyroc ODBC destination I can see the inserts are batched but even with making the buffers smaller it still fails after processing approx 400K rows.
MCITP SQL 2005, MCSA SQL 2012
July 4, 2016 at 7:44 am
Generally I use OLEDB sources and destinations in data flow tasks.
Try changing both the source and the destination to OLEDB.
Jez
July 4, 2016 at 7:49 am
Also, do you have Rows per Batch set to anything or Maximum insert commit size set?
These can affect insert performance.
Jez
July 4, 2016 at 8:07 am
OLE DB Sources \ Destinations are not valid for MySQL as no OLE Provider exists for MySQL.
I have also tried adjusting the batch sizes both for the data flow container and the specific data flow destination task and neither has any positive impact. I have tried values ranging from 50 rows per batch all the way up to 100,000 rows per batch. It makes no difference the writes to the mysql instance always end up being single row inserts.
In theory this should work, at the moment I am developing the SSIS packages in SSDT for SQL 2012. However the symptoms are the same with BIDS (SQL 2008).
This is just one of the workarounds I have tired and as yet none provide what I need:
https://blogs.msdn.microsoft.com/mattm/2009/01/07/writing-to-a-mysql-database-from-ssis/
MCITP SQL 2005, MCSA SQL 2012
July 4, 2016 at 11:27 am
have you tried using the “Balanced Data Distributor” for SSIS?
https://blogs.msdn.microsoft.com/sqlperf/2011/05/25/the-balanced-data-distributor-for-ssis/
the balanced data distributor works with ADO.net connections
usually with MultipeActiveResultSets and increased packet size
July 5, 2016 at 2:01 am
prvmine (7/4/2016)
have you tried using the “Balanced Data Distributor” for SSIS?https://blogs.msdn.microsoft.com/sqlperf/2011/05/25/the-balanced-data-distributor-for-ssis/
the balanced data distributor works with ADO.net connections
usually with MultipeActiveResultSets and increased packet size
Thank you for the suggestion, I wasn't aware of this new feature so its worth a try, while it will not insert more rows per batch it will let me perform inserts in parallel.
MCITP SQL 2005, MCSA SQL 2012
July 5, 2016 at 2:04 am
twin.devil (7/5/2016)
Have you check your MySQL Settings if then Please check the following link.
Yes, I have checked mysql we do not use the MyISAM storage format, all tables are innodb.
Keep in mind the performance of single inserts is not the issue rather that it takes far longer to insert 4.5M rows one by one than it will in batches of say 10K rows.
MCITP SQL 2005, MCSA SQL 2012
July 5, 2016 at 2:35 am
Using BDD its slightly better perfromance, going on the processing rate it will take 9.4hours to complete compared to 23 hours when running a normal data flow.
Its a bit frustrating as If it wasn't for the unicode data in the NTEXT column, the cozyroc ODBC Destination combined with using the ANSI MySQL driver rather than the Unicode MYSQL driver can do it in 30 minutes, the downside is a lot of junk characters in the unicode data due to the driver.
MCITP SQL 2005, MCSA SQL 2012
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply