December 27, 2008 at 3:01 am
You're pulling the entire of two tables, amounting to a total of 15 million rows. That's never going to be quick and there's no way indexes are going to help with that. Seeing as SQL has to pull 15 million or so rows from the disk and put it on the network, that is going to strain the server and the network.
If then each of those 15 million rows is inserted back, that's going to drive the log cache up and the PLE down (new pages that couldn't be in cache)
At this point, I'm leaning towards saying that they way the app is working is the problem, not fixable without changing the app.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 27, 2008 at 9:09 am
I would agree ... although I would take it a little further and say the problem is with the ODBC (more than with the app - Informatica) .... because when I replace the target on the SQL server with a flat file on the DB server and write back using FTP the source throughput jumps up from 658 rows/sec to 7500 rows/sec and the total session run time goes down from 7 hours to 30 minutes. Now if that might still not have eliminated the DB write performance from the suspect problem zone, when I run bulk inserts from the flat file into the SQL server, (which is ~thrice the total volume read), 35 million rows (18G) are loaded in server in 40 mins
I guess I will have to wait to see the improvement that the new release of Data Direct ODBC gives .... I will update when it is available to us for testing.
Thanks a lot for your help and guidance!
Leon
December 27, 2008 at 10:09 am
leonp (12/27/2008)
when I run bulk inserts from the flat file into the SQL server, (which is ~thrice the total volume read), 35 million rows (18G) are loaded in server in 40 mins
There's a very big difference between bulk insert (which is designed to insert massive amounts of rows very quickly) and single row inserts (which your app is doing)
SQL is optimised to do things in sets. It will insert a million rows in a single statement far, far faster than it will process a million single row inserts. As long as you're doing millions of single row inserts, there will be a performance problem.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 27, 2008 at 10:37 am
Gila,
I agree .... although the point I was trying to make was, when set to run Bulk inserts, it is performing well, telling me that SQL does not have a bottleneck ..... is that a fair assessment?
Also, you mentioned single row inserts .... was that based on the logs or profiler output I shared ... I thought the app was running transactions spanning 10,000 row-inserts.
Again, is there a way ODBCs could issue transactions in BULK mode?
December 27, 2008 at 12:08 pm
leonp (12/27/2008)
Also, you mentioned single row inserts .... was that based on the logs or profiler output I shared ... I thought the app was running transactions spanning 10,000 row-inserts.
What? That they're slower? That's just a fact of databases.
You said earlier that the inserts were single row. Hence my comments on them
The application is executing a series of sp_executes with values for one row at a time
It doesn't matter if they're in individual transactions or one. Inserting a million rows one row at a time is way, way slower than inserting a million rows in a single statement. Even more so when you do the million row insert as a bulk operation.
It's not a question of does ODBC support or not. Bulk inserts are a specific command, not an insert with a different option enabled.. Look up BULK INSERT or bcp in Books Online
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 27, 2008 at 1:24 pm
Thanks for straightening me out .... although I understand from the ODBC documentation, that even with the sp_prepare code, we could do a multi-row-insert by forming an array that could be inserted once -
SQLPrepare (hstmt, " INSERT INTO DailyLedger (...) VALUES (?,?,...)", SQL_NTS); SQLSetStmtAttr (hstmt, SQL_ATTR_PARAMSET_SIZE, (UDWORD)100, SQL_IS_UINTEGER); SQLSetStmtAttr (hstmt, SQL_ATTR_PARAMS_PROCESSED_PRT, &rows_processed, SQL_IS_POINTER); // Specify an array in which to return the status of each set // of parameters. SQLSetStmtAttr(hstmt, SQL_ATTR_PARAM_STATUS_PTR, ParamStatusArray, SQL_IS_POINTER); // pass 100 parameters per execute // bind parameters ... do { // read up to 100 ledger values into bound parameter buffers ... rc = SQLExecute (hstmt); // insert a group of 100 rows } while ! (eof);
This should reduce network round trips considerably
This essentially does a multi-row insert - I should take this back to the ODBC vendor and the APP - Informatica and ask why are we not issuing the inserts as arrays - why are we doing single row inserts - who determines the insert mode - APP (Informatica) or the ODBC
Viewing 6 posts - 31 through 35 (of 35 total)
You must be logged in to reply to this topic. Login to reply