April 10, 2008 at 6:38 am
Hi,
we build an application with an underlying sql server. The interface between application and sql server shall exist of stored procedures, so that the application developer won't have to know how the server stores the data.
So far so good, but I think there will be a performance issue if a stored procedure will be called 1Mio times to store 1Mio records. My question is, how to pass a variable with a whole data structure to a procedure or is there a better concept to handle this?
thx
Tobias
April 10, 2008 at 6:47 am
I'm still a newbie but...
my understanding is that stored procedures are the fastest way to maintain a database
that being said, it is possible to build a dataset (in .NET) and write it to XML - and then use that XML file to update a table
but is that way any faster?
April 10, 2008 at 7:03 am
Perhaps I would add another layer between programmer and database, or at least helper class that would take this one million of objects, convert them to XML and called a proc with xml parameter. You could use that xml within stored proc to insert data to structures you have in DB.
This is IMO the easiest and fastest way to store huge amounts of data.
Piotr
...and your only reply is slàinte mhath
April 10, 2008 at 8:01 am
Everyone else has already said it, but I can expand a bit. XML is the answer. You want to look at ALL your database processes from the approach of working in sets of data, not indvidual rows & columns.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
April 10, 2008 at 8:09 am
I'm sure I'll get yelled at by someone for this post...
Don't be too bound on the "Stored Procedures are the only way" approach.
They are not necessarily the fastest or best solution to every problem. It is very easy to build stored procedures that perform poorly and have security holes. They are not a solution to every problem.
Batching millions of rows through a procedure one at a time or even by building a huge XML string and passing it in as a parameter is probably not the best solution in most cases. Step back a bit and examine all of the options available before you bind yourself to a rule like you have. I believe it is usually good to have developers access your database through stored procedures, but it is clearly not the case in all situations.
April 10, 2008 at 8:24 am
It depends on how you're generating the 1-million rows of data.
If what you're doing is importing large files, then BCP/Bulk Import/SSIS is going to be much faster than calling a stored procedure 1-million times, and much faster than trying to create a 1-million-row XML document.
If what you're doing is having 1-million users create 1 row each, through an web page, then calling the stored procedure once per user is going to be faster.
If you give a little more data on how you get the original 1-million rows, we can help you narrow down the solution a bit more.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 10, 2008 at 8:31 am
No, nobody should beat you up. "It depends" is always the right answer. The factors involved really do make a difference. If the system in question is meant to gather a million rows from another source and then store them, then some bulk load process through a file, possibly using SSIS, is a better approach. But if the term "million" was just a primitive counting function (1,2,3,4, countless) and we're talking about how to process sets of data from an OLTP system, say 100 rows at a time, then opening 100 transactions and committing 100 transactions is a poor choice if those 100 transactions are part of a single set of data that can be done with 1 transaction. The devil really is in the details.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
April 10, 2008 at 11:29 am
Michael Earl (4/10/2008)
I'm sure I'll get yelled at by someone for this post...Don't be too bound on the "Stored Procedures are the only way" approach.
They are not necessarily the fastest or best solution to every problem. It is very easy to build stored procedures that perform poorly and have security holes. They are not a solution to every problem.
Batching millions of rows through a procedure one at a time or even by building a huge XML string and passing it in as a parameter is probably not the best solution in most cases. Step back a bit and examine all of the options available before you bind yourself to a rule like you have. I believe it is usually good to have developers access your database through stored procedures, but it is clearly not the case in all situations.
Well, since you insist on getting yelled at, how's this:
(yelling really, really loud) "How can you possibly post intelligent, reasonable, accurate information on this web page!"
There, does that work for you?
In other words, I agree with you. Procs are usually the best option, but certainly not always.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 10, 2008 at 12:35 pm
Michael Earl (4/10/2008)
I'm sure I'll get yelled at by someone for this post...Don't be too bound on the "Stored Procedures are the only way" approach.
They are not necessarily the fastest or best solution to every problem. It is very easy to build stored procedures that perform poorly and have security holes. They are not a solution to every problem.
Batching millions of rows through a procedure one at a time or even by building a huge XML string and passing it in as a parameter is probably not the best solution in most cases. Step back a bit and examine all of the options available before you bind yourself to a rule like you have. I believe it is usually good to have developers access your database through stored procedures, but it is clearly not the case in all situations.
All right, I'll bite...
Not using an XML parameter to do this doesn't obviate using a Stored Procedure. Say - a Stored procedure that takes a file NAME parameter, and then imports it. Not that I think that SP's are the only choice in this case either - just decoupling the alternatives.
Speaking of which - depending on the app - you may not find XML necessary. While very specific and detailed about the data structure, it's also rather costly storage-wise (and therefore slower to load/import). A simple data structure might do just fine with a simple delimited file structure....
A "straight" data entry system that flushed things into a database on a regular basis may do very well with automating a few flat files, and a DTS,SSIS or BCP proess to push stuff in...
Oh - did I not yell loud enough????
:hehe:
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 11, 2008 at 12:13 am
Thank you very much so far.
the Application will generate the Data using a bunch parallel processes. In fact today I don't know the exact amount of data. Probably I have to use a stored procedure because of rules of our customer. Reading this thread I see 2 possible solutions:
1. The application builds an xml-String of the data and passes it to a stored procedure. The SP interpretes the xml and stores the data to the target tables.
2. The application builds an flat file and passes a reference to the file to a stored procedure. The SP calls SSIS which will do the rest.
I think the second solution will be faster and I can use the whole functionality of SSIS.
Am I right?
April 11, 2008 at 1:13 am
Hmmm from my experience.. not really. SSIS is a cow - huge, memory hogging beast that is hard to debug and not very nice in managing from code. Also there are security concerns if this application is run under lower priviledged accounts, like IUSR_. Another thing is that you will have to implement synchronized writable access to file and this is definitely slower than inserting an object to collection (with lock keyword, don't forget :))
I had similar scenario in the past and ended up with list of objects that had implemented ToXml method. Then every few seconds I iterated through the collection, called ToXml and stored result in string builder. In the end I called string builders ToString to get final result that was passed to database. Then I cleared the collection. Worked perfectly.
Piotr
...and your only reply is slàinte mhath
April 11, 2008 at 7:50 am
I'd think #2 might be faster, if the SP does the insert directly (no SSIS needed). The assumption would be that the flat file would be perfectly aligned with whatever table it need to be flushed into, so SSIS is overkill. I'd use BCP or BULK INSERT based on an opendatasource call.
that being said - it's also going to be a matter of how "big" the chunks get to be. If it's important to flush it ever few seconds, with an assumption of a relatively small chunk - XML wouldn't be "bad" per se. The bigger the XML file/fragment, the bigger the penalty for having to parse it into usable parts on the other end IMO.
In the end - either way will work. Which one works "best" for you will be a "it depends"....
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply