March 26, 2009 at 6:38 am
Hi All,
I am working on a Desktop application. In which i want to globally updates the records.
My DB has nearly 10 million records.
Please suggest how to update these records in a much quicker way.
Thanks
MJ:cool:
March 26, 2009 at 7:20 am
well, bandwidth and local machines' capacity is going to be the limiting problems...download 10 million records locally, manipulate them in local client side memory, them send back changes to the server;
chances are if you do that, the controlling application is generating individual statments for each row...if you fiddle with 2% of the data, that's 500,000 individual update statements. all rows? 10M statements.
my suggestion is to look at the actual issue and see if you can do it on the server side.
what is it your application needs to do, really?
Lowell
March 26, 2009 at 7:21 am
Hi
Why do you want to load the data onto client? What do you want to update at these data?
The fastest way definitely is to do in with SQL on the server.
Greets
Flo
March 26, 2009 at 7:32 am
Actually I am working on a Retail management ERP.
User filter records using different search criteria’s and wants to updates these records.
This is the worst scenario that s/he might select all records.
For example - user wants to update all the records which has Retail Price >1.
MJ:cool:
March 26, 2009 at 7:38 am
But I don't think the user wants to see all the 10 million items. So I think your user has a GUI where (s)he specifies to update the items. Formerly guideline was show maximal 500 lines to the user, more (s)he never wants to see. Maybe today this count is increased to maximal 10,000 but I think that would be the upper boundary.
After the user specified the criteria (depending on the shown 10,000 sample data) you can start a stored procedure which does the real work on all data.
Greets
Flo
March 26, 2009 at 7:42 am
i like Florians idea; I'd suggest something like he says, showing top 500 or something,and then a message like "171,432 records match this criteria...click next to review the next 500"
Lowell
March 27, 2009 at 4:02 am
I am giving you the exact scenario -
1. I have a tab control with 3 tabs on it.
2. 1'st tab has a user interface for record filtering.
3. 2'nd tab sets the values to be updated.
4. 3'rd updates all the filtered records. User won't see all the records.
But he can have a log file to see the erroneous records at the end of updating.
S/he only sees the progress bar which shows how many records updating out of total records.
MJ:cool:
March 27, 2009 at 11:52 am
AMJ (3/27/2009)
I am giving you the exact scenario -1. I have a tab control with 3 tabs on it.
2. 1'st tab has a user interface for record filtering.
3. 2'nd tab sets the values to be updated.
4. 3'rd updates all the filtered records. User won't see all the records.
But he can have a log file to see the erroneous records at the end of updating.
S/he only sees the progress bar which shows how many records updating out of total records.
Hi
So you have a good initial situation! 😉
My suggestion:
1'st tab
Just fine!
2'nd tab
Show some example data to the user. Depending on filter criteria of the first tab. You can use a TOP(x) for this. You should use a ORDER BY clause on an indexed field which will not be affected by the updates (e.g. the PK-Column(s) or a CreationDate, ...).
You can simulate the changes either by a client or use a transaction to change the sample data in database, select them and rollback the transaction. Be aware, this may cause locks on database if the manipulation of the sample data takes too long! If the changes are not too complex you should use client side simulation.
3'rd tab
If the user confirms the update action. You are able to first get the COUNT of the data to be updated as limit for a progress bar or something like that. After that you can use a procedure to do the changes directly on SQL Server without any data transfer to client application.
For the progress bar I would suggest a procedure which possibly works in 10,000 record steps. The parameter for the stored procedure should be the criteria for the filter an id which specifies the start id for the 10,000 records batch. When the procedure is ready it will return the maximal affected id within this batch and your client application can execute again. So you have 1,000 executes instead of 10,000,000 records to be selected and 10,000,000 updates.
After everything is done you can show the sample data from 2'nd tab in the 3'rd tab but directly from database. Don't forget the ORDER BY criteria from 2'nd tab to avoid a confused user. 😛
Greets
Flo
March 29, 2009 at 10:03 pm
Thanks for the suggestion.
I worked on this approach and it works fine for me.
It improves update time by 40%.
Thanks again.
MJ:cool:
April 1, 2009 at 5:11 am
Also get the updates done by stored procedures on the server. That way you do not have to ship data to the client.
The only dat ayou should ever ship to a client is data that absolutely must be displayed, or is vital for business logic procesing. Get as much work done as possible at the server. This gives you scalability, so the appilcation can perform in an acceptable manner if it has 100K or 100M records.
Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.
When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara
April 1, 2009 at 5:31 am
Not sure about this, but it sounds odd to me that a user regularly updates up to 10 million rows.
Do you have some more details on your table design and the update activities that the users perform?
Best Regards,
Chris Büttner
April 1, 2009 at 10:07 pm
Yes you right EDVassie - i am creating SP but i have to use cursor to update these records. As I have to create the log file for each and every erroneous record.
Christian i am giving the worst scenario for that. Sometimes user can update 1000 records and sometime s/he can update 1million records.
Obviously it is not daily process. But we have to give the best solution to the user as per his requirement.
MJ:cool:
April 2, 2009 at 2:45 am
IMHO you need to make the update processing asyncronous to the user input. If you tie up a user screen for even 1 minute while you process their input, you risk them thinking the application has hung and killing it off.
If you make the update process asyncronous, then you should think about SSIS or similar to do the update. This will process your update far faster than using a cursor (at least 10x faster) and give you all the error handling capabilities you need.
Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.
When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara
April 2, 2009 at 4:38 am
AMJ (4/1/2009)
Yes you right EDVassie - i am creating SP but i have to use cursor to update these records. As I have to create the log file for each and every erroneous record.
I think cursors should never be a solution. You can also write/select the error items in one batch.
Maybe you should consider to take a part of the cursor within your procedure with a short description and some example data into a new thread here. I'm quiet sure that there will be a solution without a cursor which may be 1000 times faster.
Greets
Flo
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply