April 13, 2011 at 3:16 am
Hi everybody,
I have an application which accesses a database inside SQL Server 2008 R2 edition and generates XML files.
There are thousands of Robot applications which are constantly dumping data into 2 tables 24 X 7.
These 2 tables are also used by my application for generating the XML files.
Currently our database size is 5.5 GB but considering the future, I want to implement a scenario in which my application works on a different database for generating the XML files whereas the Robot applications work on a different database. If the XML files generated by my application do not have real time data thats fine with me.
I have little knowledge about mirroring, replication etc. Can the experts suggest to me what I should to satisfy my requirements.
Thanks in advance.
April 13, 2011 at 3:26 am
Hi,
if I understand you right, you want to run your xml generating application on another copy of the live database. So I can actually exclude mirroring because you can't read data from the mirror. This will be a feature of SQL Server 2011 Denali, but not in SQL Server 2008 R2. I would also would disadvise you from the normal replication because this must be implemented to every new table in your schema. Perhaps a Peer-To-Peer-Replication coul d help, but with this I have no experience. The best in my opinion would be to implement a log shipping with an standby-database from which you can read your data, but keep in mind that this would require a second license of sql server.
Greetings
Patrick Fiedler
[font="Arial"]Kind regards,
Patrick Fiedler
Consultant
April 13, 2011 at 4:30 am
Hi Patrick,
Thanks for your prompt response and clarifying about the different technologies.
Yes, I want my xml generating application to run on another copy of a database.
I try to elaborate the scenario below:
Database Server1 having possibly below mentioned databases
- Custom DB1
- Custom DB2
- Custom DB3
- Project DB (The project DB will have only 2 tables required by the Robot applications to dump data)
Database Server2 having possibly below mentioned database
- Project DB (Will have all the required tables for the project)
So I want only the data from the 2 tables to be transferred from Database Server 1 to Database Server 2.
Could this also be achieved if I can use just one Database Server and have different database names for e.g. "Project DB for Robots" and "Project DB for XML Generator"
Regards
Ramanan
April 13, 2011 at 4:37 am
Yes, if you only want to take a copy of these 2 tables you can use replication (the best fit in your case would be a transactional replication). In this case mirroring and log shipping will be a little bit oversized. 🙂
To implement replication you must first create a publication which publishs the 2 tables. This publication can be subscribed by another database, it doesn't matter whether the subsciber is on the same server or not.
But you can also implement a user defined sql server agent job which is inserting all new data into the copy of the live database.
[font="Arial"]Kind regards,
Patrick Fiedler
Consultant
April 13, 2011 at 4:49 am
Hi Patrick,
Thanks again for your prompt response.
I have some experience with SQL Replication around 5 years back, thanks for the hint will do a trial again.
We have transactional replication as you suggested, Merge and Snapshot replication.
I have the following doubt with transactional replication:
Since the Robot applications will be dumping data 24X7, with transactional replication my second database will also be continuously updated via replication from the publisher. Will this slow down the XML Generating program.
My main purpose for implementing this concept is to make the XML Generating program work as quick as possible even if there is 50 times more data in the database as of the current 5.5 GB data.
By the way I am also working with a German software company in India
April 13, 2011 at 5:17 am
ramanan_iyer (4/13/2011)
Hi Patrick,Thanks again for your prompt response.
I have some experience with SQL Replication around 5 years back, thanks for the hint will do a trial again.
We have transactional replication as you suggested, Merge and Snapshot replication.
I have the following doubt with transactional replication:
Since the Robot applications will be dumping data 24X7, with transactional replication my second database will also be continuously updated via replication from the publisher. Will this slow down the XML Generating program.
My main purpose for implementing this concept is to make the XML Generating program work as quick as possible even if there is 50 times more data in the database as of the current 5.5 GB data.
You say that you want to generate xml as quick as possible, but if you read the data from the same server you should read from the original table. Because in case of replication the log reader will write the data to publish to the distribution database and afterwards the distribution agent will send the data to the subscribers in you case the same machine. That means the log reader and the distribution agent will increase the i/o operations and in case of reading in original tables the sql server can use the cached table content. There will be also no locking problem. The robots will wait until your generating application will finish reading the data and your appliction will wait until the insert is made or get the data without the newly inserted data.
Perhaps you can partition the tables to stripe them on multiple database files on different disks.
By the way I am also working with a German software company in India
http://www.webdirekt.de and http://www.enexion.de[/quote%5D
Sounds interesting 😎 - german companies in India
[font="Arial"]Kind regards,
Patrick Fiedler
Consultant
April 13, 2011 at 5:24 am
Hi again,
There surely are a lot of German companies in India 🙂 Since last 11 years I am working with 2 German Companies.
The current benchmark is:
1. If I switch off external connections to SQL Server database and the Robot application, my XML generating application can generated around 2.1 GB of XML in 1 hour.
2. If the Robots are also accessing the SQL Server database in parallel, the time is significantly increased.
I don't know how the Robot application is developed but I have seen that in case of a deadlock they always get priority and my application is the victim. This seems because my application does not have any locks since it fires only select statements inside the 2 tables that are populated by the Robots.
If you feel replication could also be an added burden, can you suggest me something to improve the performance in this situation. Would partitioning the tables into different disks help with the speed?
Regards
Ramanan
April 13, 2011 at 5:39 am
In nearly all cases the disk is the bottleneck and if you set up a replication to a database which is located on another disk it would really increase the performance. The best would be to save database files and transaction logs to different disk locations. For performance reasons and recovery reasons.
But I would suggest to find out what the reason for the time increase is. Have you monitored the 2 sessions whether there are suspending time windows? Perhaps it would be an option to implement the resource governor to give your application more cpu and memory.
[font="Arial"]Kind regards,
Patrick Fiedler
Consultant
April 13, 2011 at 5:43 am
Hi again,
I have not went into the details about why there is a difference in time when the Robots are also accessing the DB.
Maybe I will try to get some more information and get back to you if required.
Thanks for all your help.
Regards
Ramanan
April 18, 2011 at 7:06 am
Hi again,
I have one more doubt, do you mean to say that implementing replication with a database on another disk would optimize and be useful
OR
Should I just try to have the data and transaction logs on different disks.
Regards
Ramanan
April 18, 2011 at 9:12 am
Hi,
Imagine, you implement a replication, all on the same disk. When the robots insert data, this data will be written also to the distribution database on the same disk. Then the replication log reader would detect new data in distribution db to distribute. It will start sending the data to the subscriber database. That would result in heavy i/o operations on the disk if you place all on the same disk. It must read all the data copied by the robots and write it to the distribution db and sql server writes all the data to the subscriber db.
So if you want to optimize the performance in this case you can take your original database where the robots work, the distribution database and the subscriber db on a separate disk (per db). Furthermore you should split database and log files to different disks.
And you can also optimize your xml export query.
[font="Arial"]Kind regards,
Patrick Fiedler
Consultant
April 18, 2011 at 9:12 am
Hi,
Imagine, you implement a replication, all on the same disk. When the robots insert data, this data will be written also to the distribution database on the same disk. Then the replication log reader would detect new data in distribution db to distribute. It will start sending the data to the subscriber database. That would result in heavy i/o operations on the disk if you place all on the same disk. It must read all the data copied by the robots and write it to the distribution db and sql server writes all the data to the subscriber db.
So if you want to optimize the performance in this case you can take your original database where the robots work, the distribution database and the subscriber db on a separate disk (per db). Furthermore you should split database and log files to different disks.
And you can also optimize your xml export query.
[font="Arial"]Kind regards,
Patrick Fiedler
Consultant
April 18, 2011 at 6:08 pm
Transactional and snapshot Replication is best for your scenerio but mirror cant be use because you cant query to the secondary node if primary node is mirroring , 1 node will be work at a time
Regards,
Syed Jahanzaib Bin Hassan
MCTS | MCITP | OCA | OCP | OCE | SCJP | IBMCDBA
Regards,
Syed Jahanzaib Bin Hassan
BSCS | MCTS | MCITP | OCA | OCP | OCE | SCJP | IBMCDBA
My Blog
www.aureus-salah.com
April 20, 2011 at 8:39 am
Syed Jahanzaib Bin hassan (4/18/2011)
Transactional and snapshot Replication is best for your scenerio but mirror cant be use because you cant query to the secondary node if primary node is mirroring , 1 node will be work at a timeRegards,
Syed Jahanzaib Bin Hassan
MCTS | MCITP | OCA | OCP | OCE | SCJP | IBMCDBA
The mirror can be used in read-only to offload reporting (or in this case xml generation).
Just create a snapshot of the mirror...
Schedule the snapshot refresh depending on how recent the data must be and how much space you have available.
April 23, 2011 at 12:41 am
overhead on the same database
Database snapshot fetch the data from the primary database,It will affect the performance of the database and overall performance of the server
Regards,
Syed Jahanzaib Bin Hassan
MCTS | MCITP | OCA | OCP | OCE | SCJP | IBMCDBA
My Blog
Regards,
Syed Jahanzaib Bin Hassan
BSCS | MCTS | MCITP | OCA | OCP | OCE | SCJP | IBMCDBA
My Blog
www.aureus-salah.com
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply