Slow access of data via a linked server connection

  • My manager has designed a new architecture for our development and user test environments that I don't think will work. I'm hoping someone has got a neat solution for us.

    We have one server for DEV and UAT. We can't afford a second server. We have one database for DEV, another for UAT. A third database, called QAI is 700GB in size, and is shared by both DEV and UAT. They only read from this database, and it is populated via a 3rd-party provider. Processes in DEV and UAT need to extract data from the QAI database and populate tables and memory caches in their own database. We can't put stored procedures in the QAI database, or change the schema.

    The current solution is 3 databases on one Instance (SQL2005) - DEV, UAT and QAI.

    The proposed solution is 3 instances - one for DEV, one for UAT and one for QAI. QAI cannot be replicated in each instance due to the size of the database, and licencing implications.

    I have created a new named instance for DEV with a linked server to the default instance which contains the QAI database. Synonyms have been created to connect the DEV instance to the tables in the QAI database, via the linked server.

    I am now testing my most intensive process, a single stored procedure which extracts pricing and industry classification data from the QAI database for 70,000 equity stocks, at the same time joining onto mapping tables on the local DEV database. The results are inserted to a table on DEV.

    As expected, the performance over the linked server is shocking. The process usually runs for 35 minutes, and is now still running ofer 2:40 hours.

    I am aware of the performance limitations of linked servers, when joining large tables in two instances/servers.

    I am hoping someone has tried to design an infrastructure like this in the past, and knows of a new feature I can use. I have investigated federated servers and distributed partitioned views but they all seem to use linked servers.

    The obvious solution is to redesign my stored procedure or use replication, but we don't have resources for redevelopment right now.

    If you can help, I would appreciate it.


    When in doubt - test, test, test!

    Wayne

  • How many tables are you looking to copy and how frequently will you copy them?

    A solution I've used at some places I've worked at where the table layouts are the same in the different databases is to use BCP with native mode files. Below is a sample .CMD file script for copying one table using windows authentication:

    if "%1" == "" goto :noparm

    bcp TestDatabase.%1 out %1.dat -n -S TestServer -T

    if errorlevel 1 goto errhandle

    SQLCMD -E -S DevServer -d DevDatabase -Q "truncate table %1" -r 1 -b

    if errorlevel 1 goto errhandle

    bcp DevDatabase.%1 in %1.dat -n -S DevServer -T -k -E -h "TABLOCK"

    goto ending

    :noparm

    echo you must specify a table name in the command line parameter.

    :errhandle

    echo script halting!

    :ending

    echo Done.

  • I won't be copying any data. A .NET application will be calling stored procedures on the DEV database in one instance, and the sprocs are extracts data from the QAI database in the other instance, using SELECT. Tables will be joined between the two databases. The QAI database is a repository of information, and we need to extract data from it, and perform math calcs based on this data.

    There are also views and udf's that need to access data in the QAI database. These all reside on the DEV database.

    We need to implement this solution without any code changes. Redevelopment is obviously the answer, but we don't have resources for that.

    I'm hoping for a solution that will allow me to join two tables in different instances, in one SELECT statement, and not have the performance problems of linked servers.


    When in doubt - test, test, test!

    Wayne

  • Why don't you make two selects (one per db) and work with the data in the .net code?

    I suppose this could be a bit faster..

  • That is the only possible solution I can think of, unfortunately we are under resourced and can't do that now. I was hoping for another solution avoiding linked servers.


    When in doubt - test, test, test!

    Wayne

  • Wayne (10/27/2008)


    There are also views and udf's that need to access data in the QAI database. These all reside on the DEV database.

    We need to implement this solution without any code changes. Redevelopment is obviously the answer, but we don't have resources for that.

    I'm hoping for a solution that will allow me to join two tables in different instances, in one SELECT statement, and not have the performance problems of linked servers.

    This is probably the worst case scenario for SQL Server, to have to join two tables from two different instances in one query. If you can't stay on one instance with all 3 databases, you may either have to write each query that does this as a stored procedure that queries the data from the remote instance into a temp table or table variable if really small, and then join that to the local table in the same stored proc to produce your desired results. The other option as already mentioned is to query the data separately and do the join in your application.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply