Large DB performance question

  • Hello

    Does anyone have an idea about the performance issues regarding JOIN's?

    If I have two tables

    UID | INFO       

    UID | INFO2

    and the two of these are over a million records, being hit regularly but 10's of thousands of web users.  Every record in the second DB is scanned every night to see what is in INFO2.  That is the reason for separation of INFO and INFO2

    What kind of impact would I have selecting INFO1 and INFO2 using a JOIN?

    and

    Is this silly to think this is possible?

    Thanks in advance

    Stefan

  • Hi Stefan,

    As usual, during performance optimization, you should estimate your needs and maybe make a tradeoff: if the given structure cannot perfectly satisfy all of your queries, then it should satisfy those that you execute most frequently and/or during peak hours. If you JOIN those tables the whole day, then maybe you should use a single table. Your should base your decision on exact tests: implement that part of the database in two ways, simulate a regular workload and derive conclusions.

    Regarding JOINS, I assume that UID is the key in both tables. Then, the fastest way to JOIN two large ordered datasets should be MERGE JOIN (that takes two already ordered subsets and JOINS them faster than the NESTED LOOPS method).

    Or, you can post some sample SQL statements on the forum, and somebody else can give you better feedback.

    Regards,

    Goce.

  • From the original post, I am not very clear it's two tables or two databases.

    Join two big tables usually cannot give you good performance. Use table variables, i.e. retrieve data from each tables based on the query criteria, and then join the table variables for the final results. The performance is much better.

    My experience is never join a big table with other tables.

     

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply