Query optimisation

  • Hello all,

    I have two tables with the following structure -

    --t1

    ID bigint

    SourceID varchar(20))

    --t2

    SourceID Varchar(20) primary key

    value int

    I need to join these two tables so that I can get the sum of value from t2 for each ID in t1. To accomplish this I have written the following query -

    SELECT

    ID, SUM(VALUE)

    FROM

    T1 A

    INNER JOIN

    T2 B

    ON

    A.SOURCEID = B.SOURCEID

    GROUP BY

    ID

    I do get the desired result with this query, but my table t1 is huge and it has around 11.5 million rows and it takes around 4 minutes to run the above select, so I was thinking of a way to make this query more performant. I like using IF Exists instead of joins wherever possible, but in this case it seems I cant use that as I need to get the sum(value) from table t2.

    Is there any other possible way I can optimise this query at all or using an inner join is the best approach here?

    Here is the code to create and insert some sample data to the above mentioned tables -

    GO

    create table t1 (ID bigint, SourceID varchar(20))

    GO

    Insert into t1

    select 1, 'A'

    union

    select 2, 'A'

    union

    select 3, 'B'

    union

    select 4, 'C'

    union

    select 5, 'D'

    union

    select 6, 'E'

    GO

    create table t2 (SourceID Varchar(20) primary key, value int)

    GO

    insert into t2

    select 'A', 1

    union

    select 'B', 2

    union

    select 'C', 4

    union

    select 'D', 8

    union

    select 'E', 16

    GO

    Thanks 🙂

  • AllaboutSQL (4/24/2012)


    I do get the desired result with this query, but my table t1 is huge and it has around 11.5 million rows and it takes around 4 minutes to run the above select, so I was thinking of a way to make this query more performant.

    Are the other queries (of similar nature or involving these tables) taking less than 4 minutes to complete?

    Is this the only query taking 4 minutes to run and other SELECTs faster than this?

    Is this query making use the the indexes?

    Please attach the execution plan.

  • Assuming that's an accurate representation of the query, you don't have a WHERE clause. That means you're processing everything in the table. You're not going to be able to improve that, and, as more data gets added, it'll get slower. You need to filter the data in some fashion.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thank you both for your replies, I have figured out that I can actually filter the data before I perform the join, so the join will now work on a very less number of rows now and will be a lot faster. I still need to test this on the original datasets to to find out how much time I'l save by doing this. I'll update the post once I have some results. Thanks 🙂

  • to be clear - do you have an index on sourceid in t1?

    it might be wise to take a snapshot of the query plan and post it up here

    MVDBA

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply