April 24, 2012 at 3:32 am
Hello all,
I have two tables with the following structure -
--t1
ID bigint
SourceID varchar(20))
--t2
SourceID Varchar(20) primary key
value int
I need to join these two tables so that I can get the sum of value from t2 for each ID in t1. To accomplish this I have written the following query -
SELECT
ID, SUM(VALUE)
FROM
T1 A
INNER JOIN
T2 B
ON
A.SOURCEID = B.SOURCEID
GROUP BY
ID
I do get the desired result with this query, but my table t1 is huge and it has around 11.5 million rows and it takes around 4 minutes to run the above select, so I was thinking of a way to make this query more performant. I like using IF Exists instead of joins wherever possible, but in this case it seems I cant use that as I need to get the sum(value) from table t2.
Is there any other possible way I can optimise this query at all or using an inner join is the best approach here?
Here is the code to create and insert some sample data to the above mentioned tables -
GO
create table t1 (ID bigint, SourceID varchar(20))
GO
Insert into t1
select 1, 'A'
union
select 2, 'A'
union
select 3, 'B'
union
select 4, 'C'
union
select 5, 'D'
union
select 6, 'E'
GO
create table t2 (SourceID Varchar(20) primary key, value int)
GO
insert into t2
select 'A', 1
union
select 'B', 2
union
select 'C', 4
union
select 'D', 8
union
select 'E', 16
GO
Thanks 🙂
April 24, 2012 at 4:02 am
AllaboutSQL (4/24/2012)
I do get the desired result with this query, but my table t1 is huge and it has around 11.5 million rows and it takes around 4 minutes to run the above select, so I was thinking of a way to make this query more performant.
Are the other queries (of similar nature or involving these tables) taking less than 4 minutes to complete?
Is this the only query taking 4 minutes to run and other SELECTs faster than this?
Is this query making use the the indexes?
Please attach the execution plan.
April 24, 2012 at 4:34 am
Assuming that's an accurate representation of the query, you don't have a WHERE clause. That means you're processing everything in the table. You're not going to be able to improve that, and, as more data gets added, it'll get slower. You need to filter the data in some fashion.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
April 25, 2012 at 2:59 am
Thank you both for your replies, I have figured out that I can actually filter the data before I perform the join, so the join will now work on a very less number of rows now and will be a lot faster. I still need to test this on the original datasets to to find out how much time I'l save by doing this. I'll update the post once I have some results. Thanks 🙂
April 25, 2012 at 3:07 am
to be clear - do you have an index on sourceid in t1?
it might be wise to take a snapshot of the query plan and post it up here
MVDBA
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply