October 1, 2013 at 8:14 am
Hello. I ran the following script that too 3 hours 25 minutes to insert about 40,000 rows. Is there any logic that I can put into this to speed it up? Thank you!
insert into T3
select T2.COL1, T2.COL2, T1.COL1, T1.COL2, T1.COL3
from T1
join T2 on T1.COL4 = T2.COL3
where T2.COL4 != 'data1'
and T1.COL1 is not NULL
and T1.COL5 is not NULL
and T1.COL6 = 'data2'
and CAST (T2.COL1 as VARCHAR)+'-'+CAST(T2.COL2 as VARCHAR)+'-'+CAST(T1.COL1 as VARCHAR)+'-'+CAST(T1.COL2 as VARCHAR) not in
(select CAST(COL1 as VARCHAR)+'-'+CAST(COL2 as VARCHAR)+'-'+CAST(COL3 as VARCHAR)+'-'+CAST(COL4 as VARCHAR) from T3)
October 1, 2013 at 8:51 am
there are many causes for performance. but it's good to post little bit more details about this.
Like, do you have created index on some column if yes which type, do you use somewhere have transaction where this tables is used,
i don't know, try this
select T2.COL1, T2.COL2, T1.COL1, T1.COL2, T1.COL3,
CAST(T2.COL1 as VARCHAR)+'-'+CAST(T2.COL2 as VARCHAR)+'-'+CAST(T1.COL1 as VARCHAR)+'-'+CAST(T1.COL2 as VARCHAR) tmpColumn into #temp
from T1 (nolock) join T2 (nolock) on T1.COL4 = T2.COL3
where T2.COL4 != 'data1'
and T1.COL1 is not NULL
and T1.COL5 is not NULL
and T1.COL6 = 'data2'
insert into T3
select * from #temp where tmpColumn not in
(select CAST(COL1 as VARCHAR)+'-'+CAST(COL2 as VARCHAR)+'-'+CAST(COL3 as VARCHAR)+'-'+CAST(COL4 as VARCHAR) from T3)
October 1, 2013 at 8:51 am
There isn't a lot to go on here. If you could share some more information like: How large are the tables in question (both size and number of rows)? Are there indexes that cover this query? What does the Execution Plan show? Did this used to run fast and now it is slow? Is this an ad-hoc query or part of a stored procedure? etc.
October 1, 2013 at 8:54 am
And some examples of actual data in the tables, along with the CREATE TABLE code..
October 1, 2013 at 8:55 am
easy_goer (10/1/2013)
Hello. I ran the following script that too 3 hours 25 minutes to insert about 40,000 rows. Is there any logic that I can put into this to speed it up? Thank you!insert into T3
select T2.COL1, T2.COL2, T1.COL1, T1.COL2, T1.COL3
from T1
join T2 on T1.COL4 = T2.COL3
where T2.COL4 != 'data1'
and T1.COL1 is not NULL
and T1.COL5 is not NULL
and T1.COL6 = 'data2'
and CAST (T2.COL1 as VARCHAR)+'-'+CAST(T2.COL2 as VARCHAR)+'-'+CAST(T1.COL1 as VARCHAR)+'-'+CAST(T1.COL2 as VARCHAR) not in
(select CAST(COL1 as VARCHAR)+'-'+CAST(COL2 as VARCHAR)+'-'+CAST(COL3 as VARCHAR)+'-'+CAST(COL4 as VARCHAR) from T3)
We can't really help performance issues on theoretical tables. I can say that your performance issues are all based in your where clause. You have lots of nonSARGable predicates here.
!= will force a full scan.
Also you cast all these columns to varchar. You didn't specify a length so it will use the default length. Do you know the default length for varchar? Me neither, that is why you should always specify the length.
If you want some real assistance with the performance issues you should take a look at this article.
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
October 1, 2013 at 8:56 am
This I reckon is the correct way to write your query:
select T2.COL1, T2.COL2, T1.COL1, T1.COL2, T1.COL3
from T1
join T2
on T1.COL4 = T2.COL3
where T2.COL4 != 'data1'
and T1.COL1 is not NULL
and T1.COL5 is not NULL
and T1.COL6 = 'data2'
AND NOT EXISTS (SELECT 1 FROM T3
WHERE T3.COL1 = T2.COL1
AND T3.COL2 = T2.COL2
AND T3.COL3 = T1.COL1
AND T3.COL4 = T1.COL2)
Check the results against the SELECT part of your original.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
October 2, 2013 at 10:13 pm
Worked like a charm. Thanks for the assistance!
October 5, 2013 at 7:46 pm
Chris, is this the equivalent of a LEFT JOIN based on 4 columns ?
AND NOT EXISTS (SELECT 1 FROM T3
WHERE T3.COL1 = T2.COL1
AND T3.COL2 = T2.COL2
AND T3.COL3 = T1.COL1
AND T3.COL4 = T1.COL2)
October 5, 2013 at 8:00 pm
homebrew01 (10/5/2013)
Chris, is this the equivalent of a LEFT JOIN based on 4 columns ?
AND NOT EXISTS (SELECT 1 FROM T3
WHERE T3.COL1 = T2.COL1
AND T3.COL2 = T2.COL2
AND T3.COL3 = T1.COL1
AND T3.COL4 = T1.COL2)
Yeh... but faster.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 6, 2013 at 6:50 am
ChrisM@Work (10/1/2013)
This I reckon is the correct way to write your query:
select T2.COL1, T2.COL2, T1.COL1, T1.COL2, T1.COL3
from T1
join T2
on T1.COL4 = T2.COL3
where T2.COL4 != 'data1'
and T1.COL1 is not NULL
and T1.COL5 is not NULL
and T1.COL6 = 'data2'
AND NOT EXISTS (SELECT 1 FROM T3
WHERE T3.COL1 = T2.COL1
AND T3.COL2 = T2.COL2
AND T3.COL3 = T1.COL1
AND T3.COL4 = T1.COL2)
Check the results against the SELECT part of your original.
So in this case, it will select from T1 and T2, only if not in T3 ?
What if you want to include non-matches against T3 ? Would you add to the WHERE clause ?
and T3.COL1 is NULL
and T3.COL2 is NULL
and T3.COL3 is NULL
and T3.COL4 is NULL
October 6, 2013 at 11:14 am
homebrew01 (10/6/2013)
What if you want to include non-matches against T3 ? Would you add to the WHERE clause ?
and T3.COL1 is NULL
and T3.COL2 is NULL
and T3.COL3 is NULL
and T3.COL4 is NULL
I guess it depends on what you are actually trying to do but if any of the t3 columns are NULL, they won't match any of the t2 columns even if they are also NULL so the result is that the rows won't exist and the WHERE NOT EXISTS will still work.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 7, 2013 at 1:25 am
homebrew01 (10/6/2013)
ChrisM@Work (10/1/2013)
This I reckon is the correct way to write your query:
select T2.COL1, T2.COL2, T1.COL1, T1.COL2, T1.COL3
from T1
join T2
on T1.COL4 = T2.COL3
where T2.COL4 != 'data1'
and T1.COL1 is not NULL
and T1.COL5 is not NULL
and T1.COL6 = 'data2'
AND NOT EXISTS (SELECT 1 FROM T3
WHERE T3.COL1 = T2.COL1
AND T3.COL2 = T2.COL2
AND T3.COL3 = T1.COL1
AND T3.COL4 = T1.COL2)
Check the results against the SELECT part of your original.
So in this case, it will select from T1 and T2, only if not in T3 ?
Yes, that's correct.
What if you want to include non-matches against T3 ? Would you add to the WHERE clause ?
Isn't this the same as "not in T3"?
and T3.COL1 is NULL
and T3.COL2 is NULL
and T3.COL3 is NULL
and T3.COL4 is NULL
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply