January 14, 2010 at 11:17 am
I wouldn't waste time on clowns that are obviously not professionals. This site houses professionals helping other professionals. There are professionals with varying levels of experience and then there's the occasional twit that most likely will not stick around. It does take time to provide the necessary scripts and desired results however it is time well spent. There are a lot of people like myself who love the challenge but don't have the time to back pedal.
January 14, 2010 at 11:18 am
jcrawf02 (1/14/2010)
So Gail, something that I wonder occasionally (the rest of the time I wonder about everything else :-P) - if I SELECT TOP 10 and ORDER BY the clustered index, can the optimizer figure out that it just needs to look at the first ten records in the index and work with those? Or is it always the case that it will process the query on the dataset and THEN apply the TOP filter?
Depends on the complexity of the query.
SELECT TOP 10 * FROM SomeTable
This so simple that the top can be pushed right down. SQL will only fetch 10 rows from the table. It knows it doesn't need more.
SELECT Top 10 * FROM SomeTable Where AnIndexedColumn = @somevar
Same here. Simple enough to be an index seek that returns only 10 records
SELECT Top 10 SomeColumn
FROM ATable
WHERE AnotherColumn = @var
Group by SomeColumn
Having Count(*) > 10
Order By SomeColumn
Now, for this one, how many rows are needed from the base table to return 10 at the end? Very difficult to say so, in this case it's likely that all qualifying rows will be fetched, aggregated, filtered again and then the top applied.
The optimiser will try really hard to push the row limitation down as far as possible, and I've seen it do things I wouldn't have expected (including pushing a row limitation down to an index seek when I was filtering on the output of Row_Number
I think this is worthy of a blog post, it's complex.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 14, 2010 at 12:05 pm
david.tyler (1/14/2010)
I wouldn't waste time on clowns that are obviously not professionals. This site houses professionals helping other professionals. There are professionals with varying levels of experience and then there's the occasional twit that most likely will not stick around. It does take time to provide the necessary scripts and desired results however it is time well spent. There are a lot of people like myself who love the challenge but don't have the time to back pedal.
If you are so 'professional', why do you resort to name calling?
January 14, 2010 at 12:10 pm
Ifila, thanks for posting the information that was being requested. This topic has shown that SSC can help and will help when challenged. I hope the results of the problem being resolved will keep you involved at SSC.
It is good that your issue is resolved.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
January 14, 2010 at 12:19 pm
GilaMonster (1/14/2010)
I think this is worthy of a blog post, it's complex.
woohoo! 😀
---------------------------------------------------------
How best to post your question[/url]
How to post performance problems[/url]
Tally Table:What it is and how it replaces a loop[/url]
"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
January 14, 2010 at 12:20 pm
After 25 years of dealing with IT people, I guess sometimes one can get cantankerous.
Thanks to all the constructive contributors!
January 14, 2010 at 12:46 pm
ifila (1/14/2010)
david.tyler (1/14/2010)
I wouldn't waste time on clowns that are obviously not professionals. This site houses professionals helping other professionals. There are professionals with varying levels of experience and then there's the occasional twit that most likely will not stick around. It does take time to provide the necessary scripts and desired results however it is time well spent. There are a lot of people like myself who love the challenge but don't have the time to back pedal.If you are so 'professional', why do you resort to name calling?
Careful now, you didn't start out so well either but we persevered in an attempt to help you solve your partiular issue, even if some of us may have gotten a bit sarcastic at times.
Bottom line, we took a challange and developed a solution and provided a recommendation on indexes as well. And, it didn't cost you any $$.
January 14, 2010 at 5:41 pm
GilaMonster (1/14/2010)I think this is worthy of a blog post, it's complex.
In the meantime, anyone interested in how TOP and OPTION (FAST n) affect the choice of plan made by the optimizer could look at the following two blogs by the SQL Server Query Optimization Team:
Paul
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
January 14, 2010 at 6:20 pm
Thanks Paul.
Gail - can't wait to see that Blog post.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
January 15, 2010 at 2:15 am
ifila (1/14/2010)
After 25 years of dealing with IT people, I guess sometimes one can get cantankerous.Thanks to all the constructive contributors!
Guilty of cantankerism as charged - but hey, it didn't work out so bad in the end.
Ifila, have you considered putting the row restriction on the innermost part of the query? Using TOP 1000 will give you the top 1000 resumes which will of course result in perhaps twice as many email addresses, but it does mean that the slowest, most expensive part of the query has a limiter on it. You could use a variable for your limit value and apply it to both the innermost query, to improve performance, and to your outermost i.e. the pivot, to return a consistent rowcount.
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 15, 2010 at 8:43 am
Chris Morris-439714 (1/15/2010)
ifila (1/14/2010)
After 25 years of dealing with IT people, I guess sometimes one can get cantankerous.Thanks to all the constructive contributors!
Guilty of cantankerism as charged - but hey, it didn't work out so bad in the end.
Ifila, have you considered putting the row restriction on the innermost part of the query? Using TOP 1000 will give you the top 1000 resumes which will of course result in perhaps twice as many email addresses, but it does mean that the slowest, most expensive part of the query has a limiter on it. You could use a variable for your limit value and apply it to both the innermost query, to improve performance, and to your outermost i.e. the pivot, to return a consistent rowcount.
Cheers
ChrisM
Thanks for the suggestion, but i dont understand what you mean:unsure:
January 15, 2010 at 8:55 am
This bit...
SELECT Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3
WHERE ((contains (originalresume, '"j2ee" and "java"'))
AND (currentdateout BETWEEN '2000-01-01' AND '2010-01-06'))
change it to
SELECT top 1000 Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3
WHERE ((contains (originalresume, '"j2ee" and "java"'))
AND (currentdateout BETWEEN '2000-01-01' AND '2010-01-06'))
You will want to put an ORDER BY on that. If you want it random, you could use ORDER BY NEWID(), but you might want it ordered by recency.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 15, 2010 at 9:16 am
Chris Morris-439714 (1/15/2010)
This bit...SELECT Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3
WHERE ((contains (originalresume, '"j2ee" and "java"'))
AND (currentdateout BETWEEN '2000-01-01' AND '2010-01-06'))
change it to
SELECT top 1000 Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3
WHERE ((contains (originalresume, '"j2ee" and "java"'))
AND (currentdateout BETWEEN '2000-01-01' AND '2010-01-06'))
You will want to put an ORDER BY on that. If you want it random, you could use ORDER BY NEWID(), but you might want it ordered by recency.
Excluding the 'Order By' that is the way i had it working before my post on SSC.
Thanks for the suggestion!
January 15, 2010 at 9:35 am
Yeah but...kinda.
This time round, you know exactly why you would put it there, and why you will (almost) always get more results in your final output than you specified in the row limit. You know that you can select the first 1000 matching resumes, most recent first i.e. ordered by recency, then extract out all of the email addresses from them, then isolate 1000 email addresses.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 14 posts - 91 through 103 (of 103 total)
You must be logged in to reply to this topic. Login to reply