June 3, 2010 at 1:32 pm
I was presented with a query that was having problems. The original query was over a view that was over three tables so I broke it down to query directly against the three tables and try to speed up the process. What I found has confused me.
The query is looking for a range of dates, using the BETWEEN clause. The table it is validating these dates against has 489,495 rows. The original query was looking for dates that do not exist in the table. I then found some dates that do exist and ran it for those dates and the results were completely different. When run for the non-existent dates, the query took around two minutes and returned zero records. When run for existing dates, the query took a few seconds and returned the records.
There is not any kind of index on this date, or any other helpful column, so for each query a Clustered Index Scan was run. On the query for existing dates, this scan returned the 489,495 rows before joining with the other tables' data and getting pared down. On the query for the non-existent dates, the scan returned over 43 million rows before joining with the other tables' data and getting pared down (to nothing).
There were some other differences in the execution plan. In the query with existing dates, the first two tables data was joined before joining with the third table's (problem table) data. In the other query, the second table was joined with the third (problem) table before those results were joined with the first table. The only difference was the dates selected.
I wish I could give you some good examples of the tables and the queries but there is just simply too much there to easily put into an example. My biggest question concerns the behavior of the Clustered Index Scan on the problem table based on the different dates and whether they exist or not. I understand how bad joins can cause extra, erroneous data. I just don't understand the difference here.
Is it possible to force the execution plan to behave in a certain manner, i.e. perform joins in a certain manner?
Thanks,
hawg
----------------------------------------------------------01010011010100010100110000100000010100110110010101110010011101100110010101110010001000000101001001101111011000110110101101110011
June 3, 2010 at 1:59 pm
hawg (6/3/2010)
There is not any kind of index on this date, or any other helpful column ...
You need to add the date column to an index. You won't get even decent performance without doing so.
Is it possible to force the execution plan to behave in a certain manner, i.e. perform joins in a certain manner?
Yes, it is possible, but that is generally a last resort and is frowned upon. Option #1 is to fix obvious problems with indexes and then re-evaluate the execution plan.
June 3, 2010 at 2:02 pm
I would love to add an index but this is a third-party database so it's not going to happen - at least not by me. The vendor would have to make that change.
I know overriding the optimizer and forcing the JOIN order is not a good idea but do you know how to do it?
----------------------------------------------------------01010011010100010100110000100000010100110110010101110010011101100110010101110010001000000101001001101111011000110110101101110011
June 3, 2010 at 2:12 pm
hawg (6/3/2010)
I would love to add an index but this is a third-party database so it's not going to happen - at least not by me. The vendor would have to make that change.
You sure you can't? I've worked on SQL databases that host data for 3rd party systems that I've added indexes to.
I know overriding the optimizer and forcing the JOIN order is not a good idea but do you know how to do it?
You're going to get terrible results with a full table scan if you don't have an index on the column(s) in your where clause. You can look at join hints but I can almost guarantee it will do more harm than good.
June 3, 2010 at 2:15 pm
History tells me this is not going to happen with this system. That doesn't mean I won't try, though.
Anyway, I realize an index is the best remedy for poor performance in this case but I did figure out how to force the JOIN order and it went down to under a second so, at least in the interim, this might be my best scenario.
Thanks for the help.
hawg
----------------------------------------------------------01010011010100010100110000100000010100110110010101110010011101100110010101110010001000000101001001101111011000110110101101110011
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply