November 2, 2007 at 8:54 am
I have a question about a query im trying to execute.. The query looks like
DECLARE @Limit DATETIME
SET @Limit = '2007-11-02 14:00:00'
SELECT
a, b, c
FROM
tbl1 LEFT OUTER JOIN tbl2 LEFT OUTER JOIN tbl3, LEFT OUTER JOIN tbl4
WHERE
tbl1.TimeStamp > @Limit
This query never finished (at least takes very long time to execute) BUT if I dont use the @Limit variable
like this:
SELECT
a, b, c
FROM
tbl1 LEFT OUTER JOIN tbl2 LEFT OUTER JOIN tbl3, LEFT OUTER JOIN tbl4
WHERE
tbl1.TimeStamp > '2007-11-02 14:00:00'
the query executes in less than a minute. tbl1.TimeStamp has datatype DATETIME.
Does anyone know the reason that I`m not able to use a variable?
Any help appreciated!
November 2, 2007 at 9:16 am
I'm not sure what you're running this in, but the fact that you're running 3 cross joins (meaning joins with NO join criteria) is going to very quickly return VERY large recordsets.
If each of your tables had 100 records, the query (with no WHERE clause) would return 100 million rows. If they each had 1000 rows, that's a 1 Trillion row data set.
Depending on what you're using to return those results, you will find that the query wasn't done running when it showed you some results.
So - start by adding join criteria (how does table1 relate to table2/ relate to table3?). Then look at the indexing and make sure that it's right. You're going to need some index involving the date field on table1, as well as indexes on the primary and foreign keys.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
November 2, 2007 at 9:22 am
Hi Matt
I have criterias on the joins, but i did'nt write them in the post, sorry for that. Even though I have criterias its still a very large dataset!
Why do the indexing matter, All I do to make it work is to type the variable value directly into the query instead of using the variable?
November 2, 2007 at 9:30 am
THAT doesn't make much sense. you should find that both of them should be the same execution plan.
Did you run DBCC Freeproccache between the two executions? I have a feeling the "second" run reused some of the stats from the first.
the indexing would make a lot of difference, since it would get it to the data much faster (in both scenarios).
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
November 2, 2007 at 11:17 am
Get execution plan for each of them. Compare them. If execution plans are very different, check when statistic was last updated for all indexes for all these tables. If not recently, update them.
November 4, 2007 at 6:38 pm
Try this
DECLARE @Limit DATETIME
SET @Limit = '2007-11-02 14:00:00'
SELECT
a, b, c
FROM
tbl1 LEFT OUTER JOIN tbl2 LEFT OUTER JOIN tbl3, LEFT OUTER JOIN tbl4
WHERE
tbl1.TimeStamp > (SELECT @Limit)
I had the same problem once and this small change works well :hehe:
November 4, 2007 at 7:24 pm
If that last suggestion works, it would be amazing to me. It shouldn't matter, other than forcing a new execution plan. But if the optimum plan is chosen, then it should be the same as including the (SELECT).
November 5, 2007 at 12:31 am
Classic parameter sniffing problem.
When you use the value in the query (or a parameter), the optimiser, when it compiles the query, knows the value used in the condition, and can make appropriate decisions based on the number of rows that will be affected (based on the statistics on the table)
When you use a variable, the optimiser cannot see the value and has to make a guess as to how much of the table will be affected by the query. It will assume 30%, and will select a plan based on that.
If you're using a variable, and the 30% os very far off the actual rows, you may end up with a very sub-optimal execution plan, and hence very long execution times.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 5, 2007 at 2:21 am
do you have any common field between the tables ?
November 5, 2007 at 2:45 am
Yes, I have some common fields, and these are used in the join clause, so the result dataset is as small as possible.
Is there some solution to this or is the only way not to use a variable? Can I make the optimizer read the value instead of guessing?
I`m quite new to SQL-server and really appreciate your help, thanks.
November 5, 2007 at 6:38 am
You can try the optimise for query hint, if there's a specific value that you usually use. Problem with that is if you pass in other values with vastly different row counts, you'll get poor performance again.
SELECT ... FROM ...
WHERE SomeColumn > @YourVariableName
OPTION (OPTIMIZE FOR (@YourVariableName = 'Common value')
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 5, 2007 at 7:12 am
The two most likely possible issues here are as mentioned:
Parameter sniffing, which will result in different execution plans. (can we see them, or have you at least checked them?)
Caching of data. have you repeated this test, changed order etc. To ensure this is the only influencing factor?
I ask because there are many factors, one of which is blocking. Also consider profiling the process to see what it is getting hungup on.
November 5, 2007 at 2:58 pm
Gail Shaw's reply tells you why you've seen the performance difference and others have given you clues about what you can do about it.
Adding to these, here's my suggestion.
Look at the Execution plans for the query when run with/without the parameter and try to understand the differences. I've seen similar situations and the key difference has been the choice of Index used to look up the required rows. If this is also your problem, you can use an Index-hint to force the optimizer to adopt a more efficient execution plan. This may work for you, but beware that it comes with it's own baggage.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply