December 6, 2010 at 11:07 am
Hello. I pull data from a SQL 2000 environment via SSIS into a 2008 DB of ours.
Select Distinct RCD.Variable4
FROM Route_Call_Detail RCD
WHERE RCD.[DateTime] > '12/4/10' AND RCD.[DateTime] < '12/5/10'
AND SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' --Thanks to Jeff Moden for this
--AND CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) BETWEEN 5011 AND 7999
This works fine, purely numeric values returned for Variable4 1st 6 chars. Uncomment that last line and I get this error:
Syntax error converting the varchar value 'AZ' to a column of data type int.
There are rows where Variable4 = 'AZ' but I thought those would be short circuited by the statement above. It works in test :crazy: Anyhow, I tried putting this in a case statement like some people have offered but no luck. Keep in mind the query runs in 2000 before getting over to us (we're fortunately not the keepers of the 2000 data).
Thanks,
Ken
December 6, 2010 at 11:19 am
i think you'll just have to wrap a section of the query into a subquery, since the data has to be filtered on the LIKE statement first;
because the second part uses a function (CONVERT)...it's not SARG-able, so a short circuit base don the index seeks cannot occur, right?
SELECT * FROM (
Select
Distinct RCD.Variable4
FROM Route_Call_Detail RCD
WHERE RCD.[DateTime] > '12/4/10'
AND RCD.[DateTime] < '12/5/10'
AND SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' --Thanks to Jeff Moden for this
) MyAlias
WHERE CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) BETWEEN 5011 AND 7999
Lowell
December 6, 2010 at 11:48 am
SQL makes no guarantee as to what order expressions will be evaluated in and, as such, one cannot assume that any form of 'short-circuiting' will always occur, especially when dealing with data columns and not just constant expressions.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 6, 2010 at 11:54 am
Lowell (12/6/2010)
i think you'll just have to wrap a section of the query into a subquery, since the data has to be filtered on the LIKE statement first;because the second part uses a function (CONVERT)...it's not SARG-able, so a short circuit base don the index seeks cannot occur, right?
SELECT * FROM (
Select
Distinct RCD.Variable4
FROM Route_Call_Detail RCD
WHERE RCD.[DateTime] > '12/4/10'
AND RCD.[DateTime] < '12/5/10'
AND SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' --Thanks to Jeff Moden for this
) MyAlias
WHERE CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) BETWEEN 5011 AND 7999
Makes sense but surprisingly it didn't change anything.
Ken
December 6, 2010 at 11:55 am
You've run into a common issue, and it's a natural one to make. The Where clause limits the data, then we do the rest...
Sadly, this isn't true. You can't make the optimizer run the WHERE first. Not directly. You have no control after you declare the results you want without hinting. Note, this is key. You declare the results you want, you don't give SQL a process, not directly.
The two ways out of this are to build a subquery with the where clause before you do any conversion and then use the OPTION (FORCE ORDER) tag (which I don't recommend without thorough knowledge of the impact), or to drop the data you're going to process to a #TMP and then do conversions after that.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
December 6, 2010 at 11:57 am
ken.trock (12/6/2010)
Lowell (12/6/2010)
i think you'll just have to wrap a section of the query into a subquery, since the data has to be filtered on the LIKE statement first;because the second part uses a function (CONVERT)...it's not SARG-able, so a short circuit base don the index seeks cannot occur, right?
SELECT * FROM (
Select
Distinct RCD.Variable4
FROM Route_Call_Detail RCD
WHERE RCD.[DateTime] > '12/4/10'
AND RCD.[DateTime] < '12/5/10'
AND SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' --Thanks to Jeff Moden for this
) MyAlias
WHERE CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) BETWEEN 5011 AND 7999
Makes sense but surprisingly it didn't change anything.
Ken
Heh, what I get for leaving the window open a bit. 🙂 Use OPTION (FORCE ORDER) at the tail of that, last piece of the statement.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
December 6, 2010 at 12:00 pm
Or, as an (untested) alternative.
Select
Distinct RCD.Variable4
FROM Route_Call_Detail RCD
WHERE RCD.[DateTime] > '12/4/10'
AND RCD.[DateTime] < '12/5/10'
AND CASE WHEN SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' THEN CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) ELSE NULL END BETWEEN BETWEEN 5011 AND 7999
)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 6, 2010 at 12:41 pm
ken.trock (12/6/2010)
Hello. I pull data from a SQL 2000 environment via SSIS into a 2008 DB of ours.
Select Distinct RCD.Variable4
FROM Route_Call_Detail RCD
WHERE RCD.[DateTime] > '12/4/10' AND RCD.[DateTime] < '12/5/10'
AND SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' --Thanks to Jeff Moden for this
--AND CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) BETWEEN 5011 AND 7999
This works fine, purely numeric values returned for Variable4 1st 6 chars. Uncomment that last line and I get this error:
Syntax error converting the varchar value 'AZ' to a column of data type int.
There are rows where Variable4 = 'AZ' but I thought those would be short circuited by the statement above. It works in test :crazy: Anyhow, I tried putting this in a case statement like some people have offered but no luck. Keep in mind the query runs in 2000 before getting over to us (we're fortunately not the keepers of the 2000 data).
Thanks,
Ken
Ken,
A LIKE that starts with a '%' won't be able to utilize an index on the column, which may make a noticeable impact. Since you want the first six characters to be digits, you might want to change it to:
AND RCD.Variable4 LIKE '[0-9][0-9][0-9][0-9][0-9][0-9]%'
which is SARGable and thus will utilize an index.
This might also fix the issue you were having... if sql is already doing a scan on the table because of this, it would just go ahead and compare the other column at the same time. You would probably still need to put it into a CTE or sub-query as Lowell showed you.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
December 6, 2010 at 12:49 pm
You're checking the first six characters are digits, so why not:
WHERE LEFT(RCD.Variable4, 4) BETWEEN '5011' AND '7999'
which can potentially make use of an index?
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
December 6, 2010 at 1:02 pm
GilaMonster (12/6/2010)
Or, as an (untested) alternative.
Select
Distinct RCD.Variable4
FROM Route_Call_Detail RCD
WHERE RCD.[DateTime] > '12/4/10'
AND RCD.[DateTime] < '12/5/10'
AND CASE WHEN SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' THEN CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) ELSE NULL END BETWEEN BETWEEN 5011 AND 7999
)
Yes, this works! I was trying to do something like this myself but couldn't quite get there. Thanks everyone who replied.
Ken
December 6, 2010 at 1:18 pm
ken.trock (12/6/2010)
GilaMonster (12/6/2010)
Or, as an (untested) alternative.
Select
Distinct RCD.Variable4
FROM Route_Call_Detail RCD
WHERE RCD.[DateTime] > '12/4/10'
AND RCD.[DateTime] < '12/5/10'
AND CASE WHEN SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' THEN CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) ELSE NULL END BETWEEN BETWEEN 5011 AND 7999
)
Yes, this works! I was trying to do something like this myself but couldn't quite get there. Thanks everyone who replied.
Watch the performance. Probably want an index on Datetime include Variable4 for this. No point in having Variable4 in the key, that predicate's not exactly SARGable.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 6, 2010 at 1:41 pm
Watch the performance. Probably want an index on Datetime include Variable4 for this. No point in having Variable4 in the key, that predicate's not exactly SARGable.
This is actually data from a 3rd party. It can be customized but we don't control it. They at least have an index for us on DateTime (I know, not the greatest name for a column).
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply