February 18, 2008 at 11:59 am
Hi,
I am facing a situation where in the UDF takes more time when compared to the base query itself.
UDF Definition
CREATE FUNCTION [fn_GetStateByZip] (@ZipCode varchar(5))
RETURNS varchar(2)
AS
BEGIN
RETURN(SELECT DISTINCT State FROM zipcodes WHERE Zipcode = @ZipCode)
END
DECLARE @START datetime
SELECT @START = getdate()
SELECT fn_GetStateByZip('37814')
PRINT 'Time taken in ms: ' + CAST(DATEDIFF(ms,@START,getdate()) as char(10));
Output is Time taken in ms: 16
DECLARE @START datetime
SELECT @START = getdate()
SELECT DISTINCT STATE FROM zipcodes where zipcode ='37814'
PRINT 'Time taken in ms: ' + CAST(DATEDIFF(ms,@START,getdate()) as char(10));
Output is Time taken in ms: 0
I ran this test for a couple of times and still get the same kind of result
The zipcodes table have the following indexes
CREATE CLUSTERED INDEX [IX_ZIPCODE] ON [Zipcodes]
(
[ZipCode] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [IX_STATE] ON [Zipcodes]
(
[State] ASC
)
INCLUDE ( [ZipCode]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
Any help is appreciated
Thanks
Vinoj
February 18, 2008 at 12:38 pm
How many times have you run the test? First section of the test caches the data for the second one. Try reversing the two tests and see what happens.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 18, 2008 at 12:43 pm
On reversing the order, the function at times takes 0 sec. for a sample of 5 test runs the UDF takes 16 ms in 2 tests and 0 secs in the other tests. The base query takes 0 sec all the time
February 18, 2008 at 12:48 pm
Why? That's as bad a double post. You're just gonna make people mad...
--Jeff Moden
Change is inevitable... Change for the better is not.
February 18, 2008 at 12:51 pm
The fact that the UDF is taking longer than the embedded code is normal... there's a bit of an overhead associated with running UDF's in many cases especially when the UDF has to do a bit of disk access.
The code is so simple, I'm pretty sure I wouldn't allow a UDF in my db to do this. If it's for a GUI, I'd probably write it as a stored proc.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 18, 2008 at 1:00 pm
It can be caused by the UDF having to store an execution plan that allows for larger and smaller ranges of answers, based on the table statistics.
The hard-coded version doesn't have to do that. It has one possible answer.
Try running the non-UDF version using a declared variable, see if that changes the speed on it.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
February 18, 2008 at 1:26 pm
Here are a couple of suggestions you could try.
Change the query in your fucntion to this. I think is is also logically better, because it prevents the possibility of your function having more than one result row.
SELECT TOP 1 State FROM zipcodes WHERE Zipcode = @ZipCode
Add this index to your table to cover the query.
CREATE UNIQUE NONCLUSTERED INDEX [IX_ ZipCode_STATE] ON [Zipcodes]
( [ZipCode] ASC , [State] ASC )
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,
IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
February 18, 2008 at 2:25 pm
Jeff Moden (2/18/2008)
The fact that the UDF is taking longer than the embedded code is normal... there's a bit of an overhead associated with running UDF's in many cases especially when the UDF has to do a bit of disk access.The code is so simple, I'm pretty sure I wouldn't allow a UDF in my db to do this. If it's for a GUI, I'd probably write it as a stored proc.
I am using these functions in other procedures, one example of such is shown below(this is a body of a stored proc)
SELECT
B.PRODUCTTYPE
, B.POLICYTYPE
FROM
COMPANIES AS A
INNER JOIN COMPANYPRODUCTS AS B
ON A.COMPANYID=B.COMPANYID
AND A.STATE =fn_GetStateByZip('37013')
February 18, 2008 at 2:30 pm
Michael Valentine Jones (2/18/2008)
Here are a couple of suggestions you could try.Change the query in your fucntion to this. I think is is also logically better, because it prevents the possibility of your function having more than one result row.
SELECT TOP 1 State FROM zipcodes WHERE Zipcode = @ZipCode
Add this index to your table to cover the query.
CREATE UNIQUE NONCLUSTERED INDEX [IX_ ZipCode_STATE] ON [Zipcodes]
( [ZipCode] ASC , [State] ASC )
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,
IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
Thanks for the suggestions. I did try out for the first one. More over it does makes logical sense as you pointed out. Unfortunately cannot go for the second suggestion as there are multiple records for the State,zipcode combination.( Came to know this only once when the data started flowing in). The same table has columns for County and city. So there are certain zipcodes in some states that are shared between counties. 🙁
The table structure is as follows
ZipcodesZipCode
ZipcodesCountyCode
ZipcodesCityCode
ZipcodesState
ZipcodesCountyName
ZipcodesCityName
ZipcodesLocationCode
And I dont want to create an index involving more than 50% of the columns
February 18, 2008 at 2:48 pm
You might try creating the index I suggested as non-unique to cover the query. It lets SQL Server treat the index like a table without a bookmark lookup.
As an alternative, you could modify the clustered index by adding State, County, City to make it unique, or just turn it into a clustered primary key.
February 18, 2008 at 3:11 pm
Michael Valentine Jones (2/18/2008)
You might try creating the index I suggested as non-unique to cover the query. It lets SQL Server treat the index like a table without a bookmark lookup.As an alternative, you could modify the clustered index by adding State, County, City to make it unique, or just turn it into a clustered primary key.
Thanks for the tip. It was helpful.
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply