UDF Taking More Time than the base query

  • Hi,

    I am facing a situation where in the UDF takes more time when compared to the base query itself.

    UDF Definition

    CREATE FUNCTION [fn_GetStateByZip] (@ZipCode varchar(5))

    RETURNS varchar(2)

    AS

    BEGIN

    RETURN(SELECT DISTINCT State FROM zipcodes WHERE Zipcode = @ZipCode)

    END

    DECLARE @START datetime

    SELECT @START = getdate()

    SELECT fn_GetStateByZip('37814')

    PRINT 'Time taken in ms: ' + CAST(DATEDIFF(ms,@START,getdate()) as char(10));

    Output is Time taken in ms: 16

    DECLARE @START datetime

    SELECT @START = getdate()

    SELECT DISTINCT STATE FROM zipcodes where zipcode ='37814'

    PRINT 'Time taken in ms: ' + CAST(DATEDIFF(ms,@START,getdate()) as char(10));

    Output is Time taken in ms: 0

    I ran this test for a couple of times and still get the same kind of result

    The zipcodes table have the following indexes

    CREATE CLUSTERED INDEX [IX_ZIPCODE] ON [Zipcodes]

    (

    [ZipCode] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    CREATE NONCLUSTERED INDEX [IX_STATE] ON [Zipcodes]

    (

    [State] ASC

    )

    INCLUDE ( [ZipCode]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    Any help is appreciated

    Thanks

    Vinoj

  • How many times have you run the test? First section of the test caches the data for the second one. Try reversing the two tests and see what happens.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • On reversing the order, the function at times takes 0 sec. for a sample of 5 test runs the UDF takes 16 ms in 2 tests and 0 secs in the other tests. The base query takes 0 sec all the time

  • Why? That's as bad a double post. You're just gonna make people mad...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The fact that the UDF is taking longer than the embedded code is normal... there's a bit of an overhead associated with running UDF's in many cases especially when the UDF has to do a bit of disk access.

    The code is so simple, I'm pretty sure I wouldn't allow a UDF in my db to do this. If it's for a GUI, I'd probably write it as a stored proc.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • It can be caused by the UDF having to store an execution plan that allows for larger and smaller ranges of answers, based on the table statistics.

    The hard-coded version doesn't have to do that. It has one possible answer.

    Try running the non-UDF version using a declared variable, see if that changes the speed on it.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Here are a couple of suggestions you could try.

    Change the query in your fucntion to this. I think is is also logically better, because it prevents the possibility of your function having more than one result row.

    SELECT TOP 1 State FROM zipcodes WHERE Zipcode = @ZipCode

    Add this index to your table to cover the query.

    CREATE UNIQUE NONCLUSTERED INDEX [IX_ ZipCode_STATE] ON [Zipcodes]

    ( [ZipCode] ASC , [State] ASC )

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,

    IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF,

    ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

  • Jeff Moden (2/18/2008)


    The fact that the UDF is taking longer than the embedded code is normal... there's a bit of an overhead associated with running UDF's in many cases especially when the UDF has to do a bit of disk access.

    The code is so simple, I'm pretty sure I wouldn't allow a UDF in my db to do this. If it's for a GUI, I'd probably write it as a stored proc.

    I am using these functions in other procedures, one example of such is shown below(this is a body of a stored proc)

    SELECT

    B.PRODUCTTYPE

    , B.POLICYTYPE

    FROM

    COMPANIES AS A

    INNER JOIN COMPANYPRODUCTS AS B

    ON A.COMPANYID=B.COMPANYID

    AND A.STATE =fn_GetStateByZip('37013')

  • Michael Valentine Jones (2/18/2008)


    Here are a couple of suggestions you could try.

    Change the query in your fucntion to this. I think is is also logically better, because it prevents the possibility of your function having more than one result row.

    SELECT TOP 1 State FROM zipcodes WHERE Zipcode = @ZipCode

    Add this index to your table to cover the query.

    CREATE UNIQUE NONCLUSTERED INDEX [IX_ ZipCode_STATE] ON [Zipcodes]

    ( [ZipCode] ASC , [State] ASC )

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,

    IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF,

    ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    Thanks for the suggestions. I did try out for the first one. More over it does makes logical sense as you pointed out. Unfortunately cannot go for the second suggestion as there are multiple records for the State,zipcode combination.( Came to know this only once when the data started flowing in). The same table has columns for County and city. So there are certain zipcodes in some states that are shared between counties. 🙁

    The table structure is as follows

    ZipcodesZipCode

    ZipcodesCountyCode

    ZipcodesCityCode

    ZipcodesState

    ZipcodesCountyName

    ZipcodesCityName

    ZipcodesLocationCode

    And I dont want to create an index involving more than 50% of the columns

  • You might try creating the index I suggested as non-unique to cover the query. It lets SQL Server treat the index like a table without a bookmark lookup.

    As an alternative, you could modify the clustered index by adding State, County, City to make it unique, or just turn it into a clustered primary key.

  • Michael Valentine Jones (2/18/2008)


    You might try creating the index I suggested as non-unique to cover the query. It lets SQL Server treat the index like a table without a bookmark lookup.

    As an alternative, you could modify the clustered index by adding State, County, City to make it unique, or just turn it into a clustered primary key.

    Thanks for the tip. It was helpful.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply