Unexpected performance difference in query with where on varchar column

  • I'm relatively new to SQL 2005 and have a question about the execution performance of an query of mine (full query below). 

    The problem lies in the where statement. The FTO.peildatum column is an varchar(8) and I'm selecting columns with an specific value. in this case 20061001. Since this is aan varchar column I thought I should specify this as FTO.Peildatum = '20061001'. This makes the query, which returns about 29000 rows, take several minutes. If I change this to FTO.Peildatum = 20061001 than it takes only 6 seconds every time exactly. I expected FTO.Peildatum = 20061001 to be slower because a conversion from number to string is involved but it's not. Can anyone explain to me why?

     

    SELECT 

    FTO.Onderwijssector, FTO.Peildatum, FTO.Status, IFO.Aanmaakdatum, IFO.Brin, IFO.Datum_verklaring_ontvangen, IFO.Datum_verklaring_accountant_ontvangen, IFO.Datum_rappel_verstuurd, IFO.Controletotaal_accountant, ISG.Datum_inschrijving, ISG.Geplande_uitschrijfdatum, ISG.Werkelijke_uitschrijfdatum, ISG.Indicatie_bekostiging_inschrijving, ISG.Indicatie_gehandicapt, ISG.Hoogste_vooropleiding, ISG.Indicatie_risicodeelnemer, ISG.Leerweg, ISG.Intensiteit, ISG.Contacturen_per_week, ISG.Indicatie_nieuwkomer, ISG.Behaalde_opleiding, ISG.Datum_opleiding_behaald, ISG.Inschrijvingsvolgnummer, ISG.Opleiding, ISG.Cumi_categorie, ISG.Cumi_ratio, OOE.Persoonsgebonden_nummer, OOE.Geslacht, OOE.Postcodecijfers, OOE.Overlijdensdatum, OOE.Datum_vestiging_in_nederland, OOE.Datum_vertrek_uit_nederland, OOE.Geboorteland, OOE.Geboortedatum, OOE.Geboorteland_ouder_1, OOE.Geslacht_ouder_1, OOE.Geboorteland_ouder_2, OOE.Geslacht_ouder_2, OOE.Land_waarnaar_vertrokken, OOE.Verblijfstitel, OOE.Nationaliteit_1, OOE.Nationaliteit_2, OOE.Leeftijdmeetdatum_1, OOE.Leeftijd_op_meetdatum_1, OOE.Leeftijd_op_meetdatum_2, OOE.Leeftijdmeetdatum_2, OOE.Leeftijd_op_meetdatum_3, OOE.Leeftijdmeetdatum_3, OOE.Leeftijd_op_meetdatum_4, OOE.Leeftijdmeetdatum_4

    FROM

    FTO inner JOIN IFO ON FTO.ID = IFO.ID inner join OOE ON FTO.ID = OOE.FTOID inner join ISG ON ooe.persoonsgebonden_nummer = isg.persoonsgebonden_nummer

    WHERE

    (FTO.Peildatum = 20061001) AND (FTO.Onderwijssector = 'BO') and (OOE.Onderwijssector = 'BO') and (ISG.Onderwijssector = 'BO')

  • It that opposite to what I would have expected. But some questions first:

    You are sure the column in the table is a varchar(8)?

    What is the query plan saying? Is there an implicit conversion going on on that column? How about indexes?

    Does the plan change drastically between the two types of queries? Specifically related to the the table with that column in it.

    Your thinking is right though - a conversion on a column should make it slower. Worst would be the inability to use an index on that column.

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

  • To answer your questions best I can.

    Yes I'm sure to column is varchar(8) and no the query plan is not changing much. I have no index on that column.

    The strang thing is that when I changed the column type to decimal and ran the query again using 20061001 without the quotes. I was even faster, 3 seconds. Leading me to think that the 6 second time in my next post included an conversion.

    The application is still changing so giving that column an different type is not a problem, I would however like to understand what went wrong.

     

  • Most odd.

    If you are using it as a date, I would say keep it as an integer.

    How are you running the query? Through SSMS or through the app when you see the differences?

    You say there is not much change to the query plan. A subtle (Define subtle) change to the plan can be detrimental to the query.

    Also, if you are going to use it as a filter, seriously consider putting an index on it.

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

  • I use both ssms and the app is an report which I make using visual studio 2005. Visual studio is a little slower in showing the data than ssms. But both experience the same quantative speed-up and slow down in changing the query. I agree I should use indexes, but I'm not finshed yet. I'll start studying the query plan.

  • The problem could be something to do with the collation sequence.

    The comparison of varchars with Windows collation sequences, eg Latin1_General_CI_AS, is considerably slower than the comparison of varchars with SQL collation sequences, eg SQL_Latin1_General_CP1_CI_AS. If FTO.Peildatum has a Windows collation sequence then a comparison of the strings might be slower than converting every FTO.Peildatum to an integer. (As crispin has pointed out, this would also mean that any index on FTO.Peildatum would not be used.)

    [Edit]: On thinking about it, this is probably wrong as I suspect this just effects non-equi comparisons. I have not got time to test at the moment.

    As FTO.Peildatum is a date I would suggest you store it as either a datetime or smalldatetime.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply