Performance on SELECT *

  • Just curious, does anybody know if there is any performance gain or loss when using SELECT * when you actually need all of the columns in a table returned.  IE.

    SELECT ID, FirstName, LastName FROM tblPerson

    SELECT * FROM tblPerson

     

     

  • You do not have any criterion in search. You do not need your data displayed in order. So, you may not need any index in your table.

  • If you're asking purely from a performance perspective, then there should be no difference between the 2.

    The reasons for avoiding Select * in this scenario (scenario= all columns required), are in maintainability and robustness of the solution.

    Adding a column to the table in the future creates more risk if there are "Select *" queries written against the table.

  • I thought it was always good practice not to use SELECT * b/c there was an extra performance hit b/c it had to go and lookup the column names?

  • My understanding had nothing to do with * being better with performance it was in case of change the impact it may have.

    Example I use * on my audit triggers because I want an error to produce when the receiving audit table does not match the origin table.

    However suppose I have your table and I do this

    SELECT * FROM tblPerson

    Then my developer writes and application that references the fields by Ordinal position like so

    RecordsetObject.Field(0).Value ' Outputs ID'

    RecordsetObject.Field(1).Value ' First Name'

    RecordsetObject.Field(2).Value ' Last Name'

    Now if you change the table like so

    ID, First Name, Middle Name, Last Name the application will provide the wrong data values.

    Furthermore I have seen a lot of folks write applications that generate spreadsheets or tables in ASP by looping thru the columns for titles and data so they aren't fixed.

    So now suppose you create a field called Social_Security_Number, now the application without changing it will show the SSN of every person.

    So * was better to avoid for 3 reason, unexpected application results, security of data and network overhead (if you don't need a column it is still being passed in the buffer and on the wire if the app is on a seperate server from the SQL Server.

    But you can still use * to your bennefit in cases like the one where they loop thru the fields to get the name and value.

    Create a view that has the column you want

    SELECT [ID], First_Name, Last_Name FROM dbo.TableX

    Then create a Stored Procedure that uses * to Select * From OurVIewsName, and applications that get column name and value by looping thru the object can be controled via the view. Now if you want to add Middle_Name you can without fear of what effect it has.

  • I understand the affects of using SELECT * on a table that changes.  I was just looking for pure perfmance.  I am curious though Antares686, why a developer would reference a a field in a Recordset as RecordsetObject.Field(0).Value instead RecordsetObject.Field("FirstName").Value.  Thanks for your reponse everyone. 

  • Because you could. Ordinal position does apparently access faster than Column Name becuase ADO had to find the column based name versus being told exactly where to look. Even thou there was a performance bennefit I never tested to see how much. I have seen to many issues come from ordinal position refences than I cared to so I choose not to adopt ordinal position over field name. After a while it became a poor practice as well to use ordinal position as well even thou it does have better performance just too many risks unless you really control your design well especially since a lot of folks used * instead of column names in their queries with their design. But as for a performance diff between use of * and Column Names only what I mentioned on having more data that you need on the wire otherwise if all columns are used there is no difference between that and * to my knowledge. 

  • Actually there's a way to use the ordinal positions and get the performance boost while still being safe. Consider this [LONG] code :

    Private Sub CHARgerTreeViewSectionModules(ByVal ModNode AS NodeByVal FkDB AS Integer)

    ON Error GOTO Gestion

        Dim RsADP AS ADODB.Recordset

        Dim RsPkADP AS ADODB.Recordset

        Dim MemoryDBName AS String

        Dim LaSp AS String

        Dim LeParam AS String

        Dim MySp AS Node

        Dim MyParam AS Node

        Dim SectionInfo AS ParameterInfo

        Dim Done AS Boolean

        Dim ParamInfo AS ParameterInfo

        Dim BlankParameterInfo AS ParameterInfo

        Dim AccessObject AS String

        Dim LaAcObj AS Node

        Dim PkADP AS Integer

        Dim ADPFilter AS String

        Dim CurrentIconKey AS String

        SET RsADP New ADODB.Recordset

        SET RsPkADP New ADODB.Recordset

        

        Dim otPkTypeObj AS ADODB.Field

        Dim otDescTypeObj AS ADODB.Field

        

        Dim opPkADP AS ADODB.Field

        

        Dim oaPkADP AS ADODB.Field

        Dim oaADPName AS ADODB.Field

        Dim oaPkObjADP AS ADODB.Field

        Dim oaObjName AS ADODB.Field

        Dim oaFkTypeObj AS ADODB.Field

        Dim oaChilds AS ADODB.Field

        

        MemoryDBName GetDBName(MyCn.ConnectionString)

        MyCn.EXECUTE "Use Documentation"

        

        OpenDiscRS RsADP"SELECT ADPS.PkADP, ADPS.ADPName, ADPS.DescADP, ObjADPS.PkObjADP, ObjADPS.ObjName, ObjADPS.FkTypeObj, ObjADPS.DescObjADP, (Select count(*) from vwObjADPS where FkParent = ObjADPS.PkObjADP) as Childs FROM ObjADPS INNER JOIN ADPS ON ObjADPS.FkADP = ADPS.PkADP WHERE ADPS.FkDB = "FkDB " and ObjADPS.PkObjADP not in (SELECT Distinct RelationsADPS.FkChild FROM RelationsADPS INNER JOIN ObjADPS ON RelationsADPS.FkChild = ObjADPS.PKObjADP INNER JOIN ADPS ON ObjADPS.FkADP = ADPS.PkADP WHERE ADPS.FkDB = "FkDB &quot ORDER BY ADPS.ADPName, ObjADPS.FkTypeObj, ObjADPS.ObjName"

        

        --'DescADP is in the select only because it is required for the order by, it is not used in this sub otherwise

        OpenDiscRS RsPkADP"SELECT Distinct PkADP, DescADP FROM ADPS WHERE FkDB = "FkDB " ORDER BY DescADP"

        

        LaSp ""

        LeParam ""

        IF NOT RsADP.EOF THEN

            OpenRsADPObjTypes

            

            SET otPkTypeObj RsADPObjTypes.Fields("PkTypeObj"

            SET otDescTypeObj RsADPObjTypes.Fields("DescTypeObj"

            

            SET opPkADP RsPkADP.Fields("PkADP"

            

            SET oaPkADP RsADP.Fields("PkADP"

            SET oaADPName RsADP.Fields("ADPName"

            SET oaPkObjADP RsADP.Fields("PkObjADP"

            SET oaObjName RsADP.Fields("ObjName"

            SET oaFkTypeObj RsADP.Fields("FkTypeObj"

            SET oaChilds RsADP.Fields("Childs"

            WHILE NOT RsPkADP.EOF

                ParamInfo BlankParameterInfo

                RsADP.Filter "PkADP = "opPkADP.Value

                IF NOT RsADP.EOF THEN

                    ADPFilter "PkADP = "opPkADP.Value

                    LaSp oaADPName.Value

                    SET MySp Me.TvDocumentation.Nodes.ADD(ModNodetvwChild"a"KeyIdLaSpGetIconKey(ObjectTypes.Modules))

                    GCount GCount 1

                    KeyId KeyId 1

                    PkADP oaPkADP.Value

                    ParamInfo.id PkADP

                    ParamInfo.name LaSp

                    ParamInfo.NodeType "ADP"

                    MySp.Tag KeyId 1

                    LesInfos(KeyId 1ParamInfo

                    RsADPObjTypes.MoveFirst

                    WHILE NOT RsADPObjTypes.EOF

                        Done False

                        AccessObject otDescTypeObj.Value

                        CurrentIconKey GetIconKey(ConvertServerTypeToObjectType(otDescTypeObj.Value))

                        SET LaAcObj Me.TvDocumentation.Nodes.ADD(MySptvwChild"a"KeyIdAccessObjectCurrentIconKey)

                        KeyId KeyId 1

                        GCount GCount 1

                        RsADP.Filter ADPFilter " And FkTypeObj = "otPkTypeObj.Value

                        WHILE NOT RsADP.EOF AND NOT Done

                            LeParam oaObjName.Value

                            SET MyParam Me.TvDocumentation.Nodes.ADD(LaAcObjtvwChild"a"KeyIdLeParamCurrentIconKey)

                            KeyId KeyId 1

                            GCount GCount 1

                            ParamInfo BlankParameterInfo

                            ParamInfo.id oaPkObjADP.Value

                            --'ParamInfo.Precision is an object

                            ParamInfo.name LeParam

                            ParamInfo.NodeType "ADPOBJ"

                            ParamInfo.VarType AccessObject

                            MyParam.Tag KeyId 1

                            LesInfos(KeyId 1ParamInfo

                            

                            IF oaChilds.Value THEN

                                IF AccessObject "Class"OR AccessObject "Form"OR AccessObject "Report"THEN

                                    ShowSubFormsOrReports MyParamParamInfoTrue

                                END IF

                            END IF

                            --'MyParam.EnsureVisible

                            

                            RsADP.MoveNext

                            IF NOT RsADP.EOF THEN

                                IF PkADP <> oaPkADP.Value THEN

                                    Done True

                                END IF

                            END IF

                        Wend

                        RsADPObjTypes.MoveNext

                    Wend

                END IF

                RsPkADP.MoveNext

            Wend

            CloseRsADPObjTypes

        END IF

        SET otPkTypeObj Nothing

        SET otDescTypeObj Nothing

        

        SET opPkADP Nothing

        

        SET oaPkADP Nothing

        SET oaADPName Nothing

        SET oaPkObjADP Nothing

        SET oaObjName Nothing

        SET oaFkTypeObj Nothing

        SET oaChilds Nothing

        DisposeRS RsADP

        DisposeRS RsPkADP

        MyCn.EXECUTE "Use "MemoryDBName

    EXIT Sub

    Gestion:

        SELECT CASE Err.Number

            CASE ELSE

                SELECT CASE ErrHandler(ModuleNameMe.name"ChargerTreeViewSectionModules" ErrErl())

                    CASE ErrResume

                                Resume

                    CASE ErrResumeNext

                                Resume Next

                    CASE ErrExit

                                MsgBox Err.Description " : "Err.Number

                                EXIT Sub

                END SELECT

        END SELECT

    END Sub

    As you can see I am selecting the data. Then using a adodb.Field object, I set a variable based on the name of the column. However this objects uses the ordinal position, so when I start looping I get the performance boost.

    While I have no benchmark test to present to you. I can confirm that this performance boost is ver far from negligable. This function is part of a huge process to loads a treeview on the form_load. Using this technic (and a few others). I took the whole process from taking 2 minutes to load 2000 nodes in the treeview to taking 8 seconds to load 13 000 nodes. Grant it this performance boost did not come only from using adodb.fields but this technic was one of the greatest contributors along with tuning the queries.

  • The only performance gain achieved by using "SELECT *" is that it takes less time to type.  You might argue that there is some miniscule benefit in compilation time, or less time required to transfer the shorter query over a network, but these are probably not measurable differences.  On the other hand, the drawbacks to using SELECT * are well documented.

    I have heard that there is a performance loss using EXISTS(SELECT * FROM ...) compared to EXISTS(SELECT NULL FROM ...), but I'm not sure if there is a noticable difference or just a theoretical one.

     

  • I already wrote a test on this one.  And even at 1 000 000 executions I could not get a difference (consistent).  I was getting like differences like 3 MS over a few minutes and it was almost evenly distrubuted over * and 1 (I did not test with null but a number).

  • >>I have heard that there is a performance loss using EXISTS(SELECT * FROM ...) compared to EXISTS(SELECT NULL FROM ...),

    That was only on versions 6.5 (or was it 7.0 ?) and below.

    It was a holdover from a "bug" in the optimizer that Microsoft inherited

    from Sybase in ver 4.x of the product, where indexes weren't

    used properly in the sub-query of the EXISTS if Select * was

    used.

    It was due to the optimizer not figuring out that only presence

    of a record needed to be determined and that the actual set

    of columns (whether it be Select * or Select NULL or Select 'Fred')

    was irrelevant.

    No longer an issue in Sql2K and Sql2005.

     

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply