Performance on SELECT *

Question

Performance on SELECT *

Ken Overton

Ten Centuries

Points: 1155
More actions
December 11, 2006 at 10:58 am

#95021

Just curious, does anybody know if there is any performance gain or loss when using SELECT * when you actually need all of the columns in a table returned. IE.
SELECT ID, FirstName, LastName FROM tblPerson
SELECT * FROM tblPerson

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

SQL ORACLE One Orange Chip Points: 27807 More actions · Answer 1

You do not have any criterion in search. You do not need your data displayed in order. So, you may not need any index in your table.

PW-201837 SSC-Insane Points: 20805 More actions · Answer 2

If you're asking purely from a performance perspective, then there should be no difference between the 2.

The reasons for avoiding Select * in this scenario (scenario= all columns required), are in maintainability and robustness of the solution.

Adding a column to the table in the future creates more risk if there are "Select *" queries written against the table.

El Guapo Old Hand Points: 366 More actions · Answer 3

I thought it was always good practice not to use SELECT * b/c there was an extra performance hit b/c it had to go and lookup the column names?

Antares686 SSC Guru Points: 125444 More actions · Answer 4

My understanding had nothing to do with * being better with performance it was in case of change the impact it may have.

Example I use * on my audit triggers because I want an error to produce when the receiving audit table does not match the origin table.

However suppose I have your table and I do this

SELECT * FROM tblPerson

Then my developer writes and application that references the fields by Ordinal position like so

RecordsetObject.Field(0).Value ' Outputs ID'

RecordsetObject.Field(1).Value ' First Name'

RecordsetObject.Field(2).Value ' Last Name'

Now if you change the table like so

ID, First Name, Middle Name, Last Name the application will provide the wrong data values.

Furthermore I have seen a lot of folks write applications that generate spreadsheets or tables in ASP by looping thru the columns for titles and data so they aren't fixed.

So now suppose you create a field called Social_Security_Number, now the application without changing it will show the SSN of every person.

So * was better to avoid for 3 reason, unexpected application results, security of data and network overhead (if you don't need a column it is still being passed in the buffer and on the wire if the app is on a seperate server from the SQL Server.

But you can still use * to your bennefit in cases like the one where they loop thru the fields to get the name and value.

Create a view that has the column you want

SELECT [ID], First_Name, Last_Name FROM dbo.TableX

Then create a Stored Procedure that uses * to Select * From OurVIewsName, and applications that get column name and value by looping thru the object can be controled via the view. Now if you want to add Middle_Name you can without fear of what effect it has.

Ken Overton Ten Centuries Points: 1155 More actions · Answer 5

I understand the affects of using SELECT * on a table that changes. I was just looking for pure perfmance. I am curious though Antares686, why a developer would reference a a field in a Recordset as RecordsetObject.Field(0).Value instead RecordsetObject.Field("FirstName").Value. Thanks for your reponse everyone.

Antares686 SSC Guru Points: 125444 More actions · Answer 6

Because you could. Ordinal position does apparently access faster than Column Name becuase ADO had to find the column based name versus being told exactly where to look. Even thou there was a performance bennefit I never tested to see how much. I have seen to many issues come from ordinal position refences than I cared to so I choose not to adopt ordinal position over field name. After a while it became a poor practice as well to use ordinal position as well even thou it does have better performance just too many risks unless you really control your design well especially since a lot of folks used * instead of column names in their queries with their design. But as for a performance diff between use of * and Column Names only what I mentioned on having more data that you need on the wire otherwise if all columns are used there is no difference between that and * to my knowledge.

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 7

Actually there's a way to use the ordinal positions and get the performance boost while still being safe. Consider this [LONG] code :

Private Sub CHARgerTreeViewSectionModules(ByVal ModNode AS Node, ByVal FkDB AS Integer)

ON Error GOTO Gestion

Dim RsADP AS ADODB.Recordset

Dim RsPkADP AS ADODB.Recordset

Dim MemoryDBName AS String

Dim LaSp AS String

Dim LeParam AS String

Dim MySp AS Node

Dim MyParam AS Node

Dim SectionInfo AS ParameterInfo

Dim Done AS Boolean

Dim ParamInfo AS ParameterInfo

Dim BlankParameterInfo AS ParameterInfo

Dim AccessObject AS String

Dim LaAcObj AS Node

Dim PkADP AS Integer

Dim ADPFilter AS String

Dim CurrentIconKey AS String

SET RsADP = New ADODB.Recordset

SET RsPkADP = New ADODB.Recordset

Dim otPkTypeObj AS ADODB.Field

Dim otDescTypeObj AS ADODB.Field

Dim opPkADP AS ADODB.Field

Dim oaPkADP AS ADODB.Field

Dim oaADPName AS ADODB.Field

Dim oaPkObjADP AS ADODB.Field

Dim oaObjName AS ADODB.Field

Dim oaFkTypeObj AS ADODB.Field

Dim oaChilds AS ADODB.Field

MemoryDBName = GetDBName(MyCn.ConnectionString)

MyCn.EXECUTE "Use Documentation"

OpenDiscRS RsADP, "SELECT ADPS.PkADP, ADPS.ADPName, ADPS.DescADP, ObjADPS.PkObjADP, ObjADPS.ObjName, ObjADPS.FkTypeObj, ObjADPS.DescObjADP, (Select count(*) from vwObjADPS where FkParent = ObjADPS.PkObjADP) as Childs FROM ObjADPS INNER JOIN ADPS ON ObjADPS.FkADP = ADPS.PkADP WHERE ADPS.FkDB = "& FkDB & " and ObjADPS.PkObjADP not in (SELECT Distinct RelationsADPS.FkChild FROM RelationsADPS INNER JOIN ObjADPS ON RelationsADPS.FkChild = ObjADPS.PKObjADP INNER JOIN ADPS ON ObjADPS.FkADP = ADPS.PkADP WHERE ADPS.FkDB = "& FkDB & &quot ORDER BY ADPS.ADPName, ObjADPS.FkTypeObj, ObjADPS.ObjName"

--'DescADP is in the select only because it is required for the order by, it is not used in this sub otherwise

OpenDiscRS RsPkADP, "SELECT Distinct PkADP, DescADP FROM ADPS WHERE FkDB = "& FkDB & " ORDER BY DescADP"

LaSp = ""

LeParam = ""

IF NOT RsADP.EOF THEN

OpenRsADPObjTypes

SET otPkTypeObj = RsADPObjTypes.Fields("PkTypeObj"

SET otDescTypeObj = RsADPObjTypes.Fields("DescTypeObj"

SET opPkADP = RsPkADP.Fields("PkADP"

SET oaPkADP = RsADP.Fields("PkADP"

SET oaADPName = RsADP.Fields("ADPName"

SET oaPkObjADP = RsADP.Fields("PkObjADP"

SET oaObjName = RsADP.Fields("ObjName"

SET oaFkTypeObj = RsADP.Fields("FkTypeObj"

SET oaChilds = RsADP.Fields("Childs"

WHILE NOT RsPkADP.EOF

ParamInfo = BlankParameterInfo

RsADP.Filter = "PkADP = "& opPkADP.Value

IF NOT RsADP.EOF THEN

ADPFilter = "PkADP = "& opPkADP.Value

LaSp = oaADPName.Value

SET MySp = Me.TvDocumentation.Nodes.ADD(ModNode, tvwChild, "a"& KeyId, LaSp, GetIconKey(ObjectTypes.Modules))

GCount = GCount + 1

KeyId = KeyId + 1

PkADP = oaPkADP.Value

ParamInfo.id = PkADP

ParamInfo.name = LaSp

ParamInfo.NodeType = "ADP"

MySp.Tag = KeyId - 1

LesInfos(KeyId - 1) = ParamInfo

RsADPObjTypes.MoveFirst

WHILE NOT RsADPObjTypes.EOF

Done = False

AccessObject = otDescTypeObj.Value

CurrentIconKey = GetIconKey(ConvertServerTypeToObjectType(otDescTypeObj.Value))

SET LaAcObj = Me.TvDocumentation.Nodes.ADD(MySp, tvwChild, "a"& KeyId, AccessObject, CurrentIconKey)

KeyId = KeyId + 1

GCount = GCount + 1

RsADP.Filter = ADPFilter & " And FkTypeObj = "& otPkTypeObj.Value

WHILE NOT RsADP.EOF AND NOT Done

LeParam = oaObjName.Value

SET MyParam = Me.TvDocumentation.Nodes.ADD(LaAcObj, tvwChild, "a"& KeyId, LeParam, CurrentIconKey)

KeyId = KeyId + 1

GCount = GCount + 1

ParamInfo = BlankParameterInfo

ParamInfo.id = oaPkObjADP.Value

--'ParamInfo.Precision is an object

ParamInfo.name = LeParam

ParamInfo.NodeType = "ADPOBJ"

ParamInfo.VarType = AccessObject

MyParam.Tag = KeyId - 1

LesInfos(KeyId - 1) = ParamInfo

IF oaChilds.Value > 0 THEN

IF AccessObject = "Class"OR AccessObject = "Form"OR AccessObject = "Report"THEN

ShowSubFormsOrReports MyParam, ParamInfo, True

END IF

--'MyParam.EnsureVisible

RsADP.MoveNext

IF NOT RsADP.EOF THEN

IF PkADP <> oaPkADP.Value THEN

Done = True

END IF

Wend

RsADPObjTypes.MoveNext

Wend

END IF

RsPkADP.MoveNext

Wend

CloseRsADPObjTypes

END IF

SET otPkTypeObj = Nothing

SET otDescTypeObj = Nothing

SET opPkADP = Nothing

SET oaPkADP = Nothing

SET oaADPName = Nothing

SET oaPkObjADP = Nothing

SET oaObjName = Nothing

SET oaFkTypeObj = Nothing

SET oaChilds = Nothing

DisposeRS RsADP

DisposeRS RsPkADP

MyCn.EXECUTE "Use "& MemoryDBName

EXIT Sub

Gestion:

SELECT CASE Err.Number

CASE ELSE

SELECT CASE ErrHandler(ModuleName, Me.name, "ChargerTreeViewSectionModules" Err, Erl())

CASE ErrResume

Resume

CASE ErrResumeNext

Resume Next

CASE ErrExit

MsgBox Err.Description & " : "& Err.Number

EXIT Sub

END SELECT

END Sub

As you can see I am selecting the data. Then using a adodb.Field object, I set a variable based on the name of the column. However this objects uses the ordinal position, so when I start looping I get the performance boost.

While I have no benchmark test to present to you. I can confirm that this performance boost is ver far from negligable. This function is part of a huge process to loads a treeview on the form_load. Using this technic (and a few others). I took the whole process from taking 2 minutes to load 2000 nodes in the treeview to taking 8 seconds to load 13 000 nodes. Grant it this performance boost did not come only from using adodb.fields but this technic was one of the greatest contributors along with tuning the queries.

Scott Coleman One Orange Chip Points: 27452 More actions · Answer 8

The only performance gain achieved by using "SELECT *" is that it takes less time to type. You might argue that there is some miniscule benefit in compilation time, or less time required to transfer the shorter query over a network, but these are probably not measurable differences. On the other hand, the drawbacks to using SELECT * are well documented.

I have heard that there is a performance loss using EXISTS(SELECT * FROM ...) compared to EXISTS(SELECT NULL FROM ...), but I'm not sure if there is a noticable difference or just a theoretical one.

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 9

I already wrote a test on this one. And even at 1 000 000 executions I could not get a difference (consistent). I was getting like differences like 3 MS over a few minutes and it was almost evenly distrubuted over * and 1 (I did not test with null but a number).

PW-201837 SSC-Insane Points: 20805 More actions · Answer 10

>>I have heard that there is a performance loss using EXISTS(SELECT * FROM ...) compared to EXISTS(SELECT NULL FROM ...),

That was only on versions 6.5 (or was it 7.0 ?) and below.

It was a holdover from a "bug" in the optimizer that Microsoft inherited

from Sybase in ver 4.x of the product, where indexes weren't

used properly in the sub-query of the EXISTS if Select * was

used.

It was due to the optimizer not figuring out that only presence

of a record needed to be determined and that the actual set

of columns (whether it be Select * or Select NULL or Select 'Fred')

was irrelevant.

No longer an issue in Sql2K and Sql2005.