December 11, 2006 at 10:58 am
Just curious, does anybody know if there is any performance gain or loss when using SELECT * when you actually need all of the columns in a table returned. IE.
SELECT ID, FirstName, LastName FROM tblPerson
SELECT * FROM tblPerson
December 11, 2006 at 11:13 am
You do not have any criterion in search. You do not need your data displayed in order. So, you may not need any index in your table.
December 11, 2006 at 11:51 am
If you're asking purely from a performance perspective, then there should be no difference between the 2.
The reasons for avoiding Select * in this scenario (scenario= all columns required), are in maintainability and robustness of the solution.
Adding a column to the table in the future creates more risk if there are "Select *" queries written against the table.
December 11, 2006 at 11:58 am
I thought it was always good practice not to use SELECT * b/c there was an extra performance hit b/c it had to go and lookup the column names?
December 11, 2006 at 12:41 pm
My understanding had nothing to do with * being better with performance it was in case of change the impact it may have.
Example I use * on my audit triggers because I want an error to produce when the receiving audit table does not match the origin table.
However suppose I have your table and I do this
SELECT * FROM tblPerson
Then my developer writes and application that references the fields by Ordinal position like so
RecordsetObject.Field(0).Value ' Outputs ID'
RecordsetObject.Field(1).Value ' First Name'
RecordsetObject.Field(2).Value ' Last Name'
Now if you change the table like so
ID, First Name, Middle Name, Last Name the application will provide the wrong data values.
Furthermore I have seen a lot of folks write applications that generate spreadsheets or tables in ASP by looping thru the columns for titles and data so they aren't fixed.
So now suppose you create a field called Social_Security_Number, now the application without changing it will show the SSN of every person.
So * was better to avoid for 3 reason, unexpected application results, security of data and network overhead (if you don't need a column it is still being passed in the buffer and on the wire if the app is on a seperate server from the SQL Server.
But you can still use * to your bennefit in cases like the one where they loop thru the fields to get the name and value.
Create a view that has the column you want
SELECT [ID], First_Name, Last_Name FROM dbo.TableX
Then create a Stored Procedure that uses * to Select * From OurVIewsName, and applications that get column name and value by looping thru the object can be controled via the view. Now if you want to add Middle_Name you can without fear of what effect it has.
December 11, 2006 at 12:54 pm
I understand the affects of using SELECT * on a table that changes. I was just looking for pure perfmance. I am curious though Antares686, why a developer would reference a a field in a Recordset as RecordsetObject.Field(0).Value instead RecordsetObject.Field("FirstName").Value. Thanks for your reponse everyone.
December 11, 2006 at 1:43 pm
Because you could. Ordinal position does apparently access faster than Column Name becuase ADO had to find the column based name versus being told exactly where to look. Even thou there was a performance bennefit I never tested to see how much. I have seen to many issues come from ordinal position refences than I cared to so I choose not to adopt ordinal position over field name. After a while it became a poor practice as well to use ordinal position as well even thou it does have better performance just too many risks unless you really control your design well especially since a lot of folks used * instead of column names in their queries with their design. But as for a performance diff between use of * and Column Names only what I mentioned on having more data that you need on the wire otherwise if all columns are used there is no difference between that and * to my knowledge.
December 12, 2006 at 7:51 am
Actually there's a way to use the ordinal positions and get the performance boost while still being safe. Consider this [LONG] code :
Private Sub CHARgerTreeViewSectionModules(ByVal ModNode AS Node, ByVal FkDB AS Integer)
ON Error GOTO Gestion
Dim RsADP AS ADODB.Recordset
Dim RsPkADP AS ADODB.Recordset
Dim MemoryDBName AS String
Dim LaSp AS String
Dim LeParam AS String
Dim MySp AS Node
Dim MyParam AS Node
Dim SectionInfo AS ParameterInfo
Dim Done AS Boolean
Dim ParamInfo AS ParameterInfo
Dim BlankParameterInfo AS ParameterInfo
Dim AccessObject AS String
Dim LaAcObj AS Node
Dim PkADP AS Integer
Dim ADPFilter AS String
Dim CurrentIconKey AS String
SET RsADP = New ADODB.Recordset
SET RsPkADP = New ADODB.Recordset
Dim otPkTypeObj AS ADODB.Field
Dim otDescTypeObj AS ADODB.Field
Dim opPkADP AS ADODB.Field
Dim oaPkADP AS ADODB.Field
Dim oaADPName AS ADODB.Field
Dim oaPkObjADP AS ADODB.Field
Dim oaObjName AS ADODB.Field
Dim oaFkTypeObj AS ADODB.Field
Dim oaChilds AS ADODB.Field
MemoryDBName = GetDBName(MyCn.ConnectionString)
MyCn.EXECUTE "Use Documentation"
OpenDiscRS RsADP, "SELECT ADPS.PkADP, ADPS.ADPName, ADPS.DescADP, ObjADPS.PkObjADP, ObjADPS.ObjName, ObjADPS.FkTypeObj, ObjADPS.DescObjADP, (Select count(*) from vwObjADPS where FkParent = ObjADPS.PkObjADP) as Childs FROM ObjADPS INNER JOIN ADPS ON ObjADPS.FkADP = ADPS.PkADP WHERE ADPS.FkDB = "& FkDB & " and ObjADPS.PkObjADP not in (SELECT Distinct RelationsADPS.FkChild FROM RelationsADPS INNER JOIN ObjADPS ON RelationsADPS.FkChild = ObjADPS.PKObjADP INNER JOIN ADPS ON ObjADPS.FkADP = ADPS.PkADP WHERE ADPS.FkDB = "& FkDB & " ORDER BY ADPS.ADPName, ObjADPS.FkTypeObj, ObjADPS.ObjName"
--'DescADP is in the select only because it is required for the order by, it is not used in this sub otherwise
OpenDiscRS RsPkADP, "SELECT Distinct PkADP, DescADP FROM ADPS WHERE FkDB = "& FkDB & " ORDER BY DescADP"
LaSp = ""
LeParam = ""
IF NOT RsADP.EOF THEN
OpenRsADPObjTypes
SET otPkTypeObj = RsADPObjTypes.Fields("PkTypeObj"
SET otDescTypeObj = RsADPObjTypes.Fields("DescTypeObj"
SET opPkADP = RsPkADP.Fields("PkADP"
SET oaPkADP = RsADP.Fields("PkADP"
SET oaADPName = RsADP.Fields("ADPName"
SET oaPkObjADP = RsADP.Fields("PkObjADP"
SET oaObjName = RsADP.Fields("ObjName"
SET oaFkTypeObj = RsADP.Fields("FkTypeObj"
SET oaChilds = RsADP.Fields("Childs"
WHILE NOT RsPkADP.EOF
ParamInfo = BlankParameterInfo
RsADP.Filter = "PkADP = "& opPkADP.Value
IF NOT RsADP.EOF THEN
ADPFilter = "PkADP = "& opPkADP.Value
LaSp = oaADPName.Value
SET MySp = Me.TvDocumentation.Nodes.ADD(ModNode, tvwChild, "a"& KeyId, LaSp, GetIconKey(ObjectTypes.Modules))
GCount = GCount + 1
KeyId = KeyId + 1
PkADP = oaPkADP.Value
ParamInfo.id = PkADP
ParamInfo.name = LaSp
ParamInfo.NodeType = "ADP"
MySp.Tag = KeyId - 1
LesInfos(KeyId - 1) = ParamInfo
RsADPObjTypes.MoveFirst
WHILE NOT RsADPObjTypes.EOF
Done = False
AccessObject = otDescTypeObj.Value
CurrentIconKey = GetIconKey(ConvertServerTypeToObjectType(otDescTypeObj.Value))
SET LaAcObj = Me.TvDocumentation.Nodes.ADD(MySp, tvwChild, "a"& KeyId, AccessObject, CurrentIconKey)
KeyId = KeyId + 1
GCount = GCount + 1
RsADP.Filter = ADPFilter & " And FkTypeObj = "& otPkTypeObj.Value
WHILE NOT RsADP.EOF AND NOT Done
LeParam = oaObjName.Value
SET MyParam = Me.TvDocumentation.Nodes.ADD(LaAcObj, tvwChild, "a"& KeyId, LeParam, CurrentIconKey)
KeyId = KeyId + 1
GCount = GCount + 1
ParamInfo = BlankParameterInfo
ParamInfo.id = oaPkObjADP.Value
--'ParamInfo.Precision is an object
ParamInfo.name = LeParam
ParamInfo.NodeType = "ADPOBJ"
ParamInfo.VarType = AccessObject
MyParam.Tag = KeyId - 1
LesInfos(KeyId - 1) = ParamInfo
IF oaChilds.Value > 0 THEN
IF AccessObject = "Class"OR AccessObject = "Form"OR AccessObject = "Report"THEN
ShowSubFormsOrReports MyParam, ParamInfo, True
END IF
END IF
--'MyParam.EnsureVisible
RsADP.MoveNext
IF NOT RsADP.EOF THEN
IF PkADP <> oaPkADP.Value THEN
Done = True
END IF
END IF
Wend
RsADPObjTypes.MoveNext
Wend
END IF
RsPkADP.MoveNext
Wend
CloseRsADPObjTypes
END IF
SET otPkTypeObj = Nothing
SET otDescTypeObj = Nothing
SET opPkADP = Nothing
SET oaPkADP = Nothing
SET oaADPName = Nothing
SET oaPkObjADP = Nothing
SET oaObjName = Nothing
SET oaFkTypeObj = Nothing
SET oaChilds = Nothing
DisposeRS RsADP
DisposeRS RsPkADP
MyCn.EXECUTE "Use "& MemoryDBName
EXIT Sub
Gestion:
SELECT CASE Err.Number
CASE ELSE
SELECT CASE ErrHandler(ModuleName, Me.name, "ChargerTreeViewSectionModules" Err, Erl())
CASE ErrResume
Resume
CASE ErrResumeNext
Resume Next
CASE ErrExit
MsgBox Err.Description & " : "& Err.Number
EXIT Sub
END SELECT
END SELECT
END Sub
As you can see I am selecting the data. Then using a adodb.Field object, I set a variable based on the name of the column. However this objects uses the ordinal position, so when I start looping I get the performance boost.
While I have no benchmark test to present to you. I can confirm that this performance boost is ver far from negligable. This function is part of a huge process to loads a treeview on the form_load. Using this technic (and a few others). I took the whole process from taking 2 minutes to load 2000 nodes in the treeview to taking 8 seconds to load 13 000 nodes. Grant it this performance boost did not come only from using adodb.fields but this technic was one of the greatest contributors along with tuning the queries.
December 12, 2006 at 9:06 am
The only performance gain achieved by using "SELECT *" is that it takes less time to type. You might argue that there is some miniscule benefit in compilation time, or less time required to transfer the shorter query over a network, but these are probably not measurable differences. On the other hand, the drawbacks to using SELECT * are well documented.
I have heard that there is a performance loss using EXISTS(SELECT * FROM ...) compared to EXISTS(SELECT NULL FROM ...), but I'm not sure if there is a noticable difference or just a theoretical one.
December 12, 2006 at 9:12 am
I already wrote a test on this one. And even at 1 000 000 executions I could not get a difference (consistent). I was getting like differences like 3 MS over a few minutes and it was almost evenly distrubuted over * and 1 (I did not test with null but a number).
December 12, 2006 at 9:24 am
>>I have heard that there is a performance loss using EXISTS(SELECT * FROM ...) compared to EXISTS(SELECT NULL FROM ...),
That was only on versions 6.5 (or was it 7.0 ?) and below.
It was a holdover from a "bug" in the optimizer that Microsoft inherited
from Sybase in ver 4.x of the product, where indexes weren't
used properly in the sub-query of the EXISTS if Select * was
used.
It was due to the optimizer not figuring out that only presence
of a record needed to be determined and that the actual set
of columns (whether it be Select * or Select NULL or Select 'Fred')
was irrelevant.
No longer an issue in Sql2K and Sql2005.
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply