Blog post #4 in support of Tim Ford’s (b|t) #iwanttohelp, #entrylevel
In SQL Server, in the T-SQL you use to query it, a view looks just like a table (I’m using the AdventureWorks2014 database for all these examples):
SELECT * FROM Production.vProductAndDescription AS vpad;
SELECT vpad.Name, vpad.Description, vpmi.Instructions FROM Production.vProductAndDescription AS vpad JOIN Production.Product AS p ON p.ProductID = vpad.ProductID JOIN Production.vProductModelInstructions AS vpmi ON vpmi.ProductModelID = p.ProductModelID WHERE vpad.ProductID = 891 AND vpad.CultureID = 'fr';
The above query actually combines two views and a table. This is what is commonly referred to as a “code smell”. A code smell is a coding practice that works, but that can lead to problems. In this case, we’re talking about performance problems. The performance problems when using views to join to tables and other views as if they were real tables comes about because a standard view is not a table. Its a query. For example, the second view introduced, vPorductModelInstructions looks like this:
ALTER VIEW [Production].[vProductModelInstructions] AS SELECT [ProductModelID] ,[Name] ,[Instructions].value(N'declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; (/root/text())[1]', 'nvarchar(max)') AS [Instructions] ,[MfgInstructions].ref.value('@LocationID[1]', 'int') AS [LocationID] ,[MfgInstructions].ref.value('@SetupHours[1]', 'decimal(9, 4)') AS [SetupHours] ,[MfgInstructions].ref.value('@MachineHours[1]', 'decimal(9, 4)') AS [MachineHours] ,[MfgInstructions].ref.value('@LaborHours[1]', 'decimal(9, 4)') AS [LaborHours] ,[MfgInstructions].ref.value('@LotSize[1]', 'int') AS [LotSize] ,[Steps].ref.value('string(.)[1]', 'nvarchar(1024)') AS [Step] ,[rowguid] ,[ModifiedDate] FROM [Production].[ProductModel] CROSS APPLY [Instructions].nodes(N'declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; /root/Location') MfgInstructions(ref) CROSS APPLY [MfgInstructions].ref.nodes('declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; step') Steps(ref); GO
That’s a query against the XML stored in the ProductModel table. The view was created to mask the complexity of the necessary XPath code, while providing a mechanism for retrieving the data from the XML. This is a common use of views. However, when we then treat the view as a table, and join it to other tables and views, we present a problem for the optimizer. Because a view is not a table, but is instead a query, the optimizer has to resolve this query in combination with any other views or tables to arrive at an execution plan for the whole combined mess. While the optimizer is very good at what it does, because of the complexity caused by the additional unnecessary processing to figure out which parts of the view is not needed to satisfy the query, it can make poor choices. That can result in poor performance.
If I were to rewrite the query, it would look something like this:
SELECT p.Name, pd.Description, pm.Instructions.value(N'declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; (/root/text())[1]', 'nvarchar(max)') AS Instructions FROM Production.Product AS p JOIN Production.ProductModelProductDescriptionCulture AS pmpdc ON pmpdc.ProductModelID = p.ProductModelID JOIN Production.ProductDescription AS pd ON pd.ProductDescriptionID = pmpdc.ProductDescriptionID JOIN Production.ProductModel AS pm ON pm.ProductModelID = p.ProductModelID CROSS APPLY Instructions.nodes(N'declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; /root/Location') MfgInstructions (ref) CROSS APPLY MfgInstructions.ref.nodes('declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; step') Steps (ref) WHERE p.ProductID = 891 AND pmpdc.CultureID = 'fr';
That’s a lot more complex than the query we had above that only referenced three objects and had only two JOIN operations. However, if you capture the I/O and the execution time for these queries, you’ll see a different story.
I used two methods for measuring performance. I used SET STATISTICS IO and SET STATISTICS TIME to ON for the queries for one set of tests. For another set I used Extended Events. Consistently the execution time for the query with the view was around 110ms. The query that didn’t reference any views was around 37ms. The reads were 155 for the query with views, but only 109 for the query without. If you look at the individual table I/O, you can start to see where the differences come from. These are the results from the query with the views:
Table ‘ProductDescription’. Scan count 0, logical reads 56
Table ‘ProductModelProductDescriptionCulture’. Scan count 28, logical reads 56
Table ‘xml_index_nodes_418100530_256001’. Scan count 13, logical reads 37
Table ‘ProductModel’. Scan count 0, logical reads 2
Table ‘Product’. Scan count 0, logical reads 4
These are the results for the query without the view:
Table ‘ProductDescription’. Scan count 0, logical reads 56
Table ‘xml_index_nodes_418100530_256001’. Scan count 13, logical reads 37
Table ‘ProductModelProductDescriptionCulture’. Scan count 6, logical reads 12
Table ‘ProductModel’. Scan count 0, logical reads 2
Table ‘Product’. Scan count 0, logical reads 2
You can see the differences in both ProductModelProductDescriptionCulture and Product. This is because of the differences in the execution plans caused by differences in the choices made by the optimizer.
A standard view is not a table. There is such a thing as a materialized view or indexed view, which is a table. That’s not what we’re talking about here. While you can use a view as if it was a table, don’t mistake it for a table. A view is just a mask in front of a query. It can’t be used like an object so that you avoid rewriting the same JOIN. That will lead to issues for the optimizer as this simple set of examples showed. Don’t shy away from using views, just understand what their real behavior is. A view is a query, not a table.
The post A View Is Not A Table appeared first on Home Of The Scary DBA.