Blog Post

TSQL Tuesday #18 - Using CTE to Analyze Disk IO

,

   (TSQL Tuesday on Wednesday! - Apologies for the late entry - better late than never)

Today’s topic is CTE’s. Scientifically speaking, solids mostly expand in response to heating and contract on cooling. This response to temperature change is expressed as its coefficient of thermal expansion or CTE.

Wait! Don’t leave this site yet! If you’re response to the above is who cares, then you’re in the right place. This is not a dissertation on physics, but another entry to TSQL Tuesday #18. Bob Pusateri, aka SQLBOB|Twitter, is hosting today. Bob maintains his neat blog called the Outer Join, and he has invited us to write about SQL CTE’s . He challenged us to use the coefficient of thermal expansion in our blog, and so it is done. Yes, the topic is CTE, but we’re talking about Common Table Expressions. 

To learn more about the origins of TSQL Tuesday, click here .

CTE were introduced in SQL Server 2005, and was definitely one of the most exciting tsql features that came along and one that made query writing much more efficient. This powerful enhancement to the TSQL toolset took the concept of the derived table and made it more streamlined in that it can be declared once and referenced multiple times in a query.

A good article on the Introduction of CTE is from the 4GuysFromRolla.com that examines the benefits of this simple construct, and discusses how they provide a more readable and usable approach to derived tables

Another good primer about Common Table Expressions in SQL Server 2005 appeared on SQLServerCentral.com. It defines what they are, and gives examples on their use.

Microsoft provides us, via MSDN, an overview on Using Common Table Expressions. For example, they can be used to:

· Create a recursive query. For more information, see Recursive Queries Using Common Table Expressions.

· Substitute for a view when the general use of a view is not required; that is, you do not have to store the definition in metadata.

· Enable grouping by a column that is derived from a scalar subselect, or a function that is either not deterministic or has external access.

· Reference the resulting table multiple times in the same statement.

The simplified form of the CTE syntax is listed below.

WITH cte_name (optional column list) AS

(

Cte_query

)

statement that uses the above CTE

Of course, they can get much more complex, but the goal is to reduce overhead of query execution, and creates queries that are recursive. A recursive query is one in which a CTE references itself. The common table expression (CTE) provides the significant advantage of being able to reference itself, thereby creating a recursive CTE.

See MSDN’s article on Recursive Queries Using Common Table Expressions.

OK, so what is a practical real-world example of a recursive CTE? There are many.

In the past, I have written articles and talked about gathering IO Statistics. In one article I wrote appearing in SQL Server Magazine, I/O, I/O, It’s Why My Server’s Slow, I offered three tips and some TSQL Code on analyzing and gathering info on SQL Server IO Statistics.

All of the code I presented in my article use CTE’s that leverage one of the DMO’s to calculate different ways to look at IO on the system. Using, sys.dm_io_virtual_file_stats., we can derive a percentage of how much IO is being used, and also the percentage of stalled or stuck IOs. Depending on the construct, we can view IO by database, by physical database file, or by stalls.

You can refer to the article above for more detail on analyzing IO, but I will quote one example here. In the CTE below, you can see the percentage of IO per drive letter. You can use this script to determine which disk is producing the most I\O and which database that disk is related to. This information can be helpful in determining which database file should be moved to another disk.

 

--Calculating the percentage of I/O by Drive
----------------------------------------------------
With g as
(select db_name(mf.database_id) as database_name, mf.physical_name, 
left(mf.physical_name, 1) as drive_letter, 
vfs.num_of_writes, 
vfs.num_of_bytes_written as BYTESWRITTEN, 
vfs.io_stall_write_ms, 
mf.type_desc, vfs.num_of_reads, vfs.num_of_bytes_read, vfs.io_stall_read_ms,
vfs.io_stall, vfs.size_on_disk_bytes
from sys.master_files mf
join sys.dm_io_virtual_file_stats(NULL, NULL) vfs
on mf.database_id=vfs.database_id and mf.file_id=vfs.file_id
--order by vfs.num_of_bytes_written desc)
)
select database_name,drive_letter, BYTESWRITTEN,
Percentage = RTRIM(CONVERT(DECIMAL(5,2),
BYTESWRITTEN*100.0/(SELECT SUM(BYTESWRITTEN) FROM g))) 
--where drive_letter='R' <-- You can put specify drive )))
+ '%'
from g --where drive_letter='R'
order by BYTESWRITTEN desc

You can see a sample output as displayed by the below image:

So, there you have it!  My contribution to T-SQL Tuesday, this 10th of May, 2011.  Better late than never, as they say.

I want to give a shout out to Bob, and thank him for hosting this event.  Happy Tuesday! (And Wednesday :-0)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating