Azure SQL Data Warehouse can sometimes feel like it’s completely different from SQL Server, but under the covers, it is still (mostly) SQL Server and it does have execution plans. Let’s take a look at one.
I’ve created a new SQL Data Warehouse using the sample database available on the portal, AdventureWorksDW. Here’s a query against that database:
SELECT dd.FullDateAlternateKey AS OrderDate, dc.LastName, SUM(fis.SalesAmount) AS SumSalesAmount FROM dbo.FactInternetSales AS fis JOIN dbo.DimDate AS dd ON fis.OrderDateKey = dd.DateKey JOIN dbo.DimCustomer AS dc ON dc.CustomerKey = fis.CustomerKey GROUP BY dd.FullDateAlternateKey, dc.LastName HAVING SUM(fis.SalesAmount) > 5000.0 ORDER BY OrderDate DESC;
If I attempt to capture an execution plan using the SQL Server Management Studio GUI, nothing happens. If I try to use T-SQL commands, I get an error that those commands are not supported with this version of SQL Server. Same thing if I try to capture a plan using Visual Studio. So… now what? Enter our new command:
EXPLAIN SELECT dd.FullDateAlternateKey AS OrderDate, dc.LastName, SUM(fis.SalesAmount) AS SumSalesAmount FROM dbo.FactInternetSales AS fis JOIN dbo.DimDate AS dd ON fis.OrderDateKey = dd.DateKey JOIN dbo.DimCustomer AS dc ON dc.CustomerKey = fis.CustomerKey GROUP BY dd.FullDateAlternateKey, dc.LastName HAVING SUM(fis.SalesAmount) > 5000.0 ORDER BY OrderDate DESC;
If I run this through the SSMS query window, I get a syntax error. So we’re now in Visual Studio. This is how we generate an execution plan from within Azure SQL Data Warehouse. What you get is XML output in the results like this:
According to the documentation on EXPLAIN, I should be able to click on the XML and it will open up to explore. In my version of Visual Studio (2015), I didn’t find that to be the case. Instead I had to copy and paste the XML into an XML file window that I created within Visual Studio. This is what I finished with:
<?xml version="1.0" encoding="utf-8"?> <dsql_query number_nodes="1" number_distributions="60" number_distributions_per_node="60"> <sql>SELECT dd.FullDateAlternateKey AS OrderDate, dc.LastName, SUM(fis.SalesAmount) AS SumSalesAmount FROM dbo.FactInternetSales AS fis JOIN dbo.DimDate AS dd ON fis.OrderDateKey = dd.DateKey JOIN dbo.DimCustomer AS dc ON dc.CustomerKey = fis.CustomerKey GROUP BY dd.FullDateAlternateKey, dc.LastName HAVING SUM(fis.SalesAmount) > 5000.0 ORDER BY OrderDate DESC</sql> <dsql_operations total_cost="5.98868068474576" total_number_operations="13"> <dsql_operation operation_type="RND_ID"> <identifier>TEMP_ID_14</identifier> </dsql_operation> <dsql_operation operation_type="ON"> <location permanent="false" distribution="AllComputeNodes" /> <sql_operations> <sql_operation type="statement">CREATE TABLE [tempdb].[dbo].[TEMP_ID_14] ([DateKey] INT NOT NULL, [FullDateAlternateKey] DATE NOT NULL ) WITH(DATA_COMPRESSION=PAGE);</sql_operation> </sql_operations> </dsql_operation> <dsql_operation operation_type="BROADCAST_MOVE"> <operation_cost cost="1.99584" accumulative_cost="1.99584" average_rowsize="7" output_rows="1188" GroupNumber="12" /> <source_statement>SELECT [T1_1].[DateKey] AS [DateKey], [T1_1].[FullDateAlternateKey] AS [FullDateAlternateKey] FROM [DWTest].[dbo].[DimDate] AS T1_1</source_statement> <destination_table>[TEMP_ID_14]</destination_table> </dsql_operation> <dsql_operation operation_type="RND_ID"> <identifier>TEMP_ID_15</identifier> </dsql_operation> <dsql_operation operation_type="ON"> <location permanent="false" distribution="AllDistributions" /> <sql_operations> <sql_operation type="statement">CREATE TABLE [tempdb].[dbo].[TEMP_ID_15] ([OrderDateKey] INT NOT NULL, [CustomerKey] INT NOT NULL, [SalesAmount] MONEY NOT NULL ) WITH(DATA_COMPRESSION=PAGE);</sql_operation> </sql_operations> </dsql_operation> <dsql_operation operation_type="SHUFFLE_MOVE"> <operation_cost cost="3.93098847457627" accumulative_cost="5.92682847457627" average_rowsize="16" output_rows="60398" GroupNumber="10" /> <source_statement>SELECT [T1_1].[OrderDateKey] AS [OrderDateKey], [T1_1].[CustomerKey] AS [CustomerKey], [T1_1].[SalesAmount] AS [SalesAmount] FROM [DWTest].[dbo].[FactInternetSales] AS T1_1</source_statement> <destination_table>[TEMP_ID_15]</destination_table> <shuffle_columns>CustomerKey;</shuffle_columns> </dsql_operation> <dsql_operation operation_type="RND_ID"> <identifier>TEMP_ID_16</identifier> </dsql_operation> <dsql_operation operation_type="ON"> <location permanent="false" distribution="AllDistributions" /> <sql_operations> <sql_operation type="statement">CREATE TABLE [tempdb].[dbo].[TEMP_ID_16] ([FullDateAlternateKey] DATE NOT NULL, [LastName] NVARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AS, [col] MONEY NOT NULL ) WITH(DATA_COMPRESSION=PAGE);</sql_operation> </sql_operations> </dsql_operation> <dsql_operation operation_type="SHUFFLE_MOVE"> <operation_cost cost="0.0618522101694915" accumulative_cost="5.98868068474576" average_rowsize="111" output_rows="136.985" GroupNumber="25" /> <source_statement>SELECT [T1_1].[FullDateAlternateKey] AS [FullDateAlternateKey], [T1_1].[LastName] AS [LastName], [T1_1].[col] AS [col] FROM (SELECT SUM([T2_2].[SalesAmount]) AS [col], [T2_1].[FullDateAlternateKey] AS [FullDateAlternateKey], [T2_2].[LastName] AS [LastName] FROM [tempdb].[dbo].[TEMP_ID_14] AS T2_1 INNER JOIN (SELECT [T3_2].[OrderDateKey] AS [OrderDateKey], [T3_2].[SalesAmount] AS [SalesAmount], [T3_1].[LastName] AS [LastName] FROM [DWTest].[dbo].[DimCustomer] AS T3_1 INNER JOIN [tempdb].[dbo].[TEMP_ID_15] AS T3_2 ON ([T3_2].[CustomerKey] = [T3_1].[CustomerKey])) AS T2_2 ON ([T2_1].[DateKey] = [T2_2].[OrderDateKey]) GROUP BY [T2_1].[FullDateAlternateKey], [T2_2].[LastName]) AS T1_1</source_statement> <destination_table>[TEMP_ID_16]</destination_table> <shuffle_columns>FullDateAlternateKey;</shuffle_columns> </dsql_operation> <dsql_operation operation_type="ON"> <location permanent="false" distribution="AllDistributions" /> <sql_operations> <sql_operation type="statement">DROP TABLE [tempdb].[dbo].[TEMP_ID_15]</sql_operation> </sql_operations> </dsql_operation> <dsql_operation operation_type="ON"> <location permanent="false" distribution="AllComputeNodes" /> <sql_operations> <sql_operation type="statement">DROP TABLE [tempdb].[dbo].[TEMP_ID_14]</sql_operation> </sql_operations> </dsql_operation> <dsql_operation operation_type="RETURN"> <location distribution="AllDistributions" /> <select>SELECT [T1_1].[FullDateAlternateKey] AS [FullDateAlternateKey], [T1_1].[LastName] AS [LastName], [T1_1].[col] AS [col] FROM (SELECT [T2_1].[FullDateAlternateKey] AS [FullDateAlternateKey], [T2_1].[LastName] AS [LastName], [T2_1].[col] AS [col] FROM (SELECT SUM([T3_1].[col]) AS [col], [T3_1].[FullDateAlternateKey] AS [FullDateAlternateKey], [T3_1].[LastName] AS [LastName] FROM [tempdb].[dbo].[TEMP_ID_16] AS T3_1 GROUP BY [T3_1].[FullDateAlternateKey], [T3_1].[LastName]) AS T2_1 WHERE ([T2_1].[col] > CAST ((5000.0) AS DECIMAL (5, 1)))) AS T1_1 ORDER BY [T1_1].[FullDateAlternateKey] DESC</select> </dsql_operation> <dsql_operation operation_type="ON"> <location permanent="false" distribution="AllDistributions" /> <sql_operations> <sql_operation type="statement">DROP TABLE [tempdb].[dbo].[TEMP_ID_16]</sql_operation> </sql_operations> </dsql_operation> </dsql_operations> </dsql_query>
So now we just save this as a .sqlplan file and open it in SSMS, right?
Nope!
See, that’s not a regular execution plan, at all. Instead, it’s a D-SQL plan. It’s not the same as our old execution plans. You can’t open it as a graphical plan (and no, not even in that very popular 3rd party tool, I tried). You will have to learn how to read these plans differently because, well, they are different.
The post Azure SQL Data Warehouse Execution Plans appeared first on Home Of The Scary DBA.