Columnstore Indexes – part 85 (“Important Batch Mode Changes in SQL Server 2016”)

Continuation from the previous 84 parts, the whole series can be found at http://www.nikoport.com/columnstore/

I wanted to write a blog post on some of the changes that were implemented in SQL Server 2016, that might affect workloads migrated from SQL Server 2012 & SQL Server 2014.

Some of the possibilities of acquiring the Batch Execution Mode in the 2012 & 2014 versions of SQL Server, especially in the indirect way are now removed from the product…

To my understanding, this happens for preventing the OLTP Systems from spending whole their resources on analytical style of processing, but in my opinion this will affect the DWH and the Hybrid scenarios mostly, plus some of the edge cases with mixed environments (short-range lookups joining with big tables).

I will use my favourite free Database ContosoRetailDW from Microsoft, to show what some of those changes are about.

Let’s roll out a freshly restored version of it, with upgrading the data files size and most importantly setting the compatibility level to 130, which corresponds to the SQL Server 2016:

USE master;
 
alter database ContosoRetailDW
    set SINGLE_USER WITH ROLLBACK IMMEDIATE;
 
RESTORE DATABASE [ContosoRetailDW] 
    FROM  DISK = N'C:\Install\ContosoRetailDW.bak' WITH  FILE = 1,  
        MOVE N'ContosoRetailDW2.0' TO N'C:\Data\ContosoRetailDW.mdf', 
        MOVE N'ContosoRetailDW2.0_log' TO N'C:\Data\ContosoRetailDW.ldf',  
        NOUNLOAD,  STATS = 5;
 
alter database ContosoRetailDW
    set MULTI_USER;
GO
Use ContosoRetailDW;
GO
ALTER DATABASE [ContosoRetailDW] SET COMPATIBILITY_LEVEL = 130
GO
ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0', SIZE = 2000000KB , FILEGROWTH = 128000KB )
GO
ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0_log', SIZE = 400000KB , FILEGROWTH = 256000KB )
GO

As the next step, let us create a Nonclustered Columnstore Index on the FactOnlineSales table:

CREATE NONCLUSTERED COLUMNSTORE INDEX [ncci_factonlinesales] ON [dbo].[FactOnlineSales]
(
[OnlineSalesKey],
[DateKey],
[StoreKey],
[ProductKey],
[PromotionKey],
[CurrencyKey],
[CustomerKey],
[SalesOrderNumber],
[SalesOrderLineNumber],
[SalesQuantity],
[SalesAmount],
[ReturnQuantity],
[ReturnAmount],
[DiscountQuantity],
[DiscountAmount],
[TotalCost],
[UnitCost],
[UnitPrice],
[ETLLoadID],
[LoadDate],
[UpdateDate]
);

Now to the test query, where I will look for top 1000 product based on their sales in USD since the year 2014:

select top 1000 prod.ProductName, sum(sales.SalesAmount)
from dbo.FactOnlineSales sales with(index([PK_FactOnlineSales_SalesKey]))
inner join dbo.DimProduct prod
on sales.ProductKey = prod.ProductKey
inner join dbo.DimCurrency cur
on sales.CurrencyKey = cur.CurrencyKey
inner join dbo.DimPromotion prom
on sales.PromotionKey = prom.PromotionKey
where cur.CurrencyName = 'USD' and prom.EndDate >= '2004-01-01'
group by prod.ProductName
order by sum(sales.SalesAmount) desc
option (recompile);

This looks like a rather neat basic reporting query that one would meet on a daily basis when working in BI/DWH. The only thing that would differ from a normal plan is that we are forcing the Rowstore PK Index on the FactOnlineSales rather then using the created Columnstore Index. For a number of reasons, when we can’t get out of Query Processor what we need, we have to force the execution with the Index hint. Sometimes Query Processor will even select the path of using different index instead of the one that makes sense (well, because SURPRISE! Nothing is Perfect).

It takes a big amount of time to process this query – 12 seconds on my VM with a hot memory.

Considering it’s execution plan below, you will notice that there is nothing too much suspicious:

Well, imagine that – on the same VM, on the latest SQL Server 2014 version it takes 5.5 Seconds to run this very query! What ? 5.5 Seconds on SQL Server 2014 vs 12 Seconds on SQL Server 2016? So it’s not faster ? Captain, what the heck is the problem with this boat ?

Take a look at the execution plan from the SQL Server 2014:

The difference I see on the first sight is that there is a Parallelism Repartition Stream iterator near the Clustered Index Scan for the DimProduct Dimension. It can’t make the execution times going havoc like they do!

I will share a little secret with you – it’s all about the Batch Execution Mode in SQL Server 2014: all those Hash Match iterators are running in Batch Mode, even though we are not using Columnstore Index anywhere.

In SQL Server 2016 this old (since 2012) functionality has been removed and once you are running your queries in the compatibility level of 130 (SQL Server 2016), your queries that were taking advantage of it – will be running significantly slower.

There is a fast & brutal solution for that problem – set your compatibility level to 120, but do not go there until you have understood all the implications: some of the most important and magnificent improvements for the Batch Execution Mode are functioning only if your database is set to compatibility level 130: single threaded batch mode, batch sorting, window functions, etc.

From what I know, there is no way you can have all of those functionalities working together under the same hood and enjoy the old way of getting Batch Execution Mode without the presence of the Columnstore Index.

For the sake experiment, let’s lower the compatibility level:

USE master
GO
ALTER DATABASE [ContosoRetailDW] 
    SET COMPATIBILITY_LEVEL = 120;
GO

Running the same query against my SQL Server 2016 instance, will bring me some very interesting results:

select top 1000 prod.ProductName, sum(sales.SalesAmount)
from dbo.FactOnlineSales sales with(index([PK_FactOnlineSales_SalesKey]))
inner join dbo.DimProduct prod
on sales.ProductKey = prod.ProductKey
inner join dbo.DimCurrency cur
on sales.CurrencyKey = cur.CurrencyKey
inner join dbo.DimPromotion prom
on sales.PromotionKey = prom.PromotionKey
where cur.CurrencyName = 'USD' and prom.EndDate >= '2004-01-01'
group by prod.ProductName
order by sum(sales.SalesAmount) desc
option (recompile);

It took just 4.3 Seconds!

…

Here is an execution plan for that:

Yes, it looks pretty much like it does in SQL Server 2014.

Start crying now. Yes, it runs much faster in SQL Server 2016 over SQL Server 2014 (around 20%) even though because of the lack of 130 Compatibility Level support, the Sort iterator is running in the Row Execution Mode (meaning it can run even faster).

Such options as a IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX and others will be following the very same path – they will loose the Batch Execution Mode once you upgrade to the 130 Compatibility Level.

The biggest problem with this situation is that things that were running fast – will be running slow and there is no reasonable way to solve this, because when migrating to SQL Server 2016, my goal is to bring the compatibility to the level 130, and not to leave it at 120, but from the other side – I definitely do not want to explain to my clients, why their reports are running slower on SQL Server 2016 once migration is finished.

Dear Microsoft, I need a Trace Flag to make this thing right. Please!

to be continued with Columnstore Indexes – part 86 (“New Trace Flags in SQL Server 2016”)

Columnstore Indexes – part 85 (“Important Batch Mode Changes in SQL Server 2016”)

Rate

Share

Share

Rate

Columnstore Indexes – part 85 (“Important Batch Mode Changes in SQL Server 2016”)

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts