Continuation from the previous 85 parts, the whole series can be found at http://www.nikoport.com/columnstore/
With the release of SQL Server 2016, a great number of new functionalities with specific internal structures and own behaviours were created (Columnstore Indexes – part 60 (“3 More Batch Mode Improvements in SQL Server 2016”), Columnstore Indexes – part 60 (“3 More Batch Mode Improvements in SQL Server 2016”), Columnstore Indexes – part 65 (“Clustered Columnstore Improvements in SQL Server 2016”), Columnstore Indexes – part 72 (“InMemory Operational Analytics”)) – so that there is a customer need to execute a good control under some certain circumstances.
2 years ago I have blogged about available Trace Flags & Query Rules for the Columnstore Indexes in SQL Server 2014, and now it is the time to revisit this topic and show what new Trace Flags are available in SQL Server 2016, and this article focuses on this topic.
Batch Sort
The possibility of having your data sorted with the help of the Batch Execution Mode was introduced in SQL Server and some of the key improvements for the SQL Server 2016, such as Window Functions depend on the velocity of execution of the Sort iterator
The vast majority of the time, it works fine, but some rare times it can bring you into some serious trouble.
For that purpose, the documented (I call it documented, since it is mentioned in the KB 3172787) Trace Flag was introduced to allow to disable the Batch Execution Mode for the iterator.
This trace flag works as a Configuration Parameter, as a Global Trace Flag, Session Trace Flag or even QueryTraceOn option).
I will use the free ContosoRetailDW database (it is so easy to play) and run the standard script for restoring a copy of it from C:\Install\, upgrading it’s compatibility level to 130 (batch mode improvements require it) and then dropping the primary clustered key from the FactOnlineSales table and create a Clustered Columnstore Index on it:
USE master; alter database ContosoRetailDW set SINGLE_USER WITH ROLLBACK IMMEDIATE; RESTORE DATABASE [ContosoRetailDW] FROM DISK = N'C:\Install\ContosoRetailDW.bak' WITH FILE = 1, MOVE N'ContosoRetailDW2.0' TO N'C:\Data\ContosoRetailDW.mdf', MOVE N'ContosoRetailDW2.0_log' TO N'C:\Data\ContosoRetailDW.ldf', NOUNLOAD, STATS = 5; alter database ContosoRetailDW set MULTI_USER; GO Use ContosoRetailDW; GO ALTER DATABASE [ContosoRetailDW] SET COMPATIBILITY_LEVEL = 130 GO ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0', SIZE = 2000000KB , FILEGROWTH = 128000KB ) GO ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0_log', SIZE = 400000KB , FILEGROWTH = 256000KB ) GO ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [PK_FactOnlineSales_SalesKey]; create clustered columnstore Index PK_FactOnlineSales on dbo.FactOnlineSales;
Let’s run the following test query against our Clustered Columnstore table, it should give us in the execution plan a Sort Iterator which will function in the Batch Execution Mode:
declare @loadDate as DateTime; select @loadDate = sales.loadDate from dbo.FactOnlineSales sales inner join dbo.DimPromotion prom on sales.PromotionKey = prom.PromotionKey where prom.DiscountPercent = 0 order by sales.LoadDate;
The 3rd from the left, the Sort Iterator is there, running in the Batch Execution Mode and occupying estimated 34% of the resources. It took just 171ms on my virtual machine.
Now, let’s us enable the Trace Flag 9347 and see if it shall bring the expected impact to the Batch Sort iterator (notice that if you are running on the RTM version of SQL Server, you will face a bug that was fixed in the Cumulative Update 1, which is already available):
declare @loadDate as DateTime; select @loadDate = sales.loadDate from dbo.FactOnlineSales sales inner join dbo.DimPromotion prom on sales.PromotionKey = prom.PromotionKey where prom.DiscountPercent = 0 order by sales.LoadDate option( querytraceon 9347);
Occupying estimated 84%! of the resources, the very same query runs over 15 times slower – 2731 ms, all because of the Sort Iterator being executed in the Row Execution Mode. What makes the performance suffer more is that the Sort Iterator does not get enough memory through the memory grant and that is why it is spilling on the disk.
Top N Sort
The Top N Sort iterator does sort like the above mentioned Sort iterator, with the difference that it passes only top N rows after sorting. There are different optimisations to the used algorithm that make this iterator functioning differently from the Sort iterator.
Given it is a different functionality and a different iterator, SQL Server 2016 has a separate Trace Flag that will allow you to disable the Batch Mode on it – the Trace Flag 9349. It is important to have it separate from the complete Sort Iterator, since a number of times one can catch both operators and having a possibility to shut down just a Top N Sort can be advantageous.
Let’s run a SELECT TOP X query, selecting 100.000 rows from the results of our previous query:
declare @loadDate as DateTime; select top 100000 @loadDate = sales.loadDate from dbo.FactOnlineSales sales inner join dbo.DimPromotion prom on sales.PromotionKey = prom.PromotionKey where prom.DiscountPercent = 0 order by sales.LoadDate;
As you can see on the image above, the execution plan presents TOP N Sort Iterator functioning in the Batch Execution Mode, taking 101ms in total execution time on my VM, while occupying estimated 53% of the execution plan resources.
Let’s execute the same query this time with the help of the Trace Flag 9349, thus preventing the TOP N Sort Iterator from running in the Batch Execution Mode:
declare @loadDate as DateTime; select top 100000 @loadDate = sales.loadDate from dbo.FactOnlineSales sales inner join dbo.DimPromotion prom on sales.PromotionKey = prom.PromotionKey where prom.DiscountPercent = 0 order by sales.LoadDate option(querytraceon 9349);
This time the query took good 1820ms, over 18 times slower then the query with Sort Iterator using the Batch Execution Mode. Even the estimated resource consumption jumped up to 92%, but from the other side the amount of memory that was granted for the second query is around 38MB vs 72MB that was granted for the query with the Batch Execution Mode. When running an OLTP that is using a huge number of parallel connections, the amount of used memory can be the most important resource and keeping it down to an accepted level can become the paramount of keeping the system alive.
Sort in a Complex Parallel Query
Sometimes there are complex queries that suffers from the unnecessary Batch Execution Mode for Sort Iterators and since Cumulative Update 1 for SQL Server 2016, there is a new trace flag – 9358 for disabling it.
I could not easily produce with a sample Database such an example query, but I hope to get back to this blog post one day and provide a practical example.
An important additional part to consider is that according to the official documentation, the good old Trace Flag 4199 will produce the same effect, meaning that if it is active – you will see some of your Sort Iterators running in the Row Execution Mode.
Dynamic memory grant for batch mode operators
Starting with SQL Server 2016, if you have enough RAM and suffering from the TempDB Spills that do have a significant impact on your workload, then you can enable the Trace Flag 9389 that will enable Batch Mode Iterators to request additional memory for the work and thus avoiding producing additional unnecessary I/O.
I am glad that Microsoft has created this functionality and especially that at the current release, it is hidden behind this track, and so Microsoft can learn from the applications before enabling it by default, hopefully in the next major release of SQL Server.
Merge & recompress during columnstore index reorganisation
Some of the most incredibly useful feature in SQL Server 2016 was removal, recompression and merge of the fragmented Row Groups (containing deleted data in the Deleted Bitmap) that I have already described in details in Columnstore Indexes – part 74 (“Row Group Merging & Cleanup, SQL Server 2016 edition”).
In SQL Server 2016 we also have a special knob for disabling this functionality and it is a Trace Flag 10204. In other words, your SQL Server 2016 database with compatibility level 130 will become a kind of SQL Server 2014, where Row Groups maintenance was non-functioning with ALTER INDEX … REORGANIZE command.
Disk-Based Clustered Columnstore
Nothing beats a practical example that is also presentable in a blog post and so here is the code to create an exact copy of a FactOnlineSales with a Clustered Columnstore Index, which will serve us for testing the ALTER INDEX … REORGANIZE command:
CREATE TABLE [dbo].[FactOnlineSales_CCI]( [OnlineSalesKey] [int] NOT NULL, [DateKey] [datetime] NOT NULL, [StoreKey] [int] NOT NULL, [ProductKey] [int] NOT NULL, [PromotionKey] [int] NOT NULL, [CurrencyKey] [int] NOT NULL, [CustomerKey] [int] NOT NULL, [SalesOrderNumber] [nvarchar](20) NOT NULL, [SalesOrderLineNumber] [int] NULL, [SalesQuantity] [int] NOT NULL, [SalesAmount] [money] NOT NULL, [ReturnQuantity] [int] NOT NULL, [ReturnAmount] [money] NULL, [DiscountQuantity] [int] NULL, [DiscountAmount] [money] NULL, [TotalCost] [money] NOT NULL, [UnitCost] [money] NULL, [UnitPrice] [money] NULL, [ETLLoadID] [int] NULL, [LoadDate] [datetime] NULL, [UpdateDate] [datetime] NULL ) ON [PRIMARY] insert into [dbo].[FactOnlineSales_CCI] with (tablockx) (OnlineSalesKey, DateKey, StoreKey, ProductKey, PromotionKey, CurrencyKey, CustomerKey, SalesOrderNumber, SalesOrderLineNumber, SalesQuantity, SalesAmount, ReturnQuantity, ReturnAmount, DiscountQuantity, DiscountAmount, TotalCost, UnitCost, UnitPrice, ETLLoadID, LoadDate, UpdateDate) select OnlineSalesKey, DateKey, StoreKey, ProductKey, PromotionKey, CurrencyKey, CustomerKey, SalesOrderNumber, SalesOrderLineNumber, SalesQuantity, SalesAmount, ReturnQuantity, ReturnAmount, DiscountQuantity, DiscountAmount, TotalCost, UnitCost, UnitPrice, ETLLoadID, LoadDate, UpdateDate from dbo.FactOnlineSales; create clustered index PK_FactOnlineSales_CCI on dbo.FactOnlineSales_CCI (OnlineSalesKey) with (data_compression = page); create clustered columnstore index PK_FactOnlineSales_CCI on dbo.FactOnlineSales_CCI with (drop_existing = on, maxdop = 1);
The next step is to install the CISL – Columnstore Indexes Script Library, which will provide us with the insights on the internal structures of our table:
exec cstore_GetRowGroups @tableName = 'FactOnlineSales_CCI' exec cstore_GetRowGroupsDetails @tableName = 'FactOnlineSales_CCI';
Everything is fine as you can see. We have a copy of our data with the Clustered Columnstore Index.
Let’s enable the Trace Flag 10204 (it does not make a lot of sense to do it on the session level of course, since the automated Tuple Mover runs in the background, unless disabled with the other Trace Flag 634):
dbcc traceon (10204, -1);
We can now easily advance with our delete process, and so the next statement shall delete the first 2 million rows ordered by the column OnlineSalesKey:
;with delCTE as ( select top (2000000) OnlineSalesKey from dbo.FactOnlineSales_CCI del order by OnlineSalesKey ) delete from delCTE;
Let’s consult the internal structures of our FactOnlineSales_CCI table:
exec cstore_GetRowGroups @tableName = 'FactOnlineSales_CCI' exec cstore_GetRowGroupsDetails @tableName = 'FactOnlineSales_CCI';
You can see on the image below, that we have 2 million rows marked as deleted in the Deleted Bitmap and that the very first Row Group is completely deleted:
All we need to do now, is to invoke the Tuple Mover with the ALTER INDEX … REORGANIZE statement, so that we can see the removal of the first RowGroup and Auto-Merge of the second Row Group:
alter index PK_FactOnlineSales_CCI on dbo.FactOnlineSales_CCI reorganize;
The statement does not even take 1 millisecond on my VM .. Because it does nothing. The activated Trace Flag goes deeper than the corresponding Extended Event columnstore_no_rowgroup_qualified_for_merge – which registers nothing at this point. Troubleshooting the issue with another potentially helpful Extended Event columnstore_rowgroup_merge_failed brings nothing as well. The trace flag simply shuts it down without a trace (pun intended).
Running the CISL functions will deliver the same result as we have had above, the ALTER INDEX … REORGANIZE falls back into SQL Server 2014 mode, where it takes care only of the open Delta-Store by default.
exec cstore_GetRowGroups @tableName = 'FactOnlineSales_CCI' exec cstore_GetRowGroupsDetails @tableName = 'FactOnlineSales_CCI';
I guess that this functionality will drive people completely mad, if they know nothing about this trace flag, but hey, if you are using CISL from the version 1.3.1 and later, your cstore_GetSQLInfo.sql will provide you with the correct output (the current file at GitHub gives you this functionality already):
exec dbo.cstore_GetSQLInfo;
The Nonclustered Columnstore
We shall make now the same test we did for the Clustered Columnstore, but this time we shall do it for the Nonclustered Columnstore Index, which is also disk-based, as the previous one:
CREATE TABLE [dbo].[FactOnlineSales_NCCI]( [OnlineSalesKey] [int] NOT NULL, [DateKey] [datetime] NOT NULL, [StoreKey] [int] NOT NULL, [ProductKey] [int] NOT NULL, [PromotionKey] [int] NOT NULL, [CurrencyKey] [int] NOT NULL, [CustomerKey] [int] NOT NULL, [SalesOrderNumber] [nvarchar](20) NOT NULL, [SalesOrderLineNumber] [int] NULL, [SalesQuantity] [int] NOT NULL, [SalesAmount] [money] NOT NULL, [ReturnQuantity] [int] NOT NULL, [ReturnAmount] [money] NULL, [DiscountQuantity] [int] NULL, [DiscountAmount] [money] NULL, [TotalCost] [money] NOT NULL, [UnitCost] [money] NULL, [UnitPrice] [money] NULL, [ETLLoadID] [int] NULL, [LoadDate] [datetime] NULL, [UpdateDate] [datetime] NULL ) ON [PRIMARY] insert into [dbo].[FactOnlineSales_NCCI] with (tablockx) (OnlineSalesKey, DateKey, StoreKey, ProductKey, PromotionKey, CurrencyKey, CustomerKey, SalesOrderNumber, SalesOrderLineNumber, SalesQuantity, SalesAmount, ReturnQuantity, ReturnAmount, DiscountQuantity, DiscountAmount, TotalCost, UnitCost, UnitPrice, ETLLoadID, LoadDate, UpdateDate) select OnlineSalesKey, DateKey, StoreKey, ProductKey, PromotionKey, CurrencyKey, CustomerKey, SalesOrderNumber, SalesOrderLineNumber, SalesQuantity, SalesAmount, ReturnQuantity, ReturnAmount, DiscountQuantity, DiscountAmount, TotalCost, UnitCost, UnitPrice, ETLLoadID, LoadDate, UpdateDate from dbo.FactOnlineSales; create clustered index PK_FactOnlineSales_NCCI on dbo.FactOnlineSales_NCCI (OnlineSalesKey) with (data_compression = page); create nonclustered index Index_FactOnlineSales_NCCI on dbo.FactOnlineSales_NCCI (OnlineSalesKey, DateKey, StoreKey, ProductKey, PromotionKey, CurrencyKey, CustomerKey, SalesOrderNumber, SalesOrderLineNumber, SalesQuantity, SalesAmount, ReturnQuantity, ReturnAmount, DiscountQuantity, DiscountAmount, TotalCost, UnitCost, UnitPrice, ETLLoadID, LoadDate, UpdateDate) with (--drop_existing = on, data_compression = page, maxdop = 1); create nonclustered columnstore index Index_FactOnlineSales_NCCI on dbo.FactOnlineSales_NCCI (OnlineSalesKey, DateKey, StoreKey, ProductKey, PromotionKey, CurrencyKey, CustomerKey, SalesOrderNumber, SalesOrderLineNumber, SalesQuantity, SalesAmount, ReturnQuantity, ReturnAmount, DiscountQuantity, DiscountAmount, TotalCost, UnitCost, UnitPrice, ETLLoadID, LoadDate, UpdateDate) with (drop_existing = on, maxdop = 1);
Let’s consult the internal structures:
exec dbo.cstore_GetRowGroups @tableName = 'FactOnlineSales_NCCI'; exec dbo.cstore_GetRowGroupsDetails @tableName = 'FactOnlineSales_NCCI';
Everything is under control here, let’s rebuild the Index 2 times (data will be first moved from the Deleted Buffer to the deleted Bitmap in the first step)
-- Move data to Deleted Bitmap alter index Index_FactOnlineSales_NCCI on dbo.FactOnlineSales_NCCI reorganize; -- Reorganize & Merge the Row Groups alter index Index_FactOnlineSales_NCCI on dbo.FactOnlineSales_NCCI reorganize;
Let’s consult the internal structures again:
exec dbo.cstore_GetRowGroups @tableName = 'FactOnlineSales_NCCI'; exec dbo.cstore_GetRowGroupsDetails @tableName = 'FactOnlineSales_NCCI';
Same as for the Clustered Columnstore Index, the data will not be migrated or merged between the Row Groups, because of the active trace flag 10204.
Memory-Optimized Columnstore
But let’s take it one step further, and see if this trace flag affects the Memory-Optimised Tables with Clustered Columnstore Index as well:
drop TABLE if exists [dbo].[FactOnlineSales_Hekaton]; CREATE TABLE [dbo].[FactOnlineSales_Hekaton]( [OnlineSalesKey] [int] NOT NULL, [DateKey] [datetime] NOT NULL, [StoreKey] [int] NOT NULL, [ProductKey] [int] NOT NULL, [PromotionKey] [int] NOT NULL, [CurrencyKey] [int] NOT NULL, [CustomerKey] [int] NOT NULL, [SalesOrderNumber] [nvarchar](20) NOT NULL, [SalesOrderLineNumber] [int] NULL, [SalesQuantity] [int] NOT NULL, [SalesAmount] [money] NOT NULL, [ReturnQuantity] [int] NOT NULL, [ReturnAmount] [money] NULL, [DiscountQuantity] [int] NULL, [DiscountAmount] [money] NULL, [TotalCost] [money] NOT NULL, [UnitCost] [money] NULL, [UnitPrice] [money] NULL, [ETLLoadID] [int] NULL, [LoadDate] [datetime] NULL, [UpdateDate] [datetime] NULL, Constraint PK_FactOnlineSales_Hekaton PRIMARY KEY NONCLUSTERED ([OnlineSalesKey]), INDEX NCCI_FactOnlineSales_Hekaton CLUSTERED COLUMNSTORE ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA); -- Insert 5 Million Rows from dbo.FactOnlineSales insert into dbo.FactOnlineSales_Hekaton select top 5000000 * from dbo.FactOnlineSales --order by OnlineSalesKey desc
In order to move the data from the Tail Row Group, I will use the newest addition for the CISL library, the cstore_doMaintenance function:
exec dbo.cstore_doMaintenance @execute = 1, @debug = 1, @executeReorganize = 1, @logData = 1, @tableName = 'FactOnlineSales_Hekaton';
Let’s see the current internals of our Memory-Optimised table:
exec cstore_GetRowGroups @tableName = 'FactOnlineSales_Hekaton'; exec cstore_GetRowGroupsDetails @tableName = 'FactOnlineSales_Hekaton';
Looks fine, as expected.
Let’s delete 2 Million Rows from our Memory-Optimised Table:
;with delCTE as ( select top (2000000) OnlineSalesKey from dbo.FactOnlineSales_Hekaton del order by OnlineSalesKey ) delete from delCTE;
Since the process of moving the data from the Tail Row Group does not guarantee the sequential of the data, let’s delete 2.8 million rows, leaving only 200.000 rows active within our table:
;with delCTE as ( select top (2000000) OnlineSalesKey from dbo.FactOnlineSales_Hekaton del order by OnlineSalesKey ) delete from delCTE;
Now, if we ran the Reorganize command on the Memory-Optimised table:
-- Select Object_ID for the InMemory OLTP Table select object_id('dbo.FactOnlineSales_Hekaton'); -- Substitute the object_id with your own from the previous statement exec sys.sp_memory_optimized_cs_migration @object_id = 219147826
or you can do like me and simply invoke the CISL cstore_doMaintance function
exec dbo.cstore_doMaintenance @debug = 1, @executeReorganize = 1, @logData = 1, @tableName = 'FactOnlineSales_Hekaton';
The miracle happens, and you can see below that the internal structure of our Row Groups is very well affected by the Row Migration:
exec cstore_GetRowGroups @tableName = 'FactOnlineSales_Hekaton'; exec cstore_GetRowGroupsDetails @tableName = 'FactOnlineSales_Hekaton';
Surprise! On the image above, we have a Tail Row Group (row_group_id = -1), which is a kind of a Delta-Store, containing the surviving 200.000 rows, and all completely deleted Row Groups are simply removed from the internal structures.
Why did this happen ?
1. There is no REBUILD functionality for the Memory-Optimized Columnstore Tables yet, and so there must be a way to keep the table alive.
2. The Row Migration process for the Memory-Optimized Columnstore Tables moves rows from Tail Row Group into the compressed Row Groups and back, it does function in a different way than a disk-based columnstore and with no REBUILD possibility, this is the only way of cleaning up the rows from the Memory-Optimized Columnstore table (besides dropping and recreating the Columnstore Index, of course).
I am happy to see that the Trace Flag 10204 does not function for the Memory-Optimized Columnstore tables, because technically for the current version of SQL Server it would make no sense at all.
Old Trace Flags & Query Rules
The Trace Flags for SQL Server 2014 were described in Columnstore Indexes – part 35 (“Trace Flags & Query Optimiser Rules”) are still active in SQL Server 2016, I will not provide the tests here, feel free to simply re-run the queries that are specified in that post.
Final Thoughts
These Trace Flags presented in this article are some really great knobs, that Microsoft has inserted into the SQL Server 2016 (some of them are available only from SQL Server 2016 CU1), allowing you to expand the potential performance of your Columnstore Batch Execution Mode queries (Dynamic memory grant with TF 9389), and preventing the Sort operations from running in the Batch Execution Mode (could be important for overloaded OLTP systems).
The possibility of disabling Row Group Merge & Self-Compress should target the extreme OLTP System (Trace Flag 10204) is also nice, but I really hope that no unexperienced engineer will start applying this trace flag everywhere by default – though I will be updating the CISL to recognise and advise the user of this configuration.
to be continued …