Blog Post

Clustered Columnstore Indexes – part 33 (“Tuple Mover that closes open Delta-Stores”)

,

Continuation from the previous 32 parts, starting from http://www.nikoport.com/2013/07/05/clustered-columnstore-indexes-part-1-intro/

We all love and use Columnstore Segments, but we all kind of scared of the open Delta-Stores (yes there can be more than 1).

Why would that be?

– because they are uncompressed heaps which most probably do not benefit from the Batch Mode at all, and even more probably slowing it down;

– because there is no way of how we can measure their size definitely;

What can we do about it:

1. Rebuild the Index – this operation will definitely close all open Delta-Store, shuffle the data and after some time (depending on a number of factors) it will present us a fully “Segmented” Table ?? It might take long time, depending if we have enough partitioning and are using

2. Wait until we can load more data through the normal API (BULK Load API would create another closed Segment without touching on the Delta-Store), filling out the maximum capacity of a Row Group (1.045.678 Rows)

3. Cry. ??

Reorganize with (COMPRESS_ALL_ROW_GROUPS = ON)

Consider an undocumented (at the moment of the writing) hint COMPRESS_ALL_ROW_GROUPS = ON which so far I understand looks for any open Delta-Stores, closes them as they are, without shuffling any additional data and compresses them immediately.

This is our old good friend Tuple Mover, but which corrects any undesired situations. ??

For the demo, I shall be using a freshly restored version of the good old & free Contoso Retail DWH database:

-- Let us create a table which copies structure of FactOnlineSales
CREATE TABLE [dbo].[FactOnlineSales_Test](
[OnlineSalesKey] [int] NOT NULL,
[DateKey] [datetime] NOT NULL,
[StoreKey] [int] NOT NULL,
[ProductKey] [int] NOT NULL,
[PromotionKey] [int] NOT NULL,
[CurrencyKey] [int] NOT NULL,
[CustomerKey] [int] NOT NULL,
[SalesOrderNumber] [nvarchar](20) NOT NULL,
[SalesOrderLineNumber] [int] NULL,
[SalesQuantity] [int] NOT NULL,
[SalesAmount] [money] NOT NULL,
[ReturnQuantity] [int] NOT NULL,
[ReturnAmount] [money] NULL,
[DiscountQuantity] [int] NULL,
[DiscountAmount] [money] NULL,
[TotalCost] [money] NOT NULL,
[UnitCost] [money] NULL,
[UnitPrice] [money] NULL,
[ETLLoadID] [int] NULL,
[LoadDate] [datetime] NULL,
[UpdateDate] [datetime] NULL
) ON [PRIMARY]
-- Create a Clustered Columnstore Index on it
create clustered columnstore index CCI_FactOnlineSales_Test
on dbo.FactOnlineSales_Test;

Now let’s load 100.000 Rows, under the magic number of 102.400 in order to keep an open Delta-Store:

insert into dbo.FactOnlineSales_Test
select top 100000 *
from dbo.FactOnlineSales;

Now we can verify that we have an open Delta-Store at our table:

select object_name(object_id) as TableName, * from sys.column_store_row_groups
where object_id = object_id('FactOnlineSales_Test');

This is what I see at my VM, we have an open Delta-Store, with 100.000 rows in it.
Screen Shot 2014-07-04 at 20.05.46

Now we can go ahead and invoke our Tuple Mover with a hint COMPRESS_ALL_ROW_GROUPS in order to close & compress our Delta-Store:

alter index CCI_FactOnlineSales_Test on FactOnlineSales_Test
Reorganize with (COMPRESS_ALL_ROW_GROUPS = ON);

We need to consult the situation with the row groups again:

select object_name(object_id) as TableName, * from sys.column_store_row_groups
where object_id = object_id('FactOnlineSales_Test');

And here we go:
Compressed_Row_Group_after_Reorganize_with_ COMPRESS_ALL_ROW_GROUPS _hint

I consider this to be an extremely valuable hint, which in the case of a bug with Delta-Store allows us to correct the situation or should we be getting a lot of queries hitting an open Delta-Store, to compress it and to remove the pressure.

Final thoughts:

– I am wondering if when loading the data we could take a good usage in order to get the desired Row Group sizes … ??

– Are there any methods to prevent automated Segment creation for BULK Load API processes with more then 102.400 rows…?

to be continued with Clustered Columnstore Indexes – part 34 (“Deleted Segments Elimination”)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating