Continuation from the previous 74 parts, the whole series can be found at http://www.nikoport.com/columnstore/.
This blog post focuses on one of the newest additions to SQL Server 2016, the feature that most probably has gained a universal appraisal since the announcement – the Stretch DB and its integration with Columnstore Indexes.
Stretch DB or alternatively Stretch Database is a way of spreading your table between SQL Server (on-premises, VM in Azure) and a Azure SQLDatabase. This means that the dat of the table will shared between the SQL Server and the Azure SQLDatabase giving the opportunity to lower the total cost of the local storage, since Azure SQLDatabase is cheap relatively expensive storage typically used on the local SQL Server installations.
This mean that the table data will be separated into Hot Data & Cold Data, where Hot Data is the type of data that is frequently accessed and it extremely important (this is typically some OLTP data) and the Cold Data (this is typically rarely or almost never accessed archival or log data).
For the final user the experience should be the same as before – should he ask for some data that is not on the SQL Server, then it will be read from Azure SQLDatabase by the invocation of remote query, joined with the local results (if any) and then presented to the user.
Currently with SQL Server 2016 CTP 3.2 with GUI Wizard we can only migrate whole table to Azure SQLDatabase but with the help of T-SQL we can already control the predicate filter.
Since the StretchDB is focusing on helping to manage big amounts of data, one of the key elements in order to lower the cost is the usage of Columnstore Indexes and here I will focus on testing StretchDB compatibility with them.
The Test
I will use the free database ContosoRetailDW that comes from Microsoft that I use the backup copied into C:\Install and the data files are placed into C:\Data:
alter database ContosoRetailDW set SINGLE_USER WITH ROLLBACK IMMEDIATE; RESTORE DATABASE [ContosoRetailDW] FROM DISK = N'C:\Install\ContosoRetailDW.bak' WITH FILE = 1, MOVE N'ContosoRetailDW2.0' TO N'C:\Data\ContosoRetailDW.mdf', MOVE N'ContosoRetailDW2.0_log' TO N'C:\Data\ContosoRetailDW.ldf', NOUNLOAD, STATS = 5; alter database ContosoRetailDW set MULTI_USER; GO GO ALTER DATABASE [ContosoRetailDW] SET COMPATIBILITY_LEVEL = 130 GO ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0', SIZE = 2000000KB , FILEGROWTH = 128000KB ) GO ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0_log', SIZE = 400000KB , FILEGROWTH = 256000KB ) GO
Currently there are 2 ways of enabling StretchDB on the database:
– using T-SQL
– using Graphical User Interface (GUI) inside SQL Server Management Studio
For T-SQL we shall need to set up the remote data archive option that is enabling Stretch Database feature on the server, by executing the following command:
exec sp_configure 'remote data archive' , '1'; go Reconfigure; go
After that we shall need to create a master key for the database. This master key will be used for securing the credentials that are used to communicate with Azure SQLDatabase:
CREATE MASTER KEY ENCRYPTION BY PASSWORD='<someRandomP$ss0rd';
For communication with remote part (Azure SQLDatabase) we shall need to create database scooped credentials:
CREATE DATABASE SCOPED CREDENTIAL stretchContosoCredentials WITH IDENTITY = 'niko' , SECRET = '=#$"qwerty $&/:bnuwE3'; GO
We will need to create a Azure SQLServer & SQLDatabase that will have those credentials created in the previous step, or we might simply reuse the existing one.
In my case I simply went to Azure Portal and created a new server with database for this test.
Notice that you will need to give enough permissions on Azure SQLDatabase for your user to be able to create a new database.
After preparing login and user, you can simply execute the following statement, which will create a new database for StretchDB:
ALTER DATABASE ContosoRetailDW SET REMOTE_DATA_ARCHIVE = ON ( SERVER = 'columnstore.database.windows.net', -- My Azure SQL Server CREDENTIAL = stretchContosoCredentials -- Database Scope Credentials ) ;
Refreshing the list of your databases in SSMS, you will notice that the symbol of the database has changed becoming a “kind of a cloud db”, meaning that this database is being stretched to Azure:
Let’s stretch a table, and this time for the experiment I will pick dbo.FactITMachine table.
Since StretchDB at the moment does not support primary & foreign keys (make sense because how can you guarantee consistency with data that is not explicitly included in the operations), we will drop them all, plus we will need to remove the identity from the leading ITMachinekey column:
ALTER TABLE [dbo].[FactITMachine] DROP CONSTRAINT [FK_FactITMachine_DimDate]; GO ALTER TABLE [dbo].[FactITMachine] DROP CONSTRAINT [FK_FactITMachine_DimMachine]; GO ALTER TABLE [dbo].[FactITMachine] DROP CONSTRAINT [PK_FactITMachine]; GO alter table dbo.FactITMachine alter column [ITMachinekey] int;
Let’s set up a filtering predicate function that will determine which rows should be migrated to Azure SQLDatabase and which should stay in SQL Server. For this test I will define 1st of January of 2009 as the cutting point for the data to be migrated:
CREATE FUNCTION dbo.fn_stretchpredicateFactITMachine(@column1 datetime) RETURNS TABLE WITH SCHEMABINDING AS RETURNSELECT 1 AS is_eligible WHERE @column1 < CONVERT(datetime, '1/1/2009', 101);
Before activating stretching on a table with a filter predicate we shall need to enable a global trace flag in SQL Server 2016 CTP 3.2:
dbcc traceon(10422,-1); -- Trace Flag Filtering Predicate
Now let’s activate stretching on our dbo.FactITMachine table using the filter predicate defined in the previous step:
ALTER TABLE dbo.FactITMachine SET ( REMOTE_DATA_ARCHIVE = ON ( FILTER_PREDICATE = dbo.fn_stretchpredicateFactITMachine(DateKey), MIGRATION_STATE = OUTBOUND ) );
If we open monitoring screen for Stretched Databases (right click on our database -> tasks-> stretch -> monitor) :
you will see that our table is successfully migrated according to the filtering predicate we have defined:
To prove that those numbers reflect the actuality, feel free to execute the following T-SQL script which will return exactly 7086 rows corresponding our filtering predicate:
select count(*) from dbo.FactITMachine where Datekey >= '2009-01-01'
If you want to get the rows back into your SQL Server table, you will need to execute the following script that will enable the global trace flag allowing downward migration from Azure SQLDatabase down to your SQL Server, and then altering the state of the table, setting the migration to be InBound:
dbcc traceon(10424,-1); -- UnMigrate Trace flag ALTER TABLE dbo.FactITMachine SET ( REMOTE_DATA_ARCHIVE = ON (MIGRATION_STATE = INBOUND) ) ;
Let’s concentrate on the test of the functionality that matters to these blog series – Columnstore Indexes – let’s create 2 tables with Columnstore Indexes: 1 with Clustered Columnstore Index and the other one with Nonclustered Columnstore Index:
create table dbo.CCI_Stretched( c1 int, c2 varchar(25), index CCI_Stretched_Index clustered columnstore ); create table dbo.NCCI_Stretched( c1 int, c2 varchar(25), index CCI_Stretched_Index nonclustered columnstore (c1,c2) );
Let’s insert a couple of rows into each of the tables:
insert into dbo.CCI_Stretched (c1,c2) select top 100 id, substring(name,0,25) from sys.sysobjects; insert into dbo.NCCI_Stretched (c1,c2) select top 100 id, substring(name,0,25) from sys.sysobjects;
Now let’s go to the GUI in SQL Server Management Studio and enable stretching for our table with the Clustered Columnstore Index:
Clicking through the welcome screen:
Confirming that the table selected is correct:
We are coming to the validation screen:
where you can see that we have received a pretty concrete message that the type of index used in the table (Clustered Columnstore is the only index there) is not supported for Stretch Databases.
If clicking on the error details you will be presented with a more definitive error message confirming that Clustered Columnstore Indexes are not supported.
Ok, this is a message from the GUI and we all know that GUI is the last thing to get updated. Even though the message appears to be very concrete, let’s try to enable stretching with the help of T-SQL:
ALTER TABLE dbo.CCI_Stretched SET ( REMOTE_DATA_ARCHIVE = ON ( MIGRATION_STATE = OUTBOUND ) );
The script was executed without any error message – did it work ?
Let’s open the Stretch Database Monitor – but the table is not listed there !
Opening Stretch Database Health Events in Extended Events gave me the following message:
This message is clearly showing the the remote table could not be created. Verifying that the table is the one I just tried to configure, I execute the following T-SQL:
select object_name(318624178);
The result is the dbo.CCI_Stretched table and so the extended event is correct – we can’t have our table stretched to Azure SQLDatabase.
Well … Thinking about the error message I have decided to try to … correct it.
Given that the database created for the StretchDB is a Standard S3 one, I went to Azure Portal and manually upgraded it to P1 (Premium).
Seconds after the upgrade has finished, and after restarting SSMS (this is an old bug with StretchDB), I have had my table listed as stretched:
To make a further confirmation, I have executed the following query while enabling the showing of the actual execution plan:
select * from dbo.CCI_Stretched;
Having Remote Query and concatenation with the local data is exactly how Stretch DB is working:
Let us hope then that there will be a possibility for choosing an edition while stretching databases. This will allow to avoid all possible problems.
Alternatively Microsoft can always enable Columnstore Indexes for the Standard Edition …
Lets not forget about the Operational Analytics and let’s see if we can stretch a table with updatable Nonclustered Columnstore Indexes:
This is the same error we have seen for the Clustered Columnstore Indexes, so let’s use the power of T-SQL to stretch this table as well:
ALTER TABLE dbo.NCCI_Stretched SET ( REMOTE_DATA_ARCHIVE = ON ( MIGRATION_STATE = OUTBOUND ) );
The result is visible in the Stretch DB Monitor:
Let’s verify if the query is really being run remotely:
select * from dbo.NCCI_Stretched;
The execution plan speaks for itself:
There is one more scenario that one should consider when thinking about Columnstore Indexes: the newest addition for the In-Memory Tables: InMemory Operational Analytics.
Heading to the Surface area limitations and blocking issues for Stretch Database will help you to discover the limitats and the Memory-optimized tables overall are not supported for the current version.
Final Thoughts
Thinking about the impossibility of Columnstore Index stretching i realised that the default Azure SQLDatabase that is created for StretchDB is the Standard one (S3), where Columnstore Indexes are not supported at the moment.
Otherwise the technology should work without any problems.
Time will tell if maybe StretchDB Migration Advisor will recommend & allow Premium DB selection when configuring the feature …
to be continued …