Continuation from the previous 66 parts, the whole series can be found at http://www.nikoport.com/columnstore/.
Continuing from the last couple of blog posts on the improvements for the DataWarehouse scenario with Clustered Columnstore Indexes, I want to turn this time to some very important ones improvement, that will greatly improve the availability of the Clustered Columnstore tables, when faced with a number of parallel transactions executing Point or Range Lookups.
In a great DataWarehouse environment the locking problem should not occur, as Data Loading process timeframes are very clearly defined and happening not during heavy reading & processing windows, but in the real world, there are always a couple of things that need to be upserted in the real time while people are still processing information.
Some Business Intelligence & Reporting solution do allow the insertion of the information during daytime activities, and for some of the projects, Clustered Columnstore Indexes were the solution that could not be applied.
In SQL Server 2014, when working with singleton updates & deletes we could easily block the reading operations from Clustered Columnstore Index, I have shown it in Columnstore Indexes – part 8 (“Locking”). The logical solution for this problem would be moving our workload to Snapshot or Read Committed Snapshot isolation level, by paying with TempDB (it is always suffering, isn’t it ?) the duties for avoiding some of the worst locking & deadlocking problems.
Well, it happened that in SQL Server 2014, the only 2 Isolation Levels that were not supported are Snapshot Isolation & Read Committed Snapshot, as I have written in Columnstore Indexes – part 7 (“Transaction Isolation”), and and as I have shown in December of 2014, after the release of Azure SQL Database v12 in Azure Columnstore, part 2 – Snapshot Isolation & Batch Mode DOP, the existence of
Columnstore Indexes themself in Azure SQLDatabase is a proof of support for both of the missing isolation levels (since Azure SQLDatabase runs in Read Committed Snapshot).
Just to remind you that the reason we did not have readable secondaries in SQL Server 2014 for Clustered Columnstore Indexes was the lack of support for snapshot isolation transaction level.
Let’s make a short test drive of the Read Committed Snapshot, in order to confirm it’s availability in SQL Server 2016. Fort his purpose I will be using a SQL Server 2016 CTP 2.3 Release, the latest release available at the moment.
Let’s create the very same CCTest table that I have created in the original blogpost, and a Clustered Columnstore Index on it:
IF OBJECT_ID('dbo.CCTest', 'U') IS NOT NULL drop table dbo.CCTest; create table dbo.CCTest( id int not null,-- primary key clustered, name varchar(50) not null, lastname varchar(50) not null ); GO create clustered columnstore index CCL_CCTest on dbo.CCTest; GO
Let’s also add a secondary Nonclustered B-tree Index, since we have them in SQL Server 2016, as I have shown in the previous blog posts:
create nonclustered index IX_CCTest_id on dbo.CCTest( id ) Include ( lastname );
Armed with this B-tree Nonclustered Index and with our Clustered Columnstore index, let’s try to insert some data into our table while setting the database into Read Committed Snapshot isolation level:
alter database ContosoRetailDW set SINGLE_USER WITH ROLLBACK IMMEDIATE; ALTER DATABASE ContosoRetailDW SET READ_COMMITTED_SNAPSHOT ON WITH NO_WAIT GO alter database ContosoRetailDW set MULTI_USER; GO
SET TRANSACTION ISOLATION LEVEL READ COMMITTED declare @i as int; declare @max as int; select @max = isnull(max(id),0) from dbo.CCTest; set @i = 1; begin tran while @i <= 50000 begin insert into dbo.CCTest ( id, name, lastname ) values ( @max + @i, 'SomeName_', 'SomeLastName_' ); set @i = @i + 1; end; commit;
The transaction works perfectly without any troubles in SQL Server 2016 CTP 2.3, where in SQL Server 2014 it has always produced the following error message:
SNAPSHOT isolation level is not supported on a table which has a clustered columnstore index.
This is a great improvement, but I just wish that it would be implemented for SQL Server 2014 as a part of Service Pack 1, because this way we could use Readable Secondaries in Availability Groups as a part of good solution provided for any enterprise client.
Now we can go on and try out and see what kind of locking we shall get with our database when reading and writing data into it.
Let’s take our T-SQLcroscope and let’s examine how system behaves when we insert a row into a table with Clustered Columnstore Index and a Nonclustered B-Tree Index:
begin tran insert into dbo.CCTest ( id, name, lastname ) values ( 100000, 'SomeName_', 'SomeLastName_' );
Notice that I did not commit the insertion of a new Tuple.
For verifying what is getting locked and with which specific lock, I will use the following query (this is the very same one from the original blogpost on the locking):
-- What's being locked SELECT dm_tran_locks.request_session_id, dm_tran_locks.resource_database_id, DB_NAME(dm_tran_locks.resource_database_id) AS dbname, CASE WHEN resource_type = 'object' THEN OBJECT_NAME(dm_tran_locks.resource_associated_entity_id) ELSE OBJECT_NAME(partitions.OBJECT_ID) END AS ObjectName, partitions.index_id, indexes.name AS index_name, dm_tran_locks.resource_type, dm_tran_locks.resource_description, dm_tran_locks.resource_associated_entity_id, dm_tran_locks.request_mode, dm_tran_locks.request_status FROM sys.dm_tran_locks LEFT JOIN sys.partitions ON partitions.hobt_id = dm_tran_locks.resource_associated_entity_id JOIN sys.indexes ON indexes.OBJECT_ID = partitions.OBJECT_ID AND indexes.index_id = partitions.index_id WHERE resource_associated_entity_id > 0 AND resource_database_id = DB_ID() ORDER BY request_session_id, resource_associated_entity_id
Even though we have just 1 open Delta-Store with 50.000+1(not yet committed) rows, we can clearly see that as in SQL Server 2014 we are blocking the Delta-Store with an IX (Intent Exclusive) Lock – hey, we are inserting data here!, but further more compared with SQL Server 2014 – we have another IX (Intent Exclusive) lock on the data page, associated with our Nonclustered B-Tree Index and we are locking with an X (Exclusive) lock the key. Everything looks very logical and quite expected, because underneath a simple query we need to modify 2 structures (Delta-Store & B-Tree Index),
but what has got actually improved?
Can we shake off some of the locking ?
Let’s run another query by reading the data from it, by getting some data with a specific id, and expecting Query Optimiser to take advantage of the Nonclustered B-Tree Index:
(Extra point is of course of going “Nasty” with a Serializable Isolation Level – because I want it to get extra-complicated)
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; GO begin tran select id, lastname from dbo.CCTest where id = 10;
What’s being locked there:
After running a parallel reading transaction that is using Serializable isolation level, there are 3 extra locks to be noticed: 2 Range-S (Shared Range Lock on the Keys) and an IS (Intent-Shared) lock on the data page where that data is located.
What does that mean ?
Are you/we/they happy ?
Should we be happy or sad about it ?
Well, I am certainly happy, if you ask
Let’s run another query in a new window, this time running a default transaction isolation level (Read Committed Snapshot):
begin tran select id, lastname from dbo.CCTest where id = 15;
Taking another look at what’s being locked is nice – there are no changes! None at all.
Because we are simply reading a version of the currently persisted data, that is being temporary stored in the TempDB, as it normally happens with Snapshot & Read Committed Snapshot.
This means that as long as you are running your default transactions for reading data that is covered with Nonclustered B-Tree Index, you will be enjoying a normal, non-locking behaviour, as you would expect from the traditional Rowstore table.
Let’s commit the last, third transaction with reading id = 15, and start a new transaction where we shall try to process all the columns from our table (Remember, the Nonclustered B-Tree Index contains data for only 2 columns – Id & LastName):
commit; begin tran select id, name, lastname from dbo.CCTest where id = 15;
I can’t find any changes to the current locking structures – and that makes me happy.
I see a free way to read the data with default Read Committed Snapshot isolation level, without any blocking, even while inserting data – which is a major problem in SQL Server 2014.
At this point, you should go back into the each of the first 2 transactions and commit the workload by issuing the Commit T-SQL Command:
commit;
Not stopping at what we have seen for a simple data insertion, lets see a little bit more of the behaviour of SQL Server 2016:
Truncate table dbo.CCTest; declare @i as int; declare @max as int; select @max = isnull(max(id),0) from dbo.CCTest; set @i = 1; begin tran while @i <= 1048576 begin insert into dbo.CCTest ( id, name, lastname ) values ( @max + @i, 'SomeName_', 'SomeLastName_' ); set @i = @i + 1; end; commit;
Let’s open a new window in SSMS, and insert some new data:
begin tran insert into dbo.CCTest ( id, name, lastname ) values ( 100000, 'SomeName_', 'SomeLastName_' );
Once again, using the same query for verifying the current locks:
-- What's being locked SELECT dm_tran_locks.request_session_id, dm_tran_locks.resource_database_id, DB_NAME(dm_tran_locks.resource_database_id) AS dbname, CASE WHEN resource_type = 'object' THEN OBJECT_NAME(dm_tran_locks.resource_associated_entity_id) ELSE OBJECT_NAME(partitions.OBJECT_ID) END AS ObjectName, partitions.index_id, indexes.name AS index_name, dm_tran_locks.resource_type, dm_tran_locks.resource_description, dm_tran_locks.resource_associated_entity_id, dm_tran_locks.request_mode, dm_tran_locks.request_status FROM sys.dm_tran_locks LEFT JOIN sys.partitions ON partitions.hobt_id = dm_tran_locks.resource_associated_entity_id JOIN sys.indexes ON indexes.OBJECT_ID = partitions.OBJECT_ID AND indexes.index_id = partitions.index_id WHERE resource_associated_entity_id > 0 AND resource_database_id = DB_ID() ORDER BY request_session_id, resource_associated_entity_id
We can observe the identical locking as we have seen in SQL Server 2014, when considering Columnstore Structures (2 Delta-Stores – 1 open & 1 closed), both with IX locks, while the difference in SQL Server 2016 for our test table is simply the locks that the Nonclustered B-Tree Index is adding (2 data page IX and 1 key X lock).
Now let’s take a more detailed look at our T-SQL friends Delete & Update, focusing on the DELETE statement at first:
commit begin tran delete from dbo.CCTest where id = 150;
I will give you Hint :), take a deep look at the lock taken on the Delta-Store (or you could make it a Compressed Segment, it does not really matter at this point) – it is an IX (Intent Exclusive) lock, making it a HUGE difference to the X (Exclusive lock) taken in SQL Server 2014. This comes at a price of taking IX lock data page and the X lock on the key, all on the Nonclustered Indexes.
But still – IX on the Columnstore Index! This will allow you to read and process the data, even while deleting data!
Making sure that no scenario escapes this test, let’s rebuild the Columnstore index and then invoke the same operation again:
alter Index CCL_CCTest on dbo.CCTest Reorganize
Everything is still the same, so it truly does not matter if we are deleting data from the Columnstore Compressed Segments or from the open/close Delta-Store.
Now it’s time for T-SQL Update command, ever popular and needed
commit; begin tran update dbo.CCTest set lastname = 'Updated Name' where id = 2;
Notice that I am using columns from the covering Nonclustered Index here, and here are the locking results:
You can see that because I have rebuild my Clustered Columnstore Index, Query Optimiser have to process both of the Row Groups (Segment Elimination did not worked this time), but they are both locked with an IX lock, making it a great improvement from the SQL Server 2014, where we had an X (Exclusive) lock on the Columnstore Row Groups.
Let’s do something different for fun, let’s update a huge part of our table, avoiding the lead column of our Nonclustered B-Tree Index:
commit; begin tran update dbo.CCTest set name = 'Updated Name' where lastname = 'SomeLastName_'
This time our locking is looking like this:
No Nonclustered B-Tree Indexes were locked this time… This serves as a warning, that if you are Updating or Deleting from your Clustered Columnstore table with queries that are faster to be processed on the Clustered Columnstore Index (or as at least Query Optimiser believes that), then you are in trouble…
Let’s commit the transaction and start a different one, this time trying to update data with some traditional hints:
commit; begin tran update dbo.CCTest with(ROWLOCK) set lastname = 'Updated Name' where id = 150;
Immediately we are getting a following error message:
Msg 651, Level 16, State 1, Line 3
Cannot use the ROW granularity hint on the table “dbo.CCTest” because locking at the specified granularity is inhibited.
This functionality looks like it has not changed from SQL Server, even though since the Query Optimiser should be able to use Nonclustered B-Tree Index as the base for this operation, because we are using just covered columns.
Interestingly enough, the following code works without provoking any error messages:
commit; begin tran update dbo.CCTest with(PAGLOCK) set lastname = 'Updated Name' where id = 150;
It looks like SQL Server 2016 CTP 2.3 has already implemented support for Page Locking, but the Row Level Locking is still somehow eluding, but hopefully it will make it’s way into the final release. I see enough usage for this scenario and I hope that Microsoft Development team considers it as well.
LOCKRES:
Coming back to the topic of %%lockres%%, there are no news so far:
select top 5 %%lockres%% from dbo.CCTest
There is still the same error message:
Msg 3625, Level 16, State 203, Line 21
‘SE CSI: Virtual Column Support’ is not yet implemented.
In the end, I am reseting my test database by changing the Read Committed Snapshot isolation shake it to off.
ALTER DATABASE ContosoRetailDW SET READ_COMMITTED_SNAPSHOT OFF;
Final Thoughts
I am really very happy on the improvements on the locking level for the Clustered Columnstore Indexes, especially since Snapshot & Read Committed Snapshot isolation levels take care of so many typical transactional locking problems.
The improvements on the Update & Delete operations are leaving me even more happy. Some people might call those minor improvements, but if your workload is suffering
The key here of course is the right indexes for the Update & Delete operations, otherwise you will fall back to the Columnstore Exclusive locking as in SQL Server 2014.
I wish there would be a little bit more granularity and depth in the locking for Columnstore Indexes, but this is maybe a technical geek talking in me…
to be continued with Columnstore Indexes – part 68 (“Data Loading, Delta-Stores & Vertipaq Compression Optimisation”)