Denali – Day 5: Column-store indexes (aka Project Apollo)
There is another great achievement for Denali, especially for Data warehouse, where data is static and very huge – READONLY, and user always require data to be retrieved quickly as possible. So in Denali another great feature called “Column stored” index. the way data is stored in a page is changed. Generally each page stores (row)record of a table. But when you create a “column stored” index page contains columns. This technology is build on Power Pivot Analytic services for sql server 2008 and on this basis VeriPaq compression is used.
When you query table you requires data for particular column but as pages stores rows it has to go through all the rows and if it’s large data much rows to be scanned and pages as well. But when you stored that in columnstored indexed key. You will access fewer pages to be scan and specific column data would be retrieved .. and will be quite fast.
Experts confirmed that for very large Read-only data warehouse system “Column stored” is provided improvement of 10 to 100 times.
This is the good about Column stored, it has its own limitations as well.
- Only one columnstored index allowed per table
- Only non-clustered columnstored index is allowed
- Table contains columnstored index could not be part of Replication, CDC, Index view.
- Columnstored index cannot be created on columns of FileStream, computed, sparse, BLOB
Different ways to use Columnstored index for DML operations are
- Disable Columnstored index and rebuild
- Partition
- Union ALL – not a good idea but an option…
Use T-SQL as
CREATE [ NONCLUSTERED ] COLUMNSTORE INDEX index_name
ON <object> ( column [ ,...n ] )
[ WITH ( <column_index_option> [ ,...n ] ) ]
[ ON {
{ partition_scheme_name ( column_name ) }
| filegroup_name
| "default"
}
]
[ ; ]
Only 1024 columns can be added into this.
*Again use this only for large data with READONLY. Not good for OLAP DML small/medium object.