Stairway to Columnstore Indexes
SQL Server 2012 and later offer a very different type of index from the traditional b-tree, the in-memory columnstore index. These indexes use a column-based storage model, as well as a new 'batch mode' of query execution and can offer huge performance increases for certain workloads. But how are they built, how do they work, and why do they manage to have such a dramatic impact on performance? In this stairway, Hugo Kornelis explains all, with his usual mix of concise description and detailed demonstration.
- This level introduces of you to the fundamentals of columnstore indexes, introdused in SQL Server 2012 to manage the indexing of very large tables.
- To fully appreciate just how different columnstore indexes are, and why work so well in reporting and online analytical processing (OLAP) workloads, but not for online transaction processing (OLTP), we must first look at the traditional “rowstore” indexes.
- The performance increase columnstore indexes grant when reading data from the index is offset by the expensive process required to build the index. In this Stairway level, Hugo Kornelis walks you through the steps SQL Server takes when building (or rebuilding) a columnstore index.
- In this level, we will first take a look at how to recognize columnstore indexes in some of the generic catalog views. After that we will investigate the new catalog views that have been added just for columnstore indexes.
- Earlier levels have shown how Columnstore Indexes work effectively with static data. In most tables however, data is hardly ever static. We are constantly inserting new rows, and updating or deleting existing rows. If you think about what this means for a columnstore index, you will realize that this comes with some unique challenges.
- This level looks in detail at what happens when we update or delete data from a clustered columnstore index, the impact it has on concurrent data access, and how without careful maintenance the efficiency of columnstore indexes can degrade over time.
- In this level, we will focus on optimization techniques to apply while building the nonclustered columnstore index, which is available in all versions of SQL Server from 2012 up.
- In Level 7, we looked at optimizing rowgroup elimination for a nonclustered columnstore index. For a clustered columnstore index, the same technique can be used but the steps and syntax change a bit. This will be covered later – but first, let’s take a look at another significant difference between nonclustered and clustered columnstore indexes, […]
- In this level, Hugo explains what batch mode execution is, how it differs from row mode execution, and what its limitations are.
- In this level, Hugo Kornelis looks at how to rewrite your queries to best take advantage of batch mode.
- Hugo Kornelis continues his exploration of the types of queries that can end up running in row mode when accessing columnstore indexes. He demonstrates how careful rewriting can often yield a logically equivalent query that runs in batch mode instead, and therefore gains the best possible performance benefit.
- The previous levels of this stairway describe details, features, and limitations of columnstore indexes in SQL Server. But they do not answer what should be the first question for every database professional: should columnstore indexes be used in my databases; on what tables should they be used; and should they be clustered or nonclustered columnstore indexes?
- This stairway series was started in 2015. As such, the focus was on SQL Server 2012 and SQL Server 2014 only. When SQL Server 2016 was released, with lots of improvements in the columnstore technology, I decided to finish the planned levels with the original focus on SQL Server 2012 and 2014, and add one extra level with a brief overview of the improvements available in SQL Server 2016.