Now that I have written about In-Memory Tables and Migrating to In-Memory tables, let’s look at indexes and how they are created and how they work within those tables. As you can imagine indexes, called memory optimized indexes are different for these types of tables, so let’s see just how different that are from regular tables.
Before we dive into this subject it is VERY important to note the biggest differences.
First, ALL memory optimized indexes MUST be created when the table is created or migrated. You cannot add indexes in an existing table without dropping and recreating the table.
Secondly, currently you can only have 8 indexes per table including your primary key. Remember that every table must have a primary key to enforce a secondary copy for a minimum of schema durability This means you can only really add 7 additional indexes so be sure to understand your workloads and plan indexing accordingly.
Third, Memory Optimized Indexes only exists in memory they are not persisted to disk and are not logged in the transaction logs. Therefore, this means they are also recreated upon database startup and do incur a performance hit as they are rebuilt.
Next, there is no such thing as key lookups against an In-Memory table, as all indexes are by nature a covering index. The index uses a pointer to the actual rows to get the needed fields instead of using a primary key like physical tables do. Therefore, these are much more efficient in returning the proper data.
Lastly, there also is no such thing as fragmentation for these indexes, since these are not read from disk. Unlike on disk indexes, these do not have a fixed page length. On disk index use physical page structures within the B-Tree, determining how much of the page should be filled is what the Fill Factor does. Since this is not a requirement fragmentation does not exist.
Ok now that we made it through all of that, let’s look at the types of indexes you can create and gain an understanding of what they are and how they are created.
Nonclustered HASH Index – This index is used to access the In-Memory version of the table, called a Hash. These are great for predicates that are singleton lookups and not ranges of values. These are optimized for seeks of equality values. For example, WHERE Name = ‘Joe’. Something to keep in mind when determining what to include in your indexes is this; if your query has two or more fields as your predicate and your index only consists of one of those fields, you will get a scan. It will not seek on that one field that was included. Understanding your workloads and indexing on the appropriate fields (or a combinations thereof) is important since you are limited to only 7 additional indexes. Given that this In-Memory OLTP is mainly focused on heavy insert/update workloads, and less so reading, this should be less of a concern.
These types of indexes are highly optimized and do not work very well if there are a lot of duplicate values in an index, the more unique your values better the index performance gains you will get. It is always important to know your data.
When it comes to these indexes knowing your memory consumption plays a part. The hash index type is a fixed length and consume a fixed amount of memory determined upon creation. The amount of memory is determined by the Bucket Count value. It is extremely important to make sure this value is as accurate as possible. Right sizing this number can make or break your performance, too low of a number according to Microsoft “can significantly impact workload performance and recovery time of a database. Look for my upcoming blog on determining bucket counts for more information. Meanwhile you can learn more about hash indexes at docs.microsoft.
Using T-SQL (both methods give the same result)
Example One (Note the index comes after the table fields)
CREATE TABLE [Sales] ([ProductKey] INT NOT NULL, [OrderDateKey] [int] NOT NULL, INDEX IDX_ProductKey HASH ([ProductKey]) WITH (BUCKET_COUNT = 100)) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_ONLY)
Example Two (Note the index comes after the field)
CREATE TABLE [Sales] ([ProductKey] INT NOT NULL INDEX IDX_ProductKey HASH WITH (BUCKET_COUNT = 100), [OrderDateKey] [int] NOT NULL) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_ONLY)
Nonclustered Index – These are also used to access the In-Memory version of the table however, these are optimized for range values such as less than and equal to, inequality predicates and sorts orders. Examples are WHERE DATE between ‘20190101’ and ‘20191231’ and WHERE DATE <> ‘20191231’. These indexes do not require a bucket count or fixed memory amount. The memory consumed by these indexes are determined by the actual row counts and size of the indexed key columns which makes it a simpler to create.
Moreover, in contrast to hash indexes which needs all fields required for your predicate to be part of your index to get a seek, these do not. If your predicates have more than one field and your index has that one of those as its leading index key value, then you can still attain a seek.
Using T-SQL (both methods give the same result)
Example One (Note the index comes after the table fields)
CREATE TABLE [Sales] ([ProductKey] INT NOT NULL, [OrderDateKey] [int] NOT NULL, INDEX IDX_ProductKey ([ProductKey])) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_ONLY)
Example Two (Note the index comes after the field)
CREATE TABLE [Sales] ([ProductKey] INT NOT NULL INDEX IDX_ProductKey, [OrderDateKey] [int] NOT NULL) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_ONLY)
Determining which index type to use can be tricky but Microsoft has given us a great guide in the below chart.
As you can see there some key differences to how In- Memory table indexes, memory optimized indexes, work compared to the normal disk indexes we are used to. Like with any other table design it is important to consider your index needs before you embark on creating or migrating to memory optimized tables. You’ll be happy you did.