Ideal table for Memory Optimization ?

  • isuckatsql (4/1/2014)


    I have a simple id, name table with six million records, that i converted to a "memory optimized' table.

    It now runs six times slower with a 'select count(*)' or 'select like' statement.

    I ran the following command

    select object_name(object_id), * from sys.dm_db_xtp_table_memory_stats

    and it claims 'memory used by table' as 1444239 kb !

    Is it the number of rows that is killing the performance or lack or memory ?

    Thanks

    Milos gave you a good explanation why it's slower

    What you really want to o for such queries, is using a Clustered Columnstore Index - unless this table really is OLTP. But this specific query would be a hell lot faster on CCI

    Andreas

    ---------------------------------------------------
    MVP SQL Server
    Microsoft Certified Master SQL Server 2008
    Microsoft Certified Solutions Master Data Platform, SQL Server 2012
    www.insidesql.org/blogs/andreaswolter
    www.andreas-wolter.com

  • I did find some evidence that an index hint on a range index running in a natively compiled proc was the quickest, but results seemed a bit variable. For example on my Azure VM, this was always the case, but on my local VM (albeit CTP2) it varied.

    Try this simple rig yourself and let me know how you get on. Run the queries in different orders, try different scales, remember to set your SPID beforehand.

    NB This is a SQLCMD Mode script, so enable it in SSMS ( Query Menu > SQLCMD Mode ) before you start.

    :on error exit

    USE hekatondb

    GO

    SET NOCOUNT ON

    GO

    :setvar spid 51-- your spid

    --:setvar scale 10000-- 10 thousand

    --:setvar scale 100000-- 1 hundred thousand

    --:setvar scale 1000000-- 1 million

    :setvar scale 10000000-- 10 million

    --:setvar scale 20000000-- 20 million

    IF OBJECT_ID('dbo.usp_hekatonCount') IS NOT NULL DROP PROC dbo.usp_hekatonCount

    IF OBJECT_ID('dbo.usp_hekatonCount2') IS NOT NULL DROP PROC dbo.usp_hekatonCount2

    IF OBJECT_ID('dbo.hekatonTable') IS NOT NULL DROP TABLE dbo.hekatonTable

    GO

    CREATE TABLE dbo.hekatonTable (

    rowIdINT NOT NULL,

    col1INT NOT NULL,

    col2BIGINT NOT NULL,

    col3BIGINT NOT NULL,

    dateAddedDATETIME NOT NULL DEFAULT GETDATE(),

    CONSTRAINT PK_hekatonTable PRIMARY KEY NONCLUSTERED HASH ( rowId ) WITH ( BUCKET_COUNT = $(scale) ),

    -- Range indexes

    INDEX ridx_hekatonTable__someData ( col1 )

    )

    WITH ( MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_ONLY )

    GO

    IF OBJECT_ID('dbo.usp_hekatonCount') IS NOT NULL DROP PROC dbo.usp_hekatonCount

    GO

    CREATE PROC dbo.usp_hekatonCount

    WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER

    AS

    BEGIN ATOMIC

    WITH ( TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'us_english' )

    SELECT COUNT(*) compiled_proc_index_hint_hash FROM dbo.hekatonTable WITH ( INDEX(PK_hekatonTable) )

    END

    GO

    IF OBJECT_ID('dbo.usp_hekatonCount2') IS NOT NULL DROP PROC dbo.usp_hekatonCount2

    GO

    CREATE PROC dbo.usp_hekatonCount2

    WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER

    AS

    BEGIN ATOMIC

    WITH ( TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'us_english' )

    SELECT COUNT(*) compiled_proc_index_hint_range FROM dbo.hekatonTable WITH ( INDEX(ridx_hekatonTable__someData) )

    END

    GO

    ;WITH cte AS (

    SELECT TOP $(scale)

    ROW_NUMBER() OVER ( ORDER BY ( SELECT 1 ) ) rn

    FROM master.sys.columns c1

    CROSS JOIN master.sys.columns c2

    CROSS JOIN master.sys.columns c3

    )

    INSERT INTO dbo.hekatonTable ( rowId, col1, col2, col3, dateAdded )

    SELECT rn, rn % 1000000, rn % 2000000, rn % 3000000, DATEADD( day, rn % 300, '1 Jan 2014' )

    FROM cte

    GO

    -- Start an XE session to record the timings

    IF EXISTS( SELECT * FROM sys.server_event_sessions WHERE name='xes_sql_statement_completed' )

    DROP EVENT SESSION xes_sql_statement_completed ON SERVER;

    GO

    -- Create the Event Session for given spid

    CREATE EVENT SESSION xes_sql_statement_completed

    ON SERVER

    ADD EVENT sqlserver.sql_statement_completed

    ( ACTION

    (

    sqlserver.sql_text

    )

    WHERE sqlserver.session_id = $(spid)

    )

    ADD TARGET package0.ring_buffer( SET max_memory = 2048 )

    WITH ( MAX_DISPATCH_LATENCY = 5 SECONDS )

    GO

    -- Start the Event Session

    ALTER EVENT SESSION xes_sql_statement_completed

    ON SERVER

    STATE = START

    GO

    GO

    SELECT COUNT(*) interop_index_hint_hash FROM dbo.hekatonTable WITH ( INDEX(PK_hekatonTable) )

    GO

    SELECT COUNT(*) interop_index_hint_range FROM dbo.hekatonTable WITH ( INDEX(ridx_hekatonTable__someData) )

    GO

    EXEC dbo.usp_hekatonCount2

    GO

    EXEC dbo.usp_hekatonCount

    GO

    -- Read the session data

    DECLARE @target_data XML

    SELECT @target_data = CAST( target_data AS XML )

    FROM sys.dm_xe_sessions AS s

    INNER JOIN sys.dm_xe_session_targets AS t ON t.event_session_address = s.address

    WHERE s.name = 'xes_sql_statement_completed'

    --SELECT @target_data

    SELECT sql_text, cpu, duration duration_microseconds, CAST( duration / 1000000.00 AS DECIMAL(10,2) ) duration_s

    FROM

    (

    SELECT

    e.c.value('(action[@name="sql_text"]/value/text())[1]', 'NVARCHAR(MAX)') AS sql_text,

    e.c.value('(data[@name="cpu_time"]/value/text())[1]', 'INT') AS cpu,

    e.c.value('(data[@name="duration"]/value/text())[1]', 'INT') AS duration,

    e.c.query('.') rawXML

    FROM @target_data.nodes('RingBufferTarget/event[@name="sql_statement_completed"]') e(c)

    ) x

    ORDER BY duration_s

    GO

    ALTER EVENT SESSION xes_sql_statement_completed

    ON SERVER

    STATE = STOP

    GO

    Regarding the difference between normal tables and these, for a COUNT(*) against a normal table, SQL Server will automatically pick the smallest index to scan, with read-ahead working hard for you to get the pages into memory if they're not already there. For a large and wide table, a non-clustered index could be significantly smaller (in 8k pages) than the main clustered index for example. With in-memory OLTP tables, all indexes cover all columns: quote from Kalen Delaney's CTP2 whitepaper:

    "...all indexes on a memory-optimized table are actually covering indexes"

    So this might suggest why there is a difference, although I am only speculating and couldn't really reproduce it myself. Also, as the others have hinted, COUNT(*) probably isn't the main business case for using this feature, but this was fun to investigate!

    I also looked at the Like problem but that's for another post...

  • OK scaled up my Azure VM to 8 core / 14GB and the rows to 50 million. Created similar(ish) test rigs for: normal table, columnstore, in-memory OLTP table and the results are as follows:

    COUNT(*) on 50 million rows

    Clustered Columnstore - 0.11s

    Normal table (cdx) - 1.7s

    Normal table (ncdx) - 1.8s

    In-memory OLTP - native compiled proc with range index hint - 8.8s

    In-memory OLTP - interop SQL with range index hint - 12s

    In-memory OLTP - native compiled proc with hash index hint - 18s

    In-memory OLTP - interop SQL with hash index hint - 21s

    Notes: Consider the results as approximate as I'm on a VM. The time it took to load the normal table ( ~30 mintes ) was significantly longer than the time it took to load the in-memory OLTP table ( ~1 minute ).

    So I think we all agree COUNT(*) is not a great idea for in-memory OLTP tables, expect it to be slower than disk-based and columnstore tables. If you do have to do it, try with a natively compiled stored proc and range index hint. Consider using in-memory OLTP for what it's designed for, eg shock absorber pattern, high concurrency etc

Viewing 3 posts - 16 through 17 (of 17 total)

You must be logged in to reply to this topic. Login to reply