Degraded Performance in 2017 vs 2008 R2 with inserts

  • We're finally upgrading from SQL Server 2008 R2 Enterprise to SQL Server 2017 Standard and have moved our test database to the new server.  Both VMs are on similar hardware with 4 vCPUs.

    Of course, given the chasm of versions we expected much better performance.  Unfortunately, that's not the case.

    Our inserts on the 2008 R2 box are the quickest and the 2017 box exhibits 250% slower inserts (100k-1 million rows) in a plain vanilla empty table with a clustered primary key (no indices, triggers, or anything).

    The new server has more RAM but its memory usage seems limited to 40GB despite the allocation not being set to a limit - it's still higher than our old server.  Our system is low usage but we do have tables with 200 million and 500 million rows with a scientific model running occasionally that needs to run quickly and reliably to make the results available to users.

    We've tried compatibility levels of 2008, 2012, 2014, 2016 and of course 2017 on the 2017 box but they all run at the same pace so it doesn't seem to the Cardinality Estimator introduced in 2014 at least in the plain vanilla inserts in an empty table.  However, in testing our model on the SQL Server 2017 box, compatibility levels of 2014 and above were slower and more unstable so the Cardinality Estimator is probably also at play here but first we'd like to understand why the simple inserts are slower before we tackle the Cardinality Estimator.

    It's only inserts that are slower.  Batch updates and deletes were within 1-2 seconds for 1 million rows on both boxes so the performance is more than acceptable.

    Does anyone know what's causing this performance degradation and how to go on about fixing it?  We're going to benchmark the vms to understand if something might be slower but from the other operations like updates and deletes, the 2017 box can be as quick so we doubt it's a configuration issue but cannot rule it out without testing.

    The slow inserts affect our model in a couple of ways, it takes much longer for the model to execute delaying the results and also strains our server for longer and since we use parallelism (24 operations) in the model, the operations tend to deadlock more often when it's slower and stability is affected in addition to performance - our success rate of each parallel process has dropped from 100% to 90-95%.

    Any help would be greatly appreciated.

    • This topic was modified 7 months, 1 week ago by  MichaelT.
  • UPDATE

    I decided to test performance of individual updates by looping through 100,000 rows and updating each one.  No transactions.

    The performance degradation for the 1st run was 250% which matches the insert peformance degradation.  However, the 2nd and 3rd runs were 4500% percent faster on 2008R2.

    My only guess at a 18x jump in speed is that the table or recordset is in-memory at this point.

    I ran the test again and cleared the tables, inserted, updated, and re-updated and the results were with a 2.5x performance boost over 2017 on the 1st run and a 45x bump on the same operation.

    • This reply was modified 7 months, 1 week ago by  MichaelT.
  • similar hardware does not mean similar performance - so I would check those.

     

    you say they are VM - details? onprem, cloud, vendor (AWS, MS, VMWare) - VM type if cloud

    Disks - local, SAN, type, and if cloud, with our without encryption, with or without write/read cache, normal HDD, SSD, SSD Premium.

    AV - could it be enabled on new server and no configured to ignore SQL Files?

    there would be other aspects to look for depending on details above.

     

    and to weed out server config use SQLIO to measure disk IO performance. see following links

    https://learn.microsoft.com/en-us/troubleshoot/sql/tools/sqliosim-utility-simulate-activity-disk-subsystem

    and

    https://techcommunity.microsoft.com/t5/sql-server-support-blog/choosing-what-sqlio-tests-to-run-and-automating-the-tests/ba-p/333997

  • Thanks for the reply and the information.

    It's VMWare.

    Disks should be local SSDs (no encryption).

    I checked AV (whew) - I thought that could be it but turning Real-Time monitoring off had no impact which is also surprising.

    I'm trying to use that SQLIOSim tool but it's not intuitive even after setting everything up and I'm not sure what to look for.

    I did run a query to check the following

    num_of_reads,

    num_of_bytes_read,

    io_stall_read_ms,

    num_of_writes,

    num_of_bytes_written,

    io_stall_write_ms,

    io_stall,

    size_on_disk_bytes

    FROM

    sys.dm_io_virtual_file_stats(null, null) as vfs

    IO_Stall is higher on the new machine for the plain vanilla database table.

    Sequential Deletes are also affected by 2.5x so the behavior is very consistent across inserts, updates, and deletes.

    Summarizing data (Grouping 2 million rows in a table with 300+ million) took 1.4 seconds on the old 2008 R2 server and 1.9 seconds on the new 2017 server - that's 35% longer so reads are also slower but not as impacted as the other operations which are affected by 2.5x.

    Can I run CrystalDisk Mark?

     

    • This reply was modified 7 months, 1 week ago by  MichaelT.
  • being VMWare I would speak with the team that maintains it and check config of new vm vs old - as well as server specs itself.

    things to look at

    • Memory overcommit - in SQL box this should NOT be allowed
    • Memory Ballooning should be disabled
    • CPU overcommit - for different reasons but should not be allowed
    • CPU Ready
    • Cpu high performance should be set to always max performance in both Host Physical Box Bios, VMware Settings and within Windows vm itself.
    • Disk driver used
    • Disk block size - it does have some impact but should not be the issue here.

    other things may also be impacting this from the VMWare side - get their team to look into all different details

    this document highlights most of the things that should be done https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/solutions/sql-server-on-vmware-best-practices-guide.pdf

  • Thank you! We'll look into those settings.

    Just out of curiosity, if the machines are identical in terms of all specs with more RAM for the SQL 2017 should we be expecting performance improvements?

    Essentially, is SQL 2017 generally faster in normal CRUD operations out-of-the-box than SQL 2008 R2?  I'd expect improvements but that could just be a false expectation on my behalf.

  • I've run the operations many times while looking at Task Manager CPU, Memory, and Disk Usage.

    SQL Server 2017

    CPU: 25%

    Total Bytes: ~2,000,000

     

    SQL Server 2008 R2

    CPU: 13%

    Total Bytes: ~6,000,000

     

    We're clearly writing more on the SQL Server 2008 R2 machine with lower CPU usage.

     

    The 25% number on the SQL 2017 troubles me as it's a 4-CPU vm and I wonder if it's CPU bound and limited a single CPU of the 4.

     

  • Aside from what been said, If your data has changed then see if you need to update index stats especially on your heavily used tables.

    ----------------------------------------------------

  • MMartin1 wrote:

    Aside from what been said, If your data has changed then see if you need to update index stats especially on your heavily used tables.

    We use Ola Hallengren's maintenance plans to update stats.

    Our testing of CRUD is being done on an empty table locally with an Identity PK to eliminate all these variables. We are seeing 2.5x performance penalties in Inserts, Updates, and Deletes.

    You're 100% right - in our main tables for the model with half a billon records and inserts of millions of records with 24 parallel transactions over the network followed by spatial updates of all records would require the table's stats to be updates, the index stats need to be updated frequently as we learned the hard way 🙂

  • I would give 'dbcc updateusage' a try.  I can tell you this has fixed weird disk performance on new servers for me more than once.

    https://learn.microsoft.com/en-us/sql/t-sql/database-console-commands/dbcc-updateusage-transact-sql?view=sql-server-ver16

  • So we ran Crystal DiskMark and it produced number of 2,000 on the SQL Server 2017 and 1,000 on the SQL Server 2008 R2.  These seemed a bit off for SSDs as we benched 9,000 on a standalone server that's a lower spec model.

    We decided to copy a 2.5GB file and the SQL Server 2008 R2 vm sustained speeds of about 400-600MB/sec while copying the file and the SQL Server 2017 sustained speeds of 500-800MB/sec.  Both vms copied the file very, very quickly.  My laptop hits 1.0-1.5GB and it's hard to see the popup.  The SSDs are pretty quick.  It's a bit strange that the SQL 2017 has faster IO in file exploser but is executing slower.

    We did notice that the SQL Server 2017 serves has a slower CPU Xeon Gold 6130 2.1 Ghz (32 CPUs) while the SQL Server 2008 R2 is running on CPU Xeon Gold 6226R 2.9 Ghz.

    The CPUMark is 30% higher the 6226R vs the 6130R in CPU Mark.

    Can the CPU be responsible for 2.5x performance difference?

    https://www.cpubenchmark.net/compare/Intel-Xeon-Gold-6226R-vs-Intel-Xeon-Gold-6130/3728vs3126

     

     

  • I gave you a series of links for IO testing - and I've also gave you a series of questions to ask your VM admin - have these get back to you with the differences between the 2 VM's - and with anything else at the host (and/or NAS if you use it) level that can cause this difference

    cpu itself would make a difference when processing queries - in straight inserts only if you have FK's on the tables you are loading to (as these will need to be processed and these will definitely need cpu (and for this the CPU' being always at MAX performance will make a difference)

  • frederico_fonseca wrote:

    I gave you a series of links for IO testing - and I've also gave you a series of questions to ask your VM admin - have these get back to you with the differences between the 2 VM's - and with anything else at the host (and/or NAS if you use it) level that can cause this difference

    I downloaded SQLSimIO but I couldn't get it to work:-) I just tried to download SQLIO but it's no longer available and pointed to DiskSpd which I downloaded and ran:

    The results were 440MiB for the SQL Server 2008 R2 and 955MiB for the SQL Server 2017.  So the SQL Server 2017 seems to have a much quicker IO configuration.  Not sure why as I suspect they have identical hardware.

    However, even though, SQL Server 2017 is much faster in IO, it's 2.5x slower in the simplest database operations.

    frederico_fonseca wrote:

    cpu itself would make a difference when processing queries - in straight inserts only if you have FK's on the tables you are loading to (as these will need to be processed and these will definitely need cpu (and for this the CPU' being always at MAX performance will make a difference)

    The table is empty without any FKs.  Here's the definition.

     

    CREATE TABLE [dbo].[Data](

    [Id] [int] IDENTITY(1,1) NOT NULL,

    [Comment] [varchar](50) NOT NULL,

    [CreateDate] [datetime] NOT NULL,

    CONSTRAINT [PK_Data] PRIMARY KEY CLUSTERED

    (

    [Id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    ALTER TABLE [dbo].[Data] ADD  CONSTRAINT [DF_Data_CreateDate]  DEFAULT (getdate()) FOR [CreateDate]

    GO

     

  • SQLIOSIM is part of the standard SQL installation since 2008 - located on the binn directory (for a sql 2022 it would be C:\Program Files\Microsoft SQL Server\MSSQL16.MSSQLSERVER\MSSQL\Binn - assuming default C install)

    see attached 3 config files and a command line (SQLIOtest.txt) to execute the tests.

    NOTE: these should be run with SQL Instance shutdown - they WILL HAMMER your system so do not do it while users active.

    you will need to change the location of the files (both within the .ini files) and on the cmd file (loaded as a .txt, change it to .cmd once on  your system)

    lines to change on ini files are

    FileName=c:\sqliotest\sqliosim.mdx <<--- change drive and location to be the same drive (and volume as your .mdf files)

    FileName=c:\sqliotest\sqliosim.ldx <<--- change drive and location to be the same drive (and volume as your .ldf files)

    you will need at least 10 GB free on the drives above

     

    on .cmd file change both locations of sqliosim.com and the location of the ini and log files to be what you desire.

    "C:\Program Files\Microsoft SQL Server\MSSQL16.MSSQLSERVER\MSSQL\Binn\SQLIOSIM.COM" -cfg C:\sqliotest\sqliosim.cfg.ini -log C:\sqliotest\sqliosim.log.xml

    the log files, bottom section (containing Final Summary  for file xxx) will show us what the systems are doing.

     

     

     

    Attachments:
    You must be logged in to view attached files.
  • ah well... files don't wanna be loaded anymore - so putting contents here

    sqliosim.cfg.ini

    [CONFIG]
    ErrorFile=sqliosim.log.xml
    ;CPUCount=2
    ;Affinity=0
    ;IOAffinity=0
    ;MaxMemoryMB=209
    StopOnError=TRUE
    TestCycles=1
    TestCycleDuration=300
    CacheHitRatio=10000
    NoBuffering=TRUE
    WriteThrough=TRUE
    MaxOutstandingIO=0
    TargetIODuration=100
    AllowIOBursts=TRUE
    UseScatterGather=TRUE
    ForceReadAhead=FALSE
    DeleteFilesAtStartup=TRUE
    DeleteFilesAtShutdown=TRUE
    StampFiles=FALSE

    [File1]
    FileName=C:\sqliotest\sqliosim.mdx
    InitialSize=500
    MaxSize=5000
    Increment=500
    Shrinkable=TRUE
    LogFile=FALSE
    Sparse=FALSE

    [File2]
    FileName=C:\sqliotest\sqliosim.ldx
    InitialSize=500
    MaxSize=5000
    Increment=500
    Shrinkable=FALSE
    LogFile=TRUE
    Sparse=FALSE

    sqliosim.seqwrites.cfg.ini

    [CONFIG]
    ErrorFile=sqliosim.log.xml
    ;CPUCount=2
    ;Affinity=0
    ;IOAffinity=0
    ;MaxMemoryMB=209
    StopOnError=TRUE
    TestCycles=1
    TestCycleDuration=300
    CacheHitRatio=10000
    NoBuffering=TRUE
    WriteThrough=TRUE
    MaxOutstandingIO=0
    TargetIODuration=100
    AllowIOBursts=TRUE
    UseScatterGather=TRUE
    ForceReadAhead=FALSE
    DeleteFilesAtStartup=TRUE
    DeleteFilesAtShutdown=TRUE
    StampFiles=FALSE

    [RandomUser]
    UserCount=0
    JumpToNewRegionPercentage=500
    MinIOChainLength=50
    MaxIOChainLength=100
    RandomUserReadWriteRatio=9000
    MinLogPerBuffer=64
    MaxLogPerBuffer=8192
    RollbackChance=100
    SleepAfter=5
    YieldPercentage=0
    CPUSimulation=FALSE
    CPUCyclesMin=0
    CPUCyclesMax=0

    [AuditUser]
    UserCount=0
    BuffersValidated=64
    DelayAfterCycles=2
    AuditDelay=200

    [ReadAheadUser]
    UserCount=0
    BuffersRAMin=32
    BuffersRAMax=64
    DelayAfterCycles=2
    RADelay=200

    [BulkUpdateUser]
    ;UserCount=4
    BuffersBUMin=600
    BuffersBUMax=1000
    DelayAfterCycles=2
    BUDelay=1

    [ShrinkUser]
    MinShrinkInterval=120
    MaxShrinkInterval=600
    MinExtends=1
    MaxExtends=20

    [File1]
    FileName=c:\sqliotest\sqliosim.mdx
    InitialSize=500
    MaxSize=5000
    Increment=500
    Shrinkable=FALSE
    LogFile=FALSE
    Sparse=FALSE

    [File2]
    FileName=c:\sqliotest\sqliosim.ldx
    InitialSize=500
    MaxSize=5000
    Increment=500
    Shrinkable=FALSE
    LogFile=TRUE
    Sparse=FALSE

    sqliosim.seqwrites_64k.cfg.ini

    [CONFIG]
    ErrorFile=sqliosim.log.xml
    ;CPUCount=2
    ;Affinity=0
    ;IOAffinity=0
    ;MaxMemoryMB=209
    StopOnError=TRUE
    TestCycles=1
    TestCycleDuration=300
    CacheHitRatio=10000
    NoBuffering=TRUE
    WriteThrough=TRUE
    MaxOutstandingIO=0
    TargetIODuration=100
    AllowIOBursts=TRUE
    UseScatterGather=TRUE
    ForceReadAhead=FALSE
    DeleteFilesAtStartup=TRUE
    DeleteFilesAtShutdown=TRUE
    StampFiles=FALSE

    [RandomUser]
    UserCount=0
    JumpToNewRegionPercentage=500
    MinIOChainLength=50
    MaxIOChainLength=100
    RandomUserReadWriteRatio=9000
    MinLogPerBuffer=64
    MaxLogPerBuffer=8192
    RollbackChance=100
    SleepAfter=5
    YieldPercentage=0
    CPUSimulation=FALSE
    CPUCyclesMin=0
    CPUCyclesMax=0

    [AuditUser]
    UserCount=0
    BuffersValidated=64
    DelayAfterCycles=2
    AuditDelay=200

    [ReadAheadUser]
    UserCount=0
    BuffersRAMin=32
    BuffersRAMax=64
    DelayAfterCycles=2
    RADelay=200

    [BulkUpdateUser]
    ;UserCount=4
    BuffersBUMin=64
    BuffersBUMax=128
    DelayAfterCycles=2
    BUDelay=1

    [ShrinkUser]
    MinShrinkInterval=120
    MaxShrinkInterval=600
    MinExtends=1
    MaxExtends=20

    [File1]
    FileName=c:\sqliotest\sqliosim.mdx
    InitialSize=500
    MaxSize=5000
    Increment=500
    Shrinkable=FALSE
    LogFile=FALSE
    Sparse=FALSE

    [File2]
    FileName=c:\sqliotest\sqliosim.ldx
    InitialSize=500
    MaxSize=5000
    Increment=500
    Shrinkable=FALSE
    LogFile=TRUE
    Sparse=FALSE

Viewing 15 posts - 1 through 15 (of 36 total)

You must be logged in to reply to this topic. Login to reply