Memory lacking causing disk issues?

  • ALZDBA (1/20/2011)


    shifty1981 (1/19/2011)


    ...

    I have been told that rebuilding our indexes typically takes about 2 hours and that it puts a pretty heft load on the server. However I hoping to find a query that I could run daily (like mid day our lowest usage time) and determine the index fragmentation and then somehow determine if we should rebuild.

    Need Index maintenance ?: have a look at IndexOptimize at http://ola.hallengren.com/Documentation.html

    Off course you'll also find that kind of scripts at SSC.

    Attached you can find the proc I'm using to only rebuild indexes that have a certain amount of fragmentation.

    Yikes! Didn't realize the code would be so involved. reminds me of day 1 in my foreign language class. We have only about 4 indexes we are pretty sure affect our performance noticeably.

    We're using SQL server 2008 R2 Enterprise x64 edition so my understanding is that this can be done while database is online right?

    There are some limitations. ref: http://technet.microsoft.com/en-us/library/ms190981.aspx

    Online rebuilds cannot be done on indexes which hold image, ntext, text, varchar(max), nvarchar(max), varbinary(max), and xml data typed columns.

    If rebuilding a clustering index, keep in mind that one holds the actual data pages at leaf level ! so the limitations also go for that one.

    our column data types are bigint, int, bit, binary(20) for the indexes in question.

    The other admin said he'd been trying to reindex every month, but there's no certainty if he actually remembered to do this. I'm hoping there might be an event log or something in SQL I can query and then track. We have tools that can do this easily if given a query (either SQL or eventid to look for).

    If you schedule this maintenance script using a sqlagent job, you'll have your logging in the job history.

    great point. we have to get in a habit of running that. is all I need to do is start the SQL agent service and then I can do this?

    I'm not sure how to determine if we are using Lock Pages in Memory.

    Lock pages in memory is meanth for your sqlserver instance so it will not page out memory if other stuff on your server needs ram. It is granted at windows level for your sqlserver service account.

    SQLServer locking is sqlserver internally only and is a whole other chapter than lock pages in memory.

    There is a DMV exposing this in column locked_page_allocations_kb:

    Select * -- euhm yes * to show all available stuff. Pick the ones you are interested in later.

    from sys.dm_os_process_memory

    ;

    Also at startup time, your sqlserver instance records a row in the sqlserver errorlog file if the service account has been granted "lock pages in memory" at windows level.

    The recorded message is : Using locked pages for buffer pool

    (double check your errorlog file hasn't been rolled over since startup or you'll not find this message)

    refs:

    - How to: Enable the Lock Pages in Memory Option (Windows)

    http://msdn.microsoft.com/en-us/library/ms190730.aspx

    - Lock Pages in Memory ... do you really need it? http://blogs.technet.com/askperf/archive/2008/03/25/lock-pages-in-memory-do-you-really-need-it.aspx

    We are a 64 bit OS. While it says it is not needed, the other articles seem to indicate this may not be the case (at the server level). we are using the max memory setting though.

    below are the results from that query from two of our heavy load servers:

    server6:

    server9:

  • Fyi - we're tracking stats on the servers. below are the ones for server6 and below that are server9.

    SERVER6:

    Average Latch Wait Time (ms)51.227

    Average Wait Time (ms)4567.500

    Batch Requests/sec499.314

    BLBD - DB total (KB)1560882176.000

    BLBD - Log File (KB)32506872.000

    BLBD - Log File Used (KB)348793.575

    BLBD - Log Growths31.000

    BLBD - Percent Log Used0.533

    BLBD - Transactions/sec145.687

    Buffer cache hit ratio99.119

    Checkpoint pages/sec697.022

    Full scans40.478

    Latch waits1540.848

    Lazy writes/sec10.001

    Lock Blocks93386.325

    Lock Blocks Allocated18677647.358

    Lock Memory (KB)13295413.733

    Lock Requests/sec1703278.310

    Lock Timeouts/sec50.797

    Lock Wait Time (ms)15231.991

    Lock Waits/sec3.375

    Number of Deadlocks/sec0.000

    Page File Size - F103068819456.000

    Page File Usage %0.258

    Page life expectancy211.008

    Page Reads/sec528.671

    Page reads/sec (SQL:BM)6062.149

    Page Splits/sec56.353

    Page Writes/sec0.000

    Page writes/sec (SQL:BM)768.543

    Pages Input/sec535.825

    Pages/sec535.825

    SQL Cache Memory (KB)2078.800

    System Up Time (sec)792615.018

    Target Server Memory (KB)67108864.000

    TEMPDB - Data total (KB)20739136.000

    TEMPDB - Log File (KB)1108792.000

    TEMPDB - Log File Used (KB)308513.425

    TEMPDB - Log Growths71.000

    TEMPDB - Percent Log Used27.333

    Total Available Memory (MB)18428.508

    Total Server Memory (KB)67108864.000

    Virtual Memory (bytes)85202730222.933

    SERVER9:

    Average Latch Wait Time (ms)71.205

    Average Wait Time (ms)540.565

    Batch Requests/sec498.734

    BLBD - DB total (KB)1372927040.000

    BLBD - Log File (KB)35914232.000

    BLBD - Log File Used (KB)146908.400

    BLBD - Log Growths0.000

    BLBD - Percent Log Used0.000

    BLBD - Transactions/sec154.300

    Buffer cache hit ratio98.837

    Checkpoint pages/sec1048.666

    FileInfo Row Count121258180.750

    Full scans25.068

    Latch waits872.205

    Lazy writes/sec0.315

    Lock Blocks54667.667

    Lock Blocks Allocated1783277.542

    Lock Memory (KB)345924.667

    Lock Requests/sec1665702.003

    Lock Timeouts/sec61.303

    Lock Wait Time (ms)5596.332

    Lock Waits/sec10.166

    Number of Deadlocks/sec0.000

    Page File Size - E67640426496.000

    Page File Usage %4.319

    Page life expectancy786.250

    Page Reads/sec739.442

    Page reads/sec (SQL:BM)1007.912

    Page Splits/sec37.732

    Page Writes/sec0.388

    Page writes/sec (SQL:BM)1065.807

    Pages Input/sec1165.327

    Pages/sec1262.727

    SQL Cache Memory (KB)82050.400

    System Up Time (sec)486751.455

    Target Server Memory (KB)27598959.933

    TEMPDB - Data total (KB)378496.000

    TEMPDB - Log File (KB)16568.000

    TEMPDB - Log File Used (KB)6603.150

    TEMPDB - Log Growths27.000

    TEMPDB - Percent Log Used39.358

    Total Available Memory (MB)124.825

    Total Server Memory (KB)27598977.000

    Virtual Memory (bytes)39500024934.400

  • Craig Farrell (1/7/2011)


    Jason,

    %DiskTime and Queues are no longer valid parameters if you're working against SAN systems. If you have independent RAIDs, that's different.

    Craig, I have a different experience. Yes, you can measure %disktime correctly. I can actually monitor logical disk counters, in Idera DM almost perfectly accurate (You can also do in perfmon, it should be accurate). You can read out the %disktime for a particular LUN in an array. You just need the right tool. The reading has proven accuracy of the scenario. That is proven also, after we move the DB from a LUN to a local RAID1, the problem suddenly disappeared.

  • FreeListStalls ? FreePages? MemoryGrantPending ?

    Did you have a lot read-ahead reads?

    They are perfmon counters. See perfmon explain button. FreeListStalls - as few possible. FreePages - above 10K MemoryGrantPending < 2

    You show us the %DiskTime is 7358%?

    can you comment on what specifically led you to say we're short on memory? Thanks.

    If you have lots RAM when utilized, SQLServer will cache into buffer-pool and procedure cache, contention bottle-neck should be at PAGELATCH level because SQLServer will read/write to cache first if found. LAZYWRITER will write dirty pages back via disk IO. That alleviates the burst of IO load. If you are seeing PAGEIOLATCH, that is contention on disk IO. If you max-out the memory but still getting PAGEIOLATCH, your disk (RAID array) is your bottle-neck. Reconfig adding more disks on your RAID10 to provide more IOPS.

    %DiskTime is 7358% does not make sense if the tool measurment is correct. I am saying over 100% does not make sense. If you read this in perfmon, make sure the scale value is correct for the readout.

    I will be happy to take on a moonlighting consulting job for you. That is all the time I have for now.

  • suspicion of excessive paging due to a lack of enough memory.

    Your initial question is "suspicion of excessive paging due to a lack of enough memory."

    Just seeing your memory size and DB size and initial data you posted, from experiences I say yes (without material evidences yet. Of course, if your DB is largely dormant, that is differnet. You should archive if that is the case.) because adding RAM does alleviate burst of IO demand. If 2 TB all active on 32 GB RAM, it is not balanced RAM vs IO. Of course, it again depends. Average healthy human beings have average height-weight ratio.

    So, check your memory counters first, then check your disk counters, "logical disk" if on LUN or partition, physical disk if physical. If you don't know, research, read up. (If you need a consutant? email me)

    Now if you are asking "why my query so slow" because index fragmentation etc. other issues, you should see my SQL Saturday #57 presentation.

  • jswong05 (3/19/2011)


    %DiskTime is 7358% does not make sense if the tool measurment is correct. I am saying over 100% does not make sense.

    http://support.microsoft.com/kb/310067

    Hence the reason why the disk time % counter has fallen out of favour if working with RAID arrays and SANs. I prefer disk idle % which will fall between 0 and 100, along with sec/read, sec/write.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 6 posts - 31 through 35 (of 35 total)

You must be logged in to reply to this topic. Login to reply