huge free spaces in data file can make query slower?

  • 1) The pages are still 8k, no matter what. Nothing about database growth changes that. The concern here is that growing at 1mb, each growth goes to some other location on the disk, all mixed and scattered. Then, any scans of anything sizeable, has to skip all over the disk to find the data. By growing in larger chunks on the disk itself, you're placing your data into more contiguous locations (although, see Jeff's comments on deletes). This makes for better performance overall. This is a huge topic. Here's a very good article on it.

    2) Partitioning is a data management tool used mostly in rolling window style data loads (only keep the last 24 months, partition by month, and we'll roll of the oldest partition, that kind of thing). It's not a performance tool. Further, when you implement partitioning, you must be able to guarantee partition elimination to even hope of decent performance. Partition elimination only occurs if every query has the partition key as part of the filtering criteria to eliminate all other partitions and only search within a single partition. If you're scanning across all partitions for every query, yeah, performance is going to be atrocious and there is no real way to improve that. If you're keeping data forever (no rolling window for management) and you can't do partition elimination, then partitions are nothing but overhead and are not helping you in any way.

    3) Shrink only when necessary. Get away from the idea that shrinking is somehow something you MUST have. It's something you use if/when/where you need to, but generally, in most circumstances, you should not NEED to.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Cool2018 - Friday, November 2, 2018 12:58 AM

    andycadley - Thursday, November 1, 2018 10:12 PM

    Cool2018 - Thursday, November 1, 2018 8:39 PM

    I am sorry I have below 3 questions. Would you please share your answer?

    1. To make actual delete is to use commit function. Am I right? But currently our sp did not have commit statement. We delete and insert. So it is not actual delete? If we don't commit, when the SQL server will really delete? Should we add commit in our script but our DB is currently auto commit. 

    SQL Server only requires a COMMIT statement if a transaction is in effect, either as the result of a BEGIN TRANSACTION or because IMPLICIT_TRANSACTIONS has been set to ON (don't do this, it's a terrible idea) directly or via setting ANSI_DEFAULTS to ON.

    Usually a DELETE (or any other SQL statement) takes immediate effect and the results are committed when the statement completes, assuming it does not error.

    Dear Andycadley,

    Thank you so much for your explanation.
    I got one question here. When we do the commit, committed rows will be removed from transaction log. Am I correct?

    Thank you so much & Best Regards,
    Cool

    Nope. Nothing gets removed from the log until you do a log backup (unless we're talking SIMPLE recovery).

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Cool2018 - Friday, November 2, 2018 12:54 AM

    Dear Andycadley,

    Highly appreciated and thank you so much for your explanation and advices.
    1) I just got some idea. If the space was refilled by SQL server, the page order will not be in order. For example: Jan, Apr, Dec, Feb, etc. Instead of Jan, Feb, Mar, Apr, etc. Can it also be lead it performance issue? But we use index and it is not an issue. Am I right to say that?  
    If we use column stored index, can it be handle the same way as other index? Any idea?
    2) After your explanation, I recalled that the data page sizes are all equal. So if we add less data in one and more in another that will lead to spaces in the page in less data page and affect the performance. Am I right? If yes, any idea how can we handle it?
    3) Currently we will have one new requirement which need to truncate the whole table and then reinsert the rows to get latest update. For this case, inserted rows will be filled to the space which produced by the truncate. Right? So I do not need DB Shrink although the space size is big.
    I am sorry if I ask you stupid questions. I have very limited knowledge on this.

    Thank you so much & Best Regards,

    1) Columnstore is a completely different critter from traditional row store indexes. It does compression on the data stored, but, you will need to rebuild these indexes regularly. This is because changes to them are stored in what is called a delta store (it's a b-tree, row store index under the covers). Getting the deleted rows out of the compressed data and moving added or changed rows into the compressed data is best handled through the rebuild process. If you're doing nightly data loads, also include nightly index rebuilds AFTER the data load completes.

    2) Don't sweat space on the page man. You're nowhere near having to worry about that as your performance bottleneck. Let's walk before we run.

    3) Again and again, you're hung on the idea that the data file storage allocation is somehow tied to performance. Yes, there is some correlation, but it's not that direct. You have a bunch of other problems. You've grown your database in little chunks so it's scattered all over the disk. You've further scattered it by shrinking over and over. You've partitioned data, but can't guarantee partition elimination. All this taken together is killing your performance. Stop obsessing over file size. Walk away from that issue.

    Strong recommendation. You sound like you're managing a very large database (databases?) and you're inexperienced. Get a consultant in to help out and train you up. Poking at the edges of the problems, especially when you don't understand the fundamentals, is not going to get you anywhere.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey - Friday, November 2, 2018 4:11 AM

    1) The pages are still 8k, no matter what. Nothing about database growth changes that. The concern here is that growing at 1mb, each growth goes to some other location on the disk, all mixed and scattered. Then, any scans of anything sizeable, has to skip all over the disk to find the data. By growing in larger chunks on the disk itself, you're placing your data into more contiguous locations (although, see Jeff's comments on deletes). This makes for better performance overall. This is a huge topic. Here's a very good article on it.

    2) Partitioning is a data management tool used mostly in rolling window style data loads (only keep the last 24 months, partition by month, and we'll roll of the oldest partition, that kind of thing). It's not a performance tool. Further, when you implement partitioning, you must be able to guarantee partition elimination to even hope of decent performance. Partition elimination only occurs if every query has the partition key as part of the filtering criteria to eliminate all other partitions and only search within a single partition. If you're scanning across all partitions for every query, yeah, performance is going to be atrocious and there is no real way to improve that. If you're keeping data forever (no rolling window for management) and you can't do partition elimination, then partitions are nothing but overhead and are not helping you in any way.

    3) Shrink only when necessary. Get away from the idea that shrinking is somehow something you MUST have. It's something you use if/when/where you need to, but generally, in most circumstances, you should not NEED to.

    Dear Grant,

    Noted with thanks.
    Highly appreciated and thank you so much for all your advices.

    Thank you & Best Regards

  • Cool2018 - Thursday, November 1, 2018 8:39 PM

    Jeff Moden - Thursday, November 1, 2018 5:12 PM

    There IS a form of freespace that WILL cause your code to run slower and also consume more memory.  If you do a bunch of deletes but it doesn't actually delete all the data from pages, then you can end up with a very low page density which wastes sometimes tons of memory.  And, it's possible that you can have this waste with virtually zero logical fragmentation.

    If your index maintenance routines aren't looking at avg_page_space_used_in_percent, you may be missing out.  Heh... even if you do, you may still be missing out because it's an average.

    Dear Jeff,

    Thank you so much for your advice.
    I am sorry I have below 3 questions. Would you please share your answer?
    1. To make actual delete is to use commit function. Am I right? But currently our sp did not have commit statement. We delete and insert. So it is not actual delete? If we don't commit, when the SQL server will really delete? Should we add commit in our script but our DB is currently auto commit.
    2. How to looking at and resolve avg_page_space_used_in_percent during our index maintenance routine job?
    3. You are right. Currently our SQL server DB memory usage very high. It reached almost 100%. Does it really due to the spaces? But our table has 9mil + rows with 300+ columns and we run by role based. So when we have 9 roles, our data extraction has 9*9= 81 mil +. Currently our DB has partition by year. Partition can make our query slower?

    Thank you & Best Regards,
    Cool

    For 1, usually you don't need a commit unliess you've made the mistake of turning implicit transactions on.  What I was talking about was if your deletes don't completely empty the pages they're deleting from.
    For 2, sys.dm_db_index_physical_stats, which is at the core of all index maintenance routines, contains the avg_page_space_used_in_percent.  If it's below 70, the index almost certainly needs to be rebuilt.  Depending on how the index is populated, anything less than 95% may be an indication to rebuild.  As was previously suggested, after you get done with your periodic loads, you may want to rebuild the indexes.  With a table of this size, rebuilding the clustered index will require a little tender loving care to keep from blowing out both the MDF and the LDF file especially if you have some form of replication running that uses the log file as the source of changes to the database.
    For 3, SQL Server will use as much memory as it can and that's not normally a problem.  Having things drop from memory and be reloaded from the disk is.  As for having a table with 300 columns, that seems a bit crazy except for an initial load from an external source with the possible exception of telephone CDRs.  The partitioning by year should help with index maintenance because I'm assuming that older data is never reinserted or updated? And extracting all 9 million rows 9 different ways also seems a bit crazy but, truth be told, I don't know enough about what the data is nor how it will be used to actually say that nor enough to be able to offer much help. 

    Grant covered what partitioning is all about.  Yes, it can actually slow queries down especially poorly written queries that don't take advantage of the temporal nature of the partitioning.  It's meant mostly as an administrative tool to do things like preventing the need for backing up data that will never change once written and to greatly decrease index maintenance for the same reason.  It can also make "deletes" by temporal range nasty fast by using SWITCH to switch out entire temporal sections in an instant.  Temporally sectioned loads can be done in a similar fashion.

    But, again, we just don't know enough about your data or its use to be of much help.  Unless we know more about the structure and usage of the data, all we can do is offer high level suggestions that you research further to see if they're even appropriate for what you actually want to do.

    I will say that if you're using the defaults for growth of the MDF and LDF files, that's a huge mistake, especially on the LDF file.  Small initial size and small increments will make a huge number of VLFs (Virtual Log Files) in the LDF and, yes, that can affect performance.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Just to add a bit about performance, most of the performance is in the code when applied to a correctly indexed table.  If you have code that is generated by the front end and it's always changing literals or the code in general and isn't using parameterized code to change the literals, then you'll end up with a shedload of recompiles.  We had some code that ran kind of fast when it executed (100ms, which actually sucked, IMHO) but, every time it was passed to SQL Server, it took 22 SECONDS to compile simply because of the amount of data it was playing against.  Obviously, that's not what you want and you need to make sure that such a thing isn't happening.  We rewrote to use a stored procedure and the code executed in about 10ms and suffered no recompiles.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • p.s.  If your new requirement is to truncate the whole table and then reload it, it's almost stupid for the table to be partitioned.  It's buying you very little and may actually hurt the performance of your queries.

    If you need to load it without much interference with your OLAP processes, then you need to build a "parallel" table and, once it's loaded, rebuild a synonym to point at it.  The next time you need to load it, reverse the process.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Friday, November 2, 2018 7:32 AM

    For 1, usually you don't need a commit unliess you've made the mistake of turning implicit transactions on.  What I was talking about was if your deletes don't completely empty the pages they're deleting from.
    For 2, sys.dm_db_index_physical_stats, which is at the core of all index maintenance routines, contains the avg_page_space_used_in_percent.  If it's below 70, the index almost certainly needs to be rebuilt.  Depending on how the index is populated, anything less than 95% may be an indication to rebuild.  As was previously suggested, after you get done with your periodic loads, you may want to rebuild the indexes.  With a table of this size, rebuilding the clustered index will require a little tender loving care to keep from blowing out both the MDF and the LDF file especially if you have some form of replication running that uses the log file as the source of changes to the database.
    For 3, SQL Server will use as much memory as it can and that's not normally a problem.  Having things drop from memory and be reloaded from the disk is.  As for having a table with 300 columns, that seems a bit crazy except for an initial load from an external source with the possible exception of telephone CDRs.  The partitioning by year should help with index maintenance because I'm assuming that older data is never reinserted or updated? And extracting all 9 million rows 9 different ways also seems a bit crazy but, truth be told, I don't know enough about what the data is nor how it will be used to actually say that nor enough to be able to offer much help. 

    Grant covered what partitioning is all about.  Yes, it can actually slow queries down especially poorly written queries that don't take advantage of the temporal nature of the partitioning.  It's meant mostly as an administrative tool to do things like preventing the need for backing up data that will never change once written and to greatly decrease index maintenance for the same reason.  It can also make "deletes" by temporal range nasty fast by using SWITCH to switch out entire temporal sections in an instant.  Temporally sectioned loads can be done in a similar fashion.

    But, again, we just don't know enough about your data or its use to be of much help.  Unless we know more about the structure and usage of the data, all we can do is offer high level suggestions that you research further to see if they're even appropriate for what you actually want to do.

    I will say that if you're using the defaults for growth of the MDF and LDF files, that's a huge mistake, especially on the LDF file.  Small initial size and small increments will make a huge number of VLFs (Virtual Log Files) in the LDF and, yes, that can affect performance.

    Dear Jeff,

    Thank you so much for all your advices and explanations.
    Our current DB size only 90GB and free space is 45GB although autogrowth size is default 1MB. Our estimated DB growth for month is 600MB only. Although our auto growth size is so small, still we have 45GB free spaces, our DB hasn't use auto growth yet. Am I right?
    I think it is still ok. Right?

    Thank you so much & Best Regards,
    Cool

  • Cool2018 - Friday, November 2, 2018 9:35 AM

    Dear Jeff,

    Thank you so much for all your advices and explanations.
    Our current DB size only 90GB and free space is 45GB although autogrowth size is default 1MB. Our estimated DB growth for month is 600MB only. Although our auto growth size is so small, still we have 45GB free spaces, our DB hasn't use auto growth yet. Am I right?
    I think it is still ok. Right?

    Thank you so much & Best Regards,
    Cool

    IF you have Instant File Initialization turned on, autogrowth of the MDF file is no problem.  Considering you estimated monthly growth, I'd likely set autogrowth to something between 100MB and 1000MB.

    As for whether or not you're used autogrowth on the MDF file or not, yes... Unless the initial size of the file was set to 90GB when the MDF file was first created or someone intentionally grew it to that size, then you've definitely used autogrowth on your MDF file sometime in the past.

    As I explained a bit before, the LDF file (Transaction Log File) is much different.  What is the current size of your log file and what are all the growth settings for it?

    As for why half your allocated disk space is empty... what is the size of the largest table (including all indexes for the table) and the largest single index in the entire database?

    And I wouldn't do any more shrinks until you find out what caused the freespace because it's just going to grow again.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Friday, November 2, 2018 3:29 PM

    Cool2018 - Friday, November 2, 2018 9:35 AM

    Dear Jeff,

    Thank you so much for all your advices and explanations.
    Our current DB size only 90GB and free space is 45GB although autogrowth size is default 1MB. Our estimated DB growth for month is 600MB only. Although our auto growth size is so small, still we have 45GB free spaces, our DB hasn't use auto growth yet. Am I right?
    I think it is still ok. Right?

    Thank you so much & Best Regards,
    Cool

    IF you have Instant File Initialization turned on, autogrowth of the MDF file is no problem.  Considering you estimated monthly growth, I'd likely set autogrowth to something between 100MB and 1000MB.

    As for whether or not you're used autogrowth on the MDF file or not, yes... Unless the initial size of the file was set to 90GB when the MDF file was first created or someone intentionally grew it to that size, then you've definitely used autogrowth on your MDF file sometime in the past.

    As I explained a bit before, the LDF file (Transaction Log File) is much different.  What is the current size of your log file and what are all the growth settings for it?

    As for why half your allocated disk space is empty... what is the size of the largest table (including all indexes for the table) and the largest single index in the entire database?

    And I wouldn't do any more shrinks until you find out what caused the freespace because it's just going to grow again.

    Dear Jeff,

    Highly appreciated and thank you so much for your enlightenment.
    I am sorry I didn't think and check carefully. Actually you are right because I remember, last 2 months ago, when I check the DB size, it is smaller than this size. But I don't know why there is so much free spaces were shown in DB properties -> free spaces. I just noticed that my mdf free space is only left 3%. I will surely set auto growth on. However I am thinking to increase my mdf initial size to 10 GB every year manually so that we have more control over the DB. I saw some article such that setting big number in Auto growth sometimes can get accidently problem. So I am thinking to set auto-growth size to 1GB which is same as your idea. Do you think is it ok? Our log file auto-growth is 10%.
    Also I will turn off my scheduler for DB Shrink so that it won't conflict with my adding free spaces.
    I am thinking to defrag our physical drive to resolve current performance issue. I think there will have badly fragmented in our physical drive.
    My DB Initial file sizes and % free spaces info are as follows:

    Namesize percent_free
    mdf5560.43
    ldf33080.399
    ndf-1998286.395
    ndf-1999310.993
    ndf-2000319.895
    ndf-2001318.095
    ndf-2002329.193
    ndf-2003341.893
    ndf-2004240.490
    ndf-2005286.183
    ndf-2006322.080
    ndf-2007323.182
    ndf-2008346.480
    ndf-2009352.983
    ndf-2010371.783
    ndf-20114211.190
    ndf-20125329.490
    ndf-20135827.090
    ndf-20145918.990
    ndf-20156057.190
    ndf-20166009.390
    ndf-20176886.785
    ndf-20185645.280
    ndf-20197.855
    ndf-20207.452

    Thank you so much & Best Regards,
    Cool

  • I'm not sure why you have so much free space for the NDF files.  Something is wrong there or you've measured something wrong or it's used space and not free space.

    You're also going to have to rebuild your log file.  Setting it to a % for growth is a bad thing to do.  If the default size of 1MB was left and you grow by 10%, it will take 73 fragments to grow to just 1GB.  The first 68 growths will each be less than 64MB so thats 68*4 VLFs and the last 5 growths will be >64MB but less than 1GB so that's 5*8 VLFs for a total of 312 VLFs just for the first GB of log file space.  That's way too many for this database.  When you get a chance, change the growth size of the log file to at least 1GB then shrink your log file to as close to 0 as possible and then grow it to 1GB and let nature take it's course from there.

    As for the size of your log file being at 33GB, that's also a bit nuts just like all the free space in your NDF files is a bit nuts.  It probably occurs when you do index rebuilds.  In the FULL Recovery Model, Rebuilds are fully logged.  If you're not doing replication or log shipping, you can switch to the BULK LOGGED recovery model and your index Rebuilds will not only be minimally logged, they'll also run faster.  Don't forget to switch back to the Full Recovery Model if that's where you started out at  Reorgs will always be fully logged regardless of the Recovery Model being used and they use a whole lot more resources than advertised.  I steer clear of Reorgs.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ah... Just noticed that you're using 2014.  The VLF algorithm changed there.  See the following article...
    https://www.sqlskills.com/blogs/paul/important-change-vlf-creation-algorithm-sql-server-2014/

    The bottom line, though, is that growing by percent is still the wrong thing to do especially if you start out at just 1 MB and you end up with too many VLFs (IMHO).  I'd still recommend shrinking the log file to as close to zero as you can get it and then force the first growth to 1GB for this this database.

    Revisiting your database and considering the size of the NDF files, I guess I'd be tempted to NOT partition that table.  Unless you have really painfully slow disks or a super limited amount of memory, it doesn't seem worth partitioning to me.  That's just my opinion though because I work with single column indexes than are larger than 5GB.  YMMV.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Friday, November 2, 2018 9:32 PM

    I'm not sure why you have so much free space for the NDF files.  Something is wrong there or you've measured something wrong or it's used space and not free space.

    You're also going to have to rebuild your log file.  Setting it to a % for growth is a bad thing to do.  If the default size of 1MB was left and you grow by 10%, it will take 73 fragments to grow to just 1GB.  The first 68 growths will each be less than 64MB so thats 68*4 VLFs and the last 5 growths will be >64MB but less than 1GB so that's 5*8 VLFs for a total of 312 VLFs just for the first GB of log file space.  That's way too many for this database.  When you get a chance, change the growth size of the log file to at least 1GB then shrink your log file to as close to 0 as possible and then grow it to 1GB and let nature take it's course from there.

    As for the size of your log file being at 33GB, that's also a bit nuts just like all the free space in your NDF files is a bit nuts.  It probably occurs when you do index rebuilds.  In the FULL Recovery Model, Rebuilds are fully logged.  If you're not doing replication or log shipping, you can switch to the BULK LOGGED recovery model and your index Rebuilds will not only be minimally logged, they'll also run faster.  Don't forget to switch back to the Full Recovery Model if that's where you started out at  Reorgs will always be fully logged regardless of the Recovery Model being used and they use a whole lot more resources than advertised.  I steer clear of Reorgs.

    Dear Jeff,

    Highly appreciated and Thank you so much for your valuable advices. It is really helpful for us.
    For the free space %, I used the below query.

    ;WITH f AS

    (

      SELECT name, size = size/128.0 FROM sys.database_files

    ),

    s AS

    (

      SELECT name, size, free = size-CONVERT(INT,FILEPROPERTY(name,'SpaceUsed'))/128.0

      FROM f

    )

    SELECT name, size, free, percent_free = free * 100.0 / size

    FROM s;


    I will take your advice for the log file too. Really appreciated your advices and time. I have one more confusing here. In order to resolve our mdf fragmentation, only rebuild index and reorg will be sufficient? Is it good if we do physical disk defragmentation too?

    We highly appreciated all your advices. It is very valuable for us.

    Thank you so much and Best Regards,
    Cool

  • Jeff Moden - Friday, November 2, 2018 9:43 PM

    Ah... Just noticed that you're using 2014.  The VLF algorithm changed there.  See the following article...
    https://www.sqlskills.com/blogs/paul/important-change-vlf-creation-algorithm-sql-server-2014/

    The bottom line, though, is that growing by percent is still the wrong thing to do especially if you start out at just 1 MB and you end up with too many VLFs (IMHO).  I'd still recommend shrinking the log file to as close to zero as you can get it and then force the first growth to 1GB for this this database.

    Revisiting your database and considering the size of the NDF files, I guess I'd be tempted to NOT partition that table.  Unless you have really painfully slow disks or a super limited amount of memory, it doesn't seem worth partitioning to me.  That's just my opinion though because I work with single column indexes than are larger than 5GB.  YMMV.

    Dear Jeff,

    Noted with thanks.
    We will take your advice and review our partitioning. Actually we are also doubt with our partitioning. We will check again and review.
    Highly appreciated your advice.

    Thank you so much & Best Regards,
    Cool

  • Cool2018 - Saturday, November 3, 2018 10:22 AM

    Jeff Moden - Friday, November 2, 2018 9:32 PM

    I'm not sure why you have so much free space for the NDF files.  Something is wrong there or you've measured something wrong or it's used space and not free space.

    You're also going to have to rebuild your log file.  Setting it to a % for growth is a bad thing to do.  If the default size of 1MB was left and you grow by 10%, it will take 73 fragments to grow to just 1GB.  The first 68 growths will each be less than 64MB so thats 68*4 VLFs and the last 5 growths will be >64MB but less than 1GB so that's 5*8 VLFs for a total of 312 VLFs just for the first GB of log file space.  That's way too many for this database.  When you get a chance, change the growth size of the log file to at least 1GB then shrink your log file to as close to 0 as possible and then grow it to 1GB and let nature take it's course from there.

    As for the size of your log file being at 33GB, that's also a bit nuts just like all the free space in your NDF files is a bit nuts.  It probably occurs when you do index rebuilds.  In the FULL Recovery Model, Rebuilds are fully logged.  If you're not doing replication or log shipping, you can switch to the BULK LOGGED recovery model and your index Rebuilds will not only be minimally logged, they'll also run faster.  Don't forget to switch back to the Full Recovery Model if that's where you started out at  Reorgs will always be fully logged regardless of the Recovery Model being used and they use a whole lot more resources than advertised.  I steer clear of Reorgs.

    Dear Jeff,

    Highly appreciated and Thank you so much for your valuable advices. It is really helpful for us.
    For the free space %, I used the below query.

    ;WITH f AS

    (

      SELECT name, size = size/128.0 FROM sys.database_files

    ),

    s AS

    (

      SELECT name, size, free = size-CONVERT(INT,FILEPROPERTY(name,'SpaceUsed'))/128.0

      FROM f

    )

    SELECT name, size, free, percent_free = free * 100.0 / size

    FROM s;


    I will take your advice for the log file too. Really appreciated your advices and time. I have one more confusing here. In order to resolve our mdf fragmentation, only rebuild index and reorg will be sufficient? Is it good if we do physical disk defragmentation too?

    We highly appreciated all your advices. It is very valuable for us.

    Thank you so much and Best Regards,
    Cool

    On the physical disk defragmentation... if you have a SAN, most people will insist that the SAN takes care of disk fragmentation auto-magically.  I don't know if that's actually true or not because I've not been able to play with a SAN to find out.  I do tend to be skeptical but they might be right.

    As for defragging your indexes... if you don't have any dependencies on the log file for such things as log shipping, replication, etc, then, again, my recommendation for index maintenance for this database is...
    1. If your Recovery Model is FULL or Bulk_logged, take a log file backup.
    2. Change the Recovery Model to Bulk_Logged if it's not already there to allow for minimal logging during rebuilds.
    3. Rebuild the indexes that need it.  Since this appears to be a "WORM" table (Write Once, Read Many.  In other words, you truncate it, load it, and use it without modifying the data), you should probably rebuild the indexes if they have <92 avg_page_space_used_in_percent or > 10 avg_fragmentation_in_percent (even at just 10% fragmentation, Reorg can take quite the toll on the log file for large tables).
    4. Change the Recovery Model back to FULL if that's where you started.
    5. Take another log file backup to minimize the amount of log file span that had minimal logging in it (important for Point-in-Time restores).

    Whether the table is partitioned or not and based on the freespace in your MDF and NDF files, your used data size should drop to something less than 12GB.  If you do the minimal logging thing when you rebuild your indexes, you probably won't need more than about 2-4GB except for when you do your data loads.  If you learn how to do minimal logging for those as well as for index maintenance, you might be able to get away with even less.

    And, if I think the actual usage is going to drop to a little as your chart suggests, I'd definitely NOT partition this table.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 30 total)

You must be logged in to reply to this topic. Login to reply