What's taking up all my disc space

  • Hi

    I have an overnight job that performs a lot of tasks

    At a point in time, I am getting a message from our network department to say that one of the discs is running low on space (they use VMWare monitoring tools)

    It's not actually running out but could

    Is there an easy/best practice approach to finding which specific task(s) cause this issue?

    Thanks

    Damian.

    - Damian

  • You can capture database growth events with an Extended Events session, along with the statement that caused the growth.

    Something similar to this should do:

    CREATE EVENT SESSION [DatabaseGrowth]

    ON SERVER

    ADD EVENT sqlserver.database_file_size_change(

    SET collect_database_name=(1)

    ACTION(sqlserver.client_hostname,sqlserver.sql_text)

    )

    WITH (STARTUP_STATE=ON)

    Add targets according to your needs.

    -- Gianluca Sartori

  • Thanks for the response Gianluca

    In this instance, I need to know the code that causes the issue

    Is there a way of establishing the impact of each component part of code or do I have to do this one step at a time

    e.g. let it run through, log the impact, analyse the log or ?

    Thanks

    - Damian

  • There is really no way to know in advance. Measure, capture, act accordingly.

    -- Gianluca Sartori

  • DamianC (11/3/2015)


    Hi

    I have an overnight job that performs a lot of tasks

    At a point in time, I am getting a message from our network department to say that one of the discs is running low on space (they use VMWare monitoring tools)

    It's not actually running out but could

    Is there an easy/best practice approach to finding which specific task(s) cause this issue?

    Thanks

    Damian.

    Which disk? What's on it? MDF? LDF? TempDB? Mix? I ask because that will help narrow down the search for the cause.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • DamianC (11/3/2015)


    In this instance, I need to know the code that causes the issue

    If you log the database extensions then would the time at which they occur tell you which job / step / "action" it was? or do you have multiple jobs running at overlapping times?

    You can review when the Job / Step ran (but you might need to increase the history size that SQL Agent keeps in case it is getting purged before you get a chance to see it)

  • Jeff Moden (11/3/2015)


    DamianC (11/3/2015)


    Hi

    I have an overnight job that performs a lot of tasks

    At a point in time, I am getting a message from our network department to say that one of the discs is running low on space (they use VMWare monitoring tools)

    It's not actually running out but could

    Is there an easy/best practice approach to finding which specific task(s) cause this issue?

    Thanks

    Damian.

    Which disk? What's on it? MDF? LDF? TempDB? Mix? I ask because that will help narrow down the search for the cause.

    Heh... I repeat... this is step #1. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi

    Thanks for the replies

    Basically, MDF, LDF and TempDB are all on the same disk, on the same drive (E)

    I am aware this is not optimal and I am going to look at the hardware structure although my hands are a little tied due to available resources

    We use a Staging (raw data) -> Enterprise (merged data) -> Presentation (Kimball style structure) approach

    So, there's 2 jobs with multiple steps involving a full set of ETL processes followed by maintenance

    I can review duration but how do I review disk space?

    Somewhere amongst this whole process disk space is running low

    I could sit and step through this manually but wondering if there are tools or methods for monitoring then reviewing

    Thanks

    - Damian

  • DamianC (11/5/2015)


    Hi

    Thanks for the replies

    Basically, MDF, LDF and TempDB are all on the same disk, on the same drive (E)

    I am aware this is not optimal and I am going to look at the hardware structure although my hands are a little tied due to available resources

    We use a Staging (raw data) -> Enterprise (merged data) -> Presentation (Kimball style structure) approach

    So, there's 2 jobs with multiple steps involving a full set of ETL processes followed by maintenance

    I can review duration but how do I review disk space?

    Somewhere amongst this whole process disk space is running low

    I could sit and step through this manually but wondering if there are tools or methods for monitoring then reviewing

    Thanks

    The next step that I'd take is to try to determine which file or files are causing your blowout. The following will list the file sizes in descending order by size in MB. I also included the GROWTH factor because an already large file with a percent growth factor can grow by a huge amount in a single growth.

    SELECT SampleDT = GETDATE()

    ,SizeMB = size/128.0

    ,DbName = DB_NAME(database_id)

    ,FileType = type_desc

    ,LogicalName = name

    ,PhysicalName = physical_name

    ,growth = CASE

    WHEN is_percent_growth = 0

    THEN CAST(growth/128.0 AS VARCHAR(20))+'MB'

    ELSE CAST(growth AS VARCHAR(20))+'%'

    END

    FROM sys.master_files

    ORDER BY SizeMB DESC

    ;

    This will at least give us a clue as to what we're looking for. For example, if a datafile (MDF or NDF) for a given database is large and you run sp_spaceused on it and it comes up with a large amount of free space, then it could very well be that index rebuilds are the problem. If a log file is huge, it could be caused by index reorgs or rebuilds or the fact that the processes you're running make a whole lot of log entries and probably need to be fixed (as in made more efficient). If it's TempDB that's gone huge, it could very well be an insufficient criteria problem (people try to get around this with DISTINCT or GROUP BY instead of fixing the code) that causes huge accidental cross-joins in the form of many-to-many joins.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply