SQL Storing Too Much Data

  • Hi All,

    I have a table that contains 3million rows and no indexes defined. It was has the following datatypes:

    ColumnNameDataTypeLengthColumnPrecisionNullable

    Source_BKeyvarchar (255)255255NOT NULL

    Period_Cumulative_BKeyvarchar (255)255255NOT NULL

    Product_SBU_2_BKeyvarchar (255)255255NOT NULL

    Gross_Profit_L2_BKeyvarchar (255)255255NOT NULL

    Base_BKeyvarchar (255)255255NOT NULL

    Exchange_Rate_Type_BKeyvarchar (255)255255NOT NULL

    Month_BKeyvarchar (255)255255NOT NULL

    Planning_Years_BKeyvarchar (255)255255NOT NULL

    Calendar_BKeyvarchar (255)255255NOT NULL

    Local_Valuenumeric1325

    Exch_Ratefloat853

    Euro_Valuenumeric1325

    Local_Value_Unfactoredfloat853

    Euro_Value_Unfactoredfloat853

    Create_Datedatetime823NOT NULL

    Last_Update_Datedatetime823NOT NULL

    The size of the table reports as being 15GB when running sp_spaceused:

    rowsreserveddataindex_sizeunused

    3842204 16583496 KB16271872 KB64 KB311560 KB

    The number of characters in each of the varchar fields is nowhere near 255 and i've verified this using the following syntax to calculate the true number of bytes that SQl should be storing for each column:

    sum(datalength([Source_BKey])+2)

    If I add each column storage size I get nowhere near the 15GB total (i calculate 0.5GB in size). It would seem that SQL is storing a lot more number of bytes than i'm expecting but i thought that using the datalength function would return the true stored number of bytes per row?

    Does anyone know how this could be happening?

    Thanks in advance

  • A HEAP table (without a clustered index) has a table size which includes the space of all the deleted rows, that ever existed, EVER.

    so if you insert and delete a lot, the table size will always reflect the space take by deleted rows.

    http://sqlserverpedia.com/wiki/Heaps#Deletes_and_Heaps

    Deletes and Heaps

    When data is deleted from a heap using a DELETE statement, SQL Server will not release the space; it remains allocated to the heap. This leads to space bloat that wastes valuable resources. To address this problem, you can do any of the following:

    that one of the many reasons that every table should have a clustered index in SQL, if not a primary key.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • you could build a clustered index and then drop it again to clear things up, but really you ought to consider adding a clustered index and leaving it there, as well as adding an index or two that might speed up queries.

  • You might also read this about choosing your clustered key: http://www.sqlservercentral.com/articles/Indexing/68563/

  • Thank you both for your quick replies! I've created a clustered index and now the table reports as being 600MB as expected so that's great.

    Just wondered why the sp_spaceused did not show massive amounts of free space?

    Regards,

    Chris

  • davas (11/29/2012)


    Thank you both for your quick replies! I've created a clustered index and now the table reports as being 600MB as expected so that's great.

    Just wondered why the sp_spaceused did not show massive amounts of free space?

    Regards,

    Chris

    This is the very first line of text from the article that Lowell referenced.

    When data is deleted from a heap using a DELETE statement, SQL Server will not release the space; it remains allocated to the heap.

    The space is still allocated so so_spaceused sees it as allocated.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • davas (11/29/2012)


    Thank you both for your quick replies! I've created a clustered index and now the table reports as being 600MB as expected so that's great.

    Just wondered why the sp_spaceused did not show massive amounts of free space?

    Regards,

    Chris

    that's that weirdness of a HEAP, that the space is not reported as free, even though you might think it should be free, since it's deleted data.

    I'd guess it's a a legacy design issue, that since it has a work around (adding the clustered index) there's no priority in making the SQL engine do anything else with the deleted data.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Thanks both for your responses regarding the heap. I guess I'd like to find a way to detect if the same is happening on other tables too (with a big impact) through a script. I know how to determine which are heaps and which are clustered which is a start but is there a way to measure the wasted space?

    Regards,

    Chris

  • davas (11/29/2012)


    Thanks both for your responses regarding the heap. I guess I'd like to find a way to detect if the same is happening on other tables too (with a big impact) through a script. I know how to determine which are heaps and which are clustered which is a start but is there a way to measure the wasted space?

    Regards,

    Chris

    There is the rub. The system doesn't see it as wasted space. Do you have lots of heaps in your system? Can you add a clustered index to those other tables? Heaps have a number of issues that make them more difficult to work with.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (11/29/2012)


    There is the rub. The system doesn't see it as wasted space. Do you have lots of heaps in your system? Can you add a clustered index to those other tables? Heaps have a number of issues that make them more difficult to work with.

    I've inherited these databases that are part of a data warehouse ETL process. I've just started looking at them in detail (I'm an 'accidental DBA') to see how they are configured to help me understand how I can improve performance of data loads etc.

    I think by creating a handful of clustered indexes I've just reduced the size of the database by 26GB to 3GB. We seem to have very few tables with clustered indexes but it is a staging database that isn't touched by users. I intend to add appropriate indexes where i can that will speed up data load times.

    Many thanks for all of your help everyone!

    Regards,

    Chris

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply