Table size is Huge!!!

  • ScottPletcher (8/6/2014)


    LutzM (8/6/2014)


    Here are a few options to make that single table "smaller":

    If the table contains a rather large number of columns (e.g. 50 or more) you might consider "vertical partitioning", meaning to add a second table with the same primary key and splitt the columns (if columns can be grouped logically, even better...).

    If the size is due to a large number of rows, you might consider "horizontal partitioning",e.g. by moving "old data" to one or more archive table(s), e.g. one per year.

    Finally, evaluate the data type used for the column definitions and if it's defined for the size needed. For example, a NCHAR(255) to store either "Yes" or "No" could be reduced to VARCHAR(3)...

    Excellent points overall.

    But a Yes/No column should be changed to char(3), not varchar(3) (barring lots of NULL values).

    Or disallow nulls and make it a char(5) to include "maybe" or "dunno" which would better fit a Yes/No column than a null. 😉

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • GPO (8/6/2014)


    I'm surprised nobody has discussed normalization... or did I miss something? It would be good if @smtzac could script the CREATE TABLE and tell us a bit about it (how many rows, what is it's purpose etc). Is it a transactional table or a flat data dump? Also what version od SQL Server is it on?

    Agreed and that's why I've been asking for what the table is actually used for. Even better, as you say, the CREATE TABLE statement (along with the CREATEs for indexes and constraints) would go a long way in determining many different aspects including but not limited to possible compression by normalization. If it's a "field level audit table", we're likely talking about a full blown EAV, which will have much different "normalization" requirements than an well designed OLTP table but possibly also has some unique attributes where things like nightly maintenance (backups, index reorgs/rebuilds, etc) could be reduced to a tiny fraction of what they currently are all while possibly enabling the ability to do online piecemeal restores.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • smtzac (8/6/2014)


    Thank you guys for your time 🙂

    Since we have no clue what the table is actually being used for, we're just shooting in the dark. Take some time to post the CREATE TABLE statement along with all of the indexes for the table along with a brief description of what the table is actually used for and let us really try to help.

    Also, run the following code against the table (change the schema and table name in the code) and attach the output.

    SELECT *

    FROM sys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('dbo.tablenamehere'),NULL,NULL,'SAMPLED')

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You can create Server and database audit to check what and who is inserting records in a table.

    Also if possible use partitions for performance.

  • mssqlsrv (8/6/2014)


    You can create Server and database audit to check what and who is inserting records in a table.

    Also if possible use partitions for performance.

    Oh, be careful now... partitioning won't necessarily give you any performance advantage and, in fact, could slow things down a bit depending, of course, on what you partition by vs the most common queries. A well written query against a properly indexed table will work with great performance.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sorry guys for late response. Not sure what is this table created for . I have scripted out:

    CREATE TABLE [dbo].[ABCD](

    [A] [uniqueidentifier] NULL,

    [uniqueidentifier] NULL,

    [C] [nvarchar](100) NULL,

    [D] [nvarchar](max) NULL,

    [E] [nvarchar](max) NULL,

    [F] [nvarchar](100) NULL,

    [G] [datetime] NULL,

    [H] [nvarchar](100) NULL,

    [datetime] NULL,

    [J] [bit] NULL,

    [K] [bit] NULL,

    [L] [bit] NULL,

    [M] [nvarchar](2) NULL

    ) ON [PRIMARY]

  • smtzac (8/7/2014)


    CREATE TABLE [dbo].[ABCD](

    [A] [uniqueidentifier] NULL,

    [uniqueidentifier] NULL,

    [C] [nvarchar](100) NULL,

    [D] [nvarchar](max) NULL,

    [E] [nvarchar](max) NULL,

    [F] [nvarchar](100) NULL,

    [G] [datetime] NULL,

    [H] [nvarchar](100) NULL,

    [datetime] NULL,

    [J] [bit] NULL,

    [K] [bit] NULL,

    [L] [bit] NULL,

    [M] [nvarchar](2) NULL

    ) ON [PRIMARY]

    Lordy.

  • Jeff Moden (8/7/2014)


    mssqlsrv (8/6/2014)


    You can create Server and database audit to check what and who is inserting records in a table.

    Also if possible use partitions for performance.

    Oh, be careful now... partitioning won't necessarily give you any performance advantage and, in fact, could slow things down a bit depending, of course, on what you partition by vs the most common queries. A well written query against a properly indexed table will work with great performance.

    erm...Agree with Jeff on this one.

    While partitioning can sometimes offer a performance boost, it should not be implemented with the hopes of it being a magic "turbo" button.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • My guess (based on bitter experience); it is an audit table populated by an AFTER UPDATE trigger and something is updating the source table without a WHERE clause.

  • smtzac (8/7/2014)


    Sorry guys for late response. Not sure what is this table created for . I have scripted out:

    CREATE TABLE [dbo].[ABCD](

    [A] [uniqueidentifier] NULL,

    [uniqueidentifier] NULL,

    [C] [nvarchar](100) NULL,

    [D] [nvarchar](max) NULL,

    [E] [nvarchar](max) NULL,

    [F] [nvarchar](100) NULL,

    [G] [datetime] NULL,

    [H] [nvarchar](100) NULL,

    [datetime] NULL,

    [J] [bit] NULL,

    [K] [bit] NULL,

    [L] [bit] NULL,

    [M] [nvarchar](2) NULL

    ) ON [PRIMARY]

    I'm guessing you've obfuscated the actual column names. Hopefully no one would create a table in production with that naming convention. Given this, I have no idea what the table is for either, so I can't comment about whether the table is properly normalized or if the choice of data types are appropriate.

    What I can tell you is that the uniqueidentifier columns are 16 bytes fixed each, and if I had to bet, one or maybe both of those columns are the clustered primary key. Also, those nvarchar(max) are storage hogs. If this table contains millions of rows, then I could easily picture it at 300 GB, but with those nvarchar(max) columns there is practically no limit, each row could contain a 100 page term paper.

    Have you checked the average fragmentation percentage and page density? The following could take more than an hour to run, but the result will tell us a lot about what's going on.

    exec sp_spaceused 'dbo.ABCD';

    dbcc showcontig ('dbo.ABCD') with fast, all_indexes;

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • smtzac (8/7/2014)


    Sorry guys for late response. Not sure what is this table created for . I have scripted out:

    CREATE TABLE [dbo].[ABCD](

    [A] [uniqueidentifier] NULL,

    [uniqueidentifier] NULL,

    [C] [nvarchar](100) NULL,

    [D] [nvarchar](max) NULL,

    [E] [nvarchar](max) NULL,

    [F] [nvarchar](100) NULL,

    [G] [datetime] NULL,

    [H] [nvarchar](100) NULL,

    [datetime] NULL,

    [J] [bit] NULL,

    [K] [bit] NULL,

    [L] [bit] NULL,

    [M] [nvarchar](2) NULL

    ) ON [PRIMARY]

    Wow! Not much help there. Are those the real table and column names? Do you know what this table is used for? For example, is it an audit table of some form?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Every column nullable. No constraints. Not even a primary key. No indexes. I think someone is pulling your leg. What possible reliable use could it be? I'm tempted to think you might be able to claw back 300gb fairly easily if you get my drift.:-)

    Seriously though, go back to the people who gave you this task and find out what they were hoping to achieve with this table. Because whatever their intentions were I'll bet this table is not the answer they were hoping for. It's not really a "relational database table" at all in the true sense. It's just a massive, loose affiliation of ones and zeros polluting your disk. If it is being queried (and I can't see how it could be) you'll never be sure you're getting the right answer.

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

  • GPO (8/7/2014)


    Every column nullable. No constraints. Not even a primary key. No indexes. I think someone is pulling your leg. What possible reliable use could it be? I'm tempted to think you might be able to claw back 300gb fairly easily if you get my drift.:-)

    Seriously though, go back to the people who gave you this task and find out what they were hoping to achieve with this table. Because whatever their intentions were I'll bet this table is not the answer they were hoping for. It's not really a "relational database table" at all in the true sense. It's just a massive, loose affiliation of ones and zeros polluting your disk. If it is being queried (and I can't see how it could be) you'll never be sure you're getting the right answer.

    It looks like a typical table from PeopleSoft, for anyone who has ever had the misfortune. 🙂

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (8/7/2014)


    GPO (8/7/2014)


    Every column nullable. No constraints. Not even a primary key. No indexes. I think someone is pulling your leg. What possible reliable use could it be? I'm tempted to think you might be able to claw back 300gb fairly easily if you get my drift.:-)

    Seriously though, go back to the people who gave you this task and find out what they were hoping to achieve with this table. Because whatever their intentions were I'll bet this table is not the answer they were hoping for. It's not really a "relational database table" at all in the true sense. It's just a massive, loose affiliation of ones and zeros polluting your disk. If it is being queried (and I can't see how it could be) you'll never be sure you're getting the right answer.

    It looks like a typical table from PeopleSoft, for anyone who has ever had the misfortune. 🙂

    I was gonna say it looks quite familiar to a bunch of others too.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

Viewing 14 posts - 16 through 28 (of 28 total)

You must be logged in to reply to this topic. Login to reply