Huge number of tables

  • Also my desk landline is sitting in the corner, the cord coiled up as it was when first (dis)connected. 

    And I very much hope it stays that way 🙂

  • JackCarrington - Monday, August 13, 2018 6:46 AM

    Jeff Moden - Saturday, August 11, 2018 1:13 PM

    Luis Cazares - Wednesday, August 8, 2018 9:11 AM

    JackCarrington - Wednesday, August 8, 2018 7:39 AM

    Hi all,

    I've inherited a situation where we have a business critical system with gigantic number of tables, 240,000 - yes, nearly a quarter of a million. 

    I'll investigate further and update this post but my guess for now is that these comprise either the vendor or other parties periodically inserting data via Select Into and then should have deleted/archived elderly data, but didn't .

    Would like appreciate any/all input.

    Run away as fast as you can!
    You could actually try to identify which tables are not being used and start cleaning as fast as possible. Hopefully, there's at least a decent naming system (clear names instead of random or semi-random fixed length strings)

    BWAAA-HAAAA!!!!   That would be my first reaction, as well BUT... people need to get over that because it's a real easy place to shine and prove your worth early on in the game, especially if you use the "rename it to mark it for deletion" method.

    Indeed it is and that's one reason I'm doing this (the other being I frickken LOVE playing detective!!).

    Jeff that was a sterling suggestion sir, thank you!!

    Heh... it's the very same method that I use here at work.  I don't have as many tables to deal with as you but I have 3 different schemas with 3 different time frames...

    Scratch Schema:  If the table hasn't been read from in a week, it automatically gets renamed and then gets dropped one week after the rename.
    Arch Schema.  If the table hasn't been read from in 93 days, it automatically gets renamed and then gets dropped one week after the rename.
    DBO Schema.  If the table hasn't been read in 180 days, it goes on a proposed kill list and distributed with a threat to rename it and kill it as if it were in the scratch schema starting "now".

    Perhaps that why I don't actually have a quarter million tables to research. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Eirikur Eiriksson - Saturday, August 11, 2018 11:04 AM

    Jeff Moden - Saturday, August 11, 2018 10:52 AM

    Rather than dropping tables that you may have to restore when the phone does ring, rename them with a prefix of "ToBeDeleted_yyyymmdd_" where "yyyymmdd" is a date that's 5 weeks out from the current date.  Build a "sweeper" that scans for anything that has a date less than the current date in the table name and then only drop those. 

    If the phone does ring, it's a whole lot easier to simply remove the prefix from the table name than it is to do a restore to restore a single or even a couple of tables.

    Good point on the renaming Jeff, you are much kinder than I!
    😎

    Having gone through this kind of things few times, my question would always be "so what was the schema and table name again?" When work areas spill into the production, sometimes one just has to do a proper spring/summer/autumn/winter cleaning.

    You asked in another thread if the GUIDs were "Serializable", not sure what that means but I presume they are actually system generated.

    After running

    SELECT * FROM sys.tables

    WHERE name LIKE '%[0-9][0-9][0-9]%'

    Order by name

    I read through the list - one thing that struck me was how the meaningful prefix names were the usual suspects, in fact over 212,000 of the total were comprised just the following 15 table name prefixes: 

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'ACCURAL_%'And name LIKE '%[0-9][0-9][0-9]%'

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'ACKNOWLEDGEMENTS_%'And name LIKE '%[0-9][0-9][0-9]%'

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'ADHOCQUERY_STATICIDSET_%'And name LIKE '%[0-9][0-9][0-9]%'

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'ATTRIBUTE%'And name LIKE '%[0-9][0-9][0-9]%'

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'BATCHATTRIBUTE%'And name LIKE '%[0-9][0-9][0-9]%'

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'BATCHCONTROLREPORT_%'And name LIKE '%[0-9][0-9][0-9]%'

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'CORRESPONDENCE_%'And name LIKE '%[0-9][0-9][0-9]%'

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'EXPORT_%'And name LIKE '%[0-9][0-9][0-9]%'

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'GENERATE_BACS_%'And name LIKE '%[0-9][0-9][0-9]%'

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'IDSETGENERIC_%'And name LIKE '%[0-9][0-9][0-9]%'

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'IMPORTSELECTION_STATICIDSET_%'And name LIKE '%[0-9][0-9][0-9]%'

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'MKTMAILINGPREACTIVATION_%'And name LIKE '%[0-9][0-9][0-9]%'

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'MKTSEGMENTATION%' And name LIKE '%[0-9][0-9][0-9]%'

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'POSTTOGL_%'And name LIKE '%[0-9][0-9][0-9]%'

    SELECT

    COUNT (*) FROM sys.tables WHERE name LIKE 'tmp_%'And name LIKE

    BUT, the following listed 7689 entries:

    SELECT

    Count (* )FROM sys.tables
    WHERE name not LIKE '%[0-9]%'

  • JackCarrington - Tuesday, August 14, 2018 6:26 AM

    You asked in another thread if the GUIDs were "Serializable", not sure what that means but I presume they are actually system generated.

    After running

    SELECT * FROM sys.tables

    WHERE name LIKE '%[0-9][0-9][0-9]%'

    Order by name

    I read through the list - one thing that struck me was how the meaningful prefix names were the usual suspects, in fact the following 15 comprised over 212,000 of the total: 

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'ACCURAL_%'And name LIKE '%[0-9][0-9][0-9]%'

    ...

    SELECT

    COUNT (*) FROM sys.tables WHERE name LIKE 'tmp_%'And name LIKE '%[0-9][0-9][0-9]%'

    Just be careful with the wildcards - the underscore is a wildcard in its own right.  See https://docs.microsoft.com/en-us/sql/t-sql/language-elements/like-transact-sql?view=sql-server-2017 for more wildcard woes

    Thomas Rushton
    blog: https://thelonedba.wordpress.com

  • ThomasRushton - Tuesday, August 14, 2018 6:38 AM

    JackCarrington - Tuesday, August 14, 2018 6:26 AM

    You asked in another thread if the GUIDs were "Serializable", not sure what that means but I presume they are actually system generated.

    After running

    SELECT * FROM sys.tables

    WHERE name LIKE '%[0-9][0-9][0-9]%'

    Order by name

    I read through the list - one thing that struck me was how the meaningful prefix names were the usual suspects, in fact the following 15 comprised over 212,000 of the total: 

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'ACCURAL_%'And name LIKE '%[0-9][0-9][0-9]%'

    ...

    SELECT

    COUNT (*) FROM sys.tables WHERE name LIKE 'tmp_%'And name LIKE '%[0-9][0-9][0-9]%'

    Just be careful with the wildcards - the underscore is a wildcard in its own right.  See https://docs.microsoft.com/en-us/sql/t-sql/language-elements/like-transact-sql?view=sql-server-2017 for more wildcard woes

    You're quite right, although I had found some code using an 'Escape' switch which allowed SQL to treat underscores as searchable values, I've nonetheless completely avoided searching for them.

    All the names of the likely excess tables contain at least one instance of 3 consecutive numerals within the suffix character string (which is not the case with the vast majority of 'base' tables - and  I'm open to a better name - created in 2009).  For example:

    ACCURAL_3ed70ee7_4e52_4d16_841b_14cc8da601d0
    ATTRIBUTE4731FE64AA10408CAB7BE47A76EF133C
    EXPORT_f38018ee_6efa_4937_b340_8a567c240a7a
    MKTSEGMENTATIONDATA_NORMALGIFTS_9C0654FD_1236_465D_8A02_9C550A8830B8
    POSTTOGL_EX_b091fca9_284a_4f45_8cd6_d3973378c764

    Anyway, the latest update:

    There are 7,689 'base' tables (which is still a ridiculously high number, but of course not the same magnitude as .25million) with (relatively) short, meaningful, unsuffixed names like

    CHANGEAGENT
    APPUSER
    APPUSERAUDIT etc. 

    Which of course leaves some 243,000 tables with crazy, GUID-suffixed names.   If I analyse by Create_Date, I find that:

    1) 2009-07-09 - the oldest (base) tables, 2,547, were created, presumably when the system was first implemented.  Of those, only 11 had GUID suffixes (the rest thus likely being base tables);
    2) 2009-07-10 - 154 tables, ALL base tables, were added the next day;
    3) 2009-07-10 to 2009-09-10 -  8 base tables added, but on ;
    4) 2010-01-06 13:46:59.337, the ghastly process of adding GUID-suffixed table begins in earnest - and hasn't stopped since. Next planned step is to bust open a few of these suffixed tables, for now they seem to be just a few rows deep, which would indicate staging tables that aren't being cleaned up.

  • JackCarrington - Tuesday, August 14, 2018 8:32 AM

    ThomasRushton - Tuesday, August 14, 2018 6:38 AM

    JackCarrington - Tuesday, August 14, 2018 6:26 AM

    You asked in another thread if the GUIDs were "Serializable", not sure what that means but I presume they are actually system generated.

    After running

    SELECT * FROM sys.tables

    WHERE name LIKE '%[0-9][0-9][0-9]%'

    Order by name

    I read through the list - one thing that struck me was how the meaningful prefix names were the usual suspects, in fact the following 15 comprised over 212,000 of the total: 

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'ACCURAL_%'And name LIKE '%[0-9][0-9][0-9]%'

    ...

    SELECT

    COUNT (*) FROM sys.tables WHERE name LIKE 'tmp_%'And name LIKE '%[0-9][0-9][0-9]%'

    Just be careful with the wildcards - the underscore is a wildcard in its own right.  See https://docs.microsoft.com/en-us/sql/t-sql/language-elements/like-transact-sql?view=sql-server-2017 for more wildcard woes

    You're quite right, although I had found some code using an 'Escape' switch which allowed SQL to treat underscores as searchable values, I've nonetheless completely avoided searching for them.

    All the names of the likely excess tables contain at least one instance of 3 consecutive numerals within the suffix character string (which is not the case with the vast majority of 'base' tables - and  I'm open to a better name - created in 2009).  For example:

    ACCURAL_3ed70ee7_4e52_4d16_841b_14cc8da601d0
    ATTRIBUTE4731FE64AA10408CAB7BE47A76EF133C
    EXPORT_f38018ee_6efa_4937_b340_8a567c240a7a
    MKTSEGMENTATIONDATA_NORMALGIFTS_9C0654FD_1236_465D_8A02_9C550A8830B8
    POSTTOGL_EX_b091fca9_284a_4f45_8cd6_d3973378c764

    Anyway, the latest update:

    There are 7,689 'base' tables (which is still a ridiculously high number, but of course not the same magnitude as .25million) with (relatively) short, meaningful, unsuffixed names like

    CHANGEAGENT
    APPUSER
    APPUSERAUDIT etc. 

    Which of course leaves some 243,000 tables with crazy, GUID-suffixed names.   If I analyse by Create_Date, I find that:

    1) 2009-07-09 - the oldest (base) tables, 2,547, were created, presumably when the system was first implemented.  Of those, only 11 had GUID suffixes (the rest thus likely being base tables);
    2) 2009-07-10 - 154 tables, ALL base tables, were added the next day;
    3) 2009-07-10 to 2009-09-10 -  8 base tables added, but on ;
    4) 2010-01-06 13:46:59.337, the ghastly process of adding GUID-suffixed table begins in earnest - and hasn't stopped since. Next planned step is to bust open a few of these suffixed tables, for now they seem to be just a few rows deep, which would indicate staging tables that aren't being cleaned up.

    Or, really bad for you, they're only recording differences from the first 11 tables.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • JackCarrington - Tuesday, August 14, 2018 8:32 AM

    ThomasRushton - Tuesday, August 14, 2018 6:38 AM

    JackCarrington - Tuesday, August 14, 2018 6:26 AM

    You asked in another thread if the GUIDs were "Serializable", not sure what that means but I presume they are actually system generated.

    After running

    SELECT * FROM sys.tables

    WHERE name LIKE '%[0-9][0-9][0-9]%'

    Order by name

    I read through the list - one thing that struck me was how the meaningful prefix names were the usual suspects, in fact the following 15 comprised over 212,000 of the total: 

    SELECT COUNT (*) FROM sys.tables WHERE name LIKE 'ACCURAL_%'And name LIKE '%[0-9][0-9][0-9]%'

    ...

    SELECT

    COUNT (*) FROM sys.tables WHERE name LIKE 'tmp_%'And name LIKE '%[0-9][0-9][0-9]%'

    Just be careful with the wildcards - the underscore is a wildcard in its own right.  See https://docs.microsoft.com/en-us/sql/t-sql/language-elements/like-transact-sql?view=sql-server-2017 for more wildcard woes

    You're quite right, although I had found some code using an 'Escape' switch which allowed SQL to treat underscores as searchable values, I've nonetheless completely avoided searching for them.

    All the names of the likely excess tables contain at least one instance of 3 consecutive numerals within the suffix character string (which is not the case with the vast majority of 'base' tables - and  I'm open to a better name - created in 2009).  For example:

    ACCURAL_3ed70ee7_4e52_4d16_841b_14cc8da601d0
    ATTRIBUTE4731FE64AA10408CAB7BE47A76EF133C
    EXPORT_f38018ee_6efa_4937_b340_8a567c240a7a
    MKTSEGMENTATIONDATA_NORMALGIFTS_9C0654FD_1236_465D_8A02_9C550A8830B8
    POSTTOGL_EX_b091fca9_284a_4f45_8cd6_d3973378c764

    Anyway, the latest update:

    There are 7,689 'base' tables (which is still a ridiculously high number, but of course not the same magnitude as .25million) with (relatively) short, meaningful, unsuffixed names like

    CHANGEAGENT
    APPUSER
    APPUSERAUDIT etc. 

    Which of course leaves some 243,000 tables with crazy, GUID-suffixed names.   If I analyse by Create_Date, I find that:

    1) 2009-07-09 - the oldest (base) tables, 2,547, were created, presumably when the system was first implemented.  Of those, only 11 had GUID suffixes (the rest thus likely being base tables);
    2) 2009-07-10 - 154 tables, ALL base tables, were added the next day;
    3) 2009-07-10 to 2009-09-10 -  8 base tables added, but on ;
    4) 2010-01-06 13:46:59.337, the ghastly process of adding GUID-suffixed table begins in earnest - and hasn't stopped since. Next planned step is to bust open a few of these suffixed tables, for now they seem to be just a few rows deep, which would indicate staging tables that aren't being cleaned up.

    Obviously you haven't seen some of the commercial ERP systems.  PeopleSoft Finance and HR systems have over 55,000 tables each.

  • and I expect in any given installation 50K of them are empty 🙂

  • aaron.reese - Friday, August 17, 2018 6:20 AM

    and I expect in any given installation 50K of them are empty 🙂

    Except on Tuesdays. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Friday, August 17, 2018 6:48 AM

    aaron.reese - Friday, August 17, 2018 6:20 AM

    and I expect in any given installation 50K of them are empty 🙂

    Except on Tuesdays. 😀

    Just pondering on the page life expectancy and plan cache retention, sounds like a Mainframe/AS400/File based system to SQL Server conversion.
    😎

    Having seen systems that generated a number of table for each customer/service/delivery combination, I'm not surprised but not an design I would promote either. The legacy converts are tainted with a muddy puddle of schema, subject and logic, as logical as feeding a frog by throwing it in the blender with the frog-food.

  • aaron.reese - Friday, August 17, 2018 6:20 AM

    and I expect in any given installation 50K of them are empty 🙂

    All depends on what modules you have purchased and use, but just about where I was working at the time.

  • Eirikur Eiriksson - Friday, August 17, 2018 7:23 AM

    Jeff Moden - Friday, August 17, 2018 6:48 AM

    aaron.reese - Friday, August 17, 2018 6:20 AM

    and I expect in any given installation 50K of them are empty 🙂

    Except on Tuesdays. 😀

    Just pondering on the page life expectancy and plan cache retention, sounds like a Mainframe/AS400/File based system to SQL Server conversion.
    😎

    Having seen systems that generated a number of table for each customer/service/delivery combination, I'm not surprised but not an design I would promote either. The legacy converts are tainted with a muddy puddle of schema, subject and logic, as logical as feeding a frog by throwing it in the blender with the frog-food.

    Well, if I remember correctly Steve Jones actually worked for PeopleSoft prior to them being bought by Oracle.  He may be able to answer part of that for us.

  • Lynn Pettis - Friday, August 17, 2018 8:52 AM

    Eirikur Eiriksson - Friday, August 17, 2018 7:23 AM

    Jeff Moden - Friday, August 17, 2018 6:48 AM

    aaron.reese - Friday, August 17, 2018 6:20 AM

    and I expect in any given installation 50K of them are empty 🙂

    Except on Tuesdays. 😀

    Just pondering on the page life expectancy and plan cache retention, sounds like a Mainframe/AS400/File based system to SQL Server conversion.
    😎

    Having seen systems that generated a number of table for each customer/service/delivery combination, I'm not surprised but not an design I would promote either. The legacy converts are tainted with a muddy puddle of schema, subject and logic, as logical as feeding a frog by throwing it in the blender with the frog-food.

    Well, if I remember correctly Steve Jones actually worked for PeopleSoft prior to them being bought by Oracle.  He may be able to answer part of that for us.

    So that's the reason for the website having so many redundant threads 😀
    😎
    And of course, blocking a spam will require dropping hundreds of tables.....😛

  • Jeff Moden - Friday, August 17, 2018 6:48 AM

    aaron.reese - Friday, August 17, 2018 6:20 AM

    and I expect in any given installation 50K of them are empty 🙂

    Except on Tuesdays. 😀

    Sorry to be a dummy...I rolled that one around in the noggin a few minutes, and zilch...

  • JackCarrington - Thursday, August 23, 2018 1:00 AM

    Jeff Moden - Friday, August 17, 2018 6:48 AM

    aaron.reese - Friday, August 17, 2018 6:20 AM

    and I expect in any given installation 50K of them are empty 🙂

    Except on Tuesdays. 😀

    Sorry to be a dummy...I rolled that one around in the noggin a few minutes, and zilch...

    No... you got it, actually.  It's meant to be a totally nonsensical "requirement" that just comes out of the blue and you wonder where it came from in a very "OK... I got nothing from that" manner that makes you ask "Seriously???".  It's meant purely in jest and as a form of acknowledgement and agreement that some other crazy thing, like having 50,000 tables and most of them are empty, does actually seem a bit crazy.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 29 (of 29 total)

You must be logged in to reply to this topic. Login to reply