What is the best practice about maintain and using Big Log Database

  • sm_iransoftware (1/31/2015)


    Hi Again

    I want to do some thing on this database.

    First Of All , I should start with a small database.

    But current DB has many connection from site. And I Should Replace It with a New empty database . (with the same staructure)

    This is my plan :

    1- Create a new database with same structure.

    2- Rename current database to olddb.

    with this code :

    USE master

    GO

    EXEC sp_dboption CurDataBase, 'Single User', True

    EXEC sp_renamedb 'CurDataBase', 'OldDataBase'

    GO

    3- Rename Newdb to current DB.

    USE master

    GO

    EXEC sp_renamedb 'NewDataBase', 'CurDataBase'

    is it true ? and Tsql code is ok ? (Please dont forget many of connection to curdatabase that Is log db. loss some seconds data is not problems)

    Thank you

    I typically use RESTRICTED_USER rather than SINGLE_USER. I believe that SINGLE_USER will allow (1)connection from ANY authenticated user, which may have unintended consequences. RESTRICTED_USER will allow multiple connections from anyone with SYSAMDIN membership (which should be just you or another DBA). That's probably closer to what you intend. However, if the web application account is a member of SYSADMIN, then you've screwed the pooch, and it doesn't matter.

    The "WITH ROLLBACK AFTER X SECONDS" option will first allow X seconds for existing connections to complete, while blocking new connections, before setting database in restricted mode.

    alter database CurDataBase set restricted_user with rollback after 30 seconds;

    If the web application were logging click stream records to a file, with periodic table loads, then this would be a much simpler process. You don't want to insert click stream records directly to database. While you've got the hood up on the database, then consider making that change.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Maybe I missed it somewhere but it seems as though you are reporting from the same system you insert into. You generally want to build a warehouse to store your historical data and keep one - a few months on your operational system, depending on your business needs of course. In the warehouse you can apply the partitioning as you see fit for your reporting needs.

    You can also create a OLAP cube that is itself partitioned. It summarises data and once done you dont need to rerun the process on the older partition part of the cube, thus you can take that detailed source data out of your warehouse database and into some other form of archive.

    ----------------------------------------------------

  • MMartin1 (2/3/2015)


    Maybe I missed it somewhere but it seems as though you are reporting from the same system you insert into. You generally want to build a warehouse to store your historical data and keep one - a few months on your operational system, depending on your business needs of course. In the warehouse you can apply the partitioning as you see fit for your reporting needs.

    You can also create a OLAP cube that is itself partitioned. It summarises data and once done you dont need to rerun the process on the older partition part of the cube, thus you can take that detailed source data out of your warehouse database and into some other form of archive.

    All which is suggested above. However, in this case I don't think he really needs a data warehouse or partitioning; simply capture those three million daily inserts into a flat file and then perform hourly bulk inserts into the click stream audit table. It's not really complicated, it's just that the 30 table inserts per second are killing the web application and reporting team.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (2/3/2015)


    MMartin1 (2/3/2015)


    Maybe I missed it somewhere but it seems as though you are reporting from the same system you insert into. You generally want to build a warehouse to store your historical data and keep one - a few months on your operational system, depending on your business needs of course. In the warehouse you can apply the partitioning as you see fit for your reporting needs.

    You can also create a OLAP cube that is itself partitioned. It summarises data and once done you dont need to rerun the process on the older partition part of the cube, thus you can take that detailed source data out of your warehouse database and into some other form of archive.

    All which is suggested above. However, in this case I don't think he really needs a data warehouse or partitioning; simply capture those three million daily inserts into a flat file and then perform hourly bulk inserts into the click stream audit table. It's not really complicated, it's just that the 30 table inserts per second are killing the web application and reporting team.

    I just have a hunch that audit table will become the next big issue if not properly planned. For one thing, as you mentioned I think, they need to consider if the full url string can be substituted for a simple page code number , I dont see a good reason why they cannot do this in permanent audit table.

    ----------------------------------------------------

  • MMartin1 (2/3/2015)


    Eric M Russell (2/3/2015)


    MMartin1 (2/3/2015)


    Maybe I missed it somewhere but it seems as though you are reporting from the same system you insert into. You generally want to build a warehouse to store your historical data and keep one - a few months on your operational system, depending on your business needs of course. In the warehouse you can apply the partitioning as you see fit for your reporting needs.

    You can also create a OLAP cube that is itself partitioned. It summarises data and once done you dont need to rerun the process on the older partition part of the cube, thus you can take that detailed source data out of your warehouse database and into some other form of archive.

    All which is suggested above. However, in this case I don't think he really needs a data warehouse or partitioning; simply capture those three million daily inserts into a flat file and then perform hourly bulk inserts into the click stream audit table. It's not really complicated, it's just that the 30 table inserts per second are killing the web application and reporting team.

    I just have a hunch that audit table will become the next big issue if not properly planned. For one thing, as you mentioned I think, they need to consider if the full url string can be substituted for a simple page code number , I dont see a good reason why they cannot do this in permanent audit table.

    Spot on. If you plan now for the maintenance nightmare that audit table will quickly become, it won't ever become a nightmare and, yes, some form of temporal partitioning is usually the best (IMHO) way to go on such audit tables. Using a staging table as the target of well written audit triggers also helps.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • To cover some of what has been said, 35 inserts per second should be trivial for a well written audit system. It's usually the tables or a shedload of inappropriate indexes that causes things to be slow. I'd love to see the triggers on this one. If they're "generic", they're slow, period. I've also see people write some of the craziest things into audit triggers.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You can go for shrinking your Log Database by changing your recovery model using SQL Management Studio. For this you have to login to the SSMS and in query window have to run this query by substituting the relevant data and yes before doing so take a full backup of your database.

    DBCC SHRINKFILE(<log_file_name_Log>)

    BACKUP LOG <database> WITH TRUNCATE_ONLY

    DBCC SHRINKFILE(<log_file_name_Log>)

    You can also go for Database partitions, Configuring log record size to Manage Log Database growth & configuring partition rollover to Manage Log Database size . So above mentioned are the ways you can opt for maintaining your large log database. I hope this will help you out for more details can ask .

    Thanks & Regards

  • Jeff Moden (2/3/2015)


    To cover some of what has been said, 35 inserts per second should be trivial for a well written audit system. It's usually the tables or a shedload of inappropriate indexes that causes things to be slow. I'd love to see the triggers on this one. If they're "generic", they're slow, period. I've also see people write some of the craziest things into audit triggers.

    From what the OP described, this isn't so much an audit table, he didn't mention triggers. It's more like click streaming, where every time a user clicks on a URL the web application inserts the table. So in addition to the overhead of 35 record inserts per second, the web application itself is burdened with constantly making connections and database calls. Partitioning would marginally help the BI reporting team with blocking issues, but batching up inserts into a flat file with hourly bulk loads would be a game changer for both the database and the application.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • He may also want to consider placing a unique key constraint on session + product + time or date with the ingore_dup_key option set to insure that redundant click events arn't being recorded. If I had to guess, the BI team is probably more concerned with how many unique impressions each product has on a periodic basis, rather than counting the total number of times that user 2834312 clicked on a specific product. That could potentially pair down the table by 90%, especially if they are recording link hover over events.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I am using Use the SQL Server Enterprise Edition

Viewing 10 posts - 16 through 24 (of 24 total)

You must be logged in to reply to this topic. Login to reply