Handling Huge SQL Database

  • Hi,

    How to know the following details when selecting from an SQL database of size in Terabytes.

    (let us say it has years of collected data in the tables)

    1. Is there any difference between taking 1 day data and 10 days data from this huge database.

    2. How to know that how much [tempdb] is being occupied for each queries.

    please let me know is there a way to find these details.

    Thank you

    regards

    Enbee.

    Regards | Enbee

  • Enbee (7/2/2008)


    1. Is there any difference between taking 1 day data and 10 days data from this huge database.

    Depends on the indexes that you have.

    2. How to know that how much [tempdb] is being occupied for each queries.

    You can monitor TempDB the same way you monitor user databases for space used and the like. Also you can use he perfmon per database counters for TempDB.

    You can also use the DMV sys.dm_db_task_space_usage to see usage per session and request

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • On the first question, it depends on your data, indexes and what your query is doing. I would guess that 10 years data would be 10 times as many rows as 1 year, and would thus take at least a little longer. On the other hand, a query like:

    declare @Value sql_variant

    select @Value = Col1

    from dbo.Table1

    where Col2 > dateadd(year, -1, getdate())

    Something like that may run exactly the same on 1 year vs 10 years.

    In other words, there's really no way to tell without testing, or at least looking at some execution plans.

    On the second question, Gail's answer is probably the best way to do this. Before I knew how to monitor these things correctly, I used to open the folder where the tempdb file was, and just watch the file grow while a query ran. It works, but there are better ways. 🙂

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • The first thing to do is to really "Know your data". You need to have an idea of what is stored and how it is. Then you will be able to tell what's appropriate for the hardware you have.


    * Noel

  • and know how it's queried. If the queries use dates, index on the dates, but you have to know how the queries, or at least the common ones, are structured to index for them.

  • As far as 1 day or 10 days data access, assuming an index (either clustered or non) AND roughly smooth data distribution levels, you should have an index seek/bookmark lookup for either for 1 year or 10 years data. 10/365 is about 2.5% or so and is by far the highest percentage of the 4 possible combinations here, and that should be less than the level at which the optimizer would choose to switch to a table scan for access. If you are non indexed on the date column, well all accesses will be equal - a full table scan. You may need to pack a lunch for that on the 10 year scan. 😀

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Thank you all for the responses.

    Here I have one more question.

    Lets say, I have a table of size 2 TB(say Table000 ).

    If, I have to run a simple query on that table (SELECT Top 1000 * FROM Table000)

    How much tempdb space should be available.

    or otherwise, is there any dependency with tempDB space for running queries on huge databases.

    please let me know your options.

    Regards | Enbee

  • That particular query shouldn't need tempDB space, as it has no query operators that require a work table. Sorts, hash joins, hash aggregates and other operators require space in TempDB. Simple seeks/scans don't (usually).

    or otherwise, is there any dependency with tempDB space for running queries on huge databases

    Depends on the type of queries you are running. It's not something that has a fixed value that can be predicted upfront

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thank you.

    Regards | Enbee

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply