Very Large Table

  • what is the quickest way to obtain the max(id) from a table with over 50 million rows?

    Currently the simple select written like this:

    SELECT max(id) FROM tablename WITH (NOLOCK)

    is taking forever. I will need to load records into another table based off of this information. I'd like to automate this but have to get that max(id) first. Is there a faster way?

    Thanks


    Aurora

  • The nolock will actually have more logical IO than not using it (don't beleive me, test it)

    Try this

    set statistics IO on

    SELECT max(id) FROM tablename WITH (NOLOCK)

    Now look at logical IO

    now run

    SELECT max(id) FROM tablename

    your logical IO will be less, it probably doesn't make a big difference though.

    My suggestion would be to place a non-clustered index on the ID column and test the logical IO (I bet it will be much faster, especially if you do not use composite clustered indexes)

    You could also do

    select top 1 id from tablename

    order by id desc

    but that will roughly be the same IO as your max statement.

    HTH

    Ray Higdon MCSE, MCDBA, CCNA

    ------------

    Ray Higdon MCSE, MCDBA, CCNA

  • As Ray suggested, create no-clustered index on that ID column and Check the execution plan to enaure the select uses the index.

  • Thanks guys, that worked like a charm! Oddly enough, I made the mistake no DBA should ever make and that is assume. Naturally I thought the index was already in place (I've inherited this stuff)and that seemed obvious. Nonetheless, two lessons were learned today. Thaks for your input.

    Aurora


    Aurora

  • Okay another caveat, I need the record back with not only the max(id) but also the max(entrydate). How can I get SQL to return just one row, fast?

    Aurora


    Aurora

  • Have you tried same suggestion to column 'entrydate'?

  • Yes, but it will not return 1 row. Here's what I'm trying to do: Write a DTS package that will append only Inserts into a SQL table from an Oracle table. We used to truncate the SQL table and refresh it on a nightly basis. That was fine when the table was small but now its over 50 million rows. So I only want to bring over the new inserts. Any thoughts on that?

    Thanks so much for all your help.

    Aurora


    Aurora

  • quote:


    Yes, but it will not return 1 row


    It should return only one row.

    quote:


    So I only want to bring over the new inserts. Any thoughts on that?


    In your DTS task, select only records that matches the enteydate. You need setup Microsoft ODBC driver for ORACLE DSN in SQL Server machine or create a linked server to Oracle database.

  • So in my DTS task I would select the records that are => 'current max(entrydate)'? I don't want to bring over duplicates. So perhaps it needs to be > current max(entrydate). But I need to get that max entrydate before I append data nightly. Any thoughts on the most efficient way to do that?


    Aurora

  • What about to use GETDATE function to get the current date and time and other Date and Time Functions. See BOL for details.

  • DOes the entry date have an index on it and id the id field an indentity field that increments each entry and thus should have a value that will always be relative to the entry date. In otherwords will the max id also occurr on the max entry date. If so then try something like this

    select max(id) as A, max(entrydate) as B from tblX where entrydate = (select max(entrydate) from tblX)

  • Well, I'm not sure I understand the problem correctly, but one of the main questions seems to be whether the last row (with MAX(id)) always contains the highest entrydate - even in case that you insert values for several days at once..? Also, are you sure that there will never be any new records in your source table with a date that already has some records in the target table? If both is OK, then you can get the date e.g. with

    SELECT TOP 1 entrydate FROM table ORDER BY id DESC

    and then select from the source only those records that have higher date.

    If there is no such rule, then IMHO the best way to avoid duplicates is to have a unique ID in the source table which would be written to the target table during inserts - that allows you to control any possible problems quite easily.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply