How to optimize Query

  • begin Tran

    declare @forward table

    (rowId int identity(1,1),

    ID int,

    CHAR_1 Varchar(20),

    CHAR_4 Varchar(20))

    declare @rowId int,@maxRowId int,@ID int,

    @CHAR_1 Varchar(20),

    @CHAR_4 Varchar(20)

    Insert into @forward(ID, CHAR_1,CHAR_4)

    Select ID,CHAR_1,CHAR_4 from MESSAGE

    where

    CHAR_1 IN

    (

    SELECT CHAR_1

    FROM MESSAGE

    WHERE (CHAR_1 IS NOT NULL)

    GROUP BY CHAR_1

    HAVING (COUNT(*) > 1))

    AND (CHAR_4 IS NULL)

    select

    @rowId = min(RowId),

    @MaxRowId = max(RowId)

    from @forward

    while @RowId <= @MaxRowId begin

    select

    @ID=ID ,

    @CHAR_1= CHAR_1,

    @CHAR_4= CHAR_4

    from

    @forward

    where

    RowId = @RowId

    ----------------------------------------------------------------------

    /* Delete only those duplicate records which are not present in MATTERS table

    IF NOT EXISTS(SELECT * FROM MATTERS AS PM WHERE PM.ID=@ID)

    BEGIN

    /*Deleting the record from PROFILES table*/

    DELETE FROM PROFILES WHERE PROFILES.ID=@ID

    /*Deleting the record from PEOPLE_BUSINESS_UNITS table*/

    DELETE FROM BUSINESS_UNITS WHERE BUSINESS_UNITS.ID=@ID

    /* Deleting the record from PEOPLE_ORGANIZATIONS table*/

    DELETE FROM ORGANIZATIONS WHERE ORGANIZATIONS.ID=@ID

    DELETE FROM MESSAGE WHERE MESSAGE.ID=@ID

    END

    ----------------------------------------------------------------------

    set @RowId = @RowId + 1

    end

    I am using this query to delete duplicate records from "MESSAGE" table. I have 10,000 duplicate records. when i run this query it is taking 6 hours to delete those 10,000 duplicate records. I have to reduce the execution time.

    How can i optimize this query?? How can i use JOINS in this query, joining Table variable with MATTERS Table. I am a bit confused here.

    Please give your valuable suggestions to optimize this query.

    Thanks in advance!!

  • First, transactions put many locks, and chance that somebody is waiting on the lock is high. Your transactions should be as short as possible. BEGIN TRAN outside of the loop and commit (which you forgot, btw) at the end is not the shortest possible transaction. That means, relocate your BEGIN TRAN to just before first DELETE, and put you COMMIT just after last DELETE - inside the loop.

    Install this[/url] script (sp_WhoIsActive) and see what do you wait on (lock? disk?).

    Second, check for missing indexes in child tables: If you have any "child" tables referencing each table you want to delete from, they should have index on foreign key column(s), or you will be faced with full-table scan of all child tables, even if you delete just one row of parent table.

    With sqlxdetails plugin you could check that tables and indexes in seconds (www.sqlxdetails.com).

    With sql trace (profiler) select event "Showplan XML" and see are there any scans of large tables or indexes (yes, index scan is a bad thing! Index seek/lookup is good). Also include SP:StmtStarting event to see exactly which statement takes a long time.

    Third, deleting on-by-one is a bad thing. There are much more efficient methods to delete many rows with a single DELETE command involving TOP N and OUTPUT clause, following @@ROWCOUNT check that ends the loop if there are no more rows. That is called "deleting in chunks". It would be better to delete all at once if you have a data warehouse, but if you are in the production OLTP, it is better to do that in chunks not to hold the locks too long because of other simultaneous users. But never on-by-one, that is the slowest method. There are laso better techniques to find duplicates. But, do that only if first two suggestions (which are really quick to check) do not solve your problem.

    _____________________________________________________
    Microsoft Certified Master: SQL Server 2008
    XDetails Addin - for SQL Developers
    blog.sqlxdetails.com - Transaction log myths
  • Hi vedran.... Thanks for the reply.... how can i join the 'table variable' with the 'matters' table to delete duplicates. I am hoping that joins can solve this. Loading the duplicate records into the 'table variable' and deleting.

  • I'd use a "slightly" different approach:

    Instead of HAVING(COUNT(*)>0) I'd use a CTE and the ROW_NUMBER() OVER(PARTITION BY CHAR_1 ORDER BY <insert_preferred_column> DESC) AS Row approach and query for Row>1.

    I'd insert the ID value into a staging temp table, adding an index on the ID column after the insert or before the insert (and control the ascending order during the insert).

    Then I would use the staging table all at once:

    DELETE FROM PROFILES

    INNER JOIN Staging on PROFILES.ID=Staging.ID

    /*Deleting the record from PEOPLE_BUSINESS_UNITS table*/

    DELETE FROM BUSINESS_UNITS

    INNER JOIN Staging on BUSINESS_UNITS.ID=Staging.ID

    /* Deleting the record from PEOPLE_ORGANIZATIONS table*/

    DELETE FROM ORGANIZATIONS

    INNER JOIN Staging on ORGANIZATIONS.ID=Staging.ID

    DELETE FROM MESSAGE

    INNER JOIN Staging on MESSAGE .ID=Staging.ID

    Should be a matter of seconds instead of hours (depending on the original table size and the indexes on those tables).



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Also to help you discover where most of the recessing time is being undertaken you could do the following in SSMS :

    Set statistics io on

    Set statistics time on

    Run query

    Set statistics io off

    Set statistics time off

    If you are able to let it run and add the actual execution plan this will allow you to identify full table scans etc which are hindering your performance.

    If not you could use the estimated plan but this will not necessarily be the plan being used.

    MCT
    MCITP Database Admin 2008
    MCITP Database Admin 2008
    MCITP Database Dev 2008
    www.jnrit.com.au/Blog.aspx

  • tinnusqlserver (4/27/2011)


    how can i join the 'table variable' with the 'matters' table to delete duplicates

    I would rather use a temp table than a table variable, because table variable does not have statistics and optimizer assumes it has only one row and can find suboptimal (slow) execution plan for the sql command.

    LutzM showed you a very nice method to join in DELETE command. Another syntax which I prefer for DELETE and UPDATE command with multiple joins over

    DELETE FROM PROFILES

    INNER JOIN Staging on PROFILES.ID=Staging.ID

    is

    DELETE p -- you can replace that with "SELECT *" if you want to see the rows

    FROM PROFILES p

    JOIN Staging s on p.ID = s.ID

    _____________________________________________________
    Microsoft Certified Master: SQL Server 2008
    XDetails Addin - for SQL Developers
    blog.sqlxdetails.com - Transaction log myths
  • Vedran Kesegic (4/27/2011)


    tinnusqlserver (4/27/2011)


    how can i join the 'table variable' with the 'matters' table to delete duplicates

    I would rather use a temp table than a table variable, because table variable does not have statistics and optimizer assumes it has only one row and can find suboptimal (slow) execution plan for the sql command.

    LutzM showed you a very nice method to join in DELETE command. Another syntax which I prefer for DELETE and UPDATE command with multiple joins over

    DELETE FROM PROFILES

    INNER JOIN Staging on PROFILES.ID=Staging.ID

    is

    DELETE p -- you can replace that with "SELECT *" if you want to see the rows

    FROM PROFILES p

    JOIN Staging s on p.ID = s.ID

    Yup. Good point, Vedran. I actually just copied and modified the original statement without thinking about the syntax in general. But you're definitely right.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply