can this UPDATE statement be optimized?

  • I have an imported table with group names that are sharing a special code among rows. The business rule states that the most recently entered group should have the special code and that all others should be blank. The example will probably make more sense. My question is this: is there a more efficient way to do my update query?

    The scenario requires that the table be loaded as is seen in the example. The only option to change the data is through an UPDATE statement.

    Thanks!

    IF OBJECT_ID('tempdb..#Test') IS NOT NULL DROP TABLE #Test

    --

    CREATE TABLE #Test (

    GroupName varchar(50) NOT NULL,

    SpecCode varchar(10) NULL,

    DateEntered datetime )

    --

    INSERT #Test (GroupName, SpecCode, DateEntered)

    SELECT 'Test Group 1', '1111-1111', '1/1/2009'

    UNION ALL

    SELECT 'Test Group 2', '1111-1111', '1/2/2009'

    UNION ALL

    SELECT 'Test Group 3', '1111-1111', '2/1/2009'

    UNION ALL

    SELECT 'Test Group 4', '2222-2222', '1/1/2009'

    UNION ALL

    SELECT 'Test Group 5', '3333-3333', '1/1/2009'

    UNION ALL

    SELECT 'Test Group 6', '3333-3333', '1/2/2009'

    --

    /*

    Table looks like this:

    --

    GroupName SpecCode DateEntered

    Test Group 11111-11112009-01-01

    Test Group 21111-11112009-01-02

    Test Group 31111-11112009-02-01

    Test Group 42222-22222009-01-01

    Test Group 53333-33332009-01-01

    Test Group 63333-33332009-01-02

    --

    Needs to look like this after UPDATE:

    --

    GroupName SpecCode DateEntered

    Test Group 1 2009-01-01

    Test Group 2 2009-01-02

    Test Group 3 1111-1111 2009-02-01

    Test Group 4 2222-2222 2009-01-01

    Test Group 5 2009-01-01

    Test Group 6 3333-3333 2009-01-02

    */

    --

    -- Update table so that for all rows that are

    -- sharing a SpecCode, only the most recently

    -- entered row keeps the code. The other rows

    -- have a blank SpecCode. But this only applies

    -- to rows which are sharing the SpecCode

    --

    SELECT *

    FROM #Test

    --

    UPDATE t

    SET SpecCode = ''

    FROM #Test t

    JOIN (

    SELECT GroupName, DateEntered, SpecCode, ROW_NUMBER() OVER (PARTITION BY SpecCode ORDER BY DateEntered DESC) AS RowRank

    FROM #Test

    WHERE SpecCode IN (

    SELECT SpecCode

    FROM #Test

    WHERE SpecCode IS NOT NULL

    GROUP BY SpecCode

    HAVING COUNT(*) > 1 )

    ) x

    ON t.GroupName = x.GroupName

    AND t.DateEntered = x.DateEntered

    AND t.SpecCode = x.SpecCode

    AND x.RowRank <> 1

    --

    SELECT *

    FROM #Test

  • Here's how I'd write the update. You'd have to test on a volume of data to see whether it's enough better to be worth it.

    ;with CTE as

    (select SpecCode,

    row_number() over (partition by SpecCode order by DateEntered desc) as Row

    from #Test)

    update CTE

    set SpecCode = ''

    where Row > 1;

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • How about this:

    update a

    set speccode = ''

    from #test a

    where not exists

    (select 1

    from #test b

    where a.speccode = b.speccode

    group by speccode

    having max(dateentered) = a.dateentered)

    Randy

  • The requirement is "the most recently entered group should have the special code and that all others should be blank."

    GSquared's solution needs a minor correction to partition by GroupName instead of SpecCode but is much faster than an exists solution.

    GSquared solution:

    Table 'Groups'. Scan count 1, logical reads 5817

    SQL Server Execution Times: CPU time = 60 ms, elapsed time = 107 ms.

    Exists solution:

    Table 'Groups'. Scan count 2, logical reads 5834

    Table 'Worktable'. Scan count 101, logical reads 18313

    SQL Server Execution Times: CPU time = 291 ms, elapsed time = 349 ms.

    CREATE TABLE Groups (

    GroupName varchar(50) NOT NULL,

    DateEntered datetime NOT NULL,

    SpecCode varchar(10) NULL,

    Constraint Groups_P primary key (GroupName, DateEntered))

    /*

    The master.dbo.Tally table is the conventional "numbers" or "sequences" table and the SQL to create that table and populate is in Jeff Moden's article at http://www.sqlservercentral.com/articles/T-SQL/62867/

    */

    INSERT Groups

    (GroupName, SpecCode, DateEntered)

    SELECT'TEST GROUP ' + cast( Groups.N as varchar(8) )

    ,cast( Groups.N as varchar(8) )

    ,DATEADD(dd, Days.N,'2009-06-01')

    FROMmaster.dbo.Tally as Groups

    CROSS JOIN

    master.dbo.Tally as Days

    WHEREGroups.N between 1 and 100

    ANDDays.N between 1 and 30

    go

    update statistics Groups with fullscan

    go

    set statistics io on

    set statistics time on

    go

    begin tran

    ;with CTE as

    (select SpecCode,

    row_number() over (partition by GroupName order by DateEntered desc) as Row

    from Groups)

    update CTE

    set SpecCode = ''

    where Row > 1;

    select * fromGroups

    rollback

    begin tran

    UPDATEGroups

    setSpecCode = ''

    whereEXISTS

    (Select 1

    fromgroups as GroupLater

    whereGroupLater.GroupName = Groups.GroupName

    andGroupLater.DateEntered> Groups.DateEntered

    )

    select * fromGroups

    rollback

    SQL = Scarcely Qualifies as a Language

  • Here is another way:UPDATE

    T

    SET

    SpecCode = ''

    FROM

    (

    SELECT SpecCode, ROW_NUMBER() OVER (PARTITION BY SpecCode ORDER BY DateEntered DESC) AS RowRank

    FROM #Test

    ) AS T

    WHERE T.RowRank > 1

    EDIT: I'm getting lazy and didn't read all of Carl's solution. I just saw that he was using a CTE which is not needed so I came up with this solution, which I see is bascially the same as Carls CTE.

  • Actually, GSquared proposed the CTE solution but I was interested in comparing the performance to the "legacy" exists solution, which is needed when the RDBMS does not support window functions such as RANK.

    Here is another solution that uses except and also works better than the EXISTS solution and shows the need for the "corresponding" option on EXCEPT.

    Table 'Groups'. Scan count 3, logical reads 6051

    Table 'Worktable'. Scan count 1, logical reads 6114

    SQL Server Execution Times: CPU time = 80 ms, elapsed time = 136 ms.

    begin tran

    UPDATEGroups

    setSpecCode = ''

    FROMGROUPS

    JOIN(select GroupName, DateEntered

    from GROUPS

    EXCEPT

    select GroupName, MAX(DateEntered)

    from GROUPS

    GROUP BY GroupName

    ) as GroupLatest

    on GroupLatest.GroupName = Groups.GroupName

    rollback

    SQL = Scarcely Qualifies as a Language

  • OOPS - For the EXCEPT, missed a condition in the ON, so please add:

    and GroupLatest.DateEntered = Groups.DateEntered

    Table 'Groups'. Scan count 3, logical reads 5851

    Table 'Worktable'. Scan count 1, logical reads 5914

    SQL Server Execution Times: CPU time = 110 ms, elapsed time = 149 ms.

    SQL = Scarcely Qualifies as a Language

  • Carl Federl (6/17/2009)


    Actually, GSquared proposed the CTE solution but I was interested in comparing the performance to the "legacy" exists solution, which is needed when the RDBMS does not support window functions such as RANK.

    Ahh yes.. indeed GSquared did. *sigh* I need to get my head in the game.. 🙂

  • For fun here is a one more way using a LEFT OUTER JOIN:UPDATE

    T

    SET

    speccode = ''

    FROM

    #Test AS T

    LEFT OUTER JOIN

    (

    select speccode, MAX(dateentered) AS MaxDate

    from #test

    group by speccode

    ) AS D

    ON T.SpecCode = D.SpecCode

    AND T.dateentered = d.MaxDate

    WHERE

    D.SpecCode IS NULL

  • No, if you change mine to partition by GroupName, it doesn't get the desired end result. It needs to be the way I wrote it to comply with the specification.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply