SSC Members Name

  • Of course!

    Actually I thought some of you might have a better solution than I might come up with.

    Anyone got a better performing solution than Gail?

  • Nope, just would have written it with a CTE instead of a derived table.

  • Steve Jones - Editor (4/30/2009)


    Anyone got a better performing solution than Gail?

    Heh. Now you know we'll need to see the execution plan to better answer that. 😀

    Seriously though, we do need to see it. 🙂

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Hmm, depending on the duplicates density, this might be faster:

    ;WITH cteUsers as (

    SELECT UserID

    , UserName

    , ROW_NUMBER() OVER (Partition By UserName Order By LastLoggedIn DESC) incr

    FROM Users

    )

    UPDATE cteUsers

    SET UserName = UserName + CAST(incr-1 AS VARCHAR(3))

    Where incr > 1

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (4/30/2009)


    Steve Jones - Editor (4/30/2009)


    Anyone got a better performing solution than Gail?

    Heh. Now you know we'll need to see the execution plan to better answer that. 😀

    Use this to populate rather than a few hardcoded rows

    Insert Into Users (UserName, LastLoggedIn)

    SELECT TOP 40000 CHAR(65+FLOOR(RAND(a.column_id*801020 + a.column_id*705020)*12)) + CHAR(65+FLOOR(RAND(a.column_id*6010 + b.object_id)*5)), DATEADD(dd,RAND(a.column_id*600 + b.object_id/1.5)*500,'2008/01/01')

    FROM master.sys.columns a CROSS JOIN master.sys.columns b

    It gives lots of dups, expand the varchar in the update to VARCHAR(5).

    Exec plan attached.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    Table 'Users'. Scan count 2, logical reads 80538, physical reads 0

    Table 'Worktable'. Scan count 1, logical reads 80842, physical reads 0

    SQL Server Execution Times:

    CPU time = 422 ms, elapsed time = 445 ms.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • RBarryYoung (4/30/2009)


    Hmm, depending on the duplicates density, this might be faster:

    Better on IOs, worse on time (at least on my PC)

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    Table 'Users'. Scan count 1, logical reads 81083

    SQL Server Execution Times:

    CPU time = 1250 ms, elapsed time = 1291 ms.

    Take the where clause out and it's way better than the update with join

    ;WITH cteUsers as (

    SELECT UserID

    , UserName

    , ROW_NUMBER() OVER (Partition By UserName Order By LastLoggedIn DESC) incr

    FROM Users

    )

    UPDATE cteUsers

    SET UserName = UserName + CAST(incr-1 AS VARCHAR(5))

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 22 ms.

    Table 'Users'. Scan count 1, logical reads 81199.

    SQL Server Execution Times:

    CPU time = 438 ms, elapsed time = 458 ms.

    Since there's no way to index that incr column, the where's not helping much. Might help with writes. I've got way too many dupes in the test data to tell.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (4/30/2009)


    RBarryYoung (4/30/2009)


    Hmm, depending on the duplicates density, this might be faster:

    Better on IOs, worse on time (at least on my PC)

    Yeah, I'm pretty sure that there's a whole "seek vs scan" threshold in there, even on the output side. My SWAG would be less than 5%, seek (use the WHERE clause), more than 20%, definitely scan (no WHERE clause), in-between, "it depends".

    Take the where clause out and it's way better than the update with join

    ;WITH cteUsers as (

    SELECT UserID

    , UserName

    , ROW_NUMBER() OVER (Partition By UserName Order By LastLoggedIn DESC) incr

    FROM Users

    )

    UPDATE cteUsers

    SET UserName = UserName + CAST(incr-1 AS VARCHAR(5))

    Don't forget to put the CASE back in if you do.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Ha! I knew someone would take up the challenge.

    I'll run a couple execution plans on these at night, see which one works better in my 800k row table.

  • RBarryYoung (4/30/2009)


    Don't forget to put the CASE back in if you do.

    True. That's what I get for posting while playing computer games....

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • OK, I used to following to generate test data:

    delete from Users

    Insert into Users

    Select TOP 1000000

    Left(c1.name, 19)

    + right(cast(newid() as varchar(36)), 2)

    + Left(c2.name, 19)

    , Cast( abs(c1.object_id+c2.column_id)/1000.0 as datetime) as LastLoggedIn

    From master.sys.system_columns c1, master.sys.system_columns c2

    Which gives about 50,000 duplicates which is 5% (it varies though and this technique does not produce any really long chains, though I don't think that that should matter much).

    At that density the WHERE version takes about 5 sec on my system and the No-WHERE version takes about 20. So 10% is probably a good dividing line.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • The other thing to beware of is that there are issues with names whose length hits the max length (40 chars). I didn't see it until I used my last generating method, because it tends to produce really long names.

    The upshot is that if you have 40-char names that are duplicates, you cannot just add a digit on to the end because then the name will be too long and it will fail.

    I doubt that you have very many of these, Steve and you may not have any, so it's questionable how much work you want to do to cover this case. I would suggest just ignoring them by putting a "LEFT(... , 40)" around the final value of the UserName assignment.

    Then if you have any of them and still need to fix them, use a different query or just do it by hand. Seriously, there cannot be that many, who uses a 40-character name?

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • ALTER TABLE Users ALTER COLUMN UserName VARCHAR(42)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Here's another approach (using a large hammer):

    DECLARE@sql VARCHAR(500);

    SET@sql = 'ALTER DATABASE ' + DB_NAME() + ' SET RECOVERY BULK_LOGGED;'

    EXECUTE(@sql)

    IF@@error != 0 RETURN;

    ;WITHCTE (UserID, SeqNo, UserName, LastLoggedIn) AS

    (

    SELECTUserID,

    ROW_NUMBER() OVER (PARTITION BY UserName ORDER BY LastLoggedIn DESC) AS SeqNo,

    UserName + CONVERT(VARCHAR(10), ROW_NUMBER() OVER (PARTITION BY UserName ORDER BY LastLoggedIn DESC)), LastLoggedIn

    FROMdbo.Users

    )

    SELECTUserId, UserName, LastLoggedIn

    INTOdbo.[9D2213FB-3FAB-40BA-9F7C-7C3B9CB8A7F7]

    FROMCTE

    WHERESeqNo > 1

    BEGIN TRANSACTION

    BEGIN TRY

    DROP TABLE dbo.Users;

    EXECUTE sp_rename N'dbo.[9D2213FB-3FAB-40BA-9F7C-7C3B9CB8A7F7]', N'Users', 'OBJECT'

    END TRY

    BEGIN CATCH

    IF XACT_STATE() != 0 ROLLBACK TRANSACTION;

    RAISERROR('Carp!', 16, 1);

    END CATCH

    COMMIT TRANSACTION;

    144 logical reads, < 400ms elapsed on my laptop.

    Paul

  • Excluding, of course, creation of any constraints or indexes

    With the same table that I used earlier, I make it

    Table 'Users'. Scan count 1, logical reads 269, physical reads 0

    SQL Server Execution Times:

    CPU time = 109 ms, elapsed time = 373 ms.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I used:

    Insert Into dbo.Users (UserName, LastLoggedIn)

    SELECT TOP 40000 CHAR(65+FLOOR(RAND(a.column_id*801020 + a.column_id*705020)*12)) + CHAR(65+FLOOR(RAND(a.column_id*6010 + b.object_id)*5)), DATEADD(dd,RAND(a.column_id*600 + b.object_id/1.5)*500,'2008/01/01')

    FROM master.sys.columns a CROSS JOIN master.sys.columns b

    from a previous post...was that yours? If so, the royalty cheque (check) is in the post 🙂

    Yes I forgot to re-create the primary key - silly boy. And yes, other stuff may well need to be added. It was just an alternative idea.

    ALTER TABLE dbo.Users ADD CONSTRAINT [PK dbo.Users UserID] PRIMARY KEY CLUSTERED (UserID) WITH (ONLINE=OFF,MAXDOP=4,FILLFACTOR=100);

    [font="Courier New"]Table 'Users'. Scan count 1, logical reads 152, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 62 ms, elapsed time = 225 ms.[/font]

Viewing 15 posts - 16 through 30 (of 36 total)

You must be logged in to reply to this topic. Login to reply