SSC Members Name

Question

Post reply

SSC Members Name

Viewing 15 posts - 16 through 30 (of 36 total)

You must be logged in to reply to this topic. Login to reply

Steve Jones - SSC Editor SSC Guru Points: 736592 More actions · Answer 1

Of course!

Actually I thought some of you might have a better solution than I might come up with.

Anyone got a better performing solution than Gail?

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 2

Nope, just would have written it with a CTE instead of a derived table.

RBarryYoung SSC Guru Points: 143329 More actions · Answer 3

Steve Jones - Editor (4/30/2009)
Anyone got a better performing solution than Gail?

Heh. Now you know we'll need to see the execution plan to better answer that. 😀

Seriously though, we do need to see it. 🙂

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

RBarryYoung SSC Guru Points: 143329 More actions · Answer 4

Hmm, depending on the duplicates density, this might be faster:

;WITH cteUsers as (

SELECT UserID

, UserName

, ROW_NUMBER() OVER (Partition By UserName Order By LastLoggedIn DESC) incr

FROM Users

)

UPDATE cteUsers

SET UserName = UserName + CAST(incr-1 AS VARCHAR(3))

Where incr > 1

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 5

RBarryYoung (4/30/2009)
Steve Jones - Editor (4/30/2009)
Anyone got a better performing solution than Gail?
Heh. Now you know we'll need to see the execution plan to better answer that. 😀

Use this to populate rather than a few hardcoded rows

Insert Into Users (UserName, LastLoggedIn)

SELECT TOP 40000 CHAR(65+FLOOR(RAND(a.column_id*801020 + a.column_id*705020)*12)) + CHAR(65+FLOOR(RAND(a.column_id*6010 + b.object_id)*5)), DATEADD(dd,RAND(a.column_id*600 + b.object_id/1.5)*500,'2008/01/01')

FROM master.sys.columns a CROSS JOIN master.sys.columns b

It gives lots of dups, expand the varchar in the update to VARCHAR(5).

Exec plan attached.

SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
Table 'Users'. Scan count 2, logical reads 80538, physical reads 0
Table 'Worktable'. Scan count 1, logical reads 80842, physical reads 0
SQL Server Execution Times:
CPU time = 422 ms, elapsed time = 445 ms.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 6

RBarryYoung (4/30/2009)
Hmm, depending on the duplicates density, this might be faster:

Better on IOs, worse on time (at least on my PC)

SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
Table 'Users'. Scan count 1, logical reads 81083
SQL Server Execution Times:
CPU time = 1250 ms, elapsed time = 1291 ms.

Take the where clause out and it's way better than the update with join

;WITH cteUsers as (

SELECT UserID

, UserName

, ROW_NUMBER() OVER (Partition By UserName Order By LastLoggedIn DESC) incr

FROM Users

)

UPDATE cteUsers

SET UserName = UserName + CAST(incr-1 AS VARCHAR(5))

SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 22 ms.
Table 'Users'. Scan count 1, logical reads 81199.
SQL Server Execution Times:
CPU time = 438 ms, elapsed time = 458 ms.

Since there's no way to index that incr column, the where's not helping much. Might help with writes. I've got way too many dupes in the test data to tell.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

RBarryYoung SSC Guru Points: 143329 More actions · Answer 7

GilaMonster (4/30/2009)
RBarryYoung (4/30/2009)
Hmm, depending on the duplicates density, this might be faster:
Better on IOs, worse on time (at least on my PC)

Yeah, I'm pretty sure that there's a whole "seek vs scan" threshold in there, even on the output side. My SWAG would be less than 5%, seek (use the WHERE clause), more than 20%, definitely scan (no WHERE clause), in-between, "it depends".

Take the where clause out and it's way better than the update with join
;WITH cteUsers as (
SELECT UserID
, UserName
, ROW_NUMBER() OVER (Partition By UserName Order By LastLoggedIn DESC) incr
FROM Users
)
UPDATE cteUsers
SET UserName = UserName + CAST(incr-1 AS VARCHAR(5))

Don't forget to put the CASE back in if you do.

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Steve Jones - SSC Editor SSC Guru Points: 736592 More actions · Answer 8

Ha! I knew someone would take up the challenge.

I'll run a couple execution plans on these at night, see which one works better in my 800k row table.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

RBarryYoung (4/30/2009)
Don't forget to put the CASE back in if you do.

True. That's what I get for posting while playing computer games....

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

RBarryYoung SSC Guru Points: 143329 More actions · Answer 10

OK, I used to following to generate test data:

delete from Users

Insert into Users

Select TOP 1000000

Left(c1.name, 19)

+ right(cast(newid() as varchar(36)), 2)

+ Left(c2.name, 19)

, Cast( abs(c1.object_id+c2.column_id)/1000.0 as datetime) as LastLoggedIn

From master.sys.system_columns c1, master.sys.system_columns c2

Which gives about 50,000 duplicates which is 5% (it varies though and this technique does not produce any really long chains, though I don't think that that should matter much).

At that density the WHERE version takes about 5 sec on my system and the No-WHERE version takes about 20. So 10% is probably a good dividing line.

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

RBarryYoung SSC Guru Points: 143329 More actions · Answer 11

The other thing to beware of is that there are issues with names whose length hits the max length (40 chars). I didn't see it until I used my last generating method, because it tends to produce really long names.

The upshot is that if you have 40-char names that are duplicates, you cannot just add a digit on to the end because then the name will be too long and it will fail.

I doubt that you have very many of these, Steve and you may not have any, so it's questionable how much work you want to do to cover this case. I would suggest just ignoring them by putting a "LEFT(... , 40)" around the final value of the UserName assignment.

Then if you have any of them and still need to fix them, use a different query or just do it by hand. Seriously, there cannot be that many, who uses a 40-character name?

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 12

ALTER TABLE Users ALTER COLUMN UserName VARCHAR(42)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Paul White SSC Guru Points: 150468 More actions · Answer 13

Here's another approach (using a large hammer):

DECLARE@sql VARCHAR(500);

SET@sql = 'ALTER DATABASE ' + DB_NAME() + ' SET RECOVERY BULK_LOGGED;'

EXECUTE(@sql)

IF@@error != 0 RETURN;

;WITHCTE (UserID, SeqNo, UserName, LastLoggedIn) AS

(

SELECTUserID,

ROW_NUMBER() OVER (PARTITION BY UserName ORDER BY LastLoggedIn DESC) AS SeqNo,

UserName + CONVERT(VARCHAR(10), ROW_NUMBER() OVER (PARTITION BY UserName ORDER BY LastLoggedIn DESC)), LastLoggedIn

FROMdbo.Users

)

SELECTUserId, UserName, LastLoggedIn

INTOdbo.[9D2213FB-3FAB-40BA-9F7C-7C3B9CB8A7F7]

FROMCTE

WHERESeqNo > 1

BEGIN TRANSACTION

BEGIN TRY

DROP TABLE dbo.Users;

EXECUTE sp_rename N'dbo.[9D2213FB-3FAB-40BA-9F7C-7C3B9CB8A7F7]', N'Users', 'OBJECT'

END TRY

BEGIN CATCH

IF XACT_STATE() != 0 ROLLBACK TRANSACTION;

RAISERROR('Carp!', 16, 1);

END CATCH

COMMIT TRANSACTION;

144 logical reads, < 400ms elapsed on my laptop.

Paul

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 14

Excluding, of course, creation of any constraints or indexes

With the same table that I used earlier, I make it

Table 'Users'. Scan count 1, logical reads 269, physical reads 0
SQL Server Execution Times:
CPU time = 109 ms, elapsed time = 373 ms.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Paul White SSC Guru Points: 150468 More actions · Answer 15

I used:

Insert Into dbo.Users (UserName, LastLoggedIn)

SELECT TOP 40000 CHAR(65+FLOOR(RAND(a.column_id*801020 + a.column_id*705020)*12)) + CHAR(65+FLOOR(RAND(a.column_id*6010 + b.object_id)*5)), DATEADD(dd,RAND(a.column_id*600 + b.object_id/1.5)*500,'2008/01/01')

FROM master.sys.columns a CROSS JOIN master.sys.columns b

from a previous post...was that yours? If so, the royalty cheque (check) is in the post 🙂

Yes I forgot to re-create the primary key - silly boy. And yes, other stuff may well need to be added. It was just an alternative idea.

ALTER TABLE dbo.Users ADD CONSTRAINT [PK dbo.Users UserID] PRIMARY KEY CLUSTERED (UserID) WITH (ONLINE=OFF,MAXDOP=4,FILLFACTOR=100);

[font="Courier New"]Table 'Users'. Scan count 1, logical reads 152, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 62 ms, elapsed time = 225 ms.[/font]

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi