Need help on performance and integrity

Question

Need help on performance and integrity

Viewing 11 posts - 106 through 115 (of 115 total)

You must be logged in to reply to this topic. Login to reply

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 1

smunson (11/3/2008)
Just a thought, but why would a set of data that starts with rows 1 thru 3 as duplicates, having only rows 2 and 3 be marked, and yet, if row 4 were C and then rows 5 thru 7 were 'A' dupes, all 3 of those dupes would be marked? I'm still trying to concoct ANY kind of scenario where that kind of lack of consistency is an essential part of anything other than some bizarre logic puzzle. At this point, I can't even remember if you have to identify a break between a C and a D, and your example didn't specify on that either.

Steve, 5 would be marked as a break from C back to A, not as a dup.

I was surprised to see the sudden burst of traffic on this. It's very tempting to unsubscribe here, but I'm going to stick around a while longer hoping to hear solid confirmation that one of the solutions was correct. I don't think we'll ever get a solid explanation of the "why" behind this pretzel logic. However, we could solve the issue of performance among using the million-row generation.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 2

Heh... I wanna know who ever said that "set based" meant "everything done in one query... 😛

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

AnzioBake SSCarpal Tunnel Points: 4134 More actions · Answer 3

HI.

I too, would Love to know why you would do this, but I had more fun writing the solution that performs reasonably well on a million rows. Maybe it is just a "problem" with sites like these....there are people on here because they SQL is the bigger part of their daily jobs and then...

there are those who have a problem.

I get the impression the OP was using non-DB logic to solve a problem and got stuck in that, and therefore did not know how to explain the overall problem, just the step he got stuck at.

my 0.02 ZAR

AnzioBake SSCarpal Tunnel Points: 4134 More actions · Answer 4

HI.

I too, would Love to know why you would do this, but I had more fun writing the solution that performs reasonably well on a million rows. Maybe it is just a "problem" with sites like these....there are people on here because they SQL is the bigger part of their daily jobs and then...

there are those who have a problem.

I get the impression the OP was using non-DB logic to solve a problem and got stuck in that, and therefore did not know how to explain the overall problem, just the step he got stuck at.

BTW Jeff, where you able to use the random data generator?

my 0.02 ZAR

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 5

AnzioBake (11/3/2008)
BTW Jeff, where you able to use the random data generator?

Yep... made a couple of changes to it to make it run just a little faster... give it a try...

SELECT TOP 1000000

IDENTITY(INT,1,1) AS RowID,

ABS(CHECKSUM(NEWID()))%2+1 AS RowKey,

CASE WHEN ABS(CHECKSUM(NEWID()))%30 < 16 THEN 'EN'

WHEN ABS(CHECKSUM(NEWID()))%30 > 24 THEN 'ZH'

ELSE 'DE'

END AS Locale,

CASE WHEN ABS(CHECKSUM(NEWID()))%10 < 7 THEN 'A'

ELSE 'C'

END AS Code

INTO dbo.SSCData

FROM Master.dbo.SysComments sc1

CROSS JOIN Master.dbo.SysComments sc2

Sorry I haven't spent much time on this post, yet... I'll be back. 🙂

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

AnzioBake SSCarpal Tunnel Points: 4134 More actions · Answer 6

Nice....That is an interesting way to get a random number i.e. newID combined with checksum()....

I like it

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 7

You can thank Matt Miller for that little gem. I used to use RAND(NEWID), but CHECKSUM is quite a bit faster. Either way, don't need a loop because NEWID does provide a random number. You just use CHECKSUM or RAND to do a conversion. CHECKSUM just happens to return an INT which makes it scream.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 8

Just beware that CHECKSUM can return a negative value.

Use ABS(CHECKSUM(NEWID())) to get positive random values.

N 56°04'39.16"
E 12°55'05.25"

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 9

Heh... thanks, Peter... I put that in my code but forgot to mention it.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Garadin One Orange Chip Points: 29613 More actions · Answer 10

If anyone does do any testing of these methods, please post the results to this thread. I'm curious as to the outcome of this. Otherwise, I'll try to get something together this weekend.

Seth Phelabaum

Consistency is only a virtue if you're not a screwup. 😉

Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 11

Hey Seth,

Work got busy so I had to leave this behind. Please let us know if you pursue it. I don't expect my cte-on-cte solution to be the winner, but the update approach should run quicker than the while loop.

Let me ask a question about set-based solutions based on a comment Jeff made earlier. Although doing everything in one query may not be the fastest solution to a problem, isn't it theoretically possible to use a single query (with subqueries and/or ctes) to solve a problem that has a set-based solution?

Bob

P.S. Thanks to all for ABS(CHECKSUM(NEWID())). That one went immediately to the utility belt.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills