Help with SQL Coding

Question

Post reply

Help with SQL Coding

Viewing 15 posts - 16 through 30 (of 66 total)

You must be logged in to reply to this topic. Login to reply

Paul White SSC Guru Points: 150468 More actions · Answer 1

Or even:

WHILE EXISTS

(

SELECT *

FROM #EventPersonInfo

WHERE app_id IS NULL

)

BEGIN

-- Get a new random number

SET @temp = ABS(CHECKSUM(NEWID()));

-- Do the update

UPDATE TOP (1)

#EventPersonInfo

SET app_id = @temp

WHERE app_id IS NULL

AND NOT EXISTS

(

SELECT *

FROM #EventPersonInfo WITH (UPDLOCK, HOLDLOCK)

WHERE app_id = @temp

);

END;

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Dave Ballantyne SSC-Dedicated Points: 33667 More actions · Answer 2

Beyond the call of duty Paul 😉

Clear Sky SQL
My Blog[/url]

Steve Jones - SSC Editor SSC Guru Points: 736478 More actions · Answer 3

Nice, Paul. I assume those are provided for testing, or other solutions and not for this problem. It's overkill and inefficient for what the OP is trying to accomplish here.

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 4

You didn't get. It is just a joke!

You should: Ha! Ha! Ha!

Am I right?

If not, I have some even better solution. It will use multiple temp tables, few CTE's, Jeff favoured "quirky update" techique and, of course couple of loops and couple of cursors (possible three)!

Do you want to see it?

😀

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Paul White SSC Guru Points: 150468 More actions · Answer 5

Steve Jones - Editor (7/13/2010)
Nice, Paul. I assume those are provided for testing, or other solutions and not for this problem.

No.

It's overkill and inefficient for what the OP is trying to accomplish here.

Thanks for your opinion.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Paul White SSC Guru Points: 150468 More actions · Answer 6

Eugene Elutin (7/13/2010)
You didn't get. It is just a joke! You should: Ha! Ha! Ha! Am I right?

Not entirely, but I can see why you would say that.

Nobody here (except Tasnim) has enough information to say how this problem could best be solved. It may be, for example, that the design cannot be changed, and a random number must indeed be generated for rows that currently contain NULL.

It may also be that the table only contains a few hundred rows and is read-only except for an overnight (off-line) batch process. We just don't know.

Rather than just join in with the predictable "your design is wrong, you need to write a set-based solution" response, I decided to show a couple of examples of row-by-row processing code, incorporating some of the features that would be required to make it robust and suitable for concurrent processing.

If I get feedback from Tasnim, I might go on to explain how my code shows that row-by-row processing can get quite complex, and will usually not perform adequately. The thread might then lead on to be an interesting and positive learning experience. It could happen.

That said, if the requirement is to add 'n' new unique random positive integers to replace an unknown number of NULLs, the set-based solutions are non-trivial too, and may not perform any better.

Notwithstanding Steve's attitude, I intend to pick back up on the theme tomorrow.

Paul

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 7

Paul White NZ (7/13/2010)
...
the set-based solutions are non-trivial too, and may not perform any better.
...

You are absoultely right. Actually, strictly speaking, it is impossible to guarantee absolute uniqueness in a set-based solution.

You will need still check for duplicates. Basically, any such solution will contain some sort of loop. In one of the recent posts I have explained some OP how to generate set of random numbers:

Some thing along the lines:

1. GENERATE SET OF RANDOM NUMBERS

2. CHECK THE GENERATED SET FOR DUPLICATES

3. IF NOT FOUND THEN EXIT

ELSE:

REGENERATE RANDOM NUMBERS FOR DUPLICTES ONLY

AND GOTO STEP 2

Post #945594 in http://www.sqlservercentral.com/Forums/Topic945541-338-1.aspx

Actually, RAND([base]) can be replaced with manipulation on NEWID() as explained in the futher posts on the above topic.

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Paul White SSC Guru Points: 150468 More actions · Answer 8

Thanks, Eugene. I'm glad someone else appreciates the subtleties.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

tasnim.siddiqi Ten Centuries Points: 1063 More actions · Answer 9

Hi Everyone!

First of all let me take the time to thank all of you guys, Paul, Steve, Old hand for helping with my case here. Really appreciate it guys 🙂

To Paul: A few things for my table. I am a database programmer (a new one). My table will have some 90000+ rows and I need to maintain unique app_IDs for each of them. The table rows will increase gradually, so every time new rows are added, a unique app_ID should be generated. As stated previously, Identity column will solve the problem, but its not random and therefore the idea is not very robust.

I am C / C++ programmer and I have easily achieved such objectives with arrays (as tables) and rand() functions. Due to the if/else and while loops, in C and because I am familiar with it I could make robust code in C to get this job done. I have described the algorithm before and you all understood the right thing, just that I needed to get it done with unique random number instead of identity column style.

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 10

tasnim.siddiqi (7/13/2010)
...
To Paul: A few things for my table. I am a database programmer (a new one). My table will have some 90000+ rows and I need to maintain unique app_IDs for each of them. The table rows will increase gradually, so every time new rows are added, a unique app_ID should be generated. As stated previously, Identity column will solve the problem, but its not random and therefore the idea is not very robust.
I am C / C++ programmer and I have easily achieved such objectives with arrays (as tables) and rand() functions. Due to the if/else and while loops, in C and because I am familiar with it I could make robust code in C to get this job done. I have described the algorithm before and you all understood the right thing, just that I needed to get it done with unique random number instead of identity column style.

1. Thanks Lord, you are not former assembly programmer (as me, for example), otherwise you would want to implement something even more robust than c/c++. 😀

2. If you want to be a database programmer (specificaly T-SQL), you should learn how the thing should be properly done in T-SQL. Query language is not procedural - it based on absolutely different concept. But, even if it would be, just think a bit what you would end up with, if VB6.0 programmer would start coding in c++ using the practice of coding in VB6.0 and saying that he knows that they work in vb therefore he is going to use them in c++...

3. Could you please enlight us a bit about why and how the identity column in SQLServer is not robust enough? What particular aspect of using it worries you?

4. From what you described so far about your table, defining App_id as identity is a right design solution in SQLServer. There is nothing in your details, except the fact that you are c/c++ programmer and you like using random numbers in your c/c++ code, justifies using random number for app_id. You wil create headache for yourself and future developers.

This forum provides one of the best opportunities to learn SQL Server and helps to became a professional database programmer. Of course it will only work if you want it. If your choice is to use c/c++ practices while developing in T-SQL, no one here is able to stop you.

Just be aware, that any normal T-SQL programmer who will see your code will call it (including you, I afraid) by not very nice names...

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Paul White SSC Guru Points: 150468 More actions · Answer 11

tasnim.siddiqi (7/13/2010)
As stated previously, Identity column will solve the problem, but its not random and therefore the idea is not very robust.

Can you explain that a little more? I am unsure *why* you feel random numbers would be more 'robust'. Robust in what scenario? Just interested.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

tasnim.siddiqi Ten Centuries Points: 1063 More actions · Answer 12

Robust in the sense that it will be more reliable, I mean with just ascending ordered numbers the app_IDs will have a biased numbering form. I mean it is like skewness, if you have done statistics before. It is better to avoid such things.

I hope I could clarify you.

Paul White SSC Guru Points: 150468 More actions · Answer 13

tasnim.siddiqi (7/14/2010)
Robust in the sense that it will be more reliable, I mean with just ascending ordered numbers the app_IDs will have a biased numbering form. I mean it is like skewness, if you have done statistics before. It is better to avoid such things.

Ok, I understand the general concern - but why is that skewness important? Is there something about the application or design that *requires* a random distribution? The more you can tell us about it, the better really.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Paul White SSC Guru Points: 150468 More actions · Answer 14

In the meantime, in case you are just looking for something simple to get you started:

DECLARE @temp INTEGER;

BEGIN TRANSACTION;

-- Find a new random number that does not already exist in the table

WHILE (@temp IS NULL OR EXISTS (SELECT * FROM Event_Person_Info WITH (UPDLOCK, HOLDLOCK) WHERE App_ID = @temp))

SET @temp = ABS(CHECKSUM(NEWID()));

-- Update the first NULL App_ID

UPDATE TOP (1)

Event_Person_Info

SET App_ID = @temp

WHERE App_ID IS NULL;

COMMIT TRANSACTION;

That's based on your original code, and just updates the first NULL App_ID found to a random positive integer, which does not already exist in the table. The code I presented previously shows various ways to extend this idea.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Steve Jones - SSC Editor SSC Guru Points: 736478 More actions · Answer 15

tasnim.siddiqi (7/14/2010)
Robust in the sense that it will be more reliable, I mean with just ascending ordered numbers the app_IDs will have a biased numbering form. I mean it is like skewness, if you have done statistics before. It is better to avoid such things.
I hope I could clarify you.

There's nothing about an increasing key that makes it more or less reliable or robust. You are fundamentally missing the point. Even in a hash, if you can hash to a known unique item, it is fine. There's no need for a distribution.

In terms of statistics, I'm not sure why skewing matters. I don't think it's "better" to avoid skewing in general. There are places it is desirable, and places it's not.

The reason an increasing key helps is performance. You can read here, about halfway down, where it talks about performance implications. When you insert into the middle of a SQL table, you potentially cause a page split. Think about inserting something in the middle of a large file on disk. It takes time and resources, way more than using the end of the file.

http://www.sqlskills.com/blogs/kimberly/post/Ever-increasing-clustering-key-the-Clustered-Index-Debateagain!.aspx