statistical analysis of RAND()

Question

statistical analysis of RAND()

Wayne Lawton

Hall of Fame

Points: 3532
More actions
November 15, 2005 at 9:05 am

#92030

G'day all,
Has anyone ran a statistical analysis of the T-SQL RAND() function? What were the results? How random is it? Any idea which algorithm it is based on?
I can certainly run my own series of tests but hopefully someone has already done so and can save me a few hours or days of work.
Thanks in advance
Wayne

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Kenneth Wilhelmsson SSC-Dedicated Points: 30043 More actions · Answer 1

Nope, no analysis. But it's not particularly random, though it may be good enough depending on your purpose.

For the most cases, NEWID() works better for generating random stuff. But, it depends. Why do you ask? Practical reason or just academic?

/Kenneth

Wayne Lawton Hall of Fame Points: 3532 More actions · Answer 2

Practical reasons, actually. One of my clients has a web app I support that uses random numbers as part of the mechanism to prevent spoofing and other related attacks. Currently the approach is to use a custom random number generator because the mechanism built in to SQL Server was not well understood at the time. Specifically, when provided a specific seed, the same sequence of pseudo-random numbers is generated.

In my spare time, I also provide backend development and support for a couple of multi-player games. There are times when a pseudo-random number in the backend would reduce the number of messages moving between the layers. In many cases, "good enough" is exactly that. In other cases, such as a game of chance where money is related to the outcome, the standards are a bit higher.

Both scenarios are leading me to re-evaluate how well random numbers are supported in SQL Server.

I found an article on the Microsoft web site by Brian Connolly here: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsqlpro04/html/sp04c1.asp that discusses a few options.

I am running a simple analysis using distribution over a particular target set and will post the results back in a few days. I welcome any input from others who may have done similar work.

Have a great day,

Wayne

Kenneth Wilhelmsson SSC-Dedicated Points: 30043 More actions · Answer 3

Here's something I had lying in a dark corner that may be of some help.

The general rule to use rand to generate random numbers from <lower limit> to <upper limit> is:

select convert(int, <upper limit - lower limit + 1> * rand() + <lower limit> )

So, to get numbers from 1 to 100:

select convert(int, 100 * rand() + 1)

From 18 to 30:

select convert(int, 13 * rand() + 18)

/Kenneth

Bill Talada SSChampion Points: 11954 More actions · Answer 4

Pseudo-random numbers in SqlServer had 32K different values when I investigated them in version 6.5. Since v6.5 had problems with the "hot spot" on tables due to page locking, I generated my own "identitys" partly using rand() which dramatically reduced contention.

I noticed at work that the same people kept getting hit with random drug testing. I suspect it is due to a poor statistical randomness in the program.

Randomness is quite a philosophical issue since it can be argued nothing is random in the universe once the underlying laws are understood. The best answer may be looping a million times on the front end and see where a user keystroke interrupts the loop. I've heard of USB devices that approach true randomness.

Good luck

john-132685 SSCertifiable Points: 6134 More actions · Answer 5

I've used the hotbits website for grabbing random numbers and loading them into tables for use with the various simulations I need. (Random numbers generated from radioactive decay.)

http://www.fourmilab.ch/hotbits/