Strange behavior with selecting random rows

  • Hi All,

    The problem i'm trying to solve is randomizing data in one of our tables, and I am trying to avoid using a cursor loop. there are 1000 street names in a table (RND_STREET) with an identity column and a street name:

    Structure: ID int identity

    street_name varchar(40)

    I have created a view & function to select a random integer between a min & max value, which has been successfully tested:

    ==============

    CREATE VIEW vRandNumber

    AS

    SELECT RAND() as RandNumber

    ALTER FUNCTION RandNumberInt(@Min int, @max-2 int)

    RETURNS INT

    AS

    BEGIN

    RETURN @min-2 + (select RandNumber from dbo.vRandNumber) * (@Max-@Min)

    END

    select dbo.RandNumberInt(1, 1000) -- returns a single record every time.

    ==============

    However, the fun starts when incorporated into a select:

    select convert(varchar(5),dbo.RandNumberInt(1, 100000)) + ' ' + street_name

    from rnd_street

    where ID = dbo.RandNumberInt(1, 1000)

    can return no records, a single on, or even multiple ones.

    ==============

    Am i missing something, or is this a bug?

    Thank you,

    John

  • you want the random values from the table ? if so use newid() in order by

    ex : select * from tablename order by newid()

    Regards- Deepak

  • I'm sorry, i forgot to include the entire statement:

    update RND_Customer_Master

    set Address1 = (select convert(varchar(5),dbo.RandNumberInt(1, 100000)) + ' ' + street_name

    from rnd_street

    where ID = dbo.RandNumberInt(1, 1000) ),

    RoutingNumber = convert(varchar(10), ABS(CAST(CAST(NEWID() AS VARBINARY) AS int)) % 1000000),

    BankAcctNumber = convert(varchar(10), ABS(CAST(CAST(NEWID() AS VARBINARY) AS int)) % 1000000),

    Zip = convert(varchar(5), ABS(CAST(CAST(NEWID() AS VARBINARY) AS int)) % 100000),

    first_name = (select contact

    from RND_contact

    where ID = dbo.RandNumberInt(1, 1000)),

    name = (select first_name + last_name

    from RND_LNAME, RND_FNAME

    where RND_FNAME.ID = dbo.RandNumberInt(1, 1000)

    and RND_LNAME.ID = RND_FNAME.ID )

  • You can use this to generate numbers :

    CHECKSUM(NEWID()) % <max value goes here>

    You might wabt to use ABS() as well because this generates negative numbers as well.

  • Hi,

    The problem is getting multiple rows returned, not the random number generation. I've tried a couple of different randomization techniques, but continue to get 0, 1, or multiple rows returned with the select.

    It becomes a show-stopper when attempting to update a record and the sub-query returns zero or multiple rows when it should return exactly one...

    where ID = dbo.RandNumberInt(1, 1000)

  • Stop using that function. THIS IS random > CHECKSUM(NEWID())

  • i don't care about the randomness -- it's the multiple rows being returned that's the problem.

  • Have you checked your data? Does the ID column have a unique constraint on it?

    Does this return anything:

    select ID, COUNT(*) from MyTable group by ID having COUNT(*) > 1

    John

  • yes, the ID is an identity column w/unique values.

    CREATE TABLE [dbo].[RND_Street](

    [ID] [int] IDENTITY(1,1) NOT NULL,

    [street_name] [varchar](40) NULL

    ) ON [PRIMARY]

  • Identity property doesn't guarantee uniqueness. Did that query I posted return any rows?

    John

  • john-from-ct (8/9/2011)


    It becomes a show-stopper when attempting to update a record and the sub-query returns zero or multiple rows when it should return exactly one...

    where ID = dbo.RandNumberInt(1, 1000)

    That's not going to return 1 row each time. It's not going to return a consistent number of rows. The function gets evaluated for each row, so each row gets a different random number and the cases where it matches the ID will be returned.

    You can see that behaviour with a simpler example

    Run this a few times:

    SELECT column_id, name, dbo.RandNumberInt(1, 100) FROM sys.columns AS c

    WHERE column_id = dbo.RandNumberInt(1, 100)

    You'll get a different number of rows, and different rows, each time because each time that function's evaluated you get a different number. Also the value the function returns in the select will be different from the one in the where clause. Function called twice, two different random numbers

    Now, what are you trying to do here?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • update the customer table with values from a set of street names with random addresses, without using a cursor & loop. I have tables with a unique identity column and i want a random function to select a single record from the address (and other) tables.

    the source tables have a similar structure:

    ID - unique number from 1 to 1000 (identity column)

    value - street / last name / first name, etc

  • Table definitions please. Sample data would be helpful (garbage/test data)

    Where does the randomised data come from?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • select ID, COUNT(*)

    from RND_Street

    group by ID

    having COUNT(*) > 1

    returns zero rows.

  • john-from-ct (8/9/2011)


    select ID, COUNT(*)

    from RND_Street

    group by ID

    having COUNT(*) > 1

    returns zero rows.

    That's not what Gail is saying. She's saying that the function gets run EVERY ROW. So in theory it's possible to get more than 1 or even 0 rows.

    She want's table DDL so she can give you a tested solution.

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply