Random Number Generating

  • Hi All,

    I need to generate random unique numbers for 10 million records.

    Here is my requirement

    1. The numbers should begin with 11.
    2. The minimum length of the number should be 14
    3. The maximum length of the number should be 25
    4. Need to generate the random unique numbers for 10 million records.

    Thanks

    Sathish Kumar

     

  • sathishkm wrote:

    Hi All,

    I need to generate random unique numbers for 10 million records.

    Here is my requirement

    1. The numbers should begin with 11.
    2. The minimum length of the number should be 14
    3. The maximum length of the number should be 25
    4. Need to generate the random unique numbers for 10 million records.

    Thanks

    Sathish Kumar

    So, it won't actually be a numeric column because your starting value of "11" would have to be left-padded with 12 spaces or zeros to have a minimum length of 14.  Are you certain that you're good with that mistake?

    I'm also curious as to why you need "random" numbers on such a small range compared to the 14 character minimum size.  It would be real easy to generate 10 million integers and sort them in a Random Order in a table.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  •  

    reduce the length to

    The numbers should begin with 11.

    The minimum length of the number should be 10

    The maximum length of the number should be 15

    Need to generate the random unique numbers for 10 million records.

  • DECLARE  @RowsToGenerate as int = 10000000
    SELECT DISTINCT TOP(@RowsToGenerate) x.Num
    FROM (SELECT LEFT('11' + CONVERT(varchar,(CONVERT(bigint,ABS(CHECKSUM(NEWID()))))) + CONVERT(varchar,(CONVERT(bigint,ABS(CHECKSUM(NEWID()))))) ,10+T.RandNum) Num
    FROM dbo.fnTally(1,CONVERT(int,@RowsToGenerate * 1.2))
    CROSS APPLY(VALUES (CAST(ABS(CHECKSUM(NEWID())) % 6 AS INT))) T(RandNum)) x
    ;

    You just need to install this little gem: dbo.fnTally

  • sathishkm wrote:

    reduce the length to

    The numbers should begin with 11. The minimum length of the number should be 10 The maximum length of the number should be 15 Need to generate the random unique numbers for 10 million records.

    That doesn't answer my original questions about the final datatype because, according to your new requirements, the number "11" would still need to be LPADDED with 8 zeros or spaces.  Or, RPADDED like Jonathan did.

    We just don't know which from your description.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    sathishkm wrote:

    reduce the length to

    The numbers should begin with 11. The minimum length of the number should be 10 The maximum length of the number should be 15 Need to generate the random unique numbers for 10 million records.

    That doesn't answer my original questions about the final datatype because, according to your new requirements, the number "11" would still need to be LPADDED with 8 zeros or spaces.  Or, RPADDED like Jonathan did.

    We just don't know which from your description.

    They aren't padded, the length of the varchar is the length of the number in it.

    Now they want the maximum length to be only 15 characters the number will fit in a bigint so can be converted to one:

    DECLARE  @RowsToGenerate as int = 10000000
    SELECT DISTINCT TOP(@RowsToGenerate) convert(bigint, x.Num) Num
    FROM (SELECT LEFT('11' + CONVERT(varchar,(CONVERT(bigint,ABS(CHECKSUM(NEWID()))))) + CONVERT(varchar,(CONVERT(bigint,ABS(CHECKSUM(NEWID()))))) ,10+T.RandNum) Num
    FROM dbo.fnTally(1,CONVERT(int,@RowsToGenerate * 1.2))
    CROSS APPLY(VALUES (CAST(ABS(CHECKSUM(NEWID())) % 6 AS INT))) T(RandNum)) x
    ;

    I just noticed that this method won't have any numbers that have a zero after the 11. So numbers like 110345678912 will never occur. I can't see that this would be a problem though.

  • Jeff Moden wrote:

    sathishkm wrote:

    reduce the length to

    The numbers should begin with 11. The minimum length of the number should be 10 The maximum length of the number should be 15 Need to generate the random unique numbers for 10 million records.

    That doesn't answer my original questions about the final datatype because, according to your new requirements, the number "11" would still need to be LPADDED with 8 zeros or spaces.  Or, RPADDED like Jonathan did.

    We just don't know which from your description.

    Jeff, unless I'm thinking about this incorrectly (entirely possible), you could actually use a BIGINT data type and have the correct format. Simply generate the numbers 1 - 10M and then simply add 110000000000000 to each number.

    Or even use an INT data type by adding 1100000000 instead, as it would still meat the 10 char minimum.

  • Something like this...

    WITH 
    cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),-- 10
    cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),-- 100
    cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n2 b),-- 10,000
    cte_Tally (n) AS (
    SELECT TOP(10000000)-- (10M rows to be generated)
    ROW_NUMBER() OVER (ORDER BY a.n)
    FROM
    cte_n3 a CROSS JOIN cte_n3 b-- 100,000,000
    ORDER BY
    a.n
    )
    SELECT
    n = ROW_NUMBER() OVER (ORDER BY NEWID() ),
    rand_num = t.n + 110000000000000
    FROM
    cte_Tally t;
    GO
  • Jason A. Long wrote:

    Jeff Moden wrote:

    sathishkm wrote:

    reduce the length to

    The numbers should begin with 11. The minimum length of the number should be 10 The maximum length of the number should be 15 Need to generate the random unique numbers for 10 million records.

    That doesn't answer my original questions about the final datatype because, according to your new requirements, the number "11" would still need to be LPADDED with 8 zeros or spaces.  Or, RPADDED like Jonathan did.

    We just don't know which from your description.

    Jeff, unless I'm thinking about this incorrectly (entirely possible), you could actually use a BIGINT data type and have the correct format. Simply generate the numbers 1 - 10M and then simply add 110000000000000 to each number.

    Or even use an INT data type by adding 1100000000 instead, as it would still meat the 10 char minimum.

    Yes... that's where I was going to go EXCEPT the OP said the minimum number would be "11"... not "110000000000000".

    Edit... actually the OP didn't say the "minimum" would be 11... he said they would start with 11.  It took that to mean "starting value" rather than "the first two digits of the number".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I read it as "the number should be between 10 and 15 digits in length and the first two digits must be 11".

    Some clarification from the OP would be helpful.

  • Jason A. Long wrote:

    I read it as "the number should be between 10 and 15 digits in length and the first two digits must be 11".

    Some clarification from the OP would be helpful.

    If the way you've read it is correct, then I absolutely agree with the BIGINT method you did.  Heh... obviously I didn't read it that way and so, I agree... we need to hear from the OP.

    And, welcome back!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • p.s.  This is really going to be fun when they need another 10 million unique random values to fold into the first 10 million. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    And, welcome back!

    Thanks Jeff... I've haven't actually gone anywhere. I just do more reading than posting these days (mostly due to reading on an iOS devise).

    Every once in awhile, I'll see a topic that interests me enough to open it on a PC and work out a solution.

  • Jeff Moden wrote:

    p.s.  This is really going to be fun when they need another 10 million unique random values to fold into the first 10 million. 😀

    It shouldn't be too bad as long as the OP keeps the first range between 1 and 10,000,000. Just run the same code but add 10,000,000 to the [cte_Tally].[n] value.

    ROW_NUMBER() OVER (ORDER BY a.n) + 10000000
    ...
    n = ROW_NUMBER() OVER (ORDER BY NEWID() ) + 10000000,

    That way the "next 10M" will be 10,000,001 - 20,000,000

  • Jason A. Long wrote:

    Jeff Moden wrote:

    p.s.  This is really going to be fun when they need another 10 million unique random values to fold into the first 10 million. 😀

    It shouldn't be too bad as long as the OP keeps the first range between 1 and 10,000,000. Just run the same code but add 10,000,000 to the [cte_Tally].[n] value.

    ROW_NUMBER() OVER (ORDER BY a.n) + 10000000
    ...
    n = ROW_NUMBER() OVER (ORDER BY NEWID() ) + 10000000,

    That way the "next 10M" will be 10,000,001 - 20,000,000

    That would absolutely work but, the way the OP was talking, it also seemed like he wanted "Random Lengths" so the range of numbers would be much larger than just 10 million (8 digits +2 for the "11").

    Heh... like I saw someone's title on an article before.  "Nope.  You don't need an answer... you need a QUESTION!" 😀

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply