Insert random data in an extra column while transfering data (INSERT INTO ..FROM SELECT..)

  • We have a table with two columns, we want to transfer this table to another table and add a column with random data

    This is the source table:

    indx col1

    ----------- -----------

    1 1

    2 1

    3 1

    4 1

    This is the destination table we want:

    indx col1 numVal

    ----------- ----------- ----------------------

    1 1 0,26113531332694

    2 1 0,297826084044096

    3 1 0,238910995882056

    4 1 0,481712234763395

    To do this we have a script like the script below. This script runs well, but it takes more than 2 hours on a table with 3.000.000 records.

    DROP TABLE myTempTableSource

    GO

    CREATE TABLE [dbo].[myTempTableSource](

    [indx] [int] IDENTITY(1,1) NOT NULL,

    [col1] [int] NULL,

    CONSTRAINT [PK_myTempTableSource] PRIMARY KEY CLUSTERED

    (

    [indx] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    INSERT INTO [myTempTableSource] VALUES(1)

    INSERT INTO [myTempTableSource] VALUES(1)

    INSERT INTO [myTempTableSource] VALUES(1)

    INSERT INTO [myTempTableSource] VALUES(1)

    /*Create Destination Table */

    DROP TABLE myTempTableDestination

    GO

    CREATE TABLE [dbo].[myTempTableDestination](

    [indx] [int] IDENTITY(1,1) NOT NULL,

    [col1] [int] NULL,

    [numVal] [float] NULL DEFAULT RAND(),

    CONSTRAINT [PK_myTempTableDestination] PRIMARY KEY CLUSTERED

    (

    [indx] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    /* Tranfer data */

    INSERT INTO [myTempTableDestination] SELECT [col1], NULL FROM [myTempTableSource]

    /* Update column numVal to random numbers */

    DECLARE @rowcount int

    DECLARE @row int

    SET @rowcount = (SELECT MAX(indx) FROM [myTempTableSource])

    SET @row = 0

    WHILE (@row <= @rowcount)

    BEGIN

    UPDATE [myTempTableDestination]

    SET numVal = RAND()

    FROM [myTempTableDestination]

    WHERE indx = @row

    SELECT @row = @row+1

    END

    /* Show result */

    SELECT * FROM [myTempTableSource]

    SELECT * FROM [myTempTableDestination]

    Is it possible to fill the column with random data while inserting the data (so that I don't have run the update statements)?

    The things I have tried:

  • Update all the rows at once (but al the rows get the same random number then)UPDATE [myTempTableDestination] SET numVal = RAND()
  • I have added a DEFAULT CONSTRAINT to the column numVal (but I don't know how to fill it with a INSERT INTO ..FROM SELECT..)

    This one works fine

    INSERT INTO [myTempTableDestination] VALUES (1, DEFAULT)

    This one gives an error and if it works I'm affraid all the rows have the same random number

    INSERT INTO [myTempTableDestination]

    SELECT [col1], DEFAULT FROM [myTempTableSource]

    Msg 156, Level 15, State 1, Line 1

    Incorrect syntax near the keyword 'DEFAULT'.

  • Regards,

    Marco Hofman

  • SELECT (ABS(CHECKSUM(NEWID())) % 100000 /* 0 to 99999 */) / 100000.0 FROM sys.tables

  • :w00t:

    I have changed the /* Tranfer data */ part to the code below and I have removed the /* Update column numVal to random numbers */ part

    /*Create source Table */

    DROP TABLE myTempTableSource

    GO

    CREATE TABLE [dbo].[myTempTableSource](

    [indx] [int] IDENTITY(1,1) NOT NULL,

    [col1] [int] NULL,

    CONSTRAINT [PK_myTempTableSource] PRIMARY KEY CLUSTERED

    (

    [indx] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    INSERT INTO [myTempTableSource] VALUES(1)

    INSERT INTO [myTempTableSource] VALUES(1)

    INSERT INTO [myTempTableSource] VALUES(1)

    INSERT INTO [myTempTableSource] VALUES(1)

    /*Create Destination Table */

    DROP TABLE myTempTableDestination

    GO

    CREATE TABLE [dbo].[myTempTableDestination](

    [indx] [int] IDENTITY(1,1) NOT NULL,

    [col1] [int] NULL,

    [numVal] [float] NULL,

    CONSTRAINT [PK_myTempTableDestination] PRIMARY KEY CLUSTERED

    (

    [indx] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    /* Tranfer data */

    INSERT INTO [myTempTableDestination] SELECT [col1], ((ABS(CHECKSUM(NEWID())) % 100000 /* 0 to 99999 */) / 100000.0) FROM [myTempTableSource]

    /* Show result */

    SELECT TOP 100 * FROM [myTempTableDestination]

    Now it transfers the data and generates a random number in a few seconds (I tested 9.000.000 million records in 43 seconds).

    This is amazing. It will take some time before I understand what your statement does, but I will figure it out.

    Thanks a lot !!!

    Marco

  • newid() = random data

    checksum turns that into an integer

    abs makes it positive.

    % (modulo) sets the top range (0 to 99999)

    / turns it into a number between 0 and 1.

    / 0.0 turns it into a decimal so that you don't get 0 for all rows because of an implicit casting to int.

  • Viewing 4 posts - 1 through 3 (of 3 total)

    You must be logged in to reply to this topic. Login to reply