July 20, 2011 at 2:47 pm
We have a table with two columns, we want to transfer this table to another table and add a column with random data
This is the source table:
indx col1
----------- -----------
1 1
2 1
3 1
4 1
This is the destination table we want:
indx col1 numVal
----------- ----------- ----------------------
1 1 0,26113531332694
2 1 0,297826084044096
3 1 0,238910995882056
4 1 0,481712234763395
To do this we have a script like the script below. This script runs well, but it takes more than 2 hours on a table with 3.000.000 records.
DROP TABLE myTempTableSource
GO
CREATE TABLE [dbo].[myTempTableSource](
[indx] [int] IDENTITY(1,1) NOT NULL,
[col1] [int] NULL,
CONSTRAINT [PK_myTempTableSource] PRIMARY KEY CLUSTERED
(
[indx] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
INSERT INTO [myTempTableSource] VALUES(1)
INSERT INTO [myTempTableSource] VALUES(1)
INSERT INTO [myTempTableSource] VALUES(1)
INSERT INTO [myTempTableSource] VALUES(1)
/*Create Destination Table */
DROP TABLE myTempTableDestination
GO
CREATE TABLE [dbo].[myTempTableDestination](
[indx] [int] IDENTITY(1,1) NOT NULL,
[col1] [int] NULL,
[numVal] [float] NULL DEFAULT RAND(),
CONSTRAINT [PK_myTempTableDestination] PRIMARY KEY CLUSTERED
(
[indx] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
/* Tranfer data */
INSERT INTO [myTempTableDestination] SELECT [col1], NULL FROM [myTempTableSource]
/* Update column numVal to random numbers */
DECLARE @rowcount int
DECLARE @row int
SET @rowcount = (SELECT MAX(indx) FROM [myTempTableSource])
SET @row = 0
WHILE (@row <= @rowcount)
BEGIN
UPDATE [myTempTableDestination]
SET numVal = RAND()
FROM [myTempTableDestination]
WHERE indx = @row
SELECT @row = @row+1
END
/* Show result */
SELECT * FROM [myTempTableSource]
SELECT * FROM [myTempTableDestination]
Is it possible to fill the column with random data while inserting the data (so that I don't have run the update statements)?
The things I have tried:
UPDATE [myTempTableDestination] SET numVal = RAND()
This one works fine
INSERT INTO [myTempTableDestination] VALUES (1, DEFAULT)
This one gives an error and if it works I'm affraid all the rows have the same random number
INSERT INTO [myTempTableDestination]
SELECT [col1], DEFAULT FROM [myTempTableSource]
Msg 156, Level 15, State 1, Line 1
Incorrect syntax near the keyword 'DEFAULT'.
Regards,
Marco Hofman
July 20, 2011 at 2:52 pm
SELECT (ABS(CHECKSUM(NEWID())) % 100000 /* 0 to 99999 */) / 100000.0 FROM sys.tables
July 20, 2011 at 3:39 pm
:w00t:
I have changed the /* Tranfer data */ part to the code below and I have removed the /* Update column numVal to random numbers */ part
/*Create source Table */
DROP TABLE myTempTableSource
GO
CREATE TABLE [dbo].[myTempTableSource](
[indx] [int] IDENTITY(1,1) NOT NULL,
[col1] [int] NULL,
CONSTRAINT [PK_myTempTableSource] PRIMARY KEY CLUSTERED
(
[indx] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
INSERT INTO [myTempTableSource] VALUES(1)
INSERT INTO [myTempTableSource] VALUES(1)
INSERT INTO [myTempTableSource] VALUES(1)
INSERT INTO [myTempTableSource] VALUES(1)
/*Create Destination Table */
DROP TABLE myTempTableDestination
GO
CREATE TABLE [dbo].[myTempTableDestination](
[indx] [int] IDENTITY(1,1) NOT NULL,
[col1] [int] NULL,
[numVal] [float] NULL,
CONSTRAINT [PK_myTempTableDestination] PRIMARY KEY CLUSTERED
(
[indx] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
/* Tranfer data */
INSERT INTO [myTempTableDestination] SELECT [col1], ((ABS(CHECKSUM(NEWID())) % 100000 /* 0 to 99999 */) / 100000.0) FROM [myTempTableSource]
/* Show result */
SELECT TOP 100 * FROM [myTempTableDestination]
Now it transfers the data and generates a random number in a few seconds (I tested 9.000.000 million records in 43 seconds).
This is amazing. It will take some time before I understand what your statement does, but I will figure it out.
Thanks a lot !!!
Marco
July 20, 2011 at 3:45 pm
newid() = random data
checksum turns that into an integer
abs makes it positive.
% (modulo) sets the top range (0 to 99999)
/ turns it into a number between 0 and 1.
/ 0.0 turns it into a decimal so that you don't get 0 for all rows because of an implicit casting to int.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply