December 14, 2007 at 5:08 am
I need to copy the data from one table to another table. which method will provide the optimized performance?
1) insert into
or
2) select * into table from table.
which method is better???????
December 14, 2007 at 5:14 am
Change to bulk recovery mode and use "Select into". Select into is minimally logged (see http://msdn2.microsoft.com/en-us/library/ms191244.aspx).
Regards,
Andras
December 14, 2007 at 5:23 am
I would do this in two stages.
SELECT * INTO newtable FROM oldtable WHERE 1=0
INSERT INTO newtable
SELECT * FROM oldtable.
The first statement creates the table in the desired structure but with no records, the 2nd populates it.
Things to watch out for.
1. Identity values. INSERT INTO SELECT * won't work if there are identity values.
2. The size of the table in terms of number of records. If you are talking about millions of records then having a number of batched inserts may be a better bet and won't blow your log file to bits.
December 14, 2007 at 5:57 am
I think David recommends his method because SELECT/INTO will temporarily lock SysObjects while the data is transferring...
However, I prefer the SELECT/INTO method for two reasons...
1. If you CAN temporarily change to the Bulk or Simple recovery modes, there's a lot less logging (other factors must also be true... see Books Online for details).
2. It's nasty fast even with a couple of million rows... you won't have the locks on SysObjects for long. Other processes will simply wait.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 14, 2007 at 6:25 am
So why would SELECT INTO not lock sysobjects?
Surely any form of creation of an object would have a small lock on sysobjects?
The other point is, how often are records created in sysobjects in any case? Application code doesn't create tables, constraints, procs or functions so surely any locking on sysobjects is irrelevant?
December 14, 2007 at 5:31 pm
So why do you recommend creating the table first?
--Jeff Moden
Change is inevitable... Change for the better is not.
December 15, 2007 at 3:04 pm
The main problem SELECT INTO is that is locks sysobjects (especially in tempdb) when lots of procedures make use of temporary tables having to wait until the insert is finished. (if I read it right from previous posts)
December 15, 2007 at 3:38 pm
Yep... know that... see my post above...
--Jeff Moden
Change is inevitable... Change for the better is not.
December 15, 2007 at 8:01 pm
Hi,
Check the following post http://sql-server-performance.com/Community/forums/p/12553/70520.aspx
Regards,
Ahmed
December 17, 2007 at 12:32 pm
Jeff Moden (12/14/2007)
So why do you recommend creating the table first?
Using SELECT INTO to create the initial table structure with no records is "blink of an eye" fast.
SELECT INTO to populate the entire recordset is painfully slow compared to INSERT INTO. In addition you can batch up your INSERT INTOs if you want to.
December 18, 2007 at 8:16 am
SELECT INTO to populate the entire recordset is painfully slow compared to INSERT INTO
Not real sure where you're coming up with that... I agree that SELECT/INTO will hold a lock on the sysObjects table longer than the Create/Insert/Select... but, if you run the following code a couple of times, you'll see that SELECT/INTO is consistently about 40% faster than INSERT/SELECT and it doesn't matter what the recovery mode is...
SET NOCOUNT ON
--=============================================================================
-- Million row test for SELECT/INTO
--=============================================================================
IF OBJECT_ID('dbo.jbmTest','U') IS NOT NULL
DROP TABLE dbo.jbmTest
DECLARE @StartTime DATETIME
SET @StartTime = GETDATE()
--===== Create and populate the Test table on the fly
SELECT TOP 1000000
sc1.ID AS ID1, sc2.ID AS ID2
INTO dbo.jbmTest
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
SELECT CONVERT(CHAR(12),GETDATE()-@StartTime,114) + ' Duration SELECT/INTO'
GO
--=============================================================================
-- Million row test for INSERT/SELECT
--=============================================================================
IF OBJECT_ID('dbo.jbmTest','U') IS NOT NULL
DROP TABLE dbo.jbmTest
DECLARE @StartTime DATETIME
SET @StartTime = GETDATE()
--===== Create the empty table
SELECT sc1.ID AS ID1, sc2.ID AS ID2
INTO dbo.jbmTest
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
WHERE 1 = 0
--===== Populate the table using INSERT/SELECT
INSERT INTO dbo.jbmTest
SELECT TOP 1000000
sc1.ID AS ID1, sc2.ID AS ID2
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
SELECT CONVERT(CHAR(12),GETDATE()-@StartTime,114) + ' Duration INSERT/SELECT'
--Jeff Moden
Change is inevitable... Change for the better is not.
December 18, 2007 at 11:23 am
Indeed, the select into is faster in the example
Select into: avg 1-2 seconds
Table 'Worktable'. Scan count 201, logical reads 6574, physical reads 0, read-ahead reads 0.
Table 'jbmTest'. Scan count 0, logical reads 1, physical reads 0, read-ahead reads 0.
Table 'syscolumns'. Scan count 2, logical reads 45, physical reads 2, read-ahead reads 19.
Insert into: 3 seconds
Table 'Worktable'. Scan count 201, logical reads 6574, physical reads 0, read-ahead reads 0.
Table 'jbmTest2'. Scan count 0, logical reads 1000001, physical reads 0, read-ahead reads 0.
Table 'syscolumns'. Scan count 2, logical reads 45, physical reads 1, read-ahead reads 17.
The select into has the advantage that no checking is required on doubles/constraints because there can't be none.
That might explain why insert into requires 1000001 logical reads. (even though there are no constraints/indexes specified?)
December 18, 2007 at 2:09 pm
Jeff Moden (12/14/2007)
I think David recommends his method because SELECT/INTO will temporarily lock SysObjects while the data is transferring...However, I prefer the SELECT/INTO method for two reasons...
1. If you CAN temporarily change to the Bulk or Simple recovery modes, there's a lot less logging (other factors must also be true... see Books Online for details).
2. It's nasty fast even with a couple of million rows... you won't have the locks on SysObjects for long. Other processes will simply wait.
You ain't kidding about fast. I inherited a stored procedure that ran like a lame dog when dealing with a million or so records - took about ten hours to process. :sick:
I tracked it down to an Insert Into. I switched it to a Select...Into and the time dropped to 15 minutes for the entire procedure. 😎 Needless to say, the folks I work for were VERY happy about it.
If there's any other way to wring additional speed out of such an operation I'd LOVE to hear about it! 😀
December 18, 2007 at 2:52 pm
10 hours to 15 minutes? Heh... I like SELECT/INTO a lot but I've not seen that type of performance gain just by switching like that before... Are you sure you're not overlooking something else that you did?
--Jeff Moden
Change is inevitable... Change for the better is not.
December 19, 2007 at 5:36 am
Jeff Moden (12/18/2007)
10 hours to 15 minutes? Heh... I like SELECT/INTO a lot but I've not seen that type of performance gain just by switching like that before... Are you sure you're not overlooking something else that you did?
You know, now that you mention it I also change a Delete From to a Truncate Table. I'd forgotten that until I just reviewed the code to answer your question. Ah, the power of minimally logged operations!
Viewing 15 posts - 1 through 15 (of 23 total)
You must be logged in to reply to this topic. Login to reply