Select * into table2 from table1 - Performance Impact - what do u think?

Question

Post reply

Select * into table2 from table1 - Performance Impact - what do u think?

Varun R

Ten Centuries

Points: 1171
More actions
June 13, 2012 at 6:59 am

#271114

Hi All,
Data : 9 Million records
144 GB Ram
what about the Performance of Select * into table2 from table1 in Bulk recovery model.
Here it is taking around 1 hour
Any one has any idea to improve the query to 1-10 Minutes?
Regards,
Varun

Viewing 15 posts - 1 through 15 (of 30 total)

You must be logged in to reply to this topic. Login to reply

Ant-Green SSC Guru Points: 113509 More actions · Answer 1

If possible use a where clause to limit the data instead of copying 9 million rows

Pre-allocate the storage space required so SQL doesnt need to grow the files

okbangas SSChampion Points: 11773 More actions · Answer 2

What is the total size of these 9 million rows?

Ole Kristian Velstadbråten Bangås - Virinco - Facebook - Twitter

Concatenating Row Values in Transact-SQL[/url]

Varun R Ten Centuries Points: 1171 More actions · Answer 3

sp_spaceused - information

60656288 - reserved

32717008 - data

27924440 - index_size

14840 - unused

Daxesh Patel Hall of Fame Points: 3208 More actions · Answer 4

if table has a large size data, I think SELECT * INTO should be avoided due to its limitations.

I would create New table ahead of time with partitions (each partition on different disk) and use INSERT INTO SELECT ...

I would also consider in my test applying data compression in both the tables

Ant-Green SSC Guru Points: 113509 More actions · Answer 5

Ant-Green

SSC Guru

Points: 113509

June 13, 2012 at 7:28 am

#1500355

Is that for the DB or the table?

kiril.lazarov.77 Hall of Fame Points: 3742 More actions · Answer 6

I would look into SSIS and play with the fast load options and the batch size to get an optimal solution. Consider dropping and recreating your indexes on the target table.

edit: no files involved forget bulk insert

Ant-Green SSC Guru Points: 113509 More actions · Answer 7

Daxesh Patel (6/13/2012)
if table has a large size data, I think SELECT * INTO should be avoided due to its limitations.
I would create New table ahead of time with partitions (each partition on different disk) and use INSERT INTO SELECT ...
I would also consider in my test applying data compression in both the tables

As the DB is in BULK LOGGED mode, SELECT ... INTO is minimally logged, where as INSERT INTO SELECT is fully logged so will cause more information to be written to the transaction log resulting a potentially longer execution time.

SGT_squeequal SSCertifiable Points: 7167 More actions · Answer 8

As already mentioned it would be better to run this in batches however, is your database used on an evening? are there any downtime periods of the application as to limit the impact to any users?

If the DB is not used in an evening this would be an ideal time to run your script to populate the other table.

***The first step is always the hardest *******

Daxesh Patel Hall of Fame Points: 3208 More actions · Answer 9

anthony.green (6/13/2012)
Daxesh Patel (6/13/2012)
if table has a large size data, I think SELECT * INTO should be avoided due to its limitations.
I would create New table ahead of time with partitions (each partition on different disk) and use INSERT INTO SELECT ...
I would also consider in my test applying data compression in both the tables
As the DB is in BULK LOGGED mode, SELECT ... INTO is minimally logged, where as INSERT INTO SELECT is fully logged so will cause more information to be written to the transaction log resulting a potentially longer execution time.

Agree with this, but overall performance should be tested with different scenarios, SELECT * INTO will create new table in default file group and you cannot take advantage of partitions. Considering log file on different disk overall execution time may have less impact than gain in the write performance. Again I don't know what infrasturcture is available.

If enough Disks are available, this is worth trying and if not than yes SELECT INTO is the best option and nothing much can be done except improving read operation (e.g. compression)

Also this operation can be BULK LOGGED...

This is from BOL

Using INSERT INTO…SELECT to Bulk Import Data with Minimal Logging
You can use INSERT INTO <target_table> SELECT <columns> FROM <source_table> to efficiently transfer a large number of rows from one table, such as a staging table, to another table with minimal logging. Minimal logging can improve the performance of the statement and reduce the possibility of the operation filling the available transaction log space during the transaction.
Minimal logging for this statement has the following requirements:
The recovery model of the database is set to simple or bulk-logged.
The target table is empty or is a nonempty heap.
The target table is not used in replication.
The TABLOCK hint is specified for the target table.
Rows that are inserted into a heap as the result of an insert action in a MERGE statement may also be minimally logged.
Unlike the BULK INSERT statement, which holds a less restrictive Bulk Update lock, INSERT INTO…SELECT with the TABLOCK hint holds an exclusive (X) lock on the table. This means that you cannot insert rows using parallel insert operations.

Jared SSC Guru Points: 61793 More actions · Answer 10

Daxesh Patel (6/13/2012)
anthony.green (6/13/2012)
Daxesh Patel (6/13/2012)
if table has a large size data, I think SELECT * INTO should be avoided due to its limitations.
I would create New table ahead of time with partitions (each partition on different disk) and use INSERT INTO SELECT ...
I would also consider in my test applying data compression in both the tables
As the DB is in BULK LOGGED mode, SELECT ... INTO is minimally logged, where as INSERT INTO SELECT is fully logged so will cause more information to be written to the transaction log resulting a potentially longer execution time.
Agree with this, but overall performance should be tested with different scenarios, SELECT * INTO will create new table in default file group and you cannot take advantage of partitions. Considering log file on different disk overall execution time may have less impact than gain in the write performance. Again I don't know what infrasturcture is available.
If enough Disks are available, this is worth trying and if not than yes SELECT INTO is the best option and nothing much can be done except improving read operation (e.g. compression)
Also this operation can be BULK LOGGED...
This is from BOL
Using INSERT INTO…SELECT to Bulk Import Data with Minimal Logging
You can use INSERT INTO <target_table> SELECT <columns> FROM <source_table> to efficiently transfer a large number of rows from one table, such as a staging table, to another table with minimal logging. Minimal logging can improve the performance of the statement and reduce the possibility of the operation filling the available transaction log space during the transaction.
Minimal logging for this statement has the following requirements:
The recovery model of the database is set to simple or bulk-logged.
The target table is empty or is a nonempty heap.
The target table is not used in replication.
The TABLOCK hint is specified for the target table.
Rows that are inserted into a heap as the result of an insert action in a MERGE statement may also be minimally logged.
Unlike the BULK INSERT statement, which holds a less restrictive Bulk Update lock, INSERT INTO…SELECT with the TABLOCK hint holds an exclusive (X) lock on the table. This means that you cannot insert rows using parallel insert operations.

This is also assuming that there are other benefits to the partitioning. If the OP is simply trying to speed it up even though the current operation is acceptable, partitioning may cause more problems than not. Also, the OP states 144GB of ram on the machine, but we don't know how the instance is configured to use that memory. There are many variables here that need to be considered to give a proper answer.

Jared
CE - Microsoft

Varun R Ten Centuries Points: 1171 More actions · Answer 11

All of your information are valid, but performance is not improving.

Some times batches take much longer time 🙂

Jared SSC Guru Points: 61793 More actions · Answer 12

How many processors/cores? what is your istance's maxdop setting? What is the instance max memory setting?

Jared
CE - Microsoft

Varun R Ten Centuries Points: 1171 More actions · Answer 13

24 Processors,

Max degree of Parallelism : 0

Max server Mem : 2147483647

Any clue?

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 14

Yeah, first set your server's max server memory to a sensible value. For 144GB of memory, that would likely be around 130GB

Taking a few steps back... Why is a SELECT ... INTO of 9 million rows being run often enough that its performance is a concern? What's the reasoning and background?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass