Select * into table2 from table1 - Performance Impact - what do u think?

Question

Select * into table2 from table1 - Performance Impact - what do u think?

Viewing 15 posts - 16 through 30 (of 30 total)

You must be logged in to reply to this topic. Login to reply

Varun R Ten Centuries Points: 1171 More actions · Answer 1

GilaMonster, That is a very good question. We are doing a kind of data ware housing , here we are creating set of tables for reports.After running this procedure these tables are created. since we are dealing with huge data, we went for " select * into " . Also I didn't mention one thing, ie we are doing some aggregate functions in the select query. for ex select c1,sum(c2),c3,c4.... into table2 from table 1 group by c1,c3,c4.

The query is using table spool operators and all. But I am restricted to do the logic change due to some reasons.

Grant Fritchey SSC Guru Points: 398999 More actions · Answer 2

Daxesh Patel (6/13/2012)
if table has a large size data, I think SELECT * INTO should be avoided due to its limitations.
I would create New table ahead of time with partitions (each partition on different disk) and use INSERT INTO SELECT ...
I would also consider in my test applying data compression in both the tables

Be careful about this approach. Partitioning is a data management tool, not a performance enhancement tool. You could actually slow down the inserts instead of speeding them up.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 3

I've sped up a lot of procedures recently by removing SELECT ... INTOs and changing the procedures to work directly off the base tables. Try that if it's possible, see if it's faster. With good indexing, it may very well be faster.

Are you saying then that the query you want help with is not the SELECT * INTO Table2 FROM Table1 that you asked about and that many people have spend time trying to help you with?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Varun R Ten Centuries Points: 1171 More actions · Answer 4

No Gail shaw, it is not like that. we are also doing both the steps.

Jared SSC Guru Points: 61793 More actions · Answer 5

Varun R (6/13/2012)
GilaMonster, That is a very good question. We are doing a kind of data ware housing , here we are creating set of tables for reports.After running this procedure these tables are created. since we are dealing with huge data, we went for " select * into " . Also I didn't mention one thing, ie we are doing some aggregate functions in the select query. for ex select c1,sum(c2),c3,c4.... into table2 from table 1 group by c1,c3,c4.
The query is using table spool operators and all. But I am restricted to do the logic change due to some reasons.

So, you are dropping the tables and recreating them each time? If so, why? Why not just add/update the missing/changed data? If this is a "warehouse," data should not be changing all that frequently (some may argue it should not change at all, only be inserted) except for inserts.

Jared
CE - Microsoft

Varun R Ten Centuries Points: 1171 More actions · Answer 6

This is not a pure data ware house. This code is already developed by other team. Currently we are doing optimization. we cant change this logic, because it will affect all the system as of now. I know that what they did is wrong. But currently our focus is to improve the performance of the existing procedures.

Daxesh Patel Hall of Fame Points: 3208 More actions · Answer 7

Grant Fritchey (6/13/2012)
Daxesh Patel (6/13/2012)
if table has a large size data, I think SELECT * INTO should be avoided due to its limitations.
I would create New table ahead of time with partitions (each partition on different disk) and use INSERT INTO SELECT ...
I would also consider in my test applying data compression in both the tables
Be careful about this approach. Partitioning is a data management tool, not a performance enhancement tool. You could actually slow down the inserts instead of speeding them up.

I have to take my statement back, yes partitions may slow down inserts especially when destination table is wipe n load

Thanks

Mike John SSCertifiable Points: 7216 More actions · Answer 8

How about ensuring the destination table is on a different filegroup in the database to the source, and that filegroup is mapped to files on different physical discs to the source?

Mike

Jeff Moden SSC Guru Points: 1004556 More actions · Answer 9

Varun R (6/13/2012)
Hi All,
Data : 9 Million records
144 GB Ram
what about the Performance of Select * into table2 from table1 in Bulk recovery model.
Here it is taking around 1 hour
Any one has any idea to improve the query to 1-10 Minutes?
Regards,
Varun

I have to echo Gail's thoughts on this. Instead of copying the data, just use it. Right good queries on good indexes and you should have no problems. When it comes to properly written queries, 9 million rows just isn't that much.

As a side bar, if you can move "reporting" to a different server that's being keep up to date by (hopefully) a "san snapshot", then do it. You won't have to worry about copying 9 million row tables and you won't have to worry about reporting users making a mistake in a query possibly dragging the system down to a crawl.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004556 More actions · Answer 10

Daxesh Patel (6/13/2012)
if table has a large size data, I think SELECT * INTO should be avoided due to its limitations.

What "limitations" are you speaking of?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

balajisrm Right there with Babe Points: 723 More actions · Answer 11

I feel the best op tion would be to use SET ROWCOUNT option and use while loop to move the data in batches. This could improve the performance a bit.

Daxesh Patel Hall of Fame Points: 3208 More actions · Answer 12

Jeff Moden (6/13/2012)
Daxesh Patel (6/13/2012)
if table has a large size data, I think SELECT * INTO should be avoided due to its limitations.
What "limitations" are you speaking of?

It has many limitations mentioned here.

some of them...

- you cannot specify file group, INTO will always create new table in default file group

- you cannot specify data types of columns (especially calculated column in select)

basically no control on new table definition

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 13

Daxesh Patel (6/14/2012)
...
- you cannot specify data types of columns (especially calculated column in select)
basically no control on new table definition

You can specify data types of columns:

SELECT CAST(NULL AS TINYINT) AS Col1

,CAST(NULL AS CHAR(10)) AS Col2

,CAST(1 +15 /100 AS NUMERIC(10,5)) AS Col3

INTO NewTable

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 14

Daxesh Patel (6/14/2012)
...
- you cannot specify file group, INTO will always create new table in default file group
...

Depends, sometimes (for example in DW serial load) the following is possible:

ALTER DATABASE MyDB

MODIFY FILEGROUP NewOne DEFAULT;

GO

SELECT CAST(NULL AS TINYINT) AS Col1

,CAST(NULL AS CHAR(10)) AS Col2

,CAST(1 +15 /100 AS NUMERIC(10,5)) AS Col3

INTO NewTable

GO

ALTER DATABASE MyDB

MODIFY FILEGROUP OldOne DEFAULT;

GO

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Jeff Moden SSC Guru Points: 1004556 More actions · Answer 15

Thanks for the cover, Eugene.

You can even specify if a column is supposed to be NULLable or not through the use of ISNULL.

About the only thing that you can do with SELECT/INTO is to transfer the formulas for calculated columns and to automatically transfer keys/indexes. Even in the FULL Recovery Mode, SELECT/INTO is frequently faster than creating the table first.

In other words, "It Depends".

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Select * into table2 from table1 - Performance Impact - what do u think?

Cookies on SQLServerCentral