Select INTO vs. Creating a table

  • I'm reviewing some code that will run on fairly large record sets (~1M records or more). In the code the developer is selecting about 25 columns (mostly integers) and casting them to Numeric (38,10) into another table.

    select CAST(COL1 as NUMERIC(38,10),

    CAST(COL2 as NUMERIC(38,10),

    .

    .

    .

    CAST(COL25 as NUMERIC(38,10)

    INTO TBL2

    FROM TBL1

    WHERE XYZ

    Strictly from a performance standpoint would it be more efficient to declare the table ahead of time with the expected data types and do an INSERT without CASTing values instead of the SELECT INTO with the explicit casts or does it not matter since SQL will need to convert the data anyway?

    I'm thinking creating the table ahead of time would be more efficient as the CASTING would make SQL process each record one row at a time rather than set based.

    Just curious....

  • Michael Smith-356657 (1/21/2014)


    I'm reviewing some code that will run on fairly large record sets (~1M records or more). In the code the developer is selecting about 25 columns (mostly integers) and casting them to Numeric (38,10) into another table.

    select CAST(COL1 as NUMERIC(38,10),

    CAST(COL2 as NUMERIC(38,10),

    .

    .

    .

    CAST(COL25 as NUMERIC(38,10)

    INTO TBL2

    FROM TBL1

    WHERE XYZ

    Strictly from a performance standpoint would it be more efficient to declare the table ahead of time with the expected data types and do an INSERT without CASTing values instead of the SELECT INTO with the explicit casts or does it not matter since SQL will need to convert the data anyway?

    I'm thinking creating the table ahead of time would be more efficient as the CASTING would make SQL process each record one row at a time rather than set based.

    Just curious....

    Really isn't going to make much difference either way. Either you use an explicit cast with select into or you let the engine make an implicit cast.

    If you really want to know which way would perform better, you should setup the scenario on your test server and do some performance tests.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Michael Smith-356657 (1/21/2014)


    I'm thinking creating the table ahead of time would be more efficient as the CASTING would make SQL process each record one row at a time rather than set based.

    No. Are you thinking of user-defined scalar functions which are executed once for each row?

    If you define the table and specify a different data type, SQL has to convert anyway, it just does it implicitly.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks for the replies. I had a feeling it wouldn't make a difference but didn't know if there were any behind the scenes efficienes the query optimizer could use.

  • It's a preference more than an optimization. I personally prefer to pre-construct my tables for clarity and to allow for index placement manipulation (usually applying the cluster before instead of after on mass inserts is more effecient in my code, Nonclusters are up for grabs as they usually don't matter as much).

    Not sure if that helps.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Actually the SELECT ... INTO is more efficient if the statement can be minimally logged.

    I did a search in Books Online for something to verify that:

    "

    With minimal logging, using the SELECT… INTO statement can be more efficient than creating a table and then populating the table with an INSERT statement.

    "

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Insert ... Select can also be minimally logged.

    Though how much it matters with temp tables which don't log redo information as it it (which is most of what's different between full logging and bulk logging) is another matter entirely.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • All depends on your scenario, env and load and most important - locking requirements.

    While doing INTO you lock tables fro the whole duration of your query.

    While loading result into pre-allocated is not, (or significantly decrease locking time)

    Mario

  • mario17 (1/21/2014)


    While doing INTO you lock tables fro the whole duration of your query.

    While loading result into pre-allocated is not, (or significantly decrease locking time)

    Whut? The source tables will be read for roughly the same amount of time if it's an Insert Into Select vs. a Select Into. The target will have difference schema locks (I think it depends on version) between Sch-S and Sch-M, as the Select Into will hold it during the creation. I do not offhand remember when it will release the Sch-M lock on the target.

    Other than that... nope, same locks.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • mario17 (1/21/2014)


    While doing INTO you lock tables fro the whole duration of your query.

    While loading result into pre-allocated is not, (or significantly decrease locking time)

    That was fixed in SQL 7 iirc.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply