February 17, 2014 at 9:33 am
Hello!
I have two tables with this info:
TABLE 1
COL1 COL2 COL3
AAA BBB CCC
QQQ WWW EEE
AAA SSS DDD
WWW EEE RRR
BBB BBB BBB
TABLE 2
COL1 COL2 COL3 COL4
b b b 343
a a a 344
c c c 345
d d d 346
e e e 347
I want to insert TABLE 1 into TABLE 2 with a query that will auto increment to COL4 looking like this:
COL1 COL2 COL3 COL4
b b b 343
a a a 344
c c c 345
d d d 346
e e e 347
AAA BBB CCC 348
QQQ WWW EEE 349
AAA SSS DDD 350
WWW EEE RRR 351
BBB BBB BBB 352
I know this can be done easily by just altering the column to have an auto-increment datatype, but I currently cannot do that at this moment. Any help on this would be awesome.
February 17, 2014 at 9:42 am
imba215 (2/17/2014)
Hello!I have two tables with this info:
TABLE 1
COL1 COL2 COL3
AAA BBB CCC
QQQ WWW EEE
AAA SSS DDD
WWW EEE RRR
BBB BBB BBB
TABLE 2
COL1 COL2 COL3 COL4
b b b 343
a a a 344
c c c 345
d d d 346
e e e 347
I want to insert TABLE 1 into TABLE 2 with a query that will auto increment to COL4 looking like this:
COL1 COL2 COL3 COL4
b b b 343
a a a 344
c c c 345
d d d 346
e e e 347
AAA BBB CCC 348
QQQ WWW EEE 349
AAA SSS DDD 350
WWW EEE RRR 351
BBB BBB BBB 352
I know this can be done easily by just altering the column to have an auto-increment datatype, but I currently cannot do that at this moment. Any help on this would be awesome.
Insert into [Table 1] (Col1, Col2, Col3, Col4)
select Col1, Col2, Col3
, (Select max(Col4) from [Table 1]) + ROW_NUMBER() over (Order by [Col1])
from [Table 2]
Russel Loski, MCSE Business Intelligence, Data Platform
February 17, 2014 at 9:44 am
Something like this?
Insert TableB (col1, col2, col3, col4)
select col1, col2, col3, ROW_NUMBER() over (order by newid()) + (select MAX(col4) from tableb)
from tablea
--edit--
Looks like Russel was typing almost the exact same thing as I was...only faster. 😉
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
February 17, 2014 at 2:42 pm
Thank you so much!
February 20, 2014 at 7:46 am
Interesting this. Intuitively, my head says that pre-calculating the max value should be more efficient when compared with using a correlated subquery:
declare @MaxId int
select @MaxId = max(col4) from #TableB
insert #TableB (Col1, Col2, Col3, Col4)
select Col1, Col2, Col3, row_number() over (order by newid()) + @MaxId
from #TableA
But, testing says otherwise. The variable based approach is significantly worse. I wonder why this is. It's interesting. I guess I'm (simplistically perhaps) expecting the correlated query to have to execute for each row of the main query. But it clearly doesn't.
Test script if you're interested:
-- create some test tables and data
create table #TableA (Col1 varchar(10),Col2 varchar(10),Col3 varchar(10))
create table #TableB (Col1 varchar(10),Col2 varchar(10),Col3 varchar(10),Col4 int)
declare @i int
set @i = 1
-- there are better ways to do this, but this is easy to understand
while @i <=100000
begin
insert #TableB values ('a','b','c',@i)
-- only add rows to A for every 10th row added to B
if @i % 10 = 0
begin
insert #TableA values ('AAA','BBB','CCC')
end
set @i = @i + 1
end
select count(*) from #TableA
-- 10000
select count(*) from #TableB
-- 100000
set statistics io on
-----------------------
-- test without indexes
-----------------------
-- DON'T DO THIS ON A PRODUCTION SYSTEM
dbcc dropcleanbuffers
dbcc freeproccache
-- correlated subquery version
insert #TableB (Col1, Col2, Col3, Col4)
select Col1, Col2, Col3, row_number() over (order by newid()) + (select max(Col4) from #TableB)
from #TableA
/*
Table '#TableB'. Scan count 1, logical reads 10349, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#TableA'. Scan count 1, logical reads 34, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
*/
-- clear the newly added data
delete #TableB where Col4 > 10000000
dbcc dropcleanbuffers
dbcc freeproccache
-- pre calculated id version
declare @MaxId int
select @MaxId = max(col4) from #TableB
insert #TableB (Col1, Col2, Col3, Col4)
select Col1, Col2, Col3, row_number() over (order by newid()) + @MaxId
from #TableA
/*
Table '#TableB'. Scan count 1, logical reads 349, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#TableB'. Scan count 0, logical reads 10037, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#TableA'. Scan count 1, logical reads 34, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
*/
set statistics io off
-- clear the newly added data
delete #TableB where Col4 > 100000
------------------
-- test with index
------------------
create index ix1 on #TableB(col4)
dbcc dropcleanbuffers
dbcc freeproccache
-- correlated subquery version
insert #TableB (Col1, Col2, Col3, Col4)
select Col1, Col2, Col3, row_number() over (order by newid()) + (select max(Col4) from #TableB)
from #TableA
/*
Table '#TableB'. Scan count 1, logical reads 30194, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#TableA'. Scan count 1, logical reads 34, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
*/
-- clear the newly added data
delete #TableB where Col4 > 100000
dbcc dropcleanbuffers
dbcc freeproccache
-- pre calculated id version
declare @MaxId int
select @MaxId = max(col4) from #TableB
insert #TableB (Col1, Col2, Col3, Col4)
select Col1, Col2, Col3, row_number() over (order by newid()) + @MaxId
from #TableA
/*
Table '#TableB'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#TableB'. Scan count 0, logical reads 30191, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#TableA'. Scan count 1, logical reads 34, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
*/
----------------------------
-- test with clustered index
----------------------------
delete #TableB where Col4 > 100000
drop index #TableB.ix1
create clustered index ix1 on #TableB(col4)
dbcc dropcleanbuffers
dbcc freeproccache
-- correlated subquery version
insert #TableB (Col1, Col2, Col3, Col4)
select Col1, Col2, Col3, row_number() over (order by newid()) + (select max(Col4) from #TableB)
from #TableA
/*
Table '#TableB'. Scan count 1, logical reads 21551, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#TableA'. Scan count 1, logical reads 34, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
*/
-- clear the newly added data
delete #TableB where Col4 > 100000
-- pre calculated id version
declare @MaxId int
select @MaxId = max(col4) from #TableB
insert #TableB (Col1, Col2, Col3, Col4)
select Col1, Col2, Col3, row_number() over (order by newid()) + @MaxId
from #TableA
/*
Table '#TableB'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#TableB'. Scan count 0, logical reads 21548, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#TableA'. Scan count 1, logical reads 34, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
*/
set statistics io off
------------
-- test time
------------
-- run inserts 100 times
create table #results (i int, t char(1), s datetime, e datetime, diff int)
declare @s-2 datetime
, @e datetime
declare @i int
set @i = 1
while @i <= 100
begin
declare @MaxId int
set @s-2 = getdate()
-- variable version
select @MaxId = max(col4) from #TableB
insert #TableB (Col1, Col2, Col3, Col4)
select Col1, Col2, Col3, row_number() over (order by newid()) + @MaxId
from #TableA
set @e = getdate()
insert #results select @i,'V', @s-2, @e, datediff(ms,@s,@e)
delete #TableB where Col4 > 100000
dbcc dropcleanbuffers
dbcc freeproccache
set @s-2 = getdate()
-- correlated subquery version
insert #TableB (Col1, Col2, Col3, Col4)
select Col1, Col2, Col3, row_number() over (order by newid()) + (select max(Col4) from #TableB)
from #TableA
set @e = getdate()
insert #results select @i,'C', @s-2, @e, datediff(ms,@s,@e)
delete #TableB where Col4 > 100000
dbcc dropcleanbuffers
dbcc freeproccache
set @i = @i + 1
end
-- show the result
select * from #results
February 20, 2014 at 6:38 pm
irobertson (2/20/2014)
Interesting this. Intuitively, my head says that pre-calculating the max value should be more efficient when compared with using a correlated subquery:
declare @MaxId int
select @MaxId = max(col4) from #TableB
insert #TableB (Col1, Col2, Col3, Col4)
select Col1, Col2, Col3, row_number() over (order by newid()) + @MaxId
from #TableA
But, testing says otherwise. The variable based approach is significantly worse. I wonder why this is. It's interesting. I guess I'm (simplistically perhaps) expecting the correlated query to have to execute for each row of the main query. But it clearly doesn't.
I have seen cases where putting the aggregate into a local variable instead of doing it in the main query does indeed help. It depends on whether the optimizer is smart enough to realize that's something it only needs to do once.
I would not have even considered a loop for this, mainly because of my mantra (below).
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply