Adding duplicate rows to a resultset

Question

Adding duplicate rows to a resultset

G33kKahuna

SSCommitted

Points: 1726
More actions
March 24, 2009 at 2:50 pm

#207447

I have an interesting challenge to create duplicate rows based on a column value in a result set.
As an example, I have a resultset with 2 records.
---------------------------------
| id | Product | Description | Qty |
---------------------------------
| 1 | P1 | Product 1 | 2 |
| 2 | P2 | Product 2 | 3 |
---------------------------------
The resultset should look like below with 5 rows
---------------------------------
| id | Product | Description | Qty |
---------------------------------
| 1 | P1 | Product 1 | 2 |
| 1 | P1 | Product 1 | 2 |
| 2 | P2 | Product 2 | 3 |
| 2 | P2 | Product 2 | 3 |
| 2 | P2 | Product 2 | 3 |
---------------------------------
I thought about using temp tables and repeating the rows based on the qty value with recursive loops. I wasnt sure if there is a better option or possibly a query trick.
help is much appreciated!
July 21, 2023 at 8:44 am

This was removed by the editor as SPAM

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

andrewd.smith SSCrazy Eights Points: 8444 More actions · Answer 1

You should use a Tally (or Numbers) table for this. There are several articles about Tally tables on this site. You can join to your table to a Tally table, limiting the number of rows selected from the Tally table based on the value of your Qty column.

Florian Reischl SSC-Dedicated Points: 37301 More actions · Answer 2

Hi

You can use a recursive CTE with a counter like this:

DECLARE @t TABLE (id INT, product VARCHAR(100), qty INT)

INSERT INTO @t

SELECT 1, 'P1', 2

UNION SELECT 2, 'P2', 3

;WITH recurse (id, product, qty, cnt) AS

(

SELECT id, product, qty, qty

FROM @t

UNION ALL

SELECT id, product, qty, cnt - 1

FROM recurse

WHERE cnt - 1 != 0

)

SELECT id, product, qty

FROM recurse

ORDER BY id

Greets

Flo

Florian Reischl SSC-Dedicated Points: 37301 More actions · Answer 3

andrewd.smith (3/24/2009)
You should use a Tally (or Numbers) table for this. There are several articles about Tally tables on this site. You can join to your table to a Tally table, limiting the number of rows selected from the Tally table based on the value of your Qty column.

... sure you are right. The Tally Solution is the better approach!

SELECT id, product, qty

FROM @t t

JOIN Tally ON Tally.N <= t.qty

Thanks for correction! 🙂

Flo

G33kKahuna SSCommitted Points: 1726 More actions · Answer 4

G33kKahuna

SSCommitted

Points: 1726

March 27, 2009 at 8:52 am

#967497

Thanks guy .... tally table worked like a charm

Paul White SSC Guru Points: 150467 More actions · Answer 5

If the maximum value for quantity is relatively small, or if you don't have a Numbers Table available, something like the following provides an efficient alternative:

declare @data

table (

id integer primary key,

product char(2) not null,

[description] varchar(100) not null,

quantity tinyint not null

);

insert @data

(

id,

product,

[description],

quantity

)

values (

1,

'P1',

'Product 1',

2

);

insert @data

(

id,

product,

[description],

quantity

)

values (

2,

'P2',

'Product 2',

3

);

select d.id,

d.product,

d.[description],

d.quantity

from @data d

cross

apply (

-- Assuming MAX(quantity) <= 9

select convert(integer, 1) union all

select 2 union all

select 3 union all

select 4 union all

select 5 union all

select 6 union all

select 7 union all

select 8 union all

select 9

) ca (rn)

where ca.rn <= quantity

order by

d.id;

For larger values, a dynamic Numbers Table could be used (generated using TOP (n) and the ROW_NUMBER() function over some system table CROSS JOINed to itself a number of times:

select top (1000)

rn = ROW_NUMBER() over (order by a1.allocation_unit_id)

from master.sys.allocation_units a1,

master.sys.allocation_units a2,

master.sys.allocation_units a3,

master.sys.allocation_units a4

order by

rn;

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Jeff Moden SSC Guru Points: 1004354 More actions · Answer 6

Paul White (12/4/2009)
For larger values, a dynamic Numbers Table could be used (generated using TOP (n) and the ROW_NUMBER() function over some system table CROSS JOINed to itself a number of times:
select top (1000)
rn = ROW_NUMBER() over (order by a1.allocation_unit_id)
from master.sys.allocation_units a1,
master.sys.allocation_units a2,
master.sys.allocation_units a3,
master.sys.allocation_units a4
order by
rn;

Ooooo.... be careful. The idea of using a single cross join is a great idea because it's nasty fast and doesn't cost much. When you start adding addition cross-joins and you go over the boundry of one of the single cross-joins, it starts to become very expensive insofar as possible growth in the TempDB log file.

I built a "little" race between the Cross-Join method and the cascading CTE method that Itzik has in his book to build a Billion row numeric table just to see wha'hapuns. 😛 The cascading CTE method did nothing to TempDB... the multiple Cross-Join method cause the TempDB logfile to blow out to over 40GB.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Paul White SSC Guru Points: 150467 More actions · Answer 7

Thanks Jeff,

Yes one should be careful. I wouldn't advocate using the method for billions of rows 😉

I had much smaller maximum values for 'Quantity' in mind - maybe thousands.

(And if a Tally/Numbers table is to hand that's obviously the way to go anyway)

I have heard you mention your test versus Itzik's method before - one day I'd like to look into it for myself. Is there anywhere I can find the code for your tests? Just if it's easy - it intrigues me slightly. I imagine the issue is the normal one (row count spools) but I'd be good to know for sure.

Paul

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Jeff Moden SSC Guru Points: 1004354 More actions · Answer 8

Paul White (12/6/2009)
Thanks Jeff,
Yes one should be careful. I wouldn't advocate using the method for billions of rows 😉
I had much smaller maximum values for 'Quantity' in mind - maybe thousands.
(And if a Tally/Numbers table is to hand that's obviously the way to go anyway)
I have heard you mention your test versus Itzik's method before - one day I'd like to look into it for myself. Is there anywhere I can find the code for your tests? Just if it's easy - it intrigues me slightly. I imagine the issue is the normal one (row count spools) but I'd be good to know for sure.
Paul

Ah... you can do the tests for yourself... under 8k, the Tally table normally wins especially after it caches. For VARCHAR(MAX), believe it or not, a well written WHILE loop will many times win (of course, I'm thinking "splits" here). Of course, for splits, a very well written CLR will win (and I do mean very well written like the guy that FLO introduced on that very long split thread).

If you need something up to 16 million, I'd recommend the single cross join on sys.All_Columns in 2k5.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)