How to write this query?

Question

Post reply

How to write this query?

manssourianm

Mr or Mrs. 500

Points: 556
More actions
March 21, 2018 at 4:54 pm

#410344

Hi,
I have this dataset:
create table test1 (
[Category] char(1) null,
[CNT] int null
)
insert into test1 ([Category], [CNT])
values ('A', 5)
;
insert into test1 ([Category], [CNT])
values ('B', 3)
;
How can I end up with this dataset:
create table test2
([Category] char(1) null)
insert into test2 (category)
values ('A')
;
insert into test2 (category)
values ('A')
;
insert into test2 (category)
values ('A')
;
insert into test2 (category)
values ('A')
;
insert into test2 (category)
values ('A')
;
insert into test2 (category)
values ('B')
;
insert into test2 (category)
values ('B')
;
insert into test2 (category)
values ('B')
;
The idea is one separate row for each category, that corresponds to the count of the category.
Any help is appreciated!
(Sorry, I couldn't fix the line spacing in the code...)
-M

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Lynn Pettis SSC Guru Points: 442470 More actions · Answer 1

Like this, for one method:
create table dbo.test1 ([Category] char(1) null, [CNT] int null );

create table dbo.test2
([Category] char(1) null);

insert into dbo.test1 ([Category], [CNT])
values ('A', 5),('B', 3);

WITH eTally(n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) n FROM [sys].[all_columns] AS [ac] CROSS JOIN [sys].[all_columns] AS [ac2])
INSERT INTO dbo.test2([Category])
SELECT t1.[Category]
FROM
dbo.test1 t1
CROSS APPLY (SELECT TOP(t1.[CNT]) n FROM eTally ORDER BY n) t;

SELECT * FROM dbo.test2;

DROP TABLE [dbo].[test2];
DROP TABLE [dbo].[test1];

ScottPletcher SSC Guru Points: 101235 More actions · Answer 2

I don't really see the need for the overhead and potential issues of reading the sys.all_columns view.

;WITH cteTally10 AS ( SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS numbers(number) ), cteTally1000 AS ( SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS number FROM cteTally10 c1 CROSS JOIN cteTally10 c2 CROSS JOIN cteTally10 c3 ) --INSERT INTO dbo.test2 ( [Category] ) SELECT t1.[Category] FROM dbo.test1 t1 INNER JOIN cteTally1000 t ON t.number BETWEEN 1 AND t1.CNT

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Lynn Pettis SSC Guru Points: 442470 More actions · Answer 3

ScottPletcher - Thursday, March 22, 2018 10:48 AM
I don't really see the need for the overhead and potential issues of reading the sys.all_columns view.
;WITH cteTally10 AS ( SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS numbers(number) ), cteTally1000 AS ( SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS number FROM cteTally10 c1 CROSS JOIN cteTally10 c2 CROSS JOIN cteTally10 c3 ) --INSERT INTO dbo.test2 ( [Category] ) SELECT t1.[Category] FROM dbo.test1 t1 INNER JOIN cteTally1000 t ON t.number BETWEEN 1 AND t1.CNT

It was a quick, throw together cte based tally table, one that Jeff has even used. I could have just as easily taken the time to type this:
with e1(n) as (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)), e2(n) as (select 1 from e1 a cross join e1 b), e4(n) as (select 1 from e2 a cross join e2 b), etally(n) as (select n = row_number() over (order by (select null)) from e4 a cross join e4 b)

Jacob Wilkins One Orange Chip Points: 27976 More actions · Answer 4

To add a further tweak, nothing about the ROW_NUMBER is needed here, so there's no need to add the overhead of generating it.

Something like this is sufficient and avoids the unnecessary overhead:

WITH n AS (SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))n(n)), n3 AS (SELECT n=n1.n FROM n n1, n n2, n n3)

SELECT t1.[Category] FROM dbo.test1 t1 CROSS APPLY (SELECT TOP (cnt) n FROM n3)n;

The overhead's pretty small, but real. 🙂

Cheers!

EDIT: I should point out that this isn't guaranteed (that there's additional overhead); it's been a while since I've run the full battery of tests, but I remember that sometimes the optimizer implemented the query forms in the same way. Still, I'd rather not just hope that the optimizer gets it right, and if it's unnecessary, there's no need to clutter up the query 🙂

manssourianm Mr or Mrs. 500 Points: 556 More actions · Answer 5

Thanks everybody!
I ended up using the response by Mr. Jacob Wilkins.
There is something I'm not understanding though: The cte that is called 'n', why is it just a table of 10 zeros? I don't get how that makes this query work. Also, I noticed the query doesn't work for values over 1,000. What can I do to change that upper limit? Sorry if I'm being dense...
-m

Jacob Wilkins One Orange Chip Points: 27976 More actions · Answer 6

That sort of CTE is part of a common construct to form an on-the-fly tally or numbers table.

The specific values don't really matter; it's just generating a certain number of rows. In this case, the first CTE generates 10 rows. The second CTE cross joins it to itself 3 times, forming 10^3=1000 rows.

Most commonly, a ROW_NUMBER is calculated over those rows to generate the tally or numbers table. In this case, we don't need a list of numbers; we just need enough rows to "explode" your original rows into the appropriate number of new rows. For example, if you have something with a count of 5 on one row, we need to make that result in 5 new rows, so we use APPLY to "match" that row to 5 rows from the CTE.

If you need more than 10^3, just add an appropriate number of cross joins in the second CTE.

Cheers!

manssourianm Mr or Mrs. 500 Points: 556 More actions · Answer 7

Thank you sir, your explanation made me understand it perfectly. I added one more join for a 10K limit which should be enough.
-martin

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 8

Just thought I'd mention that I tried both cte tally schemes and they produce a different number of rows. Jacob, yours seems to produce a doubled result set.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Jacob Wilkins One Orange Chip Points: 27976 More actions · Answer 9

The Dixie Flatline - Tuesday, March 27, 2018 11:14 AM
Just thought I'd mention that I tried both cte tally schemes and they produce a different number of rows. Jacob, yours seems to produce a doubled result set.

I'd need to see a repro script 🙂

IF OBJECT_ID('test1') IS NOT NULL DROP TABLE test1;

CREATE TABLE test1 ([Category] CHAR(1) NULL,[CNT] INT NULL); INSERT INTO test1 ([Category], [CNT]) VALUES ('A', 5),('B', 3);

WITH n AS (SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))n(n)), n3 AS (SELECT n=n1.n FROM n n1, n n2, n n3)

SELECT t1.[Category] FROM dbo.test1 t1 CROSS APPLY (SELECT TOP (cnt) n FROM n3)n;

WITH cteTally10 AS ( SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS numbers(number) ), cteTally1000 AS ( SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS number FROM cteTally10 c1 CROSS JOIN cteTally10 c2 CROSS JOIN cteTally10 c3 ) SELECT t1.[Category] FROM dbo.test1 t1 INNER JOIN cteTally1000 t ON t.number BETWEEN 1 AND t1.CNT;

Results:

Category -------- A A A A A B B B

(8 rows affected)

Category -------- A B A B A B A A

(8 rows affected)

If you mean the tally itself, and not its application to this particular query, I'd still need to see a repro script 🙂

IF OBJECT_ID('test1') IS NOT NULL DROP TABLE test1;

CREATE TABLE test1 ([Category] CHAR(1) NULL,[CNT] INT NULL); INSERT INTO test1 ([Category], [CNT]) VALUES ('A', 5),('B', 3);

WITH n AS (SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))n(n)), n3 AS (SELECT n=n1.n FROM n n1, n n2, n n3)

SELECT COUNT(*) FROM n3;

WITH cteTally10 AS ( SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS numbers(number) ), cteTally1000 AS ( SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS number FROM cteTally10 c1 CROSS JOIN cteTally10 c2 CROSS JOIN cteTally10 c3 ) SELECT COUNT(*) FROM cteTally1000;

Results:

----------- 1000

(1 row affected)

----------- 1000

(1 row affected)

Cheers!