May 23, 2012 at 9:15 am
OK, I'm trying and failing (to an extent) to get my head around set-based iteration.
As a sometime dev who is used to loops (go VBScript go!) and more 'traditional' programming, I decided this was a topic that I really needed to learn something about.
So I'm wondering if anyone can help with this example.
Here's me creating a table and populating it with data using a loop-based method:
CREATE TABLE fooTable (data VARCHAR(8000) )
DECLARE @loopCounter INT
SET @loopCounter = 10000
WHILE @loopCounter > 0
BEGIN
INSERT INTO fooTable
SELECT name from sys.databases
SET @loopCounter = @loopCounter - 1
END
So far, so good. I should have the name of every database on my server, duplicated 10000 times. 5 databases, 50000 rows. This query took about 10 seconds to execute (I didn't time it accurately).
Now I'm struggling to render this as a set-based approach.
From what I understand, I should be using a function based approach to populate the data rather than a looping one. I.e. if I was using an INT value or similar I could probably specify a range or other mathematical function - I chose a VARCHAR to make it a bit difficult.
Would be grateful for any help in understanding this topic. Thanks.
---
Note to developers:Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
My blog: http://uksqldba.blogspot.com
Visit http://www.DerekColley.co.uk to find out more about me.
May 23, 2012 at 9:22 am
I could write pages of explanation here, but I'd really recommend reading Jeff's article on the Tally/Numbers table [/url], which more eloquently explains it than I ever could.
With your particular example, running 10000 seperate inserts is the same as CROSS JOINING that table to another table with 10000 rows in it. E.g.:
To give a very simple example, using the persisted tally table in Jeff's example:
INSERT INTO fooTable
SELECT name from sys.databases
CROSS JOIN Tally
where N<=10000
May 23, 2012 at 9:25 am
derek.colley (5/23/2012)
OK, I'm trying and failing (to an extent) to get my head around set-based iteration.As a sometime dev who is used to loops (go VBScript go!) and more 'traditional' programming, I decided this was a topic that I really needed to learn something about.
So I'm wondering if anyone can help with this example.
Here's me creating a table and populating it with data using a loop-based method:
CREATE TABLE fooTable (data VARCHAR(8000) )
DECLARE @loopCounter INT
SET @loopCounter = 10000
WHILE @loopCounter > 0
BEGIN
INSERT INTO fooTable
SELECT name from sys.databases
SET @loopCounter = @loopCounter - 1
END
So far, so good. I should have the name of every database on my server, duplicated 10000 times. 5 databases, 50000 rows. This query took about 10 seconds to execute (I didn't time it accurately).
Now I'm struggling to render this as a set-based approach.
From what I understand, I should be using a function based approach to populate the data rather than a looping one. I.e. if I was using an INT value or similar I could probably specify a range or other mathematical function - I chose a VARCHAR to make it a bit difficult.
Would be grateful for any help in understanding this topic. Thanks.
try this
with x (id) as
(select 1 as id
union all
select ID+1 from X as id
where id<10000
)
insert into fooTable
select d.name from sysdatabases d cross join x option (maxrecursion 10000);
MVDBA
May 23, 2012 at 9:27 am
+1 howard beat me to it, but slightly different solution
MVDBA
May 23, 2012 at 9:30 am
Thanks both, I'll read the tally tables article and give both solutions a try.
---
Note to developers:Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
My blog: http://uksqldba.blogspot.com
Visit http://www.DerekColley.co.uk to find out more about me.
May 23, 2012 at 9:46 am
Howard, your approach is good but populates the table with only the first database name returned by SELECT name FROM sys.databases. It produces all the rows, but they are all identical - the remainder are discarded - rather than inserting in complete 'sets'.
Mike - Got Msg 319 ('Incorrect syntax near the keyword 'with'...') I checked my code with what you put, matches fine. Typo maybe?
Thanks
---
Note to developers:Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
My blog: http://uksqldba.blogspot.com
Visit http://www.DerekColley.co.uk to find out more about me.
May 23, 2012 at 9:52 am
any code before the "WITH x as" must be terminated with a ;
MVDBA
May 23, 2012 at 9:54 am
see my example
create table footable (name varchar(1000))
go
select 'aaaaa';
with x (id) as
(select 1 as id
union all
select ID+1 from X as id
where id<10000
)
insert into fooTable
select d.name from sysdatabases d cross join x option (maxrecursion 10000);
drop table footable
MVDBA
May 23, 2012 at 9:54 am
Hi Derek, the statement immediately previous to the WITH keyword must terminated by a semicolon.
May 23, 2012 at 9:56 am
derek.colley (5/23/2012)
Howard, your approach is good but populates the table with only the first database name returned by SELECT name FROM sys.databases. It produces all the rows, but they are all identical - the remainder are discarded - rather than inserting in complete 'sets'.
Hmm, that shouldn't be the case. You'll have to provide a complete example of what you're running from start to finish to work out what's wrong.
Are you sure one database isn't just at the top - e.g. the arbitrary order is different when you select back the records?
E.g.:
select data, count(*) from fooTable
group by data
Returns only one database name??
May 23, 2012 at 10:02 am
The following ran about 1 second on my computer, ymmv.
CREATE TABLE dbo.fooTable (data VARCHAR(8000) );
GO
WITH
e1(n) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1), -- 10 rows
e2(n) AS (SELECT a.n FROM e1 a CROSS JOIN e1 b), -- 100 rows
e4(n) AS (SELECT a.n FROM e2 a CROSS JOIN e2 b), -- 10,000 rows
cteNumbers(n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM e4)
INSERT INTO dbo.fooTable (data)
SELECT NAME FROM sys.databases CROSS JOIN cteNumbers;
SELECT * FROM dbo.fooTable;
DROP TABLE dbo.fooTable;
GO
May 23, 2012 at 10:02 am
Howard
Here's the example code I ran:
CREATE TABLE numbers ( N INT)
DECLARE @loopCounter INT -- yes, I know :-)
SET @loopCounter = 1
WHILE @loopCounter < 10001
BEGIN
INSERT INTO numbers VALUES ( @loopCounter )
SET @loopCounter = @loopCounter + 1
END
RAISERROR('Numbers table created.',0,0) WITH NOWAIT
CREATE TABLE fooTable ( data VARCHAR(8000) )
INSERT INTO fooTable
SELECT name FROM sys.databases
CROSS JOIN numbers
WHERE N < 10000
RAISERROR('fooTable populated.',0,0) WITH NOWAIT
SELECT TOP 50 * FROM fooTable
Here's the result set:
Numbers table created.
fooTable populated.
master
master
master
... -- goes on for 46 rows
master
---
Note to developers:Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
My blog: http://uksqldba.blogspot.com
Visit http://www.DerekColley.co.uk to find out more about me.
May 23, 2012 at 10:06 am
Mike - I corrected the WITH to ;WITH and it ran beautifully - 28 seconds. Many thanks.
Lynn - trying yours now 🙂 The more practice I get with this kind of thing, the better I'll understand it.
---
Note to developers:Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
My blog: http://uksqldba.blogspot.com
Visit http://www.DerekColley.co.uk to find out more about me.
May 23, 2012 at 10:08 am
SELECT TOP 50 * FROM fooTable
This is the problem. You're populating your table with 50,000 rows of data and then trying to determine whether it's done it correctly by only looking at the top 50 rows. There are 10000 rows that should have that database name, so it's no proof at all that the process hasn't worked just because you've looked at 50 and they're all the same!
The key thing to remember about SQL is that inserted order is irrelevant. If you want a particular order, you must have a column that defines it and include it in the order by clause.
May 23, 2012 at 10:17 am
Lynn - 4 seconds on my ageing kit - thanks very much!
Howard - was not knocking your effort (would be a bit rich considering my approach wasn't set based!) but I was after an ordered insert in the absence of a PK. The difference between the loop-based approach and your set based one was that the loop-based one inserted as follows:
1 a
2 b
3 c
4 d
5 e
6 a
7 b
...
10000 e
At the moment your code does:
1 a
2 a
3 a
...
? b
? b
? b
...
? e
I agree that inserted order is irrelevant, your code achieved exactly what was wanted - but imagine I e.g. wanted to add a PK at a later date i.e.
SELECT IDENTITY(INT, 1, 1) AS id INTO fooTable --or similar...
The order would then become relevant.
I think I'll have to go and read up properly on this since it's clear there's a few ways of achieving set based iteration.
Thanks to everyone for your help and examples.
---
Note to developers:Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
My blog: http://uksqldba.blogspot.com
Visit http://www.DerekColley.co.uk to find out more about me.
Viewing 15 posts - 1 through 15 (of 46 total)
You must be logged in to reply to this topic. Login to reply