July 14, 2011 at 5:45 am
I've a table with +100 million rows that I want to partition, and as I'm restricted to standard edition 2008 R2 I'm going to use a partitioned view. The four tables (currently) and the view have been setup and I'm now thinking about migrating the data. Bulk insert and bcp can't be used with partitioned views so I'm left with a standard insert into, or is there an alternative?
If I have to use a single insert into statement, can I batch it in any way using a transaction to do, say 100,000 rows, then wait before doing the next batch?
The tables are partitioned on a smalldatetime column that is part of the key (one table per year).
July 14, 2011 at 11:33 am
I might be missing something - but why would you want to migrate data?
I'm assuming you are talking about an approach similar to this -> http://www.b4pjs.co.uk/archive/2011/04/using-dynamically-created-partitioned-views/
Edit: Looks like my brain finally woke up - you want to migrate data from your existing table to the newly setup tables that will be used in the view.
I remember reading a forum post a few days back here about partition switching in standard edition - let me see if I can find that.
July 14, 2011 at 12:21 pm
Yes, that's what I'm after. Tried batching:
declare @BatchSize int
select @BatchSize = 200000
while 1 > 0
begin
begin tran
insert into Destination
(
...
)
select top(@BatchSize)
...
from
Source S
where
not exists --for batching
(
select *
from
Destination D
where
D.Key = S.Key
)
if @@rowcount > 0
begin
commit tran
waitfor delay '00:00:01'
end
else
begin
commit tran
break
end
end
But...it doesn't like the sub-query as it references the partitioned destination view. BOL says:
"INSERT, UPDATE, and DELETE actions against a partitioned view are not allowed if there is a self-join with the same view or with any of the member tables in the statement."
This isn't a self-join but seems to apply to a sub-query as well.
I hope that inner/left joins between the view and another table are ok for insert/update/delete, otherwise it will be pretty useless. Something like:
update Destination
set ...
from Destination D
inner join TableA A on D.Key = A.Key
Is this going to work where Destination is a partitioned view - bit worried now?
July 14, 2011 at 12:54 pm
The script below works if you have an identity column in the table - does the approach below help in your case?
/*
IF OBJECT_ID('dbo.Testvw') IS NOT NULL
DROP VIEW dbo.Testvw;
IF OBJECT_ID('dbo.TestTab_Main') IS NOT NULL
DROP TABLE dbo.TestTab_Main;
IF OBJECT_ID('dbo.TestTab_2008') IS NOT NULL
DROP TABLE dbo.TestTab_2008;
IF OBJECT_ID('dbo.TestTab_2009') IS NOT NULL
DROP TABLE dbo.TestTab_2009;
IF OBJECT_ID('dbo.TestTab_2010') IS NOT NULL
DROP TABLE dbo.TestTab_2010;
*/
GO
CREATE TABLE dbo.TestTab_Main(RowId int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
DateColumn smalldatetime NOT NULL);
CREATE TABLE dbo.TestTab_2008(RowId int NOT NULL PRIMARY KEY CLUSTERED,
DateColumn smalldatetime NOT NULL);
CREATE TABLE dbo.TestTab_2009(RowId int NOT NULL PRIMARY KEY CLUSTERED,
DateColumn smalldatetime NOT NULL);
CREATE TABLE dbo.TestTab_2010(RowId int NOT NULL PRIMARY KEY CLUSTERED,
DateColumn smalldatetime NOT NULL);
GO
CREATE VIEW dbo.Testvw
WITH SCHEMABINDING
AS
SELECT RowId,DateColumn FROM dbo.TestTab_2008
UNION ALL
SELECT RowId,DateColumn FROM dbo.TestTab_2009
UNION ALL
SELECT RowId,DateColumn FROM dbo.TestTab_2010
GO
INSERT dbo.TestTab_Main(DateColumn)
SELECT DATEADD(hour,T1.number,'20080101') FROM
(SELECT number FROM master.dbo.spt_values WHERE type = 'P') T1
CROSS JOIN
(SELECT TOP 10 number FROM master.dbo.spt_values) T2
UNION ALL
SELECT DATEADD(hour,T1.number,'20090101') FROM
(SELECT number FROM master.dbo.spt_values WHERE type = 'P') T1
CROSS JOIN
(SELECT TOP 10 number FROM master.dbo.spt_values) T2
UNION ALL
SELECT DATEADD(hour,T1.number,'20100101') FROM
(SELECT number FROM master.dbo.spt_values WHERE type = 'P') T1
CROSS JOIN
(SELECT TOP 10 number FROM master.dbo.spt_values) T2;
GO
DECLARE @MaxCtr int, @Batch int=100, @RowCtr int;
WHILE(1=1)
BEGIN
SELECT @MaxCtr = ISNULL(MAX(RowId),0) FROM dbo.TestTab_2008;
INSERT dbo.TestTab_2008(RowId,DateColumn)
SELECT TOP (@Batch) RowId,DateColumn FROM dbo.TestTab_Main
WHERE DATEPART(YEAR,DateColumn)=2008
AND RowId > @MaxCtr
ORDER BY RowId ASC;
SET @RowCtr = @@ROWCOUNT;
IF @RowCtr = 0
BREAK;
END
WHILE(1=1)
BEGIN
SELECT @MaxCtr = ISNULL(MAX(RowId),0) FROM dbo.TestTab_2009;
INSERT dbo.TestTab_2009(RowId,DateColumn)
SELECT TOP (@Batch) RowId,DateColumn FROM dbo.TestTab_Main
WHERE DATEPART(YEAR,DateColumn)=2009
AND RowId > @MaxCtr
ORDER BY RowId ASC;
SET @RowCtr = @@ROWCOUNT;
IF @RowCtr = 0
BREAK;
END
WHILE(1=1)
BEGIN
SELECT @MaxCtr = ISNULL(MAX(RowId),0) FROM dbo.TestTab_2010;
INSERT dbo.TestTab_2010(RowId,DateColumn)
SELECT TOP (@Batch) RowId,DateColumn FROM dbo.TestTab_Main
WHERE DATEPART(YEAR,DateColumn)=2010
AND RowId > @MaxCtr
ORDER BY RowId ASC;
SET @RowCtr = @@ROWCOUNT;
IF @RowCtr = 0
BREAK;
END
SELECT * FROM dbo.TestTab_Main;
SELECT * FROM dbo.TestTab_2008;
SELECT * FROM dbo.TestTab_2009;
SELECT * FROM dbo.TestTab_2010;
SELECT * FROM dbo.Testvw;
July 14, 2011 at 1:03 pm
Yes, that's very useful for the initial populating of the view base tables. But thinking there is a more fundamental problem that would affect existing stored procedures that are used for insert/update/delete on the partitioned view- see my other recent post.
July 14, 2011 at 1:13 pm
Just as a side note: instead of
WHERE DATEPART(YEAR,DateColumn)=2009
I'd rather recommend use
WHERE DateColumn>='20090101' AND DateColumny'20100101'
The original version won't benefit from a (hopefully) existing index on DateColumn.
July 14, 2011 at 1:38 pm
Good point. Plus I just realized that I really did not have a partitioned view in my original code.
/*
IF OBJECT_ID('dbo.Testvw') IS NOT NULL
DROP VIEW dbo.Testvw;
IF OBJECT_ID('dbo.TestTab_Main') IS NOT NULL
DROP TABLE dbo.TestTab_Main;
IF OBJECT_ID('dbo.TestTab_2008') IS NOT NULL
DROP TABLE dbo.TestTab_2008;
IF OBJECT_ID('dbo.TestTab_2009') IS NOT NULL
DROP TABLE dbo.TestTab_2009;
IF OBJECT_ID('dbo.TestTab_2010') IS NOT NULL
DROP TABLE dbo.TestTab_2010;
*/
CREATE TABLE dbo.TestTab_Main(RowId int IDENTITY(1,1) NOT NULL, DateColumn smalldatetime NOT NULL);
CREATE TABLE dbo.TestTab_2008(RowId int NOT NULL, DateColumn smalldatetime NOT NULL
CHECK(DateColumn >= '20080101' AND DateColumn < '20090101') PRIMARY KEY CLUSTERED(RowId,DateColumn));
CREATE TABLE dbo.TestTab_2009(RowId int NOT NULL, DateColumn smalldatetime NOT NULL
CHECK(DateColumn >= '20090101' AND DateColumn < '20100101') PRIMARY KEY CLUSTERED(RowId,DateColumn));
CREATE TABLE dbo.TestTab_2010(RowId int NOT NULL, DateColumn smalldatetime NOT NULL
CHECK(DateColumn >= '20100101' AND DateColumn < '20110101') PRIMARY KEY CLUSTERED(RowId,DateColumn));
GO
CREATE VIEW dbo.Testvw
WITH SCHEMABINDING
AS
SELECT RowId,DateColumn FROM dbo.TestTab_2008
UNION ALL
SELECT RowId,DateColumn FROM dbo.TestTab_2009
UNION ALL
SELECT RowId,DateColumn FROM dbo.TestTab_2010
GO
INSERT dbo.TestTab_Main(DateColumn)
SELECT DATEADD(hour,T1.number,'20080101') FROM
(SELECT number FROM master.dbo.spt_values WHERE type = 'P') T1
CROSS JOIN
(SELECT TOP 10 number FROM master.dbo.spt_values) T2
UNION ALL
SELECT DATEADD(hour,T1.number,'20090101') FROM
(SELECT number FROM master.dbo.spt_values WHERE type = 'P') T1
CROSS JOIN
(SELECT TOP 10 number FROM master.dbo.spt_values) T2
UNION ALL
SELECT DATEADD(hour,T1.number,'20100101') FROM
(SELECT number FROM master.dbo.spt_values WHERE type = 'P') T1
CROSS JOIN
(SELECT TOP 10 number FROM master.dbo.spt_values) T2;
GO
DECLARE @MaxCtr int, @Batch int=100, @RowCtr int;
WHILE(1=1)
BEGIN
SELECT @MaxCtr = ISNULL(MAX(RowId),0) FROM dbo.TestTab_2008;
INSERT dbo.TestTab_2008(RowId,DateColumn)
SELECT TOP (@Batch) RowId,DateColumn FROM dbo.TestTab_Main
WHERE DateColumn >= '20080101' AND DateColumn < '20090101'--DATEPART(YEAR,DateColumn)=2008
AND RowId > @MaxCtr
ORDER BY RowId ASC;
SET @RowCtr = @@ROWCOUNT;
IF @RowCtr = 0
BREAK;
END
WHILE(1=1)
BEGIN
SELECT @MaxCtr = ISNULL(MAX(RowId),0) FROM dbo.TestTab_2009;
INSERT dbo.TestTab_2009(RowId,DateColumn)
SELECT TOP (@Batch) RowId,DateColumn FROM dbo.TestTab_Main
WHERE DateColumn >= '20090101' AND DateColumn < '20100101'--DATEPART(YEAR,DateColumn)=2009
AND RowId > @MaxCtr
ORDER BY RowId ASC;
SET @RowCtr = @@ROWCOUNT;
IF @RowCtr = 0
BREAK;
END
WHILE(1=1)
BEGIN
SELECT @MaxCtr = ISNULL(MAX(RowId),0) FROM dbo.TestTab_2010;
INSERT dbo.TestTab_2010(RowId,DateColumn)
SELECT TOP (@Batch) RowId,DateColumn FROM dbo.TestTab_Main
WHERE DateColumn >= '20100101' AND DateColumn < '20110101'--DATEPART(YEAR,DateColumn)=2010
AND RowId > @MaxCtr
ORDER BY RowId ASC;
SET @RowCtr = @@ROWCOUNT;
IF @RowCtr = 0
BREAK;
END
SELECT * FROM dbo.TestTab_Main;
SELECT * FROM dbo.TestTab_2008;
SELECT * FROM dbo.TestTab_2009;
SELECT * FROM dbo.TestTab_2010;
SELECT * FROM dbo.Testvw;
-- partitioned view checks - check exec plan to check that
-- only the appropriate base table is accessed
SELECT * FROM dbo.Testvw WHERE RowId = 250
SELECT * FROM dbo.Testvw WHERE RowId = 42250
July 14, 2011 at 1:40 pm
What table would a value for DateColumn ='20100101' get inserted? 😉
July 14, 2011 at 1:55 pm
LutzM (7/14/2011)
What table would a value for DateColumn ='20100101' get inserted? 😉
That's what happens when I update the script without looking at what I did or testing what I did 😛
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply