Partitioned views based on DATETIME field

Question

Post reply

Partitioned views based on DATETIME field

EricsON

Old Hand

Points: 375
More actions
April 3, 2003 at 4:06 am

#80507

Anyone have any ideas/best practices on partitioning large volumes of timeline based data using a DATETIME field.
I am estimating more than 40 million rows in a single table over a period of 8 months. I have read elsewhere that 40 million rows is considered the theoretical maximum number of rows that should be present in a single table. Hence my thinking of splitting the data up based on DATETIME. My idea was to split it on the month number. ie. into quarters. So in effect I would have data_q1, data_q2, data_q3 and data_q4. Then bringing it all together in a partitioned view called data which would mask the base tables.
Now my problem so far involves defining the CHECK constraint for the base tables. I don't see why DATEPART(MONTH, date) IN (1, 2, 3) is not valid. BOL says only certain operators are permitted for an updatable partitioned view and they don't mention anything about deterministic functions.
Anyone have any ideas on this that could help me?

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Allen Cui-55137 SSC Guru Points: 51650 More actions · Answer 1

Allen Cui-55137

SSC Guru

Points: 51650

April 3, 2003 at 7:56 am

#453029

Can you post your script?

EricsON Old Hand Points: 375 More actions · Answer 2

Here is an example I quickly threw together to illustrate the problem:



CREATE TABLE dbo.data_q1(

    [date]    DATETIMENOT NULL CHECK (MONTH([date]) IN (1, 2, 3)),

    refid    INTEGER        NOT NULL,

    data     INTEGERNOT NULL

CONSTRAINT PK_data_q1 PRIMARY KEY CLUSTERED ([date], refid)

)

GO

CREATE TABLE dbo.data_q2(

    [date]    DATETIMENOT NULL CHECK (MONTH([date]) IN (4, 5, 6)),

    refid    INTEGER        NOT NULL,

    data     INTEGERNOT NULL

CONSTRAINT PK_data_q2 PRIMARY KEY CLUSTERED ([date], refid)

)

GO

CREATE TABLE dbo.data_q3(

    [date]    DATETIMENOT NULL CHECK (MONTH([date]) IN (7, 8, 9)),

    refid    INTEGER        NOT NULL,

    data     INTEGERNOT NULL

CONSTRAINT PK_data_q3 PRIMARY KEY CLUSTERED ([date], refid)

)

GO

CREATE TABLE dbo.data_q4(

    [date]    DATETIMENOT NULL CHECK (MONTH([date]) IN (10, 11, 12)),

    refid    INTEGER        NOT NULL,

    data     INTEGERNOT NULL

CONSTRAINT PK_data_q4 PRIMARY KEY CLUSTERED ([date], refid)

)

GO

CREATE VIEW dbo.data WITH SCHEMABINDING

AS

SELECT[date],

refid,

data

FROMdbo.data_q1

UNION ALL

SELECT[date],

refid,

data

FROMdbo.data_q2

UNION ALL

SELECT[date],

refid,

data

FROMdbo.data_q3

UNION ALL

SELECT[date],

refid,

data

FROMdbo.data_q4

GO



INSERT INTO dbo.data VALUES ('01 Jan 2003 08:30:00', 0, 0)

INSERT INTO dbo.data VALUES ('01 Feb 2003 08:30:00', 0, 0)

INSERT INTO dbo.data VALUES ('01 Mar 2003 08:30:00', 0, 0)

INSERT INTO dbo.data VALUES ('01 Apr 2003 08:30:00', 0, 0)

INSERT INTO dbo.data VALUES ('01 May 2003 08:30:00', 0, 0)

INSERT INTO dbo.data VALUES ('01 Jun 2003 08:30:00', 0, 0)

INSERT INTO dbo.data VALUES ('01 Jul 2003 08:30:00', 0, 0)

INSERT INTO dbo.data VALUES ('01 Aug 2003 08:30:00', 0, 0)

INSERT INTO dbo.data VALUES ('01 Sep 2003 08:30:00', 0, 0)

INSERT INTO dbo.data VALUES ('01 Oct 2003 08:30:00', 0, 0)

INSERT INTO dbo.data VALUES ('01 Nov 2003 08:30:00', 0, 0)

INSERT INTO dbo.data VALUES ('01 Dec 2003 08:30:00', 0, 0)

jmendoza Grasshopper Points: 14 More actions · Answer 3

you're using the IN operator in your check constrainst which apparently is not allowed for partitioned views. from BOL:

"The CHECK constraints can only use these operators: BETWEEN, AND, OR, <, <=, >, >=, ="

if you want to partition by quarter, how bout writing your constraints something like:

CHECK([DATE] >= '01-JAN-2003' and [DATE] < '01-APR-2003')

etc.

EricsON Old Hand Points: 375 More actions · Answer 4

The problem in my mind is not that I am using the IN operator, but rather the MONTH() or DATEPART() functions. It seems that functions are not eligible in this case.

Sure, I could make it compare to actual datetime values, but that implies that I need to define a good few years worth to prevent the system falling over.

What I suppose I need is a more mathematical approach that satisfies SQL Server's internal desires. 🙂 You got any more ideas?

Allen Cui-55137 SSC Guru Points: 51650 More actions · Answer 5

I modified your script and add a new partitioning column for partition data into different tables. It seems not the operator issue because I used "IN" in check constraint. I would suspect problem comes from datetime datatype.

create TABLE dbo.data_q1(

[date] DATETIMENOT NULL ,

logmonth int CHECK (logmonth IN (1, 2, 3)),

refid INTEGER NOT NULL,

data INTEGERNOT NULL

CONSTRAINT PK_data_q1 PRIMARY KEY CLUSTERED (logmonth, refid ))

GO

create TABLE dbo.data_q2(

[date] DATETIMENOT NULL ,

logmonth int CHECK (logmonth IN (4, 5, 6)),

refid INTEGER NOT NULL,

data INTEGERNOT NULL

CONSTRAINT PK_data_q2 PRIMARY KEY CLUSTERED (logmonth, refid ))

GO

create TABLE dbo.data_q3(

[date] DATETIMENOT NULL ,

logmonth int CHECK (logmonth IN (7,8,9)),

refid INTEGER NOT NULL,

data INTEGERNOT NULL

CONSTRAINT PK_data_q3 PRIMARY KEY CLUSTERED (logmonth, refid ))

GO

create TABLE dbo.data_q4(

[date] DATETIMENOT NULL ,

logmonth int CHECK (logmonth IN (10, 11, 12)),

refid INTEGER NOT NULL,

data INTEGERNOT NULL

CONSTRAINT PK_data_q4 PRIMARY KEY CLUSTERED (logmonth, refid ))

GO

create VIEW dbo.data

AS

SELECT*

FROMdbo.data_q1

UNION ALL

SELECT*

FROMdbo.data_q2

UNION ALL

SELECT*

FROMdbo.data_q3

UNION ALL

SELECT*

FROMdbo.data_q4

GO

INSERT INTO dbo.data VALUES ('01 Jan 2003 08:30:00', DATEPART(mm, '01 Jan 2003 08:30:00'),0, 0)

INSERT INTO dbo.data VALUES ('01 Feb 2003 08:30:00', DATEPART(mm, '01 Feb 2003 08:30:00'),0, 0)

INSERT INTO dbo.data VALUES ('01 Mar 2003 08:30:00', DATEPART(mm, '01 Mar 2003 08:30:00'),0, 0)

INSERT INTO dbo.data VALUES ('01 Apr 2003 08:30:00', DATEPART(mm, '01 Apr 2003 08:30:00'),0, 0)

INSERT INTO dbo.data VALUES ('01 May 2003 08:30:00', DATEPART(mm, '01 May 2003 08:30:00'),0, 0)

INSERT INTO dbo.data VALUES ('01 Jun 2003 08:30:00', DATEPART(mm, '01 Jun 2003 08:30:00'),0, 0)

INSERT INTO dbo.data VALUES ('01 Jul 2003 08:30:00', DATEPART(mm, '01 Jul 2003 08:30:00'),0, 0)

INSERT INTO dbo.data VALUES ('01 Aug 2003 08:30:00', DATEPART(mm, '01 Aug 2003 08:30:00'),0, 0)

INSERT INTO dbo.data VALUES ('01 Sep 2003 08:30:00', DATEPART(mm, '01 Sep 2003 08:30:00'),0, 0)

INSERT INTO dbo.data VALUES ('01 Oct 2003 08:30:00', DATEPART(mm, '01 Oct 2003 08:30:00'),0, 0)

INSERT INTO dbo.data VALUES ('01 Nov 2003 08:30:00', DATEPART(mm, '01 Nov 2003 08:30:00'),0, 0)

INSERT INTO dbo.data VALUES ('01 Dec 2003 08:30:00', DATEPART(mm, '01 Dec 2003 08:30:00'),0, 0)

jmendoza Grasshopper Points: 14 More actions · Answer 6

i guess i misunderstood what you were after. i assumed that your date was a date entered, date created, or some such, where the range of allowable values is bounded. in such cases you can create the required partitions before they're needed (either manually or preferably through a recurring scheduled job that creates and/or drops tables and then recompiles the view(s)).

sounds like you just wanna hash dates into one of four partitions; though it seems to me this only delays the problem you set out to avoid (hitting 40M rows/table).

and, yeah, you're right 🙂 it appears that IN is perfectly acceptable in updatable partitioned views. this worked fine:

create table t1(

col1 int not null primary key check([col1] in (1)))

create table t2(

col1 int not null primary key check([col1] in (2)))

create view v

as

select col1

from t1

union all

select col1

from t2

insert v values(1)

insert v values(2)