January 16, 2013 at 9:21 am
Hi there,
For my DWH I am using partitioning for all my bigger tables.
As I would like to use two dimensions for partitioning I decided to create different tables to splitt the years.
To query the data I use a view that simply joins the data:
CREATE VIEW view_all_years
WITH SCHEMABINDING
SELECT datum, attribute1_id, attribute2_id
FROM table_2012
UNION ALL
SELECT datum, attribute1_id, attribute2_id
FROM table_2013
The following contrains help the optimizer to eliminate tables that are not needed for the queries.
ALTER TABLE table_2012 WITH CHECK ADD CONSTRAINT [CK_table_2012] CHECK (([date]>='2012-01-01' AND [date]<='2012-12-31'))
ALTER TABLE table_2013 WITH CHECK ADD CONSTRAINT [CK_table_2013] CHECK (([date]>='2013-01-01' AND [date]<='2013-12-31'))
However I am now thinking, if I could use the same view to insert the data, so I do not have to check, if the data I am importing belongs to which table an so do not have to react with the INSERT statement in my ETL.
I tried to insert data and it works:
insert into view_all_years
(datum, attribute1_id, attribute2_id)
VALUES
('2012-01-01', 1, 2),
('2013-01-01', 1, 2)
While it is technical possible, I would like to now, if I have disadvantages in performance with this?
I saw in the execution plan, that it plans to insert the data in all underlying tables while a insert into into the specified table wouldn't.
To specify my enviroment:
SQL Server 2012 SP1 EE
I will have to insert more then 20 mio rows a day
I am looking forward to your thought
Mitch
January 16, 2013 at 10:33 am
Why multiple tables? Wouldn't it simplify things tremendously if you just partitioned the table itself?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
January 17, 2013 at 2:42 am
I separated the data in different tables to be able to better handle recovery szenarios, rebuild index or anything else.
20 Mio rows * 365 = 7,3 billion every year.
I created 3 tables for each year. (3 to represent my current partioning scheme for "normal" tables)
20 mio. rows each day are splittet about 40% 40%, 20% over the three tables. So that is 3 bil, 3bil, 1,5 bil rows for each table a year.
I will hold data for at least 5 years.
However I partioned the table within by date as well, as most of the queries do not need the hole year -> a common query is 30days from a reporting day
January 17, 2013 at 6:26 am
The execution plan worried me enough so I will modify me SSIS Package to load the dynamicly in the right table and not in the view
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply