Performance Problem when using partitioned view against actual table

Question

Post reply

Performance Problem when using partitioned view against actual table

John Burchel

Hall of Fame

Points: 3611
More actions
April 5, 2007 at 8:54 am

#62642

Hello everyone,

I am having some performance problems when using a wrapper view instead of a table.

i.e the view

Create view view_Personprofile

Select * from Personprofile_200612
Union all
Select * from Personprofile_200701

is a lot slower then querying the actual underlying tables, i.e running queries on Personprofile_200612 etc.

Does anyone know why this is ?

Thanks

Kindest Regards,
John Burchel (Trainee Developer)

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply

Ray M SSC-Insane Points: 21093 More actions · Answer 1

Can you elaborate any more?

A few things you need to make sure a partitioned view perform well.

Each table needs to have a check constraint on the value you are partitioning by, to mee it looks like some date column with month and year.

It lets the optimizer know which real table to check when running a query.

if you are joing partitioned view to partitioned view you will likely experience performance problems.

Is each table properly indexed.

from BOL

CHECK constraints are not needed for the partitioned view to return the correct results. However, if the CHECK constraints have not been defined, the query optimizer must search all the tables instead of only those that cover the search condition on the partitioning column. Without the CHECK constraints, the view operates like any other view with UNION ALL. The query optimizer cannot make any assumptions about the values stored in different tables and it cannot skip searching the tables that participate in the view definition.

http://msdn2.microsoft.com/en-us/library/ms190019.aspx

Cheers

Jeff Moden SSC Guru Points: 1004339 More actions · Answer 2

Real answer is... don't use a partitioned view. SQL Server can handle huge tables.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

John Burchel Hall of Fame Points: 3611 More actions · Answer 3

The only reason why I am intending to use partitioned view is that I can manage the underlying tables easily, i.e index rebuild time for a monthly table is lesser than that of a big table, maintenance time is lower, i.e monthly filegroups can be dropped instead of deleting from the big table and then shrinking the table.

It just makes it easier to manage really.

Kindest Regards,

John Burchel (Trainee Developer)

Jeff Moden SSC Guru Points: 1004339 More actions · Answer 4

Ah! Sorry and understood... guess that's one of the advantages of partitioned pass through views.

Not sure what the problem with the view is... I just created two small 200k row tables with a nice index on both. The execution plan on the raw code which is a SUM/GROUP BY of the two tables, and the code that uses the pass-through view is identical, Statistics IO shows precisely the same number of reads, etc, and Statistics Time shows precisely the same amount of CPU usage for the execution (although it does show quite a difference in the display-to-grid time where the view is MUCH slower).

If the raw code you're running and the view code are, in fact, identical (ie. you copied the view code without the CREATE), about the only thing I can think of is that perhaps the view got messed up somehow or something was done to the underlying table schema after the view was created and simply needs to be "recompiled". Rerunning the view code as an ALTER is one way to do it...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004339 More actions · Answer 5

Here's the test code I used John... both tables have a composite clustered index on SomeID and SomeNumber...

CREATE VIEW vBIGTEST AS

SELECT *

FROM BigTest

UNION ALL

SELECT *

FROM BigTest2

PRINT REPLICATE('-',78)

SELECT SomeID,SUM(SomeNumber)

FROM BigTest

GROUP BY SomeID

UNION ALL

SELECT SomeID,SUM(SomeNumber)

FROM BigTest2

GROUP BY SomeID

PRINT REPLICATE('-',78)

SELECT SomeID,SUM(SomeNumber)

FROM vBigTest

GROUP BY SomeID

UNION ALL

SELECT SomeID,SUM(SomeNumber)

FROM vBigTest

GROUP BY SomeID

PRINT REPLICATE('-',78)

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

David McFarland SSChampion Points: 11815 More actions · Answer 6

I realize that it was mentioned upthread, but it is absolutely vital that you have the necessary constraints, John. Perhaps you could post a dump of the CREATE TABLE script.

We use partitioned views as wrappers that hold the last 14 days as individual tables, as well as the last 100 or so weeks before that, in weekly tables. With the constraints in place, my queries that require only a subset of those days all run as fast against the view as they would against the underlying tables.

John Burchel Hall of Fame Points: 3611 More actions · Answer 7

Hi Guys,

Thanks for all your efforts, I will explain the problem further.

To start with, below is the DDL

CREATE TABLE [dbo].[PersonProfile_2007_01] (

[PersonURN] [varchar] (30),

[AddressURN] [nvarchar] (50) ,

[ProfileDate] [varchar] (10) ,

[LISTID] [varchar] (10),

[DUNSNUMBER] [varchar] (10),

[CLIENTNO] INT,

[STATUS] [varchar] (40),

[ORGNAME] [varchar] (200),

[BusinessUnitID] INT,

[LifeStage] [varchar] (10)

&nbsp

CREATE UNIQUE CLUSTERED INDEX [PERSONPROFILE_INDEX] ON [dbo].[PERSONPROFILE] ([ProfileDate] , [PersonURN] ) WITH FILFACTOR = 60

GO

CREATE INDEX NC_PERSONPROFILE ON PERSONPROFILE (CLIENTNO,PROFILEDATE)

GO

CREATE INDEX NC2_PERSONPROFILE ON PERSONPROFILE (CLIENTNO,LISTID,PROFILEDATE)

ALTER TABLE [dbo].[PersonProfile_2007_01] WITH CHECK ADD CONSTRAINT [Check_PersonProfile_2007_01] CHECK (([Profiledate]>='2007-01-01 00:00:00.000' AND [Profiledate]<= '2007-01-31 23:59:59.000'))

Wrapper View

CREATE VIEW VWQUERY

AS

SELECT * FROM PersonProfile_2007_01

UNION ALL

SELECT * FROM PersonProfile_2007_02

Here is the bottleneck

SELECT PERSONURN, ADDRESSURN, PROFILEDATE,CLIENTNO, STATUS

FROM VWQUERY

WHERE LISTID = 'L01'

AND PROFILEDATE BETWEEN 'DATEA' AND 'DATEB'

GROUP BY PROFILEDATE, CLIENTNO

The query above runs with quick speed and if i replace the vwQuery with the actual table, it still runs with quick speed. BUT If i add the order by below

ORDER BY PROFILEDATE DESC, CLIENTNO DESC

The query runs really really slow, if its with the view, but if its with the underlying table, it runs really fast.

SELECT PERSONURN, ADDRESSURN, PROFILEDATE,CLIENTNO, STATUS

FROM PersonProfile_2007_01

WHERE LISTID = 'L01'

AND PROFILEDATE BETWEEN 'DATEA' AND 'DATEB'

GROUP BY PROFILEDATE, CLIENTNO

ORDER BY PROFILEDATE DESC, CLIENTNO DESC

the script here runs quick, but if PersonProfile_2007_01 is replaced with the view, it runs about 30 x slower.

SELECT PERSONURN, ADDRESSURN, PROFILEDATE,CLIENTNO, STATUS

FROM VwQuery

WHERE LISTID = 'L01'

AND PROFILEDATE BETWEEN 'DATEA' AND 'DATEB'

GROUP BY PROFILEDATE, CLIENTNO

ORDER BY PROFILEDATE DESC, CLIENTNO DESC

This query runs really slow with the view.

I am trying to find the solution to the problem, can anyone help here.

Kindest Regards,

John Burchel (Trainee Developer)

David McFarland SSChampion Points: 11815 More actions · Answer 8

Why are you using the GROUP BY clause when you have no aggregates, out of curiousity? I'd try removing that first.

Next, if that doesn't take care of it, try this and let us know how it runs:

SELECT

a.*

FROM

(

SELECT PERSONURN, ADDRESSURN, PROFILEDATE,CLIENTNO, STATUS

FROM VwQuery

WHERE LISTID = 'L01'

AND PROFILEDATE BETWEEN 'DATEA' AND 'DATEB'

) AS a

ORDER BY a.PROFILEDATE DESC, a.CLIENTNO DESC

Finally, and just in case it's a weird bug with SQL Server, lose the DESC qualifiers in the ORDER BY clause. Don't put ASC, just let it default and tell us what happens.

John Burchel Hall of Fame Points: 3611 More actions · Answer 9

Sorry for the query, I havnt got the exact query here with me, but I do know it has a group by on those 2 columns, it looks something like that, probably I have added something extra, as I am doing this from the top of my head.

The problem only seems to happen, when the order by is added, what I think the problem is explained below:

when the order by is used, it orders the data based on the entire records in the wrapper view instead of just the underlying table concerned. If I query the monthly table and order the data, performance is awesome.

So in real terms, for example if I was querying data in January, i.e data A = 1st of January and date B = 30th of January, instead of querying the personprofile_2007_01 underlying table and only ordereding the data based on the underlying table (personprofile_2007_01), i.e query personprofile_2007_01 and order by personprofile_2007_01.profiledate and personprofile_2007_01.clientno its actually ordering by VwQuery.profiledate and VwQuery.clientno.

So I need to find a way of leting the query optimiser query the (concerned based on the date) underlying table instead of looking at each underlying table in the wrapper view.

Kindest Regards,

John Burchel (Trainee Developer)

David McFarland SSChampion Points: 11815 More actions · Answer 10

David McFarland

SSChampion

Points: 11815

April 9, 2007 at 5:25 am

#699285

My query that I wanted you to try should do that.

John Burchel Hall of Fame Points: 3611 More actions · Answer 11

Sorry for the mistake in the query, the group by isnt there, so you are right. but on a different note, I used your query and the performance was still very slow, but thesame query on the monthly table for the profile dates concerned runs like a dream.

The problem that i think is that, on the wrapper view there is no mechanism for deciding what monthly table to use according to the profiledate.

i.e if you query the view for a profiledate between '2007-01-01 00:00:000' and '2007-02-01 00:00:000', the view should not have to search all the monthly tables, but instead only query Personprofile_2007_01

Kindest Regards,

John Burchel (Trainee Developer)

David McFarland SSChampion Points: 11815 More actions · Answer 12

The Order by should be applied after the WHERE clause in that wrapper query, which was the goal. Is it possible for you to post the exact queries that you are running, as well as the execution plan (using SET SHOWPLAN_TEXT) for them?

John Burchel Hall of Fame Points: 3611 More actions · Answer 13

The query below is the query I'm actually running, and its just the order by thats giving problems, the order by is also needed as the most recent profiledate is what users are most interested in, if the order by is taken off, then it orders is Ascending order which is what the clustered index does by default:

SELECT PERSONURN, ADDRESSURN, PROFILEDATE,CLIENTNO, STATUS

FROM VwQuery

WHERE LISTID = 'L01'

AND PROFILEDATE BETWEEN '2007-02-01 23:59:59.000' AND '2007-03-19 20:11:23.000'

GROUP BY PROFILEDATE, CLIENTNO

ORDER BY PROFILEDATE DESC, CLIENTNO DESC

Kindest Regards,

John Burchel (Trainee Developer)

Cory E. SSCertifiable Points: 7076 More actions · Answer 14

David,

Earlier, you stated: "We use partitioned views as wrappers that hold the last 14 days as individual tables, as well as the last 100 or so weeks before that, in weekly tables. With the constraints in place, my queries that require only a subset of those days all run as fast against the view as they would against the underlying tables."

Is there any chance I could get you to elaborate on that? I may have a need to do something very similar, and I would greatly appreciate any direction you could point me in.

Thanks!

Cory

-- Cory