improve performance table with millions of rows

Question

improve performance table with millions of rows

diego.piccolboni

Mr or Mrs. 500

Points: 508
More actions
December 19, 2008 at 9:26 am

#126551

Hi
I use sql server 2005 and I wanted to ask an opinion for the creation of a table with millions of rows...
The table has a primary key and therefore indexes.
improve performance by dividing the table for years? :hehe: :exclamation:
ie
it is better to have a table with 24 million of rows or 12 tables with 2 million of rows
Thank you

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 1

That all depends on table design and access patterns. Being that you are on 2000 you could use a partitioned view to bring the multiple table scenario into one "table" and you can reap some benefits through that but again access patterns and designs are the critical component to the answer. So, if you want further input it would be good to post the table create script and some of the frequently used queries for consideration.

Hopefully that makes sense.

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

diego.piccolboni Mr or Mrs. 500 Points: 508 More actions · Answer 2

This is the table

CREATE TABLE [dbo].[Vendite](

[Data] [datetime] NOT NULL,

[Mese] [int] NOT NULL,

[Anno] [int] NOT NULL,

[Societa] [varchar](3) NOT NULL,

[Codice] [varchar](9) NOT NULL,

[TipologiaCliente] [varchar](1) NOT NULL,

[Classe] [int] NOT NULL,

[Importo1] [decimal](19, 2) NOT NULL,

[Importo2] [decimal](19, 2) NOT NULL,

[Importo3] [decimal](19, 2) NULL,

CONSTRAINT [PK_Vendite] PRIMARY KEY CLUSTERED

(

[Mese] ASC,

[Anno] ASC,

[Societa] ASC,

[Codice] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

) ON [PRIMARY]

queries may require the join over two years

some ideas?

Thank you

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 3

Making some great assumptions as I still don't have queries but if you are going to be retrieving 2 years of data (is that what you were stating?) then you would be able to gain some benefit from breaking the tables down physically and creating a partitioned view over the top of them. You should review the information in BOL regarding restrictions associated with partitioned views and do some testing before jumping in and making the change. Part of that testing should be with running standard queries in a side-by-side scenario to see if you are going to reap any benefit.

Again, some assumptions being made...

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

diego.piccolboni Mr or Mrs. 500 Points: 508 More actions · Answer 4

Example query

solution with single table

SELECT DISTINCT M.Mese,M.Anno,M.Societa,M.Codice FROM

dbo.VenditeAnno(@M,@Anno) M

LEFT OUTER JOIN dbo.VenditeAnno(@P,@AnnoPrec) P

ON M.Societa = P.Societa AND M.Codice = P.Codice

WHERE M.Classe = xxx

VenditeAnno is a function that returns the data a year

Thank you

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 5

I'm a little confused by the sql statement as it doesn't appear to be referencing the table at all but I am guessing that you are within the function. What is the function actually returning? Very curious.

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 6

Move whatever the function is doing into a derived table. The millions of call to that function is what is killing the performance.. no amount of indexing an overcome that.

We need to see the function's code to go any further.

diego.piccolboni Mr or Mrs. 500 Points: 508 More actions · Answer 7

FUNCTION [dbo].[VenditeAnno]

(

@Mese Int,

@Anno Int

)

RETURNS TABLE

AS

RETURN

(SELECT Vendite.Data, Vendite.Mese As Mese,Vendite.Anno as Anno,

Vendite.Societa, Vendite.Codice, Vendite.TipologiaCliente, Vendite.Classe,

Vendite.Importo1, Vendite.Importo2, Vendite.Importo3

FROM Vendite

WHERE Vendite.Mese=@Mese AND Vendite.Anno=@Anno

)

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 8

What's the point of the left join in the query? I don't understand what you are trying to do there.

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 9

Looks to be a year over year comparison, which if this is the case the physical separation of the data into multiple tables by year and using a partitioned view should help you get some better results. You should be able to see that in the execution plan when you do some testing. The only caveat to that would be that you may not be able to get the detail of the table access due to the use of the function. Hard to say until you try it.

I haven't used a function in the way you are using it but it seems odd to me. Is this efficient?

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

magarity kerns SSCrazy Points: 2763 More actions · Answer 10

I think you might get much better performance if you make that concatenated PK a non-clustered index and change the clustered to just those 2 columns in your query's WHERE clause.

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 11

...because the way the function is written, and the columns contained in the function, the current indexing is not going to be sufficient and there will be some lookups. I haven't played with the table but it sure appears that way to me.

Just my thoughts.

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

diego.piccolboni Mr or Mrs. 500 Points: 508 More actions · Answer 12

I have tried with SET STATISTICS IO

Time is almost identical

The difference is that the solution with a single table run 1 physics reading and then all logic

while The solution with multiple tables running a 1 physical reading and then 1 logical reading for each year...

the keys and the structure of the tables is always the same

who benefits if I share the table??

Thank you

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 13

Can you include the statistics IO output? Also, can you attach the two query plans?

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot