large table

Question

large table

simon_hyan

Ten Centuries

Points: 1177
More actions
December 2, 2003 at 9:31 am

#64703

Assumning there is a very large table in a database, do you have any ideas to make it good performance? Any ideas are welcome.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Shas3 Hall of Fame Points: 3671 More actions · Answer 1

Depends on how big it is? Had a similar situation with one of my old client, table size is 30+GB. If you see any perf problems I would suggest take an advantage of file groups. Place the table in a separate file group. Also consider for better indexes, UPDATE STASTS..etc

Shas3

simon_hyan Ten Centuries Points: 1177 More actions · Answer 2

Thanks for reply. About 30G. It is a good soultion, anything else?

philcart SSC-Forever Points: 47795 More actions · Answer 3

You could also try horizontal partioning and placing the partitions on seperate filegroups which reside on different physical raid volumes. The same can be done with the indexes. Check the following topics in Books online, "Partitioning", "Using Files and Filegroups" and "Data Placement Using Filegroups".

Hope this helps

Phill Carter

--------------------

Colt 45 - the original point and click interface

--------------------
Colt 45 - the original point and click interface

Frank Kalis SSC Guru Points: 111183 More actions · Answer 4

In addition to what Phil said, I think any good performance tuning starts with a good design. If this design is sub-optimal any performance tuning and tweaking is useless.

So if redesigning could be an alternative, you might want to post DDL of that table.

Frank

http://www.insidesql.de

http://www.familienzirkus.de

--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

Alamir SSCertifiable Points: 6085 More actions · Answer 5

Please Also Take care of these Notes:

When you want a report which based on complicated queries joined from several tables, These tables are updated frequently, then you lock these tables (or you will wait for previous transaction lock on these tables to be completed to put your locks on them) .. so all other poeple using these tables are locked too (so hang happened - because they are in the queue of lock of these tables)!!

the best thing is to run the query on the server (by making View on SQl Server) .. and with each table name in the query, add this statement "with (nolock)" .. so you read data as it is (not recently updated = not commited transaction), and you don't wait for locks to be free and you don't lock these tables, and any other transaction will complete successfully (for you and others) 🙂 .

you will write something like this in your View:

select Invoices.*, Situation.*

from Situation with (nolock) INNER JOIN Invoices with (nolock)

ON Situation.SituationID = Invoices.Situation

where Situation.SituationID =1

Disadv:

- when using "with (nolock)", you will not seen comitted transaction (changed one) at this moment ... but you can see it at any other time.

Adv:

- no hang at all

- very fast Response

- Little summary for Locks in SQL Log file.

also : you may need to add more Memory (RAM) when server still hangs a little after transfering your queries to path-through queries.... becuase every transaction or query is done on server first, then processed data is passed to your PC.

ALSO:

try to put the DATA and the LOG file of SQL Database on a partition that is Formatted with 32 K or 64 K Cluster Size (not the defalut = 4 K)

because the default Page size in SQL in memory is 8K ,

And If u use the defulat Cluster Size while formating partioin, then to get 1 page (8K), you need 2 hits for the HardDisk (4 K each)

But If u use 32 K Cluster Size while formatiing partioin, then on 1 hit for the HardDisk (32 K each) , you get 4 pages into Memory (more faster ... because the slowest thing that slow the whole system is to read/write from Hard Disk)

I use 32 K not 64 K.. because with 64 K you may have a lot of empty pages in memory ( lost in memeory and your DB may be graw more)

this of course will help you to solve the problem of hang more and more .. bye

Alamir Mohamed

Alamir_mohamed@yahoo.com

Alamir Mohamed
Alamir_mohamed@yahoo.com

simon_hyan Ten Centuries Points: 1177 More actions · Answer 6

Thanks for all of u. Very good ideas. Creating a SP to access this large table and binding it to cache also can be an asset.

If it occurs in ur envirnment, what is ur workflow? Just a common discussion. Hope to get ur reply.

Calvin Lawson SSChampion Points: 11030 More actions · Answer 7

Absolutely the most important thing in a large table is it's clustered index. Optimizing this will help you more than partioning or any hardware mods.

Ideally the clustered index is unique, so that SQL will use that value as the index instead of references to that value.

For updates, having a clustered index that is too wide or is modified to often can really hurt performance. That's why a lot of people use the identity column as the clustered index, as it's static and incremental. However, this may improve your DML statements, but it's going to suck if you don't use that identity column in your lookups.

For lookups, you MUST reference the clustered index in the joins or where clause if you want any sort of speed. Really, it doesn't matter how big your table is; you can get millisecond returns if SQL is able to do a seek on the clustered index.

All that really means is that you're taking a hit to maintain your clustered index one way or another, so it had better by useful. Making it to wide can hurt your DML performance, but a wide index is more useful for lookups. It's a traid-off.

As an example, I have a lookup table that has customer IDs and their First Name, Last Name, Zip, and Birthday. This table is used to return the ID based on a lookup of all the columns in the table (except ID, of course). This table is re-built on a weekly basis, and statis the rest of the time.

So, in this situation I put all my columns into my clustered index. 🙂 Actually, I made the combination of all these columns my Primary Key (clustered), as each lookup must return 0 or 1 records.

Of course this isn't always the way to to it, but I hope you get my drift. Trust me, the clustered index will make all the difference.

Signature is NULL

simon_hyan Ten Centuries Points: 1177 More actions · Answer 8

Thanks,Calvin. Index is the first job, what is ur suggestion for the next steps?

Calvin Lawson SSChampion Points: 11030 More actions · Answer 9

test test test...

Also, Normalization:

Is there redundant data in the table?

Can this table be broken up into x number of smaller tables?

DateTypes: avoid char and varchar if possible, especially for key values or repeated values (same value in more than one row). int will give you good performance for most things.

A method some people are using is called "Horizontal Partitioning". This is basically

1) breaking the table up into smaller chunks, say by date or something

2) Put check constraints on the column you're breaking up the table by. This gives it min and max values allowed for that table.

3) Build an indexed view over the table.

From what I understand, if this is set up correctly SQL will know exactly what table to use to get the data. Never tried it myself, though...

Other nonclustered indexes are important as well, although it's amazing how many times a composite clustered index will do away with the need for these.

Signature is NULL

simon_hyan Ten Centuries Points: 1177 More actions · Answer 10

simon_hyan

Ten Centuries

Points: 1177

December 3, 2003 at 8:32 pm

#484556

Good, good, good...

Gabor Nyul SSCrazy Eights Points: 9459 More actions · Answer 11

Basically it depends on your application how fast your response times are.

If you have only single row selects and updates then the size of the database does not matter if you have a well designed indexes.

But if you have queries scanning a large amount of data then not too many mysteries will be there. Of cours you can save some ms or seconds by making as suggested above (separated filegroups on separated spindle or even creating several physical files within one filegroup...) and also try to put your indexes on a separate filegroup/spindle.

If you are making order by queries try to index the ordered columns.

And in this case the tempdb large enough (avoid to autogrow during the online operations) should reside on a separate and very fast disk.

Of cours try to avoide the autogrow during the online time for every of your files. Same with the update stats.

By the way: The updated statistics and the defragmented indexes are very importants!

I like to do all those maintenance tasks outside the peak periods.

Bye

Gabor

Bye
Gabor