June 20, 2007 at 8:05 am
In general what are the recommendations, from a sizing perspective, for adding additional data files to a filegroup. In a new version of our DW I’m seeing the possibility of having some very large datafiles, in excess of 200 GB.
June 20, 2007 at 9:12 am
I'm not sure I've seen recommendations for sizing with a particular file. Usually it's because of space recommendations. It's also a nice tool if you have sections of read-only data. You can move them to filegroups and then not back them up (very often), saving time and resources, but still having recoverability.
June 20, 2007 at 10:26 am
Storage may have upper limits. Also in our DW I've split things out into logical filegroups to aid with performance and recovery. I've split out file groups as follows:
1. a staging data and staging index file group. If we lose these, no real big deal as it's used strictly to gather data from our subject areas prior to analytical processing and data aggregation. I can easily recreate the empty file groups and then create the empty staging tables with a DDL extract from the Database Publishing Wizard backup that is run every night.
2. A data and index file group for each subject area of the DW.
June 20, 2007 at 10:53 am
Thanks for the info guys. Tim do you have any sort of general rule as to when you add another file to your data or index file groups? For example when the last file reaches a certain size. Or I should say how are you deciding the number of files to include in your data and index filegroups.
June 20, 2007 at 12:58 pm
Depends on what you want, you can either use filegroups for performance, i.e place Read intensive data on a RAID 5 seperate physical disks, or have write intensive data stored on RAID 10 physical filegroup configuration.
You can also use it to organise data for easy management, i.e place data on seperate filegroups for organisation, you will need to know what the size of the data in question is, this will give you an idea of how much space you should allocate for each filegroup.
June 22, 2007 at 8:07 am
The factor that is usually the biggest limit for me is restore time. If you lose a single file, what is the max time you want to spend to restore it. Compare that to max data you will have and upper limits to the number of data files. And find a happy medium.
At one time, I considered 8g to be a big file. If 200g is the size of your db. Just splitting it up to spread the load (yes I know raid takes care of that) across multiple drives can help Maybe shoot for 32 gig files. And even with Gigabit networks, a 200g file can take a while to restore. If I recall some of my tests. 200g could take upwards of a couple hours (of course filesystem performance, CPU etc play into that).
June 22, 2007 at 9:15 am
Thanks for the info Bob. Yes I remember the 8GB data file days. I'm looking at a 3TB warehouse so i'm probably going to look at trying to keep files under 200GB.
June 22, 2007 at 10:09 am
Hi Rob,
Sorry for my delay. Here is a sp that I use to notify me when individual datafiles are > 70% full. I call the sp from an Agent job first am and send the results to me via DB Mail.
USE [DBMonitor]
GO
/****** Object: StoredProcedure [dbo].[spDBSizerptV1] Script Date: 06/22/2007 10:07:13 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER OFF
GO
CREATE PROCEDURE [dbo].[spDBSizerptV1]
AS
BEGIN
SET QUOTED_IDENTIFIER OFF
SET NOCOUNT ON
DECLARE @myCmd VARCHAR(2000)
CREATE TABLE #T2 (
_dbnameVARCHAR(30),
_TypeDesc VARCHAR(10),
_CurrentSize float,
_MaxSize float,
_data_space_id INT,
_name VARCHAR(15)
)
-- populate temp table #T2 for each database
SELECT @myCmd="INSERT INTO #T2 SELECT '?', LEFT(a.type_desc,10), a.size*8/1024, a.max_size*8/1024, a.data_space_id, b.name from [?].sys.database_files a, [?].sys.filegroups b where a.type=0 and a.data_space_id=b.data_space_id"
exec sp_MSforeachdb @command1=@myCmd
-- show all datafiles >70% average full. datafiles with unrestricted growth are not specified
SELECT_dbname as "Database Name",
_name as "File Group",
sum(_CurrentSize) as "Current File Size (MB)",
sum(_maxsize) as "Max File Size (MB)",
(round(sum(_CurrentSize)/sum(_maxsize),2))*100 as "Pct Full"
FROM#T2
WHERE_maxsize > 0
AND_TypeDesc="ROWS"
groupby _dbname, _name
having(round(sum(_CurrentSize)/sum(_maxsize),2))*100 > 70
ORDERBY _dbname, _name asc
-- clean up
DROP TABLE #T2
END
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply