April 17, 2007 at 12:10 am
Hello SQL admins,
I know that converting from DATETIME to SMALLDATETIME should save us 4 bytes of data per row, which over millions of rows would add up to some savings.
My question is this, I have altered columns from DATETIME to SMALLDATETIME yet after using DBCC UPDATEUSAGE as well as running sp_spaceused (with @updateusage='true' parameter) is showing the same amount of used data.
So can someone answer for me why the memory used for the table's data remains the same? I understand that one could create a new table with adjusted datatypes and load it. Then you would see some memory saved, however are there any DBCC commands you know of that will "refresh" the underlying structure and reflect the change on the data file.
My thanks in advance for your answers, I realize memory gets cheaper everyday but we are looking into this option to help us keep our sexy, and work out our situation.
April 17, 2007 at 2:26 am
space savings done by such changes are preserved within the database for future use. a shrinkdatabse might release the space to the OS.
April 17, 2007 at 3:02 am
- did you execute your dbmaintenance (dbcc dbreindex(yourtable) ) ?
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
April 17, 2007 at 7:48 am
First off it is a myth that changing datatypes will have any significant size impact, unless you make a significant change in the total average length of a row. However if it is an indexed field you may (not guaranteed) see an OK savings but generally nothing to warrant such a change in most cases.
It majorly depends on the total average length of your rows as well. Consider if a record is 1K average then you have on average 8 records to the page. If you change one datetime column to smalldatetime you save 4 bytes or 36 bytes per page. Because the average row is far more than your savings you will not see any change in the size of the data overall as you still can only fit 8 records to every 8k page.
So even if you do maintenance on the table and shrink the database odds are you will not recover more than a few bites-megs over millions of rows.
The best way to know your impact is figure the average row length and divide 8k by this to figure up how many records per data page you can hold, remember whole numbers (rounded down) are all that count. You can also use DBCC SHOWCONTIG to get the information for an existing table. Next figure the new average length by the change in datatype and figure how many records can fit on an 8k page. If the whole number is exactly the same then you will not see any savings on the server.
Now consider if you had (in theory) 300 million rows and each row is 996 bytes per page (including overhead for markers which you don't generally account for). This means you have 8 records per page. So you change the datetime column to smalldatetime hoping to save 4 bytes per row (or 1.11 GB
(300,000,000 * 4) / 1024 / 1024 / 1024
) but in reallity because you changed the average length from 996 to 992 you can still only fit 8 records to a page thus you savings = 0 bytes.
April 17, 2007 at 9:55 am
Thank you Antares,
That was very informative, and makes a lot of sense. I guess our best bet is to index, defrag or reindex some tables if need be and shrink transaction logs for any kind of space savings that could also be released back out to the operating system.
I will however continue to do a little testing. We have some tables with multiple datetime columns that may add up to a magic number, but from what you mention it looks like it could be a long shot. At the least I would be learning some more nitty gritty on how data is stored on disc.
Thanks !
April 17, 2007 at 10:32 am
A lot of time the biggest savings you can find is just by normalzing the tables.
EX
Facilitites
Facility_ID int,
...
Published_Phone numeric(9,0) nullable,
Backdoor_Phone numeric(9,0) nullable,
Fax_Phone numeric(9,0) nullable,
Other_Phone numeric(9,0) nullable,
...
Was changed like so
Facilitites
Facility_ID tinyint,
...
Phone_Types
Phone_Type_ID int (PRIMARY KEY ID),
Phone_Type varchar(10)
Facility_Phone_Numbers
Facility_ID (FK to Facilities),
Phone_Type_ID (FK to Phone_Types),
Phone_Number numeric(9,0) not nullable
There were a number of other fields we addressed like this as well which redused a lot of the space we had taken up, improved data reliability, and improved application scalability. In the example we could add phone types without recoding the app or the db just simple data changes.
Also, consider using lookup tables for things such as the phone types does where instead of putting the text version in the main table you can replace significant portions of reused and lengthy data.
Further things you may not even realize are variable length datatypes and rearranging based on nullability. For ever null variable length datatype (varchar, varbinary, etc) there is 2 bytes of overhead if the later columns have data in them or are not nullable. Many people (including myself in the past) design the table logically without concern for this, but logical order of columns on a table means nothing other than the person reading the DDL directly. Place variable length columsn that are not nullable first then nullable columns in order of usage (fewest nulled go first to those used least). This will shave off some but again depends on how significant the row length changes.
Lastly consider the data types you use for you clustered index. If it is an identity column then there will be little slack (empty space) produced over a period of time unless you delete records. If something like a uniqueidentifier which is extremely random can mean over a short period of time you have a lot of page splits and many pages 20-50% empty due to this.
April 17, 2007 at 10:46 am
Antares,
We do make use of lookup tables in most of our applications. These tables in particular are not taking full advantage of them. Also another good point on column order and indexed columns, your a total effin dba stud dude!
This is great, I am very thankful for your advice!
April 17, 2007 at 11:03 am
Thanks, I spent several months playing with DBCC PAGE and a few other undocumented items which helped me to understand what was going on inside the system. Reading the columns in their raw format was very informative and I made dozens of posts very early on SSC about what I found.
Also, if you ever have a chance check out The Guru's Guide to SQL Server Architecture and Internals by Ken Henderson.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply