December 25, 2018 at 8:09 am
Hi Experts,
I am having two tables. 1 is Master having merely 28 records and 2nd, Transaction table, is having around 56 millions of records. Transaction table is being referenced by Master table. Maser table's PK (ID column)has been defined as Int, however there are no chances of growing Master table beyond 50 / 75 records is near future. Hence I am thinking to change the Master Table's ID column's datatype from Int to TinyInt. I want to create POC to showcase how much space would be recovered post changing the datatype from Int to TinyInt in Transaction table and it's relevant indexes.
Please let me know if any other details to be shared.
Thanks in advance,
Mahesh
MH-09-AM-8694
December 25, 2018 at 12:10 pm
Mahesh Bote - Tuesday, December 25, 2018 8:09 AMHi Experts,I am having two tables. 1 is Master having merely 28 records and 2nd, Transaction table, is having around 56 millions of records. Transaction table is being referenced by Master table. Maser table's PK (ID column)has been defined as Int, however there are no chances of growing Master table beyond 50 / 75 records is near future. Hence I am thinking to change the Master Table's ID column's datatype from Int to TinyInt. I want to create POC to showcase how much space would be recovered post changing the datatype from Int to TinyInt in Transaction table and it's relevant indexes.
Please let me know if any other details to be shared.
Thanks in advance,
Mahesh
Personally, I would not change the data type of the ID column in either the master or transaction table. Changing it because you don't foresee it increasing to 50 or 75 rows of data in the near future. Remember that a tiny int only holds the values 0 to 127. Keeping it as an integer allows it to grow as needed without having to worry about a restructure in the future.
December 25, 2018 at 4:07 pm
Lynn Pettis - Tuesday, December 25, 2018 12:10 PMMahesh Bote - Tuesday, December 25, 2018 8:09 AMHi Experts,I am having two tables. 1 is Master having merely 28 records and 2nd, Transaction table, is having around 56 millions of records. Transaction table is being referenced by Master table. Maser table's PK (ID column)has been defined as Int, however there are no chances of growing Master table beyond 50 / 75 records is near future. Hence I am thinking to change the Master Table's ID column's datatype from Int to TinyInt. I want to create POC to showcase how much space would be recovered post changing the datatype from Int to TinyInt in Transaction table and it's relevant indexes.
Please let me know if any other details to be shared.
Thanks in advance,
MaheshPersonally, I would not change the data type of the ID column in either the master or transaction table. Changing it because you don't foresee it increasing to 50 or 75 rows of data in the near future. Remember that a tiny int only holds the values 0 to 127. Keeping it as an integer allows it to grow as needed without having to worry about a restructure in the future.
Careful now. TinyInt has no negative values and it holds all values from 0 thru 255 because the 8th bit in the byte is not used for the sign of the number. And with 56 million * 3 bytes (168 MB savings for just one column on one table) saved in one table alone, it might be worth it especially if the system has memory issues. If that table has NCI's that also contain the column, well... you get the idea.
I do, however, very much agree with you about possibly not knowing the future scalability of the smaller table. In seeming contrast though, I also have a ton of reference tables that will never have more than a single byte PK be they numeric or simply alphabetic.
Of course, there are also other considerations like what-code-where may go south because of a mismatch in datatypes that didn't exist before even if they are compatible datatypes.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 26, 2018 at 12:48 am
Jeff Moden - Tuesday, December 25, 2018 4:07 PM...And with 56 million * 3 bytes (168 MB savings for just one column on one table) saved in one table alone, it might be worth it especially if the system has memory issues.
If that table has NCI's that also contain the column, well... you get the idea.
I do, however, very much agree with you about possibly not knowing the future scalability of the smaller table. In seeming contrast though, I also have a ton of reference tables that will never have more than a single byte PK be they numeric or simply alphabetic.
Of course, there are also other considerations like what-code-where may go south because of a mismatch in datatypes that didn't exist before even if they are compatible datatypes.
Thanks Jeff.
1. Yes, Transaction table has NCI having ID( from Master Table) in it.
2. Checked with BA team on the future scalability of the master table, they confirmed that it won't grow beyond 75 in any circumstances. Also few other tables are having same master table's id, defined as Int having part of NCI as well.
So in short wanted to know methods to calculate how much space can be recovered post changes.
Thanks again,
Mahesh
MH-09-AM-8694
December 26, 2018 at 3:45 am
Mahesh Bote - Wednesday, December 26, 2018 12:48 AMJeff Moden - Tuesday, December 25, 2018 4:07 PM...And with 56 million * 3 bytes (168 MB savings for just one column on one table) saved in one table alone, it might be worth it especially if the system has memory issues.
If that table has NCI's that also contain the column, well... you get the idea.
I do, however, very much agree with you about possibly not knowing the future scalability of the smaller table. In seeming contrast though, I also have a ton of reference tables that will never have more than a single byte PK be they numeric or simply alphabetic.
Of course, there are also other considerations like what-code-where may go south because of a mismatch in datatypes that didn't exist before even if they are compatible datatypes.
Thanks Jeff.
1. Yes, Transaction table has NCI having ID( from Master Table) in it.
2. Checked with BA team on the future scalability of the master table, they confirmed that it won't grow beyond 75 in any circumstances. Also few other tables are having same master table's id, defined as Int having part of NCI as well.So in short wanted to know methods to calculate how much space can be recovered post changes.
Thanks again,
Mahesh
Here is a formula for the calculation
😎
DECLARE @Row_Count INT = 1000000;
DECLARE @COL_COUNT INT = 3;
DECLARE @INT_SIZE INT = 8;
DECLARE @TINYSIZE INT = 1;
SELECT
((@INT_SIZE - @TINYSIZE) * @Row_Count * @COL_COUNT) AS BYTES_SAVED
,ROUND(((@INT_SIZE - @TINYSIZE) * @Row_Count * @COL_COUNT) / POWER(1024.0,2),3) AS MB_SAVED
;
December 26, 2018 at 3:47 am
Mahesh Bote - Wednesday, December 26, 2018 12:48 AMJeff Moden - Tuesday, December 25, 2018 4:07 PM...And with 56 million * 3 bytes (168 MB savings for just one column on one table) saved in one table alone, it might be worth it especially if the system has memory issues.
If that table has NCI's that also contain the column, well... you get the idea.
I do, however, very much agree with you about possibly not knowing the future scalability of the smaller table. In seeming contrast though, I also have a ton of reference tables that will never have more than a single byte PK be they numeric or simply alphabetic.
Of course, there are also other considerations like what-code-where may go south because of a mismatch in datatypes that didn't exist before even if they are compatible datatypes.
Thanks Jeff.
1. Yes, Transaction table has NCI having ID( from Master Table) in it.
2. Checked with BA team on the future scalability of the master table, they confirmed that it won't grow beyond 75 in any circumstances. Also few other tables are having same master table's id, defined as Int having part of NCI as well.So in short wanted to know methods to calculate how much space can be recovered post changes.
Thanks again,
Mahesh
Really the question is something that you could calculate yourself with regards to the max saving.
looking at the storage used by each datatype you can easily calculate what is the max saving you will get. https://docs.microsoft.com/en-us/sql/t-sql/data-types/int-bigint-smallint-and-tinyint-transact-sql?view=sql-server-2017
how much space will really be saved will depend on other factors such as compression, page fill factor, record length, average record length whether this column is nullable and what percentage of the table have null values (could be none on this case) and so on and there is no easy way to calculate it other than test with a sample (or full table) to see what is the final size.
for example if your record size is such that (worst case scenario) you have 1 row per page, changing this datatype will not save you any space at all as reducing 3 bytes will still not allow for more rows per page.
so a possible way to calculate would be to determine how many rows per page you have now, how many those 3 bytes will give you per page once implemented and see how many pages in total you would save. simple math
December 26, 2018 at 8:28 am
Mahesh Bote - Wednesday, December 26, 2018 12:48 AMJeff Moden - Tuesday, December 25, 2018 4:07 PM...And with 56 million * 3 bytes (168 MB savings for just one column on one table) saved in one table alone, it might be worth it especially if the system has memory issues.
If that table has NCI's that also contain the column, well... you get the idea.
I do, however, very much agree with you about possibly not knowing the future scalability of the smaller table. In seeming contrast though, I also have a ton of reference tables that will never have more than a single byte PK be they numeric or simply alphabetic.
Of course, there are also other considerations like what-code-where may go south because of a mismatch in datatypes that didn't exist before even if they are compatible datatypes.
Thanks Jeff.
1. Yes, Transaction table has NCI having ID( from Master Table) in it.
2. Checked with BA team on the future scalability of the master table, they confirmed that it won't grow beyond 75 in any circumstances. Also few other tables are having same master table's id, defined as Int having part of NCI as well.So in short wanted to know methods to calculate how much space can be recovered post changes.
Thanks again,
Mahesh
Just remember that specifications that won't change can change.
December 26, 2018 at 8:38 am
Lynn Pettis - Wednesday, December 26, 2018 8:28 AMMahesh Bote - Wednesday, December 26, 2018 12:48 AMJeff Moden - Tuesday, December 25, 2018 4:07 PM...And with 56 million * 3 bytes (168 MB savings for just one column on one table) saved in one table alone, it might be worth it especially if the system has memory issues.
If that table has NCI's that also contain the column, well... you get the idea.
I do, however, very much agree with you about possibly not knowing the future scalability of the smaller table. In seeming contrast though, I also have a ton of reference tables that will never have more than a single byte PK be they numeric or simply alphabetic.
Of course, there are also other considerations like what-code-where may go south because of a mismatch in datatypes that didn't exist before even if they are compatible datatypes.
Thanks Jeff.
1. Yes, Transaction table has NCI having ID( from Master Table) in it.
2. Checked with BA team on the future scalability of the master table, they confirmed that it won't grow beyond 75 in any circumstances. Also few other tables are having same master table's id, defined as Int having part of NCI as well.So in short wanted to know methods to calculate how much space can be recovered post changes.
Thanks again,
MaheshJust remember that specifications that won't change can change.
I guess we'll going to see another question then
😎
My thought is that by applying compression on the column, the benefit will be greater and no downstream maintenance required if things do change.
Lynn Pettis - Wednesday, December 26, 2018 8:28 AMMahesh Bote - Wednesday, December 26, 2018 12:48 AMJeff Moden - Tuesday, December 25, 2018 4:07 PM...And with 56 million * 3 bytes (168 MB savings for just one column on one table) saved in one table alone, it might be worth it especially if the system has memory issues.
If that table has NCI's that also contain the column, well... you get the idea.
I do, however, very much agree with you about possibly not knowing the future scalability of the smaller table. In seeming contrast though, I also have a ton of reference tables that will never have more than a single byte PK be they numeric or simply alphabetic.
Of course, there are also other considerations like what-code-where may go south because of a mismatch in datatypes that didn't exist before even if they are compatible datatypes.
Thanks Jeff.
1. Yes, Transaction table has NCI having ID( from Master Table) in it.
2. Checked with BA team on the future scalability of the master table, they confirmed that it won't grow beyond 75 in any circumstances. Also few other tables are having same master table's id, defined as Int having part of NCI as well.So in short wanted to know methods to calculate how much space can be recovered post changes.
Thanks again,
MaheshJust remember that specifications that won't change can change.
Columnar compression is probably the way to go here, more benefit and less downstream maintenance.
😎
December 26, 2018 at 8:50 am
Also, changing from INT to TINYINT will require some down time to rebuild the tables and indexes.
December 26, 2018 at 9:05 am
Mahesh Bote - Wednesday, December 26, 2018 12:48 AMJeff Moden - Tuesday, December 25, 2018 4:07 PM...And with 56 million * 3 bytes (168 MB savings for just one column on one table) saved in one table alone, it might be worth it especially if the system has memory issues.
If that table has NCI's that also contain the column, well... you get the idea.
I do, however, very much agree with you about possibly not knowing the future scalability of the smaller table. In seeming contrast though, I also have a ton of reference tables that will never have more than a single byte PK be they numeric or simply alphabetic.
Of course, there are also other considerations like what-code-where may go south because of a mismatch in datatypes that didn't exist before even if they are compatible datatypes.
Thanks Jeff.
1. Yes, Transaction table has NCI having ID( from Master Table) in it.
2. Checked with BA team on the future scalability of the master table, they confirmed that it won't grow beyond 75 in any circumstances. Also few other tables are having same master table's id, defined as Int having part of NCI as well.So in short wanted to know methods to calculate how much space can be recovered post changes.
Thanks again,
Mahesh
You should know some things about the datatypes. They ARE in "Books Online" and on the network for Microsoft Docs.
An INT datatype is 4 bytes. A TINYINT is 1 Byte. There will be a 3 byte savings per row per index (always include the Clustered Index and any Non Clustered Index that contains the column). That won't include the additional savings in the B-Tree of the index but that's nearly trivial compared to the Leaf Level, which is where the data for the index lives.
Then, do the simple math. 3*TheNumberOfRows*TheNumberOfIndexes.
Of course, once you're made the datatype change, you'll need to rebuild the indexes (Clustered at least plus any Non Clustered Indexes that contain the column either as a key or an INCLUDE).
As the others have stated, columnar compression may also bring huge benefits, especially for this relatively low cardinality column).
--Jeff Moden
Change is inevitable... Change for the better is not.
January 3, 2019 at 4:37 am
thank you all for your valuable inputs.
MH-09-AM-8694
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply