This is a followup to a previous post where I likened SQL Server float datatype to Muhammad Ali. Specifically, I wanted to address the phrase "negligible data storage increase" to the test for different numeric data types.
In short, I will dismiss the notion that float provides a realistic advantage in storage of large numbers. (Also, I had a blog title idea too catchy to pass on.)
I'll avoid recapping in total my earlier post about the dangerous consequences of rounding with the float data type. The float data type is still popular with developers and database designers because of its supposed advantages in data storage size. (Also, some code-first modelers use float as a default. Also dangerous. And lazy.)
Too often, developers and database designers ignore the large potential for inconsistent rounding with float (and real) data types, whereas a decimal(p,s) data type is not only consistent with storage, but it provides transparent logical intent to the development team. No surprises with truncation of numerals to the right of the decimal point with the decimal data type.
create table realtest (real1 float(24))
create table floattest (float53 float)--float(53) is the default
create table decimal9test (decimal9 decimal(9,3))
create table decimal19test (decimal19 decimal(19,3))
go
insert into realtest (real1) Values (123456.123)
insert into floattest (float53) Values (123456.123)
insert into decimal9test (decimal9) Values (123456.123)
insert into decimal19test (decimal19) Values (123456.123)
go
insert into realtest (real1) select real1 -1. from realtest
insert into floattest (float53) select float53 -1. from floattest
insert into decimal9test (decimal9) select decimal9 -1. from decimal9test
insert into decimal19test (decimal19) select decimal19 -1. from decimal19test
go 21
go
select sizemb = SUM(sizemb), tablename from (
select SizeMb = (p.in_row_reserved_page_count*8.)/1024.,
tablename = o.name, indexname = i.name
from sys.dm_db_partition_stats p
inner join sys.objects o on p.object_id = o.object_id
inner join sys.indexes i on i.object_id = o.object_id and i.index_id = p.index_id
where o.is_ms_shipped = 0
) x
where tablename in ('floattest', 'realtest', 'decimal9test', 'decimal19test')
group by tablename
order by sizemb desc
go
select top 1 * from realtest
select top 1 * from floattest
select top 1 * from decimal9test
select top 1 * from decimal19test
go
First, the resultsets showing the storage of 2+ million rows in a single-column table.
tablename | row_count | sizemb |
---|---|---|
realtest | 2097152 | 26.382812 |
floattest | 2097152 | 34.445312 |
decimal9test | 2097152 | 28.382812 |
decimal19test | 2097152 | 36.507812 |
Here's the values that were actually stored, notice the important differences.
create table realtest (real1 float(24) )
create table floattest(float53 float) --float(53) is the default
create table decimal19test(decimal19 decimal(19,7) )
...
tablename | row_count | sizemb |
---|---|---|
realtest | 2097152 | 26.382812 |
floattest | 2097152 | 34.445312 |
decimal19test | 2097152 | 36.507812 |
Let's apply DATA_COMPRESSION = PAGE to this data.
create table realtest (real1 float(24) ) with (data_compression = page)
create table floattest(float53 float) with (data_compression = page)
create table decimal19test(decimal19 decimal(19,7) )with (data_compression = page)
...
tablename | row_count | sizemb |
---|---|---|
floattest | 2097152 | 26.382812 |
decimal19test | 2097152 | 24.320312 |
Since page data compression is a very popular, very widespread and very useful tool on modern Enterprise edition SQL Servers, we can rule out float as having any space saving advantage. In short, float loses the last ground it stood on.
The real and float data type columns only compressed by 15% and 23% respectively. The fixed decimal data type compressed, in this case, by 43.5% (36.5mb to 24.3mb).
If we scale the page compressed data comparison up to 134 million rows...
tablename | row_count | sizemb |
---|---|---|
floattest | 134217728 | 1686.070312 |
decimal19test | 134217728 | 1556.015625 |