August 14, 2018 at 10:59 am
By default, Image data (and some other types) are written to PRIMARY. Is there any way to find out how much space is consumed by these images?
August 14, 2018 at 11:03 am
Tom John-342103 - Tuesday, August 14, 2018 10:59 AMBy default, Image data (and some other types) are written to PRIMARY. Is there any way to find out how much space is consumed by these images?
You can use DATALENGTHSELECT SUM(DATALENGTH(myImageCol))
FROM myTable
https://docs.microsoft.com/en-us/sql/t-sql/functions/datalength-transact-sql?view=sql-server-2017
August 14, 2018 at 11:27 am
Jonathan AC Roberts - Tuesday, August 14, 2018 11:03 AMTom John-342103 - Tuesday, August 14, 2018 10:59 AMBy default, Image data (and some other types) are written to PRIMARY. Is there any way to find out how much space is consumed by these images?You can use DATALENGTH
SELECT SUM(DATALENGTH(myImageCol))
FROM myTable
https://docs.microsoft.com/en-us/sql/t-sql/functions/datalength-transact-sql?view=sql-server-2017
Thanks. That worked with one modification as shown below:
SELECT SUM(cast(DATALENGTH(BinaryImageData) as bigint)) FROM IMAGEINDEX
I had to do this because the value returned was >2^31. The value was 31,116,152,328
August 14, 2018 at 8:18 pm
Tom John-342103 - Tuesday, August 14, 2018 11:27 AMJonathan AC Roberts - Tuesday, August 14, 2018 11:03 AMTom John-342103 - Tuesday, August 14, 2018 10:59 AMBy default, Image data (and some other types) are written to PRIMARY. Is there any way to find out how much space is consumed by these images?You can use DATALENGTH
SELECT SUM(DATALENGTH(myImageCol))
FROM myTable
https://docs.microsoft.com/en-us/sql/t-sql/functions/datalength-transact-sql?view=sql-server-2017Thanks. That worked with one modification as shown below:
SELECT SUM(cast(DATALENGTH(BinaryImageData) as bigint)) FROM IMAGEINDEXI had to do this because the value returned was >2^31. The value was 31,116,152,328
That's actually going to provide a bit of an inaccurate count. Yes, you're counting the bytes in the image but you're not counting the number of bytes it takes to store the data. You need to count the pages in the LOB and divide that by 128.0 to get the accurate count of the bytes consumed not just by the images but by the pages they are stored on.
The hard part is if you have more than one LOB column in the same table. I've not had to split them out before and so don't know how to do such a thing off the top of my head.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 15, 2018 at 5:47 am
Jeff Moden - Tuesday, August 14, 2018 8:18 PMTom John-342103 - Tuesday, August 14, 2018 11:27 AMJonathan AC Roberts - Tuesday, August 14, 2018 11:03 AMTom John-342103 - Tuesday, August 14, 2018 10:59 AMBy default, Image data (and some other types) are written to PRIMARY. Is there any way to find out how much space is consumed by these images?You can use DATALENGTH
SELECT SUM(DATALENGTH(myImageCol))
FROM myTable
https://docs.microsoft.com/en-us/sql/t-sql/functions/datalength-transact-sql?view=sql-server-2017Thanks. That worked with one modification as shown below:
SELECT SUM(cast(DATALENGTH(BinaryImageData) as bigint)) FROM IMAGEINDEXI had to do this because the value returned was >2^31. The value was 31,116,152,328
That's actually going to provide a bit of an inaccurate count. Yes, you're counting the bytes in the image but you're not counting the number of bytes it takes to store the data. You need to count the pages in the LOB and divide that by 128.0 to get the accurate count of the bytes consumed not just by the images but by the pages they are stored on.
The hard part is if you have more than one LOB column in the same table. I've not had to split them out before and so don't know how to do such a thing off the top of my head.
Thanks Jeff. I actually moved the image data to it's own file group. The file has a size of 47,626,321,920 bytes with no wasted space. The sum of the data lengths is 31,116,152,328.
How do I get the page count from the LOB.
August 15, 2018 at 7:10 am
Tom John-342103 - Wednesday, August 15, 2018 5:47 AMJeff Moden - Tuesday, August 14, 2018 8:18 PMTom John-342103 - Tuesday, August 14, 2018 11:27 AMJonathan AC Roberts - Tuesday, August 14, 2018 11:03 AMTom John-342103 - Tuesday, August 14, 2018 10:59 AMBy default, Image data (and some other types) are written to PRIMARY. Is there any way to find out how much space is consumed by these images?You can use DATALENGTH
SELECT SUM(DATALENGTH(myImageCol))
FROM myTable
https://docs.microsoft.com/en-us/sql/t-sql/functions/datalength-transact-sql?view=sql-server-2017Thanks. That worked with one modification as shown below:
SELECT SUM(cast(DATALENGTH(BinaryImageData) as bigint)) FROM IMAGEINDEXI had to do this because the value returned was >2^31. The value was 31,116,152,328
That's actually going to provide a bit of an inaccurate count. Yes, you're counting the bytes in the image but you're not counting the number of bytes it takes to store the data. You need to count the pages in the LOB and divide that by 128.0 to get the accurate count of the bytes consumed not just by the images but by the pages they are stored on.
The hard part is if you have more than one LOB column in the same table. I've not had to split them out before and so don't know how to do such a thing off the top of my head.
Thanks Jeff. I actually moved the image data to it's own file group. The file has a size of 47,626,321,920 bytes with no wasted space. The sum of the data lengths is 31,116,152,328.
How do I get the page count from the LOB.
One way is to execute sys.dm_db_index_physical stats on the related table's Clustered Index and then look for the LOB_DATA marker for the row. Note that it will include not only the column in question but anything else that qualifies as LOB_DATA. There may also be an out of row marker. I don't remember off the top of my head but it's marked something like ROW_OVERFLOW or something like that. That's for any normal VARCHARS/NVARCHARS that overflowed the ~8060 / 8094 byte mark for the row.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 15, 2018 at 7:11 am
Jeff Moden - Wednesday, August 15, 2018 7:10 AMTom John-342103 - Wednesday, August 15, 2018 5:47 AMJeff Moden - Tuesday, August 14, 2018 8:18 PMTom John-342103 - Tuesday, August 14, 2018 11:27 AMJonathan AC Roberts - Tuesday, August 14, 2018 11:03 AMTom John-342103 - Tuesday, August 14, 2018 10:59 AMBy default, Image data (and some other types) are written to PRIMARY. Is there any way to find out how much space is consumed by these images?You can use DATALENGTH
SELECT SUM(DATALENGTH(myImageCol))
FROM myTable
https://docs.microsoft.com/en-us/sql/t-sql/functions/datalength-transact-sql?view=sql-server-2017Thanks. That worked with one modification as shown below:
SELECT SUM(cast(DATALENGTH(BinaryImageData) as bigint)) FROM IMAGEINDEXI had to do this because the value returned was >2^31. The value was 31,116,152,328
That's actually going to provide a bit of an inaccurate count. Yes, you're counting the bytes in the image but you're not counting the number of bytes it takes to store the data. You need to count the pages in the LOB and divide that by 128.0 to get the accurate count of the bytes consumed not just by the images but by the pages they are stored on.
The hard part is if you have more than one LOB column in the same table. I've not had to split them out before and so don't know how to do such a thing off the top of my head.
Thanks Jeff. I actually moved the image data to it's own file group. The file has a size of 47,626,321,920 bytes with no wasted space. The sum of the data lengths is 31,116,152,328.
How do I get the page count from the LOB.
One way is to execute sys.dm_db_index_physical stats on the related table's Clustered Index and then look for the LOB_DATA marker for the row. Note that it will include not only the column in question but anything else that qualifies as LOB_DATA. There may also be an out of row marker. I don't remember off the top of my head but it's marked something like ROW_OVERFLOW or something like that. That's for any normal VARCHARS/NVARCHARS that overflowed the ~8060 / 8094 byte mark for the row.
p.s. And there's always wasted space in pages because the data just won't fit right. Even LOB_DATA lives on 8K pages.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 15, 2018 at 7:19 am
Jeff Moden - Wednesday, August 15, 2018 7:11 AMJeff Moden - Wednesday, August 15, 2018 7:10 AMTom John-342103 - Wednesday, August 15, 2018 5:47 AMJeff Moden - Tuesday, August 14, 2018 8:18 PMTom John-342103 - Tuesday, August 14, 2018 11:27 AMJonathan AC Roberts - Tuesday, August 14, 2018 11:03 AMTom John-342103 - Tuesday, August 14, 2018 10:59 AMBy default, Image data (and some other types) are written to PRIMARY. Is there any way to find out how much space is consumed by these images?You can use DATALENGTH
SELECT SUM(DATALENGTH(myImageCol))
FROM myTable
https://docs.microsoft.com/en-us/sql/t-sql/functions/datalength-transact-sql?view=sql-server-2017Thanks. That worked with one modification as shown below:
SELECT SUM(cast(DATALENGTH(BinaryImageData) as bigint)) FROM IMAGEINDEXI had to do this because the value returned was >2^31. The value was 31,116,152,328
That's actually going to provide a bit of an inaccurate count. Yes, you're counting the bytes in the image but you're not counting the number of bytes it takes to store the data. You need to count the pages in the LOB and divide that by 128.0 to get the accurate count of the bytes consumed not just by the images but by the pages they are stored on.
The hard part is if you have more than one LOB column in the same table. I've not had to split them out before and so don't know how to do such a thing off the top of my head.
Thanks Jeff. I actually moved the image data to it's own file group. The file has a size of 47,626,321,920 bytes with no wasted space. The sum of the data lengths is 31,116,152,328.
How do I get the page count from the LOB.
One way is to execute sys.dm_db_index_physical stats on the related table's Clustered Index and then look for the LOB_DATA marker for the row. Note that it will include not only the column in question but anything else that qualifies as LOB_DATA. There may also be an out of row marker. I don't remember off the top of my head but it's marked something like ROW_OVERFLOW or something like that. That's for any normal VARCHARS/NVARCHARS that overflowed the ~8060 / 8094 byte mark for the row.
p.s. And there's always wasted space in pages because the data just won't fit right. Even LOB_DATA lives on 8K pages.
Tom John now has a rule of thumb for this type of data:
August 16, 2018 at 11:19 am
Jeff Moden - Wednesday, August 15, 2018 7:10 AMTom John-342103 - Wednesday, August 15, 2018 5:47 AMJeff Moden - Tuesday, August 14, 2018 8:18 PMTom John-342103 - Tuesday, August 14, 2018 11:27 AMJonathan AC Roberts - Tuesday, August 14, 2018 11:03 AMTom John-342103 - Tuesday, August 14, 2018 10:59 AMBy default, Image data (and some other types) are written to PRIMARY. Is there any way to find out how much space is consumed by these images?You can use DATALENGTH
SELECT SUM(DATALENGTH(myImageCol))
FROM myTable
https://docs.microsoft.com/en-us/sql/t-sql/functions/datalength-transact-sql?view=sql-server-2017Thanks. That worked with one modification as shown below:
SELECT SUM(cast(DATALENGTH(BinaryImageData) as bigint)) FROM IMAGEINDEXI had to do this because the value returned was >2^31. The value was 31,116,152,328
That's actually going to provide a bit of an inaccurate count. Yes, you're counting the bytes in the image but you're not counting the number of bytes it takes to store the data. You need to count the pages in the LOB and divide that by 128.0 to get the accurate count of the bytes consumed not just by the images but by the pages they are stored on.
The hard part is if you have more than one LOB column in the same table. I've not had to split them out before and so don't know how to do such a thing off the top of my head.
Thanks Jeff. I actually moved the image data to it's own file group. The file has a size of 47,626,321,920 bytes with no wasted space. The sum of the data lengths is 31,116,152,328.
How do I get the page count from the LOB.
One way is to execute sys.dm_db_index_physical stats on the related table's Clustered Index and then look for the LOB_DATA marker for the row. Note that it will include not only the column in question but anything else that qualifies as LOB_DATA. There may also be an out of row marker. I don't remember off the top of my head but it's marked something like ROW_OVERFLOW or something like that. That's for any normal VARCHARS/NVARCHARS that overflowed the ~8060 / 8094 byte mark for the row.
Tom: it is incredibly difficult to get an accurate accounting of the actual footprint of a particular row, column, or value. There are various types of overhead to account for, various structures for storing large data, external factors, etc. For example:
Still, if you have just a single LOB column, then you can get somewhat close by using the following query, executed in the DB containing the table (be sure to put the correct table name into the OBJECT_ID function):
SELECT [allocation_unit_type_desc], [allocated_page_file_id], [allocated_page_page_id], [page_type_desc],
'---' AS [---], *
FROM sys.dm_db_database_page_allocations(DB_ID(), OBJECT_ID(N'SchemaName.TableName'), NULL, NULL, N'DETAILED')
WHERE [allocation_unit_type_desc] = N'LOB_DATA';
You can then add in the number of rows * 18 for the number of bytes required in-row (NULL rows take up 0 bytes, at least for this column).
----
P.S. Jeff, re: types that can overflow, don't forget about VARBINARY , and I believe also XML :).
P.P.S. Tom, re: your statement of "By default, Image data (and some other types) are written to PRIMARY", I do not believe that is correct. I am fairly certain that LOB data is stored in the same file group that the table / clustered index is in. I think you are referring to the TEXT_IMAGE_ON option that allows you to specify a different file group, but I don't think that implies that without it being specified that the data is in PRIMARY. If anything, I would expect the "default" to be the default file group, and that is not necessarily PRIMARY (I always create a different file group for user data and mark it as "default" so that hopefully nothing but system meta-data goes into PRIMARY).
SQL# — https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
Sql Quantum Lift — https://SqlQuantumLift.com/ ( company )
Sql Quantum Leap — https://SqlQuantumLeap.com/ ( blog )
Info sites — Collations • Module Signing • SQLCLR
August 16, 2018 at 12:38 pm
Image data (or successor types) is a bit of a pain. I am satisfied, at this point, with the option of putting image data into it's own file group. What I found interesting is that the process of getting the data into the new structure was time consuming. Image data doesn't seem to copy very fast. To do this, I renamed the original table and all of its constraints and indexes. Recreated the table with the TextImage in it's own file group and then copied from the renamed table to the recreated table. To copy 10,000,000 or so rows with around 50GB of image data took more than 12 hours.
August 16, 2018 at 1:19 pm
Tom John-342103 - Thursday, August 16, 2018 12:38 PMImage data (or successor types) is a bit of a pain. I am satisfied, at this point, with the option of putting image data into it's own file group. What I found interesting is that the process of getting the data into the new structure was time consuming. Image data doesn't seem to copy very fast. To do this, I renamed the original table and all of its constraints and indexes. Recreated the table with the TextImage in it's own file group and then copied from the renamed table to the recreated table. To copy 10,000,000 or so rows with around 50GB of image data took more than 12 hours.
Did you pre-grow the data files (data and log)? Did you copy it over in batches of less than 3000 rows? Of course, you have 29 GB of data, taking up 3,817,933 LOB pages at bare minimum (assuming they all fit into 3150 byte pages), and likely thousands more. So I can imagine that it takes time to break up 200k values into the proper structure, allocate that structure, store the data, and write it again (everything is written to both log and data files). So I wouldn't expect it to happen super quick-like 😉 .
Take care, Solomon..
SQL# — https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
Sql Quantum Lift — https://SqlQuantumLift.com/ ( company )
Sql Quantum Leap — https://SqlQuantumLeap.com/ ( blog )
Info sites — Collations • Module Signing • SQLCLR
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply