March 4, 2014 at 1:14 pm
Hello Experts,
a coworker asked me how to convert raw binary data for for readability. He has a table that contain raw binary data it looks like this:
0x020000000500000003000000FFF67F3F21A734430264803FD72F803F010000000DFEFF3EAFA640431FF6003F217E003F0100...
I used bcp to export it out to file using various file format, however the data is not readable. My question is...is it possible to determine which doc type (txt,jpg,pdf, etc.) based on this binary data, or whatever mean to make the data human readable? Eventually he may have to go back and ask the client how to decode the data, but in the meantime I wonder if there is anything we can do about it?
I suspect the data is encrypted.
Thanks in advance!
March 4, 2014 at 1:33 pm
Is it a file stored by a client or is he perchance looking at ssis or ssrs binary stored in the database?
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
March 4, 2014 at 2:00 pm
That looks a lot like a byte array. This is the typical way of storing binary type data in the database. I don't think there is a way to determine what file type it is based on the information. We have a table where we can store literally any possible file type. The way we deal with it is to capture the file name along with the contents.
We have another system that stores nothing but pdfs. The data in both these looks just like what you posted. Depending on what this is used for you might be to make some guesses at the file type.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
March 4, 2014 at 2:11 pm
in a similar post a couple of years ago, i did a TSQL example that looked at the first x bytes and compared them to known values to find gif/jpg/bmp/png files;
see this post here, and the rest of the thread too:
http://www.sqlservercentral.com/Forums/FindPost1179452.aspx
you'd have to expand the mapping, and find what doc,docx,pdf,rtf,xls,xlsx, etc all start with.
the right thing would have been to store the original filename witht eh varbinary, of course, but you could reverse engineer about 99% of them i bet.
Lowell
March 4, 2014 at 2:19 pm
Lowell (3/4/2014)
in a similar post a couple of years ago, i did a TSQL example that looked at the first x bytes and compared them to known values to find gif/jpg/bmp/png files;see this post here, and the rest of the thread too:
http://www.sqlservercentral.com/Forums/FindPost1179452.aspx
you'd have to expand the mapping, and find what doc,docx,pdf,rtf,xls,xlsx, etc all start with.
the right thing would have been to store the original filename with the varbinary, of course, but you could reverse engineer about 99% of them i bet.
Cool stuff Lowell.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
March 4, 2014 at 2:49 pm
Is it a file stored by a client or is he perchance looking at ssis or ssrs binary stored in the database?
I heard it is stored by the clients application. I think Sean was right it may be byte array
Thank you guy for your advise, I just want to make sure I didn't miss out chance to know a method to deal with this stuff, otherwise it's not worth to go extra miles to find it out.
Thanks again!
March 4, 2014 at 2:55 pm
Lowell (3/4/2014)
in a similar post a couple of years ago, i did a TSQL example that looked at the first x bytes and compared them to known values to find gif/jpg/bmp/png files;see this post here, and the rest of the thread too:
http://www.sqlservercentral.com/Forums/FindPost1179452.aspx
you'd have to expand the mapping, and find what doc,docx,pdf,rtf,xls,xlsx, etc all start with.
the right thing would have been to store the original filename witht eh varbinary, of course, but you could reverse engineer about 99% of them i bet.
Sweet. I was thinking we could possibly reverse engineer the first part of the file comparing to known values since the various file types do all appear to have very similar info at the beginning of the byte array.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
March 6, 2014 at 2:38 pm
Sean Lange (3/4/2014)
Lowell (3/4/2014)
in a similar post a couple of years ago, i did a TSQL example that looked at the first x bytes and compared them to known values to find gif/jpg/bmp/png files;see this post here, and the rest of the thread too:
http://www.sqlservercentral.com/Forums/FindPost1179452.aspx
you'd have to expand the mapping, and find what doc,docx,pdf,rtf,xls,xlsx, etc all start with.
the right thing would have been to store the original filename witht eh varbinary, of course, but you could reverse engineer about 99% of them i bet.
Sweet. I was thinking we could possibly reverse engineer the first part of the file comparing to known values since the various file types do all appear to have very similar info at the beginning of the byte array.
Googling "file signatures", "magic numbers" etc brings up some pages.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply