March 2, 2007 at 9:51 pm
I have a unique situation where I am saving the formats of a grid cell. I will be having thousands of grids that have thousands of cells, so very quickly I will be dealing with about 100,000,000 cells. The table structure for the cell is as follows:
grid_rowGUID
grid_columnGUID
grid_cellValue
grid_cellFormat
My question is - which is better, VarBinary(Max) or XML?
Better in this case is defined as access speed, retrieval speed, and storage space. I only get one shot at this, so the feedback is super important!
March 2, 2007 at 10:29 pm
What is a "grid cell"?
What does your application do?
March 3, 2007 at 4:39 am
You might also be better looking at some other "serialization" mechanism so you get one table per grid rather than one big table as this is likely to perform better.
Can you give us some more background?
- James
--
James Moore
Red Gate Software Ltd
March 3, 2007 at 10:19 am
I just posted and I cannot find the post...is this even working?
March 3, 2007 at 10:26 am
Some times the session times out and you lose the message. If you had to hit post reply twice to get the post working then that is what happened.
March 3, 2007 at 10:55 am
Great...OK...here is my #2 try at replying
Thank you all for answering
In summary of what I am trying to do and what I am working with:
I have an application that has been in development for 3 years and is almost finished. It uses SQL Server 2005 and a windows interface.
This specific module is a cash flow forecast that operates much like an excel spreadsheet that has multiple sheets within it.
A GridCell refers to a specific cell within a UI table display. I am using Syncfusion grid control for all of the loading and saving, formula and value management.
A Template is defined as the following:
1) Provides an average of 25 editable sheets per template
2) Each sheet contains an average of 38 columns and 100 rows
3) Is completely customizable with add, edit, delete of sheets, columns, rows, cells
4) A template will be copied an average of 200 times per year for individual custoimization by the user.
5) There will be an average of 25 templates per SQL database installed
6) In one year, there will be an average of 19 million cells (25*38*100*200) or approximately 100 million after 5 years.
7) A single template is dynamically created and can also be deleted.
A Template Set within the SQL database is as follows:
Template Table --> TemplateID (PK)
Sheet Table --> SheetID (PK), TemplateID
Column Table --> Column ID (pk), SheetID, columnFormatting (binary or xml)
Row Table --> RowID(pk), SheetID, rowFormatting (binary or xml)
Cell Table --> {RowID, ColumnID}(pk), cellFormatting (binary or xml), CellValue, SheetID (this makes the dataset management easier within the app)
**There are other fields that store specific detail information, but not included for this discussion.
columnFormatting, RowFormatting, and CellFormatting store the "skin" display but does not have any direct impact or use for calculations or value usage.
--------------------------
Application Use
The application functions much like an excel spreadsheet, except that it allows users to pull data values from previous Balance Sheets, Income Statements, Deferred Taxes, and Owners Equity (all of these are template based and are working well).
The big design problem/deviation was that now a template is not used except to jump start an actual cash flow forecast. Where in all other template designs, the template is a create once, use all of the time with the same structure (i.e. no copying or customization per use).
-----------------------
Solution Approach #1: Create individual tables per grid display
Answer: If I create an individual template table for each forecast, then I would be dealing with more than a 25000 tables in 1 year (25 sheets/template *200 copies *5 tables).
I could create single set of tables for each template created. So there would be 125 tables (25*5 tables) to manage. This is possible, but I am not sure if there is much benefit to the database management (select and update queries). And the storage of the formatting would be still be bad.
----------------------------------------
Solution #2: Look at some other serialization
Answer: There are only 2 different serializations avaiable to me: binary and xml. If I put everything into binary or XML at the grid level then I will lose the ability to retrieve individual cell level values. I am also with the understanding that I would not be able to perform efficient text searches within the binary/xml data object within SQL 2005.
----------------------------------------
IF you guys think that a combined solution approach where I break out the tables for each Template Set (for use in managing the customization data), and then implement a full grid serialization/xml solution then I will pursue that approach. But it would be much better if there was a good solution that did not require me changing the current database structure since all of the app code uses it.
Thank you tons for all of your feedback and help! I don't have anybody that have the necessary skills to discuss this particular design challenge - and your feedback and opinion is greatly appreciated!
-Eric
March 5, 2007 at 12:15 am
You can search xml very efficiently in SQL 2005, if you use xquery and have the appropriate xml indexes on the column.
What do you want to do with the data after its in the db? Will you be retrieving entire rows? Searching withing the xml?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 5, 2007 at 9:59 am
I will be searching within the xml for specific values. In this case, whenever there will be a referencing formula (e.g. [Cash_On_Hand{pk199365}] * .25), I will be looking up the referencing text variable based on its primary key within the formulas.
At other times, I will also need to pull back references based on column or row position.
Would it be possible for you to point me to a good reference that would explain what you mean by "the appropriate xml indexes on the column". Are you referring to some index within the stored xml document? or the index within the sql table?
I am reasonably skilled with SQL, but have no experience (until now) with xml inside of sql. I have googled tons and have a few xml books, but the true functionality of XML *inside* of SQL still escapes me.
In the mean time, I will implement the combined approach of breaking out the tables and then using the xml storage object. I'll just cross my fingers that the sql won't die with large 5MB-15MB xml files.
Let me know if you have any other ideas - Many HUGE Thanks for the feedback!
-Eric
March 6, 2007 at 11:17 pm
Would it be possible for you to point me to a good reference that would explain what you mean by "the appropriate xml indexes on the column". Are you referring to some index within the stored xml document? or the index within the sql table?
SQL 2005 can create an index on the xml data. What that does is index the contents of the xml, shreading it for fast searches. There are two main types - a prmary xml index and secondary xml indexes. The secondary can be created on for searching on element, attribute or path. You'll need to use xQuery to query the xml.
Look in books online for a good summary. Search for CREATE XML INDEX and you should get a good summary at least.
I als don't have that much expereince with xml in SQL 2005. I've never needed to use it yet.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 7, 2007 at 8:40 am
Thanks Gail!
I have been doing lots of reading based on your suggestion from your post a couple of days ago. It looks like it is exactly what is needed for situations like this.
Even with all of the reading, I had not come across the basic information of primary and secondary xml indexes. Most everybody on the Internet is giving lots either very specific information or very generic. General sample based tutorials are hard to find.
At this point, I am going to remove the cell table and post the entire grid as an xml object in the sheet table. Following your advice, I will create xml indexes for these xml objects and viola' I get to keep my job another year
Thank you tons for the feedback! I learned something!
Best Wishes
-Eric
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply