January 29, 2007 at 2:41 am
Hi all,
Does anyone have any suggestions for loading and storing a matrix of data that changes in either dimension on a daily basis.
I have to keep each days data together as single entity. I know this is not what an rdbms is designed for but its what I have.
Feel free to chip in with anything sensible.
Thanks
Will
January 30, 2007 at 4:07 am
What about converting it into a XML document and when store it as xml datatype ?
Markus
[font="Verdana"]Markus Bohse[/font]
January 30, 2007 at 4:18 am
Thanks Markus,
Would there be a big overhead in then converting the xml datatype to columns and rows , once I had it in the db as xml? is it possible to do?
I'm in a new environment here with SQL 2005 and havent used the xml datatype before.
All tips gratefully received.
Will
January 30, 2007 at 6:37 am
I think I'd just go with a simple table - row# or name, col# or name, value. So for a 3x3 matrix, you might have these
1, 1, "A"
1, 2, "B"
1, 3, "C"
2, 1, "D"
2, 2, "E"
Etc
That will store any matrix you want. The easiest way to handle changes would be to delete all the rows for the matrix from the table and then add the new matrix mack in a transaction. A fancier implementation might just handle the add/change/delete done to the matrix, but it would be some extra work and you'd have to work to minimize round trips.
January 31, 2007 at 12:59 am
Hi,
did you think about storing your data in a star schema with columns and rows as dimensions adding a date dimension to allow you to identify the data belonging to a certain day?
You can then add data on a daily basis and decide to keep a certain range of days or delete them if not needed any more.
Michael
January 31, 2007 at 2:12 am
Thanks Michael,
I have no prior knowledge of datawarehousing so I'll read up on your suggestion. I'm working my way through the SSIS loading mechanism now.
Thanks to everybody for your help and keep all the suggestions coming please.
Will
January 31, 2007 at 12:58 pm
You need to have a better idea of that 1000x700 matrix. Besides the dimensions changing you need to ask yourself:
How sparse is it?
How many entries change between one day and the next?
How much and what kinds of data is there in each matrix location?
Without answers to at least these questions there will be no meaningful answer.
February 1, 2007 at 3:38 am
Hi Michael ,
All valid questions,
How sparse is it?
All fields in the matrix would be populated the number of columns could vary widely on a daily basis .day1 1200 cols, day2 950 cols, day2 800 cols etc. the number of rows is pretty static by comparison between 650 and 700.
How many entries change between one day and the next?
Almost all of the data would change on a daily basis and each days data must be saved in its entirety . I would not ever want to update any data.
How much and what kinds of data is there in each matrix location?
A row of financial ids, One col of business dates and the rest of the data would be numeric(18,2) held as varchar as no computation would need to be performed on the *numeric* data.
What I have done is count the number of rows in the input dataset , create a table on the fly using the rownumber as the number of columns and create a table. I then transpose the data so datacolumns become datarows, that gets me around hitting the max number of columns in a sql table 1024.
I'm just wondering how other guys would skin this cat.
Will
February 1, 2007 at 4:05 am
So you have
null fid1 fid2 fid3 ...
date1 val11 val12 val13 ...
date2 val21 val22 val23 ...
date3 val31 val32 val33 ...
. . . .
. . . .
. . . .
I presume you receive this as a text file of some sort.
Certainly you can save it into a table blob without great difficulty.
Somewhere in this forum I published some VBScript that lets you do this quite easily.
The other issue is that if there is a frequent need to do something with this data
then you might want to pre-process the original into some format that lets you then
quickly load it into a real SQL Server table - like what Andy mentioned.
February 1, 2007 at 4:16 am
Michael ,
Its like you've seen the file already, yep I receive it as an overnight tab delimited text file.
My transpose stored proc does exactly what Andy suggested (thanks Andy) anyway,
My users would would search for a matrix by LoadDate and for all or a subset of positions within the matrix so would the blob still cut it for that ?
Can I rip up a blob into cols and rows ? once I had the data in sql blob field?
Will
February 1, 2007 at 5:20 am
I have never 'ripped up' a blob like you want to do. I would stay away from doing
this in a stored procedure - as it would be very slow. Neither do I know if you
can manipulate a blob in a stored procedure using column variables or @-variables. As I said earlier, the only way I have ever dealt with getting blobs in and out is via VBScript.
Thus the problem is one of transforming the original text file into something and store it
in a blob. The format should be such that it can quickly be unloaded from the blob
and transformed into a real SQL Server table.
The things that come to my mind are:
1. Transform it into MS Access. Once unloaded to a file you can then get at the data
using a SQL Server link.
2. Transform it into a SQL Server table. Surely there is a way to export/import a single
table using DTS (don't know).
3. XML - although I haven't dealt much with this on SQL Server.
The problem is that if you need to save the original then you would have to deal
with two blobs.
The technical solution would depend a lot on the application context in which all this
manipulation occurs.
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply