using varbinary as an array of floats?

Question

Post reply

using varbinary as an array of floats?

mstephens

Mr or Mrs. 500

Points: 525
More actions
March 8, 2006 at 5:41 am

#189563

Does anyone know how i might store an array of floats in a field in a database?
The array would need to be variable in length but i would like to be able to extract the appropriate value from a particular index within the array within a regular query and be able to use AVG and STDEV statistical functions for the given index over a selected rowset.
Any ideas would be much appreciated
Matt

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

paramind SSCrazy Points: 2160 More actions · Answer 1

I can't think of any reason, why one would want to do this. Why don't you store the values in another table?

_/_/_/ paramind _/_/_/

mstephens Mr or Mrs. 500 Points: 525 More actions · Answer 2

We have an application that is logging data from a scientific instrument and all of the values in the array are taken at the same time. Different runs of the application might choose to store more or less readings in the array depending on what's being monitored. Whilst we could have a table that effectively just holds each of the floats from the array there would be a significant amount of data to go along with each value to indicate where it was taken, what time, etc. that would be duplicated on every row whereas it would be in the database only once with the array approach.

We know we can store our data in a binary form like this but i am looking for possible ways to manage it better in queries/stored procedures rather than querying for the raw data and then manipulating it in procedural code.

I know that PostgreSQL supports an array datatype, i'm just looking to achieve similar things in SQL Server.

Matt

paramind SSCrazy Points: 2160 More actions · Answer 3

>Whilst we could have a table that effectively just holds each of the floats from the array there would be a significant amount of data to go along with each value to indicate where it was taken, what time, etc. that would be duplicated on every row whereas it would be in the database only once with the array approach.<

You we're talking of floats and their respective indexes only in the original post, not of [time taken][location taken]; that still won't change things.

In case you take many [index]-[float-value] pairs at one time and in one location, two additional tables are required. There's no reason to repeat location, time per indexed float value.

And, in effect, this will probably be less costly even in terms of space (see BOL on the actual storage of varbinary), let alone in terms of speed when performing math operations (AVG etc) on that data.

As for space: if your varbinaries may grow large and may differ significantly in length from record to record inline storage (in the regular table space) is no advisable option. For every varbinary first a 16-Byte-Pointer to the actual data is stored in the table space whereas the data itself resides in an extent. This occurs automatically/hidden and will not be obvious, if you don't simply know this or check the database's overall size.

What you need to repeat is nothing but the primary key of your main record. (I assume that's where your [time taken][location taken]) is placed. If you have no PK, add a surrogate key, say an integer.

Regarding the 16-byte-pointer, the first four recordid's in the n-table are for free Moreover the indexes (!I do not talk of physical indexes in database terms) are for free too, since there's no need to store them, if you perform your inserts in the presumed order. Moreover, index markers are not necessary either - at least 2 more free bytes per float stored [index markers: you need markers, e.g. "(...)" to tell indexes and values apart, when you store your non-fixed-length values in a varbinary).

Summary: As long, as you don't trade 100 gb of memory + 10 cpu's for one gb of harddisk you're better off with an extra table - and in case you do, I'll offer you a deal

_/_/_/ paramind _/_/_/

mstephens Mr or Mrs. 500 Points: 525 More actions · Answer 4

OK, thanks. I hear what you're saying. I guess i've got to just take the plunge and try to fill a test database with a load of data to judge performance/space requirements and then take it from there.

Cheers

JeffB Hall of Fame Points: 3541 More actions · Answer 5

Paramind is suggesting using a table structure like this:

create table Reading

(

ReadingID bigint Identity(1,1),

timetaken datetime,

locationid int

--add other qualifying data

)

create table ReadingData

(

ReadingID bigint,

Value float

)

--then the following is very simple

select r.ReadingID, avg(d.value)

from reading r

inner join readingdata d

on r.ReadingID = d.ReadingID

group by r.ReadingID

jim parker SSC Rookie Points: 49 More actions · Answer 6

Or there are database systems that allow multi-valued fields.

We still use Universe (now owned by IBM) which allows sub,sub,sub values in a field (first delimiter is ctl+} ) and can present interesting problems in queries and programming (a variation of Pick Basic).