calculating averages and standard deviations

  • I have six fields named involved, engaged, explored, shared, laughed, and communicated. I need to take the value stored in these fields and add them together, divide by the number of fields to get the average. I also need to get the standard deviation. I know how to get the average and the standard deviation, but the trick is I need to be able to calculate the averages and standard deviation on any combination of 5 or 6 of the fields.

    For clarification, if I have a null value for the involved field then I would add engaged, explored, shared, laughed, and communicated and divide by 5 to get the average. I would do the same thing if any one of the six fields is null (engaged is null then add all the rest and divide by 5). If however, more than one field is null for a particular set then I do not need to calculate the average or standard deviation as it has been determined that there is not enough data to make a determination.

    I have this same problem with a set of eight different fields where any two fields can be null and the average and standard deviations are deemed valid, but more than two null fields make the result invalid.

    Ask for clarification if this does not make sense, but I can't seem to get my hands around the best way to do this.

  • It might look something like this.

    Assuming you have some sort of key field. we will call it pkey

    I would select this into a temp table

    select

    involved,

    case

    where involved is null then 0

    else

    1

    end as involvednullflg,

    etc...

    after you do this you can add up all the null flags and use that as the number of columns without a null.

    This may not be the absolute best way to do this but it would work.

    Dan

    If only I could snap my figures and have all the correct indexes apear and the buffer clean and.... Start day dream here.

  • Why don't you use the built-in functions STDEV() and AVG()?

    Both ignore NULL values by default...

    To check if at least one of your columns is null you could simply add those cols and check if the result is null.

    It's a little more complicated to check if any two of the columns are null. I'd probably use ISNULL(CAST(CAST(col1 AS BIT) AS INT),10) + ISNULL(CAST(CAST(col2 AS BIT) AS INT),10)+... > 19, maybe even in a computed (persisted) column...

    Edit: but I'm not sure if this is any better than Dans solution...

    Edit2: Syntax corrected



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Thanks all for the responses. The answers have at least started me thinking in another direction. To clarify things, i don't need to calculate an average for a day but for a collection of days. Same with the Standard deviation. So I capture the data for the six fields each day for a month. Need to average the daily sums of the six fields at the end of a time period, typically the end of a calendar month. Same for the standard deviation.

    So I have been doing the math in a sql view to add the values together for each day and divide these by the number of values. Then I take the the sum of these averages, and divide that by the number of valid days that data was captured. I then simply get the standard deviation in SSRS using stdev in the report.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply