How to calculatate the correlation coefficient

  • Hi

    Who knows an elegant way to calculate the correlation coefficient of two series (columns)  in T-SQL. It seems there is no aggregate function as there is one for the standard deviation.

    Many thanks in advance

    Urs

     

  • b/c ther'es no standard function for this, u'd have to write your own function according to a well-known algorhithm. I'm sure it'll be elegant enough

  • If the columns a VARCHAR or the like, you can use SOUNDEX (won't work on numbers).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • There two Logs in the SQL Server math functions so that is a place to start float data type is required but you can see in the code sample you can convert to varchar.  Hope this helps.

    http://msdn2.microsoft.com/en-us/library/ms177516.aspx

    Kind regards,
    Gift Peddie

  • Implemented from the formula for the sample correlation coefficient on

    http://en.wikipedia.org/wiki/Correlation (I'm assuming you want the linear

    correlation coefficient).

    Table called "data", columns "x" and "y":

    select (count(*) * sum(x * y) - sum(x) * sum(y)) /

    (sqrt(count(*) * sum(x * x) - sum(x) * sum(x)) *

    sqrt(count(*) * sum(y * y) - sum(y) * sum(y)))

    from data

    I've tried it with some sample data, and the results look "reasonable" so

    I think I've got the implementation right. It is a little numerically unstable

    though - I got a value just a shade over 1 for a perfect linear fit.

    Tony

  • Thank you very much to all of you. I implemented Tony's formula and I got exact the result what I was looking for. I will perform some more tests to check the stability for my secific data sets.

    Urs

     

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply