Is the formula for STDEV defined anywhere?

Question

Post reply

Is the formula for STDEV defined anywhere?

Out For Justice

SSCrazy

Points: 2733
More actions
November 3, 2004 at 5:45 am

#66227

Hi,
We are using STDEV in some stored procedures in SQL 2000. For our test plan, we have a defined set of data and results that have been calculated using Excel 2000. We are seeing some differences that we can't account for and given that there is more than one accepted way of calculating standard deviations, I was wondering if the discrepancies are down to Excel 2000 and SQL 2000 doing it in slightly different ways.
The Excel help is useful enough to give a definition of the formula it uses for STDEV, but I haven't been able to find the same information for SQL 2000 in the BOL.
Anyone know where I might find the formula?
Thanks!

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Frank Kalis SSC Guru Points: 111183 More actions · Answer 1

Sure, there are myriads of source out there.

Personally I like this one:

http://mathworld.wolfram.com

--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

Out For Justice SSCrazy Points: 2733 More actions · Answer 2

Thanks, but what I actually meant was does anyone know which STDEV formula SQL 2000 uses?

Frank Kalis SSC Guru Points: 111183 More actions · Answer 3

Oh, I see.

But are you concerned about the difference on the 14th decimal place anyway?

When you play with the example given in BOL and recalculate this in Excel you'll get the following results:

SQL Server: 4,7731890108535753

Excel: 4,7731890108535800

You see that Excel "rounds up" to the 14th decimal place.

Now what happens in SQL Server? I'm quite sure you won't find anywhere the exact implementation of the formula. However I guess something very similar to this happens internally:

USE pubs

DECLARE @a DECIMAL(18,10)

DECLARE @b-2 DECIMAL(18,10)

SELECT @a= SUM(royalty)/(COUNT(royalty)*1.0) FROM titles

SELECT @b-2=SUM(SQUARE(royalty-@a)) FROM titles

SELECT SQRT(@b/(COUNT(royalty)-1)) FROM titles

-----------------------------------------------------

4.7731890108535753

(1 row(s) affected)

--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

Frank Kalis SSC Guru Points: 111183 More actions · Answer 4

Oh my, probably easier is to simply look at the execution plan

StmtText

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

|--Compute Scalar(DEFINE[Expr1002]=sqrt(If (If ([Expr1003]-1>0) then (([Expr1005]-square([Expr1004])/Convert([Expr1003]))/Convert(([Expr1003]-1))) else NULL=NULL) then If ([Expr1003]-1>0) then (([Expr1005]-square([Expr1004])/Convert([Expr1003]))/Conver

|--Stream Aggregate(DEFINE[Expr1003]=COUNT_BIG([titles].[royalty]), [Expr1004]=SUM(Convert([titles].[royalty])), [Expr1005]=SUM(square(Convert([titles].[royalty])))))

|--Clustered Index Scan(OBJECT[pubs].[dbo].[titles].[UPKCL_titleidind]))

--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

Out For Justice SSCrazy Points: 2733 More actions · Answer 5

Thanks Frank, this information is very useful! I was trying to ascertain if the differences in the results we are calculating in our app and the expected results in our test plan could have been due to different standard deviation methodologies.

From what you have told me, it looks as if there is probably a bug in our code that we need to look into.