November 3, 2004 at 5:45 am
Hi,
We are using STDEV in some stored procedures in SQL 2000. For our test plan, we have a defined set of data and results that have been calculated using Excel 2000. We are seeing some differences that we can't account for and given that there is more than one accepted way of calculating standard deviations, I was wondering if the discrepancies are down to Excel 2000 and SQL 2000 doing it in slightly different ways.
The Excel help is useful enough to give a definition of the formula it uses for STDEV, but I haven't been able to find the same information for SQL 2000 in the BOL.
Anyone know where I might find the formula?
Thanks!
November 3, 2004 at 5:50 am
Sure, there are myriads of source out there.
Personally I like this one:
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
November 3, 2004 at 5:58 am
Thanks, but what I actually meant was does anyone know which STDEV formula SQL 2000 uses?
November 3, 2004 at 6:43 am
Oh, I see.
But are you concerned about the difference on the 14th decimal place anyway?
When you play with the example given in BOL and recalculate this in Excel you'll get the following results:
SQL Server: 4,7731890108535753
Excel: 4,7731890108535800
You see that Excel "rounds up" to the 14th decimal place.
Now what happens in SQL Server? I'm quite sure you won't find anywhere the exact implementation of the formula. However I guess something very similar to this happens internally:
USE pubs
DECLARE @a DECIMAL(18,10)
DECLARE @b-2 DECIMAL(18,10)
SELECT @a= SUM(royalty)/(COUNT(royalty)*1.0) FROM titles
SELECT @b-2=SUM(SQUARE(royalty-@a)) FROM titles
SELECT SQRT(@b/(COUNT(royalty)-1)) FROM titles
-----------------------------------------------------
4.7731890108535753
(1 row(s) affected)
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
November 3, 2004 at 6:49 am
Oh my, probably easier is to simply look at the execution plan
StmtText
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|--Compute Scalar(DEFINE[Expr1002]=sqrt(If (If ([Expr1003]-1>0) then (([Expr1005]-square([Expr1004])/Convert([Expr1003]))/Convert(([Expr1003]-1))) else NULL=NULL) then If ([Expr1003]-1>0) then (([Expr1005]-square([Expr1004])/Convert([Expr1003]))/Conver
|--Stream Aggregate(DEFINE[Expr1003]=COUNT_BIG([titles].[royalty]), [Expr1004]=SUM(Convert([titles].[royalty])), [Expr1005]=SUM(square(Convert([titles].[royalty])))))
|--Clustered Index Scan(OBJECT[pubs].[dbo].[titles].[UPKCL_titleidind]))
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
November 3, 2004 at 6:55 am
Thanks Frank, this information is very useful! I was trying to ascertain if the differences in the results we are calculating in our app and the expected results in our test plan could have been due to different standard deviation methodologies.
From what you have told me, it looks as if there is probably a bug in our code that we need to look into.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply