February 8, 2010 at 8:33 pm
Comments posted to this topic are about the item Forecasting with SQL
February 9, 2010 at 6:46 am
SSAS is completely capable of handling forecasting tasks utilizing a variety of different econometric algorithms.
I would STRONGLY urge readers to consult with a professional statistician or economist before undertaking this type of project.
There are a number of different data afflictions that can have serious adverse impacts on the accuracy of an econometric forecasting solution.
Autocorrelated error terms, unit roots, non-linear time series and others that can cause issues when building ARIMA or regression models and result in spurrious relationships. A trained practitioner will know what to look for and, more importantly, how to correct for these conditions. The spurrious relationships that can result from improper econometric data analysis can result in using independent variables that really have no impact on the dependent variable or throwing out independent variables that actually do have a statistically significant relationship but just need a little prep work that a trained econometrician would understand.
When these forecasting tools are used to base critical business decisions that impact people's lives and livelihoods, PLEASE be sure and leave this type of work to the experts.
February 9, 2010 at 7:11 am
Martin Vrieze (2/9/2010)
When these forecasting tools are used to base critical business decisions that impact people's lives and livelihoods, PLEASE be sure and leave this type of work to the experts.
Point of Order: Those who created the credit default swap catastrophe using these types of methods were experts. Having an expert consultant is no substitute to understanding at least the underlying principles of the models you're using. Don't just leave the work to the experts; work with them. If it's important enough to pay someone to do, it's important enough for you to understand it.
February 9, 2010 at 7:16 am
Point of Order: Those who created the credit default swap catastrophe using these types of methods were experts. Having an expert consultant is no substitute to understanding at least the underlying principles of the models you're using. Don't just leave the work to the experts; work with them. If it's important enough to pay someone to do, it's important enough for you to understand it.
Spoken with true wisdom. Well said.
February 9, 2010 at 7:55 am
Hi Martin,
I appreciate your comments and words of pause. It is important to realize that picking up this forecast model will not get you a job as a forecaster ;P
This article was not intended to be a replacement for a forecast system or to ignore the tools available in analysis services. It was intended to be an entry point into the world of forecasting, in the case where analysis services is not an option. You are certainly aware that linear regression is only one of many models to apply and that each model differs in it's usefulness for each application.
Hopefully, folks will find this exercise useful in understanding the some of the basic language of forecasting and, fundamentally, what this model is and how it works. Learning about a new skill can be daunting, but I hope that we will not be discouraged from starting merely by the possible depth of the subject.
Mark
February 9, 2010 at 8:57 am
February 9, 2010 at 9:29 am
sknox (2/9/2010)
Martin Vrieze (2/9/2010)
When these forecasting tools are used to base critical business decisions that impact people's lives and livelihoods, PLEASE be sure and leave this type of work to the experts.Point of Order: Those who created the credit default swap catastrophe using these types of methods were experts. Having an expert consultant is no substitute to understanding at least the underlying principles of the models you're using. Don't just leave the work to the experts; work with them. If it's important enough to pay someone to do, it's important enough for you to understand it.
IIRC most of the debacle was the result of ONE FORMULA that EVERYONE USED. Just took it for granted that it worked. But it was flawed. Businesses WANTED to use it though (serious bias) because it gave essentially all buckets of risk an AAA rating.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
February 9, 2010 at 11:15 am
Great topic. I had to dig through all of my fearful stats days to do trending a few years back with SSRS 2005 as we were not going to be buying add on tools... So there I was building my own trend lines.
It isn't just trending where lack of understanding can bite us. To many people write reports with no understanding of the data and far more consume the reports with even less understanding. It is so frightening the number of business decisions that are made with invalid, incorrect, and incomplete data.
February 9, 2010 at 12:05 pm
Nice article. Well written.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
February 9, 2010 at 3:54 pm
Nice work, Mark.
A couple of points, though. First, the word "variance" is frequently misused when people mean to say "variation". This can cause ambiguity in math/stats where "variance" is defined as the mean of the squared residuals (a residual being the difference between the regression line at point Xi and the raw data value Yi). The square root of variance is actually the more familiar "standard deviation".
Second, I can explain why the notation used in linear regression is at odds with that of trigonometry and Cartesian geometry. The reason for writing "y = a + bx" is generality. Suppose you wanted to fit a 2nd-order poynomial to a data series that that had curvature, not just slope; you would then be solving "y = a + bx + cx^2". You could keep adding powers of x if you had good reason to believe that the underlying phenomenon possessed many degrees of freedom.
And, as if that wasn't enough, in the general problem (where any kind of approximating function is possible), you would actually write the coefficients as a0, a1, a2,... and the "basis functions" as f0(x), f1(x), f2(x),... . For linear regression, f0(x) = 1 and f1(x) = x.
Of course, there are weird and wonderful techniques for fitting data to nonlinear functions, but it might not be practical to do it with SQL.
Cheers!
- Al
February 9, 2010 at 9:14 pm
Thanks for the clarifications, Al. I appreciate the finer points and I am sure this will be helpful to any readers as well.
Mark
February 10, 2010 at 5:37 am
Do you have any useful info on calculating multiple linear regression with T-SQL?
February 10, 2010 at 12:48 pm
piet_dj (2/10/2010)
Do you have any useful info on calculating multiple linear regression with T-SQL?
Here http://developerworkshop.net/software.html
It also handles logarithmic, exponential and power regression.
N 56°04'39.16"
E 12°55'05.25"
February 11, 2010 at 7:44 am
Very good article, thanks. When I picked up the code snippet and tried to use it, I saw one problem. In the last section of code, labeled "Create Forecast", the term that's multiplied by the seasonality factor is missing a parenthesis. That is,
MAX(A) + (MAX(B) * MAX(Forecastkey) + 1) should be
(MAX(A) + (MAX(B) * MAX(Forecastkey) + 1))
Correct me if I'm wrong, but this is the change that it took to make my forecast look correct.
February 11, 2010 at 9:35 am
ltaylor 73774 (2/11/2010)
Very good article, thanks. When I picked up the code snippet and tried to use it, I saw one problem. In the last section of code, labeled "Create Forecast", the term that's multiplied by the seasonality factor is missing a parenthesis. That is,MAX(A) + (MAX(B) * MAX(Forecastkey) + 1) should be
(MAX(A) + (MAX(B) * MAX(Forecastkey) + 1))
Correct me if I'm wrong, but this is the change that it took to make my forecast look correct.
Yes, you are correct. When the trend is multiplied against seasonality, the parenthese are needed.
Thank you for finding that error! I will get the download updated.
Mark
Viewing 15 posts - 1 through 15 (of 34 total)
You must be logged in to reply to this topic. Login to reply