Forecasting

Recently I got the chance to do forecasting work for a major American broadcasting corporation. There were many lessons I learned from this work, so I will try to lay it down in an easily digestible (as opposed to chronological, and therefore confusing) way.

To start off with, I conjecture that there are roughly two ways to forecast: Time Series, and Parametric. Regardless of what the textbooks tell you, it is much more practical to first ask yourself

Can the series (that I am trying to forecast) be expressed as a function of another set of data points?
Think polynomials, think regression. If yes, ask yourself
Do I have a dependable source for this other set of data points?
If your answer to either of the questions was 'no', then Time Series is your only viable option. Whip out those Excel sheets and your favorite stats primer, and get cracking with the textbook approach.

However, if you both your answers were yes, life could still be interesting. Don't wait for me to tell you, go and collate the data from wherever it is right now and arrange it prettily on an Excel sheet, apply your favorite font and hold your breath... Now exhale, and download XLMiner. The trial version of course.

The tools you are looking at right now are Multiple Regression, Artificial Neural Networks and Auto-regression. So familiarize yourself with the theory from Wikipedia, make the donation because you appreciate the work Wikipedia does, and run MR, ANN and auto-regression on your data sets, one at a time. Fiddle with the parameters to your heart's content because
(1) No matter what you might have understood from the theory, you haven't understood the theory. Like how? For example, I assumed that by decreasing the step size for ANN, I will be able to stabilize a system faster. To my chagrin, larger step sizes actually speeded up the stabilization (fewer epochs were needed before error leveled off). Perhaps the system was chaotic, perhaps the step size was simply more optimal. But whatever it was, I am sure I haven't understood the theory so well that I can build the perfect model at one go. So keep trying, keep fiddling.
(2) You can only learn more about your data set. Every new scenario you run has the potential to show you something about your data that you did not know till now, or perhaps could use in a later hypothesis.
Anyhow, we are getting ahead of ourselves. I promised to show learnings, and one other thing I learned was the importance of parametric modeling. So if anyone asks you why you built a parametric model, your answer might be something like this
While time series could suffice in many situations -- the motto of good analysts is always to look for best results and not get caught up in cool stunts -- time series heavily depends on historicals, and the degree to which the past will repeat itself is very uncertain. So rather than derive that perfect Trend- Cyclicality- Seasonality- Randomness, which might fall apart tomorrow due to some demand side, political or macroeconomic crunch, you can build a parametric model based on these demand side, political or macroeconomic variables and later run what-if scenarios for the client's viewing pleasure.
All the same, if your Time Series forecast is bang on target in the test runs, you can go ahead with quoting these as your primary analysis and use the parametric models for that new and improved value add.

3 comments:

RD said...

Good one Nitish!
sp. enjoyed the writing style. Crisp and absolutely unambiguous!
However would like to voice a counter opinion on some points
History repeats it self:And by history i mean the pattern. And that’s why we have cyclic and seasonal patterns in historic data.
However cyclic and season patterns follow the trend-uptick or downward slopping-as the case may be. Randomness if however un predictable-its random. And such random is specially more evident when the data points are sparse . Sales of slow moving items,for eg. However randomness can also be observed in a set of historic data which depicts normal distribution(bell shaped curve).Even this randomness can be tracked and accounted for in the forecast if causal data for the randomness has been captured with the historic data.
Parametric models can be very useful is predicting a trend or an event which can cause randomness, however the quantum of effect due the changes modeled in parametric models cannot be very accurately modeled. This error in accuracy coupled with accuracy error inbuilt in any forecasting technique might attenuate the usefulness of the result that a parametric model might provide.
It is due to this reason, I guess there is still so much of reliance on forecasting based on historic data than any thing else and we observe great blunders like the subprime crisis

El Scorcho said...

I agree with whatever 'rd' said above, but I don't think they are counter points as such:

(1) an effect which can cause "randomness" is one that can be said to break a model. Usually you try to cover these events in your Risk Plan, which is a model of sorts in turn depending on how deeply you plan for the risks, which in turn depends on how much impact these impacts can have and how valuable the quantity being modeled really is (since risk planning can get expensive if taken far enough).

(2) subprime crisis was NOT caused by bad models, but rather by lack of faith in the models. It was through models and risk scores that the mortgages were categorized as subprime in the first place. Since the risk is higher, the returns on derivatives based on them was also higher. This led to some desperate investors and glory-seeking investment bankers to invest in these bonds. When the actual borrowers (who had taken out the mortgages in the first place) started defaulting, the risk in the models started getting realized and thus led to a general panic situation. As you might know, there were not many collapses in the market (as in, no huge IBs bit the dust) and that was primarily because they had their risk models in place and knew when to bail out.

Hope that disconnects the subprime crisis and modeling for you. Anything else?

El Scorcho said...

btw
el scorcho = nitish
didn't feel like logging out-in...