ForecastingHere are some facts about forecasting. The first two are true for just about every business bigger than a kid’s curbside lemonade stand.1. Everything in business is driven by “the forecast.”2. Forecasts are almost always wrong.3. Forecasts are either quantitative or qualitative. Quantitative forecasts are causal, explanatory, or time series. Qualitative forecasts are based on judgment and expert opinion. Both quantitative and qualitative forecasts are numerical; it is the method – not the result – that warrants the name.Causal: Explains why the variable of interest changes.Explanatory: Explains that the variable of interest changes when one or more other variables change, and by how much, but not why.Time Series: Doesn’t explain anything; it just says “If there have been no major changes in the business environment lately and if you don’t expect any major changes in the near future, this is what’s coming. More or less.”4. Forecasts are either aggregate or itemized.Aggregate forecasts tend to be more accurate, because errors tend to cancel each other out. Suppose we accurately forecast enrollment for UB for next Fall. The CFO will praise us, because we got the tuition revenue exactly right. But suppose we have twice as many engineering students show up as we expected, and half as many liberal arts students. Our planning, staffing and scheduling is now worthless.Itemized forecasts can be itemized at a variety of levels. Continuing the UB example, we can forecast enrollment for, say, the School of Business, the Accounting Department, or for Accounting 101. The more finely we focus our attention, the less accurate the forecast is likely to be, because we’re dealing with smaller and smaller samples.5. Limitations of forecasts.Some things just can’t be forecasted, at least not with anything even close to useful accuracy. Natural phenomena, like earthquakes and hurricanes, are simply not well enough understood to forecast.Economic phenomena, like booms and busts, or even ordinary economic indicators like the DJIA or unemployment, are notoriously resistant to forecasting techniques.6. The Forecast “Horizon.”One of the distinguishing characteristics of a forecast is its horizon – the farther the horizon extends into the future, the less accurate the forecast. That means “the farther you go beyond the region where you have data, the less accurate the forecast will be.”Local weather forecasters are pretty good at forecasting 4 or 5 days out, but anything more than two weeks into the future is pretty much a guess. The same is true for business forecasts. Short horizons, say, up to a year, can be quite good, but any forecast beyond that is a crap shoot. It is also necessary to know the forecast error. How much “wiggle room” is included? Saying that next summer will be warmer than winter isn’t terribly helpful.Time Series Forecasting (TSF):Basic Assumptions:1. The future will be an extension of the past. Everything that’s been happening will continue to happen, at least across the forecast horizon. (If this assumption isn’t warranted, then this technique is worthless.)2. All the causes of variation in the variable of interest are imbedded in the data. The causes of variation, whatever they are, caused the variable of interest to do what it did. Therefore we don’t need to know what those causes are, just that they will continue as they have in the past.TSF compared with other methods:Multiple Regression (MR): You choose the variables on which you think the variable of interest depends. You form a linear relationship among the variables and the computer determines the coefficients, like this:y = ax1 + bx2 + … + kxn + CHere, y is the variable of interest, say, next month’s sales, and the x’s represent what we believe to be dependent variables, say, the amount of money we spend on advertising, current inventory levels, and so on. Plug in the values of the dependent variables, choose the coefficients, and out comes next month’s sales – in theory.Assumptions of MR:1. The variable of interest, y, is dependent on the xi’s.2. The variable of interest depends on all of them and does not depend on anything else.3. Although the usual formulation requires that the variable of interest be linearly dependent on each of the independent variables, non-linear relationships can be included.Problems:1. If the assumptions are wrong, the forecast will be less valid.2. If you forgot an important variable, the forecast is less valid and less accurate.Advantages of TSF viz. MR:1. You cannot leave out an important variable, or include an unimportant variable.2. In terms of implementation, Excel’s Regression tool is static. That means you have to re-run it each time you make a change. (We will generate the important outputs of Excel’s built-in tool so that they’ll respond immediately to any changes.)Disadvantages of TSF viz MR:1. You gain no insight into the causes of variation.2. Occasionally, summed periodic dependencies appear to be aperiodic. In ordinary American this means that if your sales are subject to seasonal variation as well as longer term business cycles, the combination of both effects may mask any apparent regularity and appear to be basically random.Moving Averages (MA):1. Moving Averages are simple, but they do not account for periodic variation and they are lagging indicators.2. They are only valid for “stationary” series (i.e., no trend, no cycles). Note: This is what books say. Moving averages will follow consistent trends very nicely, and depending on the period you choose, can “see through” cyclical or seasonal data.3. MA places equal weight on all data values, regardless of how long ago they occurred, but there are variations on this technique (see next section) that preferentially emphasize more recent data and can account for cyclical variation. But this technique is still a lagging indicator.Exponential Smoothing (ES):Exponential smoothing is like a moving average but it places more weight on more recent values and less and less weight on values from the past. If F(t) is the function value for time period t, A(t) is the actual value for time period t, then the forecast f(t+1) for the next period is given by f(t+1) = F(t) = ?A(t) + (1 – ?)F(t – 1) where ? is a constant between 0 and 1.1. ES is simple, but depends on your choice of ?, and is a lagging indicator.2. ES is valid only for stationary trends.3. ES can be extended to account for linear trends. When it is, it’s called “Holt’s Method.” If you’re interested, you can Google this.4. ES can be extended to account for linear trends and seasonal variation. When it is, it’s called “Winter’s Method.” You can also Google this.5. TSF does all this (and more) at least as easily. Further, TSF can relatively easily account for non-linear trends and multiple cycles “at no additional cost.”Time Series Forecasting (TSF):A “time series” is a set of equally spaced observations of the variable of interest made over time.In general, the method that follows does the following.First we will “clean up” the data by removing outliers, if any. Then we will compute a non-linear trend line and extend that line into the future across the forecast horizon. Next we’ll calculate any seasonal effects and add them to the trend.This method works better than any other for non-random time series. Unfortunately, most economic indicators (especially the stock market) are pretty much random. Fortunately, most business data (like sales, revenue, costs) are much less random.TSF – The Details:1. Plot the data, y(t) vs. t. What does it look like? The variable of interest (hereafter, just “y(t)”) changes as time goes by. If y(t) exhibits a pattern, it is an ideal candidate for TSF. The pattern can be complex, perhaps including both seasonal and cyclical variations on top of variable rates of rising and declining, but is nevertheless a pattern.If there is no detectable pattern, and the time series is essentially random, only the most general kind of forecast is possible. These tend to be much less helpful.Figure 1. Sample Monthly Sales DataOur example data looks like what’s shown in Figure 1: 48 months of sales data, and our job is to forecast next year’s sales, by month. When we look at this data, it looks a little like a seismograph recording. There is a general trend of rising sales, but at first glance, there’s little else that seems at all helpful and forecasting next year’s monthly sales seems impossible.But if we look closer, there seem to be some peaks and valleys that might occur regularly. Using Excel to identify which months have the highest sales, and our eyes to look at the actual data, we notice that the 6th month and the 18th month have noticeable peaks, and that the 9th month and the 21st month have noticeable valleys, or troughs. Then we notice that the peaks are 12 months apart, as are the troughs. Further examination shows that this pattern repeats across all 48 months. Oh, goody! That’s lucky for us, as we’ll soon enough see.2. Remove outliers, if any. Outliers are the values of y(t) that represent rare, non-repeating occurrences. Statistically, outliers are points more than ?3? from the mean. A better method for recognizing outliers is to remove points that are more than ?3? from the trend-line.A word of warning is necessary here. The computer will find all the points that are outside the standard statistical requirement for being an outlier. These points may or may not actually be outliers and you will have to use your judgment. An example is jewelry sales in the U.S. Jewelry sales are fairly constant throughout the year but peak dramatically in December. I’ve seen data for jewelry sales and the computer invariably says that all the December sales are outliers. Statistically, they are outliers, but that’s the nature of the business. The key phrase for determining which points are outliers is “rare, non-repeating” points.Excel formula for identifying outliers: Name the range containing the data, “Data”. Then, in some cell above that range, type =AVERAGE(Data). Name that cell “Avg”. In the cell just below the average, type =STDEV.S(Data). Name that cell “StdDev”. In the column next to the data, type =IF(ABS(Data-Avg)>3*StdDev,“OUTLIER”,“”). Pop quiz: Why do we need the absolute value (ABS) function?A better formula (trend-based outliers):i. In column A, number the data values as 1,2,3, etc. Name that range “PtNbr” or “Period”.ii. With the chart selected:Chart Tools ? Design ? Add Chart Element?Trendline ? More Trendline Options ?“Linear” and “Display Formula”) ?Close.(LClick on formula and drag it to where you can see it.) [You can also LClick on the “+” sign at the top right corner of the graph and go directly to “Trendline.”]iii. Copy formula into col C, using data point numbers in col A as “x” values as follows:=2.54*PtNbr+643.1 (for example) and copy down. Name this range “LinearTrend”. Of course, you will use the real coefficients based on your data.Pop Quiz: What do the constants represent, in business terms? The answer is NOT “slope” and “intercept”! While that’s correct for your algebra class, we need to understand what the numbers mean in business terms.iv. Create column D as the difference between the linear formula and the data values.=Data-LinearTrend and copy down. Name this range “Diff”.v. Find the standard deviation of the differences and check them for outliers.Pop quiz: Why is this better?Outliers are replaced as follows: yt = ½(yt-1 + yt+1). In the rare case where there are two outliers together, yo1 = 2/3ya + 1/3 yb and yo2=1/3ya + 2/3 yb. These formulas are what you will find in reference books. It is an extremely marketable skill to be able to translate formulas from books into Excel. The key to doing this is to understand what each symbol represents and where those values are on the spreadsheet. In this case, yt represents the outlier value at time t, and yt-1 and yt+1 represent the data values just before and just after the outlier value.There is a caveat here – if yt is at a peak (or trough) of the time series, you need to use your judgment.3. Replace missing values. Use the same method as for removing outliers.4. Find the trend. Here’s where this whole TSF technique shows its value. What we’re going to do is called “multiple regression.” This whole topic is usually covered in statistics courses under the title of “Correlation and Regression.” I’ve uploaded a fairly detailed description of this topic and its application to TSF to Canvas in a file called “Mgt301-Correlation&Regression”. Read it on your own and ask questions in class about anything you don’t understand.Now we all know that the equation y = ax + b is a straight line. Very few real-life situations are straight lines. They’re used because it’s a simple equation and back in the day, before cheap and powerful computers, that’s all we could do by hand. Now we can use better equations, but there’s something I need to say before we go any further.Excel offers a number of built-in regression equations including polynomials up to degree six, logarithmic, exponential, and moving average equations. Why these are almost never appropriate is because time-based business data (like sales) doesn’t change terribly rapidly. Even a second-order polynomial rises and falls more rapidly than real business data. Exponential equations are what is frequently used, but they’re only good for a limited time frame. Any CEO who tells his people and/or stockholders that “our goal is to maintain 5% growth” doesn’t understand basic math. A constant percentage growth rate is an exponential function, like the one you learned in Finance 309: where r is the periodic interest rate and n is the number of periods. If you graph this, it eventually rises literally “exponentially fast.” Exponential growth simply cannot be sustained.So what we need is an equation that doesn’t rise or fall too rapidly, but does allow for little dips and rises as time goes by. Here is such a function: .The first term will rise faster than linear (or fall, depending on the sign of a), linear, slower than linear, and it can decay slowly enough to keep the other terms from rising or falling too quickly. The last term is a constant which moves the whole curve up or down without changing its shape.All we need to do is figure out what the coefficients ought to be, so that our trend equation describes, as accurately as possible, the correlation between sales and time. Here’s how.We are going to determine how well our data correlates with time. “Time” for us is simply the chronological progression of the data. That means the first time period is “1”, the second is “2”, and so on for as many time periods as you have for your data.This translates, mathematically, to finding the coefficients of the function as described on the previous page. When you want to find out how to do that, you can Google it, but to do that in a way that Google can figure out what you’re asking you should know that what we’re doing is called “finding a least squares fit of an over-determined non-linear regression equation.” Whew! You can also look through an applied math book for the same thing. When you do either of those things, you’ll find that the solution for the coefficients is given byc=(XTX)-1XTy.Not much help, huh? Here’s what it’s saying.Although this looks like advanced math, involving matrices, their transposes, products, and inverses, its derivation is based on the following. (The transpose of a matrix is found by reversing its rows and columns: the first row becomes the first column of the transpose, the second row becomes the second column, and so on.)The simple equation, 3x = 7, can be solved using fancy words, like this. “Pre-multiply both sides by the inverse of the coefficient.” That gives 3-1 3x = 3-1 7, or 1/3 3x = 1/3 7, or x = 7/3.OK? Now let’s remember systems of equations from high school, something like this:This is still fairly easy to solve, right? Now, I can write that system using matrix notation as follows:(If your high school was like my high school, you remember this; otherwise, trust me.) I can express this in general notation, like this:where is the coefficient matrix, is the vector of unknowns, and is the right-hand side.Now this matrix form of the equation is an exact analogue of the original equation, 3x = 7 and the same rules apply for solving it, namely, pre-multiply both sides by the inverse of the coefficient. Doing that gives,which is the solution for any system of equations for which the coefficient inverse exists. Something similar produces the equation above, but it’s slightly more complicated because the coefficient matrix isn’t square. Here’s what that means.In the work we’ll be doing, we usually have more equations than unknowns. You know that we can solve two equations in two unknowns, 3 equations in 3 unknowns, and so on. But we will have, for example, 48 equations in 5 unknowns. This is called an “over-determined” system and the method for solving it is called “Least Squares” minimization. That is, we will find a function that minimizes the total squared distance of each of the points from the function. Its derivation isn’t important for us, but the solution is.So – we have five terms in the function we think will work well. We need a column of the values for each of them, as follows.i. First create 5 columns, labeled ““T^(3/2)”, “T”, “T^(1/2)”, “T^– (1/2)”, “Const”. In this case, “T” represents “time” rather than “transpose.”ii. In each column, insert the formulas =PtNbr^(3/2), =PtNbr, =SQRT(PtNbr), =1/SQRT(PtNbr), 1. Notice that the formula matches the column heading except for the constant term. Copy these formulas down to the bottom of the data and make sure that Excel doesn’t increment the constant term; it should be “1” all the way down. This is the matrix “X” in the equation above.iii. Name the range of these values “X”. (Highlight only the numbers, not the column headings!)iv. Name the data values “Y”. (Same thing – only the numbers.)To make the next instruction clearer, here is the equation again.c=(XTX)-1XTy.As usual, we read this equation “from the inside out.” The inner-most term is XTX. That part of the equation is written below in green. After finding that product (“MMULT” means “matrix multiply”) we need to find its inverse. The Excel command for that is MINVERSE, which means “matrix inverse” and is in un-bolded red. Be sure to notice the closing parentheses in matching colors.The next term we need is XTy. That product is written in bold red. Finally we need to multiply (XTX)-1 by XTy. That’s shown in blue.This will give us the answer we want, but the numbers will be in a column and we want them to be in a row, so the last operation is to TRANSPOSE that column to a row. That’s shown in a color Excel calls “Red, Accent 2, 40% lighter” and I call “light purple-ish.”So – finally we’re ready!v. Highlight one row immediately above the column headings of X and without touching the mouse, enter the following array formula. This is all one formula; it doesn’t fit on this page, but you will write it all in one line in Excel.=TRANSPOSE(MMULT(MINVERSE(MMULT(TRANSPOSE(T),T)),MMULT(TRANSPOSE(T),Y))) and press CTRL-SHIFT-ENTER.These 5 values are the coefficients of the trend equation.Name this range “Coeffs”. Even though the explanation took almost 3 pages, all you have to do is enter the formula; it takes less than 30 seconds.Once you have the coefficients of the equation of the trend that most closely follows your data, write that equation into an appropriate column, using the “data point number” cells, which you’ve named “PtNbr” as values of t. Let’s assume that X is in columns I through M and that the first row of numbers is row 5. Label column N as “RegrLine”. Then, in cell N6, write the following formula:=SUMPRODUCT(Coeffs,I5:M5) and copy this equation down through the last data value. Name this range “RegrEqn” or “NonlinearTrend”.5. If the data is seasonal, add the seasonal effects to the trend line. When I use the word “seasonal,” I’m not implying that the data changes with the seasons the way the weather does. Often, business data does change with the seasons because of holidays, summer vacations, and so on, but “season” actually represents any cyclical pattern that repeats as time passes.i.Decide on the length of the “season.” How many data points are there before the pattern repeats exactly? If the pattern repeats almost exactly, then that’s not good enough; your data isn’t seasonal. The good news is that doing this step for non-seasonal data doesn’t hurt much. (Pop quiz: Why not?)ii.Create a Moving Average (MA) of length “season.” This is usually 12 for monthly data, 4 for quarterly data, 5, 6, or 7 for daily data, and so on. Regardless of what’s “usual,” useyour eyes and your brain to find the season length. Here’s how to do that.The example data has peaks every June and November, and a trough every January,April, and September. We can see this simply by looking at the plot of the data. Since the pattern of higher and lower sales repeats every 12 months, we’ll create a 12-month moving average.A “moving average” is the average of sales for one full period, which in our case is 12 months. We take a full period average because the pattern repeats every period and because the peaks and troughs will be “averaged out,” i.e., the above-average months and the below-average months, when averaged together, should create a rather smooth line.The average of the first 12 months is found with a simple AVERAGE function, namely, =AVERAGE(D6:D17) if the actual raw data is in column D. As we “move” the average, the next 12 month period is from D7:D18 (February through January), the next is from D8:D19 (March through February), and so on.But where do we write this equation? We now have column N that contains the trend equation. Let’s put the moving average in column O. So we’ll label this column (in cell O5) “12-Mo MA”.The average monthly sales for the first year really ought to be put at the midpoint of the year. The midpoint of the year is after June and before July, but there’s no cell for “after June but before July” so we’ll put the formula in the cell for June, just because.Now copy the formula down until we get to the last June in the data. The formula at that point should be, for our example data, =AVERAGE(D42:D53). D53 is the last cell that has any data in it.iii. To make the MA reflect the true midpoint of the year, we need to “center” it if the length of the season is an even number. The value that’s in the first June cell really ought to be between June and July. The value that’s in the first July cell really should be between July and August. So if we find the average of those two cells, we’ll get the correct average for “July.”So, we’ll create a column for “CMA” (Centered MA). In cell P5, enter the label “CMA.” In the cell for “July” we’ll average the MA values for June and July with the formula =AVERAGE(O11,O12) and copy that down to the last MA value.iv. Look for “inflection points.” An inflection point is a point where the CMA makes a rather abrupt change of direction. (There’s an illustration of this in the file “Mgt301-Correlation&Regression.”)The basic assumption of this method is that there have not been any major changes in the business environment at least as far back as your data goes, nor will there be any major changes in the business environment at for the period of the forecast horizon. The CMA is especially valuable for finding such changes.The method is simple: just graph the raw data and the CMA on the same graph. If there’s an inflection point, you cannot use any of the data that accumulated prior to the inflection point. Pop quiz: Why not?If you find an inflection point, through away all the data prior to that point and start over with the remaining data. (Don’t really throw it away, because when you get to writing up your results, you’ll need to show the original data with the inflection point before proceeding with the rest of your report.)v. Calculate the Raw Seasonal Index (RSI). A “seasonal index” is the ratio of actual sales for a particular month compared to the average sales for that year. Because the “average sales” is a moving average, we always get the precise seasonal index for a particular month based on the average for which that particular month is the middle month. (That’s a confusing sentence; you may need to read it several times.)So – another column. Write the label “RSI” in cell Q5. For the first value for which the CMA exists write the formula =D12/P12 and format it as a percentage. The result will be how much that month’s sales exceed or fall short of the average. Then copy that formula down to the last cell for which the CMA exists.When you look at the values, you’ll notice that September, for example, always has a value less than 100%. For the example data, the values are in the upper 80’s, which means that September sales are about 86% or 87% of average. June, on the other hand, has values that are 10% to 15% above average.vi. Calculate the Average Seasonal Index (ASI).This goes in column R. You noticed that the RSI values for a particular month are approximately the same, but not exactly the same. So this step will find the average RSI for all occurrences of the same month.This is an annoying formula – it is the average of all the RSI values for their particular point in the data. For example, if the data is monthly, it is the average of all the July values of RSI, then all the August values, and so on. You can’t simply copy the formula down because the cell references will advance so that they no longer include data toward the top of the sheet and will include non-existent data at the bottom.The first RSI value we have in our example data is for July, so the average RSI for all Julys is =AVERAGE(Q12, Q24, Q36). We will copy this formula down for 12 months, if possible. In this case it is possible; that won’t always be the case. The important point is to write a formula that includes all the occurrences of each month.Once you’ve calculated the ASI for 12 months, you know the ASI for all months. So you can extend the ASI for all the months of data and into the future across the forecast horizon by simply copying the 12 calculated ASI values. Name this range “ASI”.So, for the first January, the formula in cell R6 is simply =R18. This can be copied down through the first June. From the second July, the formula is =R12 and that can be copied as far into the future as your horizon extends.6. Add the seasonal effects to the trend line. This is easy – just multiply the trend equation value by the ASI. Put the label “Model” in cell S5, the formula =RegrEqn*ASI in cell S6 and copy down to the last data value.Now graph the raw data (column D) and the Model data (column S) on the same graph. For our sample data it should look like the graph below (Figure 2). The agreement of the Model and the data is so close that it’s hard to tell which line is which.7. Extend the model across the forecast horizon. This, too, is easy.Extend the time periods across the horizon. Our forecast horizon will be one year, so that means you need to extend the numbers in column A from 48 through 60. Highlight the bottom-most equation columns (I53:M53) and copy them down through the horizon. Copy the ASI formula across the horizon. Finally, copy the model formula across the horizon. Name this range “Model”.When you graph the result (Data and extended Model), you’ll see the forecast. It’s shown on the next page.Figure 2: Example Data and ModelFigure 3: Example Data and Next Year’s Monthly ForecastThe only thing left to do is to label the graphs nicely. They’ll need a title, axis titles, and actual dates on the x-axis.In the professional world (but not required for this project) explore other curve-fitting techniques and other forecasting methods. Choose the forecast that best matches your “gut instinct” which is based on your intimate and detailed knowledge of your particular industry and your particular requirements. The method given here, however, is often the best.Now comes an absolutely critical part of any forecast – an error analysis.8. Perform various error analyses.i. First, calculate and plot the errors. The source of errors is always the model; the data can never be wrong, it simply is what it is. So the error is the discrepancy between the data and the model, expressed as a percentage relative to the data. Here’s how.Yet another column, this time column R. Put the label “PctErr” in cell T5. In cell T6, the formula is =(Model–Data)/Data and format it as a percentage. Then copy the formula down to the bottom of the data. (There’s no way we know what the future error will be until we get the future data.) Name this column “PctErr”.Create a graph of the errors. If the model has captured all the information that’s in the data, the errors will be small, random, and approximately normally distributed. We need to check each one of those criteria.ii. Is the error small? “Small” is a subjective judgment. Most people define “small” as meaning “if the average absolute value of the errors is less than 10% then the errors are small.”That measure of small-ness is called the MAPE, or Mean Absolute Percentage Error. Its formula is =AVERAGE(ABS(PctErr)) and CONTROL-SHIFT-ENTER. Put this formula in cell T4 and the label “MAPE” in cell T3.iii. Is the error random? “Randomness” is a surprisingly difficult topic. We are hard-wired from hundreds of thousands of years of evolution to recognize patterns. If I asked you to identify which of the following sequences of five digits is random, which would you choose? 1-2-3-4-5; 3-9-2-7;4-4-4-4-4. I’ll bet that most of you chose the center sequence. The answer is that all of them could be random. We have such an automatic response to patterns that it’s hard to believe that 1-2-3-4-5 could possibly be random.People who’ve studied “random” sequences have come up with two measures of randomness called “runs tests.” We will perform both of them and we hope that our model passes both of them.The first runs test is the number of “above/below” runs that a sequence has with respect to the average value of the sequence. The idea is that a random sequence will be above average and then change to below average an expected number of times.The expected number of “above/below” runs for random data is , where n is the number of data points, and the standard deviation of those runs is . If you recall your basic statistics, we can form a 95% confidence interval estimate of the true number of runs from our sample by calculating , where “2” is the real-world’s value for Z or t for 95% confidence. If the number of above/below runs is within that interval, we’ll agree that our errors pass the above/below runs test.The second runs test is an “up/down” test. A random sequence will change direction an expected number of times. The expected number of “up/down” runs for random data is and . As before we’ll use 95% for our confidence level and the interval becomes . If the number of up/down runs is within that interval, we’ll agree that our errors pass the up/down runs test.Now, finding the number of above/below and up/down runs is a pain in the neck (and eyes) if you try to count them from the graph. So we’ll get Excel to do it for us.The technique that Excel uses in its regression tool and which I’ve detailed for you in step 4 (page 8) guarantees that the average error will be zero. That means the end of an above-run (or below-run) occurs when the error plot crosses the x-axis. That means that when the error values change sign, we should count an above/below run.Now, we can’t know whether the first value crosses the x-axis; it is either above or below. The first possible chance for a sign change is when we look at the second value. Therefore the formula begins in the second row of data. The formula is =IF(SIGN(X6)=SIGN(X7),0,1) and it goes in row 7. (I don’t know that the column will be X – that depends on where you choose to put this column of values.)Counting up/down runs is a little trickier. Here, we have to count each occurrence of a change in the sign of the slope of the error graph. Be able to tell me why the following is the correct Excel formula and why the first occurrence of the formula is in the third row of data. =IF((X7-X6)*(X8-X7)>0,0,1).When these formulas have been copied down to the bottom of the data, simply SUM the column to find the number of runs. Then write the formulas for the confidence intervals and you’ll know whether or not your model’s errors pass these randomness tests.If your error analysis fails one or both of the randomness tests, that means that there’s information in the data that’s not captured by your model. That means that your model can be improved. If you can’t improve your model, make sure that you include an appropriate disclaimer when you write up your results.iv. Are the errors normally distributed? There are two ways to do this. The first is to use a ?2 (chi, pronounced “ki”, rhymes with “high”, squared) test and the other is to calculate a MAPE for you error distribution relative to a normal distribution. We won’t use either.All we’re going to do is create a frequency histogram of the errors and say that the errors either are or are not approximately normally distributed. I expect that you’ll tell the truth here and not pretend that the results are better than they are.9. Establish a confidence bound for your forecast. Again, from basic statistics, you’ll do this by adding a 2?e band (95% confidence limit) to your model and the forecast. The s term is the standard deviation of the errors. Your forecast is now yf? 2?e and you are ready to write up your results.10. Write-up:Despite all your work in getting to this point, very little of that work (except the results) will be included in this short (2 – pages) write-up. (This is annoying. You’ve done a lot of work to get these numbers and you don’t get to talk about it. That’s the nature of almost all data analysis work that you’ll do. The boss wants the answer, or at best, the Cliff Notes version of your efforts. The good news is that if you are asked about how you arrived at your forecast, you’ll be on solid ground.)If any of you have ever been in the Army, you’ve learned that there are three sections to every report or presentation: start by telling them what you’re going to tell them, then tell them, and finish by telling them what you told them. That’s basically what you’re going to do in this write-up. (It’s also very close to the way that a graduate thesis is organized.)The first thing you need to know to do a write-up like this is who your audience is. I’m your audience. I know what you’ve done. You don’t want to give me a tutorial on time series forecasting; I’ve already been there, done that.The next thing you want to know is only say what needs to be said. I can’t emphasize this enough. Bosses don’t have a lot of time to read a long memo. (And as of now, I have 63 of these reports to read. Do yourself a huge favor and don’t make me read more than necessary! But – be sure to say all that needs to be said. Writing is easy. Writing well is hard.)Begin by treating this as a business memo. That means the heading of your memo will be something very much like this:To: Prof.From: StudentDate: mm/dd/yyyySubject: Forecast of
Order Management
Premium Service
- 100% Custom papers
- Any delivery date
- 100% Confidentiality
- 24/7 Customer support
- The finest writers & editors
- No hidden charges
- No resale promise
Format and Features
- Approx. 275 words / page
- All paper formats (APA, MLA, Harvard, Chicago/Turabian)
- Font: 12 point Arial/Times New Roman
- Double and single spacing
- FREE bibliography page
- FREE title page
0% Plagiarism
We take all due measures in order to avoid plagiarisms in papers. We have strict fines policy towards those writers who use plagiarisms and members of QAD make sure that papers are original.