You are on page 1of 68

LESSON 7 Understanding Data Patterns and Forecasting Techniques

Assigned Reading
Selected assigned readings can be found on the BUSI 460 Course Resources website, under "Online Readings". 1. 2. UBC Real Estate Division. 2011. BUSI 460 Course Workbook. Vancouver: UBC Real Estate Division. Lesson 7: Understanding Data Patterns and Forecasting Techniques Hanke, John E. & Wichern, Dean W. 2004. Business Forecasting, Eighth Edition. Prentice Hall. Chapter 3: Exploring Data Patterns and Choosing a Forecasting Technique, p. 57-85 (excluding cases and problems). Chapter 4: Moving Averages and Smoothing Methods. Chapter 9: The Box-Jenkins (ARIMA) Methodology, p. 381-393. OR Hanke, John E. & Wichern, Dean W. 2005. Business Forecasting, Custom Edition. Pearson Custom Publishing. Chapter 3: Exploring Data Patterns and Choosing a Forecasting Technique, p. 1-29 Chapter 4: Moving Averages and Smoothing Methods, p. 31-68 Chapter 9: The Box-Jenkins (ARIMA) Method, p. 69-81

Recommended Reading
Selected recommended readings can be found on the BUSI 460 Course Resources website, under "Online Readings". 1. 2. 3. Hanke, John E. & Wichern, Dean W. 2004. Business Forecasting, Eighth Edition. Prentice Hall. Chapter 11: Managing the Forecasting Process, p. 485-494. Torto Wheaton Research. Forecasting in a Rapidly Changing Economy: Base Case and Recession Scenarios. June 2001. www.twr.com (free registration required). G. David Garson. 2005. Time Series Analysis. Notes to accompany course Public Administration 765, NC State University. (This is an extremely clear source that explains time series analysis using SPSS. You can also follow links to other helpful statistical explanations.)

Learning Objectives
After completing this lesson, the student should be able to: 1. 2. 3. 4. identify and analyze data patterns and characteristics; select appropriate analytical and forecasting techniques to address critical factors in real estate decision making; measure and interpret forecasting error in statistical forecasting models; and determine the adequacy of a forecasting technique for statistical modelling.

7.1

Lesson 7

WARNING: Lessons 7 and 8 are statistics intensive! These lessons require a working knowledge of statistical concepts such as measures of central tendency, measures of dispersion, sampling distributions, and regression analysis. If you are not familiar with basic statistical measures or have not recently completed a University-level statistics course, we strongly recommend you to: C C C review selected lessons from the BUSI 344 course, "Statistical and Computer Applications in Valuation" (available under "Pre-Reading" on the course website); read the suggested readings; and have a statistics text handy for reference.

Instructor's Comments
If you practice real estate appraisal, property management, brokerage, or other consulting activity, your daily work is consumed with forecasting activity. But how often do you take time to critically evaluate your forecasts? Why should you care? Will this course help you in your daily work? Consider the following scenarios: 1. An appraiser, working on behalf of a Surrey office landlord, prepares an appraisal report for an office building which is subject to assessment appeal. One of the issues is the market vacancy of the property as of July 1, 2006. The appraisers opinion of market vacancy is, coincidentally, identical to the third party market survey for office properties in Surrey. The appraiser is called to give testimony at the appeal hearing and is subjected to cross examination by the Assessor's legal counsel on the subject of market vacancy. The Assessor's counsel has a basic understanding of forecasting principles and through a series of questions draws the appraiser to conclude that the market forecast was really based on the published third party survey. Counsel goes on to point out the limitations and flaws in the third party market forecast when applied to the property under appeal. How could the appraiser have been better prepared for the hearing? A residential property developer (client) in Toronto regularly purchases consulting services to provide demographic forecasts for product demand and absorption. The client has always trusted the advice provided in the consultant's reports, but the latest numbers don't seem to be consistent with the developer's gut feeling for market dynamics. The client is conflicted since she wants to gain a better understanding of the consultants forecast but realizes that she doesn't have the background to interpret the underlying methods, assumptions, and statistical measures. If you were the client, what would you do? Does the consultant have a duty to the client to "de-mystify" the forecast measures of reliability? You have been engaged by a provincial agency to review an appraisal establishing a sale price for a surplus public property suitable for industrial use. The market analysis component of the report consistently contains the phrase "in my judgement" as support for the conclusion of the propertys market appeal, target market, exposure period, trends in local market conditions, and the broader economic outlook. The property has been valued at $5,000,000 and, given scarcity of vacant industrial sites in the area will generate strong interest when exposed to the market. When you follow-up with the appraiser it becomes clear that the quantitative and qualitative analysis to back up the statements of opinion cannot be produced. As a result of the appraisers' "guesstimates, the credibility of the entire appraisal is suspect and you conclude that the terms of reference for the assignment have not been met.

2.

3.

In today's business environment of increasing due diligence, sole reliance on the phrase "in my judgement" as support for critical forecasts is becoming unacceptable. Eventually, everyone is called to account for their forecasts.

7.2

Understanding Data Patterns and Forecasting Techniques

The case studies above offer a few real-life scenarios illustrating the importance of forecasting support; in reality, real estate research and analysis cannot be completed without some level of forecasting activity. There are two times in a man's life that he should not speculate: when he can't afford it and when he can. Mark Twain To understand what we mean the term "forecast", consider the following example from The Real Estate Game, in which author W.J. Poorvu notes there is very little new under the real estate sun.1 Most events in real estate have occurred before. As an example, Poorvu discussed how real estate income trusts (REITs) usually revive in popularity every decade or so. Poorvu's view is that value of real estate is linked to interest rates and the equities (stock) market. His hypothesis (in 1999) was that real estate is a more attractive investment when interest rates are high and the stock market is performing well; when interest rates are low and equities are not doing well, real estate investments are not attractive. In other words, the pattern of high demand followed by low demand for REITs echoes the business cycle, which generally follows a ten-year pattern. Forecasting relies on the observation and analysis of such patterns. If a particular event follows a regular sequence, forecasters can use historical information to develop models to predict future occurrence of the event. Most critical decisions are made in the face of uncertainty. Forecasters can assist the decision making process by examining and learning from history on how certain events that are critical for the decision unfold. This knowledge can be applied to develop the most appropriate models to forecast future events. Scott Armstrong, in the Principles of Forecasting, points out that forecasting is often confusing with planning. According to Armstrong, planning is about what the future should look like while forecasting is an attempt to learn what the future will likely look like.2 However, there is a strong linkage between the two activities since forecasting plays a role in the planning process by providing feedback on possible outcomes, which in turn leads to changes in plans. For example, a market forecast predicting a strong demand for large office floor-plates or energy efficient office space may result in a change to architectural plans for new office buildings. So far in this course, we have provided examples to illustrate Forecasting Fact why it is dangerous to forecast without supporting analysis. However, the success of any action/decision will often depend A forecast is a prediction of a value for the on the outcome of future events. Therefore, it is important to future or another situation based on obtain good and accurate forecasts of the future whenever observation and analysis of data patterns. possible. We can only make accurate predictions of events if we have good information to apply to those forecasts, such as past events, trends, and consequences. This lesson and the next will guide you through some techniques that apply past information to develop forecasts. As noted earlier in the Instructor's Comments, this lesson is fairly technical, but we will try to keep it simple by working through basic illustrations using a step-by-step approach. The examples we use may not be exactly the same as those you will use in your work, but they illustrate the important points on how to deal with a variety of data generally encountered in any workplace. By the end of this lesson and the next you will be able to develop forecasts for a number of events. In this lesson, we will focus on producing forecasts using a single data series (univariate analysis); in Lesson 7, we will build on this and demonstrate how to create forecasts when more variables are introduced (multivariate analysis).

1 2

Poorvu, W.J. 1999. The Real Estate Game: The Intelligent Guide to Decision-Making and Investment. The Free Press, New York. p.4.

Armstrong, Scott J. 2001. Principles of Forecasting: A Handbook for Researchers and Practitioners, Kluwer Academic Publishers. Boston, p.2.

7.3

Lesson 7

Lesson 6 offered an introduction to forecasting theory and was relatively short. If Lesson 6 can be considered an appetizer, then Lessons 7 and 8 are the main course (with lots for you to digest!). However, keep in mind that much of this length results from the numerous tables and figures we have provided to illustrate various data patterns and computer output necessary in developing forecasts. As well, there are also numerous text boxes providing software commands.

Statistical Software C SPSS


PLANNING AHEAD Lessons 7 and 8 illustrate statistical forecasting methods using the SPSS software package. We recommend that students use the software so they can follow along with the instructions in these lessons and get the most out of this course. You should install SPSS now and download the appropriate databases from your course website (under "Online Resources"). If you require SPSS assistance, you should review the "SPSS Orientation" document and the SPSS screencast videos that you can find on your course website.

SPSS Software Updates SPSS frequently updates the current version of their software. While there are some minor operational differences in each version, students can typically easily adapt to the changes made in each version of the software. This course workbook is written specifically for Version 19.0, but past versions back to 15.0 work perfectly well in this course.

The complex calculations required in statistical forecasting are best carried out using statistical software, which makes the application of complex formulas and difficult math quite straightforward. The software allows you to diagnose the patterns in your data and provides a variety of options for forecasting depending on the results of your diagnosis. In this and the next lesson, we will focus on the use of the SPSS ("Statistical Program for the Social Sciences") software package. Throughout these lessons we will discuss forecasting techniques and then illustrate the steps in SPSS. In our view, SPSS has the right combination of simplicity in implementation, clarity of help files, and ability to implement a variety of techniques to make it our recommended program. Keep in mind that you can complete a number of the less complex statistical forecasts with Microsoft Excel, although more steps are generally required and the outcome is not as elegant. As various forecasting procedures are described, we will present the associated SPSS commands in a text box, like the one below. The data used for the illustrations can be found on the course website. An attempt has been made to use real property data to illustrate the practical application of SPSS for real estate forecasting. By working through these illustrations and following the steps, you will become more familiar with the program and with the forecasting techniques.

7.4

Understanding Data Patterns and Forecasting Techniques

SPSS Instructions: Help Files The steps required for the SPSS software will be illustrated in text boxes throughout the lessons. As a first step, you may wish to review the SPSS help files, online help, and tutorials, particularly those related to forecasting. For example, if you view "Time Series" in the Help menu, the first topic is Sequence Charts. The "How to" feature will guide you through the process of generating a sequence chart.. This "How to" option is available for many commands, and can really help you in your work. We strongly suggest that you use these features to enrich your learning. To use the How to command for sequence charts: Choose: Help 6 Topics and click on "Index" tab. Scroll down to "time series" (or enter "time series" in the search box). Ensure "time series" is selected in the results and press Enter. Below the title "Sequence Charts" is a icon of a chart..., click on the chart icon anc then click on How to.

Lesson Plan
Forecasting draws on information from past data to predict the future. The optimal forecasting technique for any given situation is somewhat dependent on the nature of data that is available and the decision to be made or problem to be solved. Because data characteristics can differ significantly, it is necessary to have a detailed look at the data itself. Since most statistical forecasting techniques use a particular type of data called "time series data" (defined below), the lesson will devote considerable time examining the properties of this type of data. The outline below provides an overview for the first two sections of this lesson: (1) the characteristics of data series, and (2) development and implementation of a forecast model building strategy. 1. Statistical Data Characteristics: C C C
Forecasting Fact The optimal forecasting technique for any given situation depends on the nature of available data and the decision to be made or problem to be solved.

Data types. Autocorrelation and autocorrelation functions and how these provide information on data patterns. Time series data patterns: C Stationary (also called horizontal) C Trend C Seasonal C Cyclical

2. Statistical Forecasting Using Time Series: C C C Forecasting using averaging and patterns. Dealing with non-constant variance (Appendix 1). Choosing the best forecast by: C Identifying and selecting the initial forecast model; C Estimating model coefficients (or parameters); and C Determining the model's adequacy by analyzing "residuals".

7.5

Lesson 7

The textbox below provides an example of how a single data series might be used to provide a simplistic forecast for medium-end homes. This example will be referred to again later in this lesson, as we introduce various forecasting techniques.
Example: Forecasting Simple City's Housing Needs
A property developer in Simple City is planning to invest in an "affordable" single family housing development aimed at first-time homebuyers. The developer has sought your advice on whether to build in a particular neighbourhood of the city. In your examination of recent property sales to first-time buyers, you noticed that more and more couples are buying their new homes shortly after getting married. You hypothesize that this may be a result of down-payments being received as a wedding present, family advice to begin accumulating equity in housing, or cultural reasons. Based on your hypothesis, you decide to examine the relationship between the number of weddings occurring in the city and demand for new affordable single family homes. You advise the developer that a forecast of weddings is needed to time open houses and advertising for new homes. You must carry out a forecast of marriages based on past data. We will illustrate techniques for this in various parts of Lesson 7. [A note on realism: this example is clearly an over-simplification and likely relies on questionable assumptions. Please keep in mind we are using it only as an illustration of techniques!]

Statistical Data Characteristics


Data Types
Selection of an appropriate forecasting technique depends on the type of data being analyzed. There are basically three data types: cross-sectional, time series, and panel data. Cross-sectional data: Cross-sectional data consists of observations on a given set of variables at one point in time. For example, property tax assessment organizations will usually know the square feet of living area for all properties in the jurisdiction at a single point of time (or the population of properties), such as July 2005. This is cross-sectional data. Another example is the property characteristics data used to create mass appraisal models. Each piece of information on a house is a variable. Each house is an observation. There are many observations that are measured at one point in time for the model (for example, at the assessment date). For cross-sectional data, the order of the observations is not important; e.g., there is no need for data on house A to appear before data on house B.
Forecasting Fact There are four time-series data patterns: stationary, trend, seasonal, and cyclical.

Time series data: Time series data consists of observations on a set of variables over time. Monthly inflation data for the last ten years is an example of time-series data. The minute-by-minute spot price of the Canadian dollar is a time series. The dates of Canadian federal elections are a time series. Unlike cross-sectional data, in time-series data the ordering of data is an essential piece of information. Therefore, we often call this a "data series".

Panel data: Panel data consists of observations on a set of variables across many units as well as over time. For example, the assessed values of each of the houses in one neighbourhood for the past 10 years would be a panel data set. The fact that you have data on many homes makes the data cross-sectional. But because you have this data for 10 time periods, you also have a time series. The houses in this neighbourhood form a panel that has been followed over time.

7.6

Understanding Data Patterns and Forecasting Techniques

As mentioned above, forecasting mainly uses time-series data. This data type possesses unique characteristics that will be used in building forecasting models. To reveal these unique time series characteristics, we will introduce correlation and autocorrelation concepts. These concepts will help examine the time series data patterns and determine the corresponding forecasting technique to use with this data.
Data Attributes Before applying a forecasting technique, the researcher must have confidence in the data. In the introduction to Chapter 3 of Business Forecasting, Hanke and Wichern identify criteria which can be used to evaluate the usefulness of data analysis. The author's criteria are that data should be: C C C C reliable and accurate; relevant; consistent; and timely.

Researchers use the term "meta-data" to describe the attributes of a data-set. This information provides helpful information about the data to the user of the analysis. For example, the meta-data for a CMHC Monthly Housing extract would include a definition for each variable in the data set, the date the data was collected or assembled, data sources, and other information that explains the relationships that exist in the data set.

Autocorrelation Coefficient and Autocorrelation Function


Correlation describes how one variable is related to another. If the two variables move together (e.g., housing demand and house prices), then we say they have a positive correlation. If two variables move in opposite directions from each other (e.g., apartment construction and vacancy rate), then they have a negative correlation. Correlation shows how two variables are related; autocorrelation shows the relationship between the values of one variable at two different times. For example, autocorrelation on an annual housing starts data series would measure how the number of housing starts today (current year) relates to the number of housing starts a year ago and how last year's starts relate to the previous year. The term "auto" means self e.g., "automobile" means a self-propelling, moving vehicle. Autocorrelation permits analysis of data patterns, including seasonal and other time-related trends.3
Key Concept: Autocorrelation Autocorrelation describes how a variable moves in relation to itself, showing how the current value of a variable moves in relation to the past values of the same variable. Autocorrelation uses the term "lags" to refer to time periods: one lag means comparing the current value with the previous period; two lags means comparing the current value with the value two periods ago. For a one period lag, we are looking at the relationship of one period's value over the previous period's value during the entire length of the time series.

Knowing the relationship between the current value of a variable and its past values provides insight into the variable's future behaviour. The autocorrelation coefficient4 shows the degree to which the latest value of a number is related to its past value throughout the entire time series. The autocorrelation relationship can be positive or negative, strong or weak. For example, in a time series, if a high value for the last period leads us to predict a high current value, then the variable displays positive autocorrelation for one lag.
3 4

Hanke, John E. & Wichern, Dean W. 2004. Business Forecasting, Eighth Edition. Prentice Hall. p.60.
The formula for the autocorrelation coefficient is provided in Hanke & Wichern's text, Business Forecasting. p.60.

7.7

Lesson 7

There are many variables in real estate that exhibit autocorrelation. For example, a property's monthly rent is positively correlated with the previous month's rent because rent is usually the same month-to-month or perhaps only raised slightly. Patterns like this are what allow decision-makers to make future plans. Another data pattern one could examine would be the autocorrelation between two successive months' interest rates: December with November, November with October, etc. This would give us the autocorrelation coefficient for a one month lag, which is an average of the relationship between each pair of consecutive months throughout the entire data series. Similarly, the autocorrelation coefficient for a two month lag would be the average of the correlation between December and October, November and September, etc., throughout the entire series. The autocorrelation coefficient is usually denoted as: rk The subscript, k, is the interval (or lag) between the two periods being examined. For example, if we are examining the correlation between the current month's value with the last month's (lagged one period), then k=1. If we are considering the correlation lagged two periods, then k=2. For monthly data, k=12 would measure the relation between values for the current month and the same month for the previous year. The value of the autocorrelation coefficient (r) ranges from +1 to -1. Here are some samples of how to interpret the autocorrelation coefficient: C C C If there is a perfect relationship between today's value and last period's value (k=1), the autocorrelation coefficient would equal 1. If there is a strong negative relationship between the current value of the variable and the value two periods ago (with k=2), then the autocorrelation coefficient would have a negative value near -1, likely less than 0.5. If there is little or no relationship between the current value and the past values of the variable, then any autocorrelation coefficient calculated would be close to zero.
Autocorrelation and Causation Correlation coefficients have to be interpreted very carefully. If two variables are correlated it does not mean that one variable causes the other. In statistical terms, we say "correlation does not imply causation". For example, a strong correlation may be found between house break-ins and beer sales. This does not imply that beer drinking causes one to break into a home. It could be that the variables are correlated only because both activities occur at a higher frequency in the summer months. The same caution holds for autocorrelation. If the current value of a variable is correlated to past values of the variable, this does not mean that there is a cause/effect relationship between them.

Autocorrelation Function The autocorrelation function will show a series of autocorrelation coefficients of a time series for a selected number of lags. If the selected number of lags is 2, the function will generate 2 autocorrelation coefficients: r1 and r2. These coefficients provide clues as to the patterns in the data series. For example, it will show if the time series has a strong correlation with its one successive value. Time series statistical software easily generates an autocorrelation function. The forecaster simply selects the number of lag required for the analysis. The maximum number of lags has to be two less than the sample size, since two data points are needed for comparison to be possible. In SPSS, the default n is 16 i.e., SPSS will compare 16 periods in the past to the current value. However, the forecaster can change this default to another option.

7.8

Understanding Data Patterns and Forecasting Techniques

SPSS Instructions: Autocorrelations You can generate the autocorrelation function using SPSS. Choose: Analyze 6 Forecasting 6 Autocorrelations You can then choose the variable for which you want the autocorrelation function. You also choose the maximum number of lags, under "Options". Under Display, uncheck "Partial autocorrelations".

The autocorrelation between the current period and the previous period is referred to as the autocorrelation at lag 1. Similarly, the autocorrelation between the current period and two periods ago is called the autocorrelation at lag 2. The autocorrelation function provides a quick examination of the correlation between various lags. For example, if we have quarterly data with a seasonal pattern, we would expect a high positive correlation at lag 4, as well as at lag 8. This is because every fourth period represents the same season. We may also find strong negative autocorrelation between the current season and other seasons. Later in this lesson, where we discuss seasonal data in more detail, we will show this pattern with Canadian wedding data from 1995 to 2004. The autocorrelation function will be a critical component in the next section in describing how to identify various time series data patterns. In summary, the autocorrelation function applied to a data variable can be used to answer the following questions about time series data:5 C C C Is the data random or does the data have a trend? Is the data stationary (e.g., the data mean and variance remains constant over time) or non-stationary? Is the data seasonal?

If the data is random, the autocorrelation coefficient for any Forecasting Fact lag period will be very low or close to 0, meaning there is no apparent relationship(s) within the data. If the data is Autocorrelation is the correlation between stationary, the autocorrelation coefficient will generally begin values in a time series at time t and time t-k high for lag k = 1 and diminish to 0 rapidly for subsequent lag for a fixed lag k. A large autocorrelation periods. Non-stationary data, or data exhibiting a trend, will number means a relationship exists in the have an autocorrelation function significantly different than 0 data series at a specific lag period. (positive or negative) for the first few lags and gradually becoming closer to 0 with successive lags. If the data is seasonal, a significant autocorrelation coefficient will be apparent at the seasonal time lag (e.g., lag=12 for months or lag=4 for quarters). It is important to note that data will generally exhibit more than one trend (e.g., increasing and seasonal). This is illustrated in the following discussion on time series data patterns.

Hanke, John E. & Wichern, Dean W. 2004. Business Forecasting, Eighth Edition. Prentice Hall. p.63.

7.9

Lesson 7

Time Series Data Patterns


This lesson focuses on examining a single variable, which is the simplest of time-series datasets. The data series consists of a number of observations for a single variable at different time periods. For example, this data could be the monthly inflation rates for the past 10 years, interest rates for the past 18 months, daily stock price changes, or, for our property developer in Simple City, quarterly marriage data. The two steps in analyzing time series data are: (a) Graph the time series data C The data should be graphed to visually see the type of pattern: is the series progressively increasing or is it decreasing through time? C There are various graphing techniques available including scatter diagrams, line graphs, or bar graphs. You can choose the visual approach that is optimal for your data. (b) Generate an autocorrelation function C We will next use SPSS to generate an autocorrelation function, showing its output in a table. The pattern of the autocorrelations will usually help explain the pattern of the data. The autocorrelation output will also provide you with statistical tests to determine if the autocorrelation is important (i.e., "significant" in statistical terms). The four general types of time series data patterns are: 1. 2. 3. 4. stationary (also called horizontal) trend seasonal cyclical

These data patterns are distinguished by their graphs and by the types of autocorrelation function they generate. It should be noted that for any time series data the series may have more than one pattern. For example, a data series could have both a trend and a seasonal pattern. For this reason, data patterns are referred to as "components" of the data series. Data, such as monthly housing sales, could have both an upward trend over time and display a seasonal pattern within the year. In this case, the forecaster will need to identify the data's trend component and seasonal component, and then combine them to predict the future number of housing sales. 1. Stationary Pattern A stationary data pattern indicates that the data series neither increases nor decreases over time. Thus, the series is "stationary" over time, with a constant mean and constant variance throughout its history. If you graph a stationary series, it will look horizontal, and its points will stay within a fixed distance around the mean. If a stationary time series is separated into two or more smaller series, the mean of each smaller series will be equal (constant). The variance of the smaller series will also be equal. In a stationary data series, past values of the data are unrelated to the current value. The data looks like random points around a constant level. Because the data appears to be a series of random points, past values of the variable cannot be used to predict future values. As an example, reconsider the discussion in Lesson 6 of Paul Fama's "Random Walk Theory", which theorized there is no better way of predicting movements in a stock market price than looking at current values. In other words, stock market prices have a horizontal pattern over the time frame of interest to most investors, and past data do not predict future values.

7.10

Understanding Data Patterns and Forecasting Techniques

The following example illustrates time series data with a stationary pattern. "Anna-Marie's Pools and Spas" is a chain of stores in Manitoba selling pools and pool supplies. Anna-Marie is considering opening a new store in Saxon, Manitoba and has approached you as an advisor. She has a number of markets she is considering for her new store and wants to carefully examine each of these markets before making her selection. She wants to know if this is a good year to open a new store in Saxon, or if she would be better advised to wait a few years. She has asked you to examine the pattern of pool sales in Saxon in past years, using data on pool permits as a proxy for sales. Table 7.1 shows this data for the last 15 years.6
Table 7.1 Pool Permits Issued in Saxon, Manitoba 1990B2004 Year 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 Pool permits issued 28 32 35 46 36 17 42 4 26 33 8 38 59 42 19

To answer Anna-Marie's question, you need to determine if this data has a trend or if it displays a cyclical pattern. You are not concerned with seasonality, as the data are annual. The first step is to graph this data over time. Open the poolpermits.sav file (available from the Online Readings Section of your Course Resources webpage) and follow these steps in SPSS: C C Graphs 6 Legacy Dialogs 6 Scatter/Dot 6 Simple Scatter 6 Define. Select poolpermitsissued as the Y-axis variable and Year as the X-axis variable 6 OK.

Each of the data sets used in the illustrations in this lesson can be downloaded from the course website.

7.11

Lesson 7

Figure 7.1 Number of Pool Permits Issued per Year

The data appears to be randomly scattered, although with a bit of clustering around a mean of around 30 permits per year. From a visual analysis, there does not appear to be any trend or cyclical pattern in this data. Your first guess is that this data is stationary. Once you have examined the graph of the data, the next step in the analysis is to generate the autocorrelation function, to see if the data is indeed random. This is done using SPSS, following the commands listed earlier in this lesson.7

Forecasting Fact A stationary data series does not increase or decrease over time.

Any statistical package will calculate autocorrelation coefficients.

7.12

Understanding Data Patterns and Forecasting Techniques

Table 7.2 Autocorrelations for Pool Permits Series: Pool permits issued Lag Autocorrelation Std. Error(a) Box-Ljung Statistic Value 1 2 3 4 5 6 7 8 9 10 11 12 13 -.032 -.214 .111 -.244 -.187 -.064 -.058 -.039 .216 .098 -.043 -.041 -.015 .234 .226 .217 .208 .198 .188 .177 .166 .153 .140 .125 .108 .089 .019 .916 1.179 2.560 3.449 3.567 3.674 3.730 5.705 6.198 6.318 6.458 6.488 df 1 2 3 4 5 6 7 8 9 10 11 12 13 Sig.(b) .892 .633 .758 .634 .631 .735 .816 .881 .769 .798 .851 .891 .927

a The underlying process assumed is independence (white noise). b Based on the asymptotic chi-square approximation.

The SPSS autocorrelation command produces the following output: C C C C C C Lag Autocorrelation coefficients Standard error "Std.Error(a)" of the autocorrelation coefficient Box-Ljung statistic Degrees of freedom of the Box-Ljung statistic Level of significance value "Sig.(b)" of the Box-Ljung statistic

The first column in the output, "Lag", shows the selected or default number of lags for the autocorrelation function. By default SPSS has generated 13 lags for the pool permits data set (two less than its 15 total observations). The second column, "Autocorrelation", shows the values of the autocorrelation coefficients (r1 to r13) that form the autocorrelation function. Recall that autocorrelation at lag 1 (r1) is an average of the relationship between each pair of consecutive annual number of permits data throughout the entire data series. For our example, it is the average of the relationship between 2004 and 2003, 2003 and 2002, 2002 and 2001, 2001 and 2000, ..., and 1991 and 1990. Autocorrelation at lag 2 is the average of the relationship between 2004 and 2002, 2003 and 2001, 2002 and 2000,..., and 1992 and 1990. Autocorrelation at lag 13 is the average relationship between 2004 and 1991 and 2003 and 1990. The pool permits data autocorrelation function show little relationship between any two data points. The autocorrelation at lag 1 shows that at any two successive years, there seems to be, on average, only a very slight drop in pool permits (r1=-0.032). If we looked the function at lag 2, the number of permits seem to be again negatively related (r2=-0.214); however, at lag 3, the number of permits seem to be positively related (r3=0.111). The correlation appears to switch between positive and negative relationships over the years, apparently randomly. As well, note that all of the coefficients are very low in value, indicating a weak relationship between the numbers of pools permits issued in successive years.

7.13

Lesson 7

Forecasting Fact The standard error is the difference between a predicted value and the actual value for a variable. If the autocorrelation coefficient is divided by the standard error, the outcome should be >2 for a significant outcome.

The third column is the "standard error" of the autocorrelation coefficient. In statistical terms, "error" does not refer to a mistake, it refers to the "residual" or difference between a predicted value and the actual value for a variable. A low standard error will mean that the autocorrelation coefficient is significant, while a high standard error indicates a high enough degree of statistical error that the autocorrelation coefficient cannot be relied upon (statistically insignificant).

For any given lag, if the autocorrelation coefficient is significant it implies that the average relationship of the time series at that given lag is similar and consistent throughout history. This is why we can rely on the average to describe the relationship. If a value is insignificant, then the autocorrelation coefficient is an average of very different numbers. Some of the relationships may be positive while others may be negative. In other words, we have little certainty of any relationship between the data observations at that lag. The size of the standard error should always be compared to the size of the measure it relates to. As a rule of thumb, in order to have significant autocorrelation coefficients, the standard errors should be smaller than half the size of the autocorrelation coefficients.8 In other words, for a significant autocorrelation coefficient, when you divide the coefficient by the standard error the product should be greater than two. The SPSS output for the pool permits data shows standard errors that are generally larger than the autocorrelation coefficients. This indicates that the autocorrelation coefficients for the pool permits data are not significant, and implies that over time there is no pattern in the pool permits data. The data may be random (called "white noise"). The last three columns provide information on the "Box-Ljung Statistic" (BLS) or modified Box-Pierce Q Statistic. The BLS provides a check on whether the weighted sum of squares of a sequence of autocorrelation errors is significantly different from a distribution represented "white noise".9 In simple terms, this statistic tests the randomness of a time series of data. If the Box-Ljung Statistic is not significant at lags of approximately 1/4 of the sample size, then the series can be viewed as random. Since there are 14 lags in this example, we would verify the BLS at 3 or 4 lags to determine its significance.
Forecasting Fact A Box-Ljung Sig.(b), or significance level, of .05 or less is desirable B this means the forecaster has a less than a 5% chance of being wrong in stating autocorrelation exists between two variables.

The generally acceptable cut-off for a significance level is equal to or less than 0.05 (95% confidence level, CL). If the significance level is less than 0.05, the forecaster has less than a 5 percent chance of being wrong in stating that there is autocorrelation between the two time periods. In other words, you can reasonably assume there is a positive trend in the data if the lag autocorrelation is positive and the Sig.(b) value is less than 0.05.

In our example, the Box-Ljung Statistics shows high numbers at all lags, including those one-quarter through the sample, indicating there is no significant relationship between permits data at all lags or time intervals [sig.(b) greater than 0.05]. This means that the correlation coefficients ("autocorrelations" in the second column) are not statistically significant our conclusion is that the permits data are random over time, confirming our
8

A common related test is called a t-test, found by dividing the estimate by the standard error. If the t-test is greater than 2, then you are likely to have an estimate whose value is not equal to 0. In other words, if t>2, the result is usually statistically significant.
9

Wharton Business School, University of Pennsylvania, online definitions. http://armstrong.wharton.upenn.edu/dictionary/definitions/box-pierce%20test.html

7.14

Understanding Data Patterns and Forecasting Techniques

guess in viewing the graph. The BLS significance levels definitively confirm there is no trend or cyclicality in this data. The pattern is stationary and random: white noise. We advise Anna-Marie that past pool sales are a poor predictor of future pool sales, and that she needs to identify more relevant and reliable predictors before proceeding to invest further. For more in-depth information on the Box-Ljung statistic and the related Q statistic, refer to Chapter 3 of the Business Forecasting text.
Looking Ahead: Forecasting Stationary Data To forecast values for stationary time-series data, we need to build a model in which other variables can be used to estimate the value of the target variables C for example, rather than using past numbers of pool permits, we might use housing sales or number of very hot days in the summer to predict pool purchases. The timeseries in itself does not offer any discerning patterns that can be relied upon to predict future values; i.e., the only information the series provides about possible future values is that they will vary around its mean. We will discuss how to build models with more than one variable in Lesson 7.

2. Trend Pattern The trend of a data series is the component that causes the data to have a long-term predictable change over time. If a data series is rising over a long period of time, we say it has an upward trend. For example, housing prices have a long-term upward trend. In the 1960s, a house could be purchased in a small city for $5,000. Since then, house prices have increased more or less consistently over time, based on both demand (rising population and incomes) and supply conditions (rising general inflation and increases in construction costs). Table 7.3 shows the number of houses under construction (housing starts) in July in Toronto for the period 1994 to 2004. To access this data set, open the housingannual.sav file.
Table 7.3 Housing Under Construction in the Month of July, for the City of Toronto 10June 7, 2011 Year 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 Housing Under Construction 12,258 11,472 13,346 14,099 15,615 22,306 26,404 33,819 35,982 36,478 40,185

10

Series drawn from CANSIMII, series V731173, Table number 270003, Toronto, Ontario, Housing under Construction, 1976 Census Definitions; Total Units. Monthly series from January, 1972 to May 2005. Note the last ten years of data were chosen because these contain a clearly discernible trend. We will come back to this data later and see how autocorrelation is possible even if data does not trend upward over time.

7.15

Lesson 7

The construction data indicate the number of houses under construction in Toronto increased over this decade, showing an upward trend. A graph of this data will visually reveal the data pattern. We would expect to see a graph with a line that is moving upward (positively sloped) at a fairly constant rate. Note, there may be another pattern in the series, but the trend should be the most noticeable of the patterns. Figure 7.2 shows the numbers above in a scatter graph.
Figure 7.2 Houses under construction in Toronto in July, over 1984-94

This data has an obvious upward trend over time. Between 1994 and 1998, construction activity in Toronto was increasing slowly. From 1998 to 2001, Toronto was in a midst of a housing boom. Between 2001 and 2004 housing activity somewhat slowed again. Our next step is to calculate the autocorrelation coefficients to see if there is statistical evidence of a trend. The autocorrelation coefficients are in Table 7.4.

7.16

Understanding Data Patterns and Forecasting Techniques

Table 7.4 Autocorrelations For July Housing Construction Data Series: Housing Construction Lag Autocorrelation Std. Error(a) Box-Ljung Statistic Value 1 2 3 4 5 6 7 8 9 0.781 0.533 0.242 -0.070 -0.295 -0.433 -0.449 -0.377 -0.280 0.264 0.251 0.237 0.221 0.205 0.187 0.167 0.145 0.118 8.727 13.243 14.287 14.388 16.456 21.818 29.012 35.782 41.404 df 1 2 3 4 5 6 7 8 9 Sig.(b) 0.003 0.001 0.003 0.006 0.006 0.001 0.000 0.000 0.000

a The underlying process assumed is independence (white noise). b Based on the asymptotic chi-square approximation.

The autocorrelation coefficients show a relatively strong positive correlation within the series, especially for the first two lags. The correlation weakens as the numbers of lags increase and turns negative after the fourth lag. This pattern is distinctive in data with a trend. That is, the first autocorrelation coefficient is fairly high (0.781), showing that data is highly correlated year to year. The second is somewhat lower (0.533), showing a weaker relationship for two-year intervals. The pattern continues downward until the coefficients are negative. The two most recent observations (lags 1 and 2) appear to have the strongest positive correlation with current levels, indicating a positive trend. Comparing the standard errors relative to the autocorrelation coefficient it seems that the 3rd, 4th, and 5 th coefficients are not significant since the errors are large.11 However, the longer lags are negative and significant. This is a typical result of a time series with a trend where the coefficients with low lags are positive, while those with long lags are negative. The Box-Ljung Statistics indicate that all lags are significant for the construction data, since the Sig.(b) is less than 0.05 for all. This confirms that the construction data exhibits a trend. Next, we will turn to another data pattern that is commonly found in time series, the seasonal data pattern. In fact, this pattern will help us answer the question asked at the beginning of this lesson: when is the best time for the new home builder to hold open houses in order to time these with the wedding market? 3. Seasonal Pattern Data has a seasonal component if it has a pattern of change that repeats itself every year. For example, housing sales have a seasonal pattern, with more sales in spring and fewer in winter. In Canada, there is a distinct seasonality to moving dates, with July 1 sometimes referred to as "national moving day". Seasonality is such a strong feature in most Canadian data that Statistics Canada has developed a seasonal adjustment formula, known as X11, to allow their output to show both seasonally adjusted series and raw series (not adjusted). For example, unemployment rates are usually seasonally adjusted, because without seasonal adjustments people would be appalled at the high unemployment rate every February and ecstatic at the great reduction in unemployment in June.
11

Using the rule of thumb test for significance: an autocorrelation coefficient is not significant if the result of dividing the autocorrelation coefficient by the standard error is less than 2.

7.17

Lesson 7

Seasonal variations can depend on the weather or on other features related to the annual calendar. Sales of flowers increase in February and May, while other sales depend on the school year or Christmas. In some markets, new home sales are highest in spring and lowest in winter.12 Seasonal fluctuations will show as autocorrelation between data for the same season in each year. For example, we would expect autocorrelation in a data series between sales in December over a number of years, and for sales in January, etc. We will now examine the marriagesquarterly.sav data series, which has seasonal variations, in order to calculate the autocorrelation coefficients. Table 7.5 shows the number of marriages recorded in Canada from 1995 to 2004, on a quarterly basis (3 month intervals).13
Table 7.5 Number of Marriages in Canada 1995 B 2004 (excerpt: 1995-99) Year 1995 1995 1995 1995 1996 1996 1996 1996 1997 1997 1997 1997 1998 1998 1998 1998 1999 1999 1999 1999 Quarter 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 Number of Marriages 16,197 42,193 72,919 28,942 16,090 43,641 68,382 28,578 15,681 41,365 68,779 27,481 15,677 39,234 36,587 28,323 15,180 39,563 71,407 29,592 Year 2000 2000 2000 2000 2001 2001 2001 2001 2002 2002 2002 2002 2003 2003 2003 2003 2004 2004 2004 2004 Quarter 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 Number of Marriages 15,677 39,627 74,972 27,119 14,621 38,676 67,655 25,666 14,832 38,197 67,441 26,268 14,408 37,890 67,119 25,631 14,715 37,904 67,858 25,900

12

It is important to test such statements in local market areas. For example, this may not be true in some colder climate-areas in Canada. As well, some large housing projects can distort sales due to project build-time or pre-sales.
13

This data is from CANSIM II Births, Deaths and Marriages Series: Marriages, Vital Statistics, Table 530001, V92.

7.18

Understanding Data Patterns and Forecasting Techniques

This data clearly shows a seasonal pattern. The first quarter (January to March, winter) has consistently fewer marriages than any other quarter. The second quarter (April to June, spring), has the second highest number of marriages in a year. The third quarter (July to September, summer), has the most marriages in each of the 10 years of data. Finally, the fourth quarter (October to December, fall) has the second fewest number of marriages in each of the years in the data series. The seasonal pattern is even more obvious when the data is graphed (Figure 7.3). We have used a bar graph because it shows the seasonal pattern very clearly. We have used different shading patterns to represent different seasons. The seasonal pattern is clearly visible. Each winter, marriages are low. The number of marriages rises in spring, reaches a peak in summer, then declines in the fall and reaches a minimum for the year in the winter months. We will now confirm this relationship using the autocorrelation function.
Figure 7.3 Number of Marriages in Canada 1995 B 2004

7.19

Lesson 7

Table 7.6 Autocorrelations for Marriages from 1995 to 2004 Series: Number of Marriages Lag Autocorrelation Std. Error(a) Box-Ljung Statistic Value 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 -.061 -.852 -.027 .894 -.058 -.762 -.030 .798 -.050 -.676 -.029 .703 -.043 -.592 -.022 .607 .152 .150 .148 .146 .144 .142 .140 .138 .136 .134 .131 .129 .127 .124 .122 .120 .161 32.235 32.269 69.539 69.700 98.372 98.418 131.859 131.993 157.617 157.666 187.306 187.421 210.065 210.096 235.862 df 1 2 3 4 5 9 7 8 9 10 11 12 13 14 15 16 Sig.(b) .688 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000

a The underlying process assumed is independence (white noise).

We would expect the autocorrelation function to show strong positive autocorrelation between periods of the same season (winter with winter, summer with summer, etc.) and strong negative autocorrelation between periods from opposite seasons (fall with spring, summer with winter). The autocorrelation report (Table 7.6) shows exactly this pattern, with high positive autocorrelations for same seasons at the fourth, eighth, and twelfth lags. Recall that autocorrelation at lag 4 (r4) is an average of the relationship between each pair of consecutive wedding numbers from a year ago throughout the entire data series. For our example, it is the average of the relationship between fall 2004 and fall 2003, summer 2004 and summer 2003, spring 2004 and spring 2003, winter 2004 and winter 2003, ..., and winter 1996 and winter 1995. Autocorrelation at lag 8 is an average of the relationship between each pair of wedding two years apart: fall 2004 and fall 2002, summer 2004 and summer 2002, spring 2004 and spring 2002, ... winter 1997 and winter 1995. The report also shows that autocorrelations for opposites seasons at the second, sixth, and tenth lags are strongly negative. This is consistent with the fact that marriages are consistently high in summer and low in winter. There are negative autocorrelations at the first and third lags since we are comparing the number of weddings in different seasons. For example the first lag compares the number of weddings one quarter apart (winter 2004 and fall 2004, fall 2004 and summer 2004, etc). The second lag compares weddings two quarters apart (fall with spring, summer with winter, etc.). The Box-Ljung Statistic shows all autocorrelations as significant except at lag 1, which has a 68.8% [sig.(b) = 0.688] chance of being zero rather than negative. This is because the one period lag relationship is not consistent. From summer to fall the number of weddings decrease while from winter to summer the number is expanding. So on average the one period lag shows no relationship.

7.20

Understanding Data Patterns and Forecasting Techniques

If data are seasonal in nature, it is likely that a data provider would give you the data in a seasonally adjusted format. This would allow you to use the data to make forecasts, avoiding any impact from the seasonal relationship within the data. If you are gathering your own data, you may need to make your own seasonal adjustments. SPSS contains commands for deseasonalizing data. We will explain further how to forecast with seasonal data later in this lesson, in the section titled "Statistical Forecasting with Time Series". 4. Cyclical Pattern

Forecasting Fact Data which exhibit a seasonal trend will indicate a strong autocorrelation for lags with multiples of 4 (e.g., seasons of the year).

Data is cyclical if it has wave-like fluctuations, either around the constant (if data is stationary, but with a cycle) or around the trend (e.g., a sine wave along an upward sloping line). A classic example of the cyclical data pattern is the "business cycle", where economies tend to have periods of expansion followed by recession, with each cycle taking between 10 to 15 years to complete. There are other cycles that are much shorter, for example, the lunar cycle is 28 days. To illustrate a cyclical pattern we will use the Toronto housing construction data from our trend analysis earlier in this lesson. As mentioned, certain data series may exhibit more than one pattern. The housing construction series displays both a trend over the last decade, as well as a cycle over a longer period. To better illustrate this cycle, we will extend the housing data series back to 1972 and include all months, rather than just July. You can view this data in the housingmonthly.sav file. The first column contains the year, the second column contains the month, and the third contains the number of units of housing under construction in that period. There are 401 data points in the series.
SPSS Instructions Choose Analyze 6 Forecasting 6 Sequence Charts. Under "Variables", select the "Housing" variable. Under "Time Axis Labels", select "Date@6 OK.

First, we will graph the data (Figure 7.4).

7.21

Lesson 7

Figure 7.4 Housing under Construction in Toronto, Monthly, from January 1972 to May 2005

Housing under construction was high in the mid-1970s, low in the early 1980s, high again in the late 1980s, low in the mid-1990s, then high again in the last two and a half years. Not surprisingly, the early 1980s and mid1990s were economic recessions and the late 1980s and late 1990s were booms. This data series shows a cyclical pattern over the long-term; however, there appears to be a positive trend in the last decade. Housing under construction data seems to follow the pattern of the economic cycle. If the number of units under construction is a leading indicator of the strength of housing demand, a real estate developer may be wise to monitor the indicators for economic growth to understand the housing market and to forecast housing needs.14 To confirm our analysis regarding the pattern in the data series we need to examine the autocorrelation function (Table 7.7; note that lags have been set to 50, the maximum number of lags can be changed using the Options button on the Autocorrelations window).

14

Of course, the developer would need to be careful that this leading indicator wasn't showing an impending oversupply in the market.

7.22

Understanding Data Patterns and Forecasting Techniques

Table 7.7 Housing under Construction in Toronto, Monthly Series: Housing Lag 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 . . . 47 48 49 50 Autocorrelation 0.990 0.979 0.968 0.955 0.941 0.927 0.911 0.895 0.878 0.859 0.840 0.820 0.800 0.779 0.757 0.735 0.713 0.690 0.666 . . . 0.059 0.042 0.026 0.009 Std.Error(a) Value 0.050 0.050 0.050 0.050 0.050 0.049 0.049 0.049 0.049 0.049 0.049 0.049 0.049 0.049 0.049 0.049 0.049 0.049 0.049 . . . 0.047 0.047 0.047 0.047 Box-Ljung Statistic df Sig.(b) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 . . . 47 48 49 50 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 . . . 0.000 0.000 0.000 0.000

396.3 784.8 1165.2 1536.6 1898.3 2250.1 2590.6 2920.1 3237.5 3542.2 3834.4 4114.0 4380.7 4634.3 4874.4 5101.3 5315.2 5516.0 5703.8 . . . 7428.0 7428.8 7429.1 7429.1

a The underlying process assumed is independence (white noise). b Based on the asymptotic chi-square approximation.

This autocorrelation table shows a slow decline in the values of the coefficients as the number of lags increases. This indicates there is positive autocorrelation in the series and that the housing activity in any month is related to the previous month. This is similar to the autocorrelation function generated in the trend pattern analysis using the July data presented in Table 7.4. However, using the monthly data for a longer time period, the autocorrelation is much higher and the autocorrelation is never negative. In this example, the strength of the autocorrelation is extremely high. The coefficient for lag 1 is 0.99, indicating that 99% of the value in the current data is determined from last period's level. In other words, in any given month the current monthly housing construction total is almost identical to last month's total. The totals are changing but by a very small margin. For forecasting purposes this indicates that the future values will depend on the last available level. However, using this approach to forecast cyclical time-series is problematic because at some point the series will eventually change direction from increasing to decreasing. For this type of a data series, where historical totals are changing by only a small margin, the best way to extract more information is to explore how it moves from one period to the next and then look more closely at how this movement changes. We essentially want to examine if the month-to-month changes are increasing and or

7.23

Lesson 7

decreasing at regular intervals. To examine these changes, we will use a method called "differencing". Differencing simply generates a new time series by subtracting the current value from the previous value for the entire original series. In statistical terms, this is called the "first difference series". For the housing data we take the current month's value minus the previous month's value to yield a new data series containing data for all periods except the first or oldest period in our database (because we do not have data for the previous month for the first period). Table 7.8 shows an example of differenced data for the first five rows of the Housing Under Construction database.
Table 7.8 Sample Differenced Data, Housing Under Construction Date 01/01/1972 02/01/1972 03/01/1972 04/01/1972 05/01/1972 Housing 41,744 39,455 38,899 38,438 39,765 -2,289 -556 -461 1,327 Difference

Differencing is a common technique used in forecasting to remove trend characteristics from a dataset. This method effectively transfers a cyclical or seasonal data series to a stationary one. Most time series data are closely related to the previous data's value and, therefore, better information can be obtained by examining these relative changes rather than their total values. There are many examples of real estate data where a trend is evident. For example, house prices are not random. They depend highly on market activity and expectations of market participants. When time series data changes by small A competitive and efficient market leads to some degree of margins from period to period, the best consistency in pricing. When a new house is put on the market, the approach is to explore how the data owner does not have complete freedom in pricing, because they moves (e.g., the rate of change). must set it at a price that is likely to attract a purchaser. A forecaster looking at residential property values will be interested in the monthly increase in the values. Are the values of similar homes increasing on average by $2,000 every month? Are the values increasing at a diminishing rate or are they accelerating upward? The "difference" between this month's price and last month's price is what is captured in a differenced data series.
Forecasting Fact Differencing Data Differencing data removes a great deal of autocorrelation and provides the forecaster with information on changes in the movement of data over time. Differencing data is needed when forecasting two data patterns: 1. 2. Data with a trend. Data with a strong autocorrelation component at lag 1 (above 0.90), where the autocorrelation at subsequent lags diminishes slowly.

7.24

Understanding Data Patterns and Forecasting Techniques

Another example where a differenced series will help interpret the real estate values is when comparing house price increases in different markets. If an analyst wants to know which real estate market has faster rising prices, Vancouver or Sooke (a small town on Vancouver Island), it is difficult to compare the absolute values, because Vancouver's real estate's values start so much higher, and thus the percentage changes are misleading. Instead, if you looked at how each month's average price differs from the previous month's average price, you would get a sense of how price changes are happening in absolute terms. Once the differenced data is generated (Table 7.8), the two steps for analyzing time-series patterns may be undertaken: graphing the data and generating the autocorrelation function.
SPSS Instructions: Creating a Time Series You can generate a new time series using SPSS. Choose: Transform 6 Create Time Series Select Difference in the Function box. Next, select the variable for which you want to generate a differenced time series and move it to the New variable(s) box. SPSS will automatically generate a name for the time series variable. Select OK

Figure 7.5 shows the differenced data widely scattered with an average of near zero, meaning that the difference between one month's data and the previous month's data could be positive or negative (no apparent trend). The data in the graph appears to be fairly random month-to-month. However, there does appear to be a slight wave pattern to the data series.
Figure 7.5 Differenced Data

7.25

Lesson 7

Table 7.9 shows the autocorrelation function for the differenced data. This function is very similar to that generated for the non-differenced series. The autocorrelation function shows that the autocorrelation coefficients on the first two lags, 0.268 and 0.156, are significant (the coefficients are double the standard error). Box-Ljung statistics (used to test the randomness of a time series) tell us that the difference series is not random. These numbers show a weak but significant autocorrelation in the series.
Table 7.9 Housing under Construction in Toronto, Monthly Difference Series: Housing Under Construction (Monthly Difference) Lag Autocorrelation Std. Error(a) Box-Ljung Statistic Value 1 2 3 4 5 6 7 8 9 10 . . . 47 48 49 50 0.258 0.156 0.061 0.051 0.017 0.091 0.050 0.073 0.096 0.072 . . . 0.011 0.079 0.035 -0.001 0.050 0.050 0.050 0.050 0.050 0.049 0.049 0.049 0.049 0.049 . . . 0.047 0.047 0.047 0.047 26.912 36.751 38.249 39.320 39.436 42.821 43.844 46.011 49.822 51.940 . . . 122.688 125.571 126.136 126.137 df 1 2 3 4 5 6 7 8 9 10 . . . 47 48 49 50 Sig.(b) 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 . . . 0.000 0.000 0.000 0.000

a The underlying process assumed is independence (white noise). b Based on the asymptotic chi-square approximation.

Forecasting Fact Differencing is a good technique for providing information on changes in the movement of data over time: acceleration, de-celeration, or other patterns. However, multiple regression may be more suitable for analyzing cyclical data.

As stated earlier, forecasting cyclical data is problematic since it is difficult to capture a downturn or upturn in the series. For cyclical data series, the best forecasting approach involves multiple regression analysis, incorporating information from other variables. In our example, it seems the housing construction activity follows the business cycle pattern. Therefore, indicators of economic activity may be good candidates for modeling housing construction.

Summary: Recognizing Time Series Data Patterns


Recognizing data patterns requires both an in-depth understanding of the main data patterns and practice. Forecasting practitioners need to develop a keen eye for repetitive patterns as these are the key to forecasting the future. Whether you are working with data on housing sales, office vacancy rates, retail absorption, or government incentives to encourage development, you must find the patterns in the data in order to make forecasts. Once you identify the data pattern, then you can model the data series and perform a forecast. We have covered the four most common data patterns, but we have not covered the additional complexity of combinations of data patterns that can occur in some data. We have also not covered the issue of non-constant variance this is discussed in Appendix 1 at the end of this lesson.

7.26

Understanding Data Patterns and Forecasting Techniques

Having described patterns in historic time series data, we will now shift the focus of the lesson to forecasting the various time series into the future. In the remainder of this lesson, note that we are limiting our coverage to using past data for only one variable in order to forecast future values of that one variable's series. This simple forecasting is often adequate for many forecasting needs. In Lesson 8 we will examine the use of multiple variables.

Statistical Forecasting Using Time Series Data


Forecasting using time series data builds directly on the patterns we have examined so far in this lesson. In the following section, we will lead you through a number of techniques to forecast time series data for these data patterns.

How to Forecast Using Averaging and Patterns


The forecasting methods most commonly applied to real estate data are: 1. 2. 3. 4. 5. The nave forecast The average forecast The moving average forecast The exponential smoothing technique The autoregressive integrated moving average forecast (ARIMA)

In the following pages, we will review each method in turn, illustrating how to perform a simple forecast using real estate data. We will also review the advantages and disadvantages of each forecasting method. SPSS offers time series analysis commands, but in order to use these the data must first be identified as a time series. This tells SPSS that the data is not cross-sectional, and allows SPSS to carry out some unique forecasting procedures. You specify the type of data, e.g., weekly, monthly, or quarterly, and the program assigns a periodic value based on the number of periods in the cycle. For example, a one year cycle has twelve monthly data points (periodicity = 12) or four quarterly data points (periodicity = 4). By assigning data periodicity, you can then use SPSS' seasonal adjustment options in forecast commands.
SPSS Instructions: Assigning Periodicity To assign periodicity to data in SPSS: Choose Transform 6 Date and Time Wizard 6 Assign periodicity to a dataset 6 Finish Select the appropriate periodicity for the data. As an example, for monthly data: Choose: Years, months In the "First Case Is" boxes, enter the first year and month of the data series. Once entered, the program will recognize that data is monthly, and will assign a periodicity = 12 (12 periods per year). Click OK Once the periodicity of data is assigned, you can perform forecasts using the Analyze command. We will review these commands later in this section.

7.27

Lesson 7

1. The Nave Forecast The nave forecast assumes that the best predictors of future values are the most recent data available. This could sometimes be useful for a very short-term forecast of an event for which there is not much historical data. An example would be a demand forecast for a new concept in housing or a new construction material that has been on the market for a very short time. The most simple nave forecast estimates the value for a specific time period using the previous period's value. The first forecast value is equal to the last available value. In formula terms, this would look like the following: y t%1 = yt where C C C the letter y is used to represent the data for the variable under study; the subscript t represents the time period being estimated and t+1 means an estimate one period into the future; and the hat (^) indicates the term is an estimate of y and not the actual data.

A nave forecast can be used to predict only one period ahead, or, if it is used for more than one period, it assumes that all future periods will be identical to the last period for which data was available. Furthermore, because it only uses one period's information, the part of the value of yt that is due to random error is treated as if it were a permanent feature of the variable. Both of these are weaknesses of the nave technique. It basically ignores all past information beyond one period. As well, nave forecasts assume only the variable being forecasted is significant, with all other variables that might influence the forecast excluded from the analysis15. In other words, it is assumed that the forecast is not affected by the environment (e.g., housing starts are not affected by interest rates). For the Pool Permit data presented in Table 7.1, the number of permits issued in 2004 was 19. A nave forecast would therefore predict 19 permits issued each year after 2004.
The nave technique is introduced here because it is a good way to start thinking about forecasting. A nave forecast is good when a very quick, back-of-the-envelope type forecast is required and is of "good enough" quality. This forecast is only suitable for a very short time period, such as one day, week, month, or quarter, or at most a year. This approach should rarely be used in practice. In most cases, even in simple forecasts a forecaster will at least use an averaging technique, as shown in the next section. Any research that relies on the nave model as a main forecasting tool should be questioned.

A "second nave forecast" uses information from two periods of data. This would be used if the data exhibits a trend pattern. In this case, the formula for the nave forecast would be: y t%1 = yt + (yt&yt&1)

15

Management 105: Business Forecasting, School of Business Administration, California State University Sacramento (excerpt from Powerpoint presentation),http://www.csus.edu/indiv/j/jensena/mgmt105/naive_01.pps

7.28

Understanding Data Patterns and Forecasting Techniques

The forecast is adjusted by taking into account the change from yt-1 to yt (the "first difference"). The forecast essentially equals the last value plus the change that had occurred between the last two values. If there was an increase in value between the last two data points this level change is assumed to be permanent and therefore added to the future periods. We will now examine an example of a second nave forecast using data on July Housing Under Construction (Table 7.10). A second nave model would require only the last two data points, in this case 2004 (t) and 2003 (t-1).
Table 7.10 Housing Under Construction in the Month of July, for the City of Toronto Year 2003 2004 Housing Under Construction (Variable Name: HouseCon) 36,478 40,185

The forecast for HouseCon would be: HouseCon2005 = HouseCon2004 + (HouseCon2004 HouseCon2003) or HouseCon2005 = 40,185 + (40,185 36,478) = 43,892 Based on this second nave forecast, the forecaster would predict that in July 2005 there would be 43,892 houses under construction in Toronto. To predict the 2006 number, the forecaster would use the 2005 forecast for yt and the 2004 data for yt-1.
Forecasting Exercise: Nave Forecast Produce a nave forecast for the year 2004 for the Housing Under Construction data, using data from 2002 and 2003. Answer: HouseCon2004 = HouseCon2003 + (HouseCon2003 B HouseCon2002) HouseCon2004 = 36,478 + (36,478 B 35,982) = 36,974

This type of nave forecast can also be adjusted for seasonality. If the data are seasonal, the forecaster can use the last year's data for the same season to forecast the same season's data for this year and future years. This is still fairly simple and uses very little of the information in the data. These forecasts are evaluated by simply comparing the forecast outcome with the corresponding actual outcomes. To determine the best type of forecast to use, a forecaster must examine the data pattern to see which method is optimal. However, with both types of nave forecast techniques there are very few situations in which these simplistic techniques would be optimal.
Forecasting Fact The nave forecast is often the most common forecasting technique in the business community.

7.29

Lesson 7

3. Average Forecast Averaging method uses the mean of all relevant observations to forecast the next period's value. It is best applied for data where the process generating the data is stationary. For example, for an assessor inspecting residential properties, the variation in number of homes inspected weekly is small and seemingly random. Therefore, the averaging method could be used to forecast the weekly number of home inspections an assessor will make. It is very simple, and can be updated each period by re-calculating the average. In formula terms, given 10 periods of data, the forecast for the 11th period would be:

11 y t / 10 y
t 1

10

In other words, you would compile data from 10 weeks of inspections, take the average, and use this as an estimate of the 11th week. The data is stationary, in that there is no discernible trend the inspector is not getting any faster or slower, there are no seasonal influences, etc. Applying this technique to the Pool Permits example, we will use all 15 data points to forecast the year 2005: PoolPermits2005 = 31
2004

t 1990

( PoolPermits) / 15 (28 32 35 46 36 17 42 4 26 33 8 38 59 42 19) / 15)

Under this method, the forecast for all future values of pool permits is 31. That is, without any new information, the estimate for PoolPermits2006 = 31, PoolPermits2007 = 31, and even PoolPermits2050 = 31. If the data are actually stationary (random variations around a mean) and no other information is available to forecast pool permits, then an average of the past gives an excellent prediction of the future. This may, in fact, be the best forecast of pool permits available. However, like the nave model, the basic average is rarely used in professional forecasting. It is only useful for a variable that is stable and has no trend or other time varying pattern, and it is rare for data to actually be this stable.
Forecasting Exercise: Average Forecast Using the data from Table 7.3, Housing under Construction in July, produce an average forecast for 2004 using the data from 1994 to 2003. Compare this to the actual data for 2004. Does the forecast, in fact, underestimate the data? Answer: C C C C The 2004 forecast for housing under construction is 22,178 (the average housing under construction from 1994 to 2003 is 22,177.9) The actual housing under construction for 2004 is 40,185 The forecast has underestimated the level by 18,007 units The average is only useful for a stationary variable that moves randomly. If the data has an upward trend, the averaging technique will systematically underestimate the forecast value.

7.30

Understanding Data Patterns and Forecasting Techniques

3.

Moving Average Forecast

A moving average is very similar to the basic averaging technique, with one exception: for a moving average only a specific number of data points are used to calculate the average. The average is recalculated each period by dropping the oldest data point and adding a new one. The number of data points used to calculate the moving average is called the span of the moving average. For example, using monthly data, a moving average of span 4 would take the average of the past four months of data and use this average to forecast next month's value. This is a procedure that can be easily done in SPSS, or even a spreadsheet program such as Microsoft Excel.
Excel Instructions: Moving Average C C C Move to the column to the right of your data Calculate the first average (of span 4) next to the row for the fifth value, Then copy down your formula to obtain moving averages for the entire data column.

The moving average is not capable of forecasting seasonal variation and is somewhat weak in forecasting data with a trend. If the trend is consistent, that is, if data are continuously rising, then the moving average will underestimate future values, but not as much as the simple average would. A moving average is somewhat useful for forecasting cyclical variables, or variables that have a pattern that varies. It is normally used for smoothing of data that has great variations. An example of a time series that has a great deal of variation is stock prices. Despite concerns raised by Paul Fema that stock market prices could not be predicted any better than by using today's price, many stock market analysts use moving averages to track and forecast stock prices. For example, Globeinvestor.com and other investor websites offer moving averages of stocks as a tool for understanding and possibly purchasing stocks.

A Moving Average Example from Vancouver The City of Vancouver provides an interesting example of the use of moving averages. Vancouver's property assessments for land are based on a three-year moving average, in an effort to "smooth out" dramatic shifts in tax burden resulting from volatile sub-market prices. In other words, if property values on the west-side increase by 30% while those on the east-side increase by only 10%, then applying the same tax rate to both would result in substantial tax increases for west-side property owners. Can you see a problem with this approach? What might happen if land values in Vancouver fall, when the average "smoothed" tax value may actually exceed market value?

7.31

Lesson 7

SPSS Instructions: Creating a Moving Average Open the housingmonthly.sav data set. Choose: Transform 6 Create Time Series. Select the variable to be averaged (in this case, Housing), and move it to the right using the arrow key. With the formula under "New Variable" highlighted, select the Function: Prior Moving Average. In the Name Box, change the name to one that will provide you with better information than the default name (e.g., "movav4"). In Span, enter the desired span (e.g., with monthly periodicity, setting this to 4 implies quarters). Press Change to ensure that all of your selections are recorded in the formula. Re-read the function under New Variable(s) to confirm that your new variable name has been recorded. Press OK to transform the new variable. The SPSS output will show you that the variable was transformed. Returning to the data screen, you will now see a new column of data for the new variable created. To view a graph of this data, use the following command: Choose: Analyze 6 Forecasting 6 Sequence Charts. Under AVariables@, select the moving average variable (e.g., Amovav4@) and, optionally, the original variable for a reference (e.g., AHousing@). Under ATime Axis Labels@, select ADate@6 OK.

Forecasting Exercise: Moving Average Using the data for housing under construction from housingmonthly.sav, create a 4 period moving average and a 6 period moving average forecast for June 2005. Answer: C C The forecast of using the 4 period moving average for June 2005 is 39,414 units. The forecast of using the 6 period moving average for June 2005 is 40,071 units.

For the Toronto Housing Under Construction data, we have generated the following two graphs in SPSS: Figure 7. 6 shows a moving average of span 4; Figure 7.7 shows a moving average of span 12. In both graphs, the black line is the actual data, the dashed line is the moving average.

7.32

Understanding Data Patterns and Forecasting Techniques

Figure 7.6 Moving Average Span 4

Figure 7.7 Moving Average Span 12

7.33

Lesson 7

The span 4 moving average line displays many of the same shifts in direction seen in the raw data. However, note that these changes follow the changes in the actual data, and do not predict them (notice how the dashed line appears one notch to the right of the data line at all points). The span 12 line is much smoother and misses many of the short-term changes in direction completely. In other words, the span 12 tends to miss the turning point, but follows closely through a trend phase.
Forecasting Exercise: Graphing Moving Averages As an exercise, reproduce the graphs in Figures 7.6 and 7.7 for a 4, 6, and 12 period moving average.

Forecasting Fact A moving average is useful for eliminating the volatility of a data series, even data which has been already seasonally adjusted.

Moving averages are used to smooth a data series. This technique is therefore useful when data is highly volatile, but recent trends provide the best information on likely future values. However, this method is usually considered inferior to exponential smoothing, which we describe next. 4. Exponential Smoothing

When forecast techniques build in weights as well as averaging, they are referred to as smoothing techniques. In the moving average technique discussed above, the forecast is generated by using only information within the span and with each data point in the span having equal weight or influence on the forecast. All other historical information is ignored. Exponential smoothing or weighted moving average is similar to the moving average technique, except the forecast applies the greatest weight to the variable's most recent value. This technique is again appropriate for data without a trend.
Forecasting Fact Exponential smoothing is a popular technique for producing a "smoothed" time series since the most recent data (implicitly assumed to be most reliable) is given the most weight.

The goal in exponential smoothing is to generate a model that will best predict the current level and then use this current level prediction to forecast future levels. Considering the data series history, the best model is built by adjusting the weights and selecting the model that will minimize the forecast error. This technique continuously updates the previous forecast as new information is available.

t%1'"yt%(1&") yt The formula for the exponential smoothing forecast would be: y where: C C C C y t%1 is the forecast for the next period (this is sometimes shown as St , or the smoothed observation) " is the smoothing weight (0< " <1) yt is the observation for the current period y t is the previous forecast for the current period

This formula can be extended to allow for forecasts based on any number of previous time periods, as follows: y t%1'"yt%"(1&")yt&1%"(1&")2yt&2%"(1&")3yt&3%"(1&")4yt&4%

7.34

Understanding Data Patterns and Forecasting Techniques

The next period forecast is generated as an exponentially smoothed value of all the past values. If the smoothing weight (") is close to 1, then the dampening effect will be strong. However, if the weight is close to 0, then the dampening effect is low and the next period forecast will be similar to the previous forecast of the current period. The trick to choosing the best " value is to select the value which results in the smallest mean squared error (sum of the squared errors divided by the sample size for the predicted verses actual valves). Hanke and Wichern discuss three exponential smoothing techniques: C C C Exponential Smoothing Exponential Smoothing Adjusted for Trend: Holt's Method Exponential Smoothing Adjusted for Trend and Seasonal Variation: Winters' Method

According to Hanke and Wichern, exponential smoothing is better-suited for stationary time series data, Holt's method is appropriate for trend data, while Winters' method works well with trend and seasonally adjusted data. The equations for the Holt and the Winters' methods can become quite complicated, so they are not reproduced here. For details on the model equations and examples of their application, please refer to Chapter 4 of the Hanke and Wichern text. All three of these methods are available in statistical software such as SPSS. Because of its complexity, a detailed exponential smoothing example has not been provided here. You may wish to read Appendix 2 at the end of this lesson to see how an exponential smoothing model would be applied to forecast housing starts for our Toronto database. 5. Autoregressive Integrated Moving Average (ARIMA) The fifth and final approach we will cover for generating a forecast using averaging patterns is autoregressive integrated moving average (ARIMA). The ARIMA, also known as the Box-Jenkins Approach, has been developed for "univariate" time series. Univariate means a time series which consists of a sequence of single observations recorded at fixed time intervals. According to Hanke and Wichern, the key difference between ARIMA methodology and other regressive models is that ARIMA models do not use independent variables. ARIMA models use the information about the patterns in the time series data.16 This approach provides accurate forecasts for all patterns and combines the three major forecasting techniques: C C C Autoregressive regression (AR) Differencing (called integration) (I) Moving Average (MA)

AR: An autoregressive analysis is a linear regression of a variable against its own past values. The regression equation for an autoregressive process would look like: Yt'"%$Yt&1%gt To run a regression model, the data on Yt (dependent variable at time t) and Yt-1 (dependent variable at various time lags)17 are used to estimate values for the parameters, " and $, and the series of residuals, gt. Once the parameter estimates are found, the forecast for future values (e.g., Yt+1) can be calculated.

16 17

Hanke, John E. & Wichern, Dean W. 2004. Business Forecasting, Eighth Edition. Prentice Hall. In this case, the dependent variable Y is playing the role of an independent variable in the regression equation.

7.35

Lesson 7

An autoregressive model works well for stationary time series data. However, as we learn below, when the data is non-stationary we require even more sophisticated approaches. I: Differencing (or integration) allows the forecast model to use more information about the movement of the data than is possible when analyzing the raw data. Non-stationary (trend) data generally require differencing to remove the trend, or in other words, to convert the data to a stationary series. For a differenced model, we would regress the differences between the two consecutive values of Y against past differences. The regression equation would look like: (Yt&Yt&1)'"%$(Yt&1&Yt&2)%gt For both of these equations, the expected value of the residuals is zero. Therefore, future forecasts do not depend on any forecast of the residuals, gt. MA: Finally, an ARIMA with a moving average component includes a forecast of the residuals using the moving average approach. If there is a pattern in the errors, that pattern is incorporated into the forecast as a moving average. According to the US National Institute of Science and Technology: A moving average model is conceptually a linear regression of the current value of the series against the white noise or random shocks of one or more prior values of the series. The random shocks at each point are assumed to come from the same distribution, typically a normal distribution, with location at zero and constant scale. The distinction in this model is that these random shocks are propagated to future values of the time series. Fitting the MA estimates is more complicated than with AR models because the error terms are not observable. This means that iterative non-linear fitting procedures need to be used in place of linear least squares. MA models also have a less obvious interpretation than AR models.18
Forecasting Fact ARIMA combines the autocorrelation function, data differencing, and moving averages. It produces highly accurate forecasting results, but is complex to apply and interpret B e.g., it could be problematic to explain results to a client.

The ARIMA model can handle each of the data patterns described in this lesson, including stationary or non-stationary (i.e., data without/with a trend) and seasonal data. The ARIMA process is iterative: the forecaster first chooses parameters for each of the autoregressive, integration, and moving average components; second, the model coefficients are estimated; and third, the results are evaluated. Finally, this process is repeated until the desired results are acquired.

There are three steps in development of an ARIMA model: 1. Model Identification 2. Model Estimation 3. Model Validation

18

National Institute of Standards and Technology. 2006. Engineering Statistics Handbook: US Dept of Commerce Chapter 6.4.4 Univariate Time Series Models, 2006, http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc446.htm

7.36

Understanding Data Patterns and Forecasting Techniques

In step 1, the researcher's goal is to detect any seasonality that may exist, and to identify the order for the seasonal autoregressive and seasonal moving average terms. If the period is known, for example, monthly, then a seasonal AR 12 term or a seasonal MA 12 term would be adopted.19 In step 2, the researcher estimates the order of the autoregressive (AR) and moving average (MA) terms. The ARIMA command appears as ARIMA(p,d,q): C C C p is the order of the autoregressive part (AR); d is amount of differencing (integration, I); and q is the order of the moving average (MA).

Each of these parameters p, d, and q will be explained briefly below, and in more detail in Appendix 3 where a more comprehensive ARIMA example is presented. The forecaster must decide the values of p, d, and q that are best for a given data series. The choice of the parameters will depend on the patterns in the autocorrelation functions (ACF) and the partial autocorrelation functions (PACF). Estimating the parameters for ARIMA models is extremely complex, but the task is made easier with stastical software programs. You have already examined autocorrelation functions for each of the data patterns discussed earlier in this lesson. They show the correlation between values at different lags. Autocorrelation can occur either because data between two periods are directly related to each other or because data between two periods are related to each other through a "chain effect". That is, if one period's value depends strongly on the previous period's value, then the autocorrelation could be high for a number of lags. For example, in the monthly data each month's housing starts depends greatly on the previous month's housing starts, and we see (in Table 7.7) that there is autocorrelation for many lags. The partial autocorrelation function (PACF) is similar to the autocorrelation function, but controls for the chain effect. The partial autocorrelation table shows only the direct correlations between periods. If the only direct correlation is between two consecutive periods, the partial autocorrelation table would show a spike at the first period, due to the direct effect of the previous period's value on the current period's value of housing under construction, but there are no other direct effects, as all other autocorrelations are due to the chain effect of the extremely strong one period correlation. This will be illustrated below. Tails and Spikes To choose p,d,q parameters, we will check the autocorrelation and partial autocorrelation functions to see if they have either a "tail" or one or more "spikes". A tail refers to a pattern where the first value is high and then the subsequent values fall slowly (like a dog's tail). A spike is one peak with only low values surrounding it. Figure 7.8 shows a function with a "tail" on the left and a function with a "spike" on the right. The presence of tails and spikes are key to the choice of ARIMA model.20 The basic rule is this: C If the autocorrelation function has a tail and the partial autocorrelation function has one or more spikes, then the data can be best forecast using an autoregressive model ARIMA(p,0,0). The number for the autoregressive parameter (p) is equal to the number of spikes in the partial autocorrelation function. If the autocorrelation function has one or more spikes, then the data can be forecast using a moving average model ARIMA(0,0,q). The number for the moving average parameter ( q ) is equal to the number of spikes in the correlation function.

19 20

Ibid. In Chapter 9, Hanke and Wichern present a series of graphs showing patterns of tails and spikes and the parameter choices for each.

7.37

Lesson 7

However, the basic rule does not clarify where differencing is needed, or what to do when both the autocorrelation and partial autocorrelation functions have tails. Let us now consider these two cases. Data should be differenced in two situations: C C If the data has a trend. If the data is autoregressive and the first autocorrelation is above 0.95.

Figure 7.8 Functions with Tails and Spikes

If the data should be differenced, then you would examine the autocorrelation and partial autocorrelation functions of the differenced data and apply the same basic rule for determining whether the differenced data is best forecast using an autoregressive or moving average model. With differenced data, an autoregressive model would be an ARIMA(p,1,0). A moving average model would be ARIMA(0,1,q). We have entered the difference parameter as "1" because it is extremely rare for data to require more than one order of differencing. The final case is where, for either the data or the differenced data, both the autocorrelation and partial autocorrelation functions have tails. In this case, you would set both p=1 and q=1, test the model, then increase each of the parameters (p and q) in turn, until you have found the most accurate model. Measures to test the accuracy of the models are the subject of the last section in this lesson. The ARIMA method is one of the more complicated methods used to forecast data. It is popular because it can forecast for all data patterns, and because it is often one of the most accurate forecasting tools available. Because the method uses three modeling techniques (autoregression, differencing and moving average) to produce a forecast, it can be more accurate than many models for longer-term forecasts. Its negative is that the modeling process can be difficult to explain to decision-makers. For this reason, it may sometimes be overlooked in favour of simpler models, especially for short-term forecasts. Because of its complexity, a detailed ARIMA example has not been provided here. You may wish to read Appendix 3 at the end of this lesson to see how an ARIMA model would be applied to forecast housing starts for our Toronto database.

7.38

Understanding Data Patterns and Forecasting Techniques

Summary: Forecasting Using Averaging and Patterns


We have now examined five options for forecasting based on past values of one data series. The five techniques vary in their complexity and in the usefulness of the results. The simpler methods are easy to apply, but have lessened accuracy. The more complex methods offer better quality and more generalizable results, but at the cost of difficulty in analysis, both for the analyst and for the client, in terms of understanding what actually went into making the forecast and how reliable it actually may be (i.e., if clients do not understand the mechanisms underlying the forecast or the measures in the output, it is difficult to critique the results). In the next lesson, we will expand our repertoire beyond single variable data series, to look at forecasting with data that involves multiple variables. We will complete this lesson by discussing how to choose between forecasting techniques.

How to Choose the Best Forecast


So far in this lesson, we have introduced data patterns and described how you can use autocorrelation tables to identify some of these patterns. We have also described a number of different methods for forecasting time series data. Now we will examine how to evaluate the forecasting techniques to determine which will provide the best forecast. The first step in choosing a forecast is to ensure that the forecaster has properly modeled the data series and eliminated all non-stationarity. This is done by obtaining an autocorrelation function of the errors produced by the forecast method (note that errors refers to the difference between the model's estimated values and the actual observed values). If the errors exhibit no autocorrelation, then the forecaster can start to examine the quality of the forecast. We will provide an example of this step below.
Forecasting Tip: Examine the Errors The first step in choosing a forecast is to check the data patterns, autocorrelations, and partial autocorrelations, using these to identify the best forecasting method. Once a forecast is completed, you should repeat this process with the errors of the forecast to ensure that these are "white noise". The error analysis can also identify if there are problems of non-constant variance or heteroskedasticity C this is explained in Appendix 1.

Given a set of time-series data, a forecaster chooses a forecasting technique based on the patterns in the data. For example, if a strong trend is exhibited, then an ARIMA (0,1,0) might be chosen; or if the data is white noise, then an averaging method may be most appropriate (instead of ARIMA). Next, the forecaster separates the data series into two sections. The first section, called the initialization period or fitting period, contains the bulk of the data and is used to create the forecast. The second section, called the test or forecasting period, sets aside data (often the last 2 to 10 data points) for final testing of the forecasting techniques. This helps determine which may be the best technique. Two separate stages of tests are carried out on a forecast. The first stage tests "in-sample" data. In this stage, we first check to see if the forecasting method has eliminated autocorrelation. Then we examine the accuracy of the forecast within the fitted period, the data that was used to produce the forecast. The second stage is an "out-of-sample" test, which tests the accuracy of the forecast by seeing how well it predicts values for the forecasting period. There are a number of measures used for the accuracy tests. They include:

7.39

Lesson 7

C C C C

Mean absolute deviation (MAD); Mean square error (MSE); Mean absolute percentage error (MAPE); and Mean percentage error (MPE).

All four of these measures examine the size of the forecasting errors to determine whether the forecasting method is accurate or unbiased. The first three measures provide information on the accuracy of the forecast, while the last provides information on whether or not the forecast is biased, i.e., whether a forecast is predicting values that are either consistently high or consistently low. For all four measures, a result that is close to zero is preferred. That is, the better the forecast, the smaller the errors.
The forecasting error is the difference between the value of a variable that is observed in a given period, t, and the value of the variable that is obtained by the forecast technique. The error for any time period t, can therefore be described by et , where

(equation 3.6, p. 79) et'Yt&Y t The usefulness of each of these measures depends on the nature of the data, and the concerns of the forecaster. Basically the choice depends on how much a forecaster is concerned with some types of errors compared to others.21 We will illustrate the choices of measures using a simple forecast of two data points. Keep in mind that the forecasting error et in a given period, t, is the difference between the variable's observed value (Yt) and the value ). of it from the forecast model ( Y Assume that the actual number for the first data point is 10 and the actual number for the second data point is 50. The forecast values for the first and the second data points are 16 and 30 respectively. '16) and the actual number for it (Y '10) , the error for Given the forecast's estimate of the first data point (Y t t this data point is et = 10 16 = 6. '32) with the actual (Y '50) gives an error for this data For the second data point, comparing the estimate (Y t t point of et = 50 30 = 20. The measure of absolute deviation (MAD) is:

MAD

1 n t Yt Y n t 1

(equation 3.7, p. 79)

Note: the absolute value sign means that you convert all negative numbers to positive numbers. For the example forecast of the two points, MAD = (6+20)/2 =13.

21

The formulas and definitions for these measures are provided in Chapter 3 of Hanke and Wichern.

7.40

Understanding Data Patterns and Forecasting Techniques

This measure simply provides the mean (average) size of the forecast error, hence the use of absolute values. The average of absolute values measures size but ignores the sign (positive or negative) of the error. This measure is useful when a forecaster wishes to examine the accuracy of different forecasts using the same data series, and is not concerned about whether the average is due to many small errors or only a few large errors. The mean square error (MSE) is:

1 n t MSE Yt Y n t 1

(equation 3.8, p. 80)

For the example forecast, MSE = (36+400)/2=218. Notice that using MAD, the effect of the second error on the calculation is only three times as large as the first. However, in the MSE, both errors are squared, and so the effect of the second error is over 10 times as large as the first error. The MSE penalizes large errors. Using this measure, a forecast that predicts a few observations very badly will usually be rejected in favour of a forecast that predicts all observations with only small errors.

The mean square error (MSE) is the most important measure for judging forecast precision and many statistical programs produce only this measure. SPSS uses the MSE to choose the 10 best forecasts. When you choose to produce a forecast of future values using the "Save" option, SPSS automatically uses the forecast with the smallest MSE.

The mean absolute percentage error (MAPE) is presented for completeness, but may not be extremely useful in practice. The mean absolute percentage error (MAPE) is similar to the MAD, except that each error is divided by the value of the variable itself. This is why it is called a percentage, because it measures the error as a percentage of the value of the observation. The mean absolute percentage error (MAPE) is:

MAPE

t 1 n Yt Y (equation 3.9, p. 80) n t 1 Yt

The MAPE for our simple example is: (6/10 + 20/50)/2 = (0.60 + 0.40)/2 = 0.50 This means our forecast is 50% wrong. Note that in percentage terms the bigger error occurs on the first value, whose forecast is 60% wrong, despite the fact that in level term the error is smaller at 6 compared to the second error at 20. One advantage of the MAPE is that it can be used to compare both forecasts of different series whose values have different magnitudes and forecasts of one series using different forecasting methods. This is because all errors are measured in percentage terms. For example, if you are forecasting both housing starts and housing prices, you could use the MAPE to tell the client which series has the better forecast. The mean percentage error (MPE) is used to examine whether a forecast is biased, or whether a forecast is predicting values either consistently high or consistently low. Because the sign of the measure is important, this measure uses neither absolute values nor squared values. The mean percentage error (MPE) is:

MPE

1 n

t 1

t Y Yt

(equation 3.10, p. 80)

7.41

Lesson 7

If the mean percentage error is positive, then the forecast is systematically underestimating the value of the generally smaller than Yt. We would say that the forecast has a downward bias. On the observations, with Y t other hand, if the MPE is negative, then the forecast is systematically overestimating the value of the observations, meaning that the forecast has an upward bias. For our example, the MPE is (-6/10+20/50)/2 = (-0.6 + 0.4)/2 = -0.1. Since the MPE is close to zero and we are using only 2 data points, this would likely indicate no bias. The MPE is used to compare two forecasting methods, where the least biased method produces an MPE very close to zero. Let's now examine each of these measures using a very simple forecast of the data on July Housing Under Construction data presented in Table 7.3 earlier in this lesson. We will compare two forecasts using: 1. the second nave forecast (which includes a difference term) 2. moving average method The in-sample period will be 1994 to 2002, the out-of-sample check will use the years 2003 and 2004. This means we will produce our forecast using data from 1994 to 2002 and ignore the data for 2003 and 2004. To check the quality of our forecast, we will: 1. Check the errors of the forecast for autocorrelation. 2. Check the errors for 2003 and 2004 and test which forecast produces a more accurate result. Recall the nave forecast uses last period's value of the housing under construction to forecast the current value. The data and forecasts are presented below.
Table 7.11 Fitted Values and Forecast Errors for Second Naive Forecast (With Difference) Year 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 Housing Under Construction Data 12258 11472 13346 14099 15615 22306 26404 33819 35982 36478 40185 10686 15220 14852 17131 28997 30502 41234 Not included, saved for test data -2660 1121 -763 -5175 2593 -3317 5252 Nave Forecast "Fitting Period" Forecast Error

7.42

Understanding Data Patterns and Forecasting Techniques

Table 7.12 shows the autocorrelation function for the nave forecast. The table shows there is no significant autocorrelation in the errors (standard errors are large and the BLS shows randomness). This means the nave forecast method is acceptable for this data. We will use the tests above to determine if this is the better forecast.
Table 7.12 Autocorrelation: Second Nave Forecast Lag Autocorrelation Std. Error(a) Box-Ljung Statistic Value 1 2 3 4 5 -.525 .292 -.134 -.165 .191 .309 .282 .252 .218 .178 2.889 3.962 4.245 4.816 5.964 df 1 2 3 4 5 Sig.(b) .089 .138 .236 .307 .310

a The underlying process assumed is independence (white noise). b Based on the asymptotic chi-square approximation.

SPSS Instructions: Table 7.12 Manually add the two columns to the annual housing database: Nave Forecast and Fitting Period (as in Table 7.11). Choose: Analyze 6 Forecasting 6 Autocorrelations 6 Variables: FittingPeriodNF

Tables 7.13 and 7.14 show results for the moving average forecast.
Table 7.13 Fitted Values and Forecast Errors for 3 Span Moving Average Forecast Year 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 Housing Under Construction Data 12258 11472 13346 14099 15615 22306 26404 33819 35982 36478 40185 12359 12972 14353 17340 21442 27510 Not included, saved for test data 1740 2643 7953 9064 12377 8472 3 Span Moving Average Forecast "Fitting Period" Forecast Error

7.43

Lesson 7

Table 7.14 Autocorrelations: Moving Average Forecast Lag Autocorrelation Std. Error(a) Box-Ljung Statistic Value 1 2 3 4 .477 -.072 -.397 -.417 .323 .289 .250 .204 2.189 2.251 4.770 8.946 df 1 2 3 4 Sig.(b) .139 .324 .189 .062

a The underlying process assumed is independence (white noise). b Based on the asymptotic chi-square approximation.

SPSS Instructions: Table 7.14 Add the two columns to the annual housing database: 3 Span Moving Average and Fitting Period (as in Table 7.13). Choose: Analyze 6 Forecasting 6 Autocorrelations 6 Variables: FittingPeriodMA

Table 7.14 results show no clear pattern in the autocorrelations and no Sig.(b) results below the critical value of 0.05. This shows there is no autocorrelation in the errors for this forecast. Both the forecasts have now passed the first test. We move on to the "in-sample" test. We will apply the forecast to the fitted data (1994 to 2002 data), and then check the results for both accuracy and bias. The fitted forecasts start at 1995 for the nave forecast and at 1997 for the three-period moving average forecast, as the earlier date(s) are used to create these initial forecasts. Table 7.15 shows the calculations needed to perform all of the tests. The first column is the year, the second column is the data, the third column is the forecast, and the fourth column is the forecast error. Columns 5 to 8 show the calculation for each of the tests. The absolute value of the error is needed to find the MAD, the squared error is used to find the MSE, the absolute percent error is used to find the MAPE, and the percent error is used to find the MPE. The last row of each section therefore produces these test measures.

7.44

Understanding Data Patterns and Forecasting Techniques

Table 7.15 Measuring Forecast Error for the Housing Under Construction, July Data Year Housing Under Construction Data Forecast Error Absolute value of error Squared error Absolute percent error Percent error

Yt
Nave Forecast 1996 1997 1998 1999 2000 2001 2002 Total Mean Moving Average Forecast 1997 1998 1999 2000 2001 2002 Total Mean 14,099 15,615 22,306 26,404 33,819 35,982 13,346 14,099 15,615 22,306 26,404 33,819 35,982

Y t

Yt&Y t

* *Yt&Y t

)2 (Yt&Y t

*/Y *Yt&Y t t

)/Y (Yt&Y t t

22,944 15,220 14,852 17,131 28,997 30,502 41,234

9,598 1,121 -763 -5,175 2,593 -3,317 5,252 9,309 1,330

9,598 1,121 763 5,175 2,593 3,317 5,252 27,819 3,974

92,121,604 1,256,641 582,169 26,780,625 6,723,649 11,002,489 27,583,504 166,050,681 23,721,526

0.719 0.080 0.049 0.232 0.098 0.098 0.146 1.422 0.203

0.719 0.080 -0.049 -0.232 0.098 -0.098 0.146 0.664 0.095

12,359 12,972 14,353 17,340 21,442 27,510

1,740 2,643 7,953 9,064 12,377 8,472 47,395 6,771

1,740 2,643 7,953 9,064 12,377 8,472 47,863 6,838

3,028,760 6,983,687 63,244,907 82,156,096 153,198,380 71,780,432 428,457,505 61,208,215

0.123 0.169 0.357 0.343 0.366 0.235 1.819 0.260

0.123 0.169 0.357 0.343 0.366 0.235 1.784 0.255

Using the figures in Table 7.15, we can now calculate each measure of forecasting accuracy. Table 7.16 compares the four measures for both the nave and moving average forecasts. We have already established that each of these forecasts have eliminated autocorrelation in the errors terms.
Table 7.16 Comparison of Quality Measures for In-sample Forecast Nave MAD MSE MAPE MPE 3974 23721526 0.203 0.095 Moving Average 6838 61208215 0.26 0.255

7.45

Lesson 7

For each of the measures, the nave forecast outperforms the moving average forecast, but not always with a large difference. The MAD, MSE, and MAPE all show that the errors for the moving average forecast are larger than the errors for the nave forecast. The MPE shows that the moving average forecast has a negative bias, with all forecasts lower than the actual values. Lastly, we perform the "out-of-sample" test. We will apply the forecast to the test data (2003 and 2004 data), and then check the results for both accuracy and bias.
Table 7.17 Measuring Forecast Error for the Housing Under Construction, July Data "Out-of-Sample" Test Year Housing Under Construction Data Forecast Error Absolute value of error Squared error Absolute percent error Percent error

Yt
Nave Forecast 2003 2004 Total Mean Moving Average Forecast 2003 2004 Total Mean 36,478 40,185 36,478 40,185

Y t

Yt&Y t

* *Yt&Y t

)2 (Yt&Y t

*/Y *Yt&Y t t

)/Y (Yt&Y t t

38,145 40,308

5,252 1,667 123 1,790

5,252 1,667 123 1,790

27,583,504 2,778,889 15,129 2,794,018

0.146 0.046 0.003 0.049

0.146 0.046 0.003 0.049

32,068 33,956

4,410 6,229 10,638 5,319

4,410 6,229 10,638 5,319

19,445,160 38,794,904 58,240,064 29,120,032

0.121 0.155 0.276 0.138

0.121 0.155 0.276 0.138

Table 7.18 Comparison of Quality Measures for Out-of-sample Test Nave MAD MSE MAPE MPE 1790 2794018 0.049 0.049 Moving Average 5319 29120032 0.138 0.138

The out-of-sample test confirms the results of the in-sample test. The error measures are all much larger for the moving average forecast than they are for the nave forecast. This confirms that the nave forecast is superior for short-term forecasts for this data. Note that results of forecast comparisons are often not as clear-cut as in this example. It is possible for one measure to favour one forecast, while another measure favours a different forecast. The forecaster will have to decide which measure is most important for the problem. There are two facts that will probably influence this choice. First, as noted in Lesson 6, it is always good practice to choose which measure will be used to evaluate the forecast technique before the forecast tests are carried out (i.e., to eliminate bias from the forecaster). Second, keep in mind that most statistical programs only produce the MSE.

7.46

Understanding Data Patterns and Forecasting Techniques

Of course, it is very possible that the number of Houses Under Construction depend on more than how many houses were built in the past. To examine whether or not the forecast could be improved with other variables requires the forecaster to use multiple regression modeling. This will be examined in Lesson 8.
Applying the Forecasting Model We have covered many aspects of creating forecasting models, but what might be lost in all the details is what these results mean. For the Housing Under Construction data, we have now determined that using the second nave forecast came up with good results. We found this by applying the model to past values to predict them and then compared our predictions to what really happened. For example, in 2004 there were 40,185 housing starts and our model predicted 40,308 C we found that on average the prediction "errors" were acceptable. Now that we have a model we are happy with, we can use it to forecast future values! To carry out the forecast, we would use all the available data to calculate values for the future, then use forecast values to predict even further into the future. For example, the forecast for 2005 would use data from 2004 and 2003. The result is 43,892 [40,185 + (40,185 B 36,478)]. For 2006, we would use the 2005 forecast value as part of the calculation. The forecast value for 2006 is 47,599 [43,892 + 43,892 B 40,185)]. The analyst in our example has now predicted housing starts in Toronto and can use this information to aid in their development strategies. Note: if we had tested ARIMA as well, and found that it produced the best model, we would predict future values by using the "Save" command in SPSS.

Summary
This lesson has led you through the study of data patterns, forecasting with a data series, and making choices among alternative forecasts. At this point, you now have a good set of tools to help clients in making simple time series forecasts and, perhaps more importantly, in evaluating the forecasts provided to you and your clients by "expert" forecasters. However, all of the univariate forecasting methods presented in Lesson 7 share a common limitation: they are designed to forecast using only information from a single data series. Lesson 8 will build on this lesson by introducing forecasts based on more than one variable multiple regression analysis. Furthermore, the lesson will also discuss methods that introduce judgment to the forecasting process.

7.47

Lesson 7

APPENDIX 1 How to Deal with Non-Constant Variance


Recall that in a stationary data series, neither the mean nor the variance of the data series change as time progresses. Changes in the mean refer to situations where the value of the data increases over time (or decreases over time). We have dealt with changes in the mean through differencing. However, we also need to consider how to deal with changes to the variance. An example of this might be where the variation in a price is small in early periods, but prices begin to fluctuate wildly over time. This variance of prices might occur when a market becomes unregulated. When analyzing time series data, graphing the data is an important first step, because it can show patterns in the data that might not otherwise be apparent. For example, if the graph of a variable shows a pattern in how the vertical distance between the lowest and highest variable are changing (e.g., increasing or decreasing as you move from left to right in the graph), then you have a problem called heteroskedasticity. This is illustrated in Figure 7.A1 the graph on the left shows the errors/residuals resulting from estimating annual marriages, using an AR1 regression (discussed in ARIMA techniques in Appendix 3). The errors from this regression increase from the beginning of the data to the end, with the variance getting wider with each successive year. The figure on the right shows the errors from a separate regression of marriages using an ARIMA(4,0,0) process on seasonally adjusted marriage data (p=4 means there were four spikes in the graph of the ACFs, showing a seasonal autoregressive pattern). In this case, the errors appear to have a constant size as we move from left to right in the graph. There is one spike in the errors, but a spike is not sufficient for heteroskedasticity. In this example, making the seasonal adjustment and changing the forecasting model eliminated the heteroskedasticity in the errors.
Figure 7.A1 Errors With and Without Heteroskedasticity

7.48

Understanding Data Patterns and Forecasting Techniques

It is good practice in forecasting to always examine graphs of the "errors" (residuals) from a forecasting equation. Understanding how well the equation forecasts the data is an important part of choosing the best forecast. You may recall from previous coursework, where you created models for estimating house prices, that the "residuals" are part of the regression equation. In equation terms, they represent the part of the value of the left-hand side variable (the target or dependent variable) that is not explained by your variables on the right hand side (the independent variables). In a regression model for house prices, the residual would refer to the amount of the variation of house prices that is not explained by the variables used in the regression equation. For forecasting time series data, as in this lesson, the residual refers to the variation in the expected future value of a variable that is not explained by past values of this variable. If the residual graph shows heteroskedasticity, then you should change the way you measure your variable. Usually the problem is that the errors get larger over time, partly because the size of the variable gets larger over time for example, real estate prices tend to increase over time, so a forecast of future real estate prices will have errors that become more and more magnified. The way to fix this is to transform the variable so that it does not grow over time. Mathematically, a function called the natural logarithm is the ideal solution for any numbers that grow exponentially over time. As a matter of fact, this problem occurs so regularly that SPSS offers a solution in the command box for autocorrelation and for most of its forecasting techniques. Depending on how fast the errors are growing, it may also be possible to transform the data by dividing it by the year or by a time variable. If you choose to transform the data yourself, you will need to remember to transform the forecast back to the original measure before you use it. For example, if you divide the data by year and then forecast using this transformed data, you will need to multiply the forecast by year to get results you can use. On the other hand, if the growth of the data is compounded year after year, then you can use the natural logarithm transformation and rely on the program to produce a forecast of the correct data without your intervention.
SPSS Instructions: Correcting Heteroskedasticity For autocorrelation function: Choose Analyze 6 Forecasting 6 Autocorrelations 6 Natural log transform. For ARIMA: Choose Analyze 6 Forecasting 6 Create Models. Under "Method", choose ARIMA; click Criteria and select Natural log transform under "Transformation". Even though the program will use a natural logarithm transformation to produce the forecast, the fitted variable and forecast will appear in the same units as the original data. In other words, the natural logarithm transformation is invisible to the user. Its purpose is only to adjust the errors in the model to eliminate heteroskedasticity.

This simple step can make the difference between a good forecast and one that is systematically biased downward. This is an important point when forecasting real estate values, since prices tend to trend upward over time, meaning heteroskedasticity is a regular concern.

7.49

Lesson 7

APPENDIX 2 Exponential Smoothing Example


Because of the complexity of the formulas, exponential smoothing is best carried out within a statistical software package such as SPSS. Instructions for Holt's method are provided below.22 The monthly Housing under Construction data is used for this example.
SPSS Instructions: Exponential Smoothing Choose Analyze 6 Forecasting 6 Create Models Move the time series variable you want to forecast (Housing) to the Dependent Variables box. Under "Method", choose Exponential Smoothing; click Criteria; select Holt's linear trend Click Continue. Click the Save tab. Under the Save column, select Predicted Values Click the Options tab. Choose "First case after end of estimation period through specified date"; type 2006 under Year and 12 under Month. Click OK. To view the forecast, select Analyze 6 Forecasting 6 Sequence charts and select the time series (Date) and the variables (Predicted Value and original Housing variables, to show both on graph). Note: if you receive an error report: "In series Housing, a missing value was found", then you need to check the data to see if there are any missing values, which are shown as periods in the Data View screen. This will happen if you have used Exponential Smoothing to create a forecast and then run subsequent forecasts C you have forecasted values beyond April 2005 (likely shown as a new "fitted" variables), but there are no actual observations for "Housing", so these missing values show as periods. To run further iterations of Exponential Smoothing forecasts, you need to remove these missing values. You may: 1. Select Data 6 Select Cases to set a filter, such that only observations up to 05/2005 are used (the first 401 observations). Select "Based on time or case range", click "Range", and for "Last Case" specify 2005 for Year and 05 for month. When you click OK and return to the "Data View" screen, you'll see a slash through the observations beyond row 401, showing these are not considered in the analysis; OR Delete the entire rows for these future values (select entire row and press Delete); OR Save the database with a new name (if you wish to save your results) and reopen the original database.

2. 3.

22

SPSS offers two methods to apply smoothing in forecasts. Both are valid. One method, not discussed here, is called "T4253H smoothing" and is created by the following command: Under menu bar Transform...Create Time Series, and under "Functions" choose "Smoothing". This command produces a forecast for the entire data series in a way similar to the moving average. This forecast uses medians of a set of past values to forecast the value one period ahead.

7.50

Understanding Data Patterns and Forecasting Techniques

These SPSS commands create an exponential smoothing forecast. We will focus on monthly data for housing under construction to show how exponential smoothing can be used to make forecasts. For the housing data, we have applied Holt's method, which incorporates a trend correction to the forecast. We selected two options: first we chose the grid search option to allow SPSS to show us the optimal weights (based on minimizing sum of squared errors); second, we chose the Alpha value at 0.3 together with grid search for the Gamma value. The SPSS exponential smoothing output will include a summary of the selection process, the forecast values for the entire series, and the errors of the series. The last two outputs are created as additions to the data set (new variables in new columns). Table 7.A1 shows the "Smallest Sums of Squared Errors" summary, identifying the top ten models in terms of minimizing sum of squares.
Table 7.A1 Smallest Sums of Squared Errors Series Housing Model rank 1 2 3 4 5 6 7 8 9 10 Alpha (Level) 1.00000 1.00000 .90000 1.00000 .90000 .90000 .80000 .80000 .90000 1.00000 Gamma (Trend) .20000 .00000 .20000 .40000 .40000 .00000 .20000 .40000 .60000 .60000 Sums of Squared Errors 399866932.63653 407355223.10051 414353236.75514 427107419.98645 432856218.98064 434071892.08396 441237009.04575 454532156.28376 455580501.95575 464531006.95800

The minimum sum of squared errors model is the first one presented. In this best fitting model, the weight ("Alpha") on the first observation in the data series is 1.0, which means that the best predictor of the current data is the immediately preceding observation if you view the forecasted values, you will notice they are all very close to the 39,900 housing starts in May 2005. The weight ("Gamma") on the trend component is 0.2, which indicates there is a slight upward trend to the data.

7.51

Lesson 7

SPSS Instructions: Table 7.A1 To produce Table 7.A1 Ensure any rows beyond the dates of the original data have been removed (that is, delete any previously calculated forecasts). File 6 New 6 Syntax Type the following commands into the Syntax window (exactly as shown with periods): PREDICT THRU YEAR 2006 MONTH 12. EXSMOOTH /VARIABLES=Housing /MODEL=HOLT /ALPHA=GRID(0 1 .1) /GAMMA=GRID(0 1 .2) /INITIAL=CALCULATE. When done, click Run 6 All To view the forecast, select Analyze 6 Forecasting 6 Sequence charts To get the second forecast (shown in Figure 7.A3), change the /ALPHA line in the above syntax to /ALPHA=.3 and the /GAMMA line to /GAMMA=GRID(0 1 .3) Re-run the syntax statements and reproduce the sequence chart.

Figure 7.A2 shows the forecast along with the actual data. The series forecast closely follows the data set, with very little smoothing in this case. If the forecaster is interested in more smoothing, then he or she would change the "Alpha" parameter to a lower value, meaning less weight is placed on the immediately preceding observation. Figure 7.A3 will be used to illustrate the effect of altering the Alpha and Gamma weights. Figure 7.A3 shows the result of setting Alpha to 0.3, while continuing to allow grid search to choose Gamma (with "By" set to 0.1). Gamma is found to fit best at 0.3. With Alpha lowered, the forecast is smoother, because the weight of 0.3 on the most recent data leaves more weight to be distributed over the past data values used in the forecast. Comparing Figure 7.A3 to Figure 7.A2, the exponential smoothing forecast with Alpha=0.3 predicts that the data will turn down in the near future. The downturn is due to the fact that the data points previous to the last data points are lower than the last data point, and so, the exponential smoothing forecast gives weight to these lower values and predicts the future values will fall below the last value seen in the data.

7.52

Understanding Data Patterns and Forecasting Techniques

Figure 7.A2 Exponential Smoothing, Alpha = 1.0, Gamma = 0.2

Figure 7.A3 Exponential Smoothing, Alpha = 0.3, Gamma = 0.3

7.53

Lesson 7

For this particular data, as seen in graph 7.A2, the exponential smoothing forecasts with both Alpha and Gamma chosen by grid search looks like a "one-step ahead" or "nave" forecast. We see this when applying the Holt exponential smoothing method to the monthly housing data. If we allow the program to choose the parameters, the result puts almost all of the weight on the previous period's value to predict the current value. In other words, with Alpha = 1 this is effectively a "one-step ahead" forecast, the same as that presented earlier as the nave forecast. In a Holt model for this data series, the only difference between the forecast for the current period and the previous period's value derives from the Gamma parameter, and the difference is very small. This means, if the data did not have a slight upward trend, the one-step ahead forecast would be as good as any exponential smoothing forecast for this data. It is only the small parameter on the trend that gives extra information in the exponential smoothing forecast beyond the information from the one-step ahead forecast. Because the difference between the nave forecast and the exponential smoothing forecast is small for this data, a one-step ahead forecast may be fairly good for short-term forecasts of one or two periods. However, the truth of the matter is that neither the one-step ahead forecast, nor the exponential smoothing forecast are best for forecasting data with this high a level of autocorrelation differencing is needed. The important implication of Alpha = 1 is that this data should really be differenced before it is forecast. Recall that differencing the data creates a series of the differences between the current period's value and the previous period's value. By doing this, you would be able to use information about how movements between data in two periods change, and whether there is a pattern in these differences that can provide more information than the pattern of the original data series. Using a driving analogy, if today's distance is the same as yesterday's, then to learn about a driving pattern, you may need to look at how speeds change (yesterday's drive may have been at a constant speed, but today's had many stops and starts). However, even with differencing, exponential smoothing suffers from a limitation that all methods presented in this lesson share: it is designed to forecast using only information from a single data series. Lesson 8 will present forecasting methods using multiple regression analysis.

7.54

Understanding Data Patterns and Forecasting Techniques

APPENDIX 3 ARIMA Example


This appendix illustrates how ARIMA can be used to forecast monthly housing under construction data for Toronto. We will present each step in the decision process, in particular explaining how the parameters for an ARIMA forecast model are chosen. Finally, we will graph the forecasted future values. To forecast using ARIMA, you need to examine autocorrelation and partial autocorrelation functions. The commands to produce these functions is shown below.
SPSS Instructions: Partial ACF Tables To obtain ACF and PACF tables and functions for the seasonal pattern in SPSS Choose Analyze 6 Forecasting 6 Autocorrelations. Select variable and move it to the Variable box. Select Autocorrelations and Partial autocorrelations for the Display box. Select Options. Select Maximum number of lags (generally 25% of the data series or a maximum of 50; 28, in our case here).

The first step in determining the ARIMA parameters is to examine the autocorrelation and partial correlation functions. Instead of presenting the data in tables, we will look at the graphs for these functions. Figure 7.A4 shows that the autocorrelation function has a "tail" and the first autocorrelation is above 0.95. Figure 7.A5 shows that the partial autocorrelation function has a single "spike".

Figure 7.A4 Autocorrelation Function

7.55

Lesson 7

Figure 7.A5 Autocorrelation Function

To decide upon the ARIMA parameters, we can apply the decision rules presented earlier in the lesson. The first autocorrelation is 0.99 at the first lag, meaning 99% of the current period's value is determined by the previous period's value. Because there is no tail in the partial autocorrelation data, we expect that the moving average forecast is inappropriate for this data. We conclude that the data needs to be differenced. This means we have an ARIMA(p,1,0). Now, we must examine the differenced data to determine its pattern. In SPSS, we return to the Autocorrelations input panel, under "Transform" we select "Difference = 1", and under "Options" we change the Lags to 14 (to better see the pattern in the lower lags). Below we reproduce the autocorrelations and also show the partial autocorrelations for the differenced data. Figure 7.A6 shows these two graphs side-by-side (note that we have modified the Y-axis scale to better highlight the relationships).
Figure 7.A6 ACF and PACF for Differenced Data

7.56

Understanding Data Patterns and Forecasting Techniques

The differenced data continues to display significant autocorrelation, although much less than the level data (nondifferenced data). We therefore can use an autoregressive forecasting method on the differenced data. The partial autocorrelation function above shows significant autocorrelations at lag = 1 and then again at lag = 12. The significant autocorrelation at lag 1 tells us that the differenced data has a first order autoregressive process and a serial correlation between adjoining periods. This tells us that the autoregressive parameter (p) should equal 1. We will therefore apply an ARIMA(1,1,0) model to the data. The bumps in both autocorrelation and partial autocorrelation at lag = 12 tell us that we also need to examine the seasonal autocorrelations and partial autocorrelations for this data to fully specify the ARIMA model. An ARIMA model including seasonal parameters is written as ARIMA(p,d,q)(sp,sd,sq). The second set of parameters refer to the seasonal autoregression (sp), seasonal difference (sd), and seasonal moving average (sq). You diagnose the patterns of the seasonal part of the data in the same way as you do the period to period part. That is, you need to obtain ACF and PACF tables and graphs for the seasons. However, you do not need to look at each lag's autocorrelation, as we did above. For seasonal data, you would examine the autocorrelation for annual lags only.
SPSS Instructions: Creating seasonal ACF and PACF functions and graphs Choose Analyze 6 Forecasting 6 Autocorrelations. Select variable (e.g., Housing) Under Transform: select Difference = 1 Under Options: Enter a large enough number in Maximum Number of Lags to allow you to evaluate the seasonal autocorrelation function. Select Display autocorrelations at periodic lags . Press Continue then OK.

We select a number of lags that will represent years. Because our data is monthly, we need to look at autocorrelations at lag = 12, lag = 24, and so on. We choose the maximum number of lags = 72 because this will give us 6 periodic lags for the 6 years of data (6 12 = 72). The ACF is presented below, followed by the PACF for the seasonal lags. We will show both the tables and the graphs because the patterns are not clear by looking at graphs alone.

7.57

Lesson 7

Table 7.A2 Autocorrelations Series: DIFF(Housing,1) Lag 12 24 36 48 60 72 Autocorrelation .197 .090 .089 .079 .129 .100 Std.Error(a) Value .049 .048 .048 .047 .046 .045 74.531 93.411 104.680 125.571 138.003 152.323 Box-Ljung Statistic df 12 24 36 48 60 72 Sig.(b) .000 .000 .000 .000 .000 .000

a The underlying process assumed is independence (white noise). b Based on the asymptotic chi-square approximation.

Table 7.A3 Partial Autocorrelations Series: DIFF(Housing,1) Lag 1 2 3 4 5 6 Partial Autocorrelation .141 .036 .081 .065 .089 .059 Std.Error .050 .050 .050 .050 .050 .050

Figure 7.A7 ACF and PACF at Seasonal Lags

7.58

Understanding Data Patterns and Forecasting Techniques

The ACF has a tail, with a high initial autocorrelation and then falling autocorrelations in the next three lags. There is a jump at lag = 60, but as this is five years ago, we can probably safely ignore it. The PACF has a single spike, with the second lag much smaller than the first. Therefore, we will use a first order autoregressive process (sp=1) to model the seasonal part of the data. We can now combine each of the steps above to pick all of the parameters for the ARIMA. Earlier, we differenced the housing data and found the differenced data could be modeled as an AR of order 1. We saw no evidence of a need for a moving average model. This meant our first three parameters are (p, d, q) = (1,1,0). For the seasonal lags, we found that a first order autoregressive forecast would be appropriate. Because the autoregressive parameter at lag 12 was far less than 0.95 and did not show a trend, we did not difference the data. Again, we saw no evidence of a need for a moving average model. Our next three parameters are therefore (sp, sd, sq) = (1,0,0). We conclude that the monthly housing under construction data could best be modeled by an ARIMA(1,1,0)(1,0,0) this could described as "a differenced AR1 model with a seasonal autoregressive component". We therefore use this model to forecast the data series, illustrated below.
SPSS Instructions: Creating an ARIMA forecast in SPSS Choose Analyze 6 Forecasting 6 Create Models Move the time series variable that you want to forecast to the Dependent Variables box. Under "Method", choose ARIMA; Click Criteria Under the "Model" tab set the desired parameters for non-seasonal (p,d,q) and seasonal (sp,sd,sq) to specify the model you want to use to generate a forecast (e.g., p=1, d=1, q=0, sp=1, sd=0, sq=0). Deselect Include constant in model ; click Continue Click the Statistics tab; Under "Statistics for Individual Models", choose Parameter estimates Click the Save tab: Under the "Save" column, select Predicted Values Click the Options tab; Choose First case after end of estimation period through specified date; type 2006 under Year and 12 under Month (for example). Click OK

Table 7.A4 ARIMA Model Parameters Estimates Non-Seasonal Lags Seasonal Lags AR1 Seasonal AR1 0.24 0.192 Std Error 0.049 .051 T 4.933 3.755 Approx. Sig. .000 .000

7.59

Lesson 7

Table 7.A5 and Figure 7.A8 show the data from 2000 on, illustrating the actual data as a solid line and the ARIMA(1,1,0)(1,0,0) forecast as a dashed line. We also show the dashed line extending beyond the data, illustrating the future forecast through to the end of 2006. The forecast captures much of the curvature of the housing under construction data. It continues to forecast some curvature into the future.
Table 7.A5 Housing Under Construction: ARIMA Forecast Data Date Housing Under Construction Forecast 40622 40018 37525 39950 Forecast 40162 40243 40283 40797 40769 40803 40742 40660 Date Housing Under Construction Forecast 40368 40252 39774 40239 40280 40295 40303 40401 40396 40402 40391 40375

Jan-05 Feb-05 Actual Mar-05 Apr-05 May-05 Jun-05 Jul-05 Forecast Aug-05 Sep-05 Oct-05 Nov-05 Dec-05

Jan-06 Feb-06 Mar-06 Apr-06 May-06 Jun-06 Jul-06 Aug-06 Sep-06 Oct-06 Nov-06 Dec-06

Figure 7.A8 Housing Under Construction: ARIMA Forecast Graph

7.60

Understanding Data Patterns and Forecasting Techniques

Each data series is different. Using the steps shown in this example, and studying the examples in the text, you can learn to systematically analyze data patterns and choose the best parameters for a forecast using the ARIMA method. You will still need to determine if this method is more complex than would be acceptable to the clients that you serve, or whether a simpler method will be more appropriate. Luckily, as computer programs continue to improve, producing these forecasts becomes easier and the software often provides much of the explanation you will need to keep up-to-date in understanding forecasting methods. A surprising aspect of the ARIMA forecast above is how close the forecasted future data points are to the nave forecast. The forecast includes some variability due to the autoregressive and seasonal pattern from the data series. However, the last data point is very close to the preceding data value, leaving a forecaster with little idea whether this data is about to continue an upward motion or whether we should expect a downturn. The ARIMA model, in a sense, deals with this by forecasting future values that are relatively flat, with some upward and downward motion based on the information from previous patterns. Because the forecast has only a small amount of variation, it is fair to ask: "Why go to all the trouble of modeling using ARIMA if the nave forecast is quite close?" This is a reasonable question. The answer is that ARIMA models can actually use other variables as well in creating a forecast. The limited variation in the forecast in this example is mainly due to the fact that only the series' past values are used. Using only one variable is always very limiting. Lesson 8 will show you how to incorporate other variables into time series models, including ARIMA. A multiple regression model using ARIMA can provide a much better forecast than a nave model.

7.61

Lesson 7

Review and Discussion Questions


1. Based on the monthly data series for Housing Under Construction, create four forecast models using: A. B. C. D. second nave forecast (with difference) exponential smoothing ARIMA (1,1,0) ARIMA (vary parameters from the set used above)

Keep the last four periods for out-of-sample testing. 2. 3. 4. 5. Explain your choices of parameters for the two ARIMA models. Calculate the four measures of forecasting error. Select the best forecast, and justify your choice. Apply the model to predict housing starts to the end of 2007.

7.62

Understanding Data Patterns and Forecasting Techniques

ASSIGNMENT 7 Understanding Data Patterns and Forecasting Techniques

Marks: 1 mark per question. 1. Which of the following is an example of times series data? (1) (2) (3) (4) 2. The assessed values of each of the houses in Naramata over the past 10 years. The square feet of living area for all properties in Prince Rupert in August 2006. The minute-by-minute spot price of the Japanese Yen. The number of bedrooms of each of the houses in Moose Jaw in January 2007.

Which of the following interpretations of autocorrelation coefficients is correct? (1) (2) (3) (4) If there is a perfect relationship between today's value and last period's value (k=1), the autocorrelation coefficient would equal 0. If there is a strong negative relationship between the current value of the variable and the value two periods ago (with k=2), then the autocorrelation coefficient would have a negative value near -1, likely less than 0.5. If there is little or no relationship between the current value and the past values of the variable, then any autocorrelation coefficient calculated would be close to -1. If there is a positive relationship between today's value and last period's value (k=2), the autocorrelation coefficient would equal 1.5.

3.

When considering quarterly data with a seasonal pattern, you would expect: (1) (2) (3) (4) a high positive correlation at lag 4, as well as at lag 8. no correlation between lag 4 and lag 8 as every fourth period represents the same season. a strong positive autocorrelation between the current season and other seasons. no correlation whatsoever as seasonal data is always random.

***Assignment 7 continued on next page***

7.63

Lesson 7

4.

Which of the following statements are TRUE? A. B. C. D. A stationary data pattern indicates that the data series neither increases nor decreases over time. Data has a seasonal component if it has a pattern of change that repeats itself every month. The trend of a data series is the component that causes the data to have a long-term but unpredictable change over time. Data is cyclical if it has wave-like fluctuations, either around the constant (if data is stationary, but with a cycle) or around the trend (e.g., a sine wave along an upward sloping line).

(1) (2) (3) (4) 5.

A, B, and D C and B A and D All of the above

The size of the standard error should always be compared to the size of the measure it relates to, in order to determine whether the autocorrelation coefficients are significant. In order to have significant autocorrelation coefficients: A. B. C. D. (1) (2) (3) (4) the standard errors should be smaller than half the size of the autocorrelation coefficients. the standard errors should be larger than half the size of the autocorrelation coefficients. if the t-test is greater than 2, then you are likely to have an estimate whose value is not equal to 0. In other words, if t>2, the result is usually statistically significant. if the t-test is greater than 1, then you are likely to have an estimate whose value is not equal to 0. In other words, if t>1, the result is usually statistically significant.

A only D only A and C B and C

***Assignment 7 continued on next page***

7.64

Understanding Data Patterns and Forecasting Techniques

THE NEXT 2 (TWO) QUESTIONS ARE BASED ON THE FOLLOWING INFORMATION: Consider the following data on the annual number of campers at the Beachside campground: Year 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 6. Number of Vistors 12,003 17,548 11,821 14,548 12,684 14,753 16,876 21,159 13,896 15,784 19,354

Using a 2nd nave forecast, what would be the estimated number of campers for 2008? (1) (2) (3) (4) 20,632 campers 21,242 campers 19,354 campers 22,924 campers

7.

Using a three period moving average forecast, what would be the estimated number of campers for 2008? (1) (2) (3) (4) 19,345 campers 16,345 campers 16,687 campers 15,493 campers

***Assignment 7 continued on next page***

7.65

Lesson 7

8.

Given the following autocorrelation function, which forecast method should be used?
80 70 60 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9 10

(1) (2) (3) (4)

A moving average ARIMA (0,0,2) A moving average ARIMA (2,0,0) A moving average ARIMA (0,0,8) A moving average ARIMA (8,0,0)

THE NEXT 4 (FOUR) QUESTIONS ARE BASED ON THE FOLLOWING INFORMATION:


Measuring Forecast Error for the Housing Under Construction, September Data Year Housing Under Construction Data Forecast Error

Yt
Second Nave Forecast 2000 2001 2002 2003 2004 2005 2006 16990 17850 18922 18552 22595 21850 22895

Y t
19584 18710 19994 18182 26638 21105 23940

Yt&Y t
-2594 -860 -1072 370 -4043 745 -1045

9.

What is the measure of absolute deviation (MAD) for the Second Naive Forecast of Housing Under Construction, September Data? (1) (2) (3) (4) 1,533 1,045 -1,214 10,729

***Assignment 7 continued on next page***

7.66

Understanding Data Patterns and Forecasting Techniques

10.

What is the mean square error (MSE) for the Second Naive Forecast of Housing Under Construction, September Data? (1) (2) (3) (4) 2,349,213 1,474,142 3,821,060 1,092,025

11.

What is the mean absolute percentage error (MAPE) for the Second Naive Forecast of Housing Under Construction, September Data? (1) (2) (3) (4) 0.5361 0.0766 0.7921 -0.0612

12.

What is the mean percentage error (MPE) for the Second Naive Forecast of Housing Under Construction, September Data? (1) (2) (3) (4) 0.5361 -0.0456 0.0456 -0.0612

13.

To ensure that non-stationarity has been eliminated from the data series the forecaster should: (1) (2) (3) (4) obtain an autocorrelation function of the errors produced by the forecast method. look at the partial autocorrelation function and identify if there is a tail. run an autoregressive integrated moving average (ARIMA). look for exponential smoothing.

14.

Which of the following statements regarding exponential smoothing is/are TRUE? A. B. C. D. (1) (2) (3) (4) Exponential smoothing is similar to the moving average technique, except the forecast applies the greatest weight to the variables most recent value. Holt's method is better-suited for stationary time series data, exponential smoothing is appropriate for trend data, and Winter's method works well with trend and seasonally adjusted data. Exponential smoothing is appropriate for data with a trend. Exponential smoothing continually revises an estimate in the light of more recent experiences.

A and B B and C A and D All of the above

***Assignment 7 continued on next page***

7.67

Lesson 7

15.

Data should be differenced if: A. B. C. D. (1) (2) (3) (4) A or B B or C C or D A or D the data has no trend the data is autoregressive and the first autocorrelation is above 0.95 the data has a trend the data is autoregressive and the partial autocorrelation is below 0.95

15

Total Marks

PLANNING AHEAD Now that you have completed a lesson on forecasting methods (Lesson 6) and forecasting techniques for single variables, you can now begin to think about how this applies to your major project. Discuss your initial thoughts as to how you might incorporate statistical forecasting into your project: (a) What methods do you propose to apply? (b) How would you apply them? (c) How to interpret/critique results? (d) How might these results be implemented? (e) What constraints do you foresee in applying these in your major project (e.g., data limitations) and how do you propose to deal with these constraints? To generate ideas, it may help to review the "Hardware Store Location" case study in Lesson 8.

***End of Assignment 7***

7.68

You might also like