You are on page 1of 50

Identifying relationships Dr James Abdey

Applied Marketing (Market Research Methods) Topic 8: Identifying relationships

Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression


Identifying relationships Dr James Abdey

Overview Relationship between two variables

We consider regression analysis which is used for explaining variation in market share, sales, brand preference etc. This may use explanatory variables such as advertising, price, distribution and product quality Starting with correlation, we proceed to the simple linear model followed by multiple linear regression

Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Relationship between two variables

We now investigate the relationship between two variables When we have data on two variables (X and Y ), we have bivariate data We will consider how to:
measure the strength of the relationship model the relationship predict the value of one variable on the basis of the other

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Relationship between two variables

First thing to do with data is to provide a graphical representation For one variable this might be a histogram, stem-and-leaf diagram etc. For two variables we produce a scatter diagram This must include the following:
title axis labels units and be accurate!

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Relationship between two variables

Assume that we have some data in paired form: (xi , yi ), i = 1, 2, . . . , n An example might be unemployment and crime gures for 12 areas of a city, of interest to insurers in setting policy premia for people insuring against theft Unemp., x Offences, y Unemp., x Offences, y 2614 6200 1687 5420 1160 4610 1287 5588 1055 5336 1869 5719 1199 5411 2283 6336 2157 5808 1162 5103 2305 6004 1201 5268

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Relationship between two variables

We plot X on the horizontal axis, and Y on the vertical axis This emphasises any relationship between the variables
Scatter plot of Crime against Unemployment
x x 6000 x x x 5500 x x x x 5000 x x

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Number of offences

x 1000 1500 2000 Unemployment 2500

Relationship between two variables

A positive, linear relationship is apparent X and Y increase together, roughly linearly Hence the implied linear relationship is not exact The points do not lie exactly on a straight line Such an upward shape is termed positive correlation We will see later how to quantify correlation

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Relationship between two variables

Other examples of scatter plots include:
LHS: Negative correlation (Y decreases as X increases) RHS: Uncorrelated data (no obvious (linear) relationship between X and Y )

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction

Scatter plot
x x 8 x x 8

Scatter plot
x x x

Regression diagnostics Worked example Multiple linear regression

x x 6 x x x x x y x 4

x y 4

x 2 2

x x 2 4 x 6 8 0

x x 2 4 x 6 8 x

Correlation measures the strength of the linear relationship between two variables, each measured on an interval scale Positive correlation the two variables tend to vary in the same direction Negative correlation the two variables tend to vary in the opposite direction Perfect correlation the two variables have points which all lie exactly on a straight line

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

If there exists a perfect linear relationship between X and Y , we can represent them using an equation of the form Y = + X represents the intercept of the line represents the slope or gradient of the line Examples of anticipated correlation: Variables Height & weight Rainfall & sunshine hours Ice cream sales & sun cream sales Hours of study & exam mark Cars petrol consumption & goals scored Correlation Positive Negative Positive Positive Zero

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Positive correlation: large X with large Y ; small X with small Y Negative correlation: large X with small Y ; small X with large Y However, since the X and Y may have widely different numerical values we need to take this into account We do this by considering how far away from the means the two scores are

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

So, we are interested in the degree to which variations in variable values are related to each other Our basis for the measurement of correlation is
n n

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient

)(yi y ) = (xi x
i =1 i =1

y xi yi nx

Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Unfortunately, this measure is extremely sensitive to the units in which the variables are measured We would prefer a measure of correlation to remain the same regardless of the units of measurement (e.g. days, hours, minutes or seconds)


Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation

So, we use the following to measure the correlation for (sample) data r = ( = y xi yi nx 2) ( xi2 nx 2) yi2 ny

Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

)(yi y ) (xi x 2 ) (yi y )2 (xi x

Returning to the unemployment/crime dataset: xi = 19979, yi2 xi2 = 36695129, yi = 66803,

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation

= 374471231,

xi yi = 113784494

Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

= 19979/12 = 1664.92 and Since n = 12, we have x = 66803/12 = 5566.92 y Hence the (sample) correlation coefcient, r , is r = 0.861 Of course, in practise we can software like SPSS to calculate r for us!

The (sample) correlation coefcient, r , takes values between 1 and 1, i.e. 1 r 1

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

r > 0 indicates positive correlation, with r = 1 indicating perfect positive correlation r < 0 indicates negative correlation, with r = 1 indicating perfect negative correlation The closer |r | is to 1, the stronger the linear relationship is


Identifying relationships Dr James Abdey


Here we introduce simple linear regression Only part of a very large topic in statistical analysis In the simple model, we have two variables Y and X :
Y is the dependent (or response) variable that which we are trying to explain using: X , the independent (or explanatory) variable the factor we think inuences Y

Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

The simple linear regression model

Identifying relationships Dr James Abdey

Assume a true (population) linear relationship between a response variable y and an explanatory variable x of the approximate form: y = + x and are xed, but unknown, population parameters is the y -intercept is the slope of the line We seek to estimate and using (paired) sample data (xi , yi ), i = 1, . . . , n

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

The simple linear regression model

Identifying relationships Dr James Abdey

Particularly in business, we would not expect a perfect linear relationship between the two variables Hence we modify this basic model to y = + x + is some random perturbation from the initial approximate line In other words, each y observation almost lies on the postulated line, but jumps off the line according to the random variable Often referred to as the error term

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Parameter estimation The least squares method

For given sample data we could produce a scatter plot Any linear relationship would be visible This would suggest performing a (simple) linear regression We estimate the population regression line This estimated line is often termed the line of best t

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Parameter estimation The least squares method

How do we choose the line of best t? We require a formal criterion for determining the line of best t Estimation of and will be by least squares estimation Specically, we seek to minimise the sum of the squared residuals, where a residual is the difference between the true y value and its predicted (tted) value

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Parameter estimation The least squares method

The least squares estimator for is = y xi yi nx 2 2 xi n x

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R

The least squares estimators for is x =y Hence the line of best t has equation: x = y + Again, this is routinely calculated in SPSS

Prediction Regression diagnostics Worked example Multiple linear regression

Returning to the unemployment/crime dataset xi = 19979, xi2 = 36695129, yi = 66803,

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model

yi2 = 374471231,

Parameter estimation

xi yi = 113784494

Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

= 19979/12 = 1664.92 and Since n = 12, we have x = 66803/12 = 5566.92, hence y = y xi yi n x 2 2 xi nx 113784494 (12 1664.92 5566.92) = 36695129 (12 1664.922 ) = 0.7468


Identifying relationships Dr James Abdey


We estimate the intercept to be x = y = 5566.92 0.7468 1664.92 = 4323.6 Hence the least squares regression line is = 4323.6 + 0.7468x y notation, where the hat denotes an Note the y estimated value

Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Interpretation of correlation coefcient

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation

In the case of perfect correlation between X and Y , we can predict Y directly and exactly from X In the case of zero correlation between X and Y , knowledge of X tells us nothing about Y Here we consider measuring the extent to which the values of one variable can be used to predict the values of another where the correlation is neither 1, nor 0, nor 1

Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Interpretation of correlation coefcient

Our overall objective is to explain the response variable Y , which is a random variable We try to explain the variation in Y Using simple linear regression, we attempt this using a single explanatory variable, X The total variation in the response variable sample data is simply

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

)2 (yi y
i =1

We term this the total sum of squares (TSS)

Interpretation of correlation coefcient

We can decompose TSS into two components:
the amount we are able to explain using the model called the explained sum of squares (ESS); and the remaining variation that we are unable to explain with the model, called the residual sum of squares (RSS)

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Hence, TSS = ESS + RSS

Coefcient of determination, R 2
We can assess the overall t of a model using R 2 This measures the proportion of the total variability in the response variable explained by the model This statistic is known as the coefcient of determination and is denoted R 2 and dened as R2 = 0 R2 1 The closer R 2 is to 1, the better the explanatory power of the model Note that R 2 = r 2 for a simple linear model, so we can also compute it from r (correlation coefcient) ESS TSS

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Coefcient of determination, R 2
Returning to the crime/unemployment dataset, lets assign Y and X as follows
Y = number of offences X = unemployment

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient

The least squares regression line was = 4323.6 + 0.7468x y The correlation coefcient was 0.861, therefore R 2 = 0.8612 = 0.7413 This means we can explain 74.13% of the variation in number of offences using unemployment

Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression


Identifying relationships Dr James Abdey


One of the purposes in calculating the line of best t is prediction Specically, for some value of x , we can provide a prediction for y So, returning to the example, how many offences would you predict if there were 2000 unemployed people in a city area? Answer: just substitute the desired value of x into the least squares regression line: = 4323.6 + 0.7468 2000 = 5817 y

Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Provided we are predicting y for an x value that is within the available x data, then we can be fairly condent in the prediction This is what we call interpolation However, if we base our prediction on an x value outside the available x data, then we should view the prediction with caution This would be an example of extrapolation which is risky since the relationship between x and y may change for such values of x

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Regression diagnostics
The usefulness of a tted regression model rests on a basic assumption: E(y ) = + x Furthermore inference such as the hypothesis tests, condence intervals and predictive intervals only make sense if the error terms are (approximately) independent and normal with constant variance 2 Therefore it is important to check these conditions are met in practice this task is called regression diagnostics Basic idea: Looking into the residuals i or the normalised residuals i /

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Regression diagnostics
What to look for?
Do the residuals manifest IID normal behaviour? Is the scatter plot of i versus xi patternless? Is the scatter plot of i versus yi patternless? Is the scatter plot of i versus i patternless?

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

If you see trends, periodic patterns, increasing variation in any one of the above scatter plots, it is very likely that at least one assumption is violated!

Regression diagnostics
Two other issues in regression diagnostics: outliers and inuential observations Outlier: An unusually small or unusually large yi which lies outside of the majority of observations An outlier is often caused by an error in either sampling or recording data. If so, we should correct it before proceeding with the regression analysis If an observation which looks like an outlier indeed belongs to the sample and no errors in sampling or recording were discovered, we may use a more complex model or distribution to accommodate this outlier. For example, stock returns often exhibit extreme values and they often cannot be modelled satisfactorily by a normal regression model

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Regression diagnostics
Inuential observation: An xi which is far away from other xi s Such an observation may have a large inuence on the tted regression line

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Regression: Worked example

We apply the simple linear regression method to study the relationship between two series of nancial returns: a regression of Cisco Systems stock returns, y , on S&P500 Index returns, x This regression model is an example of the CAPM (Capital Asset Pricing Model) Stock returns: Return = Current price Previous price Previous price current price log previous price

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

when the difference between the two prices is small

Regression: Worked example

Remark: Daily prices are denitely not independent. However, daily returns may be seen as a sequence of uncorrelated random variables For S&P500, the average daily return is -0.04%, the maximum daily return is 4.46%, the minimum daily return is -6.01%, and the standard deviation is 1.40% For Cisco, the average daily return is -0.13%, the maximum daily return is 15.42%, the minimum daily return is -13.44%, and the standard deviation is 4.23%
Descriptive Statistics N SP500 Cisco Valid N (listwise) 252 252 252 Range 10.66 28.85 Minimum -6.00 -13.44 Maximum 4.65 15.42 Mean -.0424 -.1336 Std. Deviation 1.40017 4.23419 Variance 1.960 17.928

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Regression: Worked example

Remark: Cisco is much more volatile than the S&P500 There is clear synchronisation between the movements of the two series of returns

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Regression: Worked example

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression

We t a regression model: Cisco = + S&P500 + Rationale: Part of the uctuation in Cisco returns was driven by the uctuation of the S&P500 returns

The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Regression: Worked example

Identifying relationships Dr James Abdey


Coefficients Model Unstandardized Coefficients Standardized Coefficients B (Constant) 1 Cisco a. Dependent Variable: SP500 .227 .015 -.012 Std. Error .064 Beta

Relationship between two variables


95.0% Confidence Interval for B

Correlation Regression

Lower Bound -.188 .687 14.943 .851 .000 -.139 .197

Upper Bound .114 .257

The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics

Model Summary Model R R Square

Worked example

Adjusted R Square

Std. Error of the Estimate

Multiple linear regression





a. Predictors: (Constant), Cisco b. Dependent Variable: SP500

Regression: Worked example

Identifying relationships Dr James Abdey


When testing the statistical signicance of regression coefcients, we just need to look at the p-value The smaller the p-value, the more signicant the result, i.e. that the true parameter value is different from zero In practice, we treat p-values smaller than 0.05 as being statistically signicant (at the 5% signicance level)

Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Regression: Worked example

The estimated slope: = 2.077. The null hypothesis H0 : = 0 is rejected with p-value 0.000: extremely signicant Attempted interpretation: When the market index goes up by 1%, Cisco stock goes up by 2.077%, on average. However, the error term in the model is large with an estimated = 3.08% The p-value for testing H0 : = 0 is 0.815, so we cannot reject the hypothesis that = 0 x and both y and x are very close to Recall = y 0

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Regression: Worked example

Identifying relationships Dr James Abdey


= 47.2% of the variation of Cisco stock may be explained by the variation of the S&P500 index, or in other words 47.2% of the risk in Cisco stock is the market-related risk see CAPM below CAPM: A simple asset pricing model in nance: y i = + x i + i where yi is a stock return and xi is a market return at time i


Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Regression: Worked example

Total risk of the stock: 1 n
n i =1

Identifying relationships Dr James Abdey


1 )2 = (yi y n

n i =1

1 )2 + (yi y n

Relationship between two variables

(yi yi )2
i =1

Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics

Market-related (or systematic) risk: 1 n

n i =1

1 )2 = 2 (yi y n

)2 (xi x
i =1

Worked example Multiple linear regression

Firm-specic risk: 1 n

(yi yi )2
i =1

Regression: Worked example

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation

measures the market-related (or systematic) risk of the stock Market-related risk is unavoidable, while rm-specic risk may be diversied away through hedging Variance is a simple measure (and one of the most frequently used) for risk in nance

Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Multiple linear regression

Identifying relationships Dr James Abdey

Overview Relationship between two variables

Previously we saw simple linear regression That had one explanatory variable Often one explanatory variable is not enough to explain variation in the response variable So we add more linear explanatory variables

Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Multiple linear regression examples

Absenteeism in the workforce could be due to:
hours worked exibility in work practice salary paid...

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

Salary for managers could be related to:

qualications experience hours worked performance...

Multiple linear regression

Identifying relationships Dr James Abdey


Remember the aim of statistics is prediction and decision making In order to make the best predictions and decisions we need to use the best models This often means making more complex models adding more explanation But not too complex (Occams razor)

Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

The multiple linear model

Identifying relationships Dr James Abdey


Suppose y is the managers salary x1 = qualications, x2 = experience, x3 = hours, x4 = performance

Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

y = 0 + qual x1 + exp x2 + hrs x3 + per x4 +

We can visualise up to n = 3

The multiple linear model

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

The multiple linear model

Identifying relationships Dr James Abdey

Overview Relationship between two variables Correlation

Multiple linear regression uses least squares estimation like simple linear regression That is, we minimise the sum of the squared residuals in all dimensions Sounds tricky, but fortunately software (SPSS etc.) takes care of that for us

Regression The simple linear regression model Parameter estimation Interpretation of correlation coefcient Coefcient of determination, 2 R Prediction Regression diagnostics Worked example Multiple linear regression

You might also like