Linear Regression Model

Linear Regression Model
Daniel Romero Rodriguez

2018
Probability Distributions 1
Predictive Models
Predictive modeling is a process that uses data mining and
probability to forecast outcomes. Each model is made up of a
number of predictors, which are variables that are likely to
influence future results.
Regression and Classification Problems
Classification Problems
Classification is the process of predicting the class of given
data points. Classes are sometimes called as targets/ labels or
categories
Examples: Spam detection, Medical diagnosis, Fraud
detection, credit scoring, image detection, etc…
Classification Models
Logistic Regression
Linear Discriminant Analysis (LDA)
Classification Trees
K Nearest Neighbors
5
Problems involving sets of variables when it is known that
there exists some inherent relationship among the variables
can be solved by regression models.
Examples: House Price based on key features, gas milage
based on engine capacity, epidemiology, economics.
Model can be linear or nonlinear depending of the equation

structure and relationship between dependent and
independent variables.
Regression Models
Regression Trees Linear Regression Model
A simple linear regression model has the following structure:
𝑌 = 𝛽0 + 𝛽1 𝑋 + 𝜀
𝛽0 is the intercept
𝛽1 is the slope. Y change per unit of X change
𝑌 is the dependent variable. The variable to be predicted
X is the indepedent variable. Can be controlled, is not random.
𝜀 is the perturbation. E[𝜀]=0
• The objective is to estimate β0 and β1 based on a dataset with pairs
x i , yi
• b0 and b1 are estimations of β0 and β1
𝑦ො = 𝑏0 + 𝑏1 𝑥
Linear Regression Model Steps
• 1. Dispersion diagram
• 2. Estimate 𝑏0 and 𝑏1
• 3. Validate model with analysis of variance (ANOVA)
• 4. Determination coefficient Analysis (R2)
• 5. Residuals Analysis (Normality, Homoscedasticity and
independence)
• 6. Confidence intervals of coefficients 𝑏0 and 𝑏1
• 7. Confidence intervals of 𝑌෠ and prediction interval of Y
Step 1: Dispersion Diagram
Step 2: Estimate 𝑏0 and 𝑏1
We shall find b0 and b1, the estimates of β0 and β1 , so that the sum of the
squares of the residuals is a minimum. The residual sum of squares is often
called the sum of squares of the errors about the regression line and is
denoted by SSE.
σ 𝑦𝑖 − 𝑏 σ 𝑥𝑖 𝑛 σ 𝑥𝑖 𝑦𝑖 − σ 𝑥𝑖 σ 𝑦𝑖
𝑏0 = = 𝑦ത − 𝑏1 𝑥ҧ 𝑏1 =
𝑛 𝑛 σ 𝑥 2 − σ 𝑥𝑖 2
Example Errors
Example
A company assigns different prices to an electronic product. The following
table shows the product sales for different prices.
Sales 400 440 380 450 420 420 380 350

Price 60 50 65 45 50 55 60 65
Develop a dispersion diagram and compute a linear regression model.
Step 3: Model Validation ANOVA
The analysis of variance splits the variation of Y in different
components (Regression model and error)
The idea is that the dependent variable Y is explained by 𝛽0 +

𝛽1 𝑋 , and the error has small influence on the variation.
• 𝐻0 : 𝛽1 = 0
• 𝐻1 : 𝛽1 ≠ 0
𝑛 𝑛 𝑛
2 2 2
෍ 𝑦𝑖 − 𝑦ത = ෍ 𝑦ො𝑖 − 𝑦ത +෍ 𝑦𝑖 − 𝑦ො𝑖
𝑖=1 𝑖=1 𝑖=1
Sum of Square Sum of Square

Sum of Square Total(𝑆𝑆𝑇) = Regression(𝑆𝑆𝑅) + Error(𝑆𝑆𝐸)
𝑌 = 𝛽0 + 𝛽1 𝑋 + 𝜀
• 𝑆𝑆𝑇: Total Variability of 𝑌.
• 𝑆𝑆𝑅: Variability of 𝑌 explained by the regression model.
• 𝑆𝑆𝐸: Variability of 𝑌 explained by the error.
𝑆𝑦𝑦 = ෍ 𝑦𝑖 − 𝑦ത 2
𝑆𝑆𝑇 = 𝑆𝑦𝑦
𝑛 σ 𝑥𝑖 𝑦𝑖 − σ 𝑥𝑖 σ 𝑦𝑖
𝑆𝑆𝑅 = 𝑏1 𝑆𝑥𝑦 𝑆𝑥𝑦 =
𝑛
𝑛 σ 𝑥𝑖2 − σ 𝑥𝑖 2
𝑆𝑆𝐸 = 𝑆𝑦𝑦 − 𝑏1 𝑆𝑥𝑦 𝑆𝑥𝑥 =
𝑛
Source SS DF MS 𝒇𝑻𝒆𝒔𝒕
𝑆𝑆𝑅ൗ
𝑆𝑆𝑅 1
Regression 𝑆𝑆𝑅 1 𝑀𝑆𝑅 =
1 𝑆𝑆𝐸ൗ
𝑛−2
𝑆𝑆𝐸
Error 𝑆𝑆𝐸 𝑛−2 𝑀𝑆𝐸 =
𝑛−2
Total 𝑆𝑆𝑇 𝑛−1
𝐻0 : 𝛽1 = 0
𝐻1 : 𝛽1 ≠ 0
If 𝐻0 is not rejected then the model is not valid to predict Y

If 𝐻0 is rejected then the model explains the variability of Y,
therefore it is an appropiate model.
When 𝑓𝑇𝑒𝑠𝑡 is large then the 𝐻0 is rejected in favor of 𝐻1 , which
means that 𝛽1 ≠ 0
Step 4: Coeficient of Determination
The coefficient of determination, denoted R2 is the proportion of

the variance in the dependent variable that is predictable from the
independent variable(s)
2
𝑆𝑆𝐸
𝑅 =1−
𝑆𝑆𝑇
Example
Y X
The grades of a class of 9 students on a midterm 82 77
report (x) and on the final test (y) are as follow: 66 50
(a) Estimate the linear regression line. 78 71
(b) Carry out an ANOVA 34 40
(b) Estimate the final examination grade of a 47 46
student who received a grade of 85 on the midterm 85 90
report. 99 96
99 99
68 67
Problem: Simple Linear Regression
A study was made on the amount of converted Converted Sugar Temperature
sugar in a certain process at various temperatures. 7.7 1
The data were coded and recorded as follows: 7.8 1.1
(a) Estimate the linear regression line. 8.2 1.2
8.4 1.3
(b) Estimate the mean amount of converted sugar 8.8 1.4
produced when the coded temperature is 1.75. 8.9 1.5
8.6 1.6
9 1.7
9.3 1.8
9.2 1.9
10.5 2
Step 5: Residuals Analysis- Normality test
•𝐻0 :Residuals are normally distributed with a mean 𝜇 = 0 and 𝜎 2

2 𝑆𝑆𝐸
estimated as 𝑠 = .
𝑛−2
•𝐻1 : Residuals are not normally distributed
The residuals are estimated as 𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖 . A goodness of fit test

is carried out to evaluate the hypothesis.
Step 5: Residuals Analysis- Homogeneity of Variance
o𝐻0 : Residuals have homogeneous variance.
o𝐻1 : : Residuals have heteregoneous variance.
Plot the standardize errors versus 𝑦ො or x and evaluate if there are
patters or variance changes
Step 5: Residuals Analysis- Independence
Residuals independence can be tested analyzing the residuals plot or
using a statistical test such as Durbin-Watson
Step 5: Residuals Analysis- Independence
The Durbin-Watson is estimated using the residuals ei
The statistic will be between 0 ≤ 𝐷𝑤 ≤ 4. When 𝐷𝑤 is close to 2 the residuals are assumed
independent. For a given significance level the critical values dl and du are found from the
table. The are based on the following guidelines:
Si 0≤ 𝐷𝑤 ≤ 𝑑l, Negative correlation
Si 𝑑𝑙≤ 𝐷𝑤 ≤ 𝑑𝑢, inconclusive
Si 𝑑𝑢≤ 𝐷𝑤 ≤ 4−𝑑𝑢 residuals are independent.
Si 4−𝑑𝑢≤ 𝐷𝑤 ≤4−𝑑𝑙, inconclusive
Si 4−𝑑𝑙≤ 𝐷𝑤 ≤4, Positive correlation
Step 6: Inference of 𝛽0 and 𝛽1
The estimations 𝑏0 and 𝑏1 of 𝛽0 and 𝛽1 are values of the random
variables 𝐵0 and 𝐵1 .
𝐸 𝐵0 = 𝛽0 and 𝐸 𝐵1 = 𝛽1 .
The variances 𝐵0 and 𝐵1 are:
2 2
1 𝑥 ҧ 𝑠
𝑉 𝐵0 = 𝑠 2 + 𝑉 𝐵1 =
𝑛 𝑆𝑥𝑥 𝑆𝑥𝑥
2 2 𝑆𝐶𝐸
Where 𝑠 is given by the following equation: 𝑠 = .
𝑛−2
Step 6: Inference of 𝛽0 and 𝛽1
The confidence intervals at 1 − 𝛼 100% of 𝛽0 and 𝛽1 are:
𝑏0 − 𝑡𝑛−2,𝛼ൗ 𝑉 𝐵0 ≤ 𝛽0 ≤ 𝑏0 + 𝑡𝑛−2,𝛼ൗ 𝑉 𝐵0
2 2
𝑏1 − 𝑡𝑛−2,𝛼ൗ 𝑉 𝐵1 ≤ 𝛽1 ≤ 𝑏1 + 𝑡𝑛−2,𝛼ൗ 𝑉 𝐵1
2 2
Step 7: Inference of 𝑌
The confidence interval at 1 − 𝛼 100% of 𝑌 is:
𝑦ො0 − 𝑡𝑛−2,𝛼ൗ 𝑉 𝑌෠ ≤ 𝜇𝑌Τ𝑋=𝑥𝑜 ≤ 𝑦ො0 + 𝑡𝑛−2,𝛼ൗ 𝑉 𝑌෠

2 2
Where,
2
1 𝑥𝑜 − 𝑥ҧ
𝑉 𝑌෠ = 𝑠 2 +
𝑛 𝑆𝑥𝑥
Step 7: Inference of 𝑌
The prediction interval at 1 − 𝛼 100% of 𝑌 𝑖𝑠:
𝑦ො0 − 𝑡𝑛−2,𝛼ൗ 𝑉 𝑌 ≤ 𝑌 ≤ 𝑦ො0 + 𝑡𝑛−2,𝛼ൗ 𝑉 𝑌

2 2
Where,
2
1 𝑥𝑜 − 𝑥ҧ
𝑉 𝑌 = 𝑠2 1+ +
𝑛 𝑆𝑥𝑥

Linear Regression Model

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Linear Regression Model

Uploaded by

Copyright:

Available Formats

Linear Regression Model

Daniel Romero Rodriguez

Model can be linear or nonlinear depending of the equation

• b0 and b1 are estimations of β0 and β1

Sales 400 440 380 450 420 420 380 350

Develop a dispersion diagram and compute a linear regression model.

The idea is that the dependent variable Y is explained by 𝛽0 +

Sum of Square Sum of Square

If 𝐻0 is not rejected then the model is not valid to predict Y

The coefficient of determination, denoted R2 is the proportion of

•𝐻0 :Residuals are normally distributed with a mean 𝜇 = 0 and 𝜎 2

The residuals are estimated as 𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖 . A goodness of fit test

𝑦ො0 − 𝑡𝑛−2,𝛼ൗ 𝑉 𝑌෠ ≤ 𝜇𝑌Τ𝑋=𝑥𝑜 ≤ 𝑦ො0 + 𝑡𝑛−2,𝛼ൗ 𝑉 𝑌෠

𝑦ො0 − 𝑡𝑛−2,𝛼ൗ 𝑉 𝑌 ≤ 𝑌 ≤ 𝑦ො0 + 𝑡𝑛−2,𝛼ൗ 𝑉 𝑌

You might also like