Ch17 Least Squares 11

The Islamic University of Gaza
Faculty of Engineering
Civil Engineering Department
Numerical Analysis
ECIV 3306
Chapter 17
Least Square Regression

CURVE FITTING
Part 5
This part Describes techniques to fit curves (curve fitting) to
discrete data to obtain intermediate estimates.
There are two general approaches for curve fitting:

Least Squares regression:
Data exhibit a significant degree of scatter. The strategy is
to derive a single curve that represents the general trend
of the data.
Interpolation:
Data is very precise. The strategy is to pass a curve or a
series of curves through each of the points.
CURVE FITTING
Introduction
Part 5
In engineering, two types of applications are
encountered:
Trend analysis. Predicting values of dependent
variable, may include extrapolation beyond data
points or interpolation between data points.
Hypothesis testing. Comparing existing

mathematical model with measured data.
Mathematical Background
Arithmetic mean. The sum of the individual data
points (yi) divided by the number of points (n).
yi
y ,i 1, ,n
n
Standard deviation. The most common measure of a

spread for a sample.
St
Sy , St ( yi y)2
n 1
Mathematical Background (contd)
Variance. Representation of spread by the square of

the standard deviation.
2
2
( yi y)2 y 2
yi / n
S y or S 2
y
i
n 1 n 1
Coefficient of variation. Has the utility to quantify the
spread of data.
Sy
c.v. 100 %
y
Least Squares Regression
Chapter 17
Least Squares Regression
Chapter 17
Linear Regression
Fitting a straight line to a set of paired
observations: (x1, y1), (x2, y2),,(xn, yn).
y = a0+ a1 x + e
a1 - slope
a0 - intercept
e - error, or residual, between the model and
the observations
Linear Regression: Residual
Linear Regression: Question
How to find a0 and a1 so that the error would be

minimum?
Linear Regression: Criteria for a Best Fit
n n
min ei ( yi a0 a1 xi )
i 1 i 1
e1 e1= -e2
e2
n n
min | ei | | yi a0 a1 xi |
i 1 i 1
n
min max | ei | | yi a0 a1 xi |
i 1
Linear Regression: Least Squares Fit
n n n
2 2
Sr e
i ( yi , measured yi , model) ( yi a0 a1 xi ) 2
i 1 i 1 i 1
n n
min Sr ei2 ( yi a0 a1xi ) 2
i 1 i 1
Yields a unique line for a given set of data.

Linear Regression: Least Squares Fit
n n
2 2
min S r ei ( yi a0 a1 xi )
i 1 i 1
The coefficients a0 and a1 that minimize Sr must satisfy

the following conditions:
Sr
0
a0
Sr
0
a1
Linear Regression:
Determination of ao and a1
Sr
2 ( yi ao a1 xi ) 0
ao
Sr
2 ( yi ao a1 xi ) xi 0
a1
0 yi a0 a1 xi
0 yi xi a 0 xi a1 xi2
a0 na0 , yields
2 equations with 2
na0 xi a1 yi
unknowns, can be solved
a0 xi a1 xi2 yi xi simultaneously
Linear Regression:
Determination of ao and a1
n xi yi xi yi
a1 2
2
n xi xi
Mean values
a0 y a1 x
Least Squares Fit of a Straight Line:
Example
Fit a straight line to the x and y values in the

following Table:
xi 28 yi 24.0
xi yi xiyi xi2
1 0.5 0.5 1
2 2.5 5 4
xi2 140 xi yi 119.5
3 2 6 9
4 4 16 16
28
x 4
5 3.5 17.5 25 7
6 6 36 36 24
7 5.5 38.5 49 y 3.428571
7
28 24 119.5 140
Least Squares Fit of a Straight Line: Example
(contd)
n xi yi xi yi
a1 2
2
n x i ( xi )
7 119.5 28 24
0.8392857
7 140 282
a0 y a1 x
3.428571 0.8392857 4 0.07142857
Y = 0.07142857 + 0.8392857 x
Error Quantification of Linear Regression
Sum of the squares of residuals around the

regression line is Sr
n n
Sr ei2 ( yi ao a1 xi ) 2
i 1 i 1
St Sr
Sy Sy / x
n 1 n 2
Standard deivation Standard error of the estimate

22
Total sum of the squares around the mean for

the dependent variable, y, is St
St ( yi y) 2
Sum of the squares of residuals around the

regression line is Sr
n n
Sr ei2 ( yi ao a1 xi ) 2
i 1 i 1
24
St-Sr quantifies the improvement or error

reduction due to describing data in terms of a
straight line rather than as an average value.
2 St Sr
r
St
r 2: coefficient of determination
r : correlation coefficient
For a perfect fit:

Sr= 0 and r = r2 =1, signifying that the line
explains 100 percent of the variability of the
data.
For r = r2 = 0, Sr = St, the fit represents no
improvement.
Example
Fit a straight line to the x and y values in the

following Table:
xi 28 yi 24.0
xi yi xiyi xi2
1 0.5 0.5 1
2 2.5 5 4
xi2 140 xi yi 119.5
3 2 6 9
4 4 16 16
28
x 4
5 3.5 17.5 25 7
6 6 36 36 24
7 5.5 38.5 49 y 3.428571
7
28 24 119.5 140
(contd)
n xi yi xi yi
a1 2
2
n x i ( xi )
7 119.5 28 24
0.8392857
7 140 282
a0 y a1 x
3.428571 0.8392857 4 0.07142857
Y = 0.07142857 + 0.8392857 x
(Error Analysis)
y = 0.07142857 + 0.8392857 x
2
yi ymodel St yi y 22.7143
xi yi ymodel (yi y)2 yi a0 a1xi ei2 2
1 0.5 0.9107 8.5765 -0.4107 0.1687 S r ei 2.9911
2 2.5 1.7500 0.8622 0.7500 0.5625
3 2.0 2.5893 2.0408 -0.5893 0.3473 St S r
2
4 4.0 3.4286 0.3265 0.5714 0.3265 r 0.868
St
5 3.5 4.2679 0.0051 -0.7679 0.5896
6 6.0 5.1071 6.6122 0.8929 0.7972
2
7 5.5 5.9464 4.2908 -0.4464 0.1993 r r 0.868 0.932
28 24.0 22.7143 2.9911
Example (Error Analysis)
The standard deviation (quantifies the spread around the mean):
St 22.7143
sy 1.9457
n 1 7 1
The standard error of estimate (quantifies the spread around the
regression line)
Sr 2.9911
sy / x 0.7735
n 2 7 2
Because S y / x S y , the linear regression model has good fitness
Algorithm for linear regression
Linearization of Nonlinear Relationships
The relationship between the dependent and
independent variables is linear.
However, a few types of nonlinear functions
can be transformed into linear regression
problems.
The exponential equation.
The power equation.
The saturation-growth-rate equation.
1. The exponential equation.
y a1e b1x
ln y ln a1 b1 x
2. The power equation
y a2 x b2
log y log a2 b2 log x

3. The saturation-growth-rate equation
x
y a3
b3 x
1 1 b3 1
y a3 a3 x
Example
Fit the following Equation:
b2
y a2 x
to the data in the following table: log y log( a2 x b2 )

xi yi X=logxi Y=logyi log y log a2 b2 log x
1 0.5 0 -0.301 let Y log y, X log x,
2 1.7 0.301 0.226 a0 log a2 , a1 b2
3 3.4 0.477 0.534
4 5.7 0.602 0.753 Y a0 a1 X
5 8.4 0.699 0.922
15 19.7 2.079 2.141
Example
Xi Yi X*i=Log(X) Y*i=Log(Y) X*Y* X*2

1 0.5 0.0000 -0.3010 0.0000 0.0000
2 1.7 0.3010 0.2304 0.0694 0.0906
3 3.4 0.4771 0.5315 0.2536 0.2276
4 5.7 0.6021 0.7559 0.4551 0.3625
5 8.4 0.6990 0.9243 0.6460 0.4886
Sum 15 19.700 2.079 2.141 1.424 1.169
n xi yi xi yi 5 1.424 2.079 2.141

a1 2 2
1.75
n x i
2
( xi ) 5 1.169 2.079
a0 y a1x 0.4282 1.75 0.41584 -0.300
0.334
Linearization of Nonlinear
Functions: Example
log y=-0.300+1.75log x
b2
y a2 x
log(a2)= -0.3
a2=10-0.3=0.5
b2=a1=1.75
y 0.5 x1.75
Polynomial Regression
Some engineering data is poorly represented

by a straight line.
For these cases a curve is better suited to fit
the data.
The least squares method can readily be
extended to fit the data to higher order
polynomials.
Polynomial Regression (contd)
A parabola is preferable
A 2nd order polynomial (quadratic) is defined by:

2
y ao a1 x a2 x e
The residuals between the model and the data:

2
ei yi ao a1 xi a2 xi
The sum of squares of the residual:
2 2 2
Sr ei yi ao a1 xi a2 xi
Sr
2 ( yi ao a1 xi a2 xi2 ) 0
ao
Sr
2 ( yi ao a1 xi a2 xi2 ) xi 0
a1
Sr
2 ( yi ao a1 xi a2 xi2 ) xi2 0
a2
yi n ao a1 xi a2 xi2 3 linear equations
with 3 unknowns
xi yi ao xi a1 xi2 a2 xi3 (ao,a1,a2), can be
solved
xi2 yi ao xi2 a1 xi3 a2 xi4
A system of 3x3 equations needs to be solved to determine

the coefficients of the polynomial.
2
n xi xi a0 yi
2 3
xi x i xi a1 xi yi
2 3 4 2
xi x i xi a2 xi yi
The standard error & the coefficient of determination
Sr St Sr
sy / x r 2
n 3 St
General:
The mth-order polynomial:
y ao a1 x a2 x 2 ..... am x m e
A system of (m+1)x(m+1) linear equations must be solved for
determining the coefficients of the mth-order polynomial.
The standard error:
Sr
sy / x
n m 1
The coefficient of determination: 2 St Sr
r
St
Polynomial Regression- Example
Fit a second order polynomial to data:
xi yi xi2 xi3 xi4 xiyi xi2yi
yi 152.6
0 2.1 0 0 0 0 0
1 7.7 1 1 1 7.7 7.7 xi 15
2 13.6 4 8 16 27.2 54.4
xi2 55
3 27.2 9 27 81 81.6 244.8
4 40.9 16 64 256 163.6 654.4 xi3 225
5 61.1 25 125 625 305.5 1527.5
15 152.6 55 225 979 585.6 2489 xi4 979
xi yi 585.6
15 152.6 2
x 2.5, y 25.433 xi yi 2488.8
6 6
Polynomial Regression- Example (contd)
The system of simultaneous linear equations:

6 15 55 a0 152.6
15 55 225 a1 585.6
55 225 979 a2 2488.8
a0 2.47857, a1 2.35929, a2 1.86071
y 2.47857 2.35929 x 1.86071 x 2
2
St yi y
2
2513.39 Sr ei 3.74657
Polynomial Regression- Example (contd)
xi yi ymodel ei2 (yi-y`)2
0 2.1 2.4786 0.14332 544.42889
1 7.7 6.6986 1.00286 314.45929
2 13.6 14.64 1.08158 140.01989
3 27.2 26.303 0.80491 3.12229
4 40.9 41.687 0.61951 239.22809
5 61.1 60.793 0.09439 1272.13489
15 152.6 3.74657 2513.39333
The standard error of estimate:

3.74657
sy / x 1.12
6 3
The coefficient of determination:
2513.39 3.74657
r2 0.99851, r r2 0.99925
2513.39
Multiple Linear Regression
Multiple Linear Regression
A useful extension of linear regression is the case
where y is a linear function of x1 and x2 .
y a0 a1 x x a2 x2
Sr ei 2 yi ao a1x1i a2 x2i 2
The coefficients yielding the Min. sum of squares

are .
n x1i x2i a0 yi
x1i x1i2 x1i x2i a1 x1i yi
x2i x1i x2i x2i2 a2 x2i yi
Multiple Linear Regression-Example
x1 x2 y x12 x22 x1x2 x1y x2y

0 0 5 0 0 0 0 0
2 1 10 4 1 2 20 10
2.5 2 9 6.25 4 5 22.5 18
1 3 0 1 9 3 0 0
4 6 3 16 36 24 12 18
7 2 27 49 4 14 189 54
16.5 14 54 76.25 54 48 243.5 100
6 16.5 14 a0 54 a0 5, a1 4, a2 3
16.5 76.25 48 a1 243.5
y 5 4 x1 3 x2
14 48 54 a2 100
Multiple Linear Regression-Example
Exercise
For the previous example , find .
sr , s y / x and r

Ch17 Least Squares 11

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ch17 Least Squares 11

Uploaded by

Copyright:

Available Formats

The Islamic University of Gaza

Least Square Regression

There are two general approaches for curve fitting:

Hypothesis testing. Comparing existing

Standard deviation. The most common measure of a

Variance. Representation of spread by the square of

How to find a0 and a1 so that the error would be

Yields a unique line for a given set of data.

The coefficients a0 and a1 that minimize Sr must satisfy

Fit a straight line to the x and y values in the

Sum of the squares of residuals around the

Standard deivation Standard error of the estimate

Total sum of the squares around the mean for

Sum of the squares of residuals around the

St-Sr quantifies the improvement or error

For a perfect fit:

Fit a straight line to the x and y values in the

The standard deviation (quantifies the spread around the mean):

log y log a2 b2 log x

to the data in the following table: log y log( a2 x b2 )

Xi Yi X*i=Log(X) Y*i=Log(Y) X*Y* X*2

n xi yi xi yi 5 1.424 2.079 2.141

Some engineering data is poorly represented

A 2nd order polynomial (quadratic) is defined by:

The residuals between the model and the data:

A system of 3x3 equations needs to be solved to determine

The standard error & the coefficient of determination

The system of simultaneous linear equations:

y 2.47857 2.35929 x 1.86071 x 2

The standard error of estimate:

The coefficients yielding the Min. sum of squares

x1 x2 y x12 x22 x1x2 x1y x2y

You might also like

Xi Yi Xi=Log(X) Yi=Log(Y) XY X*2