Like The Least Square Methods But Without A Trial and Error Procedure

THE MINIMUM DISTRIBUTION METHOD Joseph Stanovsky PhD
2012 by J S PROLOGUE This paper describes an algebraic solution for two lines best fitting nearly linear test data. The two lines are orthogonal and intersect at the centroid of the data points. THE LEAST SQUARE SOLUTION The method of least squares is in use to define a line best fitting nearly linear test data. The traditional method of solution is trial and error in which every solution is an approximate solution. The following figures shows the least squares measurement y parallels the y-axis and W= 1 parallels the z-axis. y z Data point at x, y
y
x Typical least square solution data
Trial line
W = 1 at x, y
x the x, y plane
iNTRODUCING DETAILS OF THE MINIMUM DISTRIBUTION METHOD The minimum distribution method produces an algebraic solution which produces an exact solution when all data points share a line common to all. Two line solutions are produced; one line defines a line that defines the best almost linear test data and a second line that is orthogonal to the first, solutions that intersect at the centroid of the data. The the first (left most) of the following sketches show the orthogonal lines and the central figure shows the displacement y*. The figure at the right shows W acting on the x-y plane at point p at x and y. Errors in x, y an W are included.
2nd solution orthogonal to 1st 2nd 1st data centroid Data point p at x, y p y* 1st line
W at Data point p
y x
Solutions intersect at data centroid
x =x(p)[1+(p)], y =y(p)[1+(p)]
W =W(p)[1+(p)]
Unique features of the minimum distribution method THE 2ND SOLUTION LINE
2nd solution line p y is orthogonal to 2nd solution line
The y in the 2nd solution is a consequence of the Minimum Distribution theory THE CENTROIDAL Z-AXIS MOMENT OF INERTIA The second moment of the distance from the best fitting line to a data point is the moment of inertia, whether it is Iy = x2dy or Iz = (x2 +y2) or described a modulus of data distribution all take its descriptive label least squares method from the title of an 1806 paper m ethode des moindres carr es by Andrieu-Marie Legendre (1752-1883). -1-
THE MINIMUM DISTRIBUTION METHOD

INTRODUCTION Carl Friedrich Gau (1777-1865) applied the method of least squares to astronomy in 1795 and published it in Theoria Motus Corporum Coelestium in sectionibus conicix solem ambientium (1809), whereas Andrieu-Marie Legendre working in France and in astronomy discovered the method in 1806 while Irishman Robert Adrain working in the US published the method in 1808 where he applied it to survey measurements. FROM TEST DATA TO TWO LINE SOLUTIONS Figure 1 shows the random distribution of test data by symbol . This data is used to calculate the slope and y-intercept of the line best fitting the data and a second line that is orthogonal to the best fitting line. Another feature of this algebraic solution is that both measurements and errors of x, y and W are included. The errors in each measurement can be measured or estimated. The errors in each measurement are identified as x(p), y(p) and W(p). y* y*
x*
cg1
x*
Y cg2 y
Figure 1: The line OG is the line best fitting the data points W(p), in which the x-measurements are randomly spaced. The line O-O defines the y-intercept b, while point G locates the centroid of N-data points. The data point centroid is located by calculated measurements of cg and cg . 1 2
The slope of the best fitting line in Fig 1 is . The second best fitting line is not shown but described: the slope of this line is +90 and orthogonal to the best fitting line shown and also intersects the centroid G of the data points. A typical data point W(p), eq. (1), is identified by p and a measurement error . W(p) is located in Fig 1 by displacements x and y, with x or y-measurement errors equal to or , respectively. The z-axis W(p) errors are (p) as defined in eq. (1) whereas X(p) and Y(p) errors are shown in eqs. (2) and (3). W(p) =W(p) +W(p) = W(p)[1+ (p)] X(p) = x(p) +(p) x(p) = x(p)[1+(p)] Y(p) = y(p) +(p) y(p) = y(p)[1+(p)] (1) (2) (3)
The value of W(p) can be defined as 1, or by the number of repetitions made with
-2-

identical experimental data. Equations (4) and (5) represent the magnitude or the vector for N-values of W(p).
p=n i=1 p=n p=1
0 (4) W(p)i = 0
p=n i=1
W(p) = W(p)[1+ (p)]
(5)
W(p)[1+ (p)] Equation (6) locates a typical data point. It is used to define the first moment about the origin O for W(p), or mi = qi W(p), as shown in eq. 7. Y(p) W(p)[1+ (p)] mi = [qi W(p)i ]= X(p) Y(p) 0 = -X(p) W(p)[1+ (p)] (7) 0 0 W(p)[1+ (p)] 0 Combined, eqs. (5) and (7), are the principal ingredients of Varignons theorem. - W = m. Varignons theorem locates centroidal measurements from x Using the 1-component of eq. (7) in Varignons theorem produces eq. (8). n p=n p=n mO = y(p)[1+ (p)] W(p)[1+ (p)] = cg W(p)[1+ (p)] (8) 1 2
O 1 1 1 p=1 p=1 p=1
qi = X(p)
Y(p)
(6)
When solved, eq. (8) yields eq. (9), in which cg2 is the y-measurement in Fig 1.
p=n
y(p)[1+(p)] W(p)[1+ (p)] cg2 =

p=1 p=n p=1
(9)
W(p)[1+ (p)]
Using the 2-component of eq. (7) in Varignons theorem produces eq. (10). p=n p=n p=n mO = -x(p) [1+ (p)] W(p) [1+ (p)] = cg W(p) [1+ (p)] 2 1
p=1 p=1 p=1
(10)
When solved, eq. (10) yields eq. (11), in which cg1 is the x-measurement in Fig. 1.
p=n
-x(p)[1+(p)] W(p)[1+ (p)] cg1 =

p=1 p=n p=1
(11)
W(p)[1+ (p)]
Equations (9) and (11) locate the centroid of N-experimental data points using a moment center at origin O. Although the information developed so far can be used to determine the equation of the line best fitting the experimental data, it is useful to perform an operation like that in eq. (7) using a moment center at G. This is done by re-calculating the coordinates of W(p). The new coordinates are those shown in eqs. (12) and (13). X(p) = X(p) -cg = x(p)[1+(p)]-cg (12) 1 1 1Y(p) = Y(p) -cg2 = y(p)[1+(p)]-cg2 (13)
-3-

The components in eqs. (12) and (13) are used to define the displacement vector, measured from G to W(p), shown in eq. (14). x(p)[1+(p)]-cg1 Li = y(p)[1+(p)]-cg2 (14) 0 The two unit vectors, i and i, shown in eq. (15) point parallel and perpendicular to the best fitting line OG. i = Cos Sin 0 i = -Sin Cos 0 (15) Using eqs. (14) and (15), calculate the components of Li in eq. (13) that are parallel to and perpendicular to the best fitting line OG. The components parallel to line OG are developed by the dot product Lii in eq. 16, and the vector components perpendicular to line OG are defined by the dot product Lii in eq. (17). x(p)[1+(p)]-cg1 Cos Lii = y(p)[1+(p)]-cg2 Sin 0 0 The dot product in eq. (16) is evaluated in eq. (16a). Lii = ( x(p) [1+ (p)] -cg1) Cos +( y(p) [1+ (p)] -cg2) Sin (16)
(16a)
x(p)[1+(p)]-cg1 -Sin Lii = y(p)[1+(p)]-cg2 Cos (17) 0 0 The dot product in eq. (17 is evaluated in eq. (17a). Lii = (-1) ( x(p) [1+(p)] -cg1) Sin +( y(p) [1+(p)] -cg2) Cos (17a) Next, repeat moment equation (7), but instead of a moment center at O, use G as the moment center. The matrix form of a moment equation is not possible because of the algebraic length of eqs. (16a) and (17a). The algebraic form of a matrix cross product containing eqs. (16a) and (17a) is simplified in eqs. (18) and (19). A(p) = (18) ( x(p) [1+(p)] -cg1) Cos +( y(p) [1+(p)] -cg2) Sin B(p) = -(1)( x(p) [1+(p)] -cg1) Sin +( y(p) [1+(p)] -cg2) Cos (19) As a consequence of this substitution, the vector cross product for the first moment about the OG-axis, with center of moments at G, is defined in eq. (20). B(p) {W(p)[1+ (p)]} Mi = 0 B(p) 0 = 0 (20) 0 0 W(p)[1+ (p)] 0 Next, calculate the second moment about the OG-axis, IG i . The cross product is defined by the matrix definition of the cross product, eq. (21).
G 1 1 1
Ii
OG
0 B(p) 0 B(p) {W(p)[1+ (p)]} 0 0 The second moment matrix in eq. (21) produces the following components. -4-
(21)

IOG =0 1 IOG =0 2 IOG = 3
p=n
(22) (23) (24)
(-1) B(p) 2 {W(p) [1+ (p)] } } { p=1
When the algebra in eq. (24) is performed, eq. (24) reduces to eq. (25). p=n IG = (-1) {W(p)[1+ (p)]} B(p) 2 3
p=1
(25)
Equation (26) is an intermediate solution for eq. (25). B(p) 2 = (-1)(x(p) [1+(p)] -cg1) Sin +(y(p) [1+(p)] -cg2) Cos 2 The algebraic operations in eq. (26) are shown in eq. (27). I OG = (-1) {W(p) [1+ (p) ] } i (26)
{ (-1) (x(p) [1+ (p)] -cg ) Sin

1
+ (y(p) [1+(p)] -cg2) Cos 2
+ (- 2) ( x(p) [1+(p)] -cg1) (y(p) [1+(p)] -cg2) Cos Sin (27) The slope of the line best fitting the experimental data, defined by in eq. (27), is shown schematically in Fig. 1. Maximum or minimum values for the angle occur by a differentiating eq. (27) relative to . The result is that in eq. (28). dIG /d = (-1) {W(p)[1+ (p)]} (-1) (x(p) [1+(p)] -cg1) 2 Sin2 3 +(-1)(y(p) [1+(p)] -cg2) 2 Cos2 + (- 2) ( x(p) [1+(p)] -cg1) (y(p) [1+(p)] -cg2) -Sin2 +Cos2 (28) Equation (28) is simplified by installing the following: Sin 2 = 2 Sin Cos and Cos 2 = -Sin2 +Cos2 . Equation (29) is the result of these substitutions in eq. (28), in which dI/d is set equal to 0. The Tan 2 is defined in eq. (29). 0 = Sin 2 + (- 2) (-Cos 2) ( x(p) [1+(p)] -cg1) (y(p) [1+(p)] -cg2) (29) Equation (30) establishes the solution of Tan 2. The result is independent of the magnification factor {W(p) [1+ (p) ]}. p=n (- 2) ( x(p) [ 1 +(p) ] -cg ) ( y(p) [ 1 +(p) ] -cg ) 1 2 Tan 2 = (30) 2 p=1 (-1) x(p) [ 1 +(p)] -cg1 + y(p) [ 1 +(p)] -cg2 2 Equation (30) produces two values of . One solution is the angle in Fig. 1; the second solution defines a * orthogonal to the line OG. One solution defines an angle for which the data distribution is closest to the best fitting line, which is the same as a minimum moment of inertia, while the distribution for the orthogonal line produces a maximum distribution modulus, or the maximum moment of inertia.
{ (-1) (x(p) [1+ (p)] -cg ) +(y(p) [1+ (p)] -cg )

2 1 2
) (
THE BEST FITTING LINE The analytic equation for a straight line is shown in eq. (31), in which x and y define a point on a line, m is the slope of the line and b is the y-intercept.
-5-

y = mx +b (31) The value of b is defined in eq. (32) by substituting in eq. (31) cg2 for y, cg1for x, and Tan for m. b = cg2 -cg1 Tan (32) CONCLUSIONS The solutions developed in eqs. (30) and (32) define two parameters for a line best fitting N-data points measured in an experiment, in which N must be greater than 2. The slope of this straight line results from Tan 2, from which is defined. Substituting into eq. (5) produces a pair of orthogonal axes defined by unit vectors i and i. The unit vector i is associated with the minimum moment of inertia (a term common in engineering), while the unit vector i defines the axis for which the moment of inertia is a maximum. Distribution is suggested as an alternative description to the engineering terms of moments of inertia, products of inertia, or polar moments of inertia. Inertia is a word extracted from the Latin writings of all early mathematicians because there was no English equivalent that could be used to translate inertia from Latin to English. Thus, the word inertia was simply imported from the Latin text into English texts. This import process has been used when reading other languages. For example, when the documents gathered and stored by Arab traders were translated, there were many Arabic words that had no English equivalents. Consider this short list of words that came (after 1492) into the English language from the library at Toledo: azimuth, magazine, tabak (tobacco), sugar, or wadi (a creek bed). Many Latin words found their way into English; after 1620 came words like orbit, revolution and propaganda. Least squares, used by Legendre, probably takes its name from the sum squared of N-residuals [1]. That residual is: n S = (yi -f(xi))2 (33)
i=1
In contrast, the terms in both numerator and denominator of eq. (30) have very real meanings. The numerator is the product particle distribution about the z-axis, an axis that pierces the x-y plane at the centroid of the N-particles in eqs. (9) and (11). Similarly, the denominator of eq. (30) describes the distribution product modulus. Alternatively, the numerator is the polar moment of inertia of N-particles measured from the z-axis at G, while the denominator is the product of inertia with x and y measured parallel and perpendicular to the best fitting line. There are at least two differences between least squares solutions and the vector algebra used in the derivation from which eqs. (30) and (32) emerge. The first feature of the least squares method is that all experimental errors are assumed to be randomly distributed. In the vector method presented in this chapter, or minimum distribution theory, experimental errors can be measured or estimated and subsequently applied to the x, y and W measurements. The minimum distribution theory defines the y-intercept by an algebraic process, with no need to make any prior estimates as is necessary in the least squares method. If there is any complication caused by including errors in a minimum distribution solution, it is that the effect of an error or errors usually requires multiple solutions. However, multiple solutions determine only the relative effects produced by positive or negative errors in the measurements. For many minimum distribution solutions, the x-y coordinate axes are orthogonal.
-6-

But this is not a requirement. However, the reference X-Y axes must be orthogonal. An x-axis might be defined by a unit vector rotated from the reference X-axis so that sx = Cos Sin 0 . If the y-axis is rotated a negative from the reference Y-axis, then the unit vector sy = Sin Cos 0 . If these axes were orthogonal, then a dot product of sxsy should be zero. (A dot product of two vectors is defined as the product of the vectors multiplied by the cosine of the angle between the vectors). For the case cited sxsy = 2 Cos Sin , which is zero only if = 0 or 90. As a consequence of using non-orthogonal coordinates, the experimental x and y measurements can have three components of which none are necessarily zero. And finally, but only if the minimum distribution theory is seriously considered, tested and found useful, it is recommended that the least squares method not be used. Instead of its use as a staple of statistical calculations, the minimum distribution should be made part of the mathematics of vector algebra and introduced into statistics. A TEST OF THE MINIMUM DISTRIBUTION METHOD Three sets of experimental data are used: (1) a set of data with no errors, in which the x and y-components are uniformly spaced; (2) a perfect set of data, with x-axis errors, in which the x and y components are not uniformly spaced; and (3) a set of data with errors x(p), y(p) and W(p), with positive and negative errors. In each case the test condition fits equation y = mx +b beginning with m = 0.75, and b = 0.5.
p 1 2 3 4 5 x 1.00 2.00 3.00 4.00 5.00 3.00 Case 1 y 1.25 2.00 2.75 3.50 4.25 2.750 W 1.00 1.00 1.00 1.00 1.00 x 1.00 1.75 3.25 3.75 4.75 2.90 Case 2 y 1.25 2.00 2.75 3.50 4.25 W 1.00 1.00 1.00 1.00 1.00 x 1.00 1.75 3.25 3.75 4.75 3.217 Case 3 y 1.25 2.00 2.75 3.50 4.25 2.883 7.5 I(max or min) / W 10 k(min W 1.00 1.25 1.50 1.75 2.00
x y W
2.750 5.0 5.0 For reference purposes (columns 9 & 10): the radius of gyration k = 2 3 4 I (max) 4.000 1.000 0.000 1.000 4.000 10.0 Case 1 5 I (min) 2.2500 0.5625 0.0000 0.5625 2.2500 5.625 6 2(xy) 6.00 1.50 0.00 1.50 6.0 15.00 7
1 p 1 2 3 4 5
1 2 3 4 5
x-x -2.00 -1.00 0.00 1.00 2.00
y-y -1.500 -0.750 0.000 0.75 1.50
8 9 2 2 (y -x ) Tan 2 k(max) 1.7500 0.4375 0.0000 0.4375 1.7500 4.3750 3.4290 1.4142 = 36.8709 1.3600 0.7600 0.1225 0.1600 1.1725 3.5750 3.9860 1.3565 = 37.9582
1.0607
1.90 -1.15 0.35 0.85 1.65
-1.500 -0.750 0.000 0.7500 1.500
Case 2 3.610 2.2500 1.3225 0.5625 0.1225 0.0000 0.7225 0.5626 3.4225 2.2500 9.2000 5.6250 Case 3
5.700 1.725 0.000 1.275 5.550 14.250
1.0607
-7-

1 2 3 4 5 -2.2167 -1.4667 0.0333 0.5333 1.5333 -1.6333 -1.1333 0.1167 0.3667 1.1167 7.2411 3.3244 0.0078 0.3911 3.4244 14.3889 2.2458 0.8667 0.0125 0.1500 1.1042 4.3542 4.9136 2.1511 0.0011 0.2844 2.3511 9.7014 2.6678 1.2844 0.0136 0.1344 1.2469. 5.3472 3.3305 1.3851 = 36.6436
0.7619
A summary of the best fitting lines: Case 1. Case 2. Case 3. y = 0.7500 x +0.5 y = 0.7801 x +0.4876 y = 0.7422 x +0.4959
MY FIRST BOOK PURCHASE [Lagniappe] In a window display at Henrys Book Store on the 20th of May in 1944 I saw the Handbook of Mathematical Tables and Formulas, a book compiled for the Handbook Publishers, Inc. of Sandusky, Ohio by Richard Stevens Burrington, Ph.D. I entered the store where I was met by the owner. It was obvious to the owner and to me that a 16 year old high school pupil had no immediate need for a book of formulas, but there was this letter on the first page of this second edition addressed To the Student who uses this Handbook. I read the letter and then bought the book. The last three sentences of that letter are: The subject matter is not ephemeral but everlasting -- as true in the future as it has been in the past. By all means, retain this book for your own reference library. You will need it many times in years to come. ABOUT THE 6th CENTURY CHINESE [Lagniappe] The Chinese is defined by the ratio 355/113 or 3.14159292. Traditional is 3.14159265 and -6 greater by only: 100. (3.14159292-3.14159265)/3.14159265 = 8.491 10 %.
-8-

Like The Least Square Methods But Without A Trial and Error Procedure

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Like The Least Square Methods But Without A Trial and Error Procedure

Uploaded by

Copyright:

Available Formats

THE MINIMUM DISTRIBUTION METHOD Joseph Stanovsky PhD

x Typical least square solution data

Solutions intersect at data centroid

THE MINIMUM DISTRIBUTION METHOD

THE MINIMUM DISTRIBUTION METHOD

W(p) = W(p)[1+ (p)]

y(p)[1+(p)] W(p)[1+ (p)] cg2 =

-x(p)[1+(p)] W(p)[1+ (p)] cg1 =

THE MINIMUM DISTRIBUTION METHOD

THE MINIMUM DISTRIBUTION METHOD

(22) (23) (24)

(-1) B(p) 2 {W(p) [1+ (p)] } } { p=1

{ (-1) (x(p) [1+ (p)] -cg ) Sin

+ (y(p) [1+(p)] -cg2) Cos 2

{ (-1) (x(p) [1+ (p)] -cg ) +(y(p) [1+ (p)] -cg )

THE MINIMUM DISTRIBUTION METHOD

THE MINIMUM DISTRIBUTION METHOD

x-x -2.00 -1.00 0.00 1.00 2.00

y-y -1.500 -0.750 0.000 0.75 1.50

1.90 -1.15 0.35 0.85 1.65

-1.500 -0.750 0.000 0.7500 1.500

5.700 1.725 0.000 1.275 5.550 14.250

THE MINIMUM DISTRIBUTION METHOD

You might also like