<- previous    index    next ->

Lecture 4 Least Square Fit


Given numeric data points, find an equation that approximates the data
with a least square fit. This is one of many techniques for getting
an analytical approximation to numeric data.


 The problem is stated as follows :
   Given measured data for values of Y based on values of X1,X2 and X3. e.g.

    Y_actual         X1      X2     X3    observation, i
    --------       -----   -----  -----
     32.5           1.0     2.5    3.7                 1
      7.2           2.0     2.5    3.6                 2
      6.9           3.0     2.7    3.5                 3
     22.4           2.2     2.1    3.1                 4
     10.4           1.5     2.0    2.6                 5
     11.3           1.6     2.0    3.1                 6

  Find A,B and C such that   Y_approximate =  A * X1 + B * X2 + C * X3
  and such that the sum of (Y_actual - Y_approximate) squared is minimized.

 The method for determining the coefficients A,B and C follows directly
 form the problem definition and mathematical analysis given below.

 Set up and solve the system of linear equations:
 (Each SUM is for i=1 thru 6 per table above)

 | SUM(X1*X1)  SUM(X1*X2)  SUM(X1*X3) |   | A |   | SUM(X1*Y) |
 | SUM(X2*X1)  SUM(X2*X2)  SUM(X2*X3) | x | B | = | SUM(X2*Y) |
 | SUM(X3*X1)  SUM(X3*X2)  SUM(X3*X3) |   | C |   | SUM(X3*Y) |
 

Now, suppose you wanted a constant term to make the fit:
  Y_approximate =  Y0 + A * X1 + B * X2 + C * X3
Then the linear equations would be:

 | SUM( 1* 1)  SUM( 1*X1)  SUM( 1*X2)  SUM( 1*X3) |   | Y0 |   | SUM( 1*Y) |
 | SUM(X1* 1)  SUM(X1*X1)  SUM(X1*X2)  SUM(X1*X3) |   | A  |   | SUM(X1*Y) |
 | SUM(X2* 1)  SUM(X2*X1)  SUM(X2*X2)  SUM(X2*X3) | x | B  | = | SUM(X2*Y) |
 | SUM(X3* 1)  SUM(X3*X1)  SUM(X3*X2)  SUM(X3*X3) |   | C  |   | SUM(X3*Y) |


 Y is called the dependent variable and X1 .. Xn the independent variables.
 The procedures below implement a few special cases and the general case.
    The number of independent variables can vary, e.g. 2D, 3D, etc. .
    The approximation equation may use powers of the independent variables
    The user may create additional independent variables e.g. X2 = SIN(X1)
    with the restriction that the independent variables are linearly
    independent.  e.g.  Xi not equal  p Xj + q  for all i,j,p,q



 The mathematical derivation of the least square fit is as follows :

 Given data for the independent variable Y in terms of the dependent
 variables S,T,U and V  consider that there exists a function F
 such that     Y = F(S,T,U,V)
 The problem is to find coefficients A,B,C and D such that
            Y_approximate = A * S + B * T + C * U + D * V
 and such that the sum of ( Y - Y_approximate ) squared is minimized.

 Note: A, B, C, D are scalars. S, T, U, V, Y, Y_approximate are vectors.

 To find the minimum of  SUM((Y - Y_approximate)^2)
 the derivatives must be taken with respect to S,T,U and V and
 all must equal zero simultaneously. The steps follow :

  SUM((Y - Y_approximate)^2) = SUM((Y - A*S - B*T - C*U - D*V)^2)

 d/dS =  -2 * S * SUM( Y - A*S - B*T - C*U - D*V )
 d/dT =  -2 * T * SUM( Y - A*S - B*T - C*U - D*V )
 d/dU =  -2 * U * SUM( Y - A*S - B*T - C*U - D*V )
 d/dV =  -2 * V * SUM( Y - A*S - B*T - C*U - D*V )

 Setting each of the above equal to zero (derivative minimum at zero)
 and putting constant term on left, the -2 is factored out,
 the independent variable is moved inside the summation

  SUM( A*S*S + B*S*T + C*S*U + D*S*V = S*Y )
  SUM( A*T*S + B*T*T + C*T*U + D*T*V = T*Y )
  SUM( A*U*S + B*U*T + C*U*U + D*U*V = U*Y )
  SUM( A*V*S + B*V*T + C*V*U + D*V*V = V*Y )

 Distributing the SUM inside yields

  A * SUM(S*S) + B * SUM(S*T) + C * SUM(S*U) + D * SUM(S*V) = SUM(S*Y)
  A * SUM(T*S) + B * SUM(T*T) + C * SUM(T*U) + D * SUM(T*V) = SUM(T*Y)
  A * SUM(U*S) + B * SUM(U*T) + C * SUM(U*U) + D * SUM(U*V) = SUM(U*Y)
  A * SUM(V*S) + B * SUM(V*T) + C * SUM(V*U) + D * SUM(V*V) = SUM(V*Y)

 To find the coefficients A,B,C and D solve the linear system of equations

    | SUM(S*S)  SUM(S*T)  SUM(S*U)  SUM(S*V) |   | A |   | SUM(S*Y) |
    | SUM(T*S)  SUM(T*T)  SUM(T*U)  SUM(T*V) | x | B | = | SUM(T*Y) |
    | SUM(U*S)  SUM(U*T)  SUM(U*U)  SUM(U*V) |   | C |   | SUM(U*Y) |
    | SUM(V*S)  SUM(V*T)  SUM(V*U)  SUM(V*V) |   | D |   | SUM(V*Y) |

 Some observations :
     S,T,U and V must be linearly independent.
     There must be more data sets (Y, S, T, U, V) than variables.
     The analysis did not depend on the number of independent variables
     A polynomial fit results from the substitutions S=1, T=X, U=X^2, V=X^3
     The general case for any order polynomial of any number of variables
     may be used with a substitution, for example, S=1, T=a, U=b, V=a^2,
     W=a*b, X= b^2,  etc to terms such as a^5 * b^6.
     Any number of terms may be used. The "1" is for the constant term.


Sample code in various languages:

least_square_fit.c
least_square_fit_c.out

least_square_fit.f90
least_square_fit_f90.out

least_square_fit.java
least_square_fit_java.out

generic_real_least_square_fit.ada

lsfit.m MatLab source code (tiny!)
lsfit_m.out MatLab output and plot




And to see how polynomials in two and three variables may be fit:

Note that the terms for two and three variables in a polynomial are:
    a^0 b^0          a^0 b^0 c^0   constant, "1"

    a^1 b^0          a^1 b^0 c^0   just a
    a^0 b^1          a^0 b^1 c^0   just b
                     a^0 b^0 c^1

    a^2 b^0          a^2 b^0 c^0   just a^2
    a^1 b^1          a^1 b^1 c^0   a*b
    a^0 b^2          a^1 b^0 c^1
                     a^0 b^2 c^0
                     a^0 b^1 c^1
                     a^0 b^0 c^2

Then terms with a sum of the powers equal to 3, 4, etc.

Note that the matrix has the first row as the sum of each term
multiplied by the first term. The second row is the sum of each
term multiplied by the second term, etc.

The data for the terms is from the raw data sets of
y_actual a b    or  y_actual a b c 

being used to determine a fit
y_approx=F(a,b) or  y_approx=F(a,b,c)

Terms for the data point y1  a1  b1 are:
 1  a1  b1  a1^2  a1*b1  b1^2  the constant term  y1

These terms are multiplied by the first term, "1" and added to row 1.
These terms are multiplied by the second term, "a1" and added to row 2, etc.

Then the terms for data point  y2  a2  b2 are:
 1  a2  b2  a2^2  a2*b2  b2^2   the constant term  y2

These terms are multiplied by the first term, "1" and added to row 1.
These terms are multiplied by the second term, "a2" and added to row 2, etc.

Then the simultaneous equations are solved for the coefficients C1, C2, ...
to get the approximating function

 y_approx = F(a,b) = C1 + C2*a + C3*b + C4*a^2 + C5*a*b + C6*b^2

The following sample programs compute the least square fit of
internally generated data for low polynomial powers and compute
the accuracy of the fit in terms of root-mean-square error,
average error and maximum error. Note that the fit becomes exact
when the data is from a low order polynomial and the fit uses at
least that order polynomial.

least_square_fit_2d.c
least_square_fit_2d.out

least_square_fit_3d.c
least_square_fit_3d.out

You can translate the above to your favorite language.

Now, if everything works, a live interactive demonstration of
least square fit. The files needed are:
Matrix.java
LeastSquareFit.java
LeastSquareFitFrame.java
LeastSquareFit.out
LeastSquareFitAbout.txt
LeastSquareFitHelp.txt
LeastSquareFitEvaluate.txt
LeastSquareFitAlgorithm.txt
LeastSquareFitIntegrate.txt

    <- previous    index    next ->

Other links

Go to top