CMSC 455 Lecture 4, Least Square Fit

    <- previous    index    next ->

Lecture 4 Least Square Fit


Given numeric data points, find an equation that approximates the data
with a least square fit. This is one of many techniques for getting
an analytical approximation to numeric data.


 The problem is stated as follows :
   Given measured data for values of Y based on values of X1,X2 and X3. e.g.

    Y_actual         X1      X2     X3    observation, i
    --------       -----   -----  -----
     32.5           1.0     2.5    3.7                 1
      7.2           2.0     2.5    3.6                 2
      6.9           3.0     2.7    3.5                 3
     22.4           2.2     2.1    3.1                 4
     10.4           1.5     2.0    2.6                 5
     11.3           1.6     2.0    3.1                 6

  Find a, b and c such that   Y_approximate =  a * X1 + b * X2 + c * X3
  and such that the sum of (Y_actual - Y_approximate) squared is minimized.

 The method for determining the coefficients a, b and c follows directly
 form the problem definition and mathematical analysis given below.

 Set up and solve the system of linear equations:
 (Each SUM is for i=1 thru 6 per table above)

 | SUM(X1*X1)  SUM(X1*X2)  SUM(X1*X3) |   | a |   | SUM(X1*Y) |
 | SUM(X2*X1)  SUM(X2*X2)  SUM(X2*X3) | x | b | = | SUM(X2*Y) |
 | SUM(X3*X1)  SUM(X3*X2)  SUM(X3*X3) |   | c |   | SUM(X3*Y) |
 

Now, suppose you wanted a constant term to make the fit:
  Y_approximate =  Y0 + a * X1 + b * X2 + c * X3
Then the linear equations would be:

 | SUM( 1* 1)  SUM( 1*X1)  SUM( 1*X2)  SUM( 1*X3) |   | Y0 |   | SUM( 1*Y) |
 | SUM(X1* 1)  SUM(X1*X1)  SUM(X1*X2)  SUM(X1*X3) |   | a  |   | SUM(X1*Y) |
 | SUM(X2* 1)  SUM(X2*X1)  SUM(X2*X2)  SUM(X2*X3) | x | b  | = | SUM(X2*Y) |
 | SUM(X3* 1)  SUM(X3*X1)  SUM(X3*X2)  SUM(X3*X3) |   | c  |   | SUM(X3*Y) |

Note the symmetry! Easy to program.

Note the simultaneous equations, from Lecture 3:  |A| x |X| = |Y|
|A| and |Y| easily computable, solve for |X| to get Y0, a, b and c

We now have a simple equation to compute Y approximately from a reasonable
range of X1, X2, and X3.

 Y is called the dependent variable and X1 .. Xn the independent variables.
 The procedures below implement a few special cases and the general case.
    The number of independent variables can vary, e.g. 2D, 3D, etc. .
    The approximation equation may use powers of the independent variables
    The user may create additional independent variables e.g. X2 = SIN(X1)
    with the restriction that the independent variables are linearly
    independent.  e.g.  Xi not equal  p Xj + q  for all i,j,p,q


Mathematical derivation

 The mathematical derivation of the least square fit is as follows :

 Given data for the independent variable Y in terms of the dependent
 variables S,T,U and V  consider that there exists a function F
 such that     Y = F(S,T,U,V)
 The problem is to find coefficients a,b,c and d such that
            Y_approximate = a * S + b * T + c * U + d * V
 and such that the sum of ( Y - Y_approximate ) squared is minimized.

 Note: a, b, c, d are scalars. S, T, U, V, Y, Y_approximate are vectors.

 To find the minimum of  SUM((Y - Y_approximate)^2)
 the derivatives must be taken with respect to a,b,c and d and
 all must equal zero simultaneously. The steps follow :

  SUM((Y - Y_approximate)^2) = SUM((Y - a*S - b*T - c*U - d*V)^2)

 d/da =  -2 * S * SUM( Y - a*S - b*T - c*U - d*V )
 d/db =  -2 * T * SUM( Y - a*S - b*T - c*U - d*V )
 d/dc =  -2 * U * SUM( Y - a*S - b*T - c*U - d*V )
 d/dd =  -2 * V * SUM( Y - a*S - b*T - c*U - d*V )

 Setting each of the above equal to zero (derivative minimum at zero)
 and putting constant term on left, the -2 is factored out,
 the independent variable is moved inside the summation

  SUM( a*S*S + b*S*T + c*S*U + d*S*V = S*Y )
  SUM( a*T*S + b*T*T + c*T*U + d*T*V = T*Y )
  SUM( a*U*S + b*U*T + c*U*U + d*U*V = U*Y )
  SUM( a*V*S + b*V*T + c*V*U + d*V*V = V*Y )

 Distributing the SUM inside yields

  a * SUM(S*S) + b * SUM(S*T) + c * SUM(S*U) + d * SUM(S*V) = SUM(S*Y)
  a * SUM(T*S) + b * SUM(T*T) + c * SUM(T*U) + d * SUM(T*V) = SUM(T*Y)
  a * SUM(U*S) + b * SUM(U*T) + c * SUM(U*U) + d * SUM(U*V) = SUM(U*Y)
  a * SUM(V*S) + b * SUM(V*T) + c * SUM(V*U) + d * SUM(V*V) = SUM(V*Y)

 To find the coefficients a,b,c and d solve the linear system of equations

    | SUM(S*S)  SUM(S*T)  SUM(S*U)  SUM(S*V) |   | a |   | SUM(S*Y) |
    | SUM(T*S)  SUM(T*T)  SUM(T*U)  SUM(T*V) | x | b | = | SUM(T*Y) |
    | SUM(U*S)  SUM(U*T)  SUM(U*U)  SUM(U*V) |   | c |   | SUM(U*Y) |
    | SUM(V*S)  SUM(V*T)  SUM(V*U)  SUM(V*V) |   | d |   | SUM(V*Y) |

 Some observations :
     S,T,U and V must be linearly independent.
     There must be more data sets (Y, S, T, U, V) than variables.
     The analysis did not depend on the number of independent variables
     A polynomial fit results from the substitutions S=1, T=x, U=x^2, V=x^3
     The general case for any order polynomial of any number of variables
     may be used with a substitution, for example, S=1, T=x, U=y, V=x^2,
     W=x*y, X= y^2,  etc to terms such as x^5 * y^6.
     Any number of terms may be used. The "1" is for the constant term.

fitting a simple polynomial, 1D

Now, suppose you wanted to fit a simple polynomial:
Given the value of Y for at least four values of X, 
  Y_approximate =  C0 + C1 * X + C2 * X^2 +  C3 * X^3
Then the linear equations would be:

| SUM(1  *1) SUM(1  *X) SUM(1  *X^2) SUM(1  *X^3) | | C0 |   | SUM(1  *Y) |
| SUM(X  *1) SUM(X  *X) SUM(X  *X^2) SUM(X  *X^3) | | C1 |   | SUM(X  *Y) |
| SUM(X^2*1) SUM(X^2*X) SUM(X^2*X^2) SUM(X^2*X^3) |x| C2 | = | SUM(X^2*Y) |
| SUM(X^3*1) SUM(X^3*X) SUM(X^3*X^2) SUM(X^3*X^3) | | C3 |   | SUM(X^3*Y) |

Note that the (i,j) subscript in the matrix has X^(i+j)
for i=0..3, j=0..3

Solve the simultaneous equations for C0, C1, C2, C3
Note that the sum is taken over all observations and the "1" is
just shown to emphasize the symmetry.

Sample code in various languages:

least_square_fit.c
least_square_fit_c.out

least_square_fit.f90
least_square_fit_f90.out

least_square_fit.java
least_square_fit_java.out
least_square_fit_3d.java
least_square_fit_3d_java.out
least_square_fit_4d.java
least_square_fit_4d_java.out
A specialized version for use later with PDE's
lsfit.java
test_lsfit.java
test_lsfit_java.out
test_lsfit2.java
test_lsfit2_java.out
uses simeq.java

The Makefile entry that makes  test_lsfit_java.out
test_lsfit_java.out: test_lsfit.java lsfit.java simeq.java
	javac -cp . simeq.java
	javac -cp . lsfit.java
	javac -cp . test_lsfit.java
	java  -cp . test_lsfit > test_lsfit_java.out
	rm -f *.class




least_square_fit.adb
least_square_fit_ada.out
least_square_fit_2d.adb
least_square_fit_2d_ada.out
least_square_fit_3d.adb
least_square_fit_3d_ada.out
real_arrays.ads
real_arrays.adb

A specialized version for use later with PDE's
lsfit.ads
lsfit.adb
test_lsfit6.adb
test_lsfit6_ada.out
array4d.ads
test_lsfit5.adb
test_lsfit5_ada.out
array5d.ads
test_lsfit4.adb
test_lsfit4_ada.out
array4d.ads
test_lsfit3.adb
test_lsfit3_ada.out
array3d.ads
test_lsfit2.adb
test_lsfit2_ada.out
integer_arrays.ads
integer_arrays.adb
real_arrays.ads
real_arrays.adb

The Makefile entry to make  test_lsfit6_ada.out
test_lsfit6_ada.out: test_lsfit6.adb lsfit.ads lsfit.adb \
                    array3d.ads array4d.ads array5d.ads array6d.ads \
                    real_arrays.ads real_arrays.adb \
                    integer_arrays.ads integer_arrays.adb
	gnatmake test_lsfit6.adb
	./test_lsfit6 > test_lsfit6_ada.out
	rm -f test_lsfit
	rm -f *.ali
	rm -f *.o

similarly for test_lsfit2_ada.out, test_lsfit3_ada.out,
              test_lsfit4_ada.out, test_lsfit5_ada.out


Similar code in plain C
lsfit.h
lsfit.c
test_lsfit.c
test_lsfit_c.out



truncated series vs lsfit for sin(x+y) and sin(x+y+z)
using code from above, another test case:
fit_sin.adb demonstrates that a lsfit to a specified power,
does not give the same coefficients as a truncated approximation,
to the same power, and is a more accurate fit.

fit_sin.adb
fit_sin_ada.out


comparison to very small Matlab code
generic_real_least_square_fit.ada

lsfit.m MatLab source code (tiny!)
lsfit_m.out MatLab output and plot




comparison to small Python code

test_polyfit.py Python source code
test_polyfit_py.out Python output and plot






Terms for fitting two and three variables, 2D and 3D
And to see how polynomials in two and three variables may be fit:

Note that the terms for two and three variables in a polynomial are:
    a^0 b^0          a^0 b^0 c^0   constant, "1"

    a^1 b^0          a^1 b^0 c^0   just a
    a^0 b^1          a^0 b^1 c^0   just b
                     a^0 b^0 c^1

    a^2 b^0          a^2 b^0 c^0   just a^2
    a^1 b^1          a^1 b^1 c^0   a*b
    a^0 b^2          a^1 b^0 c^1
                     a^0 b^2 c^0
                     a^0 b^1 c^1
                     a^0 b^0 c^2

Then terms with a sum of the powers equal to 3, 4, 5, 6 are available

Note that the matrix has the first row as the sum of each term
multiplied by the first term. The second row is the sum of each
term multiplied by the second term, etc.

The data for the terms is from the raw data sets of
y_actual a b    or  y_actual a b c 

being used to determine a fit
y_approx=F(a,b) or  y_approx=F(a,b,c)

Terms for the data point y1  a1  b1 are:
 1  a1  b1  a1^2  a1*b1  b1^2  the constant term  y1

These terms are multiplied by the first term, "1" and added to row 1.
These terms are multiplied by the second term, "a1" and added to row 2, etc.

Then the terms for data point  y2  a2  b2 are:
 1  a2  b2  a2^2  a2*b2  b2^2   the constant term  y2

These terms are multiplied by the first term, "1" and added to row 1.
These terms are multiplied by the second term, "a2" and added to row 2, etc.

Then the simultaneous equations are solved for the coefficients C1, C2, ...
to get the approximating function

 y_approx = F(a,b) = C1 + C2*a + C3*b + C4*a^2 + C5*a*b + C6*b^2

The following sample programs compute the least square fit of
internally generated data for low polynomial powers and compute
the accuracy of the fit in terms of root-mean-square error,
average error and maximum error. Note that the fit becomes exact
when the data is from a low order polynomial and the fit uses at
least that order polynomial.

least_square_fit_2d.c
least_square_fit_2d.out

least_square_fit_3d.c
least_square_fit_3d.out

least_square_fit_4d.c
least_square_fit_4d.out

least_square_fit2d.adb
least_square_fit2d_ada.out
real_arrays.ads
reak_arrays.adb

You can translate the above to your favorite language.


Now, if everything works, a live interactive demonstration of
least square fit. 

The files needed are:
Matrix.java
LeastSquareFit.java
LeastSquareFitFrame.java
LeastSquareFit.out
LeastSquareFitAbout.txt
LeastSquareFitHelp.txt
LeastSquareFitEvaluate.txt
LeastSquareFitAlgorithm.txt
LeastSquareFitIntegrate.txt

The default parameter is an n, all, point fit.
Then set parameter to 3, for n=3, third order polynomial fit.
Then set parameter to 4 and 5 to see improvement.


Homework 2


Spline fit, possible demonstration
There are many other ways to fit a set of points.
The Spline fit smooths the fit by controlling derivatives.

SplineFrame.java
(Replace "LeastSquareFit" with "Spline" then get same files as above.)
Demonstration to see a different type of fit.

    <- previous    index    next ->

Lecture 4 Least Square Fit

Mathematical derivation

fitting a simple polynomial, 1D

A specialized version for use later with PDE's

truncated series vs lsfit for sin(x+y) and sin(x+y+z)

comparison to very small Matlab code

comparison to small Python code

Terms for fitting two and three variables, 2D and 3D

Now, if everything works, a live interactive demonstration of

Spline fit, possible demonstration

Other links

Go to top