"

Section 4.6 Best Approximation and Least Squares

Remark: Given a nonzero vector u in Rn , consider the problem of decomposing a vector y in into the sum of two vectors, one a multiple of u and the other orthogonal to u. We wish to write y=αu+z for some scalar α and z is some vector orthogonal to u and yαu=z. yαu=z is orthogonal to u if an only if (yαu)u=0 if and only if yuαuu=0 if and only if α=yuuu which is the wight of y with respect to u of the linear combination of an orthogonal set S if u is part of S. The vector y^=yuuuu is called the orthogonal projection of y onto u, and the vector yy^=z is called the component of y orthogonal to u. Notice y^=yuuuu=y(cu)(cu)(cu)(cu) when c0. Hence y^ is determined by the subspace L spanned by u( the line passing 0 and u). y^ is denoted by projLy and is called the orthogonal projection of y onto L.

 

Example 1: Let y=[345] and u=[123].

Find the orthogonal projection of y onto u.
Then write y as the sum of two orthogonal vectors, one in Span{u} and one orthogonal to u.

 

 

Exercise 1:  Let y=[213] and u=[324]. Find the orthogonal projection of y onto u. Then write y as the sum of two orthogonal vectors,
one in Span{u} and one orthogonal to u.

The orthogonal projection of a point in R2 onto a line through the origin has an important analogue in Rn. Given a vector y and a subspace W in Rn, there is a vector y^ in W such that y^ is the unique vector in W for which yy^ is orthogonal to W, and y^ is the unique vector in W closest to y. These two properties of provide the key to finding the least-squares solutions of linear systems.

 

Theorem: Let W be a subspace of Rn. Then each y in Rn can be written uniquely in the form y=y^+z where y^ is in W and z is orthogonal to W. In fact, if {u1,...,up} is any orthogonal basis of W, then y^=yu1u1u1u1+...+yupupupup and z=yy^.

 

Definition: The vector y^ is called the orthogonal projection of y onto W and often is written as projWy.

 

Proof: Use zui=(yy^)ui=0}

 

 

Example 2: Let W=Span{u1=[211],u2=[113]}. Notice that {u1,u2}
is an orthogonal basis of W. Let y=[345]. Write y as the sum of a vector in W and a vector orthogonal to W.

 

 

Exercise 2: Let W=Span{u1=[211],u2=[153]}. Notice that {u1,u2}
is an orthogonal basis of W. Let y=[434]. Write y as the sum of a vector in W and a
vector orthogonal to W.

 

Remark: If {u1,...,up} is an orthogonal basis for W and if y happens to be in W, then the formula for projWy is exactly the same as the representation of y given in above theorem, i.e. projWy=y.

 

Theorem: Let W be a subspaceof Rn, let y be any vector in Rn,and let y^ be the orthogonal projection of y onto W. Then y^ is the closest point in W to yin the sense that ||yy^||<||yv|| for all v in W distinct from y^.

 

Remark: The vector in y^ is called the best approximation to y by elements of W. The distance from y to vgiven by ||yv||, can be regarded as the \textquotedblleft error\textquotedblright{} of using v in place of y. The theorem says that this error is minimized when v=y^. This theorem also shows that y^ does not depend on the particular orthogonal basis used to compute it.

 

Example 3: The distance from a point y in Rn to a subspace W is defined as the distance from y to the nearest point in W. Find the
distance from y to W, where W=Span u1=[230],u2=[321]} and y=[212].

Exercise 3: The distance from a point y in Rn to a subspace W is defined as the distance from y to the nearest point in W. Find the distance from y to W, where W=Span u1=[121],u2=[111]} and y=[345].

 

Example 4: Find the best approximation to y by vectors of the form c1u1+c2u2 where u1=[2322],u2=[2201] and y=[2153].

 

 

Exercise 4: Find the best approximation to y by vectors of the form c1u1+c2u2
where u1=[2314] , u2=[3021]  and y=[2143].

 

Definition: If A ism×nand b is in Rm, a least-squares solution of Ax=b is an x^ in Rn such that ||bAx^||||bAx|| for all x in Rn.

 

Remark: 1. Given A and bapply the Best Approximation Theorem to the subspace ColA and let b^=projColAbBecause b^ is in the column space A, the  equation Ax=b^ is consistent, and there is an x^ in Rn such that Ax^=b^. Since b^ is the
closest point in ColA to b , vector x^ is a least-squares solution of Ax=b if and only if x^ satisfies Ax^=b^=projColAb.

2. b^ has the property that bb^ is orthogonal toColAso bAx^ is orthogonal to each column of A, i.e. aj(bAx^)=0 for any column aj of A or AT(bAx^)=0 which is equivalent to ATAx^=ATb.

 

Definition:ATAx=ATb is called the normal equations for Ax=b. A solution of ATAx=ATb is often denoted by x^.}

 

Theorem: The set of least-squares solutions of Ax=b coincides with the nonempty set of solutions of the normal equation  ATAx=ATb.

 

 

Example 5: Find a least-squares solution of the inconsistent system for

A=[300211]  ,b=[103]

 

 

Exercise 5: Find a least-squares solution of the inconsistent system for A=[211002]  , b=[012]

Theorem: Let A be an m×n matrix. The following statements are logically equivalent:

(a) The equation Ax=b has a unique least-squares solution for each b in Rm.

(b) The columns of A are linearly independent.

(c) The matrix ATA is invertible.

When these statements are true, the least-squares solution x^ is given by x^=(ATA)1ATb.

 

Remark: When a least-squares solution x^ is used to produce Ax^=b^ as an approximation to b, the distance from b to Ax^=b^ is called the least-squares error of this approximation.

 

Example 6: Find a least-squares solution of the inconsistent system for
A=[11100211] . , b=[1023]

 

 

Exercise 6: Find a least-squares solution of the inconsistent system for
A=[01102112]  , b=[1102]

 

GroupWork 1: True or False. A is an m×n matrix and b is in R.

a. The general least squares problem is to find an x that makes Ax as close as possible to b.

b. A least squares solution of Ax=b is a vector x^ that satisfies Ax^=b^ where b^ is the orthogonal projection of b onto ColA.

c. A least squares solution of Ax=b is a vector x^ such that ||bAx||||bAx^|| for all x in Rn.

d. Any solution of ATAx=ATb is a least squares solution of Ax=b.

e. If the columns of A are linearly independent, then the equation Ax=b has exactly one least squares solution.

f. For each y and each subspace W, the vector yprojWy is orthogonal to W.

g. The orthogonal projection y^ of y onto a subspace W can sometimes depend on the orthogonal basis for W used to compute y^.

 

GroupWork 2: Find a formula for the least squares solution of Ax=b when the columns of A are orthonormal.

 

GroupWork 3: True or False. A is an m×n matrix and b is in R.

a. If b is in the column space of A, then every solution of Ax=b is a least squares solution.

b. The least squares solution of Ax=b is the point in the column space of A closest to b.

c. A least squares solution of Ax=b is a list of weights that, when applied to the columns of A, produces the orthogonal projection of b onto ColA.

d. If x^ is a least squares solution of Ax=b, then x^=(ATA)1ATb.

e. If y is in a subspace W, then the orthogonal projection of y onto W is y itself.

f. The best approximation to y by elements of a subspace W is given by the vector yprojWy.

 

GroupWork 4: Describe all least squares solutions of the system

x+y  =  2

x+y  =  4

 

License

Icon for the Creative Commons Attribution 4.0 International License

Matrices Copyright © 2019 by Kuei-Nuan Lin is licensed under a Creative Commons Attribution 4.0 International License, except where otherwise noted.