Page Author: [Shubodh Sai](mailto:[email protected]?Subject=The%20Notion%20Note-taking%20Project%20|%20Least%20Squares%20Optimization)
(click on triangle shaped thing to toggle)
$$ A x=b \\\quad A \: is\: m \times n \text{ matrix where } m>n. $$
Note: We will be considering only full rank matrices. There are special cases like a low rank (like enforcing in F matrix computation), unknown rank systems etc which we won't be covering in this lecture.
In practice, it is full rank (noise) and overdetermined system (data). Assumed throughout this lecture.
So what you do is you say, you'd rather minimize the norm which represents the error:
$$ \|Ax-b\|_2 $$
The Linear Algebra Perspective: (Gilbert Strang's lectures)
Calculus Perspective:
Ah, but this is not differentiable. So I cannot say much about its convexity from this directly. But what I can do, is to square it, as both least-norm and least-square norm are equivalent problems. See this for more details. So let's minimize