CSE 575 Scalar-by-vector Derivatives Used in Linear Regression
This article is originally written in Chinese. Permission to translate has been granted by the author. The link to the orginal post is here.
In linear regreesion, we may encounter scalar-by-vector derivatives when calculating the residual error. This article is going to talk about two derivatives: and 。
Scalar-by-vector Derivative
is a differentiable multivariable function, denoted as . We also have . The derivative of over is defined as the following n-dimension vector:
which is a column vector composed of ’s partial derivatives.
Derivative of
Let , where is the column vectors of , and the row vectors of . According to , the question is now
We take the partial derivative . Only the terms where or stay, since all the others are treated as constants.
Bring (2) into (1) and we can solve
Derivation complete!
Derivative of
Let and we define as
Take the partial derivative and we have
Bring (4) into (1)
Derivation complete!
What’s Next
Now knowing all of above, let’s take on the residual error matrix.