CSE 575 Scalar-by-vector Derivatives Used in Linear Regression

This article is originally written in Chinese. Permission to translate has been granted by the author. The link to the orginal post is here.

In linear regreesion, we may encounter scalar-by-vector derivatives when calculating the residual error. This article is going to talk about two derivatives: and

Scalar-by-vector Derivative

is a differentiable multivariable function, denoted as . We also have . The derivative of over is defined as the following n-dimension vector:

which is a column vector composed of ’s partial derivatives.

Derivative of

Let , where is the column vectors of , and the row vectors of . According to , the question is now

We take the partial derivative . Only the terms where or stay, since all the others are treated as constants.

Bring (2) into (1) and we can solve

Derivation complete!

Derivative of

Let and we define as

Take the partial derivative and we have

Bring (4) into (1)

Derivation complete!

What’s Next

Now knowing all of above, let’s take on the residual error matrix.

Written on February 8, 2018