Optimality of the mean. One fact that we used implicitly in the lecture is the following: If we want to summarize a bunch of numbers $x_{1},…,x_{n}$ by a single number $s$, the best choice for $s$, the one that minimizes the average squared error, is the mean of the $x_{i}$ 's. Let's see why this is true. We begin by defining a suitable loss function. Any value $s?R$ induces a mean squared loss (MSE) given by: $L(s)=n1?i=1?n?(x_{i}?s)_{2}.$ We want to find the $s$ that minimizes this function. (a) Compute the derivative of $L(s)$. (b) What value of $s$ is obtained by setting the derivative $dL/ds$ to zero?