Note on error choices

Relative errors

We normally like to use relative errors because we can set a desired relative error tolerance and have it work for solutions of arbitrary size.
For a single variable $x$: $$\epsilon_r = \frac{x_k-x_{k+1}}{x_{k+1}}.$$
Here, we are assuming some iterative calculation of $x$. If we knew the solution $x$, we could write the error in our current approximation $\hat{x}$ as $\epsilon_r = (\hat{x}-x)/x.$ Since we don’t know the solution, we approximate $x$ as $x_{k+1}$ and $\hat{x}$ as $x_k$.
We normally want the magnitude of the error, so either take the absolute value, or the square root of the square:

$$\epsilon_r = \left|\frac{x_k-x_{k+1}}{x_{k+1}}\right| = \sqrt{\left(\frac{x_k-x_{k+1}}{x_{k+1}}\right)^2}.$$

Now, suppose that $x$ is a vector of $n$ unkowns that we are solving together.
We need some measure of the relative error.
We can either think in terms of components, or we can think in terms of the vector.
- In terms of components, we want some average relative error.
- In terms of the vector, we want some relative length of the error vector.

Mean relative error: $$\epsilon_1 = \frac{1}{n}\sum\left|\frac{x_k-x_{k+1}}{x_k}\right|.$$
Root mean square relative error: $$\epsilon_2 = \sqrt{\frac{1}{n}\sum\left(\frac{x_k-x_{k+1}}{x_k}\right)^2}.$$
Both of these two equations are measures of the average relative error in a solution component.

Here, we’ll use the 2-norm as the length of a vector (square root of sum of components).
This is the length of the error vector divided by the length of the solution vector: $$\epsilon_3 = \frac{\sqrt{\sum(x_k-x_{k+1})^2}}{\sqrt{\sum x_{k+1}^2}}.$$
- This can be written in python as np.linalg.norm(xold-x)/np.linalg.norm(x)
- This is not a very good choice.
  - It assumes some measure of consistency among the variables.
  - Suppose we have $x[0] = T$ (K), and $x[1] = m$ (kg). Comparing and combining terms doesn’t make sense. The magnitudes may be way different, and the units don’t even match.
A better equation is
$$\epsilon_4 = \sqrt{\sum\left(\frac{x_k-x_{k+1}}{x_{k+1}}\right)^2}.$$
- This is effectively the length of the relative error vector. Each component error is scaled by that component’s magnitude. (Any units would cancel.)
- In Python, you can write this as np.linalg.norm((xold-x)/x)