In a previous blog post, we explored how to summarise data patterns using the method of running mean in two dimensions.
This post looks at the method of running mean in three dimensions.
For illustration, we consider the dataset displayed in Figure 1.
These data show death rates among young males in the UK by Age and by Calendar year.
The age range is from 1 to 18, and the calendar year ranges from 1990 to 2010.
We want to summarise the mortality rates in terms of ages (\(x\)) and calendar years (\(y\)).

Let us denote by \(Z_{x,y}\) the observed death rate at age \(x\) and
calendar year \(y\), \(x\in \{1,\,\cdots,\,18\}\;\text{and}\; y\in\{1990,\,\cdots,\,2010\}\).
The relation between death rate, age and time can be summarised as
\[ Z_{x,y} = \mathcal{S}(x, y) + \varepsilon_{x,y} \]
where \(\mathcal{S}\) is some function to be determined and \(\varepsilon\) represents the noise.
There are many ways to estimate \(\mathcal{S}\).
In this post, use the simple method of running mean or moving average.
We shall look at alternative methods in subsequent posts.
In the method of running mean, \(S\) is estimated at each point by the average of the nearest neighbours.
That is, the estimate \(\hat{\mathcal{S}}(x, y)\) of \(\mathcal{S}(x,y)\) can be obtained by averaging the observed death rates
at ages and calendar years in the vicinity of \((x,y)\).
Various structure/shapes of the neighbourhood can be chosen. One simple structure consists of
taking the points that fall inside the circle of radius \(R\) centred at the target point.
We refer to R as the neighbourhood radius.
Figure 2 illustrates how the fitted running mean surface behaves with respect to the neighbourhood radius.

This figure shows that the fitted running mean surface converges to toward the underlying data as the neighbourhood radius decreases; and in particular, the running mean becomes identical to the data when \(R=0\).
At the other end, the running mean surface converges to the overall mean in the data as the neighbourhood radius increases.
A great advantage of the running mean is its simplicity. However, it has many limitations.
We shall look at better smoothing methods in subsequent posts.