July 31, 2015
Moving Average vs Windows Average
Averages are used in software all the time as a lowpass filter. One common filter method is a windowed average. This is an average using only a specific number of samples. Each new sample replaces the oldest sample, and the average taken across this set of data. Consider a scenario where a new point of data comes in every second, and you need to keep track of the average of this sample for the past 5 seconds.
Data 
Average 
1004 

1003 

1003 

998 

995 
1000.6 
993 
998.4 
1001 
998 
1008 
999 
1000 
999.4 
998 
1000 
Here we see 10 samples. The average doesn't start until the 5^{th} sample because we want an average of 5 samples. After the 5^{th} sample, the average is just an average of the previous 5 samples. To implement this, typically a circular buffer is used. We keep 5 samples, and always overwrite the oldest sample when a new sample arrives. Here are the 10 iterations of a circular buffer from the scenario above:

1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
1 
1004 
1004 
1004 
1004 
1004 
993 
993 
993 
993 
993 
2 

1003 
1003 
1003 
1003 
1003 
1001 
1001 
1001 
1001 
3 


1003 
1003 
1003 
1003 
1003 
1008 
1008 
1008 
4 



998 
998 
998 
998 
998 
1000 
1000 
5 




995 
995 
995 
995 
995 
998 
Average 




1000.6 
998.4 
998 
999 
999.4 
1000 
The new data is in bold. Notice how in column 6, the first row has been replaced. On the 7^{th} column, the second row replaced, and so on. To implement this system in software is very simple, and there are some tricks you can do to keep the average uptodate without having to sum the entire array each time which I have written about.
A windowed average suffers from a couple of drawbacks however. The biggest is the need to keep an array of all the previous samples. In a small processor one might not have enough memory to keep the array needed for a windowed average, especially if there are a lot of samples.
When doing filtering this can be solved instead using a firstorder lowpass filter, known as a moving average. In the past I've written about using a software lowpass filter. I most often use this filter when reading what should, for the most part, be a DC value from an Analog to Digital Converter (ADC). Noise in this signal can be introduced for a number of reasons, but this simple filter is a very inexpensive way to take the noise out.
Here is the function for a firstorder lowpass filter:
O_{n} = C I_{n} + ( 1 – C ) O_{n}_{1}
Where I is an array if input data, O is the output, and C is the filter coefficient with a value between 0 and 1. In implementation it is easier to think of this lowpass filter a poormans windowed average. Rather than think of a filter coefficient, consider the number of samples that will be averaged together. In this scenario, the equation becomes:
O_{n} = I_{n} + O_{n1}  O_{n}_{1 }/ N
Where N is the number of samples. This acts like a windowed average over N samples—sort of. The difference is most apparent when the input makes a sudden change. Take for instance this setup:
Here is an average of the last 5 samples using both the windowed and lowpass methods. After 5 samples, the data makes a large change. The windowed average goes to the new values after 5 samples, where the lowpass filter takes much longer. Theoretically, the lowpass filter will never reach the new value as the new value will become an asymptote. If using integer numbers, it takes about 20 samples for this scenario. The larger the change, the longer it takes for the lowpass to reach this new value. When doing a software filter, this is not typically a problem—just something that must be accounted for when selecting the value for N.
One nice thing about using a moving average with N rather than a coefficient is the ability to avoid taxing operations like multiplies and divisions. If N is a poweroftwo, the divide becomes a rightshift. This allows for a heavy filter coefficient, like 256, with barely any overhead. Here are the two lines of code needed to add a new sample to this average:
averageSum = averageSum >> AVERAGE_SHIFT;
averageSum += newSample;
When the average is desired, it is simple the average sum divided by the number of coefficients.
average = averageSum >> AVERAGE_SHIFT;
Getting this average started often involves forcing the average sum to reflect the first sample. That is, force the average to be whatever the first sample is.
averageSum = firstSample << AVERAGE_SHIFT;
Since the lowpass filter takes time to react to changes, this allows the filter to start much closer to the actual average—assuming the first sample is fairly close to the average.
In addition to considerations about how much filtering is being done, the other factor is integer size. For example, if one wants a filter on 10bit ADC data, if the average sum is 16bits, then the value for the shift cannot be larger than 6, or N=64, without overflowing the sum.
The biggest benefit of using a moving average filter is that it only requires storage for an accumulator no matter how many samples are to be averaged together. Together with using a shift for a divide, it is both small and fast. This makes it very useful in small microprocessors, and highspeed digital signal processing.
There are times when a windowed average is better than a moving average. They work better in situations when the average value can change abruptly to a new steady state because they can react faster to the change. However, there is often a way to account for this using a moving average as well. Sometimes an abrupt change can be anticipated, and the average reset when it takes place. For instance, if monitoring a voltage that can be switched from a low level to higher level, the program might know this change was going to take place (perhaps because the software initiated it). When it happens, the moving average sum is reset with the firstsample method after the change has taken place. This allows the filter to react quickly to the change. This also works with the windowed average.
I personally use a windowed average when I have plenty of memory to work with if for no other reason than calculating the coefficient in meaningful terms is easy. For instance if one has 1 sample/sec, and averages 10 samples together, the filter is a 10 second average. With a moving average this is harder to quantify like this. Keeping a windowed average also lets one keep additional statistics such as standard deviation, which we will address in the next article. When memory and/or speed are key, nothing beats a poweroftwo based moving average.