7.3.2. Moving average, Savitsky-Golay and deriving filters#

This section presents the following functions:

7.3.2.1. Smoothing using a moving average#

The moving average is an excellent filter to remove noise that is related to a specific time pattern. A classic example is the day-to-day evaluation of a process that is sensible to week-ends (for example, the number of workers who enter a building). A moving average with a window length of 7 days is ideal to evaluate the generic trend of this signal without considering intra-week fluctuations. Let’s first load some noisy data:

import kineticstoolkit.lab as ktk
import matplotlib.pyplot as plt

ts = ktk.load(ktk.doc.download("filters_types_of_noise.ktk.zip"))

ts.plot(["clean", "periodic_noise"], '.-')
_images/82fb027030cdf422dad1bc1b3018102f78a97b61cf7c71b73a41cbdc5edb987f.png

This signal contains periodic noise with a period of five seconds. Using a 5-second moving average filter:

filtered = ktk.filters.smooth(ts, window_length=5)

gives the blue curve below:

UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3489] The key 'clean' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3489] The key 'quantized' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3489] The key 'artefacts' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
_images/8986dc75d13c5abebd2aa05e841318c462ca968ff5658cb38c308341278818ac.png

As expected, the 5-sample period noise was completely removed. The signal was however also averaged and we therefore lost some dynamics in the signal.

7.3.2.2. Smoothing using a Savitzky-Golay filter#

The Savitzky-Golay filter is a generalization of the moving average. Instead of taking the mean of the n points of a moving window, the Savitzky-Golay filter fits a polynomial of a given order over each window. A moving average is therefore a particular case of Savitzky-Golay filter with a polynomial of order 0.

It is a powerful filter for data that is heavily quantized, particularly if we want to derivate these data. Let’s plot some heavily quantized data:

ts.plot(["clean", "quantized"], ".-")
_images/bf122cd50fb30724f24f1290f4bf734e5183dc5237682e0b1d239aa28bf6c366.png

To smooth this signal using a second-order Savitzky-Golay filter with a window length of 7:

filtered = ktk.filters.savgol(ts, poly_order=2, window_length=7)

which gives the blue curve below:

UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3489] The key 'clean' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3489] The key 'periodic_noise' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3489] The key 'artefacts' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
_images/d959b746e7a687847cccf3f7277a044faf8fec79c2918db85ea66892ec4ee39d.png

7.3.2.3. Deriving TimeSeries#

Heavily quantized signals are often difficult to derivate because they contain lors of plateaus that, once derived, are transformed to series of spikes. For instance, let’s see how deriving a quantized signal works withouth filtering, using ktk.filters.deriv:

derived = ktk.filters.deriv(ts)

derived.plot(["clean", "quantized"], ".-")
_images/0feed6b076005d14685ba1da07bdee4ca636092b91752960175cdd51a37fbb7a.png

We can derive a signal using a Savitzky-Golay filter, which consists in deriving the polynomial that is fitted over the moving window. Using a 2nd-order Savitzky-Golay filter with a window length of 7:

derived_savgol = ktk.filters.savgol(
    ts, poly_order=2, window_length=7, deriv=1
)

which gives the blue curve below:

UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3489] The key 'periodic_noise' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3489] The key 'artefacts' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
_images/3f4ac07ec003225839efc16aa89953337079ea4ec1b31b997835c375c0ef70c9.png

As observed, the derivative of the highly-quantized signal is similar to the derivative of the clean signal.