7.3.2. Moving average, Savitzky-Golay and deriving filters#

This section presents the following functions:

7.3.2.1. Smoothing using a moving average#

The moving average is an excellent filter to remove noise that is related to a specific time pattern. A classic example is the day-to-day evaluation of a process that is sensitive to weekends (for example, the number of workers who enter a building). A moving average with a window length of 7 days is ideal to evaluate the general trend of this signal without considering intra-week fluctuations. Let’s first load some noisy data:

import kineticstoolkit.lab as ktk
import matplotlib.pyplot as plt

ts = ktk.load(ktk.doc.download("filters_types_of_noise.ktk.zip"))

ts.plot(["clean", "periodic_noise"], '.-')
_images/5bad850682a264e1ea020b17f5744ffb0ded4562f00f14020da8a0491ac55df9.png

This signal contains periodic noise with a period of five seconds. Using a 5-second moving average filter:

filtered = ktk.filters.smooth(ts, window_length=5)

gives the blue curve below:

UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3545] The key 'clean' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3545] The key 'quantized' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3545] The key 'artefacts' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
_images/0e5da983a6078085565b14f08ffde1e66865c5fa39d327a125a533aad26e1b24.png

As expected, the 5-sample period noise was completely removed. The signal was, however, also averaged and we therefore lost some dynamics in the signal.

7.3.2.2. Smoothing using a Savitzky-Golay filter#

The Savitzky-Golay filter is a generalization of the moving average. Instead of taking the mean of the n points of a moving window, the Savitzky-Golay filter fits a polynomial of a given order over each window. A moving average is therefore a particular case of the Savitzky-Golay filter with a polynomial of order 0.

It is a powerful filter for data that is heavily quantized, particularly if we want to derive these data. Let’s plot some heavily quantized data:

ts.plot(["clean", "quantized"], ".-")
_images/8e44b1ffa16d1f5ae49449b6878a77e298c66a0f58f381b5c8e2e7174ec32803.png

To smooth this signal using a second-order Savitzky-Golay filter with a window length of 7:

filtered = ktk.filters.savgol(ts, poly_order=2, window_length=7)

which gives the blue curve below:

UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3545] The key 'clean' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3545] The key 'periodic_noise' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3545] The key 'artefacts' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
_images/91cebd49a7cdf78686f0a0d737705b38100147001748c8ca18e8d60dc94b9398.png

7.3.2.3. Deriving TimeSeries#

Heavily quantized signals are often difficult to derive because they contain lots of plateaus that, once derived, are transformed into series of spikes. For instance, let’s see how deriving a quantized signal works without filtering, using ktk.filters.deriv:

derived = ktk.filters.deriv(ts)

derived.plot(["clean", "quantized"], ".-")
_images/6f91ae3e803580c9748207635ea110e2378961516cd53115d04b0a16c553e24b.png

We can derive a signal using a Savitzky-Golay filter, which consists of deriving the polynomial that is fitted over the moving window. Using a 2nd-order Savitzky-Golay filter with a window length of 7:

derived_savgol = ktk.filters.savgol(
    ts, poly_order=2, window_length=7, deriv=1
)

which gives the blue curve below:

UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3545] The key 'periodic_noise' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3545] The key 'artefacts' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
_images/fad25d2ba999e5362d148b9cfd9950f685665d430b1b7f293d383406ea525322.png

As observed, the derivative of the highly-quantized signal is similar to the derivative of the clean signal.