7.3.3. Removing artefacts using a median filter#
Sometimes, a TimeSeries contains some bad measurements that really stand out from other data. If such artefacts are impossible to remove at the source and there are not many of them, then the median filter is a simple way to filter them out.
Let’s first load some noisy data:
import kineticstoolkit.lab as ktk
ts = ktk.load(ktk.doc.download("filters_types_of_noise.ktk.zip"))
ts.plot(["clean", "artefacts"], ".-")
Using a median filter with a window length of 3 gives, for each point, the average of the two points that are the closest together.
filtered = ktk.filters.median(ts, window_length=3)
which gives the blue curve below:
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3489] The key 'clean' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3489] The key 'periodic_noise' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3489] The key 'quantized' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
Most artefacts were removed, but not the ones at 30 and 31 seconds. This is because there were two consecutive artefacts, and as such the median filter considers that at these times, the clean signal is the artefact. A filter with a larger window length could be used, at the expense of more signal loss.
filtered = ktk.filters.median(ts, window_length=5)
which gives the blue curve below:
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3489] The key 'clean' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3489] The key 'periodic_noise' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.
UserWarning [/Users/felix/Documents/git/kineticstoolkit_doc/src/kineticstoolkit/timeseries.py:3489] The key 'quantized' exists in both TimeSeries. According to the overwrite=False parameter, the new value has been ignored. Use on_conflict='mute' to mute this warning.