2022/01/08  阅读：29  主题：红绯

# Signal processing (time series analysis) for scientific data analysis with Python: Part 1用 Python 进行科学数据分析的信号处理(时间序列分析) : 第1部分

## Running mean filter to a time series对时间序列运行平均滤波器

Nita GhoshFollowJan 27, 2021 · 5 min read

These days internet is flooded with resources that help to make one’s way to leaning data science or/and machine leaning. Oftentimes, you would find a junior scientist like myself immersed in loads of data and trying to make a little sense of it (which you may also call time series analysis in fancy terms). It was only when I was scouring the internet for scientific data analysis guidelines that I noticed the paucity of resources. Thus, I thought it would be fantastic if I get to share some of the techniques that I have learnt/am learning throughout my course of data/signal analysis procedure using Python as a programming language.

Below are different fields of application for time series/signal analysis that might get benefited from this read:

1. Scientific data analysis (spectral signal analysis for spectroscopy/bioscience)
科学数据分析(光谱/生物科学的光谱信号分析)
2. Audio signal processing, digital signal processing
音频信号处理，数字信号处理

3. Speech signal processing,

3. 语音信号处理,

4. Image processing

4. 图像处理

5. Financial data processing

5. 财务数据处理

6. Seismology etc.

6. 地震学等。

Because of my background as well as my future interest my discussion here will focus majorly on scientific data/signal analysis.

In this particular series (stay tuned for future parts, part 2, part 3) I wish to cover topics regarding several tools required for a signal processing. These are listed below.

• Temporal Filtering 时间过滤
• Convolution 卷积
• Wavelet transformation 小波变换
• Spectral analysis(Fast Fourier Transformation) 光谱分析(快速傅立叶变换)
• Time-frequency domain analysis 时频域分析
• Cleaning/denoising data 清理/去噪数据
• Resampling (up and down) 重新采样(上下)
• Interpolation and extrapolation 插值和外推
• Feature detection 特征提取
• Signal to noise ratio 信噪比

# Running mean filter (Theory) 运行平均滤波器(理论)

In the first part today I am going to introduce you to application of a smoothing filter. It’s called running mean filter or mean smoothing filter. As the name suggests this filter works by setting each data point in the filtered signal as an average of a finite number of surrounding signal. The number of data points (k) that you go backward and forward in the is the key point of a mean-smoothing filter. That is called the order of the filter.

This is the underlying equation governing the function of a mean-smoothening process. It might look cryptic but what it does is actually quite straight forward. You set each point in the filtered signal y(t) as the sum over all the data points in the original signal, x(t), going from k order backwards in t to k points forward divided by the number of time points which is 2k+1. The plus one indicates you are goin k backward and k forward from your original data point.

The image above shows the function of mean filter smoothening (blue markers) on a randomly generated dataset (red markers). So the mean smoothing filter basically takes away the sharp edges of the original signal to make it more flat around the mean. The larger the order (k) is the smoother is the filtered signal. Now , you can already notice that there’s something funny happening at the edges. These are called ‘Edge Effects’. In general, you always get something bizarre happening at the edges of your time series when you apply a temporal filter. However, there are some well established method which allow you to work around that problem. I will discuss in more detail in later parts of this series.

# Coding in Python 用 Python 编码

Now let me show how does the code look in python. Here I am assuming a basic level of familiarity of the readers with python.

## Import libraries 导入库

First, let’s import the libraries that are required to run the codes.

```import numpy as np
import matplotlib.pyplot as plt
import scipy.io as sio
import scipy.signal
from scipy import *
import copy
%matplotlib inline```

## Create artificial data with noise 用噪声创建人工数据

Let’s create an arbitrary time series (time as x-axis) signal. We set the sampling rate of this signal as 2000 Hz. Set time till 3 second with the interval being 1/sigRate. To create the signal we linearly interpolate accross 15 random time points. You can find more detail about the interpolation function in numpy library here%2C%20evaluated%20at%20x.&text=The%20x%2Dcoordinates%20at%20which%20to%20evaluate%20the%20interpolated%20values.).

```plt.rcParams["font.size"] = 16
plt.rcParams['figure.figsize'] = (20, 10)sigRate = 1000 #Hz
time = np.arange(0,3, 1/sigRate)
n = len(time)
p = 15 #poles for random interpolation
ampl = np.interp(np.linspace(0,p,n),np.arange(0,p),np.random.rand(p)*30)plt.plot(time, ampl)```

Let’s have a look at the plot already.

We now add arbirtary noise to the signal.

```noiseamp = 5
noise = noiseamp*np.random.randn(n)
signal = ampl + noise

plt.plot(time, ampl)
plt.plot(time, signal)```

This is what final signal with noise looks like:

## Running mean filter or mean smoothing filter运行平均滤波器或平均平滑滤波器

One can also initialize the variable filtSig by setting zeros in which case the starting and endpoint of the filtered signal changes, the rest remains same. Next we are implementing the running-mean filter in the time domain. Keep in mind that it is also possible to apply this kind of filter in the spectral domain. I intend to discuss this more in one of the future posts. We are looping over i starting from signal[k] to signal[n-k-1]. Notice here that for a running mean function of order k, signal[k] indicates the (k+1) point in the time series. This is because the variable Final_Signal is a numpy array.

```#initializing
filtSig = np.zeros(n)k = 20
for i in range(k,n-k-1):
# each point is the average of k surrounding points
filtSig[i] = np.mean(signal[i-k:i+k])```

Plot the result

## Compare over order (k= 10 to 20)比较超序(k = 10到20)

```#initializingfiltSig = np.zeros(n)
#filtSig = map(lambda i : np.mean(signal[i-k:i+k], signal))for k in range(10,20):
for i in range(k,n-k-1):
filtSig[i] = np.mean(signal[i-k:i+k])

plt.plot(time, filtSig)
plt.plot(time, signal, marker='o', alpha=0.01)```

Please let me know if you have any queries regarding this part in the comment section. In the next part of the series I will talk about application of Gaussian-smooth filter and the we try to understand the differences between them.

2022/01/08  阅读：29  主题：红绯