如何在Pandas数据框架中计算MOVING AVERAGE

在这篇文章中，我们将研究如何在pandas DataFrame中计算移动平均线。移动平均线是计算一段时期内数据的平均值。移动平均数也被称为滚动平均数，是通过对k个时间段内的时间序列的数据进行平均计算的。

有三种类型的移动平均线：

简单移动平均线(SMA)
指数移动平均线(EMA)
累积移动平均(CMA)

简单移动平均线（SMA）

一个简单的移动平均数告诉我们之前K个数据点的未加权平均数。K的值越大，曲线越平滑，但增加K会降低精确度。如果数据点是p 1, p 2 , . . ., p n，那么我们计算简单的移动平均数。

如何在Pandas数据框架中计算MOVING AVERAGE

在Python中，我们可以使用.rolling()方法计算移动平均数。这个方法在数据上提供了滚动窗口，我们可以在这些窗口上使用平均函数来计算移动平均数。窗口的大小在函数.rolling(window)中作为参数传递。

现在我们来看一个例子，如何计算30天内的简单滚动平均数。

第1步：导入库

# importing Libraries
 
# importing pandas as pd
import pandas as pd
 
# importing numpy as np
# for Mathematical calculations
import numpy as np
 
# importing pyplot from matplotlib as plt
# for plotting graphs
import matplotlib.pyplot as plt
plt.style.use('default')
%matplotlib inline

第2步：导入数据

为了导入数据，我们将使用pandas.read_csv()函数。

# importing time-series data
reliance = pd.read_csv('RELIANCE.NS.csv', index_col='Date',
                       parse_dates=True)
 
# Printing dataFrame
reliance.head()

输出:

如何在Pandas数据框架中计算MOVING AVERAGE

第3步：计算简单移动平均数

为了在Python中计算SMA，我们将使用Pandas dataframe.rolling()函数，帮助我们在滚动窗口上进行计算。在滚动窗口上，我们将使用.mean()函数来计算每个窗口的平均数。

语法: DataFrame.rolling(window, min_periods=None, center=False, win_type=None, on=None, axis=0).mean()

参数 :

window : 窗口的大小。这就是我们要为每个窗口的计算采取多少个观测值。
min_periods :一个窗口中需要有一个值的最小观察数（否则结果为NA）。
center : 用来设置窗口中心的标签。
win_type :它用于设置窗口类型。
on：我们要计算滚动平均值的数据框架的日期列。
axis：整数或字符串，默认为0

# updating our dataFrame to have only
# one column 'Close' as rest all columns
# are of no use for us at the moment
# using .to_frame() to convert pandas series
# into dataframe.
reliance = reliance['Close'].to_frame()
 
# calculating simple moving average
# using .rolling(window).mean() ,
# with window size = 30
reliance['SMA30'] = reliance['Close'].rolling(30).mean()
 
# removing all the NULL values using
# dropna() method
reliance.dropna(inplace=True)
 
# printing Dataframe
reliance

输出:

如何在Pandas数据框架中计算MOVING AVERAGE

第4步：绘制简单的移动平均线

# plotting Close price and simple
# moving average of 30 days using .plot() method
reliance[['Close', 'SMA30']].plot(label='RELIANCE',
                                  figsize=(16, 8))

输出:

如何在Pandas数据框架中计算MOVING AVERAGE

累计移动平均数(CMA)

累计移动平均数是到当前值为止的所有先前值的平均值。数据点x 1 , x 2 ….. 在时间t的CMA可以计算为，

累计移动平均数(CMA)

在计算CMA时，我们没有任何固定的窗口大小。窗口的大小随着时间的推移不断增加。在Python中，我们可以使用.expanding()方法计算CMA。现在我们将看到一个例子，计算30天的CMA。

第1步：导入库

# importing Libraries
 
# importing pandas as pd
import pandas as pd
 
# importing numpy as np
# for Mathematical calculations
import numpy as np
 
# importing pyplot from matplotlib as plt
# for plotting graphs
import matplotlib.pyplot as plt
plt.style.use('default')
%matplotlib inline

第2步：导入数据

为了导入数据，我们将使用pandas .read_csv（）函数。

# importing time-series data
reliance = pd.read_csv('RELIANCE.NS.csv',
                       index_col='Date',
                       parse_dates=True)
 
# Printing dataFrame
reliance.head()

第3步：计算累计移动平均数

为了在Python中计算CMA，我们将使用dataframe.expanding()函数。这个方法给我们提供了我们的聚合函数的累积值（本例中是平均值）。

**语法: **DataFrame.expanding(min_periods=1, center=None, axis=0, method=’single’).mean()

参数:

min_periods :Int，默认1。一个窗口中需要有值的最少观察数(否则结果为NA)。
center : bool，默认为False。它用于在窗口的中心设置标签。
axis : Int或str，默认为0
method : str {‘single’, ‘table’}, default ‘single’

# updating our dataFrame to have only
# one column 'Close' as rest all columns
# are of no use for us at the moment
# using .to_frame() to convert pandas series
# into dataframe.
reliance = reliance['Close'].to_frame()
 
# calculating cumulative moving
# average using .expanding().mean()
reliance['CMA30'] = reliance['Close'].expanding().mean()
 
# printing Dataframe
reliance

输出:

如何在Pandas数据框架中计算MOVING AVERAGE

第4步：绘制累积移动平均线

# plotting Close price and cumulative moving
# average of 30 days using .plot() method
reliance[['Close', 'CMA30']].plot(label='RELIANCE',
                                  figsize=(16, 8))

输出:

如何在Pandas数据框架中计算MOVING AVERAGE

指数移动平均线（EMA）

指数移动平均线（EMA）告诉我们之前K个数据点的加权平均值。EMA将更大的权重和意义放在最近的数据点上。在时间段t计算EMA的公式是：

指数移动平均线（EMA）

其中x t是时间t的观察值，α是平滑系数。在Python中，EMA是用.ewm()方法计算的。我们可以把跨度或窗口作为参数传给.ewm(span = )方法。

现在我们来看看一个计算30天内EMA的例子。

第1步：导入库

# importing Libraries
 
# importing pandas as pd
import pandas as pd
 
# importing numpy as np
# for Mathematical calculations
import numpy as np
 
# importing pyplot from matplotlib as plt
# for plotting graphs
import matplotlib.pyplot as plt
plt.style.use('default')
%matplotlib inline

第2步：导入数据

为了导入数据，我们将使用pandas .read_csv（）函数。

# importing time-series data
reliance = pd.read_csv('RELIANCE.NS.csv',
                       index_col='Date',
                       parse_dates=True)
 
# Printing dataFrame
reliance.head()

输出:

如何在Pandas数据框架中计算MOVING AVERAGE

第3步：计算指数移动平均数

为了在Python中计算EMA，我们使用dataframe.ewm()函数。它为我们提供了指数加权的函数。我们将使用.mean()函数来计算EMA。

**语法: **DataFrame.ewm(com=None, span=None, halflife=None, alpha=None, min_periods=0, adjust=True, ignore_na=False, axis=0, times=None).mean()

参数:

com : float, optional . 它是以质心为单位的衰变。
span : float, optional . 它是以跨度为单位的衰减。
halflife : float, str, timedelta, optional . 它是以半衰期为单位的衰减。
alpha : float，可选的。它是平滑系数，其值在0到1之间，1包括在内。
min_periods : int，默认为0。窗口中需要有数值的最小观测值的数量（否则结果为NA）。
adjust : bool, 默认为True . 除以开始阶段的衰减调整系数，以考虑相对权重的不平衡（将EWMA视为移动平均）。
ignore_na : 在计算权重时忽略缺失值；指定为 “True “可重现0.15.0之前的行为。
axis : 要使用的轴。值为0表示行，1表示列。

# updating our dataFrame to have only
# one column 'Close' as rest all columns
# are of no use for us at the moment
# using .to_frame() to convert pandas
# series into dataframe.
reliance = reliance['Close'].to_frame()
 
# calculating exponential moving average
# using .ewm(span).mean() , with window size = 30
reliance['EWMA30'] = reliance['Close'].ewm(span=30).mean()
 
# printing Dataframe
reliance

输出:

如何在Pandas数据框架中计算MOVING AVERAGE

第4步：绘制指数移动平均线

# plotting Close price and exponential
# moving averages of 30 days
# using .plot() method
reliance[['Close', 'EWMA30']].plot(label='RELIANCE',
                                   figsize=(16, 8))