如何在Python中计算MAPE

在这篇文章中，我们将看到如何计算确定预测准确性的方法之一，即平均绝对百分比误差（或简称MAPE），也称为平均绝对百分比偏差（MAPD）。绝对百分比误差（或简称MAPE），在python中也被称为平均绝对百分比偏差（MAPD）。MAPE术语决定了我们的预测有多大的准确性。MAPE中的 “M “代表平均数，即一系列的平均值；”A “代表绝对值，即使用绝对值来防止正负误差相互抵消；”P “是百分比，使这个准确性指标成为一个相对指标；”E “代表误差，因为这个指标有助于确定我们预测的误差量。

考虑下面的例子，我们有一个商店的销售信息。日列代表我们所指的日数，实际销售列代表各日的实际销售值，而预测销售列代表销售数字的预测值（可能用ML模型）。APE列代表绝对百分比误差（APE），表示相应一天的实际值和预测值之间的百分比误差。误差百分比的公式是（实际值-预测值）/实际值。APE是这个百分比误差的正（绝对）值。

Day No.	Actual Sales	Forecast Sales	绝对百分比误差（APE）
1	136	134	0.014
2	120	124	0.033
3	138	132	0.043
4	155	141	0.090
5	149	149	0.0

现在，MAPE值可以通过取APE值的平均值来找到。该公式可以表示为 –

如何在Python中计算MAPE？

让我们看看如何用python对上述数据集进行同样的处理

# Define the dataset as python lists
actual   = [136, 120, 138, 155, 149]
forecast = [134, 124, 132, 141, 149]
  
# Consider a list APE to store the
# APE value for each of the records in dataset
APE = []
  
# Iterate over the list values
for day in range(5):
  
    # Calculate percentage error
    per_err = (actual[day] - forecast[day]) / actual[day]
  
    # Take absolute value of
    # the percentage error (APE)
    per_err = abs(per_err)
  
    # Append it to the APE list
    APE.append(per_err)
  
# Calculate the MAPE
MAPE = sum(APE)/len(APE)
  
# Print the MAPE value and percentage
print(f'''
MAPE   : { round(MAPE, 2) }
MAPE % : { round(MAPE*100, 2) } %
''')

输出:

如何在Python中计算MAPE？

MAPE输出是一个非负的浮点数。MAPE的最佳值是0.0，而更高的值则决定了预测不够准确。然而，MAPE值应该有多大，才能称之为低效预测，这取决于用例。在上面的输出中，我们可以看到预测值是足够好的，因为MAPE表明每天的销售预测值有3%的误差。

如果你在python中处理时间序列数据，你可能会使用pandas或NumPy。在这种情况下，你可以使用以下代码来获得MAPE输出。

import pandas as pd
import numpy as np
  
# Define the function to return the MAPE values
def calculate_mape(actual, predicted) -> float:
  
    # Convert actual and predicted
    # to numpy array data type if not already
    if not all([isinstance(actual, np.ndarray),
                isinstance(predicted, np.ndarray)]):
        actual, predicted = np.array(actual), 
        np.array(predicted)
  
    # Calculate the MAPE value and return
    return round(np.mean(np.abs((
      actual - predicted) / actual)) * 100, 2)
  
if __name__ == '__main__':
  
    # CALCULATE MAPE FROM PYTHON LIST
    actual    = [136, 120, 138, 155, 149]
    predicted = [134, 124, 132, 141, 149]
  
    # Get MAPE for python list as parameters
    print("py list  :",
          calculate_mape(actual,
                         predicted), "%")
  
    # CALCULATE MAPE FROM NUMPY ARRAY
    actual    = np.array([136, 120, 138, 155, 149])
    predicted = np.array([134, 124, 132, 141, 149])
  
    # Get MAPE for python list as parameters
    print("np array :", 
          calculate_mape(actual,
                         predicted), "%")
  
    # CALCULATE MAPE FROM PANDAS DATAFRAME
      
    # Define the pandas dataframe
    sales_df = pd.DataFrame({
        "actual"    : [136, 120, 138, 155, 149],
        "predicted" : [134, 124, 132, 141, 149]
    })
  
    # Get MAPE for pandas series as parameters
    print("pandas df:", 
          calculate_mape(sales_df.actual, 
                         sales_df.predicted), "%")