如何在Pandas中使用平均值填充NAN值?

如何在Pandas中使用平均值填充NAN值?

使用mean()函数来计算平均值。计算具有NAN的列的平均值,并使用fillna()将NAN值填充为平均值。

让我们首先导入所需的库 –

import pandas as pd
import numpy as np
Python

创建具有2列和一些NaN值的DataFrame。我们使用numpy np.NaN输入了这些NaN值-

dataFrame = pd.DataFrame(
   {
      "Car": ['BMW', 'Lexus', 'Lexus', 'Mustang', 'Bentley', 'Mustang'],"Units": [100, 150, np.NaN, 80, np.NaN, np.NaN]
   }
)
Python

找到具有NaN的列的列值的平均值,即这里的Units列。因此,单位列有100、150和80;因此,平均值将为110-

meanVal = dataFrame['Units'].mean()
Python

将NaN替换为它所在的列的平均值。上面计算的平均值为110,因此NaN值将替换为110-

dataFrame['Units'].fillna(value=meanVal, inplace=True)
Python

示例

以下是代码-

import pandas as pd
import numpy as np

# Create DataFrame
dataFrame = pd.DataFrame(
   {
      "Car": ['BMW', 'Lexus', 'Lexus', 'Mustang', 'Bentley', 'Mustang'],"Units": [100, 150, np.NaN, 80, np.NaN, np.NaN]
   }
)

print"DataFrame ...\n",dataFrame

# finding mean of the column values with NaN i.e, for Units columns here
# so the Units column has 100, 150 and 80; therefore the mean would ne 110
meanVal = dataFrame['Units'].mean()

# Replace NaNs with the mean of the column where it is located
# the mean calculated above is 110, so NaN values will be replaced with 110
dataFrame['Units'].fillna(value=meanVal, inplace=True)
print"\nUpdated Dataframe after filling NaN values with mean...\n",dataFrame
Python

输出

这将产生如下输出-

DataFrame ...
       Car   Units
0      BMW   100.0
1    Lexus   150.0
2    Lexus     NaN
3  Mustang    80.0
4  Bentley     NaN
5  Mustang     NaN

Updated Dataframe after filling NaN values with mean...
       Car   Units
0      BMW   100.0
1    Lexus   150.0
2    Lexus   110.0
3  Mustang    80.0
4  Bentley   110.0
5  Mustang   110.0
Python

Python教程

Java教程

Web教程

数据库教程

图形图像教程

大数据教程

开发工具教程

计算机教程

登录

注册