Python – 计算Pandas DataFrame中每个分组的最后一个值
要计算每个分组的最后一个值,请使用groupby.last()方法。首先,使用别名导入所需的库
import pandas as pd;
创建一个带有3列的DataFrame –
dataFrame = pd.DataFrame(
{
"Car": ['BMW', 'Lexus', 'BMW', 'Tesla', 'Lexus', 'Tesla'],"Place": ['Delhi','Bangalore','Pune','Punjab','Chandigarh','Mumbai'],"Units": [100, 150, 50, 80, 110, 90]
}
)
现在,按列分组DataFrame –
groupDF = dataFrame.groupby("Car")
计算每个组的最后一个值并重置索引 –
res = groupDF.last()
res = res.reset_index()
更多Pandas文章,请阅读:Pandas教程
示例
下面是完整的代码。显示重复值的最后一个出现,即每个组的最后一个值 –
import pandas as pd;
dataFrame = pd.DataFrame(
{
"Car": ['BMW', 'Lexus', 'BMW', 'Tesla', 'Lexus', 'Tesla'],"Place": ['Delhi','Bangalore','Pune','Punjab','Chandigarh','Mumbai'],"Units": [100, 150, 50, 80, 110, 90]
}
)
print"DataFrame ...\n",dataFrame
# 按列Car分组DataFrame
groupDF = dataFrame.groupby("Car")
res = groupDF.last()
res = res.reset_index()
print"\nLast of group values = \n",res
输出结果
这将生成以下输出 –
DataFrame ...
Car Place Units
0 BMW Delhi 100
1 Lexus Bangalore 150
2 BMW Pune 50
3 Tesla Punjab 80
4 Lexus Chandigarh 110
5 Tesla Mumbai 90
Last of group values =
Car Place Units
0 BMW Pune 50
1 Lexus Chandigarh 110
2 Tesla Mumbai 90