Python – 如何按分钟对 Pandas DataFrame 进行分组?
我们将使用 groupby() 方法对 Pandas DataFrame 进行分组。使用 grouper() 函数选择要使用的列。我们将以每分钟为单位分组,并在我们的示例中按分钟间隔计算注册价格之和,该示例显示了汽车销售记录。
首先,我们假设以下是我们的 Pandas DataFrame,它有三个列。我们设置了包含日期和时间的 timestamp 的 Date_of_Purchase –
dataFrame = pd.DataFrame(
{
"Car": ["Audi", "Lexus", "Tesla", "Mercedes", "BMW", "Toyota", "Nissan", "Bentley", "Mustang"],
"Date_of_Purchase": [
pd.Timestamp("2021-07-28 00:10:00"),
pd.Timestamp("2021-07-28 00:12:00"),
pd.Timestamp("2021-07-28 00:15:00"),
pd.Timestamp("2021-07-28 00:16:00"),
pd.Timestamp("2021-07-28 00:17:00"),
pd.Timestamp("2021-07-28 00:20:00"),
pd.Timestamp("2021-07-28 00:35:00"),
pd.Timestamp("2021-07-28 00:42:00"),
pd.Timestamp("2021-07-28 00:57:00"),
],
"Reg_Price": [1000, 1400, 1100, 900, 1700, 1800, 1300, 1150, 1350]
}
)
接下来,使用 Grouper 选择 groupby() 函数内的 Date_of_Purchase 列。频率设置为7min,即将 7 分钟间隔进行分组 –
print "\n按 7 分钟分组 DataFrame...\n",dataFrame.groupby(pd.Grouper(key='Date_of_Purchase', axis=0, freq='7min')).sum()
示例
以下是代码 –
import pandas as pd
# 一个包含 Date_of_Purchase 列的 dataframe
dataFrame = pd.DataFrame(
{
"Car": ["Audi", "Lexus", "Tesla", "Mercedes", "BMW", "Toyota", "Nissan", "Bentley", "Mustang"],
"Date_of_Purchase": [
pd.Timestamp("2021-07-28 00:10:00"),
pd.Timestamp("2021-07-28 00:12:00"),
pd.Timestamp("2021-07-28 00:15:00"),
pd.Timestamp("2021-07-28 00:16:00"),
pd.Timestamp("2021-07-28 00:17:00"),
pd.Timestamp("2021-07-28 00:20:00"),
pd.Timestamp("2021-07-28 00:35:00"),
pd.Timestamp("2021-07-28 00:42:00"),
pd.Timestamp("2021-07-28 00:57:00"),
],
"Reg_Price": [1000, 1400, 1100, 900, 1700, 1800, 1300, 1150, 1350]
}
)
print "DataFrame...\n",dataFrame
# Grouper 选择 groupby() 函数内的 Date_of_Purchase 列
print "\n按 7 分钟分组 DataFrame...\n",dataFrame.groupby(pd.Grouper(key='Date_of_Purchase', axis=0, freq='7min')).sum()
输出
这将产生以下输出 –
DataFrame...
Car Date_of_Purchase Reg_Price
0 Audi 2021-07-28 00:10:00 1000
1 Lexus 2021-07-28 00:12:00 1400
2 Tesla 2021-07-28 00:15:00 1100
3 Mercedes 2021-07-28 00:16:00 900
4 BMW 2021-07-28 00:17:00 1700
5 Toyota 2021-07-28 00:20:00 1800
6 Nissan 2021-07-28 00:35:00 1300
7 Bentley 2021-07-28 00:42:00 1150
8 Mustang 2021-07-28 00:57:00 1350
按 7 分钟分组 DataFrame...
Reg_Price
Date_of_Purchase
2021-07-28 00:07:00 2400.0
2021-07-28 00:14:00 5500.0
2021-07-28 00:21:00 NaN
2021-07-28 00:28:00 NaN
2021-07-28 00:35:00 1300.0
2021-07-28 00:42:00 1150.0
2021-07-28 00:49:00 NaN
2021-07-28 00:56:00 1350.0
极客教程