从传感器数据预测车辆数量

先决条件：回归和分类|有监督的机器学习

安置在路口的传感器收集不同路口的车辆数量数据，并将数据提供给运输经理。现在我们的任务是根据传感器的数据来预测车辆总数。

本文解释了如何处理带有时间戳的传感器数据，并预测某一特定时间内的车辆数量。

数据集描述：

这个数据集包含2个属性。它们是日期和车辆。其中车辆是类标签。

下载此数据的链接 – 点击这里

从传感器数据预测车辆数量

类标签是数字型的。所以回归技术很适合这个问题。回归是用来将数据映射到一个预定义的函数中，它是一种有监督的学习算法，用来预测基于历史数据的数值。如果数据是数字的，我们可以对数据进行回归。这里的类标签Ie车辆属性是数字的类标签，所以应该进行回归。

随机森林调节器是一种集合技术，它接受输入并建立树，然后取每行/每元组所有树的平均值。

语法:

RandomForestRegressor(n_estimators=100, *, criterion=’mse’, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=’auto’, max_leaf_nodes=None,min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=None,random_state=None, verbose=0, warm_start=False, ccp_alpha=0.0, max_samples=None)

步骤:

导入必要的模块
加载数据集
分析数据
将DateTime属性转换为周、日、小时、月等（是时间戳格式的）。
建立模型
训练模型
测试数据
预测结果

第1步：导入pandas模块，用于加载数据框架。

# importing the pandas module for
# data frame
import pandas as pd
 
 
# load the data set into train variable.
train = pd.read_csv('vehicles.csv')
 
# display top 5 values of data set
train.head()

输出:

从传感器数据预测车辆数量

第2步：定义从时间戳（DateTime）中获取月、日、小时的函数，并将其载入不同的列中。

# function to get all data from time stamp
 
# get date
def get_dom(dt):
    return dt.day
 
# get week day
def get_weekday(dt):
    return dt.weekday()
 
# get hour
def get_hour(dt):
    return dt.hour
 
# get year
def get_year(dt):
    return dt.year
 
# get month
def get_month(dt):
    return dt.month
 
# get year day
def get_dayofyear(dt):
    return dt.dayofyear
 
# get year week
def get_weekofyear(dt):
    return dt.weekofyear
 
 
train['DateTime'] = train['DateTime'].map(pd.to_datetime)
train['date'] = train['DateTime'].map(get_dom)
train['weekday'] = train['DateTime'].map(get_weekday)
train['hour'] = train['DateTime'].map(get_hour)
train['month'] = train['DateTime'].map(get_month)
train['year'] = train['DateTime'].map(get_year)
train['dayofyear'] = train['DateTime'].map(get_dayofyear)
train['weekofyear'] = train['DateTime'].map(get_weekofyear)
 
# display
train.head()

输出:

从传感器数据预测车辆数量

第3步：分离类标签，并存储到目标变量中。

# there is no use of DateTime module
# so remove it
train = train.drop(['DateTime'], axis=1)
 
# separating class label for training the data
train1 = train.drop(['Vehicles'], axis=1)
 
# class label is stored in target
target = train['Vehicles']
 
print(train1.head())
target.head()

输出:

从传感器数据预测车辆数量

第四步：使用机器学习算法创建和训练数据，并预测测试后的结果。

#importing Random forest
from sklearn.ensemble import RandomForestRegressor
 
#defining the RandomForestRegressor
m1=RandomForestRegressor()
 
m1.fit(train1,target)
#testing
m1.predict([[11,6,0,1,2015,11,2]])