按给定的比例随机分割一个Pandas数据框架

按给定的比例随机分割一个Pandas数据框架

Divide a Pandas Dataframe任务在机器学习、人工智能等领域将给定的数据集分成训练数据和测试数据进行训练和测试的情况下非常有用。让我们来看看如何将pandas数据框随机分成给定的比例。对于这项任务,我们将同时使用pandas数据框架的Dataframe.sample()和Dataframe.drop()方法。

这些函数的语法如下:

  • Dataframe.sample()

语法: DataFrame.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None)

返回类型:一个与调用者类型相同的新对象,包含从调用者对象中随机抽取的n个项目。

  • Dataframe.drop()

语法: DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors=’raise’)

Return: 带有删除值的Dataframe。

例子:现在,让我们创建一个数据框架。

# Importing required libraries
import pandas as pd
 
record = {
    'course_name': ['Data Structures', 'Python',
                    'Machine Learning', 'Web Development'],
    'student_name': ['Ankit', 'Shivangi',
                     'Priya', 'Shaurya'],
    'student_city': ['Chennai', 'Pune',
                     'Delhi', 'Mumbai'],
    'student_gender': ['M', 'F',
                       'F', 'M'] }
 
# Creating a dataframe
df = pd.DataFrame(record)
 
# show the dataframe
df

输出:

按给定的比例随机分割一个Pandas数据框架

Dataframe

例子1:将一个数据框架随机分成1:1的比例。

# Importing required libraries
import pandas as pd
 
record = {
    'course_name': ['Data Structures', 'Python',
                    'Machine Learning', 'Web Development'],
    'student_name': ['Ankit', 'Shivangi',
                     'Priya', 'Shaurya'],
    'student_city': ['Chennai', 'Pune',
                     'Delhi', 'Mumbai'],
    'student_gender': ['M', 'F',
                       'F', 'M'] }
 
# Creating a dataframe
df = pd.DataFrame(record)
 
# Creating a dataframe with 50%
# values of original dataframe
part_50 = df.sample(frac = 0.5)
 
# Creating dataframe with
# rest of the 50% values
rest_part_50 = df.drop(part_50.index)
 
print("\n50% of the given DataFrame:")
print(part_50)
 
print("\nrest 50% of the given DataFrame:")
print(rest_part_50)

输出:

按给定的比例随机分割一个Pandas数据框架

Divide dataframe

例子2:将一个数据帧随机分成3:1的比例。

# Importing required libraries
import pandas as pd
 
record = {
    'course_name': ['Data Structures', 'Python',
                    'Machine Learning', 'Web Development'],
    'student_name': ['Ankit', 'Shivangi',
                     'Priya', 'Shaurya'],
    'student_city': ['Chennai', 'Pune',
                     'Delhi', 'Mumbai'],
    'student_gender': ['M', 'F',
                       'F', 'M'] }
 
# Creating a dataframe
df = pd.DataFrame(record)
 
# Creating a dataframe with 75%
# values of original dataframe
part_75 = df.sample(frac = 0.75)
 
# Creating dataframe with
# rest of the 25% values
rest_part_25 = df.drop(part_75.index)
 
print("\n75% of the given DataFrame:")
print(part_75)
 
print("\nrest 25% of the given DataFrame:")
print(rest_part_25)

输出:

按给定的比例随机分割一个Pandas数据框架

Divide Dataframe

Python教程

Java教程

Web教程

数据库教程

图形图像教程

大数据教程

开发工具教程

计算机教程