在Pandas-Python中获取该列的子串
现在,我们将看到如何获得Pandas数据框架中某一列的所有数值的子串。这种提取在处理数据时非常有用。例如,我们在一个列中有不同人的名字和姓氏,我们需要提取他们名字的前3个字母来创建他们的用户名。
示例 1:
我们可以循环浏览该列的范围,计算该列中每个值的子串。
# importing pandas as pd
import pandas as pd
# creating a dictionary
dict = {'Name':["John Smith", "Mark Wellington",
"Rosie Bates", "Emily Edward"]}
# converting the dictionary to a
# dataframe
df = pd.DataFrame.from_dict(dict)
# storing first 3 letters of name
for i in range(0, len(df)):
df.iloc[i].Name = df.iloc[i].Name[:3]
df
输出:
注意:更多信息,请参考Python提取行,使用Pandas。
例子2:在这个例子中我们将使用str.slice()。
# importing pandas as pd
import pandas as pd
# creating a dictionary
dict = {'Name':["John Smith", "Mark Wellington",
"Rosie Bates", "Emily Edward"]}
# converting the dictionary to a
# dataframe
df = pd.DataFrame.from_dict(dict)
# storing first 3 letters of name as username
df['UserName'] = df['Name'].str.slice(0, 3)
df
输出:
例子3:我们还可以通过使用方括号以不同的方式使用str访问器。
# importing pandas as pd
import pandas as pd
# creating a dictionary
dict = {'Name':["John Smith", "Mark Wellington",
"Rosie Bates", "Emily Edward"]}
# converting the dictionary to a dataframe
df = pd.DataFrame.from_dict(dict)
# storing first 3 letters of name as username
df['UserName'] = df['Name'].str[:3]
df
输出:
示例4:我们也可以使用str.extract来完成这项任务。在这个例子中,我们将在 “LastName “列中存储每个人的姓氏。
# importing pandas as pd
import pandas as pd
# creating a dictionary
dict = {'Name':["John Smith", "Mark Wellington",
"Rosie Bates", "Emily Edward"]}
# converting the dictionary to a dataframe
df = pd.DataFrame.from_dict(dict)
# storing lastname of each person
df['LastName'] = df.Name.str.extract(r'\b(\w+)$',
expand = True)
df
输出: