在Python中查找Pandas数据框架中元素的位置
在这篇文章中,我们将看到如何使用一个用户定义的函数来寻找一个元素在数据框架中的位置。首先,让我们创建一个简单的数据框架,其中有一个列表的字典,比如列名是。姓名”、”年龄”、”城市 “和 “部门”。
# Import pandas library
import pandas as pd
# List of tuples
students = [('Ankit', 23, 'Delhi', 'A'),
('Swapnil', 22, 'Delhi', 'B'),
('Aman', 22, 'Dehradun', 'A'),
('Jiten', 22, 'Delhi', 'A'),
('Jeet', 21, 'Mumbai', 'B')
]
# Creating Dataframe object
df = pd.DataFrame(students, columns =['Name', 'Age', 'City', 'Section'])
df
输出:
示例1 :查找数据框架中的一个元素的位置。
# Import pandas library
import pandas as pd
# List of tuples
students = [('Ankit', 23, 'Delhi', 'A'),
('Swapnil', 22, 'Delhi', 'B'),
('Aman', 22, 'Dehradun', 'A'),
('Jiten', 22, 'Delhi', 'A'),
('Jeet', 21, 'Mumbai', 'B')
]
# Creating Dataframe object
df = pd.DataFrame(students, columns =['Name', 'Age', 'City', 'Section'])
# This function will return a list of
# positions where element exists
# in the dataframe.
def getIndexes(dfObj, value):
# Empty list
listOfPos = []
# isin() method will return a dataframe with
# boolean values, True at the positions
# where element exists
result = dfObj.isin([value])
# any() method will return
# a boolean series
seriesObj = result.any()
# Get list of column names where
# element exists
columnNames = list(seriesObj[seriesObj == True].index)
# Iterate over the list of columns and
# extract the row index where element exists
for col in columnNames:
rows = list(result[col][result[col] == True].index)
for row in rows:
listOfPos.append((row, col))
# This list contains a list tuples with
# the index of element in the dataframe
return listOfPos
# Calling getIndexes() function to get
# the index positions of all occurrences
# of 22 in the dataframe
listOfPositions = getIndexes(df, 22)
print('Index positions of 22 in Dataframe : ')
# Printing the position
for i in range(len(listOfPositions)):
print( listOfPositions[i])
输出 :
现在我们来了解一下函数getIndexes()是如何工作的。isin(),dataframe/series.any(),接受数值并返回一个具有布尔值的数据框架。这个布尔值数据帧的大小与第一个原始数据帧相似。在数据框架中存在给定元素的地方,其值为True,否则为False。然后找到包含元素22的列的名称。我们可以通过获取布尔数据框架中包含True的列的名称来实现这一目标。现在在布尔数据框架中,我们遍历每一个被选中的列,对于每一个列,我们找到含有True的行。现在,这些存在True的列名和行索引的组合就是数据框架中22的索引位置。这就是getIndexes()如何找到给定元素的确切索引位置,并以(行,列)元组的形式存储每个位置。最后,它返回一个代表其在数据框架中索引位置的元组列表。
例子2:查找DataFrame中多个元素的位置。
# Import pandas library
import pandas as pd
# List of tuples
students = [('Ankit', 23, 'Delhi', 'A'),
('Swapnil', 22, 'Delhi', 'B'),
('Aman', 22, 'Dehradun', 'A'),
('Jiten', 22, 'Delhi', 'A'),
('Jeet', 21, 'Mumbai', 'B')
]
# Creating Dataframe object
df = pd.DataFrame(students, columns =['Name', 'Age', 'City', 'Section'])
# This function will return a
# list of positions where
# element exists in dataframe
def getIndexes(dfObj, value):
# Empty list
listOfPos = []
# isin() method will return a dataframe with
# boolean values, True at the positions
# where element exists
result = dfObj.isin([value])
# any() method will return
# a boolean series
seriesObj = result.any()
# Get list of columns where element exists
columnNames = list(seriesObj[seriesObj == True].index)
# Iterate over the list of columns and
# extract the row index where element exists
for col in columnNames:
rows = list(result[col][result[col] == True].index)
for row in rows:
listOfPos.append((row, col))
# This list contains a list tuples with
# the index of element in the dataframe
return listOfPos
# Create a list which contains all the elements
# whose index position you need to find
listOfElems = [22, 'Delhi']
# Using dictionary comprehension to find
# index positions of multiple elements
# in dataframe
dictOfPos = {elem: getIndexes(df, elem) for elem in listOfElems}
print('Position of given elements in Dataframe are : ')
# Looping through key, value pairs
# in the dictionary
for key, value in dictOfPos.items():
print(key, ' : ', value)
输出 :