Python Pandas Index.drop_duplicates()

Pandas Index.drop_duplicates()函数在Python中返回去除重复值的索引。

Pandas Index.drop_duplicates()的语法

语法: Index.drop_duplicates(labels, errors=’raise’)

参数：keep : {‘first’, ‘last’, False}

‘first’ : 丢弃重复的，除了第一次出现的。（缺省）。
‘last’ : 除最后一次出现外，删除重复的内容。
False : 丢弃所有重复的内容。

返回：deduplicated: Index

Index.drop_duplicates()示例

该函数提供了灵活性，可以选择保留哪些重复的值。我们可以从列表中删除所有的重复值，或者留下重复值的第一次/最后一次出现。

示例1：使用Index.drop_duplicates()函数来删除所有重复值的出现。让我们删除索引中所有重复值的出现，除了第一次出现的以外。

# importing pandas as pd
import pandas as pd
 
# Creating the Index
idx = pd.Index([10, 11, 5, 5, 22, 5, 3, 11])
 
# drop all duplicate occurrences of the
# labels and keep the first occurrence
idx.drop_duplicates(keep ='first')
print(idx)

输出:

Python Pandas Index.drop_duplicates()

示例2：使用Index.drop_duplicate()函数来删除标签的所有重复出现的内容。不要在索引中保留任何重复的值。

# importing pandas as pd
import pandas as pd
 
# Creating the Index
idx = pd.Index([10, 11, 5, 5, 22, 5, 3, 11])
 
# drop all duplicate occurrences of the labels
idx.drop_duplicates(keep=False)
 
# Print the Index
idx