Pandas中如何进行unstack或者pivot操作

在数据分析中，经常会有需要把一张表格/数据框中的一些列转换成行或者把一些行转换成列的需求。在Pandas中，可以使用unstack和pivot操作来实现这一目的。

unstack操作

unstack操作可以将DataFrame中的某个特定的列数据旋转为行。通过unstack操作，我们可以将某一个列中的数据旋转成多行，获取关于多个索引的层次化形式的结果。

下面是一个示例，展示如何使用unstack操作将数据框中的一列转换为行:

import pandas as pd

df = pd.DataFrame({'A': ['a', 'b', 'c'],
                   'B': [1, 2, 3],
                   'C': [4, 5, 6]})

print(df)
#     A  B  C
# 0  a  1  4
# 1  b  2  5
# 2  c  3  6

df = df.set_index('A')
df = df.unstack()

print(df)
#     A
# B  1    2    3
#    4    5    6
# dtype: int64

在上述示例中，我们使用了set_index操作将列’A’设置为索引，然后使用unstack操作将列’B’转换为行。这导致了返回了一个多级索引的Series对象，其中对应于元素(a, 1)的值为4，对应于元素(b, 2)的值为5。

另外，我们还可以使用unstack方法取出多级行索引中的某几层并且将其旋转为列：

df = pd.DataFrame([('bar', 'one', 1, 2),
                   ('bar', 'two', 2, 4),
                   ('foo', 'one', 3, 6),
                   ('foo', 'two', 4, 8)],
                  columns=['A', 'B', 'C', 'D'])

df = df.set_index(['A', 'B'])
df = df.unstack(1)

print(df)
#       C      D    
# B   one two one two
# A                 
# bar   1   2   2   4
# foo   3   4   6   8

pivot操作

pivot操作可以实现矩阵式变换。它可以接受四个参数，其中主要的两个是index和columns，它们定义了最终结果中使用的行和列的标签。

下面是一个示例，展示如何使用pivot操作将数据框中的一些行转换为列：

import pandas as pd

df = pd.DataFrame({'foo': ['three', 'three', 'one', 'one', 'one', 'two', 'two', 'three'],
                   'bar': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
                   'baz': [2, 4, 6, 8, 10, 12, 14, 16]})

pivoted = df.pivot(index='foo', columns='bar', values='baz')

print(pivoted)
# bar     A    B    C    D     E     F     G     H
# foo                                            
# one   6.0  8.0  10.0  NaN  NaN  12.0  14.0  NaN
# three  2.0  4.0   NaN  NaN  NaN   NaN   NaN  16.0
# two    NaN  NaN   NaN  NaN  NaN  12.0  14.0  NaN

在上述示例中，我们使用pivot将数据框中的列’bar’转化为列，列的标签为不同的 ‘bar’ 的唯一值，行的标签为 ‘foo’ 的唯一值。