以热图风格显示Pandas数据框架
Python编程语言中的Pandas库因其能够创建各种类型的数据结构而被广泛使用,它还提供了许多可以对数字和时间序列数据进行的操作。通过以Heatmap风格显示Pandas数据框架,用户可以获得数字数据的可视化。它给出了一个完整的数据框架的概览,这使得它非常容易理解数据框架中的关键点。
热图是一种矩阵式的二维图形,它以单元格的形式对数字数据进行可视化。热力图的每个单元格都是彩色的,颜色的深浅代表了数值与数据框架的某种关系。以下是一些以热图风格显示Pandas数据框架的方法。
考虑以这个数据框架为例
方法1:通过使用Pandas库。
在这个方法中,Pandas库将被用来生成一个数据框架和热图。热图的单元格将显示与数据框架对应的数值。下面是实现方法。
# Python program to generate a heatmap
# which displays the value in each cell
# corresponding to the given dataframe
# import required libraries
import pandas as pd
# defining index for the dataframe
idx = ['1', '2', '3', '4']
# defining columns for the dataframe
cols = list('ABCD')
# entering values in the index and columns
# and converting them into a panda dataframe
df = pd.DataFrame([[10, 20, 30, 40], [50, 30, 8, 15],
[25, 14, 41, 8], [7, 14, 21, 28]],
columns = cols, index = idx)
# displaying dataframe as an heatmap
# with diverging colourmap as virdis
df.style.background_gradient(cmap ='viridis')\
.set_properties(**{'font-size': '20px'})
输出 :
方法2:通过使用matplotlib库。
在这种方法中,Pandas数据框架将被显示为热图,热图的单元格将根据数据框架中的数值进行颜色编码。除了热图之外,还将出现一个色条,作为该图的一个图例。下面是实现方法。
# Python program to generate a heatmap
# which represents panda dataframe
# in colour coding schemes
# import required libraries
import matplotlib.pyplot as plt
import pandas as pd
# Defining index for the dataframe
idx = ['1', '2', '3', '4']
# Defining columns for the dataframe
cols = list('ABCD')
# Entering values in the index and columns
# and converting them into a panda dataframe
df = pd.DataFrame([[10, 20, 30, 40], [50, 30, 8, 15],
[25, 14, 41, 8], [7, 14, 21, 28]],
columns = cols, index = idx)
# Displaying dataframe as an heatmap
# with diverging colourmap as RdYlBu
plt.imshow(df, cmap ="RdYlBu")
# Displaying a color bar to understand
# which color represents which range of data
plt.colorbar()
# Assigning labels of x-axis
# according to dataframe
plt.xticks(range(len(df)), df.columns)
# Assigning labels of y-axis
# according to dataframe
plt.yticks(range(len(df)), df.index)
# Displaying the figure
plt.show()
输出 :
方法3:通过使用Seaborn库。
在这个方法中,一个热图将从Pandas数据框架中生成,热图的单元格将包含与数据框架相对应的值,并将被彩色编码。除了热图之外,还将出现一个色条,作为该图的一个图例。以下是实施方案。
# Python program to generate heatmap which
# represents panda dataframe in color-coding schemes
# along with values mentioned in each cell
# import required libraries
import pandas as pd
import seaborn as sns % matplotlib inline
# Defining figure size
# for the output plot
fig, ax = plt.subplots(figsize = (12, 7))
# Defining index for the dataframe
idx = ['1', '2', '3', '4']
# Defining columns for the dataframe
cols = list('ABCD')
# Entering values in the index and columns
# and converting them into a panda dataframe
df = pd.DataFrame([[10, 20, 30, 40], [50, 30, 8, 15],
[25, 14, 41, 8], [7, 14, 21, 28]],
columns = cols, index = idx)
# Displaying dataframe as an heatmap
# with diverging colourmap as RdYlGn
sns.heatmap(df, cmap ='RdYlGn', linewidths = 0.30, annot = True)
输出 :
如果输出图的最上面和最下面一行没有出现适当的高度,那么在上述代码的最后一行之后添加以下两行。
bottom, top = ax.get_ylim()
ax.set_ylim(bottom + 0.5, top - 0.5)
方法4:使用Pandas库生成相关矩阵。
相关矩阵是一种特殊的热图,显示数据框架的一些见解。这个热图的单元格显示相关系数,即数据框架中的变量之间的线性历史关系。在这个方法中,只有Pandas库被用来生成相关矩阵。下面是实现方法。
# Python program to generate heatmap
# which represents correlation between
# columns of panda dataframe
# import required libraries
import pandas as pd
# Defining index for the dataframe
idx = ['1', '2', '3', '4']
# Defining columns for the dataframe
cols = list('ABCD')
# Entering values in the index and columns
# and converting them into a panda dataframe
df = pd.DataFrame([[10, 20, 30, 40], [50, 30, 8, 15],
[25, 14, 41, 8], [7, 14, 21, 28]],
columns = cols, index = idx)
# generating pairwise correlation
corr = df.corr()
# Displaying dataframe as an heatmap
# with diverging colourmap as coolwarm
corr.style.background_gradient(cmap ='coolwarm')
输出 :
方法5:使用Seaborn库生成相关矩阵。
相关矩阵也可以用Seaborn库生成。生成的热图的单元格将包含相关系数,但其数值是四舍五入的,与Pandas库生成的热图不同。下面是实现方法。
# Python program to generate a heatmap
# which represents the correlation between
# columns of panda dataframe
# import required libraries
import pandas as pd
import seaborn as sn
# Defining figure size
# for the output plot
fig, ax = plt.subplots(figsize = (12, 7))
# Defining index for the dataframe
idx = ['1', '2', '3', '4']
# Defining columns for the dataframe
cols = list('ABCD')
# Entering values in the index and columns
# and converting them into a panda dataframe
df = pd.DataFrame([[10, 20, 30, 40], [50, 30, 8, 15],
[25, 14, 41, 8], [7, 14, 21, 28]],
columns = cols, index = idx)
df = pd.DataFrame(df, columns =['A', 'B', 'C', 'D'])
corr = df.corr()
sn.heatmap(corr, annot = True)
输出 :
如果输出图的最上面和最下面一行没有出现适当的高度,那么在上述代码的最后一行之后添加以下两行。
bottom, top = ax.get_ylim()
ax.set_ylim(bottom + 0.5, top - 0.5)