如何获得Pandas数据框架的描述性统计
Python Pandas中的describe()方法用于计算描述性统计数据,如计数、唯一值、平均值、标准差、最小值和最大值等等。在这篇文章中,让我们来学习如何获得Pandas DataFrame的描述性统计数据。
语法:
df[‘cname’].describe(percentiles = None, include = None, exclude = None)
df.describe(percentiles = None, include = None, exclude = None)
Python
参数:
percentiles:代表必须由函数返回的百分位数值。默认值是0.25、0.5和0.75。
include:代表必须包括的数据类型的列表。
exclude:代表必须排除的数据类型的列表。
示例 1:
# Import package
from pandas import DataFrame
# Create DataFrame
cart = {'Product': ['Mobile', 'AC', 'Mobile', 'Sofa', 'Laptop'],
'Price': [20000, 28000, 22000, 19000, 45000],
'Year': [2014, 2015, 2016, 2017, 2018]
}
df = DataFrame(cart, columns = ['Product', 'Price', 'Year'])
# Original DataFrame
print("Original DataFrame:\n", df)
# Describing descriptive statistics of Price
print("\nDescriptive statistics of Price:\n")
stats = df['Price'].describe()
print(stats)
Python
输出:
示例 2:
# Import package
from pandas import DataFrame
# Create DataFrame
cart = {'Product': ['Mobile', 'AC', 'Mobile', 'Sofa', 'Laptop'],
'Price': [20000, 28000, 22000, 19000, 45000],
'Year': [2014, 2015, 2016, 2017, 2018]
}
df = DataFrame(cart, columns = ['Product', 'Price', 'Year'])
# Original DataFrame
print("Original DataFrame:\n", df)
# Describing descriptive statistics of Year
print("\nDescriptive statistics of year:\n")
stats = df['Year'].describe()
print(stats)
Python
输出:
示例 3:
# Import package
from pandas import DataFrame
# Create DataFrame
cart = {'Product': ['Mobile', 'AC', 'Mobile', 'Sofa', 'Laptop'],
'Price': [20000, 28000, 22000, 19000, 45000],
'Year': [2014, 2015, 2016, 2017, 2018]
}
df = DataFrame(cart, columns = ['Product', 'Price', 'Year'])
# Original DataFrame
print("Original DataFrame:\n", df)
# Describing descriptive statistics of whole dataframe
print("\nDescriptive statistics of whole dataframe:\n")
stats = df.describe(include = 'all')
print(stats)
Python
输出:
示例 4:
在这个例子中,让我们单独打印所有的描述性统计数据。
from pandas import DataFrame
# Create DataFrame
cart = {'Product': ['Mobile', 'AC', 'Mobile', 'Sofa', 'Laptop'],
'Price': [20000, 28000, 22000, 19000, 45000],
'Year': [2014, 2015, 2016, 2017, 2018]
}
df = DataFrame(cart, columns = ['Product', 'Price', 'Year'])
# Original DataFrame
print("Original DataFrame:\n", df)
# Print Count of Price
print("\nCount of Price:\n")
counts = df['Price'].count()
print(counts)
# Print mean of Price
print("\nMean of Price:\n")
m = df['Price'].mean()
print(m)
# Print maximum value of Price
print("\nMaximum value of Price:\n")
mx = df['Price'].max()
print(m)
# Print standard deviation of Price
print("\nStandard deviation of Price:\n")
sd = df['Price'].std()
print(sd)
Python
输出: