如何用Python计算克莱默V

克莱默V：它被定义为两个给定的名义变量之间的长度测量。名义变量是一种数据测量尺度，用于对不同类型的数据进行分类。Cramer’s V介于0和1（包括）之间。0表示这两个变量没有任何关系。1表示两个变量之间存在强烈的关联。Cramer’s V可以通过以下公式计算。

√(X 2 /N) / min(C-1, R-1)

这里,

X 2 : 它是Chi-square统计学。
N：它代表总的样本量
R：它等于行的数量
C: 它等于列的数量

示例 1:

让我们计算一下3×3表格的克拉默V。

# Load necessary packages and functions
import scipy.stats as stats
import numpy as np
  
# Make a 3 x 3 table
dataset = np.array([[13, 17, 11], [4, 6, 9],
                    [20, 31, 42]])
  
# Finding Chi-squared test statistic,
# sample size, and minimum of rows
# and columns
X2 = stats.chi2_contingency(dataset, correction=False)[0]
N = np.sum(dataset)
minimum_dimension = min(dataset.shape)-1
  
# Calculate Cramer's V
result = np.sqrt((X2/N) / minimum_dimension)
  
# Print the result
print(result)

输出:

如何用Python计算Cramer's V？

Cramers V等于0.121，清楚地描述了表中两个变量之间的弱关联。

示例 2:

现在，我们将计算更大的表格和不平等的维度的Cramer’s V。Cramers V等于0.12，这清楚地描述了表格中两个变量之间的微弱关联。

# Load necessary packages and functions
import scipy.stats as stats
import numpy as np
  
# Make a 5 x 4 table
dataset = np.array([[4, 13, 17, 11], [4, 6, 9, 12],
                    [2, 7, 4, 2], [5, 13, 10, 12],
                    [5, 6, 14, 12]])
  
# Finding Chi-squared test statistic, 
# sample size, and minimum of rows and
# columns
X2 = stats.chi2_contingency(dataset, correction=False)[0]
N = np.sum(dataset)
minimum_dimension = min(dataset.shape)-1
  
# Calculate Cramer's V
result = np.sqrt((X2/N) / minimum_dimension)
  
# Print the result
print(result)