R语言 如何使用dist函数
在这篇文章中,我们将看到如何在R编程语言中使用dist()函数。
R提供了一个内置的dist()函数,使用该函数我们可以计算二维向量中每一对独特的向量之间的六种不同的距离。dist()方法接受一个数字矩阵作为参数和一个代表要测量的距离类型的方法。该方法必须是这些距离中的一个–欧几里得、最大、曼哈顿、堪培拉、二进制和闵可夫斯基。它也接受其他参数,但这些参数是可选的。
语法:
dist(vect, method = ” ” , diag = TRUE or FALSE, upper = TRUE or FALSE)
参数
- vect。一个二维向量
- method 。要测量的距离。它必须等于其中之一,”欧几里得”、”最大”、”曼哈顿”、”堪培拉”、”二进制 “或 “明科夫斯基”
- diag:逻辑值(TRUE或FALSE),表示是否应该用print.dist打印距离矩阵的对角线。
- upper:逻辑值(TRUE或FALSE),表示是否应通过print.dist打印距离矩阵的上三角。
返回类型
它返回一个 “dist “类的对象
现在让我们看看如何使用dist()函数来计算这些距离。
欧几里得距离
欧氏空间中两点之间的欧氏距离基本上是两点之间线段的长度。它可以在毕达哥拉斯定理的帮助下从两点的笛卡尔坐标中计算出来,因此偶尔也被称为毕达哥拉斯距离。
例如,在一个有两点Point1 (x1 ,y1 )和Point2 (x2 ,y2 )的二维空间中,欧氏距离由√(x1 - x2 )2 + (y1 - y2 )2 给出。
两个向量之间的欧氏距离由以下公式给出。
√Σ(vect1i - vect2i)2
其中。
- vect1是第一个矢量
- vect2是第二个向量
例如,我们有两个向量,vect1为(2,1,5,8),vect2为(1,2,4,9)。它们的欧氏距离为:√(2-1)2 +(1-2)2 +(5-4)2 +(8-9)2 ,等于2。
语法:
dist(vect, method = "euclidean", diag = TRUE or FALSE, upper = TRUE or FALSE)
例子: 欧几里得距离
# R program to illustrate how to calculate
# euclidean distance using dist() function
# Initializing a vector
vect1 <- c(1, 4, 8, 9, 2, 3)
# Initializing another vector
vect2 <- c(9, 4, 1, 2, 4, 7)
# Initializing another vector
vect3 <- c(1, 7, 9, 3, 2, 8)
# Initializing another vector
vect4 <- c(2, 1, 4, 7, 8, 9)
# Initializing another vector
vect5 <- c(1, 4, 8, 3, 9, 2)
# Initializing another vector
vect6 <- c(3, 7, 8, 6, 5, 9)
#Row bind vectors into a single matrix
twoDimensionalVect <- rbind(vect1, vect2, vect3,
vect4, vect5, vect6)
print("Euclidean distance between each pair of vectors is: ")
cat("\n\n")
# Calculate Euclidean distance between vectors using
# built in dist method By passing two-dimensional
# vector as a parameter Since we want to calculate
# Euclidean distance between each unique pair of vectors
# That is why we are passing Euclidean as a method
dist(twoDimensionalVect, method = "euclidean", diag = TRUE, upper = TRUE)
输出
曼哈顿距离
曼哈顿距离是N维向量空间中两点之间的距离度量。它被定义为相应维度的坐标之间的绝对距离之和。例如,在一个有两点Point1 (x1 , y1)和Point2 (x2 , y2 )的二维空间中,曼哈顿距离是由 |x1 - x2 | + |y1 - y2 | 得到的。
在R中,曼哈顿距离是相对于向量计算的。两个向量之间的曼哈顿距离由以下公式给出。
Σ|vect1i - vect2i|
其中。
- vect1是第一个矢量
- vect2是第二个向量
例如,我们给了两个向量,vect1为(3,6,8,9),vect2为(1,7,8,10)。它们的曼哈顿距离为:|3-1|+|6-7|+|8-8|+|9-10|
,等于4。
语法:
dist(vect, method = "manhattan", diag = TRUE or FALSE, upper = TRUE or FALSE)
例子: 曼哈顿距离
# R program to illustrate how to calculate
# Manhattan distance
# using dist() function
# Initializing a vector
vect1 <- c(1, 4, 8, 9, 2, 3)
# Initializing another vector
vect2 <- c(9, 4, 1, 2, 4, 7)
# Initializing another vector
vect3 <- c(1, 7, 9, 3, 2, 8)
# Initializing another vector
vect4 <- c(2, 1, 4, 7, 8, 9)
# Initializing another vector
vect5 <- c(1, 4, 8, 3, 9, 2)
# Initializing another vector
vect6 <- c(3, 7, 8, 6, 5, 9)
#Row bind vectors into a single matrix
twoDimensionalVect <- rbind(vect1, vect2, vect3,
vect4, vect5, vect6)
print("Manhattan distance between each pair of vectors is: ")
cat("\n\n")
# Calculate Manhattan distance between vectors using
# built in dist method By passing two-dimensional
# vector as a parameter Since we want to calculate
# Manhattan distance between each unique pair of vectors
# That is why we are passing Manhattan as a method
dist(twoDimensionalVect, method = "manhattan", diag = TRUE, upper = TRUE)
输出
最大距离
两个向量A和B之间的最大距离被计算为任何一对元素之间的最大差异。在R中,最大距离是相对于向量计算的。两个向量之间的最大距离由以下公式给出。
max(|vect1i - vect2i|)
其中。
- vect1是第一个矢量
- vect2是第二个向量
例如,我们给了两个向量,vect1是(3,6,8,9),vect2是(1,8,9,10)。它们的最大距离是由max(|3-1|, |6-8|, |8-9|, |9-10|)
给出的,它等于2。
语法:
dist(vect, method = "maximum", diag = TRUE or FALSE, upper = TRUE or FALSE)
例子: 最大距离
# R program to illustrate how to calculate Maximum distance
# using dist() function
# Initializing a vector
vect1 <- c(1, 4, 8, 9, 2, 3)
# Initializing another vector
vect2 <- c(9, 4, 1, 2, 4, 7)
# Initializing another vector
vect3 <- c(1, 7, 9, 3, 2, 8)
# Initializing another vector
vect4 <- c(2, 1, 4, 7, 8, 9)
# Initializing another vector
vect5 <- c(1, 4, 8, 3, 9, 2)
# Initializing another vector
vect6 <- c(3, 7, 8, 6, 5, 9)
#Row bind vectors into a single matrix
twoDimensionalVect <- rbind(vect1, vect2, vect3, vect4, vect5, vect6)
print("Maximum distance between each pair of vectors is: ")
cat("\n\n")
# Calculate Maximum distance between vectors using
# built in dist method By passing two-dimensional
# vector as a parameter Since we want to calculate
# Maximum distance between each unique pair of vectors
# That is why we are passing Maximum as a method
dist(twoDimensionalVect, method = "maximum", diag = TRUE, upper = TRUE)
输出
堪培拉距离
堪培拉距离是对矢量空间中成对的点之间的距离的一种数字测量。在R中,堪培拉距离是针对向量计算的。两个向量之间的堪培拉距离由以下公式给出。
∑ |vect1i - vect2i| / (|vect1i| + |vect2i|)
其中。
- vect1是第一个矢量
- vect2是第二个向量
例如,我们有两个向量,向量1为(2,2,7,5),向量2为(3,8,3,5)。它们的堪培拉距离为:|2 - 3| / (2 + 3) + |2 - 8| / (2 + 8) + |7 - 3| / (7 + 3) + |5 - 5| / (5 + 5) = 0.2 + 0.6 + 0.4 + 0
,等于1.2。
语法:
dist(vect, method = "canberra", diag = TRUE or FALSE, upper = TRUE or FALSE)
例子: 堪培拉距离
# R program to illustrate how to calculate
# Canberra distance
# using dist() function
# Initializing a vector
vect1 <- c(1, 4, 8, 9, 2, 3)
# Initializing another vector
vect2 <- c(9, 4, 1, 2, 4, 7)
# Initializing another vector
vect3 <- c(1, 7, 9, 3, 2, 8)
# Initializing another vector
vect4 <- c(2, 1, 4, 7, 8, 9)
# Initializing another vector
vect5 <- c(1, 4, 8, 3, 9, 2)
# Initializing another vector
vect6 <- c(3, 7, 8, 6, 5, 9)
#Row bind vectors into a single matrix
twoDimensionalVect <- rbind(vect1, vect2, vect3,
vect4, vect5, vect6)
print("Canberra distance between each pair of vectors is: ")
cat("\n\n")
# Calculate Canberra distance between vectors using
# built in dist method By passing two-dimensional
# vector as a parameter Since we want to calculate
# Canberra distance between each unique pair of vectors
# That is why we are passing Canberra as a method
dist(twoDimensionalVect, method = "canberra", diag = TRUE, upper = TRUE)
输出
二进制距离
两个向量A和B之间的二进制距离被计算为两个向量共享元素的比例。
这里
- vect1是第一个向量
- vect2是第二个向量
语法:
dist(vect, method = "binary", diag = TRUE or FALSE, upper = TRUE or FALSE)
例子: 二进制距离
# R program to illustrate how to calculate
# Binary distance
# using dist() function
# Initializing a vector
vect1 <- c(1, 4, 8, 9, 2, 3)
# Initializing another vector
vect2 <- c(9, 4, 1, 2, 4, 7)
# Initializing another vector
vect3 <- c(1, 7, 9, 3, 2, 8)
# Initializing another vector
vect4 <- c(2, 1, 4, 7, 8, 9)
# Initializing another vector
vect5 <- c(1, 4, 8, 3, 9, 2)
# Initializing another vector
vect6 <- c(3, 7, 8, 6, 5, 9)
#Row bind vectors into a single matrix
twoDimensionalVect <- rbind(vect1, vect2, vect3,
vect4, vect5, vect6)
print("Binary distance between each pair of vectors is: ")
cat("\n\n")
# Calculate Binary distance between vectors using
# built in dist method By passing two-dimensional
# vector as a parameter Since we want to calculate
# Binary distance between each unique pair of vectors
# That is why we are passing Binary as a method
dist(twoDimensionalVect, method = "binary", diag = TRUE, upper = TRUE)
输出
闵科夫斯基距离
闵科夫斯基距离是在N维空间中两点之间测量的距离。它是欧几里得距离和曼哈顿距离的概括。例如,在一个有两点Point1 (x1 , y1 )和Point2 (x2 , y2 )的二维空间中,闵科夫斯基距离由(|x1 \- y1 |p +|x2 \- y2 |p )1/p
给出。 在R中,闵可夫斯基距离是相对于向量计算的。两个向量之间的闵可夫斯基距离由以下公式给出。
(Σ|vect1i - vect2i|p)1/p
其中。
- vect1是第一个矢量
- vect2是第二个向量
- p是一个整数
R提供了一个内置的dist()方法来计算二维向量中每一对向量之间的闵可夫斯基距离。
语法:
dist(vect, method = "minkowski", p = integer, diag = TRUE or FALSE, upper = TRUE or FALSE)
例如,我们有两个向量,vect1为(3, 6, 8, 9),vect2为(2, 7, 7, 10)。它们的闵可夫斯基距离是:( 3 - 2|2 \+ |6 - 7|2 \+ |8 - 7|2 \+ |9 - 10|2 )1/2
等于2。
例子: 闵科夫斯基距离
# R program to illustrate how to calculate
# Minkowski distance
# using dist() function
# Initializing a vector
vect1 <- c(1, 4, 8, 9, 2, 3)
# Initializing another vector
vect2 <- c(9, 4, 1, 2, 4, 7)
# Initializing another vector
vect3 <- c(1, 7, 9, 3, 2, 8)
# Initializing another vector
vect4 <- c(2, 1, 4, 7, 8, 9)
# Initializing another vector
vect5 <- c(1, 4, 8, 3, 9, 2)
# Initializing another vector
vect6 <- c(3, 7, 8, 6, 5, 9)
#Row bind vectors into a single matrix
twoDimensionalVect <- rbind(vect1, vect2, vect3,
vect4, vect5, vect6)
print("Minkowski distance between each pair of vectors is: ")
cat("\n\n")
# Calculate Minkowski distance between vectors using
# built in dist method By passing two-dimensional
# vector as a parameter Since we want to calculate
# Minkowski distance between each unique pair of vectors
# That is why we are passing Minkowski as a method
dist(twoDimensionalVect, method = "minkowski", diag = TRUE, upper = TRUE p = 2)
输出