用NumPy数组进行Python广播
术语广播是指numpy在导致某些约束的算术操作中如何处理不同维度的数组,较小的数组被广播到较大的数组中,以便它们具有兼容的形状。
广播提供了一种矢量化数组操作的方法,这样就可以在C语言而不是Python中进行循环,因为我们知道Numpy是在C语言中实现的,它在这样做的时候不会对数据进行不必要的复制,从而导致高效的算法实现。在有些情况下,广播是一个坏主意,因为它导致了对内存的低效使用,从而降低了计算速度。
示例 –
import numpy as np
a = np.array([5, 7, 3, 1])
b = np.array([90, 50, 0, 30])
# array are compatible because of same Dimension
c = a * b
print (c)
举例以加深理解 –
让我们假设我们有一个大的数据集,每个数据集是一个参数列表。在Numpy中,我们有一个二维数组,每一行都是一个数据点,行数就是数据集的大小。假设我们想对所有这些数据进行某种缩放,每个参数都有自己的缩放系数,或者说每个参数都要乘以某个系数。
为了有一些清晰的认识,让我们用宏观营养素的分类来计算食物的热量。粗略地说,食物的热量部分是由脂肪(每克9卡)、蛋白质(4卡)和碳水化合物(4卡)组成。因此,如果我们列出一些食物(我们的数据),并为每种食物列出其宏观营养素分类(参数),然后我们可以将每种营养素乘以其热量值(应用比例)来计算每种食物的热量分类。
通过这种转换,我们现在可以计算出各种有用的信息。例如,某种食物中存在的总热量是多少,或者,给我一份晚餐的细目,知道我从蛋白质中获得了多少热量,等等。
让我们看看用Numpy产生这种计算的天真方式:
macros = array([
[0.8, 2.9, 3.9],
[52.4, 23.6, 36.5],
[55.2, 31.7, 23.9],
[14.4, 11, 4.9]
])
# Create a new array filled with zeros,
# of the same shape as macros.
result = zeros_like(macros)
cal_per_macro = array([3, 3, 8])
# Now multiply each row of macros by
# cal_per_macro. In Numpy, `*` is
# element-wise multiplication between two arrays.
for i in range(macros.shape[0]):
result[i, :] = macros[i, :] * cal_per_macro
result
输出:
array([[ 2.4, 8.7, 31.2 ],
[ 157.2, 70.8, 292 ],
[ 165.6, 95.1, 191.2],
[ 43.2, 33, 39.2]])
算法:
输入:具有m维的数组A*和具有n维的数组B*。
p = max(m, n)
if m < p:
left-pad A's shape with 1s until it also has p dimensions
else if n < p:
left-pad B's shape with 1s until it also has p dimensions
result_dims = new list with p elements
for i in p-1 ... 0:
A_dim_i = A.shape[i]
B_dim_i = B.shape[i]
if A_dim_i != 1 and B_dim_i != 1 and A_dim_i != B_dim_i:
raise ValueError("could not broadcast")
else:
# Pick the Array which is having maximum Dimension
result_dims[i] = max(A_dim_i, B_dim_i)
广播规则:
两个阵列一起广播,要遵循这些规则。
1.如果数组没有相同的等级,那么就在低等级数组的形状前加上1,直到两个形状有相同的长度。
2.如果两个数组在某一维度上有相同的大小,或者其中一个数组在该维度上有1的大小,那么这两个数组在该维度上是兼容的。
3.如果阵列与所有维度兼容,则可以一起广播。
4.在广播之后,每个数组的行为就像它的形状等于两个输入数组的元素最大形状一样。
5.在任何一个维度上,如果一个数组的大小为1,而另一个数组的大小大于1,那么第一个数组的行为就像是沿着这个维度复制的一样。
例子#1:单维数组
import numpy as np
a = np.array([17, 11, 19]) # 1x3 Dimension array
print(a)
b = 3
print(b)
# Broadcasting happened because of
# miss match in array Dimension.
c = a + b
print(c)
输出:
[17 11 19]
3
[20 14 22]
例子2:二维数组
import numpy as np
A = np.array([[11, 22, 33], [10, 20, 30]])
print(A)
b = 4
print(b)
C = A + b
print(C)
输出:
[[11 22 33]
[10 20 30]]
4
[[15 26 37]
[14 24 34]]
示例 3:
import numpy as np
v = np.array([12, 24, 36])
w = np.array([45, 55])
# To compute an outer product we first
# reshape v to a column vector of shape 3x1
# then broadcast it against w to yield an output
# of shape 3x2 which is the outer product of v and w
print(np.reshape(v, (3, 1)) * w)
x = np.array([[12, 22, 33], [45, 55, 66]])
# x has shape 2x3 and v has shape (3, )
# so they broadcast to 2x3,
print(x + v)
# Add a vector to each column of a matrix X has
# shape 2x3 and w has shape (2, ) If we transpose X
# then it has shape 3x2 and can be broadcast against w
# to yield a result of shape 3x2.
# Transposing this yields the final result
# of shape 2x3 which is the matrix.
print((x.T + w).T)
# Another solution is to reshape w to be a column
# vector of shape 2X1 we can then broadcast it
# directly against X to produce the same output.
print(x + np.reshape(w, (2, 1)))
# Multiply a matrix by a constant, X has shape 2x3.
# Numpy treats scalars as arrays of shape();
# these can be broadcast together to shape 2x3.
print(x * 2)
输出:
[[ 4 5]
[ 8 10]
[12 15]]
[[2 4 6]
[5 7 9]]
[[ 5 6 7]
[ 9 10 11]]
[[ 5 6 7]
[ 9 10 11]]
[[ 2 4 6]
[ 8 10 12]]
绘制二维函数图 –
在显示基于二维函数的图像时,广播也经常被使用。如果我们想定义一个函数z=f(x, y)。
示例:
import numpy as np
import matplotlib.pyplot as plt
# Computes x and y coordinates for
# points on sine and cosine curves
x = np.arange(0, 3 * np.pi, 0.1)
y_sin = np.sin(x)
y_cos = np.cos(x)
# Plot the points using matplotlib
plt.plot(x, y_sin)
plt.plot(x, y_cos)
plt.xlabel('x axis label')
plt.ylabel('y axis label')
plt.title('Sine and Cosine')
plt.legend(['Sine', 'Cosine'])
plt.show()
输出: