Opencv 显著图|极客教程

在这里我们使用高斯金字塔制作简单的显著图。

显著图是将一副图像中容易吸引人的眼睛注意的部分（突出）表现的图像。

虽然现在通常使用深度学习的方法计算显著图，但是一开始人们用图像的RGB成分或者HSV成分创建高斯金字塔，并通过求差来得到显著图（例如Itti等人的方法）。

在这里我们使用在问题75中得到的高斯金字塔来简单地求出显著图。算法如下：

我们使用双线性插值调整图像大小至 $\frac{1}{128}$ 、 $\frac{1}{64}$ 、 $\frac{1}{32}$ ……一开始是缩放至 $\frac{1}{128}$ 。
将得到的金字塔（我们将金字塔的各层分别编号为0,1,2,3,4,5）两两求差。
将第2步中求得的差分全部相加，并正规化至 $[0,255]$ 。

完成以上步骤就可以得到显著图了。虽然第2步中并没有指定要选择哪两张图像，但如果选择两个好的图像，则可以像答案那样得到一张显著图。

从图上可以清楚地看出，蝾螈的眼睛部分和颜色与周围不太一样的地方变成了白色，这些都是人的眼睛容易停留的地方。

解答中使用了 $(0,1)$ 、 $(0,3)$ 、 $(0,5)$ 、 $(1,4)$ 、 $(2,3)$ 、 $(3,5)$ 。

输入 (imori.jpg)	输出

python实现：

import cv2
import numpy as np
import matplotlib.pyplot as plt

# Grayscale
def BGR2GRAY(img):
    # Grayscale
    gray = 0.2126 * img[..., 2] + 0.7152 * img[..., 1] + 0.0722 * img[..., 0]
    return gray

# Bi-Linear interpolation
def bl_interpolate(img, ax=1., ay=1.):
    if len(img.shape) > 2:
        H, W, C = img.shape
    else:
        H, W = img.shape
        C = 1

    aH = int(ay * H)
    aW = int(ax * W)

    # get position of resized image
    y = np.arange(aH).repeat(aW).reshape(aW, -1)
    x = np.tile(np.arange(aW), (aH, 1))

    # get position of original position
    y = (y / ay)
    x = (x / ax)

    ix = np.floor(x).astype(np.int)
    iy = np.floor(y).astype(np.int)

    ix = np.minimum(ix, W-2)
    iy = np.minimum(iy, H-2)

    # get distance 
    dx = x - ix
    dy = y - iy

    if C > 1:
        dx = np.repeat(np.expand_dims(dx, axis=-1), C, axis=-1)
        dy = np.repeat(np.expand_dims(dy, axis=-1), C, axis=-1)

    # interpolation
    out = (1-dx) * (1-dy) * img[iy, ix] + dx * (1 - dy) * img[iy, ix+1] + (1 - dx) * dy * img[iy+1, ix] + dx * dy * img[iy+1, ix+1]

    out = np.clip(out, 0, 255)
    out = out.astype(np.uint8)

    return out

# make image pyramid
def make_pyramid(gray):
    # first element
    pyramid = [gray]
    # each scale
    for i in range(1, 6):
        # define scale
        a = 2. ** i

        # down scale
        p = bl_interpolate(gray, ax=1./a, ay=1. / a)

        # up scale
        p = bl_interpolate(p, ax=a, ay=a)

        # add pyramid list
        pyramid.append(p.astype(np.float32))

    return pyramid

# make saliency map
def saliency_map(pyramid):
    # get shape
    H, W = pyramid[0].shape

    # prepare out image
    out = np.zeros((H, W), dtype=np.float32)

    # add each difference
    out += np.abs(pyramid[0] - pyramid[1])
    out += np.abs(pyramid[0] - pyramid[3])
    out += np.abs(pyramid[0] - pyramid[5])
    out += np.abs(pyramid[1] - pyramid[4])
    out += np.abs(pyramid[2] - pyramid[3])
    out += np.abs(pyramid[3] - pyramid[5])

    # normalization
    out = out / out.max() * 255

    return out


# Read image
img = cv2.imread("imori.jpg").astype(np.float)

# grayscale
gray = BGR2GRAY(img)

# pyramid
pyramid = make_pyramid(gray)

# pyramid -> saliency
out = saliency_map(pyramid)

out = out.astype(np.uint8)

# Save result
cv2.imshow("result", out)
cv2.waitKey(0)
cv2.imwrite("out.jpg", out)