如何将图像转换为PyTorch张量?
PyTorch张量是包含单一数据类型元素的n维数组(矩阵)。 张量类似于numpy数组。 numpy数组和PyTorch张量之间的区别在于张量利用GPU加速数值计算。为了加速计算,图像被转换为张量。
要将图像转换为PyTorch张量,可以执行以下步骤:
步骤
-
导入所需的库。所需的库为 torch、torchvision、Pillow
-
读取图像。图像必须是PIL图像或 numpy.ndarray(HxWxC) ,其值在[0,255]范围内。其中 H,W,C 是图像的高度,宽度和通道数。
-
定义一个转换来将图像转换为张量。我们使用 transforms.ToTensor() 来定义转换。
-
使用上述定义的转换将图像转换为张量。
输入图像
示例1
#导入所需的库
import torch
from PIL import Image
import torchvision.transforms as transforms
#读取图像
image = Image.open('Penguins.jpg')
#定义一个转换来将图像转换为张量
transform = transforms.ToTensor()
#将图像转换为PyTorch张量
tensor = transform(image)
#print转换后的图像张量
print(tensor)
输出
tensor([[[0.4510, 0.4549, 0.4667, ..., 0.3333, 0.3333, 0.3333],
[0.4549, 0.4510, 0.4627, ..., 0.3373, 0.3373, 0.3373],
[0.4667, 0.4588, 0.4667, ..., 0.3451, 0.3451, 0.3412],
...,
[0.6706, 0.5020, 0.5490, ..., 0.4627, 0.4275, 0.3333],
[0.4196, 0.5922, 0.6784, ..., 0.4627, 0.4549, 0.3569],
[0.3569, 0.3529, 0.4784, ..., 0.3922, 0.4314, 0.3490]],
[[0.6824, 0.6863, 0.7020, ..., 0.6392, 0.6392, 0.6392],
[0.6863, 0.6824, 0.6980, ..., 0.6314, 0.6314, 0.6314],
[0.6980, 0.6902, 0.6980, ..., 0.6392, 0.6392, 0.6353],
...,
[0.7255, 0.5412, 0.5765, ..., 0.5255, 0.5020, 0.4157],
[0.4706, 0.6314, 0.7098, ..., 0.5255, 0.5294, 0.4392],
[0.4196, 0.3961, 0.5020, ..., 0.4510, 0.5059, 0.4314]],
[[0.8157, 0.8196, 0.8353, ..., 0.7922, 0.7922, 0.7922],
[0.8196, 0.8157, 0.8314, ..., 0.7882, 0.7882, 0.7882],
[0.8314, 0.8235, 0.8314, ..., 0.7961, 0.7961, 0.7922],
...,
[0.6235, 0.5059, 0.6157, ..., 0.4863, 0.4941, 0.4196],
[0.3922, 0.6000, 0.7176, ..., 0.4863, 0.5216, 0.4431],
[0.3686, 0.3647, 0.4863, ..., 0.4235, 0.4980, 0.4353]]])
在上面的Python程序中,我们将PIL图像转换为张量。
示例2
我们还可以使用 OpenCV 读取图像。使用OpenCV读取的图像类型为 numpy.ndarray 。我们可以使用 transforms.ToTensor() 将numpy.ndarray转换为张量。请看以下示例。
#导入所需的库
import torch
import cv2
import torchvision.transforms as transforms
#读取图像
image = cv2.imread('Penguins.jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
#定义一个转换来将图像转换为张量
transform = transforms.ToTensor()
#将图像转换为PyTorch的张量
tensor = transform(image)
#print转换后的图像张量
print(tensor)
输出
tensor([[[0.4510, 0.4549, 0.4667, ..., 0.3333, 0.3333, 0.3333],
[0.4549, 0.4510, 0.4627, ..., 0.3373, 0.3373, 0.3373],
[0.4667, 0.4588, 0.4667, ..., 0.3451, 0.3451, 0.3412],
...,
[0.6706, 0.5020, 0.5490, ..., 0.4627, 0.4275, 0.3333],
[0.4196, 0.5922, 0.6784, ..., 0.4627, 0.4549, 0.3569],
[0.3569, 0.3529, 0.4784, ..., 0.3922, 0.4314, 0.3490]],
[[0.6824, 0.6863, 0.7020, ..., 0.6392, 0.6392, 0.6392],
[0.6863, 0.6824, 0.6980, ..., 0.6314, 0.6314, 0.6314],
[0.6980, 0.6902, 0.6980, ..., 0.6392, 0.6392, 0.6353],
...,
[0.7255, 0.5412, 0.5765, ..., 0.5255, 0.5020, 0.4157],
[0.4706, 0.6314, 0.7098, ..., 0.5255, 0.5294, 0.4392],
[0.4196, 0.3961, 0.5020, ..., 0.4510, 0.5059, 0.4314]],
[[0.8157, 0.8196, 0.8353, ..., 0.7922, 0.7922, 0.7922],
[0.8196, 0.8157, 0.8314, ..., 0.7882, 0.7882, 0.7882],
[0.8314, 0.8235, 0.8314, ..., 0.7961, 0.7961, 0.7922],
...,
[0.6235, 0.5059, 0.6157, ..., 0.4863, 0.4941, 0.4196],
[0.3922, 0.6000, 0.7176, ..., 0.4863, 0.5216, 0.4431],
[0.3686, 0.3647, 0.4863, ..., 0.4235, 0.4980, 0.4353]]])