美术作画中有透视的概念,近大远小,意思是里自己近的地方,视觉上给人的感觉很大,而离自己远的地方,视觉上给人感觉很小.
图片也会有这个特性,不过在计算机设计做一些判断的时候,我们希望去掉透视效果,这样处理图片会变得容易些,这时我们可以使用透视变换函数.
我们先要获得透视变换前后的映射关系,在原图上找到4个点,然后在指定透视变换后4个点的位置,通过**cv2.getPerspectiveTransform()**函数可以建立的映射关系.
import matplotlib.pyplot as plt
import numpy as np
import cv2
img = plt.imread('./road-3133502_640.jpg')
src = np.float32([
[0, 400],
[640, 400],
[300, 260],
[350, 260],
])
dst = np.float32([
[200, 426],
[400, 426],
[200, 0],
[400, 0],
])
M = cv2.getPerspectiveTransform(src, dst)
图片中蓝点是映射前的点,红点是指定的映射后的点
随后我们使用**cv2.warpPerspective()**函数,对图片进行透视变换.
warped = cv2.warpPerspective(img, M, (img.shape[1],img.shape[0]), flags=cv2.INTER_NEAREST)
f,(ax1,ax2) = plt.subplots(1,2)
ax1.imshow(img)
ax1.scatter(src[:,:1],src[:,1:2],color='b')
ax1.scatter(dst[:,:1],dst[:,1:2],color='r')
ax2.imshow(warped)
plt.show()
看!路被我们"拉直"了
识别图片中物体的方法有很多种,如果要识别出的物体是颜色鲜明的,那么先用颜色来分离一下物体和背景是一种不错的思路.
比如我们要分离这张图片的天空和金字塔
cv2.inRange(hsv, low, high)
cv2.inRange()函数接收一个hsv的图片数据,一个最低阈值和一个最高阈值,输出一个灰度图片.当图片的色值在阈值之间,数据被设置为255显示白色,如果在小于最低阈值或高于最高阈值,图片数据会被设置为0,显示为黑色.
下一步我们可以用这个灰度图片做遮罩,白色显示原图,黑色部分遮挡掉原图.
可以使用
cv2.bitwise_and(img,img, mask= mask)
也可以使用numpy的计算
img[mask != 255] = [0, 0, 0]
# coding=utf-8
import cv2
import numpy as np
def apply_color_mask(hsv,img,low,high):
# Apply color mask to image
mask = cv2.inRange(hsv, low, high)
res = cv2.bitwise_and(img,img, mask= mask)
return res
img = cv2.imread('b.jpg')
hsv = cv2.cvtColor(img,cv2.COLOR_BGR2HSV)
mask = apply_color_mask(hsv,img,(0,0,0),(30,240,255))
cv2.imshow("result", mask)
cv2.waitKey(5000)
对于想提取多种颜色,可以分别把每个颜色提取出来,然后再合并起来.
下面是提取白色和黄色的方法.
mask_yellow = cv2.inRange(hsv, (0, 100, 100), (80, 255, 255))
mask_white = cv2.inRange(hsv, (0, 0, 160), (255, 80, 255))
mask_lane = cv2.bitwise_or(mask_yellow, mask_white)
cv2.addWeighted()可以让两个图片合并起来
cv2.addWeighted(src1, alpha, src2, beta, gamma[, dst[, dtype]])
我们来应用一下,先读取第一张图片
im1 = plt.imread('1.jpg')
im2 = plt.imread('2.jpg')
把第二张图片的尺寸格式化成第一张的大小,然后合并两张图片,第二张图片透明度设置为0.4
im2 = cv2.resize(im2, (im1.shape[1],im1.shape[0]))
img = cv2.addWeighted(im1, 1, im2, 0.4, 0)
plt.imshow(img)
plt.show()
如果你经常使用PhotoShop的话,那么你一定非常熟悉高斯模糊.在PS过程中经常使用高斯模糊和略微一点失焦效果来渲染背景图层,可以实现类似单反相机大光圈的效果,让图片瞬间变的高大上.
图像的高斯模糊过程就是图像与正态分布做卷积.由于正态分布又叫作高斯分布,所以这项技术就叫作高斯模糊.
高斯模糊也可以理解成把周围像素的色值取平均值.而卷积核的大小就是取平均值的范围.
来看一下在python中如何使用高斯模糊.
import matplotlib.pyplot as plt
import cv2
img = plt.imread('1.jpg')
plt.imshow(img)
plt.show()
img = cv2.GaussianBlur(img, (15, 15), 0)
plt.imshow(img)
plt.show()
我们可以用边缘检测,来把颜色变化明显的线区分初来,边缘检测是卷积的一种.比如我们需要检测下图中的车道线.
# coding=utf-8
import cv2
import numpy as np
img = cv2.imread('b.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gaus = cv2.GaussianBlur(gray, (3, 3), 0)
edges = cv2.Canny(gaus, 50, 150,apertureSize = 3)
打印边缘检测的结果看一下效果
边缘检测只是区分出了边缘处的点,如果想要画出线我们可以使用霍夫变换把点连接成线.
lines = cv2.HoughLinesP(edges, 1, np.pi / 180, 100, minLineLength=10, maxLineGap=50)
for line in lines:
x1, y1, x2, y2 = line[0]
cv2.line(img, (x1, y1), (x2, y2), (0, 255, 0), 2)
cv2.imshow("houghline", img)
cv2.waitKey(5000)
看下效果
在做计算机视觉,图片处理的时候经常遇到图片尺寸不一样的情况,这时候我们可以将图片进行缩放,统一图片数据的大小.
cv2.resize(src, dsize[, dst[, fx[, fy[, interpolation]]]])
我们来看一下使用的例子,从坐标系中可以看出来缩放前后尺寸的变化
# coding=utf-8
import cv2
import matplotlib.pyplot as plt
img = plt.imread('57a032c121f15.jpg')
newimg = cv2.resize(img,(500,500))
f,(a1,a2) = plt.subplots(1,2)
a1.imshow(img)
a2.imshow(newimg)
plt.show()
在机器学习训练的过程中,可能会因为数据量少,或者数据过于集中,造成模型效果不是很好,或者容易过拟合.
我们可以使用cv2.flip()将图片进行反转,这样能够增加数据量,也可以让模型更加对称.
cv2.flip(src, flipCode[, dst])
# coding=utf-8
import matplotlib.pyplot as plt
import numpy as np
import cv2
img = plt.imread('./cat.jpg')
newimg = cv2.flip(img,1)
f,(ax1,ax2) = plt.subplots(1,2)
ax1.imshow(img)
ax2.imshow(newimg)
plt.show()
利用cv2.Sobel()函数可以求得图片的梯度.而后我们可以根据梯度的不同对图片进行一定的筛选和处理.
其内部原理其实就是使用一个kernel对图片做卷积.实现的具体细节可以参考我的视频教程人工智能-卷积的原理
import cv2
import numpy as np
im = cv2.imread('lion.jpg')
im = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
x = cv2.Sobel(im, cv2.CV_16S, 1, 0)
y = cv2.Sobel(im, cv2.CV_16S, 0, 1)
absX = cv2.convertScaleAbs(x)
absY = cv2.convertScaleAbs(y)
dst = cv2.addWeighted(absX, 0.5, absY, 0.5, 0)
cv2.imshow('origin', im)
cv2.imshow("absX", absX)
cv2.imshow("absY", absY)
cv2.imshow("Result", dst)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.Sobel()函数求导数后会有负值,还会有大于 255 的值,而原图像是 uint8 ,所以 Sobel 建立的图像位数不够,会有截断。因此要使用 16 位有符号的数据类型,即 cv2.CV_16S。之后使用cv2.convertScaleAbs() 将其转回原来的 uint8 形式,否则无法显示图像。
def dir_threshold(img, sobel_kernel=3, thresh=(0, np.pi/2)):
# Apply the following steps to img
# 1) Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
# 2) Take the gradient in x and y separately
sobelx = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize=sobel_kernel)
sobely = cv2.Sobel(gray, cv2.CV_64F, 0, 1, ksize=sobel_kernel)
# 3) Take the absolute value of the x and y gradients
abs_sobelx = np.absolute(sobelx)
abs_sobely = np.absolute(sobely)
# 4) Use np.arctan2(abs_sobely, abs_sobelx) to calculate the direction of the gradient
absgraddir = np.arctan2(abs_sobely, abs_sobelx)
# 5) Create a binary mask where direction thresholds are met
binary_output = np.zeros_like(absgraddir)
binary_output[(absgraddir >= thresh[0]) & (absgraddir <= thresh[1])] = 1
# 6) Return this mask as your binary_output image
return binary_output
# Define a function that applies Sobel x and y,
# then computes the magnitude of the gradient
# and applies a threshold
def hls_select(image, thresh=(0, 255)):
# 1) Convert to HLS color space
hls = cv2.cvtColor(image, cv2.COLOR_RGB2HLS)
H = hls[:, :, 0]
L = hls[:, :, 1]
S = hls[:, :, 2]
# 2) Apply a threshold to the S channel
thresh = (90, 255)
binary = np.zeros_like(S)
binary[(S > thresh[0]) & (S <= thresh[1])] = 1
# 3) Return a binary image of threshold result
return binary
# Define a function that applies Sobel x and y,
# then computes the direction of the gradient
# and applies a threshold.
def mag_thresh(img, sobel_kernel=3, mag_thresh=(0, 255)):
# Apply the following steps to img
# 1) Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
# 2) Take the gradient in x and y separately
sobelx = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize=sobel_kernel)
sobely = cv2.Sobel(gray, cv2.CV_64F, 0, 1, ksize=sobel_kernel)
# 3) Calculate the magnitude
gradmag = np.sqrt(sobelx**2 + sobely**2)
# 4) Scale to 8-bit (0 - 255) and convert to type = np.uint8
scale_factor = np.max(gradmag)/255
gradmag = (gradmag/scale_factor).astype(np.uint8)
# 5) Create a binary mask where mag thresholds are met
binary_output = np.zeros_like(gradmag)
binary_output[(gradmag >= mag_thresh[0]) & (gradmag <= mag_thresh[1])] = 1
# 6) Return this mask as your binary_output image
return binary_output
def abs_sobel_thresh(img, orient='x', thresh_min=0, thresh_max=255):
# Apply the following steps to img
# 1) Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
# 2) Take the derivative in x or y given orient = 'x' or 'y'
if orient == 'x':
sobel = cv2.Sobel(gray, cv2.CV_64F, 1, 0)
if orient == 'y':
sobel = cv2.Sobel(gray, cv2.CV_64F, 0, 1)
# 3) Take the absolute value of the derivative or gradient
abs_sobel = np.absolute(sobel)
# 4) Scale to 8-bit (0 - 255) then convert to type = np.uint8
scaled_sobel = np.uint8(255*abs_sobel/np.max(abs_sobel))
# 5) Create a mask of 1's where the scaled gradient magnitude
# is > thresh_min and < thresh_max
binary_output = np.zeros_like(scaled_sobel)
binary_output[(scaled_sobel >= thresh_min) & (scaled_sobel <= thresh_max)] = 1
# 6) Return this mask as your binary_output image
return binary_output
参考: