Python3 + OpenCV3 , Crack Verification code
4 min readJan 29, 2019
Version
opencv-contrib-python 3.4.5.20
python3.7.1
numpy 1.16.0
Install OpenCV
pip install opencv-python==3.4.5.20
pip install numpy
Step
1 . get photo (http://bsr.twse.com.tw/bshtm)
2 . Reduce picture noise (OpenCV)
3 . Use MiMachine learning
This page just teach how to cut pictures by text
First Will use multiple image processing functions in open cv
1 . cv2.imread({img})
- img -> File Name , Example : test1.jpg
2 . cv2.erode({img},{kernel},{frequency})
- img -> File Name , Example : test1.jpg
- kernel -> Structural element , Act according to kernel size ,
Example : np.ones((3, 3), np.uint8) , that was kernel 3x3 - frequency -> Number of erosions
3 . cv2.dilate({img},{kernel},{frequency})
- img -> File Name , Example : test1.jpg
- kernel -> Structural element , Act according to kernel size ,
Example : np.ones((3, 3), np.uint8) , that was kernel 3x3 - frequency -> Number of erosions
4 . cv2.threshold({img},{thresh},{maxval},{type})
- img -> File Name , Example : test1.jpg
- thresh -> Boundary value , if num > thresh than num = 0 , else num = {maxval}
- type ->
THRESH_BINARY , (The pixel exceeding the threshold is set to the maximum value (maxval), and the value smaller than the threshold is set to 0.)
THRESH_BINARY_INV, (THRESH_BINARY in contrast)
THRESH_TRUNC, (A pixel exceeding the threshold is set as a threshold, and a value smaller than the threshold is set to zero.)
THRESH_TOZERO,(The pixel value exceeding the threshold is unchanged, and less than the threshold is set to 0.)
THRESH_TOZERO_INV(THRESH_TOZERO in contrast)
5 . cv2.cvtColor({img},{changetype})
This function used to convert the image
6 . cv2.findContours({img},{mode},{method})
This function used to find rectangle
- mode ->
CV_RETR_EXTERNAL:Take only the outline of the outermost layer. - CV_RETR_LIST:Get all the outlines, no hierarchy.
- CV_RETR_CCOMP:Get all the contours, store them in two layers, the first level is the outer part of the object, the second level is the outline of the inner hollow part, and the multi-layer structure will remain in the first layer.
- CV_RETR_TREE:Get all the outlines and store them in a whole class。
import cv2
import matplotlib.pyplot as plt
import numpy as np
# 印出圖片
def printPhoto(img):
plt.imshow(img)
plt.show()
# 黑白畫
def grayify(image):
return cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)def delRepeatPic(ary):
for i in range(len(ary)-2): (x1, y1, w1, h1) = ary[i] (x2, y2, w2, h2) = ary[i+1] if x2 - x1 < 20: if w1*h1 > w2*h2:
del ary[i+1]
else: del ary[i] delRepeatPic(ary) return aryimg = cv2.imread('test1.jpg')kernel = np.ones((3, 3), np.uint8)img = cv2.erode(img, kernel, iterations=1)img = cv2.medianBlur(img, 1)kernel = np.ones((2, 2), np.uint8)img = cv2.dilate(img,kernel,iterations=1)ret1, img = cv2.threshold(img, 170, 255, cv2.THRESH_BINARY)img = cv2.medianBlur(img, 1)ret, thresh1 = cv2.threshold(img, 180, 255, cv2.THRESH_BINARY)thresh1 = np.clip(thresh1, 0, 255)thresh1 = cv2.cvtColor(thresh1, cv2.COLOR_BGR2GRAY)im2, contours, _ = cv2.findContours(thresh1, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)cnts = sorted([(c, cv2.boundingRect(c)[0]) for c in contours], key=lambda x: x[1])ary = []for (c, _) in cnts: (x, y, w, h) = cv2.boundingRect(c) if w > 20 and h > 20: ary.append((x, y, w, h))ary = delRepeatPic(ary)for (x, y, w, h) in ary: printPhoto(thresh1[y:y+h, x:x+w])