Pattern Recognition 1 - Expert System¶
Objectives
- Create a working image analysis and pattern recognition pipeline
- Object labelisation and feature extraction
- Creating an Expert System for Optical Character Recognition (OCR)
Libraries needed for this lab
- Numpy
- Matplotlib
- Scikit-image
(all included in the Anaconda distribution)
Object detection¶
Given the following image :
In [1]:
Copied!
import numpy as np
from matplotlib import pyplot as plt
from skimage.io import imread
%matplotlib inline
im = imread('../data/doc1.png')
print(im.shape, im.dtype)
plt.figure(figsize=(20,15))
plt.imshow(im, cmap=plt.cm.gray)
plt.show()
import numpy as np
from matplotlib import pyplot as plt
from skimage.io import imread
%matplotlib inline
im = imread('../data/doc1.png')
print(im.shape, im.dtype)
plt.figure(figsize=(20,15))
plt.imshow(im, cmap=plt.cm.gray)
plt.show()
(851, 1068) uint8
- Segment the image to isolate the text from the background
- Label individual characters and extract the centroid position, the bounding box, and useful features for each character.
Useful documentation: skimage.measure module.
In [2]:
Copied!
## -- Your code here -- ##
## -- Your code here -- ##
Line detection and letter ordering¶
- Find the labels (objects) that belong to each text line
- Order each character in a text line from left to right
In [3]:
Copied!
## -- Your code here -- ##
## -- Your code here -- ##
Expert System classification¶
- Using the features available with the regionprops method of scikit-image, propose a method to automatically recognize some letters.
In [4]:
Copied!
## -- Your code here -- ##
## -- Your code here -- ##
Full OCR pipeline¶
Using all the previous exercices, create a method that takes as input a text image, and outputs the recognized text.
In [5]:
Copied!
def OCR(im):
text = ''
return text
print(OCR(im))
def OCR(im):
text = ''
return text
print(OCR(im))