You are on page 1of 15

OPTICAL CHARACTER

RECOGNITION

Made By:
Dhairya Goel- 02814803115
Madhwan Sharma-60214803115
DEFINITION

Optical Character Recognition(OCR) is


the mechanical or electronic conversion
of images of typewritten or printed text
into machine-encoded text.
PROBLEM OVERVIEW
 Humans are bound to make errors- some time or the
other- especially while performing mundane boring
tasks like digitization or security, continuously.

 Many times we are unable to perceive certain digits due


to various factors- motion, lack of digit clarity,
illumination and so on.

 It is these problems which have to lead us to delve into


this topic.
PURPOSE
 The main purpose of OCR system based on grid
infrastructure is to perform Document Image Analysis,
document processing of electronic document formats
converted from paper formats more effectively.

 This improves the accuracy of recognizing the


characters during document processing.

 Here OCR technique derives the meaning of the


characters, their font properties from their bit-mapped
images.
 The primary objective is to speed up the process of
character recognition in document processing. As a
result the system can process huge number of
documents with in less time and hence saves the time.

 Since our character recognition is based on a grid


infrastructure, it aims to recognize multiple
heterogenous characters that belong to different
universal languages with different font properties and
alignments.
STEPS IN OCR
PRE -PROCESSING

 Deals with improving quality of the image


for better recognition of the system.

 Consists of : Noise Removal, Deblurring,


Binarization and Edge detection.
FEATURE EXTRACTION
 Transforming the input data into the set of features is
called Feature Extraction.

 Feature extraction is performed on raw data prior to


applying k-NN algorithm on the transformed data in
feature space.

 This extracts properties that can identify a character


uniquely, and differentiate between similar characters.
Example
CLASSIFICATION
USES
 It is widely used as a form of Data Entry from printed
paper data records, whether passport documents,
invoices, bank statements, business card, mail or other
documents.

 It is common method of digitizing printed texts so that


it can be electronically edited, searched, stored more
compactly, displayed on line and used in machine
processes such as machine translation, text to speech,
key data and text mining.
CONCLUSION
 OCR technology provides fast, automated data capture
which can save considerable time and labour costs of
organizations.

 The system has its advantages such as Automation of


mundane tasks, less time complexity, very small database
and high adaptability to untrained inputs with only a
small number of features to calculate.

You might also like