Abstract: The applications on numbers are across-the-board that there is much scope for study. The chic of writing numbers is diverse and comes in a variety of form, size and fonts. Identification of Indian languages scripts is challenging problems. In Optical Character Recognition [OCR], machine printed or handwritten characters/numerals are recognized. There are plentiful approaches that deal with problem of detection of numerals/character depending on the sort of feature extracted and different way of extracting them. This paper proposes a recognition scheme for handwritten Hindi (devnagiri) numerals; most admired one in Indian subcontinent our work focused on a technique in feature extraction i.e. Local-based approach, a method using 16-segment display concept, which is extracted from halftoned images & Binary images of isolated numerals. These feature vectors are fed to neural classifier model that has been trained to recognize a Hindi numeral. The archetype of system has been tested on varieties of image of numerals. Experimentation result shows that recognition rate of halftoned images is 98 % compared to binary images (95%).
Abstract: This paper presents the analysis of similarity between local decisions, in the process of alphanumeric hand-prints classification. From the analysis of local characteristics of handprinted numerals and characters, extracted by a zoning method, the set of classification decisions is obtained and the similarity among them is investigated. For this purpose the Similarity Index is used, which is an estimator of similarity between classifiers, based on the analysis of agreements between their decisions. The experimental tests, carried out using numerals and characters from the CEDAR and ETL database, respectively, show to what extent different parts of the patterns provide similar classification decisions.
Abstract: In the present study, a support vector machine (SVM) learning approach to character recognition is proposed. Simple
feature detectors, similar to those found in the human visual system, were used in the SVM classifier. Alphabetic characters were rotated
to 8 different angles and using the proposed cognitive model, all characters were recognized with 100% accuracy and specificity.
These same results were found in psychiatric studies of human character recognition.
Abstract: Recognition of characters greatly depends upon the features used. Several features of the handwritten Arabic characters are selected and discussed. An off-line recognition system based on the selected features was built. The system was trained and tested with realistic samples of handwritten Arabic characters. Evaluation of the importance and accuracy of the selected features is made. The recognition based on the selected features give average accuracies of 88% and 70% for the numbers and letters, respectively. Further improvements are achieved by using feature weights based on insights gained from the accuracies of individual features.
Abstract: Document image processing has become an
increasingly important technology in the automation of office
documentation tasks. During document scanning, skew is inevitably
introduced into the incoming document image. Since the algorithm
for layout analysis and character recognition are generally very
sensitive to the page skew. Hence, skew detection and correction in
document images are the critical steps before layout analysis. In this
paper, a novel skew detection method is presented for binary
document images. The method considered the some selected
characters of the text which may be subjected to thinning and Hough
transform to estimate skew angle accurately. Several experiments
have been conducted on various types of documents such as
documents containing English Documents, Journals, Text-Book,
Different Languages and Document with different fonts, Documents
with different resolutions, to reveal the robustness of the proposed
method. The experimental results revealed that the proposed method
is accurate compared to the results of well-known existing methods.
Abstract: Neural processors have shown good results for
detecting a certain character in a given input matrix. In this paper, a
new idead to speed up the operation of neural processors for character
detection is presented. Such processors are designed based on cross
correlation in the frequency domain between the input matrix and the
weights of neural networks. This approach is developed to reduce the
computation steps required by these faster neural networks for the
searching process. The principle of divide and conquer strategy is
applied through image decomposition. Each image is divided into
small in size sub-images and then each one is tested separately by
using a single faster neural processor. Furthermore, faster character
detection is obtained by using parallel processing techniques to test the
resulting sub-images at the same time using the same number of faster
neural networks. In contrast to using only faster neural processors, the
speed up ratio is increased with the size of the input image when using
faster neural processors and image decomposition. Moreover, the
problem of local subimage normalization in the frequency domain is
solved. The effect of image normalization on the speed up ratio of
character detection is discussed. Simulation results show that local
subimage normalization through weight normalization is faster than
subimage normalization in the spatial domain. The overall speed up
ratio of the detection process is increased as the normalization of
weights is done off line.
Abstract: Pattern recognition is the research area of Artificial
Intelligence that studies the operation and design of systems that
recognize patterns in the data. Important application areas are image
analysis, character recognition, fingerprint classification, speech
analysis, DNA sequence identification, man and machine
diagnostics, person identification and industrial inspection. The
interest in improving the classification systems of data analysis is
independent from the context of applications. In fact, in many
studies it is often the case to have to recognize and to distinguish
groups of various objects, which requires the need for valid
instruments capable to perform this task. The objective of this article
is to show several methodologies of Artificial Intelligence for data
classification applied to biomedical patterns. In particular, this work
deals with the realization of a Computer-Aided Detection system
(CADe) that is able to assist the radiologist in identifying types of
mammary tumor lesions. As an additional biomedical application of
the classification systems, we present a study conducted on blood
samples which shows how these methods may help to distinguish
between carriers of Thalassemia (or Mediterranean Anaemia) and
healthy subjects.
Abstract: Hidden Markov Model (HMM) is a stochastic method
which has been used in various signal processing and character
recognition. This study proposes to use HMM to recognize Javanese
characters from a number of different handwritings, whereby HMM
is used to optimize the number of state and feature extraction. An
85.7 % accuracy is obtained as the best result in 16-stated vertical
model using pure HMM. This initial result is satisfactory for
prompting further research.
Abstract: Computerized lip reading has been one of the most
actively researched areas of computer vision in recent past because
of its crime fighting potential and invariance to acoustic environment.
However, several factors like fast speech, bad pronunciation,
poor illumination, movement of face, moustaches and beards make
lip reading difficult. In present work, we propose a solution for
automatic lip contour tracking and recognizing letters of English
language spoken by speakers using the information available from
lip movements. Level set method is used for tracking lip contour
using a contour velocity model and a feature vector of lip movements
is then obtained. Character recognition is performed using modified
k nearest neighbor algorithm which assigns more weight to nearer
neighbors. The proposed system has been found to have accuracy
of 73.3% for character recognition with speaker lip movements as
the only input and without using any speech recognition system in
parallel. The approach used in this work is found to significantly
solve the purpose of lip reading when size of database is small.
Abstract: Optical character recognition of cursive scripts
presents a number of challenging problems in both segmentation and
recognition processes in different languages, including Persian. In
order to overcome these problems, we use a newly developed Persian
word segmentation method and a recognition-based segmentation
technique to overcome its segmentation problems. This method is
robust as well as flexible. It also increases the system-s tolerances to
font variations. The implementation results of this method on a
comprehensive database show a high degree of accuracy which meets
the requirements for commercial use. Extended with a suitable pre
and post-processing, the method offers a simple and fast framework
to develop a full OCR system.
Abstract: This paper includes two novel techniques for skew
estimation of binary document images. These algorithms are based on
connected component analysis and Hough transform. Both these
methods focus on reducing the amount of input data provided to
Hough transform. In the first method, referred as word centroid
approach, the centroids of selected words are used for skew detection.
In the second method, referred as dilate & thin approach, the selected
characters are blocked and dilated to get word blocks and later
thinning is applied. The final image fed to Hough transform has the
thinned coordinates of word blocks in the image. The methods have
been successful in reducing the computational complexity of Hough
transform based skew estimation algorithms. Promising experimental
results are also provided to prove the effectiveness of the proposed
methods.
Abstract: Current OCR technology does not allow to
accurately recognizing small text images, such as those found
in web images. Our goal is to investigate new approaches to
recognize very low resolution text images containing antialiased
character shapes.
This paper presents a preliminary study on the variability of
such characters and the feasibility to discriminate them by
using geometrical features. In a first stage we analyze the
distribution of these features. In a second stage we present a
study on the discriminative power for recognizing isolated
characters, using various rendering methods and font
properties. Finally we present interesting results of our
evaluation tests leading to our conclusion and future focus.
Abstract: In this paper, a new proposed system for Persian
printed numeral characters recognition with emphasis on
representation and recognition stages is introduced. For the first time,
in Persian optical character recognition, geometrical central moments
as character image descriptor and fuzzy min-max neural network for
Persian numeral character recognition has been used. Set of different
experiments on binary images of regular, translated, rotated and
scaled Persian numeral characters has been done and variety of
results has been presented. The best result was 99.16% correct
recognition demonstrating geometrical central moments and fuzzy
min-max neural network are adequate for Persian printed numeral
character recognition.
Abstract: Much research into handwritten Thai character
recognition have been proposed, such as comparing heads of
characters, Fuzzy logic and structure trees, etc. This paper presents a
system of handwritten Thai character recognition, which is based on
the Ant-minor algorithm (data mining based on Ant colony
optimization). Zoning is initially used to determine each character.
Then three distinct features (also called attributes) of each character
in each zone are extracted. The attributes are Head zone, End point,
and Feature code. All attributes are used for construct the
classification rules by an Ant-miner algorithm in order to classify
112 Thai characters. For this experiment, the Ant-miner algorithm is
adapted, with a small change to increase the recognition rate. The
result of this experiment is a 97% recognition rate of the training set
(11200 characters) and 82.7% recognition rate of unseen data test
(22400 characters).
Abstract: Cluster analysis is the name given to a diverse collection of techniques that can be used to classify objects (e.g. individuals, quadrats, species etc). While Kohonen's Self-Organizing Feature Map (SOFM) or Self-Organizing Map (SOM) networks have been successfully applied as a classification tool to various problem domains, including speech recognition, image data compression, image or character recognition, robot control and medical diagnosis, its potential as a robust substitute for clustering analysis remains relatively unresearched. SOM networks combine competitive learning with dimensionality reduction by smoothing the clusters with respect to an a priori grid and provide a powerful tool for data visualization. In this paper, SOM is used for creating a toroidal mapping of two-dimensional lattice to perform cluster analysis on results of a chemical analysis of wines produced in the same region in Italy but derived from three different cultivators, referred to as the “wine recognition data" located in the University of California-Irvine database. The results are encouraging and it is believed that SOM would make an appealing and powerful decision-support system tool for clustering tasks and for data visualization.
Abstract: Character segmentation is an important preprocessing
step for text recognition. In degraded documents, existence of
touching characters decreases recognition rate drastically, for any
optical character recognition (OCR) system. In this paper we have
proposed a complete solution for segmenting touching characters in
all the three zones of printed Gurmukhi script. A study of touching
Gurmukhi characters is carried out and these characters have been
divided into various categories after a careful analysis. Structural
properties of the Gurmukhi characters are used for defining the
categories. New algorithms have been proposed to segment the
touching characters in middle zone, upper zone and lower zone.
These algorithms have shown a reasonable improvement in
segmenting the touching characters in degraded printed Gurmukhi
script. The algorithms proposed in this paper are applicable only to
machine printed text. We have also discussed a new and useful
technique to segment the horizontally overlapping lines.
Abstract: Dealing with hundreds of features in character
recognition systems is not unusual. This large number of features
leads to the increase of computational workload of recognition
process. There have been many methods which try to remove
unnecessary or redundant features and reduce feature dimensionality.
Besides because of the characteristics of Farsi scripts, it-s not
possible to apply other languages algorithms to Farsi directly. In this
paper some methods for feature subset selection using genetic
algorithms are applied on a Farsi optical character recognition (OCR)
system. Experimental results show that application of genetic
algorithms (GA) to feature subset selection in a Farsi OCR results in
lower computational complexity and enhanced recognition rate.
Abstract: Optical Character Recognition (OCR) is a very old and of great interest in pattern recognition field. In this paper we introduce a very powerful approach to recognize Persian text. We have used morphological operators, especially Hit/Miss operator to descript each sub-word and by using a template matching approach we have tried to classify generated description. We used just one font in two different sizes to verify our approach. We achieved a very good rate, up to 99.9%.
Abstract: This paper presents a new approach to tackle the problem of recognizing machine-printed Arabic texts. Because of the difficulty of recognizing cursive Arabic words, the text has to be normalized and segmented to be ready for the recognition stage. The new scheme for recognizing Arabic characters depends on multiple parallel neural networks classifier. The classifier has two phases. The first phase categories the input character into one of eight groups. The second phase classifies the character into one of the Arabic character classes in the group. The system achieved high recognition rate.
Abstract: The Block Sorting problem is to sort a given
permutation moving blocks. A block is defined as a substring
of the given permutation, which is also a substring of the
identity permutation. Block Sorting has been proved to be
NP-Hard. Until now two different 2-Approximation algorithms
have been presented for block sorting. These are the best known
algorithms for Block Sorting till date. In this work we present
a different characterization of Block Sorting in terms of a
transposition cycle graph. Then we suggest a heuristic,
which we show to exhibit a 2-approximation performance
guarantee for most permutations.