Urdu Nastaleeq Optical Character Recognition

Scholarly

Volume:1, Issue: 8, 2007 Page No: 2322 - 2325

International Journal of Information, Control and Computer Sciences

ISSN: 2517-9942

2235 Downloads

Abstract Full Text Download References Share Add to Favorites

DOI:10.5281/zenodo.1055691 BibTeX JSON

Urdu Nastaleeq Optical Character Recognition

This paper discusses the Urdu script characteristics, Urdu Nastaleeq and a simple but a novel and robust technique to recognize the printed Urdu script without a lexicon. Urdu being a family of Arabic script is cursive and complex script in its nature, the main complexity of Urdu compound/connected text is not its connections but the forms/shapes the characters change when it is placed at initial, middle or at the end of a word. The characters recognition technique presented here is using the inherited complexity of Urdu script to solve the problem. A word is scanned and analyzed for the level of its complexity, the point where the level of complexity changes is marked for a character, segmented and feeded to Neural Networks. A prototype of the system has been tested on Urdu text and currently achieves 93.4% accuracy on the average.

Authors:

Keywords:

References:

[1] U. Pal and Anirban Sarkar, "Recognition of Printed Urdu Script",
"Proceedings of the Seventh International Conference on Document
Analysis and Recognition (ICDAR 2003)".
[2] Raymond G. Gordon, "Ethnologue: Languages of the World Fifteenth
Edition" SIL International, 2005.
[3] Khalid Saeed, "New Approaches for Cursive Languages Recognition:
Machine and Hand Written Script and Texts".
[4] K. Saeed, Three-Agent System for Cursive Script Recognition, " Proc.
CVPRIP ÔÇÿ2000 Computer Vsion, Pattern Recognition and Image
Processing-5th Joint Conf. on Information Sciences, JCIS-200, Vol.2,
PP.244-247, Feb 27-March 3, N.Jersry 2000.
[5] K. Saeed, R Niedzielski, "Experiments on Thinning of Cursive-Style
Alphabets, "Inter Conf. on information Technologies ITESB -99, June
24-25, Minsk 1999.
[6] Inam shamsheer, Zaheer Ahmad, Jehanzeb Khan Orakzai, Awais
Adnan, "OCR For Printed Urdu Script Using Feed Forward Neural
Network," MLPR 2007 :International Conference on Machine Learning
and Pattern Recognition", 2007.

Scholarly

International Journal of Information, Control and Computer Sciences

Archive

Last Issue

Commitee

Urdu Nastaleeq Optical Character Recognition

Scholarly

International Journal of Information, Control and Computer Sciences

Archive

Last Issue

Commitee

Urdu Nastaleeq Optical Character Recognition

Preview