Recognition-based Segmentation in Persian Character Recognition

Optical character recognition of cursive scripts presents a number of challenging problems in both segmentation and recognition processes in different languages, including Persian. In order to overcome these problems, we use a newly developed Persian word segmentation method and a recognition-based segmentation technique to overcome its segmentation problems. This method is robust as well as flexible. It also increases the system-s tolerances to font variations. The implementation results of this method on a comprehensive database show a high degree of accuracy which meets the requirements for commercial use. Extended with a suitable pre and post-processing, the method offers a simple and fast framework to develop a full OCR system.




References:
[1] A. Amin, "Off line Arabic character recognition - a survey",
Proceedings of the International Conference on Document Analysis and
Recognition, vol. 2, pp. 596-599, 1997.
[2] Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, "Gradient based learning
applied to document recognition", Proceedings of the IEEE, vol. 86, no.
11, IEEE, pp. 2278- 2324, USA , 1998.
[3] B. Al-Badr, R. Haralick, "Segmentation-free word recognition with
application to Arabic", Proceedings of the Third International
Conference on Document Analysis and Recognition, Part vol. 1, IEEE
Comput. Soc. Press., pp. 355-359,Los Alamitos, CA, USA, 1995.
[4] I. Bazzi, R. Schwartz, J. Makhoul, "An omnifont open vocabulary OCR
system for English and Arabic", IEEE Transactions on Pattern Analysis
& Machine Intelligence, vol. 21, no. 6, IEEE Comput. Soc., pp. 495-504,
USA, 1999.
[5] A. Hassin, Tang, Xiang-Long, Liu, Jia-Feng, Zhao-Wei, "Printed
Arabic character recognition using HMM", Journal of Computer Science
& Technology, vol. 19, no. 4, Science Press, pp. 538-543, China, 2004.
[6] I. S. Abuhaiba, S. A. Mahmoud, and R. Green, "Cluster Number
Estimation and Skeleton Refining Algorithm for Arabic Characters",
The Arabian Journal for Science and Engineering, vol. 16, no. 4B, pp.
519-530, 1991.
[7] K. Jambi, "Arabic Character Recognition", Many Approaches and One
Decade:, Die Arabic Journal for Science and Engineering, vol. 16, no.
4B, pp. 501-509, 1991.
[8] A. Cheung, M. Bennamoun, and N. W. Bergmann, "Implementation of
A Statistical Based Arabic Character Recognition System", TENCON97,
pp. 531-534, Brisbane, Australia, 1997.
[9] K. R. Castleman, "Digital Image Processing", Prentice-Hall Signal
Processing Series, Prentice-Hall Inc., USA, 1979.
[10] R. C. Gonzalez, P. Wintz, "Digital Image Processing", 2nd Edition,
Addison-Wesley Publishing Company, California, 1987.
[11] A. Cheung, M. Bennamoun, and N. W. Bergmann, "A New World
Segmentation Algorithm for Arabic Script", DICl-A-97, pp. 431-435,
Auckland, New Zealand, 1997.
[12] B. Timsari, "Character recognition in typed Persian words", a
morphological approach, M.S. thesis, Isfahan Univ. of Tech., Iran(1992)
[13] R. J. Schalkol, "Pattern Recognition: Statistical, Structural and Neural
Network", Wiley, New York, 1992.