OHASD: The First On-Line Arabic Sentence Database Handwritten on Tablet PC
In this paper we present the first Arabic sentence
dataset for on-line handwriting recognition written on tablet pc. The
dataset is natural, simple and clear. Texts are sampled from daily
newspapers. To collect naturally written handwriting, forms are
dictated to writers. The current version of our dataset includes 154
paragraphs written by 48 writers. It contains more than 3800 words
and more than 19,400 characters. Handwritten texts are mainly
written by researchers from different research centers. In order to use
this dataset in a recognition system word extraction is needed. In this
paper a new word extraction technique based on the Arabic
handwriting cursive nature is also presented. The technique is applied
to this dataset and good results are obtained. The results can be
considered as a bench mark for future research to be compared with.
[1] A. L. Koerich, Large Vocabulary Off-line Handwritten Word
Recognition, PHD Thesis, Ecole de Technologie Sup'erieure,
Universit'e du Qu'ebec, 2002.
[2] M. Blumenstein, Intelligent Techniques for Handwriting Recognition,
PHD Thesis, Faculty of Engineering and Information Technology,
Griffith University, 2000
[3] R. Cole, Survey of the state of the art in human language technology,
Cambridge University Press, New York, USA,1997, Ch. 2, Pages: 513-
537, ISBN:0-521-59277-1.
[4] H. El Abed, V. Märgner, M. Kherallah, A. M. Alimi, "ICDAR 2009 Online
Arabic Handwriting Recognition Competition", 10th International
Conference on Document Analysis and Recognition, 2009, ISBN: 978-
0-7695-3725-2.
[5] R. A. Huber and A. M. Headrick, Handwriting Identification: Facts and
Fundamentals, CRC Press LLC, New York, 1999.
[6] U.-V. Marti and H. Bunke, "The IAM-database: an English sentence
database for offline handwriting recognition", International Journal of
Document Analysis and Recognition, 2002, vol. 5, pp. 39 - 46.
[7] M. Liwicki and H. Bunke, "IAM-OnDB - an On-Line English Sentence
Database Acquired from Handwritten Text on a Whiteboard",
Proceedings of the Eighth International Conference on Document
Analysis and Recognition table of contents, 2005, pp. 956 - 961, 2005,
ISBN ~ ISSN:1520-5263 , 0-7695-2420-6
[8] E.H. Ratzlaff, "Inter-line Distance Estimation and Text Line Extraction
For Unconstrained On-line Handwriting", Proceedings of the Seventh
International Workshop on Frontiers in Handwriting Recognition, 2000,
pp 33-42.
[9] G. Loudon, O. Pellijeff, LI Zhong-Wei, "A Method for Handwriting
Input and Correction on Smartphones", Proceedings of the Seventh
International Workshop on Frontiers in Handwriting Recognition, 2000,
pp 481-485.
[10] T. Su, T. Zhang, and D. Guan, "HIT-MW Dataset for Offline Chinese
Handwritten Text Recognition", Proceedings of the Tenth International
Workshop on Frontiers in Handwriting Recognition, 2006.
[1] A. L. Koerich, Large Vocabulary Off-line Handwritten Word
Recognition, PHD Thesis, Ecole de Technologie Sup'erieure,
Universit'e du Qu'ebec, 2002.
[2] M. Blumenstein, Intelligent Techniques for Handwriting Recognition,
PHD Thesis, Faculty of Engineering and Information Technology,
Griffith University, 2000
[3] R. Cole, Survey of the state of the art in human language technology,
Cambridge University Press, New York, USA,1997, Ch. 2, Pages: 513-
537, ISBN:0-521-59277-1.
[4] H. El Abed, V. Märgner, M. Kherallah, A. M. Alimi, "ICDAR 2009 Online
Arabic Handwriting Recognition Competition", 10th International
Conference on Document Analysis and Recognition, 2009, ISBN: 978-
0-7695-3725-2.
[5] R. A. Huber and A. M. Headrick, Handwriting Identification: Facts and
Fundamentals, CRC Press LLC, New York, 1999.
[6] U.-V. Marti and H. Bunke, "The IAM-database: an English sentence
database for offline handwriting recognition", International Journal of
Document Analysis and Recognition, 2002, vol. 5, pp. 39 - 46.
[7] M. Liwicki and H. Bunke, "IAM-OnDB - an On-Line English Sentence
Database Acquired from Handwritten Text on a Whiteboard",
Proceedings of the Eighth International Conference on Document
Analysis and Recognition table of contents, 2005, pp. 956 - 961, 2005,
ISBN ~ ISSN:1520-5263 , 0-7695-2420-6
[8] E.H. Ratzlaff, "Inter-line Distance Estimation and Text Line Extraction
For Unconstrained On-line Handwriting", Proceedings of the Seventh
International Workshop on Frontiers in Handwriting Recognition, 2000,
pp 33-42.
[9] G. Loudon, O. Pellijeff, LI Zhong-Wei, "A Method for Handwriting
Input and Correction on Smartphones", Proceedings of the Seventh
International Workshop on Frontiers in Handwriting Recognition, 2000,
pp 481-485.
[10] T. Su, T. Zhang, and D. Guan, "HIT-MW Dataset for Offline Chinese
Handwritten Text Recognition", Proceedings of the Tenth International
Workshop on Frontiers in Handwriting Recognition, 2006.
@article{"International Journal of Information, Control and Computer Sciences:62192", author = "Randa I. M. Elanwar and Mohsen A. Rashwan and Samia A. Mashali", title = "OHASD: The First On-Line Arabic Sentence Database Handwritten on Tablet PC", abstract = "In this paper we present the first Arabic sentence
dataset for on-line handwriting recognition written on tablet pc. The
dataset is natural, simple and clear. Texts are sampled from daily
newspapers. To collect naturally written handwriting, forms are
dictated to writers. The current version of our dataset includes 154
paragraphs written by 48 writers. It contains more than 3800 words
and more than 19,400 characters. Handwritten texts are mainly
written by researchers from different research centers. In order to use
this dataset in a recognition system word extraction is needed. In this
paper a new word extraction technique based on the Arabic
handwriting cursive nature is also presented. The technique is applied
to this dataset and good results are obtained. The results can be
considered as a bench mark for future research to be compared with.", keywords = "Arabic, Handwriting recognition, on-line dataset.", volume = "4", number = "12", pages = "1972-6", }