OHASD: The First On-Line Arabic Sentence Database Handwritten on Tablet PC

In this paper we present the first Arabic sentence dataset for on-line handwriting recognition written on tablet pc. The dataset is natural, simple and clear. Texts are sampled from daily newspapers. To collect naturally written handwriting, forms are dictated to writers. The current version of our dataset includes 154 paragraphs written by 48 writers. It contains more than 3800 words and more than 19,400 characters. Handwritten texts are mainly written by researchers from different research centers. In order to use this dataset in a recognition system word extraction is needed. In this paper a new word extraction technique based on the Arabic handwriting cursive nature is also presented. The technique is applied to this dataset and good results are obtained. The results can be considered as a bench mark for future research to be compared with.