Abstract: Current OCR technology does not allow to
accurately recognizing small text images, such as those found
in web images. Our goal is to investigate new approaches to
recognize very low resolution text images containing antialiased
character shapes.
This paper presents a preliminary study on the variability of
such characters and the feasibility to discriminate them by
using geometrical features. In a first stage we analyze the
distribution of these features. In a second stage we present a
study on the discriminative power for recognizing isolated
characters, using various rendering methods and font
properties. Finally we present interesting results of our
evaluation tests leading to our conclusion and future focus.
Abstract: This paper proposes a method, combining color and
layout features, for identifying documents captured from lowresolution
handheld devices. On one hand, the document image color
density surface is estimated and represented with an equivalent
ellipse and on the other hand, the document shallow layout structure
is computed and hierarchically represented. The combined color and
layout features are arranged in a symbolic file, which is unique for
each document and is called the document-s visual signature. Our
identification method first uses the color information in the
signatures in order to focus the search space on documents having a
similar color distribution, and finally selects the document having the
most similar layout structure in the remaining search space. Finally,
our experiment considers slide documents, which are often captured
using handheld devices.
Abstract: This paper proposes a method, combining color and layout features, for identifying documents captured from low-resolution handheld devices. On one hand, the document image color density surface is estimated and represented with an equivalent ellipse and on the other hand, the document shallow layout structure is computed and hierarchically represented. Our identification method first uses the color information in the documents in order to focus the search space on documents having a similar color distribution, and finally selects the document having the most similar layout structure in the remaining of the search space.