Combining Color and Layout Features for the Identification of Low-resolution Documents

This paper proposes a method, combining color and layout features, for identifying documents captured from lowresolution handheld devices. On one hand, the document image color density surface is estimated and represented with an equivalent ellipse and on the other hand, the document shallow layout structure is computed and hierarchically represented. The combined color and layout features are arranged in a symbolic file, which is unique for each document and is called the document-s visual signature. Our identification method first uses the color information in the signatures in order to focus the search space on documents having a similar color distribution, and finally selects the document having the most similar layout structure in the remaining search space. Finally, our experiment considers slide documents, which are often captured using handheld devices.




References:
[1] S. Mukhopadhyay, and B. Smith, "Passive capture and structuring of
lectures," in Proc. of ACM Multimedia, 1999, pp. 477-487.
[2] B. Erol, and J. Hull, "Linking presentation documents using image
analysis," in Asilomar Conf. on Signals, Systems, and Computers, Nov.
9-12 2003, Pacific Grove, CA.
[3] D. Franklin, S. Bradshaw, and K. J. Hammond, "Jabberwocky: you don-t
have to be a rocket scientist to change slides for hydrogen combustion
lecture," Intelligent User Interface, 2000, pp. 98-105.
[4] D. Lee, B. Erol, J. Graham, J. J. Hull, and N. Murata, "Portable meeting
recorder," In ACM Multimedia Conference, 2000, pp. 493-502.
[5] G. D. Abowd, "Classroom 2000: An experiment with the
instrumentation of a living educational environment," IBM Systems
Journal, Special issue on Pervasive Computing, vol. 38, No. 4, pp. 508-
530, 1999.
[6] P. Chiu, A. Kapuskar, and L. Wilcox, "Meeting capture in a media
enriched conference room," in 2nd International Workshop on
Cooperative Buildings, 1999, pp.79-88.
[7] D. Lalanne, R. Ingold, D. von Rotz, A. Behera, D. Mekhaldi and A.
Popescu-Belis, ÔÇÿÔÇÿUsing static documents as structured and thematic
interfaces to multimedia meeting archives," in 1st Intl. Workshop on
Machine Learning for Multimodal Interaction (MLMI), 2004, Martigny,
Switzerland, LNCS, vol. 3361, pp. 87-100.
[8] P. Chiu, J. Foote, A. Girgensohn, and J. Boreczky, "Automatically
linking multimedia meeting documents by image matching," in Proc. of
ACM Hypertext -00, 2000, pp. 244-245.
[9] N. Ozawa, H. Takebe, Y. Katsuyama, S. Naoi, and H. Yakota, "Slide
identification for lecture movies by matching characters and images," in
Proc. SPIE-Document Recognition and Retrieval XI, 2004, vol. 5296,
pp. 74-81.
[10] J. Hu, R. Kashi, and G. Wilfong, "Document classification using layout
analysis", in Proc. International Workshop on Database and Expert
Systems Applications, 1999, pp. 556-560.
[11] C. Shin and D. Doermann, "Classification of document page images
based on visual similarity of layout structures," in Proc. SPIE -
Document Recognition and Retrieval VII, 2000, pp. 182-190.
[12] A. Dengel, and F. Dubiel, "Clustering and classification of document
structure - a machine learning approach," in Proc. Second International
Conf. on Document Analysis and Recognition, 1993, pp. 587-591.
[13] E. Appiani, and A.M. Colla, "Automatic analysis and indexing of
variable-layout documents," in Proc. RIAO2000, Paris, France, April 12-
14, 2000, pp. 980-987.
[14] K. Y. Wong, R. G. Casey, and F. M. Wahl, "Document analysis system,"
IBM Journal of Research and Development, vol.26, pp. 647-656, 1982.
[15] G. Nagy, and S. Seth, "Hierarchical representation of optically scanned
documents," in Proceedings of International Conference on Pattern
Recognition, 1984, Vol. 1, pp. 347-349.
[16] H. S. Baird, S. E. Jones, and S. J. Fortune, "Image segmentation by
shape-directed covers," in Proceedings of International Conference on
Pattern Recognition, June 1990, pp. 820-825.
[17] L. O. Gorman, "The document spectrum for page layout analysis," IEEE
Trans. on PAMI, vol. 15, pp. 1162-1173, 1993.
[18] K. Kise, A. Sato, and M. Iwata, "Segmentation of page images using the
area voronoi diagram," Computer Vision and Image Understanding, vol.
70, pp. 370-382, 1998.
[19] F. Wahl, K. Wong, and R. Casey, "Block segmentation and text
extraction in mixed text/image documents," Graphical Models and
Image Processing, vol. 20, pp. 375-390, 1982.
[20] T. Pavlidis and J. Zhou, "Page segmentation and classification," CVGIP
vol. 54, pp. 484-496, 1992.
[21] T. Weldon and W. Higgins, "An algorithm for designing multiple gabor
filters for segmenting multi-textured images," in IEEE International
Conference on Image Processing, Chicago, October, 1998, pp. 4-7.
[22] A.K. Jain, and S.K. Bhattacharjee, "Address block location on envelopes
using gabor filters," Pattern Recognition, vol. 25, no.12, pp. 1459-1477,
1992.
[23] A. K. Jain, and Y. Zhong, "Page segmentation using texture analysis,"
Pattern Recognition, 1996, vol. 29, pp. 743-770.
[24] X.Wan, and C.C.J. Kuo, "Color distribution analysis and quantization
for image retrieval," in Proceedings of SPIE, vol. 2670, February 1996.
[25] M. Stricker, M. Orengo, "Similarity of color images," in SPIE
Conference on Storage and Retrieval for Image and Video Databases
III, February 1995, vol. 2420, pp. 381-392.
[26] P. Aigrain, H. Zhang, and D. Petkovic, "Content-based representation
and retrieval of visual media: a state-of-the-art review," Multimedia
Tools and Applications, 1996, no. 3, pp. 179-202.
[27] B.S. Manjunath, W.Y. Ma, "Texture features for browsing and retrieval
of image data," IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 18, no. 8, pp. 837-842, 1996.
[28] B.S. Manjunath, J.R Ohm, V.V. Vasudevan, and A. Yamada, "Color and
texture descriptors," IEEE Trans. on Circuits and Systems for Video
Technology, vol. 11, no. 6, pp. 703-715, 2001.
[29] A.K. Jain and A. Vailaya, "Image retrieval using color and shape,"
Pattern Recognition, vol. 29, no. 8, pp. 1233-1244, 1996.
[30] J.E. Gary, and R. Mehrotra, "Similar shape retrieval using a structural
feature index," Information Systems, 18, 7, pp. 525-537, October 1990.
[31] M. Petkovic', "Content-based video retrieval," in 7th International
Conference on Extending Database Technology, March 27-31, 2000,
Konstanz, Germany, pp 74-77.
[32] M. Swain and D. Ballard, ÔÇÿÔÇÿColor indexing--, Intl. Journal of Computer
Vision, vol. 7, no. 1, pp. 11-32, 1991.
[33] D. W. Scott, Multivariate Density Estimation. New York: John Wiley,
1992.
[34] B. W. Silverman, Density Estimation for Statistic and Data Analysis.
New York: Chapman and Hall, 1986.
[35] E. Parzen, "On estimation of a probability density function and mode,"
Ann. Math. Stat., vol. 33, pp. 1065-1076, 1962.
[36] M. J. Jones and J. M. Rehag, "Statistical color models with application
to skin detection,-- Intl. Journal of Computer Vision, vol. 46, no. 1, pp.
81-96, 2002.
[37] R. Cattoni, T. Coianiz, S. Messelodi, and C. M. Modena, "Geometric
layout analysis techniques for document image understanding a review,"
Technical Report, ITC-IRST, Trento, Italy 1998.
[38] A. Behera, D. Lalanne and R. Ingold, "Visual signature based
identification of low-resolution document images," ACM Symposium on
Document Engineering, Milwaukee, Wisconsin, 2004, pp. 178-187.