Journey on Image Clustering Based on Color Composition

Image clustering is a process of grouping images based on their similarity. The image clustering usually uses the color component, texture, edge, shape, or mixture of two components, etc. This research aims to explore image clustering using color composition. In order to complete this image clustering, three main components should be considered, which are color space, image representation (feature extraction), and clustering method itself. We aim to explore which composition of these factors will produce the best clustering results by combining various techniques from the three components. The color spaces use RGB, HSV, and L*a*b* method. The image representations use Histogram and Gaussian Mixture Model (GMM), whereas the clustering methods use KMeans and Agglomerative Hierarchical Clustering algorithm. The results of the experiment show that GMM representation is better combined with RGB and L*a*b* color space, whereas Histogram is better combined with HSV. The experiments also show that K-Means is better than Agglomerative Hierarchical for images clustering.




References:
[1] Y. Chen, J.Z. Wang, R. Krovetz, "Content-Based Image Retrieval by
Clustering", Proceedings of the 5th ACM SIGMM international
workshop on Multimedia information retrieval, 2003, pp. 193-200.
[2] R. Liu, Y. Wang, T. Baba, Y. Uehara, D. Masumoto and S. Nagata,
"SVM-Based Active Feedback in Image Retrieval Using Clustering and
Unlabeled Data. LNCS, Computer Analysis of Images and Patterns",
Springer Berlin / Heidelberg, Volume 4673/2007, August 2007, pp. 954-
961.
[3] J. Guan, G. Qiu, "Spectral images and features co-clustering with
application to content-based image retrieval", In Proc. of IEEE
Workshop on Multimedia Signal Processing, 2005.
[4] D. Kim, "Qcluster: Relevance Feedback Using Adaptive Clustering for
Content-Based Image Retrieval", In Proc. of the ACM SIGMOD Int.
Conf. on Management of Data, 2003.
[5] S. Park, K. Seo, D. Jang, "Fuzzy Art-Based Image Clustering Method
for Content-Based Image Retrieval", International Journal of
Information Technology and Decision Making, 06(02), 2007.
[6] Y. Liu, X. Chen, C. Zhang, A. Sprague, "An Interactive Region-Based
Image Clustering and Retrieval Platform", In Proc. of the IEEE
International Conference on Multimedia and Expo, 2006, pp. 929-932.
[7] R. Fakouri, B. Zamani, M. Fathy, and B. Minaei, "Region-Based Image
Clustering and Retrieval Using Fuzzy Similarity and Relevance
Feedback", In Proc. Of the International Conference on Computer and
Electrical Engineering, 2008.
[8] E. Margaretha, H.M. Manurung, "Multimedia Information Processing.
Technical report", Faculty of Computer Science University of Indonesia,
2009.
[9] A. K. Jain, M. N. Murty, P. J. Flynn, "Data Clustering: A Review", in
ACM Computer Survey, 1999, pp 264-323.
[10] J. Huang, S. R. Kumar, and R. Zabith, "An automatic hierarchical image
classification scheme", In ACM Conference on multimedia, England,
September 2008, pp. 219-228.
[11] C. Carson, S. Belongie, H. Greenspan, and J. Malik, "Region-based
image querying", In Proc. of the IEEE Workshop on Content-based
Access of Image and Video libraries (CVPR'97), 1997, pp. 42-49.
[12] C. Carson, S. Belongie, H. Greenspan, and J. Malik, "Blobworld: Image
segmentation using expectation-maximization and its application to
image querying", IEEE Transactions on Pattern Analysis and Machine
Intelligence, 24(8):1026-1038, 2002.
[13] G. Sheikholeslami and A. Zhang, " Approach to clustering large visual
databases using wavelet transform", In Proc. of SPIE conference on
visual data exploration and analysis IV, volume 3017, San Jose,
California, 1997.
[14] H. Greenspan, J. Goldberger, and L. Ridel, " A continuous probabilistic
framework for image matching", Journal of Computer Vision and Image
Understanding, 84:384-406, 2001.
[15] J. Chen, C.A. Bouman, and J.C. Dalton, "Hierarchical browsing and
search of large image databases", IEEE transactions on Image
Processing, 9(3):442-455, March 2000.
[16] G. Pass and R. Zabih, "Comparing images using joint histograms",
Multimedia Systems, 7:234-240, 1999.
[17] M. Stricker and A. Dimai, "Spectral covariance and fuzzy regions for
image indexing. Machine Vision and Applications", 10(2):66-73, 1997.
[18] J. Huang, S. R. Kumar, M. Mitra, W.-J. Zhu, and R. Zabih, "Image
indexing using color correlograms", In Proc. of the IEEE Comp. Vis.
And Patt. Rec., pp. 762-768, 1997.
[19] K. Barnard, P. Duygulu, and D. Forsyth, "Clustering art. In Computer
Vision and Pattern Recognition (CVPR 2001)", Hawaii, December
2001.
[20] K. Barnard and D. Forsyth, "Learning the semantics of words and
pictures. In International Conference on Computer Vision", volume 2,
pp. 408-415, 2001.
[21] A. Vailaya, M. A. T. Figueiredo, A. K. Jain, and H.-J. Zhang, "Image
Classification for Content-Based Indexing," IEEE Trans. Image
Processing, vol. 10, no. 1, pp. 117-130, 2001.
[22] J. Z. Wang, J. Li, and G. Wiederhold, "SIMPLIcity: Semantics-Sensitive
Integrated Matching for Picture LIbraries," IEEE Trans. Pattern Anal.
Machine Intell., vol. 23, no. 9, pp. 947-963, 2001.
[23] N. Tishby, F. Pereira, and W. Bialek, "The information bottleneck
method", In Proc. Of the 37-th Annual Allerton Conference on
Communication, Control and Computing, pp. 368-377, 1999.
[24] N. Slonim and N. Tishby, :Agglomerative information bottleneck", In
Proc. of Neural Information Processing Systems, pp. 617-623, 1999.
[25] N. Slonim, N. Friedman, and N. Tishby, "Unsupervised document
classification using sequential information maximization", In Proc. of
the 25-th Annual International ACM SIGIR Conference on Research
and Development in Information Retrieval, 2002.
[26] N. Slonim, R. Somerville, N. Tishby, and O. Lahav, "Objective
classification of galaxy spectra using the information bottleneck
method", 323:270-284, 2001.
[27] E. Schneidman, N. Slonim, N. Tishby, R. R. deRuyter van Steveninck,
and W. Bialek, "Analysing neural codes using the information
bottleneck method", In Advances in Neural Information Processing
Systems, NIPS, 2001
[28] S. Gordon, "Unsupervised Image Clustering using Probabilistic
Continuous Models and Information Theoretic Principle", Thesis,
Universitas Tel-Aviv Israel, 2006.
[29] D. Cardani. Adventures in HSV Space. Available at
http://robotica.itam.mx/espanol/archivos /hsvspace.pdf. 2006.
[30] N. Vasconcelos and A.Lippman, "Feature representations for image
retrieval: Beyond the color histogram", In Proc. of the Int. Conference
on Multimedia and Expo, New York, August 2000.
[31] C. Adi, "Comparison of Agglomerative Hierarchical Clustering
methods for Text Data", Thesis, Faculty of Computer Science,
University of Indonesia.