Accelerating GLA with an M-Tree

In this paper, we propose a novel improvement for the generalized Lloyd Algorithm (GLA). Our algorithm makes use of an M-tree index built on the codebook which makes it possible to reduce the number of distance computations when the nearest code words are searched. Our method does not impose the use of any specific distance function, but works with any metric distance, making it more general than many other fast GLA variants. Finally, we present the positive results of our performance experiments.




References:
[1] A. K. Jain and R. C. Dubes, Algorithms for Clustering Data. Prentice-
Hall, 1988.
[2] U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and Uthurusamy,
Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press,
1996.
[3] R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis.
New York: John Wiley & Sons, 1973.
[4] A. Gersho and R. M. Gray, Vector Quantization and Signal
Compression. Boston: Kluwer Academic, 1992.
[5] P. Ciaccia, M. Patella, P. Zezula, "M -tree: An efficient access method
for similarity search in metric spaces," in Proceedings of the 23rd
Conference on Very Large Databases, 1997, pp. 426-435.
[6] T. Kanungo, D. M. Mount, N. S. Netanyahu, C. Piatko, R. Silverman,
and A. Y. Wu, "Computing nearest neighbors for moving points and
applications to clustering," in Proceedings of the 10th Annual ACM
SIAM Symposium on Discrete Algorithms, 1999, pp. S931-S932.
[7] K. Alsabti, S. Ranka, and V. Singh, , "An efficient k-means clustering
algorithm," in Proceedings of the 1st Workshop on High Performance
Data Mining, 1998.
[8] D. Pelleg and A. Moore, "Accelerating exact k-means algorithms with
geometric reasoning," in Proceedings of the ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining,
1999, pp. 277-281.
[9] C.-D. Bei and R. M. Gray, "An improvement of the minimum distortion
encoding algorithm for vector quantization," IEEE Transactions on
Communications, vol. 33, no. 10, pp. 1132-1133, October 1985.
[10] S.-W. Ra and J.-K. Kim, "A fast mean-distance-ordered partial codebook
search algorithm for image vector quantization," IEEE Transactions on
Circuits and Systems, vol. 40, no. 9, pp. 576-579, September 1993.
[11] S.-H. Chen and W. M. Hsieh, "Fast algorithm for VQ codebook design,"
IEE Proceedings-I, vol. 138, no. 5, pp. 357-362, October 1991.
[12] T. Kaukoranta, P. Fränti, and O. Nevalainen, "A fast exact GLA based
on code vector activity detection," IEEE Transactions on Image
Processing, vol. 9, no. 8, pp. 1337-1342, August 2000.
[13] J. K. Uhlmann, "Satisfying general proximity/similarity queries with
metric trees," Information Processing Letters, vol. 40, no. 4, pp. 175-
179, November 1991.