A Parallel Implementation of k-Means in MATLAB

The aim of this work is the parallel implementation
of k-means in MATLAB, in order to reduce the execution time.
Specifically, a new function in MATLAB for serial k-means algorithm
is developed, which meets all the requirements for the conversion to a
function in MATLAB with parallel computations. Additionally, two
different variants for the definition of initial values are presented.
In the sequel, the parallel approach is presented. Finally, the
performance tests for the computation times respect to the numbers
of features and classes are illustrated.




References:
[1] P. Sneath and R. R. Sokal, Numerical Taxonomy: The Principles and
Practice of Numerical Classification. San Francisco: W.H. Freeman,
1973.
[2] A. Tsimpiris and D. Kugiumtzis, “Feature selection for classification of
oscillating time series,” Expert Systems, vol. 29, no. 5, pp. 456–477,
2012.
[3] I. Guyon and A. Elisseeff, “An introduction to variable and feature
selection,” J. Mach. Learn. Res., vol. 3, pp. 1157–1182, 2003.
[4] G. John, R. Kohavi, and K. Pfleger, “Irrelevant features and the
subset selection problem,” in Proceedings of the Eleventh International
Conference on Machine Learning. Morgan Kaufmann, 1994, pp.
121–129.
[5] D. Arthur and S. Vassilvitskii, “On the worst case complexity of the
k-means method,” Stanford InfoLab, Technical Report 2005-34, 2005.
[6] ——, “How slow is the k-means method?” in Proceedings of the
Twenty-second Annual Symposium on Computational Geometry, ser.
SCG ’06, New York, NY, USA, 2006, pp. 144–153.
[7] P. Luszczek, “Parallel programming in matlab,” International Journal of
High Performance Computing Applications, vol. 23, no. 3, pp. 277–283,
2009.
[8] G. Sharma and J. Martin, “Matlab : A language for parallel computing,”
International Journal of Parallel Programming, vol. 37, pp. 3–36, 2009.
[9] D. N. Varsamis, P. A. Mastorocostas, A. K. Papakonstantinou, and
N. P. Karampetakis, “A parallel searching algorithm for the insetting
procedure in matlab parallel toolbox,” in Federated Conference on
Computer Science and Information Systems (FedCSIS), 2012. IEEE,
2012, pp. 587–593.
[10] C. Moler, “Parallel matlab: Multiple processors and multiple cores,” The
MathWorks News & Notes, 2007.
[11] C. Lin and L. Snyder, Principles of Parallel Programming. Boston,
USA: Addison-Wesley, 2008.
[12] D. Varsamis, C. Talagkozis, P. Mastorocostas, E. Outsios, and
N. Karampetakis, “The performance of the matlab parallel computing
toolbox in specific problems,” in Advanced Information Science
and Applications Volume I, 18th Int. Conf. on Circuits, Systems,
Communications and Computers (CSCC 2014), July 17-21, 2014,
Santorini Island, Greece, vol. 1, 2014, pp. 145–150.