Clustering of Variables Based On a Probabilistic Approach Defined on the Hypersphere

We consider n individuals described by p standardized variables, represented by points of the surface of the unit hypersphere Sn-1. For a previous choice of n individuals we suppose that the set of observables variables comes from a mixture of bipolar Watson distribution defined on the hypersphere. EM and Dynamic Clusters algorithms are used for identification of such mixture. We obtain estimates of parameters for each Watson component and then a partition of the set of variables into homogeneous groups of variables. Additionally we will present a factor analysis model where unobservable factors are just the maximum likelihood estimators of Watson directional parameters, exactly the first principal component of data matrix associated to each group previously identified. Such alternative model it will yield us to directly interpretable solutions (simple structure), avoiding factors rotations.





References:
[1] B. S. Everitt. Cluster Analysis, London: Arnold, 1993.
[2] E. M. Qannari, E. Vigneau, P. Luscan, A. C. Lefebvre and F. Vey.
Clustering of variables: application in consumer and sensory studies.
Food Quality and Preference, 8, 5/6, 423-428, 1997.
[3] E. Vigneau and E. M. Qannari. Clustering of variables around latent
components. Communications in Statistics - Simulation and
Computation, 32, 4, pp. 1131-1150, 2003.
[4] H. Hotelling. Analysis of a complex of statistical variables into principal
components. J. Educational Psychology, 24, pp. 417-441, 1933.
[5] Y. Escoufier. Le traitement des variables vectorielles. Biometrics, 29,
pp. 751-760, 1973.
[6] P. Gomes. Distribution de Bingham sur la n-sphere: une nouvelle
approche de l’ Analyse~Factorielle, Thèse D’ État Université des
Sciences et Techniques du Languedoc-Montpellier, 1987.
[7] A. Figueiredo. Classificação de variáveis no contexto de um modelo
probabilístico definido na n-esfera. Tese de Doutoramento em
Estatística e Investigação Operacional na especialidade de Estatística
Experimental e Análise de Dados, Faculdade de Ciências, Universidade
de Lisboa, 2000.
[8] K. Mardia and P. E. Jupp. Directional Statistics, 2nd edition, Wiley:
Chichester, 2000.
[9] A. Figueiredo and P. Gomes. Power of tests of uniformity defined on the
hypersphere. Communications in Statistics -Simulation and
Computation, 22, 1, pp. 87-94, 2003.
[10] A. Figueiredo and P. Gomes. Performance of the EM algorithm on the
identification of a mixture of Watson distributions defined on the
hypersphere. REVSTAT-Statistical Journal, 4, 2, p. 19, 2006,
[11] A. Figueiredo and P. Gomes. Goodness-of-fit methods for the bipolar
Watson distribution defined on the hypersphere. Statistics and
Probability Letters, 76, pp. 142-152, 2006.
[12] P. Gomes and A. Figueiredo. “A new probabilistic approach for the
classification of normalized variables”. In Contributed Papers of the
Bulletin of the 52nd Session of the International Statistical Institute, vol.
LVIII, Book 1, pp. 403-404, 1999.