Web Application to Profiling Scientific Institutions through Citation Mining

Recently the use of data mining to scientific bibliographic data bases has been implemented to analyze the pathways of the knowledge or the core scientific relevances of a laureated novel or a country. This specific case of data mining has been named citation mining, and it is the integration of citation bibliometrics and text mining. In this paper we present an improved WEB implementation of statistical physics algorithms to perform the text mining component of citation mining. In particular we use an entropic like distance between the compression of text as an indicator of the similarity between them. Finally, we have included the recently proposed index h to characterize the scientific production. We have used this web implementation to identify users, applications and impact of the Mexican scientific institutions located in the State of Morelos.





References:
[1] Amaral, L.A.N., Gopikrishnan, P., Matia, K., Plerou, V. and Stanley E.H.
Application of statistical physics methods and concepts to the study of
science & technology systems, Scientometrics, Vol. 51, No. 1, 2001, pp
9-36.
[2] Bilke, S. and Peterson, C. Topological properties of citation and metabolic
networks, Phys. Rev. E Vol. 64, 2001. 036106
[3] Katz, J.S. The self-similar science systems, Research Policy, Vol 28, 1999,
pp. 501-517
[4] Redner, S. How popular is your paper? An empirical study of the citation
distribution, Eur. Phys. J., Vol 4, 1998, 131.
[5] Kostoff R.N., del R'─▒o J.A. The impact of physics research. Phys World.
Vol. 14, 2001, pp 47-51.
[6] Kostoff, R.N., del R'─▒o, J.A., Humenik, J.A, Garc'─▒a, E.O. and Ram'─▒rez,
A.M., Citation Mining: Integrating Text Mining and Bibliometrics for
Research Users Profile. J. Am. Soc. Inform. Scien. & Tech. Vol. 52,
2001, pp. 1148-1156.
[7] J.A del R'─▒o, R.N. Kostoff, E.O. Garc'─▒a, A.M. Ram'─▒rez and J.A. Humenik,
Phenomenological approach to profile impact of scientific research:
citation mining, Adv. Complex Syst. Vol. 5, 2002. pp. 19-42.
[8] Ortuno, M., Carpena, P., Bernaola-Galvan, P., Muoz, E. and Somoza,
A.M., Keyword detection in natural languages and DNA. Europhysics
Letters, Vol. 57, 2002, pp. 759-764.
[9] Montrol, E.W, About the Physics of no-physical systems. J. Stat Phys,
Vol. 42, 1986, 647.
[10] Benedeto, D., Caglioti E., Loreto V., Language Trees and Zipping,
Physical Review Letters, Vol. 88, 2002, 048702.
[11] del Ro, J.A. and Corts, H.D. La ciencia mexicana en Nature y Science:
La ltima dcada, Ciencia, (journal of the Mexican Academy of Sciences
AMC). In press (2006).
[12] Cortes H.D., del Rio J. A., Garcia E.O., Web Implementation of
Entropy-like Algorithms for Citation Mining, WSEAS Transactions on
Information Science and Applications, Vol. 2, No. 9, Pp. 1430 - 1437,
2005
[13] J. E. Hirsch. An index to quantify an individual-s scientific research output.
PROC. NAT. ACAD. SCI. 102, 16569-16572 (2005), also available
in http://arXiv.org/physics/0508025.