Mining Genes Relations in Microarray Data Combined with Ontology in Colon Cancer Automated Diagnosis System
MATCH project [1] entitle the development of an
automatic diagnosis system that aims to support treatment of colon
cancer diseases by discovering mutations that occurs to tumour
suppressor genes (TSGs) and contributes to the development of
cancerous tumours. The constitution of the system is based on a)
colon cancer clinical data and b) biological information that will be
derived by data mining techniques from genomic and proteomic
sources The core mining module will consist of the popular, well
tested hybrid feature extraction methods, and new combined
algorithms, designed especially for the project. Elements of rough
sets, evolutionary computing, cluster analysis, self-organization maps
and association rules will be used to discover the annotations
between genes, and their influence on tumours [2]-[11].
The methods used to process the data have to address their high
complexity, potential inconsistency and problems of dealing with the
missing values. They must integrate all the useful information
necessary to solve the expert's question. For this purpose, the system
has to learn from data, or be able to interactively specify by a domain
specialist, the part of the knowledge structure it needs to answer a
given query. The program should also take into account the
importance/rank of the particular parts of data it analyses, and adjusts
the used algorithms accordingly.
[1] http://www.match-project.com/
[2] Pawlak Z. (1982) Rough sets. International Journal of Information and
Computer Sciences, 11(5):341-356.
[3] Pawlak Z. and Slowinski. R. (1994) Rough set approach to multiattribute
decision analysis. European Journal of Operational Research,
72(3):443-459.
[4] Slezak D. (2005) Association Reducts: A Framework for Mining Multiattribute
Dependencies. ISMIS 2005: 354-363.
[5] Wroblewski J. (1996) Theoretical Foundations of Order-Based Genetic
Algorithms. Fundam. Inform. 28(3-4): 423-430.
[6] Wroblewski:J., Slezak D. (2003) Order Based Genetic Algorithms for
the Search of Approximate Entropy Reducts. RSFDGrC 2003: 308-311.
[7] Yao H., Hamilton H.J., Butz C.J. (2004) A Foundational Approach to
Mining Itemset Utilities from Databases. SDM 2004.
[8] Yao J.T., Yao Y.Y., and Zhao, Y. (2005) Foundations of classification,
in: Lin, T.Y., Ohsuga, S., Liau, C.J. and Hu, X. (Eds), Foundations and
Novel Approaches in Data Mining, Springer, Berlin, pp. 75-97.
[9] Yao Y.Y., Zhong, N. and Zhao, Y.(2004) A three-layered conceptual
framework of data mining, Proceedings of ICDM'04 Workshop of
Foundation of Data Mining, 215-221.
[10] Ziarko, W. (1989) A technique for discovering and analysis of causeeffect
relationships in empirical data. International Joint Conference on
Artificial Intelligence, Proceedings of the Workshop on Knowledge
Discovery in Databases, Detroit, p.390-396.
[11] Ziarko, W. (1989) Determination of locally optimal set of features for
representation of implicit knowledge. Proceedings of International
Conference on Computing and Information, Toronto, North Holland,
p.433-438.
[12] Baskin C., García-Sastre A., Tumpey T. (2004) Integration of Clinical
Data, Pathology, and cDNA Microarrays in Influenza Virus-Infected
Pigtailed Macaques Journal of Virology, October 2004, p. 10420-10432,
Vol. 78, No. 19
[13] Casey R. M. (2005) Bioinformatics Data Integration. Business
Intelligence Network
[14] Pasquier, C. et al. THEA: ontology-driven analysis of microarray data.
Pasquier, C. et al. Bioinformatics 20(16), 2636-2643, 2004.
[15] Radetzki, U., Bode, T., Witterstein, G., Gnasa et al. (2003) A Service-
Centric Computing Environment for Heterogeneous Biological
Databases and Methods." In R. Spang, P. Beziat, and M. Vingron (eds.):
Currents in Computational Molecular Biology (RECOMB 2003), pp. 25-
26, April 2003, Berlin, Germany.
[16] Burger, M., Graepel, T., Obermayer, K.: Self-organizing maps:
Generalizations and new optimization techniques. Neurocomputing 20
(1998) pp. 173-190.
[17] Kohonen, T.: Self-organized formation of topologically correct feature
maps. Bio-logical Cybernetics 43 (1982) pp. 59-69.
[18] Gruzdz, A.,Ihnatowicz, A., Slezak, D.: Interactive gene clustering-A
case study of breast cancer microarray data. Information Systems
Frontiers (2006) 8:21-27.
[1] http://www.match-project.com/
[2] Pawlak Z. (1982) Rough sets. International Journal of Information and
Computer Sciences, 11(5):341-356.
[3] Pawlak Z. and Slowinski. R. (1994) Rough set approach to multiattribute
decision analysis. European Journal of Operational Research,
72(3):443-459.
[4] Slezak D. (2005) Association Reducts: A Framework for Mining Multiattribute
Dependencies. ISMIS 2005: 354-363.
[5] Wroblewski J. (1996) Theoretical Foundations of Order-Based Genetic
Algorithms. Fundam. Inform. 28(3-4): 423-430.
[6] Wroblewski:J., Slezak D. (2003) Order Based Genetic Algorithms for
the Search of Approximate Entropy Reducts. RSFDGrC 2003: 308-311.
[7] Yao H., Hamilton H.J., Butz C.J. (2004) A Foundational Approach to
Mining Itemset Utilities from Databases. SDM 2004.
[8] Yao J.T., Yao Y.Y., and Zhao, Y. (2005) Foundations of classification,
in: Lin, T.Y., Ohsuga, S., Liau, C.J. and Hu, X. (Eds), Foundations and
Novel Approaches in Data Mining, Springer, Berlin, pp. 75-97.
[9] Yao Y.Y., Zhong, N. and Zhao, Y.(2004) A three-layered conceptual
framework of data mining, Proceedings of ICDM'04 Workshop of
Foundation of Data Mining, 215-221.
[10] Ziarko, W. (1989) A technique for discovering and analysis of causeeffect
relationships in empirical data. International Joint Conference on
Artificial Intelligence, Proceedings of the Workshop on Knowledge
Discovery in Databases, Detroit, p.390-396.
[11] Ziarko, W. (1989) Determination of locally optimal set of features for
representation of implicit knowledge. Proceedings of International
Conference on Computing and Information, Toronto, North Holland,
p.433-438.
[12] Baskin C., García-Sastre A., Tumpey T. (2004) Integration of Clinical
Data, Pathology, and cDNA Microarrays in Influenza Virus-Infected
Pigtailed Macaques Journal of Virology, October 2004, p. 10420-10432,
Vol. 78, No. 19
[13] Casey R. M. (2005) Bioinformatics Data Integration. Business
Intelligence Network
[14] Pasquier, C. et al. THEA: ontology-driven analysis of microarray data.
Pasquier, C. et al. Bioinformatics 20(16), 2636-2643, 2004.
[15] Radetzki, U., Bode, T., Witterstein, G., Gnasa et al. (2003) A Service-
Centric Computing Environment for Heterogeneous Biological
Databases and Methods." In R. Spang, P. Beziat, and M. Vingron (eds.):
Currents in Computational Molecular Biology (RECOMB 2003), pp. 25-
26, April 2003, Berlin, Germany.
[16] Burger, M., Graepel, T., Obermayer, K.: Self-organizing maps:
Generalizations and new optimization techniques. Neurocomputing 20
(1998) pp. 173-190.
[17] Kohonen, T.: Self-organized formation of topologically correct feature
maps. Bio-logical Cybernetics 43 (1982) pp. 59-69.
[18] Gruzdz, A.,Ihnatowicz, A., Slezak, D.: Interactive gene clustering-A
case study of breast cancer microarray data. Information Systems
Frontiers (2006) 8:21-27.
@article{"International Journal of Medical, Medicine and Health Sciences:51795", author = "A. Gruzdz and A. Ihnatowicz and J. Siddiqi and B. Akhgar", title = "Mining Genes Relations in Microarray Data Combined with Ontology in Colon Cancer Automated Diagnosis System", abstract = "MATCH project [1] entitle the development of an
automatic diagnosis system that aims to support treatment of colon
cancer diseases by discovering mutations that occurs to tumour
suppressor genes (TSGs) and contributes to the development of
cancerous tumours. The constitution of the system is based on a)
colon cancer clinical data and b) biological information that will be
derived by data mining techniques from genomic and proteomic
sources The core mining module will consist of the popular, well
tested hybrid feature extraction methods, and new combined
algorithms, designed especially for the project. Elements of rough
sets, evolutionary computing, cluster analysis, self-organization maps
and association rules will be used to discover the annotations
between genes, and their influence on tumours [2]-[11].
The methods used to process the data have to address their high
complexity, potential inconsistency and problems of dealing with the
missing values. They must integrate all the useful information
necessary to solve the expert's question. For this purpose, the system
has to learn from data, or be able to interactively specify by a domain
specialist, the part of the knowledge structure it needs to answer a
given query. The program should also take into account the
importance/rank of the particular parts of data it analyses, and adjusts
the used algorithms accordingly.", keywords = "Bioinformatics, gene expression, ontology, selforganizingmaps.", volume = "2", number = "4", pages = "137-5", }