A System to Integrate and Manipulate Protein Database Using BioPerl and XML
The size, complexity and number of databases used
for protein information have caused bioinformatics to lag behind in
adapting to the need to handle this distributed information.
Integrating all the information from different databases into one
database is a challenging problem. Our main research is to develop a
tool which can be used to access and manipulate protein information
from difference databases. In our approach, we have integrated
difference databases such as Swiss-prot, PDB, Interpro, and EMBL
and transformed these databases in flat file format into relational
form using XML and Bioperl. As a result, we showed this tool can
search different sizes of protein information stored in relational
database and the result can be retrieved faster compared to flat file
database. A web based user interface is provided to allow user to
access or search for protein information in the local database.
[1] Guochun Xie,Reynold DeMarco,Richard Blevins and Yuhong Wang,
Stroing biological sequence databases in relational form,
http://www.bioinformatic.oupjournals.org, 1999.
[2] Andre Bergholz,Jorg A. schenk, stepehn Heyman,Johann Christoper ,
Sequence comparison using a relational database
approach,http://www.citeseer.ist.psu.edi/bergholz97sequence.html,
1997.
[3] P.mork,A.halevy, P.tarczy, A model for data integration system of
Biomedical Data Applied to Online Genetic Databases, 2000.
[4] Wang L., Riethiven-Tom, P., N,McNail P.,Robinso Redaschi,
A.,Lijnzaad,Exploiting XML with CORBA to improve Distributing
EMBL data, EMBL Outstation , European Bioinformatics Institute,2001
[5] Wang L., Riethiven-Tom, P., N,McNail P.,Robinso, Accessing and
distributing EMBL data using CORBA, Genome Biology 2000 1(5):
research, 2000
[6] E.V. Kriventseva, W.Flieschman, E.M Zdobnov, R. Apweiler, CluSTr: A
database of clusters of Swis-sprot + Trembl Protiens, Nucleic Asids
Research, Vol 29, No1, pg 33 - 36, 2001
[7] Emmanuel, B,Leser,U. Lijnzaad,P,Cussat-Blanc,Jungferm K.Guyon,F.,
Vaysseix, G, Jhelgesen,C., and Rodriguez-Tome, P. A Proposal for a
standard CORBA interface for genome Maps, Bioinformatics, vol 15, No
2, , pg 157 - 169, 1999
[8] http://www.w3.org/XML/
[9] http://www.bio.perl.org/
[10] http://www.ebi.uniprot.org/uniprot-srv/uniprotsearch
[11] http://au.expasy.org/
[12] http://pir.georgetown.edu/pirwww/dbinfo/pirpsd.html
[13] http://pfam-wust1.edu/hmmsearch.shtml
[14] http://umber.sbs.man.ac.uk/dbrowser/OWL
[15] S.F. Altschul et al., "Basic Local Alignment Search Tool,", Journal of
Molecular Biology 215, 403-420, 1990
[1] Guochun Xie,Reynold DeMarco,Richard Blevins and Yuhong Wang,
Stroing biological sequence databases in relational form,
http://www.bioinformatic.oupjournals.org, 1999.
[2] Andre Bergholz,Jorg A. schenk, stepehn Heyman,Johann Christoper ,
Sequence comparison using a relational database
approach,http://www.citeseer.ist.psu.edi/bergholz97sequence.html,
1997.
[3] P.mork,A.halevy, P.tarczy, A model for data integration system of
Biomedical Data Applied to Online Genetic Databases, 2000.
[4] Wang L., Riethiven-Tom, P., N,McNail P.,Robinso Redaschi,
A.,Lijnzaad,Exploiting XML with CORBA to improve Distributing
EMBL data, EMBL Outstation , European Bioinformatics Institute,2001
[5] Wang L., Riethiven-Tom, P., N,McNail P.,Robinso, Accessing and
distributing EMBL data using CORBA, Genome Biology 2000 1(5):
research, 2000
[6] E.V. Kriventseva, W.Flieschman, E.M Zdobnov, R. Apweiler, CluSTr: A
database of clusters of Swis-sprot + Trembl Protiens, Nucleic Asids
Research, Vol 29, No1, pg 33 - 36, 2001
[7] Emmanuel, B,Leser,U. Lijnzaad,P,Cussat-Blanc,Jungferm K.Guyon,F.,
Vaysseix, G, Jhelgesen,C., and Rodriguez-Tome, P. A Proposal for a
standard CORBA interface for genome Maps, Bioinformatics, vol 15, No
2, , pg 157 - 169, 1999
[8] http://www.w3.org/XML/
[9] http://www.bio.perl.org/
[10] http://www.ebi.uniprot.org/uniprot-srv/uniprotsearch
[11] http://au.expasy.org/
[12] http://pir.georgetown.edu/pirwww/dbinfo/pirpsd.html
[13] http://pfam-wust1.edu/hmmsearch.shtml
[14] http://umber.sbs.man.ac.uk/dbrowser/OWL
[15] S.F. Altschul et al., "Basic Local Alignment Search Tool,", Journal of
Molecular Biology 215, 403-420, 1990
@article{"International Journal of Medical, Medicine and Health Sciences:49688", author = "Zurinahni Zainol and Rosalina Abdul Salam and Rosni Abdullah and Nur'Aini and Wahidah Husain", title = "A System to Integrate and Manipulate Protein Database Using BioPerl and XML", abstract = "The size, complexity and number of databases used
for protein information have caused bioinformatics to lag behind in
adapting to the need to handle this distributed information.
Integrating all the information from different databases into one
database is a challenging problem. Our main research is to develop a
tool which can be used to access and manipulate protein information
from difference databases. In our approach, we have integrated
difference databases such as Swiss-prot, PDB, Interpro, and EMBL
and transformed these databases in flat file format into relational
form using XML and Bioperl. As a result, we showed this tool can
search different sizes of protein information stored in relational
database and the result can be retrieved faster compared to flat file
database. A web based user interface is provided to allow user to
access or search for protein information in the local database.", keywords = "Protein sequence database, relational database,
integrated database.", volume = "1", number = "6", pages = "354-4", }