Fractal Analysis of 16S rRNA Gene Sequences in Archaea Thermophiles

A nucleotide sequence can be expressed as a numerical sequence when each nucleotide is assigned its proton number. A resulting gene numerical sequence can be investigated for its fractal dimension in terms of evolution and chemical properties for comparative studies. We have investigated such nucleotide fluctuation in the 16S rRNA gene of archaea thermophiles. The studied archaea thermophiles were archaeoglobus fulgidus, methanothermobacter thermautotrophicus, methanocaldococcus jannaschii, pyrococcus horikoshii, and thermoplasma acidophilum. The studied five archaea-euryarchaeota thermophiles have fractal dimension values ranging from 1.93 to 1.97. Computer simulation shows that random sequences would have an average of about 2 with a standard deviation about 0.015. The fractal dimension was found to correlate (negative correlation) with the thermophile-s optimal growth temperature with R2 value of 0.90 (N =5). The inclusion of two aracheae-crenarchaeota thermophiles reduces the R2 value to 0.66 (N = 7). Further inclusion of two bacterial thermophiles reduces the R2 value to 0.50 (N =9). The fractal dimension is correlated (positive) to the sequence GC content with an R2 value of 0.89 for the five archaea-euryarchaeota thermophiles (and 0.74 for the entire set of N = 9), although computer simulation shows little correlation. The highest correlation (positive) was found to be between the fractal dimension and di-nucleotide Shannon entropy. However Shannon entropy and sequence GC content were observed to correlate with optimal growth temperature having an R2 of 0.8 (negative), and 0.88 (positive), respectively, for the entire set of 9 thermophiles; thus the correlation lacks species specificity. Together with another correlation study of bacterial radiation dosage with RecA repair gene sequence fractal dimension, it is postulated that fractal dimension analysis is a sensitive tool for studying the relationship between genotype and phenotype among closely related sequences.





References:
[1] Todd Holden, R. Subramaniam, R. Sullivan, E. Cheung, C. Schneider,
G. Tremberger, Jr., A. Flamholz, D. H. Lieberman, and T. D. Cheung,
"ATCG nucleotide fluctuation of Deinococcus radiodurans radiation
genes", Proc. SPIE 6694, 669417, 2007
[2] N. N. Oiwa and J. A. Glazier, "The fractal structure of the mitochondrial
genomes", Physica A, vol 311, pp221 - 230, 2002.
[3] Z.G. Yu, A. Vo, Z.M. Gong and S.C. Long, "Fractals in DNA sequence
analysis", Chinese Physics, vol 11, pp1313-1318, 2002.
[4] H.D. Liu, Z.H. Liu, X. Sun, "Studies of Hurst Index for Different
Regions of Genes", ICBBE 2007, pp238-240, 2007.
[5] C.Y. Lee, "Mass Fractal Dimension of the Ribosome and Implication of
its Dynamic Characteristics", Physical Review E, vol 73, 042901 (3
pages), 2006.
[6] Pollard KS, Salama SR, Lambert N, Coppens S, Pedersen JS, et al., "An
RNA gene expressed during cortical development evolved rapidly in
humans". Nature 443, 167-172 , 2006.
[7] Pollard KS, Salama SR, King B, Kern AD, Dreszer T, et al.,"Forces
shaping the fastest evolving regions in the human genome", PLoS Genet
2(10): e168. DOI: 10. 1371/journal.pgen.0020168, 2006
[8] E.W. Weisstein, "Capacity Dimension." From MathWorld--A Wolfram
Web Resource. http://mathworld.wolfram.com/
[9] Huai-chun Wang & Donal A. Hickey,"Evidence for strong selective
constraint acting on the nucleotide composition of 16S ribosomal RNA
genes", Nucleic Acid Research, vol 30, 2501-2507, 2002.
[10] W. Klonowski "From conformons to human brains: an informal
overview of nonlinear dynamics and its applications in
biomedicine".Nonlinear Biomed Phys. 2007 Jul 5; 1(1):5.
[11] M.J. Berryman, A. Allison, and D. Abbott, "Mutual Information for
examining correlations in DNA-, Fluctuation & Noise Letters, vol 4,
ppL237-L246, 2004.
[12] T. Higuchi, "Approach to an irregular time series on the basis of fractal
theory", Physica D, vol 31, 277-283, 1998.
[13] Xinmin Yang, Haluk Beyenal, Gary Harkin, Zbigniew Lewandowski,"
Quantifying biofilm structure using image analysis", Journal of
Microbiological Methods, Vol 39, Pages 109-119, 2000
[14] Todd Holden, G. Tremberger, Jr., P. Marchese, E. Cheung, R.
Subramaniam, R. Sullivan, P. Schneider, A. Flamholz, D. Lieberman, &
T. Cheung, "DNA sequance based comparative studies of between nonextremophile
and extremophile organisms with implications in
exobiology", SPIE Astrobiology Conference Proceedings, 7097-30,,
invited, in press, 2008.
[15] Stoyan Milev, Alemayehu A. Gorfe, Andrey Karshikoff, Robert T.
Clubb, Hans Rudolf Bosshard, and Ilian Jelesarov, "Energetics of
Sequence-Specific Protein-DNA Association: Binding of Integrase
Tn916 to Its Target DNA" Biochemistry vol 42, 3481-3491, 2003.
[16] Stoyan Milev, Alemayehu A. Gorfe, Andrey Karshikoff, Robert T.
Clubb, Hans Rudolf Bosshard, and Ilian Jelesarov, "Energetics of
Sequence-Specific Protein-DNA Association: Conformational Stability
of the DNA Binding Domain of Integrase Tn916 and Its Cognate DNA
Duplex" Biochemistry vol 42, 3492-3502 , 2003.