A Comparison of SVM-based Criteria in Evolutionary Method for Gene Selection and Classification of Microarray Data

An evolutionary method whose selection and recombination operations are based on generalization error-bounds of support vector machine (SVM) can select a subset of potentially informative genes for SVM classifier very efficiently [7]. In this paper, we will use the derivative of error-bound (first-order criteria) to select and recombine gene features in the evolutionary process, and compare the performance of the derivative of error-bound with the error-bound itself (zero-order) in the evolutionary process. We also investigate several error-bounds and their derivatives to compare the performance, and find the best criteria for gene selection and classification. We use 7 cancer-related human gene expression datasets to evaluate the performance of the zero-order and first-order criteria of error-bounds. Though both criteria have the same strategy in theoretically, experimental results demonstrate the best criterion for microarray gene expression data.

A Phenomic Algorithm for Reconstruction of Gene Networks

The goal of Gene Expression Analysis is to understand the processes that underlie the regulatory networks and pathways controlling inter-cellular and intra-cellular activities. In recent times microarray datasets are extensively used for this purpose. The scope of such analysis has broadened in recent times towards reconstruction of gene networks and other holistic approaches of Systems Biology. Evolutionary methods are proving to be successful in such problems and a number of such methods have been proposed. However all these methods are based on processing of genotypic information. Towards this end, there is a need to develop evolutionary methods that address phenotypic interactions together with genotypic interactions. We present a novel evolutionary approach, called Phenomic algorithm, wherein the focus is on phenotypic interaction. We use the expression profiles of genes to model the interactions between them at the phenotypic level. We apply this algorithm to the yeast sporulation dataset and show that the algorithm can identify gene networks with relative ease.

Role of Oxidative DNA Damage in Pathogenesis of Diabetic Neuropathy

Oxidative stress is considered to be the cause for onset and the progression of type 2 diabetes mellitus (T2DM) and complications including neuropathy. It is a deleterious process that can be an important mediator of damage to cell structures: protein, lipids and DNA. Data suggest that in patients with diabetes and diabetic neuropathy DNA repair is impaired, which prevents effective removal of lesions. Objective: The aim of our study was to evaluate the association of the hOGG1 (326 Ser/Cys) and XRCC1 (194 Arg/Trp, 399 Arg/Gln) gene polymorphisms whose protein is involved in the BER pathway with DNA repair efficiency in patients with diabetes type 2 and diabetic neuropathy compared to the healthy subjects. Genotypes were determined by PCR-RFLP analysis in 385 subjects, including 117 with type 2 diabetes, 56 with diabetic neuropathy and 212 with normal glucose metabolism. The polymorphisms studied include codon 326 of hOGG1 and 194, 399 of XRCC1 in the base excision repair (BER) genes. Comet assay was carried out using peripheral blood lymphocytes from the patients and controls. This test enabled the evaluation of DNA damage in cells exposed to hydrogen peroxide alone and in the combination with the endonuclease III (Nth). The results of the analysis of polymorphism were statistically examination by calculating the odds ratio (OR) and their 95% confidence intervals (95% CI) using the ¤ç2-tests. Our data indicate that patients with diabetes mellitus type 2 (including those with neuropathy) had higher frequencies of the XRCC1 399Arg/Gln polymorphism in homozygote (GG) (OR: 1.85 [95% CI: 1.07-3.22], P=0.3) and also increased frequency of 399Gln (G) allele (OR: 1.38 [95% CI: 1.03-1.83], P=0.3). No relation to other polymorphisms with increased risk of diabetes or diabetic neuropathy. In T2DM patients complicated by neuropathy, there was less efficient repair of oxidative DNA damage induced by hydrogen peroxide in both the presence and absence of the Nth enzyme. The results of our study suggest that the XRCC1 399 Arg/Gln polymorphism is a significant risk factor of T2DM in Polish population. Obtained data suggest a decreased efficiency of DNA repair in cells from patients with diabetes and neuropathy may be associated with oxidative stress. Additionally, patients with neuropathy are characterized by even greater sensitivity to oxidative damage than patients with diabetes, which suggests participation of free radicals in the pathogenesis of neuropathy.

Modeling Stress-Induced Regulatory Cascades with Artificial Neural Networks

Yeast cells live in a constantly changing environment that requires the continuous adaptation of their genomic program in order to sustain their homeostasis, survive and proliferate. Due to the advancement of high throughput technologies, there is currently a large amount of data such as gene expression, gene deletion and protein-protein interactions for S. Cerevisiae under various environmental conditions. Mining these datasets requires efficient computational methods capable of integrating different types of data, identifying inter-relations between different components and inferring functional groups or 'modules' that shape intracellular processes. This study uses computational methods to delineate some of the mechanisms used by yeast cells to respond to environmental changes. The GRAM algorithm is first used to integrate gene expression data and ChIP-chip data in order to find modules of coexpressed and co-regulated genes as well as the transcription factors (TFs) that regulate these modules. Since transcription factors are themselves transcriptionally regulated, a three-layer regulatory cascade consisting of the TF-regulators, the TFs and the regulated modules is subsequently considered. This three-layer cascade is then modeled quantitatively using artificial neural networks (ANNs) where the input layer corresponds to the expression of the up-stream transcription factors (TF-regulators) and the output layer corresponds to the expression of genes within each module. This work shows that (a) the expression of at least 33 genes over time and for different stress conditions is well predicted by the expression of the top layer transcription factors, including cases in which the effect of up-stream regulators is shifted in time and (b) identifies at least 6 novel regulatory interactions that were not previously associated with stress-induced changes in gene expression. These findings suggest that the combination of gene expression and protein-DNA interaction data with artificial neural networks can successfully model biological pathways and capture quantitative dependencies between distant regulators and downstream genes.

Community Detection-based Analysis of the Human Interactome Network

The study of proteomics reached unexpected levels of interest, as a direct consequence of its discovered influence over some complex biological phenomena, such as problematic diseases like cancer. This paper presents a new technique that allows for an accurate analysis of the human interactome network. It is basically a two-step analysis process that involves, at first, the detection of each protein-s absolute importance through the betweenness centrality computation. Then, the second step determines the functionallyrelated communities of proteins. For this purpose, we use a community detection technique that is based on the edge betweenness calculation. The new technique was thoroughly tested on real biological data and the results prove some interesting properties of those proteins that are involved in the carcinogenesis process. Apart from its experimental usefulness, the novel technique is also computationally effective in terms of execution times. Based on the analysis- results, some topological features of cancer mutated proteins are presented and a possible optimization solution for cancer drugs design is suggested.

Changes to Oxidative Stress Levels Following Exposure to Formaldehyde in Lymphocytes

Formaldehyde is the illegal chemical substance used for food preservation in fish and vegetable. It can promote carcinogenesis. Superoxide dismutases are the important antioxidative enzymes that catalyze the dismutation of superoxide anion into oxygen and hydrogen peroxide. The resultant level of oxidative stress in formaldehyde-treated lymphocytes was investigated. The formaldehyde concentrations of 0, 20, 40, 60, 80 and 120μmol/L were treated in human lymphocytes for 12 hours. After 12 treated hours, the superoxide dismutase activity change was measured in formaldehyde-treated lymphocytes. The results showed that the formaldehyde concentrations of 60, 80 and 120μmol/L significantly decreased superoxide dismutase activities in lymphocytes (P < 0.05). The change of superoxide dismutase activity in formaldehyde-treated lymphocytes may be the biomarker for detect cellular injury, such as damage to DNA, due to formaldehyde exposure.

Ultrasonic Evaluation of Bone Callus Growth in a Rabbit Tibial Distraction Model

Ultrasound is useful in demonstrating bone mineral density of regenerating osseous tissue as well as structural alterations. A proposed ultrasound method, which included ultrasonography and acoustic parameters measurement, was employed to evaluate its efficacy in monitoring the bone callus changes in a rabbit tibial distraction osteogenesis (DO) model. The findings demonstrated that ultrasonographic images depicted characteristic changes of the bone callus, typical of histology findings, during the distraction phase. Follow-up acoustic parameters measurement of the bone callus, including speed of sound, reflection and attenuation, showed significant linear changes over time during the distraction phase. The acoustic parameters obtained during the distraction phase also showed moderate to strong correlation with consolidated bone callus density and micro-architecture measured by micro-computed tomography at the end of the consolidation phase. The results support the preferred use of ultrasound imaging in the early monitoring of bone callus changes during DO treatment.

Function of miR-125b in Zebrafish Neurogenesis

MicroRNAs are an important class of gene expression regulators that are involved in many biological processes including embryogenesis. miR-125b is a conserved microRNA that is enriched in the nervous system. We have previously reported the function of miR-125b in neuronal differentiation of human cell lines. We also discovered the function of miR-125b in regulating p53 in human and zebrafish. Here we further characterize the brain defects in zebrafish embryos injected with morpholinos against miR-125b. Our data confirm the essential role of miR-125b in brain morphogenesis particularly in maintaining the balance between proliferation, cell death and differentiation. We identified lunatic fringe (lfng) as an additional target of miR-125b in human and zebrafish and suggest that lfng may mediate the function of miR-125b in neurogenesis. Together, this report reveals new insights into the function of miR- 125b during neural development of zebrafish.

Preparation and Bioevaluation of DOTA-Cyclic RGD Peptide Dimer Labeled with 68Ga

Radiolabeled cyclic RGD peptides targeting integrin αvβ3 are reported as promising agents for the early diagnosis of metastatic tumors. With an aim to improve tumor uptake and retention of the peptide, cyclic RGD peptide dimer E[c (RGDfK)] 2 (E = Glutamic acid, f = phenyl alanine, K = lysine) coupled to the bifunctional chelator DOTA was custom synthesized and radiolabelled with 68Ga. Radiolabelling of cyclic RGD peptide dimer with 68Ga was carried out using HEPES buffer and biological evaluation of the complex was done in nude mice bearing HT29 tumors.

A Novel Microarray Biclustering Algorithm

Biclustering aims at identifying several biclusters that reveal potential local patterns from a microarray matrix. A bicluster is a sub-matrix of the microarray consisting of only a subset of genes co-regulates in a subset of conditions. In this study, we extend the motif of subspace clustering to present a K-biclusters clustering (KBC) algorithm for the microarray biclustering issue. Besides minimizing the dissimilarities between genes and bicluster centers within all biclusters, the objective function of the KBC algorithm additionally takes into account how to minimize the residues within all biclusters based on the mean square residue model. In addition, the objective function also maximizes the entropy of conditions to stimulate more conditions to contribute the identification of biclusters. The KBC algorithm adopts the K-means type clustering process to efficiently make the partition of K biclusters be optimized. A set of experiments on a practical microarray dataset are demonstrated to show the performance of the proposed KBC algorithm.

Eukaryotic Gene Prediction by an Investigation of Nonlinear Dynamical Modeling Techniques on EIIP Coded Sequences

Many digital signal processing, techniques have been used to automatically distinguish protein coding regions (exons) from non-coding regions (introns) in DNA sequences. In this work, we have characterized these sequences according to their nonlinear dynamical features such as moment invariants, correlation dimension, and largest Lyapunov exponent estimates. We have applied our model to a number of real sequences encoded into a time series using EIIP sequence indicators. In order to discriminate between coding and non coding DNA regions, the phase space trajectory was first reconstructed for coding and non-coding regions. Nonlinear dynamical features are extracted from those regions and used to investigate a difference between them. Our results indicate that the nonlinear dynamical characteristics have yielded significant differences between coding (CR) and non-coding regions (NCR) in DNA sequences. Finally, the classifier is tested on real genes where coding and non-coding regions are well known.

Dexamethasone: Impact on Testicular Activity

Dexamethasone (Dex) is a synthetic glucocorticoid that is used in therapy. However prolonged treatments with high doses are often required. This causes side effects that interfere with the activity of several endocrine systems, including the gonadotropic axis. The aim of our study is to determine the effect of Dex on testicular function in prepubertal Wistar rats. Newborn Wistar rats are submitted to intraperitoneal injection of Dex (1μg of Dex dissolved in NaCl 0.9% / 5g bw) for 20 days and then sacrificed at the age of 40days. A control group received NaCl 0.9%. The rat is weighed daily. The plasmatic levels of testosterone, LH and FSH were measured by radioimmunoassay. A histomorphometric study was performed on sections of testis. Treated groups showed a significant decrease in body weight (p

Prevalence of Epstein-Barr Virus Latent Membrane Protein-1 in Jordanian Patients with Hodgkin's Lymphoma and Non- Hodgkin's Lymphoma

The aim of this study was to estimate the frequency of EBV infection in Hodgkin's lymphoma (HL) and non-Hodgkin's lymphoma (NHL) occurring in Jordanian patients. A total of 55 patients with lymphoma were examined in this study. Of 55 patients, 30 and 25 were diagnosed as HL and NHL, respectively. The four HL subtypes were observed with the majority of the cases exhibited the mixed cellularity (MC) subtype followed by the nodular sclerosis (NS). The high grade was found to be the commonest subtype of NHL in our sample, followed by the low grade. The presence of EBV virus was detected by immunostating for expression of latent membrane protein-1 (LMP-1). The frequency of LMP-1 expression occurred more frequent in patients with HL (60.0%) than in patients with NHL (32.0%). The frequency of LMP-1 expression was also higher in patients with MC subtype (61.11%) than those patients with NS (28.57%). No age or gender difference in occurrence of EBV infection was observed among patient with HL. By contrast, the prevalence of EBV infection in NHL patients aged below 50 was lower (16.66%) than in NHL patients aged 50 or above (46.15%). In addition, EBV infection was more frequent in females with NHL (38.46%) than in male with NHL (25%). In NHL cases, the frequency of EBV infection in intermediate grade (60.0%) was high when compared with frequency of low (25%) or high grades (25%). In conclusion, analysis of LMP-1 expression indicates an important role for this viral oncogene in the pathogenesis of EBV-associated malignant lymphomas. These data also support the previous findings that people with EBV may develop lymphoma and that efforts to maintain low lymphoma should be considered for people with EBV infection.

A Dynamic Time-Lagged Correlation based Method to Learn Multi-Time Delay Gene Networks

A gene network gives the knowledge of the regulatory relationships among the genes. Each gene has its activators and inhibitors that regulate its expression positively and negatively respectively. Genes themselves are believed to act as activators and inhibitors of other genes. They can even activate one set of genes and inhibit another set. Identifying gene networks is one of the most crucial and challenging problems in Bioinformatics. Most work done so far either assumes that there is no time delay in gene regulation or there is a constant time delay. We here propose a Dynamic Time- Lagged Correlation Based Method (DTCBM) to learn the gene networks, which uses time-lagged correlation to find the potential gene interactions, and then uses a post-processing stage to remove false gene interactions to common parents, and finally uses dynamic correlation thresholds for each gene to construct the gene network. DTCBM finds correlation between gene expression signals shifted in time, and therefore takes into consideration the multi time delay relationships among the genes. The implementation of our method is done in MATLAB and experimental results on Saccharomyces cerevisiae gene expression data and comparison with other methods indicate that it has a better performance.

MIM: A Species Independent Approach for Classifying Coding and Non-Coding DNA Sequences in Bacterial and Archaeal Genomes

A number of competing methodologies have been developed to identify genes and classify DNA sequences into coding and non-coding sequences. This classification process is fundamental in gene finding and gene annotation tools and is one of the most challenging tasks in bioinformatics and computational biology. An information theory measure based on mutual information has shown good accuracy in classifying DNA sequences into coding and noncoding. In this paper we describe a species independent iterative approach that distinguishes coding from non-coding sequences using the mutual information measure (MIM). A set of sixty prokaryotes is used to extract universal training data. To facilitate comparisons with the published results of other researchers, a test set of 51 bacterial and archaeal genomes was used to evaluate MIM. These results demonstrate that MIM produces superior results while remaining species independent.

Phylogenetic Inference from 18S rRNA Gene Sequences of Horseshoe Crabs, Tachypleus gigas between Tanjung Dawai, Kedah and Cherating, Pahang, Peninsular Malaysia

The phylogenetic analysis using the most conservative portions of 18S rRNA gene revealed the phylogenetic relationship among the two populations where DNA divergence showed that the nucleotides diversity value were -0.00838 for the Tanjung Dawai, Kedah and -0.00708 for the Cherating, Pahang populations respectively. The net nucleotide divergence among populations (Da) was -0.0073 indicating a low polymorphism among the populations studied. Total number of mutations in the Tanjung Dawai, Kedah samples was higher than Cherating, Pahang samples, which are 73 and 59 respectively while shared mutations across the populations were 8, and reveal the evolutionary in the genome of Malaysian T. gigas. The tree topology of both populations inferred using Neigbour-joining method by comparing 1791 bp of partial 18S rRNA sequence revealed that T. gigas haplotypes were clustered into seven clades, suggesting that they are genetically diverse among populations which derived from a common ancestor.

Quantitative Characteristics of Rainbow Trout, Oncorhynchus Mykiss, Neo-Males (XX Genotype) and Super-Males (YY Genotype) Sperm

Rainbow trout homogametic males, (XX or YY sex genotype), can be obtained, respectively, through masculinisation of genetic females or induced androgenesis. Aim of this study was to compare reproductive potential of neo-males (XX) and super-males (YY) with heterogametic males (XY). We measured spermatozoa motility parameters, sperm concentration, osmolality and characterized protein profiles in samples of stripped and testicular sperm obtained from XY and YY males, and testicular sperm of XX males. The motile spermatozoa, as measured by both subjective method and CASA, showed no differences between testicular sperm of XX males and stripped sperm of XY and YY males whereas testicular sperm of XY and YY males had significantly lower sperm motility. Result of protein densitometry showed similarities in protein profile between seminal plasma of XY and YY males and testicular fluids of XX males. Testis of XX males showed specific histological structures of cysts consists hypertrophied Sertoli cells.

Symmetry Breaking and the Emergence of Branching Structures in Morphogenesis: Minimal Conditions and Mechanical Interactions between Cells

The minimal condition for symmetry breaking in morphogenesis of cellular population was investigated using cellular automata based on reaction-diffusion dynamics. In particular, the study looked for the possibility of the emergence of branching structures due to mechanical interactions. The model used two types of cells an external gradient. The results showed that the external gradient influenced movement of cell type-I, also revealed that clusters formed by cells type-II worked as barrier to movement of cells type-I.

Dynamic Bayesian Networks Modeling for Inferring Genetic Regulatory Networks by Search Strategy: Comparison between Greedy Hill Climbing and MCMC Methods

Using Dynamic Bayesian Networks (DBN) to model genetic regulatory networks from gene expression data is one of the major paradigms for inferring the interactions among genes. Averaging a collection of models for predicting network is desired, rather than relying on a single high scoring model. In this paper, two kinds of model searching approaches are compared, which are Greedy hill-climbing Search with Restarts (GSR) and Markov Chain Monte Carlo (MCMC) methods. The GSR is preferred in many papers, but there is no such comparison study about which one is better for DBN models. Different types of experiments have been carried out to try to give a benchmark test to these approaches. Our experimental results demonstrated that on average the MCMC methods outperform the GSR in accuracy of predicted network, and having the comparable performance in time efficiency. By proposing the different variations of MCMC and employing simulated annealing strategy, the MCMC methods become more efficient and stable. Apart from comparisons between these approaches, another objective of this study is to investigate the feasibility of using DBN modeling approaches for inferring gene networks from few snapshots of high dimensional gene profiles. Through synthetic data experiments as well as systematic data experiments, the experimental results revealed how the performances of these approaches can be influenced as the target gene network varies in the network size, data size, as well as system complexity.

Novel Hybrid Method for Gene Selection and Cancer Prediction

Microarray data profiles gene expression on a whole genome scale, therefore, it provides a good way to study associations between gene expression and occurrence or progression of cancer. More and more researchers realized that microarray data is helpful to predict cancer sample. However, the high dimension of gene expressions is much larger than the sample size, which makes this task very difficult. Therefore, how to identify the significant genes causing cancer becomes emergency and also a hot and hard research topic. Many feature selection algorithms have been proposed in the past focusing on improving cancer predictive accuracy at the expense of ignoring the correlations between the features. In this work, a novel framework (named by SGS) is presented for stable gene selection and efficient cancer prediction . The proposed framework first performs clustering algorithm to find the gene groups where genes in each group have higher correlation coefficient, and then selects the significant genes in each group with Bayesian Lasso and important gene groups with group Lasso, and finally builds prediction model based on the shrinkage gene space with efficient classification algorithm (such as, SVM, 1NN, Regression and etc.). Experiment results on real world data show that the proposed framework often outperforms the existing feature selection and prediction methods, say SAM, IG and Lasso-type prediction model.