Abstract: The most important subtype of non-Hodgkin-s
lymphoma is the Diffuse Large B-Cell Lymphoma. Approximately
40% of the patients suffering from it respond well to therapy,
whereas the remainder needs a more aggressive treatment, in order to
better their chances of survival. Data Mining techniques have helped
to identify the class of the lymphoma in an efficient manner. Despite
that, thousands of genes should be processed to obtain the results.
This paper presents a comparison of the use of various attribute
selection methods aiming to reduce the number of genes to be
searched, looking for a more effective procedure as a whole.
Abstract: The classification of the protein structure is commonly
not performed for the whole protein but for structural domains, i.e.,
compact functional units preserved during evolution. Hence, a first
step to a protein structure classification is the separation of the
protein into its domains. We approach the problem of protein domain
identification by proposing a novel graph theoretical algorithm. We
represent the protein structure as an undirected, unweighted and
unlabeled graph which nodes correspond the secondary structure
elements of the protein. This graph is call the protein graph. The
domains are then identified as partitions of the graph corresponding
to vertices sets obtained by the maximization of an objective function,
which mutually maximizes the cycle distributions found in the
partitions of the graph. Our algorithm does not utilize any other kind
of information besides the cycle-distribution to find the partitions. If
a partition is found, the algorithm is iteratively applied to each of
the resulting subgraphs. As stop criterion, we calculate numerically
a significance level which indicates the stability of the predicted
partition against a random rewiring of the protein graph. Hence,
our algorithm terminates automatically its iterative application. We
present results for one and two domain proteins and compare our
results with the manually assigned domains by the SCOP database
and differences are discussed.
Abstract: Female breast cancer is the second in frequency after cervical cancer. Surgery is the most common treatment for breast cancer, followed by chemotherapy as a treatment of choice. Although effective, it causes serious side effects. Controlled-release drug delivery is an alternative method to improve the efficacy and safety of the treatment. It can release the dosage of drug between the minimum effect concentration (MEC) and minimum toxic concentration (MTC) within tumor tissue and reduce the damage of normal tissue and the side effect. Because an in vivo experiment of this system can be time-consuming and labor-intensive, a mathematical model is desired to study the effects of important parameters before the experiments are performed. Here, we describe a 3D mathematical model to predict the release of doxorubicin from pluronic gel to treat human breast cancer. This model can, ultimately, be used to effectively design the in vivo experiments.
Abstract: Breast cancer detection techniques have been reported
to aid radiologists in analyzing mammograms. We note that most
techniques are performed on uncompressed digital mammograms.
Mammogram images are huge in size necessitating the use of
compression to reduce storage/transmission requirements. In this
paper, we present an algorithm for the detection of
microcalcifications in the JPEG2000 domain. The algorithm is based
on the statistical properties of the wavelet transform that the
JPEG2000 coder employs. Simulation results were carried out at
different compression ratios. The sensitivity of this algorithm ranges
from 92% with a false positive rate of 4.7 down to 66% with a false
positive rate of 2.1 using lossless compression and lossy compression
at a compression ratio of 100:1, respectively.
Abstract: Data Mining aims at discovering knowledge out of
data and presenting it in a form that is easily comprehensible to
humans. One of the useful applications in Egypt is the Cancer
management, especially the management of Acute Lymphoblastic
Leukemia or ALL, which is the most common type of cancer in
children.
This paper discusses the process of designing a prototype that can
help in the management of childhood ALL, which has a great
significance in the health care field. Besides, it has a social impact
on decreasing the rate of infection in children in Egypt. It also
provides valubale information about the distribution and
segmentation of ALL in Egypt, which may be linked to the possible
risk factors.
Undirected Knowledge Discovery is used since, in the case of this
research project, there is no target field as the data provided is
mainly subjective. This is done in order to quantify the subjective
variables. Therefore, the computer will be asked to identify
significant patterns in the provided medical data about ALL. This
may be achieved through collecting the data necessary for the
system, determimng the data mining technique to be used for the
system, and choosing the most suitable implementation tool for the
domain.
The research makes use of a data mining tool, Clementine, so as to
apply Decision Trees technique. We feed it with data extracted from
real-life cases taken from specialized Cancer Institutes. Relevant
medical cases details such as patient medical history and diagnosis
are analyzed, classified, and clustered in order to improve the disease
management.
Abstract: This study investigated the pattern and seasonal index of influenza cases in Thailand. Our results showed that southern Thailand had the highest influenza incidence among the four regions of Thailand (i.e. north, northeast, central and southern Thailand). The influenza pattern in southern Thailand was similar to that of northeastern Thailand. Seasonal index values of influenza cases in Thailand were higher in the hot season than in the wet season. Influenza cases started to increase at the beginning of the hot season (April), reached a maximum in August, rapidly declined in the middle of the wet season and reached the lowest value in December. Seasonal index values for northern Thailand differed from other regions of Thailand.
Abstract: This study aimed at developing a forecasting model on the number of Dengue Haemorrhagic Fever (DHF) incidence in Northern Thailand using time series analysis. We developed Seasonal Autoregressive Integrated Moving Average (SARIMA) models on the data collected between 2003-2006 and then validated the models using the data collected between January-September 2007. The results showed that the regressive forecast curves were consistent with the pattern of actual values. The most suitable model was the SARIMA(2,0,1)(0,2,0)12 model with a Akaike Information Criterion (AIC) of 12.2931 and a Mean Absolute Percent Error (MAPE) of 8.91713. The SARIMA(2,0,1)(0,2,0)12 model fitting was adequate for the data with the Portmanteau statistic Q20 = 8.98644 ( x20,95= 27.5871, P>0.05). This indicated that there was no significant autocorrelation between residuals at different lag times in the SARIMA(2,0,1)(0,2,0)12 model.
Abstract: In this paper we present a method for gene ranking
from DNA microarray data. More precisely, we calculate the correlation
networks, which are unweighted and undirected graphs, from
microarray data of cervical cancer whereas each network represents
a tissue of a certain tumor stage and each node in the network
represents a gene. From these networks we extract one tree for
each gene by a local decomposition of the correlation network. The
interpretation of a tree is that it represents the n-nearest neighbor
genes on the n-th level of a tree, measured by the Dijkstra distance,
and, hence, gives the local embedding of a gene within the correlation
network. For the obtained trees we measure the pairwise similarity
between trees rooted by the same gene from normal to cancerous
tissues. This evaluates the modification of the tree topology due to
progression of the tumor. Finally, we rank the obtained similarity
values from all tissue comparisons and select the top ranked genes.
For these genes the local neighborhood in the correlation networks
changes most between normal and cancerous tissues. As a result
we find that the top ranked genes are candidates suspected to be
involved in tumor growth and, hence, indicates that our method
captures essential information from the underlying DNA microarray
data of cervical cancer.