Abstract: A feature weighting and selection method is proposed
which uses the structure of a weightless neuron and exploits the
principles that govern the operation of Genetic Algorithms and
Evolution. Features are coded onto chromosomes in a novel way
which allows weighting information regarding the features to be
directly inferred from the gene values. The proposed method is
significant in that it addresses several problems concerned with
algorithms for feature selection and weighting as well as providing
significant advantages such as speed, simplicity and suitability for
real-time systems.
Abstract: In the recent past, there has been an increasing interest
in applying evolutionary methods to Knowledge Discovery in
Databases (KDD) and a number of successful applications of Genetic
Algorithms (GA) and Genetic Programming (GP) to KDD have been
demonstrated. The most predominant representation of the
discovered knowledge is the standard Production Rules (PRs) in the
form If P Then D. The PRs, however, are unable to handle
exceptions and do not exhibit variable precision. The Censored
Production Rules (CPRs), an extension of PRs, were proposed by
Michalski & Winston that exhibit variable precision and supports an
efficient mechanism for handling exceptions. A CPR is an
augmented production rule of the form:
If P Then D Unless C, where C (Censor) is an exception to the rule.
Such rules are employed in situations, in which the conditional
statement 'If P Then D' holds frequently and the assertion C holds
rarely. By using a rule of this type we are free to ignore the exception
conditions, when the resources needed to establish its presence are
tight or there is simply no information available as to whether it
holds or not. Thus, the 'If P Then D' part of the CPR expresses
important information, while the Unless C part acts only as a switch
and changes the polarity of D to ~D.
This paper presents a classification algorithm based on evolutionary
approach that discovers comprehensible rules with exceptions in the
form of CPRs.
The proposed approach has flexible chromosome encoding, where
each chromosome corresponds to a CPR. Appropriate genetic
operators are suggested and a fitness function is proposed that
incorporates the basic constraints on CPRs. Experimental results are
presented to demonstrate the performance of the proposed algorithm.
Abstract: This paper and its companion (Part 2) deal with
modeling and optimization of two NP-hard problems in production
planning of flexible manufacturing system (FMS), part type selection
problem and loading problem. The part type selection problem and
the loading problem are strongly related and heavily influence the
system-s efficiency and productivity. The complexity of the problems
is harder when flexibilities of operations such as the possibility of
operation processed on alternative machines with alternative tools are
considered. These problems have been modeled and solved
simultaneously by using real coded genetic algorithms (RCGA)
which uses an array of real numbers as chromosome representation.
These real numbers can be converted into part type sequence and
machines that are used to process the part types. This first part of the
papers focuses on the modeling of the problems and discussing how
the novel chromosome representation can be applied to solve the
problems. The second part will discuss the effectiveness of the
RCGA to solve various test bed problems.
Abstract: The genetic algorithm (GA) based solution techniques
are found suitable for optimization because of their ability of
simultaneous multidimensional search. Many GA-variants have been
tried in the past to solve optimal power flow (OPF), one of the
nonlinear problems of electric power system. The issues like
convergence speed and accuracy of the optimal solution obtained
after number of generations using GA techniques and handling
system constraints in OPF are subjects of discussion. The results
obtained for GA-Fuzzy OPF on various power systems have shown
faster convergence and lesser generation costs as compared to other
approaches. This paper presents an enhanced GA-Fuzzy OPF (EGAOPF)
using penalty factors to handle line flow constraints and load
bus voltage limits for both normal network and contingency case
with congestion. In addition to crossover and mutation rate
adaptation scheme that adapts crossover and mutation probabilities
for each generation based on fitness values of previous generations, a
block swap operator is also incorporated in proposed EGA-OPF. The
line flow limits and load bus voltage magnitude limits are handled by
incorporating line overflow and load voltage penalty factors
respectively in each chromosome fitness function. The effects of
different penalty factors settings are also analyzed under contingent
state.
Abstract: Re-entrant scheduling is an important search problem
with many constraints in the flow shop. In the literature, a number of
approaches have been investigated from exact methods to
meta-heuristics. This paper presents a genetic algorithm that encodes
the problem as multi-level chromosomes to reflect the dependent
relationship of the re-entrant possibility and resource consumption.
The novel encoding way conserves the intact information of the data
and fastens the convergence to the near optimal solutions. To test the
effectiveness of the method, it has been applied to the
resource-constrained re-entrant flow shop scheduling problem.
Computational results show that the proposed GA performs better than
the simulated annealing algorithm in the measure of the makespan
Abstract: SeqWord Gene Island Sniffer, a new program for
the identification of mobile genetic elements in sequences of bacterial chromosomes is presented. This program is based on the
analysis of oligonucleotide usage variations in DNA sequences. 3,518 mobile genetic elements were identified in 637 bacterial
genomes and further analyzed by sequence similarity and the
functionality of encoded proteins. The results of this study are stored in an open database http://anjie.bi.up.ac.za/geidb/geidbhome.
php). The developed computer program and the database provide the information valuable for further investigation of the
distribution of mobile genetic elements and virulence factors among bacteria. The program is available for download at www.bi.up.ac.za/SeqWord/sniffer/index.html.
Abstract: This research presents a system for post processing of
data that takes mined flat rules as input and discovers crisp as well as
fuzzy hierarchical structures using Learning Classifier System
approach. Learning Classifier System (LCS) is basically a machine
learning technique that combines evolutionary computing,
reinforcement learning, supervised or unsupervised learning and
heuristics to produce adaptive systems. A LCS learns by interacting
with an environment from which it receives feedback in the form of
numerical reward. Learning is achieved by trying to maximize the
amount of reward received. Crisp description for a concept usually
cannot represent human knowledge completely and practically. In the
proposed Learning Classifier System initial population is constructed
as a random collection of HPR–trees (related production rules) and
crisp / fuzzy hierarchies are evolved. A fuzzy subsumption relation is
suggested for the proposed system and based on Subsumption Matrix
(SM), a suitable fitness function is proposed. Suitable genetic
operators are proposed for the chosen chromosome representation
method. For implementing reinforcement a suitable reward and
punishment scheme is also proposed. Experimental results are
presented to demonstrate the performance of the proposed system.
Abstract: The job shop scheduling problem (JSSP) is well known as one of the most difficult combinatorial optimization problems. This paper presents a hybrid genetic algorithm for the JSSP with the objective of minimizing makespan. The efficiency of the genetic algorithm is enhanced by integrating it with a local search method. The chromosome representation of the problem is based on operations. Schedules are constructed using a procedure that generates full active schedules. In each generation, a local search heuristic based on Nowicki and Smutnicki-s neighborhood is applied to improve the solutions. The approach is tested on a set of standard instances taken from the literature and compared with other approaches. The computation results validate the effectiveness of the proposed algorithm.
Abstract: Y chromosome microdeletions are the most common
genetic cause of male infertility and screening for these
microdeletions in azoospermic or severely oligospermic men is now
standard practice. Analysis of the Y chromosome in men with
azoospermia or severe oligozoospermia has resulted in the
identification of three regions in the euchromatic part of the long arm
of the human Y chromosome (Yq11) that are frequently deleted in
men with otherwise unexplained spermatogenic failure. PCR analysis
of microdeletions in the AZFa, AZFb and AZFc regions of the
human Y chromosome is an important screening tool. The aim of this
study was to analyse the type of microdeletions in men with fertility
disorders in Slovakia. We evaluated 227 patients with azoospermia
and with normal karyotype. All patient samples were analyzed
cytogenetically. For PCR amplification of sequence-tagged sites
(STS) of the AZFa, AZFb and AZFc regions of the Y chromosome
was used Devyser AZF set. Fluorescently labeled primers for all
markers in one multiplex PCR reaction were used and for automated
visualization and identification of the STS markers we used genetic
analyzer ABi 3500xl (Life Technologies). We reported 13 cases of
deletions in the AZF region 5,73%. Particular types of deletions were
recorded in each region AZFa,b,c .The presence of microdeletions in
the AZFc region was the most frequent. The study confirmed that
percentage of microdeletions in the AZF region is low in Slovak
azoospermic patients, but important from a prognostic view.
Abstract: Flow-shop scheduling problem (FSP) deals with the
scheduling of a set of jobs that visit a set of machines in the same
order. The FSP is NP-hard, which means that an efficient algorithm
for solving the problem to optimality is unavailable. To meet the
requirements on time and to minimize the make-span performance of
large permutation flow-shop scheduling problems in which there are
sequence dependent setup times on each machine, this paper
develops one hybrid genetic algorithms (HGA). Proposed HGA
apply a modified approach to generate population of initial
chromosomes and also use an improved heuristic called the iterated
swap procedure to improve initial solutions. Also the author uses
three genetic operators to make good new offspring. The results are
compared to some recently developed heuristics and computational
experimental results show that the proposed HGA performs very
competitively with respect to accuracy and efficiency of solution.
Abstract: Artificial Immune System is adopted as a Heuristic
Algorithm to solve the combinatorial problems for decades.
Nevertheless, many of these applications took advantage of the benefit
for applications but seldom proposed approaches for enhancing the
efficiency. In this paper, we continue the previous research to develop
a Self-evolving Artificial Immune System II via coordinating the T
and B cell in Immune System and built a block-based artificial
chromosome for speeding up the computation time and better
performance for different complexities of problems. Through the
design of Plasma cell and clonal selection which are relative the
function of the Immune Response. The Immune Response will help
the AIS have the global and local searching ability and preventing
trapped in local optima. From the experimental result, the significant
performance validates the SEAIS II is effective when solving the
permutation flows-hop problems.
Abstract: The design of a gravity dam is performed through an
interactive process involving a preliminary layout of the structure
followed by a stability and stress analysis. This study presents a
method to define the optimal top width of gravity dam with genetic
algorithm. To solve the optimization task (minimize the cost of the
dam), an optimization routine based on genetic algorithms (GAs) was
implemented into an Excel spreadsheet. It was found to perform well
and GA parameters were optimized in a parametric study. Using the
parameters found in the parametric study, the top width of gravity
dam optimization was performed and compared to a gradient-based
optimization method (classic method). The accuracy of the results
was within close proximity. In optimum dam cross section, the ratio
of is dam base to dam height is almost equal to 0.85, and ratio of dam
top width to dam height is almost equal to 0.13. The computerized
methodology may provide the help for computation of the optimal
top width for a wide range of height of a gravity dam.
Abstract: Genetic Folding (GF) a new class of EA named as is
introduced for the first time. It is based on chromosomes composed
of floating genes structurally organized in a parent form and
separated by dots. Although, the genotype/phenotype system of GF
generates a kernel expression, which is the objective function of
superior classifier. In this work the question of the satisfying
mapping-s rules in evolving populations is addressed by analyzing
populations undergoing either Mercer-s or none Mercer-s rule. The
results presented here show that populations undergoing Mercer-s
rules improve practically models selection of Support Vector
Machine (SVM). The experiment is trained multi-classification
problem and tested on nonlinear Ionosphere dataset. The target of this
paper is to answer the question of evolving Mercer-s rule in SVM
addressed using either genetic folding satisfied kernel-s rules or not
applied to complicated domains and problems.
Abstract: Eukaryotic protein-coding genes are interrupted by spliceosomal introns, which are removed from the RNA transcripts before translation into a protein. The exon-intron structures of different eukaryotic species are quite different from each other, and the evolution of such structures raises many questions. We try to address some of these questions using statistical analysis of whole genomes. We go through all the protein-coding genes in a genome and study correlations between the net length of all the exons in a gene, the number of the exons, and the average length of an exon. We also take average values of these features for each chromosome and study correlations between those averages on the chromosomal level. Our data show universal features of exon-intron structures common to animals, plants, and protists (specifically, Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, Cryptococcus neoformans, Homo sapiens, Mus musculus, Oryza sativa, and Plasmodium falciparum). We have verified linear correlation between the number of exons in a gene and the length of a protein coded by the gene, while the protein length increases in proportion to the number of exons. On the other hand, the average length of an exon always decreases with the number of exons. Finally, chromosome clustering based on average chromosome properties and parameters of linear regression between the number of exons in a gene and the net length of those exons demonstrates that these average chromosome properties are genome-specific features.
Abstract: This paper explores university course timetabling
problem. There are several characteristics that make scheduling and
timetabling problems particularly difficult to solve: they have huge
search spaces, they are often highly constrained, they require
sophisticated solution representation schemes, and they usually
require very time-consuming fitness evaluation routines. Thus
standard evolutionary algorithms lack of efficiency to deal with
them. In this paper we have proposed a memetic algorithm that
incorporates the problem specific knowledge such that most of
chromosomes generated are decoded into feasible solutions.
Generating vast amount of feasible chromosomes makes the progress
of search process possible in a time efficient manner. Experimental
results exhibit the advantages of the developed Hybrid Genetic
Algorithm than the standard Genetic Algorithm.
Abstract: Low temperature (LT) is one of the most abiotic
stresses causing loss of yield in wheat (T. aestivum). Four major
genes in wheat (Triticum aestivum L.) with the dominant alleles
designated Vrn–A1,Vrn–B1,Vrn–D1 and Vrn4, are known to have
large effects on the vernalization response, but the effects on cold
hardiness are ambiguous. Poor cold tolerance has restricted winter
wheat production in regions of high winter stress [9]. It was known
that nearly all wheat chromosomes [5] or at least 10 chromosomes of
21 chromosome pairs are important in winter hardiness [15]. The
objective of present study was to clarify the role of each chromosome
in cold tolerance. With this purpose we used 20 isogenic lines of
wheat. In each one of these isogenic lines only a chromosome from
‘Bezostaya’ variety (a winter habit cultivar) was substituted to
‘Capple desprez’ variety. The plant materials were planted in
controlled conditions with 20º C and 16 h day length in moderately
cold areas of Iran at Karaj Agricultural Research Station in 2006-07
and the acclimation period was completed for about 4 weeks in a
cold room with 4º C. The cold hardiness of these isogenic lines was
measured by LT50 (the temperature in which 50% of the plants are
killed by freezing stress).The experimental design was completely
randomized block design (RCBD)with three replicates. The results
showed that chromosome 5A had a major effect on freezing
tolerance, and then chromosomes 1A and 4A had less effect on this
trait. Further studies are essential to understanding the importance of
each chromosome in controlling cold hardiness in wheat.
Abstract: Whole genome duplication (WGD) increased the
number of yeast Saccharomyces cerevisiae chromosomes from 8 to
16. In spite of retention the number of chromosomes in the genome
of this organism after WGD to date, chromosomal rearrangement
events have caused an evolutionary distance between current genome
and its ancestor. Studies under evolutionary-based approaches on
eukaryotic genomes have shown that the rearrangement distance is an
approximable problem. In the case of S. cerevisiae, we describe that
rearrangement distance is accessible by using dedoubled adjacency
graph drawn for 55 large paired chromosomal regions originated
from WGD. Then, we provide a program extracted from a C program
database to draw a dedoubled genome adjacency graph for S.
cerevisiae. From a bioinformatical perspective, using the duplicated
blocks of current genome in S. cerevisiae, we infer that genomic
organization of eukaryotes has the potential to provide valuable
detailed information about their ancestrygenome.
Abstract: Sense-antisense gene pair (SAGP) is a pair of two oppositely transcribed genes sharing a common region on a chromosome. In the mammalian genomes, SAGPs can be organized in more complex sense-antisense gene architectures (CSAGA) in which at least one gene could share loci with two or more antisense partners. Many dozens of CSAGAs can be found in the human genome. However, CSAGAs have not been systematically identified and characterized in context of their role in human diseases including cancers. In this work we characterize the structural-functional properties of a cluster of 5 genes –TMEM97, IFT20, TNFAIP1, POLDIP2 and TMEM199, termed TNFAIP1 / POLDIP2 module. This cluster is organized as CSAGA in cytoband 17q11.2. Affymetrix U133A&B expression data of two large cohorts (410 atients, in total) of breast cancer patients and patient survival data were used. For the both studied cohorts, we demonstrate (i) strong and reproducible transcriptional co-regulatory patterns of genes of TNFAIP1/POLDIP2 module in breast cancer cell subtypes and (ii) significant associations of TNFAIP1/POLDIP2 CSAGA with amplification of the CSAGA region in breast cancer, (ii) cancer aggressiveness (e.g. genetic grades) and (iv) disease free patient-s survival. Moreover, gene pairs of this module demonstrate strong synergetic effect in the prognosis of time of breast cancer relapse. We suggest that TNFAIP1/ POLDIP2 cluster can be considered as a novel type of structural-functional gene modules in the human genome.
Abstract: In this paper, we probe into the traffic assignment problem by the chromosome-learning-based path finding method in simulation, which is to model the driver' behavior in the with-in-a-day process. By simply making a combination and a change of the traffic route chromosomes, the driver at the intersection chooses his next route. The various crossover and mutation rules are proposed with extensive examples.
Abstract: This paper presents a novel algorithm of stereo
correspondence with rank transform. In this algorithm we used the
genetic algorithm to achieve the accurate disparity map. Genetic
algorithms are efficient search methods based on principles of
population genetic, i.e. mating, chromosome crossover, gene
mutation, and natural selection. Finally morphology is employed to
remove the errors and discontinuities.