Abstract: In this paper we considered the Neumann problem for
the fourth order differential equation. First we define the weighted Sobolev space
2 Wα and generalized solution for this equation. Then we consider the existence and uniqueness of the generalized solution,
as well as give the description of the spectrum and of the domain of definition of the corresponding operator.
Abstract: Obtaining labeled data in supervised learning is often
difficult and expensive, and thus the trained learning algorithm tends
to be overfitting due to small number of training data. As a result,
some researchers have focused on using unlabeled data which may
not necessary to follow the same generative distribution as the labeled
data to construct a high-level feature for improving performance on
supervised learning tasks. In this paper, we investigate the impact of
the relationship between unlabeled and labeled data for classification
performance. Specifically, we will apply difference unlabeled data
which have different degrees of relation to the labeled data for
handwritten digit classification task based on MNIST dataset. Our
experimental results show that the higher the degree of relation
between unlabeled and labeled data, the better the classification
performance. Although the unlabeled data that is completely from
different generative distribution to the labeled data provides the lowest
classification performance, we still achieve high classification performance.
This leads to expanding the applicability of the supervised
learning algorithms using unsupervised learning.
Abstract: In this article a modification of the algorithm of the fuzzy ART network, aiming at returning it supervised is carried out. It consists of the search for the comparison, training and vigilance parameters giving the minimum quadratic distances between the output of the training base and those obtained by the network. The same process is applied for the determination of the parameters of the fuzzy ARTMAP giving the most powerful network. The modification consist in making learn the fuzzy ARTMAP a base of examples not only once as it is of use, but as many time as its architecture is in evolution or than the objective error is not reached . In this way, we don-t worry about the values to impose on the eight (08) parameters of the network. To evaluate each one of these three networks modified, a comparison of their performances is carried out. As application we carried out a classification of the image of Algiers-s bay taken by SPOT XS. We use as criterion of evaluation the training duration, the mean square error (MSE) in step control and the rate of good classification per class. The results of this study presented as curves, tables and images show that modified fuzzy ARTMAP presents the best compromise quality/computing time.
Abstract: Sparse representation which can represent high dimensional
data effectively has been successfully used in computer vision
and pattern recognition problems. However, it doesn-t consider the
label information of data samples. To overcome this limitation,
we develop a novel dimensionality reduction algorithm namely
dscriminatively regularized sparse subspace learning(DR-SSL) in this
paper. The proposed DR-SSL algorithm can not only make use of
the sparse representation to model the data, but also can effective
employ the label information to guide the procedure of dimensionality
reduction. In addition,the presented algorithm can effectively deal
with the out-of-sample problem.The experiments on gene-expression
data sets show that the proposed algorithm is an effective tool for
dimensionality reduction and gene-expression data classification.
Abstract: This paper proposes an improvement method of classification
efficiency in a classification model. The model is used
in a risk search system and extracts specific labels from articles
posted at bulletin board sites. The system can analyze the important
discussions composed of the articles. The improvement method
introduces ensemble learning methods that use multiple classification
models. Also, it introduces expressions related to the specific labels
into generation of word vectors. The paper applies the improvement
method to articles collected from three bulletin board sites selected
by users and verifies the effectiveness of the improvement method.
Abstract: In this article we are going to discuss the improvement
of the multi classes- classification problem using multi layer
Perceptron. The considered approach consists in breaking down the
n-class problem into two-classes- subproblems. The training of each
two-class subproblem is made independently; as for the phase of test,
we are going to confront a vector that we want to classify to all two
classes- models, the elected class will be the strongest one that won-t
lose any competition with the other classes. Rates of recognition
gotten with the multi class-s approach by two-class-s decomposition
are clearly better that those gotten by the simple multi class-s
approach.
Abstract: There are many situations where input feature vectors are incomplete and methods to tackle the problem have been studied for a long time. A commonly used procedure is to replace each missing value with an imputation. This paper presents a method to perform categorical missing data imputation from numerical and categorical variables. The imputations are based on Simpson-s fuzzy min-max neural networks where the input variables for learning and classification are just numerical. The proposed method extends the input to categorical variables by introducing new fuzzy sets, a new operation and a new architecture. The procedure is tested and compared with others using opinion poll data.
Abstract: The healthcare environment is generally perceived as
being information rich yet knowledge poor. However, there is a lack
of effective analysis tools to discover hidden relationships and trends
in data. In fact, valuable knowledge can be discovered from
application of data mining techniques in healthcare system. In this
study, a proficient methodology for the extraction of significant
patterns from the Coronary Heart Disease warehouses for heart
attack prediction, which unfortunately continues to be a leading cause
of mortality in the whole world, has been presented. For this purpose,
we propose to enumerate dynamically the optimal subsets of the
reduced features of high interest by using rough sets technique
associated to dynamic programming. Therefore, we propose to
validate the classification using Random Forest (RF) decision tree to
identify the risky heart disease cases. This work is based on a large
amount of data collected from several clinical institutions based on
the medical profile of patient. Moreover, the experts- knowledge in
this field has been taken into consideration in order to define the
disease, its risk factors, and to establish significant knowledge
relationships among the medical factors. A computer-aided system is
developed for this purpose based on a population of 525 adults. The
performance of the proposed model is analyzed and evaluated based
on set of benchmark techniques applied in this classification problem.
Abstract: Decision fusion is one of hot research topics in
classification area, which aims to achieve the best possible
performance for the task at hand. In this paper, we
investigate the usefulness of this concept to improve change
detection accuracy in remote sensing. Thereby, outputs of
two fuzzy change detectors based respectively on
simultaneous and comparative analysis of multitemporal
data are fused by using fuzzy integral operators. This
method fuses the objective evidences produced by the
change detectors with respect to fuzzy measures that express
the difference of performance between them. The proposed
fusion framework is evaluated in comparison with some
ordinary fuzzy aggregation operators. Experiments carried
out on two SPOT images showed that the fuzzy integral was
the best performing. It improves the change detection
accuracy while attempting to equalize the accuracy rate in
both change and no change classes.
Abstract: In this paper, we present a novel statistical approach to
corpus-based speech synthesis. Classically, phonetic information is
defined and considered as acoustic reference to be respected. In this
way, many studies were elaborated for acoustical unit classification.
This type of classification allows separating units according to their
symbolic characteristics. Indeed, target cost and concatenation cost
were classically defined for unit selection.
In Corpus-Based Speech Synthesis System, when using large text
corpora, cost functions were limited to a juxtaposition of symbolic
criteria and the acoustic information of units is not exploited in the
definition of the target cost.
In this manuscript, we token in our consideration the unit phonetic
information corresponding to acoustic information. This would be realized
by defining a probabilistic linguistic Bi-grams model basically
used for unit selection. The selected units would be extracted from
the English TIMIT corpora.
Abstract: In the paper, the relative performances on spectral
classification of short exon and intron sequences of the human and
eleven model organisms is studied. In the simulations, all
combinations of sixteen one-sequence numerical representations, four
threshold values, and four window lengths are considered. Sequences
of 150-base length are chosen and for each organism, a total of
16,000 sequences are used for training and testing. Results indicate
that an appropriate combination of one-sequence numerical
representation, threshold value, and window length is essential for
arriving at top spectral classification results. For fixed-length
sequences, the precisions on exon and intron classification obtained
for different organisms are not the same because of their genomic
differences. In general, precision increases as sequence length
increases.
Abstract: The ElectroEncephaloGram (EEG) is useful for
clinical diagnosis and biomedical research. EEG signals often
contain strong ElectroOculoGram (EOG) artifacts produced
by eye movements and eye blinks especially in EEG recorded
from frontal channels. These artifacts obscure the underlying
brain activity, making its visual or automated inspection
difficult. The goal of ocular artifact removal is to remove
ocular artifacts from the recorded EEG, leaving the underlying
background signals due to brain activity. In recent times,
Independent Component Analysis (ICA) algorithms have
demonstrated superior potential in obtaining the least
dependent source components. In this paper, the independent
components are obtained by using the JADE algorithm (best
separating algorithm) and are classified into either artifact
component or neural component. Neural Network is used for
the classification of the obtained independent components.
Neural Network requires input features that exactly represent
the true character of the input signals so that the neural
network could classify the signals based on those key
characters that differentiate between various signals. In this
work, Auto Regressive (AR) coefficients are used as the input
features for classification. Two neural network approaches
are used to learn classification rules from EEG data. First, a
Polynomial Neural Network (PNN) trained by GMDH (Group
Method of Data Handling) algorithm is used and secondly,
feed-forward neural network classifier trained by a standard
back-propagation algorithm is used for classification and the
results show that JADE-FNN performs better than JADEPNN.
Abstract: Recent years have seen a growing trend towards the
integration of multiple information sources to support large-scale
prediction of protein-protein interaction (PPI) networks in model
organisms. Despite advances in computational approaches, the
combination of multiple “omic" datasets representing the same type
of data, e.g. different gene expression datasets, has not been
rigorously studied. Furthermore, there is a need to further investigate
the inference capability of powerful approaches, such as fullyconnected
Bayesian networks, in the context of the prediction of PPI
networks. This paper addresses these limitations by proposing a
Bayesian approach to integrate multiple datasets, some of which
encode the same type of “omic" data to support the identification of
PPI networks. The case study reported involved the combination of
three gene expression datasets relevant to human heart failure (HF).
In comparison with two traditional methods, Naive Bayesian and
maximum likelihood ratio approaches, the proposed technique can
accurately identify known PPI and can be applied to infer potentially
novel interactions.
Abstract: The one-class support vector machine “support vector
data description” (SVDD) is an ideal approach for anomaly or outlier
detection. However, for the applicability of SVDD in real-world
applications, the ease of use is crucial. The results of SVDD are
massively determined by the choice of the regularisation parameter C
and the kernel parameter of the widely used RBF kernel. While for
two-class SVMs the parameters can be tuned using cross-validation
based on the confusion matrix, for a one-class SVM this is not
possible, because only true positives and false negatives can occur
during training. This paper proposes an approach to find the optimal
set of parameters for SVDD solely based on a training set from
one class and without any user parameterisation. Results on artificial
and real data sets are presented, underpinning the usefulness of the
approach.
Abstract: Much research into handwritten Thai character
recognition have been proposed, such as comparing heads of
characters, Fuzzy logic and structure trees, etc. This paper presents a
system of handwritten Thai character recognition, which is based on
the Ant-minor algorithm (data mining based on Ant colony
optimization). Zoning is initially used to determine each character.
Then three distinct features (also called attributes) of each character
in each zone are extracted. The attributes are Head zone, End point,
and Feature code. All attributes are used for construct the
classification rules by an Ant-miner algorithm in order to classify
112 Thai characters. For this experiment, the Ant-miner algorithm is
adapted, with a small change to increase the recognition rate. The
result of this experiment is a 97% recognition rate of the training set
(11200 characters) and 82.7% recognition rate of unseen data test
(22400 characters).
Abstract: Knowledge sharing in general and the contextual
access to knowledge in particular, still represent a key challenge in
the knowledge management framework. Researchers on semantic
web and human machine interface study techniques to enhance this
access. For instance, in semantic web, the information retrieval is
based on domain ontology. In human machine interface, keeping
track of user's activity provides some elements of the context that can
guide the access to information. We suggest an approach based on
these two key guidelines, whilst avoiding some of their weaknesses.
The approach permits a representation of both the context and the
design rationale of a project for an efficient access to knowledge. In
fact, the method consists of an information retrieval environment
that, in the one hand, can infer knowledge, modeled as a semantic
network, and on the other hand, is based on the context and the
objectives of a specific activity (the design). The environment we
defined can also be used to gather similar project elements in order to
build classifications of tasks, problems, arguments, etc. produced in a
company. These classifications can show the evolution of design
strategies in the company.
Abstract: Results of Chilean wine classification based on the
information provided by an electronic nose are reported in this paper.
The classification scheme consists of two parts; in the first stage,
Principal Component Analysis is used as feature extraction method to
reduce the dimensionality of the original information. Then, Radial
Basis Functions Neural Networks is used as pattern recognition
technique to perform the classification. The objective of this study is
to classify different Cabernet Sauvignon, Merlot and Carménère wine
samples from different years, valleys and vineyards of Chile.
Abstract: Text categorization (the assignment of texts in natural language into predefined categories) is an important and extensively studied problem in Machine Learning. Currently, popular techniques developed to deal with this task include many preprocessing and learning algorithms, many of which in turn require tuning nontrivial internal parameters. Although partial studies are available, many authors fail to report values of the parameters they use in their experiments, or reasons why these values were used instead of others. The goal of this work then is to create a more thorough comparison of preprocessing parameters and their mutual influence, and report interesting observations and results.
Abstract: In quality control of freeze-dried durian, crispiness is
a key quality index of the product. Generally, crispy testing has to be
done by a destructive method. A nondestructive testing of the
crispiness is required because the samples can be reused for other
kinds of testing. This paper proposed a crispiness classification
method of freeze-dried durians using fuzzy logic for decision
making. The physical changes of a freeze-dried durian include the
pores appearing in the images. Three physical features including (1)
the diameters of pores, (2) the ratio of the pore area and the
remaining area, and (3) the distribution of the pores are considered to
contribute to the crispiness. The fuzzy logic is applied for making the
decision. The experimental results comparing with food expert
opinion showed that the accuracy of the proposed classification
method is 83.33 percent.
Abstract: This work presents a neural network model for the
clustering analysis of data based on Self Organizing Maps (SOM).
The model evolves during the training stage towards a hierarchical
structure according to the input requirements. The hierarchical structure
symbolizes a specialization tool that provides refinements of the
classification process. The structure behaves like a single map with
different resolutions depending on the region to analyze. The benefits
and performance of the algorithm are discussed in application to the
Iris dataset, a classical example for pattern recognition.