Abstract: The main goal of data mining is to extract accurate, comprehensible and interesting knowledge from databases that may be considered as large search spaces. In this paper, a new, efficient type of Genetic Algorithm (GA) called uniform two-level GA is proposed as a search strategy to discover truly interesting, high-level prediction rules, a difficult problem and relatively little researched, rather than discovering classification knowledge as usual in the literatures. The proposed method uses the advantage of uniform population method and addresses the task of generalized rule induction that can be regarded as a generalization of the task of classification. Although the task of generalized rule induction requires a lot of computations, which is usually not satisfied with the normal algorithms, it was demonstrated that this method increased the performance of GAs and rapidly found interesting rules.
Abstract: Research papers are usually evaluated via peer
review. However, peer review has limitations in evaluating research
papers. In this paper, Scienstein and the new idea of 'collaborative
document evaluation' are presented. Scienstein is a project to
evaluate scientific papers collaboratively based on ratings, links,
annotations and classifications by the scientific community using the
internet. In this paper, critical success factors of collaborative
document evaluation are analyzed. That is the scientists- motivation
to participate as reviewers, the reviewers- competence and the
reviewers- trustworthiness. It is shown that if these factors are
ensured, collaborative document evaluation may prove to be a more
objective, faster and less resource intensive approach to scientific
document evaluation in comparison to the classical peer review
process. It is shown that additional advantages exist as collaborative
document evaluation supports interdisciplinary work, allows
continuous post-publishing quality assessments and enables the
implementation of academic recommendation engines. In the long
term, it seems possible that collaborative document evaluation will
successively substitute peer review and decrease the need for
journals.
Abstract: Eigenvector methods are gaining increasing acceptance in the area of spectrum estimation. This paper presents a successful attempt at testing and evaluating the performance of two of the most popular types of subspace techniques in determining the parameters of multiexponential signals with real decay constants buried in noise. In particular, MUSIC (Multiple Signal Classification) and minimum-norm techniques are examined. It is shown that these methods perform almost equally well on multiexponential signals with MUSIC displaying better defined peaks.
Abstract: Mammographic images and data analysis to
facilitate modelling or computer aided diagnostic (CAD) software development should best be done using a common database that can handle various mammographic image file
formats and relate these to other patient information.
This would optimize the use of the data as both primary
reporting and enhanced information extraction of research data could be performed from the single dataset. One desired
improvement is the integration of DICOM file header information into the database, as an efficient and reliable source of supplementary patient information intrinsically
available in the images.
The purpose of this paper was to design a suitable database to link and integrate different types of image files and gather common information that can be further used for research
purposes. An interface was developed for accessing, adding,
updating, modifying and extracting data from the common
database, enhancing the future possible application of the data in CAD processing.
Technically, future developments envisaged include the creation of an advanced search function to selects image files
based on descriptor combinations. Results can be further used for specific CAD processing and other research. Design of a
user friendly configuration utility for importing of the required fields from the DICOM files must be done.
Abstract: The International Classification of Primary Care (ICPC), which belongs to the WHO Family of International Classifications (WHO-FIC), has a low granularity, which is convenient for describing general medical practice. However, its lack of specificity makes it useful to be used along with an interface terminology. An international survey has been performed, using a questionnaire sent by email to experts from 25 countries, in order to describe the terminologies interfacing with ICPC. Eleven interface terminologies have been identified, developed in Argentina, Australia, Belgium (2), Canada, Denmark, France, Germany, Norway, South Africa, and The Netherlands. Globally, these systems have been poorly assessed until now.
Abstract: In the current economy of increasing global
competition, many organizations are attempting to use knowledge as
one of the means to gain sustainable competitive advantage. Besides
large organizations, the success of SMEs can be linked to how well
they manage their knowledge. Despite the profusion of research
about knowledge management within large organizations, fewer
studies tried to analyze KM in SMEs.
This research proposes a new framework showing the determinant
role of organizational dimensions onto KM approaches. The paper
and its propositions are based on a literature review and analysis.
In this research, personalization versus codification,
individualization versus institutionalization and IT-based versus non
IT-based are highlighted as three distinct dimensions of knowledge
management approaches.
The study contributes to research by providing a more nuanced
classification of KM approaches and provides guidance to managers
about the types of KM approaches that should be adopted based on
the size, geographical dispersion and task nature of SMEs.
To the author-s knowledge, the paper is the first of its kind to
examine if there are suitable configurations of KM approaches for
SMEs with different dimensions. It gives valuable information, which
hopefully will help SME sector to accomplish KM.
Abstract: The distinction among urban, periurban and rural areas represents a classical example of uncertainty in land classification. Satellite images, geostatistical analysis and all kinds of spatial data are very useful in urban sprawl studies, but it is important to define precise rules in combining great amounts of data to build complex knowledge about territory. Rough Set theory may be a useful method to employ in this field. It represents a different mathematical approach to uncertainty by capturing the indiscernibility. Two different phenomena can be indiscernible in some contexts and classified in the same way when combining available information about them. This approach has been applied in a case of study, comparing the results achieved with both Map Algebra technique and Spatial Rough Set. The study case area, Potenza Province, is particularly suitable for the application of this theory, because it includes 100 municipalities with different number of inhabitants and morphologic features.
Abstract: In this paper we propose a robust environmental sound classification approach, based on spectrograms features driven from log-Gabor filters. This approach includes two methods. In the first methods, the spectrograms are passed through an appropriate log-Gabor filter banks and the outputs are averaged and underwent an optimal feature selection procedure based on a mutual information criteria. The second method uses the same steps but applied only to three patches extracted from each spectrogram.
To investigate the accuracy of the proposed methods, we conduct experiments using a large database containing 10 environmental sound classes. The classification results based on Multiclass Support Vector Machines show that the second method is the most efficient with an average classification accuracy of 89.62 %.
Abstract: Monitoring lightning electromagnetic pulses (sferics)
and other terrestrial as well as extraterrestrial transient radiation signals
is of considerable interest for practical and theoretical purposes
in astro- and geophysics as well as meteorology. Managing a continuous
flow of data, automisation of the detection and classification
process is important. Features based on a combination of wavelet and
statistical methods proved efficient for analysis and characterisation
of transients and as input into a radial basis function network that is
trained to discriminate transients from pulse like to wave like.
Abstract: Classifier fusion may generate more accurate
classification than each of the basic classifiers. Fusion is often based
on fixed combination rules like the product, average etc. This paper
presents decision templates as classifier fusion method for the
recognition of the handwritten English and Farsi numerals (1-9).
The process involves extracting a feature vector on well-known
image databases. The extracted feature vector is fed to multiple
classifier fusion. A set of experiments were conducted to compare
decision templates (DTs) with some combination rules. Results from
decision templates conclude 97.99% and 97.28% for Farsi and
English handwritten digits.
Abstract: The standard investigational method for obstructive
sleep apnea syndrome (OSAS) diagnosis is polysomnography (PSG),
which consists of a simultaneous, usually overnight recording of
multiple electro-physiological signals related to sleep and
wakefulness. This is an expensive, encumbering and not a readily
repeated protocol, and therefore there is need for simpler and easily
implemented screening and detection techniques. Identification of
apnea/hypopnea events in the screening recordings is the key factor
for the diagnosis of OSAS. The analysis of a solely single-lead
electrocardiographic (ECG) signal for OSAS diagnosis, which may
be done with portable devices, at patient-s home, is the challenge of
the last years. A novel artificial neural network (ANN) based
approach for feature extraction and automatic identification of
respiratory events in ECG signals is presented in this paper. A
nonlinear principal component analysis (NLPCA) method was
considered for feature extraction and support vector machine for
classification/recognition. An alternative representation of the
respiratory events by means of Kohonen type neural network is
discussed. Our prospective study was based on OSAS patients of the
Clinical Hospital of Pneumology from Iaşi, Romania, males and
females, as well as on non-OSAS investigated human subjects. Our
computed analysis includes a learning phase based on cross signal
PSG annotation.
Abstract: Discovering new biological knowledge from the highthroughput biological data is a major challenge to bioinformatics today. To address this challenge, we developed a new approach for protein classification. Proteins that are evolutionarily- and thereby functionally- related are said to belong to the same classification. Identifying protein classification is of fundamental importance to document the diversity of the known protein universe. It also provides a means to determine the functional roles of newly discovered protein sequences. Our goal is to predict the functional classification of novel protein sequences based on a set of features extracted from each protein sequence. The proposed technique used datasets extracted from the Structural Classification of Proteins (SCOP) database. A set of spectral domain features based on Fast Fourier Transform (FFT) is used. The proposed classifier uses multilayer back propagation (MLBP) neural network for protein classification. The maximum classification accuracy is about 91% when applying the classifier to the full four levels of the SCOP database. However, it reaches a maximum of 96% when limiting the classification to the family level. The classification results reveal that spectral domain contains information that can be used for classification with high accuracy. In addition, the results emphasize that sequence similarity measures are of great importance especially at the family level.
Abstract: In hydrocyclones, the particle separation efficiency is
limited by the suspended fine particles, which are discharged with the
coarse product in the underflow. It is well known that injecting water
in the conical part of the cyclone reduces the fine particle fraction in
the underflow. This paper presents a mathematical model that
simulates the water injection in the conical component. The model
accounts for the fluid flow and the particle motion. Particle
interaction, due to hindered settling caused by increased density and
viscosity of the suspension, and fine particle entrainment by settling
coarse particles are included in the model. Water injection in the
conical part of the hydrocyclone is performed to reduce fine particle
discharge in the underflow. The model demonstrates the impact of
the injection rate, injection velocity, and injection location on the
shape of the partition curve. The simulations are compared with
experimental data of a 50-mm cyclone.
Abstract: Retinal vascularity assessment plays an important role in diagnosis of ophthalmic pathologies. The employment of digital images for this purpose makes possible a computerized approach and has motivated development of many methods for automated vascular tree segmentation. Metrics based on contingency tables for binary classification have been widely used for evaluating performance of these algorithms and, concretely, the accuracy has been mostly used as measure of global performance in this topic. However, this metric shows very poor matching with human perception as well as other notable deficiencies. Here, a new similarity function for measuring quality of retinal vessel segmentations is proposed. This similarity function is based on characterizing the vascular tree as a connected structure with a measurable area and length. Tests made indicate that this new approach shows better behaviour than the current one does. Generalizing, this concept of measuring descriptive properties may be used for designing functions for measuring more successfully segmentation quality of other complex structures.
Abstract: Motor imagery classification provides an important basis for designing Brain Machine Interfaces [BMI]. A BMI captures and decodes brain EEG signals and transforms human thought into actions. The ability of an individual to control his EEG through imaginary mental tasks enables him to control devices through the BMI. This paper presents a method to design a four state BMI using EEG signals recorded from the C3 and C4 locations. Principle features extracted through principle component analysis of the segmented EEG are analyzed using two novel classification algorithms using Elman recurrent neural network and functional link neural network. Performance of both classifiers is evaluated using a particle swarm optimization training algorithm; results are also compared with the conventional back propagation training algorithm. EEG motor imagery recorded from two subjects is used in the offline analysis. From overall classification performance it is observed that the BP algorithm has higher average classification of 93.5%, while the PSO algorithm has better training time and maximum classification. The proposed methods promises to provide a useful alternative general procedure for motor imagery classification
Abstract: The fine structure of supercavitation in the wake of a
symmetrical cylinder is studied with high-speed video cameras. The
flow is observed in a cavitation tunnel at the speed of 8m/sec when the
sidewall and the wake are partially filled with the massive cavitation
bubbles. The present experiment observed that a two-dimensional
ripple wave with a wave length of 0.3mm is propagated in a
downstream direction, and then abruptly increases to a thicker
three-dimensional layer. IR-photography recorded that the wakes
originated from the horseshoe vortexes alongside the cylinder. The
wake was developed to inside the dead water zone, which absorbed the
bubbly wake propelled from the separated vortices at the center of the
cylinder. A remote sensing classification technique (maximum most
likelihood) determined that the surface porosity was 0.2, and the mean
speed in the mixed wake was 7m/sec. To confirm the existence of
two-dimensional wave motions in the interface, the experiments were
conducted at a very low frequency, and showed similar gravity waves
in both the upper and lower interfaces.
Abstract: This paper presents the design and implementation of
the WebGD, a CORBA-based document classification and retrieval
system on Internet. The WebGD makes use of such techniques as Web,
CORBA, Java, NLP, fuzzy technique, knowledge-based processing
and database technology. Unified classification and retrieval model,
classifying and retrieving with one reasoning engine and flexible
working mode configuration are some of its main features. The
architecture of WebGD, the unified classification and retrieval model,
the components of the WebGD server and the fuzzy inference engine
are discussed in this paper in detail.
Abstract: In order to develop forest management strategies in
tropical forest in Malaysia, surveying the forest resources and
monitoring the forest area affected by logging activities is essential.
There are tremendous effort has been done in classification of land
cover related to forest resource management in this country as it is a
priority in all aspects of forest mapping using remote sensing and
related technology such as GIS. In fact classification process is a
compulsory step in any remote sensing research. Therefore, the main
objective of this paper is to assess classification accuracy of
classified forest map on Landsat TM data from difference number of
reference data (200 and 388 reference data). This comparison was
made through observation (200 reference data), and interpretation
and observation approaches (388 reference data). Five land cover
classes namely primary forest, logged over forest, water bodies, bare
land and agricultural crop/mixed horticultural can be identified by
the differences in spectral wavelength. Result showed that an overall
accuracy from 200 reference data was 83.5 % (kappa value
0.7502459; kappa variance 0.002871), which was considered
acceptable or good for optical data. However, when 200 reference
data was increased to 388 in the confusion matrix, the accuracy
slightly improved from 83.5% to 89.17%, with Kappa statistic
increased from 0.7502459 to 0.8026135, respectively. The accuracy
in this classification suggested that this strategy for the selection of
training area, interpretation approaches and number of reference data
used were importance to perform better classification result.
Abstract: The paper describes a new approach for fingerprint
classification, based on the distribution of local features (minute
details or minutiae) of the fingerprints. The main advantage is that
fingerprint classification provides an indexing scheme to facilitate
efficient matching in a large fingerprint database. A set of rules based
on heuristic approach has been proposed. The area around the core
point is treated as the area of interest for extracting the minutiae
features as there are substantial variations around the core point as
compared to the areas away from the core point. The core point in a
fingerprint has been located at a point where there is maximum
curvature. The experimental results report an overall average
accuracy of 86.57 % in fingerprint classification.
Abstract: A new approach for facial expressions recognition based on face context and adaptively weighted sub-pattern PCA (Aw-SpPCA) has been presented in this paper. The facial region and others part of the body have been segmented from the complex environment based on skin color model. An algorithm has been proposed to accurate detection of face region from the segmented image based on constant ratio of height and width of face (δ= 1.618). The paper also discusses on new concept to detect the eye and mouth position. The desired part of the face has been cropped to analysis the expression of a person. Unlike PCA based on a whole image pattern, Aw-SpPCA operates directly on its sub patterns partitioned from an original whole pattern and separately extracts features from them. Aw-SpPCA can adaptively compute the contributions of each part and a classification task in order to enhance the robustness to both expression and illumination variations. Experiments on single standard face with five types of facial expression database shows that the proposed method is competitive.