Abstract: Classification is an important topic in machine learning
and bioinformatics. Many datasets have been introduced for
classification tasks. A dataset contains multiple features, and the quality of features influences the classification accuracy of the dataset.
The power of classification for each feature differs. In this study, we
suggest the Classification Influence Index (CII) as an indicator of classification power for each feature. CII enables evaluation of the
features in a dataset and improved classification accuracy by transformation of the dataset. By conducting experiments using CII
and the k-nearest neighbor classifier to analyze real datasets, we confirmed that the proposed index provided meaningful improvement
of the classification accuracy.
Abstract: This paper presents a implementation of an object tracking system in a video sequence. This object tracking is an important task in many vision applications. The main steps in video analysis are two: detection of interesting moving objects and tracking of such objects from frame to frame. In a similar vein, most tracking algorithms use pre-specified methods for preprocessing. In our work, we have implemented several object tracking algorithms (Meanshift, Camshift, Kalman filter) with different preprocessing methods. Then, we have evaluated the performance of these algorithms for different video sequences. The obtained results have shown good performances according to the degree of applicability and evaluation criteria.
Abstract: This paper presents the development of a wavelet
based algorithm, for distinguishing between magnetizing inrush
currents and power system fault currents, which is quite adequate,
reliable, fast and computationally efficient tool. The proposed
technique consists of a preprocessing unit based on discrete wavelet
transform (DWT) in combination with an artificial neural network
(ANN) for detecting and classifying fault currents. The DWT acts as
an extractor of distinctive features in the input signals at the relay
location. This information is then fed into an ANN for classifying
fault and magnetizing inrush conditions. A 220/55/55 V, 50Hz
laboratory transformer connected to a 380 V power system were
simulated using ATP-EMTP. The DWT was implemented by using
Matlab and Coiflet mother wavelet was used to analyze primary
currents and generate training data. The simulated results presented
clearly show that the proposed technique can accurately discriminate
between magnetizing inrush and fault currents in transformer
protection.
Abstract: Medical image segmentation based on image smoothing followed by edge detection assumes a great degree of importance in the field of Image Processing. In this regard, this paper proposes a novel algorithm for medical image segmentation based on vigorous smoothening by identifying the type of noise and edge diction ideology which seems to be a boom in medical image diagnosis. The main objective of this algorithm is to consider a particular medical image as input and make the preprocessing to remove the noise content by employing suitable filter after identifying the type of noise and finally carrying out edge detection for image segmentation. The algorithm consists of three parts. First, identifying the type of noise present in the medical image as additive, multiplicative or impulsive by analysis of local histograms and denoising it by employing Median, Gaussian or Frost filter. Second, edge detection of the filtered medical image is carried out using Canny edge detection technique. And third part is about the segmentation of edge detected medical image by the method of Normalized Cut Eigen Vectors. The method is validated through experiments on real images. The proposed algorithm has been simulated on MATLAB platform. The results obtained by the simulation shows that the proposed algorithm is very effective which can deal with low quality or marginal vague images which has high spatial redundancy, low contrast and biggish noise, and has a potential of certain practical use of medical image diagnosis.
Abstract: The presented work is motivated by a French law
regarding nuclear waste management. A new conceptual Accelerator
Driven System (ADS) designed for the Minor Actinides (MA)
transmutation has been assessed by numerical simulation. The
MUltiple Spallation Target (MUST) ADS combines high thermal power (up to 1.4 GWth) and high specific power. A 30 mA and 1
GeV proton beam is divided into three secondary beams transmitted on three liquid lead-bismuth spallation targets. Neutron and thermalhydraulic
simulations have been performed with the code MURE, based on the Monte-Carlo transport code MCNPX. A methodology has been developed to define characteristic of the MUST ADS concept according to a specific transmutation scenario. The reference
scenario is based on a MA flux (neptunium, americium and curium)
providing from European Fast Reactor (EPR) and a plutonium multireprocessing
strategy is accounted for. The MUST ADS reference
concept is a sodium cooled fast reactor. The MA fuel at equilibrium is mixed with MgO inert matrix to limit the core reactivity and
improve the fuel thermal conductivity. The fuel is irradiated over five
years. Five years of cooling and two years for the fuel fabrication are
taken into account. The MUST ADS reference concept burns about 50% of the initial MA inventory during a complete cycle. In term of
mass, up to 570 kg/year are transmuted in one concept. The methodology to design the MUST ADS and to calculate fuel
composition at equilibrium is precisely described in the paper. A detailed fuel evolution analysis is performed and the reference scenario is compared to a scenario where only americium transmutation is performed.
Abstract: In Geographic Information System, one of the sources
of obtaining needed geographic data is digitizing analog maps and
evaluation of aerial and satellite photos. In this study, a method will
be discussed which can be used to extract vectorial features and
creating vectorized drawing files for aerial photos. At the same time
a software developed for these purpose. Converting from raster to
vector is also known as vectorization and it is the most important step
when creating vectorized drawing files. In the developed algorithm,
first of all preprocessing on the aerial photo is done. These are;
converting to grayscale if necessary, reducing noise, applying some
filters and determining the edge of the objects etc. After these steps,
every pixel which constitutes the photo are followed from upper left
to right bottom by examining its neighborhood relationship and one
pixel wide lines or polylines obtained. The obtained lines have to be
erased for preventing confusion while continuing vectorization
because if not erased they can be perceived as new line, but if erased
it can cause discontinuity in vector drawing so the image converted
from 2 bit to 8 bit and the detected pixels are expressed as a different
bit. In conclusion, the aerial photo can be converted to vector form
which includes lines and polylines and can be opened in any CAD
application.
Abstract: The frontal area in the brain is known to be involved in
behavioral judgement. Because a Kanji character can be discriminated
visually and linguistically from other characters, in Kanji character
discrimination, we hypothesized that frontal event-related potential
(ERP) waveforms reflect two discrimination processes in separate
time periods: one based on visual analysis and the other based
on lexcical access. To examine this hypothesis, we recorded ERPs
while performing a Kanji lexical decision task. In this task, either a
known Kanji character, an unknown Kanji character or a symbol was
presented and the subject had to report if the presented character was
a known Kanji character for the subject or not. The same response
was required for unknown Kanji trials and symbol trials. As a preprocessing
of signals, we examined the performance of a method
using independent component analysis for artifact rejection and found
it was effective. Therefore we used it. In the ERP results, there
were two time periods in which the frontal ERP wavefoms were
significantly different betweeen the unknown Kanji trials and the
symbol trials: around 170ms and around 300ms after stimulus onset.
This result supported our hypothesis. In addition, the result suggests
that Kanji character lexical access may be fully completed by around
260ms after stimulus onset.
Abstract: Many factors affect the success of Machine Learning
(ML) on a given task. The representation and quality of the instance
data is first and foremost. If there is much irrelevant and redundant
information present or noisy and unreliable data, then knowledge
discovery during the training phase is more difficult. It is well known
that data preparation and filtering steps take considerable amount of
processing time in ML problems. Data pre-processing includes data
cleaning, normalization, transformation, feature extraction and
selection, etc. The product of data pre-processing is the final training
set. It would be nice if a single sequence of data pre-processing
algorithms had the best performance for each data set but this is not
happened. Thus, we present the most well know algorithms for each
step of data pre-processing so that one achieves the best performance
for their data set.
Abstract: Brain Computer Interface (BCI) has been recently
increased in research. Functional Near Infrared Spectroscope (fNIRs)
is one the latest technologies which utilize light in the near-infrared
range to determine brain activities. Because near infrared technology
allows design of safe, portable, wearable, non-invasive and wireless
qualities monitoring systems, fNIRs monitoring of brain
hemodynamics can be value in helping to understand brain tasks. In
this paper, we present results of fNIRs signal analysis indicating that
there exist distinct patterns of hemodynamic responses which
recognize brain tasks toward developing a BCI. We applied two
different mathematics tools separately, Wavelets analysis for
preprocessing as signal filters and feature extractions and Neural
networks for cognition brain tasks as a classification module. We
also discuss and compare with other methods while our proposals
perform better with an average accuracy of 99.9% for classification.
Abstract: This paper proposes a neural network weights and
topology optimization using genetic evolution and the
backpropagation training algorithm. The proposed crossover and
mutation operators aims to adapt the networks architectures and
weights during the evolution process. Through a specific inheritance
procedure, the weights are transmitted from the parents to their
offsprings, which allows re-exploitation of the already trained
networks and hence the acceleration of the global convergence of the
algorithm. In the preprocessing phase, a new feature extraction
method is proposed based on Legendre moments with the Maximum
entropy principle MEP as a selection criterion. This allows a global
search space reduction in the design of the networks. The proposed
method has been applied and tested on the well known MNIST
database of handwritten digits.
Abstract: In this paper, a new face recognition method based on
PCA (principal Component Analysis), LDA (Linear Discriminant
Analysis) and neural networks is proposed. This method consists of
four steps: i) Preprocessing, ii) Dimension reduction using PCA, iii)
feature extraction using LDA and iv) classification using neural
network. Combination of PCA and LDA is used for improving the
capability of LDA when a few samples of images are available and
neural classifier is used to reduce number misclassification caused by
not-linearly separable classes. The proposed method was tested on
Yale face database. Experimental results on this database
demonstrated the effectiveness of the proposed method for face
recognition with less misclassification in comparison with previous
methods.
Abstract: Biclustering is a very useful data mining technique for
identifying patterns where different genes are co-related based on a
subset of conditions in gene expression analysis. Association rules
mining is an efficient approach to achieve biclustering as in
BIMODULE algorithm but it is sensitive to the value given to its
input parameters and the discretization procedure used in the
preprocessing step, also when noise is present, classical association
rules miners discover multiple small fragments of the true bicluster,
but miss the true bicluster itself. This paper formally presents a
generalized noise tolerant bicluster model, termed as μBicluster. An
iterative algorithm termed as BIDENS based on the proposed model
is introduced that can discover a set of k possibly overlapping
biclusters simultaneously. Our model uses a more flexible method to
partition the dimensions to preserve meaningful and significant
biclusters. The proposed algorithm allows discovering biclusters that
hard to be discovered by BIMODULE. Experimental study on yeast,
human gene expression data and several artificial datasets shows that
our algorithm offers substantial improvements over several
previously proposed biclustering algorithms.
Abstract: Clustering is the process of subdividing an input data set into a desired number of subgroups so that members of the same subgroup are similar and members of different subgroups have diverse properties. Many heuristic algorithms have been applied to the clustering problem, which is known to be NP Hard. Genetic algorithms have been used in a wide variety of fields to perform clustering, however, the technique normally has a long running time in terms of input set size. This paper proposes an efficient genetic algorithm for clustering on very large data sets, especially on image data sets. The genetic algorithm uses the most time efficient techniques along with preprocessing of the input data set. We test our algorithm on both artificial and real image data sets, both of which are of large size. The experimental results show that our algorithm outperforms the k-means algorithm in terms of running time as well as the quality of the clustering.
Abstract: Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper a study of touching Gurmukhi characters is carried out and these characters have been divided into various categories after a careful analysis.Structural properties of the Gurmukhi characters are used for defining the categories. New algorithms have been proposed to segment the touching characters in middle zone. These algorithms have shown a reasonable improvement in segmenting the touching characters in degraded Gurmukhi script. The algorithms proposed in this paper are applicable only to machine printed text.
Abstract: Microarrays have become the effective, broadly used tools in biological and medical research to address a wide range of problems, including classification of disease subtypes and tumors. Many statistical methods are available for analyzing and systematizing these complex data into meaningful information, and one of the main goals in analyzing gene expression data is the detection of samples or genes with similar expression patterns. In this paper, we express and compare the performance of several clustering methods based on data preprocessing including strategies of normalization or noise clearness. We also evaluate each of these clustering methods with validation measures for both simulated data and real gene expression data. Consequently, clustering methods which are common used in microarray data analysis are affected by normalization and degree of noise and clearness for datasets.
Abstract: Image registration plays an important role in the
diagnosis of dental pathologies such as dental caries, alveolar bone
loss and periapical lesions etc. This paper presents a new wavelet
based algorithm for registering noisy and poor contrast dental x-rays.
Proposed algorithm has two stages. First stage is a preprocessing
stage, removes the noise from the x-ray images. Gaussian filter has
been used. Second stage is a geometric transformation stage.
Proposed work uses two levels of affine transformation. Wavelet
coefficients are correlated instead of gray values. Algorithm has been
applied on number of pre and post RCT (Root canal treatment)
periapical radiographs. Root Mean Square Error (RMSE) and
Correlation coefficients (CC) are used for quantitative evaluation.
Proposed technique outperforms conventional Multiresolution
strategy based image registration technique and manual registration
technique.
Abstract: In this paper a new approach to face recognition is presented that achieves double dimension reduction making the system computationally efficient with better recognition results. In pattern recognition techniques, discriminative information of image increases with increase in resolution to a certain extent, consequently face recognition results improve with increase in face image resolution and levels off when arriving at a certain resolution level. In the proposed model of face recognition, first image decimation algorithm is applied on face image for dimension reduction to a certain resolution level which provides best recognition results. Due to better computational speed and feature extraction potential of Discrete Cosine Transform (DCT) it is applied on face image. A subset of coefficients of DCT from low to mid frequencies that represent the face adequately and provides best recognition results is retained. A trade of between decimation factor, number of DCT coefficients retained and recognition rate with minimum computation is obtained. Preprocessing of the image is carried out to increase its robustness against variations in poses and illumination level. This new model has been tested on different databases which include ORL database, Yale database and a color database. The proposed technique has performed much better compared to other techniques. The significance of the model is two fold: (1) dimension reduction up to an effective and suitable face image resolution (2) appropriate DCT coefficients are retained to achieve best recognition results with varying image poses, intensity and illumination level.
Abstract: Several combinations of the preprocessing algorithms,
feature selection techniques and classifiers can be applied to the data
classification tasks. This study introduces a new accurate classifier,
the proposed classifier consist from four components: Signal-to-
Noise as a feature selection technique, support vector machine,
Bayesian neural network and AdaBoost as an ensemble algorithm.
To verify the effectiveness of the proposed classifier, seven well
known classifiers are applied to four datasets. The experiments show
that using the suggested classifier enhances the classification rates for
all datasets.
Abstract: In this paper we present the algorithm which allows
us to have an object tracking close to real time in Full HD videos.
The frame rate (FR) of a video stream is considered to be between
5 and 30 frames per second. The real time track building will be
achieved if the algorithm can follow 5 or more frames per second. The
principle idea is to use fast algorithms when doing preprocessing to
obtain the key points and track them after. The procedure of matching
points during assignment is hardly dependent on the number of points.
Because of this we have to limit pointed number of points using the
most informative of them.
Abstract: This paper presents a method for the detection of OD in the retina which takes advantage of the powerful preprocessing techniques such as the contrast enhancement, Gabor wavelet transform for vessel segmentation, mathematical morphology and Earth Mover-s distance (EMD) as the matching process. The OD detection algorithm is based on matching the expected directional pattern of the retinal blood vessels. Vessel segmentation method produces segmentations by classifying each image pixel as vessel or nonvessel, based on the pixel-s feature vector. Feature vectors are composed of the pixel-s intensity and 2D Gabor wavelet transform responses taken at multiple scales. A simple matched filter is proposed to roughly match the direction of the vessels at the OD vicinity using the EMD. The minimum distance provides an estimate of the OD center coordinates. The method-s performance is evaluated on publicly available DRIVE and STARE databases. On the DRIVE database the OD center was detected correctly in all of the 40 images (100%) and on the STARE database the OD was detected correctly in 76 out of the 81 images, even in rather difficult pathological situations.