Abstract: With increasingly more mobile health applications
appearing due to the popularity of smartphones, the possibility arises
that these data can be used to improve the medical diagnostic process,
as well as the overall quality of healthcare, while at the same time
lowering costs. However, as of yet there have been no reports of a
successful combination of patient-generated data from smartphones
with data from clinical routine. In this paper we describe how these
two types of data can be combined in a secure way without
modification to hospital information systems, and how they can
together be used in a medical expert system for automatic nutritional
classification and triage.
Abstract: This paper presents an efficient fusion algorithm for
iris images to generate stable feature for recognition in unconstrained
environment. Recently, iris recognition systems are focused on real
scenarios in our daily life without the subject’s cooperation. Under
large variation in the environment, the objective of this paper is to
combine information from multiple images of the same iris. The
result of image fusion is a new image which is more stable for further
iris recognition than each original noise iris image. A wavelet-based
approach for multi-resolution image fusion is applied in the fusion
process. The detection of the iris image is based on Adaboost
algorithm and then local binary pattern (LBP) histogram is then
applied to texture classification with the weighting scheme.
Experiment showed that the generated features from the proposed
fusion algorithm can improve the performance for verification system
through iris recognition.
Abstract: The 3D body movement signals captured during
human-human conversation include clues not only to the content of
people’s communication but also to their culture and personality.
This paper is concerned with automatic extraction of this information
from body movement signals. For the purpose of this research, we
collected a novel corpus from 27 subjects, arranged them into groups
according to their culture. We arranged each group into pairs and
each pair communicated with each other about different topics.
A state-of-art recognition system is applied to the problems of
person, culture, and topic recognition. We borrowed modeling,
classification, and normalization techniques from speech recognition.
We used Gaussian Mixture Modeling (GMM) as the main technique
for building our three systems, obtaining 77.78%, 55.47%, and
39.06% from the person, culture, and topic recognition systems
respectively. In addition, we combined the above GMM systems with
Support Vector Machines (SVM) to obtain 85.42%, 62.50%, and
40.63% accuracy for person, culture, and topic recognition
respectively.
Although direct comparison among these three recognition
systems is difficult, it seems that our person recognition system
performs best for both GMM and GMM-SVM, suggesting that intersubject
differences (i.e. subject’s personality traits) are a major
source of variation. When removing these traits from culture and
topic recognition systems using the Nuisance Attribute Projection
(NAP) and the Intersession Variability Compensation (ISVC)
techniques, we obtained 73.44% and 46.09% accuracy from culture
and topic recognition systems respectively.
Abstract: The study of the electrical signals produced by neural
activities of human brain is called Electroencephalography. In this
paper, we propose an automatic and efficient EEG signal
classification approach. The proposed approach is used to classify the
EEG signal into two classes: epileptic seizure or not. In the proposed
approach, we start with extracting the features by applying Discrete
Wavelet Transform (DWT) in order to decompose the EEG signals
into sub-bands. These features, extracted from details and
approximation coefficients of DWT sub-bands, are used as input to
Principal Component Analysis (PCA). The classification is based on
reducing the feature dimension using PCA and deriving the supportvectors
using Support Vector Machine (SVM). The experimental are
performed on real and standard dataset. A very high level of
classification accuracy is obtained in the result of classification.
Abstract: In this paper we consider the rule reduct generation
problem. Rule Reduct Generation (RG) and Modified Rule
Generation (MRG) algorithms, that are used to solve this problem,
are well-known. Alternative to these algorithms, we develop Pruning
Rule Generation (PRG) algorithm. We compare the PRG algorithm
with RG and MRG.
Abstract: Member States shall establish zones and
agglomerations throughout their territory to assess and manage air
quality in order to comply with European directives.
In Italy decree 155/2010, transposing Directive 2008/50/EC on
ambient air quality and cleaner air for Europe, merged into a single
act the previous provisions on ambient air quality assessment and
management, including those resulting from the implementation of
Directive 2004/107/EC relating to arsenic, cadmium, nickel, mercury
and polycyclic aromatic hydrocarbons in ambient air.
Decree 155/2010 introduced stricter rules for identifying zones on
the basis of the characteristics of the territory in spite of considering
pollution levels, as it was in the past. The implementation of such
new criteria has reduced the great variability of the previous zoning,
leading to a significant reduction of the total number of zones and to
a complete and uniform ambient air quality assessment and
management throughout the Country.
The present document is related to the new zones definition in
Italy according to Decree 155/2010. In particular the paper contains
the description and the analysis of the outcome of zoning and
classification.
Abstract: In Brazil, neonatal mortality rate is considered
incompatible with the country development conditions, and has been
a Public Health concern. Reduction in infant mortality rates has also
been part of the Millennium Development Goals, a commitment
made by countries, members of the Organization of United Nations
(OUN), including Brazil. Fetal mortality rate is considered a highly
sensitive indicator of health care quality. Suitable actions, such as
good quality and access to health services may contribute positively
towards reduction in these fetal and neonatal rates. With appropriate
antenatal follow-up and health care during gestation and delivery,
some death causes could be reduced or even prevented by means of
early diagnosis and intervention, as well as changes in risk factors
and interventions. Objectives: To study the quality of maternal and
infant health care based on fetal and neonatal mortality, as well as the
possible actions to prevent those deaths in Botucatu (Brazil).
Methods: Classification of prevention according to the International
Classification of Diseases and the modified Wigglesworth´s
classification. In order to evaluate adequacy, indicators of quality of
antenatal and delivery care were established by the authors. Results:
Considering fetal deaths, 56.7% of them occurred before delivery,
which reveals possible shortcomings in antenatal care, and 38.2% of
them were a result of intra- labor changes, which could be prevented
or reduced by adequate obstetric management. These findings were
different from those in the group of early neonatal deaths which were
also studied. Adequacy of health services showed that antenatal and
childbirth care was appropriate for 24% and 33.3% of pregnant
women, respectively, which corroborates the results of prevention.
These results revealed that shortcomings in obstetric and antenatal
care could be the causes of deaths in the study. Early and late
neonatal deaths have similar characteristics: 76% could be prevented
or reduced mainly by adequate newborn care (52.9%) and adequate
health care for gestational women (11.7%). When adequacy of care
was evaluated, childbirth and newborn care was adequate in 25.8%
and antenatal care was adequate in 16.1%. In conclusion, direct
relationship was found between adequacy and quality of care
rendered to pregnant women and newborns, and fetal and infant
mortality. Moreover, our findings highlight that deaths could be
prevented by an adequate obstetric and neonatal management.
Abstract: Text mining techniques are generally applied for
classifying the text, finding fuzzy relations and structures in data
sets. This research provides plenty text mining capabilities. One
common application is text classification and event extraction,
which encompass deducing specific knowledge concerning incidents
referred to in texts. The main contribution of this paper is the
clarification of a concept graph generation mechanism, which is based
on a text classification and optimal fuzzy relationship extraction.
Furthermore, the work presented in this paper explains the application
of fuzzy relationship extraction and branch and bound (BB) method
to simplify the texts.
Abstract: In this study, data loss tolerance of Support Vector Machines (SVM) based activity recognition model and multi activity classification performance when data are received over a lossy wireless sensor network is examined. Initially, the classification algorithm we use is evaluated in terms of resilience to random data loss with 3D acceleration sensor data for sitting, lying, walking and standing actions. The results show that the proposed classification method can recognize these activities successfully despite high data loss. Secondly, the effect of differentiated quality of service performance on activity recognition success is measured with activity data acquired from a multi hop wireless sensor network, which introduces high data loss. The effect of number of nodes on the reliability and multi activity classification success is demonstrated in simulation environment. To the best of our knowledge, the effect of data loss in a wireless sensor network on activity detection success rate of an SVM based classification algorithm has not been studied before.
Abstract: In this paper, we present a robust algorithm to recognize extracted text from grocery product images captured by mobile phone cameras. Recognition of such text is challenging since text in grocery product images varies in its size, orientation,
style, illumination, and can suffer from perspective distortion.
Pre-processing is performed to make the characters scale and
rotation invariant. Since text degradations can not be appropriately
defined using well-known geometric transformations such
as translation, rotation, affine transformation and shearing, we
use the whole character black pixels as our feature vector.
Classification is performed with minimum distance classifier
using the maximum likelihood criterion, which delivers very
promising Character Recognition Rate (CRR) of 89%. We
achieve considerably higher Word Recognition Rate (WRR) of
99% when using lower level linguistic knowledge about product
words during the recognition process.
Abstract: Red blood cells (RBC) are the most common types of
blood cells and are the most intensively studied in cell biology. The
lack of RBCs is a condition in which the amount of hemoglobin level
is lower than normal and is referred to as “anemia”. Abnormalities in
RBCs will affect the exchange of oxygen. This paper presents a
comparative study for various techniques for classifying the RBCs as
normal or abnormal (anemic) using WEKA. WEKA is an open
source consists of different machine learning algorithms for data
mining applications. The algorithms tested are Radial Basis Function
neural network, Support vector machine, and K-Nearest Neighbors
algorithm. Two sets of combined features were utilized for
classification of blood cells images. The first set, exclusively consist
of geometrical features, was used to identify whether the tested blood
cell has a spherical shape or non-spherical cells. While the second
set, consist mainly of textural features was used to recognize the
types of the spherical cells. We have provided an evaluation based on
applying these classification methods to our RBCs image dataset
which were obtained from Serdang Hospital - Malaysia, and
measuring the accuracy of test results. The best achieved
classification rates are 97%, 98%, and 79% for Support vector
machines, Radial Basis Function neural network, and K-Nearest
Neighbors algorithm respectively.
Abstract: Image spam is a kind of email spam where the spam
text is embedded with an image. It is a new spamming technique
being used by spammers to send their messages to bulk of internet
users. Spam email has become a big problem in the lives of internet
users, causing time consumption and economic losses. The main
objective of this paper is to detect the image spam by using histogram
properties of an image. Though there are many techniques to
automatically detect and avoid this problem, spammers employing
new tricks to bypass those techniques, as a result those techniques are
inefficient to detect the spam mails. In this paper we have proposed a
new method to detect the image spam. Here the image features are
extracted by using RGB histogram, HSV histogram and combination
of both RGB and HSV histogram. Based on the optimized image
feature set classification is done by using k- Nearest Neighbor(k-NN)
algorithm. Experimental result shows that our method has achieved
better accuracy. From the result it is known that combination of RGB
and HSV histogram with k-NN algorithm gives the best accuracy in
spam detection.
Abstract: The growth in the volume of text data such as books
and articles in libraries for centuries has imposed to establish
effective mechanisms to locate them. Early techniques such as
abstraction, indexing and the use of classification categories have
marked the birth of a new field of research called "Information
Retrieval". Information Retrieval (IR) can be defined as the task of
defining models and systems whose purpose is to facilitate access to
a set of documents in electronic form (corpus) to allow a user to find
the relevant ones for him, that is to say, the contents which matches
with the information needs of the user.
Most of the models of information retrieval use a specific data
structure to index a corpus which is called "inverted file" or "reverse
index".
This inverted file collects information on all terms over the corpus
documents specifying the identifiers of documents that contain the
term in question, the frequency of each term in the documents of the
corpus, the positions of the occurrences of the word...
In this paper we use an oriented object database (db4o) instead of
the inverted file, that is to say, instead to search a term in the inverted
file, we will search it in the db4o database.
The purpose of this work is to make a comparative study to see if
the oriented object databases may be competing for the inverse index
in terms of access speed and resource consumption using a large
volume of data.
Abstract: Many of the ever-growing elderly population require
exercise, such as running, for health management. One important
element of a runner’s training is the choice of shoes for exercise; shoes
are important because they provide the interface between the feet and
road. When we purchase shoes, we may instinctively choose a pair
after trying on many different pairs of shoes. Selecting the shoes
instinctively may work, but it does not guarantee a suitable fit for
running activities. Therefore, if we could select suitable shoes for each
runner from the viewpoint of brain activities, it would be helpful for
validating shoe selection. In this paper, we describe how brain
activities show different characteristics during particular task,
corresponding to different properties of shoes. Using five subjects, we
performed a verification experiment, applying weight, softness, and
flexibility as shoe properties. In order to affect the shoe property’s
differences to the brain, subjects run for 10 min. Before and after
running, subjects conducted a paced auditory serial addition task
(PASAT) as the particular task; and the subjects’ brain activities
during the PASAT are evaluated based on oxyhemoglobin and
deoxyhemoglobin relative concentration changes, measured by
near-infrared spectroscopy (NIRS). When the brain works actively,
oxihemoglobin and deoxyhemoglobin concentration drastically
changes; therefore, we calculate the maximum values of concentration
changes. In order to normalize relative concentration changes after
running, the maximum value are divided by before running maximum
value as evaluation parameters. The classification of the groups of
shoes is expressed on a self-organizing map (SOM). As a result,
deoxyhemoglobin can make clusters for two of the three types of
shoes.
Abstract: Tumor is an uncontrolled growth of tissues in any part
of the body. Tumors are of different types and they have different
characteristics and treatments. Brain tumor is inherently serious and
life-threatening because of its character in the limited space of the
intracranial cavity (space formed inside the skull). Locating the tumor
within MR (magnetic resonance) image of brain is integral part of the
treatment of brain tumor. This segmentation task requires
classification of each voxel as either tumor or non-tumor, based on
the description of the voxel under consideration. Many studies are
going on in the medical field using Markov Random Fields (MRF) in
segmentation of MR images. Even though the segmentation process
is better, computing the probability and estimation of parameters is
difficult. In order to overcome the aforementioned issues, Conditional
Random Field (CRF) is used in this paper for segmentation, along
with the modified artificial bee colony optimization and modified
fuzzy possibility c-means (MFPCM) algorithm. This work is mainly
focused to reduce the computational complexities, which are found in
existing methods and aimed at getting higher accuracy. The
efficiency of this work is evaluated using the parameters such as
region non-uniformity, correlation and computation time. The
experimental results are compared with the existing methods such as
MRF with improved Genetic Algorithm (GA) and MRF-Artificial
Bee Colony (MRF-ABC) algorithm.
Abstract: Mammography has been one of the most reliable
methods for early detection of breast cancer. There are different
lesions which are breast cancer characteristic such as
microcalcifications, masses, architectural distortions and bilateral
asymmetry. One of the major challenges of analysing digital
mammogram is how to extract efficient features from it for accurate
cancer classification. In this paper we proposed a hybrid feature
extraction method to detect and classify all four signs of breast
cancer. The proposed method is based on multiscale surrounding
region dependence method, Gabor filters, multi fractal analysis,
directional and morphological analysis. The extracted features are
input to self adaptive resource allocation network (SRAN) classifier
for classification. The validity of our approach is extensively
demonstrated using the two benchmark data sets Mammographic
Image Analysis Society (MIAS) and Digital Database for Screening
Mammograph (DDSM) and the results have been proved to be
progressive.
Abstract: Skin detection is an important task for computer
vision systems. A good method of skin detection means a good and
successful result of the system.
The colour is a good descriptor for image segmentation and
classification; it allows detecting skin colour in the images. The
lighting changes and the objects that have a colour similar than skin
colour make the operation of skin detection difficult.
In this paper, we proposed a method using the YCbCr colour space
for skin detection and lighting effects elimination, then we use the
information of texture to eliminate the false regions detected by the
YCbCr skin model.
Abstract: Existing methods of data mining cannot be applied on
spatial data because they require spatial specificity consideration, as
spatial relationships.
This paper focuses on the classification with decision trees, which
are one of the data mining techniques. We propose an extension of
the C4.5 algorithm for spatial data, based on two different approaches
Join materialization and Querying on the fly the different tables.
Similar works have been done on these two main approaches, the
first - Join materialization - favors the processing time in spite of
memory space, whereas the second - Querying on the fly different
tables- promotes memory space despite of the processing time.
The modified C4.5 algorithm requires three entries tables: a target
table, a neighbor table, and a spatial index join that contains the
possible spatial relationship among the objects in the target table and
those in the neighbor table. Thus, the proposed algorithms are applied
to a spatial data pattern in the accidentology domain.
A comparative study of our approach with other works of
classification by spatial decision trees will be detailed.
Abstract: Molluca Collision Zone is located at the junction of
the Eurasian, Australian, Pacific and the Philippines plates. Between
the Sangihe arc, west of the collision zone, and to the east of
Halmahera arc is active collision and convex toward the Molluca Sea.
This research will analyze the behavior of earthquake occurrence in
Molluca Collision Zone related to the distributions of an earthquake
in each partition regions, determining the type of distribution of a
occurrence earthquake of partition regions, and the mean occurence
of earthquakes each partition regions, and the correlation between the
partitions region. We calculate number of earthquakes using partition
method and its behavioral using conventional statistical methods. In
this research, we used data of shallow earthquakes type and its
magnitudes ≥4 SR (period 1964-2013). From the results, we can
classify partitioned regions based on the correlation into two classes:
strong and very strong. This classification can be used for early
warning system in disaster management.
Abstract: Microarray technology is universally used in the study
of disease diagnosis using gene expression levels. The main
shortcoming of gene expression data is that it includes thousands of
genes and a small number of samples. Abundant methods and
techniques have been proposed for tumor classification using
microarray gene expression data. Feature or gene selection methods
can be used to mine the genes that directly involve in the
classification and to eliminate irrelevant genes. In this paper
statistical measures like T-Statistics, Signal-to-Noise Ratio (SNR)
and F-Statistics are used to rank the genes. The ranked genes are used
for further classification. Particle Swarm Optimization (PSO)
algorithm and Shuffled Frog Leaping (SFL) algorithm are used to
find the significant genes from the top-m ranked genes. The Naïve
Bayes Classifier (NBC) is used to classify the samples based on the
significant genes. The proposed work is applied on Lung and Ovarian
datasets. The experimental results show that the proposed method
achieves 100% accuracy in all the three datasets and the results are
compared with previous works.