Abstract: Over the past several years, researchers have shown a great interest in assessing the mobility of elderly people to measure their functional status. Usually, such an assessment is done by conducting tests that require the subject to walk a certain distance, turn around, and finally sit back down. Consequently, this study aims to provide an at home monitoring system to assess the patient’s status continuously. Thus, we proposed a technique to automatically detect when a subject sits down while walking at home. In this study, we utilized a Doppler radar system to capture the motion of the subjects. More than 20 features were extracted from the radar signals out of which 11 were chosen based on their Intraclass Correlation Coefficient (ICC > 0.75). Accordingly, the sequential floating forward selection wrapper was applied to further narrow down the final feature vector. Finally, five features were introduced to the Linear Discriminant Analysis classifier and an accuracy of 93.75% was achieved as well as a precision and recall of 95% and 90% respectively.
Abstract: In Pakistan, environmental degradation and consequent human health deterioration has rapidly accelerated in the past decade due to solid waste mismanagement. As the situation worsens with time, establishment of proper waste management practices is urgently needed especially in semi urban and rural areas of Pakistan. This study uses a concept of Waste Bank, which involves a transfer station for collection of sorted waste fractions and its delivery to the targeted market such as recycling industries, biogas plants, composting facilities etc. The management efficiency and effectiveness of Waste Bank depend strongly on the proficient sorting and collection of solid waste fractions at household level. However, the social attitude towards such a solution in semi urban/rural areas of Pakistan demands certain prerequisites to make it workable. Considering these factors the objectives of this study are to: [A] Obtain reliable data about quantity and characteristics of generated waste to define feasibility of business and design factors, such as required storage area, retention time, transportation frequency of the system etc. [B] Analyze the effects of various social factors on waste generation to foresee future projections. [C] Quantify the improvement in waste sorting efficiency after awareness campaign. We selected Gujrat city of Central Punjab province of Pakistan as it is semi urban adjoined by rural areas. A total of 60 houses (20 from each of the three selected colonies), belonging to different social status were selected. Awareness sessions about waste segregation were given through brochures and individual lectures in each selected household. Sampling of waste, that households had attempted to sort, was then carried out in the three colored bags that were provided as part of the awareness campaign. Finally, refined waste sorting, weighing of various fractions and measurement of dry mass was performed in environmental laboratory using standard methods. It was calculated that sorting efficiency of waste improved from 0 to 52% as a result of the awareness campaign. The generation of waste (dry mass basis) on average from one household was 460 kg/year whereas per capita generation was 68 kg/year. Extrapolating these values for Gujrat Tehsil, the total waste generation per year is calculated to be 101921 tons dry mass (DM). Characteristics found in waste were (i) organic decomposable (29.2%, 29710 tons/year DM), (ii) recyclables (37.0%, 37726 tons/year DM) that included plastic, paper, metal and glass, and (iii) trash (33.8%, 34485 tons/year DM) that mainly comprised of polythene bags, medicine packaging, pampers and wrappers. Waste generation was more in colonies with comparatively higher income and better living standards. In future, data collection for all four seasons and improvements due to expansion of awareness campaign to educational institutes will be quantified. This waste management system can potentially fulfill vital sustainable development goals (e.g. clean water and sanitation), reduce the need to harvest fresh resources from the ecosystem, create business and job opportunities and consequently solve one of the most pressing environmental issues of the country.
Abstract: In order to study environmental contamination by cytostatic drugs in Portugal hospitals, sampling campaigns were conducted in three hospitals in 2015 (112 samples). Platinum containing drugs and fluorouracil were chosen because both were administered in high amounts. The detection limit was 0.01 pg/cm² for platinum and 0.1 pg/cm² for fluorouracil. The results show that spills occur mainly on the patient`s chair, while the most referenced occurrence is due to an inadequately closed wrapper. Day hospitals facilities were detected as having the largest number of contaminated samples and with higher levels of contamination.
Abstract: Predictive data analysis and modeling involving machine learning techniques become challenging in presence of too many explanatory variables or features. Presence of too many features in machine learning is known to not only cause algorithms to slow down, but they can also lead to decrease in model prediction accuracy. This study involves housing dataset with 79 quantitative and qualitative features that describe various aspects people consider while buying a new house. Boruta algorithm that supports feature selection using a wrapper approach build around random forest is used in this study. This feature selection process leads to 49 confirmed features which are then used for developing predictive random forest models. The study also explores five different data partitioning ratios and their impact on model accuracy are captured using coefficient of determination (r-square) and root mean square error (rsme).
Abstract: Abstract—Attribute or feature selection is one of the basic
strategies to improve the performances of data classification tasks,
and, at the same time, to reduce the complexity of classifiers,
and it is a particularly fundamental one when the number
of attributes is relatively high. Its application to unsupervised
classification is restricted to a limited number of experiments in
the literature. Evolutionary computation has already proven itself
to be a very effective choice to consistently reduce the number
of attributes towards a better classification rate and a simpler
semantic interpretation of the inferred classifiers. We present a feature
selection wrapper model composed by a multi-objective evolutionary
algorithm, the clustering method Expectation-Maximization (EM),
and the classifier C4.5 for the unsupervised classification of data
extracted from a psychological test named BASC-II (Behavior
Assessment System for Children - II ed.) with two objectives:
Maximizing the likelihood of the clustering model and maximizing
the accuracy of the obtained classifier. We present a methodology
to integrate feature selection for unsupervised classification, model
evaluation, decision making (to choose the most satisfactory model
according to a a posteriori process in a multi-objective context), and
testing. We compare the performance of the classifier obtained by the
multi-objective evolutionary algorithms ENORA and NSGA-II, and
the best solution is then validated by the psychologists that collected
the data.
Abstract: Financial forecasting using machine learning techniques has received great efforts in the last decide . In this ongoing work, we show how machine learning of graphical models will be able to infer a visualized causal interactions between different banks in the Saudi equities market. One important discovery from such learned causal graphs is how companies influence each other and to what extend. In this work, a set of graphical models named Gaussian graphical models with developed ensemble penalized feature selection methods that combine ; filtering method, wrapper method and a regularizer will be shown. A comparison between these different developed ensemble combinations will also be shown. The best ensemble method will be used to infer the causal relationships between banks in Saudi equities market.
Abstract: Open Agent System platform based on High Level
Architecture is firstly proposed to support the application involving
heterogeneous agents. The basic idea is to develop different wrappers
for different agent systems, which are wrapped as federates to join a
federation. The platform is based on High Level Architecture and the
advantages for this open standard are naturally inherited, such as
system interoperability and reuse. Especially, the federal architecture
allows different federates to be heterogeneous so as to support the
integration of different agent systems. Furthermore, both implicit
communication and explicit communication between agents can be
supported. Then, as the wrapper RTI_JADE an example, the
components are discussed. Finally, the performance of RTI_JADE is
analyzed. The results show that RTI_JADE works very efficiently.
Abstract: Globalization and therefore increasing tight competition among companies, have resulted to increase the importance of making well-timed decision. Devising and employing effective strategies, that are flexible and adaptive to changing market, stand a greater chance of being effective in the long-term. In other side, a clear focus on managing the entire product lifecycle has emerged as critical areas for investment. Therefore, applying wellorganized tools to employ past experience in new case, helps to make proper and managerial decisions. Case based reasoning (CBR) is based on a means of solving a new problem by using or adapting solutions to old problems. In this paper, an adapted CBR model with k-nearest neighbor (K-NN) is employed to provide suggestions for better decision making which are adopted for a given product in the middle of life phase. The set of solutions are weighted by CBR in the principle of group decision making. Wrapper approach of genetic algorithm is employed to generate optimal feature subsets. The dataset of the department store, including various products which are collected among two years, have been used. K-fold approach is used to evaluate the classification accuracy rate. Empirical results are compared with classical case based reasoning algorithm which has no special process for feature selection, CBR-PCA algorithm based on filter approach feature selection, and Artificial Neural Network. The results indicate that the predictive performance of the model, compare with two CBR algorithms, in specific case is more effective.
Abstract: A generic and extendible Multi-Agent Data Mining
(MADM) framework, MADMF (the Multi-Agent Data Mining
Framework) is described. The central feature of the framework is that
it avoids the use of agreed meta-language formats by supporting a
framework of wrappers.
The advantage offered is that the framework is easily extendible,
so that further data agents and mining agents can simply be added to
the framework. A demonstration MADMF framework is currently
available. The paper includes details of the MADMF architecture and
the wrapper principle incorporated into it. A full description and
evaluation of the framework-s operation is provided by considering
two MADM scenarios.
Abstract: In this work, we present a novel active learning approach
for learning a visual object detection system. Our system
is composed of an active learning mechanism as wrapper around
a sub-algorithm which implement an online boosting-based learning
object detector. In the core is a combination of a bootstrap procedure
and a semi automatic learning process based on the online boosting
procedure. The idea is to exploit the availability of classifier during
learning to automatically label training samples and increasingly
improves the classifier. This addresses the issue of reducing labeling
effort meanwhile obtain better performance. In addition, we propose
a verification process for further improvement of the classifier.
The idea is to allow re-update on seen data during learning for
stabilizing the detector. The main contribution of this empirical study
is a demonstration that active learning based on an online boosting
approach trained in this manner can achieve results comparable or
even outperform a framework trained in conventional manner using
much more labeling effort. Empirical experiments on challenging data
set for specific object deteciton problems show the effectiveness of
our approach.
Abstract: Developing an accurate classifier for high dimensional microarray datasets is a challenging task due to availability of small sample size. Therefore, it is important to determine a set of relevant genes that classify the data well. Traditionally, gene selection method often selects the top ranked genes according to their discriminatory power. Often these genes are correlated with each other resulting in redundancy. In this paper, we have proposed a hybrid method using feature ranking and wrapper method (Genetic Algorithm with multiclass SVM) to identify a set of relevant genes that classify the data more accurately. A new fitness function for genetic algorithm is defined that focuses on selecting the smallest set of genes that provides maximum accuracy. Experiments have been carried on four well-known datasets1. The proposed method provides better results in comparison to the results found in the literature in terms of both classification accuracy and number of genes selected.
Abstract: Sharing consistent and correct master data among
disparate applications in a reverse-logistics chain has long been
recognized as an intricate problem. Although a master data
management (MDM) system can surely assume that responsibility,
applications that need to co-operate with it must comply with
proprietary query interfaces provided by the specific MDM system. In
this paper, we present a RFID-ready MDM system which makes
master data readily available for any participating applications in a
reverse-logistics chain. We propose a RFID-wrapper as a part of our
MDM. It acts as a gateway between any data retrieval request and
query interfaces that process it. With the RFID-wrapper, any
participating applications in a reverse-logistics chain can easily
retrieve master data in a way that is analogous to retrieval of any other
RFID-based logistics transactional data.
Abstract: The internet has become an attractive avenue for
global e-business, e-learning, knowledge sharing, etc. Due to
continuous increase in the volume of web content, it is not practically
possible for a user to extract information by browsing and integrating
data from a huge amount of web sources retrieved by the existing
search engines. The semantic web technology enables advancement
in information extraction by providing a suite of tools to integrate
data from different sources. To take full advantage of semantic web,
it is necessary to annotate existing web pages into semantic web
pages. This research develops a tool, named OWIE (Ontology-based
Web Information Extraction), for semantic web annotation using
domain specific ontologies. The tool automatically extracts
information from html pages with the help of pre-defined ontologies
and gives them semantic representation. Two case studies have been
conducted to analyze the accuracy of OWIE.
Abstract: This paper addresses the problem of building a unified
structure to describe a peer-to-peer system. Our approach uses the
well-known notations in the P2P area, and provides a global
architecture that puts a separation between the platform specific
characteristics and the logical ones. In order to enable the navigation
of the peer across platforms, a roaming layer is added. The latter
provides a capability to define a unique identification of peer and
assures the mapping between this identification and those used in
each platform. The mapping task is assured by special wrapper. In
addition, ontology is proposed to give a clear presentation of the
structure of the P2P system without interesting in the content and the
resource managed by the peer. The ontology is created according to
the web semantic paradigm and using OWL language; so, the
structure of the system is considered as a web resource.
Abstract: This article addresses feature selection for breast
cancer diagnosis. The present process contains a wrapper approach
based on Genetic Algorithm (GA) and case-based reasoning (CBR).
GA is used for searching the problem space to find all of the possible
subsets of features and CBR is employed to estimate the evaluation
result of each subset. The results of experiment show that the
proposed model is comparable to the other models on Wisconsin
breast cancer (WDBC) dataset.