Abstract: Recently GPS data is used in a lot of studies to
automatically reconstruct travel patterns for trip survey. The aim is to
minimize the use of questionnaire surveys and travel diaries so as to
reduce their negative effects. In this paper data acquired from GPS and
accelerometer embedded in smart phones is utilized to predict the
mode of transportation used by the phone carrier. For prediction,
Support Vector Machine (SVM) and Adaptive boosting (AdaBoost)
are employed. Moreover a unique method to improve the prediction
results from these algorithms is also proposed. Results suggest that the
prediction accuracy of AdaBoost after improvement is relatively better
than the rest.
Abstract: The tourism industry has been widely used to eradicate poverty, due to the ability to generate income, employment as well as improving the quality of life. The industry has faced rapid growth with support from local residents who were involved directly and indirectly in tourism activities. Their support and behaviour does not only facilitate in boosting tourists’ satisfaction levels, but at the same time it contributes to the word-of-mouth promotion among the visitors. In order to ensure the success of the industry, the involvement and participation of the local communities are pertinent. This paper endeavours on local community attitudes, benefit and their support toward future tourism development in Tioman Island. Through a series of descriptive and factor analyses, various useful understandings on the issues of interest revealed. The findings indicated that community with personal benefit will support future development. Meanwhile, the finding also revealed that the community with negative perception still supports future tourism development due to their over reliance on this sector as their main source of income and destination development means.
Abstract: This paper presents a 14-bit cyclic-pipelined Analog to digital converter (ADC) running at 1 MS/s. The architecture is based on a 1.5-bit per stage structure utilizing digital correction for each stage. The ADC consists of two 1.5-bit stages, one shift register delay line, and digital error correction logic. Inside each 1.5-bit stage, there is one gain-boosting op-amp and two comparators. The ADC was implemented in 0.18µm CMOS process and the design has an area of approximately 0.2 mm2. The ADC has a differential input range of 1.2 Vpp. The circuit has an average power consumption of 3.5mA with 10MHz sampling clocks. The post-layout simulations of the design satisfy 12-bit SNDR with a full-scale sinusoid input.
Abstract: Increasing attention has been given in academia to the concept of corporate social responsibility. Also, the number of companies that undertake social responsibility initiatives has been boosting day by day since behaving in a socially responsible manner brings a lot to the companies. Literature provides various benefits of social responsibility and under which situations these benefits could be realized. However, most of these studies focus on one aspect of the consequences of behaving in a socially responsible manner and there is no study that unifies the conditions that a company should fulfill to make customers prefer its brand. This study aims to fill this gap. More specifically, the purpose of this study is to identify the conditions that a socially responsible company should fulfill in order to attract customers. To this end, a scale is developed and its reliability and validity is assessed through the method of Multitrait- Multimethod Matrix.
Abstract: Text Mining is around applying knowledge discovery
techniques to unstructured text is termed knowledge discovery in text
(KDT), or Text data mining or Text Mining. In decision tree
approach is most useful in classification problem. With this
technique, tree is constructed to model the classification process.
There are two basic steps in the technique: building the tree and
applying the tree to the database. This paper describes a proposed
C5.0 classifier that performs rulesets, cross validation and boosting
for original C5.0 in order to reduce the optimization of error ratio.
The feasibility and the benefits of the proposed approach are
demonstrated by means of medial data set like hypothyroid. It is
shown that, the performance of a classifier on the training cases from
which it was constructed gives a poor estimate by sampling or using a
separate test file, either way, the classifier is evaluated on cases that
were not used to build and evaluate the classifier are both are large. If
the cases in hypothyroid.data and hypothyroid.test were to be
shuffled and divided into a new 2772 case training set and a 1000
case test set, C5.0 might construct a different classifier with a lower
or higher error rate on the test cases. An important feature of see5 is
its ability to classifiers called rulesets. The ruleset has an error rate
0.5 % on the test cases. The standard errors of the means provide an
estimate of the variability of results. One way to get a more reliable
estimate of predictive is by f-fold –cross- validation. The error rate of
a classifier produced from all the cases is estimated as the ratio of the
total number of errors on the hold-out cases to the total number of
cases. The Boost option with x trials instructs See5 to construct up to
x classifiers in this manner. Trials over numerous datasets, large and
small, show that on average 10-classifier boosting reduces the error
rate for test cases by about 25%.
Abstract: This paper proposes to use ETM+ multispectral data
and panchromatic band as well as texture features derived from the
panchromatic band for land cover classification. Four texture features
including one 'internal texture' and three GLCM based textures
namely correlation, entropy, and inverse different moment were used
in combination with ETM+ multispectral data. Two data sets
involving combination of multispectral, panchromatic band and its
texture were used and results were compared with those obtained by
using multispectral data alone. A decision tree classifier with and
without boosting were used to classify different datasets. Results
from this study suggest that the dataset consisting of panchromatic
band, four of its texture features and multispectral data was able to
increase the classification accuracy by about 2%. In comparison, a
boosted decision tree was able to increase the classification accuracy
by about 3% with the same dataset.
Abstract: In this work, we present a novel active learning approach
for learning a visual object detection system. Our system
is composed of an active learning mechanism as wrapper around
a sub-algorithm which implement an online boosting-based learning
object detector. In the core is a combination of a bootstrap procedure
and a semi automatic learning process based on the online boosting
procedure. The idea is to exploit the availability of classifier during
learning to automatically label training samples and increasingly
improves the classifier. This addresses the issue of reducing labeling
effort meanwhile obtain better performance. In addition, we propose
a verification process for further improvement of the classifier.
The idea is to allow re-update on seen data during learning for
stabilizing the detector. The main contribution of this empirical study
is a demonstration that active learning based on an online boosting
approach trained in this manner can achieve results comparable or
even outperform a framework trained in conventional manner using
much more labeling effort. Empirical experiments on challenging data
set for specific object deteciton problems show the effectiveness of
our approach.
Abstract: Chronic hepatitis B can evolve to cirrhosis and liver
cancer. Interferon is the only effective treatment, for carefully selected
patients, but it is very expensive. Some of the selection criteria are
based on liver biopsy, an invasive, costly and painful medical procedure.
Therefore, developing efficient non-invasive selection systems,
could be in the patients benefit and also save money. We investigated
the possibility to create intelligent systems to assist the Interferon
therapeutical decision, mainly by predicting with acceptable accuracy
the results of the biopsy. We used a knowledge discovery in integrated
medical data - imaging, clinical, and laboratory data. The resulted
intelligent systems, tested on 500 patients with chronic hepatitis
B, based on C5.0 decision trees and boosting, predict with 100%
accuracy the results of the liver biopsy. Also, by integrating the other
patients selection criteria, they offer a non-invasive support for the
correct Interferon therapeutic decision. To our best knowledge, these
decision systems outperformed all similar systems published in the
literature, and offer a realistic opportunity to replace liver biopsy in
this medical context.
Abstract: In this paper, the application of multiple Elman neural networks to time series data regression problems is studied. An ensemble of Elman networks is formed by boosting to enhance the performance of the individual networks. A modified version of the AdaBoost algorithm is employed to integrate the predictions from multiple networks. Two benchmark time series data sets, i.e., the Sunspot and Box-Jenkins gas furnace problems, are used to assess the effectiveness of the proposed system. The simulation results reveal that an ensemble of boosted Elman networks can achieve a higher degree of generalization as well as performance than that of the individual networks. The results are compared with those from other learning systems, and implications of the performance are discussed.
Abstract: With the globalized production and logistics
environment, the need for reducing the product development interval
and lead time, having a faster response to orders, conforming to quality
standards, fair tracking, and boosting information exchanging
activities with customers and partners, and coping with changes in the
management environment, manufacturers are in dire need of an
information management system in their manufacturing environments.
There are lots of information systems that have been designed to
manage the condition or operation of equipment in the field but
existing systems have a decentralized architecture, which is not
unified. Also, these systems cannot effectively handle the status data
extraction process upon encountering a problem related to protocols or
changes in the equipment or the setting. In this regard, this paper will
introduce a system for processing and saving the status info of
production equipment, which uses standard representation formats, to
enable flexible responses to and support for variables in the field
equipment. This system can be used for a variety of manufacturing and
equipment settings and is capable of interacting with higher-tier
systems such as MES.
Abstract: The problem of ranking (rank regression) has become popular in the machine learning community. This theory relates to problems, in which one has to predict (guess) the order between objects on the basis of vectors describing their observed features. In many ranking algorithms a convex loss function is used instead of the 0-1 loss. It makes these procedures computationally efficient. Hence, convex risk minimizers and their statistical properties are investigated in this paper. Fast rates of convergence are obtained under conditions, that look similarly to the ones from the classification theory. Methods used in this paper come from the theory of U-processes as well as empirical processes.
Abstract: Bagging and boosting are among the most popular re-sampling ensemble methods that generate and combine a diversity of regression models using the same learning algorithm as base-learner. Boosting algorithms are considered stronger than bagging on noise-free data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using an averaging methodology of bagging and boosting ensembles with 10 sub-learners in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-learners on standard benchmark datasets and the proposed ensemble gave better accuracy.
Abstract: In this paper we designed and implemented a new
ensemble of classifiers based on a sequence of classifiers which were
specialized in regions of the training dataset where errors of its
trained homologous are concentrated. In order to separate this
regions, and to determine the aptitude of each classifier to properly
respond to a new case, it was used another set of classifiers built
hierarchically. We explored a selection based variant to combine the
base classifiers. We validated this model with different base
classifiers using 37 training datasets. It was carried out a statistical
comparison of these models with the well known Bagging and
Boosting, obtaining significantly superior results with the
hierarchical ensemble using Multilayer Perceptron as base classifier.
Therefore, we demonstrated the efficacy of the proposed ensemble,
as well as its applicability to general problems.
Abstract: SoftBoost is a recently presented boosting algorithm,
which trades off the size of achieved classification margin and
generalization performance. This paper presents a performance
evaluation of SoftBoost algorithm on the generic object recognition
problem. An appearance-based generic object recognition
model is used. The evaluation experiments are performed using
a difficult object recognition benchmark. An assessment with respect
to different degrees of label noise as well as a comparison to
the well known AdaBoost algorithm is performed. The obtained
results reveal that SoftBoost is encouraged to be used in cases
when the training data is known to have a high degree of noise.
Otherwise, using Adaboost can achieve better performance.
Abstract: Traffic Management and Information Systems, which rely on a system of sensors, aim to describe in real-time traffic in urban areas using a set of parameters and estimating them. Though the state of the art focuses on data analysis, little is done in the sense of prediction. In this paper, we describe a machine learning system for traffic flow management and control for a prediction of traffic flow problem. This new algorithm is obtained by combining Random Forests algorithm into Adaboost algorithm as a weak learner. We show that our algorithm performs relatively well on real data, and enables, according to the Traffic Flow Evaluation model, to estimate and predict whether there is congestion or not at a given time on road intersections.
Abstract: Bagging and boosting are among the most popular resampling ensemble methods that generate and combine a diversity of classifiers using the same learning algorithm for the base-classifiers. Boosting algorithms are considered stronger than bagging on noisefree data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using a voting methodology of bagging and boosting ensembles with 10 subclassifiers in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-classifiers, as well as other well known combining methods, on standard benchmark datasets and the proposed technique was the most accurate.
Abstract: In recent years, a number of works proposing the
combination of multiple classifiers to produce a single
classification have been reported in remote sensing literature. The
resulting classifier, referred to as an ensemble classifier, is
generally found to be more accurate than any of the individual
classifiers making up the ensemble. As accuracy is the primary
concern, much of the research in the field of land cover
classification is focused on improving classification accuracy. This
study compares the performance of four ensemble approaches
(boosting, bagging, DECORATE and random subspace) with a
univariate decision tree as base classifier. Two training datasets,
one without ant noise and other with 20 percent noise was used to
judge the performance of different ensemble approaches. Results
with noise free data set suggest an improvement of about 4% in
classification accuracy with all ensemble approaches in
comparison to the results provided by univariate decision tree
classifier. Highest classification accuracy of 87.43% was achieved
by boosted decision tree. A comparison of results with noisy data
set suggests that bagging, DECORATE and random subspace
approaches works well with this data whereas the performance of
boosted decision tree degrades and a classification accuracy of
79.7% is achieved which is even lower than that is achieved (i.e.
80.02%) by using unboosted decision tree classifier.
Abstract: Studies in neuroscience suggest that both global and
local feature information are crucial for perception and recognition of
faces. It is widely believed that local feature is less sensitive to
variations caused by illumination, expression and illumination. In
this paper, we target at designing and learning local features for face
recognition. We designed three types of local features. They are
semi-global feature, local patch feature and tangent shape feature.
The designing of semi-global feature aims at taking advantage of
global-like feature and meanwhile avoiding suppressing AdaBoost
algorithm in boosting weak classifies established from small local
patches. The designing of local patch feature targets at automatically
selecting discriminative features, and is thus different with traditional
ways, in which local patches are usually selected manually to cover
the salient facial components. Also, shape feature is considered in
this paper for frontal view face recognition. These features are
selected and combined under the framework of boosting algorithm
and cascade structure. The experimental results demonstrate that the
proposed approach outperforms the standard eigenface method and
Bayesian method. Moreover, the selected local features and
observations in the experiments are enlightening to researches in
local feature design in face recognition.