Using Probe Person Data for Travel Mode Detection

Recently GPS data is used in a lot of studies to automatically reconstruct travel patterns for trip survey. The aim is to minimize the use of questionnaire surveys and travel diaries so as to reduce their negative effects. In this paper data acquired from GPS and accelerometer embedded in smart phones is utilized to predict the mode of transportation used by the phone carrier. For prediction, Support Vector Machine (SVM) and Adaptive boosting (AdaBoost) are employed. Moreover a unique method to improve the prediction results from these algorithms is also proposed. Results suggest that the prediction accuracy of AdaBoost after improvement is relatively better than the rest.

Community Behaviour and Support towards Island Tourism Development

The tourism industry has been widely used to eradicate poverty, due to the ability to generate income, employment as well as improving the quality of life. The industry has faced rapid growth with support from local residents who were involved directly and indirectly in tourism activities. Their support and behaviour does not only facilitate in boosting tourists’ satisfaction levels, but at the same time it contributes to the word-of-mouth promotion among the visitors. In order to ensure the success of the industry, the involvement and participation of the local communities are pertinent. This paper endeavours on local community attitudes, benefit and their support toward future tourism development in Tioman Island. Through a series of descriptive and factor analyses, various useful understandings on the issues of interest revealed. The findings indicated that community with personal benefit will support future development. Meanwhile, the finding also revealed that the community with negative perception still supports future tourism development due to their over reliance on this sector as their main source of income and destination development means.

14-Bit 1MS/s Cyclic-Pipelined ADC

This paper presents a 14-bit cyclic-pipelined Analog to digital converter (ADC) running at 1 MS/s. The architecture is based on a 1.5-bit per stage structure utilizing digital correction for each stage. The ADC consists of two 1.5-bit stages, one shift register delay line, and digital error correction logic. Inside each 1.5-bit stage, there is one gain-boosting op-amp and two comparators. The ADC was implemented in 0.18µm CMOS process and the design has an area of approximately 0.2 mm2. The ADC has a differential input range of 1.2 Vpp. The circuit has an average power consumption of 3.5mA with 10MHz sampling clocks. The post-layout simulations of the design satisfy 12-bit SNDR with a full-scale sinusoid input.

Prerequisites to Increase the Purchase Intent fora Socially Responsible Company –Development of a Scale

Increasing attention has been given in academia to the concept of corporate social responsibility. Also, the number of companies that undertake social responsibility initiatives has been boosting day by day since behaving in a socially responsible manner brings a lot to the companies. Literature provides various benefits of social responsibility and under which situations these benefits could be realized. However, most of these studies focus on one aspect of the consequences of behaving in a socially responsible manner and there is no study that unifies the conditions that a company should fulfill to make customers prefer its brand. This study aims to fill this gap. More specifically, the purpose of this study is to identify the conditions that a socially responsible company should fulfill in order to attract customers. To this end, a scale is developed and its reliability and validity is assessed through the method of Multitrait- Multimethod Matrix.

Text Mining Technique for Data Mining Application

Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In decision tree approach is most useful in classification problem. With this technique, tree is constructed to model the classification process. There are two basic steps in the technique: building the tree and applying the tree to the database. This paper describes a proposed C5.0 classifier that performs rulesets, cross validation and boosting for original C5.0 in order to reduce the optimization of error ratio. The feasibility and the benefits of the proposed approach are demonstrated by means of medial data set like hypothyroid. It is shown that, the performance of a classifier on the training cases from which it was constructed gives a poor estimate by sampling or using a separate test file, either way, the classifier is evaluated on cases that were not used to build and evaluate the classifier are both are large. If the cases in hypothyroid.data and hypothyroid.test were to be shuffled and divided into a new 2772 case training set and a 1000 case test set, C5.0 might construct a different classifier with a lower or higher error rate on the test cases. An important feature of see5 is its ability to classifiers called rulesets. The ruleset has an error rate 0.5 % on the test cases. The standard errors of the means provide an estimate of the variability of results. One way to get a more reliable estimate of predictive is by f-fold –cross- validation. The error rate of a classifier produced from all the cases is estimated as the ratio of the total number of errors on the hold-out cases to the total number of cases. The Boost option with x trials instructs See5 to construct up to x classifiers in this manner. Trials over numerous datasets, large and small, show that on average 10-classifier boosting reduces the error rate for test cases by about 25%.

Fusion of ETM+ Multispectral and Panchromatic Texture for Remote Sensing Classification

This paper proposes to use ETM+ multispectral data and panchromatic band as well as texture features derived from the panchromatic band for land cover classification. Four texture features including one 'internal texture' and three GLCM based textures namely correlation, entropy, and inverse different moment were used in combination with ETM+ multispectral data. Two data sets involving combination of multispectral, panchromatic band and its texture were used and results were compared with those obtained by using multispectral data alone. A decision tree classifier with and without boosting were used to classify different datasets. Results from this study suggest that the dataset consisting of panchromatic band, four of its texture features and multispectral data was able to increase the classification accuracy by about 2%. In comparison, a boosted decision tree was able to increase the classification accuracy by about 3% with the same dataset.

Efficient Boosting-Based Active Learning for Specific Object Detection Problems

In this work, we present a novel active learning approach for learning a visual object detection system. Our system is composed of an active learning mechanism as wrapper around a sub-algorithm which implement an online boosting-based learning object detector. In the core is a combination of a bootstrap procedure and a semi automatic learning process based on the online boosting procedure. The idea is to exploit the availability of classifier during learning to automatically label training samples and increasingly improves the classifier. This addresses the issue of reducing labeling effort meanwhile obtain better performance. In addition, we propose a verification process for further improvement of the classifier. The idea is to allow re-update on seen data during learning for stabilizing the detector. The main contribution of this empirical study is a demonstration that active learning based on an online boosting approach trained in this manner can achieve results comparable or even outperform a framework trained in conventional manner using much more labeling effort. Empirical experiments on challenging data set for specific object deteciton problems show the effectiveness of our approach.

Artificial Intelligence Support for Interferon Treatment Decision in Chronic Hepatitis B

Chronic hepatitis B can evolve to cirrhosis and liver cancer. Interferon is the only effective treatment, for carefully selected patients, but it is very expensive. Some of the selection criteria are based on liver biopsy, an invasive, costly and painful medical procedure. Therefore, developing efficient non-invasive selection systems, could be in the patients benefit and also save money. We investigated the possibility to create intelligent systems to assist the Interferon therapeutical decision, mainly by predicting with acceptable accuracy the results of the biopsy. We used a knowledge discovery in integrated medical data - imaging, clinical, and laboratory data. The resulted intelligent systems, tested on 500 patients with chronic hepatitis B, based on C5.0 decision trees and boosting, predict with 100% accuracy the results of the liver biopsy. Also, by integrating the other patients selection criteria, they offer a non-invasive support for the correct Interferon therapeutic decision. To our best knowledge, these decision systems outperformed all similar systems published in the literature, and offer a realistic opportunity to replace liver biopsy in this medical context.

The Application of an Ensemble of Boosted Elman Networks to Time Series Prediction: A Benchmark Study

In this paper, the application of multiple Elman neural networks to time series data regression problems is studied. An ensemble of Elman networks is formed by boosting to enhance the performance of the individual networks. A modified version of the AdaBoost algorithm is employed to integrate the predictions from multiple networks. Two benchmark time series data sets, i.e., the Sunspot and Box-Jenkins gas furnace problems, are used to assess the effectiveness of the proposed system. The simulation results reveal that an ensemble of boosted Elman networks can achieve a higher degree of generalization as well as performance than that of the individual networks. The results are compared with those from other learning systems, and implications of the performance are discussed.

The Status Info Processing and Keeping System for Production Equipment

With the globalized production and logistics environment, the need for reducing the product development interval and lead time, having a faster response to orders, conforming to quality standards, fair tracking, and boosting information exchanging activities with customers and partners, and coping with changes in the management environment, manufacturers are in dire need of an information management system in their manufacturing environments. There are lots of information systems that have been designed to manage the condition or operation of equipment in the field but existing systems have a decentralized architecture, which is not unified. Also, these systems cannot effectively handle the status data extraction process upon encountering a problem related to protocols or changes in the equipment or the setting. In this regard, this paper will introduce a system for processing and saving the status info of production equipment, which uses standard representation formats, to enable flexible responses to and support for variables in the field equipment. This system can be used for a variety of manufacturing and equipment settings and is capable of interacting with higher-tier systems such as MES.

Ranking - Convex Risk Minimization

The problem of ranking (rank regression) has become popular in the machine learning community. This theory relates to problems, in which one has to predict (guess) the order between objects on the basis of vectors describing their observed features. In many ranking algorithms a convex loss function is used instead of the 0-1 loss. It makes these procedures computationally efficient. Hence, convex risk minimizers and their statistical properties are investigated in this paper. Fast rates of convergence are obtained under conditions, that look similarly to the ones from the classification theory. Methods used in this paper come from the theory of U-processes as well as empirical processes.

Combining Bagging and Additive Regression

Bagging and boosting are among the most popular re-sampling ensemble methods that generate and combine a diversity of regression models using the same learning algorithm as base-learner. Boosting algorithms are considered stronger than bagging on noise-free data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using an averaging methodology of bagging and boosting ensembles with 10 sub-learners in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-learners on standard benchmark datasets and the proposed ensemble gave better accuracy.

Judges System for Classifiers Specialization

In this paper we designed and implemented a new ensemble of classifiers based on a sequence of classifiers which were specialized in regions of the training dataset where errors of its trained homologous are concentrated. In order to separate this regions, and to determine the aptitude of each classifier to properly respond to a new case, it was used another set of classifiers built hierarchically. We explored a selection based variant to combine the base classifiers. We validated this model with different base classifiers using 37 training datasets. It was carried out a statistical comparison of these models with the well known Bagging and Boosting, obtaining significantly superior results with the hierarchical ensemble using Multilayer Perceptron as base classifier. Therefore, we demonstrated the efficacy of the proposed ensemble, as well as its applicability to general problems.

Performance Comparison and Evaluation of AdaBoost and SoftBoost Algorithms on Generic Object Recognition

SoftBoost is a recently presented boosting algorithm, which trades off the size of achieved classification margin and generalization performance. This paper presents a performance evaluation of SoftBoost algorithm on the generic object recognition problem. An appearance-based generic object recognition model is used. The evaluation experiments are performed using a difficult object recognition benchmark. An assessment with respect to different degrees of label noise as well as a comparison to the well known AdaBoost algorithm is performed. The obtained results reveal that SoftBoost is encouraged to be used in cases when the training data is known to have a high degree of noise. Otherwise, using Adaboost can achieve better performance.

Traffic Flow Prediction using Adaboost Algorithm with Random Forests as a Weak Learner

Traffic Management and Information Systems, which rely on a system of sensors, aim to describe in real-time traffic in urban areas using a set of parameters and estimating them. Though the state of the art focuses on data analysis, little is done in the sense of prediction. In this paper, we describe a machine learning system for traffic flow management and control for a prediction of traffic flow problem. This new algorithm is obtained by combining Random Forests algorithm into Adaboost algorithm as a weak learner. We show that our algorithm performs relatively well on real data, and enables, according to the Traffic Flow Evaluation model, to estimate and predict whether there is congestion or not at a given time on road intersections.

Combining Bagging and Boosting

Bagging and boosting are among the most popular resampling ensemble methods that generate and combine a diversity of classifiers using the same learning algorithm for the base-classifiers. Boosting algorithms are considered stronger than bagging on noisefree data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using a voting methodology of bagging and boosting ensembles with 10 subclassifiers in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-classifiers, as well as other well known combining methods, on standard benchmark datasets and the proposed technique was the most accurate.

Ensemble Learning with Decision Tree for Remote Sensing Classification

In recent years, a number of works proposing the combination of multiple classifiers to produce a single classification have been reported in remote sensing literature. The resulting classifier, referred to as an ensemble classifier, is generally found to be more accurate than any of the individual classifiers making up the ensemble. As accuracy is the primary concern, much of the research in the field of land cover classification is focused on improving classification accuracy. This study compares the performance of four ensemble approaches (boosting, bagging, DECORATE and random subspace) with a univariate decision tree as base classifier. Two training datasets, one without ant noise and other with 20 percent noise was used to judge the performance of different ensemble approaches. Results with noise free data set suggest an improvement of about 4% in classification accuracy with all ensemble approaches in comparison to the results provided by univariate decision tree classifier. Highest classification accuracy of 87.43% was achieved by boosted decision tree. A comparison of results with noisy data set suggests that bagging, DECORATE and random subspace approaches works well with this data whereas the performance of boosted decision tree degrades and a classification accuracy of 79.7% is achieved which is even lower than that is achieved (i.e. 80.02%) by using unboosted decision tree classifier.

Learning to Recognize Faces by Local Feature Design and Selection

Studies in neuroscience suggest that both global and local feature information are crucial for perception and recognition of faces. It is widely believed that local feature is less sensitive to variations caused by illumination, expression and illumination. In this paper, we target at designing and learning local features for face recognition. We designed three types of local features. They are semi-global feature, local patch feature and tangent shape feature. The designing of semi-global feature aims at taking advantage of global-like feature and meanwhile avoiding suppressing AdaBoost algorithm in boosting weak classifies established from small local patches. The designing of local patch feature targets at automatically selecting discriminative features, and is thus different with traditional ways, in which local patches are usually selected manually to cover the salient facial components. Also, shape feature is considered in this paper for frontal view face recognition. These features are selected and combined under the framework of boosting algorithm and cascade structure. The experimental results demonstrate that the proposed approach outperforms the standard eigenface method and Bayesian method. Moreover, the selected local features and observations in the experiments are enlightening to researches in local feature design in face recognition.