Multi-Level Air Quality Classification in China Using Information Gain and Support Vector Machine

Machine Learning and Data Mining are the two important tools for extracting useful information and knowledge from large datasets. In machine learning, classification is a wildly used technique to predict qualitative variables and is generally preferred over regression from an operational point of view. Due to the enormous increase in air pollution in various countries especially China, Air Quality Classification has become one of the most important topics in air quality research and modelling. This study aims at introducing a hybrid classification model based on information theory and Support Vector Machine (SVM) using the air quality data of four cities in China namely Beijing, Guangzhou, Shanghai and Tianjin from Jan 1, 2014 to April 30, 2016. China's Ministry of Environmental Protection has classified the daily air quality into 6 levels namely Serious Pollution, Severe Pollution, Moderate Pollution, Light Pollution, Good and Excellent based on their respective Air Quality Index (AQI) values. Using the information theory, information gain (IG) is calculated and feature selection is done for both categorical features and continuous numeric features. Then SVM Machine Learning algorithm is implemented on the selected features with cross-validation. The final evaluation reveals that the IG and SVM hybrid model performs better than SVM (alone), Artificial Neural Network (ANN) and K-Nearest Neighbours (KNN) models in terms of accuracy as well as complexity.

Block Based Imperial Competitive Algorithm with Greedy Search for Traveling Salesman Problem

Imperial competitive algorithm (ICA) simulates a multi-agent algorithm. Each agent is like a kingdom has its country, and the strongest country in each agent is called imperialist, others are colony. Countries are competitive with imperialist which in the same kingdom by evolving. So this country will move in the search space to find better solutions with higher fitness to be a new imperialist. The main idea in this paper is using the peculiarity of ICA to explore the search space to solve the kinds of combinational problems. Otherwise, we also study to use the greed search to increase the local search ability. To verify the proposed algorithm in this paper, the experimental results of traveling salesman problem (TSP) is according to the traveling salesman problem library (TSPLIB). The results show that the proposed algorithm has higher performance than the other known methods.

Applying Sequential Pattern Mining to Generate Block for Scheduling Problems

The main idea in this paper is using sequential pattern mining to find the information which is helpful for finding high performance solutions. By combining this information, it is defined as blocks. Using the blocks to generate artificial chromosomes (ACs) could improve the structure of solutions. Estimation of Distribution Algorithms (EDAs) is adapted to solve the combinatorial problems. Nevertheless many of these approaches are advantageous for this application, but only some of them are used to enhance the efficiency of application. Generating ACs uses patterns and EDAs could increase the diversity. According to the experimental result, the algorithm which we proposed has a better performance to solve the permutation flow-shop problems.

Self-evolving Artificial Immune System via Developing T and B Cell for Permutation Flow-shop Scheduling Problems

Artificial Immune System is applied as a Heuristic Algorithm for decades. Nevertheless, many of these applications took advantage of the benefit of this algorithm but seldom proposed approaches for enhancing the efficiency. In this paper, a Self-evolving Artificial Immune System is proposed via developing the T and B cell in Immune System and built a self-evolving mechanism for the complexities of different problems. In this research, it focuses on enhancing the efficiency of Clonal selection which is responsible for producing Affinities to resist the invading of Antigens. T and B cell are the main mechanisms for Clonal Selection to produce different combinations of Antibodies. Therefore, the development of T and B cell will influence the efficiency of Clonal Selection for searching better solution. Furthermore, for better cooperation of the two cells, a co-evolutional strategy is applied to coordinate for more effective productions of Antibodies. This work finally adopts Flow-shop scheduling instances in OR-library to validate the proposed algorithm.

Development of Decision Support System for House Evaluation and Purchasing

Home is important for Chinese people. Because the information regarding the house attributes and surrounding environments is incomplete in most real estate agency, most house buyers are difficult to consider the overall factors effectively and only can search candidates by sorting-based approach. This study aims to develop a decision support system for housing purchasing, in which surrounding facilities of each house are quantified. Then, all considered house factors and customer preferences are incorporated into Simple Multi-Attribute Ranking Technique (SMART) to support the housing evaluation. To evaluate the validity of proposed approach, an empirical study was conducted from a real estate agency. Based on the customer requirement and preferences, the proposed approach can identify better candidate house with consider the overall house attributes and surrounding facilities.

Combine a Population-based Incremental Learning with Artificial Immune System for Intrusion Detection System

This research focus on the intrusion detection system (IDS) development which using artificial immune system (AIS) with population based incremental learning (PBIL). AIS have powerful distinguished capability to extirpate antigen when the antigen intrude into human body. The PBIL is based on past learning experience to adjust new learning. Therefore we propose an intrusion detection system call PBIL-AIS which combine two approaches of PBIL and AIS to evolution computing. In AIS part we design three mechanisms such as clonal selection, negative selection and antibody level to intensify AIS performance. In experimental result, our PBIL-AIS IDS can capture high accuracy when an intrusion connection attacks.

The Design of Self-evolving Artificial Immune System II for Permutation Flow-shop Problem

Artificial Immune System is adopted as a Heuristic Algorithm to solve the combinatorial problems for decades. Nevertheless, many of these applications took advantage of the benefit for applications but seldom proposed approaches for enhancing the efficiency. In this paper, we continue the previous research to develop a Self-evolving Artificial Immune System II via coordinating the T and B cell in Immune System and built a block-based artificial chromosome for speeding up the computation time and better performance for different complexities of problems. Through the design of Plasma cell and clonal selection which are relative the function of the Immune Response. The Immune Response will help the AIS have the global and local searching ability and preventing trapped in local optima. From the experimental result, the significant performance validates the SEAIS II is effective when solving the permutation flows-hop problems.

Building a Trend Based Segmentation Method with SVR Model for Stock Turning Detection

This research focus on developing a new segmentation method for improving forecasting model which is call trend based segmentation method (TBSM). Generally, the piece-wise linear representation (PLR) can finds some of pair of trading points is well for time series data, but in the complicated stock environment it is not well for stock forecasting because of the stock has more trends of trading. If we consider the trends of trading in stock price for the trading signal which it will improve the precision of forecasting model. Therefore, a TBSM with SVR model used to detect the trading points for various stocks of Taiwanese and America under different trend tendencies. The experimental results show our trading system is more profitable and can be implemented in real time of stock market

Atrial Fibrillation Analysis Based on Blind Source Separation in 12-lead ECG

Atrial Fibrillation is the most common sustained arrhythmia encountered by clinicians. Because of the invisible waveform of atrial fibrillation in atrial activation for human, it is necessary to develop an automatic diagnosis system. 12-Lead ECG now is available in hospital and is appropriate for using Independent Component Analysis to estimate the AA period. In this research, we also adopt a second-order blind identification approach to transform the sources extracted by ICA to more precise signal and then we use frequency domain algorithm to do the classification. In experiment, we gather a significant result of clinical data.

Emotion Classification by Incremental Association Language Features

The Major Depressive Disorder has been a burden of medical expense in Taiwan as well as the situation around the world. Major Depressive Disorder can be defined into different categories by previous human activities. According to machine learning, we can classify emotion in correct textual language in advance. It can help medical diagnosis to recognize the variance in Major Depressive Disorder automatically. Association language incremental is the characteristic and relationship that can discovery words in sentence. There is an overlapping-category problem for classification. In this paper, we would like to improve the performance in classification in principle of no overlapping-category problems. We present an approach that to discovery words in sentence and it can find in high frequency in the same time and can-t overlap in each category, called Association Language Features by its Category (ALFC). Experimental results show that ALFC distinguish well in Major Depressive Disorder and have better performance. We also compare the approach with baseline and mutual information that use single words alone or correlation measure.