Analysis of Relation between Unlabeled and Labeled Data to Self-Taught Learning Performance

Obtaining labeled data in supervised learning is often difficult and expensive, and thus the trained learning algorithm tends to be overfitting due to small number of training data. As a result, some researchers have focused on using unlabeled data which may not necessary to follow the same generative distribution as the labeled data to construct a high-level feature for improving performance on supervised learning tasks. In this paper, we investigate the impact of the relationship between unlabeled and labeled data for classification performance. Specifically, we will apply difference unlabeled data which have different degrees of relation to the labeled data for handwritten digit classification task based on MNIST dataset. Our experimental results show that the higher the degree of relation between unlabeled and labeled data, the better the classification performance. Although the unlabeled data that is completely from different generative distribution to the labeled data provides the lowest classification performance, we still achieve high classification performance. This leads to expanding the applicability of the supervised learning algorithms using unsupervised learning.

Modified Fuzzy ARTMAP and Supervised Fuzzy ART: Comparative Study with Multispectral Classification

In this article a modification of the algorithm of the fuzzy ART network, aiming at returning it supervised is carried out. It consists of the search for the comparison, training and vigilance parameters giving the minimum quadratic distances between the output of the training base and those obtained by the network. The same process is applied for the determination of the parameters of the fuzzy ARTMAP giving the most powerful network. The modification consist in making learn the fuzzy ARTMAP a base of examples not only once as it is of use, but as many time as its architecture is in evolution or than the objective error is not reached . In this way, we don-t worry about the values to impose on the eight (08) parameters of the network. To evaluate each one of these three networks modified, a comparison of their performances is carried out. As application we carried out a classification of the image of Algiers-s bay taken by SPOT XS. We use as criterion of evaluation the training duration, the mean square error (MSE) in step control and the rate of good classification per class. The results of this study presented as curves, tables and images show that modified fuzzy ARTMAP presents the best compromise quality/computing time.

Magnesium Borate Synthesis by Microwave Method Using MgCl2.6H2O and H3BO3

There are many kinds of metal borates found not only in nature but also synthesized in the laboratory such as magnesium borates. Due to its excellent properties, as remarkable ceramic materials, they have also application areas in anti-wear and friction reducing additives as well as electro-conductive treating agents. The synthesis of magnesium borate powders can be fulfilled simply with two different methods, hydrothermal and thermal synthesis. Microwave assisted method, also another way of producing magnesium borate, can be classified into thermal synthesis because of using the principles of solid state synthesis. It also contributes producing particles with small size and high purity in nano-size material synthesize. In this study the production of magnesium borates, are aimed using MgCl2.6H2O and H3BO3. The identification of both starting materials and products were made by the equipments of, X-Ray Diffraction (XRD) and Fourier Transform Infrared Spectroscopy (FT-IR). After several synthesis steps magnesium borates were synthesized and characterized by XRD and FT-IR, as well.

Three-player Domineering

Domineering is a classic two-player combinatorial game usually played on a rectangular board. Three-player Domineering is the three-player version of Domineering played on a three dimensional board. Experimental results are presented for x×y ×z boards with x + y + z < 10 and x, y, z ≥ 2. Also, some theoretical results are shown for 2 × 2 × n board with n even and n ≥ 4.

Gene Expression Data Classification Using Discriminatively Regularized Sparse Subspace Learning

Sparse representation which can represent high dimensional data effectively has been successfully used in computer vision and pattern recognition problems. However, it doesn-t consider the label information of data samples. To overcome this limitation, we develop a novel dimensionality reduction algorithm namely dscriminatively regularized sparse subspace learning(DR-SSL) in this paper. The proposed DR-SSL algorithm can not only make use of the sparse representation to model the data, but also can effective employ the label information to guide the procedure of dimensionality reduction. In addition,the presented algorithm can effectively deal with the out-of-sample problem.The experiments on gene-expression data sets show that the proposed algorithm is an effective tool for dimensionality reduction and gene-expression data classification.

Enhancements in Blended e-Learning Management System

A learning management system (commonly abbreviated as LMS) is a software application for the administration, documentation, tracking, and reporting of training programs, classroom and online events, e-learning programs, and training content (Ellis 2009). (Hall 2003) defines an LMS as \"software that automates the administration of training events. All Learning Management Systems manage the log-in of registered users, manage course catalogs, record data from learners, and provide reports to management\". Evidence of the worldwide spread of e-learning in recent years is easy to obtain. In April 2003, no fewer than 66,000 fully online courses and 1,200 complete online programs were listed on the TeleCampus portal from TeleEducation (Paulsen 2003). In the report \" The US market in the Self-paced eLearning Products and Services:2010-2015 Forecast and Analysis\" The number of student taken classes exclusively online will be nearly equal (1% less) to the number taken classes exclusively in physical campuses. Number of student taken online course will increase from 1.37 million in 2010 to 3.86 million in 2015 in USA. In another report by The Sloan Consortium three-quarters of institutions report that the economic downturn has increased demand for online courses and programs.

Impovement of a Label Extraction Method for a Risk Search System

This paper proposes an improvement method of classification efficiency in a classification model. The model is used in a risk search system and extracts specific labels from articles posted at bulletin board sites. The system can analyze the important discussions composed of the articles. The improvement method introduces ensemble learning methods that use multiple classification models. Also, it introduces expressions related to the specific labels into generation of word vectors. The paper applies the improvement method to articles collected from three bulletin board sites selected by users and verifies the effectiveness of the improvement method.

Enhancing Cache Performance Based on Improved Average Access Time

A high performance computer includes a fast processor and millions bytes of memory. During the data processing, huge amount of information are shuffled between the memory and processor. Because of its small size and its effectiveness speed, cache has become a common feature of high performance computers. Enhancing cache performance proved to be essential in the speed up of cache-based computers. Most enhancement approaches can be classified as either software based or hardware controlled. The performance of the cache is quantified in terms of hit ratio or miss ratio. In this paper, we are optimizing the cache performance based on enhancing the cache hit ratio. The optimum cache performance is obtained by focusing on the cache hardware modification in the way to make a quick rejection to the missed line's tags from the hit-or miss comparison stage, and thus a low hit time for the wanted line in the cache is achieved. In the proposed technique which we called Even- Odd Tabulation (EOT), the cache lines come from the main memory into cache are classified in two types; even line's tags and odd line's tags depending on their Least Significant Bit (LSB). This division is exploited by EOT technique to reject the miss match line's tags in very low time compared to the time spent by the main comparator in the cache, giving an optimum hitting time for the wanted cache line. The high performance of EOT technique against the familiar mapping technique FAM is shown in the simulated results.

Learning Style and Learner Satisfaction in a Course Delivery Context

This paper describes the results and implications of a correlational study of learning styles and learner satisfaction. The relationship of these empirical concepts was examined in the context of traditional versus e-blended modes of course delivery in an introductory graduate research course. Significant results indicated that the visual side of the visual-verbal dimension of students- learning style(s) was positively correlated to satisfaction with themselves as learners in an e-blended course delivery mode and negatively correlated to satisfaction with the classroom environment in the context of a traditional classroom course delivery mode.

Decomposition Method for Neural Multiclass Classification Problem

In this article we are going to discuss the improvement of the multi classes- classification problem using multi layer Perceptron. The considered approach consists in breaking down the n-class problem into two-classes- subproblems. The training of each two-class subproblem is made independently; as for the phase of test, we are going to confront a vector that we want to classify to all two classes- models, the elected class will be the strongest one that won-t lose any competition with the other classes. Rates of recognition gotten with the multi class-s approach by two-class-s decomposition are clearly better that those gotten by the simple multi class-s approach.

A Remote Sensing Approach for Vulnerability and Environmental Change in Apodi Valley Region, Northeast Brazil

The objective of this study was to improve our understanding of vulnerability and environmental change; it's causes basically show the intensity, its distribution and human-environment effect on the ecosystem in the Apodi Valley Region, This paper is identify, assess and classify vulnerability and environmental change in the Apodi valley region using a combined approach of landscape pattern and ecosystem sensitivity. Models were developed using the following five thematic layers: Geology, geomorphology, soil, vegetation and land use/cover, by means of a Geographical Information Systems (GIS)-based on hydro-geophysical parameters. In spite of the data problems and shortcomings, using ESRI-s ArcGIS 9.3 program, the vulnerability score, to classify, weight and combine a number of 15 separate land cover classes to create a single indicator provides a reliable measure of differences (6 classes) among regions and communities that are exposed to similar ranges of hazards. Indeed, the ongoing and active development of vulnerability concepts and methods have already produced some tools to help overcome common issues, such as acting in a context of high uncertainties, taking into account the dynamics and spatial scale of asocial-ecological system, or gathering viewpoints from different sciences to combine human and impact-based approaches. Based on this assessment, this paper proposes concrete perspectives and possibilities to benefit from existing commonalities in the construction and application of assessment tools.

A Technique for Execution of Written Values on Shared Variables

The current paper conceptualizes the technique of release consistency indispensable with the concept of synchronization that is user-defined. Programming model concreted with object and class is illustrated and demonstrated. The essence of the paper is phases, events and parallel computing execution .The technique by which the values are visible on shared variables is implemented. The second part of the paper consist of user defined high level synchronization primitives implementation and system architecture with memory protocols. There is a proposition of techniques which are core in deciding the validating and invalidating a stall page .

Categorical Missing Data Imputation Using Fuzzy Neural Networks with Numerical and Categorical Inputs

There are many situations where input feature vectors are incomplete and methods to tackle the problem have been studied for a long time. A commonly used procedure is to replace each missing value with an imputation. This paper presents a method to perform categorical missing data imputation from numerical and categorical variables. The imputations are based on Simpson-s fuzzy min-max neural networks where the input variables for learning and classification are just numerical. The proposed method extends the input to categorical variables by introducing new fuzzy sets, a new operation and a new architecture. The procedure is tested and compared with others using opinion poll data.

Dynamic Features Selection for Heart Disease Classification

The healthcare environment is generally perceived as being information rich yet knowledge poor. However, there is a lack of effective analysis tools to discover hidden relationships and trends in data. In fact, valuable knowledge can be discovered from application of data mining techniques in healthcare system. In this study, a proficient methodology for the extraction of significant patterns from the Coronary Heart Disease warehouses for heart attack prediction, which unfortunately continues to be a leading cause of mortality in the whole world, has been presented. For this purpose, we propose to enumerate dynamically the optimal subsets of the reduced features of high interest by using rough sets technique associated to dynamic programming. Therefore, we propose to validate the classification using Random Forest (RF) decision tree to identify the risky heart disease cases. This work is based on a large amount of data collected from several clinical institutions based on the medical profile of patient. Moreover, the experts- knowledge in this field has been taken into consideration in order to define the disease, its risk factors, and to establish significant knowledge relationships among the medical factors. A computer-aided system is developed for this purpose based on a population of 525 adults. The performance of the proposed model is analyzed and evaluated based on set of benchmark techniques applied in this classification problem.

Change Detector Combination in Remotely Sensed Images Using Fuzzy Integral

Decision fusion is one of hot research topics in classification area, which aims to achieve the best possible performance for the task at hand. In this paper, we investigate the usefulness of this concept to improve change detection accuracy in remote sensing. Thereby, outputs of two fuzzy change detectors based respectively on simultaneous and comparative analysis of multitemporal data are fused by using fuzzy integral operators. This method fuses the objective evidences produced by the change detectors with respect to fuzzy measures that express the difference of performance between them. The proposed fusion framework is evaluated in comparison with some ordinary fuzzy aggregation operators. Experiments carried out on two SPOT images showed that the fuzzy integral was the best performing. It improves the change detection accuracy while attempting to equalize the accuracy rate in both change and no change classes.

Implementing an Intuitive Reasoner with a Large Weather Database

In this paper, the implementation of a rule-based intuitive reasoner is presented. The implementation included two parts: the rule induction module and the intuitive reasoner. A large weather database was acquired as the data source. Twelve weather variables from those data were chosen as the “target variables" whose values were predicted by the intuitive reasoner. A “complex" situation was simulated by making only subsets of the data available to the rule induction module. As a result, the rules induced were based on incomplete information with variable levels of certainty. The certainty level was modeled by a metric called "Strength of Belief", which was assigned to each rule or datum as ancillary information about the confidence in its accuracy. Two techniques were employed to induce rules from the data subsets: decision tree and multi-polynomial regression, respectively for the discrete and the continuous type of target variables. The intuitive reasoner was tested for its ability to use the induced rules to predict the classes of the discrete target variables and the values of the continuous target variables. The intuitive reasoner implemented two types of reasoning: fast and broad where, by analogy to human thought, the former corresponds to fast decision making and the latter to deeper contemplation. . For reference, a weather data analysis approach which had been applied on similar tasks was adopted to analyze the complete database and create predictive models for the same 12 target variables. The values predicted by the intuitive reasoner and the reference approach were compared with actual data. The intuitive reasoner reached near-100% accuracy for two continuous target variables. For the discrete target variables, the intuitive reasoner predicted at least 70% as accurately as the reference reasoner. Since the intuitive reasoner operated on rules derived from only about 10% of the total data, it demonstrated the potential advantages in dealing with sparse data sets as compared with conventional methods.

Application of Pattern Search Method to Power System Security Constrained Economic Dispatch

Direct search methods are evolutionary algorithms used to solve optimization problems. (DS) methods do not require any information about the gradient of the objective function at hand while searching for an optimum solution. One of such methods is Pattern Search (PS) algorithm. This paper presents a new approach based on a constrained pattern search algorithm to solve a security constrained power system economic dispatch problem (SCED). Operation of power systems demands a high degree of security to keep the system satisfactorily operating when subjected to disturbances, while and at the same time it is required to pay attention to the economic aspects. Pattern recognition technique is used first to assess dynamic security. Linear classifiers that determine the stability of electric power system are presented and added to other system stability and operational constraints. The problem is formulated as a constrained optimization problem in a way that insures a secure-economic system operation. Pattern search method is then applied to solve the constrained optimization formulation. In particular, the method is tested using one system. Simulation results of the proposed approach are compared with those reported in literature. The outcome is very encouraging and proves that pattern search (PS) is very applicable for solving security constrained power system economic dispatch problem (SCED).

Unit Selection Algorithm Using Bi-grams Model For Corpus-Based Speech Synthesis

In this paper, we present a novel statistical approach to corpus-based speech synthesis. Classically, phonetic information is defined and considered as acoustic reference to be respected. In this way, many studies were elaborated for acoustical unit classification. This type of classification allows separating units according to their symbolic characteristics. Indeed, target cost and concatenation cost were classically defined for unit selection. In Corpus-Based Speech Synthesis System, when using large text corpora, cost functions were limited to a juxtaposition of symbolic criteria and the acoustic information of units is not exploited in the definition of the target cost. In this manuscript, we token in our consideration the unit phonetic information corresponding to acoustic information. This would be realized by defining a probabilistic linguistic Bi-grams model basically used for unit selection. The selected units would be extracted from the English TIMIT corpora.

Exons and Introns Classification in Human and Other Organisms

In the paper, the relative performances on spectral classification of short exon and intron sequences of the human and eleven model organisms is studied. In the simulations, all combinations of sixteen one-sequence numerical representations, four threshold values, and four window lengths are considered. Sequences of 150-base length are chosen and for each organism, a total of 16,000 sequences are used for training and testing. Results indicate that an appropriate combination of one-sequence numerical representation, threshold value, and window length is essential for arriving at top spectral classification results. For fixed-length sequences, the precisions on exon and intron classification obtained for different organisms are not the same because of their genomic differences. In general, precision increases as sequence length increases.

Seismic Vulnerability Assessment of Buildings in Algiers Area

Several models of vulnerability assessment have been proposed. The selection of one of these models depends on the objectives of the study. The classical methodologies for seismic vulnerability analysis, as a part of seismic risk analysis, have been formulated with statistical criteria based on a rapid observation. The information relating to the buildings performance is statistically elaborated. In this paper, we use the European Macroseismic Scale EMS-98 to define the relationship between damage and macroseismic intensity to assess the seismic vulnerability. Applying to Algiers area, the first step is to identify building typologies and to assign vulnerability classes. In the second step, damages are investigated according to EMS-98.