Implementation of IEEE 802.15.4 Packet Analyzer

A packet analyzer is a tool for debugging sensor network systems and is convenient for developers. In this paper, we introduce a new packet analyzer based on an embedded system. The proposed packet analyzer is compatible with IEEE 802.15.4, which is suitable for the wireless communication standard for sensor networks, and is available for remote control by adopting a server-client scheme based on the Ethernet interface. To confirm the operations of the packet analyzer, we have developed two types of sensor nodes based on PIC4620 and ATmega128L microprocessors and tested the functions of the proposed packet analyzer by obtaining the packets from the sensor nodes.

An Experimental Study of a Self-Supervised Classifier Ensemble

Learning using labeled and unlabelled data has received considerable amount of attention in the machine learning community due its potential in reducing the need for expensive labeled data. In this work we present a new method for combining labeled and unlabeled data based on classifier ensembles. The model we propose assumes each classifier in the ensemble observes the input using different set of features. Classifiers are initially trained using some labeled samples. The trained classifiers learn further through labeling the unknown patterns using a teaching signals that is generated using the decision of the classifier ensemble, i.e. the classifiers self-supervise each other. Experiments on a set of object images are presented. Our experiments investigate different classifier models, different fusing techniques, different training sizes and different input features. Experimental results reveal that the proposed self-supervised ensemble learning approach reduces classification error over the single classifier and the traditional ensemble classifier approachs.

A Proposed Trust Model for the Semantic Web

A serious problem on the WWW is finding reliable information. Not everything found on the Web is true and the Semantic Web does not change that in any way. The problem will be even more crucial for the Semantic Web, where agents will be integrating and using information from multiple sources. Thus, if an incorrect premise is used due to a single faulty source, then any conclusions drawn may be in error. Thus, statements published on the Semantic Web have to be seen as claims rather than as facts, and there should be a way to decide which among many possibly inconsistent sources is most reliable. In this work, we propose a trust model for the Semantic Web. The proposed model is inspired by the use trust in human society. Trust is a type of social knowledge and encodes evaluations about which agents can be taken as reliable sources of information or services. Our proposed model allows agents to decide which among different sources of information to trust and thus act rationally on the semantic web.

SUPAR: System for User-Centric Profiling of Association Rules in Streaming Data

With a surge of stream processing applications novel techniques are required for generation and analysis of association rules in streams. The traditional rule mining solutions cannot handle streams because they generally require multiple passes over the data and do not guarantee the results in a predictable, small time. Though researchers have been proposing algorithms for generation of rules from streams, there has not been much focus on their analysis. We propose Association rule profiling, a user centric process for analyzing association rules and attaching suitable profiles to them depending on their changing frequency behavior over a previous snapshot of time in a data stream. Association rule profiles provide insights into the changing nature of associations and can be used to characterize the associations. We discuss importance of characteristics such as predictability of linkages present in the data and propose metric to quantify it. We also show how association rule profiles can aid in generation of user specific, more understandable and actionable rules. The framework is implemented as SUPAR: System for Usercentric Profiling of Association Rules in streaming data. The proposed system offers following capabilities: i) Continuous monitoring of frequency of streaming item-sets and detection of significant changes therein for association rule profiling. ii) Computation of metrics for quantifying predictability of associations present in the data. iii) User-centric control of the characterization process: user can control the framework through a) constraint specification and b) non-interesting rule elimination.

Mining Sequential Patterns Using I-PrefixSpan

In this paper, we propose an improvement of pattern growth-based PrefixSpan algorithm, called I-PrefixSpan. The general idea of I-PrefixSpan is to use sufficient data structure for Seq-Tree framework and separator database to reduce the execution time and memory usage. Thus, with I-PrefixSpan there is no in-memory database stored after index set is constructed. The experimental result shows that using Java 2, this method improves the speed of PrefixSpan up to almost two orders of magnitude as well as the memory usage to more than one order of magnitude.

Text Mining Technique for Data Mining Application

Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In decision tree approach is most useful in classification problem. With this technique, tree is constructed to model the classification process. There are two basic steps in the technique: building the tree and applying the tree to the database. This paper describes a proposed C5.0 classifier that performs rulesets, cross validation and boosting for original C5.0 in order to reduce the optimization of error ratio. The feasibility and the benefits of the proposed approach are demonstrated by means of medial data set like hypothyroid. It is shown that, the performance of a classifier on the training cases from which it was constructed gives a poor estimate by sampling or using a separate test file, either way, the classifier is evaluated on cases that were not used to build and evaluate the classifier are both are large. If the cases in hypothyroid.data and hypothyroid.test were to be shuffled and divided into a new 2772 case training set and a 1000 case test set, C5.0 might construct a different classifier with a lower or higher error rate on the test cases. An important feature of see5 is its ability to classifiers called rulesets. The ruleset has an error rate 0.5 % on the test cases. The standard errors of the means provide an estimate of the variability of results. One way to get a more reliable estimate of predictive is by f-fold –cross- validation. The error rate of a classifier produced from all the cases is estimated as the ratio of the total number of errors on the hold-out cases to the total number of cases. The Boost option with x trials instructs See5 to construct up to x classifiers in this manner. Trials over numerous datasets, large and small, show that on average 10-classifier boosting reduces the error rate for test cases by about 25%.

Efficient Implementation of Serial and Parallel Support Vector Machine Training with a Multi-Parameter Kernel for Large-Scale Data Mining

This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.

Evolving Neural Networks using Moment Method for Handwritten Digit Recognition

This paper proposes a neural network weights and topology optimization using genetic evolution and the backpropagation training algorithm. The proposed crossover and mutation operators aims to adapt the networks architectures and weights during the evolution process. Through a specific inheritance procedure, the weights are transmitted from the parents to their offsprings, which allows re-exploitation of the already trained networks and hence the acceleration of the global convergence of the algorithm. In the preprocessing phase, a new feature extraction method is proposed based on Legendre moments with the Maximum entropy principle MEP as a selection criterion. This allows a global search space reduction in the design of the networks. The proposed method has been applied and tested on the well known MNIST database of handwritten digits.

A Novel Approach to Persian Online Hand Writing Recognition

Persian (Farsi) script is totally cursive and each character is written in several different forms depending on its former and later characters in the word. These complexities make automatic handwriting recognition of Persian a very hard problem and there are few contributions trying to work it out. This paper presents a novel practical approach to online recognition of Persian handwriting which is based on representation of inputs and patterns with very simple visual features and comparison of these simple terms. This recognition approach is tested over a set of Persian words and the results have been quite acceptable when the possible words where unknown and they were almost all correct in cases that the words where chosen from a prespecified list.

FPGA Implementation of the “PYRAMIDS“ Block Cipher

The “PYRAMIDS" Block Cipher is a symmetric encryption algorithm of a 64, 128, 256-bit length, that accepts a variable key length of 128, 192, 256 bits. The algorithm is an iterated cipher consisting of repeated applications of a simple round transformation with different operations and different sequence in each round. The algorithm was previously software implemented in Cµ code. In this paper, a hardware implementation of the algorithm, using Field Programmable Gate Arrays (FPGA), is presented. In this work, we discuss the algorithm, the implemented micro-architecture, and the simulation and implementation results. Moreover, we present a detailed comparison with other implemented standard algorithms. In addition, we include the floor plan as well as the circuit diagrams of the various micro-architecture modules.

Contribution to the Query Optimization in the Object-Oriented Databases

Appeared toward 1986, the object-oriented databases management systems had not known successes knew five years after their birth. One of the major difficulties is the query optimization. We propose in this paper a new approach that permits to enrich techniques of query optimization existing in the object-oriented databases. Seen success that knew the query optimization in the relational model, our approach inspires itself of these optimization techniques and enriched it so that they can support the new concepts introduced by the object databases.

Incorporation of Long-Term Redundancy in ECG Time Domain Compression Methods through Curve Simplification and Block-Sorting

We suggest a novel method to incorporate longterm redundancy (LTR) in signal time domain compression methods. The proposition is based on block-sorting and curve simplification. The proposition is illustrated on the ECG signal as a post-processor for the FAN method. Test applications on the new so-obtained FAN+ method using the MIT-BIH database show substantial improvement of the compression ratio-distortion behavior for a higher quality reconstructed signal.

Novel Ridge Orientation Based Approach for Fingerprint Identification Using Co-Occurrence Matrix

In this paper we use the property of co-occurrence matrix in finding parallel lines in binary pictures for fingerprint identification. In our proposed algorithm, we reduce the noise by filtering the fingerprint images and then transfer the fingerprint images to binary images using a proper threshold. Next, we divide the binary images into some regions having parallel lines in the same direction. The lines in each region have a specific angle that can be used for comparison. This method is simple, performs the comparison step quickly and has a good resistance in the presence of the noise.

The Relevance of Data Warehousing and Data Mining in the Field of Evidence-based Medicine to Support Healthcare Decision Making

Evidence-based medicine is a new direction in modern healthcare. Its task is to prevent, diagnose and medicate diseases using medical evidence. Medical data about a large patient population is analyzed to perform healthcare management and medical research. In order to obtain the best evidence for a given disease, external clinical expertise as well as internal clinical experience must be available to the healthcare practitioners at right time and in the right manner. External evidence-based knowledge can not be applied directly to the patient without adjusting it to the patient-s health condition. We propose a data warehouse based approach as a suitable solution for the integration of external evidence-based data sources into the existing clinical information system and data mining techniques for finding appropriate therapy for a given patient and a given disease. Through integration of data warehousing, OLAP and data mining techniques in the healthcare area, an easy to use decision support platform, which supports decision making process of care givers and clinical managers, is built. We present three case studies, which show, that a clinical data warehouse that facilitates evidence-based medicine is a reliable, powerful and user-friendly platform for strategic decision making, which has a great relevance for the practice and acceptance of evidence-based medicine.

Searching for Similar Informational Articles in the Internet Channel

In terms of total online audience, newspapers are the most successful form of online content to date. The online audience for newspapers continues to demand higher-quality services, including personalized news services. News providers should be able to offer suitable users appropriate content. In this paper, a news article recommender system is suggested based on a user-s preference when he or she visits an Internet news site and reads the published articles. This system helps raise the user-s satisfaction, increase customer loyalty toward the content provider.

SMaTTS: Standard Malay Text to Speech System

This paper presents a rule-based text- to- speech (TTS) Synthesis System for Standard Malay, namely SMaTTS. The proposed system using sinusoidal method and some pre- recorded wave files in generating speech for the system. The use of phone database significantly decreases the amount of computer memory space used, thus making the system very light and embeddable. The overall system was comprised of two phases the Natural Language Processing (NLP) that consisted of the high-level processing of text analysis, phonetic analysis, text normalization and morphophonemic module. The module was designed specially for SM to overcome few problems in defining the rules for SM orthography system before it can be passed to the DSP module. The second phase is the Digital Signal Processing (DSP) which operated on the low-level process of the speech waveform generation. A developed an intelligible and adequately natural sounding formant-based speech synthesis system with a light and user-friendly Graphical User Interface (GUI) is introduced. A Standard Malay Language (SM) phoneme set and an inclusive set of phone database have been constructed carefully for this phone-based speech synthesizer. By applying the generative phonology, a comprehensive letter-to-sound (LTS) rules and a pronunciation lexicon have been invented for SMaTTS. As for the evaluation tests, a set of Diagnostic Rhyme Test (DRT) word list was compiled and several experiments have been performed to evaluate the quality of the synthesized speech by analyzing the Mean Opinion Score (MOS) obtained. The overall performance of the system as well as the room for improvements was thoroughly discussed.

Object-Oriented Simulation of Simulating Anticipatory Systems

The present paper is oriented to problems of simulation of anticipatory systems, namely those that use simulation models for the aid of anticipation. A certain analogy between use of simulation and imagining will be applied to make the explication more comprehensible. The paper will be completed by notes of problems and by some existing applications. The problems consist in the fact that simulation of the mentioned anticipatory systems end is simulation of simulating systems, i.e. in computer models handling two or more modeled time axes that should be mapped to real time flow in a nondescent manner. Languages oriented to objects, processes and blocks can be used to surmount the problems.

Using Heuristic Rules from Sentence Decomposition of Experts- Summaries to Detect Students- Summarizing Strategies

Summarizing skills have been introduced to English syllabus in secondary school in Malaysia to evaluate student-s comprehension for a given text where it requires students to employ several strategies to produce the summary. This paper reports on our effort to develop a computer-based summarization assessment system that detects the strategies used by the students in producing their summaries. Sentence decomposition of expert-written summaries is used to analyze how experts produce their summary sentences. From the analysis, we identified seven summarizing strategies and their rules which are then transformed into a set of heuristic rules on how to determine the summarizing strategies. We developed an algorithm based on the heuristic rules and performed some experiments to evaluate and support the technique proposed.

An Artificial Immune System for a Multi Agent Robotics System

This paper explores an application of an adaptive learning mechanism for robots based on the natural immune system. Most of the research carried out so far are based either on the innate or adaptive characteristics of the immune system, we present a combination of these to achieve behavior arbitration wherein a robot learns to detect vulnerable areas of a track and adapts to the required speed over such portions. The test bed comprises of two Lego robots deployed simultaneously on two predefined near concentric tracks with the outer robot capable of helping the inner one when it misaligns. The helper robot works in a damage-control mode by realigning itself to guide the other robot back onto its track. The panic-stricken robot records the conditions under which it was misaligned and learns to detect and adapt under similar conditions thereby making the overall system immune to such failures.

Application of Extreme Learning Machine Method for Time Series Analysis

In this paper, we study the application of Extreme Learning Machine (ELM) algorithm for single layered feedforward neural networks to non-linear chaotic time series problems. In this algorithm the input weights and the hidden layer bias are randomly chosen. The ELM formulation leads to solving a system of linear equations in terms of the unknown weights connecting the hidden layer to the output layer. The solution of this general system of linear equations will be obtained using Moore-Penrose generalized pseudo inverse. For the study of the application of the method we consider the time series generated by the Mackey Glass delay differential equation with different time delays, Santa Fe A and UCR heart beat rate ECG time series. For the choice of sigmoid, sin and hardlim activation functions the optimal values for the memory order and the number of hidden neurons which give the best prediction performance in terms of root mean square error are determined. It is observed that the results obtained are in close agreement with the exact solution of the problems considered which clearly shows that ELM is a very promising alternative method for time series prediction.