A Preference-Based Multi-Agent Data Mining Framework for Social Network Service Users' Decision Making

Multi-Agent Systems (MAS) emerged in the pursuit to improve our standard of living, and hence can manifest complex human behaviors such as communication, decision making, negotiation and self-organization. The Social Network Services (SNSs) have attracted millions of users, many of whom have integrated these sites into their daily practices. The domains of MAS and SNS have lots of similarities such as architecture, features and functions. Exploring social network users- behavior through multiagent model is therefore our research focus, in order to generate more accurate and meaningful information to SNS users. An application of MAS is the e-Auction and e-Rental services of the Universiti Cyber AgenT(UniCAT), a Social Network for students in Universiti Tunku Abdul Rahman (UTAR), Kampar, Malaysia, built around the Belief- Desire-Intention (BDI) model. However, in spite of the various advantages of the BDI model, it has also been discovered to have some shortcomings. This paper therefore proposes a multi-agent framework utilizing a modified BDI model- Belief-Desire-Intention in Dynamic and Uncertain Situations (BDIDUS), using UniCAT system as a case study.

On the Efficient Implementation of a Serial and Parallel Decomposition Algorithm for Fast Support Vector Machine Training Including a Multi-Parameter Kernel

This work deals with aspects of support vector machine learning for large-scale data mining tasks. Based on a decomposition algorithm for support vector machine training that can be run in serial as well as shared memory parallel mode we introduce a transformation of the training data that allows for the usage of an expensive generalized kernel without additional costs. We present experiments for the Gaussian kernel, but usage of other kernel functions is possible, too. In order to further speed up the decomposition algorithm we analyze the critical problem of working set selection for large training data sets. In addition, we analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our tests and conclusions led to several modifications of the algorithm and the improvement of overall support vector machine learning performance. Our method allows for using extensive parameter search methods to optimize classification accuracy.

Ozone Decomposition over Silver-Loaded Perlite

The Bulgarian natural expanded mineral obtained from Bentonite AD perlite (A deposit of "The Broken Mountain" for perlite mining, near by the village of Vodenicharsko, in the municipality of Djebel), was loaded with silver (as ion form - Ag+ 2 and 5 wt% by the incipient wetness impregnation method), and as atomic silver - Ag0 using Tollen-s reagent (silver mirror reaction). Some physicochemical characterization of the samples are provided via: DC arc-AES, XRD, DR-IR and UV-VIS. The aim of this work was to obtain and test the silver-loaded catalyst for ozone decomposition. So the samples loaded with atomic silver show ca. 80% conversion of ozone 20 minutes after the reaction start. Then conversion decreases to ca. 20 % but stay stable during the prolongation of time.

Portable Virtual Piano Design

The purpose of this study is to design a portable virtual piano. By utilizing optical fiber gloves and the virtual piano software designed by this study, the user can play the piano anywhere at any time. This virtual piano consists of three major parts: finger tapping identification, hand movement and positioning identification, and MIDI software sound effect simulation. To play the virtual piano, the user wears optical fiber gloves and simulates piano key tapping motions. The finger bending information detected by the optical fiber gloves can tell when piano key tapping motions are made. Images captured by a video camera are analyzed, hand locations and moving directions are positioned, and the corresponding scales are found. The system integrates finger tapping identification with information about hand placement in relation to corresponding piano key positions, and generates MIDI piano sound effects based on this data. This experiment shows that the proposed method achieves an accuracy rate of 95% for determining when a piano key is tapped.

A Brain Inspired Approach for Multi-View Patterns Identification

Biologically human brain processes information in both unimodal and multimodal approaches. In fact, information is progressively abstracted and seamlessly fused. Subsequently, the fusion of multimodal inputs allows a holistic understanding of a problem. The proliferation of technology has exponentially produced various sources of data, which could be likened to being the state of multimodality in human brain. Therefore, this is an inspiration to develop a methodology for exploring multimodal data and further identifying multi-view patterns. Specifically, we propose a brain inspired conceptual model that allows exploration and identification of patterns at different levels of granularity, different types of hierarchies and different types of modalities. A structurally adaptive neural network is deployed to implement the proposed model. Furthermore, the acquisition of multi-view patterns with the proposed model is demonstrated and discussed with some experimental results.

Determining of Threshold Levels of Burst by Burst AQAM/CDMA in Slow Rayleigh Fading Environments

In this paper, we are going to determine the threshold levels of adaptive modulation in a burst by burst CDMA system by a suboptimum method so that the above method attempts to increase the average bit per symbol (BPS) rate of transceiver system by switching between the different modulation modes in variable channel condition. In this method, we choose the minimum values of average bit error rate (BER) and maximum values of average BPS on different values of average channel signal to noise ratio (SNR) and then calculate the relative threshold levels of them, so that when the instantaneous SNR increases, a higher order modulation be employed for increasing throughput and vise-versa when the instantaneous SNR decreases, a lower order modulation be employed for improvement of BER. In transmission step, by this adaptive modulation method, in according to comparison between obtained estimation of pilot symbols and a set of above suboptimum threshold levels, above system chooses one of states no transmission, BPSK, 4QAM and square 16QAM for modulation of data. The expected channel in this paper is a slow Rayleigh fading.

Web Traffic Mining using Neural Networks

With the explosive growth of data available on the Internet, personalization of this information space become a necessity. At present time with the rapid increasing popularity of the WWW, Websites are playing a crucial role to convey knowledge and information to the end users. Discovering hidden and meaningful information about Web users usage patterns is critical to determine effective marketing strategies to optimize the Web server usage for accommodating future growth. The task of mining useful information becomes more challenging when the Web traffic volume is enormous and keeps on growing. In this paper, we propose a intelligent model to discover and analyze useful knowledge from the available Web log data.

A Modified Fuzzy C-Means Algorithm for Natural Data Exploration

In Data mining, Fuzzy clustering algorithms have demonstrated advantage over crisp clustering algorithms in dealing with the challenges posed by large collections of vague and uncertain natural data. This paper reviews concept of fuzzy logic and fuzzy clustering. The classical fuzzy c-means algorithm is presented and its limitations are highlighted. Based on the study of the fuzzy c-means algorithm and its extensions, we propose a modification to the cmeans algorithm to overcome the limitations of it in calculating the new cluster centers and in finding the membership values with natural data. The efficiency of the new modified method is demonstrated on real data collected for Bhutan-s Gross National Happiness (GNH) program.

Applying Fuzzy FP-Growth to Mine Fuzzy Association Rules

In data mining, the association rules are used to find for the associations between the different items of the transactions database. As the data collected and stored, rules of value can be found through association rules, which can be applied to help managers execute marketing strategies and establish sound market frameworks. This paper aims to use Fuzzy Frequent Pattern growth (FFP-growth) to derive from fuzzy association rules. At first, we apply fuzzy partition methods and decide a membership function of quantitative value for each transaction item. Next, we implement FFP-growth to deal with the process of data mining. In addition, in order to understand the impact of Apriori algorithm and FFP-growth algorithm on the execution time and the number of generated association rules, the experiment will be performed by using different sizes of databases and thresholds. Lastly, the experiment results show FFPgrowth algorithm is more efficient than other existing methods.

Discovery of Quantified Hierarchical Production Rules from Large Set of Discovered Rules

Automated discovery of Rule is, due to its applicability, one of the most fundamental and important method in KDD. It has been an active research area in the recent past. Hierarchical representation allows us to easily manage the complexity of knowledge, to view the knowledge at different levels of details, and to focus our attention on the interesting aspects only. One of such efficient and easy to understand systems is Hierarchical Production rule (HPRs) system. A HPR, a standard production rule augmented with generality and specificity information, is of the following form: Decision If < condition> Generality Specificity . HPRs systems are capable of handling taxonomical structures inherent in the knowledge about the real world. This paper focuses on the issue of mining Quantified rules with crisp hierarchical structure using Genetic Programming (GP) approach to knowledge discovery. The post-processing scheme presented in this work uses Quantified production rules as initial individuals of GP and discovers hierarchical structure. In proposed approach rules are quantified by using Dempster Shafer theory. Suitable genetic operators are proposed for the suggested encoding. Based on the Subsumption Matrix(SM), an appropriate fitness function is suggested. Finally, Quantified Hierarchical Production Rules (HPRs) are generated from the discovered hierarchy, using Dempster Shafer theory. Experimental results are presented to demonstrate the performance of the proposed algorithm.

Content Based Sampling over Transactional Data Streams

This paper investigates the problem of sampling from transactional data streams. We introduce CFISDS as a content based sampling algorithm that works on a landmark window model of data streams and preserve more informed sample in sample space. This algorithm that work based on closed frequent itemset mining tasks, first initiate a concept lattice using initial data, then update lattice structure using an incremental mechanism.Incremental mechanism insert, update and delete nodes in/from concept lattice in batch manner. Presented algorithm extracts the final samples on demand of user. Experimental results show the accuracy of CFISDS on synthetic and real datasets, despite on CFISDS algorithm is not faster than exist sampling algorithms such as Z and DSS.

Constraint Based Frequent Pattern Mining Technique for Solving GCS Problem

Generalized Center String (GCS) problem are generalized from Common Approximate Substring problem and Common substring problems. GCS are known to be NP-hard allowing the problems lies in the explosion of potential candidates. Finding longest center string without concerning the sequence that may not contain any motifs is not known in advance in any particular biological gene process. GCS solved by frequent pattern-mining techniques and known to be fixed parameter tractable based on the fixed input sequence length and symbol set size. Efficient method known as Bpriori algorithms can solve GCS with reasonable time/space complexities. Bpriori 2 and Bpriori 3-2 algorithm are been proposed of any length and any positions of all their instances in input sequences. In this paper, we reduced the time/space complexity of Bpriori algorithm by Constrained Based Frequent Pattern mining (CBFP) technique which integrates the idea of Constraint Based Mining and FP-tree mining. CBFP mining technique solves the GCS problem works for all center string of any length, but also for the positions of all their mutated copies of input sequence. CBFP mining technique construct TRIE like with FP tree to represent the mutated copies of center string of any length, along with constraints to restraint growth of the consensus tree. The complexity analysis for Constrained Based FP mining technique and Bpriori algorithm is done based on the worst case and average case approach. Algorithm's correctness compared with the Bpriori algorithm using artificial data is shown.

Parallel and Distributed Mining of Association Rule on Knowledge Grid

In Virtual organization, Knowledge Discovery (KD) service contains distributed data resources and computing grid nodes. Computational grid is integrated with data grid to form Knowledge Grid, which implements Apriori algorithm for mining association rule on grid network. This paper describes development of parallel and distributed version of Apriori algorithm on Globus Toolkit using Message Passing Interface extended with Grid Services (MPICHG2). The creation of Knowledge Grid on top of data and computational grid is to support decision making in real time applications. In this paper, the case study describes design and implementation of local and global mining of frequent item sets. The experiments were conducted on different configurations of grid network and computation time was recorded for each operation. We analyzed our result with various grid configurations and it shows speedup of computation time is almost superlinear.

Determining Cluster Boundaries Using Particle Swarm Optimization

Self-organizing map (SOM) is a well known data reduction technique used in data mining. Data visualization can reveal structure in data sets that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOMs, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of a generic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOMs. The application of our method to unlabeled call data for a mobile phone operator demonstrates its feasibility. PSO algorithm utilizes U-matrix of SOMs to determine cluster boundaries; the results of this novel automatic method correspond well to boundary detection through visual inspection of code vectors and k-means algorithm.

A Degraded Practical MIMOME Channel: Issues Insecret Data Communications

In this paper, a Gaussian multiple input multiple output multiple eavesdropper (MIMOME) channel is considered where a transmitter communicates to a receiver in the presence of an eavesdropper. We present a technique for determining the secrecy capacity of the multiple input multiple output (MIMO) channel under Gaussian noise. We transform the degraded MIMOME channel into multiple single input multiple output (SIMO) Gaussian wire-tap channels and then use scalar approach to convert it into two equivalent multiple input single output (MISO) channels. The secrecy capacity model is then developed for the condition where the channel state information (CSI) for main channel only is known to the transmitter. The results show that the secret communication is possible when the eavesdropper channel noise is greater than a cutoff noise level. The outage probability is also analyzed of secrecy capacity is also analyzed. The effect of fading and outage probability is also analyzed.

Join and Meet Block Based Default Definite Decision Rule Mining from IDT and an Incremental Algorithm

Using maximal consistent blocks of tolerance relation on the universe in incomplete decision table, the concepts of join block and meet block are introduced and studied. Including tolerance class, other blocks such as tolerant kernel and compatible kernel of an object are also discussed at the same time. Upper and lower approximations based on those blocks are also defined. Default definite decision rules acquired from incomplete decision table are proposed in the paper. An incremental algorithm to update default definite decision rules is suggested for effective mining tasks from incomplete decision table into which data is appended. Through an example, we demonstrate how default definite decision rules based on maximal consistent blocks, join blocks and meet blocks are acquired and how optimization is done in support of discernibility matrix and discernibility function in the incomplete decision table.

Prediction of Overall Efficiency in Multistage Gear Trains

A mathematical model for determining the overall efficiency of a multistage tractor gearbox including all gear, lubricant, surface finish related parameters and operating conditions is presented. Sliding friction, rolling friction and windage losses were considered as the main sources of power loss in the gearing system. A computer code in FORTRAN was developed to simulate the model. Sliding friction contributes about 98% of the total power loss for gear trains operating at relatively low speeds (less than 2000 rpm input speed). Rolling frictional losses decrease with increased load while windage losses are only significant for gears running at very high speeds (greater than 3000 rpm). The results also showed that the overall efficiency varies over the path of contact of the gear meshes ranging between 94% to 99.5%.

A Hybrid Data Mining Method for the Medical Classification of Chest Pain

Data mining techniques have been used in medical research for many years and have been known to be effective. In order to solve such problems as long-waiting time, congestion, and delayed patient care, faced by emergency departments, this study concentrates on building a hybrid methodology, combining data mining techniques such as association rules and classification trees. The methodology is applied to real-world emergency data collected from a hospital and is evaluated by comparing with other techniques. The methodology is expected to help physicians to make a faster and more accurate classification of chest pain diseases.

Advanced Information Extraction with n-gram based LSI

Number of documents being created increases at an increasing pace while most of them being in already known topics and little of them introducing new concepts. This fact has started a new era in information retrieval discipline where the requirements have their own specialties. That is digging into topics and concepts and finding out subtopics or relations between topics. Up to now IR researches were interested in retrieving documents about a general topic or clustering documents under generic subjects. However these conventional approaches can-t go deep into content of documents which makes it difficult for people to reach to right documents they were searching. So we need new ways of mining document sets where the critic point is to know much about the contents of the documents. As a solution we are proposing to enhance LSI, one of the proven IR techniques by supporting its vector space with n-gram forms of words. Positive results we have obtained are shown in two different application area of IR domain; querying a document database, clustering documents in the document database.

A PSO-Based Optimum Design of PID Controller for a Linear Brushless DC Motor

This Paper presents a particle swarm optimization (PSO) method for determining the optimal proportional-integral-derivative (PID) controller parameters, for speed control of a linear brushless DC motor. The proposed approach has superior features, including easy implementation, stable convergence characteristic and good computational efficiency. The brushless DC motor is modelled in Simulink and the PSO algorithm is implemented in MATLAB. Comparing with Genetic Algorithm (GA) and Linear quadratic regulator (LQR) method, the proposed method was more efficient in improving the step response characteristics such as, reducing the steady-states error; rise time, settling time and maximum overshoot in speed control of a linear brushless DC motor.