Finding Fuzzy Association Rules Using FWFP-Growth with Linguistic Supports and Confidences

In data mining, the association rules are used to search for the relations of items of the transactions database. Following the data is collected and stored, it can find rules of value through association rules, and assist manager to proceed marketing strategy and plan market framework. In this paper, we attempt fuzzy partition methods and decide membership function of quantitative values of each transaction item. Also, by managers we can reflect the importance of items as linguistic terms, which are transformed as fuzzy sets of weights. Next, fuzzy weighted frequent pattern growth (FWFP-Growth) is used to complete the process of data mining. The method above is expected to improve Apriori algorithm for its better efficiency of the whole association rules. An example is given to clearly illustrate the proposed approach.

Analysis and Classification of Hiv-1 Sub- Type Viruses by AR Model through Artificial Neural Networks

HIV-1 genome is highly heterogeneous. Due to this variation, features of HIV-I genome is in a wide range. For this reason, the ability to infection of the virus changes depending on different chemokine receptors. From this point of view, R5 HIV viruses use CCR5 coreceptor while X4 viruses use CXCR5 and R5X4 viruses can utilize both coreceptors. Recently, in Bioinformatics, R5X4 viruses have been studied to classify by using the experiments on HIV-1 genome. In this study, R5X4 type of HIV viruses were classified using Auto Regressive (AR) model through Artificial Neural Networks (ANNs). The statistical data of R5X4, R5 and X4 viruses was analyzed by using signal processing methods and ANNs. Accessible residues of these virus sequences were obtained and modeled by AR model since the dimension of residues is large and different from each other. Finally the pre-processed data was used to evolve various ANN structures for determining R5X4 viruses. Furthermore ROC analysis was applied to ANNs to show their real performances. The results indicate that R5X4 viruses successfully classified with high sensitivity and specificity values training and testing ROC analysis for RBF, which gives the best performance among ANN structures.

Bifurcation Analysis in a Two-neuron System with Different Time Delays

In this paper, we consider a two-neuron system with time-delayed connections between neurons. By analyzing the associated characteristic transcendental equation, its linear stability is investigated and Hopf bifurcation is demonstrated. Some explicit formulae for determining the stability and the direction of the Hopf bifurcation periodic solutions bifurcating from Hopf bifurcations are obtained by using the normal form theory and center manifold theory. Some numerical simulation results are given to support the theoretical predictions. Finally, main conclusions are given.

Automata Theory Approach for Solving Frequent Pattern Discovery Problems

The various types of frequent pattern discovery problem, namely, the frequent itemset, sequence and graph mining problems are solved in different ways which are, however, in certain aspects similar. The main approach of discovering such patterns can be classified into two main classes, namely, in the class of the levelwise methods and in that of the database projection-based methods. The level-wise algorithms use in general clever indexing structures for discovering the patterns. In this paper a new approach is proposed for discovering frequent sequences and tree-like patterns efficiently that is based on the level-wise issue. Because the level-wise algorithms spend a lot of time for the subpattern testing problem, the new approach introduces the idea of using automaton theory to solve this problem.

Comparation Treatment Method for Industrial Tempeh Waste by Constructed Wetland and Activated Sludge

Ever since industrial revolution began, our ecosystem has changed. And indeed, the negatives outweigh the positives. Industrial waste usually released into all kinds of body of water, such as river or sea. Tempeh waste is one example of waste that carries many hazardous and unwanted substances that will affect the surrounding environment. Tempeh is a popular fermented food in Asia which is rich in nutrients and active substances. Tempeh liquid waste- in particular- can cause an air pollution, and if penetrates through the soil, it will contaminates ground-water, making it unavailable for the water to be consumed. Moreover, bacteria will thrive within the polluted water, which often responsible for causing many kinds of diseases. The treatment used for this chemical waste is biological treatment such as constructed wetland and activated sludge. These kinds of treatment are able to reduce both physical and chemical parameters altogether such as temperature, TSS, pH, BOD, COD, NH3-N, NO3-N, and PO4-P. These treatments are implemented before the waste is released into the water. The result is a comparation between constructed wetland and activated sludge, along with determining which method is better suited to reduce the physical and chemical subtances of the waste.

Application of a Similarity Measure for Graphs to Web-based Document Structures

Due to the tremendous amount of information provided by the World Wide Web (WWW) developing methods for mining the structure of web-based documents is of considerable interest. In this paper we present a similarity measure for graphs representing web-based hypertext structures. Our similarity measure is mainly based on a novel representation of a graph as linear integer strings, whose components represent structural properties of the graph. The similarity of two graphs is then defined as the optimal alignment of the underlying property strings. In this paper we apply the well known technique of sequence alignments for solving a novel and challenging problem: Measuring the structural similarity of generalized trees. In other words: We first transform our graphs considered as high dimensional objects in linear structures. Then we derive similarity values from the alignments of the property strings in order to measure the structural similarity of generalized trees. Hence, we transform a graph similarity problem to a string similarity problem for developing a efficient graph similarity measure. We demonstrate that our similarity measure captures important structural information by applying it to two different test sets consisting of graphs representing web-based document structures.

Improved C-Fuzzy Decision Tree for Intrusion Detection

As the number of networked computers grows, intrusion detection is an essential component in keeping networks secure. Various approaches for intrusion detection are currently being in use with each one has its own merits and demerits. This paper presents our work to test and improve the performance of a new class of decision tree c-fuzzy decision tree to detect intrusion. The work also includes identifying best candidate feature sub set to build the efficient c-fuzzy decision tree based Intrusion Detection System (IDS). We investigated the usefulness of c-fuzzy decision tree for developing IDS with a data partition based on horizontal fragmentation. Empirical results indicate the usefulness of our approach in developing the efficient IDS.

A Cross-Layer Approach for Cooperative MIMO Multi-hop Wireless Sensor Networks

In this work, we study the problem of determining the minimum scheduling length that can satisfy end-to-end (ETE) traffic demand in scheduling-based multihop WSNs with cooperative multiple-input multiple-output (MIMO) transmission scheme. Specifically, we present a cross-layer formulation for the joint routing, scheduling and stream control problem by incorporating various power and rate adaptation schemes, and taking into account an antenna beam pattern model and the signal-to-interference-and-noise (SINR) constraint at the receiver. In the context, we also propose column generation (CG) solutions to get rid of the complexity requiring the enumeration of all possible sets of scheduling links.

Modeling Language for Constructing Solvers in Machine Learning: Reductionist Perspectives

For a given specific problem an efficient algorithm has been the matter of study. However, an alternative approach orthogonal to this approach comes out, which is called a reduction. In general for a given specific problem this reduction approach studies how to convert an original problem into subproblems. This paper proposes a formal modeling language to support this reduction approach in order to make a solver quickly. We show three examples from the wide area of learning problems. The benefit is a fast prototyping of algorithms for a given new problem. It is noted that our formal modeling language is not intend for providing an efficient notation for data mining application, but for facilitating a designer who develops solvers in machine learning.

Visual Object Tracking in 3D with Color Based Particle Filter

This paper addresses the problem of determining the current 3D location of a moving object and robustly tracking it from a sequence of camera images. The approach presented here uses a particle filter and does not perform any explicit triangulation. Only the color of the object to be tracked is required, but not any precisemotion model. The observation model we have developed avoids the color filtering of the entire image. That and the Monte Carlotechniques inside the particle filter provide real time performance.Experiments with two real cameras are presented and lessons learned are commented. The approach scales easily to more than two cameras and new sensor cues.

Sensitivity Analysis in Power Systems Reliability Evaluation

In this paper sensitivity analysis is performed for reliability evaluation of power systems. When examining the reliability of a system, it is useful to recognize how results change as component parameters are varied. This knowledge helps engineers to understand the impact of poor data, and gives insight on how reliability can be improved. For these reasons, a sensitivity analysis can be performed. Finally, a real network was used for testing the presented method.

Improving Classification in Bayesian Networks using Structural Learning

Naïve Bayes classifiers are simple probabilistic classifiers. Classification extracts patterns by using data file with a set of labeled training examples and is currently one of the most significant areas in data mining. However, Naïve Bayes assumes the independence among the features. Structural learning among the features thus helps in the classification problem. In this study, the use of structural learning in Bayesian Network is proposed to be applied where there are relationships between the features when using the Naïve Bayes. The improvement in the classification using structural learning is shown if there exist relationship between the features or when they are not independent.

Data Mining in Oral Medicine Using Decision Trees

Data mining has been used very frequently to extract hidden information from large databases. This paper suggests the use of decision trees for continuously extracting the clinical reasoning in the form of medical expert-s actions that is inherent in large number of EMRs (Electronic Medical records). In this way the extracted data could be used to teach students of oral medicine a number of orderly processes for dealing with patients who represent with different problems within the practice context over time.

Prospects, Problems of Marketing Research and Data Mining in Turkey

The objective of this paper is to review and assess the methodological issues and problems in marketing research, data and knowledge mining in Turkey. As a summary, academic marketing research publications in Turkey have significant problems. The most vital problem seems to be related with modeling. Most of the publications had major weaknesses in modeling. There were also, serious problems regarding measurement and scaling, sampling and analyses. Analyses myopia seems to be the most important problem for young academia in Turkey. Another very important finding is the lack of publications on data and knowledge mining in the academic world.

Extensiveness and Effectiveness of Corporate Governance Regulations in South-Eastern Europe

The purpose of the article is to illustrate the main characteristics of the corporate governance challenge facing the countries of South-Eastern Europe (SEE) and to subsequently determine and assess the extensiveness and effectiveness of corporate governance regulations in these countries. Therefore, we start with an overview on the subject of the key problems of corporate governance in transition. We then address the issue of corporate governance measurement for SEE countries. To this end, we include a review of the methodological framework for determining both the extensiveness and the effectiveness of corporate governance legislation. We then focus on the actual analysis of the quality of corporate governance codes, as well as of legal institutions effectiveness and provide a measure of corporate governance in Romania and other SEE emerging markets. The paper concludes by emphasizing the corporate governance enforcement gap and by identifying research issues that require further study.

An Attribute-Centre Based Decision Tree Classification Algorithm

Decision tree algorithms have very important place at classification model of data mining. In literature, algorithms use entropy concept or gini index to form the tree. The shape of the classes and their closeness to each other some of the factors that affect the performance of the algorithm. In this paper we introduce a new decision tree algorithm which employs data (attribute) folding method and variation of the class variables over the branches to be created. A comparative performance analysis has been held between the proposed algorithm and C4.5.

Presenting a Combinatorial Feature to Estimate Depth of Anesthesia

Determining depth of anesthesia is a challenging problem in the context of biomedical signal processing. Various methods have been suggested to determine a quantitative index as depth of anesthesia, but most of these methods suffer from high sensitivity during the surgery. A novel method based on energy scattering of samples in the wavelet domain is suggested to represent the basic content of electroencephalogram (EEG) signal. In this method, first EEG signal is decomposed into different sub-bands, then samples are squared and energy of samples sequence is constructed through each scale and time, which is normalized and finally entropy of the resulted sequences is suggested as a reliable index. Empirical Results showed that applying the proposed method to the EEG signals can classify the awake, moderate and deep anesthesia states similar to BIS.

Gender Differences in Entrepreneurship: Situation, Characteristics, Motivation and Entrepreneurial Behavior of Women Entrepreneurs in Switzerland

Entrepreneurs are important for national labour markets and economies in that they contribute significantly to economic growth as well as provide the majority of jobs and create new ones. According to the Global Entrepreneurship Monitor’s “Report on Women and Entrepreneurship”, investment in women’s entrepreneurship is an important way to exponentially increase the impact of new venture creation finding ways to empower women’s participation and success in entrepreneurship are critical for more sustainable and successful economic development. Our results confirm that they are still differences between men and women entrepreneurs The reasons seems to be the lack of specific business skills, the less extensive social network, and the lack of identification patterns among women. Those differences can be explained by the fact that women still have fewer opportunities to make a career. If this is correct, we can predict an increasing proportion of women among entrepreneurs in the next years. Concerning the development of a favorable environment for developing and enhancing women entrepreneurship activities, our results show the insertion in a network and the role of a model doubtless represent elements determining in the choice to launch an entrepreneurship activity, as well as a precious resource for the success of her company.

SUPAR: System for User-Centric Profiling of Association Rules in Streaming Data

With a surge of stream processing applications novel techniques are required for generation and analysis of association rules in streams. The traditional rule mining solutions cannot handle streams because they generally require multiple passes over the data and do not guarantee the results in a predictable, small time. Though researchers have been proposing algorithms for generation of rules from streams, there has not been much focus on their analysis. We propose Association rule profiling, a user centric process for analyzing association rules and attaching suitable profiles to them depending on their changing frequency behavior over a previous snapshot of time in a data stream. Association rule profiles provide insights into the changing nature of associations and can be used to characterize the associations. We discuss importance of characteristics such as predictability of linkages present in the data and propose metric to quantify it. We also show how association rule profiles can aid in generation of user specific, more understandable and actionable rules. The framework is implemented as SUPAR: System for Usercentric Profiling of Association Rules in streaming data. The proposed system offers following capabilities: i) Continuous monitoring of frequency of streaming item-sets and detection of significant changes therein for association rule profiling. ii) Computation of metrics for quantifying predictability of associations present in the data. iii) User-centric control of the characterization process: user can control the framework through a) constraint specification and b) non-interesting rule elimination.

Network Anomaly Detection using Soft Computing

One main drawback of intrusion detection system is the inability of detecting new attacks which do not have known signatures. In this paper we discuss an intrusion detection method that proposes independent component analysis (ICA) based feature selection heuristics and using rough fuzzy for clustering data. ICA is to separate these independent components (ICs) from the monitored variables. Rough set has to decrease the amount of data and get rid of redundancy and Fuzzy methods allow objects to belong to several clusters simultaneously, with different degrees of membership. Our approach allows us to recognize not only known attacks but also to detect activity that may be the result of a new, unknown attack. The experimental results on Knowledge Discovery and Data Mining- (KDDCup 1999) dataset.