Mining Network Data for Intrusion Detection through Naïve Bayesian with Clustering

Network security attacks are the violation of information security policy that received much attention to the computational intelligence society in the last decades. Data mining has become a very useful technique for detecting network intrusions by extracting useful knowledge from large number of network data or logs. Naïve Bayesian classifier is one of the most popular data mining algorithm for classification, which provides an optimal way to predict the class of an unknown example. It has been tested that one set of probability derived from data is not good enough to have good classification rate. In this paper, we proposed a new learning algorithm for mining network logs to detect network intrusions through naïve Bayesian classifier, which first clusters the network logs into several groups based on similarity of logs, and then calculates the prior and conditional probabilities for each group of logs. For classifying a new log, the algorithm checks in which cluster the log belongs and then use that cluster-s probability set to classify the new log. We tested the performance of our proposed algorithm by employing KDD99 benchmark network intrusion detection dataset, and the experimental results proved that it improves detection rates as well as reduces false positives for different types of network intrusions.

Enhanced GA-Fuzzy OPF under both Normal and Contingent Operation States

The genetic algorithm (GA) based solution techniques are found suitable for optimization because of their ability of simultaneous multidimensional search. Many GA-variants have been tried in the past to solve optimal power flow (OPF), one of the nonlinear problems of electric power system. The issues like convergence speed and accuracy of the optimal solution obtained after number of generations using GA techniques and handling system constraints in OPF are subjects of discussion. The results obtained for GA-Fuzzy OPF on various power systems have shown faster convergence and lesser generation costs as compared to other approaches. This paper presents an enhanced GA-Fuzzy OPF (EGAOPF) using penalty factors to handle line flow constraints and load bus voltage limits for both normal network and contingency case with congestion. In addition to crossover and mutation rate adaptation scheme that adapts crossover and mutation probabilities for each generation based on fitness values of previous generations, a block swap operator is also incorporated in proposed EGA-OPF. The line flow limits and load bus voltage magnitude limits are handled by incorporating line overflow and load voltage penalty factors respectively in each chromosome fitness function. The effects of different penalty factors settings are also analyzed under contingent state.

Effect of Transmission Codes on Hybrid SC/MRC Diversity Reception MQAM system over Rayleigh Fading Channels

In this paper, the effect of transmission codes on the performance of coherent square M-ary quadrature amplitude modulation (CSMQAM) under hybrid selection/maximal-ratio combining (H-S/MRC) diversity is analysed. The fading channels are modeled as frequency non-selective slow independent and identically distributed Rayleigh fading channels corrupted by additive white Gaussian noise (AWGN). The results for coded MQAM are computed numerically for the case of (24,12) extended Golay code and compared with uncoded MQAM under H-S/MRC diversity by plotting error probabilities versus average signal to noise ratio (SNR) for various values L and N in order to examine the improvement in the performance of the digital communications system as the number of selected diversity branches is increased. The results for no diversity, conventional SC and Lth order MRC schemes are also plotted for comparison. Closed form analytical results derived in this paper are sufficiently simple and therefore can be computed numerically without any approximations. The analytical results presented in this paper are expected to provide useful information needed for design and analysis of digital communication systems over wireless fading channels.

Estimating Word Translation Probabilities for Thai – English Machine Translation using EM Algorithm

Selecting the word translation from a set of target language words, one that conveys the correct sense of source word and makes more fluent target language output, is one of core problems in machine translation. In this paper we compare the 3 methods of estimating word translation probabilities for selecting the translation word in Thai – English Machine Translation. The 3 methods are (1) Method based on frequency of word translation, (2) Method based on collocation of word translation, and (3) Method based on Expectation Maximization (EM) algorithm. For evaluation we used Thai – English parallel sentences generated by NECTEC. The method based on EM algorithm is the best method in comparison to the other methods and gives the satisfying results.

Info-participation of the Disabled Using the Mixed Preference Data in Improving Their Travel Quality

Today, the preferences and participation of the TD groups such as the elderly and disabled is still lacking in decision-making of transportation planning, and their reactions to certain type of policies are not well known. Thus, a clear methodology is needed. This study aimed to develop a method to extract the preferences of the disabled to be used in the policy-making stage that can also guide to future estimations. The method utilizes the combination of cluster analysis and data filtering using the data of the Arao city (Japan). The method is a process that follows: defining the TD group by the cluster analysis tool, their travel preferences in tabular form from the household surveys by policy variableimpact pairs, zones, and by trip purposes, and the final outcome is the preference probabilities of the disabled. The preferences vary by trip purpose; for the work trips, accessibility and transit system quality policies with the accompanying impacts of modal shifts towards public mode use as well as the decreasing travel costs, and the trip rate increase; for the social trips, the same accessibility and transit system policies leading to the same mode shift impact, together with the travel quality policy area leading to trip rate increase. These results explain the policies to focus and can be used in scenario generation in models, or any other planning purpose as decision support tool.

Integrating Context Priors into a Decision Tree Classification Scheme

Scene interpretation systems need to match (often ambiguous) low-level input data to concepts from a high-level ontology. In many domains, these decisions are uncertain and benefit greatly from proper context. This paper demonstrates the use of decision trees for estimating class probabilities for regions described by feature vectors, and shows how context can be introduced in order to improve the matching performance.

Bond Graph and Bayesian Networks for Reliable Diagnosis

Bond Graph as a unified multidisciplinary tool is widely used not only for dynamic modelling but also for Fault Detection and Isolation because of its structural and causal proprieties. A binary Fault Signature Matrix is systematically generated but to make the final binary decision is not always feasible because of the problems revealed by such method. The purpose of this paper is introducing a methodology for the improvement of the classical binary method of decision-making, so that the unknown and identical failure signatures can be treated to improve the robustness. This approach consists of associating the evaluated residuals and the components reliability data to build a Hybrid Bayesian Network. This network is used in two distinct inference procedures: one for the continuous part and the other for the discrete part. The continuous nodes of the network are the prior probabilities of the components failures, which are used by the inference procedure on the discrete part to compute the posterior probabilities of the failures. The developed methodology is applied to a real steam generator pilot process.

Error Rate Probability for Coded MQAM with MRC Diversity in the Presence of Cochannel Interferers over Nakagami-Fading Channels

Exact expressions for bit-error probability (BEP) for coherent square detection of uncoded and coded M-ary quadrature amplitude modulation (MQAM) using an array of antennas with maximal ratio combining (MRC) in a flat fading channel interference limited system in a Nakagami-m fading environment is derived. The analysis assumes an arbitrary number of independent and identically distributed Nakagami interferers. The results for coded MQAM are computed numerically for the case of (24,12) extended Golay code and compared with uncoded MQAM by plotting error probabilities versus average signal-to-interference ratio (SIR) for various values of order of diversity N, number of distinct symbols M, in order to examine the effect of cochannel interferers on the performance of the digital communication system. The diversity gains and net gains are also presented in tabular form in order to examine the performance of digital communication system in the presence of interferers, as the order of diversity increases. The analytical results presented in this paper are expected to provide useful information needed for design and analysis of digital communication systems with space diversity in wireless fading channels.

Augmentation Opportunity of Transmission Control Protocol Performance in Wireless Networks and Cellular Systems

The advancement in wireless technology with the wide use of mobile devices have drawn the attention of the research and technological communities towards wireless environments, such as Wireless Local Area Networks (WLANs), Wireless Wide Area Networks (WWANs), and mobile systems and ad-hoc networks. Unfortunately, wired and wireless networks are expressively different in terms of link reliability, bandwidth, and time of propagation delay and by adapting new solutions for these enhanced telecommunications, superior quality, efficiency, and opportunities will be provided where wireless communications were otherwise unfeasible. Some researchers define 4G as a significant improvement of 3G, where current cellular network’s issues will be solved and data transfer will play a more significant role. For others, 4G unifies cellular and wireless local area networks, and introduces new routing techniques, efficient solutions for sharing dedicated frequency bands, and an increased mobility and bandwidth capacity. This paper discusses the possible solutions and enhancements probabilities that proposed to improve the performance of Transmission Control Protocol (TCP) over different wireless networks and also the paper investigated each approach in term of advantages and disadvantages.

Learning Spatio-Temporal Topology of a Multi-Camera Network by Tracking Multiple People

This paper presents a novel approach for representing the spatio-temporal topology of the camera network with overlapping and non-overlapping fields of view (FOVs). The topology is determined by tracking moving objects and establishing object correspondence across multiple cameras. To track people successfully in multiple camera views, we used the Merge-Split (MS) approach for object occlusion in a single camera and the grid-based approach for extracting the accurate object feature. In addition, we considered the appearance of people and the transition time between entry and exit zones for tracking objects across blind regions of multiple cameras with non-overlapping FOVs. The main contribution of this paper is to estimate transition times between various entry and exit zones, and to graphically represent the camera topology as an undirected weighted graph using the transition probabilities.

Reliability Analysis of Underground Pipelines Using Subset Simulation

An advanced Monte Carlo simulation method, called Subset Simulation (SS) for the time-dependent reliability prediction for underground pipelines has been presented in this paper. The SS can provide better resolution for low failure probability level with efficient investigating of rare failure events which are commonly encountered in pipeline engineering applications. In SS method, random samples leading to progressive failure are generated efficiently and used for computing probabilistic performance by statistical variables. SS gains its efficiency as small probability event as a product of a sequence of intermediate events with larger conditional probabilities. The efficiency of SS has been demonstrated by numerical studies and attention in this work is devoted to scrutinise the robustness of the SS application in pipe reliability assessment. It is hoped that the development work can promote the use of SS tools for uncertainty propagation in the decision-making process of underground pipelines network reliability prediction.

Optimal Image Compression Based on Sign and Magnitude Coding of Wavelet Coefficients

Wavelet transforms is a very powerful tools for image compression. One of its advantage is the provision of both spatial and frequency localization of image energy. However, wavelet transform coefficients are defined by both a magnitude and sign. While algorithms exist for efficiently coding the magnitude of the transform coefficients, they are not efficient for the coding of their sign. It is generally assumed that there is no compression gain to be obtained from the coding of the sign. Only recently have some authors begun to investigate the sign of wavelet coefficients in image coding. Some authors have assumed that the sign information bit of wavelet coefficients may be encoded with the estimated probability of 0.5; the same assumption concerns the refinement information bit. In this paper, we propose a new method for Separate Sign Coding (SSC) of wavelet image coefficients. The sign and the magnitude of wavelet image coefficients are examined to obtain their online probabilities. We use the scalar quantization in which the information of the wavelet coefficient to belong to the lower or to the upper sub-interval in the uncertainly interval is also examined. We show that the sign information and the refinement information may be encoded by the probability of approximately 0.5 only after about five bit planes. Two maps are separately entropy encoded: the sign map and the magnitude map. The refinement information of the wavelet coefficient to belong to the lower or to the upper sub-interval in the uncertainly interval is also entropy encoded. An algorithm is developed and simulations are performed on three standard images in grey scale: Lena, Barbara and Cameraman. Five scales are performed using the biorthogonal wavelet transform 9/7 filter bank. The obtained results are compared to JPEG2000 standard in terms of peak signal to noise ration (PSNR) for the three images and in terms of subjective quality (visual quality). It is shown that the proposed method outperforms the JPEG2000. The proposed method is also compared to other codec in the literature. It is shown that the proposed method is very successful and shows its performance in term of PSNR.

Adaptive Network Intrusion Detection Learning: Attribute Selection and Classification

In this paper, a new learning approach for network intrusion detection using naïve Bayesian classifier and ID3 algorithm is presented, which identifies effective attributes from the training dataset, calculates the conditional probabilities for the best attribute values, and then correctly classifies all the examples of training and testing dataset. Most of the current intrusion detection datasets are dynamic, complex and contain large number of attributes. Some of the attributes may be redundant or contribute little for detection making. It has been successfully tested that significant attribute selection is important to design a real world intrusion detection systems (IDS). The purpose of this study is to identify effective attributes from the training dataset to build a classifier for network intrusion detection using data mining algorithms. The experimental results on KDD99 benchmark intrusion detection dataset demonstrate that this new approach achieves high classification rates and reduce false positives using limited computational resources.

Fractal Dimension: An Index to Quantify Parameters in Genetic Algorithms

Genetic Algorithms (GAs) are direct searching methods which require little information from design space. This characteristic beside robustness of these algorithms makes them to be very popular in recent decades. On the other hand, while this method is employed, there is no guarantee to achieve optimum results. This obliged designer to run such algorithms more than one time to achieve more reliable results. There are many attempts to modify the algorithms to make them more efficient. In this paper, by application of fractal dimension (particularly, Box Counting Method), the complexity of design space are established for determination of mutation and crossover probabilities (Pm and Pc). This methodology is followed by a numerical example for more clarification. It is concluded that this modification will improve efficiency of GAs and make them to bring about more reliable results especially for design space with higher fractal dimensions.

Transient Analysis of a Single-Server Queue with Fixed-Size Batch Arrivals

The transient analysis of a queuing system with fixed-size batch Poisson arrivals and a single server with exponential service times is presented. The focus of the paper is on the use of the functions that arise in the analysis of the transient behaviour of the queuing system. These functions are shown to be a generalization of the modified Bessel functions of the first kind, with the batch size B as the generalizing parameter. Results for the case of single-packet arrivals are obtained first. The similarities between the two families of functions are then used to obtain results for the general case of batch arrival queue with a batch size larger than one.

Frame and Burst Acquisition in TDMA Satellite Communication Networks with Transponder Hopping

The paper presents frame and burst acquisition in a satellite communication network based on time division multiple access (TDMA) in which the transmissions may be carried on different transponders. A unique word pattern is used for the acquisition process. The search for the frame is aided by soft-decision of QPSK modulated signals in an additive white Gaussian channel. Results show that when the false alarm rate is low the probability of detection is also low, and the acquisition time is long. Conversely when the false alarm rate is high, the probability of detection is also high and the acquisition time is short. Thus the system operators can trade high false alarm rates for high detection probabilities and shorter acquisition times.

A Formal Approach for Proof Constructions in Cryptography

In this article we explore the application of a formal proof system to verification problems in cryptography. Cryptographic properties concerning correctness or security of some cryptographic algorithms are of great interest. Beside some basic lemmata, we explore an implementation of a complex function that is used in cryptography. More precisely, we describe formal properties of this implementation that we computer prove. We describe formalized probability distributions (σ-algebras, probability spaces and conditional probabilities). These are given in the formal language of the formal proof system Isabelle/HOL. Moreover, we computer prove Bayes- Formula. Besides, we describe an application of the presented formalized probability distributions to cryptography. Furthermore, this article shows that computer proofs of complex cryptographic functions are possible by presenting an implementation of the Miller- Rabin primality test that admits formal verification. Our achievements are a step towards computer verification of cryptographic primitives. They describe a basis for computer verification in cryptography. Computer verification can be applied to further problems in cryptographic research, if the corresponding basic mathematical knowledge is available in a database.

Ruin Probabilities with Dependent Rates of Interest and Autoregressive Moving Average Structures

This paper studies ruin probabilities in two discrete-time risk models with premiums, claims and rates of interest modelled by three autoregressive moving average processes. Generalized Lundberg inequalities for ruin probabilities are derived by using recursive technique. A numerical example is given to illustrate the applications of these probability inequalities.

Cash Flow Optimization on Synthetic CDOs

Collateralized Debt Obligations are not as widely used nowadays as they were before 2007 Subprime crisis. Nonetheless there remains an enthralling challenge to optimize cash flows associated with synthetic CDOs. A Gaussian-based model is used here in which default correlation and unconditional probabilities of default are highlighted. Then numerous simulations are performed based on this model for different scenarios in order to evaluate the associated cash flows given a specific number of defaults at different periods of time. Cash flows are not solely calculated on a single bought or sold tranche but rather on a combination of bought and sold tranches. With some assumptions, the simplex algorithm gives a way to find the maximum cash flow according to correlation of defaults and maturities. The used Gaussian model is not realistic in crisis situations. Besides present system does not handle buying or selling a portion of a tranche but only the whole tranche. However the work provides the investor with relevant elements on how to know what and when to buy and sell.

Dynamic Routing to Multiple Destinations in IP Networks using Hybrid Genetic Algorithm (DRHGA)

In this paper we have proposed a novel dynamic least cost multicast routing protocol using hybrid genetic algorithm for IP networks. Our protocol finds the multicast tree with minimum cost subject to delay, degree, and bandwidth constraints. The proposed protocol has the following features: i. Heuristic local search function has been devised and embedded with normal genetic operation to increase the speed and to get the optimized tree, ii. It is efficient to handle the dynamic situation arises due to either change in the multicast group membership or node / link failure, iii. Two different crossover and mutation probabilities have been used for maintaining the diversity of solution and quick convergence. The simulation results have shown that our proposed protocol generates dynamic multicast tree with lower cost. Results have also shown that the proposed algorithm has better convergence rate, better dynamic request success rate and less execution time than other existing algorithms. Effects of degree and delay constraints have also been analyzed for the multicast tree interns of search success rate.