Knowledge-Driven Decision Support System Based on Knowledge Warehouse and Data Mining by Improving Apriori Algorithm with Fuzzy Logic

In recent years, we have seen an increasing importance of research and study on knowledge source, decision support systems, data mining and procedure of knowledge discovery in data bases and it is considered that each of these aspects affects the others. In this article, we have merged information source and knowledge source to suggest a knowledge based system within limits of management based on storing and restoring of knowledge to manage information and improve decision making and resources. In this article, we have used method of data mining and Apriori algorithm in procedure of knowledge discovery one of the problems of Apriori algorithm is that, a user should specify the minimum threshold for supporting the regularity. Imagine that a user wants to apply Apriori algorithm for a database with millions of transactions. Definitely, the user does not have necessary knowledge of all existing transactions in that database, and therefore cannot specify a suitable threshold. Our purpose in this article is to improve Apriori algorithm. To achieve our goal, we tried using fuzzy logic to put data in different clusters before applying the Apriori algorithm for existing data in the database and we also try to suggest the most suitable threshold to the user automatically.

A Relationship Extraction Method from Literary Fiction Considering Korean Linguistic Features

The knowledge of the relationship between characters can help readers to understand the overall story or plot of the literary fiction. In this paper, we present a method for extracting the specific relationship between characters from a Korean literary fiction. Generally, methods for extracting relationships between characters in text are statistical or computational methods based on the sentence distance between characters without considering Korean linguistic features. Furthermore, it is difficult to extract the relationship with direction from text, such as one-sided love, because they consider only the weight of relationship, without considering the direction of the relationship. Therefore, in order to identify specific relationships between characters, we propose a statistical method considering linguistic features, such as syntactic patterns and speech verbs in Korean. The result of our method is represented by a weighted directed graph of the relationship between the characters. Furthermore, we expect that proposed method could be applied to the relationship analysis between characters of other content like movie or TV drama.

An Empirical Evaluation of Performance of Machine Learning Techniques on Imbalanced Software Quality Data

The development of change prediction models can help the software practitioners in planning testing and inspection resources at early phases of software development. However, a major challenge faced during the training process of any classification model is the imbalanced nature of the software quality data. A data with very few minority outcome categories leads to inefficient learning process and a classification model developed from the imbalanced data generally does not predict these minority categories correctly. Thus, for a given dataset, a minority of classes may be change prone whereas a majority of classes may be non-change prone. This study explores various alternatives for adeptly handling the imbalanced software quality data using different sampling methods and effective MetaCost learners. The study also analyzes and justifies the use of different performance metrics while dealing with the imbalanced data. In order to empirically validate different alternatives, the study uses change data from three application packages of open-source Android data set and evaluates the performance of six different machine learning techniques. The results of the study indicate extensive improvement in the performance of the classification models when using resampling method and robust performance measures.

Four Phase Methodology for Developing Secure Software

A simple and robust approach for developing secure software. A Four Phase methodology consists in developing the non-secure software in phase one, and for the next three phases, one phase for each of the secure developing types (i.e. self-protected software, secure code transformation, and the secure shield). Our methodology requires first the determination and understanding of the type of security level needed for the software. The methodology proposes the use of several teams to accomplish this task. One Software Engineering Developing Team, a Compiler Team, a Specification and Requirements Testing Team, and for each of the secure software developing types: three teams of Secure Software Developing, three teams of Code Breakers, and three teams of Intrusion Analysis. These teams will interact among each other and make decisions to provide a secure software code protected against a required level of intruder.

A Survey on Data-Centric and Data-Aware Techniques for Large Scale Infrastructures

Large scale computing infrastructures have been widely developed with the core objective of providing a suitable platform for high-performance and high-throughput computing. These systems are designed to support resource-intensive and complex applications, which can be found in many scientific and industrial areas. Currently, large scale data-intensive applications are hindered by the high latencies that result from the access to vastly distributed data. Recent works have suggested that improving data locality is key to move towards exascale infrastructures efficiently, as solutions to this problem aim to reduce the bandwidth consumed in data transfers, and the overheads that arise from them. There are several techniques that attempt to move computations closer to the data. In this survey we analyse the different mechanisms that have been proposed to provide data locality for large scale high-performance and high-throughput systems. This survey intends to assist scientific computing community in understanding the various technical aspects and strategies that have been reported in recent literature regarding data locality. As a result, we present an overview of locality-oriented techniques, which are grouped in four main categories: application development, task scheduling, in-memory computing and storage platforms. Finally, the authors include a discussion on future research lines and synergies among the former techniques.

X-Corner Detection for Camera Calibration Using Saddle Points

This paper discusses a corner detection algorithm for camera calibration. Calibration is a necessary step in many computer vision and image processing applications. Robust corner detection for an image of a checkerboard is required to determine intrinsic and extrinsic parameters. In this paper, an algorithm for fully automatic and robust X-corner detection is presented. Checkerboard corner points are automatically found in each image without user interaction or any prior information regarding the number of rows or columns. The approach represents each X-corner with a quadratic fitting function. Using the fact that the X-corners are saddle points, the coefficients in the fitting function are used to identify each corner location. The automation of this process greatly simplifies calibration. Our method is robust against noise and different camera orientations. Experimental analysis shows the accuracy of our method using actual images acquired at different camera locations and orientations.

Coloured Petri Nets Model for Web Architectures of Web and Database Servers

Web application architecture is important to achieve the desired performance for the application. Performance analysis studies are conducted to evaluate existing or planned systems. Web applications are used by hundreds of thousands of users simultaneously, which sometimes increases the risk of server failure in real time operations. We use Coloured Petri Net (CPN), a very powerful tool for modelling dynamic behaviour of a web application system. CPNs extend the vocabulary of ordinary Petri nets and add features that make them suitable for modelling large systems. The major focus of this work is on server side of web applications. The presented work focuses on modelling restructuring aspects, with major focus on concurrency and architecture, using CPN. It also focuses on bringing out the appropriate architecture for web and database servers given the number of concurrent users.

A Two-Stage Adaptation towards Automatic Speech Recognition System for Malay-Speaking Children

Recently, Automatic Speech Recognition (ASR) systems were used to assist children in language acquisition as it has the ability to detect human speech signal. Despite the benefits offered by the ASR system, there is a lack of ASR systems for Malay-speaking children. One of the contributing factors for this is the lack of continuous speech database for the target users. Though cross-lingual adaptation is a common solution for developing ASR systems for under-resourced language, it is not viable for children as there are very limited speech databases as a source model. In this research, we propose a two-stage adaptation for the development of ASR system for Malay-speaking children using a very limited database. The two stage adaptation comprises the cross-lingual adaptation (first stage) and cross-age adaptation. For the first stage, a well-known speech database that is phonetically rich and balanced, is adapted to the medium-sized Malay adults using supervised MLLR. The second stage adaptation uses the speech acoustic model generated from the first adaptation, and the target database is a small-sized database of the target users. We have measured the performance of the proposed technique using word error rate, and then compare them with the conventional benchmark adaptation. The two stage adaptation proposed in this research has better recognition accuracy as compared to the benchmark adaptation in recognizing children’s speech.

Analytic Network Process in Location Selection and Its Application to a Real Life Problem

Location selection presents a crucial decision problem in today’s business world where strategic decision making processes have critical importance. Thus, location selection has strategic importance for companies in boosting their strength regarding competition, increasing corporate performances and efficiency in addition to lowering production and transportation costs. A right choice in location selection has a direct impact on companies’ commercial success. In this study, a store location selection problem of Carglass Turkey which operates in vehicle glass branch is handled. As this problem includes both tangible and intangible criteria, Analytic Network Process (ANP) was accepted as the main methodology. The model consists of control hierarchy and BOCR subnetworks which include clusters of actors, alternatives and criteria. In accordance with the management’s choices, five different locations were selected. In addition to the literature review, a strict cooperation with the actor group was ensured and maintained while determining the criteria and during whole process. Obtained results were presented to the management as a report and its feasibility was confirmed accordingly.

Performance Analysis of OQSMS and MDDR Scheduling Algorithms for IQ Switches

Due to the increasing growth of internet users, the emerging applications of multicast are growing day by day and there is a requisite for the design of high-speed switches/routers. Huge amounts of effort have been done into the research area of multicast switch fabric design and algorithms. Different traffic scenarios are the influencing factor which affect the throughput and delay of the switch. The pointer based multicast scheduling algorithms are not performed well under non-uniform traffic conditions. In this work, performance of the switch has been analyzed by applying the advanced multicast scheduling algorithm OQSMS (Optimal Queue Selection Based Multicast Scheduling Algorithm), MDDR (Multicast Due Date Round-Robin Scheduling Algorithm) and MDRR (Multicast Dual Round-Robin Scheduling Algorithm). The results show that OQSMS achieves better switching performance than other algorithms under the uniform, non-uniform and bursty traffic conditions and it estimates optimal queue in each time slot so that it achieves maximum possible throughput.

Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems

Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.

Implementation of ADETRAN Language Using Message Passing Interface

This paper describes the Message Passing Interface (MPI) implementation of ADETRAN language, and its evaluation on SX-ACE supercomputers. ADETRAN language includes pdo statement that specifies the data distribution and parallel computations and pass statement that specifies the redistribution of arrays. Two methods for implementation of pass statement are discussed and the performance evaluation using Splitting-Up CG method is presented. The effectiveness of the parallelization is evaluated and the advantage of one dimensional distribution is empirically confirmed by using the results of experiments.

CRLH and SRR Based Microwave Filter Design Useful for Communication Applications

CRLH (composite right/left-handed) based and SRR (split-ring resonator) based filters have been designed at microwave frequency which can provide better performance compared to conventional edge-coupled band-pass filter designed around the same frequency, 2.45 GHz. Both CRLH and SRR are unit cells used in metamaterial design. The primary aim of designing filters with such structures is to realize size reduction and also to realize novel filter performance. The CRLH based filter has been designed in microstrip transmission line, while the SRR based filter is designed with SRR loading in waveguide. The CRLH based filter designed at 2.45 GHz provides an insertion loss of 1.6 dB with harmonic suppression up to 10 GHz with 67 % size reduction when compared with a conventional edge-coupled band-pass filter designed around the same frequency. One dimensional (1-D) SRR matrix loaded in a waveguide shows the possibility of realizing a stop-band with sharp skirts in the pass-band while a stop-band in the pass-band of normal rectangular waveguide with tailoring of the dimensions of SRR unit cells. Such filters are expected to be very useful for communication systems at microwave frequency.

Enhancing the Network Security with Gray Code

Nowadays, network is an essential need in almost every part of human daily activities. People now can seamlessly connect to others through the Internet. With advanced technology, our personal data now can be more easily accessed. One of many components we are concerned for delivering the best network is a security issue. This paper is proposing a method that provides more options for security. This research aims to improve network security by focusing on the physical layer which is the first layer of the OSI model. The layer consists of the basic networking hardware transmission technologies of a network. With the use of observation method, the research produces a schematic design for enhancing the network security through the gray code converter.

Calcification Classification in Mammograms Using Decision Trees

Cancer affects people globally with breast cancer being a leading killer. Breast cancer is due to the uncontrollable multiplication of cells resulting in a tumour or neoplasm. Tumours are called ‘benign’ when cancerous cells do not ravage other body tissues and ‘malignant’ if they do so. As mammography is an effective breast cancer detection tool at an early stage which is the most treatable stage it is the primary imaging modality for screening and diagnosis of this cancer type. This paper presents an automatic mammogram classification technique using wavelet and Gabor filter. Correlation feature selection is used to reduce the feature set and selected features are classified using different decision trees.

Using Vulnerability to Reduce False Positive Rate in Intrusion Detection Systems

Intrusion Detection Systems are an essential tool for network security infrastructure. However, IDSs have a serious problem which is the generating of massive number of alerts, most of them are false positive ones which can hide true alerts and make the analyst confused to analyze the right alerts for report the true attacks. The purpose behind this paper is to present a formalism model to perform correlation engine by the reduction of false positive alerts basing on vulnerability contextual information. For that, we propose a formalism model based on non-monotonic JClassicδє description logic augmented with a default (δ) and an exception (є) operator that allows a dynamic inference according to contextual information.

Computational Feasibility Study of a Torsional Wave Transducer for Tissue Stiffness Monitoring

A torsional piezoelectric ultrasonic transducer design is proposed to measure shear moduli in soft tissue with direct access availability, using shear wave elastography technique. The measurement of shear moduli of tissues is a challenging problem, mainly derived from a) the difficulty of isolating a pure shear wave, given the interference of multiple waves of different types (P, S, even guided) emitted by the transducers and reflected in geometric boundaries, and b) the highly attenuating nature of soft tissular materials. An immediate application, overcoming these drawbacks, is the measurement of changes in cervix stiffness to estimate the gestational age at delivery. The design has been optimized using a finite element model (FEM) and a semi-analytical estimator of the probability of detection (POD) to determine a suitable geometry, materials and generated waves. The technique is based on the time of flight measurement between emitter and receiver, to infer shear wave velocity. Current research is centered in prototype testing and validation. The geometric optimization of the transducer was able to annihilate the compressional wave emission, generating a quite pure shear torsional wave. Currently, mechanical and electromagnetic coupling between emitter and receiver signals are being the research focus. Conclusions: the design overcomes the main described problems. The almost pure shear torsional wave along with the short time of flight avoids the possibility of multiple wave interference. This short propagation distance reduce the effect of attenuation, and allow the emission of very low energies assuring a good biological security for human use.

An Improved Face Recognition Algorithm Using Histogram-Based Features in Spatial and Frequency Domains

In this paper, we propose an improved face recognition algorithm using histogram-based features in spatial and frequency domains. For adding spatial information of the face to improve recognition performance, a region-division (RD) method is utilized. The facial area is firstly divided into several regions, then feature vectors of each facial part are generated by Binary Vector Quantization (BVQ) histogram using DCT coefficients in low frequency domains, as well as Local Binary Pattern (LBP) histogram in spatial domain. Recognition results with different regions are first obtained separately and then fused by weighted averaging. Publicly available ORL database is used for the evaluation of our proposed algorithm, which is consisted of 40 subjects with 10 images per subject containing variations in lighting, posing, and expressions. It is demonstrated that face recognition using RD method can achieve much higher recognition rate.

A Cuckoo Search with Differential Evolution for Clustering Microarray Gene Expression Data

A DNA microarray technology is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. It is handled by clustering which reveals the natural structures and identifying the interesting patterns in the underlying data. In this paper, gene based clustering in gene expression data is proposed using Cuckoo Search with Differential Evolution (CS-DE). The experiment results are analyzed with gene expression benchmark datasets. The results show that CS-DE outperforms CS in benchmark datasets. To find the validation of the clustering results, this work is tested with one internal and one external cluster validation indexes.

Recognition of Tifinagh Characters with Missing Parts Using Neural Network

In this paper, we present an algorithm for reconstruction from incomplete 2D scans for tifinagh characters. This algorithm is based on using correlation between the lost block and its neighbors. This system proposed contains three main parts: pre-processing, features extraction and recognition. In the first step, we construct a database of tifinagh characters. In the second step, we will apply “shape analysis algorithm”. In classification part, we will use Neural Network. The simulation results demonstrate that the proposed method give good results.