Performance Optimization of Data Mining Application Using Radial Basis Function Classifier

Text data mining is a process of exploratory data analysis. Classification maps data into predefined groups or classes. It is often referred to as supervised learning because the classes are determined before examining the data. This paper describes proposed radial basis function Classifier that performs comparative crossvalidation for existing radial basis function Classifier. The feasibility and the benefits of the proposed approach are demonstrated by means of data mining problem: direct Marketing. Direct marketing has become an important application field of data mining. Comparative Cross-validation involves estimation of accuracy by either stratified k-fold cross-validation or equivalent repeated random subsampling. While the proposed method may have high bias; its performance (accuracy estimation in our case) may be poor due to high variance. Thus the accuracy with proposed radial basis function Classifier was less than with the existing radial basis function Classifier. However there is smaller the improvement in runtime and larger improvement in precision and recall. In the proposed method Classification accuracy and prediction accuracy are determined where the prediction accuracy is comparatively high.

Application of Java-based Pointcuts in Aspect Oriented Programming (AOP) for Data Race Detection

Wide applicability of concurrent programming practices in developing various software applications leads to different concurrency errors amongst which data race is the most important. Java provides greatest support for concurrent programming by introducing various concurrency packages. Aspect oriented programming (AOP) is modern programming paradigm facilitating the runtime interception of events of interest and can be effectively used to handle the concurrency problems. AspectJ being an aspect oriented extension to java facilitates the application of concepts of AOP for data race detection. Volatile variables are usually considered thread safe, but they can become the possible candidates of data races if non-atomic operations are performed concurrently upon them. Various data race detection algorithms have been proposed in the past but this issue of volatility and atomicity is still unaddressed. The aim of this research is to propose some suggestions for incorporating certain conditions for data race detection in java programs at the volatile fields by taking into account support for atomicity in java concurrency packages and making use of pointcuts. Two simple test programs will demonstrate the results of research. The results are verified on two different Java Development Kits (JDKs) for the purpose of comparison.

Performance Assessment of Computational Gridon Weather Indices from HOAPS Data

Long term rainfall analysis and prediction is a challenging task especially in the modern world where the impact of global warming is creating complications in environmental issues. These factors which are data intensive require high performance computational modeling for accurate prediction. This research paper describes a prototype which is designed and developed on grid environment using a number of coupled software infrastructural building blocks. This grid enabled system provides the demanding computational power, efficiency, resources, user-friendly interface, secured job submission and high throughput. The results obtained using sequential execution and grid enabled execution shows that computational performance has enhanced among 36% to 75%, for decade of climate parameters. Large variation in performance can be attributed to varying degree of computational resources available for job execution. Grid Computing enables the dynamic runtime selection, sharing and aggregation of distributed and autonomous resources which plays an important role not only in business, but also in scientific implications and social surroundings. This research paper attempts to explore the grid enabled computing capabilities on weather indices from HOAPS data for climate impact modeling and change detection.

Verification and Validation for Java Classes using Design by Contract. The Modular External Approach

Since the conception of JML, many tools, applications and implementations have been done. In this context, the users or developers who want to use JML seem surounded by many of these tools, applications and so on. Looking for a common infrastructure and an independent language to provide a bridge between these tools and JML, we developed an approach to embedded contracts in XML for Java: XJML. This approach offer us the ability to separate preconditions, posconditions and class invariants using JML and XML, so we made a front-end which can process Runtime Assertion Checking, Extended Static Checking and Full Static Program Verification. Besides, the capabilities for this front-end can be extended and easily implemented thanks to XML. We believe that XJML is an easy way to start the building of a Graphic User Interface delivering in this way a friendly and IDE independency to developers community wich want to work with JML.

Face Detection in Color Images using Color Features of Skin

Because of increasing demands for security in today-s society and also due to paying much more attention to machine vision, biometric researches, pattern recognition and data retrieval in color images, face detection has got more application. In this article we present a scientific approach for modeling human skin color, and also offer an algorithm that tries to detect faces within color images by combination of skin features and determined threshold in the model. Proposed model is based on statistical data in different color spaces. Offered algorithm, using some specified color threshold, first, divides image pixels into two groups: skin pixel group and non-skin pixel group and then based on some geometric features of face decides which area belongs to face. Two main results that we received from this research are as follow: first, proposed model can be applied easily on different databases and color spaces to establish proper threshold. Second, our algorithm can adapt itself with runtime condition and its results demonstrate desirable progress in comparison with similar cases.

A Novel Architecture for Wavelet based Image Fusion

In this paper, we focus on the fusion of images from different sources using multiresolution wavelet transforms. Based on reviews of popular image fusion techniques used in data analysis, different pixel and energy based methods are experimented. A novel architecture with a hybrid algorithm is proposed which applies pixel based maximum selection rule to low frequency approximations and filter mask based fusion to high frequency details of wavelet decomposition. The key feature of hybrid architecture is the combination of advantages of pixel and region based fusion in a single image which can help the development of sophisticated algorithms enhancing the edges and structural details. A Graphical User Interface is developed for image fusion to make the research outcomes available to the end user. To utilize GUI capabilities for medical, industrial and commercial activities without MATLAB installation, a standalone executable application is also developed using Matlab Compiler Runtime.

Iterative Process to Improve Simple Adaptive Subdivision Surfaces Method with Butterfly Scheme

Subdivision surfaces were applied to the entire meshes in order to produce smooth surfaces refinement from coarse mesh. Several schemes had been introduced in this area to provide a set of rules to converge smooth surfaces. However, to compute and render all the vertices are really inconvenient in terms of memory consumption and runtime during the subdivision process. It will lead to a heavy computational load especially at a higher level of subdivision. Adaptive subdivision is a method that subdivides only at certain areas of the meshes while the rest were maintained less polygons. Although adaptive subdivision occurs at the selected areas, the quality of produced surfaces which is their smoothness can be preserved similar as well as regular subdivision. Nevertheless, adaptive subdivision process burdened from two causes; calculations need to be done to define areas that are required to be subdivided and to remove cracks created from the subdivision depth difference between the selected and unselected areas. Unfortunately, the result of adaptive subdivision when it reaches to the higher level of subdivision, it still brings the problem with memory consumption. This research brings to iterative process of adaptive subdivision to improve the previous adaptive method that will reduce memory consumption applied on triangular mesh. The result of this iterative process was acceptable better in memory and appearance in order to produce fewer polygons while it preserves smooth surfaces.

Automatic Distance Compensation for Robust Voice-based Human-Computer Interaction

Distant-talking voice-based HCI system suffers from performance degradation due to mismatch between the acoustic speech (runtime) and the acoustic model (training). Mismatch is caused by the change in the power of the speech signal as observed at the microphones. This change is greatly influenced by the change in distance, affecting speech dynamics inside the room before reaching the microphones. Moreover, as the speech signal is reflected, its acoustical characteristic is also altered by the room properties. In general, power mismatch due to distance is a complex problem. This paper presents a novel approach in dealing with distance-induced mismatch by intelligently sensing instantaneous voice power variation and compensating model parameters. First, the distant-talking speech signal is processed through microphone array processing, and the corresponding distance information is extracted. Distance-sensitive Gaussian Mixture Models (GMMs), pre-trained to capture both speech power and room property are used to predict the optimal distance of the speech source. Consequently, pre-computed statistic priors corresponding to the optimal distance is selected to correct the statistics of the generic model which was frozen during training. Thus, model combinatorics are post-conditioned to match the power of instantaneous speech acoustics at runtime. This results to an improved likelihood in predicting the correct speech command at farther distances. We experiment using real data recorded inside two rooms. Experimental evaluation shows voice recognition performance using our method is more robust to the change in distance compared to the conventional approach. In our experiment, under the most acoustically challenging environment (i.e., Room 2: 2.5 meters), our method achieved 24.2% improvement in recognition performance against the best-performing conventional method.

Modeling and Analysis of Adaptive Buffer Sharing Scheme for Consecutive Packet Loss Reduction in Broadband Networks

High speed networks provide realtime variable bit rate service with diversified traffic flow characteristics and quality requirements. The variable bit rate traffic has stringent delay and packet loss requirements. The burstiness of the correlated traffic makes dynamic buffer management highly desirable to satisfy the Quality of Service (QoS) requirements. This paper presents an algorithm for optimization of adaptive buffer allocation scheme for traffic based on loss of consecutive packets in data-stream and buffer occupancy level. Buffer is designed to allow the input traffic to be partitioned into different priority classes and based on the input traffic behavior it controls the threshold dynamically. This algorithm allows input packets to enter into buffer if its occupancy level is less than the threshold value for priority of that packet. The threshold is dynamically varied in runtime based on packet loss behavior. The simulation is run for two priority classes of the input traffic – realtime and non-realtime classes. The simulation results show that Adaptive Partial Buffer Sharing (ADPBS) has better performance than Static Partial Buffer Sharing (SPBS) and First In First Out (FIFO) queue under the same traffic conditions.

Comanche – A Compiler-Driven I/O Management System

Most scientific programs have large input and output data sets that require out-of-core programming or use virtual memory management (VMM). Out-of-core programming is very error-prone and tedious; as a result, it is generally avoided. However, in many instance, VMM is not an effective approach because it often results in substantial performance reduction. In contrast, compiler driven I/O management will allow a program-s data sets to be retrieved in parts, called blocks or tiles. Comanche (COmpiler MANaged caCHE) is a compiler combined with a user level runtime system that can be used to replace standard VMM for out-of-core programs. We describe Comanche and demonstrate on a number of representative problems that it substantially out-performs VMM. Significantly our system does not require any special services from the operating system and does not require modification of the operating system kernel.

Static and Dynamic Complexity Analysis of Software Metrics

Software complexity metrics are used to predict critical information about reliability and maintainability of software systems. Object oriented software development requires a different approach to software complexity metrics. Object Oriented Software Metrics can be broadly classified into static and dynamic metrics. Static Metrics give information at the code level whereas dynamic metrics provide information on the actual runtime. In this paper we will discuss the various complexity metrics, and the comparison between static and dynamic complexity.

Theoretical Considerations for Software Component Metrics

We have defined two suites of metrics, which cover static and dynamic aspects of component assembly. The static metrics measure complexity and criticality of component assembly, wherein complexity is measured using Component Packing Density and Component Interaction Density metrics. Further, four criticality conditions namely, Link, Bridge, Inheritance and Size criticalities have been identified and quantified. The complexity and criticality metrics are combined to form a Triangular Metric, which can be used to classify the type and nature of applications. Dynamic metrics are collected during the runtime of a complete application. Dynamic metrics are useful to identify super-component and to evaluate the degree of utilisation of various components. In this paper both static and dynamic metrics are evaluated using Weyuker-s set of properties. The result shows that the metrics provide a valid means to measure issues in component assembly. We relate our metrics suite with McCall-s Quality Model and illustrate their impact on product quality and to the management of component-based product development.

Towards Model-Driven Communications

In modern distributed software systems, the issue of communication among composing parts represents a critical point, but the idea of extending conventional programming languages with general purpose communication constructs seems difficult to realize. As a consequence, there is a (growing) gap between the abstraction level required by distributed applications and the concepts provided by platforms that enable communication. This work intends to discuss how the Model Driven Software Development approach can be considered as a mature technology to generate in automatic way the schematic part of applications related to communication, by providing at the same time high level specialized languages useful in all the phases of software production. To achieve the goal, a stack of languages (meta-meta¬models) has been introduced in order to describe – at different levels of abstraction – the collaborative behavior of generic entities in terms of communication actions related to a taxonomy of messages. Finally, the generation of platforms for communication is viewed as a form of specification of language semantics, that provides executable models of applications together with model-checking supports and effective runtime environments.

Evaluating per-user Fairness of Goal-Oriented Parallel Computer Job Scheduling Policies

Fair share objective has been included into the goaloriented parallel computer job scheduling policy recently. However, the previous work only presented the overall scheduling performance. Thus, the per-user performance of the policy is still lacking. In this work, the details of per-user fair share performance under the Tradeoff-fs(Tx:avgX) policy will be further evaluated. A basic fair share priority backfill policy namely RelShare(1d) is also studied. The performance of all policies is collected using an event-driven simulator with three real job traces as input. The experimental results show that the high demand users are usually benefited under most policies because their jobs are large or they have a lot of jobs. In the large job case, one job executed may result in over-share during that period. In the other case, the jobs may be backfilled for performances. However, the users with a mixture of jobs may suffer because if the smaller jobs are executing the priority of the remaining jobs from the same user will be lower. Further analysis does not show any significant impact of users with a lot of jobs or users with a large runtime approximation error.