Abstract: Fault-proneness of a software module is the
probability that the module contains faults. A correlation exists
between the fault-proneness of the software and the measurable
attributes of the code (i.e. the static metrics) and of the testing (i.e.
the dynamic metrics). Early detection of fault-prone software
components enables verification experts to concentrate their time and
resources on the problem areas of the software system under
development. This paper introduces Genetic Algorithm based
software fault prediction models with Object-Oriented metrics. The
contribution of this paper is that it has used Metric values of JEdit
open source software for generation of the rules for the classification
of software modules in the categories of Faulty and non faulty
modules and thereafter empirically validation is performed. The
results shows that Genetic algorithm approach can be used for
finding the fault proneness in object oriented software components.
Abstract: Prediction of fault-prone modules provides one way to
support software quality engineering. Clustering is used to determine
the intrinsic grouping in a set of unlabeled data. Among various
clustering techniques available in literature K-Means clustering
approach is most widely being used. This paper introduces K-Means
based Clustering approach for software finding the fault proneness of
the Object-Oriented systems. The contribution of this paper is that it
has used Metric values of JEdit open source software for generation
of the rules for the categorization of software modules in the
categories of Faulty and non faulty modules and thereafter
empirically validation is performed. The results are measured in
terms of accuracy of prediction, probability of Detection and
Probability of False Alarms.
Abstract: Fault-proneness of a software module is the
probability that the module contains faults. To predict faultproneness
of modules different techniques have been proposed which
includes statistical methods, machine learning techniques, neural
network techniques and clustering techniques. The aim of proposed
study is to explore whether metrics available in the early lifecycle
(i.e. requirement metrics), metrics available in the late lifecycle (i.e.
code metrics) and metrics available in the early lifecycle (i.e.
requirement metrics) combined with metrics available in the late
lifecycle (i.e. code metrics) can be used to identify fault prone
modules using Genetic Algorithm technique. This approach has been
tested with real time defect C Programming language datasets of
NASA software projects. The results show that the fusion of
requirement and code metric is the best prediction model for
detecting the faults as compared with commonly used code based
model.
Abstract: The aim of this paper is to rank the impact of Object
Oriented(OO) metrics in fault prediction modeling using Artificial
Neural Networks(ANNs). Past studies on empirical validation of
object oriented metrics as fault predictors using ANNs have focused
on the predictive quality of neural networks versus standard
statistical techniques. In this empirical study we turn our attention to
the capability of ANNs in ranking the impact of these explanatory
metrics on fault proneness. In ANNs data analysis approach, there is
no clear method of ranking the impact of individual metrics. Five
ANN based techniques are studied which rank object oriented
metrics in predicting fault proneness of classes. These techniques are
i) overall connection weights method ii) Garson-s method iii) The
partial derivatives methods iv) The Input Perturb method v) the
classical stepwise methods. We develop and evaluate different
prediction models based on the ranking of the metrics by the
individual techniques. The models based on overall connection
weights and partial derivatives methods have been found to be most
accurate.
Abstract: Missing data is a persistent problem in almost all
areas of empirical research. The missing data must be treated very
carefully, as data plays a fundamental role in every analysis.
Improper treatment can distort the analysis or generate biased results.
In this paper, we compare and contrast various imputation techniques
on missing data sets and make an empirical evaluation of these
methods so as to construct quality software models. Our empirical
study is based on NASA-s two public dataset. KC4 and KC1. The
actual data sets of 125 cases and 2107 cases respectively, without
any missing values were considered. The data set is used to create
Missing at Random (MAR) data Listwise Deletion(LD), Mean
Substitution(MS), Interpolation, Regression with an error term and
Expectation-Maximization (EM) approaches were used to compare
the effects of the various techniques.
Abstract: There is lot of work done in prediction of the fault proneness of the software systems. But, it is the severity of the faults that is more important than number of faults existing in the developed system as the major faults matters most for a developer and those major faults needs immediate attention. In this paper, we tried to predict the level of impact of the existing faults in software systems. Neuro-Fuzzy based predictor models is applied NASA-s public domain defect dataset coded in C programming language. As Correlation-based Feature Selection (CFS) evaluates the worth of a subset of attributes by considering the individual predictive ability of each feature along with the degree of redundancy between them. So, CFS is used for the selecting the best metrics that have highly correlated with level of severity of faults. The results are compared with the prediction results of Logistic Models (LMT) that was earlier quoted as the best technique in [17]. The results are recorded in terms of Accuracy, Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The results show that Neuro-fuzzy based model provide a relatively better prediction accuracy as compared to other models and hence, can be used for the modeling of the level of impact of faults in function based systems.
Abstract: In this paper, subtractive clustering based fuzzy inference system approach is used for early detection of faults in the function oriented software systems. This approach has been tested with real time defect datasets of NASA software projects named as PC1 and CM1. Both the code based model and joined model (combination of the requirement and code based metrics) of the datasets are used for training and testing of the proposed approach. The performance of the models is recorded in terms of Accuracy, MAE and RMSE values. The performance of the proposed approach is better in case of Joined Model. As evidenced from the results obtained it can be concluded that Clustering and fuzzy logic together provide a simple yet powerful means to model the earlier detection of faults in the function oriented software systems.