A K-Means Based Clustering Approach for Finding Faulty Modules in Open Source Software Systems

Prediction of fault-prone modules provides one way to support software quality engineering. Clustering is used to determine the intrinsic grouping in a set of unlabeled data. Among various clustering techniques available in literature K-Means clustering approach is most widely being used. This paper introduces K-Means based Clustering approach for software finding the fault proneness of the Object-Oriented systems. The contribution of this paper is that it has used Metric values of JEdit open source software for generation of the rules for the categorization of software modules in the categories of Faulty and non faulty modules and thereafter empirically validation is performed. The results are measured in terms of accuracy of prediction, probability of Detection and Probability of False Alarms.

Prediction of Reusability of Object Oriented Software Systems using Clustering Approach

In literature, there are metrics for identifying the quality of reusable components but the framework that makes use of these metrics to precisely predict reusability of software components is still need to be worked out. These reusability metrics if identified in the design phase or even in the coding phase can help us to reduce the rework by improving quality of reuse of the software component and hence improve the productivity due to probabilistic increase in the reuse level. As CK metric suit is most widely used metrics for extraction of structural features of an object oriented (OO) software; So, in this study, tuned CK metric suit i.e. WMC, DIT, NOC, CBO and LCOM, is used to obtain the structural analysis of OO-based software components. An algorithm has been proposed in which the inputs can be given to K-Means Clustering system in form of tuned values of the OO software component and decision tree is formed for the 10-fold cross validation of data to evaluate the in terms of linguistic reusability value of the component. The developed reusability model has produced high precision results as desired.