The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups

The problems arising from unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many researchers have found that the performance of existing classifiers tends to be biased towards the majority class. The k-nearest neighbors’ nonparametric discriminant analysis is a method that was proposed for classifying unbalanced classes with good performance. In this study, the methods of discriminant analysis are of interest in investigating misclassification error rates for classimbalanced data of three diabetes risk groups. The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification of class-imbalanced data of diabetes risk groups. Data from a project maintaining healthy conditions for 599 employees of a government hospital in Bangkok were obtained for the classification problem. The employees were divided into three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data including the variables of diabetes risk group, age, gender, blood glucose, and BMI were analyzed and bootstrapped for 50 and 100 samples, 599 observations per sample, for additional estimation of the misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples showed nonnormality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. Searching the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions of (0.90:0.05:0.05), (0.80: 0.10: 0.10) and (0.70, 0.15, 0.15). The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k=3 or k=4 and the defined prior probabilities of non-risk: risk: diabetic as 0.90: 0.05:0.05 or 0.80:0.10:0.10 gave the smallest error rate of misclassification. The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.

A New Approach for Classifying Large Number of Mixed Variables

The issue of classifying objects into one of predefined groups when the measured variables are mixed with different types of variables has been part of interest among statisticians in many years. Some methods for dealing with such situation have been introduced that include parametric, semi-parametric and nonparametric approaches. This paper attempts to discuss on a problem in classifying a data when the number of measured mixed variables is larger than the size of the sample. A propose idea that integrates a dimensionality reduction technique via principal component analysis and a discriminant function based on the location model is discussed. The study aims in offering practitioners another potential tool in a classification problem that is possible to be considered when the observed variables are mixed and too large.

Target Detection using Adaptive Progressive Thresholding Based Shifted Phase-Encoded Fringe-Adjusted Joint Transform Correlator

A new target detection technique is presented in this paper for the identification of small boats in coastal surveillance. The proposed technique employs an adaptive progressive thresholding (APT) scheme to first process the given input scene to separate any objects present in the scene from the background. The preprocessing step results in an image having only the foreground objects, such as boats, trees and other cluttered regions, and hence reduces the search region for the correlation step significantly. The processed image is then fed to the shifted phase-encoded fringe-adjusted joint transform correlator (SPFJTC) technique which produces single and delta-like correlation peak for a potential target present in the input scene. A post-processing step involves using a peak-to-clutter ratio (PCR) to determine whether the boat in the input scene is authorized or unauthorized. Simulation results are presented to show that the proposed technique can successfully determine the presence of an authorized boat and identify any intruding boat present in the given input scene.

Improved Segmentation of Speckled Images Using an Arithmetic-to-Geometric Mean Ratio Kernel

In this work, we improve a previously developed segmentation scheme aimed at extracting edge information from speckled images using a maximum likelihood edge detector. The scheme was based on finding a threshold for the probability density function of a new kernel defined as the arithmetic mean-to-geometric mean ratio field over a circular neighborhood set and, in a general context, is founded on a likelihood random field model (LRFM). The segmentation algorithm was applied to discriminated speckle areas obtained using simple elliptic discriminant functions based on measures of the signal-to-noise ratio with fractional order moments. A rigorous stochastic analysis was used to derive an exact expression for the cumulative density function of the probability density function of the random field. Based on this, an accurate probability of error was derived and the performance of the scheme was analysed. The improved segmentation scheme performed well for both simulated and real images and showed superior results to those previously obtained using the original LRFM scheme and standard edge detection methods. In particular, the false alarm probability was markedly lower than that of the original LRFM method with oversegmentation artifacts virtually eliminated. The importance of this work lies in the development of a stochastic-based segmentation, allowing an accurate quantification of the probability of false detection. Non visual quantification and misclassification in medical ultrasound speckled images is relatively new and is of interest to clinicians.