Image Spam Detection Using Color Features and K-Nearest Neighbor Classification

Image spam is a kind of email spam where the spam text is embedded with an image. It is a new spamming technique being used by spammers to send their messages to bulk of internet users. Spam email has become a big problem in the lives of internet users, causing time consumption and economic losses. The main objective of this paper is to detect the image spam by using histogram properties of an image. Though there are many techniques to automatically detect and avoid this problem, spammers employing new tricks to bypass those techniques, as a result those techniques are inefficient to detect the spam mails. In this paper we have proposed a new method to detect the image spam. Here the image features are extracted by using RGB histogram, HSV histogram and combination of both RGB and HSV histogram. Based on the optimized image feature set classification is done by using k- Nearest Neighbor(k-NN) algorithm. Experimental result shows that our method has achieved better accuracy. From the result it is known that combination of RGB and HSV histogram with k-NN algorithm gives the best accuracy in spam detection.

Selecting the Best Sub-Region Indexing the Images in the Case of Weak Segmentation Based On Local Color Histograms

Color Histogram is considered as the oldest method used by CBIR systems for indexing images. In turn, the global histograms do not include the spatial information; this is why the other techniques coming later have attempted to encounter this limitation by involving the segmentation task as a preprocessing step. The weak segmentation is employed by the local histograms while other methods as CCV (Color Coherent Vector) are based on strong segmentation. The indexation based on local histograms consists of splitting the image into N overlapping blocks or sub-regions, and then the histogram of each block is computed. The dissimilarity between two images is reduced, as consequence, to compute the distance between the N local histograms of the both images resulting then in N*N values; generally, the lowest value is taken into account to rank images, that means that the lowest value is that which helps to designate which sub-region utilized to index images of the collection being asked. In this paper, we make under light the local histogram indexation method in the hope to compare the results obtained against those given by the global histogram. We address also another noteworthy issue when Relying on local histograms namely which value, among N*N values, to trust on when comparing images, in other words, which sub-region among the N*N sub-regions on which we base to index images. Based on the results achieved here, it seems that relying on the local histograms, which needs to pose an extra overhead on the system by involving another preprocessing step naming segmentation, does not necessary mean that it produces better results. In addition to that, we have proposed here some ideas to select the local histogram on which we rely on to encode the image rather than relying on the local histogram having lowest distance with the query histograms.

Cardiac Disorder Classification Based On Extreme Learning Machine

In this paper, an extreme learning machine with an automatic segmentation algorithm is applied to heart disorder classification by heart sound signals. From continuous heart sound signals, the starting points of the first (S1) and the second heart pulses (S2) are extracted and corrected by utilizing an inter-pulse histogram. From the corrected pulse positions, a single period of heart sound signals is extracted and converted to a feature vector including the mel-scaled filter bank energy coefficients and the envelope coefficients of uniform-sized sub-segments. An extreme learning machine is used to classify the feature vector. In our cardiac disorder classification and detection experiments with 9 cardiac disorder categories, the proposed method shows significantly better performance than multi-layer perceptron, support vector machine, and hidden Markov model; it achieves the classification accuracy of 81.6% and the detection accuracy of 96.9%.

Image Contrast Enhancement based Sub-histogram Equalization Technique without Over-equalization Noise

In order to enhance the contrast in the regions where the pixels have similar intensities, this paper presents a new histogram equalization scheme. Conventional global equalization schemes over-equalizes these regions so that too bright or dark pixels are resulted and local equalization schemes produce unexpected discontinuities at the boundaries of the blocks. The proposed algorithm segments the original histogram into sub-histograms with reference to brightness level and equalizes each sub-histogram with the limited extents of equalization considering its mean and variance. The final image is determined as the weighted sum of the equalized images obtained by using the sub-histogram equalizations. By limiting the maximum and minimum ranges of equalization operations on individual sub-histograms, the over-equalization effect is eliminated. Also the result image does not miss feature information in low density histogram region since the remaining these area is applied separating equalization. This paper includes how to determine the segmentation points in the histogram. The proposed algorithm has been tested with more than 100 images having various contrasts in the images and the results are compared to the conventional approaches to show its superiority.

A new Adaptive Approach for Histogram based Mouth Segmentation

The segmentation of mouth and lips is a fundamental problem in facial image analyisis. In this paper we propose a method for lip segmentation based on rg-color histogram. Statistical analysis shows, using the rg-color-space is optimal for this purpose of a pure color based segmentation. Initially a rough adaptive threshold selects a histogram region, that assures that all pixels in that region are skin pixels. Based on that pixels we build a gaussian model which represents the skin pixels distribution and is utilized to obtain a refined, optimal threshold. We are not incorporating shape or edge information. In experiments we show the performance of our lip pixel segmentation method compared to the ground truth of our dataset and a conventional watershed algorithm.

Resource Leveling in Construction Projects using Re- Modified Minimum Moment Approach

An attempt in this paper proposes a re-modification to the minimum moment approach of resource leveling which is a modified minimum moment approach to the traditional method by Harris. The method is based on critical path method. The new approach suggests the difference between the methods in the selection criteria of activity which needs to be shifted for leveling resource histogram. In traditional method, the improvement factor found first to select the activity for each possible day of shifting. In modified method maximum value of the product of Resources Rate and Free Float was found first and improvement factor is then calculated for that activity which needs to be shifted. In the proposed method the activity to be selected first for shifting is based on the largest value of resource rate. The process is repeated for all the remaining activities for possible shifting to get updated histogram. The proposed method significantly reduces the number of iterations and is easier for manual computations.

Burstiness Reduction of a Doubly Stochastic AR-Modeled Uniform Activity VBR Video

Stochastic modeling of network traffic is an area of significant research activity for current and future broadband communication networks. Multimedia traffic is statistically characterized by a bursty variable bit rate (VBR) profile. In this paper, we develop an improved model for uniform activity level video sources in ATM using a doubly stochastic autoregressive model driven by an underlying spatial point process. We then examine a number of burstiness metrics such as the peak-to-average ratio (PAR), the temporal autocovariance function (ACF) and the traffic measurements histogram. We found that the former measure is most suitable for capturing the burstiness of single scene video traffic. In the last phase of this work, we analyse statistical multiplexing of several constant scene video sources. This proved, expectedly, to be advantageous with respect to reducing the burstiness of the traffic, as long as the sources are statistically independent. We observed that the burstiness was rapidly diminishing, with the largest gain occuring when only around 5 sources are multiplexed. The novel model used in this paper for characterizing uniform activity video was thus found to be an accurate model.

Developing the Color Temperature Histogram Method for Improving the Content-Based Image Retrieval

This paper proposes a new method for image searches and image indexing in databases with a color temperature histogram. The color temperature histogram can be used for performance improvement of content–based image retrieval by using a combination of color temperature and histogram. The color temperature histogram can be represented by a range of 46 colors. That is more than the color histogram and the dominant color temperature. Moreover, with our method the colors that have the same color temperature can be separated while the dominant color temperature can not. The results showed that the color temperature histogram retrieved an accurate image more often than the dominant color temperature method or color histogram method. This also took less time so the color temperature can be used for indexing and searching for images.

Parametric Modeling Approach for Call Holding Times for IP based Public Safety Networks via EM Algorithm

This paper presents parametric probability density models for call holding times (CHTs) into emergency call center based on the actual data collected for over a week in the public Emergency Information Network (EIN) in Mongolia. When the set of chosen candidates of Gamma distribution family is fitted to the call holding time data, it is observed that the whole area in the CHT empirical histogram is underestimated due to spikes of higher probability and long tails of lower probability in the histogram. Therefore, we provide the Gaussian parametric model of a mixture of lognormal distributions with explicit analytical expressions for the modeling of CHTs of PSNs. Finally, we show that the CHTs for PSNs are fitted reasonably by a mixture of lognormal distributions via the simulation of expectation maximization algorithm. This result is significant as it expresses a useful mathematical tool in an explicit manner of a mixture of lognormal distributions.

A Quantum Algorithm of Constructing Image Histogram

Histogram plays an important statistical role in digital image processing. However, the existing quantum image models are deficient to do this kind of image statistical processing because different gray scales are not distinguishable. In this paper, a novel quantum image representation model is proposed firstly in which the pixels with different gray scales can be distinguished and operated simultaneously. Based on the new model, a fast quantum algorithm of constructing histogram for quantum image is designed. Performance comparison reveals that the new quantum algorithm could achieve an approximately quadratic speedup than the classical counterpart. The proposed quantum model and algorithm have significant meanings for the future researches of quantum image processing.

Grouping and Indexing Color Features for Efficient Image Retrieval

Content-based Image Retrieval (CBIR) aims at searching image databases for specific images that are similar to a given query image based on matching of features derived from the image content. This paper focuses on a low-dimensional color based indexing technique for achieving efficient and effective retrieval performance. In our approach, the color features are extracted using the mean shift algorithm, a robust clustering technique. Then the cluster (region) mode is used as representative of the image in 3-D color space. The feature descriptor consists of the representative color of a region and is indexed using a spatial indexing method that uses *R -tree thus avoiding the high-dimensional indexing problems associated with the traditional color histogram. Alternatively, the images in the database are clustered based on region feature similarity using Euclidian distance. Only representative (centroids) features of these clusters are indexed using *R -tree thus improving the efficiency. For similarity retrieval, each representative color in the query image or region is used independently to find regions containing that color. The results of these methods are compared. A JAVA based query engine supporting query-by- example is built to retrieve images by color.

High Capacity Data Hiding based on Predictor and Histogram Modification

In this paper, we propose a high capacity image hiding technology based on pixel prediction and the difference of modified histogram. This approach is used the pixel prediction and the difference of modified histogram to calculate the best embedding point. This approach can improve the predictive accuracy and increase the pixel difference to advance the hiding capacity. We also use the histogram modification to prevent the overflow and underflow. Experimental results demonstrate that our proposed method within the same average hiding capacity can still keep high quality of image and low distortion

Tests for Gaussianity of a Stationary Time Series

One of the primary uses of higher order statistics in signal processing has been for detecting and estimation of non- Gaussian signals in Gaussian noise of unknown covariance. This is motivated by the ability of higher order statistics to suppress additive Gaussian noise. In this paper, several methods to test for non- Gaussianity of a given process are presented. These methods include histogram plot, kurtosis test, and hypothesis testing using cumulants and bispectrum of the available sequence. The hypothesis testing is performed by constructing a statistic to test whether the bispectrum of the given signal is non-zero. A zero bispectrum is not a proof of Gaussianity. Hence, other tests such as the kurtosis test should be employed. Examples are given to demonstrate the performance of the presented methods.

A New Method in Short-Term Heart Rate Variability — Five-Class Density Histogram

A five-class density histogram with an index named cumulative density was proposed to analyze the short-term HRV. 150 subjects participated in the test, falling into three groups with equal numbers -- the healthy young group (Young), the healthy old group (Old), and the group of patients with congestive heart failure (CHF). Results of multiple comparisons showed a significant differences of the cumulative density in the three groups, with values 0.0238 for Young, 0.0406 for Old and 0.0732 for CHF (p