Abstract: Matching high dimensional features between images is computationally expensive for exhaustive search approaches in computer vision. Although the dimension of the feature can be degraded by simplifying the prior knowledge of homography, matching accuracy may degrade as a tradeoff. In this paper, we present a feature matching method based on k-means algorithm that reduces the matching cost and matches the features between images instead of using a simplified geometric assumption. Experimental results show that the proposed method outperforms the previous linear exhaustive search approaches in terms of the inlier ratio of matched pairs.
Abstract: Image Multi-label Classification (IMC) assigns a label or a set of labels to an image. The big demand for image annotation and archiving in the web attracts the researchers to develop many algorithms for this application domain. The existing techniques for IMC have two drawbacks: The description of the elementary characteristics from the image and the correlation between labels are not taken into account. In this paper, we present an algorithm (MIML-HOGLPP), which simultaneously handles these limitations. The algorithm uses the histogram of gradients as feature descriptor. It applies the Label Priority Power-set as multi-label transformation to solve the problem of label correlation. The experiment shows that the results of MIML-HOGLPP are better in terms of some of the evaluation metrics comparing with the two existing techniques.
Abstract: In this paper, a scalable augmented reality framework for handheld devices is presented. The presented framework is enabled by using a server-client data communication structure, in which the search for tracking targets among a database of images is performed on the server-side while pixel-wise 3D tracking is performed on the client-side, which, in this case, is a handheld mobile device. Image search on the server-side adopts a residual-enhanced image descriptors representation that gives the framework a scalability property. The tracking algorithm on the client-side is based on a gravity-aligned feature descriptor which takes the advantage of a sensor-equipped mobile device and an optimized intensity-based image alignment approach that ensures the accuracy of 3D tracking. Automatic content streaming is achieved by using a key-frame selection algorithm, client working phase monitoring and standardized rules for content communication between the server and client. The recognition accuracy test performed on a standard dataset shows that the method adopted in the presented framework outperforms the Bag-of-Words (BoW) method that has been used in some of the previous systems. Experimental test conducted on a set of video sequences indicated the real-time performance of the tracking system with a frame rate at 15-30 frames per second. The presented framework is exposed to be functional in practical situations with a demonstration application on a campus walk-around.
Abstract: Texture is an important characteristic in real and
synthetic scenes. Texture analysis plays a critical role in inspecting
surfaces and provides important techniques in a variety of
applications. Although several descriptors have been presented to
extract texture features, the development of object recognition is still a
difficult task due to the complex aspects of texture. Recently, many
robust and scaling-invariant image features such as SIFT, SURF and
ORB have been successfully used in image retrieval and object
recognition. In this paper, we have tried to compare the performance
for texture classification using these feature descriptors with k-means
clustering. Different classifiers including K-NN, Naive Bayes, Back
Propagation Neural Network , Decision Tree and Kstar were applied in
three texture image sets - UIUCTex, KTH-TIPS and Brodatz,
respectively. Experimental results reveal SIFTS as the best average
accuracy rate holder in UIUCTex, KTH-TIPS and SURF is
advantaged in Brodatz texture set. BP neuro network works best in the
test set classification among all used classifiers.
Abstract: Obturator Foramen is a specific structure in Pelvic
bone images and recognition of it is a new concept in medical image
processing. Moreover, segmentation of bone structures such as
Obturator Foramen plays an essential role for clinical research in
orthopedics. In this paper, we present a novel method to analyze the
similarity between the substructures of the imaged region and a hand
drawn template as a preprocessing step for computation of Pelvic
bone rotation on hip radiographs. This method consists of integrated
usage of Marker-controlled Watershed segmentation and Zernike
moment feature descriptor and it is used to detect Obturator Foramen
accurately. Marker-controlled Watershed segmentation is applied to
separate Obturator Foramen from the background effectively. Then,
Zernike moment feature descriptor is used to provide matching
between binary template image and the segmented binary image for
final extraction of Obturator Foramens. Finally, Pelvic bone rotation
rate calculation for each hip radiograph is performed automatically to
select and eliminate hip radiographs for further studies which depend
on Pelvic bone angle measurements. The proposed method is tested
on randomly selected 100 hip radiographs. The experimental results
demonstrated that the proposed method is able to segment Obturator
Foramen with 96% accuracy.
Abstract: One of the most critical decision points in the design of a
face recognition system is the choice of an appropriate face representation.
Effective feature descriptors are expected to convey sufficient, invariant
and non-redundant facial information. In this work we propose a set of
Hahn moments as a new approach for feature description. Hahn moments
have been widely used in image analysis due to their invariance, nonredundancy
and the ability to extract features either globally and locally.
To assess the applicability of Hahn moments to Face Recognition we
conduct two experiments on the Olivetti Research Laboratory (ORL)
database and University of Notre-Dame (UND) X1 biometric collection.
Fusion of the global features along with the features from local facial
regions are used as an input for the conventional k-NN classifier. The
method reaches an accuracy of 93% of correctly recognized subjects for
the ORL database and 94% for the UND database.
Abstract: Scale Invariant Feature Transform (SIFT) has been
widely applied, but extracting SIFT feature is complicated and
time-consuming. In this paper, to meet the demand of the real-time
applications, SIFT is parallelized and optimized on cluster system,
which is named pSIFT. Redundancy storage and communication are
used for boundary data to improve the performance, and before
representation of feature descriptor, data reallocation is adopted to
keep load balance in pSIFT. Experimental results show that pSIFT
achieves good speedup and scalability.
Abstract: Linearization of graph embedding has been emerged
as an effective dimensionality reduction technique in pattern
recognition. However, it may not be optimal for nonlinearly
distributed real world data, such as face, due to its linear nature. So, a
kernelization of graph embedding is proposed as a dimensionality
reduction technique in face recognition. In order to further boost the
recognition capability of the proposed technique, the Fisher-s
criterion is opted in the objective function for better data
discrimination. The proposed technique is able to characterize the
underlying intra-class structure as well as the inter-class separability.
Experimental results on FRGC database validate the effectiveness of
the proposed technique as a feature descriptor.
Abstract: The purpose of this paper is to detect human in images.
This paper proposes a method for extracting human body feature descriptors consisting of projected edge component series. The feature descriptor can express appearances and shapes of human with local
and global distribution of edges. Our method evaluated with a linear SVM classifier on Daimler-Chrysler pedestrian dataset, and test with
various sub-region size. The result shows that the accuracy level of
proposed method similar to Histogram of Oriented Gradients(HOG)
feature descriptor and feature extraction process is simple and faster than existing methods.
Abstract: In this paper, we proposed the distribution of mesh
normal vector direction as a feature descriptor of a 3D model. A
normal vector shows the entire shape of a model well. The
distribution of normal vectors was sampled in proportion to each
polygon's area so that the information on the surface with less surface
area may be less reflected on composing a feature descriptor in order
to enhance retrieval performance. At the analysis result of ANMRR,
the enhancement of approx. 12.4%~34.7% compared to the existing
method has also been indicated.
Abstract: Content-based Image Retrieval (CBIR) aims at searching image databases for specific images that are similar to a given query image based on matching of features derived from the image content. This paper focuses on a low-dimensional color based indexing technique for achieving efficient and effective retrieval performance. In our approach, the color features are extracted using the mean shift algorithm, a robust clustering technique. Then the cluster (region) mode is used as representative of the image in 3-D color space. The feature descriptor consists of the representative color of a region and is indexed using a spatial indexing method that uses *R -tree thus avoiding the high-dimensional indexing problems associated with the traditional color histogram. Alternatively, the images in the database are clustered based on region feature similarity using Euclidian distance. Only representative (centroids) features of these clusters are indexed using *R -tree thus improving the efficiency. For similarity retrieval, each representative color in the query image or region is used independently to find regions containing that color. The results of these methods are compared. A JAVA based query engine supporting query-by- example is built to retrieve images by color.
Abstract: An automatic method for the extraction of feature points for face based applications is proposed. The system is based upon volumetric feature descriptors, which in this paper has been extended to incorporate scale space. The method is robust to noise and has the ability to extract local and holistic features simultaneously from faces stored in a database. Extracted features are stable over a range of faces, with results indicating that in terms of intra-ID variability, the technique has the ability to outperform manual landmarking.
Abstract: We propose a fast and robust hierarchical face detection system which finds and localizes face images with a cascade of classifiers. Three modules contribute to the efficiency of our detector. First, heterogeneous feature descriptors are exploited to enrich feature types and feature numbers for face representation. Second, a PSO-Adaboost algorithm is proposed to efficiently select discriminative features from a large pool of available features and reinforce them into the final ensemble classifier. Compared with the standard exhaustive Adaboost for feature selection, the new PSOAdaboost algorithm reduces the training time up to 20 times. Finally, a three-stage hierarchical classifier framework is developed for rapid background removal. In particular, candidate face regions are detected more quickly by using a large size window in the first stage. Nonlinear SVM classifiers are used instead of decision stump functions in the last stage to remove those remaining complex nonface patterns that can not be rejected in the previous two stages. Experimental results show our detector achieves superior performance on the CMU+MIT frontal face dataset.
Abstract: In this paper, we propose a new approach to query-by-humming, focusing on MP3 songs database. Since MP3 songs are much more difficult in melody representation than symbolic performance data, we adopt to extract feature descriptors from the vocal sounds part of the songs. Our approach is based on signal filtering, sub-band spectral processing, MDCT coefficients analysis and peak energy detection by ignorance of the background music as much as possible. Finally, we apply dual dynamic programming algorithm for feature similarity matching. Experiments will show us its online performance in precision and efficiency.