Abstract: We have proposed an information filtering system
using index word selection from a document set based on the
topics included in a set of documents. This method narrows
down the particularly characteristic words in a document set
and the topics are obtained by Sparse Non-negative Matrix
Factorization. In information filtering, a document is often
represented with the vector in which the elements correspond
to the weight of the index words, and the dimension of the
vector becomes larger as the number of documents is
increased. Therefore, it is possible that useless words as index
words for the information filtering are included. In order to
address the problem, the dimension needs to be reduced. Our
proposal reduces the dimension by selecting index words
based on the topics included in a document set. We have
applied the Sparse Non-negative Matrix Factorization to the
document set to obtain these topics. The filtering is carried out
based on a centroid of the learning document set. The centroid
is regarded as the user-s interest. In addition, the centroid is
represented with a document vector whose elements consist of
the weight of the selected index words. Using the English test
collection MEDLINE, thus, we confirm the effectiveness of
our proposal. Hence, our proposed selection can confirm the
improvement of the recommendation accuracy from the other
previous methods when selecting the appropriate number of
index words. In addition, we discussed the selected index
words by our proposal and we found our proposal was able to
select the index words covered some minor topics included in
the document set.
Abstract: Image retrieval is a topic where scientific interest is currently high. The important steps associated with image retrieval system are the extraction of discriminative features and a feasible similarity metric for retrieving the database images that are similar in content with the search image. Gabor filtering is a widely adopted technique for feature extraction from the texture images. The recently proposed sparsity promoting l1-norm minimization technique finds the sparsest solution of an under-determined system of linear equations. In the present paper, the l1-norm minimization technique as a similarity metric is used in image retrieval. It is demonstrated through simulation results that the l1-norm minimization technique provides a promising alternative to existing similarity metrics. In particular, the cases where the l1-norm minimization technique works better than the Euclidean distance metric are singled out.
Abstract: In this paper, novel statistical sampling based equalization techniques and CNN based detection are proposed to increase the spectral efficiency of multiuser communication systems over fading channels. Multiuser communication combined with selective fading can result in interferences which severely deteriorate the quality of service in wireless data transmission (e.g. CDMA in mobile communication). The paper introduces new equalization methods to combat interferences by minimizing the Bit Error Rate (BER) as a function of the equalizer coefficients. This provides higher performance than the traditional Minimum Mean Square Error equalization. Since the calculation of BER as a function of the equalizer coefficients is of exponential complexity, statistical sampling methods are proposed to approximate the gradient which yields fast equalization and superior performance to the traditional algorithms. Efficient estimation of the gradient is achieved by using stratified sampling and the Li-Silvester bounds. A simple mechanism is derived to identify the dominant samples in real-time, for the sake of efficient estimation. The equalizer weights are adapted recursively by minimizing the estimated BER. The near-optimal performance of the new algorithms is also demonstrated by extensive simulations. The paper has also developed a (Cellular Neural Network) CNN based approach to detection. In this case fast quadratic optimization has been carried out by t, whereas the task of equalizer is to ensure the required template structure (sparseness) for the CNN. The performance of the method has also been analyzed by simulations.
Abstract: Given a large sparse signal, great wishes are to
reconstruct the signal precisely and accurately from lease number of
measurements as possible as it could. Although this seems possible
by theory, the difficulty is in built an algorithm to perform the
accuracy and efficiency of reconstructing. This paper proposes a new
proved method to reconstruct sparse signal depend on using new
method called Least Support Matching Pursuit (LS-OMP) merge it
with the theory of Partial Knowing Support (PSK) given new method
called Partially Knowing of Least Support Orthogonal Matching
Pursuit (PKLS-OMP).
The new methods depend on the greedy algorithm to compute the
support which depends on the number of iterations. So to make it
faster, the PKLS-OMP adds the idea of partial knowing support of its
algorithm. It shows the efficiency, simplicity, and accuracy to get
back the original signal if the sampling matrix satisfies the Restricted
Isometry Property (RIP).
Simulation results also show that it outperforms many algorithms
especially for compressible signals.
Abstract: In this paper a one-dimension Self Organizing Map
algorithm (SOM) to perform feature selection is presented. The
algorithm is based on a first classification of the input dataset on a
similarity space. From this classification for each class a set of
positive and negative features is computed. This set of features is
selected as result of the procedure. The procedure is evaluated on an
in-house dataset from a Knowledge Discovery from Text (KDT)
application and on a set of publicly available datasets used in
international feature selection competitions. These datasets come
from KDT applications, drug discovery as well as other applications.
The knowledge of the correct classification available for the training
and validation datasets is used to optimize the parameters for positive
and negative feature extractions. The process becomes feasible for
large and sparse datasets, as the ones obtained in KDT applications,
by using both compression techniques to store the similarity matrix
and speed up techniques of the Kohonen algorithm that take
advantage of the sparsity of the input matrix. These improvements
make it feasible, by using the grid, the application of the
methodology to massive datasets.
Abstract: There are two major variants of the Simplex
Algorithm: the revised method and the standard, or tableau method.
Today, all serious implementations are based on the revised method
because it is more efficient for sparse linear programming problems.
Moreover, there are a number of applications that lead to dense linear
problems so our aim in this paper is to present some computational
results on parallel implementation of dense Simplex Method. Our
implementation is implemented on a SMP cluster using C
programming language and the Message Passing Interface MPI.
Preliminary computational results on randomly generated dense
linear programs support our results.
Abstract: In this paper, we propose an adaptation of the Patricia-Tree for sparse datasets to generate non redundant rule associations. Using this adaptation, we can generate frequent closed itemsets that are more compact than frequent itemsets used in Apriori approach. This adaptation has been experimented on a set of datasets benchmarks.
Abstract: Clusters of microcalcifications in mammograms are an
important sign of breast cancer. This paper presents a complete
Computer Aided Detection (CAD) scheme for automatic detection of
clustered microcalcifications in digital mammograms. The proposed
system, MammoScan μCaD, consists of three main steps. Firstly
all potential microcalcifications are detected using a a method for
feature extraction, VarMet, and adaptive thresholding. This will also
give a number of false detections. The goal of the second step,
Classifier level 1, is to remove everything but microcalcifications.
The last step, Classifier level 2, uses learned dictionaries and sparse
representations as a texture classification technique to distinguish
single, benign microcalcifications from clustered microcalcifications,
in addition to remove some remaining false detections. The system
is trained and tested on true digital data from Stavanger University
Hospital, and the results are evaluated by radiologists. The overall
results are promising, with a sensitivity > 90 % and a low false
detection rate (approx 1 unwanted pr. image, or 0.3 false pr. image).
Abstract: We present new finite element methods for Helmholtz and Maxwell equations on general three-dimensional polyhedral meshes, based on domain decomposition with boundary elements on the surfaces of the polyhedral volume elements. The methods use the lowest-order polynomial spaces and produce sparse, symmetric linear systems despite the use of boundary elements. Moreover, piecewise constant coefficients are admissible. The resulting approximation on the element surfaces can be extended throughout the domain via representation formulas. Numerical experiments confirm that the convergence behavior on tetrahedral meshes is comparable to that of standard finite element methods, and equally good performance is attained on more general meshes.
Abstract: Although the level crossing concept has been the subject of intensive investigation over the last few years, certain problems of great interest remain unsolved. One of these concern is distribution of threshold levels. This paper presents a new threshold level allocation schemes for level crossing based on nonuniform sampling. Intuitively, it is more reasonable if the information rich regions of the signal are sampled finer and those with sparse information are sampled coarser. To achieve this objective, we propose non-linear quantization functions which dynamically assign the number of quantization levels depending on the importance of the given amplitude range. Two new approaches to determine the importance of the given amplitude segment are presented. The proposed methods are based on exponential and logarithmic functions. Various aspects of proposed techniques are discussed and experimentally validated. Its efficacy is investigated by comparison with uniform sampling.
Abstract: An algorithm for learning an overcomplete dictionary
using a Cauchy mixture model for sparse decomposition of an underdetermined
mixing system is introduced. The mixture density
function is derived from a ratio sample of the observed mixture
signals where 1) there are at least two but not necessarily more
mixture signals observed, 2) the source signals are statistically
independent and 3) the sources are sparse. The basis vectors of the
dictionary are learned via the optimization of the location parameters
of the Cauchy mixture components, which is shown to be more
accurate and robust than the conventional data mining methods
usually employed for this task. Using a well known sparse
decomposition algorithm, we extract three speech signals from two
mixtures based on the estimated dictionary. Further tests with
additive Gaussian noise are used to demonstrate the proposed
algorithm-s robustness to outliers.
Abstract: The study of proteomics reached unexpected levels of
interest, as a direct consequence of its discovered influence over some
complex biological phenomena, such as problematic diseases like
cancer. This paper presents the latest authors- achievements regarding
the analysis of the networks of proteins (interactome networks), by
computing more efficiently the betweenness centrality measure. The
paper introduces the concept of betweenness centrality, and then
describes how betweenness computation can help the interactome net-
work analysis. Current sequential implementations for the between-
ness computation do not perform satisfactory in terms of execution
times. The paper-s main contribution is centered towards introducing
a speedup technique for the betweenness computation, based on
modified shortest path algorithms for sparse graphs. Three optimized
generic algorithms for betweenness computation are described and
implemented, and their performance tested against real biological
data, which is part of the IntAct dataset.
Abstract: Computed tomography and laminography are heavily investigated in a compressive sensing based image reconstruction framework to reduce the dose to the patients as well as to the radiosensitive devices such as multilayer microelectronic circuit boards. Nowadays researchers are actively working on optimizing the compressive sensing based iterative image reconstruction algorithm to obtain better quality images. However, the effects of the sampled data’s properties on reconstructed the image’s quality, particularly in an insufficient sampled data conditions have not been explored in computed laminography. In this paper, we investigated the effects of two data properties i.e. sampling density and data incoherence on the reconstructed image obtained by conventional computed laminography and a recently proposed method called spherical sinusoidal scanning scheme. We have found that in a compressive sensing based image reconstruction framework, the image quality mainly depends upon the data incoherence when the data is uniformly sampled.
Abstract: Synthetic aperture radar (SAR) imaging usually
requires echo data collected continuously pulse by pulse with certain
bandwidth. However in real situation, data collection or part of signal
spectrum can be interrupted due to various reasons, i.e. there will be
gaps in spatial spectrum. In this case we need to find ways to fill out
the resulted gaps and get image with defined resolution. In this paper
we introduce our work on how to apply iterative spatially variant
apodization (Super-SVA) technique to extrapolate the spatial
spectrum in both azimuthal and range directions so as to fill out the
gaps and get correct radar image.
Abstract: We propose an enhanced collaborative filtering
method using Hofstede-s cultural dimensions, calculated for 111
countries. We employ 4 of these dimensions, which are correlated to
the costumers- buying behavior, in order to detect users- preferences
for items. In addition, several advantages of this method
demonstrated for data sparseness and cold-start users, which are
important challenges in collaborative filtering. We present
experiments using a real dataset, Book Crossing Dataset.
Experimental results shows that the proposed algorithm provide
significant advantages in terms of improving recommendation
quality.
Abstract: Symbolic Circuit Analysis (SCA) is a technique used
to generate the symbolic expression of a network. It has become a
well-established technique in circuit analysis and design. The
symbolic expression of networks offers excellent way to perform
frequency response analysis, sensitivity computation, stability
measurements, performance optimization, and fault diagnosis. Many
approaches have been proposed in the area of SCA offering different
features and capabilities. Numerical Interpolation methods are very
common in this context, especially by using the Fast Fourier
Transform (FFT). The aim of this paper is to present a method for
SCA that depends on the use of Wavelet Transform (WT) as a
mathematical tool to generate the symbolic expression for large
circuits with minimizing the analysis time by reducing the number of
computations.
Abstract: Texture classification is an important image processing
task with a broad application range. Many different techniques for
texture classification have been explored. Using sparse approximation
as a feature extraction method for texture classification is a relatively
new approach, and Skretting et al. recently presented the Frame
Texture Classification Method (FTCM), showing very good results on
classical texture images. As an extension of that work the FTCM is
here tested on a real world application as detection of abnormalities
in mammograms. Some extensions to the original FTCM that are
useful in some applications are implemented; two different smoothing
techniques and a vector augmentation technique. Both detection of
microcalcifications (as a primary detection technique and as a last
stage of a detection scheme), and soft tissue lesions in mammograms
are explored. All the results are interesting, and especially the results
using FTCM on regions of interest as the last stage in a detection
scheme for microcalcifications are promising.
Abstract: In this paper three different approaches for person
verification and identification, i.e. by means of fingerprints, face and
voice recognition, are studied. Face recognition uses parts-based
representation methods and a manifold learning approach. The
assessment criterion is recognition accuracy. The techniques under
investigation are: a) Local Non-negative Matrix Factorization
(LNMF); b) Independent Components Analysis (ICA); c) NMF with
sparse constraints (NMFsc); d) Locality Preserving Projections
(Laplacianfaces). Fingerprint detection was approached by classical
minutiae (small graphical patterns) matching through image
segmentation by using a structural approach and a neural network as
decision block. As to voice / speaker recognition, melodic cepstral
and delta delta mel cepstral analysis were used as main methods, in
order to construct a supervised speaker-dependent voice recognition
system. The final decision (e.g. “accept-reject" for a verification
task) is taken by using a majority voting technique applied to the
three biometrics. The preliminary results, obtained for medium
databases of fingerprints, faces and voice recordings, indicate the
feasibility of our study and an overall recognition precision (about
92%) permitting the utilization of our system for a future complex
biometric card.
Abstract: In this paper we introduce a novel kernel classifier
based on a iterative shrinkage algorithm developed for compressive
sensing. We have adopted Bregman iteration with soft and hard
shrinkage functions and generalized hinge loss for solving l1 norm
minimization problem for classification. Our experimental results
with face recognition and digit classification using SVM as the
benchmark have shown that our method has a close error rate
compared to SVM but do not perform better than SVM. We have
found that the soft shrinkage method give more accuracy and in some
situations more sparseness than hard shrinkage methods.
Abstract: In this work, I present a review on Sparse Distributed
Memory for Small Cues (SDMSCue), a variant of Sparse Distributed
Memory (SDM) that is capable of handling small cues. I then conduct
and show some cognitive experiments on SDMSCue to test its
cognitive soundness compared to SDM. Small cues refer to input
cues that are presented to memory for reading associations; but have
many missing parts or fields from them. The original SDM failed to
handle such a problem. SDMSCue handles and overcomes this
pitfall. The main idea in SDMSCue; is the repeated projection of the
semantic space on smaller subspaces; that are selected based on the
input cue length and pattern. This process allows for Read/Write
operations using an input cue that is missing a large portion.
SDMSCue is augmented with the use of genetic algorithms for
memory allocation and initialization. I claim that SDM functionality
is a subset of SDMSCue functionality.