Abstract: Reverse engineering of full-genomic interaction networks based on compendia of expression data has been successfully applied for a number of model organisms. This study adapts these approaches for an important non-model organism: The major human fungal pathogen Candida albicans. During the infection process, the pathogen can adapt to a wide range of environmental niches and reversibly changes its growth form. Given the importance of these processes, it is important to know how they are regulated. This study presents a reverse engineering strategy able to infer fullgenomic interaction networks for C. albicans based on a linear regression, utilizing the sparseness criterion (LASSO). To overcome the limited amount of expression data and small number of known interactions, we utilize different prior-knowledge sources guiding the network inference to a knowledge driven solution. Since, no database of known interactions for C. albicans exists, we use a textmining system which utilizes full-text research papers to identify known regulatory interactions. By comparing with these known regulatory interactions, we find an optimal value for global modelling parameters weighting the influence of the sparseness criterion and the prior-knowledge. Furthermore, we show that soft integration of prior-knowledge additionally improves the performance. Finally, we compare the performance of our approach to state of the art network inference approaches.
Abstract: User-based Collaborative filtering (CF), one of the
most prevailing and efficient recommendation techniques, provides
personalized recommendations to users based on the opinions of other
users. Although the CF technique has been successfully applied in
various applications, it suffers from serious sparsity problems. The
cloud-model approach addresses the sparsity problems by
constructing the user-s global preference represented by a cloud
eigenvector. The user-based CF approach works well with dense
datasets while the cloud-model CF approach has a greater
performance when the dataset is sparse. In this paper, we present a
hybrid approach that integrates the predictions from both the
user-based CF and the cloud-model CF approaches. The experimental
results show that the proposed hybrid approach can ameliorate the
sparsity problem and provide an improved prediction quality.
Abstract: A proof of convergence of a new continuation algorithm for computing the Analytic SVD for a large sparse parameter– dependent matrix is given. The algorithm itself was developed and numerically tested in [5].
Abstract: Every commercial bank optimises its asset portfolio
depending on the profitability of assets and chosen or imposed
constraints. This paper proposes and applies a stylized model for
optimising banks' asset and liability structure, reflecting profitability
of different asset categories and their risks as well as costs associated
with different liability categories and reserve requirements. The level
of detail for asset and liability categories is chosen to create a
suitably parsimonious model and to include the most important
categories in the model. It is shown that the most appropriate
optimisation criterion for the model is the maximisation of the ratio
of net interest income to assets. The maximisation of this ratio is
subject to several constraints. Some are accounting identities or
dictated by legislative requirements; others vary depending on the
market objectives for a particular bank. The model predicts variable
amount of assets allocated to loan provision.
Abstract: In comparison to the original SVM, which involves a
quadratic programming task; LS–SVM simplifies the required
computation, but unfortunately the sparseness of standard SVM is
lost. Another problem is that LS-SVM is only optimal if the training
samples are corrupted by Gaussian noise. In Least Squares SVM
(LS–SVM), the nonlinear solution is obtained, by first mapping the
input vector to a high dimensional kernel space in a nonlinear
fashion, where the solution is calculated from a linear equation set. In
this paper a geometric view of the kernel space is introduced, which
enables us to develop a new formulation to achieve a sparse and
robust estimate.
Abstract: Sparse representation which can represent high dimensional
data effectively has been successfully used in computer vision
and pattern recognition problems. However, it doesn-t consider the
label information of data samples. To overcome this limitation,
we develop a novel dimensionality reduction algorithm namely
dscriminatively regularized sparse subspace learning(DR-SSL) in this
paper. The proposed DR-SSL algorithm can not only make use of
the sparse representation to model the data, but also can effective
employ the label information to guide the procedure of dimensionality
reduction. In addition,the presented algorithm can effectively deal
with the out-of-sample problem.The experiments on gene-expression
data sets show that the proposed algorithm is an effective tool for
dimensionality reduction and gene-expression data classification.
Abstract: In this paper, the implementation of a rule-based
intuitive reasoner is presented. The implementation included two
parts: the rule induction module and the intuitive reasoner. A large
weather database was acquired as the data source. Twelve weather
variables from those data were chosen as the “target variables"
whose values were predicted by the intuitive reasoner. A “complex"
situation was simulated by making only subsets of the data available
to the rule induction module. As a result, the rules induced were
based on incomplete information with variable levels of certainty.
The certainty level was modeled by a metric called "Strength of
Belief", which was assigned to each rule or datum as ancillary
information about the confidence in its accuracy. Two techniques
were employed to induce rules from the data subsets: decision tree
and multi-polynomial regression, respectively for the discrete and the
continuous type of target variables. The intuitive reasoner was tested
for its ability to use the induced rules to predict the classes of the
discrete target variables and the values of the continuous target
variables. The intuitive reasoner implemented two types of
reasoning: fast and broad where, by analogy to human thought, the
former corresponds to fast decision making and the latter to deeper
contemplation. . For reference, a weather data analysis approach
which had been applied on similar tasks was adopted to analyze the
complete database and create predictive models for the same 12
target variables. The values predicted by the intuitive reasoner and
the reference approach were compared with actual data. The intuitive
reasoner reached near-100% accuracy for two continuous target
variables. For the discrete target variables, the intuitive reasoner
predicted at least 70% as accurately as the reference reasoner. Since
the intuitive reasoner operated on rules derived from only about 10%
of the total data, it demonstrated the potential advantages in dealing
with sparse data sets as compared with conventional methods.
Abstract: This paper presents a new feature based dense stereo
matching algorithm to obtain the dense disparity map via dynamic
programming. After extraction of some proper features, we use some
matching constraints such as epipolar line, disparity limit, ordering
and limit of directional derivative of disparity as well. Also, a coarseto-
fine multiresolution strategy is used to decrease the search space
and therefore increase the accuracy and processing speed. The
proposed method links the detected feature points into the chains and
compares some of the feature points from different chains, to
increase the matching speed. We also employ color stereo matching
to increase the accuracy of the algorithm. Then after feature
matching, we use the dynamic programming to obtain the dense
disparity map. It differs from the classical DP methods in the stereo
vision, since it employs sparse disparity map obtained from the
feature based matching stage. The DP is also performed further on a
scan line, between any matched two feature points on that scan line.
Thus our algorithm is truly an optimization method. Our algorithm
offers a good trade off in terms of accuracy and computational
efficiency. Regarding the results of our experiments, the proposed
algorithm increases the accuracy from 20 to 70%, and reduces the
running time of the algorithm almost 70%.
Abstract: The advantage of solving the complex nonlinear
problems by utilizing fuzzy logic methodologies is that the
experience or expert-s knowledge described as a fuzzy rule base can
be directly embedded into the systems for dealing with the problems.
The current limitation of appropriate and automated designing of
fuzzy controllers are focused in this paper. The structure discovery
and parameter adjustment of the Branched T-S fuzzy model is
addressed by a hybrid technique of type constrained sparse tree
algorithms. The simulation result for different system model is
evaluated and the identification error is observed to be minimum.
Abstract: Delay-Tolerant Networks (DTNs) are sparse, wireless
networks where disconnections are common due to host mobility and
low node density. The Message Ferrying (MF) scheme is a mobilityassisted
paradigm to improve connectivity in DTN-like networks. A
ferry or message ferry is a special node in the network which has
a per-determined route in the deployed area and relays messages
between mobile hosts (MHs) which are intermittently connected.
Increased contact opportunities among mobile hosts and the ferry
improve the performance of the network, both in terms of message
delivery ratio and average end-end delay. However, due to the inherent
mobility of mobile hosts and pre-determined periodicity of the
message ferry, mobile hosts may often -miss- contact opportunities
with a ferry. In this paper, we propose the combination of stationary
ferry access points (FAPs) with MF routing to increase contact
opportunities between mobile hosts and the MF and consequently
improve the performance of the DTN. We also propose several
placement models for deploying FAPs on MF routes. We evaluate the
performance of the FAP placement models through comprehensive
simulation. Our findings show that FAPs do improve the performance
of MF-assisted DTNs and symmetric placement of FAPs outperforms
other placement strategies.
Abstract: In this paper an efficient incomplete factorization preconditioner is proposed for the Least Mean Squares (LMS) adaptive filter. The proposed preconditioner is approximated from a priori knowledge of the factors of input correlation matrix with an incomplete strategy, motivated by the sparsity patter of the upper triangular factor in the QRD-RLS algorithm. The convergence properties of IPLMS algorithm are comparable with those of transform domain LMS(TDLMS) algorithm. Simulation results show efficiency and robustness of the proposed algorithm with reduced computational complexity.
Abstract: Effective evaluation of software development effort is an important aspect of successful project management. Based on a large database with 4106 projects ever developed, this study statistically examines the factors that influence development effort. The factors found to be significant for effort are project size, average number of developers that worked on the project, type of development, development language, development platform, and the use of rapid application development. Among these factors, project size is the most critical cost driver. Unsurprisingly, this study found that the use of CASE tools does not necessarily reduce development effort, which adds support to the claim that the use of tools is subtle. As many of the current estimation models are rarely or unsuccessfully used, this study proposes a parsimonious parametric model for the prediction of effort which is both simple and more accurate than previous models.
Abstract: In the present paper, we propose numerical methods for solving the Stein equation AXC - X - D = 0 where the matrix A is large and sparse. Such problems appear in discrete-time control problems, filtering and image restoration. We consider the case where the matrix D is of full rank and the case where D is factored as a product of two matrices. The proposed methods are Krylov subspace methods based on the block Arnoldi algorithm. We give theoretical results and we report some numerical experiments.
Abstract: Restarted GMRES methods augmented with approximate eigenvectors are widely used for solving large sparse linear systems. Recently a new scheme of augmenting with error approximations is proposed. The main aim of this paper is to develop a restarted GMRES method augmented with the combination of harmonic Ritz vectors and error approximations. We demonstrate that the resulted combination method can gain the advantages of two approaches: (i) effectively deflate the small eigenvalues in magnitude that may hamper the convergence of the method and (ii) partially recover the global optimality lost due to restarting. The effectiveness and efficiency of the new method are demonstrated through various numerical examples.
Abstract: Text categorization is the problem of classifying text
documents into a set of predefined classes. After a preprocessing
step, the documents are typically represented as large sparse vectors.
When training classifiers on large collections of documents, both the
time and memory restrictions can be quite prohibitive. This justifies
the application of feature selection methods to reduce the
dimensionality of the document-representation vector. In this paper,
three feature selection methods are evaluated: Random Selection,
Information Gain (IG) and Support Vector Machine feature selection
(called SVM_FS). We show that the best results were obtained with
SVM_FS method for a relatively small dimension of the feature
vector. Also we present a novel method to better correlate SVM
kernel-s parameters (Polynomial or Gaussian kernel).
Abstract: In this paper a combined feature selection method is
proposed which takes advantages of sample domain filtering,
resampling and feature subset evaluation methods to reduce
dimensions of huge datasets and select reliable features. This method
utilizes both feature space and sample domain to improve the process
of feature selection and uses a combination of Chi squared with
Consistency attribute evaluation methods to seek reliable features.
This method consists of two phases. The first phase filters and
resamples the sample domain and the second phase adopts a hybrid
procedure to find the optimal feature space by applying Chi squared,
Consistency subset evaluation methods and genetic search.
Experiments on various sized datasets from UCI Repository of
Machine Learning databases show that the performance of five
classifiers (Naïve Bayes, Logistic, Multilayer Perceptron, Best First
Decision Tree and JRIP) improves simultaneously and the
classification error for these classifiers decreases considerably. The
experiments also show that this method outperforms other feature
selection methods.
Abstract: Phylogenetic tree is a graphical representation of the
evolutionary relationship among three or more genes or organisms.
These trees show relatedness of data sets, species or genes
divergence time and nature of their common ancestors. Quality of a
phylogenetic tree requires parsimony criterion. Various approaches
have been proposed for constructing most parsimonious trees. This
paper is concerned about calculating and optimizing the changes of
state that are needed called Small Parsimony Algorithms. This paper
has proposed enhanced small parsimony algorithm to give better
score based on number of evolutionary changes needed to produce
the observed sequence changes tree and also give the ancestor of the
given input.
Abstract: In-core memory requirement is a bottleneck in solving
large three dimensional Navier-Stokes finite element problem
formulations using sparse direct solvers. Out-of-core solution
strategy is a viable alternative to reduce the in-core memory
requirements while solving large scale problems. This study
evaluates the performance of various out-of-core sequential solvers
based on multifrontal or supernodal techniques in the context of
finite element formulations for three dimensional problems on a
Windows platform. Here three different solvers, HSL_MA78,
MUMPS and PARDISO are compared. The performance of these
solvers is evaluated on a 64-bit machine with 16GB RAM for finite
element formulation of flow through a rectangular channel. It is
observed that using out-of-core PARDISO solver, relatively large
problems can be solved. The implementation of Newton and
modified Newton's iteration is also discussed.
Abstract: Due to the ever growing amount of publications about
protein-protein interactions, information extraction from text is
increasingly recognized as one of crucial technologies in
bioinformatics. This paper presents a Protein Interaction Extraction
System using a Link Grammar Parser from biomedical abstracts
(PIELG). PIELG uses linkage given by the Link Grammar Parser to
start a case based analysis of contents of various syntactic roles as
well as their linguistically significant and meaningful combinations.
The system uses phrasal-prepositional verbs patterns to overcome
preposition combinations problems. The recall and precision are
74.4% and 62.65%, respectively. Experimental evaluations with two
other state-of-the-art extraction systems indicate that PIELG system
achieves better performance. For further evaluation, the system is
augmented with a graphical package (Cytoscape) for extracting
protein interaction information from sequence databases. The result
shows that the performance is remarkably promising.
Abstract: HSDPA is a new feature which is introduced in
Release-5 specifications of the 3GPP WCDMA/UTRA standard to
realize higher speed data rate together with lower round-trip times.
Moreover, the HSDPA concept offers outstanding improvement of
packet throughput and also significantly reduces the packet call
transfer delay as compared to Release -99 DSCH. Till now the
HSDPA system uses turbo coding which is the best coding technique
to achieve the Shannon limit. However, the main drawbacks of turbo
coding are high decoding complexity and high latency which makes
it unsuitable for some applications like satellite communications,
since the transmission distance itself introduces latency due to
limited speed of light. Hence in this paper it is proposed to use LDPC
coding in place of Turbo coding for HSDPA system which decreases
the latency and decoding complexity. But LDPC coding increases the
Encoding complexity. Though the complexity of transmitter
increases at NodeB, the End user is at an advantage in terms of
receiver complexity and Bit- error rate. In this paper LDPC Encoder
is implemented using “sparse parity check matrix" H to generate a
codeword at Encoder and “Belief Propagation algorithm "for LDPC
decoding .Simulation results shows that in LDPC coding the BER
suddenly drops as the number of iterations increase with a small
increase in Eb/No. Which is not possible in Turbo coding. Also same
BER was achieved using less number of iterations and hence the
latency and receiver complexity has decreased for LDPC coding.
HSDPA increases the downlink data rate within a cell to a theoretical
maximum of 14Mbps, with 2Mbps on the uplink. The changes that
HSDPA enables includes better quality, more reliable and more
robust data services. In other words, while realistic data rates are
only a few Mbps, the actual quality and number of users achieved
will improve significantly.