Abstract: Biological sequences from different species are called or-thologs if they evolved from a sequence of a common ancestor species and they have the same biological function. Approximations of Kolmogorov complexity or entropy of biological sequences are already well known to be useful in extracting similarity information between such sequences -in the interest, for example, of ortholog detection. As is well known, the exact Kolmogorov complexity is not algorithmically computable. In prac-tice one can approximate it by computable compression methods. How-ever, such compression methods do not provide a good approximation to Kolmogorov complexity for short sequences. Herein is suggested a new ap-proach to overcome the problem that compression approximations may notwork well on short sequences. This approach is inspired by new, conditional computations of Kolmogorov entropy. A main contribution of the empir-ical work described shows the new set of entropy-based machine learning attributes provides good separation between positive (ortholog) and nega-tive (non-ortholog) data - better than with good, previously known alter-natives (which do not employ some means to handle short sequences well).Also empirically compared are the new entropy based attribute set and a number of other, more standard similarity attributes sets commonly used in genomic analysis. The various similarity attributes are evaluated by cross validation, through boosted decision tree induction C5.0, and by Receiver Operating Characteristic (ROC) analysis. The results point to the conclu-sion: the new, entropy based attribute set by itself is not the one giving the best prediction; however, it is the best attribute set for use in improving the other, standard attribute sets when conjoined with them.
Abstract: A novel physico-chemical route to produce few layer graphene nanoribbons with atomically smooth edges is reported, via acid treatment (H2SO4:HNO3) followed by characteristic thermal shock processes involving extremely cold substances. Samples were studied by scanning electron microscopy (SEM), transmission electron microscopy (TEM), X-ray diffraction (XRD), Raman spectroscopy and X-ray photoelectron spectroscopy. This method demonstrates the importance of having the nanotubes open ended for an efficient uniform unzipping along the nanotube axis. The average dimensions of these nanoribbons are approximately ca. 210 nm wide and consist of few layers, as observed by transmission electron microscopy. The produced nanoribbons exhibit different chiralities, as observed by high resolution transmission electron microscopy. This method is able to provide graphene nanoribbons with atomically smooth edges which could be used in various applications including sensors, gas adsorption materials, composite fillers, among others.
Abstract: One of the main issues in Computer Vision is to extract the movement of one or several points or objects of interest in an image or video sequence to conduct any kind of study or control process. Different techniques to solve this problem have been applied in numerous areas such as surveillance systems, analysis of traffic, motion capture, image compression, navigation systems and others, where the specific characteristics of each scenario determine the approximation to the problem. This paper puts forward a Computer Vision based algorithm to analyze fish trajectories in high turbulence conditions in artificial structures called vertical slot fishways, designed to allow the upstream migration of fish through obstructions in rivers. The suggested algorithm calculates the position of the fish at every instant starting from images recorded with a camera and using neural networks to execute fish detection on images. Different laboratory tests have been carried out in a full scale fishway model and with living fishes, allowing the reconstruction of the fish trajectory and the measurement of velocities and accelerations of the fish. These data can provide useful information to design more effective vertical slot fishways.
Abstract: Clusters of microcalcifications in mammograms are an
important sign of breast cancer. This paper presents a complete
Computer Aided Detection (CAD) scheme for automatic detection of
clustered microcalcifications in digital mammograms. The proposed
system, MammoScan μCaD, consists of three main steps. Firstly
all potential microcalcifications are detected using a a method for
feature extraction, VarMet, and adaptive thresholding. This will also
give a number of false detections. The goal of the second step,
Classifier level 1, is to remove everything but microcalcifications.
The last step, Classifier level 2, uses learned dictionaries and sparse
representations as a texture classification technique to distinguish
single, benign microcalcifications from clustered microcalcifications,
in addition to remove some remaining false detections. The system
is trained and tested on true digital data from Stavanger University
Hospital, and the results are evaluated by radiologists. The overall
results are promising, with a sensitivity > 90 % and a low false
detection rate (approx 1 unwanted pr. image, or 0.3 false pr. image).
Abstract: The main aim of this study is to describe and introduce a method of numerical analysis in obtaining approximate solutions for the SIR-SI differential equations (susceptible-infectiverecovered for human populations; susceptible-infective for vector populations) that represent a model for dengue disease transmission. Firstly, we describe the ordinary differential equations for the SIR-SI disease transmission models. Then, we introduce the numerical analysis of solutions of this continuous time, discrete space SIR-SI model by simplifying the continuous time scale to a densely populated, discrete time scale. This is followed by the application of this numerical analysis of solutions of the SIR-SI differential equations to the estimation of relative risk using continuous time, discrete space dengue data of Kuala Lumpur, Malaysia. Finally, we present the results of the analysis, comparing and displaying the results in graphs, table and maps. Results of the numerical analysis of solutions that we implemented offers a useful and potentially superior model for estimating relative risks based on continuous time, discrete space data for vector borne infectious diseases specifically for dengue disease.
Abstract: The Minimum Weighted Vertex Cover (MWVC) problem is a classic graph optimization NP - complete problem. Given an undirected graph G = (V, E) and weighting function defined on the vertex set, the minimum weighted vertex cover problem is to find a vertex set S V whose total weight is minimum subject to every edge of G has at least one end point in S. In this paper an effective algorithm, called Support Ratio Algorithm (SRA), is designed to find the minimum weighted vertex cover of a graph. Computational experiments are designed and conducted to study the performance of our proposed algorithm. Extensive simulation results show that the SRA can yield better solutions than other existing algorithms found in the literature for solving the minimum vertex cover problem.
Abstract: We present a theory for optimal filtering of infinite sets of random signals. There are several new distinctive features of the proposed approach. First, we provide a single optimal filter for processing any signal from a given infinite signal set. Second, the filter is presented in the special form of a sum with p terms where each term is represented as a combination of three operations. Each operation is a special stage of the filtering aimed at facilitating the associated numerical work. Third, an iterative scheme is implemented into the filter structure to provide an improvement in the filter performance at each step of the scheme. The final step of the concerns signal compression and decompression. This step is based on the solution of a new rank-constrained matrix approximation problem. The solution to the matrix problem is described in this paper. A rigorous error analysis is given for the new filter.
Abstract: We present new finite element methods for Helmholtz and Maxwell equations on general three-dimensional polyhedral meshes, based on domain decomposition with boundary elements on the surfaces of the polyhedral volume elements. The methods use the lowest-order polynomial spaces and produce sparse, symmetric linear systems despite the use of boundary elements. Moreover, piecewise constant coefficients are admissible. The resulting approximation on the element surfaces can be extended throughout the domain via representation formulas. Numerical experiments confirm that the convergence behavior on tetrahedral meshes is comparable to that of standard finite element methods, and equally good performance is attained on more general meshes.
Abstract: Generalized Center String (GCS) problem are
generalized from Common Approximate Substring problem
and Common substring problems. GCS are known to be
NP-hard allowing the problems lies in the explosion of
potential candidates. Finding longest center string without
concerning the sequence that may not contain any motifs is
not known in advance in any particular biological gene
process. GCS solved by frequent pattern-mining techniques
and known to be fixed parameter tractable based on the
fixed input sequence length and symbol set size. Efficient
method known as Bpriori algorithms can solve GCS with
reasonable time/space complexities. Bpriori 2 and Bpriori
3-2 algorithm are been proposed of any length and any
positions of all their instances in input sequences. In this
paper, we reduced the time/space complexity of Bpriori
algorithm by Constrained Based Frequent Pattern mining
(CBFP) technique which integrates the idea of Constraint
Based Mining and FP-tree mining. CBFP mining technique
solves the GCS problem works for all center string of any
length, but also for the positions of all their mutated copies
of input sequence. CBFP mining technique construct TRIE
like with FP tree to represent the mutated copies of center
string of any length, along with constraints to restraint
growth of the consensus tree. The complexity analysis for
Constrained Based FP mining technique and Bpriori
algorithm is done based on the worst case and average case
approach. Algorithm's correctness compared with the
Bpriori algorithm using artificial data is shown.
Abstract: Crude oil blending is an important unit operation in
petroleum refining industry. A good model for the blending system is
beneficial for supervision operation, prediction of the export
petroleum quality and realizing model-based optimal control. Since
the blending cannot follow the ideal mixing rule in practice, we
propose a static neural network to approximate the blending
properties. By the dead-zone approach, we propose a new robust
learning algorithm and give theoretical analysis. Real data of crude
oil blending is applied to illustrate the neuro modeling approach.
Abstract: The degradation of selected pharmaceuticals in some
water matrices was studied by using several chemical treatments. The
pharmaceuticals selected were the beta-blocker metoprolol, the
nonsteroidal anti-inflammatory naproxen, the antibiotic amoxicillin,
and the analgesic phenacetin; and their degradations were conducted
by using UV radiation alone, ozone, Fenton-s reagent, Fenton-like
system, photo-Fenton system, and combinations of UV radiation and
ozone with H2O2, TiO2, Fe(II), and Fe(III). The water matrices, in
addition to ultra-pure water, were a reservoir water, a groundwater,
and two secondary effluents from two municipal WWTP. The results
reveal that the presence of any second oxidant enhanced the
oxidation rates, with the systems UV/TiO2 and O3/TiO2 providing the
highest degradation rates. It is also observed in most of the
investigated oxidation systems that the degradation rate followed the
sequence: amoxicillin > naproxen > metoprolol > phenacetin. Lower
rates were obtained with the pharmaceuticals dissolved in natural
waters and secondary effluents due to the organic matter present
which consume some amounts of the oxidant agents.
Abstract: In this paper, an alternating implicit block method for
solving two dimensional scalar wave equation is presented. The
new method consist of two stages for each time step implemented
in alternating directions which are very simple in computation. To
increase the speed of computation, a group of adjacent points is
computed simultaneously. It is shown that the presented method
increase the maximum time step size and more accurate than the
conventional finite difference time domain (FDTD) method and other
existing method of natural ordering.
Abstract: Using maximal consistent blocks of tolerance relation
on the universe in incomplete decision table, the concepts of join block
and meet block are introduced and studied. Including tolerance class,
other blocks such as tolerant kernel and compatible kernel of an object
are also discussed at the same time. Upper and lower approximations
based on those blocks are also defined. Default definite decision rules
acquired from incomplete decision table are proposed in the paper. An
incremental algorithm to update default definite decision rules is
suggested for effective mining tasks from incomplete decision table
into which data is appended. Through an example, we demonstrate
how default definite decision rules based on maximal consistent
blocks, join blocks and meet blocks are acquired and how optimization
is done in support of discernibility matrix and discernibility function
in the incomplete decision table.
Abstract: New graph similarity methods have been proposed in this work with the aim to refining the chemical information extracted from molecules matching. For this purpose, data fusion of the isomorphic and nonisomorphic subgraphs into a new similarity measure, the Approximate Similarity, was carried out by several approaches. The application of the proposed method to the development of quantitative structure-activity relationships (QSAR) has provided reliable tools for predicting several pharmacological parameters: binding of steroids to the globulin-corticosteroid receptor, the activity of benzodiazepine receptor compounds, and the blood brain barrier permeability. Acceptable results were obtained for the models presented here.
Abstract: Gauteng, as the province with the greatest industrial and population density, the economic hub of South Africa also generates the greatest amount of waste, both general and hazardous. Therefore the province has a significant need to develop and apply appropriate integrated waste management policies that ensure that waste is recognised as a serious problem and is managed in an effective integrated manner to preserve both the present and future human health and environment. This paper reflects on Gauteng-s waste outlook in particular the province-s General Waste Minimisation Plan and its Integrated Waste Management Policy. The paper also looks at general waste generation, recyclable waste streams as well as recycling and separation at source initiatives in the province. Both the quantity and nature of solid waste differs considerably across the socio-economic spectrum. People in informal settlements generate an average of 0.16 kg per person per day whereas 2 kg per day is not unusual in affluent areas. For example the amount of waste generated in Johannesburg is approximately 1.2 kg per person per day.
Abstract: The investigation results of high-density hydrogen
heating by high-current electric arc are presented at initial pressure
from 5 MPa to 160 MPa with current amplitude up to 1.6 MA and
current rate of rise 109-1011 A/s. When changing the initial pressure
and current rate of rise, channel temperature varies from several
electronvolts to hundreds electronvolts. Arc channel radius is several
millimeters. But the radius of the discharge chamber greater than the
radius of the arc channel on approximately order of magnitude. High
efficiency of gas heating is caused by radiation absorption of
hydrogen surrounding the arc. Current channel consist from vapor of
the initiating wire. At current rate of rise of 109 A/s and relatively
small current amplitude gas heating occurs due to radiation
absorption in the band transparency of hydrogen by the wire vapours
with photon energies less than 13.6 eV. At current rate of rise of
1011 A/s gas heating is due to hydrogen absorption of soft X-rays
from discharge channel.
Abstract: This paper is devoted to present and discuss a model that allows a local segmentation by using statistical information of a given image. It is based on Chan-Vese model, curve evolution, partial differential equations and binary level sets method. The proposed model uses the piecewise constant approximation of Chan-Vese model to compute Signed Pressure Force (SPF) function, this one attracts the curve to the true object(s)-s boundaries. The implemented model is used to extract weld defects from weld radiographic images in the aim to calculate the perimeter and surfaces of those weld defects; encouraged resultants are obtained on synthetic and real radiographic images.
Abstract: The wavelet transform is one of the most important
method used in signal processing. In this study, we have introduced
frequency-energy characteristics of local earthquakes using discrete
wavelet transform. Frequency-energy characteristic was analyzed
depend on difference between P and S wave arrival time and noise
within records. We have found that local earthquakes have similar
characteristics. If frequency-energy characteristics can be found
accurately, this gives us a hint to calculate P and S wave arrival time.
It can be seen that wavelet transform provides successful
approximation for this. In this study, 100 earthquakes with 500
records were analyzed approximately.
Abstract: In this paper, a direct torque control - space vector
modulation (DTC-SVM) scheme is presented for a six-phase speed
and voltage sensorless induction motor (IM) drive. The decoupled
torque and stator flux control is achieved based on IM stator flux field orientation. The rotor speed is detected by on-line estimating of
the rotor angular slip speed and stator vector flux speed. In addition, a simple method is introduced to estimate the stator resistance.
Moreover in this control scheme the voltage sensors are eliminated
and actual motor phase voltages are approximated by using PWM
inverter switching times and the dc link voltage. Finally, some simulation and experimental results are presented to verify the
effectiveness and capability of the proposed control scheme.
Abstract: Accurate demand forecasting is one of the most key
issues in inventory management of spare parts. The problem of
modeling future consumption becomes especially difficult for lumpy
patterns, which characterized by intervals in which there is no
demand and, periods with actual demand occurrences with large
variation in demand levels. However, many of the forecasting
methods may perform poorly when demand for an item is lumpy.
In this study based on the characteristic of lumpy demand patterns
of spare parts a hybrid forecasting approach has been developed,
which use a multi-layered perceptron neural network and a
traditional recursive method for forecasting future demands. In the
described approach the multi-layered perceptron are adapted to
forecast occurrences of non-zero demands, and then a conventional
recursive method is used to estimate the quantity of non-zero
demands. In order to evaluate the performance of the proposed
approach, their forecasts were compared to those obtained by using
Syntetos & Boylan approximation, recently employed multi-layered
perceptron neural network, generalized regression neural network
and elman recurrent neural network in this area. The models were
applied to forecast future demand of spare parts of Arak
Petrochemical Company in Iran, using 30 types of real data sets. The
results indicate that the forecasts obtained by using our proposed
mode are superior to those obtained by using other methods.