Abstract: Octree compression techniques have been used
for several years for compressing large three dimensional data
sets into homogeneous regions. This compression technique
is ideally suited to datasets which have similar values in
clusters. Oil engineers represent reservoirs as a three dimensional
grid where hydrocarbons occur naturally in clusters. This
research looks at the efficiency of storing these grids using
octree compression techniques where grid cells are broken
into active and inactive regions. Initial experiments yielded
high compression ratios as only active leaf nodes and their
ancestor, header nodes are stored as a bitstream to file on
disk. Savings in computational time and memory were possible
at decompression, as only active leaf nodes are sent to the
graphics card eliminating the need of reconstructing the original
matrix. This results in a more compact vertex table, which can
be loaded into the graphics card quicker and generating shorter
refresh delay times.
Abstract: As the majority of faults are found in a few of its modules so there is a need to investigate the modules that are affected severely as compared to other modules and proper maintenance need to be done on time especially for the critical applications. In this paper, we have explored the different predictor models to NASA-s public domain defect dataset coded in Perl programming language. Different machine learning algorithms belonging to the different learner categories of the WEKA project including Mamdani Based Fuzzy Inference System and Neuro-fuzzy based system have been evaluated for the modeling of maintenance severity or impact of fault severity. The results are recorded in terms of Accuracy, Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The results show that Neuro-fuzzy based model provides relatively better prediction accuracy as compared to other models and hence, can be used for the maintenance severity prediction of the software.
Abstract: A new method for color image segmentation using fuzzy logic is proposed in this paper. Our aim here is to automatically produce a fuzzy system for color classification and image segmentation with least number of rules and minimum error rate. Particle swarm optimization is a sub class of evolutionary algorithms that has been inspired from social behavior of fishes, bees, birds, etc, that live together in colonies. We use comprehensive learning particle swarm optimization (CLPSO) technique to find optimal fuzzy rules and membership functions because it discourages premature convergence. Here each particle of the swarm codes a set of fuzzy rules. During evolution, a population member tries to maximize a fitness criterion which is here high classification rate and small number of rules. Finally, particle with the highest fitness value is selected as the best set of fuzzy rules for image segmentation. Our results, using this method for soccer field image segmentation in Robocop contests shows 89% performance. Less computational load is needed when using this method compared with other methods like ANFIS, because it generates a smaller number of fuzzy rules. Large train dataset and its variety, makes the proposed method invariant to illumination noise
Abstract: Using 1km grid datasets representing monthly mean
precipitation, monthly mean temperature, and dry matter production
(DMP), we considered the regional plant production ability in
Southeast and South Asia, and also employed pixel-by-pixel
correlation analysis to assess the intensity of relation between climate
factors and plant production. While annual DMP in South Asia was
approximately less than 2,000kg, the one in most part of Southeast
Asia exceeded 2,500 - 3,000kg. It suggested that plant production in
Southeast Asia was superior to South Asia, however, Rain-Use
Efficiency (RUE) representing dry matter production per 1mm
precipitation showed that inland of Indochina Peninsula and India
were higher than islands in Southeast Asia. By the results of
correlation analysis between climate factors and DMP, while the area
in most parts of Indochina Peninsula indicated negative correlation
coefficients between DMP and precipitation or temperature, the area
in Malay Peninsula and islands showed negative correlation to
precipitation and positive one to temperature, and most part of India
dominating South Asia showed positive to precipitation and negative
to temperature. In addition, the areas where the correlation coefficients
exceeded |0.8| were regarded as “susceptible" to climate factors, and
the areas smaller than |0.2| were “insusceptible". By following the
discrimination, the map implying expected impacts by climate change
was provided.
Abstract: Saturated hydraulic conductivity of Soil is an
important property in processes involving water and solute flow in
soils. Saturated hydraulic conductivity of soil is difficult to measure
and can be highly variable, requiring a large number of replicate
samples. In this study, 60 sets of soil samples were collected at
Saqhez region of Kurdistan province-IRAN. The statistics such as
Correlation Coefficient (R), Root Mean Square Error (RMSE), Mean
Bias Error (MBE) and Mean Absolute Error (MAE) were used to
evaluation the multiple linear regression models varied with number
of dataset. In this study the multiple linear regression models were
evaluated when only percentage of sand, silt, and clay content (SSC)
were used as inputs, and when SSC and bulk density, Bd, (SSC+Bd)
were used as inputs. The R, RMSE, MBE and MAE values of the 50
dataset for method (SSC), were calculated 0.925, 15.29, -1.03 and
12.51 and for method (SSC+Bd), were calculated 0.927, 15.28,-1.11
and 12.92, respectively, for relationship obtained from multiple
linear regressions on data. Also the R, RMSE, MBE and MAE values
of the 10 dataset for method (SSC), were calculated 0.725, 19.62, -
9.87 and 18.91 and for method (SSC+Bd), were calculated 0.618,
24.69, -17.37 and 22.16, respectively, which shows when number of
dataset increase, precision of estimated saturated hydraulic
conductivity, increases.
Abstract: In the past years a lot of effort has been made in the
field of face detection. The human face contains important features
that can be used by vision-based automated systems in order to
identify and recognize individuals. Face location, the primary step of
the vision-based automated systems, finds the face area in the input
image. An accurate location of the face is still a challenging task.
Viola-Jones framework has been widely used by researchers in order
to detect the location of faces and objects in a given image. Face
detection classifiers are shared by public communities, such as
OpenCV. An evaluation of these classifiers will help researchers to
choose the best classifier for their particular need. This work focuses
of the evaluation of face detection classifiers minding facial
landmarks.
Abstract: This paper presents a real time force sensing
instrument that is designed for human gait analysis purposes. It is
capable of recording and monitoring ground reaction forces exerted
by human foot during various activities such as walking, running and
jumping in real time. In overall, force sensing mat mainly consists of
three elements: the force sensing mat, signal conditioning circuit and
data acquisition device. Force sensing mat is the mat that contains an
array of force sensing elements. To control and process the incoming
signal from the force sensing mat, Force-Logger and Force-Reloader
are developed using National Instrument Labview. This paper
describes the architecture of the force sensing mat, signal
conditioning circuit and the real time streaming of the incoming data
from the force sensing mat. Additionally, a preliminary experiment
dataset is presented in this paper.
Abstract: This article addresses feature selection for breast
cancer diagnosis. The present process contains a wrapper approach
based on Genetic Algorithm (GA) and case-based reasoning (CBR).
GA is used for searching the problem space to find all of the possible
subsets of features and CBR is employed to estimate the evaluation
result of each subset. The results of experiment show that the
proposed model is comparable to the other models on Wisconsin
breast cancer (WDBC) dataset.
Abstract: The literature reports a large number of approaches for
measuring the similarity between protein sequences. Most of these
approaches estimate this similarity using alignment-based techniques
that do not necessarily yield biologically plausible results, for two
reasons.
First, for the case of non-alignable (i.e., not yet definitively aligned
and biologically approved) sequences such as multi-domain, circular
permutation and tandem repeat protein sequences, alignment-based
approaches do not succeed in producing biologically plausible results.
This is due to the nature of the alignment, which is based on the
matching of subsequences in equivalent positions, while non-alignable
proteins often have similar and conserved domains in non-equivalent
positions.
Second, the alignment-based approaches lead to similarity measures
that depend heavily on the parameters set by the user for the alignment
(e.g., gap penalties and substitution matrices). For easily alignable
protein sequences, it's possible to supply a suitable combination of
input parameters that allows such an approach to yield biologically
plausible results. However, for difficult-to-align protein sequences,
supplying different combinations of input parameters yields different
results. Such variable results create ambiguities and complicate the
similarity measurement task.
To overcome these drawbacks, this paper describes a novel and
effective approach for measuring the similarity between protein
sequences, called SAF for Substitution and Alignment Free. Without
resorting either to the alignment of protein sequences or to substitution
relations between amino acids, SAF is able to efficiently detect the
significant subsequences that best represent the intrinsic properties of
protein sequences, those underlying the chronological dependencies of
structural features and biochemical activities of protein sequences.
Moreover, by using a new efficient subsequence matching scheme,
SAF more efficiently handles protein sequences that contain similar
structural features with significant meaning in chronologically
non-equivalent positions. To show the effectiveness of SAF, extensive
experiments were performed on protein datasets from different
databases, and the results were compared with those obtained by
several mainstream algorithms.
Abstract: Mining sequential patterns from large customer transaction databases has been recognized as a key research topic in database systems. However, the previous works more focused on mining sequential patterns at a single concept level. In this study, we introduced concept hierarchies into this problem and present several algorithms for discovering multiple-level sequential patterns based on the hierarchies. An experiment was conducted to assess the performance of the proposed algorithms. The performances of the algorithms were measured by the relative time spent on completing the mining tasks on two different datasets. The experimental results showed that the performance depends on the characteristics of the datasets and the pre-defined threshold of minimal support for each level of the concept hierarchy. Based on the experimental results, some suggestions were also given for how to select appropriate algorithm for a certain datasets.
Abstract: It is important problems to increase the detection rates
and reduce false positive rates in Intrusion Detection System (IDS).
Although preventative techniques such as access control and
authentication attempt to prevent intruders, these can fail, and as a
second line of defence, intrusion detection has been introduced. Rare
events are events that occur very infrequently, detection of rare
events is a common problem in many domains. In this paper we
propose an intrusion detection method that combines Rough set and
Fuzzy Clustering. Rough set has to decrease the amount of data and
get rid of redundancy. Fuzzy c-means clustering allow objects to
belong to several clusters simultaneously, with different degrees of
membership. Our approach allows us to recognize not only known
attacks but also to detect suspicious activity that may be the result of
a new, unknown attack. The experimental results on Knowledge
Discovery and Data Mining-(KDDCup 1999) Dataset show that the
method is efficient and practical for intrusion detection systems.
Abstract: Segmentation and quantification of stenosis is an
important task in assessing coronary artery disease. One of the main
challenges is measuring the real diameter of curved vessels.
Moreover, uncertainty in segmentation of different tissues in the
narrow vessel is an important issue that affects accuracy. This paper
proposes an algorithm to extract coronary arteries and measure the
degree of stenosis. Markovian fuzzy clustering method is applied to
model uncertainty arises from partial volume effect problem. The
algorithm employs: segmentation, centreline extraction, estimation of
orthogonal plane to centreline, measurement of the degree of
stenosis. To evaluate the accuracy and reproducibility, the approach
has been applied to a vascular phantom and the results are compared
with real diameter. The results of 10 patient datasets have been
visually judged by a qualified radiologist. The results reveal the
superiority of the proposed method compared to the Conventional
thresholding Method (CTM) on both datasets.
Abstract: This paper details the application of a genetic
programming framework for induction of useful classification rules
from a database of income statements, balance sheets, and cash flow
statements for North American public companies. Potentially
interesting classification rules are discovered. Anomalies in the
discovery process merit further investigation of the application of
genetic programming to the dataset for the problem domain.
Abstract: Distance visualization of large datasets often takes the direction of remote viewing and zooming techniques of stored static images. However, the continuous increase in the size of datasets and visualization operation causes insufficient performance with traditional desktop computers. Additionally, the visualization techniques such as Isosurface depend on the available resources of the running machine and the size of datasets. Moreover, the continuous demand for powerful computing powers and continuous increase in the size of datasets results an urgent need for a grid computing infrastructure. However, some issues arise in current grid such as resources availability at the client machines which are not sufficient enough to process large datasets. On top of that, different output devices and different network bandwidth between the visualization pipeline components often result output suitable for one machine and not suitable for another. In this paper we investigate how the grid services could be used to support remote visualization of large datasets and to break the constraint of physical co-location of the resources by applying the grid computing technologies. We show our grid enabled architecture to visualize large medical datasets (circa 5 million polygons) for remote interactive visualization on modest resources clients.
Abstract: The goal of data mining algorithms is to discover
useful information embedded in large databases. One of the most
important data mining problems is discovery of frequently occurring
patterns in sequential data. In a multidimensional sequence each
event depends on more than one dimension. The search space is quite
large and the serial algorithms are not scalable for very large
datasets. To address this, it is necessary to study scalable parallel
implementations of sequence mining algorithms.
In this paper, we present a model for multidimensional sequence
and describe a parallel algorithm based on data parallelism.
Simulation experiments show good load balancing and scalable and
acceptable speedup over different processors and problem sizes and
demonstrate that our approach can works efficiently in a real parallel
computing environment.
Abstract: Empirical insights into the implementation of logistics competencies at the top management level are scarce. This paper addresses this issue with an explorative approach which is based on a dataset of 872 observations in the years 2000, 2004 and 2008 using quantitative content analysis from annual reports of the 500 publicly listed firms with the highest global research and development expenditures according to the British Department for Business Innovation and Skills. We find that logistics competencies are more pronounced in Asian companies than in their European or American counterparts. On an industrial level the results are quite mixed. Using partial point-biserial correlations we show that logistics competencies are positively related to financial performance.
Abstract: In this paper, we evaluate the performance of some wavelet based coding algorithms such as 3D QT-L, 3D SPIHT and JPEG2K. In the first step we achieve an objective comparison between three coders, namely 3D SPIHT, 3D QT-L and JPEG2K. For this purpose, eight MRI head scan test sets of 256 x 256x124 voxels have been used. Results show superior performance of 3D SPIHT algorithm, whereas 3D QT-L outperforms JPEG2K. The second step consists of evaluating the robustness of 3D SPIHT and JPEG2K coding algorithm over wireless transmission. Compressed dataset images are then transmitted over AWGN wireless channel or over Rayleigh wireless channel. Results show the superiority of JPEG2K over these two models. In fact, it has been deduced that JPEG2K is more robust regarding coding errors. Thus we may conclude the necessity of using corrector codes in order to protect the transmitted medical information.
Abstract: The paper presents an on-line recognition machine
(RM) for continuous/isolated, dynamic and static gestures that arise
in Flight Deck Officer (FDO) training. RM is based on generic pattern
recognition framework. Gestures are represented as templates using
summary statistics. The proposed recognition algorithm exploits temporal
and spatial characteristics of gestures via dynamic programming
and Markovian process. The algorithm predicts corresponding index
of incremental input data in the templates in an on-line mode.
Accumulated consistency in the sequence of prediction provides a
similarity measurement (Score) between input data and the templates.
The algorithm provides an intuitive mechanism for automatic detection
of start/end frames of continuous gestures. In the present paper,
we consider isolated gestures. The performance of RM is evaluated
using four datasets - artificial (W TTest), hand motion (Yang) and
FDO (tracker, vision-based ). RM achieves comparable results which
are in agreement with other on-line and off-line algorithms such as
hidden Markov model (HMM) and dynamic time warping (DTW).
The proposed algorithm has the additional advantage of providing
timely feedback for training purposes.
Abstract: Segmentation is an important step in medical image
analysis and classification for radiological evaluation or computer
aided diagnosis. The CAD (Computer Aided Diagnosis ) of lung CT
generally first segment the area of interest (lung) and then analyze
the separately obtained area for nodule detection in order to
diagnosis the disease. For normal lung, segmentation can be
performed by making use of excellent contrast between air and
surrounding tissues. However this approach fails when lung is
affected by high density pathology. Dense pathologies are present in
approximately a fifth of clinical scans, and for computer analysis
such as detection and quantification of abnormal areas it is vital that
the entire and perfectly lung part of the image is provided and no
part, as present in the original image be eradicated. In this paper we
have proposed a lung segmentation technique which accurately
segment the lung parenchyma from lung CT Scan images. The
algorithm was tested against the 25 datasets of different patients
received from Ackron Univeristy, USA and AGA Khan Medical
University, Karachi, Pakistan.
Abstract: The amount of the information being churned out by the field of biology has jumped manifold and now requires the extensive use of computer techniques for the management of this information. The predominance of biological information such as protein sequence similarity in the biological information sea is key information for detecting protein evolutionary relationship. Protein sequence similarity typically implies homology, which in turn may imply structural and functional similarities. In this work, we propose, a learning method for detecting remote protein homology. The proposed method uses a transformation that converts protein sequence into fixed-dimensional representative feature vectors. Each feature vector records the sensitivity of a protein sequence to a set of amino acids substrings generated from the protein sequences of interest. These features are then used in conjunction with support vector machines for the detection of the protein remote homology. The proposed method is tested and evaluated on two different benchmark protein datasets and it-s able to deliver improvements over most of the existing homology detection methods.