Abstract: Recent developments in storage technology and
networking architectures have made it possible for broad areas of applications to rely on data streams for quick response and accurate
decision making. Data streams are generated from events of real world so existence of associations, which are among the occurrence of these events in real world, among concepts of data streams is
logical. Extraction of these hidden associations can be useful for prediction of subsequent concepts in concept shifting data streams. In this paper we present a new method for learning association among
concepts of data stream and prediction of what the next concept will be. Knowing the next concept, an informed update of data model will be possible. The results of conducted experiments show that the proposed method is proper for classification of concept shifting data
streams.
Abstract: As emails communications have no consistent
authentication procedure to ensure the authenticity, we present an
investigation analysis approach for detecting forged emails based on
Random Forests and Naïve Bays classifiers. Instead of investigating
the email headers, we use the body content to extract a unique writing
style for all the possible suspects. Our approach consists of four main
steps: (1) The cybercrime investigator extract different effective
features including structural, lexical, linguistic, and syntactic
evidence from previous emails for all the possible suspects, (2) The
extracted features vectors are normalized to increase the accuracy
rate. (3) The normalized features are then used to train the learning
engine, (4) upon receiving the anonymous email (M); we apply the
feature extraction process to produce a feature vector. Finally, using
the machine learning classifiers the email is assigned to one of the
suspects- whose writing style closely matches M. Experimental
results on real data sets show the improved performance of the
proposed method and the ability of identifying the authors with a
very limited number of features.
Abstract: Two seperate experiments by barley and alfalfa were
conducted to a 2×8 factorial completely randomised design, with four
replicates. Factors were inoculation (M) with Gomus mosseae or uninoculation
(M0) and seven levels of contaminants (Co, Cd, Pb and
combinations) plus an uncontaminated control treatment (C). Heavy
metals in plant tissues and soil were quantified by Inductively
Coupled Plasma Optical Emission Spectrometer (ICP-OES) (Variant-
Liberty 150AX Turbo). Phytoextraction coefficient of contaminants
calculated by concentration of heavy metals in the shoot (mgkg-1) /
concentration of heavy metals in soil (mgkg-1). In the barley, the
highest rate of phytoextraction coefficient of Pb, Cd and Co was in
M0Pb, M0PbCoCd and MCo, respectively (P
Abstract: Face Recognition is a field of multidimensional
applications. A lot of work has been done, extensively on the most of
details related to face recognition. This idea of face recognition using
PCA is one of them. In this paper the PCA features for Feature
extraction are used and matching is done for the face under
consideration with the test image using Eigen face coefficients. The
crux of the work lies in optimizing Euclidean distance and paving the
way to test the same algorithm using Matlab which is an efficient tool
having powerful user interface along with simplicity in representing
complex images.
Abstract: The present work presents the extraction of copper(II)
from sulphuric acid solutions with Sodium diethyldithiocarbamate
(SDDT), and six different organic diluents: Dichloromethane,
Chloroform, Carbon tetrachloride, Toluene, xylene and Cyclohexane,
were tested. The pair SDDT/Chloroform showed to be the most
selective in removing the copper cations, and hence was considered
throughout the experimental study.
The effects of operating parameters such as the initial concentration
of the extracting agent, the agitation time, the agitation speed and the
acid concentration were considered.
For an initial concentration of Cu (II) of 63 ppm in a 0.5 M sulphuric
acid solution, both with a mass of the extracting agent of 20 mg, an
extraction percentage of about 97.8 % and a distribution coefficient
of 44.42 were obtained, respectively, confirming the performance
of the SDDT-Chloroform pair.
Abstract: Artemia is one of the most conspicuous invertebrates
associated with aquaculture. It can be considered as a model
organism, offering numerous advantages for comprehensive and
multidisciplinary studies using morphologic or molecular methods.
Since DNA extraction is an important step of any molecular
experiment, a new and a rapid method of DNA extraction from adult
Artemia was described in this study. Besides, the efficiency of this
technique was compared with two widely used alternative techniques,
namely Chelex® 100 resin and SDS-chloroform methods. Data
analysis revealed that the new method is the easiest and the most cost
effective method among the other methods which allows a quick and
efficient extraction of DNA from the adult animal.
Abstract: Erwinia carotovora var. carotovora is the main cause of soft rot in potatoes. Hyphaene thebaica was studied for biocontrol of E. carotovora which inhibited growth of E. carotovora on solid medium, a comparative study of classical and ultrasound-assisted extractions of Hyphaene thebaica fruit. The use of ultrasound decreased significant the total time of treatment and increase the total amount of crude extract. The crude extract was subjected to determine the in vitro, by a bioassay technique revealed that the treatment of paper disks with ultrasound extraction of Hyphaene thebaica reduced the growth of pathogen and produced inhibition zones up to 38mm in diameter. The antioxidant activity of ultrasound-ethanolic extract of Doum fruits (Hyphaene thebaica) was determined. Data obtained showed that the extract contains the secondary metabolites such as Tannins, Saponin, Flavonoids, Phenols, Steroids, Terpenoids, Glycosides and Alkaloids.
Abstract: This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions in documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme (TFIDF) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the bird flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the EART system compared with another system that uses the Apriori algorithm throughout the execution time and evaluating extracted association rules.
Abstract: An important step in studying the statistics of
fingerprint minutia features is to reliably extract minutia features from
the fingerprint images. A new reliable method of computation for
minutiae feature extraction from fingerprint images is presented. A
fingerprint image is treated as a textured image. An orientation flow
field of the ridges is computed for the fingerprint image. To
accurately locate ridges, a new ridge orientation based computation
method is proposed. After ridge segmentation a new method of
computation is proposed for smoothing the ridges. The ridge skeleton
image is obtained and then smoothed using morphological operators
to detect the features. A post processing stage eliminates a large
number of false features from the detected set of minutiae features.
The detected features are observed to be reliable and accurate.
Abstract: We present a new method for the fully automatic 3D
reconstruction of the coronary artery centerlines, using two X-ray
angiogram projection images from a single rotating monoplane
acquisition system. During the first stage, the input images are
smoothed using curve evolution techniques. Next, a simple yet
efficient multiscale method, based on the information of the Hessian
matrix, for the enhancement of the vascular structure is introduced.
Hysteresis thresholding using different image quantiles, is used to
threshold the arteries. This stage is followed by a thinning procedure
to extract the centerlines. The resulting skeleton image is then pruned
using morphological and pattern recognition techniques to remove
non-vessel like structures. Finally, edge-based stereo correspondence
is solved using a parallel evolutionary optimization method based on
f symbiosis. The detected 2D centerlines combined with disparity
map information allow the reconstruction of the 3D vessel
centerlines. The proposed method has been evaluated on patient data
sets for evaluation purposes.
Abstract: A human verification system is presented in this
paper. The system consists of several steps: background subtraction,
thresholding, line connection, region growing, morphlogy, star
skelatonization, feature extraction, feature matching, and decision
making. The proposed system combines an advantage of star
skeletonization and simple statistic features. A correlation matching
and probability voting have been used for verification, followed by a
logical operation in a decision making stage. The proposed system
uses small number of features and the system reliability is
convincing.
Abstract: Lanthanum oxide is to be recovered from monazite,
which contains about 13.44% lanthanum oxide. The principal
objective of this study is to be able to extract lanthanum oxide from
monazite of Moemeik Myitsone Area. The treatment of monazite in
this study involves three main steps; extraction of lanthanum
hydroxide from monazite by using caustic soda, digestion with nitric
acid and precipitation with ammonium hydroxide and calcination of
lanthanum oxalate to lanthanum oxide.
Abstract: ICA which is generally used for blind source separation
problem has been tested for feature extraction in Speech recognition
system to replace the phoneme based approach of MFCC. Applying
the Cepstral coefficients generated to ICA as preprocessing has
developed a new signal processing approach. This gives much better
results against MFCC and ICA separately, both for word and speaker
recognition. The mixing matrix A is different before and after MFCC
as expected. As Mel is a nonlinear scale. However, cepstrals
generated from Linear Predictive Coefficient being independent
prove to be the right candidate for ICA. Matlab is the tool used for
all comparisons. The database used is samples of ISOLET.
Abstract: Multi-residue analysis method for penicillins was
developed and validated in bovine muscle, chicken, milk, and flatfish.
Detection was based on liquid chromatography tandem mass
spectrometry (LC/MS/MS). The developed method was validated for
specificity, precision, recovery, and linearity. The analytes were
extracted with 80% acetonitrile and clean-up by a single
reversed-phase solid-phase extraction step. Six penicillins presented
recoveries higher than 76% with the exception of Amoxicillin
(59.7%). Relative standard deviations (RSDs) were not more than
10%. LOQs values ranged from 0.1 and to 4.5 ug/kg. The method was
applied to 128 real samples. Benzylpenicillin was detected in 15
samples and Cloxacillin was detected in 7 samples. Oxacillin was
detected in 2 samples. But the detected levels were under the MRL
levels for penicillins in samples.
Abstract: The use of High Order Statistics (HOS) analysis is
expected to provide so many candidates of features that can be selected for pattern recognition. More candidates of the feature can
be extracted using simple manipulation through a specific mathematical function prior to the HOS analysis. Feature extraction
method using HOS analysis combined with Difference to the Nth-Power manipulation has been examined in application for Automatic
Modulation Recognition (AMR) to perform scheme recognition of three digital modulation signal, i.e. QPSK-16QAM-64QAM in the
AWGN transmission channel. The simulation results is reported
when the analysis of HOS up to order-12 and the manipulation of Difference to the Nth-Power up to N = 4. The obtained accuracy rate
of AMR using the method of Simple Decision obtained 90% in SNR > 10 dB in its classifier, while using the method of Voted Decision is
96% in SNR > 2 dB.
Abstract: A wide spectrum of systems require reliable
personal recognition schemes to either confirm or determine the
identity of an individual person. This paper considers multimodal
biometric system and their applicability to access control,
authentication and security applications. Strategies for feature
extraction and sensor fusion are considered and contrasted. Issues
related to performance assessment, deployment and standardization
are discussed. Finally future directions of biometric systems
development are discussed.
Abstract: Named Entity Recognition (NER) aims to classify each word of a document into predefined target named entity classes and is now-a-days considered to be fundamental for many Natural Language Processing (NLP) tasks such as information retrieval, machine translation, information extraction, question answering systems and others. This paper reports about the development of a NER system for Bengali and Hindi using Support Vector Machine (SVM). Though this state of the art machine learning technique has been widely applied to NER in several well-studied languages, the use of this technique to Indian languages (ILs) is very new. The system makes use of the different contextual information of the words along with the variety of features that are helpful in predicting the four different named (NE) classes, such as Person name, Location name, Organization name and Miscellaneous name. We have used the annotated corpora of 122,467 tokens of Bengali and 502,974 tokens of Hindi tagged with the twelve different NE classes 1, defined as part of the IJCNLP-08 NER Shared Task for South and South East Asian Languages (SSEAL) 2. In addition, we have manually annotated 150K wordforms of the Bengali news corpus, developed from the web-archive of a leading Bengali newspaper. We have also developed an unsupervised algorithm in order to generate the lexical context patterns from a part of the unlabeled Bengali news corpus. Lexical patterns have been used as the features of SVM in order to improve the system performance. The NER system has been tested with the gold standard test sets of 35K, and 60K tokens for Bengali, and Hindi, respectively. Evaluation results have demonstrated the recall, precision, and f-score values of 88.61%, 80.12%, and 84.15%, respectively, for Bengali and 80.23%, 74.34%, and 77.17%, respectively, for Hindi. Results show the improvement in the f-score by 5.13% with the use of context patterns. Statistical analysis, ANOVA is also performed to compare the performance of the proposed NER system with that of the existing HMM based system for both the languages.
Abstract: Surface metrology with image processing is a challenging task having wide applications in industry. Surface roughness can be evaluated using texture classification approach. Important aspect here is appropriate selection of features that characterize the surface. We propose an effective combination of features for multi-scale and multi-directional analysis of engineering surfaces. The features include standard deviation, kurtosis and the Canny edge detector. We apply the method by analyzing the surfaces with Discrete Wavelet Transform (DWT) and Dual-Tree Complex Wavelet Transform (DT-CWT). We used Canberra distance metric for similarity comparison between the surface classes. Our database includes the surface textures manufactured by three machining processes namely Milling, Casting and Shaping. The comparative study shows that DT-CWT outperforms DWT giving correct classification performance of 91.27% with Canberra distance metric.
Abstract: In this paper we proposed a method for finding video
frames representing one sign in the finger alphabet. The method is
based on determining hands location, segmentation and the use of
standard video quality evaluation metrics. Metric calculation is
performed only in regions of interest. Sliding mechanism for finding
local extrema and adaptive threshold based on local averaging is used
for key frames selection. The success rate is evaluated by recall,
precision and F1 measure. The method effectiveness is compared
with metrics applied to all frames. Proposed method is fast, effective
and relatively easy to realize by simple input video preprocessing
and subsequent use of tools designed for video quality measuring.
Abstract: MATCH project [1] entitle the development of an
automatic diagnosis system that aims to support treatment of colon
cancer diseases by discovering mutations that occurs to tumour
suppressor genes (TSGs) and contributes to the development of
cancerous tumours. The constitution of the system is based on a)
colon cancer clinical data and b) biological information that will be
derived by data mining techniques from genomic and proteomic
sources The core mining module will consist of the popular, well
tested hybrid feature extraction methods, and new combined
algorithms, designed especially for the project. Elements of rough
sets, evolutionary computing, cluster analysis, self-organization maps
and association rules will be used to discover the annotations
between genes, and their influence on tumours [2]-[11].
The methods used to process the data have to address their high
complexity, potential inconsistency and problems of dealing with the
missing values. They must integrate all the useful information
necessary to solve the expert's question. For this purpose, the system
has to learn from data, or be able to interactively specify by a domain
specialist, the part of the knowledge structure it needs to answer a
given query. The program should also take into account the
importance/rank of the particular parts of data it analyses, and adjusts
the used algorithms accordingly.