Abstract: Named Entity Recognition (NER) aims to classify each word of a document into predefined target named entity classes and is now-a-days considered to be fundamental for many Natural Language Processing (NLP) tasks such as information retrieval, machine translation, information extraction, question answering systems and others. This paper reports about the development of a NER system for Bengali and Hindi using Support Vector Machine (SVM). Though this state of the art machine learning technique has been widely applied to NER in several well-studied languages, the use of this technique to Indian languages (ILs) is very new. The system makes use of the different contextual information of the words along with the variety of features that are helpful in predicting the four different named (NE) classes, such as Person name, Location name, Organization name and Miscellaneous name. We have used the annotated corpora of 122,467 tokens of Bengali and 502,974 tokens of Hindi tagged with the twelve different NE classes 1, defined as part of the IJCNLP-08 NER Shared Task for South and South East Asian Languages (SSEAL) 2. In addition, we have manually annotated 150K wordforms of the Bengali news corpus, developed from the web-archive of a leading Bengali newspaper. We have also developed an unsupervised algorithm in order to generate the lexical context patterns from a part of the unlabeled Bengali news corpus. Lexical patterns have been used as the features of SVM in order to improve the system performance. The NER system has been tested with the gold standard test sets of 35K, and 60K tokens for Bengali, and Hindi, respectively. Evaluation results have demonstrated the recall, precision, and f-score values of 88.61%, 80.12%, and 84.15%, respectively, for Bengali and 80.23%, 74.34%, and 77.17%, respectively, for Hindi. Results show the improvement in the f-score by 5.13% with the use of context patterns. Statistical analysis, ANOVA is also performed to compare the performance of the proposed NER system with that of the existing HMM based system for both the languages.
Abstract: Surface metrology with image processing is a challenging task having wide applications in industry. Surface roughness can be evaluated using texture classification approach. Important aspect here is appropriate selection of features that characterize the surface. We propose an effective combination of features for multi-scale and multi-directional analysis of engineering surfaces. The features include standard deviation, kurtosis and the Canny edge detector. We apply the method by analyzing the surfaces with Discrete Wavelet Transform (DWT) and Dual-Tree Complex Wavelet Transform (DT-CWT). We used Canberra distance metric for similarity comparison between the surface classes. Our database includes the surface textures manufactured by three machining processes namely Milling, Casting and Shaping. The comparative study shows that DT-CWT outperforms DWT giving correct classification performance of 91.27% with Canberra distance metric.
Abstract: Biometrics methods include recognition techniques
such as fingerprint, iris, hand geometry, voice, face, ears and gait. The gait recognition approach has some advantages, for example it
does not need the prior concern of the observed subject and it can
record many biometric features in order to make deeper analysis, but
most of the research proposals use high computational cost. This
paper shows a gait recognition system with feature subtraction on a
bundle rectangle drawn over the observed person. Statistical results
within a database of 500 videos are shown.
Abstract: A frictionless contact problem for a two-layer orthotropic elastic medium loaded through a rigid flat stamp is considered. It is assumed that tensile tractions are not allowed and only compressive tractions can be transmitted across the interface. In the solution, effect of gravity is taken into consideration. If the external load on the rigid stamp is less than or equal to a critical value, continuous contact between the layers is maintained. The problem is expressed in terms of a singular integral equation by using the theory of elasticity and the Fourier transforms. Numerical results for initial separation point, critical separation load and contact stress distribution are presented.
Abstract: This paper investigates the activity of the
gastrocnemius (Gas) muscle in healthy subjects during salat (ruku-
position) and specific exercise [Unilateral Plantar Flexion Exercise
(UPFE)] using electromyography (EMG). Both lateral and medial
Gas muscles were assessed. A group of undergraduates aged between
19 to 25 years voluntarily participated in this study. The myoelectric
activity of the muscles were recorded and analyzed. The finding
indicated that there were contractions of the muscles during the salat
and exercise with almost same EMG-s level. From the result,
Wilcoxon-s Rank Sum test showed no significant difference between
ruku- and UPFE for both medial (p=0.082) and lateral (p=0.226) of
GAS muscles. Therefore, salat may be useful in strengthening
exercise and also in rehabilitation programs for lower limb activities.
Abstract: In this paper we proposed a method for finding video
frames representing one sign in the finger alphabet. The method is
based on determining hands location, segmentation and the use of
standard video quality evaluation metrics. Metric calculation is
performed only in regions of interest. Sliding mechanism for finding
local extrema and adaptive threshold based on local averaging is used
for key frames selection. The success rate is evaluated by recall,
precision and F1 measure. The method effectiveness is compared
with metrics applied to all frames. Proposed method is fast, effective
and relatively easy to realize by simple input video preprocessing
and subsequent use of tools designed for video quality measuring.
Abstract: MATCH project [1] entitle the development of an
automatic diagnosis system that aims to support treatment of colon
cancer diseases by discovering mutations that occurs to tumour
suppressor genes (TSGs) and contributes to the development of
cancerous tumours. The constitution of the system is based on a)
colon cancer clinical data and b) biological information that will be
derived by data mining techniques from genomic and proteomic
sources The core mining module will consist of the popular, well
tested hybrid feature extraction methods, and new combined
algorithms, designed especially for the project. Elements of rough
sets, evolutionary computing, cluster analysis, self-organization maps
and association rules will be used to discover the annotations
between genes, and their influence on tumours [2]-[11].
The methods used to process the data have to address their high
complexity, potential inconsistency and problems of dealing with the
missing values. They must integrate all the useful information
necessary to solve the expert's question. For this purpose, the system
has to learn from data, or be able to interactively specify by a domain
specialist, the part of the knowledge structure it needs to answer a
given query. The program should also take into account the
importance/rank of the particular parts of data it analyses, and adjusts
the used algorithms accordingly.
Abstract: The methanolic extracts from seeds of tamarind
(Tamarindus indica) was prepared by Soxhlet apparatus extraction
and evaluated for total phenolic content by Folin-Ciocalteu method.
Then, methanolic extract was screened biological activities (In vitro)
for anti-melanogenic activity by tyrosinase inhibition test, antiinflammation
activity by cyclooxygenase 1 (COX-1) and
cyclooxygenase 2 (COX-2) inhibition test, and cytotoxic screening
test with Vero cells. The results showed that total phenolic content,
which contained in extract, was contained 27.72 mg of gallic acid
equivalent per g of dry weight. The ability to inhibit tyrosinase
enzyme, which exerted by Tamarind seed extracts (1 mg/ml) was
52.13 ± 0.42 %. The extract was not possessed inhibitory effect to
COX-1 and COX-2 enzymes and cytotoxic effect to Vero cells. The
finding is concludes that tested seed extract was possessed
antimelanogenic activity with non-toxic effects. However, there was
not exhibited anti-inflammatory activity. Further studies include the
use of advance biological models to confirm this biological activity,
as well as, the isolation and characterization of the purified
compounds that it was contained.
Abstract: Human identification at a distance has recently gained
growing interest from computer vision researchers. Gait recognition
aims essentially to address this problem by identifying people based
on the way they walk [1]. Gait recognition has 3 steps. The first step
is preprocessing, the second step is feature extraction and the third
one is classification. This paper focuses on the classification step that
is essential to increase the CCR (Correct Classification Rate).
Multilayer Perceptron (MLP) is used in this work. Neural Networks
imitate the human brain to perform intelligent tasks [3].They can
represent complicated relationships between input and output and
acquire knowledge about these relationships directly from the data
[2]. In this paper we apply MLP NN for 11 views in our database and
compare the CCR values for these views. Experiments are performed
with the NLPR databases, and the effectiveness of the proposed
method for gait recognition is demonstrated.
Abstract: The purpose of this paper is to solve the problem of protecting aerial lines from high impedance faults (HIFs) in distribution systems. This investigation successfully applies 3I0 zero sequence current to solve HIF problems. The feature extraction system based on discrete wavelet transform (DWT) and the feature identification technique found on statistical confidence are then applied to discriminate effectively between the HIFs and the switch operations. Based on continuous wavelet transform (CWT) pattern recognition of HIFs is proposed, also. Staged fault testing results demonstrate that the proposed wavelet based algorithm is feasible performance well.
Abstract: The extraction of meaningful information from image
could be an alternative method for time series analysis. In this paper,
we propose a graphical analysis of time series grouped into table
with adjusted colour scale for numerical values. The advantages of
this method are also discussed. The proposed method is easy to
understand and is flexible to implement the standard methods of
pattern recognition and verification, especially for noisy
environmental data.
Abstract: The proposed system identifies the species of the wood
using the textural features present in its barks. Each species of a wood
has its own unique patterns in its bark, which enabled the proposed
system to identify it accurately. Automatic wood recognition system
has not yet been well established mainly due to lack of research in this
area and the difficulty in obtaining the wood database. In our work, a
wood recognition system has been designed based on pre-processing
techniques, feature extraction and by correlating the features of those
wood species for their classification. Texture classification is a problem
that has been studied and tested using different methods due to its
valuable usage in various pattern recognition problems, such as wood
recognition, rock classification. The most popular technique used
for the textural classification is Gray-level Co-occurrence Matrices
(GLCM). The features from the enhanced images are thus extracted
using the GLCM is correlated, which determines the classification
between the various wood species. The result thus obtained shows a
high rate of recognition accuracy proving that the techniques used in
suitable to be implemented for commercial purposes.
Abstract: The effect of wheat flour extraction rates on flour
composition, farinographic characteristics and the quality of
sourdough naans was investigated. The results indicated that by
increasing the extraction rate, the amount of protein, fiber, fat and
ash increased, whereas moisture content decreased. Farinographic
characteristic like water absorption and dough development time
increased with an increase in flour extraction rate but the dough
stabilities and tolerance indices were reduced with an increase in
flour extraction rates. Titratable acidity for both sourdough and
sourdough naans also increased along with flour extraction rate. The
study showed that overall quality of sourdough naans were affected
by both flour extraction rate and starter culture used. Sensory
analysis of sourdough naans revealed that desirable extraction rate
for sourdough naan was 76%.
Abstract: Groundwater has become the most dependable source
of fresh water for agriculture, domestic and industrial uses in the past
few decades. This wide use of groundwater if left uncontrolled and
unseen will lead to overexploitation causing sea water intrusion in the
coastal areas and illegal water marketing. Several Policies and Acts
have been enacted to regulate and manage the use of this valuable
resource. In spite of this the over extraction of groundwater beyond
the recharging capacity of aquifers and depletion in the quality of
groundwater is continuing. The current study aims at reviewing the
Acts and Policies existing in the State of Tamil Nadu and in the
National level regarding groundwater regulation and management.
Further an analysis is made on the rights associated with the usage of
groundwater resources and the gaps in these policies have been
analyzed. Some suggestions are made to reform the existing
groundwater policies for better management and regulation of the
resource.
Abstract: This paper describes an optimal approach for feature
subset selection to classify the leaves based on Genetic Algorithm
(GA) and Kernel Based Principle Component Analysis (KPCA). Due
to high complexity in the selection of the optimal features, the
classification has become a critical task to analyse the leaf image
data. Initially the shape, texture and colour features are extracted
from the leaf images. These extracted features are optimized through
the separate functioning of GA and KPCA. This approach performs
an intersection operation over the subsets obtained from the
optimization process. Finally, the most common matching subset is
forwarded to train the Support Vector Machine (SVM). Our
experimental results successfully prove that the application of GA
and KPCA for feature subset selection using SVM as a classifier is
computationally effective and improves the accuracy of the classifier.
Abstract: Electromyography (EMG) signal processing has been investigated remarkably regarding various applications such as in rehabilitation systems. Specifically, wavelet transform has served as a powerful technique to scrutinize EMG signals since wavelet transform is consistent with the nature of EMG as a non-stationary signal. In this paper, the efficiency of wavelet transform in surface EMG feature extraction is investigated from four levels of wavelet decomposition and a comparative study between different mother wavelets had been done. To recognize the best function and level of wavelet analysis, two evaluation criteria, scatter plot and RES index are recruited. Hereupon, four wavelet families, namely, Daubechies, Coiflets, Symlets and Biorthogonal are studied in wavelet decomposition stage. Consequently, the results show that only features from first and second level of wavelet decomposition yields good performance and some functions of various wavelet families can lead to an improvement in separability class of different hand movements.
Abstract: In this paper, a new approach for target recognition based on the Empirical mode decomposition (EMD) algorithm of Huang etal. [11] and the energy tracking operator of Teager [13]-[14] is introduced. The conjunction of these two methods is called Teager-Huang analysis. This approach is well suited for nonstationary signals analysis. The impulse response (IR) of target is first band pass filtered into subsignals (components) called Intrinsic mode functions (IMFs) with well defined Instantaneous frequency (IF) and Instantaneous amplitude (IA). Each IMF is a zero-mean AM-FM component. In second step, the energy of each IMF is tracked using the Teager energy operator (TEO). IF and IA, useful to describe the time-varying characteristics of the signal, are estimated using the Energy separation algorithm (ESA) algorithm of Maragos et al .[16]-[17]. In third step, a set of features such as skewness and kurtosis are extracted from the IF, IA and IMF energy functions. The Teager-Huang analysis is tested on set of synthetic IRs of Sonar targets with different physical characteristics (density, velocity, shape,? ). PCA is first applied to features to discriminate between manufactured and natural targets. The manufactured patterns are classified into spheres and cylinders. One hundred percent of correct recognition is achieved with twenty three echoes where sixteen IRs, used for training, are free noise and seven IRs, used for testing phase, are corrupted with white Gaussian noise.
Abstract: In this work, we are interested in developing a speech denoising tool by using a discrete wavelet packet transform (DWPT). This speech denoising tool will be employed for applications of recognition, coding and synthesis. For noise reduction, instead of applying the classical thresholding technique, some wavelet packet nodes are set to zero and the others are thresholded. To estimate the non stationary noise level, we employ the spectral entropy. A comparison of our proposed technique to classical denoising methods based on thresholding and spectral subtraction is made in order to evaluate our approach. The experimental implementation uses speech signals corrupted by two sorts of noise, white and Volvo noises. The obtained results from listening tests show that our proposed technique is better than spectral subtraction. The obtained results from SNR computation show the superiority of our technique when compared to the classical thresholding method using the modified hard thresholding function based on u-law algorithm.
Abstract: This Paper proposes a new facial feature extraction approach, Wash-Hadamard Transform (WHT). This approach is based on correlation between local pixels of the face image. Its primary advantage is the simplicity of its computation. The paper compares the proposed approach, WHT, which was traditionally used in data compression with two other known approaches: the Principal Component Analysis (PCA) and the Discrete Cosine Transform (DCT) using the face database of Olivetti Research Laboratory (ORL). In spite of its simple computation, the proposed algorithm (WHT) gave very close results to those obtained by the PCA and DCT. This paper initiates the research into WHT and the family of frequency transforms and examines their suitability for feature extraction in face recognition applications.
Abstract: The CMLP building was developed to be a model for
sustainability with strategies to reduce water, energy and pollution,
and to provide a healthy environment for the building occupants. The
aim of this paper is to investigate the environmental effects of energy
used by this building. A LCA (life cycle analysis) was led to measure
the real environmental effects produced by the use of energy. The
impact categories most affected by the energy use were found to be
the human health effects, as well as ecotoxicity. Natural gas
extraction, uranium milling for nuclear energy production, and the
blasting for mining and infrastructure construction are the processes
contributing the most to emissions in the human health effect. Data
comparing LCA results of CMLP building with a conventional
building results showed that energy used by the CMLP building has
less damage for the environment and human health than a
conventional building.