Abstract: In the paper a method of modeling text for Polish is
discussed. The method is aimed at transforming continuous input text
into a text consisting of sentences in so called canonical form, whose
characteristic is, among others, a complete structure as well as no
anaphora or ellipses. The transformation is lossless as to the content
of text being transformed. The modeling method has been worked
out for the needs of the Thetos system, which translates Polish
written texts into the Polish sign language. We believe that the
method can be also used in various applications that deal with the
natural language, e.g. in a text summary generator for Polish.
Abstract: Automatic Extraction of Event information from
social text stream (emails, social network sites, blogs etc) is a vital
requirement for many applications like Event Planning and
Management systems and security applications. The key information
components needed from Event related text are Event title, location,
participants, date and time. Emails have very unique distinctions over
other social text streams from the perspective of layout and format
and conversation style and are the most commonly used
communication channel for broadcasting and planning events.
Therefore we have chosen emails as our dataset. In our work, we
have employed two statistical NLP methods, named as Finite State
Machines (FSM) and Hidden Markov Model (HMM) for the
extraction of event related contextual information. An application
has been developed providing a comparison among the two methods
over the event extraction task. It comprises of two modules, one for
each method, and works for both bulk as well as direct user input.
The results are evaluated using Precision, Recall and F-Score.
Experiments show that both methods produce high performance and
accuracy, however HMM was good enough over Title extraction and
FSM proved to be better for Venue, Date, and time.
Abstract: Focusing on the environmental issues, including the reduction of scrap and consumer residuals, along with the benefiting from the economic value during the life cycle of goods/products leads the companies to have an important competitive approach. The aim of this paper is to present a new mixed nonlinear facility locationallocation model in recycling collection networks by considering multi-echelon, multi-suppliers, multi-collection centers and multifacilities in the recycling network. To make an appropriate decision in reality, demands, returns, capacities, costs and distances, are regarded uncertain in our model. For this purpose, a fuzzy mathematical programming-based possibilistic approach is introduced as a solution methodology from the recent literature to solve the proposed mixed-nonlinear programming model (MNLP). The computational experiments are provided to illustrate the applicability of the designed model in a supply chain environment and to help the decision makers to facilitate their analysis.
Abstract: Named Entity Recognition (NER) aims to classify each word of a document into predefined target named entity classes and is now-a-days considered to be fundamental for many Natural Language Processing (NLP) tasks such as information retrieval, machine translation, information extraction, question answering systems and others. This paper reports about the development of a NER system for Bengali and Hindi using Support Vector Machine (SVM). Though this state of the art machine learning technique has been widely applied to NER in several well-studied languages, the use of this technique to Indian languages (ILs) is very new. The system makes use of the different contextual information of the words along with the variety of features that are helpful in predicting the four different named (NE) classes, such as Person name, Location name, Organization name and Miscellaneous name. We have used the annotated corpora of 122,467 tokens of Bengali and 502,974 tokens of Hindi tagged with the twelve different NE classes 1, defined as part of the IJCNLP-08 NER Shared Task for South and South East Asian Languages (SSEAL) 2. In addition, we have manually annotated 150K wordforms of the Bengali news corpus, developed from the web-archive of a leading Bengali newspaper. We have also developed an unsupervised algorithm in order to generate the lexical context patterns from a part of the unlabeled Bengali news corpus. Lexical patterns have been used as the features of SVM in order to improve the system performance. The NER system has been tested with the gold standard test sets of 35K, and 60K tokens for Bengali, and Hindi, respectively. Evaluation results have demonstrated the recall, precision, and f-score values of 88.61%, 80.12%, and 84.15%, respectively, for Bengali and 80.23%, 74.34%, and 77.17%, respectively, for Hindi. Results show the improvement in the f-score by 5.13% with the use of context patterns. Statistical analysis, ANOVA is also performed to compare the performance of the proposed NER system with that of the existing HMM based system for both the languages.
Abstract: Adsorption of Toluidine blue dye from aqueous solutions onto Neem Leaf Powder (NLP) has been investigated. The surface characterization of this natural material was examined by Particle size analysis, Scanning Electron Microscopy (SEM), Fourier Transform Infrared (FTIR) spectroscopy and X-Ray Diffraction (XRD). The effects of process parameters such as initial concentration, pH, temperature and contact duration on the adsorption capacities have been evaluated, in which pH has been found to be most effective parameter among all. The data were analyzed using the Langmuir and Freundlich for explaining the equilibrium characteristics of adsorption. And kinetic models like pseudo first- order, second-order model and Elovich equation were utilized to describe the kinetic data. The experimental data were well fitted with Langmuir adsorption isotherm model and pseudo second order kinetic model. The thermodynamic parameters, such as Free energy of adsorption (AG"), enthalpy change (AH') and entropy change (ASĀ°) were also determined and evaluated.
Abstract: Human identification at a distance has recently gained
growing interest from computer vision researchers. Gait recognition
aims essentially to address this problem by identifying people based
on the way they walk [1]. Gait recognition has 3 steps. The first step
is preprocessing, the second step is feature extraction and the third
one is classification. This paper focuses on the classification step that
is essential to increase the CCR (Correct Classification Rate).
Multilayer Perceptron (MLP) is used in this work. Neural Networks
imitate the human brain to perform intelligent tasks [3].They can
represent complicated relationships between input and output and
acquire knowledge about these relationships directly from the data
[2]. In this paper we apply MLP NN for 11 views in our database and
compare the CCR values for these views. Experiments are performed
with the NLPR databases, and the effectiveness of the proposed
method for gait recognition is demonstrated.
Abstract: XML files contain data which is in well formatted manner. By studying the format or semantics of the grammar it will be helpful for fast retrieval of the data. There are many algorithms which describes about searching the data from XML files. There are no. of approaches which uses data structure or are related to the contents of the document. In these cases user must know about the structure of the document and information retrieval techniques using NLPs is related to content of the document. Hence the result may be irrelevant or not so successful and may take more time to search.. This paper presents fast XML retrieval techniques by using new indexing technique and the concept of RXML. When indexing an XML document, the system takes into account both the document content and the document structure and assigns the value to each tag from file. To query the system, a user is not constrained about fixed format of query.
Abstract: In this paper, we propose a multiple objective optimization model with respect to portfolio selection problem for investors looking forward to diversify their equity investments in a number of equity markets. Based on Markowitz-s M-V model we developed a Fuzzy Mixed Integer Multi-Objective Nonlinear Programming Problem (FMIMONLP) to maximize the investors- future gains on equity markets, reach the optimal proportion of the budget to be invested in different equities. A numerical example with a comprehensive analysis on artificial data from several equity markets is presented in order to illustrate the proposed model and its solution method. The model performed well compared with the deterministic version of the model.