Abstract: Personal name matching system is the core of
essential task in national citizen database, text and web mining,
information retrieval, online library system, e-commerce and record
linkage system. It has necessitated to the all embracing research in
the vicinity of name matching. Traditional name matching methods
are suitable for English and other Latin based language. Asian
languages which have no word boundary such as Myanmar language
still requires sounds alike matching system in Unicode based
application. Hence we proposed matching algorithm to get analogous
sounds alike (phonetic) pattern that is convenient for Myanmar
character spelling. According to the nature of Myanmar character, we
consider for word boundary fragmentation, collation of character.
Thus we use pattern conversion algorithm which fabricates words in
pattern with fragmented and collated. We create the Myanmar sounds
alike phonetic group to help in the phonetic matching. The
experimental results show that fragmentation accuracy in 99.32% and
processing time in 1.72 ms.
Abstract: This article presents a method for elections between the members of a group that is founded by fuzzy logic. Linguistic variables are objects for decision on election cards and deduction is based on t-norms and s-norms. In this election-s method election cards are questionnaire. The questionnaires are comprised of some questions with some choices. The choices are words from natural language. Presented method is accompanied by center of gravity (COG) defuzzification added up to a computer program by MATLAB. Finally the method is illustrated by solving two examples; choose a head for a research group-s members and a representative for students.
Abstract: Databases have become ubiquitous. Almost all IT applications are storing into and retrieving information from databases. Retrieving information from the database requires knowledge of technical languages such as Structured Query Language (SQL). However majority of the users who interact with the databases do not have a technical background and are intimidated by the idea of using languages such as SQL. This has led to the development of a few Natural Language Database Interfaces (NLDBIs). A NLDBI allows the user to query the database in a natural language. This paper highlights on architecture of new NLDBI system, its implementation and discusses on results obtained. In most of the typical NLDBI systems the natural language statement is converted into an internal representation based on the syntactic and semantic knowledge of the natural language. This representation is then converted into queries using a representation converter. A natural language query is translated to an equivalent SQL query after processing through various stages. The work has been experimented on primitive database queries with certain constraints.
Abstract: The localization of software products is essential for reaching the users of the international market. An important task for this is the translation of the user interface into local national languages. As graphical interfaces are usually optimized for the size of the texts in the original language, after the translation certain user controls (e.g. text labels and buttons in dialogs) may grow in such a manner that they slip above each other. This not only causes an unpleasant appearance but also makes the use of the program more difficult (or even impossible) which implies that the arrangement of the controls must be corrected subsequently. The correction should preserve the original structure of the interface (e.g. the relation of logically coherent controls), furthermore, it is important to keep the nicely proportioned design: the formation of large empty areas should be avoided. This paper describes an algorithm that automatically rearranges the controls of a graphical user interface based on the principles above. The algorithm has been implemented and integrated into a translation support system and reached results pleasant for the human eye in most test cases.
Abstract: Information Retrieval has the objective of studying
models and the realization of systems allowing a user to find the
relevant documents adapted to his need of information. The
information search is a problem which remains difficult because the
difficulty in the representing and to treat the natural languages such
as polysemia. Intentional Structures promise to be a new paradigm to
extend the existing documents structures and to enhance the different
phases of documents process such as creation, editing, search and
retrieval. The intention recognition of the author-s of texts can reduce
the largeness of this problem. In this article, we present intentions
recognition system is based on a semi-automatic method of
extraction the intentional information starting from a corpus of text.
This system is also able to update the ontology of intentions for the
enrichment of the knowledge base containing all possible intentions
of a domain. This approach uses the construction of a semi-formal
ontology which considered as the conceptualization of the intentional
information contained in a text. An experiments on scientific
publications in the field of computer science was considered to
validate this approach.
Abstract: Parsing is important in Linguistics and Natural
Language Processing to understand the syntax and semantics of a
natural language grammar. Parsing natural language text is
challenging because of the problems like ambiguity and inefficiency.
Also the interpretation of natural language text depends on context
based techniques. A probabilistic component is essential to resolve
ambiguity in both syntax and semantics thereby increasing accuracy
and efficiency of the parser. Tamil language has some inherent
features which are more challenging. In order to obtain the solutions,
lexicalized and statistical approach is to be applied in the parsing
with the aid of a language model. Statistical models mainly focus on
semantics of the language which are suitable for large vocabulary
tasks where as structural methods focus on syntax which models
small vocabulary tasks. A statistical language model based on Trigram
for Tamil language with medium vocabulary of 5000 words has
been built. Though statistical parsing gives better performance
through tri-gram probabilities and large vocabulary size, it has some
disadvantages like focus on semantics rather than syntax, lack of
support in free ordering of words and long term relationship. To
overcome the disadvantages a structural component is to be
incorporated in statistical language models which leads to the
implementation of hybrid language models. This paper has attempted
to build phrase structured hybrid language model which resolves
above mentioned disadvantages. In the development of hybrid
language model, new part of speech tag set for Tamil language has
been developed with more than 500 tags which have the wider
coverage. A phrase structured Treebank has been developed with 326
Tamil sentences which covers more than 5000 words. A hybrid
language model has been trained with the phrase structured Treebank
using immediate head parsing technique. Lexicalized and statistical
parser which employs this hybrid language model and immediate
head parsing technique gives better results than pure grammar and
trigram based model.