Sounds Alike Name Matching for Myanmar Language

Personal name matching system is the core of essential task in national citizen database, text and web mining, information retrieval, online library system, e-commerce and record linkage system. It has necessitated to the all embracing research in the vicinity of name matching. Traditional name matching methods are suitable for English and other Latin based language. Asian languages which have no word boundary such as Myanmar language still requires sounds alike matching system in Unicode based application. Hence we proposed matching algorithm to get analogous sounds alike (phonetic) pattern that is convenient for Myanmar character spelling. According to the nature of Myanmar character, we consider for word boundary fragmentation, collation of character. Thus we use pattern conversion algorithm which fabricates words in pattern with fragmented and collated. We create the Myanmar sounds alike phonetic group to help in the phonetic matching. The experimental results show that fragmentation accuracy in 99.32% and processing time in 1.72 ms.

Fuzzy Voting in Internal Elections of Educational and Party Organizations

This article presents a method for elections between the members of a group that is founded by fuzzy logic. Linguistic variables are objects for decision on election cards and deduction is based on t-norms and s-norms. In this election-s method election cards are questionnaire. The questionnaires are comprised of some questions with some choices. The choices are words from natural language. Presented method is accompanied by center of gravity (COG) defuzzification added up to a computer program by MATLAB. Finally the method is illustrated by solving two examples; choose a head for a research group-s members and a representative for students.

Natural Language Database Interface for Selection of Data Using Grammar and Parsing

Databases have become ubiquitous. Almost all IT applications are storing into and retrieving information from databases. Retrieving information from the database requires knowledge of technical languages such as Structured Query Language (SQL). However majority of the users who interact with the databases do not have a technical background and are intimidated by the idea of using languages such as SQL. This has led to the development of a few Natural Language Database Interfaces (NLDBIs). A NLDBI allows the user to query the database in a natural language. This paper highlights on architecture of new NLDBI system, its implementation and discusses on results obtained. In most of the typical NLDBI systems the natural language statement is converted into an internal representation based on the syntactic and semantic knowledge of the natural language. This representation is then converted into queries using a representation converter. A natural language query is translated to an equivalent SQL query after processing through various stages. The work has been experimented on primitive database queries with certain constraints.

Automatic Rearrangement of Localized Graphical User Interface

The localization of software products is essential for reaching the users of the international market. An important task for this is the translation of the user interface into local national languages. As graphical interfaces are usually optimized for the size of the texts in the original language, after the translation certain user controls (e.g. text labels and buttons in dialogs) may grow in such a manner that they slip above each other. This not only causes an unpleasant appearance but also makes the use of the program more difficult (or even impossible) which implies that the arrangement of the controls must be corrected subsequently. The correction should preserve the original structure of the interface (e.g. the relation of logically coherent controls), furthermore, it is important to keep the nicely proportioned design: the formation of large empty areas should be avoided. This paper describes an algorithm that automatically rearranges the controls of a graphical user interface based on the principles above. The algorithm has been implemented and integrated into a translation support system and reached results pleasant for the human eye in most test cases.

Semi-Automatic Analyzer to Detect Authorial Intentions in Scientific Documents

Information Retrieval has the objective of studying models and the realization of systems allowing a user to find the relevant documents adapted to his need of information. The information search is a problem which remains difficult because the difficulty in the representing and to treat the natural languages such as polysemia. Intentional Structures promise to be a new paradigm to extend the existing documents structures and to enhance the different phases of documents process such as creation, editing, search and retrieval. The intention recognition of the author-s of texts can reduce the largeness of this problem. In this article, we present intentions recognition system is based on a semi-automatic method of extraction the intentional information starting from a corpus of text. This system is also able to update the ontology of intentions for the enrichment of the knowledge base containing all possible intentions of a domain. This approach uses the construction of a semi-formal ontology which considered as the conceptualization of the intentional information contained in a text. An experiments on scientific publications in the field of computer science was considered to validate this approach.

Structural Parsing of Natural Language Text in Tamil Using Phrase Structure Hybrid Language Model

Parsing is important in Linguistics and Natural Language Processing to understand the syntax and semantics of a natural language grammar. Parsing natural language text is challenging because of the problems like ambiguity and inefficiency. Also the interpretation of natural language text depends on context based techniques. A probabilistic component is essential to resolve ambiguity in both syntax and semantics thereby increasing accuracy and efficiency of the parser. Tamil language has some inherent features which are more challenging. In order to obtain the solutions, lexicalized and statistical approach is to be applied in the parsing with the aid of a language model. Statistical models mainly focus on semantics of the language which are suitable for large vocabulary tasks where as structural methods focus on syntax which models small vocabulary tasks. A statistical language model based on Trigram for Tamil language with medium vocabulary of 5000 words has been built. Though statistical parsing gives better performance through tri-gram probabilities and large vocabulary size, it has some disadvantages like focus on semantics rather than syntax, lack of support in free ordering of words and long term relationship. To overcome the disadvantages a structural component is to be incorporated in statistical language models which leads to the implementation of hybrid language models. This paper has attempted to build phrase structured hybrid language model which resolves above mentioned disadvantages. In the development of hybrid language model, new part of speech tag set for Tamil language has been developed with more than 500 tags which have the wider coverage. A phrase structured Treebank has been developed with 326 Tamil sentences which covers more than 5000 words. A hybrid language model has been trained with the phrase structured Treebank using immediate head parsing technique. Lexicalized and statistical parser which employs this hybrid language model and immediate head parsing technique gives better results than pure grammar and trigram based model.