Finding Fuzzy Association Rules Using FWFP-Growth with Linguistic Supports and Confidences

In data mining, the association rules are used to search for the relations of items of the transactions database. Following the data is collected and stored, it can find rules of value through association rules, and assist manager to proceed marketing strategy and plan market framework. In this paper, we attempt fuzzy partition methods and decide membership function of quantitative values of each transaction item. Also, by managers we can reflect the importance of items as linguistic terms, which are transformed as fuzzy sets of weights. Next, fuzzy weighted frequent pattern growth (FWFP-Growth) is used to complete the process of data mining. The method above is expected to improve Apriori algorithm for its better efficiency of the whole association rules. An example is given to clearly illustrate the proposed approach.

Fuzzy Processing of Uncertain Data

In practice, we often come across situations where it is necessary to make decisions based on incomplete or uncertain data. In control systems it may be due to the unknown exact mathematical model, or its excessive complexity (e.g. nonlinearity) when it is necessary to simplify it, respectively, to solve it using a rule base. In the case of databases, searching data we compare a similarity measure with of the requirements of the selection with stored data, where both the select query and the data itself may contain vague terms, for example in the form of linguistic qualifiers. In this paper, we focus on the processing of uncertain data in databases and demonstrate it on the example multi-criteria decision making in the selection of variants, specified by higher number of technical parameters.

Deixis and Personalization in Ad Slogans

This study examines the use of the persuasive strategy of deixis and personalization in advertising slogans. This rhetorical/ stylistic and linguistic strategy has been found to be widely used in advertising slogans for over a century. A total of five hundred advertising slogans of multinational companies in both product and service sectors were obtained. The analysis reveals the 3 main components of this strategy as being deictic words, absolute uniqueness and personal pronouns. The percentage and mean of the use of the 3 components are tabulated. The findings show that advertisers have used this persuasive strategy in creative ways to persuade consumers to buy their products and services.

An Investigation into Kanji Character Discrimination Process from EEG Signals

The frontal area in the brain is known to be involved in behavioral judgement. Because a Kanji character can be discriminated visually and linguistically from other characters, in Kanji character discrimination, we hypothesized that frontal event-related potential (ERP) waveforms reflect two discrimination processes in separate time periods: one based on visual analysis and the other based on lexcical access. To examine this hypothesis, we recorded ERPs while performing a Kanji lexical decision task. In this task, either a known Kanji character, an unknown Kanji character or a symbol was presented and the subject had to report if the presented character was a known Kanji character for the subject or not. The same response was required for unknown Kanji trials and symbol trials. As a preprocessing of signals, we examined the performance of a method using independent component analysis for artifact rejection and found it was effective. Therefore we used it. In the ERP results, there were two time periods in which the frontal ERP wavefoms were significantly different betweeen the unknown Kanji trials and the symbol trials: around 170ms and around 300ms after stimulus onset. This result supported our hypothesis. In addition, the result suggests that Kanji character lexical access may be fully completed by around 260ms after stimulus onset.

Multi Language Text Editor for Burushaski and Urdu through Unicode

This paper introduces an isolated and unique ancient language Burushaski, spoken in Hunza, Nagar, Yasin and parts of Gilgit in the Northern Areas of Pakistan. It explains the working mechanism of Multi Language Text Editor for Urdu and Burushaski. It is developed under the use of ISO/IEC 10646 Unicode standards for Urdu and Burushaski open-type fonts. It gives an ample opportunity to this regional ancient language to have a modern Information technology for its promotion and preservation. The main objective of this research paper is to help preserve the heritage of such rare languages and give smart way of automation. It also facilitates to those who are interested in undertaking research on Burushaski or keen to trace fonatic relationship between the national Urdu language and Burushaski. Since this editor covers both Burushaski and Urdu so it can play an important role to introduce Burusho linguistic culture to the world at large. Precisely, as a result of this research paper, Burushaski publication through IT means would be possible.

The Determination of Rating Points of Objects with Qualitative Characteristics and their Usagein Decision Making Problems

The paper presents the method developed to assess rating points of objects with qualitative indexes. The novelty of the method lies in the fact that the authors use linguistic scales that allow to formalize the values of the indexes with the help of fuzzy sets. As a result it is possible to operate correctly with dissimilar indexes on the unified basis and to get stable final results. The obtained rating points are used in decision making based on fuzzy expert opinions.

Arabic Word Semantic Similarity

This paper is concerned with the production of an Arabic word semantic similarity benchmark dataset. It is the first of its kind for Arabic which was particularly developed to assess the accuracy of word semantic similarity measurements. Semantic similarity is an essential component to numerous applications in fields such as natural language processing, artificial intelligence, linguistics, and psychology. Most of the reported work has been done for English. To the best of our knowledge, there is no word similarity measure developed specifically for Arabic. In this paper, an Arabic benchmark dataset of 70 word pairs is presented. New methods and best possible available techniques have been used in this study to produce the Arabic dataset. This includes selecting and creating materials, collecting human ratings from a representative sample of participants, and calculating the overall ratings. This dataset will make a substantial contribution to future work in the field of Arabic WSS and hopefully it will be considered as a reference basis from which to evaluate and compare different methodologies in the field.

Fuzzy Approach for Ranking of Motor Vehicles Involved in Road Accidents

Increasing number of vehicles and lack of awareness among road users may lead to road accidents. However no specific literature was found to rank vehicles involved in accidents based on fuzzy variables of road users. This paper proposes a ranking of four selected motor vehicles involved in road accidents. Human and non-human factors that normally linked with road accidents are considered for ranking. The imprecision or vagueness inherent in the subjective assessment of the experts has led the application of fuzzy sets theory to deal with ranking problems. Data in form of linguistic variables were collected from three authorised personnel of three Malaysian Government agencies. The Multi Criteria Decision Making, fuzzy TOPSIS was applied in computational procedures. From the analysis, it shows that motorcycles vehicles yielded the highest closeness coefficient at 0.6225. A ranking can be drawn using the magnitude of closeness coefficient. It was indicated that the motorcycles recorded the first rank.

Fuzzy Trust for Peer-to-Peer Based Systems

Trust management is one of the drawbacks in Peer-to-Peer (P2P) system. Lack of centralized control makes it difficult to control the behavior of the peers. Reputation system is one approach to provide trust assessment in P2P system. In this paper, we use fuzzy logic to model trust in a P2P environment. Our trust model combines first-hand (direct experience) and second-hand (reputation)information to allow peers to represent and reason with uncertainty regarding other peers' trustworthiness. Fuzzy logic can help in handling the imprecise nature and uncertainty of trust. Linguistic labels are used to enable peers assign a trust level intuitively. Our fuzzy trust model is flexible such that inference rules are used to weight first-hand and second-hand accordingly.

Weak Measurement Theory for Discrete Scales

With the increasing spread of computers and the internet among culturally, linguistically and geographically diverse communities, issues of internationalization and localization and becoming increasingly important. For some of the issues such as different scales for length and temperature, there is a well-developed measurement theory. For others such as date formats no such theory will be possible. This paper fills a gap by developing a measurement theory for a class of scales previously overlooked, based on discrete and interval-valued scales such as spanner and shoe sizes. The paper gives a theoretical foundation for a class of data representation problems.

Greek Compounds: A Challenging Case for the Parsing Techniques of PC-KIMMO v.2

In this paper we describe the recognition process of Greek compound words using the PC-KIMMO software. We try to show certain limitations of the system with respect to the principles of compound formation in Greek. Moreover, we discuss the computational processing of phenomena such as stress and syllabification which are indispensable for the analysis of such constructions and we try to propose linguistically-acceptable solutions within the particular system.

ORank: An Ontology Based System for Ranking Documents

Increasing growth of information volume in the internet causes an increasing need to develop new (semi)automatic methods for retrieval of documents and ranking them according to their relevance to the user query. In this paper, after a brief review on ranking models, a new ontology based approach for ranking HTML documents is proposed and evaluated in various circumstances. Our approach is a combination of conceptual, statistical and linguistic methods. This combination reserves the precision of ranking without loosing the speed. Our approach exploits natural language processing techniques for extracting phrases and stemming words. Then an ontology based conceptual method will be used to annotate documents and expand the query. To expand a query the spread activation algorithm is improved so that the expansion can be done in various aspects. The annotated documents and the expanded query will be processed to compute the relevance degree exploiting statistical methods. The outstanding features of our approach are (1) combining conceptual, statistical and linguistic features of documents, (2) expanding the query with its related concepts before comparing to documents, (3) extracting and using both words and phrases to compute relevance degree, (4) improving the spread activation algorithm to do the expansion based on weighted combination of different conceptual relationships and (5) allowing variable document vector dimensions. A ranking system called ORank is developed to implement and test the proposed model. The test results will be included at the end of the paper.

Communicative Competence: Novice versus Professional Engineers' Perceptions

The notion of communicative competence has been deemed fuzzy in communication studies. This fuzziness has led to tensions among engineers across tenures in interpreting what constitutes communicative competence. The study seeks to investigate novice and professional engineers- understanding of the said notion in terms of two main elements of communicative competence: linguistic and rhetorical competence. Novice engineers are final year engineering students, whilst professional engineers represent engineers who have at least 5 years working experience. Novice and professional engineers were interviewed to gauge their perceptions on linguistic and rhetorical features deemed necessary to enhance communicative competence for the profession. Both groups indicated awareness and differences on the importance of the sub-sets of communicative competence, namely, rhetorical explanatory competence, linguistic oral immediacy competence, technical competence and meta-cognitive competence. Such differences, a possible attribute of the learning theory, inadvertently indicate sublime differences in the way novice and professional engineers perceive communicative competence.

Prediction of Reusability of Object Oriented Software Systems using Clustering Approach

In literature, there are metrics for identifying the quality of reusable components but the framework that makes use of these metrics to precisely predict reusability of software components is still need to be worked out. These reusability metrics if identified in the design phase or even in the coding phase can help us to reduce the rework by improving quality of reuse of the software component and hence improve the productivity due to probabilistic increase in the reuse level. As CK metric suit is most widely used metrics for extraction of structural features of an object oriented (OO) software; So, in this study, tuned CK metric suit i.e. WMC, DIT, NOC, CBO and LCOM, is used to obtain the structural analysis of OO-based software components. An algorithm has been proposed in which the inputs can be given to K-Means Clustering system in form of tuned values of the OO software component and decision tree is formed for the 10-fold cross validation of data to evaluate the in terms of linguistic reusability value of the component. The developed reusability model has produced high precision results as desired.

Micropower Fuzzy Linguistic-Hedges Circuit in Current-Mode Approach

In this paper, based on a novel synthesis, a set of new simplified circuit design to implement the linguistic-hedge operations for adjusting the fuzzy membership function set is presented. The circuits work in current-mode and employ floating-gate MOS (FGMOS) transistors that operate in weak inversion region. Compared to the other proposed circuits, these circuits feature severe reduction of the elements number, low supply voltage (0.7V), low power consumption (60dB). In this paper, a set of fuzzy linguistic hedge circuits, including absolutely, very, much more, more, plus minus, more or less and slightly, has been implemented in 0.18 mm CMOS process. Simulation results by Hspice confirm the validity of the proposed design technique and show high performance of the circuits.

Maya Semantic Technique: A Mathematical Technique Used to Determine Partial Semantics for Declarative Sentences

This research uses computational linguistics, an area of study that employs a computer to process natural language, and aims at discerning the patterns that exist in declarative sentences used in technical texts. The approach is mathematical, and the focus is on instructional texts found on web pages. The technique developed by the author and named the MAYA Semantic Technique is used here and organized into four stages. In the first stage, the parts of speech in each sentence are identified. In the second stage, the subject of the sentence is determined. In the third stage, MAYA performs a frequency analysis on the remaining words to determine the verb and its object. In the fourth stage, MAYA does statistical analysis to determine the content of the web page. The advantage of the MAYA Semantic Technique lies in its use of mathematical principles to represent grammatical operations which assist processing and accuracy if performed on unambiguous text. The MAYA Semantic Technique is part of a proposed architecture for an entire web-based intelligent tutoring system. On a sample set of sentences, partial semantics derived using the MAYA Semantic Technique were approximately 80% accurate. The system currently processes technical text in one domain, namely Cµ programming. In this domain all the keywords and programming concepts are known and understood.

Extraction of Significant Phrases from Text

Prospective readers can quickly determine whether a document is relevant to their information need if the significant phrases (or keyphrases) in this document are provided. Although keyphrases are useful, not many documents have keyphrases assigned to them, and manually assigning keyphrases to existing documents is costly. Therefore, there is a need for automatic keyphrase extraction. This paper introduces a new domain independent keyphrase extraction algorithm. The algorithm approaches the problem of keyphrase extraction as a classification task, and uses a combination of statistical and computational linguistics techniques, a new set of attributes, and a new machine learning method to distinguish keyphrases from non-keyphrases. The experiments indicate that this algorithm performs better than other keyphrase extraction tools and that it significantly outperforms Microsoft Word 2000-s AutoSummarize feature. The domain independence of this algorithm has also been confirmed in our experiments.

A Case Study of Reactive Focus on Form through Negotiation on Spoken Errors: Does It Work for All Learners?

This case study investigates the effects of reactive focus on form through negotiation on the linguistic development of an adult EFL learner in an exclusive private EFL classroom. The findings revealed that in this classroom negotiated feedback occurred significantly more often than non-negotiated feedback. However, it was also found that in the long run the learner was significantly more successful in correcting his own errors when he had received nonnegotiated feedback than negotiated feedback. This study, therefore, argues that although negotiated feedback seems to be effective for some learners in the short run, it is non-negotiated feedback which seems to be more effective in the long run. This long lasting effect might be attributed to the impact of schooling system which is itself indicative of the dominant culture, or to the absence of other interlocutors in the course of interaction.

Ontology Population via NLP Techniques in Risk Management

In this paper we propose an NLP-based method for Ontology Population from texts and apply it to semi automatic instantiate a Generic Knowledge Base (Generic Domain Ontology) in the risk management domain. The approach is semi-automatic and uses a domain expert intervention for validation. The proposed approach relies on a set of Instances Recognition Rules based on syntactic structures, and on the predicative power of verbs in the instantiation process. It is not domain dependent since it heavily relies on linguistic knowledge. A description of an experiment performed on a part of the ontology of the PRIMA1 project (supported by the European community) is given. A first validation of the method is done by populating this ontology with Chemical Fact Sheets from Environmental Protection Agency2. The results of this experiment complete the paper and support the hypothesis that relying on the predicative power of verbs in the instantiation process improves the performance.