Abstract: In data mining, the association rules are used to search
for the relations of items of the transactions database. Following the
data is collected and stored, it can find rules of value through
association rules, and assist manager to proceed marketing strategy
and plan market framework. In this paper, we attempt fuzzy partition
methods and decide membership function of quantitative values of
each transaction item. Also, by managers we can reflect the
importance of items as linguistic terms, which are transformed as
fuzzy sets of weights. Next, fuzzy weighted frequent pattern growth
(FWFP-Growth) is used to complete the process of data mining. The
method above is expected to improve Apriori algorithm for its better
efficiency of the whole association rules. An example is given to
clearly illustrate the proposed approach.
Abstract: In practice, we often come across situations where it is
necessary to make decisions based on incomplete or uncertain data.
In control systems it may be due to the unknown exact mathematical
model, or its excessive complexity (e.g. nonlinearity) when it is
necessary to simplify it, respectively, to solve it using a rule base. In
the case of databases, searching data we compare a similarity
measure with of the requirements of the selection with stored data,
where both the select query and the data itself may contain vague
terms, for example in the form of linguistic qualifiers. In this paper,
we focus on the processing of uncertain data in databases and
demonstrate it on the example multi-criteria decision making in the
selection of variants, specified by higher number of technical
parameters.
Abstract: This study examines the use of the persuasive strategy
of deixis and personalization in advertising slogans. This rhetorical/
stylistic and linguistic strategy has been found to be widely used in
advertising slogans for over a century. A total of five hundred
advertising slogans of multinational companies in both product and
service sectors were obtained. The analysis reveals the 3 main
components of this strategy as being deictic words, absolute
uniqueness and personal pronouns. The percentage and mean of the
use of the 3 components are tabulated. The findings show that
advertisers have used this persuasive strategy in creative ways to
persuade consumers to buy their products and services.
Abstract: The frontal area in the brain is known to be involved in
behavioral judgement. Because a Kanji character can be discriminated
visually and linguistically from other characters, in Kanji character
discrimination, we hypothesized that frontal event-related potential
(ERP) waveforms reflect two discrimination processes in separate
time periods: one based on visual analysis and the other based
on lexcical access. To examine this hypothesis, we recorded ERPs
while performing a Kanji lexical decision task. In this task, either a
known Kanji character, an unknown Kanji character or a symbol was
presented and the subject had to report if the presented character was
a known Kanji character for the subject or not. The same response
was required for unknown Kanji trials and symbol trials. As a preprocessing
of signals, we examined the performance of a method
using independent component analysis for artifact rejection and found
it was effective. Therefore we used it. In the ERP results, there
were two time periods in which the frontal ERP wavefoms were
significantly different betweeen the unknown Kanji trials and the
symbol trials: around 170ms and around 300ms after stimulus onset.
This result supported our hypothesis. In addition, the result suggests
that Kanji character lexical access may be fully completed by around
260ms after stimulus onset.
Abstract: This paper introduces an isolated and unique ancient language Burushaski, spoken in Hunza, Nagar, Yasin and parts of Gilgit in the Northern Areas of Pakistan. It explains the working mechanism of Multi Language Text Editor for Urdu and Burushaski. It is developed under the use of ISO/IEC 10646 Unicode standards for Urdu and Burushaski open-type fonts. It gives an ample opportunity to this regional ancient language to have a modern Information technology for its promotion and preservation. The main objective of this research paper is to help preserve the heritage of such rare languages and give smart way of automation. It also facilitates to those who are interested in undertaking research on Burushaski or keen to trace fonatic relationship between the national Urdu language and Burushaski. Since this editor covers both Burushaski and Urdu so it can play an important role to introduce Burusho linguistic culture to the world at large. Precisely, as a result of this research paper, Burushaski publication through IT means would be possible.
Abstract: The paper presents the method developed to assess
rating points of objects with qualitative indexes. The novelty of the
method lies in the fact that the authors use linguistic scales that allow
to formalize the values of the indexes with the help of fuzzy sets. As
a result it is possible to operate correctly with dissimilar indexes on
the unified basis and to get stable final results. The obtained rating
points are used in decision making based on fuzzy expert opinions.
Abstract: This paper is concerned with the production of an Arabic word semantic similarity benchmark dataset. It is the first of its kind for Arabic which was particularly developed to assess the accuracy of word semantic similarity measurements. Semantic similarity is an essential component to numerous applications in fields such as natural language processing, artificial intelligence, linguistics, and psychology. Most of the reported work has been done for English. To the best of our knowledge, there is no word similarity measure developed specifically for Arabic. In this paper, an Arabic benchmark dataset of 70 word pairs is presented. New methods and best possible available techniques have been used in this study to produce the Arabic dataset. This includes selecting and creating materials, collecting human ratings from a representative sample of participants, and calculating the overall ratings. This dataset will make a substantial contribution to future work in the field of Arabic WSS and hopefully it will be considered as a reference basis from which to evaluate and compare different methodologies in the field.
Abstract: Increasing number of vehicles and lack of awareness among road users may lead to road accidents. However no specific literature was found to rank vehicles involved in accidents based on fuzzy variables of road users. This paper proposes a ranking of four selected motor vehicles involved in road accidents. Human and non-human factors that normally linked with road accidents are considered for ranking. The imprecision or vagueness inherent in the subjective assessment of the experts has led the application of fuzzy sets theory to deal with ranking problems. Data in form of linguistic variables were collected from three authorised personnel of three Malaysian Government agencies. The Multi Criteria Decision Making, fuzzy TOPSIS was applied in computational procedures. From the analysis, it shows that motorcycles vehicles yielded the highest closeness coefficient at 0.6225. A ranking can be drawn using the magnitude of closeness coefficient. It was indicated that the motorcycles recorded the first rank.
Abstract: Trust management is one of the drawbacks in Peer-to-Peer (P2P) system. Lack of centralized control makes it difficult to control the behavior of the peers. Reputation system is one approach to provide trust assessment in P2P system. In this paper, we use fuzzy logic to model trust in a P2P environment. Our trust model combines first-hand (direct experience) and second-hand (reputation)information to allow peers to represent and reason with uncertainty regarding other peers' trustworthiness. Fuzzy logic can help in handling the imprecise nature and uncertainty of trust. Linguistic labels are used to enable peers assign a trust level intuitively. Our fuzzy trust model is flexible such that inference rules are used to weight first-hand and second-hand accordingly.
Abstract: With the increasing spread of computers and the internet among culturally, linguistically and geographically diverse communities, issues of internationalization and localization and becoming increasingly important. For some of the issues such as different scales for length and temperature, there is a well-developed measurement theory. For others such as date formats no such theory will be possible. This paper fills a gap by developing a measurement theory for a class of scales previously overlooked, based on discrete and interval-valued scales such as spanner and shoe sizes. The paper gives a theoretical foundation for a class of data representation problems.
Abstract: In this paper we describe the recognition process of Greek compound words using the PC-KIMMO software. We try to show certain limitations of the system with respect to the principles of compound formation in Greek. Moreover, we discuss the computational processing of phenomena such as stress and syllabification which are indispensable for the analysis of such constructions and we try to propose linguistically-acceptable solutions within the particular system.
Abstract: Increasing growth of information volume in the
internet causes an increasing need to develop new (semi)automatic
methods for retrieval of documents and ranking them according to
their relevance to the user query. In this paper, after a brief review
on ranking models, a new ontology based approach for ranking
HTML documents is proposed and evaluated in various
circumstances. Our approach is a combination of conceptual,
statistical and linguistic methods. This combination reserves the
precision of ranking without loosing the speed. Our approach
exploits natural language processing techniques for extracting
phrases and stemming words. Then an ontology based conceptual
method will be used to annotate documents and expand the query.
To expand a query the spread activation algorithm is improved so
that the expansion can be done in various aspects. The annotated
documents and the expanded query will be processed to compute
the relevance degree exploiting statistical methods. The outstanding
features of our approach are (1) combining conceptual, statistical
and linguistic features of documents, (2) expanding the query with
its related concepts before comparing to documents, (3) extracting
and using both words and phrases to compute relevance degree, (4)
improving the spread activation algorithm to do the expansion based
on weighted combination of different conceptual relationships and
(5) allowing variable document vector dimensions. A ranking
system called ORank is developed to implement and test the
proposed model. The test results will be included at the end of the
paper.
Abstract: The notion of communicative competence has been deemed fuzzy in communication studies. This fuzziness has led to tensions among engineers across tenures in interpreting what constitutes communicative competence. The study seeks to investigate novice and professional engineers- understanding of the said notion in terms of two main elements of communicative competence: linguistic and rhetorical competence. Novice engineers are final year engineering students, whilst professional engineers represent engineers who have at least 5 years working experience. Novice and professional engineers were interviewed to gauge their perceptions on linguistic and rhetorical features deemed necessary to enhance communicative competence for the profession. Both groups indicated awareness and differences on the importance of the sub-sets of communicative competence, namely, rhetorical explanatory competence, linguistic oral immediacy competence, technical competence and meta-cognitive competence. Such differences, a possible attribute of the learning theory, inadvertently indicate sublime differences in the way novice and professional engineers perceive communicative competence.
Abstract: In literature, there are metrics for identifying the
quality of reusable components but the framework that makes use of
these metrics to precisely predict reusability of software components
is still need to be worked out. These reusability metrics if identified
in the design phase or even in the coding phase can help us to reduce
the rework by improving quality of reuse of the software component
and hence improve the productivity due to probabilistic increase in
the reuse level. As CK metric suit is most widely used metrics for
extraction of structural features of an object oriented (OO) software;
So, in this study, tuned CK metric suit i.e. WMC, DIT, NOC, CBO
and LCOM, is used to obtain the structural analysis of OO-based
software components. An algorithm has been proposed in which the
inputs can be given to K-Means Clustering system in form of
tuned values of the OO software component and decision tree is
formed for the 10-fold cross validation of data to evaluate the in
terms of linguistic reusability value of the component. The developed
reusability model has produced high precision results as desired.
Abstract: In this paper, based on a novel synthesis, a set of new simplified circuit design to implement the linguistic-hedge operations for adjusting the fuzzy membership function set is presented. The circuits work in current-mode and employ floating-gate MOS (FGMOS) transistors that operate in weak inversion region. Compared to the other proposed circuits, these circuits feature severe reduction of the elements number, low supply voltage (0.7V), low power consumption (60dB). In this paper, a set of fuzzy linguistic hedge circuits, including absolutely, very, much more, more, plus minus, more or less and slightly, has been implemented in 0.18 mm CMOS process. Simulation results by Hspice confirm the validity of the proposed design technique and show high performance of the circuits.
Abstract: This research uses computational linguistics, an area of study that employs a computer to process natural language, and aims at discerning the patterns that exist in declarative sentences used in technical texts. The approach is mathematical, and the focus is on instructional texts found on web pages. The technique developed by the author and named the MAYA Semantic Technique is used here and organized into four stages. In the first stage, the parts of speech in each sentence are identified. In the second stage, the subject of the sentence is determined. In the third stage, MAYA performs a frequency analysis on the remaining words to determine the verb and its object. In the fourth stage, MAYA does statistical analysis to determine the content of the web page. The advantage of the MAYA Semantic Technique lies in its use of mathematical principles to represent grammatical operations which assist processing and accuracy if performed on unambiguous text. The MAYA Semantic Technique is part of a proposed architecture for an entire web-based intelligent tutoring system. On a sample set of sentences, partial semantics derived using the MAYA Semantic Technique were approximately 80% accurate. The system currently processes technical text in one domain, namely Cµ programming. In this domain all the keywords and programming concepts are known and understood.
Abstract: Prospective readers can quickly determine whether a document is relevant to their information need if the significant phrases (or keyphrases) in this document are provided. Although keyphrases are useful, not many documents have keyphrases assigned to them, and manually assigning keyphrases to existing documents is costly. Therefore, there is a need for automatic keyphrase extraction. This paper introduces a new domain independent keyphrase extraction algorithm. The algorithm approaches the problem of keyphrase extraction as a classification task, and uses a combination of statistical and computational linguistics techniques, a new set of attributes, and a new machine learning method to distinguish keyphrases from non-keyphrases. The experiments indicate that this algorithm performs better than other keyphrase extraction tools and that it significantly outperforms Microsoft Word 2000-s AutoSummarize feature. The domain independence of this algorithm has also been confirmed in our experiments.
Abstract: This case study investigates the effects of reactive
focus on form through negotiation on the linguistic development of
an adult EFL learner in an exclusive private EFL classroom. The
findings revealed that in this classroom negotiated feedback occurred
significantly more often than non-negotiated feedback. However, it
was also found that in the long run the learner was significantly more
successful in correcting his own errors when he had received nonnegotiated
feedback than negotiated feedback. This study, therefore,
argues that although negotiated feedback seems to be effective for
some learners in the short run, it is non-negotiated feedback which
seems to be more effective in the long run. This long lasting effect
might be attributed to the impact of schooling system which is itself
indicative of the dominant culture, or to the absence of other
interlocutors in the course of interaction.
Abstract: In this paper we propose an NLP-based method for
Ontology Population from texts and apply it to semi automatic
instantiate a Generic Knowledge Base (Generic Domain Ontology) in
the risk management domain. The approach is semi-automatic and
uses a domain expert intervention for validation. The proposed
approach relies on a set of Instances Recognition Rules based on
syntactic structures, and on the predicative power of verbs in the
instantiation process. It is not domain dependent since it heavily
relies on linguistic knowledge.
A description of an experiment performed on a part of the
ontology of the PRIMA1 project (supported by the European
community) is given. A first validation of the method is done by
populating this ontology with Chemical Fact Sheets from
Environmental Protection Agency2. The results of this experiment
complete the paper and support the hypothesis that relying on the
predicative power of verbs in the instantiation process improves the
performance.
Abstract: This paper describes a system, in which various methods of text summarizing can be adapted to Polish. A structure of the system is presented. A modular construction of the system and access to the system via the Internet are signaled.