Information Retrieval: Improving Question Answering Systems by Query Reformulation and Answer Validation

Question answering (QA) aims at retrieving precise information from a large collection of documents. Most of the Question Answering systems composed of three main modules: question processing, document processing and answer processing. Question processing module plays an important role in QA systems to reformulate questions. Moreover answer processing module is an emerging topic in QA systems, where these systems are often required to rank and validate candidate answers. These techniques aiming at finding short and precise answers are often based on the semantic relations and co-occurrence keywords. This paper discussed about a new model for question answering which improved two main modules, question processing and answer processing which both affect on the evaluation of the system operations. There are two important components which are the bases of the question processing. First component is question classification that specifies types of question and answer. Second one is reformulation which converts the user's question into an understandable question by QA system in a specific domain. The objective of an Answer Validation task is thus to judge the correctness of an answer returned by a QA system, according to the text snippet given to support it. For validating answers we apply candidate answer filtering, candidate answer ranking and also it has a final validation section by user voting. Also this paper described new architecture of question and answer processing modules with modeling, implementing and evaluating the system. The system differs from most question answering systems in its answer validation model. This module makes it more suitable to find exact answer. Results show that, from total 50 asked questions, evaluation of the model, show 92% improving the decision of the system.

A New Approach of Fuzzy Methods for Evaluating of Hydrological Data

The main criteria of designing in the most hydraulic constructions essentially are based on runoff or discharge of water. Two of those important criteria are runoff and return period. Mostly, these measures are calculated or estimated by stochastic data. Another feature in hydrological data is their impreciseness. Therefore, in order to deal with uncertainty and impreciseness, based on Buckley-s estimation method, a new fuzzy method of evaluating hydrological measures are developed. The method introduces triangular shape fuzzy numbers for different measures in which both of the uncertainty and impreciseness concepts are considered. Besides, since another important consideration in most of the hydrological studies is comparison of a measure during different months or years, a new fuzzy method which is consistent with special form of proposed fuzzy numbers, is also developed. Finally, to illustrate the methods more explicitly, the two algorithms are tested on one simple example and a real case study.

Fuzzy Shortest Paths Approximation for Solving the Fuzzy Steiner Tree Problem in Graphs

In this paper, we deal with the Steiner tree problem (STP) on a graph in which a fuzzy number, instead of a real number, is assigned to each edge. We propose a modification of the shortest paths approximation based on the fuzzy shortest paths (FSP) evaluations. Since a fuzzy min operation using the extension principle leads to nondominated solutions, we propose another approach to solving the FSP using Cheng's centroid point fuzzy ranking method.

Ranking Alternatives in Multi-Criteria Decision Analysis using Common Weights Based on Ideal and Anti-ideal Frontiers

One of the most important issues in multi-criteria decision analysis (MCDA) is to determine the weights of criteria so that all alternatives can be compared based on the collective performance of criteria. In this paper, one of popular methods in data envelopment analysis (DEA) known as common weights (CWs) is used to determine the weights in MCDA. Two frontiers named ideal and anti-ideal frontiers, instead of ideal and anti-ideal alternatives, are defined based on two new proposed CWs models. Ideal and antiideal frontiers are more flexible than that of alternatives. According to the optimal solutions of these two models, the distances of an alternative from the ideal and anti-ideal frontiers are derived. Then, a relative distance is introduced to measure the value of each alternative. The suggested models are linear and despite weight restrictions are feasible. An example is presented for explaining the method and for comparing to the existing literature.

Systematic Analysis of Dynamic Association of Health Outcomes with Computer Usage for Office Staff

This paper systematically investigates the timedependent health outcomes for office staff during computer work using the developed mathematical model. The model describes timedependent health outcomes in multiple body regions associated with computer usage. The association is explicitly presented with a doseresponse relationship which is parametrized by body region parameters. Using the developed model we perform extensive investigations of the health outcomes statically and dynamically. We compare the risk body regions and provide various severity rankings of the discomfort rate changes with respect to computer-related workload dynamically for the study population. Application of the developed model reveals a wide range of findings. Such broad spectrum of investigations in a single report literature is lacking. Based upon the model analysis, it is discovered that the highest average severity level of the discomfort exists in neck, shoulder, eyes, shoulder joint/upper arm, upper back, low back and head etc. The biggest weekly changes of discomfort rates are in eyes, neck, head, shoulder, shoulder joint/upper arm and upper back etc. The fastest discomfort rate is found in neck, followed by shoulder, eyes, head, shoulder joint/upper arm and upper back etc. Most of our findings are consistent with the literature, which demonstrates that the developed model and results are applicable and valuable and can be utilized to assess correlation between the amount of computer-related workload and health risk.

Online Brands: A Comparative Study of World Top Ranked Universities with Science and Technology Programs

University websites are considered as one of the brand primary touch points for multiple stakeholders, but most of them did not have great designs to create favorable impressions. Some of the elements that web designers should carefully consider are the appearance, the content, the functionality, usability and search engine optimization. However, priority should be placed on website simplicity and negative space. In terms of content, previous research suggests that universities should include reputation, learning environment, graduate career prospects, image destination, cultural integration, and virtual tour on their websites. The study examines how top 200 world ranking science and technology-based universities present their brands online and whether the websites capture the content dimensions. Content analysis of the websites revealed that the top ranking universities captured these dimensions at varying degree. Besides, the UK-based university had better priority on website simplicity and negative space compared to the Malaysian-based university.

An MCDM Approach to Selection Scheduling Rule in Robotic Flexibe Assembly Cells

Multiple criteria decision making (MCDM) is an approach to ranking the solutions and finding the best one when two or more solutions are provided. In this study, MCDM approach is proposed to select the most suitable scheduling rule of robotic flexible assembly cells (RFACs). Two MCDM approaches, Analytic Hierarchy Process (AHP) and Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) are proposed for solving the scheduling rule selection problem. The AHP method is employed to determine the weights of the evaluation criteria, while the TOPSIS method is employed to obtain final ranking order of scheduling rules. Four criteria are used to evaluate the scheduling rules. Also, four scheduling policies of RFAC are examined to choose the most appropriate one for this purpose. A numerical example illustrates applications of the suggested methodology. The results show that the methodology is practical and works in RFAC settings.

Ranking Fuzzy Numbers Based On Epsilon-Deviation Degree

Nejad and Mashinchi (2011) proposed a revision for ranking fuzzy numbers based on the areas of the left and the right sides of a fuzzy number. However, this method still has some shortcomings such as lack of discriminative power to rank similar fuzzy numbers and no guarantee the consistency between the ranking of fuzzy numbers and the ranking of their images. To overcome these drawbacks, we propose an epsilon-deviation degree method based on the left area and the right area of a fuzzy number, and the concept of the centroid point. The main advantage of the new approach is the development of an innovative index value which can be used to consistently evaluate and rank fuzzy numbers. Numerical examples are presented to illustrate the efficiency and superiority of the proposed method.

Development of Decision Support System for House Evaluation and Purchasing

Home is important for Chinese people. Because the information regarding the house attributes and surrounding environments is incomplete in most real estate agency, most house buyers are difficult to consider the overall factors effectively and only can search candidates by sorting-based approach. This study aims to develop a decision support system for housing purchasing, in which surrounding facilities of each house are quantified. Then, all considered house factors and customer preferences are incorporated into Simple Multi-Attribute Ranking Technique (SMART) to support the housing evaluation. To evaluate the validity of proposed approach, an empirical study was conducted from a real estate agency. Based on the customer requirement and preferences, the proposed approach can identify better candidate house with consider the overall house attributes and surrounding facilities.

Multipath Routing Sensor Network for Finding Crack in Metallic Structure Using Fuzzy Logic

For collecting data from all sensor nodes, some changes in Dynamic Source Routing (DSR) protocol is proposed. At each hop level, route-ranking technique is used for distributing packets to different selected routes dynamically. For calculating rank of a route, different parameters like: delay, residual energy and probability of packet loss are used. A hybrid topology of DMPR(Disjoint Multi Path Routing) and MMPR(Meshed Multi Path Routing) is formed, where braided topology is used in different faulty zones of network. For reducing energy consumption, variant transmission ranges is used instead of fixed transmission range. For reducing number of packet drop, a fuzzy logic inference scheme is used to insert different types of delays dynamically. A rule based system infers membership function strength which is used to calculate the final delay amount to be inserted into each of the node at different clusters. In braided path, a proposed 'Dual Line ACK Link'scheme is proposed for sending ACK signal from a damaged node or link to a parent node to ensure that any error in link or any node-failure message may not be lost anyway. This paper tries to design the theoretical aspects of a model which may be applied for collecting data from any large hanging iron structure with the help of wireless sensor network. But analyzing these data is the subject of material science and civil structural construction technology, that part is out of scope of this paper.

Starting Pitcher Rotation in the Chinese Professional Baseball League based on AHP and TOPSIS

The rotation of starting pitchers is a strategic issue which has a significant impact on the performance of a professional team. Choosing an optimal starting pitcher from among many alternatives is a multi-criteria decision-making (MCDM) problem. In this study, a model using the Analytic Hierarchy Process (AHP) and Technique for Order Performance by Similarity to the Ideal Solution (TOPSIS) is proposed with which to arrange the starting pitcher rotation for teams of the Chinese Professional Baseball League. The AHP is used to analyze the structure of the starting pitcher selection problem and to determine the weights of the criteria, while the TOPSIS method is used to make the final ranking. An empirical analysis is conducted to illustrate the utilization of the model for the starting pitcher rotation problem. The results demonstrate the effectiveness and feasibility of the proposed model.

A New Fuzzy DSS/ES for Stock Portfolio Selection using Technical and Fundamental Approaches in Parallel

A Decision Support System/Expert System for stock portfolio selection presented where at first step, both technical and fundamental data used to estimate technical and fundamental return and risk (1st phase); Then, the estimated values are aggregated with the investor preferences (2nd phase) to produce convenient stock portfolio. In the 1st phase, there are two expert systems, each of which is responsible for technical or fundamental estimation. In the technical expert system, for each stock, twenty seven candidates are identified and with using rough sets-based clustering method (RC) the effective variables have been selected. Next, for each stock two fuzzy rulebases are developed with fuzzy C-Mean method and Takai-Sugeno- Kang (TSK) approach; one for return estimation and the other for risk. Thereafter, the parameters of the rule-bases are tuned with backpropagation method. In parallel, for fundamental expert systems, fuzzy rule-bases have been identified in the form of “IF-THEN" rules through brainstorming with the stock market experts and the input data have been derived from financial statements; as a result two fuzzy rule-bases have been generated for all the stocks, one for return and the other for risk. In the 2nd phase, user preferences represented by four criteria and are obtained by questionnaire. Using an expert system, four estimated values of return and risk have been aggregated with the respective values of user preference. At last, a fuzzy rule base having four rules, treats these values and produce a ranking score for each stock which will lead to a satisfactory portfolio for the user. The stocks of six manufacturing companies and the period of 2003-2006 selected for data gathering.

Method for Solving Fully Fuzzy Assignment Problems Using Triangular Fuzzy Numbers

In this paper, a new method is proposed to find the fuzzy optimal solution of fuzzy assignment problems by representing all the parameters as triangular fuzzy numbers. The advantages of the pro-posed method are also discussed. To illustrate the proposed method a fuzzy assignment problem is solved by using the proposed method and the obtained results are discussed. The proposed method is easy to understand and to apply for finding the fuzzy optimal solution of fuzzy assignment problems occurring in real life situations.

Fuzzy Approach for Ranking of Motor Vehicles Involved in Road Accidents

Increasing number of vehicles and lack of awareness among road users may lead to road accidents. However no specific literature was found to rank vehicles involved in accidents based on fuzzy variables of road users. This paper proposes a ranking of four selected motor vehicles involved in road accidents. Human and non-human factors that normally linked with road accidents are considered for ranking. The imprecision or vagueness inherent in the subjective assessment of the experts has led the application of fuzzy sets theory to deal with ranking problems. Data in form of linguistic variables were collected from three authorised personnel of three Malaysian Government agencies. The Multi Criteria Decision Making, fuzzy TOPSIS was applied in computational procedures. From the analysis, it shows that motorcycles vehicles yielded the highest closeness coefficient at 0.6225. A ranking can be drawn using the magnitude of closeness coefficient. It was indicated that the motorcycles recorded the first rank.

A Study of the Effectiveness of the Routing Decision Support Algorithm

Multi criteria decision making (MCDM) methods like analytic hierarchy process, ELECTRE and multi-attribute utility theory are critically studied. They have irregularities in terms of the reliability of ranking of the best alternatives. The Routing Decision Support (RDS) algorithm is trying to improve some of their deficiencies. This paper gives a mathematical verification that the RDS algorithm conforms to the test criteria for an effective MCDM method when a linear preference function is considered.

A Framework for Ranking Quality of Information on Weblog

The vast amount of information on the World Wide Web is created and published by many different types of providers. Unlike books and journals, most of this information is not subject to editing or peer review by experts. This lack of quality control and the explosion of web sites make the task of finding quality information on the web especially critical. Meanwhile new facilities for producing web pages such as Blogs make this issue more significant because Blogs have simple content management tools enabling nonexperts to build easily updatable web diaries or online journals. On the other hand despite a decade of active research in information quality (IQ) there is no framework for measuring information quality on the Blogs yet. This paper presents a novel experimental framework for ranking quality of information on the Weblog. The results of data analysis revealed seven IQ dimensions for the Weblog. For each dimension, variables and related coefficients were calculated so that presented framework is able to assess IQ of Weblogs automatically.

Development of Accident Predictive Model for Rural Roadway

This paper present the study carried out of accident analysis, black spot study and to develop accident predictive models based on the data collected at rural roadway, Federal Route 50 (F050) Malaysia. The road accident trends and black spot ranking were established on the F050. The development of the accident prediction model will concentrate in Parit Raja area from KM 19 to KM 23. Multiple non-linear regression method was used to relate the discrete accident data with the road and traffic flow explanatory variable. The dependent variable was modeled as the number of crashes namely accident point weighting, however accident point weighting have rarely been account in the road accident prediction Models. The result show that, the existing number of major access points, without traffic light, rise in speed, increasing number of Annual Average Daily Traffic (AADT), growing number of motorcycle and motorcar and reducing the time gap are the potential contributors of increment accident rates on multiple rural roadway.

Principal Component Analysis-Ranking as a Variable Selection Method for the Simultaneous Spectrophotometric Determination of Phenol, Resorcinol and Catechol in Real Samples

Simultaneous determination of multicomponents of phenol, resorcinol and catechol with a chemometric technique a PCranking artificial neural network (PCranking-ANN) algorithm is reported in this study. Based on the data correlation coefficient method, 3 representative PCs are selected from the scores of original UV spectral data (35 PCs) as the original input patterns for ANN to build a neural network model. The results obtained by iterating 8000 .The RMSEP for phenol, resorcinol and catechol with PCranking- ANN were 0.6680, 0.0766 and 0.1033, respectively. Calibration matrices were 0.50-21.0, 0.50-15.1 and 0.50-20.0 μg ml-1 for phenol, resorcinol and catechol, respectively. The proposed method was successfully applied for the determination of phenol, resorcinol and catechol in synthetic and water samples.

Language and Retrieval Accuracy

One of the major challenges in the Information Retrieval field is handling the massive amount of information available to Internet users. Existing ranking techniques and strategies that govern the retrieval process fall short of expected accuracy. Often relevant documents are buried deep in the list of documents returned by the search engine. In order to improve retrieval accuracy we examine the issue of language effect on the retrieval process. Then, we propose a solution for a more biased, user-centric relevance for retrieved data. The results demonstrate that using indices based on variations of the same language enhances the accuracy of search engines for individual users.

Study of Environmental Effects on Sunflower Oil Percent based on Graphical Method

Biplot can be used to evaluate cultivars for their oil percent potential and stability and to evaluate trial sites for their discriminating ability and representativeness. Multi-environmental trial (MET) data for oil percent of 10 open pollinating sunflower cultivars were analyzed to investigate the genotype-environment interactions. The genotypes were evaluated in four locations with different climatic conditions in Iran in 2010. In each location, a Randomized Complete Block design with four replications was used. According to both mean and stability, Zaria, Master and R453, had highest performances among all cultivars. The graphical analysis identified best cultivar for each environment. Cultivars Berezans and Record performed best in Khoy and Islamabad. Zaria and R453 were the best genotypes in Sari and Karaj followed by Master and Favorit. The GGE bi-plot indicated two mega-environments, group one contained Karaj, Khoy and Islamabad and the second group contained Sari. The best discriminating location was Karaj followed with Khoy, Islamabad and Sari. The best representative genotypes were Zaria, R453, Master and Favorit. Ranking of ten cultivars based their oil percent was as Zaria > R453 ≈ Master ≈ Favorit > Record ≈ Berezans > Sor > Lakumka > Bulg3 > Bulg5.