A Proposed Hybrid Approach for Feature Selection in Text Document Categorization

Text document categorization involves large amount of data or features. The high dimensionality of features is a troublesome and can affect the performance of the classification. Therefore, feature selection is strongly considered as one of the crucial part in text document categorization. Selecting the best features to represent documents can reduce the dimensionality of feature space hence increase the performance. There were many approaches has been implemented by various researchers to overcome this problem. This paper proposed a novel hybrid approach for feature selection in text document categorization based on Ant Colony Optimization (ACO) and Information Gain (IG). We also presented state-of-the-art algorithms by several other researchers.

Virtual Reality for Mutual Understanding in Landscape Planning

This paper argues that fostering mutual understanding in landscape planning is as much about the planners educating stakeholder groups as the stakeholders educating the planners. In other words it is an epistemological agreement as to the meaning and nature of place, especially where an effort is made to go beyond the quantitative aspects, which can be achieved by the phenomenological experience of the Virtual Reality (VR) environment. This education needs to be a bi-directional process in which distance can be both temporal as well as spatial separation of participants, that there needs to be a common framework of understanding in which neither 'side' is disadvantaged during the process of information exchange and it follows that a medium such as VR offers an effective way of overcoming some of the shortcomings of traditional media by taking advantage of continuing technological advances in Information, Technology and Communications (ITC). In this paper we make particular reference to this as an extension to Geographical Information Systems (GIS). VR as a two-way communication tool offers considerable potential particularly in the area of Public Participation GIS (PPGIS). Information rich virtual environments that can operate over broadband networks are now possible and thus allow for the representation of large amounts of qualitative and quantitative information 'side-by-side'. Therefore, with broadband access becoming standard for households and enterprises alike, distributed virtual reality environments have great potential to contribute to enabling stakeholder participation and mutual learning within the planning context.

Development of a Catchment Water Quality Model for Continuous Simulations of Pollutants Build-up and Wash-off

Estimation of runoff water quality parameters is required to determine appropriate water quality management options. Various models are used to estimate runoff water quality parameters. However, most models provide event-based estimates of water quality parameters for specific sites. The work presented in this paper describes the development of a model that continuously simulates the accumulation and wash-off of water quality pollutants in a catchment. The model allows estimation of pollutants build-up during dry periods and pollutants wash-off during storm events. The model was developed by integrating two individual models; rainfall-runoff model, and catchment water quality model. The rainfall-runoff model is based on the time-area runoff estimation method. The model allows users to estimate the time of concentration using a range of established methods. The model also allows estimation of the continuing runoff losses using any of the available estimation methods (i.e., constant, linearly varying or exponentially varying). Pollutants build-up in a catchment was represented by one of three pre-defined functions; power, exponential, or saturation. Similarly, pollutants wash-off was represented by one of three different functions; power, rating-curve, or exponential. The developed runoff water quality model was set-up to simulate the build-up and wash-off of total suspended solids (TSS), total phosphorus (TP) and total nitrogen (TN). The application of the model was demonstrated using available runoff and TSS field data from road and roof surfaces in the Gold Coast, Australia. The model provided excellent representation of the field data demonstrating the simplicity yet effectiveness of the proposed model.

Infrared Face Recognition Using Distance Transforms

In this work we present an efficient approach for face recognition in the infrared spectrum. In the proposed approach physiological features are extracted from thermal images in order to build a unique thermal faceprint. Then, a distance transform is used to get an invariant representation for face recognition. The obtained physiological features are related to the distribution of blood vessels under the face skin. This blood network is unique to each individual and can be used in infrared face recognition. The obtained results are promising and show the effectiveness of the proposed scheme.

Parallelization and Optimization of SIFT Feature Extraction on Cluster System

Scale Invariant Feature Transform (SIFT) has been widely applied, but extracting SIFT feature is complicated and time-consuming. In this paper, to meet the demand of the real-time applications, SIFT is parallelized and optimized on cluster system, which is named pSIFT. Redundancy storage and communication are used for boundary data to improve the performance, and before representation of feature descriptor, data reallocation is adopted to keep load balance in pSIFT. Experimental results show that pSIFT achieves good speedup and scalability.

Voltage Stability Investigation of Grid Connected Wind Farm

At present, it is very common to find renewable energy resources, especially wind power, connected to distribution systems. The impact of this wind power on voltage distribution levels has been addressed in the literature. The majority of this works deals with the determination of the maximum active and reactive power that is possible to be connected on a system load bus, until the voltage at that bus reaches the voltage collapse point. It is done by the traditional methods of PV curves reported in many references. Theoretical expression of maximum power limited by voltage stability transfer through a grid is formulated using an exact representation of distribution line with ABCD parameters. The expression is used to plot PV curves at various power factors of a radial system. Limited values of reactive power can be obtained. This paper presents a method to study the relationship between the active power and voltage (PV) at the load bus to identify the voltage stability limit. It is a foundation to build a permitted working operation region in complying with the voltage stability limit at the point of common coupling (PCC) connected wind farm.

No one Set of Parameter Values Can Simulate the Epidemics Due to SARS Occurring at Different Localities

A mathematical model for the transmission of SARS is developed. In addition to dividing the population into susceptible (high and low risk), exposed, infected, quarantined, diagnosed and recovered classes, we have included a class called untraced. The model simulates the Gompertz curves which are the best representation of the cumulative numbers of probable SARS cases in Hong Kong and Singapore. The values of the parameters in the model which produces the best fit of the observed data for each city are obtained by using a differential evolution algorithm. It is seen that the values for the parameters needed to simulate the observed daily behaviors of the two epidemics are different.

Graph-Based Text Similarity Measurement by Exploiting Wikipedia as Background Knowledge

Text similarity measurement is a fundamental issue in many textual applications such as document clustering, classification, summarization and question answering. However, prevailing approaches based on Vector Space Model (VSM) more or less suffer from the limitation of Bag of Words (BOW), which ignores the semantic relationship among words. Enriching document representation with background knowledge from Wikipedia is proven to be an effective way to solve this problem, but most existing methods still cannot avoid similar flaws of BOW in a new vector space. In this paper, we propose a novel text similarity measurement which goes beyond VSM and can find semantic affinity between documents. Specifically, it is a unified graph model that exploits Wikipedia as background knowledge and synthesizes both document representation and similarity computation. The experimental results on two different datasets show that our approach significantly improves VSM-based methods in both text clustering and classification.

A Wavelet Based Object Watermarking System for Image and Video

Efficient storage, transmission and use of video information are key requirements in many multimedia applications currently being addressed by MPEG-4. To fulfill these requirements, a new approach for representing video information which relies on an object-based representation, has been adopted. Therefore, objectbased watermarking schemes are needed for copyright protection. This paper proposes a novel blind object watermarking scheme for images and video using the in place lifting shape adaptive-discrete wavelet transform (SA-DWT). In order to make the watermark robust and transparent, the watermark is embedded in the average of wavelet blocks using the visual model based on the human visual system. Wavelet coefficients n least significant bits (LSBs) are adjusted in concert with the average. Simulation results shows that the proposed watermarking scheme is perceptually invisible and robust against many attacks such as lossy image/video compression (e.g. JPEG, JPEG2000 and MPEG-4), scaling, adding noise, filtering, etc.

Cumulative Learning based on Dynamic Clustering of Hierarchical Production Rules(HPRs)

An important structuring mechanism for knowledge bases is building clusters based on the content of their knowledge objects. The objects are clustered based on the principle of maximizing the intraclass similarity and minimizing the interclass similarity. Clustering can also facilitate taxonomy formation, that is, the organization of observations into a hierarchy of classes that group similar events together. Hierarchical representation allows us to easily manage the complexity of knowledge, to view the knowledge at different levels of details, and to focus our attention on the interesting aspects only. One of such efficient and easy to understand systems is Hierarchical Production rule (HPRs) system. A HPR, a standard production rule augmented with generality and specificity information, is of the following form Decision If < condition> Generality Specificity . HPRs systems are capable of handling taxonomical structures inherent in the knowledge about the real world. In this paper, a set of related HPRs is called a cluster and is represented by a HPR-tree. This paper discusses an algorithm based on cumulative learning scenario for dynamic structuring of clusters. The proposed scheme incrementally incorporates new knowledge into the set of clusters from the previous episodes and also maintains summary of clusters as Synopsis to be used in the future episodes. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested incremental structuring of clusters would be useful in mining data streams.

How Does Psychoanalysis Help in Reconstructing Political Thought? An Exercise of Interpretation

The significance of psychology in studying politics is embedded in philosophical issues as well as behavioural pursuits. For the former is often associated with Sigmund Freud and his followers. The latter is inspired by the writings of Harold Lasswell. Political psychology or psychopolitics has its own impression on political thought ever since it deciphers the concept of human nature and political propaganda. More importantly, psychoanalysis views political thought as a textual content which needs to explore the latent from the manifest content. In other words, it reads the text symptomatically and interprets the hidden truth. This paper explains the paradigm of dream interpretation applied by Freud. The dream work is a process which has four successive activities: condensation, displacement, representation and secondary revision. The texts dealing with political though can also be interpreted on these principles. Freud's method of dream interpretation draws its source after the hermeneutic model of philological research. It provides theoretical perspective and technical rules for the interpretation of symbolic structures. The task of interpretation remains a discovery of equivalence of symbols and actions through perpetual analogies. Psychoanalysis can help in studying political thought in two ways: to study the text distortion, Freud's dream interpretation is used as a paradigm exploring the latent text from its manifest text; and to apply Freud's psychoanalytic concepts and theories ranging from individual mind to civilization, religion, war and politics.

Floating-Point Scaling for BSS Gain Control

In Blind Source Separation (BSS) processing, taking advantage of scaling factor indetermination and based on the floatingpoint representation, we propose a scaling technique applied to the separation matrix, to avoid the saturation or the weakness in the recovered source signals. This technique performs an Automatic Gain Control (AGC) in an on-line BSS environment. We demonstrate the effectiveness of this technique by using the implementation of a division free BSS algorithm with two input, two output. This technique is computationally cheaper and efficient for a hardware implementation.

Evaluation of a Bio-Mechanism by Graphed Static Equilibrium Forces

The unique structural configuration found in human foot allows easy walking. Similar movement is hard to imitate even for an ape. It is obvious that human ambulation relates to the foot structure itself. Suppose the bones are represented as vertices and the joints as edges. This leads to the development of a special graph that represents human foot. On a footprint there are point-ofcontacts which have contact with the ground. It involves specific vertices. Theoretically, for an ideal ambulation, these points provide reactions onto the ground or the static equilibrium forces. They are arranged in sequence in form of a path. The ambulating footprint follows this path. Having the human foot graph and the path crossbred, it results in a representation that describes the profile of an ideal ambulation. This profile cites the locations where the point-of-contact experience normal reaction forces. It highlights the significant of these points.

Designing Pictogram for Food Portion Size

The objective of this paper is to investigate a new approach based on the idea of pictograms for food portion size. This approach adopts the model of the United States Pharmacopeia- Drug Information (USP-DI). The representation of each food portion size composed of three parts: frame, the connotation of dietary portion sizes and layout. To investigate users- comprehension based on this approach, two experiments were conducted, included 122 Taiwanese people, 60 male and 62 female with ages between 16 and 64 (divided into age groups of 16-30, 31-45 and 46-64). In Experiment 1, the mean correcting rate of the understanding level of food items is 48.54% (S.D.= 95.08) and the mean response time 2.89sec (S.D.=2.14). The difference on the correct rates for different age groups is significant (P*=0.00

Using Dempster-Shafer Theory in XML Information Retrieval

XML is a markup language which is becoming the standard format for information representation and data exchange. A major purpose of XML is the explicit representation of the logical structure of a document. Much research has been performed to exploit logical structure of documents in information retrieval in order to precisely extract user information need from large collections of XML documents. In this paper, we describe an XML information retrieval weighting scheme that tries to find the most relevant elements in XML documents in response to a user query. We present this weighting model for information retrieval systems that utilize plausible inferences to infer the relevance of elements in XML documents. We also add to this model the Dempster-Shafer theory of evidence to express the uncertainty in plausible inferences and Dempster-Shafer rule of combination to combine evidences derived from different inferences.

Dynamic Decompression for Text Files

Compression algorithms reduce the redundancy in data representation to decrease the storage required for that data. Lossless compression researchers have developed highly sophisticated approaches, such as Huffman encoding, arithmetic encoding, the Lempel-Ziv (LZ) family, Dynamic Markov Compression (DMC), Prediction by Partial Matching (PPM), and Burrows-Wheeler Transform (BWT) based algorithms. Decompression is also required to retrieve the original data by lossless means. A compression scheme for text files coupled with the principle of dynamic decompression, which decompresses only the section of the compressed text file required by the user instead of decompressing the entire text file. Dynamic decompressed files offer better disk space utilization due to higher compression ratios compared to most of the currently available text file formats.

Robust Image Transmission Over Time-varying Channels using Hierarchical Joint Source Channel Coding

In this paper, a joint source-channel coding (JSCC) scheme for time-varying channels is presented. The proposed scheme uses hierarchical framework for both source encoder and transmission via QAM modulation. Hierarchical joint source channel codes with hierarchical QAM constellations are designed to track the channel variations which yields to a higher throughput by adapting certain parameters of the receiver to the channel variation. We consider the problem of still image transmission over time-varying channels with channel state information (CSI) available at 1) receiver only and 2) both transmitter and receiver being informed about the state of the channel. We describe an algorithm that optimizes hierarchical source codebooks by minimizing the distortion due to source quantizer and channel impairments. Simulation results, based on image representation, show that, the proposed hierarchical system outperforms the conventional schemes based on a single-modulator and channel optimized source coding.

Evolutionary Approach for Automated Discovery of Censored Production Rules

In the recent past, there has been an increasing interest in applying evolutionary methods to Knowledge Discovery in Databases (KDD) and a number of successful applications of Genetic Algorithms (GA) and Genetic Programming (GP) to KDD have been demonstrated. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. The PRs, however, are unable to handle exceptions and do not exhibit variable precision. The Censored Production Rules (CPRs), an extension of PRs, were proposed by Michalski & Winston that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations, in which the conditional statement 'If P Then D' holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence are tight or there is simply no information available as to whether it holds or not. Thus, the 'If P Then D' part of the CPR expresses important information, while the Unless C part acts only as a switch and changes the polarity of D to ~D. This paper presents a classification algorithm based on evolutionary approach that discovers comprehensible rules with exceptions in the form of CPRs. The proposed approach has flexible chromosome encoding, where each chromosome corresponds to a CPR. Appropriate genetic operators are suggested and a fitness function is proposed that incorporates the basic constraints on CPRs. Experimental results are presented to demonstrate the performance of the proposed algorithm.

Designing and Implementing an Innovative Course about World Wide Web, Based on the Conceptual Representations of Students

Internet is nowadays included to all National Curriculums of the elementary school. A comparative study of their goals leads to the conclusion that a complete curriculum should aim to student-s acquisition of the abilities to navigate and search for information and additionally to emphasize on the evaluation of the information provided by the World Wide Web. In a constructivistic knowledge framework the design of a course has to take under consideration the conceptual representations of students. The following paper presents the conceptual representation of students of eleven years old, attending the Sixth Grade of Greek Elementary School about World Wide Web and their use in the design and implementation of an innovative course.

Solving Part Type Selection and Loading Problem in Flexible Manufacturing System Using Real Coded Genetic Algorithms – Part I: Modeling

This paper and its companion (Part 2) deal with modeling and optimization of two NP-hard problems in production planning of flexible manufacturing system (FMS), part type selection problem and loading problem. The part type selection problem and the loading problem are strongly related and heavily influence the system-s efficiency and productivity. The complexity of the problems is harder when flexibilities of operations such as the possibility of operation processed on alternative machines with alternative tools are considered. These problems have been modeled and solved simultaneously by using real coded genetic algorithms (RCGA) which uses an array of real numbers as chromosome representation. These real numbers can be converted into part type sequence and machines that are used to process the part types. This first part of the papers focuses on the modeling of the problems and discussing how the novel chromosome representation can be applied to solve the problems. The second part will discuss the effectiveness of the RCGA to solve various test bed problems.