TOSOM: A Topic-Oriented Self-Organizing Map for Text Organization

The self-organizing map (SOM) model is a well-known neural network model with wide spread of applications. The main characteristics of SOM are two-fold, namely dimension reduction and topology preservation. Using SOM, a high-dimensional data space will be mapped to some low-dimensional space. Meanwhile, the topological relations among data will be preserved. With such characteristics, the SOM was usually applied on data clustering and visualization tasks. However, the SOM has main disadvantage of the need to know the number and structure of neurons prior to training, which are difficult to be determined. Several schemes have been proposed to tackle such deficiency. Examples are growing/expandable SOM, hierarchical SOM, and growing hierarchical SOM. These schemes could dynamically expand the map, even generate hierarchical maps, during training. Encouraging results were reported. Basically, these schemes adapt the size and structure of the map according to the distribution of training data. That is, they are data-driven or dataoriented SOM schemes. In this work, a topic-oriented SOM scheme which is suitable for document clustering and organization will be developed. The proposed SOM will automatically adapt the number as well as the structure of the map according to identified topics. Unlike other data-oriented SOMs, our approach expands the map and generates the hierarchies both according to the topics and their characteristics of the neurons. The preliminary experiments give promising result and demonstrate the plausibility of the method.

PIIN Suppression Using Random Diagonal Code for Spectral Amplitude Coding Optical CDMA System

A new code for spectral-amplitude coding optical code-division multiple-access system is proposed called Random diagonal (RD) code. This code is constructed using code segment and data segment. One of the important properties of this code is that the cross correlation at data segment is always zero, which means that Phase Intensity Induced Noise (PIIN) is reduced. For the performance analysis, the effects of phase-induced intensity noise, shot noise, and thermal noise are considered simultaneously. Bit-error rate (BER) performance is compared with Hadamard and Modified Frequency Hopping (MFH) codes. It is shown that the system using this new code matrices not only suppress PIIN, but also allows larger number of active users compare with other codes. Simulation results shown that using point to point transmission with three encoded channels, RD code has better BER performance than other codes, also its found that at 0 dbm PIIN noise are 10-10 and 10-11 for RD and MFH respectively.

Color Image Segmentation using Adaptive Spatial Gaussian Mixture Model

An adaptive spatial Gaussian mixture model is proposed for clustering based color image segmentation. A new clustering objective function which incorporates the spatial information is introduced in the Bayesian framework. The weighting parameter for controlling the importance of spatial information is made adaptive to the image content to augment the smoothness towards piecewisehomogeneous region and diminish the edge-blurring effect and hence the name adaptive spatial finite mixture model. The proposed approach is compared with the spatially variant finite mixture model for pixel labeling. The experimental results with synthetic and Berkeley dataset demonstrate that the proposed method is effective in improving the segmentation and it can be employed in different practical image content understanding applications.

Mining and Visual Management of XML-Based Image Collections

This article describes Uruk, the virtual museum of Iraq that we developed for visual exploration and retrieval of image collections. The system largely exploits the loosely-structured hierarchy of XML documents that provides a useful representation method to store semi-structured or unstructured data, which does not easily fit into existing database. The system offers users the capability to mine and manage the XML-based image collections through a web-based Graphical User Interface (GUI). Typically, at an interactive session with the system, the user can browse a visual structural summary of the XML database in order to select interesting elements. Using this intermediate result, queries combining structure and textual references can be composed and presented to the system. After query evaluation, the full set of answers is presented in a visual and structured way.

A Prediction of Attractive Evaluation Objects Based On Complex Sequential Data

This paper proposes a method that predicts attractive evaluation objects. In the learning phase, the method inductively acquires trend rules from complex sequential data. The data is composed of two types of data. One is numerical sequential data. Each evaluation object has respective numerical sequential data. The other is text sequential data. Each evaluation object is described in texts. The trend rules represent changes of numerical values related to evaluation objects. In the prediction phase, the method applies new text sequential data to the trend rules and evaluates which evaluation objects are attractive. This paper verifies the effect of the proposed method by using stock price sequences and news headline sequences. In these sequences, each stock brand corresponds to an evaluation object. This paper discusses validity of predicted attractive evaluation objects, the process time of each phase, and the possibility of application tasks.

Human Detection using Projected Edge Feature

The purpose of this paper is to detect human in images. This paper proposes a method for extracting human body feature descriptors consisting of projected edge component series. The feature descriptor can express appearances and shapes of human with local and global distribution of edges. Our method evaluated with a linear SVM classifier on Daimler-Chrysler pedestrian dataset, and test with various sub-region size. The result shows that the accuracy level of proposed method similar to Histogram of Oriented Gradients(HOG) feature descriptor and feature extraction process is simple and faster than existing methods.

Preservation of Molecular Ozone in a Clathrate Hydrate : Three-Phase (Gas + Liquid + Hydrate) Equilibrium Measurements for O3 + O2 + CO2 + H2O Systems

This paper reports the three-phase (gas + liquid + hydrate) equilibrium pressure versus temperature data for a (O3 + O2 + CO2 + H2O) system for developing the hydrate-based technology to preserve ozone, a chemically unstable substance, for various industrial, medical and consumer uses. These data cover the temperature range from 272 K to 277 K, corresponding to pressures from 1.6 MPa to 3.1 MPa, for each of the three different (O3 + O2)-to-CO2 or O2-to-CO2 molar ratios in the gas phase, which are approximately 4 : 6, 5 : 5, respectively. The mole fraction of ozone in the gas phase was ~0.03 , which are the densest ozone fraction to artificially form O3 containing hydrate ever reported in the literature. Based on these data, the formation of hydrate containing high-concentration ozone, as high as 1 mass %, will be expected.

Influence of Radio Frequency Identification Technology in Logistic, Inventory Control and Supply Chain Optimization

The main aim of Supply Chain Management (SCM) is to produce, distribute, logistics and deliver goods and equipment in right location, right time, right amount to satisfy costumers, with minimum time and cost waste. So implementing techniques that reduce project time and cost, and improve productivity and performance is very important. Emerging technologies such as the Radio Frequency Identification (RFID) are now making it possible to automate supply chains in a real time manner and making them more efficient than the simple supply chain of the past for tracing and monitoring goods and products and capturing data on movements of goods and other events. This paper considers concepts, components and RFID technology characteristics by concentration of warehouse and inventories management. Additionally, utilization of RFID in the role of improving information management in supply chain is discussed. Finally, the facts of installation and this technology-s results in direction with warehouse and inventory management and business development will be presented.

Web Application to Profiling Scientific Institutions through Citation Mining

Recently the use of data mining to scientific bibliographic data bases has been implemented to analyze the pathways of the knowledge or the core scientific relevances of a laureated novel or a country. This specific case of data mining has been named citation mining, and it is the integration of citation bibliometrics and text mining. In this paper we present an improved WEB implementation of statistical physics algorithms to perform the text mining component of citation mining. In particular we use an entropic like distance between the compression of text as an indicator of the similarity between them. Finally, we have included the recently proposed index h to characterize the scientific production. We have used this web implementation to identify users, applications and impact of the Mexican scientific institutions located in the State of Morelos.

A Hybrid Recommender System based on Collaborative Filtering and Cloud Model

User-based Collaborative filtering (CF), one of the most prevailing and efficient recommendation techniques, provides personalized recommendations to users based on the opinions of other users. Although the CF technique has been successfully applied in various applications, it suffers from serious sparsity problems. The cloud-model approach addresses the sparsity problems by constructing the user-s global preference represented by a cloud eigenvector. The user-based CF approach works well with dense datasets while the cloud-model CF approach has a greater performance when the dataset is sparse. In this paper, we present a hybrid approach that integrates the predictions from both the user-based CF and the cloud-model CF approaches. The experimental results show that the proposed hybrid approach can ameliorate the sparsity problem and provide an improved prediction quality.

A Serializability Condition for Multi-step Transactions Accessing Ordered Data

In mobile environments, unspecified numbers of transactions arrive in continuous streams. To prove correctness of their concurrent execution a method of modelling an infinite number of transactions is needed. Standard database techniques model fixed finite schedules of transactions. Lately, techniques based on temporal logic have been proposed as suitable for modelling infinite schedules. The drawback of these techniques is that proving the basic serializability correctness condition is impractical, as encoding (the absence of) conflict cyclicity within large sets of transactions results in prohibitively large temporal logic formulae. In this paper, we show that, under certain common assumptions on the graph structure of data items accessed by the transactions, conflict cyclicity need only be checked within all possible pairs of transactions. This results in formulae of considerably reduced size in any temporal-logic-based approach to proving serializability, and scales to arbitrary numbers of transactions.

WebGD: A CORBA-based Document Classification and Retrieval System on the Web

This paper presents the design and implementation of the WebGD, a CORBA-based document classification and retrieval system on Internet. The WebGD makes use of such techniques as Web, CORBA, Java, NLP, fuzzy technique, knowledge-based processing and database technology. Unified classification and retrieval model, classifying and retrieving with one reasoning engine and flexible working mode configuration are some of its main features. The architecture of WebGD, the unified classification and retrieval model, the components of the WebGD server and the fuzzy inference engine are discussed in this paper in detail.

Shape Restoration of the Left Ventricle

This paper describes an automatic algorithm to restore the shape of three-dimensional (3D) left ventricle (LV) models created from magnetic resonance imaging (MRI) data using a geometry-driven optimization approach. Our basic premise is to restore the LV shape such that the LV epicardial surface is smooth after the restoration. A geometrical measure known as the Minimum Principle Curvature (κ2) is used to assess the smoothness of the LV. This measure is used to construct the objective function of a two-step optimization process. The objective of the optimization is to achieve a smooth epicardial shape by iterative in-plane translation of the MRI slices. Quantitatively, this yields a minimum sum in terms of the magnitude of κ 2, when κ2 is negative. A limited memory quasi-Newton algorithm, L-BFGS-B, is used to solve the optimization problem. We tested our algorithm on an in vitro theoretical LV model and 10 in vivo patient-specific models which contain significant motion artifacts. The results show that our method is able to automatically restore the shape of LV models back to smoothness without altering the general shape of the model. The magnitudes of in-plane translations are also consistent with existing registration techniques and experimental findings.

Revisiting the Concept of Risk Analysis within the Context of Geospatial Database Design: A Collaborative Framework

The aim of this research is to design a collaborative framework that integrates risk analysis activities into the geospatial database design (GDD) process. Risk analysis is rarely undertaken iteratively as part of the present GDD methods in conformance to requirement engineering (RE) guidelines and risk standards. Accordingly, when risk analysis is performed during the GDD, some foreseeable risks may be overlooked and not reach the output specifications especially when user intentions are not systematically collected. This may lead to ill-defined requirements and ultimately in higher risks of geospatial data misuse. The adopted approach consists of 1) reviewing risk analysis process within the scope of RE and GDD, 2) analyzing the challenges of risk analysis within the context of GDD, and 3) presenting the components of a risk-based collaborative framework that improves the collection of the intended/forbidden usages of the data and helps geo-IT experts to discover implicit requirements and risks.

Satellite Data Classification Accuracy Assessment Based from Reference Dataset

In order to develop forest management strategies in tropical forest in Malaysia, surveying the forest resources and monitoring the forest area affected by logging activities is essential. There are tremendous effort has been done in classification of land cover related to forest resource management in this country as it is a priority in all aspects of forest mapping using remote sensing and related technology such as GIS. In fact classification process is a compulsory step in any remote sensing research. Therefore, the main objective of this paper is to assess classification accuracy of classified forest map on Landsat TM data from difference number of reference data (200 and 388 reference data). This comparison was made through observation (200 reference data), and interpretation and observation approaches (388 reference data). Five land cover classes namely primary forest, logged over forest, water bodies, bare land and agricultural crop/mixed horticultural can be identified by the differences in spectral wavelength. Result showed that an overall accuracy from 200 reference data was 83.5 % (kappa value 0.7502459; kappa variance 0.002871), which was considered acceptable or good for optical data. However, when 200 reference data was increased to 388 in the confusion matrix, the accuracy slightly improved from 83.5% to 89.17%, with Kappa statistic increased from 0.7502459 to 0.8026135, respectively. The accuracy in this classification suggested that this strategy for the selection of training area, interpretation approaches and number of reference data used were importance to perform better classification result.

Utilization of Advanced Data Storage Technology to Conduct Construction Industry on Clear Environment

Construction projects generally take place in uncontrolled and dynamic environments where construction waste is a serious environmental problem in many large cities. The total amount of waste and carbon dioxide emissions from transportation vehicles are still out of control due to increasing construction projects, massive urban development projects and the lack of effective tools for minimizing adverse environmental impacts in construction. This research is about utilization of the integrated applications of automated advanced tracking and data storage technologies in the area of environmental management to monitor and control adverse environmental impacts such as construction waste and carbon dioxide emissions. Radio Frequency Identification (RFID) integrated with the Global Position System (GPS) provides an opportunity to uniquely identify materials, components, and equipments and to locate and track them using minimal or no worker input. The transmission of data to the central database will be carried out with the help of Global System for Mobile Communications (GSM).

A Thought on Exotic Statistical Distributions

The statistical distributions are modeled in explaining nature of various types of data sets. Although these distributions are mostly uni-modal, it is quite common to see multiple modes in the observed distribution of the underlying variables, which make the precise modeling unrealistic. The observed data do not exhibit smoothness not necessarily due to randomness, but could also be due to non-randomness resulting in zigzag curves, oscillations, humps etc. The present paper argues that trigonometric functions, which have not been used in probability functions of distributions so far, have the potential to take care of this, if incorporated in the distribution appropriately. A simple distribution (named as, Sinoform Distribution), involving trigonometric functions, is illustrated in the paper with a data set. The importance of trigonometric functions is demonstrated in the paper, which have the characteristics to make statistical distributions exotic. It is possible to have multiple modes, oscillations and zigzag curves in the density, which could be suitable to explain the underlying nature of select data set.

Supercompression for Full-HD and 4k-3D (8k)Digital TV Systems

In this work, we developed the concept of supercompression, i.e., compression above the compression standard used. In this context, both compression rates are multiplied. In fact, supercompression is based on super-resolution. That is to say, supercompression is a data compression technique that superpose spatial image compression on top of bit-per-pixel compression to achieve very high compression ratios. If the compression ratio is very high, then we use a convolutive mask inside decoder that restores the edges, eliminating the blur. Finally, both, the encoder and the complete decoder are implemented on General-Purpose computation on Graphics Processing Units (GPGPU) cards. Specifically, the mentio-ned mask is coded inside texture memory of a GPGPU.

Quantitative Analysis of PCA, ICA, LDA and SVM in Face Recognition

Face recognition is a technique to automatically identify or verify individuals. It receives great attention in identification, authentication, security and many more applications. Diverse methods had been proposed for this purpose and also a lot of comparative studies were performed. However, researchers could not reach unified conclusion. In this paper, we are reporting an extensive quantitative accuracy analysis of four most widely used face recognition algorithms: Principal Component Analysis (PCA), Independent Component Analysis (ICA), Linear Discriminant Analysis (LDA) and Support Vector Machine (SVM) using AT&T, Sheffield and Bangladeshi people face databases under diverse situations such as illumination, alignment and pose variations.

Analysis of Food Security Situation among Nigerian Rural Farmers

This paper analysed the food security situation among Nigerian rural farmers. Data collected on 202 rural farmers from Benue State were analysed using descriptive and inferential statistics. The study revealed that majority of the respondents (60.83%) had medium dietary diversity. Furthermore, household daily calorie requirement for the food secure households was 10,723 and the household daily calorie consumption was 12,598, with a surplus index of 0.04. The food security index was 1.16. The Household daily per capita calorie consumption was 3,221.2. For the food insecure households, the household daily calorie requirement was 20,213 and the household daily calorie consumption was 17,393. The shortfall index was 0.14. The food security index was 0.88. The Household daily per capita calorie consumption was 2,432.8. The most commonly used coping strategies during food stress included intercropping (99.2%), reliance on less preferred food (98.1%), limiting portion size at meal times (85.8%) and crop diversification (70.8%).