Identification of Most Frequently Occurring Lexis in Body-enhancement Medicinal Unsolicited Bulk e-mails

e-mail has become an important means of electronic communication but the viability of its usage is marred by Unsolicited Bulk e-mail (UBE) messages. UBE consists of many types like pornographic, virus infected and 'cry-for-help' messages as well as fake and fraudulent offers for jobs, winnings and medicines. UBE poses technical and socio-economic challenges to usage of e-mails. To meet this challenge and combat this menace, we need to understand UBE. Towards this end, the current paper presents a content-based textual analysis of more than 2700 body enhancement medicinal UBE. Technically, this is an application of Text Parsing and Tokenization for an un-structured textual document and we approach it using Bag Of Words (BOW) and Vector Space Document Model techniques. We have attempted to identify the most frequently occurring lexis in the UBE documents that advertise various products for body enhancement. The analysis of such top 100 lexis is also presented. We exhibit the relationship between occurrence of a word from the identified lexis-set in the given UBE and the probability that the given UBE will be the one advertising for fake medicinal product. To the best of our knowledge and survey of related literature, this is the first formal attempt for identification of most frequently occurring lexis in such UBE by its textual analysis. Finally, this is a sincere attempt to bring about alertness against and mitigate the threat of such luring but fake UBE.

Estimation of Skew Angle in Binary Document Images Using Hough Transform

This paper includes two novel techniques for skew estimation of binary document images. These algorithms are based on connected component analysis and Hough transform. Both these methods focus on reducing the amount of input data provided to Hough transform. In the first method, referred as word centroid approach, the centroids of selected words are used for skew detection. In the second method, referred as dilate & thin approach, the selected characters are blocked and dilated to get word blocks and later thinning is applied. The final image fed to Hough transform has the thinned coordinates of word blocks in the image. The methods have been successful in reducing the computational complexity of Hough transform based skew estimation algorithms. Promising experimental results are also provided to prove the effectiveness of the proposed methods.

Fuzzy Ideology based Long Term Load Forecasting

Fuzzy Load forecasting plays a paramount role in the operation and management of power systems. Accurate estimation of future power demands for various lead times facilitates the task of generating power reliably and economically. The forecasting of future loads for a relatively large lead time (months to few years) is studied here (long term load forecasting). Among the various techniques used in forecasting load, artificial intelligence techniques provide greater accuracy to the forecasts as compared to conventional techniques. Fuzzy Logic, a very robust artificial intelligent technique, is described in this paper to forecast load on long term basis. The paper gives a general algorithm to forecast long term load. The algorithm is an Extension of Short term load forecasting method to Long term load forecasting and concentrates not only on the forecast values of load but also on the errors incorporated into the forecast. Hence, by correcting the errors in the forecast, forecasts with very high accuracy have been achieved. The algorithm, in the paper, is demonstrated with the help of data collected for residential sector (LT2 (a) type load: Domestic consumers). Load, is determined for three consecutive years (from April-06 to March-09) in order to demonstrate the efficiency of the algorithm and to forecast for the next two years (from April-09 to March-11).

Using Perspective Schemata to Model the ETL Process

Data Warehouses (DWs) are repositories which contain the unified history of an enterprise for decision support. The data must be Extracted from information sources, Transformed and integrated to be Loaded (ETL) into the DW, using ETL tools. These tools focus on data movement, where the models are only used as a means to this aim. Under a conceptual viewpoint, the authors want to innovate the ETL process in two ways: 1) to make clear compatibility between models in a declarative fashion, using correspondence assertions and 2) to identify the instances of different sources that represent the same entity in the real-world. This paper presents the overview of the proposed framework to model the ETL process, which is based on the use of a reference model and perspective schemata. This approach provides the designer with a better understanding of the semantic associated with the ETL process.

The Development of the Multi-Agent Classification System (MACS) in Compliance with FIPA Specifications

The paper investigates the feasibility of constructing a software multi-agent based monitoring and classification system and utilizing it to provide an automated and accurate classification of end users developing applications in the spreadsheet domain. The agents function autonomously to provide continuous and periodic monitoring of excels spreadsheet workbooks. Resulting in, the development of the MultiAgent classification System (MACS) that is in compliance with the specifications of the Foundation for Intelligent Physical Agents (FIPA). However, different technologies have been brought together to build MACS. The strength of the system is the integration of the agent technology with the FIPA specifications together with other technologies that are Windows Communication Foundation (WCF) services, Service Oriented Architecture (SOA), and Oracle Data Mining (ODM). The Microsoft's .NET widows service based agents were utilized to develop the monitoring agents of MACS, the .NET WCF services together with SOA approach allowed the distribution and communication between agents over the WWW that is in order to satisfy the monitoring and classification of the multiple developer aspect. ODM was used to automate the classification phase of MACS.

A Hybrid Machine Learning System for Stock Market Forecasting

In this paper, we propose a hybrid machine learning system based on Genetic Algorithm (GA) and Support Vector Machines (SVM) for stock market prediction. A variety of indicators from the technical analysis field of study are used as input features. We also make use of the correlation between stock prices of different companies to forecast the price of a stock, making use of technical indicators of highly correlated stocks, not only the stock to be predicted. The genetic algorithm is used to select the set of most informative input features from among all the technical indicators. The results show that the hybrid GA-SVM system outperforms the stand alone SVM system.

A Crisis Communication Network Based on Embodied Conversational Agents System with Mobile Services

In this paper, we proposed a new framework to incorporate an intelligent agent software robot into a crisis communication portal (CCNet) in order to send alert news to subscribed users via email and other mobile services such as Short Message Service (SMS), Multimedia Messaging Service (MMS) and General Packet Radio Services (GPRS). The content on the mobile services can be delivered either through mobile phone or Personal Digital Assistance (PDA). This research has shown that with our proposed framework, the embodied conversation agents system can handle questions intelligently with our multilayer architecture. At the same time, the extended framework can take care of delivery content through a more humanoid interface on mobile devices.

A Web Oriented Spread Spectrum Watermarking Procedure for MPEG-2 Videos

In the last decade digital watermarking procedures have become increasingly applied to implement the copyright protection of multimedia digital contents distributed on the Internet. To this end, it is worth noting that a lot of watermarking procedures for images and videos proposed in literature are based on spread spectrum techniques. However, some scepticism about the robustness and security of such watermarking procedures has arisen because of some documented attacks which claim to render the inserted watermarks undetectable. On the other hand, web content providers wish to exploit watermarking procedures characterized by flexible and efficient implementations and which can be easily integrated in their existing web services frameworks or platforms. This paper presents how a simple spread spectrum watermarking procedure for MPEG-2 videos can be modified to be exploited in web contexts. To this end, the proposed procedure has been made secure and robust against some well-known and dangerous attacks. Furthermore, its basic scheme has been optimized by making the insertion procedure adaptive with respect to the terminals used to open the videos and the network transactions carried out to deliver them to buyers. Finally, two different implementations of the procedure have been developed: the former is a high performance parallel implementation, whereas the latter is a portable Java and XML based implementation. Thus, the paper demonstrates that a simple spread spectrum watermarking procedure, with limited and appropriate modifications to the embedding scheme, can still represent a valid alternative to many other well-known and more recent watermarking procedures proposed in literature.

Partial Connection Architecture for Mobile Computing

In mobile computing environments, there are many new non existing problems in the distributed system, which is consisted of stationary hosts because of host mobility, sudden disconnection by handoff in wireless networks, voluntary disconnection for efficient power consumption of a mobile host, etc. To solve the problems, we proposed the architecture of Partial Connection Manager (PCM) in this paper. PCM creates the limited number of mobile agents according to priority, sends them in parallel to servers, and combines the results to process the user request rapidly. In applying the proposed PCM to the mobile market agent service, we understand that the mobile agent technique could be suited for the mobile computing environment and the partial connection problem management.

Neural-Symbolic Machine-Learning for Knowledge Discovery and Adaptive Information Retrieval

In this paper, a model for an information retrieval system is proposed which takes into account that knowledge about documents and information need of users are dynamic. Two methods are combined, one qualitative or symbolic and the other quantitative or numeric, which are deemed suitable for many clustering contexts, data analysis, concept exploring and knowledge discovery. These two methods may be classified as inductive learning techniques. In this model, they are introduced to build “long term" knowledge about past queries and concepts in a collection of documents. The “long term" knowledge can guide and assist the user to formulate an initial query and can be exploited in the process of retrieving relevant information. The different kinds of knowledge are organized in different points of view. This may be considered an enrichment of the exploration level which is coherent with the concept of document/query structure.

Night-Time Traffic Light Detection Based On SVM with Geometric Moment Features

This paper presents an effective traffic lights detection method at the night-time. First, candidate blobs of traffic lights are extracted from RGB color image. Input image is represented on the dominant color domain by using color transform proposed by Ruta, then red and green color dominant regions are selected as candidates. After candidate blob selection, we carry out shape filter for noise reduction using information of blobs such as length, area, area of boundary box, etc. A multi-class classifier based on SVM (Support Vector Machine) applies into the candidates. Three kinds of features are used. We use basic features such as blob width, height, center coordinate, area, area of blob. Bright based stochastic features are also used. In particular, geometric based moment-s values between candidate region and adjacent region are proposed and used to improve the detection performance. The proposed system is implemented on Intel Core CPU with 2.80 GHz and 4 GB RAM and tested with the urban and rural road videos. Through the test, we show that the proposed method using PF, BMF, and GMF reaches up to 93 % of detection rate with computation time of in average 15 ms/frame.

Programming Aid Tool for Detecting Common Mistakes of Novice Programmers in OpenMP Code

OpenMP is an API for parallel programming model of shared memory multiprocessors. Novice OpenMP programmers often produce the code that compiler cannot find human errors. It was investigated how compiler coped with the common mistakes that can occur in OpenMP code. The latest version(4.4.3) of GCC is used for this research. It was found that GCC compiled the codes without any errors or warnings. In this paper the programming aid tool is presented for OpenMP programs. It can check 12 common mistakes that novice programmer can commit during the programming of OpenMP. It was demonstrated that the programming aid tool can detect the various common mistakes that GCC failed to detect.

A New Concept for Deriving the Expected Value of Fuzzy Random Variables

Fuzzy random variables have been introduced as an imprecise concept of numeric values for characterizing the imprecise knowledge. The descriptive parameters can be used to describe the primary features of a set of fuzzy random observations. In fuzzy environments, the expected values are usually represented as fuzzy-valued, interval-valued or numeric-valued descriptive parameters using various metrics. Instead of the concept of area metric that is usually adopted in the relevant studies, the numeric expected value is proposed by the concept of distance metric in this study based on two characters (fuzziness and randomness) of FRVs. Comparing with the existing measures, although the results show that the proposed numeric expected value is same with those using the different metric, if only triangular membership functions are used. However, the proposed approach has the advantages of intuitiveness and computational efficiency, when the membership functions are not triangular types. An example with three datasets is provided for verifying the proposed approach.

Performance Analysis of Quantum Cascaded Lasers

Improving the performance of the QCL through block diagram as well as mathematical models is the main scope of this paper. In order to enhance the performance of the underlined device, the mathematical model parameters are used in a reliable manner in such a way that the optimum behavior was achieved. These parameters play the central role in specifying the optical characteristics of the considered laser source. Moreover, it is important to have a large amount of radiated power, where increasing the amount of radiated power represents the main hopping process that can be predicted from the behavior of quantum laser devices. It was found that there is a good agreement between the calculated values from our mathematical model and those obtained with VisSim and experimental results. These demonstrate the strength of mplementation of both mathematical and block diagram models.

An Approach for Integration of Industrial Robot with Vision System and Simulation Software

Utilization of various sensors has made it possible to extend capabilities of industrial robots. Among these are vision sensors that are used for providing visual information to assist robot controllers. This paper presents a method of integrating a vision system and a simulation program with an industrial robot. The vision system is employed to detect a target object and compute its location in the robot environment. Then, the target object-s information is sent to the robot controller via parallel communication port. The robot controller uses the extracted object information and the simulation program to control the robot arm for approaching, grasping and relocating the object. This paper presents technical details of system components and describes the methodology used for this integration. It also provides a case study to prove the validity of the methodology developed.

Augmentation Opportunity of Transmission Control Protocol Performance in Wireless Networks and Cellular Systems

The advancement in wireless technology with the wide use of mobile devices have drawn the attention of the research and technological communities towards wireless environments, such as Wireless Local Area Networks (WLANs), Wireless Wide Area Networks (WWANs), and mobile systems and ad-hoc networks. Unfortunately, wired and wireless networks are expressively different in terms of link reliability, bandwidth, and time of propagation delay and by adapting new solutions for these enhanced telecommunications, superior quality, efficiency, and opportunities will be provided where wireless communications were otherwise unfeasible. Some researchers define 4G as a significant improvement of 3G, where current cellular network’s issues will be solved and data transfer will play a more significant role. For others, 4G unifies cellular and wireless local area networks, and introduces new routing techniques, efficient solutions for sharing dedicated frequency bands, and an increased mobility and bandwidth capacity. This paper discusses the possible solutions and enhancements probabilities that proposed to improve the performance of Transmission Control Protocol (TCP) over different wireless networks and also the paper investigated each approach in term of advantages and disadvantages.

Moment Invariants in Image Analysis

This paper aims to present a survey of object recognition/classification methods based on image moments. We review various types of moments (geometric moments, complex moments) and moment-based invariants with respect to various image degradations and distortions (rotation, scaling, affine transform, image blurring, etc.) which can be used as shape descriptors for classification. We explain a general theory how to construct these invariants and show also a few of them in explicit forms. We review efficient numerical algorithms that can be used for moment computation and demonstrate practical examples of using moment invariants in real applications.

Methods for Case Maintenance in Case-Based Reasoning

Case-Based Reasoning (CBR) is one of machine learning algorithms for problem solving and learning that caught a lot of attention over the last few years. In general, CBR is composed of four main phases: retrieve the most similar case or cases, reuse the case to solve the problem, revise or adapt the proposed solution, and retain the learned cases before returning them to the case base for learning purpose. Unfortunately, in many cases, this retain process causes the uncontrolled case base growth. The problem affects competence and performance of CBR systems. This paper proposes competence-based maintenance method based on deletion policy strategy for CBR. There are three main steps in this method. Step 1, formulate problems. Step 2, determine coverage and reachability set based on coverage value. Step 3, reduce case base size. The results obtained show that this proposed method performs better than the existing methods currently discussed in literature.

A Comparative Study of Fine Grained Security Techniques Based on Data Accessibility and Inference

This paper analyzes different techniques of the fine grained security of relational databases for the two variables-data accessibility and inference. Data accessibility measures the amount of data available to the users after applying a security technique on a table. Inference is the proportion of information leakage after suppressing a cell containing secret data. A row containing a secret cell which is suppressed can become a security threat if an intruder generates useful information from the related visible information of the same row. This paper measures data accessibility and inference associated with row, cell, and column level security techniques. Cell level security offers greatest data accessibility as it suppresses secret data only. But on the other hand, there is a high probability of inference in cell level security. Row and column level security techniques have least data accessibility and inference. This paper introduces cell plus innocent security technique that utilizes the cell level security method but suppresses some innocent data to dodge an intruder that a suppressed cell may not necessarily contain secret data. Four variations of the technique namely cell plus innocent 1/4, cell plus innocent 2/4, cell plus innocent 3/4, and cell plus innocent 4/4 respectively have been introduced to suppress innocent data equal to 1/4, 2/4, 3/4, and 4/4 percent of the true secret data inside the database. Results show that the new technique offers better control over data accessibility and inference as compared to the state-of-theart security techniques. This paper further discusses the combination of techniques together to be used. The paper shows that cell plus innocent 1/4, 2/4, and 3/4 techniques can be used as a replacement for the cell level security.

Recognition of Isolated Handwritten Latin Characters using One Continuous Route of Freeman Chain Code Representation and Feedforward Neural Network Classifier

In a handwriting recognition problem, characters can be represented using chain codes. The main problem in representing characters using chain code is optimizing the length of the chain code. This paper proposes to use randomized algorithm to minimize the length of Freeman Chain Codes (FCC) generated from isolated handwritten characters. Feedforward neural network is used in the classification stage to recognize the image characters. Our test results show that by applying the proposed model, we reached a relatively high accuracy for the problem of isolated handwritten when tested on NIST database.