Abstract: e-mail has become an important means of electronic
communication but the viability of its usage is marred by Unsolicited
Bulk e-mail (UBE) messages. UBE consists of many types
like pornographic, virus infected and 'cry-for-help' messages as well
as fake and fraudulent offers for jobs, winnings and medicines. UBE
poses technical and socio-economic challenges to usage of e-mails.
To meet this challenge and combat this menace, we need to
understand UBE. Towards this end, the current paper presents a
content-based textual analysis of more than 2700 body enhancement
medicinal UBE. Technically, this is an application of Text Parsing
and Tokenization for an un-structured textual document and we
approach it using Bag Of Words (BOW) and Vector Space Document
Model techniques. We have attempted to identify the most
frequently occurring lexis in the UBE documents that advertise
various products for body enhancement. The analysis of such top
100 lexis is also presented. We exhibit the relationship between
occurrence of a word from the identified lexis-set in the given UBE
and the probability that the given UBE will be the one advertising for
fake medicinal product. To the best of our knowledge and survey of
related literature, this is the first formal attempt for identification of
most frequently occurring lexis in such UBE by its textual analysis.
Finally, this is a sincere attempt to bring about alertness against and
mitigate the threat of such luring but fake UBE.
Abstract: This paper includes two novel techniques for skew
estimation of binary document images. These algorithms are based on
connected component analysis and Hough transform. Both these
methods focus on reducing the amount of input data provided to
Hough transform. In the first method, referred as word centroid
approach, the centroids of selected words are used for skew detection.
In the second method, referred as dilate & thin approach, the selected
characters are blocked and dilated to get word blocks and later
thinning is applied. The final image fed to Hough transform has the
thinned coordinates of word blocks in the image. The methods have
been successful in reducing the computational complexity of Hough
transform based skew estimation algorithms. Promising experimental
results are also provided to prove the effectiveness of the proposed
methods.
Abstract: Fuzzy Load forecasting plays a paramount role in the operation and management of power systems. Accurate estimation of future power demands for various lead times facilitates the task of generating power reliably and economically. The forecasting of future loads for a relatively large lead time (months to few years) is studied here (long term load forecasting). Among the various techniques used in forecasting load, artificial intelligence techniques provide greater accuracy to the forecasts as compared to conventional techniques. Fuzzy Logic, a very robust artificial intelligent technique, is described in this paper to forecast load on long term basis. The paper gives a general algorithm to forecast long term load. The algorithm is an Extension of Short term load forecasting method to Long term load forecasting and concentrates not only on the forecast values of load but also on the errors incorporated into the forecast. Hence, by correcting the errors in the forecast, forecasts with very high accuracy have been achieved. The algorithm, in the paper, is demonstrated with the help of data collected for residential sector (LT2 (a) type load: Domestic consumers). Load, is determined for three consecutive years (from April-06 to March-09) in order to demonstrate the efficiency of the algorithm and to forecast for the next two years (from April-09 to March-11).
Abstract: Data Warehouses (DWs) are repositories which contain the unified history of an enterprise for decision support. The data must be Extracted from information sources, Transformed and integrated to be Loaded (ETL) into the DW, using ETL tools. These tools focus on data movement, where the models are only used as a means to this aim. Under a conceptual viewpoint, the authors want to innovate the ETL process in two ways: 1) to make clear compatibility between models in a declarative fashion, using correspondence assertions and 2) to identify the instances of different sources that represent the same entity in the real-world. This paper presents the overview of the proposed framework to model the ETL process, which is based on the use of a reference model and perspective schemata. This approach provides the designer with a better understanding of the semantic associated with the ETL process.
Abstract: The paper investigates the feasibility of constructing a software multi-agent based monitoring and classification system and utilizing it to provide an automated and accurate classification of end users developing applications in the spreadsheet domain. The agents function autonomously to provide continuous and periodic monitoring of excels spreadsheet workbooks. Resulting in, the development of the MultiAgent classification System (MACS) that is in compliance with the specifications of the Foundation for Intelligent Physical Agents (FIPA). However, different technologies have been brought together to build MACS. The strength of the system is the integration of the agent technology with the FIPA specifications together with other technologies that are Windows Communication Foundation (WCF) services, Service Oriented Architecture (SOA), and Oracle Data Mining (ODM). The Microsoft's .NET widows service based agents were utilized to develop the monitoring agents of MACS, the .NET WCF services together with SOA approach allowed the distribution and communication between agents over the WWW that is in order to satisfy the monitoring and classification of the multiple developer aspect. ODM was used to automate the classification phase of MACS.
Abstract: In this paper, we propose a hybrid machine learning
system based on Genetic Algorithm (GA) and Support Vector
Machines (SVM) for stock market prediction. A variety of indicators
from the technical analysis field of study are used as input features.
We also make use of the correlation between stock prices of different
companies to forecast the price of a stock, making use of technical
indicators of highly correlated stocks, not only the stock to be
predicted. The genetic algorithm is used to select the set of most
informative input features from among all the technical indicators.
The results show that the hybrid GA-SVM system outperforms the
stand alone SVM system.
Abstract: In this paper, we proposed a new framework to incorporate an intelligent agent software robot into a crisis communication portal (CCNet) in order to send alert news to subscribed users via email and other mobile services such as Short Message Service (SMS), Multimedia Messaging Service (MMS) and General Packet Radio Services (GPRS). The content on the mobile services can be delivered either through mobile phone or Personal Digital Assistance (PDA). This research has shown that with our proposed framework, the embodied conversation agents system can handle questions intelligently with our multilayer architecture. At the same time, the extended framework can take care of delivery content through a more humanoid interface on mobile devices.
Abstract: In the last decade digital watermarking procedures have
become increasingly applied to implement the copyright protection
of multimedia digital contents distributed on the Internet. To this
end, it is worth noting that a lot of watermarking procedures
for images and videos proposed in literature are based on spread
spectrum techniques. However, some scepticism about the robustness
and security of such watermarking procedures has arisen because
of some documented attacks which claim to render the inserted
watermarks undetectable. On the other hand, web content providers
wish to exploit watermarking procedures characterized by flexible and
efficient implementations and which can be easily integrated in their
existing web services frameworks or platforms. This paper presents
how a simple spread spectrum watermarking procedure for MPEG-2
videos can be modified to be exploited in web contexts. To this end,
the proposed procedure has been made secure and robust against some
well-known and dangerous attacks. Furthermore, its basic scheme
has been optimized by making the insertion procedure adaptive with
respect to the terminals used to open the videos and the network transactions
carried out to deliver them to buyers. Finally, two different
implementations of the procedure have been developed: the former
is a high performance parallel implementation, whereas the latter is
a portable Java and XML based implementation. Thus, the paper
demonstrates that a simple spread spectrum watermarking procedure,
with limited and appropriate modifications to the embedding scheme,
can still represent a valid alternative to many other well-known and
more recent watermarking procedures proposed in literature.
Abstract: In mobile computing environments, there are many
new non existing problems in the distributed system, which is
consisted of stationary hosts because of host mobility, sudden
disconnection by handoff in wireless networks, voluntary
disconnection for efficient power consumption of a mobile host, etc.
To solve the problems, we proposed the architecture of Partial
Connection Manager (PCM) in this paper. PCM creates the limited
number of mobile agents according to priority, sends them in parallel
to servers, and combines the results to process the user request rapidly.
In applying the proposed PCM to the mobile market agent service, we
understand that the mobile agent technique could be suited for the
mobile computing environment and the partial connection problem
management.
Abstract: In this paper, a model for an information retrieval
system is proposed which takes into account that knowledge about
documents and information need of users are dynamic. Two
methods are combined, one qualitative or symbolic and the other
quantitative or numeric, which are deemed suitable for many
clustering contexts, data analysis, concept exploring and
knowledge discovery. These two methods may be classified as
inductive learning techniques. In this model, they are introduced to
build “long term" knowledge about past queries and concepts in a
collection of documents. The “long term" knowledge can guide
and assist the user to formulate an initial query and can be
exploited in the process of retrieving relevant information. The
different kinds of knowledge are organized in different points of
view. This may be considered an enrichment of the exploration
level which is coherent with the concept of document/query
structure.
Abstract: This paper presents an effective traffic lights detection
method at the night-time. First, candidate blobs of traffic lights are
extracted from RGB color image. Input image is represented on the
dominant color domain by using color transform proposed by Ruta,
then red and green color dominant regions are selected as candidates.
After candidate blob selection, we carry out shape filter for noise
reduction using information of blobs such as length, area, area of
boundary box, etc. A multi-class classifier based on SVM (Support
Vector Machine) applies into the candidates. Three kinds of features
are used. We use basic features such as blob width, height, center
coordinate, area, area of blob. Bright based stochastic features are also
used. In particular, geometric based moment-s values between
candidate region and adjacent region are proposed and used to improve
the detection performance. The proposed system is implemented on
Intel Core CPU with 2.80 GHz and 4 GB RAM and tested with the
urban and rural road videos. Through the test, we show that the
proposed method using PF, BMF, and GMF reaches up to 93 % of
detection rate with computation time of in average 15 ms/frame.
Abstract: OpenMP is an API for parallel programming model of shared memory multiprocessors. Novice OpenMP programmers often produce the code that compiler cannot find human errors. It was investigated how compiler coped with the common mistakes that can occur in OpenMP code. The latest version(4.4.3) of GCC is used for this research. It was found that GCC compiled the codes without any errors or warnings. In this paper the programming aid tool is presented for OpenMP programs. It can check 12 common mistakes that novice programmer can commit during the programming of OpenMP. It was demonstrated that the programming aid tool can detect the various common mistakes that GCC failed to detect.
Abstract: Fuzzy random variables have been introduced as an imprecise concept of numeric values for characterizing the imprecise knowledge. The descriptive parameters can be used to describe the primary features of a set of fuzzy random observations. In fuzzy environments, the expected values are usually represented as fuzzy-valued, interval-valued or numeric-valued descriptive parameters using various metrics. Instead of the concept of area metric that is usually adopted in the relevant studies, the numeric expected value is proposed by the concept of distance metric in this study based on two characters (fuzziness and randomness) of FRVs. Comparing with the existing measures, although the results show that the proposed numeric expected value is same with those using the different metric, if only triangular membership functions are used. However, the proposed approach has the advantages of intuitiveness and computational efficiency, when the membership functions are not triangular types. An example with three datasets is provided for verifying the proposed approach.
Abstract: Improving the performance of the QCL through block diagram as well as mathematical models is the main scope of this paper. In order to enhance the performance of the underlined device, the mathematical model parameters are used in a reliable manner in such a way that the optimum behavior was achieved. These parameters play the central role in specifying the optical characteristics of the considered laser source. Moreover, it is important to have a large amount of radiated power, where increasing the amount of radiated power represents the main hopping process that can be predicted from the behavior of quantum laser devices. It was found that there is a good agreement between the calculated values from our mathematical model and those obtained with VisSim and experimental results. These demonstrate the strength of mplementation of both mathematical and block diagram models.
Abstract: Utilization of various sensors has made it possible to
extend capabilities of industrial robots. Among these are vision
sensors that are used for providing visual information to assist robot
controllers. This paper presents a method of integrating a vision
system and a simulation program with an industrial robot. The vision
system is employed to detect a target object and compute its location
in the robot environment. Then, the target object-s information is sent
to the robot controller via parallel communication port. The robot
controller uses the extracted object information and the simulation
program to control the robot arm for approaching, grasping and
relocating the object. This paper presents technical details of system
components and describes the methodology used for this integration.
It also provides a case study to prove the validity of the methodology
developed.
Abstract: The advancement in wireless technology with the wide
use of mobile devices have drawn the attention of the research and
technological communities towards wireless environments, such as
Wireless Local Area Networks (WLANs), Wireless Wide Area
Networks (WWANs), and mobile systems and ad-hoc networks.
Unfortunately, wired and wireless networks are expressively different
in terms of link reliability, bandwidth, and time of propagation delay
and by adapting new solutions for these enhanced
telecommunications, superior quality, efficiency, and opportunities
will be provided where wireless communications were otherwise
unfeasible. Some researchers define 4G as a significant improvement
of 3G, where current cellular network’s issues will be solved and data
transfer will play a more significant role. For others, 4G unifies
cellular and wireless local area networks, and introduces new routing
techniques, efficient solutions for sharing dedicated frequency bands,
and an increased mobility and bandwidth capacity. This paper
discusses the possible solutions and enhancements probabilities that
proposed to improve the performance of Transmission Control
Protocol (TCP) over different wireless networks and also the paper
investigated each approach in term of advantages and disadvantages.
Abstract: This paper aims to present a survey of object
recognition/classification methods based on image moments. We
review various types of moments (geometric moments, complex
moments) and moment-based invariants with respect to various
image degradations and distortions (rotation, scaling, affine
transform, image blurring, etc.) which can be used as shape
descriptors for classification. We explain a general theory how to
construct these invariants and show also a few of them in explicit
forms. We review efficient numerical algorithms that can be used
for moment computation and demonstrate practical examples of
using moment invariants in real applications.
Abstract: Case-Based Reasoning (CBR) is one of machine
learning algorithms for problem solving and learning that caught a lot
of attention over the last few years. In general, CBR is composed of
four main phases: retrieve the most similar case or cases, reuse the
case to solve the problem, revise or adapt the proposed solution, and
retain the learned cases before returning them to the case base for
learning purpose. Unfortunately, in many cases, this retain process
causes the uncontrolled case base growth. The problem affects
competence and performance of CBR systems. This paper proposes
competence-based maintenance method based on deletion policy
strategy for CBR. There are three main steps in this method. Step 1,
formulate problems. Step 2, determine coverage and reachability set
based on coverage value. Step 3, reduce case base size. The results
obtained show that this proposed method performs better than the
existing methods currently discussed in literature.
Abstract: This paper analyzes different techniques of the fine grained security of relational databases for the two variables-data accessibility and inference. Data accessibility measures the amount of data available to the users after applying a security technique on a table. Inference is the proportion of information leakage after suppressing a cell containing secret data. A row containing a secret cell which is suppressed can become a security threat if an intruder generates useful information from the related visible information of the same row. This paper measures data accessibility and inference associated with row, cell, and column level security techniques. Cell level security offers greatest data accessibility as it suppresses secret data only. But on the other hand, there is a high probability of inference in cell level security. Row and column level security techniques have least data accessibility and inference. This paper introduces cell plus innocent security technique that utilizes the cell level security method but suppresses some innocent data to dodge an intruder that a suppressed cell may not necessarily contain secret data. Four variations of the technique namely cell plus innocent 1/4, cell plus innocent 2/4, cell plus innocent 3/4, and cell plus innocent 4/4 respectively have been introduced to suppress innocent data equal to 1/4, 2/4, 3/4, and 4/4 percent of the true secret data inside the database. Results show that the new technique offers better control over data accessibility and inference as compared to the state-of-theart security techniques. This paper further discusses the combination of techniques together to be used. The paper shows that cell plus innocent 1/4, 2/4, and 3/4 techniques can be used as a replacement for the cell level security.
Abstract: In a handwriting recognition problem, characters can
be represented using chain codes. The main problem in representing
characters using chain code is optimizing the length of the chain
code. This paper proposes to use randomized algorithm to minimize
the length of Freeman Chain Codes (FCC) generated from isolated
handwritten characters. Feedforward neural network is used in the
classification stage to recognize the image characters. Our test results
show that by applying the proposed model, we reached a relatively
high accuracy for the problem of isolated handwritten when tested on
NIST database.