Abstract: I/O workload is a critical and important factor to
analyze I/O pattern and to maximize file system performance.
However to measure I/O workload on running distributed parallel file
system is non-trivial due to collection overhead and large volume of
data. In this paper, we measured and analyzed file system activities on
two large-scale cluster systems which had TFlops level high
performance computation resources. By comparing file system
activities of 2009 with those of 2006, we analyzed the change of I/O
workloads by the development of system performance and high-speed
network technology.
Abstract: Spatial and mobile computing evolves. This paper
describes a smart modeling platform called “GeoSEMA". This
approach tends to model multidimensional GeoSpatial Evolutionary
and Mobile Agents. Instead of 3D and location-based issues, there
are some other dimensions that may characterize spatial agents, e.g.
discrete-continuous time, agent behaviors. GeoSEMA is seen as a
devoted design pattern motivating temporal geographic-based
applications; it is a firm foundation for multipurpose and
multidimensional special-based applications. It deals with
multipurpose smart objects (buildings, shapes, missiles, etc.) by
stimulating geospatial agents.
Formally, GeoSEMA refers to geospatial, spatio-evolutive and
mobile space constituents where a conceptual geospatial space model
is given in this paper. In addition to modeling and categorizing
geospatial agents, the model incorporates the concept of inter-agents
event-based protocols. Finally, a rapid software-architecture
prototyping GeoSEMA platform is also given. It will be
implemented/ validated in the next phase of our work.
Abstract: The problem of frequent pattern discovery is defined
as the process of searching for patterns such as sets of features or items that appear in data frequently. Finding such frequent patterns
has become an important data mining task because it reveals associations, correlations, and many other interesting relationships
hidden in a database. Most of the proposed frequent pattern mining
algorithms have been implemented with imperative programming
languages. Such paradigm is inefficient when set of patterns is large
and the frequent pattern is long. We suggest a high-level declarative
style of programming apply to the problem of frequent pattern
discovery. We consider two languages: Haskell and Prolog. Our
intuitive idea is that the problem of finding frequent patterns should
be efficiently and concisely implemented via a declarative paradigm
since pattern matching is a fundamental feature supported by most
functional languages and Prolog. Our frequent pattern mining
implementation using the Haskell and Prolog languages confirms our
hypothesis about conciseness of the program. The comparative
performance studies on line-of-code, speed and memory usage of
declarative versus imperative programming have been reported in the
paper.
Abstract: Nowadays, computer worms, viruses and Trojan horse
become popular, and they are collectively called malware. Those
malware just spoiled computers by deleting or rewriting important
files a decade ago. However, recent malware seems to be born to earn
money. Some of malware work for collecting personal information so
that malicious people can find secret information such as password for
online banking, evidence for a scandal or contact address which relates
with the target. Moreover, relation between money and malware
becomes more complex. Many kinds of malware bear bots to get
springboards. Meanwhile, for ordinary internet users,
countermeasures against malware come up against a blank wall.
Pattern matching becomes too much waste of computer resources,
since matching tools have to deal with a lot of patterns derived from
subspecies. Virus making tools can automatically bear subspecies of
malware. Moreover, metamorphic and polymorphic malware are no
longer special. Recently there appears malware checking sites that
check contents in place of users' PC. However, there appears a new
type of malicious sites that avoids check by malware checking sites. In
this paper, existing protocols and methods related with the web are
reconsidered in terms of protection from current attacks, and new
protocol and method are indicated for the purpose of security of the
web.
Abstract: Independent spanning trees (ISTs) provide a number of advantages in data broadcasting. One can cite the use in fault tolerance network protocols for distributed computing and bandwidth. However, the problem of constructing multiple ISTs is considered hard for arbitrary graphs. In this paper we present an efficient algorithm to construct ISTs on hypercubes that requires minimum resources to be performed.
Abstract: Generator of hypotheses is a new method for data mining. It makes possible to classify the source data automatically and produces a particular enumeration of patterns. Pattern is an expression (in a certain language) describing facts in a subset of facts. The goal is to describe the source data via patterns and/or IF...THEN rules. Used evaluation criteria are deterministic (not probabilistic). The search results are trees - form that is easy to comprehend and interpret. Generator of hypotheses uses very effective algorithm based on the theory of monotone systems (MS) named MONSA (MONotone System Algorithm).
Abstract: The mathematical framework for studying of a fuzzy approximate reasoning is presented in this paper. Two important defuzzification methods (Area defuzzification and Height defuzzification) besides the center of gravity method which is the best well known defuzzification method are described. The continuity of the defuzzification methods and its application to a fuzzy feedback control are discussed.
Abstract: Quantitative Investigation of impact of the factors' contribution towards measuring the reusability of software components could be helpful in evaluating the quality of developed or developing reusable software components and in identification of reusable component from existing legacy systems; that can save cost of developing the software from scratch. But the issue of the relative significance of contributing factors has remained relatively unexplored. In this paper, we have use the Taguchi's approach in analyzing the significance of different structural attributes or factors in deciding the reusability level of a particular component. The results obtained shows that the complexity is the most important factor in deciding the better Reusability of a function oriented Software. In case of Object Oriented Software, Coupling and Complexity collectively play significant role in high reusability.
Abstract: Given a parallel program to be executed on a heterogeneous
computing system, the overall execution time of the program
is determined by a schedule. In this paper, we analyze the worst-case
performance of the list scheduling algorithm for scheduling tasks
of a parallel program in a mixed-machine heterogeneous computing
system such that the total execution time of the program is minimized.
We prove tight lower and upper bounds for the worst-case
performance ratio of the list scheduling algorithm. We also examine
the average-case performance of the list scheduling algorithm. Our
experimental data reveal that the average-case performance of the list
scheduling algorithm is much better than the worst-case performance
and is very close to optimal, except for large systems with large
heterogeneity. Thus, the list scheduling algorithm is very useful in
real applications.
Abstract: Encrypted messages sending frequently draws the attention
of third parties, perhaps causing attempts to break and
reveal the original messages. Steganography is introduced to hide
the existence of the communication by concealing a secret message
in an appropriate carrier like text, image, audio or video. Quantum
steganography where the sender (Alice) embeds her steganographic
information into the cover and sends it to the receiver (Bob) over a
communication channel. Alice and Bob share an algorithm and hide
quantum information in the cover. An eavesdropper (Eve) without
access to the algorithm can-t find out the existence of the quantum
message. In this paper, a text quantum steganography technique based
on the use of indefinite articles (a) or (an) in conjunction with the nonspecific
or non-particular nouns in English language and quantum
gate truth table have been proposed. The authors also introduced a
new code representation technique (SSCE - Secret Steganography
Code for Embedding) at both ends in order to achieve high level of
security. Before the embedding operation each character of the secret
message has been converted to SSCE Value and then embeds to cover
text. Finally stego text is formed and transmits to the receiver side.
At the receiver side different reverse operation has been carried out
to get back the original information.
Abstract: Cryptography provides the secure manner of
information transmission over the insecure channel. It authenticates
messages based on the key but not on the user. It requires a lengthy
key to encrypt and decrypt the sending and receiving the messages,
respectively. But these keys can be guessed or cracked. Moreover,
Maintaining and sharing lengthy, random keys in enciphering and
deciphering process is the critical problem in the cryptography
system. A new approach is described for generating a crypto key,
which is acquired from a person-s iris pattern. In the biometric field,
template created by the biometric algorithm can only be
authenticated with the same person. Among the biometric templates,
iris features can efficiently be distinguished with individuals and
produces less false positives in the larger population. This type of iris
code distribution provides merely less intra-class variability that aids
the cryptosystem to confidently decrypt messages with an exact
matching of iris pattern. In this proposed approach, the iris features
are extracted using multi resolution wavelets. It produces 135-bit iris
codes from each subject and is used for encrypting/decrypting the
messages. The autocorrelators are used to recall original messages
from the partially corrupted data produced by the decryption process.
It intends to resolve the repudiation and key management problems.
Results were analyzed in both conventional iris cryptography system
(CIC) and non-repudiation iris cryptography system (NRIC). It
shows that this new approach provides considerably high
authentication in enciphering and deciphering processes.
Abstract: The convergence of heterogeneous wireless access technologies characterizes the 4G wireless networks. In such converged systems, the seamless and efficient handoff between
different access technologies (vertical handoff) is essential and remains a challenging problem. The heterogeneous co-existence of access technologies with largely different characteristics creates a decision problem of determining the “best" available network at
“best" time to reduce the unnecessary handoffs. This paper proposes a dynamic decision model to decide the “best" network at “best"
time moment to handoffs. The proposed dynamic decision model make the right vertical handoff decisions by determining the “best"
network at “best" time among available networks based on, dynamic
factors such as “Received Signal Strength(RSS)" of network and
“velocity" of mobile station simultaneously with static factors like Usage Expense, Link capacity(offered bandwidth) and power
consumption. This model not only meets the individual user needs but also improve the whole system performance by reducing the unnecessary handoffs.
Abstract: The vast amount of information on the World Wide
Web is created and published by many different types of providers.
Unlike books and journals, most of this information is not subject to
editing or peer review by experts. This lack of quality control and the
explosion of web sites make the task of finding quality information
on the web especially critical. Meanwhile new facilities for
producing web pages such as Blogs make this issue more significant
because Blogs have simple content management tools enabling nonexperts
to build easily updatable web diaries or online journals. On
the other hand despite a decade of active research in information
quality (IQ) there is no framework for measuring information quality
on the Blogs yet. This paper presents a novel experimental
framework for ranking quality of information on the Weblog. The
results of data analysis revealed seven IQ dimensions for the Weblog.
For each dimension, variables and related coefficients were
calculated so that presented framework is able to assess IQ of
Weblogs automatically.
Abstract: This article presents the evolution and technological changes implemented on the full scale simulators developed by the Simulation Department of the Instituto de Investigaciones Eléctricas1 (Mexican Electric Research Institute) and located at different training centers around the Mexican territory, and allows US to know the last updates, basically from the input/output view point, of the current simulators at some facilities of the electrical sector as well as the compatible industry of the electrical manufactures and industries such as Comision Federal de Electricidad (CFE*, The utility Mexican company). Tendencies of these developments and impact within the operators- scope are also presented.
Abstract: Tumor classification is a key area of research in the
field of bioinformatics. Microarray technology is commonly used in
the study of disease diagnosis using gene expression levels. The
main drawback of gene expression data is that it contains thousands
of genes and a very few samples. Feature selection methods are used
to select the informative genes from the microarray. These methods
considerably improve the classification accuracy. In the proposed
method, Genetic Algorithm (GA) is used for effective feature
selection. Informative genes are identified based on the T-Statistics,
Signal-to-Noise Ratio (SNR) and F-Test values. The initial candidate
solutions of GA are obtained from top-m informative genes. The
classification accuracy of k-Nearest Neighbor (kNN) method is used
as the fitness function for GA. In this work, kNN and Support Vector
Machine (SVM) are used as the classifiers. The experimental results
show that the proposed work is suitable for effective feature
selection. With the help of the selected genes, GA-kNN method
achieves 100% accuracy in 4 datasets and GA-SVM method
achieves in 5 out of 10 datasets. The GA with kNN and SVM
methods are demonstrated to be an accurate method for microarray
based tumor classification.
Abstract: COSMED K4b2 is a portable electrical device designed to test pulmonary functions. It is ideal for many applications that need the measurement of the cardio-respiratory response either in the field or in the lab is capable with the capability to delivery real time data to a sink node or a PC base station with storing data in the memory at the same time. But the actual sensor outputs and data received may contain some errors, such as impulsive noise which can be related to sensors, low batteries, environment or disturbance in data acquisition process. These abnormal outputs might cause misinterpretations of exercise or living activities to persons being monitored. In our paper we propose an effective and feasible method to detect and identify errors in applications by principal component analysis (PCA) and a back propagation (BP) neural network.
Abstract: Petri Net being one of the most useful graphical tools for modelling complex asynchronous systems, we have used Petri Net to model multi-track railway level crossing system. The roadway has been augmented with four half-size barriers. For better control, a three stage control mechanism has been introduced to ensure that no road-vehicle is trapped on the level crossing. Timed Petri Net is used to include the temporal nature of the signalling system. Safeness analysis has also been included in the discussion section.
Abstract: Database management systems that integrate user preferences promise better solution for personalization, greater flexibility and higher quality of query responses. This paper presents a tentative work that studies and investigates approaches to express user preferences in queries. We sketch an extend capabilities of SQLf language that uses the fuzzy set theory in order to define the user preferences. For that, two essential points are considered: the first concerns the expression of user preferences in SQLf by so-called fuzzy commensurable predicates set. The second concerns the bipolar way in which these user preferences are expressed on mandatory and/or optional preferences.
Abstract: Simulation is a very powerful method used for highperformance
and high-quality design in distributed system, and now
maybe the only one, considering the heterogeneity, complexity and
cost of distributed systems. In Grid environments, foe example, it is
hard and even impossible to perform scheduler performance
evaluation in a repeatable and controllable manner as resources and
users are distributed across multiple organizations with their own
policies. In addition, Grid test-beds are limited and creating an
adequately-sized test-bed is expensive and time consuming.
Scalability, reliability and fault-tolerance become important
requirements for distributed systems in order to support distributed
computation. A distributed system with such characteristics is called
dependable. Large environments, like Cloud, offer unique
advantages, such as low cost, dependability and satisfy QoS for all
users. Resource management in large environments address
performant scheduling algorithm guided by QoS constrains. This
paper presents the performance evaluation of scheduling heuristics
guided by different optimization criteria. The algorithms for
distributed scheduling are analyzed in order to satisfy users
constrains considering in the same time independent capabilities of
resources. This analysis acts like a profiling step for algorithm
calibration. The performance evaluation is based on simulation. The
simulator is MONARC, a powerful tool for large scale distributed
systems simulation. The novelty of this paper consists in synthetic
analysis results that offer guidelines for scheduler service
configuration and sustain the empirical-based decision. The results
could be used in decisions regarding optimizations to existing Grid
DAG Scheduling and for selecting the proper algorithm for DAG
scheduling in various actual situations.
Abstract: One astonishing capability of humans is to recognize thousands of different objects visually, and to learn the semantic association between those objects and words referring to them. This work is an attempt to build a computational model of such capacity,simulating the process by which infants learn how to recognize objects and words through exposure to visual stimuli and vocal sounds.One of the main fact shaping the brain of a newborn is that lights and colors come from entities of the world. Gradually the visual system learn which light sensations belong to same entities, despite large changes in appearance. This experience is common between humans and several other mammals, like non-human primates. But humans only can recognize a huge variety of objects, most manufactured by himself, and make use of sounds to identify and categorize them. The aim of this model is to reproduce these processes in a biologically plausible way, by reconstructing the essential hierarchy of cortical circuits on the visual and auditory neural paths.