Abstract: IPsec has now become a standard information security
technology throughout the Internet society. It provides a well-defined
architecture that takes into account confidentiality, authentication,
integrity, secure key exchange and protection mechanism against
replay attack also. For the connectionless security services on packet
basis, IETF IPsec Working Group has standardized two extension
headers (AH&ESP), key exchange and authentication protocols. It is
also working on lightweight key exchange protocol and MIB's for
security management. IPsec technology has been implemented on
various platforms in IPv4 and IPv6, gradually replacing old
application-specific security mechanisms. IPv4 and IPv6 are not
directly compatible, so programs and systems designed to one
standard can not communicate with those designed to the other. We
propose the design and implementation of controlled Internet security
system, which is IPsec-based Internet information security system in
IPv4/IPv6 network and also we show the data of performance
measurement. With the features like improved scalability and
routing, security, ease-of-configuration, and higher performance of
IPv6, the controlled Internet security system provides consistent
security policy and integrated security management on IPsec-based
Internet security system.
Abstract: Clustering categorical data is more complicated than
the numerical clustering because of its special properties. Scalability
and memory constraint is the challenging problem in clustering large
data set. This paper presents an incremental algorithm to cluster the
categorical data. Frequencies of attribute values contribute much in
clustering similar categorical objects. In this paper we propose new
similarity measures based on the frequencies of attribute values and
its cardinalities. The proposed measures and the algorithm are
experimented with the data sets from UCI data repository. Results
prove that the proposed method generates better clusters than the
existing one.
Abstract: I/O workload is a critical and important factor to
analyze I/O pattern and file system performance. However tracing I/O
operations on the fly distributed parallel file system is non-trivial due
to collection overhead and a large volume of data. In this paper, we
design and implement a parallel file system logging method for high
performance computing using shared memory-based multi-layer
scheme. It minimizes the overhead with reduced logging operation
response time and provides efficient post-processing scheme through
shared memory. Separated logging server can collect sequential logs
from multiple clients in a cluster through packet communication.
Implementation and evaluation result shows low overhead and high
scalability of this architecture for high performance parallel logging
analysis.
Abstract: This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.
Abstract: This paper presents Qmulus- a Cloud Based GPS
Model. Qmulus is designed to compute the best possible route which
would lead the driver to the specified destination in the shortest time
while taking into account real-time constraints. Intelligence
incorporated to Qmulus-s design makes it capable of generating and
assigning priorities to a list of optimal routes through customizable
dynamic updates. The goal of this design is to minimize travel and
cost overheads, maintain reliability and consistency, and implement
scalability and flexibility. The model proposed focuses on
reducing the bridge between a Client Application and a Cloud
service so as to render seamless operations. Qmulus-s system
model is closely integrated and its concept has the potential to be
extended into several other integrated applications making it capable
of adapting to different media and resources.
Abstract: The demand for autonomous resource
management for distributed systems has increased in recent
years. Distributed systems require an efficient and powerful
communication mechanism between applications running on
different hosts and networks. The use of mobile agent
technology to distribute and delegate management tasks
promises to overcome the scalability and flexibility limitations
of the currently used centralized management approach. This
work proposes a multiagent system that adopts mobile agents
as a technology for tasks distribution, results collection, and
management of resources in large-scale distributed systems. A
new mobile agent-based approach for collecting results from
distributed system elements is presented. The technique of
artificial intelligence based on intelligent agents giving the
system a proactive behavior. The presented results are based
on a design example of an application operating in a mobile
environment.
Abstract: Locality Sensitive Hashing (LSH) is one of the most
promising techniques for solving nearest neighbour search problem in
high dimensional space. Euclidean LSH is the most popular variation
of LSH that has been successfully applied in many multimedia
applications. However, the Euclidean LSH presents limitations that
affect structure and query performances. The main limitation of the
Euclidean LSH is the large memory consumption. In order to achieve
a good accuracy, a large number of hash tables is required. In this
paper, we propose a new hashing algorithm to overcome the storage
space problem and improve query time, while keeping a good
accuracy as similar to that achieved by the original Euclidean LSH.
The Experimental results on a real large-scale dataset show that the
proposed approach achieves good performances and consumes less
memory than the Euclidean LSH.
Abstract: XML is becoming a de facto standard for online data exchange. Existing XML filtering techniques based on a publish/subscribe model are focused on the highly structured data marked up with XML tags. These techniques are efficient in filtering the documents of data-centric XML but are not effective in filtering the element contents of the document-centric XML. In this paper, we propose an extended XPath specification which includes a special matching character '%' used in the LIKE operation of SQL in order to solve the difficulty of writing some queries to adequately filter element contents using the previous XPath specification. We also present a novel technique for filtering a collection of document-centric XMLs, called Pfilter, which is able to exploit the extended XPath specification. We show several performance studies, efficiency and scalability using the multi-query processing time (MQPT).
Abstract: In the context of spectrum surveillance, a new method
to recover the code of spread spectrum signal is presented, while the
receiver has no knowledge of the transmitter-s spreading sequence. In
our previous paper, we used Genetic algorithm (GA), to recover
spreading code. Although genetic algorithms (GAs) are well known
for their robustness in solving complex optimization problems, but
nonetheless, by increasing the length of the code, we will often lead
to an unacceptable slow convergence speed. To solve this problem we
introduce Particle Swarm Optimization (PSO) into code estimation in
spread spectrum communication system. In searching process for
code estimation, the PSO algorithm has the merits of rapid
convergence to the global optimum, without being trapped in local
suboptimum, and good robustness to noise. In this paper we describe
how to implement PSO as a component of a searching algorithm in
code estimation. Swarm intelligence boasts a number of advantages
due to the use of mobile agents. Some of them are: Scalability, Fault
tolerance, Adaptation, Speed, Modularity, Autonomy, and
Parallelism. These properties make swarm intelligence very attractive
for spread spectrum code estimation. They also make swarm
intelligence suitable for a variety of other kinds of channels. Our
results compare between swarm-based algorithms and Genetic
algorithms, and also show PSO algorithm performance in code
estimation process.
Abstract: Service discovery is a very important component of Service Oriented Architectures (SOA). This paper presents two alternative approaches to customise the query results of private service registry such as Universal Description, Discovery and Integration (UDDI). The customisation is performed based on some pre-defined and/or real-time changing parameters. This work identifies the requirements, designs and additional mechanisms that must be applied to UDDI in order to support this customisation capability. We also detail the implements of the approaches and examine its performance and scalability. Based on our experimental results, we conclude that both approaches can be used to customise registry query results, but by storing personalization parameters in external resource will yield better performance and but less scalable when size of query results increases. We believe these approaches when combined with semantics enabled service registry will enhance the service discovery methods within a private UDDI registry environment.
Abstract: This paper explores the scalability issues associated
with solving the Named Entity Recognition (NER) problem using
Support Vector Machines (SVM) and high-dimensional features. The
performance results of a set of experiments conducted using binary
and multi-class SVM with increasing training data sizes are
examined. The NER domain chosen for these experiments is the
biomedical publications domain, especially selected due to its
importance and inherent challenges. A simple machine learning
approach is used that eliminates prior language knowledge such as
part-of-speech or noun phrase tagging thereby allowing for its
applicability across languages. No domain-specific knowledge is
included. The accuracy measures achieved are comparable to those
obtained using more complex approaches, which constitutes a
motivation to investigate ways to improve the scalability of multiclass
SVM in order to make the solution more practical and useable.
Improving training time of multi-class SVM would make support
vector machines a more viable and practical machine learning
solution for real-world problems with large datasets. An initial
prototype results in great improvement of the training time at the
expense of memory requirements.
Abstract: Ontologies and tagging systems are two different ways to organize the knowledge present in the current Web. In this paper we propose a simple method to model folksonomies, as tagging systems, with ontologies. We show the scalability of the method using real data sets. The modeling method is composed of a generic ontology that represents any folksonomy and an algorithm to transform the information contained in folksonomies to the generic ontology. The method allows representing folksonomies at any instant of time.
Abstract: Design patterns describe good solutions to common
and reoccurring problems in program design. Applying design
patterns in software design and implementation have significant
effects on software quality metrics such as flexibility, usability,
reusability, scalability and robustness. There is no standard rule for
using design patterns. There are some situations that a pattern is
applied for a specific problem and this pattern uses another pattern.
In this paper, we study the effect of using chain of patterns on
software quality metrics.
Abstract: In this paper we describe the design and implementation of a parallel algorithm for data assimilation with ensemble Kalman filter (EnKF) for oil reservoir history matching problem. The use of large number of observations from time-lapse seismic leads to a large turnaround time for the analysis step, in addition to the time consuming simulations of the realizations. For efficient parallelization it is important to consider parallel computation at the analysis step. Our experiments show that parallelization of the analysis step in addition to the forecast step has good scalability, exploiting the same set of resources with some additional efforts.
Abstract: Simulation is a very powerful method used for highperformance
and high-quality design in distributed system, and now
maybe the only one, considering the heterogeneity, complexity and
cost of distributed systems. In Grid environments, foe example, it is
hard and even impossible to perform scheduler performance
evaluation in a repeatable and controllable manner as resources and
users are distributed across multiple organizations with their own
policies. In addition, Grid test-beds are limited and creating an
adequately-sized test-bed is expensive and time consuming.
Scalability, reliability and fault-tolerance become important
requirements for distributed systems in order to support distributed
computation. A distributed system with such characteristics is called
dependable. Large environments, like Cloud, offer unique
advantages, such as low cost, dependability and satisfy QoS for all
users. Resource management in large environments address
performant scheduling algorithm guided by QoS constrains. This
paper presents the performance evaluation of scheduling heuristics
guided by different optimization criteria. The algorithms for
distributed scheduling are analyzed in order to satisfy users
constrains considering in the same time independent capabilities of
resources. This analysis acts like a profiling step for algorithm
calibration. The performance evaluation is based on simulation. The
simulator is MONARC, a powerful tool for large scale distributed
systems simulation. The novelty of this paper consists in synthetic
analysis results that offer guidelines for scheduler service
configuration and sustain the empirical-based decision. The results
could be used in decisions regarding optimizations to existing Grid
DAG Scheduling and for selecting the proper algorithm for DAG
scheduling in various actual situations.
Abstract: Key management is a vital component in any modern security protocol. Due to scalability and practical implementation considerations automatic key management seems a natural choice in significantly large virtual private networks (VPNs). In this context IETF Internet Key Exchange (IKE) is the most promising protocol under permanent review. We have made a humble effort to pinpoint IKEv2 net gain over IKEv1 due to recent modifications in its original structure, along with a brief overview of salient improvements between the two versions. We have used US National Institute of Technology NIIST VPN simulator to get some comparisons of important performance metrics.
Abstract: This work deals with aspects of support vector machine learning for large-scale data mining tasks. Based on a decomposition algorithm for support vector machine training that can be run in serial as well as shared memory parallel mode we introduce a transformation of the training data that allows for the usage of an expensive generalized kernel without additional costs. We present experiments for the Gaussian kernel, but usage of other kernel functions is possible, too. In order to further speed up the decomposition algorithm we analyze the critical problem of working set selection for large training data sets. In addition, we analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our tests and conclusions led to several modifications of the algorithm and the improvement of overall support vector machine learning performance. Our method allows for using extensive parameter search methods to optimize classification accuracy.
Abstract: Resource Discovery in Grids is critical for efficient
resource allocation and management. Heterogeneous nature and
dynamic availability of resources make resource discovery a
challenging task. As numbers of nodes are increasing from tens to
thousands, scalability is essentially desired. Peer-to-Peer (P2P)
techniques, on the other hand, provide effective implementation of
scalable services and applications. In this paper we propose a model
for resource discovery in Condor Middleware by using the four axis
framework defined in P2P approach. The proposed model enhances
Condor to incorporate functionality of a P2P system, thus aim to
make Condor more scalable, flexible, reliable and robust.
Abstract: Evolvable hardware (EHW) is a developing field that
applies evolutionary algorithm (EA) to automatically design circuits,
antennas, robot controllers etc. A lot of research has been done in this
area and several different EAs have been introduced to tackle
numerous problems, as scalability, evolvability etc. However every
time a specific EA is chosen for solving a particular task, all its
components, such as population size, initialization, selection
mechanism, mutation rate, and genetic operators, should be selected
in order to achieve the best results. In the last three decade the
selection of the right parameters for the EA-s components for solving
different “test-problems" has been investigated. In this paper the
behaviour of mutation rate for designing logic circuits, which has not
been done before, has been deeply analyzed. The mutation rate for an
EHW system modifies the number of inputs of each logic gates, the
functionality (for example from AND to NOR) and the connectivity
between logic gates. The behaviour of the mutation has been
analyzed based on the number of generations, genotype redundancy
and number of logic gates for the evolved circuits. The experimental
results found provide the behaviour of the mutation rate during
evolution for the design and optimization of simple logic circuits.
The experimental results propose the best mutation rate to be used for
designing combinational logic circuits. The research presented is
particular important for those who would like to implement a
dynamic mutation rate inside the evolutionary algorithm for evolving
digital circuits. The researches on the mutation rate during the last 40
years are also summarized.
Abstract: Since 1992, year where Hugo de Garis has published
the first paper on Evolvable Hardware (EHW), a period of intense
creativity has followed. It has been actively researched, developed
and applied to various problems. Different approaches have been
proposed that created three main classifications: extrinsic, mixtrinsic
and intrinsic EHW. Each of these solutions has a real interest.
Nevertheless, although the extrinsic evolution generates some
excellent results, the intrinsic systems are not so advanced. This
paper suggests 3 possible solutions to implement the run-time
configuration intrinsic EHW system: FPGA-based Run-Time
Configuration system, JBits-based Run-Time Configuration system
and Multi-board functional-level Run-Time Configuration system.
The main characteristic of the proposed architectures is that they are
implemented on Field Programmable Gate Array. A comparison of
proposed solutions demonstrates that multi-board functional-level
run-time configuration is superior in terms of scalability, flexibility
and the implementation easiness.