Abstract: The aim of this work is the parallel implementation
of k-means in MATLAB, in order to reduce the execution time.
Specifically, a new function in MATLAB for serial k-means algorithm
is developed, which meets all the requirements for the conversion to a
function in MATLAB with parallel computations. Additionally, two
different variants for the definition of initial values are presented.
In the sequel, the parallel approach is presented. Finally, the
performance tests for the computation times respect to the numbers
of features and classes are illustrated.
Abstract: In this work, the hemodynamics in the sinuses of
Valsalva after Transcatheter Aortic Valve Implantation is numerically
examined. We focus on the physical results in the two-dimensional
case. We use a finite element methodology based on a Lagrange
multiplier technique that enables to couple the dynamics of blood
flow and the leaflets’ movement. A massively parallel implementation
of a monolithic and fully implicit solver allows more accuracy and
significant computational savings. The elastic properties of the aortic
valve are disregarded, and the numerical computations are performed
under physiologically correct pressure loads. Computational results
depict that blood flow may be subject to stagnation in the lower
domain of the sinuses of Valsalva after Transcatheter Aortic Valve
Implantation.
Abstract: The main goal of this article is to describe the online
flood monitoring and prediction system Floreon+ primarily developed
for the Moravian-Silesian region in the Czech Republic and the basic
process it uses for running automatic rainfall-runoff and
hydrodynamic simulations along with their calibration and
uncertainty modeling. It takes a long time to execute such process
sequentially, which is not acceptable in the online scenario, so the use
of a high performance computing environment is proposed for all
parts of the process to shorten their duration. Finally, a case study on
the Ostravice River catchment is presented that shows actual
durations and their gain from the parallel implementation.
Abstract: Many problems in computer vision and image
processing present potential for parallel implementations through one
of the three major paradigms of geometric parallelism, algorithmic
parallelism and processor farming. Static process scheduling
techniques are used successfully to exploit geometric and algorithmic
parallelism, while dynamic process scheduling is better suited to
dealing with the independent processes inherent in the process
farming paradigm. This paper considers the application of parallel or
multi-computers to a class of problems exhibiting spatial data
characteristic of the geometric paradigm. However, by using
processor farming paradigm, a dynamic scheduling technique is
developed to suit the MIMD structure of the multi-computers. A
hybrid scheme of scheduling is also developed and compared with
the other schemes. The specific problem chosen for the investigation
is the Hough transform for line detection.
Abstract: The aim of this paper is to investigate the
performance of the developed two point block method designed for
two processors for solving directly non stiff large systems of higher
order ordinary differential equations (ODEs). The method calculates
the numerical solution at two points simultaneously and produces
two new equally spaced solution values within a block and it is
possible to assign the computational tasks at each time step to a
single processor. The algorithm of the method was developed in C
language and the parallel computation was done on a parallel shared
memory environment. Numerical results are given to compare the
efficiency of the developed method to the sequential timing. For
large problems, the parallel implementation produced 1.95 speed-up
and 98% efficiency for the two processors.
Abstract: This paper describes a new algorithm of arrangement
in parallel, based on Odd-Even Mergesort, called division and
concurrent mixes. The main idea of the algorithm is to achieve that
each processor uses a sequential algorithm for ordering a part of the
vector, and after that, for making the processors work in pairs in
order to mix two of these sections ordered in a greater one, also
ordered; after several iterations, the vector will be completely
ordered. The paper describes the implementation of the new
algorithm on a Message Passing environment (such as MPI). Besides,
it compares the obtained experimental results with the quicksort
sequential algorithm and with the parallel implementations (also on
MPI) of the algorithms quicksort and bitonic sort. The comparison
has been realized in an 8 processors cluster under GNU/Linux which
is running on a unique PC processor.
Abstract: There are two major variants of the Simplex
Algorithm: the revised method and the standard, or tableau method.
Today, all serious implementations are based on the revised method
because it is more efficient for sparse linear programming problems.
Moreover, there are a number of applications that lead to dense linear
problems so our aim in this paper is to present some computational
results on parallel implementation of dense Simplex Method. Our
implementation is implemented on a SMP cluster using C
programming language and the Message Passing Interface MPI.
Preliminary computational results on randomly generated dense
linear programs support our results.
Abstract: In the last decade digital watermarking procedures have
become increasingly applied to implement the copyright protection
of multimedia digital contents distributed on the Internet. To this
end, it is worth noting that a lot of watermarking procedures
for images and videos proposed in literature are based on spread
spectrum techniques. However, some scepticism about the robustness
and security of such watermarking procedures has arisen because
of some documented attacks which claim to render the inserted
watermarks undetectable. On the other hand, web content providers
wish to exploit watermarking procedures characterized by flexible and
efficient implementations and which can be easily integrated in their
existing web services frameworks or platforms. This paper presents
how a simple spread spectrum watermarking procedure for MPEG-2
videos can be modified to be exploited in web contexts. To this end,
the proposed procedure has been made secure and robust against some
well-known and dangerous attacks. Furthermore, its basic scheme
has been optimized by making the insertion procedure adaptive with
respect to the terminals used to open the videos and the network transactions
carried out to deliver them to buyers. Finally, two different
implementations of the procedure have been developed: the former
is a high performance parallel implementation, whereas the latter is
a portable Java and XML based implementation. Thus, the paper
demonstrates that a simple spread spectrum watermarking procedure,
with limited and appropriate modifications to the embedding scheme,
can still represent a valid alternative to many other well-known and
more recent watermarking procedures proposed in literature.
Abstract: In a previous work, we presented the numerical
solution of the two dimensional second order telegraph partial
differential equation discretized by the centred and rotated five-point
finite difference discretizations, namely the explicit group (EG) and
explicit decoupled group (EDG) iterative methods, respectively. In
this paper, we utilize a domain decomposition algorithm on these
group schemes to divide the tasks involved in solving the same
equation. The objective of this study is to describe the development
of the parallel group iterative schemes under OpenMP programming
environment as a way to reduce the computational costs of the
solution processes using multicore technologies. A detailed
performance analysis of the parallel implementations of points and
group iterative schemes will be reported and discussed.
Abstract: In this paper, a new reverse converter for the moduli set {2n, 2n–1, 2n–1–1} is presented. We improved a previously introduced conversion algorithm for deriving an efficient hardware design for reverse converter. Hardware architecture of the proposed converter is based on carry-save adders and regular binary adders, without the requirement for modular adders. The presented design is faster than the latest introduced reverse converter for moduli set {2n, 2n–1, 2n–1–1}. Also, it has better performance than the reverse converters for the recently introduced moduli set {2n+1–1, 2n, 2n–1}
Abstract: A state of the art Speaker Identification (SI) system requires a robust feature extraction unit followed by a speaker modeling scheme for generalized representation of these features. Over the years, Mel-Frequency Cepstral Coefficients (MFCC) modeled on the human auditory system has been used as a standard acoustic feature set for SI applications. However, due to the structure of its filter bank, it captures vocal tract characteristics more effectively in the lower frequency regions. This paper proposes a new set of features using a complementary filter bank structure which improves distinguishability of speaker specific cues present in the higher frequency zone. Unlike high level features that are difficult to extract, the proposed feature set involves little computational burden during the extraction process. When combined with MFCC via a parallel implementation of speaker models, the proposed feature set outperforms baseline MFCC significantly. This proposition is validated by experiments conducted on two different kinds of public databases namely YOHO (microphone speech) and POLYCOST (telephone speech) with Gaussian Mixture Models (GMM) as a Classifier for various model orders.
Abstract: Various mechanisms providing mutual exclusion and
thread synchronization can be used to support parallel processing
within a single computer. Instead of using locks, semaphores, barriers
or other traditional approaches in this paper we focus on alternative
ways for making better use of modern multithreaded architectures
and preparing hash tables for concurrent accesses. Hash structures
will be used to demonstrate and compare two entirely different
approaches (rule based cooperation and hardware synchronization
support) to an efficient parallel implementation using traditional
locks. Comparison includes implementation details, performance
ranking and scalability issues. We aim at understanding the effects
the parallelization schemes have on the execution environment with
special focus on the memory system and memory access
characteristics.
Abstract: We report on the development of a model to
understand why the range of experience with respect to HIV
infection is so diverse, especially with respect to the latency period.
To investigate this, an agent-based approach is used to extract highlevel
behaviour which cannot be described analytically from the set
of interaction rules at the cellular level. A network of independent
matrices mimics the chain of lymph nodes. Dealing with massively
multi-agent systems requires major computational effort. However,
parallelisation methods are a natural consequence and advantage of
the multi-agent approach and, using the MPI library, are here
implemented, tested and optimized. Our current focus is on the
various implementations of the data transfer across the network.
Three communications strategies are proposed and tested, showing
that the most efficient approach is communication based on the
natural lymph-network connectivity.