Abstract: Nowadays, the mathematical/statistical applications
are developed with more complexity and accuracy. However, these
precisions and complexities have brought as result that applications
need more computational power in order to be executed faster. In this
sense, the multicore environments are playing an important role to
improve and to optimize the execution time of these applications.
These environments allow us the inclusion of more parallelism inside
the node. However, to take advantage of this parallelism is not an
easy task, because we have to deal with some problems such as: cores
communications, data locality, memory sizes (cache and RAM),
synchronizations, data dependencies on the model, etc. These issues
are becoming more important when we wish to improve the
application’s performance and scalability. Hence, this paper describes
an optimization method developed for Systemic Model of Banking
Originated Losses (SYMBOL) tool developed by the European
Commission, which is based on analyzing the application's weakness
in order to exploit the advantages of the multicore. All these
improvements are done in an automatic and transparent manner with
the aim of improving the performance metrics of our tool. Finally,
experimental evaluations show the effectiveness of our new
optimized version, in which we have achieved a considerable
improvement on the execution time. The time has been reduced
around 96% for the best case tested, between the original serial
version and the automatic parallel version.
Abstract: Advances in processors architecture, such as multicore,
increase the size of complexity of parallel computer systems.
With multi-core architecture there are different parallel languages
that can be used to run parallel programs. One of these languages is
OpenMP which embedded in C/Cµ or FORTRAN. Because of this
new architecture and the complexity, it is very important to evaluate
the performance of OpenMP constructs, kernels, and application
program on multi-core systems. Performance is the activity of
collecting the information about the execution characteristics of a
program. Performance tools consists of at least three interfacing
software layers, including instrumentation, measurement, and
analysis. The instrumentation layer defines the measured
performance events. The measurement layer determines what
performance event is actually captured and how it is measured by the
tool. The analysis layer processes the performance data and
summarizes it into a form that can be displayed in performance tools.
In this paper, a number of OpenMP performance tools are surveyed,
explaining how each is used to collect, analyse, and display data
collection.
Abstract: Grobner basis calculation forms a key part of computational
commutative algebra and many other areas. One important
ramification of the theory of Grobner basis provides a means to solve
a system of non-linear equations. This is why it has become very
important in the areas where the solution of non-linear equations is
needed, for instance in algebraic cryptanalysis and coding theory. This
paper explores on a parallel-distributed implementation for Grobner
basis calculation over GF(2). For doing so Buchberger algorithm is
used. OpenMP and MPI-C language constructs have been used to
implement the scheme. Some relevant results have been furnished
to compare the performances between the standalone and hybrid
(parallel-distributed) implementation.
Abstract: OpenMP is an API for parallel programming model of shared memory multiprocessors. Novice OpenMP programmers often produce the code that compiler cannot find human errors. It was investigated how compiler coped with the common mistakes that can occur in OpenMP code. The latest version(4.4.3) of GCC is used for this research. It was found that GCC compiled the codes without any errors or warnings. In this paper the programming aid tool is presented for OpenMP programs. It can check 12 common mistakes that novice programmer can commit during the programming of OpenMP. It was demonstrated that the programming aid tool can detect the various common mistakes that GCC failed to detect.
Abstract: In this paper developed and realized absolutely new
algorithm for solving three-dimensional Poisson equation. This
equation used in research of turbulent mixing, computational fluid
dynamics, atmospheric front, and ocean flows and so on. Moreover in
the view of rising productivity of difficult calculation there was
applied the most up-to-date and the most effective parallel
programming technology - MPI in combination with OpenMP
direction, that allows to realize problems with very large data
content. Resulted products can be used in solving of important
applications and fundamental problems in mathematics and physics.
Abstract: In a previous work, we presented the numerical
solution of the two dimensional second order telegraph partial
differential equation discretized by the centred and rotated five-point
finite difference discretizations, namely the explicit group (EG) and
explicit decoupled group (EDG) iterative methods, respectively. In
this paper, we utilize a domain decomposition algorithm on these
group schemes to divide the tasks involved in solving the same
equation. The objective of this study is to describe the development
of the parallel group iterative schemes under OpenMP programming
environment as a way to reduce the computational costs of the
solution processes using multicore technologies. A detailed
performance analysis of the parallel implementations of points and
group iterative schemes will be reported and discussed.
Abstract: The H.264/AVC standard is a highly efficient video
codec providing high-quality videos at low bit-rates. As employing
advanced techniques, the computational complexity has been
increased. The complexity brings about the major problem in the
implementation of a real-time encoder and decoder. Parallelism is the
one of approaches which can be implemented by multi-core system.
We analyze macroblock-level parallelism which ensures the same bit
rate with high concurrency of processors. In order to reduce the
encoding time, dynamic data partition based on macroblock region is
proposed. The data partition has the advantages in load balancing and
data communication overhead. Using the data partition, the encoder
obtains more than 3.59x speed-up on a four-processor system. This
work can be applied to other multimedia processing applications.