Abstract: Many-core GPUs provide high computing ability and
substantial bandwidth; however, optimizing irregular applications
like SpMV on GPUs becomes a difficult but meaningful task. In this
paper, we propose a novel method to improve the performance of
SpMV on GPUs. A new storage format called HYB-R is proposed to
exploit GPU architecture more efficiently. The COO portion of the
matrix is partitioned recursively into a ELL portion and a COO
portion in the process of creating HYB-R format to ensure that there
are as many non-zeros as possible in ELL format. The method of
partitioning the matrix is an important problem for HYB-R kernel, so
we also try to tune the parameters to partition the matrix for higher
performance. Experimental results show that our method can get
better performance than the fastest kernel (HYB) in NVIDIA-s
SpMV library with as high as 17% speedup.
Abstract: Iris localization is a very important approach in
biometric identification systems. Identification process usually is
implemented in three levels: iris localization, feature extraction, and
pattern matching finally. Accuracy of iris localization as the first step
affects all other levels and this shows the importance of iris
localization in an iris based biometric system. In this paper, we
consider Daugman iris localization method as a standard method,
propose a new method in this field and then analyze and compare the
results of them on a standard set of iris images. The proposed method
is based on the detection of circular edge of iris, and improved by
fuzzy circles and surface energy difference contexts. Implementation
of this method is so easy and compared to the other methods, have a
rather high accuracy and speed. Test results show that the accuracy of
our proposed method is about Daugman method and computation
speed of it is 10 times faster.
Abstract: Various mechanisms providing mutual exclusion and
thread synchronization can be used to support parallel processing
within a single computer. Instead of using locks, semaphores, barriers
or other traditional approaches in this paper we focus on alternative
ways for making better use of modern multithreaded architectures
and preparing hash tables for concurrent accesses. Hash structures
will be used to demonstrate and compare two entirely different
approaches (rule based cooperation and hardware synchronization
support) to an efficient parallel implementation using traditional
locks. Comparison includes implementation details, performance
ranking and scalability issues. We aim at understanding the effects
the parallelization schemes have on the execution environment with
special focus on the memory system and memory access
characteristics.
Abstract: Wireless mobile communications have experienced
the phenomenal growth through last decades. The advances in
wireless mobile technologies have brought about a demand for high
quality multimedia applications and services. For such applications
and services to work, signaling protocol is required for establishing,
maintaining and tearing down multimedia sessions. The Session
Initiation Protocol (SIP) is an application layer signaling protocols,
based on request/response transaction model. This paper considers
SIP INVITE transaction over an unreliable medium, since it has been
recently modified in Request for Comments (RFC) 6026. In order to
help in assuring that the functional correctness of this modification is
achieved, the SIP INVITE transaction is modeled and analyzed using
Colored Petri Nets (CPNs). Based on the model analysis, it is
concluded that the SIP INVITE transaction is free of livelocks and
dead codes, and in the same time it has both desirable and
undesirable deadlocks. Therefore, SIP INVITE transaction should be
subjected for additional updates in order to eliminate undesirable
deadlocks. In order to reduce the cost of implementation and
maintenance of SIP, additional remodeling of the SIP INVITE
transaction is recommended.
Abstract: Computations with higher than the IEEE 754 standard double-precision (about 16 significant digits) are required recently. Although there are available software routines in Fortran and C for high-precision computation, users are required to implement such routines in their own computers with detailed knowledges about them. We have constructed an user-friendly online system for octupleprecision computation. In our Web system users with no knowledges about high-precision computation can easily perform octupleprecision computations, by choosing mathematical functions with argument(s) inputted, by writing simple mathematical expression(s) or by uploading C program(s). In this paper we enhance the Web system above by adding the facility of uploading Fortran programs, which have been widely used in scientific computing. To this end we construct converter routines in two stages.