Abstract: With the popularity of the multi-core and many-core architectures there is a great requirement for software frameworks which can support parallel programming methodologies. In this paper we introduce an Eclipse toolkit, JConqurr which is easy to use and provides robust support for flexible parallel progrmaming. JConqurr is a multi-core and many-core programming toolkit for Java which is capable of providing support for common parallel programming patterns which include task, data, divide and conquer and pipeline parallelism. The toolkit uses an annotation and a directive mechanism to convert the sequential code into parallel code. In addition to that we have proposed a novel mechanism to achieve the parallelism using graphical processing units (GPU). Experiments with common parallelizable algorithms have shown that our toolkit can be easily and efficiently used to convert sequential code to parallel code and significant performance gains can be achieved.
Abstract: Industrial robots play a vital role in automation
however only little effort are taken for the application of robots in
machining work such as Grinding, Cutting, Milling, Drilling,
Polishing etc. Robot parallel manipulators have high stiffness,
rigidity and accuracy, which cannot be provided by conventional
serial robot manipulators. The aim of this paper is to perform the
modeling and the workspace analysis of a 3 DOF Parallel
Manipulator (3 DOF PM). The 3 DOF PM was modeled and
simulated using 'ADAMS'. The concept involved is based on the
transformation of motion from a screw joint to a spherical joint
through a connecting link. This paper work has been planned to
model the Parallel Manipulator (PM) using screw joints for very
accurate positioning. A workspace analysis has been done for the
determination of work volume of the 3 DOF PM. The position of the
spherical joints connected to the moving platform and the
circumferential points of the moving platform were considered for
finding the workspace. After the simulation, the position of the joints
of the moving platform was noted with respect to simulation time and
these points were given as input to the 'MATLAB' for getting the
work envelope. Then 'AUTOCAD' is used for determining the work
volume. The obtained values were compared with analytical
approach by using Pappus-Guldinus Theorem. The analysis had been
dealt by considering the parameters, link length and radius of the
moving platform. From the results it is found that the radius of
moving platform is directly proportional to the work volume for a
constant link length and the link length is also directly proportional
to the work volume, at a constant radius of the moving platform.
Abstract: In this study, a mathematical model was proposed and
the accuracy of this model was assessed to predict the growth of
Pseudomonas aeruginosa and rhamnolipid production under nitrogen
limiting (sodium nitrate) fed-batch fermentation. All of the
parameters used in this model were achieved individually without
using any data from the literature.
The overall growth kinetic of the strain was evaluated using a
dual-parallel substrate Monod equation which was described by
several batch experimental data. Fed-batch data under different
glycerol (as the sole carbon source, C/N=10) concentrations and feed
flow rates were used to describe the proposed fed-batch model and
other parameters. In order to verify the accuracy of the proposed
model several verification experiments were performed in a vast
range of initial glycerol concentrations. While the results showed an
acceptable prediction for rhamnolipid production (less than 10%
error), in case of biomass prediction the errors were less than 23%. It
was also found that the rhamnolipid production by P. aeruginosa was
more sensitive at low glycerol concentrations.
Based on the findings of this work, it was concluded that the
proposed model could effectively be employed for rhamnolipid
production by this strain under fed-batch fermentation on up to 80 g l-
1 glycerol.
Abstract: In this paper, a method for matching image segments
using triangle-based (geometrical) regions is proposed. Triangular
regions are formed from triples of vertex points obtained from a
keypoint detector (SIFT). However, triangle regions are subject to
noise and distortion around the edges and vertices (especially acute
angles). Therefore, these triangles are expanded into parallelogramshaped
regions. The extracted image segments inherit an important
triangle property; the invariance to affine distortion. Given two
images, matching corresponding regions is conducted by computing
the relative affine matrix, rectifying one of the regions w.r.t. the other
one, then calculating the similarity between the reference and
rectified region. The experimental tests show the efficiency and
robustness of the proposed algorithm against geometrical distortion.
Abstract: I/O workload is a critical and important factor to
analyze I/O pattern and file system performance. However tracing I/O
operations on the fly distributed parallel file system is non-trivial due
to collection overhead and a large volume of data. In this paper, we
design and implement a parallel file system logging method for high
performance computing using shared memory-based multi-layer
scheme. It minimizes the overhead with reduced logging operation
response time and provides efficient post-processing scheme through
shared memory. Separated logging server can collect sequential logs
from multiple clients in a cluster through packet communication.
Implementation and evaluation result shows low overhead and high
scalability of this architecture for high performance parallel logging
analysis.
Abstract: The aim of this paper is to investigate the
performance of the developed two point block method designed for
two processors for solving directly non stiff large systems of higher
order ordinary differential equations (ODEs). The method calculates
the numerical solution at two points simultaneously and produces
two new equally spaced solution values within a block and it is
possible to assign the computational tasks at each time step to a
single processor. The algorithm of the method was developed in C
language and the parallel computation was done on a parallel shared
memory environment. Numerical results are given to compare the
efficiency of the developed method to the sequential timing. For
large problems, the parallel implementation produced 1.95 speed-up
and 98% efficiency for the two processors.
Abstract: In this paper we address a multi-objective scheduling problem for unrelated parallel machines. In unrelated parallel systems, the processing cost/time of a given job on different machines may vary. The objective of scheduling is to simultaneously determine the job-machine assignment and job sequencing on each machine. In such a way the total cost of the schedule is minimized. The cost function consists of three components, namely; machining cost, earliness/tardiness penalties and makespan related cost. Such scheduling problem is combinatorial in nature. Therefore, a Simulated Annealing approach is employed to provide good solutions within reasonable computational times. Computational results show that the proposed approach can efficiently solve such complicated problems.
Abstract: This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.
Abstract: This paper deals with the optimal design of two-channel recursive parallelogram quadrature mirror filter (PQMF) banks. The analysis and synthesis filters of the PQMF bank are composed of two-dimensional (2-D) recursive digital all-pass filters (DAFs) with nonsymmetric half-plane (NSHP) support region. The design problem can be facilitated by using the 2-D doubly complementary half-band (DC-HB) property possessed by the analysis and synthesis filters. For finding the coefficients of the 2-D recursive NSHP DAFs, we appropriately formulate the design problem to result in an optimization problem that can be solved by using a weighted least-squares (WLS) algorithm in the minimax (L∞) optimal sense. The designed 2-D recursive PQMF bank achieves perfect magnitude response and possesses satisfactory phase response without requiring extra phase equalizer. Simulation results are also provided for illustration and comparison.
Abstract: Among various testing methodologies, Built-in Self-
Test (BIST) is recognized as a low cost, effective paradigm. Also,
full adders are one of the basic building blocks of most arithmetic
circuits in all processing units. In this paper, an optimized testable 2-
bit full adder as a test building block is proposed. Then, a BIST
procedure is introduced to scale up the building block and to generate
a self testable n-bit full adders. The target design can achieve 100%
fault coverage using insignificant amount of hardware redundancy.
Moreover, Overall test time is reduced by utilizing polymorphic
gates and also by testing full adder building blocks in parallel.
Abstract: Transient eddy current problem is solved in the
present paper by the method of the Laplace transform for the case of
a double conductor line located parallel to a conducting half-space.
The Fourier sine and cosine integral transforms are used in order to
find the Laplace transform of the solution. The inverse Laplace
transform of the solution is found in closed form. The integrated
electromotive force per unit length of the double conductor line is
calculated in the form of an improper integral.
Abstract: Single crystals of Magnesium alloys such as Mg-1Al,
Mg-1Zn-0.5Y, Mg-3Li, and AZ31 alloys were successfully fabricated in this study by employing the modified Bridgman method. Single
crystals of pure Mg were also made in this study. To determine the exact orientation of crystals, Laue back-reflection method and pole figure measurement were carried out on each single crystal. Dimensions of single crystals were 10 mm in diameter and 120 mm in
length. Hardness and compression tests were conducted and the results
revealed that hardness and the strength strongly depended on the
orientation. The closer to basal one the orientation was, the higher hardness and compressive strength were. The effect of alloying was
not higher than that of orientation. After compressive deformation of single crystals, the orientation of the crystals was found to rotate and to be parallel to the basal orientation.
Abstract: We discuss the signal detection through nonlinear
threshold systems. The detection performance is assessed by the
probability of error Per . We establish that: (1) when the signal is
complete suprathreshold, noise always degrades the signal detection
both in the single threshold system and in the parallel array of
threshold devices. (2) When the signal is a little subthreshold, noise
degrades signal detection in the single threshold system. But in the
parallel array, noise can improve signal detection, i.e., stochastic
resonance (SR) exists in the array. (3) When the signal is predominant
subthreshold, noise always can improve signal detection and SR
always exists not only in the single threshold system but also in the
parallel array. (4) Array can improve signal detection by raising the
number of threshold devices. These results extend further the
applicability of SR in signal detection.
Abstract: Designing and implementing intelligent systems has become a crucial factor for the innovation and development of better products of space technologies. A neural network is a parallel system, capable of resolving paradigms that linear computing cannot. Field programmable gate array (FPGA) is a digital device that owns reprogrammable properties and robust flexibility. For the neural network based instrument prototype in real time application, conventional specific VLSI neural chip design suffers the limitation in time and cost. With low precision artificial neural network design, FPGAs have higher speed and smaller size for real time application than the VLSI and DSP chips. So, many researchers have made great efforts on the realization of neural network (NN) using FPGA technique. In this paper, an introduction of ANN and FPGA technique are briefly shown. Also, Hardware Description Language (VHDL) code has been proposed to implement ANNs as well as to present simulation results with floating point arithmetic. Synthesis results for ANN controller are developed using Precision RTL. Proposed VHDL implementation creates a flexible, fast method and high degree of parallelism for implementing ANN. The implementation of multi-layer NN using lookup table LUT reduces the resource utilization for implementation and time for execution.
Abstract: The approach based on the wavelet transform has
been widely used for image denoising due to its multi-resolution
nature, its ability to produce high levels of noise reduction and the
low level of distortion introduced. However, by removing noise, high
frequency components belonging to edges are also removed, which
leads to blurring the signal features. This paper proposes a new
method of image noise reduction based on local variance and edge
analysis. The analysis is performed by dividing an image into 32 x 32
pixel blocks, and transforming the data into wavelet domain. Fast
lifting wavelet spatial-frequency decomposition and reconstruction is
developed with the advantages of being computationally efficient and
boundary effects minimized. The adaptive thresholding by local
variance estimation and edge strength measurement can effectively
reduce image noise while preserve the features of the original image
corresponding to the boundaries of the objects. Experimental results
demonstrate that the method performs well for images contaminated
by natural and artificial noise, and is suitable to be adapted for
different class of images and type of noises. The proposed algorithm
provides a potential solution with parallel computation for real time
or embedded system application.
Abstract: The nanotechnology based on epitaxial systems
includes single or arranged misfit dislocations. In general, whatever
is the type of dislocation or the geometry of the array formed by the
dislocations; it is important for experimental studies to know exactly
the stress distribution for which there is no analytical expression [1,
2]. This work, using a numerical analysis, deals with relaxation of
epitaxial layers having at their interface a periodic network of edge
misfit dislocations. The stress distribution is estimated by using
isotropic elasticity. The results show that the thickness of the two
sheets is a crucial parameter in the stress distributions and then in the
profile of the two sheets.
A comparative study between the case of single dislocation and
the case of parallel network shows that the layers relaxed better when
the interface is covered by a parallel arrangement of misfit.
Consequently, a single dislocation at the interface produces an
important stress field which can be reduced by inserting a parallel
network of dislocations with suitable periodicity.
Abstract: ''Cocktail party problem'' is well known as one of the human auditory abilities. We can recognize the specific sound that we want to listen by this ability even if a lot of undesirable sounds or noises are mixed. Blind source separation (BSS) based on independent component analysis (ICA) is one of the methods by which we can separate only a special signal from their mixed signals with simple hypothesis. In this paper, we propose an online approach for blind source separation using the sliding DFT and the time domain independent component analysis. The proposed method can reduce calculation complexity in comparison with conventional methods, and can be applied to parallel processing by using digital signal processors (DSPs) and so on. We evaluate this method and show its availability.
Abstract: The protection of parallel transmission lines has been a challenging task due to mutual coupling between the adjacent circuits of the line. This paper presents a novel scheme for detection and classification of faults on parallel transmission lines. The proposed approach uses combination of wavelet transform and neural network, to solve the problem. While wavelet transform is a powerful mathematical tool which can be employed as a fast and very effective means of analyzing power system transient signals, artificial neural network has a ability to classify non-linear relationship between measured signals by identifying different patterns of the associated signals. The proposed algorithm consists of time-frequency analysis of fault generated transients using wavelet transform, followed by pattern recognition using artificial neural network to identify the type of the fault. MATLAB/Simulink is used to generate fault signals and verify the correctness of the algorithm. The adaptive discrimination scheme is tested by simulating different types of fault and varying fault resistance, fault location and fault inception time, on a given power system model. The simulation results show that the proposed scheme for fault diagnosis is able to classify all the faults on the parallel transmission line rapidly and correctly.
Abstract: This paper presents implementation of attitude controller for a small UAV using field programmable gate array (FPGA). Due to the small size constrain a miniature more compact and computationally extensive; autopilot platform is needed for such systems. More over UAV autopilot has to deal with extremely adverse situations in the shortest possible time, while accomplishing its mission. FPGAs in the recent past have rendered themselves as fast, parallel, real time, processing devices in a compact size. This work utilizes this fact and implements different attitude controllers for a small UAV in FPGA, using its parallel processing capabilities. Attitude controller is designed in MATLAB/Simulink environment. The discrete version of this controller is implemented using pipelining followed by retiming, to reduce the critical path and thereby clock period of the controller datapath. Pipelined, retimed, parallel PID controller implementation is done using rapidprototyping and testing efficient development tool of “system generator", which has been developed by Xilinx for FPGA implementation. The improved timing performance enables the controller to react abruptly to any changes made to the attitudes of UAV.
Abstract: Full search block matching algorithm is widely used for hardware implementation of motion estimators in video compression algorithms. In this paper we are proposing a new architecture, which consists of a 2D parallel processing unit and a 1D unit both working in parallel. The proposed architecture reduces both data access power and computational power which are the main causes of power consumption in integer motion estimation. It also completes the operations with nearly the same number of clock cycles as compared to a 2D systolic array architecture. In this work sum of absolute difference (SAD)-the most repeated operation in block matching, is calculated in two steps. The first step is to calculate the SAD for alternate rows by a 2D parallel unit. If the SAD calculated by the parallel unit is less than the stored minimum SAD, the SAD of the remaining rows is calculated by the 1D unit. Early termination, which stops avoidable computations has been achieved with the help of alternate rows method proposed in this paper and by finding a low initial SAD value based on motion vector prediction. Data reuse has been applied to the reference blocks in the same search area which significantly reduced the memory access.