Abstract: Real time image and video processing is a demand in
many computer vision applications, e.g. video surveillance, traffic
management and medical imaging. The processing of those video
applications requires high computational power. Thus, the optimal
solution is the collaboration of CPU and hardware accelerators. In
this paper, a Canny edge detection hardware accelerator is proposed.
Edge detection is one of the basic building blocks of video and image
processing applications. It is a common block in the pre-processing
phase of image and video processing pipeline. Our presented
approach targets offloading the Canny edge detection algorithm from
processing system (PS) to programmable logic (PL) taking the
advantage of High Level Synthesis (HLS) tool flow to accelerate the
implementation on Zynq platform. The resulting implementation
enables up to a 100x performance improvement through hardware
acceleration. The CPU utilization drops down and the frame rate
jumps to 60 fps of 1080p full HD input video stream.
Abstract: This paper describes a cycle accurate simulation results of weight values learned by an auto-encoder behavior model in terms of pre-route simulation. Given the results we visualized the first layer representations with natural images. Many common deep learning threads have focused on learning high-level abstraction of unlabeled raw data by unsupervised feature learning. However, in the process of handling such a huge amount of data, the learning method’s computation complexity and time limited advanced research. These limitations came from the fact these algorithms were computed by using only single core CPUs. For this reason, parallel-based hardware, FPGAs, was seen as a possible solution to overcome these limitations. We adopted and simulated the ready-made auto-encoder to design a behavior model in VerilogHDL before designing hardware. With the auto-encoder behavior model pre-route simulation, we obtained the cycle accurate results of the parameter of each hidden layer by using MODELSIM. The cycle accurate results are very important factor in designing a parallel-based digital hardware. Finally this paper shows an appropriate operation of behavior model based pre-route simulation. Moreover, we visualized learning latent representations of the first hidden layer with Kyoto natural image dataset.
Abstract: This paper suggests a design methodology for the hardware and software of the electronic control unit (ECU) of safety-critical vehicle applications such as braking and steering. The architecture of the hardware is a high integrity system such thatit incorporates a high performance 32-bit CPU and a separate peripheral controlprocessor (PCP) together with an external watchdog CPU. Communication between the main CPU and the PCP is executed via a common area of RAM and events on either processor which are invoked by interrupts. Safety-related software is also implemented to provide a reliable, self-testing computing environment for safety critical and high integrity applications. The validity of the design approach is shown by using the hardware-in-the-loop simulation (HILS)for electric power steering(EPS) systemswhich consists of the EPS mechanism, the designed ECU, and monitoring tools.
Abstract: Medical image is an integral part of e-health care and e-diagnosis system. Medical image watermarking is widely used to protect patients’ information from malicious alteration and manipulation. The watermarked medical images are transmitted over the internet among patients, primary and referred physicians. The images are highly prone to corruption in the wireless transmission medium due to various noises, deflection, and refractions. Distortion in the received images leads to faulty watermark detection and inappropriate disease diagnosis. To address the issue, this paper utilizes error correction code (ECC) with (8, 4) Hamming code in an existing watermarking system. In addition, we implement the high complex ECC on a graphics processing units (GPU) to accelerate and support real-time requirement. Experimental results show that GPU achieves considerable speedup over the sequential CPU implementation, while maintaining 100% ECC efficiency.
Abstract: Cloud computing (CC) and mobile cloud computing (MCC) have advanced rapidly the last few years. Today, MCC undergoes fast improvement and progress in terms of hardware (memory, embedded sensors, power consumption, touch screen, etc.) software (more and more sophisticated mobile applications) and transmission (higher data transmission rates achieved with different technologies such as 3Gs). This paper presents a review on the concept of CC and MCC. Then, it discusses what has been done regarding middleware in cloud and mobile cloud computing. Later, it shows the architecture of our proposed middleware along with its functionalities which will be provided to mobile clients in order to overcome the well known problems (such as low battery power, slow CPU speed and little memory…).
Abstract: Conjugate gradient method has been enormously used
to solve large scale unconstrained optimization problems due to the
number of iteration, memory, CPU time, and convergence property,
in this paper we find a new class of nonlinear conjugate gradient
coefficient with global convergence properties proved by exact line
search. The numerical results for our new βK give a good result when
it compared with well known formulas.
Abstract: Accurate modeling of high speed RLC interconnects
has become a necessity to address signal integrity issues in current
VLSI design. To accurately model a dispersive system of interconnects
at higher frequencies; a full-wave analysis is required.
However, conventional circuit simulation of interconnects with full
wave models is extremely CPU expensive. We present an algorithm
for reducing large VLSI circuits to much smaller ones with similar
input-output behavior. A key feature of our method, called Frequency
Shift Technique, is that it is capable of reducing linear time-varying
systems. This enables it to capture frequency-translation and sampling
behavior, important in communication subsystems such as mixers,
RF components and switched-capacitor filters. Reduction is obtained
by projecting the original system described by linear differential
equations into a lower dimension. Experiments have been carried out
using Cadence Design Simulator cwhich indicates that the proposed
technique achieves more % reduction with less CPU time than the
other model order reduction techniques existing in literature. We
also present applications to RF circuit subsystems, obtaining size
reductions and evaluation speedups of orders of magnitude with
insignificant loss of accuracy.
Abstract: Today, design requirements are extending more and
more from electronic (analogue and digital) to multidiscipline design.
These current needs imply implementation of methodologies to make
the CAD product reliable in order to improve time to market, study
costs, reusability and reliability of the design process.
This paper proposes a high level design approach applied for the
characterization and the optimization of Switched-Current Sigma-
Delta Modulators. It uses the new hardware description language
VHDL-AMS to help the designers to optimize the characteristics of
the modulator at a high level with a considerably reduced CPU time
before passing to a transistor level characterization.
Abstract: Skin color can provide a useful and robust cue
for human-related image analysis, such as face detection,
pornographic image filtering, hand detection and tracking,
people retrieval in databases and Internet, etc. The major
problem of such kinds of skin color detection algorithms is
that it is time consuming and hence cannot be applied to a real
time system. To overcome this problem, we introduce a new
fast technique for skin detection which can be applied in a real
time system. In this technique, instead of testing each image
pixel to label it as skin or non-skin (as in classic techniques),
we skip a set of pixels. The reason of the skipping process is
the high probability that neighbors of the skin color pixels are
also skin pixels, especially in adult images and vise versa. The
proposed method can rapidly detect skin and non-skin color
pixels, which in turn dramatically reduce the CPU time
required for the protection process. Since many fast detection
techniques are based on image resizing, we apply our
proposed pixel skipping technique with image resizing to
obtain better results. The performance evaluation of the
proposed skipping and hybrid techniques in terms of the
measured CPU time is presented. Experimental results
demonstrate that the proposed methods achieve better result
than the relevant classic method.
Abstract: The volume of XML data exchange is explosively
increasing, and the need for efficient mechanisms of XML data
management is vital. Many XML storage models have been proposed
for storing XML DTD-independent documents in relational database
systems. Benchmarking is the best way to highlight pros and cons of
different approaches. In this study, we use a common benchmarking
scheme, known as XMark to compare the most cited and newly
proposed DTD-independent methods in terms of logical reads,
physical I/O, CPU time and duration. We show the effect of Label
Path, extracting values and storing in another table and type of join
needed for each method-s query answering.
Abstract: Discovery schools in Jordan are connected in one flat
ATM bridge network. All Schools connected to the network will hear
broadcast traffic. High percentage of unwanted traffic such as
broadcast, consumes the bandwidth between schools and QRC.
Routers in QRC have high CPU utilization. The number of
connections on the router is very high, and may exceed recommend
manufacturing specifications. One way to minimize number of
connections to the routers in QRC, and minimize broadcast traffic is
to use PPPoE. In this study, a PPPoE solution has been presented
which shows high performance for the clients when accessing the
school server resources. Despite the large number of the discovery
schools at MoE, the experimental results show that the PPPoE
solution is able to yield a satisfactory performance for each client at
the school and noticeably reduce the traffic broadcast to the QRC.
Abstract: A major part of the flow field involves no complicated
turbulent behavior in many turbulent flows. In this research work, in
order to reduce required memory and CPU time, the flow field was
decomposed into several blocks, each block including its special
turbulence. A two dimensional backward facing step was considered
here. Four combinations of the Prandtl mixing length and standard k-
E models were implemented as well. Computer memory and CPU
time consumption in addition to numerical convergence and accuracy
of the obtained results were mainly investigated. Observations
showed that, a suitable combination of turbulence models in different
blocks led to the results with the same accuracy as the high order
turbulence model for all of the blocks, in addition to the reductions in
memory and CPU time consumption.
Abstract: Measures of complexity and entropy have not converged to a single quantitative description of levels of organization of complex systems. The need for such a measure is increasingly necessary in all disciplines studying complex systems. To address this problem, starting from the most fundamental principle in Physics, here a new measure for quantity of organization and rate of self-organization in complex systems based on the principle of least (stationary) action is applied to a model system - the central processing unit (CPU) of computers. The quantity of organization for several generations of CPUs shows a double exponential rate of change of organization with time. The exact functional dependence has a fine, S-shaped structure, revealing some of the mechanisms of self-organization. The principle of least action helps to explain the mechanism of increase of organization through quantity accumulation and constraint and curvature minimization with an attractor, the least average sum of actions of all elements and for all motions. This approach can help describe, quantify, measure, manage, design and predict future behavior of complex systems to achieve the highest rates of self organization to improve their quality. It can be applied to other complex systems from Physics, Chemistry, Biology, Ecology, Economics, Cities, network theory and others where complex systems are present.
Abstract: Numerical integration of initial boundary problem for advection equation in 3 ℜ is considered. The method used is
conditionally stable semi-Lagrangian advection scheme with high order interpolation on unstructured mesh. In order to increase time step integration the BFECC method with limiter TVD correction is used. The method is adopted on parallel graphic processor unit environment using NVIDIA CUDA and applied in Navier-Stokes solver. It is shown that the calculation on NVIDIA GeForce 8800
GPU is 184 times faster than on one processor AMDX2 4800+ CPU. The method is extended to the incompressible fluid dynamics solver. Flow over a Cylinder for 3D case is compared to the experimental data.
Abstract: Embedded hardware simulator is a valuable computeraided
tool for embedded application development. This paper focuses
on the ARM926EJ-S MMU, builds state transition models and
formally verifies critical properties for the models. The state transition
models include loading instruction model, reading data model, and
writing data model. The properties of the models are described by
CTL specification language, and they are verified in VIS. The results
obtained in VIS demonstrate that the critical properties of MMU are
satisfied in the state transition models. The correct models can be
used to implement the MMU component in our simulator. In the
end of this paper, the experimental results show that the MMU can
successfully accomplish memory access requests from CPU.
Abstract: On-board Error Detection and Correction (EDAC)
devices aim to secure data transmitted between the central
processing unit (CPU) of a satellite onboard computer and its local
memory. This paper presents a comparison of the performance of
four low complexity EDAC techniques for application in Random
Access Memories (RAMs) on-board small satellites. The
performance of a newly proposed EDAC architecture is measured
and compared with three different EDAC strategies, using the same
FPGA technology. A statistical analysis of single-event upset (SEU)
and multiple-bit upset (MBU) activity in commercial memories
onboard Alsat-1 is given for a period of 8 years
Abstract: This paper presents an effective traffic lights detection
method at the night-time. First, candidate blobs of traffic lights are
extracted from RGB color image. Input image is represented on the
dominant color domain by using color transform proposed by Ruta,
then red and green color dominant regions are selected as candidates.
After candidate blob selection, we carry out shape filter for noise
reduction using information of blobs such as length, area, area of
boundary box, etc. A multi-class classifier based on SVM (Support
Vector Machine) applies into the candidates. Three kinds of features
are used. We use basic features such as blob width, height, center
coordinate, area, area of blob. Bright based stochastic features are also
used. In particular, geometric based moment-s values between
candidate region and adjacent region are proposed and used to improve
the detection performance. The proposed system is implemented on
Intel Core CPU with 2.80 GHz and 4 GB RAM and tested with the
urban and rural road videos. Through the test, we show that the
proposed method using PF, BMF, and GMF reaches up to 93 % of
detection rate with computation time of in average 15 ms/frame.
Abstract: A separation-kernel-based operating system (OS) has been designed for use in secure embedded systems by applying formal methods to the design of the separation-kernel part. The separation kernel is a small OS kernel that provides an abstract distributed environment on a single CPU. The design of the separation kernel was verified using two formal methods, the B method and the Spin model checker. A newly designed semi-formal method, the extended state transition method, was also applied. An OS comprising the separation-kernel part and additional OS services on top of the separation kernel was prototyped on the Intel IA-32 architecture. Developing and testing of a prototype embedded application, a point-of-sale application, on the prototype OS demonstrated that the proposed architecture and the use of formal methods to design its kernel part are effective for achieving a secure embedded system having a high-assurance separation kernel.
Abstract: The volume of XML data exchange is explosively increasing, and the need for efficient mechanisms of XML data management is vital. Many XML storage models have been proposed for storing XML DTD-independent documents in relational database systems. Benchmarking is the best way to highlight pros and cons of different approaches. In this study, we use a common benchmarking scheme, known as XMark to compare the most cited and newly proposed DTD-independent methods in terms of logical reads, physical I/O, CPU time and duration. We show the effect of Label Path, extracting values and storing in another table and type of join needed for each method's query answering.
Abstract: This research simulates one of the natural phenomena,
the ocean wave. Our goal is to be able to simulate the ocean wave at
real-time rate with the water surface interacting with objects. The
wave in this research is calm and smooth caused by the force of the
wind above the ocean surface. In order to make the simulation of the
wave real-time, the implementation of the GPU and the
multithreading techniques are used here. Based on the fact that the
new generation CPUs, for personal computers, have multi cores, they
are useful for the multithread. This technique utilizes more than one
core at a time. This simulation is programmed by C language with
OpenGL. To make the simulation of the wave look more realistic, we
applied an OpenGL technique called cube mapping (environmental
mapping) to make water surface reflective and more realistic.